Monitoring & Compliance In Broadcast: Monitoring Compute Systems

With the ongoing evolution from dedicated hardware towards software running on COTS and cloud-compute infrastructure, monitoring compute resource is vital.

For many years, broadcasters have been in effect running data centers, usually in the basement or lower floors of a broadcast center building. They were not called data centers, mostly they were called The Central Apparatus Room (CAR), sometimes Central Machine Room, sometimes Central Technical Area or even “rack room”.

A More Diverse Ecosystem

Initially, these areas were full of specialist broadcast hardware, specifically built to cope with the needs of video and audio production. As standard IT hardware started to become fast enough, and digital storage reached useful sizes for storing large size video and audio files, more and more of the CAR consisted of racks of standard servers, with specialist software installed. The use of KVM, remote desktop, and RS232 meant that engineers no longer had to use local interfaces and keyboards within the room to access the particular device.  In addition, monitoring tools became available so that monitoring devices in the CAR, such as audio and video routers, video servers, compression facilities, multiplexers and broadcast playout automation systems could be monitored from the NOC (Network Operations Center). Typically though, many broadcast engineers still carry portable specialized test and measurement tools so that they can physically check signals at local points, and waveform monitors, image analyzers and timing tools are still usually present as part of production workflows.

Modern broadcast facilities are often a blend of dedicated hardware and more flexible compute resource which can be on-premise, in a remote (but owned) data center or in a public cloud data center. For all of these, the trend with compute resource has been towards use of COTS hardware, however there is often still a need for specialized equipment, which may be based on standard servers, but require additional boards and specialized software to be installed.

More To Monitor

Whatever the situation, both the devices in use, and the networks connecting them do need regular monitoring. Broadly speaking, wherever the broadcast system is situated, there will be a substantial requirement for various types of monitoring; facility monitoring (typically air-conditioning, water leaking, fire safety), core infrastructure monitoring (including servers, networks, storage, Operating systems [these days generally standard IT versions], virtual machines and microservices), and media applications also require monitoring.  All of which may be on premise or off premise. Maintenance and test schedules should ideally be in place for all of these elements.

All of these should be monitored, and in order to do that, a number of different monitoring systems, both static and dynamic, may need to be used simultaneously. Typically consideration of the actual media flow through the system needs to be checked frequently, and this needs to be dynamically monitored. An example might be of a server somewhere in the facility that is reported to be functioning properly by the IT monitoring system, however when looked at via the video monitoring system, it transpires that the server may be playing video, but the video format it is playing is incorrect. A good monitoring system should also be able to give some indication of the impact a fault somewhere in the system might have, and to do this, several systems may need to be combined together to give a top-down view.

Where SAAS and cloud systems are in place, often these come with their own monitoring tools, and may also have suitable failover procedures in place. This can in some cases conflict with the specific software licensing that some specialist professional broadcast tools operate, and this does need consideration when building a new system.

A cloud service provider may insist that as part of their offering, their monitoring tools must be used, and they may require that their engineers provide this service. This can work well, but also can sometimes cause difficulties when there are specific video and audio production requirements. In some cases, there may be a need for separate monitoring systems to be looking at the same device, but with different viewpoints.

Resources For Monitoring

When building a system whether on premise or on cloud, it is also necessary to consider the potential overhead that monitoring may have on the overall resources. When designing a system, made up of many third-party products, each provider may give an indication of the typical hardware requirement needed to support the media flow through their product, for a system of the requested size. Often, in order to offer the best price, the hardware specification will be sized quite tightly to the product. If then constant and frequent monitoring, (typically constantly querying servers), is added over the top, this can potentially present problems across the networks and also the devices. How often to monitor, where to monitor, and where to statically or dynamically monitor across a system are key decisions. Analysis of the criticality of each element of the production system and adjustment of the query timing can pay dividends when looking at the overall efficiency of the system.

Maintenance

One area of integrated network-based systems, where different third-party providers are part of an overall media workflow, is the thorny problem of software updates. Once a system is up and running smoothly, and suitable monitoring is in place, particularly where complex API driven workflows are installed, it is tempting to leave it alone. Unfortunately, as standards change, new workflows are required, and security updates are needed, each manufacturer will issue software updates, which may or may not impact other parts of the system. Corresponding changes in system monitoring tools will also be needed.

Having said that, most manufacturers and broadcast engineers can now use remote working tools, and most fault diagnostics can take place via external connection, once a fault is flagged, either by automated monitoring systems, or by the humans attempting to use the system. Monitoring systems can flag errors, or in some cases potential errors that may be about to occur, and send messages to authorized engineers, generally these alerts are via SNMP, HTTP, or configurable email addresses. This can present possible risks to security, but if an external connection for maintenance or repairing a fault is specifically invited by an authorized engineer, the risk is lessened. Modern systems are also starting to use AI and machine learning to predict potential errors before they occur, generally based on historical data logged by monitoring systems, that give information on where the pressure points might be within a system.

Conclusion

As systems become less and less localized to a centralized facility, so the tools needed for monitoring such systems become more sophisticated and available remotely and are able to analyze the data results and present them to engineers in a quickly understood form, usually as a dashboard with the option to go into deeper analysis once the source of the error is indicated. Remote monitoring systems can also be set up to ignore possible errors that may be flagged because of issues with the remote connection between the monitoring systems and the resource location. These options free engineers from the need to physically be present in the CAR or data center, although at the time of writing, in many facilities, it is still sometimes necessary to physically go to a rack and re-route in a cable! If modern monitoring systems are present, the ability to analyze and pinpoint a fault quickly and effectively becomes routine, and contributes to the efficient running of the facility, whether it be on premise, hybrid, or cloud.

You might also like...

Production Delivery Specifications - The Broadcast Standards Essential Guide

This Essential Guide provides a unique reference resource for production companies or teams preparing to package and deliver assets to broadcasters & streamers. It gathers the published content delivery specifications from the DPP, Netflix, Apple TV+, NABA, The BBC and…

IP Monitoring & Diagnostics With Command Line Tools: Part 9 - Continuous Monitoring

Scheduling a continuous monitoring process will detect problems at the earliest opportunity. If the diagnostic tools run often enough, they can forecast a server outage before a mission critical failure happens. Pre-emptive diagnosis and automatic corrections are a very good…

Navigating Streaming Networks For Live Sports: Broadcaster OTT & Streaming Delivery Networks

With the ongoing growth of OTT content consumption, and the drive from live sports broadcasters to provide high-scale and high-quality Direct to Consumer OTT services, Streamers and their customers now demand streaming services that operate at the scale and quality…

Live Sports Production: Camera To Truck

Much of the OB production infrastructure has moved to IP, but has the connectivity between the cameras and the OB or backhaul also migrated to IP?

Building Software Defined Infrastructure: Zero Tolerance Security

Software based systems bring immense flexibility but they also bring increased vulnerability and inevitable trade-offs between flexibility and security.