Point-to-point connectivity and well-defined component functionality allows broadcast engineers to easily monitor and fault-find SDI, AES, and MADI infrastructures. But as more broadcasters transition to IP, can we learn from IT to help us improve reliability?
Striving for one-hundred percent reliability is a noble objective but all systems suffer a certain amount of failure. We can hope to get closer to perfection, but we never really achieve it. Human error, scenarios we haven’t considered, and cost constraints all conspire against us.
When I first started working with IT professionals, I was completely amazed by the amount of time and energy they spent building and understanding monitoring systems. They weren’t just content with monitoring network bandwidth or server uptime. They got deep into the system and individual components, recording data and even mouse clicks when users accessed web pages.
Open source solutions such as Nagios can monitor all kinds of elements within a system including protocols, operating systems, and even web servers. Centralized logging and dashboard monitors, all help understand what’s going on within the infrastructure. But these are incredibly complex solutions and take a great deal of time to master.
Traditional broadcasting has always been about point-to-point connections. Most signals have their own dedicated cabling and the ones that don’t are time-division-multiplexed to provide them with a guaranteed path so there is almost no data loss.
I believe, the key differences between broadcast and IT monitoring is due to the contended mesh network approach of IT compared to the point-to-point distribution system within broadcast.
Virtualization also demands that monitoring plays a key role. Applications and even entire instances of servers can move around hardware as they are enabled and disabled. An occurrence of a server running a compression engine can be easily deleted when not required. Only to be spun up again as the business demands. In true virtualization, the instance can move around hardware server clusters each time it is created and deleted.
Consequently, finding a fault in a virtualized networked system can be a complex and daunting task. Finding constants when variables are everywhere further adds to the challenge. And identifying stream errors when the packets take different routes adds apparent randomness to the equation.
Contrast this with traditional SDI broadcast infrastructures. Each cable does a specific job and is responsible for delivering well-defined data. Every component provides a decisive function such as vision mixing or signal distribution. And the whole system can be visually laid out in diagrams allowing the engineer to follow signals through the system.
Nothing could be further from the truth with IT networks. And broadcasters transitioning to IP have the added challenge of supporting existing SDI systems as well as moving to IT networks.
Building effective monitoring and logging is crucial to maintaining reliable systems and engineering sanity. We can learn a great deal from IT and we should think very carefully about adopting their monitoring philosophy and methodologies as we transition and migrate to IP.
Tony Orme, editor