BCE Going Deeper - Part 3 - Debugging IP

At the start of 2013, BCE at RTL City was a hole in Luxembourg’s ground and in less than four years they were on air broadcasting 35 different channels across Europe and Singapore. Costas Colombus is BCE’s Technology Projects and Support Director and gave The Broadcast Bridge a unique insight into how they made this mammoth installation work, including describing the issues and how they overcame them along the way.

In this third article in the series we look at the challenges that occur in IP networks, how to detect them, and the network tools needed to fix them.

BCE decided to use a fully redundant system consisting of two routers; Juniper and Arista. During the installation phase, Costas detected problems with the network common to both IP routers; randomly, video feeds would break up with no obvious reason.

To introduce video into the network, BCE’s principle technology contractors SAM, provided video gateways, devices to convert HD-SDI to ST-2022 IP and aggregate many sources onto one 40GB/s fiber. Video disturbances in one of the gateway sources could manifest themselves as dropped packets in the IP domain.

Data mining tools from the IT industry helped log events and hone in on the source of the problems. BCE chose Paessler’s PRTG network monitoring solution to give a deep analysis of the network including lost packets and data rates. Network bandwidth monitoring is achieved using remote hardware devices, or extracting statistics using SNMP and sFlow.

But why is flow monitoring and control so important? Intuitively, we might think that adding a high-speed link between two networks would improve transmission speeds. However, making every link in a network as fast as possible would increase costs disproportionately and negate our wins using Commercial off-the-shelf (COTs) products. And the Braess Paradox demonstrates that adding high speed links between routers doesn’t always increase speeds and could even decrease them. This is counterintuitive and is the subject of a later article.

Diagram 1 – These two diagrams are cross-section magnified photographs of a fiber, the one on the left is clean and the one on the right is the same channel after it’s been touched very quickly.

Arista and Juniper utilize sFlow to continuously sample and monitor application level traffic at wire-speed simultaneously on all interfaces of their routers. It makes the distinction of monitoring between on the wire, and at the protocol level - an important distinction as the two can often give different measurements depending on the protocol being used. For example, UDP will give much faster protocol speeds then TCP, even if the wire-speed is the same.

The sFlow specification claims that sampling and monitoring does not impact on router performance. Oscilloscopes use high impedance probes to monitor audio and video systems, so we can be sure the signal is not being loaded or influenced by the measuring device. The same assumption cannot always be made in networks as network interface cards on servers and PC’s buffer incoming and outgoing packets resulting in critical timing information being lost.

BCE developed their own monitoring software to present the information in a coherent form, different vendors provide their own monitoring solutions and a common system was needed to assist BCE’s maintenance teams in diagnosing issues quickly and proactively. The amount of sampled data available is overwhelming and identifying which attributes to monitor is a full-time job. False negatives waste precious time and can be severely detrimental to the smooth operation of a network.

The deep data mining and analysis used by BCE is often only found in high performance systems such as those used by Google and Amazon due to the data speeds involved and the level of understanding needed by the engineering teams. It also allows automated systems to intelligently detect patterns of behavior that are inconsistent with normal operation and flag potential issues to maintenance teams before a failure occurs. BCE’s network operations centers have fine-tuned this requirement and monitor video, audio and metadata flows 24/7, and continuously record data for later analysis should an intermittent problem develop.

Using these network monitoring tools connected to real time wire-speed router monitoring systems in Arista and Juniper, BCE’s maintenance teams were able to record and analyze network speed measurements and reported errors. They noticed tens of thousands of network packet losses each day on many of the router QSFP ports.

The amount of historic data gathered allowed engineers to focus on the QSFP’s and they proved switching them from white-label units to the manufacturer approved units reduced the dropped packets from tens of thousands a day, to tens of packets a day, and sometimes even zero.

Diagram 2 – BCE’s bespoke software showing video over IP monitoring.

BCE found that another contributing factor to packet loss can be dirty fiber interfaces, as highlighted in The Broadcast Bridge Essential Guide on Fiber Optics in Production. Dust and grit are the enemies of fiber and this was particularly evident for BCE as their building work continued. Even though construction was taking place far away from the fiber installation, the smallest amount of dust could contaminate the interface. To rectify this, BCE dedicated one specific team to clean and make fiber connections to guarantee the consistency of work.

Historically, engineers working in the SDI domain would only need to deal with the physical and application layers of the ISO 7-Layer model, but as we migrate to IP the need to understand the other five layers soon becomes apparent.

In this series of three articles - Debugging IP, Cable, Standards and ITIL, and Choosing Routers, we’ve witnessed at first hand the importance of working closely with IT engineers to make IP-media systems operate effectively and reliably. One person cannot possibly hope to understand all aspects of the ISO 7-layer model to the depth needed for IP migration, so collaboration between broadcast and IT engineers is key to solving problems, even those manufacturers are unaware of. 

Let us know what you think…

Log-in or Register for free to post comments…

You might also like...

Articles You May Have Missed – June 20, 2018

Until now, 4K/UHD and high dynamic range (HDR), in many ways, has been little more than a science project, as manufacturers have struggled to convince production entities of the long-term practicality and viability. Fears of overly complex pipelines and…

A Brief History of IP - Putting It All Together

Building reliable, flexible IP networks requires an understanding of infrastructure components and the interoperability of systems that run on them, especially when working in fast-paced, dynamic studios. Protocol interfacing is relatively straightforward, but as we investigate application level connectivity further,…

The Benefits of Cloud-Based Video Service Operations

With near unfettered access to portable media players of all types and faster networks, consumers are increasingly migrating to video providers that serve them best. Quality and reliability are the key drivers for loyal and recurring engagement.

Audio Over IP - Making It Work - Part 3

Multicasting is an incredibly powerful tool used in broadcast infrastructures to efficiently distribute streams of audio, video, and metadata. In this article, we look at the advantages of multicasting, how it works, and the alternatives that overcome some of its…

Articles You May Have Missed – June 6, 2018

Artificial Intelligence (AI) was a hot topic of both conversation and display at this year’s NAB show. While the early demonstrations merely hint the potential of AI, there is clearly plenty of interest by users and manufacturers. Artificial Intelligence i…