BCE Going Deeper - Part 3 - Debugging IP

At the start of 2013, BCE at RTL City was a hole in Luxembourg’s ground and in less than four years they were on air broadcasting 35 different channels across Europe and Singapore. Costas Colombus is BCE’s Technology Projects and Support Director and gave The Broadcast Bridge a unique insight into how they made this mammoth installation work, including describing the issues and how they overcame them along the way.

In this third article in the series we look at the challenges that occur in IP networks, how to detect them, and the network tools needed to fix them.

BCE decided to use a fully redundant system consisting of two routers; Juniper and Arista. During the installation phase, Costas detected problems with the network common to both IP routers; randomly, video feeds would break up with no obvious reason.

To introduce video into the network, BCE’s principle technology contractors SAM, provided video gateways, devices to convert HD-SDI to ST-2022 IP and aggregate many sources onto one 40GB/s fiber. Video disturbances in one of the gateway sources could manifest themselves as dropped packets in the IP domain.

Data mining tools from the IT industry helped log events and hone in on the source of the problems. BCE chose Paessler’s PRTG network monitoring solution to give a deep analysis of the network including lost packets and data rates. Network bandwidth monitoring is achieved using remote hardware devices, or extracting statistics using SNMP and sFlow.

But why is flow monitoring and control so important? Intuitively, we might think that adding a high-speed link between two networks would improve transmission speeds. However, making every link in a network as fast as possible would increase costs disproportionately and negate our wins using Commercial off-the-shelf (COTs) products. And the Braess Paradox demonstrates that adding high speed links between routers doesn’t always increase speeds and could even decrease them. This is counterintuitive and is the subject of a later article.

Diagram 1 – These two diagrams are cross-section magnified photographs of a fiber, the one on the left is clean and the one on the right is the same channel after it’s been touched very quickly.

Arista and Juniper utilize sFlow to continuously sample and monitor application level traffic at wire-speed simultaneously on all interfaces of their routers. It makes the distinction of monitoring between on the wire, and at the protocol level - an important distinction as the two can often give different measurements depending on the protocol being used. For example, UDP will give much faster protocol speeds then TCP, even if the wire-speed is the same.

The sFlow specification claims that sampling and monitoring does not impact on router performance. Oscilloscopes use high impedance probes to monitor audio and video systems, so we can be sure the signal is not being loaded or influenced by the measuring device. The same assumption cannot always be made in networks as network interface cards on servers and PC’s buffer incoming and outgoing packets resulting in critical timing information being lost.

BCE developed their own monitoring software to present the information in a coherent form, different vendors provide their own monitoring solutions and a common system was needed to assist BCE’s maintenance teams in diagnosing issues quickly and proactively. The amount of sampled data available is overwhelming and identifying which attributes to monitor is a full-time job. False negatives waste precious time and can be severely detrimental to the smooth operation of a network.

The deep data mining and analysis used by BCE is often only found in high performance systems such as those used by Google and Amazon due to the data speeds involved and the level of understanding needed by the engineering teams. It also allows automated systems to intelligently detect patterns of behavior that are inconsistent with normal operation and flag potential issues to maintenance teams before a failure occurs. BCE’s network operations centers have fine-tuned this requirement and monitor video, audio and metadata flows 24/7, and continuously record data for later analysis should an intermittent problem develop.

Using these network monitoring tools connected to real time wire-speed router monitoring systems in Arista and Juniper, BCE’s maintenance teams were able to record and analyze network speed measurements and reported errors. They noticed tens of thousands of network packet losses each day on many of the router QSFP ports.

The amount of historic data gathered allowed engineers to focus on the QSFP’s and they proved switching them from white-label units to the manufacturer approved units reduced the dropped packets from tens of thousands a day, to tens of packets a day, and sometimes even zero.

Diagram 2 – BCE’s bespoke software showing video over IP monitoring.

BCE found that another contributing factor to packet loss can be dirty fiber interfaces, as highlighted in The Broadcast Bridge Essential Guide on Fiber Optics in Production. Dust and grit are the enemies of fiber and this was particularly evident for BCE as their building work continued. Even though construction was taking place far away from the fiber installation, the smallest amount of dust could contaminate the interface. To rectify this, BCE dedicated one specific team to clean and make fiber connections to guarantee the consistency of work.

Historically, engineers working in the SDI domain would only need to deal with the physical and application layers of the ISO 7-Layer model, but as we migrate to IP the need to understand the other five layers soon becomes apparent.

In this series of three articles - Debugging IP, Cable, Standards and ITIL, and Choosing Routers, we’ve witnessed at first hand the importance of working closely with IT engineers to make IP-media systems operate effectively and reliably. One person cannot possibly hope to understand all aspects of the ISO 7-layer model to the depth needed for IP migration, so collaboration between broadcast and IT engineers is key to solving problems, even those manufacturers are unaware of. 

You might also like...

The Sponsors Perspective: Working In The Cloud – Productivity Is Always The Key

We’ve encountered media companies along all aspects of migrating their workflows to the cloud. Some with large on-premises media processing capabilities are just beginning to design their path, while others have transformed some of their workflows to be cloud-native, a…

Preparing To Broadcast ​Commonwealth Games Birmingham

The 2022 Commonwealth Games will be the biggest sports event on UK shores since London 2012 with around 1.5 billion global audience expected to watch over the 11 day event beginning July 28. Bidding for the host broadcast contract began in summer of 2019 with the…

India Spotlights The Importance of Converged “Direct-To-Mobile” Broadcasting In Today’s Mobile Video

As the U.S. continues to roll out NextGen TV services in markets large and small across the country, 5G wireless technology is being considered (and tested) to augment the OTA signal and provide a fast and accurate backchannel to…

IP Security For Broadcasters: Part 9 - NMOS Security

NMOS has succeeded in providing interoperability between media devices on IP infrastructures, and there are provisions within the specifications to help maintain system security.

Look Like A Million For Less

Every TV viewer compares live content with what they regularly see on TV, with multimillion-dollar talent with more multimillions in technical equipment and support.