BCE Going Deeper - Part 3 - Debugging IP

At the start of 2013, BCE at RTL City was a hole in Luxembourg’s ground and in less than four years they were on air broadcasting 35 different channels across Europe and Singapore. Costas Colombus is BCE’s Technology Projects and Support Director and gave The Broadcast Bridge a unique insight into how they made this mammoth installation work, including describing the issues and how they overcame them along the way.

In this third article in the series we look at the challenges that occur in IP networks, how to detect them, and the network tools needed to fix them.

BCE decided to use a fully redundant system consisting of two routers; Juniper and Arista. During the installation phase, Costas detected problems with the network common to both IP routers; randomly, video feeds would break up with no obvious reason.

To introduce video into the network, BCE’s principle technology contractors SAM, provided video gateways, devices to convert HD-SDI to ST-2022 IP and aggregate many sources onto one 40GB/s fiber. Video disturbances in one of the gateway sources could manifest themselves as dropped packets in the IP domain.

Data mining tools from the IT industry helped log events and hone in on the source of the problems. BCE chose Paessler’s PRTG network monitoring solution to give a deep analysis of the network including lost packets and data rates. Network bandwidth monitoring is achieved using remote hardware devices, or extracting statistics using SNMP and sFlow.

But why is flow monitoring and control so important? Intuitively, we might think that adding a high-speed link between two networks would improve transmission speeds. However, making every link in a network as fast as possible would increase costs disproportionately and negate our wins using Commercial off-the-shelf (COTs) products. And the Braess Paradox demonstrates that adding high speed links between routers doesn’t always increase speeds and could even decrease them. This is counterintuitive and is the subject of a later article.

Diagram 1 – These two diagrams are cross-section magnified photographs of a fiber, the one on the left is clean and the one on the right is the same channel after it’s been touched very quickly.

Arista and Juniper utilize sFlow to continuously sample and monitor application level traffic at wire-speed simultaneously on all interfaces of their routers. It makes the distinction of monitoring between on the wire, and at the protocol level - an important distinction as the two can often give different measurements depending on the protocol being used. For example, UDP will give much faster protocol speeds then TCP, even if the wire-speed is the same.

The sFlow specification claims that sampling and monitoring does not impact on router performance. Oscilloscopes use high impedance probes to monitor audio and video systems, so we can be sure the signal is not being loaded or influenced by the measuring device. The same assumption cannot always be made in networks as network interface cards on servers and PC’s buffer incoming and outgoing packets resulting in critical timing information being lost.

BCE developed their own monitoring software to present the information in a coherent form, different vendors provide their own monitoring solutions and a common system was needed to assist BCE’s maintenance teams in diagnosing issues quickly and proactively. The amount of sampled data available is overwhelming and identifying which attributes to monitor is a full-time job. False negatives waste precious time and can be severely detrimental to the smooth operation of a network.

The deep data mining and analysis used by BCE is often only found in high performance systems such as those used by Google and Amazon due to the data speeds involved and the level of understanding needed by the engineering teams. It also allows automated systems to intelligently detect patterns of behavior that are inconsistent with normal operation and flag potential issues to maintenance teams before a failure occurs. BCE’s network operations centers have fine-tuned this requirement and monitor video, audio and metadata flows 24/7, and continuously record data for later analysis should an intermittent problem develop.

Using these network monitoring tools connected to real time wire-speed router monitoring systems in Arista and Juniper, BCE’s maintenance teams were able to record and analyze network speed measurements and reported errors. They noticed tens of thousands of network packet losses each day on many of the router QSFP ports.

The amount of historic data gathered allowed engineers to focus on the QSFP’s and they proved switching them from white-label units to the manufacturer approved units reduced the dropped packets from tens of thousands a day, to tens of packets a day, and sometimes even zero.

Diagram 2 – BCE’s bespoke software showing video over IP monitoring.

BCE found that another contributing factor to packet loss can be dirty fiber interfaces, as highlighted in The Broadcast Bridge Essential Guide on Fiber Optics in Production. Dust and grit are the enemies of fiber and this was particularly evident for BCE as their building work continued. Even though construction was taking place far away from the fiber installation, the smallest amount of dust could contaminate the interface. To rectify this, BCE dedicated one specific team to clean and make fiber connections to guarantee the consistency of work.

Historically, engineers working in the SDI domain would only need to deal with the physical and application layers of the ISO 7-Layer model, but as we migrate to IP the need to understand the other five layers soon becomes apparent.

In this series of three articles - Debugging IP, Cable, Standards and ITIL, and Choosing Routers, we’ve witnessed at first hand the importance of working closely with IT engineers to make IP-media systems operate effectively and reliably. One person cannot possibly hope to understand all aspects of the ISO 7-layer model to the depth needed for IP migration, so collaboration between broadcast and IT engineers is key to solving problems, even those manufacturers are unaware of. 

Let us know what you think…

Log-in or Register for free to post comments…

You might also like...

Applied Technology: EditShare QScan is an Integrated Approach to Content Compliance

Content that does not match broadcast or VOD delivery specifications can have a detrimental impact on any media facility. Non-compliant content is likely to be rejected by the service provider requiring the producer of that content to make the necessary…

Articles You May Have Missed – April 18, 2018

The increase of IP technology, now merged into broadcast links, requires new test equipment. This article looks at RF IP probes and video monitors.

Understanding IP Networks - Production Crews and IP - Part 22

Broadcast television is the point where the creative arts and technology meet. It’s different from any other discipline as to operate at an optimum level, and get the best possible quality, artisans, producers, and creatives have a deeper technical u…

2018 Logging and Compliance Trends - Part 2

Two types of people visit NAB Show exhibits, engineers and not-engineers. The not-engineers fly into Las Vegas, have some meetings and wander the exhibits for a day or two. Then there are the engineers who visit the show with specific…

2018 Logging and Compliance Trends - Part 1

It seems every year brings more broadcasting activities to log and archive, and more laws and regulations to comply with. The 2018 NAB Show is the only place in the world where all the available new technology to automatically log and…