Test, QC & Monitoring Global Viewpoint – May 2023

Ground-to-Cloud Flows

Broadcasters are making unique demands on IP networks due to the nature of high impact media flows both for baseband and compressed video. Although we normally have the luxury of working within managed networks, this isn’t always the case.

Media exchange for broadcasting uses IP in two different ways: UDP/IP and TCP/IP. In the context of media flows, RTP is an extension of UDP/IP and is generally used as a fire-and-forget protocol which results in latency being kept very low. The challenge with this method is that lost packets stay lost as there is no re-send mechanism. Standards such as ST2110 allow for FEC so that some corruption or loss can be rectified, but even with this, the network must be of an incredibly high quality employing non-blocking switches.

TCP/IP on the other hand was designed to take into consideration lost and out-of-sequence packets in the network, but the price we pay for guaranteed delivery is variable latency, something we don’t really want in the studio environment.

Although RTP/UDP and TCP generally work harmoniously together in the studio, mainly due to the spine-leaf topology and non-blocking switches, this relationship cannot always be assumed, especially when we need to move media flows outside of the studio. One example of this is when we’re processing in the public cloud and have to employ ground-to-cloud designs which require the video and audio streams to be “internet friendly”.

Unless you want to spend massive amounts of money on vendor specific high-capacity network pipes into the cloud, then the video and audio will not only need to be compressed, but it will also need to be sent using TCP or an ARQ (Automated Repeat reQuest) method. ARQ can be thought of as a stripped-down version of TCP as it uses UDP with the data-ack arrangement. That is, several packets are sent and when they are received, the receiver sends an ACK packet (as in acknowledge) so that the next group of packets can be sent. This has the benefit of recovering lost packets and automatically throttling the send rate to maximize the link bandwidth.

Whether the broadcaster uses ARQ or TCP to stream their video and audio into their cloud servers, they will inevitably end up joining the media traffic with all the other auxiliary information that is sent over the same link, such as remote-control data, security keys, and monitoring information. And due to the high impact nature of the media flows, this can have some unintended consequences for the auxiliary data. One of which is increased latency.

Media flows tend to use large amounts of bandwidth over long periods of time. As the data entering the network will be asynchronous, buffers in the network switches are often employed to reduce the risk of packet loss. However, when the bursty auxiliary data is also sent on the same link, the buffers in the switches have the potential to overflow resulting in lost packets and increased latency, especially when employing TCP.

One solution to this is to send the media flows and bursty auxiliary flows across two separate links into the cloud to reduce the risk of buffer overflow in the network switches and hence packet loss and increased latency. Although we could increase the size of the buffers to negate the need for two separate links, doing so would significantly increase latency.

As broadcasters continue their IP journey, we are going to be finding all kinds of challenges that we need to overcome. In the network context we will need to dig deep into the media and auxiliary flows to understand how they are being impacted by the buffers in switches and other points within the signal path, such as NICs, especially when we stress cloud hybrid systems through dynamic scaling and maintaining resilience. Wireshark and Python are your best friends!