Designing IP Broadcast Systems: Thinking Asynchronously

Designing IP infrastructures requires broadcast engineers and technologists to think asynchronously if they are to deliver reliable studio IP infrastructures.

SMPTE’s ST2110 suite of standards to facilitate video, audio, and metadata streaming over IP is a game changer for broadcasters. Not just because we can use IP in an integrated and open manner, but also because SMPTE have succeeded in abstracting the media essence away from the underlying transport stream. And it is this success that is empowering broadcasters to fully adopt IP for real-time media distribution within broadcast networks.

Although IP is one specific standard, broadcasters use it in two different ways depending on the application. In OTT, the media streams are predominantly delivered over the internet which means that the methodologies adopted must comply with internet standards. The internet has its history in delivering web page content to users resulting in our need to comply with HTTP/TCP/IP stacks. Conversely, a broadcaster has much more flexibility when working within their own datacenter and therefore has more scope to use different protocols such as RTP/UDP/IP.

Latency – Being Realistic

One of the key challenges facing broadcasters is keeping latency as low as possible. In SDI and AES infrastructures these latencies were a function of the combined propagation delays of the components and transport medium, such as coaxial cable and fiber, and were responsible for only a few nanoseconds of delay. Furthermore, SDI and AES are switched-circuit technologies, so system designers do not have to be concerned with congestion as seen with IP packet-switched networks. However, variable, and unpredictable latency is at the heart of any packet-switched system.

Variable, and unpredictable, are two attributes of any system that broadcasters fear, and with good reason. We can easily become obsessed with making latency “as low as possible”, to the point where it becomes overwhelming to the detriment of the design. Instead, we should be taking a more holistic view of latency and accepting that it exists without spending disproportionate amounts of time on trying to achieve the impossible. After all, the only way to make the latencies in an IP network the same as those in an SDI/AES network is to make IP synchronous, which clearly isn’t going to happen.

There is also the need to apply application specific latency measure tolerances. For example, within a broadcast studio infrastructure the latency may need to be less than a few frames of video, that is, in the order of 20 to 30ms, but for media file OTT delivery this could easily be several seconds, although less than a second is desirable for live OTT (albeit not yet achievable). In other words, choose the realistic latency tolerance that meets the application for the workflow.

Timing Is Everything

IP is fundamentally asynchronous, unlike SDI and AES that is synchronous. Also, just to add to the challenge, IP does not guarantee the delivery of packets or their order. This might seem like a shortcoming in IP, but the original designers wanted to keep these limitations so that IP could be as flexible as possible. TCP solves this problem by building a protocol layer on top of IP to reorder any out-of-sequence packets and to resend any lost packets. It’s worth remembering that the re-ordering takes place in the TCP software that resides in the receiver so that any out-of-sequence packets are presented in the correct order to higher level applications within the device. However, all this adds to significant and variable latency. Consequently, TCP is adequate for the OTT internet domain where we have a bit of latitude for latency tolerances, but not for the studio domain where latency tolerances are less flexible.

Another way of thinking about video and audio over IP is that we’ve taken a synchronous medium and distributed it across an asynchronous network. And an important consequence of this approach is that we’ve destroyed the timing plane.

Even when distributing video and audio across an asynchronous network, the video frames and audio samples must be played to the viewer using a constant and time invariant sampling clock, otherwise the video fluidity of motion will be compromised, and the audio will distort. This leads to the need for a high accuracy timing system within the broadcast infrastructure. To maintain accurate timing in SDI we use a sync-pulse-generator, but in IP we use a PTP (Precision Time Protocol) GM (Grand Master) clock, which has the potential to achieve sub-nanosecond accuracy in timing.

One question arises when using PTP in a studio environment and that is Why do we not use PTP when delivering OTT? In the OTT domain, the viewer only needs to synchronize one device, that is, the television or mobile device they are consuming the media on. In file delivery the solution is relatively straightforward as the viewing device creates its own timebase, which for most home and domestic environments is good enough. And for real-time live broadcasting, the television can synchronize its frame-rate sampling clock to the long-term rate of the incoming video. Although there will be some timing anomalies due to packet jitter and re-ordering, these can generally be ironed out using a long-term low pass filter. However, the same is not true for studio infrastructures where multiple cameras, video file servers, and graphics equipment must be synchronized.

Thinking Asynchronously

Understanding timing and latency in broadcast networks can be easily achieved if broadcast engineers and technologists think more in terms of asynchronous packet distribution. It’s true that we’re still distributing synchronous media, but the underlying transport stream is asynchronous. Packet switched networks are a form of multiplexing that allow packets to be switched between different network links to achieve signal routing, but there are many other packets from other sources all sharing the same link, so it is the role of the layer-2 switch or layer-3 router to multiplex the packets from the source into the destination link.

Broadcasters are not compelled to use ethernet, but it is the ubiquitous standard for layer-2 transport exchange within industrial facilities. Other protocols such as SDLC and HDLC are available, but these tend to be specialist protocols used in the backbones of Telco’s, consequently, ethernet using fiber and twisted pair CATn cable infrastructures are generally the first choice for broadcasters.

Asynchronous networks have one fundamental flaw, that is, the data packets are not synchronized and are generated randomly from the sending device. When multiplexing the packets onto a link, the probability of two packets being perfectly aligned one-after-another with optimal temporal spacing is virtually impossible. Switch and router manufacturers deal with this by using buffers within their devices. Furthermore, any device connected to the network will employ buffers to synchronize incoming traffic flows to the devices processing events.

Figure 1 – FIFOs (First In, First Out) are a standard method of synchronizing two asynchronous systems. However, if receiver doesn’t read the packets fast enough then buffer-overflow can occur which results in dropped packets and therefore a compromise in data integrity.

Figure 1 – FIFOs (First In, First Out) are a standard method of synchronizing two asynchronous systems. However, if receiver doesn’t read the packets fast enough then buffer-overflow can occur which results in dropped packets and therefore a compromise in data integrity.

Buffer Management

Buffer analysis for optimizing traffic flow is a massive subject and an area of intense research as we need to balance two fundamental and seemingly contradictory requirements: keep latency low and maintain high data integrity by not dropping any data packets. Traffic streams are often bursty meaning a 1Mb/s flow will not have the packets uniformly distributed across every second, and this results in bursting. With COTS type IT equipment, it’s not unusual to see an average of 1Mb/s IP flow peak to 10Mb/s for a brief period and then have a relatively long time of inactivity. The buffer is able to deal with this as it acts as a short-term soak for the data. However, if the buffer is too long then excessive latency can occur, and if it’s too short then packets of data will be dropped and the data integrity will be compromised.

IT systems are asynchronous by nature as all the terminal equipment such as servers, desktop computers, and mobile devices are autonomous, and all operate asynchronously. Therefore, it’s not unreasonable to expect the network transporting their data to be asynchronous too. Fundamentally, we need buffers to allow different devices to reliably exchange asynchronous data.

ST2110, along with PTP has an option to keep transmitted packets evenly gapped and this goes a long way to reducing the risk of dropping packets. However, this also results in the procurement of high-end switches with non-blocking buffer and backplane processes, which leads to much higher infrastructure costs, and this is one of the reasons it’s difficult, if not even impossible to stream ST2110 video across the public internet or into the cloud. Even using FEC (Forward Error Correction) it’s challenging to reliably stream low latency uncompressed video and audio into the internet without some form of packet recovery, purely because the internet is a massive unmanaged network, and this is one justification for employing managed networks with non-blocking switches in a broadcast facility.

Design Considerations

When designing and building IP broadcast facilities the engineers must start with the workflows. For example, one workflow may require a newsroom viewing system where journalists can monitor incoming feeds, for this workflow a non-blocking architecture capable of supporting ST2110 and PTP is a complete overkill, but a lightly managed network capable of supporting compressed video over RTP/TCP/IP is more than adequate. However, when building the studio infrastructure, non-blocking infrastructures will be a significant requirement to keep packet loss and latency low.

Latency is a major challenge for IP, however, if broadcasters learn to work with latency and understand why and where it occurs then achieving reliable infrastructures will be much more achievable.

And finally, thinking in terms of asynchronous packet distribution will help enormously when designing, implementing, and maintaining IP broadcast infrastructures.

Part of a series supported by

You might also like...

The Resolution Revolution

We can now capture video in much higher resolutions than we can transmit, distribute and display. But should we?

Microphones: Part 3 - Human Auditory System

To get the best out of a microphone it is important to understand how it differs from the human ear.

HDR Picture Fundamentals: Camera Technology

Understanding the terminology and technical theory of camera sensors & lenses is a key element of specifying systems to meet the consumer desire for High Dynamic Range.

IP Security For Broadcasters: Part 2 - The Problem To Be Solved

By assuming that IP must be made secure, we run the risk of missing a more fundamental question that is often overlooked: why is IP so insecure?

Standards: Part 22 - Inside AIFF Files

Compared with other popular standards in use, AIFF is ancient. The core functionality was stabilized over 30 years ago and remains unchanged.