Designing IP Broadcast Systems: Timing

How adding PTP to asynchronous IP networks provides a synchronization layer that maintains fluidity of motion and distortion free sound in the audio domain.

Broadcast systems have been synchronous since the first television programs were broadcast in the 1930s. However, IP systems are asynchronous by nature, and this is presenting us with some new and interesting challenges.

There are no moving pictures in television, just a series of still images played quickly to give the illusion of motion. Our perception of the fluidity of motion is reliant on these images being played in a time invariant manner, which in turn leads to a synchronous system. To keep our broadcast system as simple as possible we use a constant image sampling rate of either 29.97 or 25 frames per second, depending on where you are in the world. Higher frame rates are being used to improve the immersive experience, but the vast majority of frame rates used throughout the world are either 29.97 or 25 fps.

Although the sample and playback frame rates must be time invariant, there is no reason why the distribution system connecting the broadcaster to the viewer must also be time invariant. Synchronized distribution has certainly been the case when transmitting traditional programs using DVB and ATSC systems, but this is more of an overhang from the need to maintain backwards compatibility for existing home televisions. Relaxing this time invariant constraint on the distribution mechanism provides broadcasters with many more options for program delivery and is one of the reasons that video and audio over IP works.

Flexibility In IP

The success of IP networks is emphasized with two key attributes: the IP packet has no knowledge of the underlying transport stream it is travelling across, and IP packets may be received out of sequence. These attributes greatly influence network topologies where designers can balance flexibility, scalability, and resilience meaning there is often a compromise to be struck. For example, increasing resilience can reduce flexibility, and improving scalability can reduce resilience. We can simultaneously improve flexibility, scalability, and resilience but in doing so the cost and complexity of the network increases exponentially. Removing the constraint that packets must all arrive in sequence requires systems to be put in place to reconstruct the order at the receiving device (such as the television), but in essence, IP packets do not all have to take the same route throughout the network, and this provides much more flexibility.

Broadcasters have up to now built IP networks that are highly resilient using topologies such as spine-leaf. Although this works well, flexibility and scalability are compromised when compared to mesh networks. But this is a relatively small price to pay when considering where many broadcasters currently are in their IP journey. The constraints that SMPTEs ST2110 place on infrastructures by maintaining evenly gapped packet spacing and keeping delivery sequential may seem to contradict the strengths of IP networks, however, broadcasters are still learning IP and reducing the number of variables in the system is to be welcomed, but this will change as greater confidence is gained and more ambitious networks built that will further enhance scalability and flexibility.

To keep latency low and predictable then buffer utilization in switches and connected devices must also be kept low. This is achieved by adding a certain amount of synchronization to the system and is accomplished using PTP (Precision Time Protocol).

Synchronization With PTP

PTP provides a layer of timing synchronization to the devices connected to the network (such as cameras, production switchers, microphones and sound console). It’s important to note that PTP doesn’t time the network but only the devices that are connected to it.

At the heart of PTP is a value that is updated every nanosecond to represent the number of nanoseconds that have passed since the epoch on 1st January 1970. The theory is that if all connected devices share this value and it is equal, and they all agree on when the epoch occurred, then they are in effect synchronized. A series of messages are sent from the PTP Grand Master (PTP-GM) containing the master nanosecond clock value to all connected devices allowing them to synchronize their own internal version of the nanosecond counter to the value being sent by the PTP-GM.

Figure 1 – At A, a device such as a camera is synchronized to the PTP-GM and creates evenly gapped packets for video frames. After travelling through the network, the packets are temporally shifted and so lose their timing reference based on their position (B). The PTP receiver after B is synchronized to the PTP-GM (and every other PTP device on the network including the cameras) and re-gaps the packets so they recover their temporal parameters.

Figure 1 – At A, a device such as a camera is synchronized to the PTP-GM and creates evenly gapped packets for video frames. After travelling through the network, the packets are temporally shifted and so lose their timing reference based on their position (B). The PTP receiver after B is synchronized to the PTP-GM (and every other PTP device on the network including the cameras) and re-gaps the packets so they recover their temporal parameters.

Computer servers rely on software stacks to process IP messages which are generally built into the operating systems kernel. Although this provides greater flexibility, it has some limitations for time synchronization because the server itself will introduce variable processing as it is an asynchronous processing system, and this results in inaccurate synchronization. To counteract this, the network interface card (NIC) often has hardware processing that intercepts the IP packets containing the PTP information and synchronizes its local clock, thus providing a highly resilient and accurate timing system which the rest of the server then uses. Broadcast devices such as cameras and production switchers etc. also employ these hardware methods to improve PTP accuracy as without this the video and audio sampling systems will drift and create discontinuities in the motion and distortion in the sound. And when this reaches an extreme, video and audio samples can even be lost and dropped, further adding to the problem.

PTP And Networks

PTP accuracy requires us to dig deeper into the network as we can’t take for granted some of the detail of the propagation characteristics. For example, for PTP synchronization to be accurate it relies on the network to be symmetrical and have determinate latency. By symmetrical we mean that the message send-and-receive time from the PTP-GM to a connected device is the same. This is implied in CAT5/6/7/8 type cabling as the Tx/Rx signal transmission paths are in the same physical cable and so the propagation time in both directions is equal. But this must be validated when the network is configured as IP packets may take different routes when going in different directions. The IP specification does not mandate that the source-to-destination path must be exactly the same as the destination-to-source path. Therefore, the network may or may not be symmetrical, it is up to the broadcast engineer to validate this.

The buffers in the switches temporally move packets to achieve egress scheduling resulting in jitter and therefore we cannot assume the network has determinate latency. This will cause the PTP to suffer from stability and accuracy issues. One solution is to use PTP-aware switches which are PTP switches that can determine how long the IP/PTP packets have queued in the buffer and append the PTP timestamp messages with an offset that describes how late the PTP message will be when it reaches its destination (due to the buffer latency). The connected device is then able to add an offset into the nanosecond time value just received so the localized counter can achieve greater accuracy. The PTP system will still operate without PTP-aware switches, but the synchronization the connected devices can achieve will be greatly compromised.

Monitoring Time

Although nanosecond timing has been at the heart of every broadcast infrastructure since the first television pictures were broadcast, we must now look at timing in a slightly different manner. The time information that the connected devices require is no longer embedded into the signal itself (as with SDI and AES) but is instead formed of a subsystem that operates independently of the video, audio, and metadata streams. The PTP timestamps must be synchronized, but the mechanism for synchronizing all the connected devices operates without knowledge of the video, audio, or metadata.

Broadcaster engineers must look at the PTP messages that form the timing plane and distributed throughout the network. Not only are we interested in the validity of the data, but also added fields such as the PTP-aware switch offset values, the number of messages sent per minute, and whether propagation delays are influencing the accuracy of the timing plane.

Timing has always been at the heart of a broadcast engineers core thought processes, but as we progress on our IP journey and use PTP more, we must find new methods of monitoring the timing plane. It is not just about monitoring one point in a network, but is the ability to monitor, validate, and understand hundreds if not thousands of messages that are routed around the network. Most importantly, accuracy in the timing plane leads to a reduction in latency, and this is something every broadcast facility is striving for. 

Part of a series supported by

You might also like...

The Meaning Of Metadata

Metadata is increasingly used to automate media management, from creation and acquisition to increasingly granular delivery channels and everything in-between. There’s nothing much new about metadata—it predated digital media by decades—but it is poised to become pivotal in …

Location Sound Recording With The Experts - Part 1

We talk to five experts about the creative and professional challenges encountered every day by location sound recordists across a wide range of genres of production.

Managing Paradigm Change

When disruptive technologies transform how we do things it can be a shock to the system that feels like sudden and sometimes daunting change – managing that change is a little easier when viewed through the lens of modular, incremental, and c…

Future Technologies: The Future Is Distributed

We continue our series considering technologies of the near future and how they might transform how we think about broadcast, with how distributed processing, achieved via the combination of mesh network topologies and microservices may bring significant improvements in scalability,…

Audio For Broadcast: Cloud Based Audio

With several industry leading audio vendors demonstrating milestone product releases based on new technology at the 2024 NAB Show, the evolution of cloud-based audio took a significant step forward. In light of these developments the article below replaces previously published content…