In principle, IP systems for broadcasting should not differ from those for IT. However, as we have seen in the previous nineteen articles in this series, reliably distributing video and audio is highly reliant on accurate timing. In this article, we investigate the key components needed to build a reliable broadcast IP infrastructure.
SMPTE’s ST2110 specification abstracts away the link between video and audio essence, and the underlying transport mechanisms of SDI, AES, and MADI. Although ST2022-6 provided a mechanism to distribute SDI over IP, it was wasteful of bandwidth and didn’t provide the flexibility ST2110 has achieved.
As ST2110 removed the sync pulses from the video signal, a new timing model had to be adopted. SMPTE chose IEEE-1588:2008 to provide the timing plane. Otherwise known as PTP (Precision Timing Protocol) each compliant device synchronizes its own internal clock with the Grand Master clock.
PTP is used in industry to harmonize processes on production lines and in laboratories. Multiple robots moving a car chassis must have their motors synchronized otherwise one might move ahead of the other and cause distortion of the chassis. Sub-microsecond accuracy is achievable with PTP and its application is directly applicable to television.
In an ST2110 system, all devices, such as cameras, vision switchers, sound consoles and recording servers, must be attached to a PTP network. Each device is synchronized to the PTP’s Grand Master Clock to achieve the sub-microsecond accuracy needed.
Jitter Must be Overcome
Packet jitter is the enemy of video streaming. If an IP packet is moved too far temporally in the stream, then large buffers will be required to re-synchronize the packets. And if a devices PTP clock is drifting, or itself is suffering from jitter, further buffering must be used to avoid packet loss.
To keep latency as low as possible, ST2110 uses UDP and does not use TCP. As there is no upper-protocol available to resend lost packets, packet loss in video and audio streams cannot be tolerated.
Diagram 1 – PTP synchronization messages must be sent to every broadcast device that processes ST2110 data. The network must keep latency and jitter as low as possible to maintain accurate PTP clocks in each device.
A single HD video feed can generate 200,000 IP packets every second. The data is continuous and relentless. With broadcast systems generating hundreds and sometimes thousands of video streams, the amount of continuous data being moved around a network is colossal.
In any broadcast system, all devices must be synchronized to the stream that is taking the longest to reach the vision switcher. But large buffers result in long latency which in turn can affect other streams being switched in a studio.
Multicasting is used extensively to route video and audio streams from cameras and microphones, to vision switchers, monitors, and sound consoles. In SDI systems, broadcasters use distribution amplifiers to simultaneously distribute video to other devices. IP Multicasting is used to achieve the same effect.
Leaf-spine or Central Switch?
There is much debate about the type of network topology to use. Some advocate leaf-spine and others prefer a dual central switch (to achieve redundancy). In deciding which topology to use, engineers must analyze the Ethernet capacity of each link in the network.
Centralized switches imply faster throughput with minimum latency. But they are difficult to scale. As the demands on the network increase the switches must increase in size accordingly. Running two switches side-by-side creates interesting challenges as a separate control system is needed to keep the respective look-up table databases matched.
Although leaf-spine networks may appear to be scalable, the continuous and repetitive nature of video packets leaves little opportunity to burst data. Detailed modeling is needed at the planning stage to understand data rates between the leaf and spine ports, and even predict future capacities. The whole point of moving to IP is that broadcasters are future proofing their installations.
All this leads to Ethernet switches that have speeds and throughput found in high-end ISP backbones. A progressive HD signal consumes approximately 2.5Gbits/sec of data bandwidth. And ST2110 imposes latencies of less than 250microseconds between sender and receiver. A network with a thousand video streams will need a switch that can move 2.5Tbits/sec of data between its ports, without packet loss.
Regardless of whether leaf-spine or centralized switches are employed, non-blocking Ethernet switches must be used. Due to their complexity, they are at the high-end of the IT budget and will certainly need vendor service level agreements to support them.
Security is a topic broadcast engineers have not given much thought to in the past. It was obvious when somebody had hacked an SDI network because they would have a pair of wire-cutters in their hand. The same is not true of IP networks for video and audio streaming.
Careful VLAN planning is required to keep facilities separate from each other, even in the same company. Quite often, a broadcaster will lease out its facilities to other production companies and sensitive material cannot be seen in other parts of the station.
Diagram 2 – To maintain high levels of security, VLAN’s are needed to keep traffic isolated between different facilities within in the same broadcast station.
Keeping media assets safe from the internet is a necessity. Media is no longer distributed on tape so if adequate security measures are not in place, then it’s difficult to discover if somebody is downloading a block-buster-film from your archive. Such a security breach has all kinds of legal ramifications for the broadcaster as the owner of the film may claim they’ve allowed it to be illegally copied.
Intruder Detection Systems (IDS) and Intruder Protection Systems (IPS) must process Terabits of data every second. Again, they will be at the higher end of the IT budget. And adequate planning must be allowed otherwise they will create unnecessary bottle necks and swamp true errors with false positives.
Audit systems are playing an increasing role in workflows to keep track of who is uploading and downloading media. Although not directly linked to high-speed IP networks, such systems will be administered by IT.
Critical Timing Information
Monitoring is a challenge in broadcast IP networks due to the volume of data being transferred. Diagnosis usually requires engineers to dig deep into the Ethernet stream and acquire accurate packet timing information as well as decoding the video and audio.
Protocol analyzers such as Wireshark must be used in conjunction with low latency NIC’s (Network Interface Cards). Off-the-shelf NIC’s use buffers to store datagrams as they are received off-the-wire, doing so destroys important timing information needed to understand packet jitter and buffer management.
Broadcast IP networks are complex and need careful planning to guarantee continuous reliable delivery of streaming media. Understanding the complexities of video and audio timing, and PTP systems, is key to building a successful and reliable network.
You might also like...
This FREE to download eBook is likely to become the reference document you keep close at hand, because, if, like many, you are tasked with Preparing for Broadcast IP Infrastructures. Supported by Riedel, this near 100 pages of in-depth guides, illustrations,…
Thanks to improved streaming technology, a lot more fans are going to be watching the Super Bowl on mobile screens.
Today’s broadcast engineers face a unique challenge, one that is likely unfamiliar to these professionals. The challenge is to design, build and operate IP-centric solutions for video and audio content.
Broadcasting used to be simple. It required one TV station sending one signal to multiple viewers. Everyone received the same imagery at the same time. That was easy.
Saving dollars is one of the reasons broadcasters are moving to IP. Network speeds have now reached a level where real-time video and audio distribution is a realistic option. Taking this technology to another level, Rohde and Schwarz demonstrate in…