Building An IP Studio: Connecting Cameras - Part 2 - Session Description Protocols

IP is incredibly versatile. It’s data payload agnostic and multiple transport streams have the capability to transport it over many different types of networks. However, this versatility provides many challenges, especially when sending video and audio over networks.

Although SDI networks are fixed, the types of video formats they can transport fits within a relatively tight constraint. This could be HD, SD, 50 fields or 60/1.001 fields per second. SMPTE have done a fantastic job of tightly defining the standards so that there are no surprises when connecting equipment. This virtually guarantees that if I connect an HD-SDI camera to an HD-SDI monitor then I will see the pictures.

The historic backwards compatibility requirements of 50 or 60/1.001 fields per second is more or less finished, and that’s before we even start talking about interlace. We no longer have to maintain the tightly specified timing constraints imposed by the electromagnetic coils of the camera and monitor. Consequently, we can explore a whole new generation of higher frame rates, and different horizontal and vertical image arrays. Although this may still be somewhat futuristic, IP provides the infrastructure that will allow this to happen.

Other than defining the type of higher-level protocol employed, such as UDP or TCP, IP packets have no knowledge of the type of data their payloads are carrying. And as the media essence has been abstracted away from the constraints of the transport stream, the IP stream no longer implies any type of media format. It could equally be carrying audio, video, metadata, control information, or just about anything else. This implies great flexibility, but how do we know what type of media or data an IP packet is carrying?

For each stream we must define a set of attributes that specify the media essence. For example, the image is 1920 x 1080 pixels, the color subsampling is 4:2:2, and the frame rate is 60/1.001 fields per second. These are just a few of the attributes and we could quite easily need ten or more to define a meaningful video signal. And this information must be maintained for every stream.

One method is to provide a spreadsheet of every video, audio, and metadata source defining each of the attributes that specify the signal. For a few media flows this is possible, but when we start reaching hundreds and even thousands of video and audio flows, not to mention their associated metadata streams, providing a spreadsheet approach is clearly an administrative nightmare, bordering on the impossible, especially if we want to take advantage of the versatility, scalability, and flexibility that IP can provide.

Another more practical solution, and one that has been used in AoIP for many years, is the SDP (Session Description Protocol). This is a small file that contains all the parameters needed to specify the media stream. The SDP was originally specified in 1998 by the IETF with a revised version being released in 2006 called RFC 4566. SMPTE adopted this in the early 2000s.

Without the SDP, streams based on formats such as SMPTEs ST2110 are almost impossible to manage. It is theoretically possible to use a spreadsheet to keep a record of the source IP addresses, frame rates, color space, etc., however, the practicalities of maintaining such a spread sheet render the exercise virtually unmanageable. Instead, each source generator, such as a camera, microphone, or frame synchronizer, creates an SDP file. This can then be issued on a periodic basis by the device or retrieved by a management control system.

SDP files are not restricted to ST2110 streams, instead, they are used by a large number of streaming formats to identify the audio and video streams such as RTP/MPEG and DASH. This provides the potential for distributing and identifying many different streams in a broadcast network leading to massive flexibility and scalability. That said, it is the role of the generating source or destination device to make sure the information defined in the SDP is accurate and correct.

Fig 1 – each device on the network can generate SDP files and send them to the broadcasters control system. This gives the control system an overview of all the connected devices and their formats.

Both source and destination devices on the network can generate SDP files. A monitor or loudspeaker can issue SDP files to advertise its connectivity allowing other devices to be connected to it.

In its simplest form, the SDP provides a plug-and-play type service, assuming the device is configured to send its SDP files periodically. Typically, this is often once per second. If the transmit rate of the SDP files is too fast, then the system runs the risk of creating network congestion. And if it is too slow, then system management and other devices may not be able to detect a newly connected piece of equipment fast enough.

SDP files are not particularly long and are typically 4K bytes in length. But care must be taken when calculating their network bandwidth allocation as a single device could consist of multiple streams resulting in one SDP file for each essence. In a studio camera this soon becomes an issue as there could easily be ten separate video and audio input and output streams. Therefore, a 4Kbyte file suddenly consumes 320Kbits/s of network bandwidth (4Kbyte x 8 x 10 = 320Kbits/s). And if there are 500 sources, then this could result in 160Mbits/s of bandwidth resulting in careful network management.

If there are too many devices creating SDP files, then there are two further options: send the files over a different network or use the system control software to actively pull the SDP files from the devices instead of the devices constantly sending them. Using a different network has its merits as the SDP file delivery times are not as critical as the media streams, resulting in their being no interference with the media streams.

The broadcast facilities control software is largely responsible for collecting the SDP files and creating an up-to-date database of the connected devices and their audio and video parameters, as well as their network addresses. This can be performed in the background by the management service so that an overview of all the connected devices is easily viewable. Web page type views allow users to monitor the devices and their associated streams with varying degrees of hierarchy and granularity to determine how the system is configured.

Identifying media streams in a complex broadcast network is not a trivial task. Streaming specifications adopting SDPs help keep track of the media essence parameters so that down stream equipment can easily receive and connect to the streams.

Other related articles posted on The Broadcast Bridge.

Building An IP Studio: Connecting Cameras - Part 3 - Network Switching And Routing

You might also like...

Designing IP Broadcast Systems - The Book

Designing IP Broadcast Systems is another massive body of research driven work - with over 27,000 words in 18 articles, in a free 84 page eBook. It provides extensive insight into the technology and engineering methodology required to create practical IP based broadcast…

Next-Gen 5G Contribution: Part 2 - MEC & The Disruptive Potential Of 5G

The migration of the core network functionality of 5G to virtualized or cloud-native infrastructure opens up new capabilities like MEC which have the potential to disrupt current approaches to remote production contribution networks.

Designing IP Broadcast Systems: Addressing & Packet Delivery

How layer-3 and layer-2 addresses work together to deliver data link layer packets and frames across networks to improve efficiency and reduce congestion.

Next-Gen 5G Contribution: Part 1 - The Technology Of 5G

5G is a collection of standards that encompass a wide array of different use cases, across the entire spectrum of consumer and commercial users. Here we discuss the aspects of it that apply to live video contribution in broadcast production.

Designing IP Broadcast Systems: Integrating Cloud Infrastructure

Connecting on-prem broadcast infrastructures to the public cloud leads to a hybrid system which requires reliable secure high value media exchange and delivery.