Building An IP Studio: Connecting Cameras - Part 2 - Session Description Protocols

IP is incredibly versatile. It’s data payload agnostic and multiple transport streams have the capability to transport it over many different types of networks. However, this versatility provides many challenges, especially when sending video and audio over networks.

Although SDI networks are fixed, the types of video formats they can transport fits within a relatively tight constraint. This could be HD, SD, 50 fields or 60/1.001 fields per second. SMPTE have done a fantastic job of tightly defining the standards so that there are no surprises when connecting equipment. This virtually guarantees that if I connect an HD-SDI camera to an HD-SDI monitor then I will see the pictures.

The historic backwards compatibility requirements of 50 or 60/1.001 fields per second is more or less finished, and that’s before we even start talking about interlace. We no longer have to maintain the tightly specified timing constraints imposed by the electromagnetic coils of the camera and monitor. Consequently, we can explore a whole new generation of higher frame rates, and different horizontal and vertical image arrays. Although this may still be somewhat futuristic, IP provides the infrastructure that will allow this to happen.

Other than defining the type of higher-level protocol employed, such as UDP or TCP, IP packets have no knowledge of the type of data their payloads are carrying. And as the media essence has been abstracted away from the constraints of the transport stream, the IP stream no longer implies any type of media format. It could equally be carrying audio, video, metadata, control information, or just about anything else. This implies great flexibility, but how do we know what type of media or data an IP packet is carrying?

For each stream we must define a set of attributes that specify the media essence. For example, the image is 1920 x 1080 pixels, the color subsampling is 4:2:2, and the frame rate is 60/1.001 fields per second. These are just a few of the attributes and we could quite easily need ten or more to define a meaningful video signal. And this information must be maintained for every stream.

One method is to provide a spreadsheet of every video, audio, and metadata source defining each of the attributes that specify the signal. For a few media flows this is possible, but when we start reaching hundreds and even thousands of video and audio flows, not to mention their associated metadata streams, providing a spreadsheet approach is clearly an administrative nightmare, bordering on the impossible, especially if we want to take advantage of the versatility, scalability, and flexibility that IP can provide.

Another more practical solution, and one that has been used in AoIP for many years, is the SDP (Session Description Protocol). This is a small file that contains all the parameters needed to specify the media stream. The SDP was originally specified in 1998 by the IETF with a revised version being released in 2006 called RFC 4566. SMPTE adopted this in the early 2000s.

Without the SDP, streams based on formats such as SMPTEs ST2110 are almost impossible to manage. It is theoretically possible to use a spreadsheet to keep a record of the source IP addresses, frame rates, color space, etc., however, the practicalities of maintaining such a spread sheet render the exercise virtually unmanageable. Instead, each source generator, such as a camera, microphone, or frame synchronizer, creates an SDP file. This can then be issued on a periodic basis by the device or retrieved by a management control system.

SDP files are not restricted to ST2110 streams, instead, they are used by a large number of streaming formats to identify the audio and video streams such as RTP/MPEG and DASH. This provides the potential for distributing and identifying many different streams in a broadcast network leading to massive flexibility and scalability. That said, it is the role of the generating source or destination device to make sure the information defined in the SDP is accurate and correct.

Fig 1 – each device on the network can generate SDP files and send them to the broadcasters control system. This gives the control system an overview of all the connected devices and their formats.

Both source and destination devices on the network can generate SDP files. A monitor or loudspeaker can issue SDP files to advertise its connectivity allowing other devices to be connected to it.

In its simplest form, the SDP provides a plug-and-play type service, assuming the device is configured to send its SDP files periodically. Typically, this is often once per second. If the transmit rate of the SDP files is too fast, then the system runs the risk of creating network congestion. And if it is too slow, then system management and other devices may not be able to detect a newly connected piece of equipment fast enough.

SDP files are not particularly long and are typically 4K bytes in length. But care must be taken when calculating their network bandwidth allocation as a single device could consist of multiple streams resulting in one SDP file for each essence. In a studio camera this soon becomes an issue as there could easily be ten separate video and audio input and output streams. Therefore, a 4Kbyte file suddenly consumes 320Kbits/s of network bandwidth (4Kbyte x 8 x 10 = 320Kbits/s). And if there are 500 sources, then this could result in 160Mbits/s of bandwidth resulting in careful network management.

If there are too many devices creating SDP files, then there are two further options: send the files over a different network or use the system control software to actively pull the SDP files from the devices instead of the devices constantly sending them. Using a different network has its merits as the SDP file delivery times are not as critical as the media streams, resulting in their being no interference with the media streams.

The broadcast facilities control software is largely responsible for collecting the SDP files and creating an up-to-date database of the connected devices and their audio and video parameters, as well as their network addresses. This can be performed in the background by the management service so that an overview of all the connected devices is easily viewable. Web page type views allow users to monitor the devices and their associated streams with varying degrees of hierarchy and granularity to determine how the system is configured.

Identifying media streams in a complex broadcast network is not a trivial task. Streaming specifications adopting SDPs help keep track of the media essence parameters so that down stream equipment can easily receive and connect to the streams.

Other related articles posted on The Broadcast Bridge.

Building An IP Studio: Connecting Cameras - Part 3 - Network Switching And Routing

You might also like...

Microphones: Part 11 - The State Of The Art… And The Potential Of MEMS Microphone Arrays

Here we look from the state of the art in microphones, to what the future may bring with the enticing theoretical potential of microphone arrays built using MEMS technology.

Microphones: Part 10 - Mid-Side (M-S) Recording And Processing

M-S techniques provide useful sound-field positioning and a convenient way to check mono compatibility. We explain the hard science behind this often misunderstood technique.

Microphones: Part 9 - The Science Of Stereo Capture & Reproduction

Here we look at the science of using a matched pair of microphones positioned as a coincident pair to capture stereo sound images.

Microphones: Part 8 - Audio Vectorscopes

The audio vectorscope is an excellent tool for assuring quality in stereo sound production, because it makes the virtual sound image visible in the same way that a television vectorscope allows the color signals to be seen.

Microphones: Part 7 - Microphones For Stereophony

Once the basic requirements for reproducing sound were in place, the most significant next step was to reproduce to some extent the spatial attributes of sound. Stereophony, using two channels, was the first successful system.