Connecting a camera in an SDI infrastructure is easy. Just connect the camera output to the monitor input, and all being well, a picture will appear. The story is very different in the IP domain.
In this short series of articles, we look at what it means to connect a camera to a studio infrastructure in the IP domain. We discuss the challenges, solutions, and the practical aspects of routing moving pictures in the IP studio.
Although I say connecting SDI is easy, that wasn’t always the case. In the very early days of SDI, it was very difficult to transport reliable images as the chipsets that provided automatic equalization were yet to be developed and the cable design that could work to 270mb/s was in its infancy. Although we probably don’t remember, those earlier trailblazers working with SDI found themselves ascending a steep learning curve. Much as we are doing today with IP.
Fundamentally, in communication systems we have two methods of routing signals, either as circuit switched, or packet switched.
With circuit switched infrastructures a point-to-point connection exists between two devices, but to deliver greater flexibility, some sort of routing matrix is employed. In the old days of television this consisted of relays and patch bays, but as electronics developed, FET and MOSFET routing matrices were used that allowed a one-to-many connectivity of the matrix input to one or multiple outputs.
Television has always used some form of synchronous distribution. Although no longer needed, the line, field and frame syncs provided a method of synchronizing the scanning coils of the camera with the scanning coils in the studio monitors, and the scanning coils of all the television sets at home. As we moved to SDI and AES, and to maintain backwards compatibility we kept this timing information, which in effect delivered the synchronous delivery transport streams that SDI and AES provides. And this in turn led to the adoption of circuit switched networks as they easily maintain the synchronous nature of the SDI and AES transport streams.
Figure 1 – left) Early ethernet (IEEE 802.3) used a single coaxial cable to connect multiple computers and devices. CSMA/CD (Carrier Sense Multiple Access with Collision Detection) was a method of allowing multiple devices to send packets and detect collisions. This provided the sender with the option of resending the packet should a packet collision be detected. Right) demonstrates how ethernet is used in modern broadcast facilities. The point-to-point connection between the cameras and the ethernet switch removes collisions, but each port uses a memory buffer to reduce the chance of congestion when the video packets are sent to the next device, in this case, the monitor. Consequently, variable latency occurs.
IP was originally developed to allow computer systems to communicate with each other and exchange data. Critically, the data often consisted of documents and database queries and responses, which in turn led to exchanging just short bursts of data. At the time, packet delivery was the optimal communication method as computer-to-computer data exchange was often short. Furthermore, computers within a network would often share the same physical cable, so methods of arbitration had to be devised to reduce the probability of two computers sending a packet simultaneously and causing a collision, resulting in corruption of the data and lost packets.
The bottom line is that IP was never designed to send continuous streams of video or audio. But to take advantage of the advances in IT technology, broadcast engineers and innovators have had to devise methods of distributing continuous streams of video and audio over IP networks.
It’s worth remembering that IP datagrams exist independently of their underlying transport streams. And this allows us to easily transport IP datagrams over ethernet, WiFi, or fiber, and switch between them. From the perspective of the IP datagram, it doesn’t know or care about its transport stream. But transferring between different transport streams can have a significant effect on the streamed media. This is both its greatest strength and most difficult challenge. As we’ve abstracted away the streamed video and audio media from the timing plane, we can no longer rely on clocks within the transport domain to provide reliable timing information.
Although we don’t need line, field and frame syncs anymore, television is still a sampled system for both the video and audio, and so we must synchronize the video playout of the monitor with the video playout of the camera, and we must synchronize the audio playback of the loudspeaker with the audio sampling of the microphone.
Therefore, connecting a camera to a monitor in an IP network presents us with the following challenges:
- The monitor and other devices need to know how to receive the camera feed
- We must restore the timing to synchronize the camera and downstream devices
- A reliable and flexible method of labelling the video streams must be found
- Routing the camera to its destinations is required
- Monitoring and understanding where the packets are going, and interacting is required
Also, we have to contend with the thorny issue of latency. Again, as SDI and AES are synchronous transport streams, the transmit and receive clocks are very small, potentially in the order of a few samples. There may be some clock jitter on the circuit, but the phase locked loop in the receiver will be able to remove this assuming it was within tolerable limits. Furthermore, the sync pulse generator would have supplied a reliable clock source so that most SDI senders and receivers will be relatively close further removing the need for large input buffers.
As IP is asynchronous, we cannot rely on the clocks within the transport stream. For example, a camera may be sending IP datagrams over fiber, and a graphics generator may be sending images over CAT8/ethernet. Both use wildly different transport methods, and it would be almost impossible to choose which one to select.
Buffers are used extensively within IT networks to synchronize asynchronous events, and to remove the potential for network congestion. As television is still a synchronous system, with the current need to provide backwards compatibility we have no option but to use buffers in both send and receive devices, as well as within the network. Although buffers solve the problem of congestion collision and synchronization of asynchronous events, they do so at the expense of latency. More worryingly, this latency is no longer static and predictable, but is highly variable.
In the next article in this series, we will look at how a monitor displays a picture being sent from a camera.
You might also like...
CDNs are much more than just high-speed links between Origins and ISP (Internet Service Provider) networks. Instead, they form a complete ecosystem of storage and processing, and they create new possibilities for highly efficient streaming at scale that will likely…
At the moment it is far from clear exactly how the OTA TV landscape will evolve in the US over the next few years… the only sure thing is that we are in a period of rapid change.
TV stations have mostly parked their satellite trucks and ENG vans in favor of mobile bi-directional wireless digital systems such as bonded cellular, wireless, and direct-to-modem wired internet connections. Is Starlink part of the future?
We discuss the accelerating evolution of immersive media experiences & consumer technology, whether the mainstream media is keeping pace with the gamification of media consumption and the exponential growth in delivery capacity that will be required to support mass audience…
Why keeping control of wi-fi and other devices within a broadcast facility to ensure there is no interference with critical devices is essential.