IP Video: Its More Than One Thing, And It Needs More Than One Solution. Part 2.

What does IP Video really mean, why there is such hysteria and where it’s all going? At annual industry trade shows, like IBC and NAB, there is often a single ‘buzzword’ which floats to the top of conversations between vendors, broadcasters and consultants. Over the last year or so, that word has been ‘IP Video’, and with each day that passes, IP Video hysteria grows. In this article we will try to rationalise what IP Video actually means in practice, and ask questions about how it will start to play a role in the day-to-day lives of broadcasters. We will question the ongoing debate around choosing which IP Video protocol is going to become “the standard” and we will conclude that in fact we need more than one protocol to realise the opportunities offered by working with Video in an IP world. Continue reading part 2.

Continued from part 1

Examination of the main protocols for IP Video

Before we move on, let’s stop and take a closer look at the various players in the IP Video world. Like any new technology genre we see different opinions developing on the best way to solve the problem that is IP Video.

We see ‘standards based’ approaches from groups building complex, evolving solutions on top of many layers of previous standards; and we see more pragmatic approaches based on getting something which works today and already has useful applications.

The strong trend towards building more technical layers on top of existing layers for example, encapsulating SDI streams within RTP, or Transport Streams within UDP, perhaps reveal the industry’s reluctance to let go of something we all know and trust, but runs the risk of over complicating a challenge which deserves a clean slate.

The IP Video protocols most commonly talked about include SMPTE 2022-6 and ASPEN – both of which are transported over 10 Gb (or faster) Ethernet. There are other solutions without that 10 Gb minimum dependency, including Network Device Interface (NDI). Each of these formats has different strengths and design motivations, and each may be suitable for different applications.

SMPTE 2022–6 creates a very complex data stream

SMPTE 2022–6 is fundamentally the closest to what we are familiar with in SDI. The approach is to take a raw, uncompressed SDI stream, and simply carry that over a 10 Gb network cable using UDP (User Datagram Protocol). The content of the pixels does not change, and the basic SDI interleaved data structure stays the same.

In order to carry SDI over UDP, SMPTE 2022–6 makes use of an existing standard called Real Time Protocol (RTP). This works by chopping up the SDI stream into tiny pieces, rubbing in some salt and other seasonings before sending it on its way. The penalty is that the data that ends up on the wire is more complex than the original SDI signal that it represents. The benefit is that we can now transport our stream over (high performance) commodity networking infrastructure.

It’s important to recognise that SMPTE 2022–6 creates a very complex data stream requiring work at the sending end to construct that data and work at the receiving end to deconstruct it. You can see that this is even more of a challenge when you consider that the underlying SDI stream itself is already very complex. SDI is not simply a solid frame of video followed by another in sequence - it is an interleaved stream of chopped up picture, sound and metadata - designed for real-time hardware applications.

So, we end up with a very complex SDI salad, chopped further by SMPTE 2022–6, arriving in a tube to be sorted out by a very complex system before getting back to something resembling a video frame. For HD video this is happening at 1.5 Gb/s — rising to 6 or 12 Gb/s for newer standards.

To conclude our salad analogy, that’s 1.5 billion bits of assorted chopped vegetable arriving every second – and we have to reconstruct the original lettuce leaf or a carrot in real time to actually get a picture out.


Some vendors have started to appreciate this complexity, and are trying to find a simpler way to move video and audio between devices, without so much chopping and sorting. Enter ASPEN. This proposed standard shares much of the underlying architecture of SMPTE 2022–6, using 10 Gb/s UDP and the RTP packetisation system.

Where ASPEN differs is that the approach recognises that using an SDI interleaved stream on top of RTP and UDP is unnecessary — because most devices at either end really just want to see complete video frames. Instead, ASPEN makes use of an MPEG Transport Streams to encapsulate the media essence, and once again, the data is chopped into another set of small pieces for transmission along the wire. This approach is less intensive than SMPTE 2022–6, but it’s still a lot of work to do at an incredible rate.

Because of the amount of processing, slicing, multiplexing and demultiplexing required, we’ve almost certainly designed-out any sort of software implementation — and instead have restricted ourselves to using low-level hardware. Did we just lose sight of the key benefit of our dream IP studio?

The dream was one where the CPU in one system could ‘talk video’ directly to the CPU in another — in practice the complexity and speed of these 10Gb protocols make that totally impractical. One of the authors of this piece wrote software to deconstruct a SMPTE 2022–6 data stream with a desktop computer using software alone - with some success - but with evidence that the computer would not be doing very much else at the same time - and the ultimate conclusion being — SMPTE 2022–6 needs to be done in hardware. The same conclusion probably applies to ASPEN.


The uncompressed camp is also now starting to recognise the need for compression - to try and get more streams down a sensible sized network (after all SMPTE 2022–6 and ASPEN can’t fit even a single HD video stream down a 1Gb/s network connection).

Codecs including JPEG2000 and TICO are starting to see some usage, although this raises two more challenges — codec licensing and compression overhead. Patents and licensing have been the thorn in a number of common standards adopted in broadcast and on the Internet — particularly noticeable in H.264 and MP3. As we move ahead with IP Video, it would be nice to avoid a repeat of all this trouble and expense.

In practical terms, the main burden of compression is the compute load it entails. Codecs like JPEG2000, H.264 (and even more so H.265) are extremely compute intensive, and require either dedicated silicon, or a sizeable portion of a computer processor. Whilst compression reduces the network data rate, that benefit must be offset against the compute load. This becomes particularly important when the channel count increases and your systems are processing multiple IP video streams simultaneously. Equally, long–GOP codecs like H.264 can introduce latencies into the data stream due to backwards and forwards interframe compression and out of order delivery.

The two main ’10 gigabit’ standards are represented in various groups, with offshoots and flavours emerging through AIMS, ASPEN, VSF, AMWA and a number of other groups. Most major hardware vendors have aligned themselves with one or the other and are rushing to bring IP Video products to the market. Very often these products are simple interfaces between 10 gigabit IP Video and SDI — with their underlying hardware unchanged using baseband SDI Video.

Mark Gilbert is CTO at Gallery SIENNA

Simon Haywood is an independent broadcast consultant at Tamed Technology http://tamedtechnology.com

Continue reading part 3

You might also like...

Future Technologies: Autoscaling Infrastructures

We continue our series considering technologies of the near future and how they might transform how we think about broadcast, with a discussion of the concepts, possibilities and constraints of autoscaling IP based infrastructures.

Standards: Part 12 - ST2110 Part 10 - System Level Timing & Control

How ST 2110 Part 10 describes transport, timing, error-protection and service descriptions relating to the individual essence streams delivered over the IP network using SDP, RTP, FEC & PTP.

FEMA Experimenting At IPAWS TSSF

When government agencies get involved, prepare for new acronyms.

Managing Paradigm Change

When disruptive technologies transform how we do things it can be a shock to the system that feels like sudden and sometimes daunting change – managing that change is a little easier when viewed through the lens of modular, incremental, and c…

Future Technologies: The Future Is Distributed

We continue our series considering technologies of the near future and how they might transform how we think about broadcast, with how distributed processing, achieved via the combination of mesh network topologies and microservices may bring significant improvements in scalability,…