Virtualizing Live TV Production: The Software-only Facility
When are you going to virtualize? A common question to vendors. Let us pull apart that question. What does it mean in a media industry context? How do you virtualize live production with today’s computers and software? In this paper, we will start by listing the advantages that virtualization brings and discuss how to benefit from these. Through a newsroom case study, we will summarize the tools available to us and state-of-the-art techniques in software development. What can we learn from other industries? Finally, we will put this all together and introduce a framework suitable for distributed, low-latency, high-quality live production.
To Virtualize ...
To virtualize is to benefit from the -ilities of IT:
- agility – try things out, fail fast, be first;
- scalability – grow or shrink – use a bigger computer, or more computers, within minutes;
- reliability – avoid single points of failure and stay on air;
- card-swipe-ability – pay only for what you use;
- composability – construct flexible workflows using a chain of micro-services;
- sustainability – benefit from cross-industry investment in renewables.
The challenge is enabling these benefits for tier 1 live production, where we cannot compromise on either quality or latency. We need both high data rates and strict timing constraints. Encoding streams at every stage is an option, but takes time, has processing cost and risks introducing encoding artifacts. Instead, this article focusses on processing uncompressed streams (e.g. HD 1080p50 10-bit 3G Bit/s) across fast networks (e.g. 10G Bit/s+).
A Virtualized Newsroom
Where is virtualization used in live TV production today? I was technical leader for a newsroom system that was virtualized and deployed using Docker containers. Figure 1 shows the systems that were virtualized (green) and those that were not.
My team could receive a bug report or feature request from the morning news and have a fix ready for the lunchtime news. All due to the level of confidence we had in our infrastructure as code, with automated build, test, container packaging and deployment systems. The green boxes show newsroom control systems (NRCS), rundown automation and prompters. All these systems suit virtualization or deploying with containers. The red boxes are graphics engines, switchers and media players—applications producing tens of frames every second—needing GPU and PCI I/O card access. Harder to virtualize, these applications use a computer system much more like a gaming PC than a database or web server.
Where a high percentage of a machine's performance and resource is needed by one process, like for a full screen 3D game, virtualization delivers little benefit. As an alternative, software automation (e.g. ansible) allows the remote management of executables, drivers, software libraries and configurations. Like with a Steam games library or cellular app store, automation achieves the benefits of managing infrastructure as code.
Tools And Techniques
Efficient information technology uses one computer to do many parallel tasks. In the context of the web, one computer serves thousands of users at a time. Tens of compute cores run hundreds of processes, using thousands of threads, serving tens of thousands of parallel transactions. How? In terms of mechanisms, operating systems and hypervisors support massive scale by sharing out resources, using time slicing and context switching. Logically and in tandem, software development has evolved to exploit all that a multi-core, multi-threaded machine has to offer.
To scale up further – providing resilience to failure – software and data is distributed across many computers, in different data centres and across the globe.
When I first wrote a computer program, the code started at the top and ran in order to the end, perhaps looping in the middle. The code executed synchronously, line-by-line, CPU-clock-tick-by-CPU-clock-tick. If the program had to wait for a resource – reading a file, writing to memory, drawing some graphics – the CPU got blocked, 100% busy while waiting. Yet another way of making a CPU busy is to copy data, with uncompressed video frames requiring millions of cycles for each replica. As CPUs got faster over time, holding up the CPU is increasingly wasteful. Multi-tasking and threading can help to mitigate this, but threads are a limited system resource, also getting blocked.
Therefore, blocking and copying are the enemies of efficient computer software. Our friends are asynchronous processing and sharing data. Over the last decade, programming languages have matured to offer event-driven asynchronous software development. Operations like waiting for a file read are done in the background, freeing up the CPU and allowing one thread to serve thousands of requests. An efficient computer program is a collection of actions triggered by events, with the flow of the code no longer directly connected to the clock of the CPU. In parallel technical developments, networking stacks have evolved zero-copy operations and inter process communication provides no-copy sharing of data.
A standard like SMPTE ST 2110 – and similar real time protocols – have issues for pure software implementations. Rapid packet pacing is a synchronous, blocking operation. Also, the specification of bespoke bits-on-the-wire transport protocols forces copying of data in every transmitter and receiver. Ultimately, at any point in time down at a micro-second level, a virtualized process is not necessarily being serviced by the operating system or hypervisor. When the time slices granted to an application are not regular enough to receive real-time media packets, the software cannot keep up.
How do we replicate the synchronous, clock-centric workflows of the broadcast facility in this asynchronous world … and still make television?
Learning From Others
Other industries are using computers and networks – both on-premises and in the cloud - at data rates that exceed the requirements of a broadcast facility. What can we learn from them? If a network says it is 25 GBit/s or 100 GBit/s and is non-blocking, how can we fill the pipes with media streams? Testing is the place to start, and then we can replicate what any successful tests do in our software.
The Energy Sciences Network is building high-performance data processing networks to aid scientific discovery. They publish fasterdata, advice on how to tune operating systems and software to get full use out of high bandwidth networks and fast computers. I followed their instructions with tests on AWS and was able to achieve close to full bandwidth using TCP/IP sockets. In short, this required distributing network-specific load across a few ports and a few CPU cores.
Other accelerated networking technologies are available, such as Elastic Fabric Adapters on AWS and network cards supporting remote direct memory access (RDMA). The Open Fabrics Alliance is creating advanced network ecosystems. They have released libfabric, a software library that enables developers to gain maximum performance and efficiency from whatever underlying network is available. Anyone familiar with AWS CDI is already using a cloud-specific version of libfabric. However, the scope of the libfabric is much broader than cloud. Our tests show it can move frames of video over TCP/IP networks much faster than real time, with low CPU impact.
An Enabling Framework
As an enabler for broadcast workflow transition to software, our collective goal should be designing and developing hybrid systems that work equally well on premises and in the cloud. Figure 2 shows a candidate architecture, with composable software services running on multiple hosts, connected by a content transfer fabric, orchestrated and controlled by a server. Real time IO at the edge, faster than real time in the core.
The framework is suitable for software automation with manual control, running both with: on-premises servers supporting SDI and ST 2110 in hardware; virtual machines on-premises and/or in the cloud. With composable services, developers can write applications augmented by uncompressed media processing capabilities. The framework facilitates control, scale up and scale out of live productions.
Using a common control track model, work can be duplicated across compute nodes. Two or more nodes can be asked to do the same work at the same time. If one fails to deliver a frame of video, the viewer does not notice as the duplicate node also produced the same frame. This allows architecting for failure, including across cloud availability zones. Another consequence is that services can be live-migrated from system-to-system – even during a broadcast – without dropping a frame.
Richard I. Cartwright, Technical Leader, Software Engineering, Matrox Video U.K.
Taking advantage of state-of-the-art software development techniques, software can deliver the benefits of virtualization for broadcast workflows. For high-performance video processing applications – ones that use most of the resources available to a computer – automation rather than virtualization delivers infrastructure as code advantages. We can learn from other industries that are using capabilities of high-performance computers and network fabrics. Moving to exploit these requires an event-driven asynchronous approach, one that may be counter-intuitive to those familiar with synchronous broadcast engineering practices. Asynchronous processing facilitated by a media-aware framework, such as Matrox ORIGIN, enables: the transition of workflows to software; benefits of IT innovations; deploying both on-premises and in the cloud.