Audio For Broadcast: Cloud Based Audio

With several industry leading audio vendors demonstrating milestone product releases based on new technology at the 2024 NAB Show, the evolution of cloud-based audio took a significant step forward. In light of these developments the article below replaces previously published content on the topic.

For broadcasters, the concept of investing in the cloud is like investing in a big bag of choices.

Unlike traditional broadcast infrastructures that are scaled to cater for the year’s biggest broadcast events, the cloud promotes the ability to flex broadcast resources to meet the requirements of a specific production. This agile, SaaS deployment is an attractive proposition because it means broadcasters can spend less on processing up front by leaning on operational expenditure as and when it is needed instead.

Accessing a mix engine in pure software environment promises the ultimate in scalability and makes traditional systems built to handle the biggest events seem massively bloated and inefficient, especially when that capacity is unused for most of the year.

The ability to employ a scalable processing strategy is not only a technological evolution, but a huge ideological shift. And it simply isn’t possible by relying on hardware alone.

It’s Already Happening

Cloud control of on-premise (on-prem) servers and remote control of hardware-based equipment via a webapp are both workflows which have been referred to as cloud-enabled broadcasting. But while both of these approaches are made possible by global improvements to connectivity, neither really fit the brief.

Both of these workflows really only refer to the method of control. A pure cloud-native application is custom-designed to run in a cloud computing environment like AWS, Azure and Google and is specifically developed to provide scalable resources which can run cloud-native software.

This is not a new thing; broadcasters have been embracing the cloud’s ability to manage data and automate processes for years. Applications like playout and media asset management (MAM) work well in the cloud because the audio is already embedded with the video, nicely synchronized and coordinated. Content providers have also used cloud workflows to perform specific tasks, like delivering localized commentary from international correspondents or creating multiple language feeds for global broadcast.

Why The Delay?

But full audio processing for live broadcast in the cloud has proved more difficult to implement. Part of the reason is that rather than embedding the audio with the video like we have done for decades, live broadcasting in the cloud requires audio to travel independently to video. This is similar to the way SMPTE ST2110 processes the audio and video independently of any external timing processes, and it makes timing much more of a challenge as signals need to be resynchronized relative to each other.

Sources can come in from different locations with different transport latencies. On a single production there might be edge processing for in-ear monitoring, a few sources from on-prem processors and a couple of assets from MAM systems. Time-alignment can quickly get messy, and achieving deterministic latencies in the cloud is even more demanding when timings can shift during a single broadcast depending on the transport. Tracking the timestamps of audio, video and data flows is of paramount importance if they are to be realigned later.

Timing Is Right For Vendors

As is so often the challenge in audio workflows, we’re back to having to manage sync and latency.

Thankfully, multiple technology providers are working on ways to deal with these challenges, helping to match up the clocking of audio sources from various locations to a cloud environment.

While working in the cloud is still a huge shift for broadcasters, it is arguably a bigger shift for broadcast vendors, who are scrambling to develop new business models which move away from the hardware-only models that they have relied on for decades.

The result is a fragmented market, with some traditional hardware manufacturers adapting existing systems for cloud use, some implementing cloud-native software to meet broadcasters’ needs, and some teaming up to transplant existing workflows onto host server platforms.

In reality, a full-scale shift of whole broadcast infrastructures into the cloud is unlikely. Hybrid environments designed to connect hardware, software, and cloud-native services are likely to provide the capacity for broadcasters to flex their requirements for some time, and most suppliers are adapting their propositions down two distinct paths.

Virtual Machines

The idea of a virtual machine (VM) is well-established. In the simplest of terms, a VM is a computer made up purely of software components and hosted virtually.

In a broadcast environment there is a lot to like about this approach. VMs can be custom designed to do a very specific job, they are quick to deploy, and they can provide dedicated processing power whenever and wherever it is needed. As each VM is fully isolated from other VMs on the server, it makes them very secure and ensures they remain unaffected by anything that might go wrong elsewhere on the same server.

VMs are also highly portable, which means they can easily be moved from one server to another, which is a benefit to network designers looking to better manage latencies caused by geography. Latency management is critical for live broadcast, and the location of the event, the processing and the control all have an impact on latency. The portable nature of VMs mean that there are many ways a network designer can tailor the network to their advantage.

For example, to achieve an on-site priority for in ear monitoring, IEM processing always works better when it’s located at the venue. But the priority for a mix engineer sat at a connected hardware console in a centralized facility might be to minimize control latency, which can be influenced by locating the processing as close to the control as possible. When working in a cloud environment the ability to choose the location of the VM can make a big difference.

VMs can – and are – also hosted on on-prem hardware, but while some broadcasters might prefer the physical server to be somewhere they can see it, it misses the point. VMs hosted in cloud environments like AWS and Google are more portable and they can deliver more potential computing power as they have access to more processing and memory resources. These environments can also optimize machine types around the needs of the customer.


Although it can provide much greater scale for audio processing in the cloud, in essence a VM suffers from the same lack of efficiency as its hardware equivalent; most of its potential remains unused for the majority of the time.

A microservice approach in an entirely different proposition. Microservices are like a pay-as-you-go service for operational compute, where a broadcaster only uses the services it needs for a specific project and leases those service for as long as it needs them. It’s like taking a VM, splitting it out into all its component parts, and leasing only the parts you need for as long as you need them.

In a microservices architecture, each component operates as an independent service. If a broadcaster needs more processing, it can spin up more modules and run them together, or it can use the same architecture to implement other microservices, like intercoms or 2110 receivers. It allows users to design a system that best suits their needs using microservices as building blocks. And like in a VM, because each component microservice is deployed independently of any of the others, it promotes security and minimizes the risk of failure across the whole production.

The microservice approach also opens the door to third parties, providing an infrastructure for different manufacturing specialists to provide dedicated software in a microservice environment via common software development kits (SDKs). Several broadcast suppliers are already working with a wider ecosystem of partner companies to build their own ecosystems, and this too is attractive to many broadcasters as it gives them the ability to access a variety of different specialist services in one place, and at a potentially lower cost.

Microservices can be spun up and down which means that the levels of compute are scalable to meet the needs of individual projects, making them very efficient, quick to launch, and cost effective.

The Hybrid Future

There will always be live broadcasts where on-prem facilities will be more efficient, or an onsite Outside Broadcast, or a remote production, or a combination of all three. The ability to realign each signal and work on a recognizable worksurface virtual or otherwise makes everything feel like a more traditional workflow and puts the operator back in control.

Whichever route they take, more broadcasters are working with technology suppliers to introduce more cloud resources into their workflows, encouraging even more hybrid models and providing greater potential to flex their resources to meet the needs of the production.

The days of having to buy big are over and the adoption of the cloud should be about enabling broadcasters to employ the best mix of resources to achieve their needs on a case-by-case basis, as another DSP resource which can be accessed when it is appropriate to do so rather than turning to the cloud just because it is there.

Scalability is the end goal, and that scalability will open the door to cover more events at higher quality.

Supported by

You might also like...

An Introduction To Network Observability

The more complex and intricate IP networks and cloud infrastructures become, the greater the potential for unwelcome dynamics in the system, and the greater the need for rich, reliable, real-time data about performance and error rates.

Next-Gen 5G Contribution: Part 2 - MEC & The Disruptive Potential Of 5G

The migration of the core network functionality of 5G to virtualized or cloud-native infrastructure opens up new capabilities like MEC which have the potential to disrupt current approaches to remote production contribution networks.

Designing IP Broadcast Systems: Addressing & Packet Delivery

How layer-3 and layer-2 addresses work together to deliver data link layer packets and frames across networks to improve efficiency and reduce congestion.

The Business Cost Of Poor Streaming Quality

Poor quality streaming loses viewers at an alarming rate especially when we consider the unintended consequences of poor error reporting on streaming players.

Next-Gen 5G Contribution: Part 1 - The Technology Of 5G

5G is a collection of standards that encompass a wide array of different use cases, across the entire spectrum of consumer and commercial users. Here we discuss the aspects of it that apply to live video contribution in broadcast production.