Audio For Broadcast: Cloud Based Audio

As broadcast production begins to leverage cloud-native production systems, and re-examines how it approaches timing to achieve that potential, audio and its requirement for very low latency remains one of the key challenges.

The cloud is a big place; it’s absolutely massive and it is stuffed with promises. Being able to access a mix engine in a pure software environment promises ultimate scalability and the ability to flex broadcast resources to fit the requirement of the production. It promises an agile, SaaS model which can reduce capital expenditure by leaning on operational expenditure instead, not to mention savings on salaries and transport costs for on site staff.

It doesn’t care about geography, and it promotes distributed working and remote production. Continuous improvements to bandwidth and reliability means expensive dark fiber connectivity is no longer necessary. Standalone 5G is just around the corner, and in the meantime public internet is already picking up the slack for cloud-enabled coverage in both live coverage and proof of concepts.

For broadcasters, the idea of investing in cloud-native systems is like investing in a big bag of choices. The cloud promises the ability to define a workflow that is appropriate to the production. This kind of thing simply isn’t possible with a pure hardware solution, and it has geographic flexibility for team members and a way to cover a wider range of events built in. It’s no wonder that broadcasters are looking for overlaps to push aspects of production into the cloud.

The problem is, as things stand at the moment, not many of these things are easy. Not yet anyway.

What Is It?

In a live broadcast setting, the cloud wears lots of different hats. Cloud-based & cloud-native software, running on private and public cloud systems, cloud control of on-premise (on-prem) servers, and remote control of hardware-based equipment via a web app, have all been referred to as “working in the cloud.” But are they?

For this article we are looking at the concept of ‘cloud’ as a production ecosystem that is, at its core, software. Pure cloud-native (based on containerized microservices) and cloud-based (running on virtualized servers) applications are primarily designed to run in public cloud computing environments like AWS, Azure and Google. They can also be run on private cloud systems. The software applications are developed to utiliize the scalable compute resources which such flexible host environments offer.

It is live production systems based on the combination of software and scalable compute resources that offers the exciting potential of huge flexibility and scalability when compared to bespoke hardware based systems.

Broadcasters have actually been embracing the cloud’s flexibility, scalability and ability to manage data and automate processes for years. Playout and media asset management (MAM) workflows are easy to manage in the cloud, and broadcasters have made the most of them.

But what these use cases have in common is that audio content is already embedded with the video, so everything is already nicely coordinated. And as we have discussed before, sync between audio and video is one of the linchpins of live broadcasting; it’s why for the longest time live broadcast infrastructures have embedded audio alongside the video, such as in an SDI stream.

Some Old Friends

When creating production systems In the cloud, this cannot and does not happen.

In the cloud, audio is decoupled from the video in the same way that SMPTE ST2110 processes the audio and video independently of any external timing processes, which makes timing much more of a challenge as signals need to be resynchronized relative to each other.

In a live broadcast environment, it’s even harder as sources can be coming in from different locations with different transport latencies. In the context of remote production infrastructures, edge (on or near location) processing makes sense for in-ear monitoring; but throw in a few sources from on-prem processors and a couple of assets from MAM systems, and time-alignment starts to look complicated.

And there’s more: working in the cloud creates additional latency due to the requirement for data packetization, as a buffer is introduced directly into the workflow to packetize the data, and another to unpack it at the other end.

We’re back to having to manage sync and latency, so often the challenge in audio workflows, and achieving deterministic latencies in the cloud is even more demanding when timing can shift during a single broadcast depending on the transport infrastructure. Tracking the timestamp of audio, video and data flows is of paramount importance if they are to be realigned later.

The ability to realign these signals and work on a recognizable worksurface – virtual or otherwise – makes everything feel like a more traditional workflow and puts the operator back in control.

Where Is The Processing?

There are techniques which network designers can use to minimize latencies, many of which are to do with location.

Every production will depend on a variety of factors; where the event is taking place, where the processing is located and where the control is happening. Cloud networks are complex, but their flexible nature means that there are many ways a network designer can tailor the network.

For example, we already know that the priority for on-site talent is in ear monitoring and that edge processing works well for IEMs on site at an event. But the priority for a mix engineer sat at a connected hardware console in a centralized facility might be to minimize control latency.

In a cloud environment, control latencies can be influenced by locating the processing as close to the control as possible; in a pure public cloud environment, for example, network engineers can choose the location of the data center, so choosing one which is closer can make a big difference.

This way of working is similar to how Content Delivery Networks (CDNs) are used to speed up data access, keeping the distance between users and servers as short as possible. It is a system which has worked well; Netflix has been working this way as far back as 2016 when it migrated its cloud computing architecture to AWS.

Rapid Change

But once a broadcaster understands what is needed and how it might be achieved, then what next?

At the time of writing, Nov 2023, there are a number of vendors creating solutions to many of these challenges. There is as yet no consensus as to how software first cloud production infrastructure should be done and, although there are conversations happening we are a long way off any standards. It is a technology evolution cycle we have been through many times in broadcast.

There are various different approaches in use and in development. Some are pursuing a fairly traditional synchronous approach to timing and latency - seeking to match up the clocking of audio sources from various locations within a cloud environment, whilst others are taking different technical approaches to asynchronous processing within timestamped frameworks.

Any further detailed discussion of these varying methods of implementation must happen with individual vendors about their specific systems.

It’s Already Happening

That’s not to say that cloud-native workflows aren’t being used; they definitely are. We have seen a small number of high profile instances where primarily software based cloud production infrastructure has been deployed successfully.

So far, most broadcasters taking advantage of these flexible software production environments are veering towards hybrid systems which use cloud systems in conjunction with other infrastructure.

For audio, content providers are using cloud workflows to perform specific services like localized commentary from international correspondents or multiple language feeds for global broadcast, and the cloud is absolutely fabulous for things like this.

Use cases like this make sense because these workflows already exist and broadcasters are simply using the cloud to add scale to these specific workflows, but scaling up cloud native DSP for a remote broadcast for a football match? We’re not quite there yet.

The Hybrid Future

There will always be live broadcasts where on-prem facilities will be more efficient, or an on site Outside Broadcast, or a remote production, or a combination of all three. As broadcasters work with technology suppliers to introduce more cloud resources, it will add another dimension to encourage even more hybrid models with the ability to flex depending on the production.

The adoption of the cloud should be about what the broadcaster is looking to achieve and using the best mix of resources to achieve it. The best way to look at it is as another DSP resource which can be accessed as an when it is appropriate rather than turning to the cloud just because it is there.

Scalability is the end goal, and that scalability will open the door to cover more events at higher quality.

As we said earlier, we’re not there yet.

But watch this space. 

Supported by

You might also like...

An Introduction To Network Observability

The more complex and intricate IP networks and cloud infrastructures become, the greater the potential for unwelcome dynamics in the system, and the greater the need for rich, reliable, real-time data about performance and error rates.

Designing IP Broadcast Systems: Part 3 - Designing For Everyday Operation

Welcome to the third part of ‘Designing IP Broadcast Systems’ - a major 18 article exploration of the technology needed to create practical IP based broadcast production systems. Part 3 discusses some of the key challenges of designing network systems to support eve…

What Are The Long-Term Implications Of AI For Broadcast?

We’ve all witnessed its phenomenal growth recently. The question is: how do we manage the process of adopting and adjusting to AI in the broadcasting industry? This article is more about our approach than specific examples of AI integration;…

Next-Gen 5G Contribution: Part 2 - MEC & The Disruptive Potential Of 5G

The migration of the core network functionality of 5G to virtualized or cloud-native infrastructure opens up new capabilities like MEC which have the potential to disrupt current approaches to remote production contribution networks.

Designing IP Broadcast Systems: Addressing & Packet Delivery

How layer-3 and layer-2 addresses work together to deliver data link layer packets and frames across networks to improve efficiency and reduce congestion.