The Sponsors Perspective: Effectively Using The Power Of Immersive Audio

The LAWO mc<sup>2</sup>56 mkIII with immersive control shown on-screen.

The LAWO mc²56 mkIII with immersive control shown on-screen.

Lawo’s Christian Scheck takes a tour of console functions and features that have a special place in immersive audio production, and how they are developing.

This article was first published as part of Essential Guide: Immersive Audio Pt 3 - Immersive Audio Objects

One thing that is becoming increasingly clear regarding next-generation audio (NGA) is that, while a lot has already been accomplished, quite a few things still need to be done. Current achievements are the result of daring forays into uncharted territory where the only constant seems to be that Dolby’s Atmos system, MPEG-H and a number of other 3D audio formats are here to stay and will play a prominent part in the evolution of immersive audio for the broadcast world.

Let us first look at the monitoring aspect of NGA productions. Since A1s can no longer predict the ways in which audio productions will be consumed by end-users, the only tools they have at their disposal are a growing number of presentations (delivery formats) they can monitor for consistency and quality.

The more presentations there are, the more you need to monitor at least occasionally. Long before anyone was aware of an object-based scenario, channel-based formats, like the momentous 22.2 immersive audio project developed by NHK in 2011, it was already clear that integrated monitoring would be a must-have feature. Luckily, the capability for this had been available on Lawo consoles for a long time. For today’s NGA scenarios, A1s need all the help they can get—right at their fingertips.

The reason for this is simple: convenience. The more presentations audio engineers needs to monitor in a short time, the more important a convenient way of switching among them becomes. Moving around the control room to change settings on an outboard device (a Dolby DP590, say), is no option.

LAWO immersive in-console control.

A second consideration is what equipment should be available for monitoring in an OB truck or a control room. In specialized user communities, certain A1s have started to advocate the installation of soundbars in an OB truck, for instance, arguing that most NGA productions were likely to be consumed using these space-saving and relatively affordable speaker systems. The same applies to the binaural delivery format for headphones. Increasing one’s chances to accurately predict the result requires monitoring the binaural stems as well.

User Interface

Being able to configure the authoring solution directly from the console is another way of saving time. Lawo’s mc² consoles can retrieve the information from a Dolby DP590, for instance, and display it on their built-in screens.

This is enough in most scenarios as most broadcast productions use immersive sound in a fairly static way. Discussions about building a solution’s full feature set right into a mixing console are still on-going.

Being able to remotely control the authoring tool from the console is, of course, very nice, but the console also needs to provide substantially more downmix firepower than most solutions offer today.

What would be an effective and reliable user interface? A system similar to the one jointly developed by NHK and Lawo, based on colored balls that provide a clear indication of a stream’s placement in a three-dimensional space? Operators who have worked with it like the intuitive indication of signal placements.

Tools Rule

This leads us to the next consideration: panning signals in a three-dimensional space and the tools required to perform this effectively. Given the fairly static distribution of signals in an immersive broadcast production (hardly any signals need to whiz about listeners’ heads), most operators seem to agree that a joystick for the X/Y axes and an encoder for the Z plane are easy to grasp.

Added flexibility is provided by functions like “X-Z Swap” on mc² consoles. It allows operators to assign the joystick to the X/Z axes and to use the encoder for controlling the Y axis.

So far, this has proven to be the right approach for live broadcast productions using immersive audio. Other controller types, however, are already under consideration.

LAWO MC<sup>2</sup>56 mkIII pan section.<br />

LAWO MC²56 mkIII pan section.

One For All

It stands to reason that working with ten (5.1.4) or even more channels (7.1.4, 22.2, etc.) requires the ability to control all relevant busses using a single fader and that the metering system needs to accommodate a sufficient number of bar graphs.

Since nobody as yet knows for sure what the future will bring, the A__UHD Core architecture is prepared to support any multichannel format natively. This goes hand in hand with the required multi-channel dynamics, i.e. processors able to handle a multitude of channels (rather than a mere six).

All For One

In the light of the complexity facing audio engineers regarding mixing and—even more so—monitoring multiple presentations, multi-user operation looks likely to become the norm. Mixing systems will have to cater to such scenarios, with ample PFL/AFL support, CUT and DIM functions for all channels, and so on.

Going One Meta

Next-generation audio goes beyond 3D immersive audio by providing tools that allow end users to personalize their listening experience. This is a major evolutionary step, which somewhat redefines the A1’s paradigm.

While audio engineers have long lived in a comfort zone of at least a certain degree of objectivity regarding a “good” mix, personalization does away with this. Today’s and tomorrow’s A1s will at best be able to make educated guesses and monitor a rising number of presentations.

Still, this will likely be insufficient for satisfactory listening experiences in users’ homes if the decoders stationed there have insufficient information regarding the stems they are receiving.

Enter the next buzzword for NGA productions: “metadata”. That is, clear descriptions of the playback range of stems. Several approaches are being discussed regarding the kind of data that will be required by renderers to make sense of what comes in and what end users expect them to send out. A simple example in this respect is the commentary voice: with “low”, “mid” and “high” settings, it will be easier for end users (and their renderers) to achieve a predictable result.

We definitely are living in interesting times…

Other related articles posted on The Broadcast Bridge.

Essential Guide: Immersive Audio Pt 3 - Immersive Audio Objects

Supported by

You might also like...

Designing IP Broadcast Systems: Integrating Cloud Infrastructure

Connecting on-prem broadcast infrastructures to the public cloud leads to a hybrid system which requires reliable secure high value media exchange and delivery.

Video Quality: Part 1 - Video Quality Faces New Challenges In Generative AI Era

In this first in a new series about Video Quality, we look at how the continuing proliferation of User Generated Content has brought new challenges for video quality assurance, with AI in turn helping address some of them. But new…

Minimizing OTT Churn Rates Through Viewer Engagement

A D2C streaming service requires an understanding of satisfaction with the service – the quality of it, the ease of use, the style of use – which requires the right technology and a focused information-gathering approach.

Designing IP Broadcast Systems: Where Broadcast Meets IT

Broadcast and IT engineers have historically approached their professions from two different places, but as technology is more reliable, they are moving closer.

Encoding & Transport For Remote Contribution At NAB 2024

As broadcasters embrace remote production workflows the technology required to compress, encode and reliably transport streams from the venue to the network operation center or the cloud become key, and there will be plenty of new developments and sources of…