The Sponsors Perspective: Effectively Using The Power Of Immersive Audio

Lawo’s Christian Scheck takes a tour of console functions and features that have a special place in immersive audio production, and how they are developing.


This article was first published as part of Essential Guide: Immersive Audio Pt 3 - Immersive Audio Objects

One thing that is becoming increasingly clear regarding next-generation audio (NGA) is that, while a lot has already been accomplished, quite a few things still need to be done. Current achievements are the result of daring forays into uncharted territory where the only constant seems to be that Dolby’s Atmos system, MPEG-H and a number of other 3D audio formats are here to stay and will play a prominent part in the evolution of immersive audio for the broadcast world.

Let us first look at the monitoring aspect of NGA productions. Since A1s can no longer predict the ways in which audio productions will be consumed by end-users, the only tools they have at their disposal are a growing number of presentations (delivery formats) they can monitor for consistency and quality.

The more presentations there are, the more you need to monitor at least occasionally. Long before anyone was aware of an object-based scenario, channel-based formats, like the momentous 22.2 immersive audio project developed by NHK in 2011, it was already clear that integrated monitoring would be a must-have feature. Luckily, the capability for this had been available on Lawo consoles for a long time. For today’s NGA scenarios, A1s need all the help they can get—right at their fingertips.

The reason for this is simple: convenience. The more presentations audio engineers needs to monitor in a short time, the more important a convenient way of switching among them becomes. Moving around the control room to change settings on an outboard device (a Dolby DP590, say), is no option.

LAWO immersive in-console control.

LAWO immersive in-console control.

A second consideration is what equipment should be available for monitoring in an OB truck or a control room. In specialized user communities, certain A1s have started to advocate the installation of soundbars in an OB truck, for instance, arguing that most NGA productions were likely to be consumed using these space-saving and relatively affordable speaker systems. The same applies to the binaural delivery format for headphones. Increasing one’s chances to accurately predict the result requires monitoring the binaural stems as well.

User Interface

Being able to configure the authoring solution directly from the console is another way of saving time. Lawo’s mc² consoles can retrieve the information from a Dolby DP590, for instance, and display it on their built-in screens.

This is enough in most scenarios as most broadcast productions use immersive sound in a fairly static way. Discussions about building a solution’s full feature set right into a mixing console are still on-going.

Being able to remotely control the authoring tool from the console is, of course, very nice, but the console also needs to provide substantially more downmix firepower than most solutions offer today.

What would be an effective and reliable user interface? A system similar to the one jointly developed by NHK and Lawo, based on colored balls that provide a clear indication of a stream’s placement in a three-dimensional space? Operators who have worked with it like the intuitive indication of signal placements.

Tools Rule

This leads us to the next consideration: panning signals in a three-dimensional space and the tools required to perform this effectively. Given the fairly static distribution of signals in an immersive broadcast production (hardly any signals need to whiz about listeners’ heads), most operators seem to agree that a joystick for the X/Y axes and an encoder for the Z plane are easy to grasp.

Added flexibility is provided by functions like “X-Z Swap” on mc² consoles. It allows operators to assign the joystick to the X/Z axes and to use the encoder for controlling the Y axis.

So far, this has proven to be the right approach for live broadcast productions using immersive audio. Other controller types, however, are already under consideration.

LAWO MC<sup>2</sup>56 mkIII pan section.<br />

LAWO MC256 mkIII pan section.

One For All

It stands to reason that working with ten (5.1.4) or even more channels (7.1.4, 22.2, etc.) requires the ability to control all relevant busses using a single fader and that the metering system needs to accommodate a sufficient number of bar graphs.

Since nobody as yet knows for sure what the future will bring, the A__UHD Core architecture is prepared to support any multichannel format natively. This goes hand in hand with the required multi-channel dynamics, i.e. processors able to handle a multitude of channels (rather than a mere six).

All For One

In the light of the complexity facing audio engineers regarding mixing and—even more so—monitoring multiple presentations, multi-user operation looks likely to become the norm. Mixing systems will have to cater to such scenarios, with ample PFL/AFL support, CUT and DIM functions for all channels, and so on.

Going One Meta

Next-generation audio goes beyond 3D immersive audio by providing tools that allow end users to personalize their listening experience. This is a major evolutionary step, which somewhat redefines the A1’s paradigm.

While audio engineers have long lived in a comfort zone of at least a certain degree of objectivity regarding a “good” mix, personalization does away with this. Today’s and tomorrow’s A1s will at best be able to make educated guesses and monitor a rising number of presentations.

Still, this will likely be insufficient for satisfactory listening experiences in users’ homes if the decoders stationed there have insufficient information regarding the stems they are receiving.

Enter the next buzzword for NGA productions: “metadata”. That is, clear descriptions of the playback range of stems. Several approaches are being discussed regarding the kind of data that will be required by renderers to make sense of what comes in and what end users expect them to send out. A simple example in this respect is the commentary voice: with “low”, “mid” and “high” settings, it will be easier for end users (and their renderers) to achieve a predictable result.

We definitely are living in interesting times…

Supported by

You might also like...

100GbE Switching And Transport Moves To The Forefront Of Contribution Landscape

There was a time, not too long ago, when 100 Gigabit Ethernet (100GbE) IP switching was only considered for IT data centers moving large amounts of financial and military data. With the growth of media and the urgent need for remotely…

Is Gamma Still Needed?: Part 6 - Analyzing Gamma Correction In The Frequency Domain

To date, the explanations of gamma that are seen mostly restrict themselves to the voltage or brightness domain and very little has been published about the effects of gamma in the frequency domain. This is a great pity, because analysis…

Ultra HD Forum Announces Watermarking API

The Ultra HD Forum has confirmed launch of its first API for forensic watermarking before the end of 2020.

The World Of OTT: Part 2 - The Main Components Of The OTT Ecosystem

All video services begin with some form of content production and acquisition, so we will assume this is constant regardless of the content distribution method.

Agile Monitoring Supports Growth Of OTT

To maintain high quality of experience for their customers, content providers need a way to monitor hundreds—sometimes thousands—of channels without compromising real-time error detection. In most cases, the immense scale of their service offerings makes continual visual monitoring of …