The Sponsors Perspective: Deep Dive Into Immersive Audio For All

The familiar “Faster, Higher, Stronger” motto is not just for Olympic athletes but also applies to the common man (and woman): The first black-and-white television sets were considered a sensation until color TV came along, opening a whole new dimension to the viewing audience. In parallel, cinema screens grew ever taller and wider. This went hand in hand with the switch from mono to stereo and a stunning series of dramatic boosts regarding picture quality. While 4K is still in the process of being implemented, more daring early adopters are already grooming their 8K offering.


This article was first published as part of Essential Guide: Immersive Audio Pt 1 - An Immersive Audio Primer

Similar leaps and bounds occurred in the audio domain and were considered heaven-sent by the entertainment and broadcast sectors where operators were on the lookout for new, ever more spectacular ways of engaging with their audiences. Just think of quadrophony in the early 1970s or sports, concert and show telecasts in 5.1-Surround. What quickly became a standard in movie theaters eventually only made it into a disappointing number of home cinemas—the living room of most families firmly remained a “no-fly zone”, mainly for space and practical reasons. For a while, this seemed to put the great expectations and the ostensibly bright future of surround sound (5.1, 7.1, Dolby Surround, etc.) for the masses to rest.

The Irony Of Fate

True to the saying that hope is the last to die, countless public broadcasters and broadcast facilities kept investing in 5.1 audio and even took it to a point where they were no longer able to serve viewers with “only” two HiFi speakers a “proper stereo” sound. Down-conversions of their original multi-channel format were all they could muster.

Leading console manufacturers were only too willing to lend a helping hand and quickly came up with 5.1-bus desks. Still, as early as 2003, Lawo went one better by refusing to restrict its mc²-series consoles to a fixed number of multi-channels per bus. The engineers in Rastatt indeed realized that, for a convincing result 3D audio required a Z axis for vertical localization and hence more than six channels. Some call this approach “9.1”, others refer to it as “5.1.4”, “7.1.4”, etc.

For the London Olympics in 2012, the Japanese public broadcaster NHK devised its “Super Hi-Vision” project, which relied on 22.2 channels to match the revolutionary 8K picture quality with a genuinely immersive audio experience. Having been instrumental in bringing this project to fruition, Lawo leveraged its experience for the development of its Immersive Mixing Engine (LIME).

This seemed to mark the beginning of a bright future for immersive audio technologies—not least because there is currently an abundance of 3D solutions: Dolby Atmos, MPEG-H, AURO-3D, DTS:X, NHK 22.2, IMAX 6.0 and 12.0 as well as Sennheiser AMBEO 9.1 and later. The technology is available, and there are clear signs that new developments will be announced in 2019. The only snag is that general adoption of this immersive prowess has been so subdued that some broadcasters are seriously considering shutting down their 5.1 operations for reasons of unjustified cost. How come?

Practicability Comes Before Immersion

The main reason seems to be that consumers had to wait until 2019 before they could reasonably be expected to partake in these developments: as more than two speakers—preferably hidden in the TV set—are simply no option in most households, some manufacturers started developing soundbars and tools for binaural signal processing for headphones. These two listening solutions offer the advantage that they require next to no space while providing an immersive experience so convincing that no listener who has tried it will ever want to return to the two-dimensional stereo world.

Providing a consumer-friendly infrastructure is thus paramount for the triumph of immersive audio technology and the success of any “new” playback format. Only now does it seem sensible for radio and TV channels to plan full speed ahead and match their superior picture quality with an equally rich audio rendition. The good news therefore is that all the pieces of the puzzle are slowly falling into place and will soon put a smile on the faces of both content providers and consumers.

And there is more: the object-based approach offers the additional benefit that consumers will soon be able to adjust the level balance to their own liking, i.e. to personalize their listening experience. This will come in handy for those wishing to improve the comprehensibility of dialogs without raising the overall playback level to an environment-hostile degree. Other options will include effectively muting commentaries for an unfiltered live experience. The audio information is indeed supplied as distinct channel groups, or stems, whose levels can be adjusted individually. It is left to the content providers’ discretion how far they want to go with these additional options.

Close up on an immersive panning technology.

Close up on an immersive panning technology.

What Does This Mean for The Production Side?

Current trends seem to indicate that the immersive audio realm will eventually split into two territories—Asia and Rest of the World: In the US and Europe, Dolby Atmos is in the process of asserting itself, while Korea and China have set their ears on MPEG-H.

The immersive approach for objects, OTT content, binaural mixes, sound image personalization, etc., means more work for audio engineers. For instance, the fact that audio contents are consumed on a variety of platforms (telecasts, cable, internet streaming) and need to sound convincing on all of them requires substantially more monitoring than before. And we haven’t even touched on errors likely to occur in stress situations when most processing parameters can only be tweaked on the outboard gear.

Features like Automix and Audio-follows-Video are generally considered a given. With its KICK software released in 2015, Lawo added automated control of mc² channels based on external tracking data. This system for a “crisp” and close-miked audio on soccer pitches is already mandatory in Germany’s Bundesliga (first division) as highly complex mixing operations are carried out without the slightest artefact or phase glitch. KICK’s reliability is such that the automation routines remain rock-solid all through extra time and the ensuing penalty shootout: all level balances remain consistent and reproducible, leaving sound engineers (AR1s) more time for other important tasks, like the overall mix.

Consistency is even more important in object-based offerings where an objective advantage for consumers needs to be achieved. Personalization will soon allow consumers to change the level balance. Yet this only makes sense if the provided audio objects are pro-grade. Crossfades, level jumps and crosstalk are artefacts content providers need to avoid at all cost for a satisfactory listening experience.

Soundbars are taking over: The Yamaha MusicCast YSP-5600 sound bar uses sound beam technology and 44 speakers to produce a 7-channel surround image, including two height channels.

Soundbars are taking over: The Yamaha MusicCast YSP-5600 sound bar uses sound beam technology and 44 speakers to produce a 7-channel surround image, including two height channels.

The Destination Is The Journey

The rising adoption of immersive audio can be felt everywhere—the number of new movie theaters with cutting-edge 3D audio technology keeps growing almost by the day.

A new soundbar generation seems to have so much going for it that it will only be a matter of time before consumers give in to the undisputed advantages of immersive 3D sound reproduction. Immersive formats optimized for headphones will further stoke demand.

The attraction of interactive, object-based formats that broadcasters and broadcast facilities will offer soon cannot be overestimated. Its nicest side-effect is indeed that every consumer can shape their own listening experience by boosting some signals (and attenuating others), shortening contributions based on object metadata without losing important bits of information, or recreating the live atmosphere they remember from a stadium, arena, etc. In combination, these factors offer enormous potential for the market—and broadcasters will happily cater to these new expectations.

Lawo supports the adoption of immersive audio through the seamless integration of all relevant control features into its mc² consoles and will launch new solutions in the very near future. The next Olympic Games and other high-profile events are just around the corner. High time, then, that all parties concerned got cracking. From Lawo’s point of view, the year 2019 will see widespread acceptance of immersive audio. Stay tuned for Lawo’s product announcements later this year!

Supported by

You might also like...

Data Recording and Transmission: Part 24 - Message Integrity

Once upon a time, the cause of data corruption would be accidental. A dropout on a tape or interference picked up on a cable would damage a few bits. Error correction was designed to deal with that.

The Sponsors Perspective: Mixing Realities - Feeding The Immersive Markets

Will alternative immersive channels create an imperative for broadcasters? Veronique Larcher, Director of AMBEO Immersive Audio, Sennheiser, explores immersive content outside of the commercial broadcast space, including virtual, augmented, and mixed realities.

Encoding Shines At Virtual IBC 2020

Had IBC 2020 taken place as usual, some of the liveliest discussions would have centered around encoding, after an eventful year leading up to the virtual event that took place over the same time slot.

The Sponsors Perspective: The Personal HRTF - An Aural Fingerprint

HRTF stands for Head Related Transfer Function and, simply put, is a catch-all term for the characteristics a human head imparts on sound before it enters the ear canal. Everything from level tonal changes caused by our head, shoulders, and…

Data Recording and Transmission: Part 23 - Delivering Data

The requirements for data transmission have changed out of all recognition since the early days of computing where the goal was simply to make something that worked. Today that’s the easy part.