The Sponsors Perspective: An Ambisonics Primer

Sennheiser examines the theory, implementation, and uses of the Ambisonic soundfield, and its important role in the immersive audio world.

This article was first published as part of Essential Guide: Immersive Audio Pt 2 - Immersive Audio Compatibility

Ambisonics is probably the original speaker-agnostic immersive format, and it’s been waiting a while for everyone to catch up. If you’re familiar with the Mid-Side microphone technique, that gives you an idea of how this format works - in those terms, ‘first-order’ Ambisonics is essentially a central omni-directional ‘’mid’ or pressure component (W), plus three different ‘side’ figure-of-eights: Back-front (X), orientated left-right (Y), and up-down (Z). These four signals make up the so-called ‘B-Format’ first-order Ambisonic format.

This is not an object-based format like Atmos. In fact, if you tried to split Ambisonics up into individual objects with position you would defeat one of its most useful features. All components, together, form the entire soundfield and are, as such, inseparable. However, it is a speaker-agnostic immersive format as it does describe a full 360-degree sound field without referencing speaker positions.

Because of the way this format stores the soundfield, it can easily be ‘decoded’ into any type of speaker set-up or number of speakers and panning and effects can be implemented directly in B-format, which maintains that speaker-agnostic status and explains its starring role in the upcoming 360-degree video boom - especially with live head tracking, which enables audio sources to effectively remain static in the space, while the video reflects the viewing angle. It can also be relatively easily encoded with environment and/or HRTF at the replay end if required to enhance the soundfield for headphones (see Essential Guide “Immersive Audio – Part 1” on binaural audio and the personalised HRTF).

The Sennheiser AMBEO A-B converter for transforming A-Format channels into B-Format Ambisonics components.

Getting High

Higher Order Ambisonics (HOA) - anything above 1st order - effectively increases the number of ‘sides’ in our virtual Ambisonic microphone (except they’re no-longer figure-of-eights) - a mathematical idea termed Spherical harmonics. As you work your way up the ‘orders’ of Ambisonics, effective resolution of the sound field increases, the sweet-spot gets bigger, and the number of channels required goes up too: For second order Ambisonics you need nine channels, for third-order you need 16.

You don’t need a microphone to create and work with an Ambisonic sound field. There are plenty of Ambisonic panning and processing tools available for different platforms, DAWs, phones, and so on, including headphone encoders for working on Ambisonics when you don’t have the luxury of lots of speakers, along with head tracking options so you can effectively monitor your head-tracking-enabled VR 360 mixes.

Ambisonic audio is specified as an option for both MPEG-H Audio and for DTS-UHD, and therefore can be part of DVB-MPEG/UHD or ATSC 3.0 broadcasts.

Slightly confusingly, standard formats and files for higher order Ambisonics are rather fraught with variations, mainly because there are different options for the derivation and ordering of the spherical harmonic components. The main sequences are ACN and Furse-Malham (FuMa). ACN starts with WYZX for 1st order while FuMa starts with WXYZ. It’s important to be aware that of the potential for mixing up the order, which will definitely lead to a disappointing, or disorientating, Ambisonic experience. There are also different options for the normalisation of those components such as maxN (for FuMa ordering), SN3D, N3D. and more. Of the proposed file formats, AmbiX seems to be the most popular option and is scalable to any order. It uses ACN ordering, SN3D normalisation, and the core audio format (.caf) container.

YouTube and Facebook now support 360 video and Ambisonic audio and in fact there is a free software suite called Facebook 360 Spatial Workstation available for designing spatial audio for Facebook, also compatible with YouTube 360 spatial audio metadata. YouTube’s encoding process specifies the Spatial Media Metadata Injector.

Ambisonic Microphones

The standard way of recording Ambisonics has always been a tetrahedral array or cardioid capsules. This was first seen in the Soundfield Microphone, brought to market in the 70s by Calrec. More recently, a good number of tetrahedral array mics have come to market, made economically viable by the upsurge of interest in immersive audio and probably, in particular, the 360 video trend.

The Sennhesier AMBEO VR microphone for Ambisonic recording.

The raw audio from a tetrahedral array of cardioid microphones is normally termed ‘A-format’. This can then be transformed into the B-Format 1st-order Ambisonic components of W, X, Y, and Z.

The Sennheiser AMBEO VR microphone is one such product and fits into the Sennheiser AMBEO immersive technology landscape along with products like the free AMBEO Orbit plug-in for mixing various sources into binaural audio, plug-ins from it’s partner in VR, Dear Reality, the Neumann KU 100 dummy head microphone, and - for the end-user - the high-end Sennheiser AMBEO Soundbar.

The AMBEO VR microphone uses four matched KE 14 capsules and outputs four corresponding audio channels for the A-Format feed. It also comes with the A-B converter tool for getting the A-Format signal into a DAW in B-Format with various adjustments, such as FuMa or AmbiX ordering / normalisation, microphone position, and filters.

Ambisonic Potential

The rise of Ambisonics has been a long-time coming. The very fact that people are waking up to the advantages of speaker-agnostic immersive audio, and that the consumer now has the technology and every opportunity to experience it in many convenient forms, is driving this boost.

It fits very nicely into the grand immersive scheme along with object -based audio, channel-based beds with height, and with binaural audio for headphones, which is why it’s included in the MPEG-H Audio and DTS-UHD specs. A-format capture is well-suited to encoding into channel-based bed as well, so even if you didn’t want to include the raw Ambisonic channels, the techniques and technology can be the basis of a high-quality ambience feed for sports broadcast and so on.

Ambisonics should be a valuable part of your immersive audio toolbox.

Other related articles posted on The Broadcast Bridge.

Essential Guide: Immersive Audio Pt 2 - Immersive Audio Compatibility

Supported by

You might also like...

Monitoring & Compliance In Broadcast: Part 3 - Production Systems

‘Monitoring & Compliance In Broadcast’ explores how exemplary content production and delivery standards are maintained and legal obligations are met. The series includes four Themed Content Collections, each of which tackles a different area of the media supply chain. Part 3 con…

Live Sports Production: Part 3 – Evolving OB Infrastructure

Welcome to Part 3 of ‘Live Sports Production’ - This multi-part series uses a round table style format to explore the technology of live sports production with some of the industry’s leading broadcast engineers. It is a fascinating insight into w…

Sports Graphics Production: Data Driven Visualization In Sports

Here we chart the evolution of data visualization in sports broadcast, examine the technology & workflow, and pivotal role of ML in current immersive production techniques.

Sports Graphics Production: Sports Data Management

Those who work with data driven systems know that reliably acquiring data is only part of the battle - data is often not in the right format and needs to be delivered for use structured in the right way.

Sports Graphics Production: Data Sources For Live Sports Graphics

The first step in data driven sports graphics production is gathering the data itself. The nature of that data can vary dramatically from sport to sport. Here we discuss some of the data gathering technology and techniques required.