The Sponsors Perspective: The Personal HRTF - An Aural Fingerprint

Could personalized HRTFs be the next big step change for both the consumer and professional immersive headphone experience?

HRTF stands for Head Related Transfer Function and, simply put, is a catch-all term for the characteristics a human head imparts on sound before it enters the ear canal. Everything from level tonal changes caused by our head, shoulders, and pinna (external ear parts), to arrival-time differences (Interaural Time Difference, or ITD) between the two ears have an effect on our perception of the direction and distance of sources.

This article was first published as part of Essential Guide: Immersive Audio Pt 1 - An Immersive Audio Primer

It’s a concept that explains the necessity of headphones with binaural sound, for example. That is, if you record a source by sticking two microphones in your ears, that recording will incorporate your HRTF, both considering the direct sound and room reflections. If you then play that back through speakers, the HRTF effect becomes a disadvantage because of pronounced coloring and new room reflections clashing against the recording. The source would have to be replayed through headphones to avoid the HRTF effect being imparted a second time.

A Place of Your Own

Part of the issue with immersive audio reproduced through speakers in a space is the effective localization of sources within that space. With object-based reproduction or with wave field synthesis you can approximate actual source position, but in the end it all gets injected into your ear canal after processing by your own HRTF. Therefore, a binaural source over headphones should be capable of producing the ultimate immersive experience.

However, everyone has their own personal HRTF. Our aural perception filter is as personal as a fingerprint. A generic binaural signal such as might be recorded with a ‘dummy head’ microphone will be a good approximation, but to a certain extent it will always be like looking through someone else’s spectacles.

What if you could easily measure and define your own HRTF? That could then be used by rendering engines to produce a personalized binaural feed from any source – including the most extreme object- and scene-based immersive formats. Set-top boxes, sound cards, games, mixing console monitoring sections, and DAWs could all incorporate rendering engines based on personalized HRTFs.

Enter SOFA

The SOFA file, or ‘Spatially Oriented Format for Acoustics’, is a general-purpose file format for storing spatial acoustic data, standardized by the AES as ‘AES69’. The data does not only have to be a HRTF but could be applied to a specific listening position in a room or for modelling a full acoustic response of a concert hall at various positions, for example.

The data is made up of multiple impulse responses –a representation of how a given input is changed at an output. In the case of measuring HRTF, each impulse response represents a measurement for each ear, from a particular direction that is defined with elevation and azimuth. Therefore, to measure an HRTF with microphones you need to take enough responses to adequately represent the full source sphere around a test subject.

How many responses is enough? Well, this method of modelling and quantifying HRTFs is not new and the University of California, Davis’ CPIC Interface Laboratory’s HRTF Database has been in existence for some time with a compiled library of HRTFs where each one is made up of 1250 directional readings for each ear of the subject. However, numbers of readings in the 200 region are more common, such as for the Listen Library, which was a joint project between microphone and headphone manufacturer AKG, and IRCAM (Institute for Research and Coordination in Acoustics/Music).

Aural ID

Thankfully, an alternative to sitting in an anechoic chamber for several hours is here… Genelec recently announced its new Aural ID process for modelling an individual’s HRTF and compiling that into a SOFA file that does not involve sticking microphones in your ears.

The idea is to create each model from a 360-degree video of the head and shoulders of each customer that can be acquired simply on a high-quality mobile phone.

Simplified HRTF: a couple of HRTF aspects that help determine source direction. HRTF is more complicated than this though, as it uses the entire upper torso and acts in three dimensions where both angle and azimuth are relevant.

That video is uploaded to the Genelec web-based calculation service, which builds a virtual 3D model, including especially detailed modelling of the pinna. This model is put into a full wave analysis of the HRTF using lots of virtual sources from many angles, which in turn generates the full HRTF data and the SOFA file.

Once you have your own personal HRTF data, a rendering engine can personalize any sound reproduction specifically for your headphones, bringing stereo and immersive content straight to your ear canals, and missing out those pesky monitors.

Of course, the monitors themselves, the room they are in, head movements, and other people listening with you have such a significant effect on a social listening experience that Aural ID is unlikely to spell the end of monitors just yet (something Genelec is no doubt pleased about), but this technology does have some significant practical applications and advantages in both consumer and professional worlds.

Immersive games should get a big reality boost for a start, and if mixing on headphones is necessary, it won’t be such a hit-and-miss affair if your DAW or console headphone output can model stereo, surround, and immersive experiences comparable to loudspeaker reproduction at the touch of a button.

The Aural ID service should be available from Genelec very soon.

The SOFA file format is already in use in game development and is specified as the format of choice for Steam Audio from Valve Corporation, for example - a solution for developers that integrates environment and listener simulation.

Personalized HRTFs can be loaded into the Unity, FMOD, Unreal, and C environments, so expect to be able to load you Aural ID into your favorite VR game in the not-too-distant future...

A Head Related Future

In the creative space, you could argue that awareness of HRTF and its effects could inform mixers and engineers to an extent, particularly in narrative audio and effects for film and TV, for example. But because of the issues around headphones versus monitors, and the complications in generating content for every eventuality, history has generally settled for ignoring the HRTF principals, choosing to mix on monitors and leave everything else to take care of itself. Binaural productions have tended to be niche products because translation has been best assured using in-room monitoring.

However, listening habits are changing and more people are putting on headsets and consuming content as a personal experience. Real time rendering of a binaural experience from immersive source material is already happening and will be completely relevant to how we approach broadcast audio production in the future.

Other related articles posted on The Broadcast Bridge.

Essential Guide: Immersive Audio Pt 1 - An Immersive Audio Primer

Supported by

You might also like...

NAB Show 2024 BEIT Sessions Part 2: New Broadcast Technologies

The most tightly focused and fresh technical information for TV engineers at the NAB Show will be analyzed, discussed, and explained during the four days of BEIT sessions. It’s the best opportunity on Earth to learn from and question i…

Standards: Part 6 - About The ISO 14496 – MPEG-4 Standard

This article describes the various parts of the MPEG-4 standard and discusses how it is much more than a video codec. MPEG-4 describes a sophisticated interactive multimedia platform for deployment on digital TV and the Internet.

The Big Guide To OTT: Part 9 - Quality Of Experience (QoE)

Part 9 of The Big Guide To OTT features a pair of in-depth articles which discuss how a data driven understanding of the consumer experience is vital and how poor quality streaming loses viewers.

Chris Brown Discusses The Themes Of The 2024 NAB Show

The Broadcast Bridge sat down with Chris Brown, executive vice president and managing director, NAB Global Connections and Events to discuss this year’s gathering April 13-17 (show floor open April 14-17) and how the industry looks to the show e…

Essential Guide: Next-Gen 5G Contribution

This Essential Guide explores the technology of 5G and its ongoing roll out. It discusses the technical reasons why 5G has become the new standard in roaming contribution, and explores the potential disruptive impact 5G and MEC could have on…