Studio B control room at Japanese game developer Capcom's facility, featuring Genelec The Ones for 7.1.4 monitoring.
Immersive audio has long been promoted as the natural partner of new, enveloping video technologies. First there was stereoscopic 3D and now Ultra High Definition (UHD), for which spatial sound with a sensation of height as well as width and length is regarded as the best way to enhance the experience of watching films and TV programmes and playing computer gaming. Several formats are now available for feature and drama production, games and virtual reality (VR) - including Dolby Atmos, Ambisonics and binaural - but there are still important issues that need to be considered before applying such technologies.
A key consideration is monitoring, both for professional production and the final playback in cinemas or on TVs, smartphones and VR headsets. Among the pro audio companies researching the requirements for spatial sound in these applications is loudspeaker monitor designer Genelec. Some of its research and general observations on the subject will be presented in a paper at the 2018 NAB Show in Las Vegas.
Immersive Audio Monitoring for Drama, Games and VR was written by Genelec's senior technologist Thomas Lund, who will also present the paper. The session will include a description of the human auditory system, specifically the abilities to produce spherical localisation and immersive sensation, which are key in experiencing spatial sound.
Lund will also discuss recent research into perceptual techniques carried out by both Genelec and outside institutions. "We have done a lot of internal research specifically into binaural (headphone) listening," he says. "I did that as well a while back when I was manager of professional R&D at TC Electronic. The paper for the NAB Show will have our specific research combined with a review of the latest medical studies on perception."
This has resulted in more details emerging with regard to cross-modal sensing, learning and latency, among other aspects. "New non-invasive techniques have been employed to make this possible over the last few years," Lund explains. "My original background is in medicine and we recently performed a broad review of such issues, related to pro audio. I also gave a cross-modal presentation of some of the results at the HPA [Hollywood Professional Association] Tech Retreat during February."
Lund's paper will also highlight the differences between in-room monitoring arrangements for film, broadcast, games and VR and those needed for binaural playback. As part of this he will discuss consistency, the 'end-listener' experience and inter-subject variability, plus the pros and cons of both monitoring methods. These include head movement, spectral balance, calibrated listening level, requirements for SPL (sound pressure levels) and the prevention of production-side listener fatigue.
A major practical difference between the various application areas is the defined and much higher volume experienced in the cinema. Lund adds that another important differentiation is experiencing sound in a collective environment as opposed to something more individual, as with viewing on a smartphone or wearing a VR headset.
Lund believes binaural reproduction is likely to become a "credible delivery possibility", bearing in mind the need for precise, individual head transfer functions (HRTFs) to take head movement into account. ""However, in-room production monitoring ensures immersive qualities and translation across a variety of playback conditions, including different binaural rendering devices," he says.
Despite this, the individual features of a person's outer ear and any body movement will always affect how people hear, both in a natural environment and in rooms set up with audio playback and monitoring equipment. "Binaural production is therefore more prone to cause fatigue or even 'cyber sickness'," Lund comments. "Human hearing also crosses over from the ears to haptic sensation, which is mostly picked up by the abdomen. This is below 50Hz so much low frequency effect (LFE) content cannot be conveyed using headphones. Loudspeaker-based production, however, needs a decent room and a frequency response calibrated monitoring system, taking each monitor's placement into account."
Lund observes that the AES (Audio Engineering Society) will "have an important role to play" in ensuring reference listening is "tightly specified" for production, with open standards for the whole process. The paper Immersive Audio Monitoring for Drama, Games and VR by Thomas Lund will be presented on Tuesday 10 April at 14.30 in North Hall Meeting Room N255.
You might also like...
In this fourth installment of the Immersive Audio series we investigate the production tools needed to produce live immersive content. Moving from channel-based output to object audio presents some interesting challenges as the complex audio image moves around in three-dimensional…
Immersive audio transforms the listening environment to deliver a mesmerizing and captivating experience for a wide range of audiences and expansive group of genres.
Part one of this four-part series introduces immersive audio, the terminology used, the standards adopted, and the key principles that make it work.
Every Super Bowl is a showcase of the latest broadcast technology, whether video or audio. For the 53rd Super Bowl broadcast, CBS Sports will use almost exclusively IP and network-based audio.
Richard Devine creates sound assets for companies such as Apple, Microsoft, Google and other Silicon Valley giants, as well as content companies like Sony Media and the video game, Doom. These range from individual sounds to complete music tracks.