MPEG-H Broadcasts Bring Viewers Unprecedented Control

With consumers viewing (and listening to) content on more devices and in more places than ever before, broadcasters are being challenged to meet demands for new and better audio experiences in the most cost-effective way. This means upping the ante on multichannel audio from the existing 5.1 surround sound systems found in homes across the world. Consequently, broadcasters are assessing the capabilities of existing infrastructures and determining how new developments in audio and video technology will affect their ability to deliver enhanced services to a broad array of end-user technologies—from high-end home theaters, to tablets and smart phones.

There’s been a lot of talk of the next-generation ATSC 3.0 television broadcast specification, which looks to be standardized in two years and will include much higher resolution video signals (UHTDTV, or 4K) and multichannel “immersive” audio. However, ATSC 3.0 is not backward compatible with current HDTV receivers, so implementation concerns persist.

Similar to the video side, several companies are vying to have their scheme for sending new types of audio experiences include in the ATSC 3.0 spec. Fraunhofer IIS has developed an audio processing scheme called MPEG-H, which has been demonstrated at the IBC Show in September and the more recent SMPTE Fall conference and is vying to be included in the upcoming ATSC 3.0 broadcast standard.

MPEG-H is a new soon-to-be adopted (perhaps in spring 2015) standard that offers “object-oriented audio” features for TVs and mobile devices. A special SMPTE committee is expected to adopt MPEG-H in stages and be rolled out over the ensuing months (or years).

The goal is to give viewers more audio capabilities that they can use in the home or on their mobile device in the hopes of retaining eyeballs. This includes allowing the consumer to choose a language, choosing the home team announcer versus the away team announcer, listening to a specific race car driver communicating with his pit crew, increasing audio levels for dialogue only (or ambiance sounds), as well as other control parameters that can not be done with the current ATSC 2.0 standard, which uses AC-3 audio compression.

To this end, the Germany-based Fraunhofer IIS, Technicolor and Qualcomm have joined forces to form the MPEG-H Audio Alliance, which is promoting its version of “the next generation for interactive and immersive sound.” The alliance said it has developed a roadmap for MPEG-H Audio deployment that allows broadcasters to add new functionality at the rate and pace of their choosing, while preserving existing investments in technology and processes. The standard is backward compatible with the systems and practices currently used for AC-3 or HE-AAC surround sound broadcasting.

“Basically, the broadcaster will have control over what interfaces are presented to the viewer at home, while giving the viewer more audio features than they have ever had available to them before,” said Robert Bleidt, General Manager, Audio and Multimedia Division, Fraunhofer USA Digital Media Technologies, told the website Display Daily during the 2014 SMPTE Fall conference in Hollywood. “This can range from nothing to a couple of preset buttons, to giving the viewers full control. This is a significant challenge for broadcasters.”

Fraunhofer’s Robert Bleidt said broadcasters using MPEG-H will have control over what interfaces are presented to the viewer at home, while giving the viewer more audio features than they have ever had available to them before.

Bleidt said MPEG-H is about personalising the audio experience by allowing the viewer to set parameters that make them most comfortable or engage with a broadcast. They can turn up or down the dialogue (if the broadcaster and content creator have authored the content to permit that). In addition, different TV genres have different levels of creative intent. There might be more interactive capabilities during a live sports broadcast than a postproduction feature film.

New interactive menu features will make MPEG-H Audio-based offerings a key service for customers interested in more control over their listening experience, whether at home or on mobile devices. MPEG-H Audio content will also automatically optimize audio playback across different speaker configurations or headsets, allowing consumers to enjoy the best sound quality possible – no matter where they are or what device they use.

MPEG-H Audio lays the foundation for broadcasters to deliver a more personalized, interactive, and immersive audio experience for end-users by offering:

  • Interactive “sound mixing” through object coding, which allows viewers to customise the levels of different sound elements—for example, boosting selected commentary or creating a “home team” mix for sports broadcasts.
  • Rich 3-D sound with the ability to capitalize on additional front- and rear- height speaker channels. This enhances today’s surround sound broadcasts and creates a truly realistic audio experience.
  • Higher Order Ambisonics (HOA), to provide a fully immersive sound experience that is ideal for live broadcasts and performances, such as sporting events.


Qualcomm Technologies said it is already incorporating MPEG-H Audio support into its roadmap for future mobile chipsets. This is an important step toward widespread distribution of new audio functionality across a range of consumer devices.

Fraunhofer IIS (Institute for Integrated Circuits) is a veteran developer of MPEG audio standards, with its technology currently used in more than seven billion devices worldwide. The company will lend its advanced MPEG-H codec to the alliance as well as other engineering support. Technicolor is a co-developer of the MP3 standard and also provides production services for content creators and distributors around the world. It will implement MPEG-H decoding technology into set top boxes. Finally, Qualcomm will make MPEG-H receiver chips for mobile devices.

As a group, The MPEG-H Audio Alliance is hoping that by promoting a practical, end-to-end TV audio solution from the broadcaster to the home, the ATSC might look more favorably on it. The equipment appears to be ready.

At this year’s IBC Show in Amsterdam, Fraunhofer IIS showed a real-time MPEG-H hardware prototype with the ability to encode audio for live broadcasts from stereo up to 3D sound in the 7.1+4 H format with additional tracks for interactive objects including commentary in several languages, ambient sound or sound effects. The company’s system is comprised of real-time encoder for contribution from outside broadcasts to the studio, where a professional decoder recovers the uncompressed audio for further editing and mixing; real-time encoder for emission to consumers—over the Internet for new media use or for trials of upcoming over-the-air broadcast systems such as ATSC 3.0; and a professional decoder used to monitor the emission encoder's output.

“Our work with the new MPEG-H TV audio system so far has been done by capturing audio from a live event, and then encoding it with software on a computer. At IBC we were showing the next step for live broadcast use—the world’s first real-time encoder for interactive and immersive TV audio. With this prototype hardware, we will be able to demonstrate how we can integrate MPEG-H into a broadcaster’s plant for live trials and tests,” said Bleidt. “The system will encode elements of the audio as interactive objects so viewers at home may adjust the sound to their preference. This new hardware will give broadcasters the ability to encode true 3D sound, enhancing today’s surround sound broadcasts to create a truly realistic audio experience.”

Dolby Labs, with its Atmos 3D audio system, and several other patent-holding companies are also vying for inclusion in the final ATSC 3.0 spec.

For broadcasters, MPEG-H’s advanced compression scheme offers the ability to send more audio data using less bandwidth. Today’s 5.1 broadcasts (using AC-3 and requiring 448 Kbps) could be delivered with the same quality at 160 Kbps. Or the broadcaster could elect to add 30 kbps for interactive audio elements (another announcer, sound effects, pit crew radios, etc.). The new standard will also be able to transmit the latest immersive sound systems (up to 22 speakers) found in movie theaters to audiophile consumers using a single wireless 3-D immersive 7.1+4 channel sound bar.

At the end of the day, the MPEG-H Audio standard will allow TV broadcasters to offer live broadcasts with object-based 3D audio across all devices, providing viewers the ability to tailor the audio to suit their personal listening preferences.

Dolby Labs (Atmos) and several other patent-holding companies are also vying for inclusion in the final ATSC 3.0 spec.

You might also like...

Core Insights - Improving Headset Comms At Extreme Events

Without intercom, a live broadcast production would soon degenerate into chaos. A whole industry has been built on the protocols intercom users have adopted and everybody involved in the production must be able to hear the director’s instructions, clearly a…

Improving Comms With 5GHz - Part 2

This is the second instalment of our extended article exploring the use of the 5GHz spectrum for Comms.

Digital Audio: Part 10 - Adjusting Levels

Gain control in digital audio is essentially a numerical model of the same process in the analog domain.

Essential Guide: Flexible IP Monitoring

Video, audio and metadata monitoring in the IP domain requires different parameter checking than is typically available from the mainstream monitoring tools found in IT. The contents of the data payload are less predictable and packet distribution more tightly defined…

Improving Comms With 5GHz - Part 1

As broadcasters strive for more and more unique content, live events are growing in popularity. Consequently, productions are increasing in complexity resulting in an ever-expanding number of production staff all needing access to high quality communications. Wireless intercom systems are…