Controlling loudness is now a regulatory requirement for many broadcasters. Fortunately, the process can be automated.
Anyone who has watched television knows that audio loudness is an issue. Oftentimes, commercials are louder than the regular programming, causing viewer complaints. In addition, variations in loudness frequently occur across multiple language versions of the same program and across multiple programs.
For a long time, the broadcast industry did not have a standard or specification to objectively measure perceived loudness. Then the ITU developed an algorithm for objective measurement of loudness called ITU BS.1770-2. It has since been updated to become version BS.1770-3 and today it is commonly used, and regulated, around the globe.
Despite these efforts, audio loudness problems still exist. This article examines a few of the root causes of audio loudness issues and compares the differences between loudness monitoring approaches for file-based and real-time workflows. To conclude, it will provide strategies for correcting loudness in a file-based workflow.
Main causes of audio loudness issues
There are several common causes of audio loudness issues. One of the biggest perpetrators is commercials that have been created with increased loudness to grab viewers’ attention. Lack of awareness among media creators regarding loudness issues in the industry is another reason for audio disparity. In addition, movies and different types of media are created with high dynamic range (i.e., a large difference between the loudest and faintest sound).
In the U.S., the mandate for loudness control can be traced back to an angry California Democrat, Rep. Anna Eshoo. After complaining about TV commercials being too loud at one of her in-home dinner parties, she decided to write a law prohibiting the practice.
It is imperative for broadcasters to find a tool that can measure loudness based on the ITU standard to ensure a comfortable listening experience for viewers and to comply with regulatory standards. In 2010, the U.S. passed the legislation H.R. 1084, also known as the CALM (Commercial Advertisement Loudness Mitigation) Act. This required all broadcasters to transmit advertisements at a loudness level no more than that of the accompanying program. Similarly, in 2011 Europe’s EBU published similar regulations, called Loudness Recommendation EBU R128. Broadcasters must comply with those standards. This document can be found here.
What to look for in an audio monitoring solution
Broadcasters will want to choose a monitoring solution that is highly flexible, offering easy configuration in order to adhere to different loudness control regulations across the globe. It’s also important to deploy a monitoring solution that has the ability to detect issues in both live and file-based workflows/ This will be discussed in more detail in the next section.
Loudness correction should be automated based on monitoring reports, employing both metadata and audio normalization algorithms. Audio normalization algorithms must not deteriorate the quality of audio. The correction should go beyond simple gain or attenuation methods, taking into account factors such as true peak and dynamic range.
What’s more, the structural integrity of media files needs to be maintained in the correction process. It should not introduce new encoding errors. Correction should only affect the required metadata or audio samples without touching any other section of the file.
Is your audio monitored and controlled for loudness?
Monitoring audio in a file-based workflow
The tools and methodology for loudness monitoring in file-based content vs. live content scenarios are somewhat different. For content workflows in which file-based monitoring is possible, it is preferable to identify and correct loudness issues in the early stages to ensure that the issue does not propagate to later stages. Ideally, once the content is ingested or ready after editing, the content should be free from any loudness violations. However, because the content may be delivered across the globe to different regions, the final delivered content may need to comply with different loudness standards. Hence, automated file-based QC systems should be deployed at multiple stages of the workflow for regulatory loudness compliance. In particular, it’s recommended to repeat the monitoring process after editing and before the assets are stored for playout.
For live broadcasts (i.e., sports, events, and news), where it is not possible to perform file-based monitoring, real-time monitoring is required to ensure loudness compliance before the signals go on-air. Real-time monitoring has one big limitation: the content must be processed in a single pass. This makes normalization of live content very challenging. These systems can provide a quick fix for true peak level or short-term loudness but it is difficult to achieve the desired program loudness level.
Sport and news are two types of live content that require special consideration when it comes to audio processing and loudness.
Effective loudness correction strategies
Once violations to loudness standards are detected, correction techniques need to be deployed to ensure compliance. The goal is to achieve consistent loudness across different programs as well within a program. This will keep viewers satisfied and watching longer.
Traditionally, peak normalization has been used as the technique to control the loudness of programs. However, using peak normalization alone is insufficient when it comes to properly normalizing loudness based on modern metrics.Through metadata-based or signal processing-based correction techniques, monitoring systems can help broadcasters reach the Holy Grail of delivering consistent audio.
Metadata-based monitoring systems work with Dolby formats such as AC-3, which carries loudness specific metadata such as dialnorm and dynamic range control profiles. It’s absolutely essential that broadcasters properly set dialnorm and DRC (Dynamic Range Control) in order to allow automatic adjustments of audio levels during playout. If the audio material does not match these parameters, the audio signal or the metadata must be modified. Correcting the metadata is the easiest approach, as it does not require the primary content to be decoded and re-encoded.
Signal processing correction techniques are geared toward broadcasters working with non-Dolby formats such as PCM or AAC.This method aims to modify audio signals in order to achieve target loudness. Normalization algorithms need to be devised to control all three aspects of loudness: program loudness, true peak, and dynamic range.
When it comes to program loudness, dynamic range is an important parameter to consider during the correction process. Ignoring the dynamic range could lead to a situation where low audio levels may become so low that the audio is inaudible.
It’s also important to keep in mind that the structural integrity of the media file should be maintained during the correction process. After audio samples have been normalized, audio will need to be re-encoded and re-wrapped into the main container file. This step can potentially introduce encoding or wrapping errors in the content. If any of the video/audio/ancillary information is altered or lost during this step, it could adversely impact the content distribution chain. Furthermore, the processing algorithm should not create any kind of audio distortion.
Over the last decade, the broadcast industry has taken the issue of audio loudness very seriously, instituting regulatory standards designed to elevate the consumer television experience, from a comfort perspective. To effectively correct audio loudness issues, broadcasters need a loudness monitoring system that can automatically measure and control loudness for both file-based and live content workflows. Software-based monitoring systems have emerged as the best solution, as they enable broadcasters to keep pace with the wide and ever-changing range of standards and formats.
Manik Gupta, Engineering Manager at Interra Systems.
You might also like...
Part one of this four-part series introduces immersive audio, the terminology used, the standards adopted, and the key principles that make it work.
Every Super Bowl is a showcase of the latest broadcast technology, whether video or audio. For the 53rd Super Bowl broadcast, CBS Sports will use almost exclusively IP and network-based audio.
This year’s Super Bowl LIII telecast on CBS will be produced and broadcast into millions of living rooms by employing the usual plethora of traditional live production equipment, along with a few wiz bang additions like 4K UHD and a…
Networked modular audio stageboxes have been around for a while and were hailed as a convenient alternative to clunky snakes and the huge patch bays that came with them. Unlike analog stage- and wallboxes, which usually only transmit signals to…
Quality Control is one of the many areas where IT and broadcast use similar terms, but the meaning is quite different. Whereas IT focuses on guaranteeing bit rates and packet delivery to improve quality of service and hence quality of…