How Audio Normalization Works

Audio normalization is the application of a constant amount of gain to a recording with the goal of bringing the amplitude to a target level. The signal-to-noise ratio and relative dynamics of the audio remain unchanged in the process because the same amount of gain is applied across the entire recording. Is it a good practice to normalize audio? The answer is: sometimes.

There are two types of audio normalization, a function normally found in most digital audio workstations (DAWs). Peak normalization adjusts the recording based on the highest signal level found in the recording. Loudness normalization adjusts the recording based on perceived loudness. Both types of normalization adjust the gain by a constant value across the entire recording.  With peak normalization, the gain is changed to bring the highest PCM (Pulse Code Modulated) sample value or analog signal peak to a given level – usually 0 dBFS, the loudest level allowed in a digital audio system. Since peak only searches for the highest level, it alone does not account for the apparent loudness of the content.

Peak normalization is mostly used to change the volume to ensure optimal use of available dynamic range during the mastering stage of a digital recording. When combined with compression/limiting, peak normalization becomes a feature that can provide a loudness advantage over non-peak normalized material. This feature of digital recording systems, compression and limiting followed by peak normalization, enables contemporary program loudness.

Loudness normalization is when the gain is changed to bring the average amplitude to a target level. This average can be a measurement of average power, such as the RMS value, or it can be a measure of human-perceived loudness. This type of normalization was created to contend with varying levels of loudness when listening to multiple songs in a sequence.

Loudness normalization can result in peaks that exceed the recording medium's limits. Software offering such normalization normally provides the option of using dynamic range compression to prevent clipping when this happens. This considers the overall loudness of a file. There may be large peaks, but also softer sections. It takes an average.

Audio should be normalized for two reasons: 1. to get the maximum volume, and 2. for matching volumes of different songs or program segments. Peak normalization to 0 dBFS is a bad idea for any components to be used in a multi-track recording. As soon as extra processing or play tracks are added, the audio may overload.

It should be remembered that audio normalization is a destructive process. Performing digital processing to a file is going to change it in some way. The poor reputation for normalization came in the early days of digital audio when all files were 16-bits. If the volume was turned down, it reduced the bit depth. Historically, as digital audio quality improved, normalization no longer degrades the audio’s quality. Now it is more like turning up the volume — nothing more.

Whether to use normalization at all depends on the program content and the skill of the operator. If gain staging has not been done properly, maxing out the audio can bring in negative artifacts. When gains are normal, however, normalization can be beneficial. But remember, times have changed. Many streaming services now adjust music levels in their own facilities. So know where your audio is headed before making a decision to normalize.

Normalization is a tool, whose results depend on the skill of the person using it. It can easily be abused and cause an unnecessary loss of sound quality. So as with any tool, use it with caution. Know what you are doing. 

You might also like...

Transforms: Part 6 - The Discrete Cosine Transform (DCT)

The Fourier Transform is complex in the mathematical sense, which means that each coefficient is represented by complex number.

Broadcast Audio For Coachella With American Mobile Studio

For the past 15 years, Chris Shepard, chief engineer and owner of American Mobile Studio, has been responsible for the music mixes broadcast over a variety of streaming platforms from some of the biggest festivals in the United States, including Coachella,…

IP Security For Broadcasters: Part 9 - NMOS Security

NMOS has succeeded in providing interoperability between media devices on IP infrastructures, and there are provisions within the specifications to help maintain system security.

Baking A Cake With The Audio Definition Model

The trouble with Next Generation Audio is its versatility and the wide array of devices which need to deliver the enhanced immersive experience for the consumer. The Audio Definition Model may hold the key.

Information: Part 3 - Applying Statistics

We are not done with statistics yet. In a sense we will never be done with it and it is better to know how to deal with it than to ignore it. It is better still to know how others…