How Audio Normalization Works

Audio normalization is the application of a constant amount of gain to a recording with the goal of bringing the amplitude to a target level. The signal-to-noise ratio and relative dynamics of the audio remain unchanged in the process because the same amount of gain is applied across the entire recording. Is it a good practice to normalize audio? The answer is: sometimes.

There are two types of audio normalization, a function normally found in most digital audio workstations (DAWs). Peak normalization adjusts the recording based on the highest signal level found in the recording. Loudness normalization adjusts the recording based on perceived loudness. Both types of normalization adjust the gain by a constant value across the entire recording.  With peak normalization, the gain is changed to bring the highest PCM (Pulse Code Modulated) sample value or analog signal peak to a given level – usually 0 dBFS, the loudest level allowed in a digital audio system. Since peak only searches for the highest level, it alone does not account for the apparent loudness of the content.

Peak normalization is mostly used to change the volume to ensure optimal use of available dynamic range during the mastering stage of a digital recording. When combined with compression/limiting, peak normalization becomes a feature that can provide a loudness advantage over non-peak normalized material. This feature of digital recording systems, compression and limiting followed by peak normalization, enables contemporary program loudness.

Loudness normalization is when the gain is changed to bring the average amplitude to a target level. This average can be a measurement of average power, such as the RMS value, or it can be a measure of human-perceived loudness. This type of normalization was created to contend with varying levels of loudness when listening to multiple songs in a sequence.

Loudness normalization can result in peaks that exceed the recording medium's limits. Software offering such normalization normally provides the option of using dynamic range compression to prevent clipping when this happens. This considers the overall loudness of a file. There may be large peaks, but also softer sections. It takes an average.

Audio should be normalized for two reasons: 1. to get the maximum volume, and 2. for matching volumes of different songs or program segments. Peak normalization to 0 dBFS is a bad idea for any components to be used in a multi-track recording. As soon as extra processing or play tracks are added, the audio may overload.

It should be remembered that audio normalization is a destructive process. Performing digital processing to a file is going to change it in some way. The poor reputation for normalization came in the early days of digital audio when all files were 16-bits. If the volume was turned down, it reduced the bit depth. Historically, as digital audio quality improved, normalization no longer degrades the audio’s quality. Now it is more like turning up the volume — nothing more.

Whether to use normalization at all depends on the program content and the skill of the operator. If gain staging has not been done properly, maxing out the audio can bring in negative artifacts. When gains are normal, however, normalization can be beneficial. But remember, times have changed. Many streaming services now adjust music levels in their own facilities. So know where your audio is headed before making a decision to normalize.

Normalization is a tool, whose results depend on the skill of the person using it. It can easily be abused and cause an unnecessary loss of sound quality. So as with any tool, use it with caution. Know what you are doing. 

You might also like...

The Sponsors Perspective: Mixing Realities - Feeding The Immersive Markets

Will alternative immersive channels create an imperative for broadcasters? Veronique Larcher, Director of AMBEO Immersive Audio, Sennheiser, explores immersive content outside of the commercial broadcast space, including virtual, augmented, and mixed realities.

Digital Audio - Part 3

Digital audio relies completely on sampling and no treatment of the subject can be complete without looking at how it works.

The Sponsors Perspective: The Personal HRTF - An Aural Fingerprint

HRTF stands for Head Related Transfer Function and, simply put, is a catch-all term for the characteristics a human head imparts on sound before it enters the ear canal. Everything from level tonal changes caused by our head, shoulders, and…

The Sponsors Perspective: Deep Dive Into Immersive Audio For All

The familiar “Faster, Higher, Stronger” motto is not just for Olympic athletes but also applies to the common man (and woman): The first black-and-white television sets were considered a sensation until color TV came along, opening a whole new dimension to …

The Sponsors Perspective: An Ambisonics Primer

Sennheiser examines the theory, implementation, and uses of the Ambisonic soundfield, and its important role in the immersive audio world.