Standards: Audio - Standards For Audio Coding

Audio coding demands very different tools and workflows to video, but the same fundamental principles around quality apply to both. This guide surveys the standards, codecs and container formats you need to navigate modern audio workflows.

Resources For Audio Production

Audio production follows a similar workflow concept to video but with some differences. Not only are the tools and container files slightly different, but the necessary computing and storage capacity is also reduced. Within broadcast workflows, the management of audio content can be approached as additional tracks held within the video container, or separately in a specialized audio container. In a radio or podcast production workflow, for example, there is no accompanying video.

Some file formats that store audio efficiently are useful when you ingest and file new recordings in a digital librarian system. The audio samples should be uncompressed and stored in a lossless format to avoid introducing artefacts. The files support at least some metadata tagging of the content but this varies depending on the container type. Additional metadata goes into the content management database.

In addition to the summaries below there is a more comprehensive listing of the AES Standards & Recommended Practices, AES Information Documents and AES Project Reports online in Appendix H.

Tools & Software Apps

There are a diverse and sophisticated choice of audio production tools available, and many of the most popular tools are platform specific. Digital Audio Workstations and other tools aimed primarily at music production can be used very effectively for broadcast audio editing and post-production. Most of the video post-production platforms offer increasingly sophisticated, tightly integrated, audio editing and production tools. Professional software supports most of the commonly used standards but if you need specific file formats it is wise to ensure that the tools you select are compatible before embarking on any project. Work-arounds are always possible with additional format conversion tools. Many commercially available tools are supported on MacOS and Windows but not on Linux.

Deploying open-source audio tools is appropriate in these scenarios:

  • Editing on Linux workstations.
  • Portability across all the major platforms.
  • Conversion between formats.
  • Detailed analysis of the content.
  • Diagnosis of problems.

The ffmpeg tool is often thought of as a video-conversion tool but it also has powerful support for audio file conversions too. Useful analysis tools are built-in and accessible from the command-line interface.

Quality Control

Use uncompressed lossless formats for editing and preparing content. Compressing the audio introduces artefacts which are not obvious at first but multiple compress-uncompress cycles will compound them, distorting the output. The damage is not repairable.

Sound playback must be continuous and consistent. Your customers can tolerate dropouts in the visual content far more easily than losing the sound momentarily. A program with perfect video and intermittent sound is far less enjoyable than lower quality video with robust soundtracks.

The compression algorithm in MP3 discards sounds that the human ear will theoretically not hear. For example, soft violins during a loud cymbal crash. The human ear is very good at reconstructing that missing sound component so theoretically the listener will not notice. At the highest bit-rates the result can be quite effective, but reducing the bitrate will degrade the quality of the audio. Never use MP3 as a production or archival format. Retain the original raw uncompressed source material in the archives.

Be aware that some cameras and recording technologies are inherently lossy and are incapable of delivering lossless audio. Avoid them if you can. Alternatively, up convert to a lossless format at the point of ingest to capture the best possible quality.

Deployment

At the very final deployment stage, you will want to compress the audio for streaming or downloading by the end-user.

For an audio-only deployment, MP3 is very widely supported but not as good quality as AAC which delivers higher quality for the same bitrate. HE-AAC is even better if your target player devices support it.

Even though the quality is very good with HE-AAC, these are all lossy formats.

Proprietary Standards

Proprietary standards are embedded in the production tools. These will store their project assets in a more compact format but need exporting for more portable use downstream in the workflow.

These are some proprietary container formats described here as file-name-type extensions. The license-fees depend on how they are used and deployed and what the target platforms are. The license-fees are usually included in the purchase of the tools or hardware used to create them. Some of these are platform specific which makes them less portable. They might be designed to carry combined video and audio but can also be used in audio only scenarios.

Extension Format
.ac3 Dolby AC3 surround sound file.
.aif See AIFF.
.aifc Compressed AIFF file.
.aiff Audio Interleave File Format extracted from a CD. Designed by Apple and based on IFF.
.alac Apple Lossless Audio Codec.
.asf Advanced Systems Format (alternative to wmv).
.avi Audio Video Interleave.
.caf Apple Lossless Audio (ALAC) files (uncommon).
.dts Digital Theater Systems sound file.
.evo Enhanced VOB.
.f4v Flash Video file with H•264 video & AAC audio.
.flv Flash Video file containing SWF encoded content. Deprecated and should not be used for new projects.
.iff Electronic Arts Interchange File Format.
.mov QuickTime File Format.
.qt Early QuickTime File Format.
.rmvb RealMedia Variable Bitrate file.
.vob DVD Video Object.
.caf Core Audio Format file developed by Apple and used in GarageBand and other applications.

 

Open Standards

Open-source codecs and storage container files offer many advantages. They are supported by a community of enthusiastic developers and perform well. They are ported to virtually every platform. Because the supporting source-code is available, you can customize them or port them to new platforms very easily. Open source projects also actively seek to avoid patents and license-fees, so they are also attractive commercially. Therefore, open-source standards and tools are a very attractive solution.

Extension Format
.ape Monkey Audio file.
.flac Free Lossless Audio Codec (FLAC) coded audio.
.mka Matroška audio.
.mpc Musepack audio file.
.mxf Material Exchange Format.
.ofr OptimFROG lossless coded audio.
.oga Ogg audio file.
.ogg Ogg audio/video file.
.ogm Ogg media file.
.opus An Ogg format container containing Opus coded audio.
.wav WAV audio file. These are often used in radio broadcasting.
.wave See wav.
.webm WebM based on the Matroška format.

 


If you benefit from open-source technology, then an occasional donation to support the project team is often appreciated. This will ensure the project continues to thrive. Open does not mean free but the choice to pay is optional, and exploitation is never a good thing!


Choosing An Appropriate Sample-rate

Harry Nyquist suggested the sample rate should be at least twice the highest audible frequency to capture all the content. The bit-depth is also important in avoiding a staircase effect due to quantization. This introduces harmonics that are well above the audible range and are removed by a post-processing filter on playback.

If the sample-rate is too low, then ghost frequencies that were not in the original recording will appear when the samples are rendered as output audio. This is called aliasing and should be avoided by choosing a high enough sample-rate.

The deployment sample-rate should be 44.1 kHz for audio-only projects and 48 kHz if you want to embed the audio into a video project. Stick to one of these throughout your project to avoid unnecessary sample-rate conversions.

If you have sufficient storage and fast enough computers, work at twice or four times your deployment sample-rate. Filter, mix and process your audio with effects, then down-sample the finished recording without degradation.

Lossless Audio

In the context of a production workflow, lossless coding formats are the right solution. They can still provide some compression efficiency but the raw original audio samples can be recovered intact with no loss of data.

This is also beneficial when archiving content since a lossless format can be rendered down to any other format when needed. Using lossy codecs in production or archiving leads to a gradual degradation of quality as each compression cycle permanently adds more artefacts.

This is true of any kind of lossy format regardless of whether it is audio, video or any other kind of media. Here are some examples:

ExtensionFormat
AACSomewhat lossy.
AES67Lossless.
AIFFLossless.
ALACLossless.
FLACLossless.
HE-AACHigh quality but still slightly lossy.
IFFLossless but very large files.
MP3Very lossy.
Ogg VorbisLossy.
Ogg OpusLossy.
PCMLossless but very large files.
TAKA recent lossless audio format which is commercially open to use for free although the source code is not yet published.
WAVLossless.
WMALossy.
Dolby AC-3Lossy.
Dolby AC-4Lossy.
Dolby TrueHDLossless.

Symbolic Audio Representation

Symbolic representation of sound is a completely lossless solution. It depends on the receiving system having sound synthesis capabilities. If this is based on sampled sounds that are delivered with the playing instructions, the results should be predictable. There are a variety of symbolic music representation formats available.

MIDI is a similar idea but devices only render similar sounding results if they implement the Roland General MIDI standard. Modern studio equipment has MIDI connectivity to manage presets and parameter storage. Therefore, MIDI may be useful in the control plane to manage a live audio data flow through mixers, routers and compressors.

Any symbolic representation can potentially deliver very high quality and versatile audio experiences at extremely low bitrates.

Choosing The Best Solution

There are a large number of free and open-source audio coding formats. Whilst they are useful and good performers, they struggle to gain dominance vs. codecs such as AAC which are embedded in many mobile devices and TV receivers.

Production systems may still benefit from using open-source software since the choice of codecs is less of an issue. With workflows becoming predominantly software based, this problem goes

away entirely and you are free to choose whatever codecs suit your production process needs. Production systems should always be based on lossless codecs to avoid introducing unwanted artefacts.

Supported by

You might also like...

Network Traffic Engineering: RIST & SRT - The Success Of ARQ Based Protocols

IP networks are inherently unreliable. We kick off this series on IP Network Traffic Engineering with a look at how RIST and SRT give broadcast engineers user-configurable control over the latency-versus-reliability trade-off for real-time media streaming.

Standards: Video - Standards For Video Coding

From 4K to 32K, the demand for ever-larger video formats is pushing codec technology to its limits. This guide surveys the landscape of video coding standards – from legacy MPEG formats to AI-driven neural network compression – to help navigate the choices sha…

Broadcast Standards 2026 – Video Coding

Video coding was developed to deliver video conferencing services over low-bandwidth modem connections, but modern demands for ever-larger video formats are pushing codec technology to its limits.

Network Traffic Engineering: Part 1

IP networks are inherently unreliable. They always have been – it is literally designed in as a feature.

Standards: An Introduction To Standards

There are many standards relevant to the broadcasting and media industry. In this section we examine the background to standards, who develops them, where to find them and why they are absolutely and totally necessary.