Standards: Containers - AIFF Audio Containers
Thirty years on, Apple’s AIFF format remains a reliable workhorse for uncompressed audio production and archiving. We explore its chunk-based structure, clarify the differences between AIFF and AIFF-C, and demonstrate why this ancient format refuses to fade away.
The Audio Interchange File Format
Compared with other popular standards in use, Apple’s Audio Interchange File Format (AIFF) is ancient. The core functionality was stabilized over 30 years ago and has not changed since then. Given the pace of development in our industry that is a remarkable achievement.
AIFF files contain raw audio samples. This is optimum for production use. Most editing tools can open and save AIFF content after editing.
The embedded metadata describes the sample rates, track configuration and bit depths for the audio essence.
Despite its lack of complexity, AIFF is a versatile format and not likely to go out of use soon. Not least because of the massive legacy of archived recordings in this format.
Compressed Audio Storage With AIFF-C
The original classic AIFF files only contain uncompressed data. The AIFF-C specification was published as a replacement for version 1.3 of the original AIFF document and allows any kind of sound data to be carried. This has become the de-facto default format. When encountering what appears to be a classic AIFF file, it is likely to actually be an AIFF-C file.
Searching online for documentation only reveals a draft document, but additional supporting information can be found in these developer documents from Apple:
- Original AIFF Version 1.3 specification.
- Sound Manager.
- Core Audio.
- QuickTime version 4 to 7.
- Source code header files AIFF.h and Sound.h in the QuickTime developer kit.
- AV Foundation.
There are a few important differences between AIFF and AIFF-C files summarized here:
| Element | AIFF | AIFF-C |
|---|---|---|
| The FORM type identifier. | 'AIFF' | 'AIFC' |
| FVER chunks. | Never present. | Always present. |
| COMM chunk. | Four properties describing the sampling structure. | Six properties describing sampling and the compression codec wrapper used for the SSND chunk. |
| SSND chunk. | Always Big-Endian data. | Can be any format, compressed or uncompressed. |
| Preferred file extension. | .aiff | .aifc |
Multi-byte Data Formats
When AIFF was invented, CPU microprocessors had 8-bit architectures. They also ran at much slower clock speeds. With the enhancement of CPU chips to 16-bits, then 32-bits and 64-bits, the ordering of the bytes becomes important. The two principal CPU manufacturers (Motorola and Intel) chose opposing byte arrangements.
Since Apple based the original Macintosh design on Motorola 68000 and then PowerPC microprocessors, the AIFF data was organized with the highest ordered byte first. This is called Big-Endian.
Conversely, Intel arranged their data with the lowest ordered byte first. This is called Little-Endian. Since Windows PCs were based on the Intel architecture, this required some format conversion when moving content between the two platforms. The AIFF-C version switches to a Little-Endian arrangement eliminating this conversion.
Endianness In CPU Architectures
Early Apple Macintosh hardware was based on Big-Endian architectures, the move from PowerPC to Intel forced the OS to be rebuilt around the Little-Endian design. Windows had been running on Intel for some time; switching to Intel CPUs facilitated running Windows on Apple hardware or emulating it in a virtual machine.
When Apple moved to their own CPU design (Apple Silicon), retaining the Little-Endian architecture made sense because it avoids rebuilding the entire OS again, possibly introducing bugs into the bargain. This is the third (remarkably seamless) CPU architecture migration that Apple has undergone with the Macintosh operating system and the AIFF format has survived intact.
| Architecture | Endianness |
|---|---|
| Motorola 68000 | Big-Endian |
| PowerPC | Bi-Endian |
| PowerPC in Apple Macintosh | Big-Endian |
| Intel | Little-Endian |
| ARM | Bi-Endian |
| Apple Silicon | Little-Endian |
Note that the PowerPC and generic ARM CPUs can operate in both modes and are described as Bi-Endian.
Multiple Channel Support
AIFF can theoretically support an unlimited number of channels. The samples are interleaved together so that samples for a stereo pair (left and right channels) are stored adjacently. A group of samples stored like this across all the channels is called a Sample Frame.
Monaural sound, which we covered in the last section, is just a sequence of single samples so the sample framing is implied.
Multiple channels are mapped to numbers according to this grid (extracted from the specification). A sample frame spans all of the channels in use.
There are other alternative channel mapping arrangements to support more channels. The AIFF version 1.3 specification only describes 6 channel surround source mapping. This is sufficient for 5.1 surround but the source mapping is undefined beyond that. The accepted convention is to use the ITU-R BS.2159 standard. You may encounter systems that map the surround channels differently.
Sample Sizes
AIFF supports sample sizes from 1 to 32-bits.
Interestingly, music can be produced with a simple square wave by adjusting the on and off duty cycle in the time domain. So, specifying 1-bit audio is not such a crazy idea after all.
Realistically you would probably need at least 5-bits for low-quality speech coding. Sampling music at 8-bits delivers poor quality and 16-bits is considered a minimum for most applications. Larger sample sizes are useful in studio and production environments.
| Sample Size | Storage Arrangement |
|---|---|
| 1-bit to 8-bits. | One byte per sample. |
| 9-bits to 16-bits. | Two bytes per sample. |
| 17-bits to 24-bits. | Three bytes per sample. |
| 25-bits to 32-bits. | Four bytes per sample. |
| ARM | Bi-Endian |
| Apple Silicon | Little-Endian |
If the sample size does not fully occupy the bytes in the storage arrangement, it is left justified and right padded with zero bits. This example shows how a 12-bit sample is placed into a 16-bit word.
Mathematically (in a Boolean sense), the padding would have been more helpful had it been placed at the most significant end of the data word. That would have avoided bit-shifting the values.
There is a clever concept involved here. Placing the padding at the least significant end doesn’t improve the resolution or reduce the quantization. The innovation is that it does have (nearly) the same dynamic range (loudness) as the 16-bit data quality. This is elegant simplicity at work again because 16-bit and 12-bit samples can be mixed with no audible change to the loudness.
Sample Rates
The AIFF specification does not mandate any particular sample rates. The sample frames are delivered at a rate determined by an IEEE 64-bit double-precision floating-point value in the COMM chunk. The range of possible values supported by this floating point value far exceeds any sample rates we might ever encounter.
Chunk Format Details
AIFF files are organized into chunks of data. Each chunk has a header and a payload body.
A chunk header starts with a 32-bit long word containing a FourCC chunk type value. This determines how the application software should interpret the payload. The size of the payload is another 32-bit value. This size value does not include the 8 bytes of header data.
The chunk type ID is composed of four printable ASCII compatible characters packaged into a 32-bit long-word. The space character and other punctuation symbols are permitted.
Refer to Chapter 4-2 for a description of how FourCC codes work.
In some scenarios, non-printing characters are used which require special software to interpret and present in a human readable fashion.
The entire chunk must be constructed with an even number of bytes. Zero padding bytes must be added to variable length chunks such as the SSND sound data chunks to ensure they meet this criterion.
The chunk size value facilitates a rapid walk through the AIFF file by aggregating the 8-bytes of header data and the chunk size with the current index position in the file. Adding the size value to the current index easily locates the start of the next chunk.
File Format Details
AIFF files are based on the IFF format originally designed by Electronic Arts in 1985. The first chunk in an AIFF file is always a container whose FourCC type code is ‘FORM’. The rest of the chunks in the file are semantically nested inside it.
This chunk defines the file type more reliably than the file extension. The length value in the header defines the data length minus the 8-byte header for this containing chunk.
| Container | Owner |
|---|---|
| FORM | Form chunks describe the file format & act as a container for the rest of the chunks. They must always be present as they are the root chunk at the head of the hierarchical tree structure. |
| FVER | Format Version descriptor describing the AIFF-C revision date. Only present (& mandatory) in AIFF-C files but not present in older AIFF files. |
| COMM | Common data describing the fundamental attributes of the sampled sound. This must always be present & describes the format of the samples contained in the SSND chunks. |
| SSND | Sound sample frames containing the essence data. Their payload is organized according to the description in the COMM chunk. An empty AIFF container with no SSND chunks is technically valid but pointless. It may crash some players. |
After that, the chunks can be presented in any order. The second and subsequent chunks are described as Local data. The maximum file size is 4GB. This is the only constraint on the number of channels or the run-time of the sampled sound data. These chunk types are important and must be present when mandated by the specification. Note that there are different constraints for classic AIFF files and the later AIFF-C files.
Here is an illustration showing a simple AIFF-C file with four chunks. The brackets indicate the example value stored in each property. The shaded boxes represent the individual bytes:
- 1 box = 8-bit byte.
- 2 boxes = 16-bit word.
- 4 boxes = 32-bit long-word.
- 8 boxes = 64 bit double-precision value.
- 16 boxes = Illustrates a container for a text string.
Here is an example of the 16 box framing:
FORM – File Format Descriptor Chunk
The FORM chunks describe the file format as AIFF or AIFF-C. This is the outermost container and must always be present. It will always be the first chunk in the AIFF file. The foundational IFF specification allows for a range of nested containers similar to the ISOBMFF file structure. The AIFF specification profiles this behavior to constrain it to one single top-level FORM container.
The payload of the FORM chunk is a single FourCC code that identifies the file type as one of these values:
| Type Codes | Description |
|---|---|
| AIFF | Generic Big-Endian classic AIFF file. |
| AIFC | AIFF-C files containing compressed or Little-Endian uncompressed data. |
| AIFS | Not canonical. A deprecated file type used during initial development of the format. A few files of this type have escaped into the wild. File readers should reject these files as not being compatible with the AIFF specification. |
This chunk accurately describes the overall length of the data in the file. This is more reliable than deriving it from the value provided by the OS file system manager.
The form type value will either be ‘AIFF’ or ‘AIFC’. Note that this may not be consistent with the file extension on the physical file.
Refer to Annex A in the AIFF-C specification for examples of how a FORM container is constructed.
FVER – Format Version Chunk
Format Version (FVER) chunks have a 4-byte payload containing a version descriptor for the applicable version of the specification. The value is unique to each revision and rendered as a timestamp. This timestamp is the release date of the specification that the file conforms to.
An FVER chunk will only be present in an AIFF-C file and was not supported in the earlier classic AIFF files.
Note that this is not related to the creation or modification date of the physical file in any way. It is a description of the content not the container. This is a mandatory item and only one FVER must be present in the file.
COMM – Common Data Chunk
The Common chunk describes fundamental parameters for the sampled data. This chunk must be present. Without it, the player cannot unpack and stream the sample frames.
The basic format for classic AIFF files carries an 18-byte payload with the following properties:
- Number of channels which affects the size of the sample frame structure. Total number of sample frames in the SSND chunk.
- Number of bits-per-sample. This describes how to extract the sample data after stripping off the padding zero-bits.
- The sample-rate of the audio. It is stored as an IEEE double-precision floating point value. It describes the rate for sample frames per second. This delivers a sample for all channels simultaneously.
The Extended Common chunk in the AIFF-C files has two additional parameters that increase size of the payload. Most AIFF files use this format now:
- A 32-bit FourCC compression type that describes the audio codec being used.
- A human readable name for the compression type.
Note that the compression type value is case sensitive with similar upper and lower-case variants. Consult the specifications for each one to ascertain whether they are otherwise identical.
The compression type registry is managed by Apple. Send them details of any compression types you define. The address described in Appendix B of the standard should reach the correct department but it is quite old.
The compression types registry does not appear to be publicly available so it must be synthesized by assembling the information from whatever sources are available.
If the same type is used for multiple different compression algorithms it creates a namespace-collision. The problem escalates if the content is propagated and the media is distributed with a compromised type value.
These are the compression type values to avoid if you are creating your own. There may be other types not documented here that we have not yet found.
Many earlier codecs that were used with AIFF-C are now obsolete and have been superseded by MPEG and other standards. That doesn’t necessarily mean you can use the newer codecs inside an AIFF-C file. It may be technically possible but your files would be proprietary and use a potentially unregistered FourCC code for the compression type. That would render them unreadable to most other applications. Simply registering the FourCC code will not automatically enable other apps to read your files.
| Type | SSND format | Description |
|---|---|---|
| NONE | Big-Endian. | Raw uncompressed samples. |
| ACE2 | ACE 2-to-1. | 2-to-1 IIGS ACE (Audio Compression/Expansion). |
| ACE8 | ACE 8-to-3. | 8-to-3 IIGS ACE (Audio Compression/Expansion). |
| ADP4 | 4:1 Intel/DVI ADPCM. | Stéphane Tavenard (Audio Convert/Player) AmigaOS. |
| alaw | 8-bit samples. | ITU-T G.711 ALaw 2:1. |
| DWVW | Delta with variable word width. | TX16W Typhoon. |
| fl32 | IEEE-32-bit Float. | 32-bit floating point. |
| FL32 | IEEE-32-bit Float. | SoundHack & Csound. |
| fl64 | IEEE-64-bit Float. | 64-bit floating point. |
| ima4 | IMA 4:1 – ADPCM. | Adaptive differential pulse-code modulation. IMA is defunct and the specification is stored in a publicly accessible archive. |
| MAC3 | MACE 3-to-1. | 3-to-1 Macintosh Audio Compression/Expansion. |
| MAC6 | MACE 6-to-1. | 6-to-1 Macintosh Audio Compression/Expansion. |
| Qclp | Qualcomm PureVoice. | Qualcomm. |
| QDMC | QDesign Music. | QDesign. |
| rt24 | RT24 50:1. | Voxware. |
| rt29 | RT29 50:1. | Voxware. |
| SDX2 | Square-Root-Delta. | Big-endian. 3DO (Panasonic)/Mac (Apple). |
| sowt | Little-Endian. | Raw uncompressed byte swopped samples when compared with 'NONE'. |
| ulaw | 8-bit samples. | ITU-T G.711 µLaw 2:1. |
This is an unofficial list of the codec types aggregated from a variety of sources.
SSND – Sampled Sound Data Chunk
The audio essence is stored in a Sampled Sound Data Chunk. Although the chunks can appear in any order, this chunk is normally placed at the end of the file. There will only be one SSND chunk with all the sample frames contained within it.
A SSND chunk must be present if the number of sample frames described in the COMM chunk is non-zero.
The SSND header contains these properties:
- Size of the sound data chunk (not including the header data).
- Offset to the first sample frame at the start of the playable sound. This could adjust the in-point for playback. It is normally set to zero to play sample frames from the beginning.
- Size of alignment blocks. This indicates the size in bytes of the blocks that the audio data is packaged into. It is used in conjunction with the offset value. Most applications do not use this and it is usually set to zero. Block alignment speeds up disk access for real-time recording applications.
The sampled sound data immediately follows the block-size value in the header.
The sound data must contain an even number of bytes. A padding zero-byte might be added to the end to ensure the samples finish on a word boundary. Additional zero-byte padding may be necessary if alignment blocks are used.
| FourCC ID | Description |
|---|---|
| MARK | Markers point to uncompressed sample frame locations. Used by instruments to define loop points or as chapter marks and cue points for UI controls. |
| INST | Instrument description. Configures the sound generation. |
| MIDI | MIDI Data containing system exclusive data, note on/off and controller instructions. |
| AESD | Recording device configuration. |
| APPL | Application specific info. |
| SAXL | Hardware sound accelerator configuration and parameters. This is an experimental AIFF-C specification. Refer to Annex D for more information. |
| COMT | Comment texts that describe the file content. |
| NAME | Name of the sampled audio. |
| AUTH | Author/creator of recording. |
| (c)<space-char> | Copyright notice and date. Note the use of punctuation characters and the trailing space-character. |
| ANNO | Annotation carrying an additional commentary text. |
| ID3<space-char> | Non-standard extension for carrying ID3 tag data. Note the trailing space-character. |
Other Optional AIFF Chunk Types
These chunk types are optional and may not be present in an AIFF file. This is compiled from a variety of sources and includes some non-standard items. It may help when disassembling AIFF files. Some chunk types may only be supported by specific implementations.
Tagging & Metadata
Searching online for information about metadata storage in AIFF files is challenging. Some commentators state that AIFF files cannot carry metadata. Others tell us that ID3 is incompatible with AIFF containers. Neither of these assertions is true.
AIFF is flexible and extensible and carries data chunks other than audio samples. This is self-evident with the use of AIFF in iTunes where artist, track information and cover-artwork will survive when moving AIFF files from one library to another.
Any arbitrary metadata can be stored in the text chunks within an AIFF file. Since that data does not affect the sound playback, it could be ID3 structured tag data.
There are three embedded metadata/tagging approaches suitable for AIFF files. Externally maintained metadata is always a possibility too:
- AIFF standardized Native chunks – the AIFF specification describes Name, Author, Comment, Annotation, and Copyright text chunks. These are well supported by applications running on macOS. The specification is completely open and available for other platform developers to implement.
- ID3 tags – These are widely supported by many tools. Some tools are free and others have commercial fee-based licenses. All platforms are supported to some extent. You must use ID3v2 structured tag data in AIFF files.
- XMP – The Adobe Extensible Metadata Platform (XMP) was devised for use in JPEG images but can also be used with AIFF files. Even though XMP has been standardized as ISO 16684-1:2012, the tools to support it are predominantly provided as proprietary solutions by Adobe.
There are occasional references online to an ‘ID3 ’ chunk (note the embedded space character). This is not mentioned in any AIFF specifications. Theoretically, it should not cause any problems because players are supposed to ignore chunks they do not recognize. Non-standard chunks may not survive an edit cycle though.
File Name Extensions
Do not rely on the file extension to determine the exact format of an AIFF or AIFF-C file. It is helpful for invoking an AIFF parser or player but then inspect the FORM and FVER chunks to properly identify the file content.
| Extension | Description |
|---|---|
| .aiff | Preferred for classic AIFF files but also used for AIFF-C files which can be determined from the FORM and FVER chunks inside the file. |
| .aif | Less common file type for platforms that cannot support more than 3-character file extensions. |
| .aifc | Preferred for AIFF-C files. |
| .caf | This is a Core Audio File but AIFF data may be carried inside it when used for sample loops in GarageBand and Logic Pro. |
Media Type Identifiers
There are a lot of different media type identifiers for AIFF files but aside from the special cases, you should stick to using audio/aiff for most applications.
| Media type | Status | Description |
|---|---|---|
| audio/aiff | Preferred | This is the preferred media type for AIFF files. Use this value for new work. |
| audio/x-gsm | Special case | Mobile device audio. |
| audio/x-midi | Special case | MIDI files containing AIFF data. |
| audio/vnd.qcelp | Special case | Speech coding at low bit rates. |
| audio/x-aiff | Deprecate | This is a common alternative that you should recognize but it is a legacy value. Do not use this for new projects. |
| sound/aiff | Deprecate | Alternative media type. |
| audio/x-pn-aiff | Deprecate | Progressive networks variant of the AIFF file format. |
| audio/x-rmf | Deprecate | Beatnik audio files. |
| audio/rmf | Deprecate | Beatnik audio files. |
Applying AIFF
Although AIFF is an old format, it is still a good solution for storing uncompressed audio, especially for long-term archiving purposes.
Whilst this is an ancient and no longer actively developed format, files of this type will be widely used in libraries and archives. It has been sufficiently popular that media libraries will very likely continue to hold AIFF and AIFF-C files for hundreds of years into the future.
These Appendix articles contain additional information you may find useful:
Supported by
You might also like...
SMPTE Education Launches Summer 2026 Lineup Of IP And ST 2110 Courses
Boasting two standalone courses, an intensive boot camp, and a hands-on practical lab, SMPTE Education has launched its summer 2026 Lineup of IP and ST 2110 Courses.
Standards: Video - Advanced Video Coding (AVC)
AVC remains one of the most widely deployed video codecs in the world, but navigating its profiles, levels and signaling mechanisms is far from straightforward.
Network Traffic Engineering: RIST & SRT - The Success Of ARQ Based Protocols
IP networks are inherently unreliable. We kick off this series on IP Network Traffic Engineering with a look at how RIST and SRT give broadcast engineers user-configurable control over the latency-versus-reliability trade-off for real-time media streaming.
Standards: Video - Standards For Video Coding
From 4K to 32K, the demand for ever-larger video formats is pushing codec technology to its limits. This guide surveys the landscape of video coding standards – from legacy MPEG formats to AI-driven neural network compression – to help navigate the choices sha…
Broadcast Standards 2026 – Video Coding
Video coding was developed to deliver video conferencing services over low-bandwidth modem connections, but modern demands for ever-larger video formats are pushing codec technology to its limits.