Standards: Containers - MPEG ISOBMFF Containers

Born out of Apple’s QuickTime format, ISOBMFF underpins the ubiquitous MP4 container and countless derivatives. Here we explain the atom-based structure that made it so extensible and map the sprawling family of formats it spawned.

ISO Base Media File Format (ISOBMFF)

The reason that the Apple QuickTime File Format (QTFF) was adopted by ISO/MPEG is that QTFF was already very close to what ISO/MPEG were looking for as a candidate technology when they were designing the ISO Base Media File Format (ISOBMFF). It is published as ISO 14496-12 (MPEG-4) for video, and ISO 15444-12 (JPEG 2000) for images.

The ISOBMFF supports all the features and benefits that QTFF provided for timing, structure and metadata. This extensible format carries any kind of time-based media elements you might ever need.

The specification allows external ancillary files of any kind to be called in to use by reference. The primary timing, and framing information must be carried in the main ISO base media file.

Since ISOBMFF and QTFF are based on the same DNA, applications designed to manage either kind of file can share a lot of common code. This is good for application developers because less code = easier maintenance.

QuickTime File Format (QTFF)

Apple developed the QuickTime file format as an object-oriented storage container. The media is stored in many small chunks called atoms. The data type of those atoms is indicated with a four-character (FourCC) code. The format is extensible because new atom types can be added to the repertoire at any time. Older file readers simply ignore the atoms they don’t recognize. New atom descriptions are introduced via a registry process.

The atom payload can carry Unicode characters. This allows URLs and internationally localized text to use the full range of the character set.

QTFF supported very sophisticated VR techniques for creating panoramas and object movies by rapidly switching frames in an entirely non-linear fashion. Apple called them Navigable Movies. This has been inherited by the ISOBMFF and points towards some future Metaverse and VR applications which MPEG is developing standards for.

The Apple QuickTime File Format is described here:

https://developer.apple.com/documentation/quicktime-file-format

FourCC Codes

The FourCC concept was designed by Apple in 1984 when the original Macintosh was released. These identifiers were widely used in the file system to identify resource data types. Apple describes them as OSTypes but the FourCC nomenclature is more popular. In the registry and some standards documents, they are also described as Code Points.

In 1985, Electronic Arts developed the IFF media file format for use on the Amiga personal computer. It used the same FourCC concept to identify the chunks of media data within the IFF file. The IFF technical documents credit Apple with the original innovation. IFF files are the foundation for many other formats (AIFF, RIFF, AVI and WAV). The Apple QuickTime files emerged in 1991 using the same idea but must surely have been in development for some time before that.

FourCC values are 32-bit integers composed of four separate 8-bit ASCII characters. The data is arranged in big-endian fashion to render the characters left to right. The big-endian format allows them to be assembled into a viable 32-bit value without needing any transformation.

Normally the codes are constructed with ASCII printable characters. Spaces are also allowed. This can cause problems for the unwary because the lack of a space character will not match the atoms correctly. In rare cases, non-printing control characters are used, which makes them even harder for humans to read without special formatting software. By convention, most of the codes are spelled with lower-case characters. Upper case variants define a completely different atom type to their lower case counterpart.

Serialization Into Files

Essence data needs to be serialized so it can be written to a file. The receiving application reverses this process (de-serializing) when it opens the file and reconstructs the original format.

The earliest media file formats only contained raw data. The internal structure was very simple with a header preceding the essence data.

The header contains a few metadata items to describe the format and scope of the raw data.

The next evolution splits the stored media into chunks. An optional trailer might be added to carry ancillary metadata.

After editing, the entire content is written to a new file. This takes a long time, leading to an unsatisfactory user experience.

An index to list the active chunks allows them to be rearranged, omitted or appended. Access to the index is easy when it is placed at the end of the file. The file manager provides the file length for your application code. If the last item in the file is a negative offset (N) back to the start of the index, then a simple file pointer seek function can locate it.

This is effective and quick but after many edits there might be some redundant chunks left in the file. These waste space. On poorly implemented solutions, the old redundant indexes are left intact as well but are ignored. Flatten the file to its optimal size by rewriting it without the garbage. Software engineers describe this as ‘Garbage-Collection’.

Run the garbage collecting file flattening process offline in the background as a scheduled workflow task running at a low priority so as not to consume an annoying amount of CPU resource. Read about the nice command which can manage the process priority from the command line shell.

A properly designed file editor will add the new chunks and replace the old index with the new one. That will only leave the redundant blocks in the main body to be flattened.

If the index or the chunks themselves are enhanced to identify a distinct data type, then metadata can be stored in a chunk as well. This facilitates mixed media and multi-channel storage in a single file.

Editing the file removes redundant chunks from the index leaving them embedded in the file and appends new ones to the end of the file. The index is updated when the file is saved.

ISO MPEG-4 Containers

The ISO Base Media File Format is similar to the Apple MOV file format developed for QuickTime. Apple originally designed this file structure to support multiple simultaneous media tracks of any type.

Additional features that support MPEG-4 coded media are described in MPEG-4 Part 14. Part 15 addresses the storage of AVC video content packaged for transmission on a network.

Combine MPEG-4 Parts 12, 14 and 15 and RFC 6381 to fully understand the MP4 container file structure. Then examine the MP4 registry to find descriptions of the internal chunk storage ATOMs.

Base Media File Formats

Base Media File Formats are foundations that provide a basic structure for storing arbitrary data of any kind. Other container formats are derived from them either by extending them or by using profiles.

NameDescription
MPEG-4 Part 1:2010MPEG-4 Part 1 describes the version 1 storage format. That is obsoleted by Part 14. The rest of the Part 1 standard is undergoing revision.
MPEG-4 Part 12:2022ISO Base Media File Format (ISOBMFF) is technically identical to the JPEG 2000 file storage defined in ISO 15444 Part 12 which it replaces. A new revision is under development.
MPEG-4 Part 14:2020MPEG-4 file format version 2 completely replaces the specification in Part 1 and is based on Part 12.
MPEG-4 Part 15:2022Describes how to package and store MPEG-4 Part 10 AVC and HEVC video. It is based on Part 12. Amendments have been published.
IETF RFC 6381Specifies how supported media elements are described in the metadata.
MP4 ATOM registryThe coded content is stored as structured chunks of data called ATOMs. A complete list of the ATOM types is available from the MP4 Registration Authority.

Registering Atom & Box Type Codes

QTFF describes the media objects inside the container as Atoms and ISO calls them boxes. Each box has a size parameter, type code specifier and a payload. The entire contents of the file are managed in boxes with no other data allowed.

The Box and Atom type codes are managed by the MP4 Registration Authority. This is administered by Apple on behalf of ISO and the wider QuickTime community. The complete list of type codes is available for public access at their web site:

https://mp4ra.org/

The registry provides additional information and links to relevant standards defining the content and structure of each payload type. QTFF supports a few Atom types that are not available in ISOBMFF files. They are described in the registry to avoid name space collisions. The registry is available here:

https://mp4ra.org/registered-types/boxes

The Atom type codes are collated under these categories:

CategoryNotes
ISO family270 unique code points.
User-data40 unique code points.
QuickTime specific16 unique code points. This table is allegedly not yet complete.
Deprecated AtomsThese are described in the Apple QuickTime File Format documentation.


Debugging ISO Media Container Files

Before QuickTime was revised to use the AVFoundation library, Apple developers had a variety of tools for disassembling movie files for inspection. Those tools are long extinct now that QuickTime 7 is deprecated.

This is an example output from an ancient Apple QuickTime de-compiler tool which illustrates the internal atom structure. ISOBMFF files would look very similar.  (see above)

Clearly, atoms are nested at multiple levels contained within one another. This tree structure is reflected into the object graph cached in memory when the file is loaded by an application.


You will encounter similar tree structures in many scenarios in a software based production environment. They can be expressed in many different forms.


MP4 and ISOBMFF formatting problems can be traced with these more recent inspection tools:

Tool Details
MediaInfo Available from the MediaArea company. Lists the contents of the video file.
ffprobe This is part of the ffmpeg toolkit and displays some of the metrics relating to the file content.
Boxdumper An open-source tool that displays the Box structure.
IsoViewer Inspects the internal box structure of an ISO file.
MP4Box.js This is a JavaScript library to process MP4 files in a web browser.
Mp4dump Built on top of the Bento4 C++ class library. This displays the entire box structure of an MP4 file.

 

MPEG Application Formats

More recently, MPEG has standardized specific ISOBMFF file formats for different target applications as the ISO 23000 MPEG Application Format.

The various parts of ISO 23000 collate media storage requirements under various categories. Collectively, these are described as MPEG-A.

Part Application format
1 Purpose for multimedia application formats.
2 MPEG music player.
3 MPEG photo player.
4 Musical slide show.
5 Media streaming.
6 Professional archiving.
7 Open access.
8 Portable video.
9 Digital Multimedia Broadcasting.
10 Surveillance.
11 Stereoscopic video.
12 Interactive music.
13 Augmented reality.
15 Multimedia preservation.
16 Publish/Subscribe.
17 Multiple sensorial media.
18 Media linking.
19 Common media (CMAF) for segmented media.

 

File Name Extensions

These file name extensions are based on standards for the ISOBMFF containers:

Container Content
.mp4 MP4 file format.
.3gp 3GPP file format for 3G UMTS multimedia services. Used on 5G mobile phones.
.3g2 3GPP2 file format for 5G CDMA2000 multimedia services. It is more efficient than 3GPP.
.mj2 Motion JPEG 2000.
.dvb The DVB specific features of the ISOBMFF file format are described in ETSI TR 102 833 and DVB document A158.
.dcf Stores a group of digital camera raw images.
.m21 Contains MPEG-21 data.
.f4v Adobe Flash Video.
.heif High Efficiency Image File Format.

 

Applying ISOBMFF

The MPEG containers probably have the best support for end-users across all the client-player applications.

MPEG containers can carry a variety of different coded media elements but are designed around the MPEG codecs. These have been around for some time and are being overtaken by better performing and newer codecs. Those new codecs are often stored in a Matroška file container.

MPEG containers are not likely to disappear. They may become even more popular when the patents expire and there are no more license fees incurred. However, by then the Matroška format may be dominant.

Supported by

You might also like...

Broadcast Standards – The Science Of AI

Artificial Intelligence is already an integral part of our everyday lives and it is already making our lives more productive. But it is far from risk-free.

Broadcast Standards 2026 – Audio Coding

Audio is central to the whole broadcast experience. While video can show us what’s going on, it is audio that tells us how to feel about it. If only it wasn’t all so complicated.

Network Traffic Engineering: Why MPEG-TS Is Still The Standard

MPEG transport stream (MPEG TS) was designed in the 1990s to deliver continuous video and audio over unreliable, one-way networks, such as satellite, terrestrial RF, and cable, where packet loss and corruption are expected. But it is still prevalent in…

Standards: Video - High Efficiency Video Coding (HEVC)

Designed to halve the bitrate of AVC while supporting resolutions up to 16K, HEVC represents a significant leap in video coding efficiency. This guide explores its profiles, tiers and levels, and examines whether it can overcome the challenges of entrenched…

SMPTE Education Launches Summer 2026 Lineup Of IP And ST 2110 Courses

Boasting two standalone courses, an intensive boot camp, and a hands-on practical lab, SMPTE Education has launched its summer 2026 Lineup of IP and ST 2110 Courses.