Standards: SMPTE ST 2110 - ST 2110-4x Metadata Transport
Metadata affects the entire lifecycle of media assets and describes many properties which are all important at different times during the media lifecycle. Sufficient detail is necessary to manage visual effects in post production and workflow processing and the rabbit hole is very deep. We find out how deep it goes.
The ST 2110-4x Metadata Documents
Metadata affects the entire lifecycle of media assets. Imaging and sound recording theory describes many subtle properties for the assets which are all vitally important at different times during the lifecycle. The rabbit hole is very deep. Sufficient detail is necessary to manage visual effects post production and workflow processing effectively.
If you neglect to gather metadata at any stage, forensic reconstruction later on may sometimes be possible. However, if the data is truly lost, it is gone forever.
Before looking at ST 211-40, it is worth reviewing ST 291 which is the source of the ancillary metadata carried in the ST 2110-40 streams.
The ST 2110 Metadata Standards
SMPTE has already published or is working on several standards covering the use of metadata in an ST 2110 based IP network.
| Standard | Description |
|---|---|
| ST 291-1 | Ancillary Data Packet and Space Formatting. |
| ST 2110-40 | Carriage of SMPTE ST 291-1 Ancillary Data extracted from an SDI feed. This was developed early in the ST 2110 architectural design. |
| ST 2110-41 | ST 2110-41 Fast Metadata Framework (FMX). A new approach which is more flexible than the earlier standard. |
| ST 2110-42 | Metadata for 2110-x streams using FMX. This is a work in progress that at the time of writing is currently on hold. |
| ST 2127-2 | Mapping Metadata Guided Audio (MGA) to ST 2110-41. This partially replaces the need for ST 2110-42. |
| ST 2109 | Format for Non-PCM Audio and Data in AES3 – Audio Metadata. |
| 29 | Caption data. |
Understanding of the ancillary data embedded in an SDI stream described in the ST 291 standard illustrates how the metadata is transformed for carriage in ST 2110-40 streams.
ST 291 Ancillary Data
There are two parts of the ST 291 standard:
- ST 291-1 – Ancillary data packet and Space Formatting.
- RP 291-2 – Ancillary data packet payload formats.
ST 291-1 is the standard for ancillary data but you should also read Recommended Practice RP 291-2 to apply it.
Vertical blanking is organized as lines of TV signals carrying a variety of test and other data. It is located before and after the active-image area. A safe area reserves some lines for special purposes. Vertical blanking does not include the horizontal blanking time which is located at the start of all lines whether they are visible or not.
Any audio found in the Horizontal Ancillary data (HANC) area will be delivered according to Part 30. The remaining ancillary data will be delivered in the part 40 ANC stream. If there is no ancillary data to send, an empty ‘keep-alive’ heartbeat packet will be transmitted at least once for every field frame or segment. ANC packets should be transmitted within 1ms of the data appearing in the SDI feed.
The data arriving from an SDI feed should be identifiable using the identifying codes in ST 352. That can be used to determine the mapping of RP 291-2 data-types for the ANC packets.
Consult IETF RFC 8331 for the RTP payload definition for ST 291 ancillary data.
Note that an errata document has been published for ST 2110-40 (ed 2023) which describes some typographical errors.
ST 291 – Part 1
ST 291-1 is a robust metadata system designed to carry anything that could potentially appear in Vertical ANCillary data (VANC) and HANC data spaces within an incoming SDI feed.
There are two similar kinds of Ancillary Data Packets described: Type 1 & Type 2. Type 1 is more useful when lengthy user data is being transmitted (in excess of 255 words). Type 2 supports a wider variety of message types for delivering controls and signals.
The packets all carry a packet Description ID (DID) code that allows the receiving client player to handle each kind of packet correctly. Type 2 extends this with a Secondary ID (SDID) value.
High-definition ANC data formatting is described in recommended practice RP 291-2.
SMPTE often publishes recommended practice documents and sometimes promotes them to full standards when they are ready. The ST and RP documents therefore share a common numbering space.
Data ID & Secondary ID Codes
The Data ID (DID) and Secondary Data ID (SDID) values describe the type of ANC packet being transmitted. The Type 1 packets contain a single DID value while the Type 2 packets have a DID, plus an additional SDID value that allows for a greater range of semantic meanings:
- Type 1 – DID + Sequence Number.
- Type 2 – DID + SDID.The DID and SDID formats are set out in ST 291-1 and the range of possible values are maintained in a registry managed by SMPTE at this URL:
https://smpte-ra.org/smpte-ancillary-data-smpte-st-291/
Note that a different URL is described at the top of Page 5 in the ST 291 standard but the target web page for it is missing.
The entire registry can be printed, saved as a PDF or downloaded as an Excel spreadsheet.
Prior to the online registry, this data was published in the SMPTE RP 291 document. This is now out of date and has been withdrawn.
Refer to Section 4 in the ST 291-1 standard for a description of how the registry is operated and how to request new ID values.
Once a value has been registered against a DID or SDID value, the meaning can never be altered, and nor can it be deleted from the registry. It might be deprecated though.
Type 1 ANC Packets
Type 1 packets can carry these payloads in the user data area:
- Audio Data.
- Camera position.
- Error Detection and Handling.
- Continuous sequences of related packets.
- At the time of writing a generic Time Labelling format is also under development.
Here is the internal structure for a Type 1 ANC packet:
These are the various parts of the packet:
| Value | Description |
|---|---|
| ADF | The Ancillary Data Flag indicates the start of the packet. It may be exactly one or three words long. |
| DID | The Data ID word describes what kind of data is being carried as a payload. |
| DBN |
A Data Block Number describes this packet's position within a sequence of packets belonging to a specific DID value. The range of values is 1-255 and the cycle can then be restarted as many times as needed. |
| DC | A Data Count number describes how many words make up the user data payload. |
| UDW | The User Data Words payload, comprising up to 255 words in each packet. The format and syntax of this data will be defined in a separate document specific to the application. |
| CS | A Checksum word which provides basic error detection but not correction. |
Hexadecimal numeric notation is based on 16 symbols from 0 to F. A Hexadecimal number is suffixed with a small letter ‘h’ to denote this format.
The value of the Ancillary Data Flag is important. It is a pattern that must not appear anywhere else in the stream. Consequently, these hexadecimal word values are never permitted in the payload:
000h, 001h, 002h, 003h
3FCh, 3FDh, 3FEh, 3FFh
The ADF value is composed of three words for a component interface and only one word for a composite interface. The three-word hexadecimal ADF value for component interfaces is:
000h 3FFh 3FFh
The one-word hexadecimal ADF value for composite interfaces is:
3FCh
There are potential data losses when the ANC data is passed through legacy equipment that only supports 8-bits. The 10-bit ADF values are truncated by removing the two least significant bits.
This alters the value of the data word in a different (and more complex) way compared with truncating the most significant bits. Restoring back to a 10-bit value cannot replace the lost data in the two least significant bits. There is always some uncertainty regarding the restored word values.
For example, the value 3FFh becomes FFh in the 8-bit environment. When it is restored by padding with zero-bits, it becomes 3FCh. In fact, all four of the values between 3FCh and 3FFh are restored as 3FCh. Likewise, the values 0 to 3 all become zero in the 8-bit world and are always zero when restored to a 10-bit value.
The truncation is not applied in a consistent manner across the whole packet which adds to the complexity.
This has implications for the synchronization of packet starts and also the identification of DID and SDID values that determine the meaning of each packet. Read about Legacy Equipment on Page 3, then read Sections 5 and 6 of the ST 291 standard to fully understand the complexities of this.
You need not worry about this if there are no 8-bit legacy systems in your enterprise.
Type 2 ANC Packets
Type 2 packets can carry a wider variety of payloads:
- Acquisition Metadata Sets for video camera parameters.
- AFD and Bar Data.
- Ancillary Time Code.
- ANSI/SCTE 104 messages.
- Compressed audio metadata.
- Data broadcast (DTV).
- DVB/SCTE VBI data.
- EIA 608 Data mapping.
- EIA 708B Data mapping.
- Extended HDR/WCG for SDI.
- Film Codes.
- HD-SDTI transport in active frame space.
- KLV Metadata transport.
- Link Encryption Messages & Metadata.
- Lip Sync data as specified by ST 2064-1.
- Metadata to monitor errors of audio and video signals on a broadcasting chain.
- MPEG recoding data.
- MPEG TS packets.
- Packing UMID and Program Identification Label Data into SMPTE 291M Ancillary Data Packets.
- Pan-Scan Data.
- Payload Identification.
- Program Description.
- SDTI transport in active frame space.
- Stereoscopic 3D Frame Compatible Packing and Signaling.
- Structure of inter-station control data conveyed by ancillary data packets
- Subtitling Distribution packet (SDP).
- Time Code for High Frame Rate Signals.
- Transport of ANC packet in an ANC Multi-packet.
- Two Frame Marker.
- VBI Data.
- Vertical Ancillary Data Mapping of KLV Formatted HDR/WCG Metadata.
- WSS data per RDD 8.
Here is the internal structure for a type 2 ANC packet:
Type 2 packets replace the Data Block Number (DBN) with a Secondary Data ID (SDID) value. Hence, they cannot be used to construct sequences of related packets unless the sequencing is managed by the payload content.
ST 2110-40 – Carriage Of ST 291-1 Ancillary Data
Part 40 describes how to carry ST 291 compliant ancillary data transmitted during the horizontal and vertical blanking intervals. Ancillary data includes these different elements:
- Closed captioning.
- Subtitles.
- Timecode.
- Metadata.
- Slate information.
- Teletext.
- Copy prevention signals.
- Vertical Interval Test Signals.
Ancillary data (ANC) extracted from an SDI stream is delivered in real-time alongside the video and audio essence data. They all travel in separate RTP streams around the IP network. The system architecture is described in ST 2110-10 which covers timing and all the common characteristics of the different streams. The streams are all synchronized to a common reference clock.
The individual packets of ANC data are described in SMPTE standard ST 291 Part 1. The source of this data is the VANC and HANC data spaces within an SDI signal. Those data spaces are described in RP 291 Part 2 for high-definition TV services.
The code that supports this process is quite simple to build:
- The HANC and VANC ancillary data spaces are de-embedded from the incoming SDI feed.
- The individual data items are parsed out and stored in local variables.
- The metadata is formatted into ST 291-1 packets.
- The ST 291 packets are marshalled into an RTP stream for output.
Mapping ST 291 Packets Into RTP Streams
The mapping of ST 291 ANC data packets into the RTP streams is described in IETF RFC 8331. This is constrained by the requirements set out in ST 2110-10. It is important to obtain all of these standards first and cross-refer while you build, test and debug your IP network infrastructure.
RFC 8331 retains the location within the SDI ANC data spaces where the source metadata was found. It is described in terms of these properties:
- Line Number.
- Horizontal Offset.
- Stream Number.
- Interlaced Field (where applicable).
A stream of properly constructed RFC 8331 packets will allow the original SDI signal to be reconstructed for output at the destination.
Timestamps are based on the video fields and should be the same for all ANC packets derived from a single interlaced field or progressive-frame.
If there is no ANC data to be transmitted, a ‘keep-alive’ packet is delivered as a heartbeat to maintain the connection.
Timing is critical to maintain synchronization. Keeping latency to a minimum is important too. Section 6 of the ST 2110-40 standard explores this in some depth.
The Session Description Protocol (SDP) is widely used within the ST 2110 environment. This is also described in RFC 8331 and ST 2110-10. Refer to Section 7 of ST 2110-40 for details of how SDP is employed by metadata streams.
Section 7 also mentions that the flow identification semantics described in RFC 8331 should not be used as they are incompatible with ST 2110.
A format Specific Parameter is also described in section 7 and must be defined correctly to include the frame rate.
ST 2110-41 – Fast Metadata Framework (FMX)
Part 41 specifies how to transport Faster Metadata (FMX) that did not originate from an ST 291 compliant source. Obtaining signaling data from a separate data stream is easier than unpacking video streams.
ST 2110-40 could potentially consume many IP addresses. FMX described in ST 2110-41 is a more flexible format for transmitting metadata payloads in RTP packets. It is designed to use the available network resources more efficiently.
This document is still being worked on but is due to be released very soon. There is some outstanding work required to define the payload formats to be carried by this transport.
The design goals and advantages of FMX over ST 2110-40 + ST 291 are:
- More easily parsed.
- Rapid delivery of metadata.
- Time aware and synchronized with the video and audio essence.
- Extensible architecture beyond what we currently think we need.
- Payload agnostic and can transport anything (such as XML, JSON or any other format).
- Will use ST 2110-10 and ST 2059 to synchronize.
- Upwards compatible and will support ST 291 SDI ANC data.
- Useable with any kind of media stream.
- Delivered independently of an elementary stream.
- Must require no changes to any other ST 2110 specifications.
- More efficient use of IP address space.
- Can be synchronized with the system clock more easily.
- Uses a Key-Length-Value (KLV) packaging model.
- Single level of encapsulation.
- Allows more sophisticated metadata schemas than SDI.
- Supports larger amounts of metadata than SDI.
FMX Packet Structure
FMX supports the delivery of time-synchronized metadata associated with audio and video data streams or less tightly coupled metadata that does not need to be synchronous.
Like many other ST 2110 standards, this also describes RTP payloads and SDP messaging formats.
Section 5 describes the RTP packet format comprising two parts:
- RTP Header.
- RTP Payload.
The header is described in RFC 3550. ST 2110-41 adds constraints to profile this and render the packets compatible with the rest of ST 2110.
This is much simpler than ST 2110-40 in the way that it describes the format of the payload content. The RTP packet size is still constrained by the requirements set out in ST 2110-10.
The payload is constructed from a collection of Data Item Packages. An empty RTP packet can be delivered without any Data Item Packages for keep alive purposes.
The Data Item Packages must not be split across multiple RTP packets. Their size must be a multiple of 32-bits but can be variable within that constraint. Any necessary padding is considered to be part of the Data Item Package and would be specified in the application document.
| Value | Description |
|---|---|
| DIT | The Data Item Type is a 22-bit value that describes the contents of the package. |
| K-bit | This flag bit manages the segmentation of the payload into objects. See Annex A of ST 2110-41 for details. |
| DIL | The Data Item Length. The number of 32-bit fragments in the payload body. |
| 32-bit Fragments | All of the payload data must be split into 32-bit fragments. |
| Padding | The last fragment may contain padding to ensure the data length is a multiple of 32-bits. |
| Payload | Some semantic information may be carried in the payload to describe the data structure and how much padding is included. |
An alternative segmentation scheme for supporting an object-like structure is described in Annex A. Section 6 describes the SDP signaling message format which must conform to RFC 4566 and ST 2110-10. This mandates that the SSN value should describe an ST 2110-41 stream and enumerate any Data Item Types that may appear in the stream. Here is an example SDP message:
a=fmtp:117 SSN=ST2110-41:2024; DIT=100,2000A1,1013FC,3FFF00
The payload formats are described in these separate standards:
| Payload | Description |
|---|---|
| ST 2109 | Format for Non-PCM Audio and Data in AES3 – Audio Metadata. |
| ST 2110-42 | Technical metadata. Currently on-hold at the time of writing. |
| ST 2127-2 | Mapping Metadata Guided Audio (MGA) to ST 2110-41. |
ST 2127-2 – Mapping MGA Audio Metadata To ST 2110-41
ST 2127 Part 2 describes the mapping of Metadata-Guided Audio (MGA) payloads into the Fast Metadata Framework (FMX). MGA is described more fully in ST 2127 Part 1 where it is used with MXF files and the FMX framework is described in ST 2110-41.
This standard was described in one of the SMPTE quarterly reports in 2022 as a partial replacement for ST 2110-42.
The standard is quite short and makes reference to the audio essence being carried by an ST 2110-30 stream controlled by the ST 2127-2 metadata carried in a ST 2110-41 framework using rules defined in ST 2127-1.
Acquire and read all four documents to understand the entire structure.
ST 2110-42 – FMX Payload For 2110 Technical Metadata
The original intent for this standard was to describe an object-based format for carrying the technical metadata about the video/audio streams. This would be delivered using the framework defined by ST 2110-41.
This is not intended to replace NMOS IS-04 or the SDP messaging protocols. The metadata payloads would be packaged as described in ST 2110-41. These are some of the items that were proposed:
| ST 2110 Part | Metadata carried |
|---|---|
| 20 | File Multicast Transport Protocol (FMTP) stream parameter values. |
| 30 & 31 | The Packetization time (ptime) value and the number of channels. |
| 30 & 31 | The ttime value and the number of channels. |
| 40 | The video format tag (VPID byte). |
| All | AMWA Sender ID and/or Flow ID. |
Initial work was started in 2020 but the only public mention of ST 2110-42 is in the SMPTE quarterly reports. The most recent information suggests this standard may not be necessary as the requirements will be covered by other standards that are nearing completion. In particular ST 2127-2 is mentioned.
As of February 2023, any further progress on ST 2110-42 appears to have halted for the time being indicating that it may now be considered redundant.
Relevant Standards
Refer to these other standards for supporting material that will assist in understanding and deploying ST 2110-4x metadata content:
| Document | Vintage | Description |
|---|---|---|
| ST 125 | 2013 | SDTV Component Video Signal Coding 4:4:4 and 4:2:2 for 13.5 MHz and 18 MHz Systems. |
| RP 165 | 1994 | Error Detection Check-words and Status Flags for Use in Bit-Serial Digital Interfaces for Television. |
| RP 168 | 2009 | Definition of Vertical Interval Switching Point for Synchronous Video Switching. |
| ST 259 | 2008 | SDTV Digital Signal/Data – Serial Digital Interface. |
| ST 274 | 2008 | 1920 x 1080 Image Sample Structure, Digital Representation and Digital Timing Reference Sequences for Multiple Picture Rates. |
| ST 291-1 | 2011 | Ancillary Data Packet and Space Formatting. |
| RP 291-2 | 2013 | Ancillary Data Space Use – 4:2:2 SDTV and HDTV Component Systems and 4:2:2 2048 × 1080 Production Image Formats. |
| ST 292-1 | 2018 | 1.5 Gb/s Signal/Data Serial Interface. |
| ST 296 | 2012 | 1280 × 720 Progressive Image 4:2:2 and 4:4:4 Sample Structure – Analog and Digital Representation and Analog Interface. |
| ST 305 | 2005 | Serial Data Transport Interface. |
| ST 352 | 2013 | Payload Identification Codes for Serial Digital Interfaces. |
| ST 2059-1 | 2021 | SMPTE Profile for Use of IEEE-1588 Precision Time Protocol in Professional Broadcast Applications. This covers the generation and Alignment of Interface Signals to the SMPTE Epoch. |
| ST 2109 | 2019 | Format for Non-PCM Audio and Data in AES3 – Audio Metadata. |
| ST 2110-10 | 2022 | Professional Media over Managed IP Networks – System Timing and Definitions. |
| ST 2110-20 | 2022 | Professional Media Over Managed IP Networks – Uncompressed Active Video. |
| ST 2110-21 | 2022 | Professional Media Over Managed IP Networks – Traffic Shaping and Delivery Timing for Video. |
| ST 2127-1 | 2022 | Mapping Metadata-Guided Audio (MGA) signals into the MXF Constrained Generic Container. |
| ST 2127-2 | 2024 | Mapping MGA Audio Metadata to ST 2110-41. |
| RFC 3550 | 2003 | RTPA Transport Protocol for Real-Time Applications. |
| RFC 4566 | 2006 | SDP Session Description Protocol. |
| RFC 5234 | 2008 | Augmented BNF for Syntax Specifications. |
| RFC 8285 | 2017 | A General Mechanism for RTP Header Extensions. |
| RFC 8331 | 2018 | RTP Payload for SMPTE ST 291 Ancillary Data. |
| VSF TR-03 | 2015 | Transport of Uncompressed Elementary Stream Media over IP. |
| VSF TR-04 | 2015 | Utilization of ST 2022-6 Media Flows within a VSF TR-03 Environment. |
Using Metadata In ST 2110
Metadata management within the ST 2110 architecture is a work in progress. The ST 2110-40 standard is designed around the ancillary data found in an SDI signal.
A more generalized approach is specified in ST 2110-41 but this only describes the framework for carriage of the metadata.
ST 2110-42 was intended to describe the message payloads and metadata formats but this work is currently suspended.
Some audio metadata is described in ST 2127-2 and ST 2109 but there is currently no corresponding standardized description for video-related metadata delivered via FMX.
For now, we must watch and wait.
Metadata keeps track of your products in the archives so they can be re purposed and monetized later.
These Appendix articles contain additional information you may find useful:
Supported by
You might also like...
Standards: Video - Advanced Video Coding (AVC)
AVC remains one of the most widely deployed video codecs in the world, but navigating its profiles, levels and signaling mechanisms is far from straightforward.
Network Traffic Engineering: RIST & SRT - The Success Of ARQ Based Protocols
IP networks are inherently unreliable. We kick off this series on IP Network Traffic Engineering with a look at how RIST and SRT give broadcast engineers user-configurable control over the latency-versus-reliability trade-off for real-time media streaming.
Standards: Video - Standards For Video Coding
From 4K to 32K, the demand for ever-larger video formats is pushing codec technology to its limits. This guide surveys the landscape of video coding standards – from legacy MPEG formats to AI-driven neural network compression – to help navigate the choices sha…
Broadcast Standards 2026 – Video Coding
Video coding was developed to deliver video conferencing services over low-bandwidth modem connections, but modern demands for ever-larger video formats are pushing codec technology to its limits.
Network Traffic Engineering: Part 1
IP networks are inherently unreliable. They always have been – it is literally designed in as a feature.