Standards: Part 20 - ST 2110-4x Metadata Standards
Our series continues with Metadata. It is the glue that connects all your media assets to each other and steers your workflow. You cannot find content in the library or manage your creative processes without it. Metadata can also control the end-user playback experience.
This article is part of our growing series on Standards.
There is an overview of all 26 articles in Part 1 - An Introduction To Standards.
Metadata is vital to managing your collection of assets in an archive or librarian system. It describes relationships between different media assets. It is also vital when managing the production workflow.
During playback, metadata driven signals trigger remote actions to synchronously present supporting assets alongside the viewer in the client player.
It is the essential glue that steers the production processes and integrates all your media assets.
The ST 2110 Metadata Standards
SMPTE has already published or is working on several standards covering the use of metadata in an ST 2110 based IP network.
Standard | Description |
---|---|
ST 2110-40 | Carriage of SMPTE ST 291-1 Ancillary Data extracted from an SDI feed. This was developed early in the ST 2110 architectural design. |
ST 2110-41 | Fast Metadata Framework (FMX). A new approach which is more flexible than the earlier standard. |
ST 2110-42 | Metadata for 2110-x streams using FMX. This is a work in progress that is currently on hold. |
ST 2127-2 | Mapping Metadata Guided Audio (MGA) to ST 2110-41. This partially replaces the need for ST 2110-42. |
ST 2109 | Format for Non-PCM Audio and Data in AES3 - Audio Metadata. |
ST 2110-40 - Carriage Of SMPTE ST 291-1 Ancillary Data
Ancillary data (ANC) extracted from an SDI stream is delivered in real-time alongside the video and audio essence data. They all travel in separate RTP streams around the IP network. The system architecture is described in ST 2110-10 which covers timing and all the common characteristics of the different streams. The streams are all synchronized to a common reference clock.
The individual packets of ANC data are described in SMPTE standard ST 291 Part 1. The source of this data is the VANC and HANC data spaces within an SDI signal. Those data spaces are described in RP 291 Part 2 for high-definition TV services.
The code that supports this process is quite simple to build:
- The HANC and VANC ancillary data spaces are de-embedded from the incoming SDI feed.
- The individual data items are parsed out and stored in local variables.
- The metadata is formatted into ST 291-1 packets.
- The ST 291 packets are marshalled into an RTP stream for output.
ST 291 Part 1
ST 291-1 is a robust metadata system designed to carry anything that could potentially appear in VANC and HANC data spaces within an incoming SDI feed.
There are two similar kinds of Ancillary Data Packets described (Type 1 & 2). Type 1 is more useful when lengthy user data being transmitted (in excess of 255 words). Type 2 supports a wider variety of message types for delivering controls and signals.
The packets all carry a packet Description ID (DID) code that allows the receiving client player to handle each kind of packet correctly. Type 2 extends this with a Secondary ID (SDID) value.
High-definition ANC data formatting is described in recommended practice RP 291-2.
SMPTE often publishes recommended practice documents and sometimes promotes them to full standards when they are ready. The ST and RP documents therefore share a common numbering space.
Data ID & Secondary ID Codes
The Data ID (DID) and Secondary Data ID (SDID) values describe the type of ANC packet being transmitted. The Type 1 packets contain a single DID value while the Type 2 packets have A DID plus an additional SDID value that allows for a greater range of semantic meanings:
- Type 1 - DID + Sequence Number.
- Type 2 - DID + SDID.
The DID and SDID formats are set out in ST 291-1 and the range of possible values are maintained in a registry managed by SMPTE at this URL:
https://smpte-ra.org/smpte-ancillary-data-smpte-st-291/
Note that a different URL is described at the top of Page 5 in the ST 291 standard but the target web page for it is missing.
The entire registry is available to print or save as a PDF. Downloading the entire table in an Excel spreadsheet is also possible but there is currently a fault in the way the link is constructed. Modify the link to remove the invalid parameter flagged by your web browser to download the Excel spreadsheet version. The URL parameter to remove is: &Privatelink=…. Leave the rest intact. Ignore these instructions when the problem is fixed:
https://creatorapp.zoho.com/smptezoho/smpte-291-2010/xls/Identifiers_Status/…
&appLinkName=smpte-291-2010
&viewLinkName=Identifiers_Status
&Privatelink=… … …
&fileType=xls
Prior to the online registry, this data was published in the SMPTE RP 291 document. This is now out of date and has been withdrawn.
Refer to Section 4 in the ST 291-1 standard for a description of how the registry is operated and how to request new ID values.
Once a value has been registered against a DID or SDID value, the meaning cannot ever be altered nor can it be deleted from the registry. It might be deprecated though.
Type 1 ANC Packets
Type 1 packets can carry these payloads in the user data area:
- Audio Data.
- Camera position.
- Error Detection and Handling.
- Continuous sequences of related packets
- A generic Time Labelling format is under development.
Here is the internal structure for a Type 1 ANC packet:
These are the various parts of the packet:
Value | Description |
---|---|
ADF | The Ancillary Data Flag indicates the start of the packet. It may be exactly one or three words long. |
DID | The Data ID word describes what kind of data is being carried as a payload. |
DBN | A Data Block Number describes this packet's position within a sequence of packets belonging to a specific DID value. The range of values is 1-255 and the cycle can then be restarted as many times as needed. |
DC | A Data Count number describes how many words make up the user data payload. |
UDW | The User Data Words payload, comprising up to 255 words in each packet. The format and syntax of this data will be defined in a separate document specific to the application. |
CS | A Checksum word which provides basic error detection but not correction. |
Hexadecimal numeric notation is based on 16 symbols from 0 to F. A Hexadecimal number is suffixed with a small letter ‘h’ to denote this format.
The value of the Ancillary Data Flag is important. It is a pattern that must not appear anywhere else in the stream. Consequently, these hexadecimal word values are never permitted in the payload:
000h, 001h, 002h, 003h
3FCh, 3FDh, 3FEh, 3FFh
The ADF value is composed of three words for a component interface and only one word for a composite interface. The three-word hexadecimal ADF value for component interfaces is:
000h 3FFh 3FFh
The one-word hexadecimal ADF value for composite interfaces is:
3FCh
There are potential data losses when the ANC data is passed through legacy equipment that only supports 8-bits. The 10-bit ADF values are truncated by removing the two least significant bits. This alters the value of the data word in a different (and more complex) way compared with truncating the most significant bits. Restoring back to a 10-bit value cannot replace the lost data in the two least significant bits. There is always some uncertainty regarding the restored word values.
For example, the value 3FFh becomes FFh in the 8-bit environment. When it is restored by padding with zero-bits, it becomes 3FCh. In fact, all four of the values between 3FCh and 3FFh are restored as 3FCh. Likewise, the values 0 to 3 all become zero in the 8-bit world and are always zero when restored to a 10-bit value.
The truncation is not applied in a consistent manner across the whole packet which adds to the complexity.
This has implications for the synchronization of packet starts and also the identification of DID and SDID values that determine the meaning of each packet. Read about Legacy Equipment on Page 3, then read Sections 5 and 6 of the ST 291 standard to fully understand the complexities of this.
You need not worry about this if there are no 8-bit legacy systems in your enterprise.
Type 2 ANC Packets
Type 2 packets can carry a wider variety of payloads:
- Acquisition Metadata Sets for Video Camera Parameters.
- AFD and Bar Data.
- Ancillary Time Code.
- ANSI/SCTE 104 messages.
- Compressed Audio Metadata.
- Data broadcast (DTV).
- DVB/SCTE VBI data.
- EIA 608 Data mapping.
- EIA 708B Data mapping.
- Extended HDR/WCG for SDI.
- Film Codes.
- HD-SDTI transport in active frame space.
- KLV Metadata transport.
- Link Encryption Messages & Metadata.
- Lip Sync data as specified by ST 2064-1.
- Metadata to monitor errors of audio and video signals on a broadcasting chain.
- MPEG recoding data.
- MPEG TS packets.
- Packing UMID and Program Identification Label Data into SMPTE 291M Ancillary Data Packets.
- Pan-Scan Data.
- Payload Identification.
- Program Description.
- SDTI transport in active frame space.
- Stereoscopic 3D Frame Compatible Packing and Signaling.
- Structure of inter-station control data conveyed by ancillary data packets.
- Subtitling Distribution packet (SDP).
- Time Code for High Frame Rate Signals.
- Transport of ANC packet in an ANC Multi-packet.
- Two Frame Marker.
- VBI Data.
- Vertical Ancillary Data Mapping of KLV Formatted HDR/WCG Metadata.
- WSS data per RDD 8.
Here is the internal structure for a type 2 ANC packet:
Type 2 packets replace the Data Block Number (DBN) with a Secondary Data ID (SDID) value. Hence, they cannot be used to construct sequences of related packets unless the sequencing is managed by the payload content.
Mapping ST 291 Packets Into RTP Streams
The mapping of ST 291 ANC data packets into the RTP streams is described in IETF RFC 8331. This is constrained by the requirements set out in ST 2110-10. It is important to obtain all of these standards first and cross-refer while you build, test and debug your IP network infrastructure.
RFC 8331 retains the location within the SDI ANC data spaces where the source metadata was found. It is described in terms of these properties:
- Line Number.
- Horizontal Offset.
- Stream Number.
- Interlaced Field (where applicable).
A stream of properly constructed RFC 8331 packets will allow the original SDI signal to be reconstructed for output at the destination.
Timestamps are based on the video fields and should be the same for all ANC packets derived from a single interlaced-field or progressive-frame.
If there is no ANC data to be transmitted, a ‘keep-alive’ packet is delivered as a heartbeat to maintain the connection.
Timing is critical to maintain synchronization. Keeping latency to a minimum is important too. Section 6 of the ST 2110-40 standard explores this in some depth.
The Session Description Protocol is widely used within the ST 2110 environment. This is also described in RFC 8331 and ST 2110-10. Refer to Section 7 of ST 2110-40 for details of how SDP is employed by metadata streams.
Section 7 also mentions that the flow identification semantics described in RFC 8331 should not be used as they are incompatible with ST 2110.
A format Specific Parameter is also described in section 7 and must be defined correctly to include the frame-rate.
ST 2110-41 - Fast Metadata Framework (FMX)
The Fast Metadata Framework (FMX) describes a more flexible format for transmitting metadata payloads in RTP packets. It is designed to use network resources more efficiently. ST 2110-40 can potentially consume many IP addresses. These are the design goals:
- Rapid delivery of metadata.
- Time aware and synchronized with the video and audio essence.
- Extensible architecture.
- Payload agnostic and can transport anything.
- Will use ST 2110-10 and ST 2059 to synchronize.
- Upwards compatible and will support ST 291 SDI ANC data.
- Useable with any kind of media stream.
- Can also run independently of any streams.
- Must require no changes to any other ST 2110 specifications.
FMX supports the delivery of time synchronized metadata associated with audio and video data streams or more loosely coupled metadata that does not need to be synchronous.
Like many other ST 2110 standards, this also describes RTP payloads and SDP messaging formats.
Section 5 describes the RTP packet format comprising two parts:
- RTP Header.
- RTP Payload.
The header is described in RFC 3550. ST 2110-41 adds constraints to profile this and render the packets compatible with the rest of ST 2110.
This is much simpler than ST 2110-40 in the way that it describes the format of the payload content. The RTP packet size is still constrained by the requirements set out in ST 2110-10.
The payload is constructed from a collection of Data Item Packages. An empty RTP packet can be delivered without any Data Item Packages for keep-alive purposes.
The Data Item Packages must not be split across multiple RTP packets. Their size must be a multiple of 32-bits but can vary in size. Any necessary padding is considered to be part of the Data Item Package and would be specified in the application document.
Value | Description |
---|---|
DIT | The Data Item Type is a 22-bit value that describes the contents of the package. |
K-bit | This flag bit manages the segmentation of the payload into objects. See Annex A of ST 2110-41 for details. |
DIL | The Data Item Length describes the number of 32-bit fragments in the payload body. |
32-bit fragments | All of the payload data must be split into 32-bit fragments. |
Padding | The last fragment may contain some padding to ensure the data length is a multiple of 32-bits. |
Payload | Some semantic information may be carried in the payload to describe the data structure and how much padding is included. |
An alternative segmentation scheme for supporting an object like structure is described in Annex A.
Section 6 describes the SDP signaling message format which must conform to RFC 4566 and ST 2110-10. This mandates that the SSN value should describe an ST 2110-41 stream and enumerate any Data Item Types that may appear in the stream. Here is an example SDP message:
a=fmtp:117 SSN=ST2110-41:2024; DIT=100,2000A1,1013FC,3FFF00
The payload formats are described in these separate standards:
Payload | Description |
---|---|
ST 2109 | Format for Non-PCM Audio and Data in AES3 - Audio Metadata. |
ST 2110-42 | Technical metadata. Currently on-hold. |
ST 2127-2 | Mapping Metadata Guided Audio (MGA) to ST 2110-41. |
ST 2110-42 - FMX Payload For 2110 Technical Metadata
The original intent for this standard was to describe an object-based format for carrying the technical metadata about the video/audio streams. This is not intended to replace NMOS IS-04 or the SDP messaging protocols. The metadata payloads would be packaged as described in ST 2110-41. These are some of the items that were proposed:
ST 2110 part | Metadata carried |
---|---|
20 | The values of the FMTP parameters for the stream. |
30 | The ptime value and the number of channels. |
31 | The ttime value and the number of channels. |
40 | The video format tag (VPID byte). |
All | AMWA Sender ID and/or Flow ID. |
Initial work was started in 2020 but the only public mention of ST 2110-42 is in the SMPTE quarterly reports. The most recent information suggests this standard may not be necessary as the requirements will be covered by other standards that are nearing completion. In particular ST 2127-2 is mentioned. As of February 2023, any further progress on ST 2110-42 appears to have halted for the time being.
ST 2127-2 - Mapping MGA Audio Metadata To ST 2110-41
ST 2127 Part 2 describes the mapping of Metadata-Guided Audio (MGA) Audio payloads into the Fast Metadata Framework (FMX). MGA is described more fully in ST 2127 Part 1 where it is used with MXF files and the FMX framework is described in ST 2110-41.
This standard was described in one of the SMPTE quarterly reports in 2022 as a partial replacement for ST 2110-42.
The standard is quite short and makes reference to the audio essence being carried by an ST 2110-30 stream controlled by the ST 2127-2 metadata carried in a ST 2110-41 framework using rules defined in ST 2127-1.
Acquire and read all four documents to understand the entire structure.
About ST 2110-43
The ST 2110-43 standard describes Timed Text Markup Language for Captions and Subtitles. Timed Text is essence media. It is only mentioned here because it shares many physical characteristics with metadata formats for delivery and is grouped with the ST 2110 metadata standards.
Related Standards
Refer to these other standards for supporting material that will assist in understanding and deploying ST 2110-4X metadata content:
Standard | Version | Description |
---|---|---|
ST 125 | 2013 | SDTV Component Video Signal Coding 4:4:4 and 4:2:2 for 13.5 MHz and 18 MHz Systems. |
RP 165 | 1994 | Error Detection Check-words and Status Flags for Use in Bit-Serial Digital Interfaces for Television. |
RP 168 | 2009 | Definition of Vertical Interval Switching Point for Synchronous Video Switching. |
ST 259 | 2008 | SDTV Digital Signal/Data - Serial Digital Interface. |
ST 274 | 2008 | 1920 x 1080 Image Sample Structure, Digital Representation and Digital Timing Reference Sequences for Multiple Picture Rates. |
ST 291-1 | 2011 | Ancillary Data Packet and Space Formatting. |
RP 291-2 | 2013 | Ancillary Data Space Use - 4:2:2 SDTV and HDTV Component Systems and 4:2:2 2048 × 1080 Production Image Formats. |
ST 292-1 | 2018 | 1.5 Gb/s Signal/Data Serial Interface. |
ST 296 | 2012 | 1280 × 720 Progressive Image 4:2:2 and 4:4:4 Sample Structure - Analog and Digital Representation and Analog Interface. |
ST 305 | 2005 | Serial Data Transport Interface. |
ST 352 | 2013 | Payload Identification Codes for Serial Digital Interfaces. |
ST 2059-1 | 2021 | SMPTE Profile for Use of IEEE-1588 Precision Time Protocol in Professional Broadcast Applications. This covers the generation and Alignment of Interface Signals to the SMPTE Epoch. |
ST 2109 | 2019 | Format for Non-PCM Audio and Data in AES3 - Audio Metadata. |
ST 2110-10 | 2022 | Professional Media over Managed IP Networks: System Timing and Definitions. |
ST 2110-20 | 2022 | Professional Media Over Managed IP Networks: Uncompressed Active Video. |
ST 2110-21 | 2022 | Professional Media Over Managed IP Networks: Traffic Shaping and Delivery Timing for Video. |
ST 2127-1 | 2022 | Mapping Metadata-Guided Audio (MGA) signals into the MXF Constrained Generic Container. |
ST 2127-2 | 2024 | Mapping MGA Audio Metadata to ST 2110-41. |
RFC 3550 | 2003 | RTP: A Transport Protocol for Real-Time Applications. |
RFC 4566 | 2006 | SDP: Session Description Protocol. |
RFC 5234 | 2008 | Augmented BNF for Syntax Specifications. |
RFC 8285 | 2017 | A General Mechanism for RTP Header Extensions. |
RFC 8331 | 2018 | RTP Payload for Society of Motion Picture and Television Engineers (SMPTE) ST 291 Ancillary Data. |
VSF TR-03 | 2015 | Transport of Uncompressed Elementary Stream Media over IP. |
VSF TR-04 | 2015 | Utilization of ST 2022-6 Media Flows within a VSF TR-03 Environment. |
Conclusion
Metadata management within the ST 2110 architecture is a work in progress. The ST 2110-40 standard is designed around the ancillary data found in an SDI signal.
A more generalized approach is specified in ST 2110-41 but this only describes the framework for carriage of the metadata. ST 2110-42 was intended to describe the message payloads and metadata formats but this work is currently suspended. Some audio metadata is described in ST 2127-2 and ST 2109 but there is currently no corresponding standardized description for video related metadata delivered via FMX.
For now, we must watch and wait.
These Appendix articles contain additional information you may find useful:
Part of a series supported by
You might also like...
The Resolution Revolution
We can now capture video in much higher resolutions than we can transmit, distribute and display. But should we?
Microphones: Part 3 - Human Auditory System
To get the best out of a microphone it is important to understand how it differs from the human ear.
HDR Picture Fundamentals: Camera Technology
Understanding the terminology and technical theory of camera sensors & lenses is a key element of specifying systems to meet the consumer desire for High Dynamic Range.
IP Security For Broadcasters: Part 2 - The Problem To Be Solved
By assuming that IP must be made secure, we run the risk of missing a more fundamental question that is often overlooked: why is IP so insecure?
Standards: Part 22 - Inside AIFF Files
Compared with other popular standards in use, AIFF is ancient. The core functionality was stabilized over 30 years ago and remains unchanged.