Standards: SMPTE ST 2110 - ST 2110-4x Metadata Transport

Metadata affects the entire lifecycle of media assets and describes many properties which are all important at different times during the media lifecycle. Sufficient detail is necessary to manage visual effects in post production and workflow processing and the rabbit hole is very deep. We find out how deep it goes.

The ST 2110-4x Metadata Documents

Metadata affects the entire lifecycle of media assets. Imaging and sound recording theory describes many subtle properties for the assets which are all vitally important at different times during the lifecycle. The rabbit hole is very deep. Sufficient detail is necessary to manage visual effects post production and workflow processing effectively.

If you neglect to gather metadata at any stage, forensic reconstruction later on may sometimes be possible. However, if the data is truly lost, it is gone forever.

Before looking at ST 211-40, it is worth reviewing ST 291 which is the source of the ancillary metadata carried in the ST 2110-40 streams.

The ST 2110 Metadata Standards

SMPTE has already published or is working on several standards covering the use of metadata in an ST 2110 based IP network. 

StandardDescription
ST 291-1Ancillary Data Packet and Space Formatting.
ST 2110-40Carriage of SMPTE ST 291-1 Ancillary Data extracted from an SDI feed. This was developed early in the ST 2110 architectural design.
ST 2110-41ST 2110-41 Fast Metadata Framework (FMX). A new approach which is more flexible than the earlier standard.
ST 2110-42Metadata for 2110-x streams using FMX. This is a work in progress that at the time of writing is currently on hold.
ST 2127-2Mapping Metadata Guided Audio (MGA) to ST 2110-41. This partially replaces the need for ST 2110-42.
ST 2109Format for Non-PCM Audio and Data in AES3 – Audio Metadata.
29Caption data.

Understanding of the ancillary data embedded in an SDI stream described in the ST 291 standard illustrates how the metadata is transformed for carriage in ST 2110-40 streams.

ST 291 Ancillary Data

There are two parts of the ST 291 standard:

  • ST 291-1 – Ancillary data packet and Space Formatting.
  • RP 291-2 – Ancillary data packet payload formats.

ST 291-1 is the standard for ancillary data but you should also read Recommended Practice RP 291-2 to apply it.

Vertical blanking is organized as lines of TV signals carrying a variety of test and other data. It is located before and after the active-image area. A safe area reserves some lines for special purposes. Vertical blanking does not include the horizontal blanking time which is located at the start of all lines whether they are visible or not.

Any audio found in the Horizontal Ancillary data (HANC) area will be delivered according to Part 30. The remaining ancillary data will be delivered in the part 40 ANC stream. If there is no ancillary data to send, an empty ‘keep-alive’ heartbeat packet will be transmitted at least once for every field frame or segment. ANC packets should be transmitted within 1ms of the data appearing in the SDI feed.

The data arriving from an SDI feed should be identifiable using the identifying codes in ST 352. That can be used to determine the mapping of RP 291-2 data-types for the ANC packets.

Consult IETF RFC 8331 for the RTP payload definition for ST 291 ancillary data.


Note that an errata document has been published for ST 2110-40 (ed 2023) which describes some typographical errors.


ST 291 – Part 1

ST 291-1 is a robust metadata system designed to carry anything that could potentially appear in Vertical ANCillary data (VANC) and HANC data spaces within an incoming SDI feed.

There are two similar kinds of Ancillary Data Packets described: Type 1 & Type 2. Type 1 is more useful when lengthy user data is being transmitted (in excess of 255 words). Type 2 supports a wider variety of message types for delivering controls and signals.

The packets all carry a packet Description ID (DID) code that allows the receiving client player to handle each kind of packet correctly. Type 2 extends this with a Secondary ID (SDID) value.

High-definition ANC data formatting is described in recommended practice RP 291-2.


SMPTE often publishes recommended practice documents and sometimes promotes them to full standards when they are ready. The ST and RP documents therefore share a common numbering space.


Data ID & Secondary ID Codes

The Data ID (DID) and Secondary Data ID (SDID) values describe the type of ANC packet being transmitted. The Type 1 packets contain a single DID value while the Type 2 packets have a DID, plus an additional SDID value that allows for a greater range of semantic meanings:

  • Type 1 – DID + Sequence Number.
  • Type 2 – DID + SDID.The DID and SDID formats are set out in ST 291-1 and the range of possible values are maintained in a registry managed by SMPTE at this URL:

https://smpte-ra.org/smpte-ancillary-data-smpte-st-291/


Note that a different URL is described at the top of Page 5 in the ST 291 standard but the target web page for it is missing.


The entire registry can be printed, saved as a PDF or downloaded as an Excel spreadsheet.

Prior to the online registry, this data was published in the SMPTE RP 291 document. This is now out of date and has been withdrawn.

Refer to Section 4 in the ST 291-1 standard for a description of how the registry is operated and how to request new ID values.

Once a value has been registered against a DID or SDID value, the meaning can never be altered, and nor can it be deleted from the registry. It might be deprecated though.

Type 1 ANC Packets

Type 1 packets can carry these payloads in the user data area:

  • Audio Data.
  • Camera position.
  • Error Detection and Handling.
  • Continuous sequences of related packets.
  • At the time of writing a generic Time Labelling format is also under development.

Here is the internal structure for a Type 1 ANC packet:

These are the various parts of the packet:

Value Description
ADF The Ancillary Data Flag indicates the start of the packet. It may be exactly one or three words long.
DID The Data ID word describes what kind of data is being carried as a payload.
DBN A Data Block Number describes this packet's position within a sequence of packets
belonging to a specific DID value. The range of values is 1-255 and the cycle can then be restarted as many times as needed.
DC A Data Count number describes how many words make up the user data payload.
UDW The User Data Words payload, comprising up to 255 words in each packet. The format and syntax of this data will be defined in a separate document specific to the application.
CS A Checksum word which provides basic error detection but not correction.

 


Hexadecimal numeric notation is based on 16 symbols from 0 to F. A Hexadecimal number is suffixed with a small letter ‘h’ to denote this format.


The value of the Ancillary Data Flag is important. It is a pattern that must not appear anywhere else in the stream. Consequently, these hexadecimal word values are never permitted in the payload:

000h, 001h, 002h, 003h

3FCh, 3FDh, 3FEh, 3FFh

The ADF value is composed of three words for a component interface and only one word for a composite interface. The three-word hexadecimal ADF value for component interfaces is:

000h 3FFh 3FFh

The one-word hexadecimal ADF value for composite interfaces is:

3FCh

There are potential data losses when the ANC data is passed through legacy equipment that only supports 8-bits. The 10-bit ADF values are truncated by removing the two least significant bits.

This alters the value of the data word in a different (and more complex) way compared with truncating the most significant bits. Restoring back to a 10-bit value cannot replace the lost data in the two least significant bits. There is always some uncertainty regarding the restored word values.

For example, the value 3FFh becomes FFh in the 8-bit environment. When it is restored by padding with zero-bits, it becomes 3FCh. In fact, all four of the values between 3FCh and 3FFh are restored as 3FCh. Likewise, the values 0 to 3 all become zero in the 8-bit world and are always zero when restored to a 10-bit value.

The truncation is not applied in a consistent manner across the whole packet which adds to the complexity.

This has implications for the synchronization of packet starts and also the identification of DID and SDID values that determine the meaning of each packet. Read about Legacy Equipment on Page 3, then read Sections 5 and 6 of the ST 291 standard to fully understand the complexities of this.


You need not worry about this if there are no 8-bit legacy systems in your enterprise.


Type 2 ANC Packets

Type 2 packets can carry a wider variety of payloads:

  • Acquisition Metadata Sets for video camera parameters.
  • AFD and Bar Data.
  • Ancillary Time Code.
  • ANSI/SCTE 104 messages.
  • Compressed audio metadata.
  • Data broadcast (DTV).
  • DVB/SCTE VBI data.
  • EIA 608 Data mapping.
  • EIA 708B Data mapping.
  • Extended HDR/WCG for SDI.
  • Film Codes.
  • HD-SDTI transport in active frame space.
  • KLV Metadata transport.
  • Link Encryption Messages & Metadata.
  • Lip Sync data as specified by ST 2064-1.
  • Metadata to monitor errors of audio and video signals on a broadcasting chain.
  • MPEG recoding data.
  • MPEG TS packets.
  • Packing UMID and Program Identification Label Data into SMPTE 291M Ancillary Data Packets.
  • Pan-Scan Data.
  • Payload Identification.
  • Program Description.
  • SDTI transport in active frame space.
  • Stereoscopic 3D Frame Compatible Packing and Signaling.
  • Structure of inter-station control data conveyed by ancillary data packets
  • Subtitling Distribution packet (SDP).
  • Time Code for High Frame Rate Signals.
  • Transport of ANC packet in an ANC Multi-packet.
  • Two Frame Marker.
  • VBI Data.
  • Vertical Ancillary Data Mapping of KLV Formatted HDR/WCG Metadata.
  • WSS data per RDD 8.

Here is the internal structure for a type 2 ANC packet:

Type 2 packets replace the Data Block Number (DBN) with a Secondary Data ID (SDID) value. Hence, they cannot be used to construct sequences of related packets unless the sequencing is managed by the payload content.

ST 2110-40 – Carriage Of ST 291-1 Ancillary Data

Part 40 describes how to carry ST 291 compliant ancillary data transmitted during the horizontal and vertical blanking intervals. Ancillary data includes these different elements:

  • Closed captioning.
  • Subtitles.
  • Timecode.
  • Metadata.
  • Slate information.
  • Teletext.
  • Copy prevention signals.
  • Vertical Interval Test Signals.

Ancillary data (ANC) extracted from an SDI stream is delivered in real-time alongside the video and audio essence data. They all travel in separate RTP streams around the IP network. The system architecture is described in ST 2110-10 which covers timing and all the common characteristics of the different streams. The streams are all synchronized to a common reference clock.

The individual packets of ANC data are described in SMPTE standard ST 291 Part 1. The source of this data is the VANC and HANC data spaces within an SDI signal. Those data spaces are described in RP 291 Part 2 for high-definition TV services.

The code that supports this process is quite simple to build:

  • The HANC and VANC ancillary data spaces are de-embedded from the incoming SDI feed.
  • The individual data items are parsed out and stored in local variables.
  • The metadata is formatted into ST 291-1 packets.
  • The ST 291 packets are marshalled into an RTP stream for output.

Mapping ST 291 Packets Into RTP Streams

The mapping of ST 291 ANC data packets into the RTP streams is described in IETF RFC 8331. This is constrained by the requirements set out in ST 2110-10. It is important to obtain all of these standards first and cross-refer while you build, test and debug your IP network infrastructure.

RFC 8331 retains the location within the SDI ANC data spaces where the source metadata was found. It is described in terms of these properties:

  • Line Number.
  • Horizontal Offset.
  • Stream Number.
  • Interlaced Field (where applicable).

A stream of properly constructed RFC 8331 packets will allow the original SDI signal to be reconstructed for output at the destination.

Timestamps are based on the video fields and should be the same for all ANC packets derived from a single interlaced field or progressive-frame.

If there is no ANC data to be transmitted, a ‘keep-alive’ packet is delivered as a heartbeat to maintain the connection.

Timing is critical to maintain synchronization. Keeping latency to a minimum is important too. Section 6 of the ST 2110-40 standard explores this in some depth.

The Session Description Protocol (SDP) is widely used within the ST 2110 environment. This is also described in RFC 8331 and ST 2110-10. Refer to Section 7 of ST 2110-40 for details of how SDP is employed by metadata streams.

Section 7 also mentions that the flow identification semantics described in RFC 8331 should not be used as they are incompatible with ST 2110.

A format Specific Parameter is also described in section 7 and must be defined correctly to include the frame rate.

ST 2110-41 – Fast Metadata Framework (FMX)

Part 41 specifies how to transport Faster Metadata (FMX) that did not originate from an ST 291 compliant source. Obtaining signaling data from a separate data stream is easier than unpacking video streams.

ST 2110-40 could potentially consume many IP addresses. FMX described in ST 2110-41 is a more flexible format for transmitting metadata payloads in RTP packets. It is designed to use the available network resources more efficiently.

This document is still being worked on but is due to be released very soon. There is some outstanding work required to define the payload formats to be carried by this transport.

The design goals and advantages of FMX over ST 2110-40 + ST 291 are:

  • More easily parsed.
  • Rapid delivery of metadata.
  • Time aware and synchronized with the video and audio essence.
  • Extensible architecture beyond what we currently think we need.
  • Payload agnostic and can transport anything (such as XML, JSON or any other format).
  • Will use ST 2110-10 and ST 2059 to synchronize.
  • Upwards compatible and will support ST 291 SDI ANC data.
  • Useable with any kind of media stream.
  • Delivered independently of an elementary stream.
  • Must require no changes to any other ST 2110 specifications.
  • More efficient use of IP address space.
  • Can be synchronized with the system clock more easily.
  • Uses a Key-Length-Value (KLV) packaging model.
  • Single level of encapsulation.
  • Allows more sophisticated metadata schemas than SDI.
  • Supports larger amounts of metadata than SDI.

FMX Packet Structure

FMX supports the delivery of time-synchronized metadata associated with audio and video data streams or less tightly coupled metadata that does not need to be synchronous.

Like many other ST 2110 standards, this also describes RTP payloads and SDP messaging formats.

Section 5 describes the RTP packet format comprising two parts:

  • RTP Header.
  • RTP Payload.

The header is described in RFC 3550. ST 2110-41 adds constraints to profile this and render the packets compatible with the rest of ST 2110.

This is much simpler than ST 2110-40 in the way that it describes the format of the payload content. The RTP packet size is still constrained by the requirements set out in ST 2110-10.

The payload is constructed from a collection of Data Item Packages. An empty RTP packet can be delivered without any Data Item Packages for keep alive purposes.

The Data Item Packages must not be split across multiple RTP packets. Their size must be a multiple of 32-bits but can be variable within that constraint. Any necessary padding is considered to be part of the Data Item Package and would be specified in the application document.

Value Description
DIT The Data Item Type is a 22-bit value that describes the contents of the package.
K-bit This flag bit manages the segmentation of the payload into objects. See Annex A of ST 2110-41 for details.
DIL The Data Item Length. The number of 32-bit fragments in the payload body.
32-bit Fragments All of the payload data must be split into 32-bit fragments.
Padding The last fragment may contain padding to ensure the data length is a multiple of 32-bits.
Payload Some semantic information may be carried in the payload to describe the data structure and how much padding is included.

 

An alternative segmentation scheme for supporting an object-like structure is described in Annex A. Section 6 describes the SDP signaling message format which must conform to RFC 4566 and ST 2110-10. This mandates that the SSN value should describe an ST 2110-41 stream and enumerate any Data Item Types that may appear in the stream. Here is an example SDP message:

a=fmtp:117 SSN=ST2110-41:2024; DIT=100,2000A1,1013FC,3FFF00

The payload formats are described in these separate standards:

PayloadDescription
ST 2109Format for Non-PCM Audio and Data in AES3 – Audio Metadata.
ST 2110-42Technical metadata. Currently on-hold at the time of writing.
ST 2127-2Mapping Metadata Guided Audio (MGA) to ST 2110-41.

ST 2127-2 – Mapping MGA Audio Metadata To ST 2110-41

ST 2127 Part 2 describes the mapping of Metadata-Guided Audio (MGA) payloads into the Fast Metadata Framework (FMX). MGA is described more fully in ST 2127 Part 1 where it is used with MXF files and the FMX framework is described in ST 2110-41.

This standard was described in one of the SMPTE quarterly reports in 2022 as a partial replacement for ST 2110-42.

The standard is quite short and makes reference to the audio essence being carried by an ST 2110-30 stream controlled by the ST 2127-2 metadata carried in a ST 2110-41 framework using rules defined in ST 2127-1.

Acquire and read all four documents to understand the entire structure.

ST 2110-42 – FMX Payload For 2110 Technical Metadata

The original intent for this standard was to describe an object-based format for carrying the technical metadata about the video/audio streams. This would be delivered using the framework defined by ST 2110-41.

This is not intended to replace NMOS IS-04 or the SDP messaging protocols. The metadata payloads would be packaged as described in ST 2110-41. These are some of the items that were proposed:

ST 2110 Part Metadata carried
20 File Multicast Transport Protocol (FMTP) stream parameter values.
30 & 31 The Packetization time (ptime) value and the number of channels.
30 & 31 The ttime value and the number of channels.
40 The video format tag (VPID byte).
All AMWA Sender ID and/or Flow ID.

 

Initial work was started in 2020 but the only public mention of ST 2110-42 is in the SMPTE quarterly reports. The most recent information suggests this standard may not be necessary as the requirements will be covered by other standards that are nearing completion. In particular ST 2127-2 is mentioned.

As of February 2023, any further progress on ST 2110-42 appears to have halted for the time being indicating that it may now be considered redundant.

Relevant Standards

Refer to these other standards for supporting material that will assist in understanding and deploying ST 2110-4x metadata content:

DocumentVintageDescription
ST 1252013SDTV Component Video Signal Coding 4:4:4 and 4:2:2 for 13.5 MHz and 18 MHz Systems.
RP 1651994Error Detection Check-words and Status Flags for Use in Bit-Serial Digital Interfaces for Television.
RP 1682009Definition of Vertical Interval Switching Point for Synchronous Video Switching.
ST 2592008SDTV Digital Signal/Data – Serial Digital Interface.
ST 27420081920 x 1080 Image Sample Structure, Digital Representation and Digital Timing Reference Sequences for Multiple Picture Rates.
ST 291-12011Ancillary Data Packet and Space Formatting.
RP 291-22013Ancillary Data Space Use – 4:2:2 SDTV and HDTV Component Systems and 4:2:2 2048 × 1080 Production Image Formats.
ST 292-120181.5 Gb/s Signal/Data Serial Interface.
ST 29620121280 × 720 Progressive Image 4:2:2 and 4:4:4 Sample Structure – Analog and Digital Representation and Analog Interface.
ST 3052005Serial Data Transport Interface.
ST 3522013Payload Identification Codes for Serial Digital Interfaces.
ST 2059-12021SMPTE Profile for Use of IEEE-1588 Precision Time Protocol in Professional Broadcast Applications. This covers the generation and Alignment of Interface Signals to the SMPTE Epoch.
ST 21092019Format for Non-PCM Audio and Data in AES3 – Audio Metadata.
ST 2110-102022Professional Media over Managed IP Networks – System Timing and Definitions.
ST 2110-202022Professional Media Over Managed IP Networks – Uncompressed Active Video.
ST 2110-212022Professional Media Over Managed IP Networks – Traffic Shaping and Delivery Timing for Video.
ST 2127-12022Mapping Metadata-Guided Audio (MGA) signals into the MXF Constrained Generic Container.
ST 2127-22024Mapping MGA Audio Metadata to ST 2110-41.
RFC 35502003RTPA Transport Protocol for Real-Time Applications.
RFC 45662006SDP Session Description Protocol.
RFC 52342008Augmented BNF for Syntax Specifications.
RFC 82852017A General Mechanism for RTP Header Extensions.
RFC 83312018RTP Payload for SMPTE ST 291 Ancillary Data.
VSF TR-032015Transport of Uncompressed Elementary Stream Media over IP.
VSF TR-042015Utilization of ST 2022-6 Media Flows within a VSF TR-03 Environment.

Using Metadata In ST 2110

Metadata management within the ST 2110 architecture is a work in progress. The ST 2110-40 standard is designed around the ancillary data found in an SDI signal.

A more generalized approach is specified in ST 2110-41 but this only describes the framework for carriage of the metadata.

ST 2110-42 was intended to describe the message payloads and metadata formats but this work is currently suspended.

Some audio metadata is described in ST 2127-2 and ST 2109 but there is currently no corresponding standardized description for video-related metadata delivered via FMX.

For now, we must watch and wait.

Metadata keeps track of your products in the archives so they can be re purposed and monetized later.

Supported by

You might also like...

Standards: Video - Advanced Video Coding (AVC)

AVC remains one of the most widely deployed video codecs in the world, but navigating its profiles, levels and signaling mechanisms is far from straightforward.

Network Traffic Engineering: RIST & SRT - The Success Of ARQ Based Protocols

IP networks are inherently unreliable. We kick off this series on IP Network Traffic Engineering with a look at how RIST and SRT give broadcast engineers user-configurable control over the latency-versus-reliability trade-off for real-time media streaming.

Standards: Video - Standards For Video Coding

From 4K to 32K, the demand for ever-larger video formats is pushing codec technology to its limits. This guide surveys the landscape of video coding standards – from legacy MPEG formats to AI-driven neural network compression – to help navigate the choices sha…

Broadcast Standards 2026 – Video Coding

Video coding was developed to deliver video conferencing services over low-bandwidth modem connections, but modern demands for ever-larger video formats are pushing codec technology to its limits.

Network Traffic Engineering: Part 1

IP networks are inherently unreliable. They always have been – it is literally designed in as a feature.