Standards: Video - Standards For Video Coding
From 4K to 32K, the demand for ever-larger video formats is pushing codec technology to its limits. This guide surveys the landscape of video coding standards – from legacy MPEG formats to AI-driven neural network compression – to help navigate the choices shaping the future of video delivery.
The Fundamental Challenge
Video content is getting larger with image sizes that are routinely 4K, and often 8K. We now have 16K on the horizon and 32K camera development is underway. Couple that with very high frame rates and HDR imaging and you have the recipe for a huge challenge.
The task is to deliver the multi-gigabyte files across an increasingly congested Internet for consumers to enjoy.
Codec designers have risen to these demands admirably and the newest codecs are astonishingly good at compressing video content. And we are not done yet with finding improvements; new techniques based on Neural Networks are on the way and we are on the threshold of another revolution in coding efficiency.
Video Is Becoming More Pervasive
Video went mobile with the introduction of the iPhone. After that all mobile phones followed the same fundamental design. Although they were actually developed prior to the iPhone, iPads ushered in the era of tablets which provide a larger format for the video viewing experience, and these are used everywhere. According to the statistics, tablets are the most popular platform for mobile video viewing.
Tablet displays are now commonplace in vehicles. Cameras are replacing mirrors for reversing and the low-latency transport of video over short distances becomes more important as autonomous vehicles require computer vision and Lidar to be employed to model the surrounding environment.
Digital signage and inexpensive large flat screens facilitate the ubiquitous deployment of video advertising. We see examples in shops, on public transport, on entertainment systems in cars, and many other places. Personalized advertising as portrayed in the films like “Minority Report” is technically feasible now. Facial recognition rather than retina scans is already in widespread use.
Examples of current and future applications are:
- TV broadcasting.
- Removable media (DVD & Blu-ray).
- Streaming services (Netflix, Disney+, AppleTV, Prime etc.).
- Digital signage.
- Digital advertising.
- Automotive and self-driving solutions.
- Autonomous vehicles.
- Long distance drone delivery systems.
- Augmented reality.
- Virtual reality (Metaverse).
- Security monitoring.
- Machine vision.
- Toys & gaming.
A Smorgasbord Of Choices
Historically, international standards for video coding have been driven mainly by MPEG. Proprietary standards are popular with streaming services and the organizations that sell technology to them. Open-source codecs are gaining traction because they are almost exclusively patent free and cost effective to deploy.
There are a variety of ways to group codecs into different categories. We could arrange them according to coding efficiency or the lossy nature of the resulting output. Perhaps for commercial reasons they can be divided into licensed or free codecs. The most important criteria are based on how they are going to be used:
- Lossless coding.
- Production codecs.
- Codecs suited to streaming deployment.
- Broadcasting codecs.
- Mobile friendly codecs.
- Archival quality codecs.
- Legacy and deprecated codecs and how to extract content from them.
Deployment codecs stream video to end users via players or TV apps. Codecs for production workflows tend to be proprietary and based on the camera output or the editing and visual effects pipeline tools:
- Production use demands lossless behavior.
- Delivery to consumers is bandwidth limited and sacrifices quality for lower bitrates.
The H-Series Codecs
The H series of codecs are early research prototypes that eventually become international standards when the development is complete. This series is managed by ITU-T and shared with the ISO/MPEG standards developers via the joint video coding team. There are several historical names for that illustrious group of specialists. It is currently known as the Joint Video Experts Team (JVET).
These are the ITU-T H series video coding standards names with corresponding MPEG and ISO references and their familiar names:
| H Number | MPEG | Standard | Name |
|---|---|---|---|
| H.120 | - | - | Video Conferencing |
| H.261 | MPEG-1 | ISO 11172 | MPG |
| H.262 | MPEG-2 | ISO 13818 | MPEG |
| H.263 | MPEG-4 part 2 | ISO 14496-2 | MP4 Visual |
| H.264 | MPEG-4 part 10 | ISO 14496-10 | AVC |
| H.265 | MPEG-H part 2 | ISO 23008-2 | HEVC |
| H.266 | MPEG I part 3 | ISO 23090-3 | VVC |
| H.267 | Not Yet Defined | Under Development | ECM |
ISO MPEG Video Codec Evolution
The ISO MPEG codecs have evolved from a common ancestor whose provenance can be traced back at least as far back as the H.120 video conferencing standard developed by CCITT in 1988.
There are a variety of abbreviated names for ISO codecs. Some of these have fallen into disuse as they have been out-performed by later alternatives. The more advanced coding designs are targeting VR/AR content:
| Name | Description |
|---|---|
| M-JPEG | Motion-JPEG. |
| JPEG XS | Lightweight low latency video coding. |
| AVC | Advanced Video Coding. |
| MVC | Multiview Video Coding. |
| SVC | Scalable Video Coding. |
| RVC | Reconfigurable Video Coding. |
| HEVC | High Efficiency Video Coding. |
| VVC | Versatile Video Coding. |
| VCB | Video Coding for Browsers. |
| IVC | Internet Video Coding. |
| GVC | General Video Coding. |
| EVC | Essential Video Coding. |
| LCEVC | Low Complexity Enhancement Video Coding. |
| ECM | Enhanced Compression Model Coding. |
| V-PCC | Video-based Point Cloud Compression. |
| G-PCC | Geometry-based Point Cloud Compression. |
| MIV | MPEG Immersive Video. |
Looking forwards, the new EVC codec is a recent initiative based on a patent-free toolkit.
The LCEVC codec is designed to extend any underlying codec. The enhancement layers improve the resulting decompressed picture quality.
The VP9 and AV1 codecs are developed independently by open-source groups or corporates. They can also be enhanced with LCEVC.
The H.264/AVC codec has been very successful. In 2004, it was uncertain whether H.264/AVC or VC1 would become dominant. VC1 was based on a popular Microsoft Windows Media format offered to SMPTE for ratification. Eventually, H.264 did become the codec of choice for a wide variety of applications. AVC support is embedded in most player devices currently shipping. The successors (SVC, HEVC & LCEVC) must offer significant advantages over AVC to gain similar traction.
There are variants of the AVC codec to support stereographic 3D viewing, although this has fallen out of favor now. Consult the online resources for more details about MVC, MVCD and 3D-AVC should you need them.
On the horizon is the new ECM video codec which is currently being designed. Looking forward, there are many opportunities for new standards to be created where there are none available at present.
Interoperability Challenges
Although everything can be built on standards, that does not always guarantee interoperable systems. The standards can be interpreted differently in many ways. Standards developers employ several approaches to solve this:
- Profiles – Define a set of constraints that limit the available features of a coding architecture. Profiles could be used to constrain the coded output so it does not require the use of patented tools. Another profile might be optimized for speed or bandwidth reduction.
- Tiers – Introduced by HEVC to simplify the use of profiles.
- Levels – Define limits for the size of the image rectangle or pixel depth and other metrics.
Profiles constrain feature sets and levels constrain the range of data values used by the codec.
Popular ISO/MPEG Video Codecs
Here are the ISO/MPEG standards currently in popular use for video coding technologies. MPEG provides useful insights online and uses the name as a key. ISO has a web page for each standard which links to each published edition. Make sure you are using the latest edition unless your project mandates a specific earlier one.
| Name | Description | Standard-Part |
|---|---|---|
| MPEG-1 | ISO 11172-2 | The original MPEG standard is most likely only found in media archive collections. Video CD disks were released when MPEG-1 was popular. Old MPG files will contain MPEG-1 coded video content. This is likely to be VHS quality at best. |
| MPEG-2 | ISO 13818-2 | Facilitated the switch from Analog to standard definition Digital TV broadcasting and DVD disks. |
| MPEG-4 Visual | ISO 14496-2 | The part 2 (Visual) codec was an improvement over MPEG-2. It was very quickly overtaken by part 10 (AVC) coding. Examples coded in this format will be very rare. |
| AVC | ISO 14496-10 | Widely adopted for high-definition as a successor to MPEG-2. It can potentially work up to 8K resolution. |
| HEVC | ISO 23008-2 | Intended to replace AVC for very high-definition video. It is well supported in web-browsers. |
The MPEG Taxonomy
The later MPEG standards gradually replace or enhance their earlier counterparts. The MPEG-2 standard improves MPEG-1 and MPEG-4 adds further enhancements. After MPEG-4, the major component parts are broken out to separate standards which themselves have many parts.
These are the major video coding related ISO standards:
| Document | Version | Description |
|---|---|---|
| ISO 11172 | MPEG-1 | Coding of Moving Pictures and Associated Audio. |
| ISO 13818 | MPEG-2 | Generic Coding of Moving Pictures and Associated Audio Information. |
| ISO 14496 | MPEG-4 | Coding of Audio-Visual Objects. |
| ISO 15938 | MPEG-7 | Multimedia Content Description Interface. |
| ISO 21000 | MPEG-21 | Multimedia Framework. |
| ISO 23000 | MPEG-A | Multimedia application format. |
| ISO 23001 | MPEG-B | MPEG systems technologies. |
| ISO 23002 | MPEG-C | MPEG video technologies. |
| ISO 23003 | MPEG-D | MPEG audio technologies. |
| ISO 23004 | MPEG-E | Multimedia Middleware. |
| ISO 23005 | MPEG-V | Media context and control. |
| ISO 23006 | MPEG-M | Extensible Middleware (MXM). |
| ISO 23007 | MPEG-U | Rich media user interfaces. |
| ISO 23008 | MPEG-H | High Efficiency Coding and Media Delivery in Heterogeneous Environments. |
| ISO 23009 | MPEG-DASH | Dynamic adaptive streaming over HTTP. |
| ISO 23090 | MPEG-I | Coded representation of immersive media. |
| ISO 23091 | MPEG-CICP | Coding- independent code points. |
| ISO 23094 | MPEG-5 | General Video Coding (GVC) EVC & LCEVC. |
Current & Future Deployment Codecs
Historically, most codec specifications were described by ISO standards based on the MPEG working groups research. Now they have competition from other developers. There have always been proprietary standards used in closed streaming services where the provider controls the choice of player operated by the end-user. Open-source patent-free codecs are rapidly gaining traction and challenge the status quo because they are commercially attractive:
| Codec | Observations |
|---|---|
| MPEG-2 | Still used for DVDs and Digital TV broadcasting and continues to be important for SD delivery. |
| AVC | Widely used by streaming services for HD broadcasting. |
| HEVC | The licensing model is being challenged by AV1. |
| VP series | The VP8 and VP9 codecs thrive within the Google ecosystem but are challenged by AV1. |
| AV series | The AV1 codec is seen as a very strong contender against the VP and HEVC codecs. |
| VVC | Seen as a potential successor to HEVC and may displace it. |
| AV1 + LCEVC | Using the LCEVC enhancement layer on top of AV1 could become dominant. |
| EVC | This codec is attractive because it is entirely patent free when the Baseline profile is used. |
| MPAI-EVC | Artificial Intelligence (AI) enhanced EVC coding may deliver very good compression ratios. |
| ECM | Enhanced Compression Model Coding |
Looking Forward To New ISO Codecs
New and interesting video coding concepts are emerging as we look forward to VR and other future applications. Projecting a moving video image into a 3-Dimensional space is already happening. To succeed commercially it needs to be delivered consistently and portable to many different headsets and players. There are also some interesting developments that allow existing codecs to be enhanced to improve viewing quality:
| Name | ISO Standard | Description |
|---|---|---|
| M-JPEG | ISO 15444-3 | Motion JPEG 2000. |
| SVC | ISO 14496-10-F, ISO 14496-10-G | Scalable Video Coding which enhances AVC compression and could be applied to other codecs as well in a WebRTC scenario. See MPEG-4 part 10. |
| RVC | ISO 23001-1, ISO 23001-4, ISO 23002-4 | Reconfigurable Video Coding uses interchangeable component tools to construct a specific encoder. See MPEG-B part 4 and MPEG-C part 4. |
| MPEG-B | ISO 23001 | Systems technologies (many parts). |
| Derived Visual Tracks | ISO 23001-16 | Non-destructively applies transforms and visual effects such as blends and dissolves to source video material. See MPEG-B part 16. |
| MPEG-C | ISO 23002 | Media Tool Library. |
| MVC | ISO 23002-3 | Multiview coding. |
| VVC | ISO 23090-3 | Versatile Video Coding is designed to halve the output size of HEVC compressed 8K video. Also known as FVC (Future Video Coding). See MPEG-I part 3 and H.266. |
| V-PCC | ISO 23090-5 | Visual Point Cloud Compression for describing moving video in 3D scenes using volumetric points. See MPEG-I part 5. |
| G-PCC | ISO 23090-9 | Geometric Point Cloud Compression for describing moving video in 3D scenes using a triangulated mesh. See MPEG-I part 9. |
| MIV | ISO 23090-12 | MPEG Immersive Video for creating Virtual Reality scenes rendered from Visual Point Clouds V-PCC. See MPEG-I part 12. |
| GVC | ISO 23094 | General video coding. |
| EVC | ISO 23094-1 | Essential Video Coding offers a royalty free Baseline profile and a more sophisticated Main profile with licensable tools that can be switched in and out as needed. See MPEG-5 part 1. |
| LCEVC | ISO 23094-2 | Low Complexity Enhancement Video Coding improves any underlying existing codec to fix compression artefacts. See MPEG-5 part 2. |
| MPAI-EVC | Not Yet | Replaces individual coding tools in an EVC codec with AI empowered counterparts. |
| ECM | Not Yet | Based on the H.267 research project. Provides a very fast low latency coding advancements for gaming and automotive solutions. |
Which Is The Best Codec?
The finished size of the media content directly affects the cost of transmission. Broadcasters want to fit more channels into a multiplex and streaming services want to deliver more on-demand content within their available bandwidth. This drives the adoption of more efficient codecs.
The optimum choice of codec is most likely driven by the cost of licensing and the level of support already deployed in the target client viewing devices. There are many alternative proprietary and open-source high-performance codecs to choose from but they lack market penetration or deployed player support.
If ISO standards are mandated, then AVC is optimal for standard-definition. HEVC would perform well on higher definition and VVC will outperform both. All three require royalty payments.
License free open-source codecs that are widely deployed in web-browsers are becoming dominant. VP9 is also very popular. Support is good on Windows and in the Chrome browser.
Safari is the browser built in to all the Apple iPhone and iPad products; it has partial support and it is improving. The core WebKit support is mandated by Apple even when competing browser apps are developed on these platforms. That might change, which would allow third-party render engines that support other codecs to be deployed.
Apple Silicon CPU chips have hardware support built into the CPU chips to handle AV1 content. This will likely migrate into the A-Series chips in iPads and iPhones as well. The leverage from this implementation virtually guarantees its dominance in the market. If a royalty-free solution is required, then AV1 appears to be the strongest contender.
Image Sizes & Resolutions
Very early streamed video resolution was microscopic compared with modern 8K displays. The progress in creating larger display formats has been remarkable. It was not long ago that NHK was demonstrating 8K TV projected onto a large theatrical screen to an astonished audience.
TV image sizes are characterized as follows:
- SDTV – Standard Definition TV.
- EDTV – Enhanced Definition TV.
- HDTV – High Definition TV.
- UHDTV – Ultra High Definition TV.
These are the common digital format image sizes you will most likely encounter when processing content.
| Tool | W | H | Scan Type | Description |
|---|---|---|---|---|
| SDTV | 640 | 480 | Interlaced | Analog NTSC compatible as used in the USA. |
| SDTV | 720 | 576 | Interlaced | Analog PAL/SECAM compatible as used in Europe. |
| EDTV | 720 | 80 | Progressive | USA region 1 DVD format. |
| EDTV | 720 | 576 | Progressive | European region 2 DVD format. |
| HDTV | 1280 | 720 | Progressive | Medium definition. |
| HDTV | 1920 | 1080 | Interlaced | Compromise format to save bandwidth. |
| HDTV | 1920 | 1080 | Progressive | Full HD (Blu-ray). |
| UHDTV | 3840 | 2160 | Progressive | 4K UHD format. |
| UHDTV | 7680 | 4320 | Progressive | 8K UHD format. |
HEVC is already capable of encoding 16K video and the advanced research labs are working on cameras that can deliver 32K video formats.
Frame Rates
Anything less than about 15 frames per second (fps) will be perceived by a human viewer as individual frames. Early movie films used 18 F as a minimum rate.
Conventional movie film is shot at 24 fps but projected at 48 fps with each frame being flashed on the screen twice.
Movie frame rates are close to the UK and European TV frame rates which are 25 fps, but in the USA the TV presents images at 29.97 fps. This odd value is chosen for obscure technical reasons and is often referred to as 30 fps.
The introduction of digital TV services freed us from any constraints and higher frame rates became practical. These make motion appear more natural. Sport content looks much better at 100 or 120 fps as there is less motion blur.
The downside of higher frame rates is that there are more frames to compress, but on the upside there is a greater likelihood of there being compression macro-blocks that are similar or identical across more frames, which evens out the downside somewhat by improving compression efficiency.
Chroma Sub-sampling
Human visual perception is extremely good at resolving detail conveyed as luminance. Coloring is less important because the eye compensates for the effects of ambient illumination to adjust the relative colors. We can cope very well with about a third of the resolution in the color domain vs. luminance.
Video transmission and compression technology exploits this by carrying less information about the color. Here is a much-simplified description of how it works:
TV signals are converted from their RGB values into these components by mathematically operating on the color intensity values for each pixel:
- Luminance (combined from R+G+B).
- Blue difference (Blue minus Yellow).
- Red difference (Red minus Yellow).
This notation describes the proportion of information delivered for each component:
| Sampling | Description |
|---|---|
| 4:4:4 | Full resolution for all components. Used in high-end film scanners and movie post production. |
| 4:2:2 | Luminance at full resolution, both color difference channels at half resolution. Used by many professional video cameras and interfaces. |
| 4:1:1 | Luminance at full resolution, both color difference channels at quarter resolution. Pro-consumer cameras and the NTSC TV system use this technique. |
| 4:2:0 | Luminance at full resolution, difference channels at half resolution alternating on each line. Used by many consumer formats, digital TV, the PAL system, DVDs, Blu-ray disks and JPEG images. |
Colorimetry Terminology
Several new terms are introduced when contemplating colorimetry:
| Term | Description |
|---|---|
| Color Primaries | The primary color components. |
| Color Space | The range of colors that could be realized by the color primaries. |
| White-Point | The center of the color space where the components are equal and produce a pure white tone. |
| Gamut | The limited range of colors supported by a model within the color space. |
| Color Temperature | Originally defined by the operating temperature of an incandescent lightbulb that is used to specify the white point. |
Colorimetry Specifications
One of the important differences is the way colorimetry is specified. The range of colors is constrained in traditional broadcasting because illegal colors cannot be reliably transported through the systems. This is due to legacy limitations existing long after analog video has been deprecated. Streaming services are not limited because they never used analog video.
This is a complex subject. Human visual perception naturally compensates for color shifts, different dynamic ranges and brightness levels. The colorimetry standards describe transfer functions that map the input brightness levels to the output. When the output color gamut cannot include all the colors in the input, the transfer function ensures the image is perceived correctly under all viewing conditions.
Video color legalization in traditional broadcasting limits the dynamic range of pixel color values. Streaming services are not affected by this and can accommodate the full range of values.
Basic transfer functions are implemented with Gamma correction shown here with positive and negative curves alongside the unmodified linear transform:
Dolby Laboratories developed a more sophisticated technique for displaying High Dynamic Range (HDR) content. This is called the Perceptual Quantizer (PQ). It greatly enhances the detail in the darker regions of the image. This is a simplified example with the positive Gamma curve to show the difference:
The BBC collaborated with NHK on an alternative approach called Hybrid Log Gamma (HLG). This applies gamma correction to the lower range of values and logarithmic correction to the upper. It doesn’t increase detail in the dark region as much as PQ. The mid-range values are increased more than the basic Gamma transform but less than PQ. The positive Gamma curve is included for comparison:
The join between the two curves used in HLG must be implemented carefully to avoid discontinuities that would introduce contouring (posterization) artefacts in the output images. Contouring is discussed in the context of High Dynamic Range imaging with an example of the effect of reducing the number of bits per pixel.
Refer to EBU TR 038 which describes a subjective evaluation of the HLG transfer function.
Relevant Colorimetry Standards
These colorimetry standards define the transfer function behavior:
| Document Standard | Vintage | Description |
|---|---|---|
| CIE D65 | 1967 | Specification of the white-point for a color temperature of 6504 degrees Kelvin. Described in ITU-R BT 709 where the gamut is limited for use in TV systems. |
| Dolby Vision CM 2.9 | 2018* | Dynamic Mastering Metadata for Color Volume Transformation. |
| Dolby Vision CM 4.0 | 2023* | Recommended and improved Dynamic Mastering Metadata for Color Volume Transformation with enhanced algorithms and a more complex tone-curve that supports additional post-processing options. This is a superset of the version 2.9 specification. |
| EBU TR 038 | 2017 | Subjective Evaluation of Hybrid Log Gamma (HLG). |
| ITU-R BT.709 | 2015 | Parameter values for the HDTV standards for production and international program exchange. |
| ITU-R BT.1886 | 2011 | Reference electro-optical transfer function for flat panel displays used in HDTV studio production. |
| ITU-R BT.2020 | 2015 | Parameter values for Ultra-High Definition television systems for production and international program exchange. |
* The version dates for the Dolby Vision metadata specs are approximate and based on available documentation.
The Dolby Vision work on colorimetry has facilitated the wide adoption of High Dynamic Range (HDR) content which enhances the Ultra High Definition (UHD) viewing experience to a very high standard. There are open and non-proprietary alternatives if you need them.
High Dynamic Range (HDR)
The number of available colors that can be rendered on a display surface is called the color gamut. Different displays exhibit varying ranges of color. Creatives working on color grading workstations have their screens calibrated carefully and their ambient working conditions managed so they reproduce the same colors consistently. These are the factors that affect the color rendition. The HDR specifications include descriptions of color sub sampling configurations that are necessary to resolve the range of colors:
- The type of display.
- The physical dynamic range of each component color.
- The linearity of the component colors (gamma curves).
- Ambient lighting conditions.
- Calibration of transfer functions.
Display types might be any of these:
- CRT – Traditional glass screens –almost completely eliminated.
- Plasma – Early flat screens.
- LCD – Backlit color filtered.
- OLED – Individual LEDs for each component of the pixels.
The brightness of a display is described as the nit level. It is impossible to actually make black any darker. Because the human eye adjusts to different dynamic ranges, if the white is brighter, the blacks appear to be darker. HDR therefore requires higher nit levels than SDR.
Here is a comparison between standard and high dynamic range brightness levels:
The colors are more vibrant with HDR and you can see more detail in the dark areas of the picture.
The difference between the color gamuts for standard RGB and High Dynamic Range is obvious when using the CIE 1931 color space diagram.
High dynamic range enhancement uses a technique called tone mapping to extend the range of realizable colors in the image. This technique has been around for a long time. Early experiments in the 1850’s to combine multiple images at different exposures was the beginning. In 1985, computer graphics software introduced HDR imaging when rendering animations. In the 1990’s the technique started being applied to video and screen displays capable of displaying HDR images arrived after the millennium.
Here are the SDR and HDR gamuts side by side:
The number of bits in a colored image does not affect the overall dynamic range but reducing the bits will introduce contouring (banding) artefacts. At extremes, this creates a solarization effect which is artistically interesting but utterly horrible when viewing video.
You will occasionally observe this artefact on over-compressed DVD video. Consider a sunset with a cloudless sky; that image has a large area with a circular gradient toned area which is very prone to contouring. Alleviate this artefact by lowering the compression ratio during that sequence. It will cause a momentary burst in the bitstream, but careful control so the maximum bitrate limit is not exceeded will solve the problem.
HDR is generally acceptable with a 10-bit resolution. Experiments with 12-bits do yield better quality but it is unlikely to become a consumer format. HDR10 performs well but does require the correct HDMI cables for display connection. The HDR10 media profile is described by ITU R-BT.2100.
The higher resolution 8K TV services will benefit from the HDR10+ ADAPTIVE specification which is not yet standardized.
Relevant HDR Standards
These standards are useful if you need to understand the technical background to HDR displays:
| Specification | Vintage | Description |
|---|---|---|
| ITU-R BT 709 | 2015 | Parameter values for the HDTV standards for production and international program exchange. |
| ITU-R BT.2020 | 2015 | Specifies the color primaries and the gamut of available tones. |
| ITU-R BT.2100 | 2018 | Image parameter values for High Dynamic Range television for use in production and international program exchange. Describes both the PQ and the Hybrid Log-Gamma transfer function. |
| ISO 23008-14 | 2018 | Conversion and coding practices for HDR/WCG Y'CbCr 4:2:0 video with PQ transfer characteristics. |
| P3-D65 | 2015 | Wide-Color Gamut colorimetry specification for Dolby Vision HDR content. Also limited by ITU-R BT 709. The P3 color space was originally developed by Apple Inc. |
| ST 2084 | 2014 | The image transfer function from SDR to HDR. This is the High Dynamic Range Electro-Optical Transfer Function of Mastering Reference Displays. The Perceptual Quantizer (PQ) is used as the Colorimetry specification for Dolby Vision HDR content. |
| ST-2086 | 2018 | Static metadata describing the light levels and the Mastering Display Color Volume Metadata Supporting High Luminance and Wide-Color Gamut Images. |
There are other specifications available with slightly different properties.
What Is Coming Next?
At the time of writing H.267 is being worked on but not likely to become a standard for some time yet. It is being designed to facilitate immersive content for Augmented Reality (AR) and Virtual Reality (VR) systems. Possible target applications are gaming and AI systems. H.267 may require enhanced decoder hardware to be deployed to render the output video. That may prove to be an obstacle since it will mandate a costly hardware replacement cycle for the consumer audience.
The goal is to accomplish another 50% reduction in the resulting bitrate compared with VVC. Given that VVC is already expected to be much more efficient than HEVC, this will be a significant leap forward.
The standard is expected to be finalized around 2028-Q3 with possible wider deployment around 2035. This is still a long way off and this will offer HEVC an opportunity to gain some traction in the meantime. Some documentation refers to this as the Enhanced Compression Model (ECM).
H.267 will likely be the last of the macroblock structured codecs. Current advanced research into AI driven coding is investigating how to describe moving images in a way that can be created with Neural Networks. AI synthesized video already uses this technique.
This will not need macroblocks, DCT, motion vectors, GOPs or any of the fundamental techniques that the H series and ISO/MPEG codecs are based on. We will need a new way to describe the coded video stream so it can be efficiently pushed through the Neural Network. That is an opportunity for another standard.
Neural Network Processing Units (NPU) are being built into computing devices inside the core CPU chip. Apple Silicon is an example of how this has already been happening for some time. The necessary hardware to benefit from this block-free format will be deployed and ready by the time the AI coding research is complete.
These Appendix articles contain additional information you may find useful:
Supported by
You might also like...
Network Traffic Engineering: RIST & SRT - The Success Of ARQ Based Protocols
IP networks are inherently unreliable. We kick off this series on IP Network Traffic Engineering with a look at how RIST and SRT give broadcast engineers user-configurable control over the latency-versus-reliability trade-off for real-time media streaming.
Broadcast Standards 2026 – Video Coding
Video coding was developed to deliver video conferencing services over low-bandwidth modem connections, but modern demands for ever-larger video formats are pushing codec technology to its limits.
Network Traffic Engineering: Part 1
IP networks are inherently unreliable. They always have been – it is literally designed in as a feature.
Standards: An Introduction To Standards
There are many standards relevant to the broadcasting and media industry. In this section we examine the background to standards, who develops them, where to find them and why they are absolutely and totally necessary.
Broadcast Standards – The Book 2026
We need standards more than ever. The rapid evolution of technology and connectivity is challenging the very idea of what broadcasting is. Broadcasters are having to find new commercial models to maintain audiences, and modern production workflows deliver the flexibility…