Improving Compression Efficiency With AI

A group of international technology vendors and broadcasters is working on developing and implementing Artificial Intelligence (AI) standards to improve video coding. Calling itself MPAI (Moving Picture, Audio and Data Coding by Artificial Intelligence) they believe that machine learning can improve efficiency of the existing Enhanced Video Coding standard by about 25 percent.

The collective efforts of the MPAI’s experts is focused on horizontal hybrid approach that introduces AI-based algorithms combined with traditional video codecs by replacing one or more blocks of the traditional loop with machine learning-based blocks.

The latest activities of the non-profit global group—that counts members, including broadcasters like Italy's RAI, from 15 countries and was officially launched in September 2020—was discussed at length during the recent HPA Tech Retreat online conference. It’s stated goals include leveraging AI as a core technology for its standards; and developing patent-friendly framework licenses to help its members monetize their intellectual property.

MPAI defines data coding as the transformation of data from a given representation to an equiv­alent one more suited to a specific application. Examples they cite are compression and semantics extraction. To date it has identified an AI module (AIM) and its interfaces as the AI building block. They say the syntax and semantics of interfaces determine what AIMs should per­form, not how. AIMs can be implemented in hardware or software, with AI or machine learning legacy data processing.

The group is looking at many facets of utilizing AI and will establish a licensing model for its application.

The group is looking at many facets of utilizing AI and will establish a licensing model for its application.

“MPAI’s AI framework, which enables the creation, execution, com­pos­ition and update of AIM-based work­flows, is the cornerstone of MPAI standardization because it enables building high-com­plexity AI solutions by interconnecting multi-vendor AIMs trained to specific tasks, operating in the standard AI framework and exchanging data in standard formats,” said Mikhail Tsinberg, president CEO at Key Digital Systems and a founding member of MPAI.

The group is looking at many facets of utilizing AI, including: MPAI-AF, for creation and execution of AI-ML-DP workflows; MPAI-EVC, extending video codec capability with AI; MPAI-MMC, conversation with machines, naturally as with humans; MPAI-CAE, audio at home, in the office, on the go and in the studio; MPAI-OSD, Visual Object band Scene Analysis using AI.

Focused on coding efficiency, the goal is to provide up to 25% additional bitrate reduction for the same resolution over existing EVC (MPEG-5) codecs. Testing is now on-going using different content types to build a data set that can be used to improve performance in a highly automated way. Content they are now working with includes Natural video (video camera captured content), Moving Film (film captured content), Computer-Generated Graphics and Video Gaming.

According to Tsinberg, the MPAI’s initial work has involved using lossy and visually lossless compression (currently, mathematically lossless coding is not being considered). Specific parameters include up to 8K resolution; rectangular video with wide range of aspect ratios, including video banners and vertical video; SD/HDR; standard and wide color gamut; 4:2:0, 4:2:2, and 4:4:4 color formats (initial focus will be on YUV-based coding); and multiple frame rates of up to 120 Hz.

They are also looking at IP-based (including HTTP live streaming) protocols, MPEG-2 Transport Streams, MPEG Media Transport and others. Using CBR, VBR and Capped VBR, they are hoping to deploy systems with (end-to-end) delay levels of: High: greater than 100 milliseconds to offline encoding; Low: 30 msec, less than Delay 100 msec; and Very Low: less than 30 msec (less than one picture frame).

These systems with AI-acceleration could be located on-premise or in the cloud. Backward compatibility with existing systems and scalability are also being taken into account.

The MPAI has come out with a Web socket method that looks to build an abstraction layer agnostic to the AI frameworks, the operating systems and  physical location.

The MPAI has come out with a Web socket method that looks to build an abstraction layer agnostic to the AI frameworks, the operating systems and physical location.

“The MPAI Enhanced video Coding mission is to exploit advances in AI to develop video coding standards that improve coding efficiency,” Tsinberg said. “The rationale is to use AI tools that are able to distill aspects of the data semantics relevant to video compression. These will be highly adaptive systems. Our focus currently is on Intra Predictions and we have already built a dataset for training.”

To manage the bi-directional communication between EVC codecs and the AI tools, he said the MPAI has come out with a Web socket method that looks to build an abstraction layer agnostic to the AI frameworks, the operating systems and the physical location.

The future plan is to port the code developed in the group’s Evidence Project to FPGA boards that are more effective than generic processors in terms of performance and feature low latency and high throughput.

“By accelerating the maturity of AI-enabled data compression, MPAI will help accelerate large-scale adoption of AI Technologies in devices leading to a future where AI is the dominating technology in devices,” Tsinberg said.

Recognizing the moral responsibilities linked to AI, the group’s website states that: “Although it is a technical body, MPAI is aware of the revolutionary impact AI will have on the future of human society. MPAI pledges to address ethical questions raised by its technical work with the involvement of high-profile external thinkers. The initial significant step is to enable the understanding of the inner working of complex AI systems.”

Editor’s Note: As part of its fourth standard project “Compression and understanding of industrial data (MPAI-CUI),” the MPAI has issued a Call for Technologies. The standard aims to enable AI-based filtering and extraction of information from a company’s “governance, financial and risk data enabling prediction of company performance.” It is also calling for technologies to develop two standards related to audio (MPAI-CAE) and multimodal conversation (MPAI-MMC).

You might also like...

Now A Trusted Technology, AI Streamlines Video Production And Delivery Workflows

There was a time when the mere mention of bringing artificial intelligence (AI) and machine learning into the media industry brought visions of robots replacing humans. Today that is certainly not the case—although we might be getting close: I s…

Core Insights - Internet Contribution For Broadcasters

What is the internet? Who is the internet? Where is the internet? These are the first three questions on the tip of every engineers and technologist’s lips. Before we can ever possibly hope to work with internet technology, we m…

5G Gaining Momentum In Broadcast, But Universal Infrastructure (And Experience) Lagging

The recent news that NTV has become the first Russian TV channel to experiment with 5G broadcast, one of many such transmission tests that have been conducted over the past 18 months, illustrates that broadcasters see a bright future in the…

The Sponsors Perspective: PRISM Waveform Monitors: Form Is Temporary, But Class Is Permanent?

In the beginning, there was television. And whenever people tried to make television programmes effective video signal monitoring was an essential pre-requisite.

Timing: Part 3 - Early Synchronization

Synchronizing became extremely important with the growth of AC power systems, which ended up being used to synchronize all sorts of equipment, from Radar to television.