Understanding Compression Technology: Predicted Frames and Difference Frames.  Part 3.

​In Part 2 of this series on Compression Technology we learned how Motion Vectors are generated when motion estimation is employed as the first step of creating P-frames and B-frames. In Part 3 we’ll learn how these motion vectors are used to generate Predicted Frames.

Let’s review the nature of P- and B-frames by first looking at forward dependencies. Two types of frames serve as references for other frames: an I-frame can support a future P-frame and/or B-frame. And, a P-frame can support a future P-frame and/or B-frame. Put a different way, a P-frame and/or a B-frame can be dependent on a previous I-frame or a previous P-frame. Arrows that point leftward in the Closed GOP diagram below show such dependencies.

Dependencies among I-, P-, and B-Frames (Apple).

Video frames that will become P- and B-frames are partitioned into macroblocks in the same way as is done for an I-frame. Starting with the first macroblock in the Present Image (current video frame) a search is made to determine where it’s content can be found in the Adjacent Image (next video frame). When the contents of a macroblock have not moved, the macroblock’s motion vector is set to zero.

When a match is not found at X=0 and Y=0, the Present Image’s comparison macroblock is moved at an increasing distance from its origin until there is a match—or ultimately no match. Once a search is made for the first macroblock, additional searches are made for every macroblock within the Present Image. In this way every macroblock within a Present Image is assigned a motion vector. A Present Image’s motion vectors are stored in a Motion Estimation Block. Although an estimation block will ultimately be stored, it is first used to generate a Predicted Frame.

Below, the upper-left image is the Present Image. The upper-right image is the Adjacent Image (next video frame). One difference that has occurred between the capture of the Present Image and the capture of the Adjacent Image is obvious – the person has opened their eyes.

Steps to Generate a Predicted Frame (Wang).

Steps to Generate a Predicted Frame (Wang).

Steps to Generate aPredicted Frame (Wang)
The lower-left image is the Adjacent Image with the calculated motion vectors superimposed. These motion vectors are applied to the Present Image (current video frame) to construct a Predicted Frame. Simply put, the vectors move macroblocks in the Present Image to new locations. The lower-right image shows the generated Predicted Frame.

Ideally, these vectors would move pixels exactly to their new locations. However, as shown, the Predicted Frame has errors. To eliminate motion estimation errors, a Difference Frame is created.

A Difference Frame is generated by subtracting the Adjacent Image (current video frame) from the Predicted Frame (next video frame). Were the motion vectors able to create a perfect Predicted Frame, the Predicted Frame would match the Adjacent Image and the Difference Frame would be empty. With motion video, likely there will be information in the Difference Frame – as shown below.

Difference Frame (Wang).

Difference Frame (Wang).

The Difference Frame is compressed (DCT) after which lossless data reduction is applied (VLC and RLC). This is the same process used to compress an I-frame. The motion estimation blocks are also VLC and RLC compressed. The compressed Difference Frame along with the compressed motion estimation block are then stored.

To summarize the compression process; each I-frame is intra-frame compressed and stored in a long-GOP stream. Each compressed P-frame includes two types of information: a motion estimation block and a Difference Frame. (Each compressed B-frame has two motion estimation blocks and two Difference Frames.)

As a stream is uncompressed, an I-frame is re-created by reversing its lossless compression and then performing an Inverse DCT. This yields a Present Image that is output as a video picture. (A Present Image can be obtained from a previous I- or P-frame.) When a P-frame is encountered in a long-GOP stream, its motion estimation block is uncompressed. These vectors are applied to the Present Image to create a Predicted Frame.

Next, the P-frame’s Difference Frame is re-created by reversing its lossless compression and then performing an Inverse DCT. With both a Predicted Frame and Difference Frame available, an Adjacent Image – output as a video picture – is generated by using the Difference Frame to correct errors in the Predicted Frame. (A B-frame’s single Adjacent Image is obtained from a previous I- or P-frame by appropriately employing two Difference Frames to correct errors in two Predicted Frames.)

This process is repeated for the remaining frames in each GOP. When the next I-frame is encountered, the process is repeated. Although P- and B-frames are more efficient – require less stored data – than are I-frames, because of the use of Difference Frames they have the same visual quality.

You might also like...

Production Control Room Tools At NAB 2024

As we approach the 2024 NAB Show we discuss the increasing demands placed on production control rooms and their crew, and the technologies coming to market in this key area of live broadcast production.

Designing IP Broadcast Systems: Where Broadcast Meets IT

Broadcast and IT engineers have historically approached their professions from two different places, but as technology is more reliable, they are moving closer.

Audio At NAB 2024

The 2024 NAB Show will see the big names in audio production embrace and help to drive forward the next generation of software centric distributed production workflows and join the ‘cloud’ revolution. Exciting times for broadcast audio.

SD/HD/UHD & SDR/HDR Video Workflows At NAB 2024

Here is our run down of some of the technology at the 2024 NAB Show that eases the burden of achieving effective workflows that simultaneously support multiple production and delivery video formats.

Standards: Part 7 - ST 2110 - A Review Of The Current Standard

Of all of the broadcast standards it is perhaps SMPTE ST 2110 which has had the greatest impact on production & distribution infrastructure in recent years, but much has changed since it’s 2017 release.