Understanding Compression Technology.  Part 1.

Over the decades from our industry’s move from analog to digital based technology, codecs (encoders and decoders) have played a key role in the capture, editing, and distribution of video media. The MPEG-2 codec has supported all three tasks. Camera oriented codecs have moved from MPEG-2 to MPEG-4 (H.264). Distribution has followed two paths. Television program distribution continues to employ MPEG-2. Internet distribution immediately went to progressive video encoded using H.264 because the codec is about 2X more efficient than MPEG-2. The newest distribution format is H.265 or HEVC (high efficiency video codec).

While the details of compression change over sequential implementations, there are seven aspects of MPEG-2 and H.264/H.265 technology that are fundamental to compression: Macroblocks, DCT, Quantization, Lossless Compression, Motion Estimation, Predicted Frames, and Difference frames. Part 1 of this article will cover macroblocks, DCT, and quantization. Lossless compression and motion estimation and will be covered in Part 2Part 3 will detail the critical role of predicted frames and difference frames.


MPEG-2, H.264 & H265 can be compressed (encoded) two ways. The simplest, is intra-frame encoding. Here each video frame is compressed without any dependence on any other frame. (Both the DV and HEVC/AVC-Intra codecs use intra-frame compression.) The more typical form of compression is inter-frame encoding because it is far more efficient. Inter-frame encoding employs an “I-frame” plus “P-frames” and usually “B-frames.” An I-frame is compressed using the same process as is use by intra-frame encoding.


Video frames may need to be chroma down-sampled to 4:2:0 prior to encoding. Low-pass filtering then reduces noise within each video frame. Both types of filtering are mild forms of lossy compression.

Encoding a video frame begins with the macroblock in the upper-left corner. MPEG-2 encoding starts with a video frame partitioned into non-overlapping 16x16 pixel blocks called macroblocks” Assuming 4:2:0 color-sampling, the amount of Red chroma (Cr) data is one-quarter the amount of luminance (Y) data—thus one 8x8 block of Cr data is pulled from a macroblock. Likewise, one 8x8 block of Blue chroma (Cb) data are pulled from a macroblock. Each luminance macroblock is subdivided into four 8x8 blocks thereby yielding a total of six 8x8 blocks. (H.264 employs an adaptive scheme. Small blocks are employed when there is extensive detail while large blocks are used when there are few details—for example an area of blue sky).

A Discrete Cosine Transform (DCT) is then applied to the each of the six 8x8 data blocks. A DCT is 2D Fast Fourier Transform (FFT) that inputs an 8x8 pixel-block and transforms it from the spatial domain to the frequency domain. In other words, pixel information is converted to a mathematical (cosine coefficients) as represented by the grayscale in the figure below.

Image 1: Pixel information is converted into a mathematical representation of cosine coefficients as represented by the grayscale in the figure below.

The 64-coefficients come from an 8x8 block of pixels where the upper-left cell represents the DC (zero-frequency) value. From left-to-right and top-to-bottom these values represent an increase in signal frequency, which we understand to represent an increase in image detail. To this point, encoding has been mathematically lossless.

The next encoding stage, quantization, performs lossy compression. During quantization the coefficient matrix from a DCT is multiplied by a pre-defined “quantization matrix.” (A quantization matrix determines the amount of compression to be applied.) The matrix favors coefficients toward the upper-left of the coefficient matrix. Those coefficients that are out of favor typically become zero and thus generate very little data.

Image 2: The next encoding stage, illustrated above, performs a lossy compression. During quantization the coefficient matrix from a DCT is multiplied by a pre-defined “quantization matrix.

Lossless data reduction, Variable Length Coding (VLC) and Run Length Coding (RLC), follow quantization. Variable length coding identifies the most frequent patterns in the quantized coefficients. They are then represented by codes defined by only a few bits. Less frequent patterns, conversely, are represented by codes that require more bits.

Next, the set of numeric values resulting from VLC are run length encoded. RLC generates a unique code that represents a repeating pattern found in the output from the VLC encoding stage. For example, a “run” of 31 zero values—as seen in the figure above—becomes a unique code to which a repeat count is appended.

All the data from the RLC stage are then stored. Remember that the DCT, quantization, VLC, and RLC must be completed for all six 8x8 pixel-blocks before a new video-block can be processed.

You might also like...

KVM & Multiviewer Systems At NAB 2024

We take a look at what to expect in the world of KVM & Multiviewer systems at the 2024 NAB Show. Expect plenty of innovation in KVM over IP and systems that facilitate remote production, distributed teams and cloud integration.

Wi-Fi Gets Wider With Wi-Fi 7

The last 56k dialup modem I bought in 1998 cost more than double the price of a 28k modem, and the double bandwidth was worth the extra money. New Wi-Fi 7 devices are similarly premium-priced because early adaptation of leading-edge new technology…

NAB Show 2024 BEIT Sessions Part 2: New Broadcast Technologies

The most tightly focused and fresh technical information for TV engineers at the NAB Show will be analyzed, discussed, and explained during the four days of BEIT sessions. It’s the best opportunity on Earth to learn from and question i…

Standards: Part 6 - About The ISO 14496 – MPEG-4 Standard

This article describes the various parts of the MPEG-4 standard and discusses how it is much more than a video codec. MPEG-4 describes a sophisticated interactive multimedia platform for deployment on digital TV and the Internet.

The Big Guide To OTT: Part 9 - Quality Of Experience (QoE)

Part 9 of The Big Guide To OTT features a pair of in-depth articles which discuss how a data driven understanding of the consumer experience is vital and how poor quality streaming loses viewers.