Broadcast For IT - Part 16 - Video Compression

To deliver efficient media solutions IT engineers must be able to communicate effectively with broadcast engineers. In this series of articles, we present the most important topics in broadcasting that IT engineers must understand. Here, we look at compression, why, and how we use it.

Television signals continually consume copious amounts of data. Typically, an HD transmission distributed over an SDI network will consume data at a rate of 1.485Gbits/second. A progressive HD signal will use 2.97Gbits/sec and 8K signals use more than 100Gbits/sec.

Baseband is the term used to describe a signal that leaves the camera uncompressed. This is a difficult description as all television signals are compressed by the camera at the point of capture. That is, we take an infinitely varying scene and sample it at 29.97 or 25 frames per second. And then split the light into red, green and blue channels ready for digitization and quantization.

Each of these steps is technically a form of compression. However, for historical reasons, a signal that has gone through this process without further bit rate reduction is referred to as “baseband”.

Constant Bit Rate is Evenly Gapped

Furthermore, video signals are inherently periodic. The frame rate provides regular data at 29.97 and 25 Hertz. This is evened out by SDI to provide a constant bit rate (CBR) of data that is evenly gapped throughout each second.

Broadcast engineers prefer to work with baseband signals as they provide the least degradation and concatenation errors when processing.

Diagram 1 – Time line of most popular compression formats.

Compression is the process of taking a baseband signal and further reducing the bit rate without noticeable degradation of vision or sound. More advanced systems change the distribution to variable bit rate (VBR) to create bursty data distributions and improve efficiency over packet transport networks.

Compression is a Three Step Process

Video signals use three methods to achieve compression; removing surplus synchronizing information, intra-frame, and temporal reduction.

In modern workflows, a video signal may be compressed and uncompressed many times throughout the broadcast chain. Depending on the compression used, this process can be highly destructive and leads to the concept of concatenation error. Video compression tends to be lossy, so continually compressing and uncompressing a lossy signal leads to excessive noise and distortion.

In highly optimized systems, concatenation may not be immediately obvious but will be seen when one too many compression and un-compression cycles are completed, leading to the cliff-edge effect.

Removing Syncs Reduces Bit Rates by 20%

Due to the historic operation of broadcast systems, much of the signal uses line, frame, and field synchronizing pulses. Typically, a line pulse in SDI systems uses nearly twenty percent of the signal bandwidth. This was an overhang from the NTSC and PAL systems where long pulses were needed to reduce the amount of back-emf induced in the electromagnetic scan coils of a cathode ray tube television.

True digital systems can dispense with these pulses and replace them with unique code values. A saving of twenty percent of the video data-rate can be easily achieved using this method.

Once the sync information has been reduced, a compression system will further reduce the bit-rate using intra-frame reduction. This method starts by dividing each frame into 8 x 8 blocks of pixels and then performing a specialist form of Fourier transform on them called the DCT (Discrete Cosine Transform).

Discrete Cosine Transforms

The DCT algorithms provide coefficients that represent discrete frequencies within each 8 x 8 block. Due to the cyclic nature of the images we see, patterns of closely related values appear over the screen. Constraining these amounts to fit within pre-defined values starts the compression process. Applying coarse limits and normalization parameters leads to greater compression.

Diagram 2 – In intra-frame compression, DCT functions are applied to 8x8 pixel blocks to convert from the spatial domain to frequency domain.

After intra-frame compression is complete, the compressor goes onto temporal, or motion compensated compression. This method looks for recurring patterns between subsequent frames and instead of sending similar pixel values for consecutive frames, will send just one value for several frames.

Motion compensated processing (MCP) is extremely complex and can analyze up to 60 frames simultaneously to determine common motion between frames to send vector representations instead of absolute pixel values. This can best be seen when a ball is thrown into the air.

Vector Representation is More Efficient

Motion compensation will analyze each frame in turn and be able to pick out the ball. Instead of sending the pixel values of the ball, the MCP will provide a vector representation of the ball-object. In the extreme, and with the correct picture content, this provides a great deal of bit-rate reduction.

MPC only generates data when there is movement in a scene and further adds to the burstiness of a stream to create VBR data and enhance distribution over packet networks.

MPEG - Motion Picture Expert Group

Broadcasters have been using digital compression since the early 1990’s. MPEG (Motion Picture Expert Group) provided the first commercial systems and MPEG-2 was adopted, it’s still used today in satellite and terrestrial transmissions, but advances in MPEG4 and HEVC (High Efficiency Video Coding) are starting to make an impact.

MPEG-2 has two fundamental forms of operation; intra-frame and inter-frame. Intra-frame is the frame only compression using DCT’s on the 8 x 8 pixel blocks. This is often referred to as “I-Frame” compression. It’s often used in archiving and editing as the type of compression provided reduces the possibility of concatenation errors found with multiple edits.

High Compression Efficiency

Inter-frame is the process of motion compensation. Although this can provide some fantastic compression ratios, it is also highly destructive of the original video signal. Concatenation errors soon appear in the form of picture stutter and break-up if the available bit-rate is heavily restricted, especially with scenes that have fast, high-dynamic movement.

MPEG-2 Inter-frame is usually reserved for transmission to home viewers as the picture will only need to be constructed once.

Diagram 3 – Inter-frame compression achieves lower bit rates using “B” (bi-directional) and “P” (predictor) frames to apply motion compensation and greatly reduce data between I-Frames, this comes at the expense of concatenation errors during multiple compression and de-compression cycles. This is less evident in Intra-frame only compression, but at the expense of higher bit rates.

Group of Pictures (GOP) is used to describe the construction of a motion compensated MPEG-2 stream. I-Frames are used as an anchor frame to provide a reference for each of the compressed inter-frames. Without it, we would just see partially formed images forming over many seconds. Terms such as 12-GOP and 30-GOP describe the number of I-frames for each compressed inter-frame configuration.

Null-Padding is a Cheat

Although GOP streams natively create VBR data, they can be configured to create CBR streams. In this instance, a CBR stream is just a VBR stream with Null-Padding data added to it. In effect, this creates wasteful data and selecting the correct version is the responsibility of the engineer configuring the system.

A system that compresses from one format to another, for example 525i29.97-12GOP at 25Mbits/sec to 525i29.97-25GOP at 5Mbits/per sec, is referred to as “transcoding”. Modern transcoders tend to be software applications as they either process files, or live streams provided on IP/Ethernet links.

Codec’s or Transcoders?

Converting from SDI to a compression format, and then back to SDI, is described by the term “codec”. They tend to be hardware devices as specialist electronics is used to convert the SDI signal to data streams that can be processed in software. SDI-PCI cards are now found in computer servers allowing x86 type computer architecture to be used as codecs. However, specification of the server is critical due to the high bandwidth data channels needed to move between the SDI card, CPU, memory, and disk storage.

Video compression is a highly specialized discipline and many of the controls are inter-dependent. Modern transcoders provide many outputs of differing bit rates to meet the requirements of delivery to multiple devices such as cell phones and tablets, as well as traditional televisions. Migrating to IP is accelerating the need to understand and deliver highly optimized video compression systems.

Other related articles posted on The Broadcast Bridge.

You might also like...

Monitoring & Compliance In Broadcast: Part 3 - Production Systems

‘Monitoring & Compliance In Broadcast’ explores how exemplary content production and delivery standards are maintained and legal obligations are met. The series includes four Themed Content Collections, each of which tackles a different area of the media supply chain. Part 3 con…

Building Software Defined Infrastructure: Systems & Data Flows

For broadcasters seeking to build robust workflows from software defined infrastructure, key considerations arise around data flows and the pro’s and cons of open and closed systems.

Broadcast Standards: Microservices Functionality, Routing, API’s & Analytics

Here we delve into the inner workings of microservices and how to deploy & manage them. We look at their pros and cons, the role of DevOps, Event Bus architecture, the role of API’s and the elevated need for l…

Live Sports Production: Part 3 – Evolving OB Infrastructure

Welcome to Part 3 of ‘Live Sports Production’ - This multi-part series uses a round table style format to explore the technology of live sports production with some of the industry’s leading broadcast engineers. It is a fascinating insight into w…

IP Monitoring & Diagnostics With Command Line Tools: Part 8 - Caching The Results

Storing monitoring outcomes in temporary cache containers separates the observation and diagnostic processes so they can run independently of the centralised marshalling and reporting process.