Color and Colorimetry – Part 11

The dominant reason for the adoption of color difference working is that it allows the color difference signals to be reduced in bandwidth without obvious loss of picture quality. Only the luma signal needs to be retained at full bandwidth. There are various ways in which that can be done.

The first use of color difference in television was in the NTSC system of 1953, which was extraordinarily advanced for its time. The NTSC encoding process began with the creation of the Y' signal and color difference signals. However, study of the HVS revealed that color acuity was not constant but varied with the axis on which the color was changing. The acuity on the axis from blue/magenta passing through the white point was less than on the axis from orange.

To capitalize on that, NTSC re-matrixed the color difference information onto two new orthogonal axes as shown in Fig.1. This had the effect of rotating the color space by 33 degrees. The vertical axis was called I (for in-phase) and color difference information on that axis was filtered to 1.3MHz bandwidth, whereas the horizontal axis was called Q (for quadrature) and was filtered to only 600kHz.

Fig.1 - In the NTSC encoder, the color difference signals were re-matrixed on to new axes called I and Q. These axes were 33 degrees turned and allowed the Q signal to be filtered to the least bandwidth.

The NTSC encoder used the band-limited I and Q signals to modulate a subcarrier and a quadrature subcarrier. When these two were linearly added, a chroma signal was created whose phase represented the hue and whose amplitude represented the saturation. The I and Q axes were only necessary for the perceptive filtering at the encoder, and the chroma signal could still be sampled on the R' - Y' and B' - Y' axes in the receiver. To aid in that, the burst, which could be extended in a phase locked loop to act as a reference subcarrier, was aligned with the negative B' - Y' axis.

The frequency of the subcarrier in NTSC was carefully chosen so that it completed an integer number of cycles in two TV lines. That meant in large areas of constant color, the chroma phase would invert from one line to the next, making the chroma less visible on screen. The chroma energy was also interleaved with the luma energy at multiples of line rate.

The frequency of the sound carrier in black and white 525/60 had been chosen to sit between multiples of line rate so minimize interference from the video signal. Unfortunately harmonics of the NTSC chroma signal now fell on the sound carrier. The solution was to shift the entire luma and chroma spectrum slightly so the chroma no longer interfered. As the subcarrier was locked to line rate and the line rate was locked to the frame rate, this required the frame rate to be reduced by 1000/1001.

The sound interference was solved, but when the frame rate was divided by 30 the result was no longer a whole second. This led to the invention of color time, which is out from real time by 1000/1001. Ultimately the shift would be handled by the adoption of drop-frame time code in which agreed frames were omitted in a sequence that kept the time code reasonably close to real time in the same way that leap years stop the seasons from drifting.

Fig.2 - In composite digital, the sampling phase is aligned to I and Q, which means that samples are not taken at burst peaks or crossings.

When it is considered how early NTSC was designed, it worked extremely well. The basic problem with NTSC was that the sheer geographical size of the United States meant video signals had to travel huge distances and suffered generation loss. That meant that the picture on the average TV in the average bar in the average town was a lot worse than the picture in the studio.

In the composite digital format the sampling rate was four times the subcarrier frequency and the sampling was phase-locked to the chroma, which in practice meant that the sampling clock was obtained from the burst. However, the sampling phases were aligned to I and Q axes, which meant that the burst was not sampled at peaks and crossings, but was instead sampled with a 57 degree phase shift as is shown in Fig.2.

In the component digital domain the equivalent goal to filtering is to reduce the bit rate. If the information is subject to suitable low-pass filtering, then it would be possible to reduce the sampling rate without risk of aliasing. The simplest rate reduction is by a factor of two, which means that alternate samples are dropped after filtering. A factor of two (or four) also has the advantage that the color difference samples remain co-sited with the luma samples. In practice it is wasteful to compute values that will be discarded, so it is usual to combine the filtering and sample dropping in a device called a decimator that only computes the value of samples that are to be retained.

Fig.3 shows that CCIR (Now ITU) 601 filtered the color difference signals to half the bandwidth of luma in the horizontal axis only, because that is what its immediate analog forerunner did. It is important to realize that 601 was only just digital video; it was more analog video expressed as numbers to avoid generation loss.

Fig.3 - The 4:2:2 sampling structure of 601 is the same in every line in order to be compatible with legacy analog formats that only filtered color difference signals horizontally.

601 used a sampling rate of 6.75MHz for color difference instead of the 13.5MHz of luma. That allowed the color difference samples Cr and Cb to be multiplexed along with Y' into a 27MHz signal having a CrYCbY.... structure, in which every other Y' pixel had co-sited color difference pixels.

This sampling structure that is common for production purposes is known as 4:2:2. It follows from the requirement for co-siting that, the color sampling rates should be binary fractions of the luma sampling rate. The lowest useful sampling rate for color difference was one quarter the luma sampling rate, which was labelled as 1. Therefore 13.5 MHz luminance plus two 6.75MHz color differences would be called 4:2:2. Under this convention RGB would be 4:4:4.

Alternative down sampling schemes include 4:2:0 in which the color difference signals are decimated in two dimensions. Fig.4 shows that there are several ways of doing it. In JPEG and MPEG-1, the color information is not co-sited. Instead it is located in the center of the corresponding four luma pixels. This requires that the decimator is also an interpolator. Interpolation is also needed at the decoder to return to co-sited pixels so that they can be matrixed back to RGB for display.

In MPEG-2 frame-based coding, the color difference data are sited on vertical lines running through the luma pixels, but are half way between the lines. Once again interpolation is needed. 4:2:0 coding reduces the data rate to one half of what it would be in 4:4:4, which is extremely useful. Although technically information has been lost, it will not be obvious to the human viewer.

Fig.4 - At a) the 4:2:0 color subsampling of JPEG and MPEG-1 locates the color information in the center of a square of luma samples. At b), the 4:2:0 subsampling of MPEG-2 is co-sited with luma samples in one axis but not in the other.

It is obvious that if subsampling is used at all, then it should be used in both horizontal and vertical axes for the best reduction in bit rate. However, this was not always possible and one of the obstacles was interlace. In an interlaced picture, alternate vertical lines on the screen represent a picture captured at a different time and so, in the presence of motion it is not smart to decimate or interpolate between lines of different fields. When interlace was in common use, 4:2:2 was the best that could be done, but now that interlace has finally been dragged to the bone yard, 4:2:0 has become more common.

Chroma subsampling can be used in different ways. For example instead of halving the bandwidth of the chroma signal, it can be used to halve the time it takes to send it. That is the basic idea behind time compression, which is a more radical use of a time base corrector. The color difference signal is filtered to reduce the bandwidth, and it is then sampled at an appropriate frequency. If the samples are reproduced at twice that frequency, the time taken to transmit the line will be halved.

It doesn't matter if the time compression is done using digital memory or using charge-coupled analog technology.

The analog Betacam format used time compression to squeeze the color difference signals by a factor of two so both could be multiplexed into a single tape track having the same bandwidth as luma. This meant that the three components could be recorded in only two tape tracks by two heads.

The Multiplexed Analog Component (MAC) systems time compressed audio samples, luma and color difference into the modulation of a single carrier for satellite broadcasting. The picture began life in the 4:2:0 format, so that the color difference signals could be sent on alternate lines. The color difference signals were time compressed by 3:1 and luma was time compressed by 3:2 with the result that the video bandwidth went up to 8.5MHz.

The results compared with the prevailing use of composite video were extremely good, but in practice if the video reaching a MAC encoder had ever been in the composite domain the advantage was lost.

You might also like...

Microphones: Part 11 - The State Of The Art… And The Potential Of MEMS Microphone Arrays

Here we look from the state of the art in microphones, to what the future may bring with the enticing theoretical potential of microphone arrays built using MEMS technology.

Microphones: Part 10 - Mid-Side (M-S) Recording And Processing

M-S techniques provide useful sound-field positioning and a convenient way to check mono compatibility. We explain the hard science behind this often misunderstood technique.

Microphones: Part 9 - The Science Of Stereo Capture & Reproduction

Here we look at the science of using a matched pair of microphones positioned as a coincident pair to capture stereo sound images.

Microphones: Part 8 - Audio Vectorscopes

The audio vectorscope is an excellent tool for assuring quality in stereo sound production, because it makes the virtual sound image visible in the same way that a television vectorscope allows the color signals to be seen.

Microphones: Part 7 - Microphones For Stereophony

Once the basic requirements for reproducing sound were in place, the most significant next step was to reproduce to some extent the spatial attributes of sound. Stereophony, using two channels, was the first successful system.