Is Gamma Still Needed?: Part 7 - HLG And PQ

The legacy gamma adopted in 709 and 240M has recently been supplanted by two more approaches to applying non-linearity to luminance, namely the Hybrid Log Gamma (HLG) system developed by the BBC and NHK and the Perceptive Quantizer (PQ) developed by Dolby.

I hesitate to call HLG and PQ gamma functions, because mathematically they are not. They are only gamma functions in a loose sense.

PQ and HLG are relatively easy to contrast because they represent completely different approaches to the same problem. If they have something in common, it is that in their own ways they both represent the end of the line for non-linearity in luminance. Within the constraints surrounding their design both have gone as far as it is possible to go. If and when a requirement arises for a further improvement in image coding, the solution will not be found in the use of non-linearity.

If HLG and PQ represent different approaches to the same problem, perhaps it would be as well to consider what the problem is, or perhaps was. Going back to legacy formats, there was only one type of display, namely the CRT, and it is fundamentally a non-linear device. It was essential to compensate for most of that non-linearity and it was the characteristic of the CRT that primarily determined the gamma correction function that was adopted.

In legacy formats, the improvement in perceived noise performance that followed from the use of gamma was a bonus. By a pure coincidence, the sensitivity of human vision to noise is a function of brightness, and adoption of a logarithmic non-linearity will give the best noise performance. From a distance, or to a non-mathematician, a gamma curve and a logarithmic curve are somewhat similar, so gamma in legacy television was doing the right thing psycho-optically, even if it wasn't optimal.

The CRT was limited in the brightness it could produce, so there was no point in sending it signals that could not be reproduced. Now there are lots of different display technologies that offer ever increasing brightness and the CRT is history. In the cinema, the brightness bottleneck caused by projection is giving way to light emitting screens. Clearly higher performance signals are needed to take advantage of these new displays.

It is important to realize that wide dynamic range pictures are intended to use the higher peak brightness of modern displays to give a more realistic representation of highlights such as specular reflections and to allow detail to be seen in bright areas. Apart from that, most of the picture will be as before. In particular, HDR technology is powerless to improve dynamic range in dark areas of the picture.

The darkest thing the viewer sees on the screen is set by the ability of the display to cut off the light it generates made worse by any ambient light falling on the screen. The video signal can't affect either. Many people will spend money on a HDR display and will then fail to see an HDR image because the ambient illumination is too high.

It is also worth remembering that many displays are only capable of producing peak brightness over a small percentage of the screen area, and will run out of power or overheat if greater areas are used. With a correctly graded signal there won't be a problem.

In many respects HDR video has become more like audio, in which brief transients exceed the moderate average power level. HDR video can be thought of a video with headroom, and graded accordingly.

Nevertheless, it was inevitable that modern bright displays would also be situated in locations where ambient brightness would render a CRT completely useless. Using bright displays in that way means that high dynamic range is simply not possible.

Fig.1a) In legacy television, the transmitted signal is scene related, meaning that it is a result of what the camera saw, how the camera dealt with the image, and what gamma correction was applied. Fig.1b) Optimal video encoding requires the viewing conditions to be known. The brightness the display can deliver affects that. In display related systems the display characteristics determine the encoding.

With the CRT out of the picture, there was more freedom in the type of non-linearity that could be used and both HLG and PQ recognized that. The fundamental difference between the two approaches is that PQ was optimized for the best psycho-optics, whereas HLG was optimized for the best compatibility with the existing infra structure of broadcast television. The good news is that both work a whole lot better than what went before.

Fig.1a) shows that in legacy TV standards, the signal that was transmitted was determined primarily by three factors; the scene that the camera saw, what the camera did to what it saw and by the gamma correction applied on the assumption that a CRT would reproduce it. Neglecting any processing that may have been carried out for creative purposes, the transmitted signal was basically scene related. The whole history of television broadcasting has mostly been based on scene related signals, although the CRT represented a fixed type of display and to that extent signals intended for CRTs are display related. HLG is designed to work within that scene related legacy.

The CRT mostly defined legacy gamma correction. Now the opportunity arises to optimize the non-linearity for the best psycho-optic result. In considering that approach, it must be stressed that the psycho-optics takes place at the display. The viewer is at the display, sees light emitted from the display at whatever brightness the display can achieve, and in whatever the surroundings happen to be.

It follows immediately that any non-linearity optimized for psycho-optic performance must be display related, as shown in Fig.1b) and that is why PQ took that approach. Definitive work on the perception of noise in video signals was by Peter Barten. PQ encodes the video signal closely according to Barten's findings. By definition, PQ encoding requires knowledge of the characteristics of the display to which the pictures are being sent. Once that is known, the best psycho-optic result for the available data rate or wordlength can be obtained.

Fig.2. At a), the standard gamma correction function has a hard clip at peak white. b) Camera makers place a soft clip or knee in the transfer function so that some detail in white areas is preserved, even though compressed. c) HLG is rather like the sum of a) and b), in that the camera knee becomes standardized so a suitable display can decode it. d) HLG on a legacy display appears like legacy video with a stronger camera knee.

As the PQ encoding must convey the brightness that will be created at the display, the PQ signal must also be accompanied by the appropriate metadata explaining the basis on which it was encoded. This means that displays not having the same dynamic range as the encoded signal will be able to incorporate their own knee to limit dynamic range in an acceptable manner.

PQ can best be thought of as a container that has a gamut somewhat larger than most input signals. For a given camera output, if a display of a given brightness used with PQ is replaced by a brighter one, most of the picture content will still have the same brightness, but the highlights will get brighter. In HLG the brighter display will make the entire picture brighter. HLG might have benefited from some form of metadata, but it was an early decision to do without because the delivery of the metadata could not be guaranteed in legacy systems.

Whilst the non-linearity of 709 and 240M are standardized, in practice that is not the transfer function the viewer gets. The reason is that video signals have a fixed gamut and no signal can get outside that gamut. For many years cameras have had much greater dynamic range than the legacy television formats and it was commonplace for camera sensors to produce signals that would go far out of gamut on highlights. If nothing were done, the out of gamut region would be hard clipped and reproduced as a featureless peak white area. Fig.2a) shows the standard transfer function that saturates at reference white.

In practice camera manufacturers incorporated a processor that modified the transfer function in bright areas of the picture. Fig.2b) shows that there is a knee in the camera which effectively reduces the gain in bright areas so that some detail can be retained. This gives a result that is subjectively more acceptable.

In theory, if the precise knee used in the camera was known, an inverse function could be used with a brighter display to extend the dynamic range. However, the signal within the knee has used a relatively small number of quantizing intervals to describe a large range of brightness. Expanding that out would invite visible contouring.

One way of considering how HLG works is that it has incorporated the camera knee into the gamma corrector. Fig.2c) shows that HLG is substantially the same as 709 at low levels, but there is a knee in the transfer function that is low enough that brightness above the knee is quantized accurately enough to avoid contouring when an opposing transfer function expands the dynamic range at the display.

It follows that if a video signal having the HLG OETF were to be displayed on a legacy monitor having 709 compatible gamma, it could be viewed. As seen in Fig. 2d), it would look like an SDR picture from a camera having a knee, albeit set somewhat lower than usual. This means that for many purposes an HLG signal can be shown on a 709 display, giving HLG a degree of backward compatibility with legacy hardware. Such backward compatibility was never a goal of PQ and displaying PQ signals on legacy equipment often results in a reduction of dynamic range.

You might also like...

Microphones: Part 11 - The State Of The Art… And The Potential Of MEMS Microphone Arrays

Here we look from the state of the art in microphones, to what the future may bring with the enticing theoretical potential of microphone arrays built using MEMS technology.

IP Monitoring & Diagnostics With Command Line Tools: Part 2 - Testing Remote Connections

In the previous article, we set the scene for working with the Command Line Interface (CLI) on a UNIX system. Now we will explore some techniques for performing basic tests on our network infrastructure to check for potential problems.

Microphones: Part 10 - Mid-Side (M-S) Recording And Processing

M-S techniques provide useful sound-field positioning and a convenient way to check mono compatibility. We explain the hard science behind this often misunderstood technique.

Microphones: Part 9 - The Science Of Stereo Capture & Reproduction

Here we look at the science of using a matched pair of microphones positioned as a coincident pair to capture stereo sound images.

Microphones: Part 8 - Audio Vectorscopes

The audio vectorscope is an excellent tool for assuring quality in stereo sound production, because it makes the virtual sound image visible in the same way that a television vectorscope allows the color signals to be seen.