In this final part of the series, an attempt will be made to summarize all that has gone before and to see what it means.
The use of traditional gamma in television reduces the amplitude of detail in bright screen areas by a factor of 4 (240M) or 4.5 (709) whereas the amplitude of detail in dark areas is not reduced at all. The way in which the video signal is compressed was originally based on the response of a CRT, which by a complete coincidence just happens to require a transfer function that somewhat resembles human noise perception.
Now that there are no CRTs left, the need to compensate for their non-linearity no longer exists and modern display technologies, many of which are linear or nearly so, have to (almost) linearize the incoming gamma-compensated signal in order to make it look right on the screen.
One occasionally reads that had gamma not been required by the response of the CRT; it would have been necessary to invent it. The facts don’t support that. What is supportable is that had the first form of display to be invented had not been a CRT, but instead something linear, it may have been necessary to invent some sort of video compression scheme. Any such scheme would have been optimized to suit human vision and would have out-performed gamma.
In today's CRT-free world, gamma has to be viewed as an accidental or coincidental form of perceptual coding. It can be expected to display the characteristics of compression systems that information theory predicts and experience confirms.
One of the rules of compression is that once a signal is compressed, production becomes more difficult and less accurate. Another rule is that concatenating compression schemes is generally a bad idea.
In all compression systems, the coder needs a matching decoder that reproduces something close to the original signal despite going through a channel of finite bandwidth/bit rate or signal-to-noise ratio. In such systems the decoder has to reverse the encoding accurately and this means that manipulation of the signal in the compressed domain is not allowed.
Maintaining Production Quality
Older readers will remember the Dolby systems that extended the dynamic range of analog audiotape. On replay the decoder had to track the encoder and this was achieved by accurate setting of replay level using Dolby tone. No one would have dreamed of attempting a production step on the encoded signal, because that would prevent it from being properly decoded.
When digital audio recording formats were limited to 16 bits, Prism Sound developed a codec called Dynamic Range Extension (DRE) that encoded 18-bit audio into 16 bits for recording. A matching decoder was needed on replay. Once again no one in his right mind would have contemplated performing a production step on a DRE recording, as it would damage the decoding.
Yet there is no conceptual difference between audio Dolby, DRE and video gamma as in all three the decoders track the encoder using signal level. Yet in video we naïvely manipulate the encoded signal as if it were linear and in doing so we prevent it from being properly decoded. Instead we kid ourselves that the resulting artifacts and problems are not serious.
Even something as simple as a cross-fade doesn't work properly when performed on gamma corrected signal. At the center of the fade the calculation of luminance is incorrect. This happens at any boundary between two colors, and is most obvious on a green magenta transition.
When gamma is used, color difference signals are not free from luma information and if they are low-pass filtered the result is a loss of luma information that gets worse in the presence of saturated colors. This was true of PAL and NTSC and continued to be true for 601 digital. High-end post-production stayed with RGB to avoid some of the problems. With linear light signals, color difference signals work well and significant reduction of bandwidth/bit rate can be obtained by low-pass filtering the difference signals.
Considering the entire chain through which television signals pass, following capture, there is production then distribution. Increasingly content is rendered instead of coming from a camera. Only after production is there the serious bandwidth/bit rate constraint of distribution media or transmission/broadcast channels. Only after production is the use of compression mandatory.
Floating Point Solutions
The problems of gamma led to the adoption of floating point coding of linear light for high quality image processing. The dynamic range available with modern cameras and floating point coding is so great that it allows the effective camera exposure to be set in post-production.
Such coding cannot be employed in systems that rely on legacy digital video interfaces because the longer word length cannot be supported. As more and more installations start to use IT solutions that are format agnostic, it is probable that the use of floating point will increase, leaving gamma to be applied for final delivery.
The Dolby systems were optimized for analog tape and DRE was optimized for digital audio. No one would have dreamed of trying to use an analog Dolby system in conjunction with digital audio so the word length could be reduced. In audio production today we have as much dynamic range as could ever be needed simply by allowing the samples the appropriate word length. In television we failed to embrace the opportunities allowed by the digital domain by digitizing the output of a legacy analog compressor.
Digital Legacy Limitations
In many respects 601 digital video wasn't really digital, because it simply took an existing analog TV format and expressed it as a series of numbers. The analog timing was still there, the gamma and interlace were still there. The opportunity to incorporate error checking, an obvious necessity in digital interfaces, was not in the original standard and had to be added on later.
In audio, digital technology was embraced in a way that allowed for future development. On its introduction, the AES/EBU digital audio interface was capable of working with much longer word lengths than the prevailing 16-bit technology. It had error checking from the outset. In contrast the digital video interfaces were no better than was needed for the TV signals of the time and very little scope for future development was left. As a result, when High Dynamic Range began to emerge, broadcasters found themselves with a limited infra-structure that could not readily handle the necessary longer word lengths.
That limited infra-structure meant High Dynamic Range had to be accommodated by increasing the compression factor of the gamma instead of simply using longer word length. The immediate consequence of increasing the compression factor is to increase the level of artifacts.
No Gamma Justification
If gamma is considered to be a compression system, there might be a case to retain it with linear cameras and linear displays if it saved a lot of data. But in 601 all it can do is to express ten or eleven bit linear video as eight bit video. The higher compression factors of PQ and HLG do a little better, but compared to other compression technologies, gamma is rather unimpressive. Even if gamma had no drawbacks and based on its performance alone, there is not much justification to retain gamma purely as a compression scheme used with linear sensors and displays.
Although it appears like a compression system that offers an improvement in the signal to noise ratio of the video signal, it does so by requiring increased bandwidth. The non-linearity causes harmonics to be generated. Gamma and inverse gamma only cancel out if the channel between them can carry the increased bandwidth. If the bandwidth is curtailed, the decoded video signal will be distorted.
On account of the bandwidth increase, perhaps gamma should be considered as a modulation scheme, like FM, that trades SNR for bandwidth.
Gamma Harmonic Distortion
As stated above, concatenation of coding schemes is not a good idea. When gamma was thought up, systems like MPEG were not even dreamed of, but today it is the norm to feed gamma corrected video into a transform-based coder using a discrete cosine transform or some later transform to convert the video signal into the frequency domain. The idea is that typical video is redundant in the frequency domain. Unfortunately the non-linearity of gamma causes harmonics to be generated, so after the transform, gamma corrected video results in a larger number of coefficients being generated, damaging compression performance.
Once there was little variation in the gamma correction used from one format to the next, and standards conversion didn’t attempt any change. But now we have several types of gamma that are quite different and converting from one format to another requires gamma conversion. In other words we are converting from one type of compression codec to another. Concatenation of codecs is well known to cause generation loss and gamma is no exception.
Floating-Point Is The Solution
Trying to produce a Standard Dynamic Range output from HDR signals is non-trivial. It may be better to produce in floating point linear and then apply the appropriate gamma(s) for transmission.
You might also like...
Television is still a niche industry, but nonetheless, one of the most powerful storytelling mediums in existence. Whether reporting news events, delivering educational seminars, or product reviews, television still outperforms all other mediums in terms of its ability to communicate…
While the merits of 8K delivery is being debated by broadcasters around the world, some are moving forward with plans to deploy the high resolution quality in creative ways that engage viewers and encourage them to interact with a live…
In the last article in this series, we looked at how PTP V2.1 has improved security. In this part, we investigate how robustness and monitoring is further improved to provide resilient and accurate network timing.
It’s a truism of our craft that compelling visual stories in film and TV are communicated in the subtext of scenes, that is to say, what we exclude from the Frame is almost always more important to the storytelling t…
Timing accuracy has been a fundamental component of broadcast infrastructures for as long as we’ve transmitted television pictures and sound. The time invariant nature of frame sampling still requires us to provide timing references with sub microsecond accuracy.