Is Gamma Still Needed?: Part 4 - The Weber Fraction, Distortion And Noise

Now the CRT is history, we have to justify the retention of gamma on its performance as a perceptual compression codec. That requires its effect on human vision to be considered.

One useful guide is the work of Weber and Fechner, who studied human responses to stimuli. They found that the sensitivity of the human response to a change in a stimulus is not absolute, but under certain conditions is instead proportional to the size of the stimulus. That is another way of saying the percentage accuracy of a human sense, known as the Weber Fraction, is practically constant under the appropriate conditions.

Fig.1. Note that both scales are logarithmic here. The Weber Fraction stays constant over a wide range of brightness, but rises at low light levels as the eye cannot have infinite accuracy. Diagram from William Schreiber.

William Schreiber found that the Weber Fraction is about two percent of the brightness over a large range down to about 1 candela per square meter, where the sensitivity falls as Fig.1 shows. At 0.1 candela per square meter it is more like four percent. This is unsurprising as maintaining constant percentage accuracy requires the absolute accuracy to become infinite as black is approached.

Fig.2 shows that if in linear light we were half way up the eight-bit binary coding range at a code of 128, the change in brightness due to moving to 129 would not be seen, as it is less than one percent. If, however we were at 32 above black and we changed to 33, the change would be about three percent and it could be seen. Up near 192 the change would be nearer half a percent and totally undetectable.

What this means is that to make a purely linear coding scheme good enough in the dark parts of the picture it will be over specified in the bright parts. For transmission, but not necessarily for production, what is needed is a coding scheme that has accuracy that reflects the characteristics of human vision. This is not a concrete argument for gamma, as there are other coding schemes that have the desired characteristic and may have fewer side effects. In computers, for example, the need for constant accuracy with wide dynamic range was met with great success by floating point coding.

Fig.2. A characteristic of linear scales is that percentage accuracy falls with small signals. Here the change from 128 to 129 is a smaller percentage than the change from 32 to 33.

The concrete argument for a particular gamma was the black and white CRT. Now that is in the past the main argument for retaining gamma is compatibility with existing hardware.

If gamma correction is applied immediately before the display, the channel is linear so the effect of noise in the channel between the camera and the display is independent of brightness. However, if gamma correction is applied at the source, the effect is that noise is diminished in dark areas and increased in bright areas by the non-linearity. Fig.3 shows that the slope of the CRT function is smaller near black so noise there is amplified less than the same noise in brighter areas.

Signals near black therefore have a better signal-to-noise ratio than signals near white. That fits more or less with the characteristics of human vision described by Weber, in which the effects of noise are more visible in dark areas of the picture.

Fig.3. The gamma characteristic of the CRT causes a constant amount of noise to produce less brightness change in dark areas.

It is a complete coincidence that the physics of the CRT had approximately the right properties to act as a decoder for a compression system that gave a small subjective improvement in noise performance to the HVS. Whether it was a happy coincidence or a short-term fix that would give rise to long-term problems is debatable.

I have seen it argued that non-linearity should be used in video signals because of the Weber/Fechner findings. However, their findings also apply to human hearing which has just as many non-linear aspects, for example the logarithmic loudness sensation. Most quantities in audio are expressed in dB, which are fundamentally logarithmic, yet audio is predominantly captured, processed and distributed in linear form. Non-linear audio formats are only acceptable in telephony where quality is less important than in other applications.

If within our visual system there are any non-linearities, they will be the same for the original scene and for the reproduced scene in the example above and so they do not justify a non-linear channel. Instead they justify non-linear application of decisions in a compression system, which can readily be applied to a linear channel. Audio is almost universally linear, yet uses logarithmic metering and volume controls.

It is true that the traditional use of gamma gives an improvement in perceived signal-to-noise ratio, but it's also true that it's not very great. Fig.4 shows how the extent of the improvement can be calculated. The figure contains the transfer function of 240M, and the linear part has been extended. As the slope of the linear part is 4, it follows that if the system remained linear the output for unity input would have to be four times as big.

The best noise performance is where the encoder slope is the steepest, so keeping the system linear and transmitting a signal four times as big would give exactly the same noise performance near black.

Fig.4. If the linear region of 240M video were to be extended to peak white, there would be no change of performance near black, but the video signal would have to be four times as big. Gamma can be seen to be a 4:1 compander.

In the analog domain traditional gamma allows the same performance as a signal 12dB (four times, or two stops) larger. Whilst that would be a nuisance in analog television, in 240M digital television, we can easily make the number range four times as big by adding two bits to the luma samples. The option of 10 bit signaling was always there in 601 digital component. In 709 the steepest slope is 4.5, so we would need an extra three bits.

It should be noted that HLG dispensed with the linear portion of the transfer function, so the above technique cannot be used and it has to be assessed in a different way.

If we consider traditional SDR in the light of Weber/Fechner, the use of gamma compresses the luminance range by a factor of four so the 220 codes of eight-bit video are the equivalent of 880 codes in a linear system. If we consider the limited light output of a CRT, Schreiber’s work suggests a Weber fraction of five percent at that order of brightness. The lowest code that allows sufficient accuracy is 20 above black, since 21 and 20 differ by five percent.

A code value of 880 is just over five stops above 20, so the range over which noise is invisible in 240M SDR is just over five stops. 709 is substantially the same. Note that range is not the dynamic range according to some definitions. Five stops isn’t very much, hence the pressure to improve it.

Ten-bit video is available in studios and as the maximum brightness of the CRT must remains the same, the extra number range allows the region near black to be expressed with more precision. Two extra bits would offer two more stops of numerical range, but the Weber fraction would be even larger at such low levels, so ten bit SDR might offer about eight stops, which is still somewhat below the capabilities of human vision.

If it is intended to rely on gamma to obtain an increase in the useful number of stops, then clearly the non-linearity would have to increase.

One aspect of gamma that is difficult to understand is that in connection with some type of gamma or other that is being proposed, reference is made to the word length necessary to prevent the visibility of banding or contouring. There are two difficulties.

One is that contouring is not noise. Noise is by definition de-correlated from the signal, whereas contouring is a direct function of the signal and is classed as a distortion; quantizing distortion. Contouring by definition produces a recognizable image across a series of pixels that is obviously more visible, as much as two stops more visible, than random changes to individual pixels, so assessing noise performance on the visibility of contouring gives misleading results.

The human visual system has a noise floor, as do electronic cameras. Contouring is not found in human vision, in film chemistry or in the output of a camera sensor.

The second difficulty is that if contouring is visible in a digital imaging system other than as a deliberate effect, then there is something wrong. It has been known for many decades that quantizers need to be dithered in order to linearize them and give them a noise floor. A properly dithered quantizer never displays contouring.

In cameras, noise from the sensor can dither the convertor, or a suitable noise source can be used. The problem comes when artificial images are computed, because then there are no natural noise sources. Graphics can be computed to arbitrary accuracy with any desired word length, but when the results are output to a broadcast standard of eight or ten-bit word length, if the truncation is achieved simply by omitting the low order bits, the result is contouring.

Shortening of word length has to be done by simulating the dither process of an ADC. Digitally generated noise is added to the source data before truncation so that the quantizing error is de-correlated. Done correctly, no contouring or banding exists and the greatest possible artifact-free dynamic range is obtained.

You might also like...

Microphones: Part 11 - The State Of The Art… And The Potential Of MEMS Microphone Arrays

Here we look from the state of the art in microphones, to what the future may bring with the enticing theoretical potential of microphone arrays built using MEMS technology.

Microphones: Part 10 - Mid-Side (M-S) Recording And Processing

M-S techniques provide useful sound-field positioning and a convenient way to check mono compatibility. We explain the hard science behind this often misunderstood technique.

Microphones: Part 9 - The Science Of Stereo Capture & Reproduction

Here we look at the science of using a matched pair of microphones positioned as a coincident pair to capture stereo sound images.

Microphones: Part 8 - Audio Vectorscopes

The audio vectorscope is an excellent tool for assuring quality in stereo sound production, because it makes the virtual sound image visible in the same way that a television vectorscope allows the color signals to be seen.

Microphones: Part 7 - Microphones For Stereophony

Once the basic requirements for reproducing sound were in place, the most significant next step was to reproduce to some extent the spatial attributes of sound. Stereophony, using two channels, was the first successful system.