Information: Part 5 - Moving Images

Signal transducers such as cameras, displays, microphones and loudspeakers handle information, ideally converting it from one form to another, but practically losing some. Information theory can be used to analyze such devices.

Other articles in this series:

A transducer is something that converts energy from one form to another. Light is converted to electricity, electricity to sound and so on. Here I intend to restrict the debate to transducers that work with meaningful signals rather than with pure power.

A signal is a form of communication that contains information. As Fig.1 shows, the signal passes down a channel. A signal transducer can be considered to be part of a channel. An ideal channel is able to convey the information unchanged to the destination. Ideal channels exist only in our imagination because all real channels have finite information capacity.

In digital systems, we can use techniques such as error correction to ensure the amount of corruption of our messages is negligible. However most transducers are analog and the use of error correction is much harder. The best that can be done is to use feedback and/or feedforward, where possible.

In cameras, the information is constrained by a number of factors. Firstly, the scanning standard adopted sets bounds on the information capacity and practical matters then prevent those bounds being reached. For example the existence of aperture effect in sensors prevents the resolution theoretically possible from the number of lines or pixels being reached.

Fig.1- An information channel can include transducers. Here, a microphone, an electrical circuit and a speaker form the channel.

In legacy analog television signals, many of the routine tests that were made were actually measuring some aspect of information capacity. Tests of the frequency response of the video channel were essentially checking that the signal was capable of exploring the full bandwidth available in the allowable channel bandwidth. A bandwidth test combined with measurement of the signal to noise ratio is essentially measuring the Shannon information capacity.

A common test for video equipment is to provide a test chart that is placed before the camera and which carries markings that correspond to various spatial frequencies. When scanned by the camera, these markings will correspond to temporal frequencies that can be measured.

Even if bandwidth and signal to noise ratio are adequate a video signal can also be impaired if the channel is not phase linear. In other words if different frequencies take a different time to arrive, the detail in the picture will be smeared across the screen. Tests for phase linearity would allow that deficiency to be detected.

The test cards and associated tests were basically derived from photography and tested the ability of a television system to reproduce still pictures. What was being measured was static resolution. When it is considered that the goal of television is to reproduce moving pictures, it should be clear that measuring static resolution might not tell the whole story. In practice it is misleading.

Unfortunately, mankind does not possess any moving picture technology. The best we currently have is to present the viewer with a series of still pictures and to hope the viewer will interpret some of the differences between them as motion. Perhaps one day we will have proper moving pictures, but until then we are faced with serious problems when anything moves.

It is not very difficult to create genuine moving pictures. All that is necessary is to reproduce at each display pixel the time history of the light that struck the corresponding pixel in the camera. This was a feature of television inventions a hundred or more years ago. Such systems do not have or need a frame rate and cannot flicker or judder. In today's frame-based television the time history is distorted out of recognition.

If as a thought exercise, we imagine having a conventional TV system that has near infinite static resolution, we could then assess the effects of motion on it.

Fig.2a) shows a line from a static test card having practically zero width reproduced by the camera. However, if the camera or the test card should move during the exposure time, the result is shown in Fig.2b). The near zero width line has now become a rectangle. Motion has converted the camera into a spatial filter and the rectangle is the impulse response. Filtering reduces bandwidth and reduces the amount of information in proportion.

The rectangular impulse response is that of the moving average filter that is used in post-production where blurring is to be applied to a picture. The math is identical to that of the aperture effect in digital sensors, where the ideal zero-sized photo site is replaced by a square pixel of finite size.

That gives us a way of assessing the magnitude of the problem. In an HD camera having 1440 pixels along the active line, the sensor is active for typically half the picture period. That would correspond to 1/100 or 1/120 of a second depending on the frame rate adopted.

Fig.2c) shows that if the camera is panned by one pixel in 1/100 or 1/120 of a second, the pixels have become twice as big and the HD camera is now an SD camera having effectively 720 pixels along the active line and having an information bound half as great.

Fig.2 - a) A line feature in a video system of near infinite static resolution. b) The line feature of a) becomes a rectangle in the presence of motion, limiting resolution. c) Panning by one pixel width during the exposure time of the sensor effectively halves the resolution of the sensor.

A panning speed of one pixel per 1/100 or 1/120 of a second corresponds to one picture width in 14 seconds for 50Hz and 12 seconds for 60Hz. That is an extremely slow panning speed, which would be exceeded most of the time in most real program material. The only sport that could be televised with actual high definition is snail racing.

The dominant loss of resolution in television is due to the width of the rectangular impulse in Fig. 2b). So why not reduce it? Why not use short shutter speeds as photographers do? If one imagines a racecar driving by, the videographer pans it so it is more or less stationary on the camera sensor and the best resolution is available.

If the background is now considered, that will appear in a different place in each picture and will jump from one place to the next in a phenomenon called background strobing. The shutter speed needs to be long enough to smear out the strobing, meaning that the short shutter speeds of photography cannot be used.

The only solution is to increase the frame rate so that the shutter speeds can become shorter without making background strobing worse. In the same way that we don't have any moving picture technology, as long as we adhere to picture rates based on electricity supply frequencies we won't actually have any high definition technology. We can call it by any name we like, but the information simply isn't there.

Information is lost as it falls on the sensor due to the moving average filtering action of motion smear and the pixel count becomes irrelevant. In television, the fixation on increasing pixel count, that doesn't work, parallels the fixation on frequency response of the audiophile.

The only winners in all of this are the designers of compression codecs. The loss of information due to motion smear makes television signals highly redundant because the amount of information in the video signal is far below the capacity of the scanning standard. Increasing the pixel count when it is not the dominant source of information loss simply increases the degree of redundancy and allows increasingly impressive compression factors to be reported. Effectively we are compressing the emperor's new clothes.

It is also possible to see how gamma works in television. The non-linearity of gamma causes harmonics of the TV signal to be generated. These can still pass through the TV channel because the base band signal forms such a small fraction of the available bandwidth.

The number of pixels in a picture increases as the square of the resolution. If we know the word length of the pixels, the resolution and the frame rate, we can calculate the raw bit rate. The same bit rate can be obtained by a near infinite number of combinations of pixel count and frame rate. For example, if the number of pixels in the active line is reduced to 0.7, the pixel count per frame is halved, so the frame rate could be doubled with no change in raw bit rate.

For any motion speed that is to be portrayed, there is an optimum combination where the smear due to the finite shutter time causes the same loss of information as the finite pixel count. What we are actually doing is optimizing the scanning parameters not for a mythical still picture, but for an actual moving picture. The approach maximizes the dynamic resolution. The calculation is very simple and for all reasonable bit rates and motion speeds, an answer as low as 50 or 60Hz is never obtained. As for 24Hz, well, that only works because the acuity of human vision is reduced when chewing popcorn.

The arguments for increased frame rates in television and cinema have been developed over the years and are now irrefutable. Unfortunately, most efforts to get higher rates adopted meet with exceptional resistance.

Other related articles posted on The Broadcast Bridge.

Information: Part 6 - Loudspeakers

You might also like...

Designing IP Broadcast Systems - The Book

Designing IP Broadcast Systems is another massive body of research driven work - with over 27,000 words in 18 articles, in a free 84 page eBook. It provides extensive insight into the technology and engineering methodology required to create practical IP based broadcast…

Next-Gen 5G Contribution: Part 2 - MEC & The Disruptive Potential Of 5G

The migration of the core network functionality of 5G to virtualized or cloud-native infrastructure opens up new capabilities like MEC which have the potential to disrupt current approaches to remote production contribution networks.

Designing IP Broadcast Systems: Addressing & Packet Delivery

How layer-3 and layer-2 addresses work together to deliver data link layer packets and frames across networks to improve efficiency and reduce congestion.

Next-Gen 5G Contribution: Part 1 - The Technology Of 5G

5G is a collection of standards that encompass a wide array of different use cases, across the entire spectrum of consumer and commercial users. Here we discuss the aspects of it that apply to live video contribution in broadcast production.

Designing IP Broadcast Systems: Integrating Cloud Infrastructure

Connecting on-prem broadcast infrastructures to the public cloud leads to a hybrid system which requires reliable secure high value media exchange and delivery.