Having looked at the traditional approach to moving pictures and found that the portrayal of motion was irremediably poor, thoughts turn to how moving pictures might be portrayed properly.
Other articles in this series and more series by the same author:
Presently we have no moving picture technology. Instead, we present the viewer with a series of still pictures that give an illusion of motion along with a host of artefacts.
In earlier parts of this series we saw that the artefacts could be diminished by panning or by the tracking shot, but this only reproduced the dominant object of interest reasonably well, and any background or any other object moving differently would be reproduced with incorrect motion, known as background strobing.
As William Schreiber pointed out decades ago, raising the frame rate would not stop it, because it follows from the (mis)use of sampling. Sampling per se is not the problem, as has been demonstrated many times over in systems such as digital audio and fly-by-wire. The problem with moving pictures is that motion causes the relevant axes to be non-orthogonal so that they interact with one another.
With motion, things do not move along the time axis and instead move along an optic flow axis, which is not orthogonal to the image plane. This means that when the sampling creates steps in time it has a component that produces steps in the image plane, seen as judder or strobing.
As image sampling on the time axis is fundamentally flawed, a way forward would be to find something not having that flaw, something that abandons both temporal image sampling and the dependence on frames. It is not necessary to look far. At the same time it is important to understand the reasoning that led to the evolution of what we have now, to find that the decisions made in the past based on limitations of technology are no longer valid now technology has advanced so dramatically.
An idea expressed in an early patent, during the reign of Queen Victoria, claimed a method of remotely displaying a moving image. An optical system focused an image on a plane that was an array of light sensors. Today we would call them pixels, but the term was not then in use. Each light sensor was connected by a wire to a corresponding point light emitter in the display.
The system worked in real time and each pixel on the display reproduced the time history of the light falling on the corresponding sensor pixel. Although the discrete sensors or pixels used spatial sampling across the image, there was no sampling on the time axis and so there would have been no image smear or strobing.
The obvious difficulty with such a system is that between the sensor and the display one piece of wire was required for every pixel. Whilst that might have been possible, if difficult, over a short distance, any thoughts of broadcasting such a system were ruled out. The technology of the day didn’t allow it.
Instead, two different yet similar approaches were adopted. In the cinema, the audience was presented with a series of photographs that changed mechanically at 24 Hz. In broadcasting, the technology of the day allowed only a single waveform to be transmitted. Initially this was used for sound radio, but the principle was retained in television.
The problem with television was how to send a two-dimensional image in a single waveform. The obvious solution was to use scanning, in which the picture is broken into a series of rows or columns. Along each row or column the brightness of the image would modulate the waveform to produce a video signal.
After all of the rows or columns had been scanned, it would be time to start over, so it follows immediately that scanning results in a frame structure. Every place on the image is visited, or sampled, once per frame. Early systems, such as that of Baird, used mechanical scanning, with a single light sensor scanned mechanically in columns over the image.
This was later improved on using an electron beam to do the scanning. Television today continues to use the same principle with the implied frame rate. Whilst analog television was sampled in time and vertically but with continuous lines, digital television simply samples the lines as well, so the picture could be represented by an array of pixels represented by binary numbers.
The digital video signal was eventually serialized, one bit at a time, one pixel at a time to create the long-lived serial digital interface (SDI). SDI was a perfect example of the use of multiplexing to transmit a large number of signals using a single waveform. Revisiting the Victorian invention, it is immediately possible to see how the problem of connecting the sensor to the display can be solved thanks to developments in multiplexing electronics and multiple carrier broadcasting techniques such as OFDM.
In traditional television, the multiplexing is rigid and results in each pixel being sampled at the frame rate; 25 or 30Hz. It is necessary to move on from this approach if correct portrayal of motion is to be achieved. The solution is not difficult to grasp: each display pixel must reconstruct the way in which the light falling on the sensor pixel changes with time.
Essentially the sensor outputs one continuous analog waveform per pixel. If those waveforms drive corresponding pixels in the display, then for the first time a true moving picture will be seen in a video system.
Such a system offers a number of advantages over traditional film and television. There can be no flicker or background strobing. These artefacts are absent because the system does not have a frame rate. From that it follows that no standards conversion will be required. There can be no more temporal aliasing with stagecoach wheels going backwards. There is no more motion blur when the image moves across the sensor during the frame period. This means that a still image taken from the system will have near photographic quality, whereas still frames taken from traditional television are always poor.
It is now well understood that in cinema and television static resolution is meaningless. The parameter that matters is dynamic resolution; the resolution with which a moving object is portrayed. In a traditional TV system the dynamic resolution function falls with motion. Doubling the static resolution without changing the frame rate simply causes the dynamic resolution to fall more steeply. One significant result of the true motion approach is that the dynamic resolution does not fall with motion until a motion speed chosen by design is reached. It is not currently known what that speed should be.
It follows immediately that true-motion systems do not need absurd pixel counts because they do not suffer motion smear. 720 lines in the broadcast channel would be enough for many purposes, up-sampled as required at the display. Dispensing with absurd pixel counts is doubly beneficial in a parallel system, because it drastically reduces the number of channels needed and simultaneously reduces the noise level in each sensor pixel because it can be bigger.
Implementation of the system depends on two processes. One is a parallel process that allows the time function signal from each pixel to be digitized. The second is to multiplex the result into a serial bit stream compatible with a network, cable or radio channel.
As the frequency of the signal from a pixel is the product of the spatial frequency in the picture and the motion speed, the maximum sampling rate required at each pixel would probably be in the kilohertz region, on the other hand the minimum sampling rate could be zero in the case of a non-moving picture area or motion of an area having no detail.
The relatively slow sampling rate by electronic standards and the massive parallelism means that the output of many pixels could be handled by one ADC having an analog multiplexer. Provided the time between a pixel being sampled and the result appearing on a screen is constant, there is no requirement for the sampling of pixels to be synchronous or even at the same rate.
It is not anticipated that these proposals will be taken up. There is too much inertia in the traditional frame based systems for any radical change to be considered. On the other hand applications where traditional television is not good enough, such as gaming, self–driving cars and some military and research functions would benefit from true motion portrayal.
One advantage of true motion television is that it is relatively easy to display in slow motion. All that is necessary is to slow the display clock and the incoming data stream by the same factor and everything slows down. This might find applications in research or in sport.
As the true motion signal contains essentially the complete history of the image, a still picture could be produced anywhere on the time axis. This means that converting the true motion signal back to a traditional frame based television format is relatively easy, although the advantages of the system would then be lost. There is an argument for using true motion as a production format from which traditional television signals at more than one frame rate could be derived.
You might also like...
How adding PTP to asynchronous IP networks provides a synchronization layer that maintains fluidity of motion and distortion free sound in the audio domain.
This article describes the various codecs in common use and their symbiotic relationship to the media container files which are essential when it comes to packaging the resulting content for storage or delivery.
This list of file container formats and their extensions is not exhaustive but it does describe the important ones whose standards are in everyday use in a broadcasting environment.
The Bathurst 1000 is a massive production by anybody’s standards, with 175 cameras, 10 OB’s, 250 crew and 31 miles of fiber cable. Here is how the team at Gravity Media Australia pull it off.
When we think of glue in broadcast infrastructures, we tend to think of the interface equipment that connects different protocols and systems together. However, IP infrastructures add another level of complexity to our concept of glue.