Interlace: Part 3 - Deinterlacing
Now that interlace is obsolete, we are left only with the problem of dealing with archive material that exists in the interlaced format. The overwhelming majority of video tapes, whether component or composite, analog or digital, would be interlaced.
Other articles in this series and other series by the same author:
Viewing interlaced material on a progressive scan format is essentially a form of standards conversion in which the frame rate is doubled, and the picture is resized. For example, given a historic NTSC video recording, having a frame rate of 30 Hz, a field rate of 60Hz (well, almost) and 525 lines in a frame needs to be converted so that the frame rate becomes 60Hz. The resizing will depend on the target progressive format required.
It would be as well not to have great expectations for deinterlacing. There are two problems which compound. Firstly interlace is a non-recoverable process in which half of the information is discarded at the outset. That information has gone and it’s not coming back. Secondly the output of the deinterlacer will be viewed in a more recent format and on a modern display both of which are capable of higher definition and on which it will be more obvious that resolution has been lost.
A simplistic approach to doubling the frame rate is individually to convert every incoming field into a frame. An input field having odd lines is interpolated to create the even lines and an input field having even lines is interpolated to create the odd lines. Whilst this would work, the vertical resolution is extremely poor as it would be determined by the number of lines in an input field.
Another simple approach would be to combine an odd field and an even field to make a frame. Fig.1a) shows that on a still picture the result would be an improvement in vertical resolution as significantly more information has now been provided for the output frame. Without motion, successive TV frames are identical so the lines discarded in one field are the same as the lines retained in the next field and they can be recombined. Deinterlacing in that way would not double the vertical resolution, as most practical interlaced equipment would contain vertical filters to reduce flicker,
However there is one huge flaw in that arrangement, which is that it breaks down in the presence of motion. The fields were captured at different times. As Fig.1b) shows, the slightest motion between fields means they cannot be recombined. In the case of horizontal motion, the result is feathering where a vertical detail is represented in a different place in successive fields.
Fig.1a) In the absence of motion, two successive fields can be superimposed to produce a frame. Fig.1b) As successive fields are captured at different times, they cannot be superimposed when there is motion. Here a moving vertical edge suffers feathering.
The viewer seldom sees feathering when watching interlaced material because of eye tracking. Eye tracking attempts to follow a moving object across the screen. When correctly tracking, the eye will move the display image across the retina from field to field by the same amount as the image moved across the sensor, so the object will appear in the same place on the retina, cancelling out the feathering.
Compression theory tells us there is a good deal of redundancy in video signals, especially in the time domain. Redundancy is basically a repeat of the same information that reveals nothing new. A compressor is trying to identify that redundancy because it need not be sent if the decoder already has the information.
In deinterlacing we have the opposite problem, which is that half of the information in a frame was thrown away and we are hoping to find it, or some of it, in nearby fields. Temporal redundancy suggests that information is indeed spread over a number of pictures, so there is a basis for that hope.
In a bidirectional MPEG decoder, a given output picture is recreated by shifting pixel data from earlier and later pictures. Something similar can be done by a deinterlacer.
The criteria for doing that are significantly greater than in a compression codec. The outline of a moving object must correctly be identified so that only the motion of the object is cancelled and not that of any other object or background.
The motion vectors must be extremely accurate because if they are not, image information from different fields will not superimpose correctly and resolution will be lost. That is a fundamental problem because one of the drawbacks of interlaced formats is that motion portrayal is not very good, yet accurate motion information is needed for deinterlacing. Interlace is self-defeating in that way.
Sometimes the fundamental problems of interlace are absent and we get a break. The output of a telecine machine, for example may adhere to some interlaced video format, but it is not really interlaced. When a telecine machine plays a film in the old world, it has to produce a 50Hz video signal. The frame rate of the film would typically be 24Hz and this was converted to 25Hz by transporting the film slightly too fast. Sometimes film for television purposes was shot at 25Hz so the problem was avoided.
With only 25 frames per second on the film itself, it follows that pairs of fields at 50 Hz must have come from the same film frame and so there could not be any relative motion between the fields. That pair of fields could be recombined to make a frame that might as well have been progressively scanned in the first place. In fact some telecine machines did that. They scanned the whole frame progressively into memory and then interlaced the output by selectively reading the memory. That was a reversible process.
In the new world, the telecine machine had to output a 60Hz format from 24Hz film and this was achieved using a process called 3:2 pull down in which a first film frame was converted to two fields and a second frame was converted to three fields. Making two frames into five fields increased the rate by a factor of two and a half, turning 24Hz into 60Hz. This did not do much for motion portrayal, but at the time it was the best that could be done and it became accepted practice.
However, when an attempt is made to put 3:2 pulldown video through a standards convertor to make 50Hz video the result is generally pretty awful. To this day the problem is known as the Dallas effect, as that was the title of a popular US soap opera where the problem first came to light.
Originally Dallas was shot and edited on film and the film then went through two parallel telecine processes to produce 60Hz NTSC for US and 50Hz PAL for UK. All was well. Then it was decided to edit in the video domain. The film was scanned to 60Hz with 3:2 pull down and the video output after editing could be broadcast in the US directly and looked fine. The problem came when the 3:2 NTSC was standards converted to PAL. The results were atrocious and howls of protest came from viewers.
The situation was saved by a special box built by Snell and Wilcox. This was a pipeline of frame stores and differencing engines, that could accept a 3:2 pull down video signal and identify the fields that were repeated three times so that the third version could be discarded. This was non trivial because editing could break the 3:2 sequence.
With the third fields gone, the surviving pairs of fields from the same original film frame were deinterlaced to create an internal progressive format having 525 lines at 24Hz. This was then re-sized and reinterlaced to produce 625 lines at 48Hz. A modified VTR could record the 48Hz PAL and make a standard tape. When played at 50Hz the tape was almost as good as if the original film had passed through a PAL telecine.
As a final comment on the whole interlace saga, a modern high performance deinterlacer can be considered from the standpoint of compression theory. The interlacer is the encoder and the deinterlacer is the decoder. As has been seen, the interlacing process is trivially easy and only requires lines to be discarded according to a simple pattern. On the other hand the deinterlacer has been seen here to be an incredibly complicated piece of work. That puts interlace in the category of an asymmetrical codec, where the encoder and decoder have different complexities.
Much of the success of MPEG in broadcasting has been due to the fact that it is an asymmetrical codec. A small number of complex expensive encoders communicate with a large number of simple, inexpensive decoders. In comparison interlace requires a simple inexpensive encoder communicating with a large number of complex expensive decoders. That makes interlace precisely the opposite of what is required.
Now that we can place interlace alongside lead water pipes and asbestos as a building material if there is anything to be learned it is that it took far too long to get rid of it. Without doubt there are other suboptimal institutions and processes today that continue because they are not questioned. Perhaps they should be.
You might also like...
The Resolution Revolution
We can now capture video in much higher resolutions than we can transmit, distribute and display. But should we?
Microphones: Part 3 - Human Auditory System
To get the best out of a microphone it is important to understand how it differs from the human ear.
HDR Picture Fundamentals: Camera Technology
Understanding the terminology and technical theory of camera sensors & lenses is a key element of specifying systems to meet the consumer desire for High Dynamic Range.
IP Security For Broadcasters: Part 2 - The Problem To Be Solved
By assuming that IP must be made secure, we run the risk of missing a more fundamental question that is often overlooked: why is IP so insecure?
Standards: Part 22 - Inside AIFF Files
Compared with other popular standards in use, AIFF is ancient. The core functionality was stabilized over 30 years ago and remains unchanged.