Audio Global Viewpoint – May 2021

The Importance Of Being Timed – Or Not

Accurate timing continues to be central to video, audio and metadata delivery. But as we progress to IP, do we need to be so obsessed with nanosecond tolerances?

We still discuss television images in terms of interlace, lines and frames, even though these terms describe the antiquated CRT and camera tube technology that hasn’t been seen for a good thirty years. Video reference signals are often referred to as “black and burst”, but we haven’t used the burst since SDI took over from PAL and NTSC, and can anybody remember the need for fractional frame rates? Something to do with the color sub carrier frequency and FM audio causing flicker back in 1965.

Maintaining backwards compatibility has always been the mantra for broadcasters. The argument goes something like this - there are millions of televisions out there that must be able to view the existing standard even though we want to transmit the new one. Any engineer around during 4:3 to 16:9 migration probably still has sleepless nights when thinking about the active aspect ratio switching, and letterbox and pilar box images, all resulting from an attempt to maintain the new 16:9 images on 4:3 televisions.

When we were trying to synchronize the electron beam in a CRT to the scanning electron beam of the camera in the studio, clearly there was a need to keep tight timing tolerances. Large synchronizing pulses were needed to reduce back EMFs and not blow up the television coil driver circuits and interlace came about due to the limitation of the electronics and RF bandwidth of the time.

Fast forward fifty years and we’re still using fractional frame rates and still take into consideration line, frame, and field syncs. Flat screen televisions and CCD and CMOS image sensors do not need scanning coils, and never have. They are both happy to work at 30/1.001 fps, 30 fps, 25 fps, and a whole spectrum of other frame rates.

I would suggest that the need to provide backwards compatibility is now unjustified and something we should really move away from. This isn’t just holding us back it’s also damaging the way we think about broadcast infrastructure.

IP systems are asynchronous by design. Yes, we use PTP to provide timing references, but we don’t need to provide the historic sync references. In terms of timing, all we really need to know is the approximate frame rate and where pixel one of line one is within the data stream. A decent software flywheel oscillator would even be able to reconstruct the frame rate.

The challenge I see is that the need to keep thinking in terms of line and field syncs is constraining our thinking. We’re still limiting ourselves to nanosecond accuracy when we only need to think in terms of frame rates, even this is arguable with light-field cameras. As long as a complete frame is received within plus or minus 10mS the display is guaranteed to show it at the correct frame rate, and to keep latency low then a frame timing source would be beneficial.

IP is opening many opportunities for us, but we must look beyond the notion of nanosecond synchronous timing to take full advantage of them.