The conversion of monochrome TV to color was quite a trick, but it came at a cost.
The scanned representation of images in monochrome TV meant that the image was effectively sampled into lines and the video signal effectively contained a sampling spectrum that was rich in multiples of the line rate. The designers of the original black and white TV standards wisely placed the sound carrier such that it would fall between multiples of the line rate. This reduced any interference between the vision and the sound.
Then along came color, which needed a subcarrier. In order to make the subcarrier invisible, it was arranged to invert in phase from one line to the next, so that it took two line periods to have a whole number of cycles. This meant that the sampling spectrum of the chroma was effectively divided in half so that the spectrum was at multiples of half line rate. Whilst this approach made the subcarrier invisible, it also served to align harmonics of the chroma with the sound carrier.
On an unmodified monochrome TV set, intended to work with an NTSC color signal but to show a black and white picture, and on color sets alike, the result was interference between the new chroma signal and the sound channel. This resulted in some head scratching.
There wasn't much room for maneuver. The sound carrier frequency couldn't be changed, as that would have required every existing TV set to be re-aligned. The subcarrier had to have a complete number of cycles in two lines to make it invisible, but to shift the subcarrier frequency required the line frequency to change, and to stay within the TV format that meant the frame rate had to change to keep the same number of lines in a frame.
So that was the only way out, and NTSC took it. The sound carrier frequency stayed the same, but everything else changed. The frame rate, the field rate, the line rate and the subcarrier frequency were all shifted down together by 0.1 percent so the field rate became 59.94 Hz instead of 60 Hz.
Technically it worked beautifully as the sound carrier was now in an emptier part of the spectrum and the TVs of the day could lock to syncs that were only 0.1 percent off without any trouble.
By this time Henry Warren had managed to get the United States humming along at exactly 60Hz so that any ac outlet could power an accurate clock and dividing the original TV field rate by 60 would do the same. But when the field rate was changed to 59.94 Hz on the introduction of NTSC that didn't work anymore.
NTSC color sync pulse generators were no longer phase-locked to the local power frequency and dividing the field rate by sixty did not result in seconds. This was OK for live TV, but once video recording and film were thrown in there was a problem because nothing would play for the expected time owing to the shift in the frame rate. The error was about 3.6 seconds per hour; roughly ninety seconds a day.
An early way of dealing with this was the adoption of color time. This was based on using electric clocks intended for 60Hz ac power, but running them instead from the NTSC field rate. This meant that they could count TV frames and produce a running time reference, but with a 0.1 percent error. If the length of a tape or film was known, the error could be calculated ahead of time and converted to color time. In that way the color time clock could predict when a clip was due to finish so that continuity became easier.
The NTSC color signal was intended as a transmission format from a transmitter to a receiver, but before long people started recording it on tape and another learning curve began. Replay of color video signals from tape is only possible for broadcast purposes if a time base corrector (TBC) is used. Otherwise the timing errors in the chroma signal will be too great to decode it properly. Time base correctors naturally evolved to get the chroma phase correct in the replayed signal.
As has been mentioned more than once, the whole success of NTSC revolves around the subcarrier frequency inverting on alternate lines. There are 525 lines in a frame, an odd number. If a frame begins with regular subcarrier phase, the next frame will begin with inverted phase. This means that two complete frames have to go by before H, V and subcarrier have the same relationship again. These were called Type A and Type B frames.
For transmission, this doesn't matter in the least, but suppose someone edits an NTSC tape such that the sequence is broken and it goes A,B,A,B,B,A,B.......
The time base corrector sees the burst phase invert at the edit, which looks like a half-cycle timing error. As the TBC lives to get the chroma timing correct, it has to shift the entire picture left or right on the screen by half a cycle of subcarrier.
That then raised the necessity of defining the A and B Type frames in NTSC in terms of the subcarrier to horizontal sync timing, the so-called Sc-H phase. An installation using an accurate Sc-H phase sync pulse generator and an editor that respected the A and B frame sequence could perform edits without picture jumps.
Things were different in Europe, because the adoption of V-switch in PAL color TV led to the subcarrier having a quarter-cycle offset with respect to line rate and this meant that any interference between chroma and the sound carrier was avoided and the frame rate could remain at 50Hz exactly. However the lower frequencies introduced by V-switch meant that the PAL video signal contained an eight-field sequence that can be seen today as the forerunner of the Group of Pictures of MPEG.
To make matters worse the BBC never ratified the EBU's Sc-H standards, so manufacturers would proudly take their EBU specification equipment to the BBC and find that it didn't work because Sc-H phase at the BBC in those days changed with the direction of the wind.
Back in the New World, the problems of using 59.94 Hz were finally palliated with the adoption of the so-called drop-frame time code. This was a bit of a misnomer, because no frames were actually dropped. The reduced frame rate meant they were never there in the first place. What drop-frame actually means is that 0.1 percent of the time codes were dropped at carefully planned intervals so that the time code reading would stay reasonably close to real time.
Effectively the time code system worked as if the seconds count was obtained by dividing the frame count by 29.97 instead of 30. In order to work with whole numbers this meant dividing by 1000 instead of 1001, except that with time code, every frame is numbered and it is therefore possible for an editor to tell whether a frame is an A frame or a B frame by looking at the time code. In order not to break the AB sequence of NTSC, codes for single frames could not be dropped. They had to be dropped in pairs so the AB sequence would be undisturbed by the code drop. Effectively this meant dividing by 2000 instead of 2002.
The 0.1 percent reduction in frame rate resulted in the video signal having a ten-minute sequence containing eighteen fewer TV frames than a pure 60Hz system would have. This suggested a very simple scheme that is easy to remember whereby at the beginning of every minute except the tenth, two time codes are dropped. Dropping two codes nine times in ten minutes gave the necessary 18-frame correction. Every NTSC editor knew the rules, so it wouldn't go looking for a time code that didn't exist.
Then along came digital audio that initially recorded on videocassettes to get enough bandwidth. The sampling rate of 44.1KHz came about because it could be supported by a 50Hz VCR recording 3 samples on 294 lines per field or by a 60Hz VCR recording three samples on 245 lines per field. Compact Disc production recorders were based on black and white U-Matic VCRs and ran at 60 Hz exactly.
Digital VTRs for production had to adopt the AES standard audio sampling rate of 48KHz, even though the rotary heads were locked to 59.94Hz in NTSC countries. As the frame rate is slightly low, there were 48,048 audio samples in 60 fields and not 48,000. This meant the audio blocks on the tape would contain varying numbers of samples in a pattern so that in the long term the correct sampling rate was recorded.
59.94 Hz was only necessary to prevent sound and picture interference in NTSC broadcasts. It was possible to play, for example, a C-format NTSC videotape at 60Hz exactly because the sound had its own tape track. Once the broadcasting of NTSC ended, the requirement for 59.94 Hz went away and it became a pain in the ass that persisted purely because of inertia rather than any technical requirement.