Digital Audio: Part 5 - The Mathematics Of Oversampling

Oversampling is a topic that is central to digital audio and has almost become universal, but what does it mean?

Mathematically speaking, oversampling isn't necessary, because if we know the audio bandwidth we need, then simply doubling it gives us the appropriate sampling rate and that's that. The definition of oversampling is mathematical: any sampling rate that is more than twice the audio bandwidth is oversampled.

Oversampling is not necessary to the mathematicians, because they can assume ideal devices such as brick-wall filters that allow the sampling rate to be twice the audio bandwidth. Unfortunately such devices cannot practically be made.

Digital audio requires a low-pass filter prior to the ADC to prevent aliasing and a second filter after the DAC to return the signal to the continuous domain. The reproduced audio signal has passed through those two filters, so if they have any shortcomings we are going to hear them.

It's a fundamental characteristic of filters that the steeper the slope of the response in the stop band, the longer the delay caused by the filter. It's obvious really: in order to distinguish between two frequencies, to allow one and block the other, the greater the length of the signal that can be examined, what is known as the window, the easier it is to see any difference in period. It follows that a brick-wall filter having a vertical cut-off slope must have an infinite delay.

Fig.1a) shows that in conventional sampling, the sampling rate is raised slightly so that filters with finite slope can be used. In much audio equipment a bandwidth of 20kHz is obtained with a sampling rate of 48kHz.

Fig.1 - At a) a conventional ADC has a sampling rate Fs a little more than twice the highest audio frequency to be converted. In an oversampling convertor, b), the sampling rate is higher than necessary so that a more gentle anti-aliasing filter can be used.

Fig.1 - At a) a conventional ADC has a sampling rate Fs a little more than twice the highest audio frequency to be converted. In an oversampling convertor, b), the sampling rate is higher than necessary so that a more gentle anti-aliasing filter can be used.

Analog anti-aliasing and reconstruction filters are difficult to make. A steep-cut filter must delay the signal significantly, yet delay in the analog domain is difficult to achieve, and making a low-pass filter with an acceptable phase response is very difficult. Sometimes a phase correction stage is added, increasing the complexity. Analog filters are also physically large, owing to the need to use reactive components such as inductors and capacitors.

In contrast, delay in the digital domain is trivially easy. Samples are just left in memory and they are delayed with no loss of quality. A digital filter can have much better performance than an analog filter, and can be implemented at lower cost, but the problem is that the anti-aliasing filter needs to be in the analog domain, before the sampling stage. The problem is resolved by oversampling.

In an oversampling ADC, the sampling rate is temporarily made higher than sampling theory would predict. Fig.1b) shows that the anti aliasing filter can then have a much reduced slope in the cut off-region, and a correspondingly improved phase response in the audio band.

Once the conversion has taken place at the oversampled rate, that rate can be reduced in a digital filter. Although the same spectrum as Fig.1a) results, the overall frequency response has been defined by the digital filter, not by the analog filter, and the digital filter can be as accurate as we please and have any phase response we specify, including linear phase.

The digital filter does not just determine the frequency response; it also reduces the sampling rate. If the oversampling factor is an integer, like 2 or 4, each sample at the output rate always coincides in time with a sample at the oversampled rate. The design of the filter is eased because reducing the sampling rate is then only a matter of omitting the samples that do not coincide.

In practice there is no point in computing the value of samples that are going to be discarded, so the low-pass filter and the rate reduction are combined into a single stage, known as a decimator. in which only samples that are to be retained will be computed.

Fig.2 - In an oversampling DAC, the convertor runs at some multiple of the original sampling rate, again allowing a gentle slope to be used in the reconstruction filter.

Fig.2 - In an oversampling DAC, the convertor runs at some multiple of the original sampling rate, again allowing a gentle slope to be used in the reconstruction filter.

A mirrored approach can be used in a DAC. The DAC itself runs at an elevated sampling rate as shown in Fig.2, so that the analog reconstruction filter can have a relatively gentle slope in the cut-off region and a benign phase response in the audio band. The DAC need to be supplied with samples from an interpolator that calculates sample values between those from the audio data. Again the oversampling factor will be an integer so that some of the output samples will coincide with the input samples and need not be computed.

The operation of digital interpolators will have to wait for another time.

Let's try a thought experiment. Suppose an ADC is oversampling by a factor of two, so that each output sample will occupy the same time span as two conversions. Also suppose one sample has a value of 63 whereas the next sample has a value of 64. Six bits are needed to encode each sample. Now let us consider what the filter does with the samples. If it averages them, the result will be 63.5, a value that requires an additional bit below the radix point, having a value of one half.

So from two six bit words, we have extracted seven bit data by filtering. That's the real power of oversampling, because it allows convertors to be built that have longer word length so they can offer more dynamic range. In the simple system described here, a doubling of the sampling rate allows one extra bit of resolution to be obtained. Quadrupling of the sampling rate allows two extra bits and so on.

This follows from fundamental information theory and is the basis of all modulation techniques. Information is like a pat of butter as Fig.3 shows. Squash it down and the signal to noise ratio falls whilst the bandwidth increases. Squeeze it from the side and the bandwidth reduces whilst the signal to noise ratio goes up.

From a storage standpoint, oversampling is grossly inefficient. Doubling the sampling rate to get one extra bit of resolution doubles the data rate. Inside a convertor this is of little consequence, but for recording or transmission it is an issue. It is always the most efficient approach to storage to filter down to the lowest acceptable sampling rate and the longest word length.

The same approach can be applied in a DAC. If the DAC is oversampled by a factor of two, and two successive conversions differ by one quantizing interval, the analog reconstruction filter will average them to produce an analog voltage half way between the quantizing intervals, effectively as if the DAC had an extra bit of resolution.

This approach was put to good use when the Compact Disc was launched. At one time the CD was intended to be a 14-bit medium and Philips had developed a suitable 14-bit DAC for a consumer player. Rather late in the day the CD format was upgraded to 16 bits, and suddenly Philips had to find a suitable DAC. Someone had the bright idea of oversampling the 14-bit DAC by a factor of 4 to give it 16-bit resolution.

A suitable interpolator was designed and the first generation Philips CD player came to the market using oversampling from the outset, and offered an extremely good phase performance. The implementation of the oversampling was the easy part. The problem Philips had was how to explain to potential customers that their player really did have 16-bit resolution. 

Fig.3 - The information capacity of a signal can be doubled by doubling the bandwidth or by doubling the signal level so the SNR improves.

Fig.3 - The information capacity of a signal can be doubled by doubling the bandwidth or by doubling the signal level so the SNR improves.

The vinyl disk was probably the last audio medium the consumer could understand and the technological leap of digital audio was too much. Some hi-fi magazines of the day tried to explain how it worked and most of their efforts were hilarious. Some tried to argue that it didn't work and that it was inferior to prior formats. Fortunately the medium spoke for itself and no technical qualifications whatsoever were needed to hear the improvement.

Audio convertors are simple in concept but when it comes to actually making one the problem of component tolerances soon arises. For example a 16-bit convertor has 65535 quantizing intervals and every one of them needs to span the same voltage range in the analog domain. The percentage accuracy needed is phenomenal.

Oversampling comes to the rescue because it allows some of the necessary precision to be obtained by averaging over time. Timing accuracy is much easier to achieve in electronics than close component tolerance, so although it's not theoretically needed, oversampling allows a more accurate convertor to be made with components of a given accuracy, or, conversely allow the same accuracy to be obtained with components having wider tolerance, which reduces the cost.

Another factor is that when more of a convertor is implemented in the digital domain, it can become physically smaller because digital microelectronics can mostly manage without physically large passive components such as capacitors. Without oversampling, it would have been much harder for digital audio to have entered universal use, especially where the audio processes have to be incorporated in some more complex device such as a portable computer where space is severely limited.

You might also like...

Essential Guide: Protecting Premium Content OTT & VOD Distribution

The complexity of modern OTT and VOD distribution has increased massively in recent years. The adoption of internet streaming gives viewers unparalleled freedom to consume their favorite live and pre-recorded media when they want, where they want, and how they…

The Sponsors Perspective: Capturing Immersive Audio

Strategies for capturing immersive audio for scene and object-based audio.

AES Tackles Online Content Loudness

It was on December 13, 2011 that the Federal Communications Committee (FCC, the governmental body that oversees TV broadcasting in the U.S.), along with many irritated consumers, had had enough and decided to do something about the often times huge disparity…

The Sponsors Perspective: Monitoring Immersive Audio

Genelec Senior Technologist Thomas Lund starts down the road to ideal monitoring for immersive audio by looking at what is real, and how that could or should be translated for the listener.

The Sponsors Perspective: Effectively Using The Power Of Immersive Audio

Lawo’s Christian Scheck takes a tour of console functions and features that have a special place in immersive audio production, and how they are developing.