Digital audio relies completely on sampling and no treatment of the subject can be complete without looking at how it works.
Sampling is simply a process of periodic measurement. In audio the time between samples is exactly the same, which simplifies things. Fig.1 shows that the sampling process is multiplicative; in other words the incoming audio waveform is multiplied by a pulse train to produce a pulse amplitude modulated (PAM) signal, in which the height of the individual pulses corresponds to the height the audio waveform had at the sampling instant.
Fig.1. Sampling can be thought of as the audio waveform amplitude modulating a series of narrow pulses or needles, to produce a pulse-amplitude-modulated (PAM) signal.
Sampling can be described in either the time domain or the frequency domain, and if the two explanations don’t agree there is something wrong. In order to switch between time and frequency domains, it is necessary to have a basic understanding of transforms and transform duality. Any hi-fi journalist would of course be able to speak at length on that subject, but we are not all privileged to meet such luminaries and it may be necessary to outline the topic here.
Obviously, the sampling process itself takes no notice of what happened between the samples. Fig.2 shows that the same set of samples could have come from two different waveforms. Which of the waveforms of Fig.2 is the correct one and which one is the aliased waveform? In the absence of more information all that can be said is that it could be either but not both.
Fig.2. Here, two different waveforms are represented by the same set of samples. Which is correct depends on the associated filters.
The further information that is needed is the frequency band in which the sampled signal lies. Sampling on its own is not meaningful and it is absolutely fundamental that any sampling process must be preceded by and followed by suitable filters as shown in Fig.3, that define the frequency band in which the signal exists. Such filters would only admit one of the waveforms in Fig.2 and thereby resolve the uncertainty.
Sampling theory sets out the necessary relationships between the filters and the sampling rate. In order to specify the filters, it is necessary to know what the sampling process does to the spectrum. As it is a modulation process, effectively the amplitude modulation of a series of pulses, the result must contain sidebands.
Fig.3. All sampling processes must be preceded and followed by appropriate filters. It is not meaningful to discuss sampling without considering the action of the filters.
Fig. 4 shows that a sine wave is one component of a steady rotation and to eliminate the cosine component there must be two rotations in opposing directions so the cosine waves cancel leaving all real sine waves comprising equal amounts of positive and negative frequency energy. In the baseband these cannot be distinguished, but a modulation process reveals them. In a modulation process, one opposed pair of rotations is carried around by another. The results are both sum and difference frequencies, which is where the upper and lower side bands of modulation processes come from.
This is no mathematical construct, but it is found in real life. When starting a helicopter rotor the blades will be out of pattern and they swing about with respect to the head. But the head is rotating and the frequencies felt in the cockpit are heterodyned. As the rotor starts up the hull starts to sway. Pilots call it padding. Initially the frequency of the padding rises, but then as the rotor gets faster the padding frequency starts to fall again because it is due to the lower sideband of the swinging blades.
Fig.4. When there is contra-rotation, as shown here, the horizontal processes are in opposition and cancel out, whereas the vertical process results in a sine wave having equal parts of positive and negative frequency. Modulation causes one sine wave to carry another so that negative frequencies subtract to produce lower side bands.
Fig.5a) shows that the spectrum of a pulse train consists of the fundamental plus a theoretically infinite series of harmonics according to Fourier. If such a pulse train is amplitude modulated by a baseband signal as in Fig.5b) the baseband will repeat as upper and lower sidebands around the harmonics.
If the base bandwidth is increased as in Fig.5c), the limit is when the lower side band has reached half way up the sampling frequency and the lower sideband has reached half way down. Half the sampling rate is often called the Nyquist frequency. This is because Nyquist discovered around 1924 that the pulse rate of a telegraphy system could not exceed twice the bandwidth of the channel.
An ideal anti-aliasing filter with a rectangular frequency response could prevent the baseband exceeding the Nyquist frequency and an identical filter later on could reject the images and leave only the baseband.
Fig.5d) shows another possibility. Here the input signal is limited by anti-aliasing filter having a bandpass characteristic. The lower sideband is now below the baseband with the upper sideband symmetrically above the sampling rate. The lower sideband of twice the sampling rate is below the upper sideband of the sampling rate. The sidebands and the baseband all occupy different frequencies so the input signal is recoverable by a bandpass filter.
It should be clear from Fig.5d) that sampling theory does not actually require the input spectrum to be limited to half the frequency of the sampling frequency. Instead the requirement is that the signal bandwidth must be limited to half the sampling frequency.
Fig.5. a) An un-modulated set of sampling needles has a spectrum like this. b) In conventional sampling, the base band is repeated as upper and lower sidebands around multiples of the sampling rate. c) The limit is reached at Fs/2, the Nyquist Frequency, when the lower sideband touches the baseband. d) An alternative sampling system in which the base band is band-limited. The lower side band is now below the baseband in the spectrum. The baseband can still be recovered with a band-pass filter.
It should also be clear why one or other of the waveforms of Fig.2 could be correct, depending on the filtering used. Whichever one is correct, the other one could not pass the filters.
In audio compression systems it is common to divide the audio spectrum into a large number of frequency bands, typically 32. As the bandwidth of each is 1/32 of the input spectrum, it follows that the sampling rate of each band can also be 1/32 of the input sampling rate, so the total sampling rate does not change after the filter bank.
Aliasing is not always detrimental. For example the stroboscope is an example of a beneficial use of aliasing. When the input frequency and the sampling rate coincide, the lower sideband frequency is zero. Marks cut on the perimeter of an audio turntable appear to be stationary when illuminated by 100Hz or 120Hz light from an a.c. powered source. It is ironic that those who prefer vinyl disks to this new fangled digital stuff use sampling theory to get the speed correct.
Fig.6. In a reconstruction filter the zero crossings in the filter impulses align with adjacent samples so the output waveform effectively joins the samples into a continuous waveform identical within the available bandwidth to the original.
Turning to the time domain, Fig.6 shows that the impulse response of an ideal low-pass filter is a sinx/x curve, which periodically passes through zero. If the filter has a bandwidth of one half the sampling rate, the zero crossings of the impulse response coincide with the locations of adjacent samples. In other words at the centre of a given sample, the Voltage due to all other samples is zero. That means the filtered waveform must join up the tops of the samples.
In between the samples the Voltage is the sum of all of the impulses within the filter window. So it doesn’t matter that the sampling process doesn’t know what happens between samples, because the filters do. Once the sampling rate exceeds the Nyquist frequency, the filters guarantee that the audio waveform could only change between samples in one way, which means that a well engineered sampling system need not change the original waveform in the slightest.
This was first explained by the British mathematician Edmund Whittaker in 1915, which was somewhat ahead of the publications of Vladimir Kotelnikov in the Soviet Union and of Claude Shannon in USA.
Fig.7 shows that the Fourier transform of a square wave is an infinite series of harmonics whose envelope is a sinx/x curve. That curve passes through zero periodically, which is why a square wave has no even harmonics. Transform duality tells us that if a filter has a square, or brick wall, frequency response the impulse response must also be a sinx/x curve stretching to infinity.
Those infinities are bad news because they can’t be reached in the real world. A perfect square wave isn’t possible because the vertical ends require infinite bandwidth. Equally a filter with a square frequency response isn’t possible because it would have an infinite impulse response and cause infinite delay to the signal.
Fig.7. Transform duality holds that a square in either the frequency or time domains results in a sin/x in the other domain.
The ideal spectrum of sampling theory where the rectangular baseband meets the rectangular lower sideband at the Nyquist frequency cannot be realised. It is therefore sensible to treat a sampling rate of twice the signal bandwidth as a bound; by which is not meant the means of marsupial locomotion, but instead describes a condition that can’t be exceeded and which can only be approached asymptotically and with diminishing returns.
It’s not the end of the world, because it is only necessary to increase the sampling rate slightly and the filters then have a finite slope and the infinities go away.
Sampling is also used by opinion polls, where it is not possible to ask everyone and instead the mathematics of probability are used to estimate the accuracy of a poll based on a limited sample. There comes a point where increasing the size of the sample doesn’t materially increase the accuracy. The same is true of audio sampling rates. As the sampling rate is increased for a signal of fixed bandwidth, the accuracy increases at first, but there comes a point where increasing the sampling rate no longer increases the accuracy. That happens at the Nyquist frequency and any higher sampling rate is defined as oversampling.
Why Did You Read This?
You might also like...
The best sampling rate for digital audio is easily established by considering the requirements of the human auditory system (HAS), which is the only meaningful arbiter. Provided that the bandwidth of a digital audio system somewhat exceeds the bandwidth of…
It’s interesting to compare the quality that can be obtained using digital audio with legacy media such as the vinyl disk and magnetic tape.
With the advent of immersive audio mixing using codecs like Dolby Atmos and DTS:X (the successor to DTS HD) professionals now have the ability to create interactive, personalized, scalable and immersive content by representing it as a set of…
Noise shaping performs an important role in digital audio because it allows hardware to be made at lower cost without sacrificing performance, and in some cases allowing a performance improvement.
Oversampling is a topic that is central to digital audio and has almost become universal, but what does it mean?