As we saw earlier when discussing transform duality, when something happens on one side of a transform, we can predict through duality what to expect on the other side.
Other articles in this series:
As the digital domain works entirely with numbers, and numbers are fundamentally discrete, it follows that we can only use the digital domain to describe discrete information. Transform duality tells us that if we start with discrete information, the spectra we compute will also be discrete.
If a signal is sampled, as most of today's signals are, the sampling process has two effects. In the frequency domain the spectrum extends dramatically because the original spectrum now repeats above and below all of the harmonics of the sampling clock. This extended spectrum is redundant in the information theory context, as the repeats of the baseband spectrum add no information that was not in the baseband.
Were this not so, reconstruction could not work. In reconstruction, a filter simply rejects all of the redundant repeat spectra, leaving only the baseband spectrum that existed before it was sampled.
The repeating spectra of the sampled signal warrant some study. The baseband signal is continuous. Using the concepts of Fourier analysis, we can visualize the sampling process as one which adds further waveforms to the baseband waveform that have the effect of making the signal take on a value of zero except at the sample instants.
Clearly that waveform must have a significant high frequency content in order to turn the original analog waveform into a series of spikes. The additional waveform is synthesized from repeated harmonics of the sampling frequency carrying sidebands of the input signal.
That slightly unusual approach to looking at sampling does, however allow a very simple view of how reconstruction works. If those additional harmonics convert our continuous input waveform to a series of pulses, all we have to do is to remove said harmonics in a suitable filter and the original waveform will re-appear. The fundamental point that a correctly sampled signal carries the entire waveform and not just the signal value at a few places is well made, possible more directly than was the case with earlier explanations.
Given that the entire waveform is carried by samples that obey Shannon's theory, it also follows that it doesn't matter where the samples are taken, provided they are evenly spaced. The original waveform will emerge on reconstruction wherever the samples were taken.
Sampling is the process that allows analog signals to become discrete. If we adhere to the rules of sampling theory, the sampling process loses no information. There is one exceptional case in which there is a compound loss of information and that is in the education process of the hi-fi journalist which leaves them not only not knowing how sampling theory works but also not knowing that they don't know.
In the digital domain the signal becomes discontinuous as it is only represented at the sites of the samples. If the sampled input waveform is windowed it will consist of a finite number of samples. Transform duality suggests that it can therefore be represented by the same number of coefficients in the frequency domain.
In other words a finite number of samples having a finite word length allows a finite number of waveforms to exist. The result of a transform is a finite number of coefficients having finite word length, that also allow a finite number of waveforms to be synthesized. If those numbers are the same nothing has been lost.
That, basically, is what the Discrete Fourier Transform (DFT) is about. Starting with discrete number representing a sampled waveform, the DFT gives us discrete coefficients representing a sampled spectrum. Duality also tells us that if we wished, we could reconstruct a continuous spectrum by filtering the coefficients appropriately.
Fig.1 - The fundamental frequency coefficient is calculated here. As the positive and negative parts of a sine wave only differ in polarity, it is more efficient to subtract pairs of samples and perform one multiplication.
The number of samples in the input window and the sampling rate determine the time span of the input data. The frequency spacing of the discrete coefficients is the reciprocal of that time. This is entirely consistent with transform duality, because if the duration of the window is increased, duality suggests the frequency resolution of the transform would increase.
The Discrete Fourier Transform is carried out in basically the same way as the continuous transform except that all of the relevant signals are sampled at the same rate and synchronously. The discrete-time input waveform is multiplied by the discrete-time basis function for each frequency on a sample-by-sample basis to produce a discrete number of products that are summed to calculate each coefficient.
As before, each basis function exists in both sine and cosine form in order to calculate the complex coefficients (in the mathematical sense) that allow the phase of each frequency component in the input waveform to be established.
Whilst it is perfectly possible to calculate a discrete Fourier transform simply by copying the analog process and performing a sampled version of the continuous transform, such an approach would not be efficient, meaning that it would requires more computations than necessary. The calculation of a DFT contains redundancy, and if that can be identified, the amount of computation can go down.
Fig.2a) The FFT butterfly consists of summing and differencing a pair of samples. Fig.2b) The processing of Fig.1 is implemented here with butterflies.
In an algorithm, what we mean by redundancy is that the same calculation is made more than once as the computation proceeds. Another factor is that as the basis functions are sinusoidal, the negative half-cycles are the same as the positive half-cycles except for their polarity, so it is not necessary to calculate them fully.
The Fast Fourier Transform (FFT) takes full advantage of these ideas to calculate the transform using the least amount of computation. The advantages of a more efficient computation are numerous and depend on the application. In addition to running faster on a given problem in a given machine, the use of the FFT allows a slower machine to achieve the same throughput, and in the case portable equipment, allows reduced power consumption. That is an important consideration in, for example, digital radio, where modern modulation schemes such as OFDM require a portable radio to perform a Fourier analysis on the received signal.
The transform simply multiplies the input sample sequence by sine and cosine waves of various frequencies and integrates the results. In order to multiply an input sample sequence by a sine wave, instead of performing one multiplication for each sample, we note that half a cycle later, the sine wave has simply inverted.
Fig.1 shows what happens. Instead of using the negative half-cycle of the sine wave, we invert the input samples instead. In Fig1, subtracting sample 5 from sample 1 and multiplying once produces the same result as multiplying sample 1 by a positive value and sample 2 by a negative value before adding. The number of multiplications has been halved.
This is significant because multiplication is an intensive process in a computer processor that requires a considerable number of microinstructions to perform, takes a lot of time and consumes a lot of energy. Anything that cuts the number of multiplications an algorithm needs, without cutting any corners, is generally a good thing.
Fig.1 also shows that instead of repeating the process to look for the cosine component, which would require the same sample subtractions, the subtractions are used twice, by computing the sine and cosine in parallel.
The FFT cuts down the number of multiplications needed by performing where possible a single multiplication once and storing the result in memory where it can be accessed whenever that product is needed again. It follows immediately that a processor having a cache memory would offer a speed advantage for FFT calculations.
The lowest frequency in the FFT is zero, and mathematically the zero coefficient is found by multiplying each sample by a constant of unity and adding the products. This is identical to averaging the samples, which is intuitively correct.
The next lowest frequency is one where a complete cycle fits into the window as shown in Fig.1. The figure also shows that subtractions are needed between sample 0 and 4, 1 and 5, 2 and 6 and 3 and 7.
Calculation of the first two coefficients can be combined by calculating sums and differences using a building block shown in Fig. 2a), which is called a butterfly on account of its shape. Fig.2b) shows that the four pairs of samples are used in four butterflies to produce four sums and four differences. The sums are added to obtain the first coefficient (the DC component) and the differences are multiplied by the sine and cosine of the phase angle and summed to obtain the coefficients of the fundamental.
The remaining coefficients are calculated in a similar way to obtain a discrete spectrum looking something like Fig.3. Here F8 is the sampling rate, F4 is the Nyquist frequency and F5 - F7 are the lower sideband, which is simply the baseband mirrored. The spectrum is therefore redundant, because F0 to F3 tell us all there is to know. As the coefficients are complex, each one consists of two numbers. The FFT has converted eight samples to four coefficient containing two numbers, so there has been no loss of information.
You might also like...
Compression is almost taken for granted despite its incredible complexity. But it’s worth remembering how compression has developed so we can progress further.
John Watkinson moves on to discussion of the effects of the medium waves are travelling in and explains why loudspeaker enclosures contain foam.
Compression is the ultimate enabling technology behind broadcasting. Without it, life would be very difficult indeed. In this new series, the whole topic will be explored at some depth.
Broadcasting is totally dependent on waves which crop up in a surprising number of places. Sound waves and light waves form the message, which is delivered by further types of wave.
Delivering determinant latency is more important than fighting variable latency, even if it is small. In this article, we look at how codec design and JPEG-XS can scale to make the best use of network bandwidth while keeping latency predictable.