Digital Audio: Part 16 - Filters, Direct Implementation

There are two approaches to digital filtering. One is to implement the impulse response directly. The other is to use recursion. Here we look at the direct implementation.

As has been seen, any filter is characterized by its impulse response. In the case of digital audio the ideal impulse is very simple to grasp. It is no more than a single sample that is non-zero surrounded by samples having zero value. Mathematicians have different names for it, but that doesn't change what it is.

Fig.1 - In convolution, the impulse response is slid across the input waveform and the area of overlap is measured. The impulse must be mirrored so the tail arrives last.

The theoretical digital audio impulse has zero width and therefore requires infinite bandwidth to reproduce it. That ideal impulse only exists as a concept. All real digital audio systems are bandwidth limited according to sampling theory. Samples can never be heard unless they have passed through a reconstruction filter that forms part of every DAC.

Filters can only reduce the bandwidth of a signal. They can't increase it. Information theory states that the amount of information a signal could carry increases with bandwidth, but a filter has no source of additional information. In other words passing 20kHz of audio bandwidth through a 40kHz low-pass filter still leaves you with 20kHz of audio. Re-mastering old Rolling Stones albums on to SACD (Super Audio CD) can't add to the information limit of the original tapes. That was one of the reasons SACD failed.

However, if we are trying to build an oversampling DAC, we take advantage of that theory. Raising the sampling rate cannot and does not put any more information into the signal, but it does ease the design of the reconstruction filter. If we have a 48kHz audio file and we want to oversample it, the impulse response of the interpolator should be that of a 24kHz low-pass filter, since that is the bandwidth of a 48kHz audio file.

In the previous part the concept of convolution was mentioned. Fig.1 shows how it works. The input waveform (in this example a rectangular pulse) is considered fixed and the impulse response waveform is slid across it. The impulse response needs to be time reversed or mirrored so that the tail of the response arrives last. One exception is if the impulse response is perfectly symmetrical, in which case mirroring can be omitted.

The output is proportional to the area by which the input waveform and the impulse overlap. That area is shown shaded in the figure along with the way the output waveform evolves during the overlapping process.

Fig.2 - In sampled convolution the impulse and the input are moved in steps of one sample period.

In the digital domain the samples are a constant distance apart, and that means calculating area is simple, because having a constant width, the area is proportional to the sample value. Fig.2 shows an example of sampled convolution. Here the impulse is stepped across the input one sample period at a time. At each step, or point, the area of the overlap is calculated. As the samples have constant width, the area is proportional to the extent the sample values overlap. Mathematicians would call it summation of coincident cross products.

There are many ways of implementing a filter of the type shown, including the use of suitable software in a processor, but the implementation that best illustrates what is actually happening is hardware. Stepping samples one sample period at a time is an obvious application for a shift register.

Fig.3 shows a shift register in which the sample train representing the input waveform can be stepped across. The length of the shift register forms a window in which a fixed number of points are simultaneously available. Any impulse response we choose to use has to fit inside the window, hence this topology is known as a finite impulse response (FIR) filter.

Each point, also known as a tap, of the shift register feeds one input of a multiplier. The other input is a value known as a coefficient, which is effectively a sample of the required impulse response. The products from the multipliers are summed to produce the filter output.

If this filter is supplied with a test impulse, namely a single non-zero sample surrounded by samples of zero value, then as the input shifts across the window, the output of every multiplier except one will be zero. That single multiplier outputs a cross product. Coming out of the summer as the filter shifts will be the impulse response.

With a real audio input there will be samples in every stage and cross products from every stage that will be summed. The FIR filter performs convolution of the impulse response and the input waveform. The FIR filter is always causal, in that any output must occur after the corresponding input has been supplied. Where symmetrical impulse responses are used to obtain phase linearity, the filter causes a delay corresponding to one half the size of the window.

Fig.3 - Stepping the samples in a FIR filter is often done using a shift register.

The number of multiplications available per output sample determines the finite length of the impulse. This means that practically all impulses will have to be shortened to fit within the available window. The most brutal way of doing that is simply to truncate the impulse, meaning that the parts outside the window are literally chopped off to leave sharp ends.

The rectangular window causes a sudden transition from samples that matter to samples that don't. One might expect some effects on account of that discontinuity and the result is known as the Gibbs phenomenon. Discontinuities contain high frequencies and the one explanation of the Gibbs phenomenon is that it is the filter ringing due to those high frequencies.

Another way of looking at the issue is to consider that windowing restricts the number of samples the filter can see and transform theory suggests that also reduces the frequency resolution of the filter, in the same way that putting a small block into a Discrete Fourier Transform produces a small number of coefficients.

Fig.4 shows the effect of different numbers of points on the Gibbs phenomenon. Increasing the number of points has made the rate of cut-off steeper because the frequency resolution of the filter has gone up in proportion. For the same reason the frequency difference between successive ripples becomes smaller.

Not surprisingly it was found that tapering the coefficients towards the ends of the window gave better results as this has the effect of softening the discontinuities and reducing the ringing. Whilst FIR filters can ring, they only contain forward signal paths and have no feedback. Accordingly they are unconditionally stable, cannot oscillate and must return to zero output within one window width of the input being muted.

Fig.4 - Showing the effect of increasing the number of points in the filter.

The filter design process consists of selecting an impulse response and a window function and multiplying the two together to obtain coefficients for the multipliers. In practice the coefficients will need to be normalized so that the passband of the filter has unity gain.

There are many such window functions, most of which are variations on a curve that is somewhat Gaussian or bell-shaped and it would be tedious to describe them all here. There is no one ideal window and the one chosen should reflect the best compromise between good stop band attenuation, which may be important to avoid aliasing, the steepness of the filter cut-off and the amount of pass band ripple, which could impair sound quality.

In the stop band, attenuation is obtained when positive and negative products are summed and cancel out. If the coefficients are not sufficiently accurate, the products will be inaccurate and the attenuation will be poor. FIR filter economics is two dimensional, as the window size, or number of points, determines the number of multiplications per sample and the stop band performance determines the coefficient word length. The filter will always be a compromise between performance and cost.

Optimization of filters has a long history and began way before the digital era. The Remez Exchange Algorithm was published in 1934 and was an iterative approach that converged on the optimal design. The delays needed between the taps of an FIR filter are difficult to implement in the analog domain, especially in the case of audio, which covers a wide range of octaves. Delay is trivially easy to implement in the digital domain with no loss of quality so it was inevitable that the FIR filter would increase in importance. That led to the Parks-McClellan algorithm of 1972, which was optimized to design FIR filters.

Fig.5 - A folded filter uses calculated products more than once.

The falling cost of processing power famously described by Moore's Law applies to digital filters, making complex filters more economical to implement as time goes by.

When the impulse response is perfectly symmetrical, coefficients at equal distances left and right of the center will be the same. Instead of repeating the same multiplication twice, the filter can be folded so that the product is calculated once but used twice. This approximately halves the number of multiplications per sample. Fig.5 shows a folded filter in which three of the products are used twice. This filter has been simplified for explanatory purposes. A real audio filter would need many more points.

Broadcast Bridge Survey

You might also like...

Microphones: Part 11 - The State Of The Art… And The Potential Of MEMS Microphone Arrays

Here we look from the state of the art in microphones, to what the future may bring with the enticing theoretical potential of microphone arrays built using MEMS technology.

IP Monitoring & Diagnostics With Command Line Tools: Part 2 - Testing Remote Connections

In the previous article, we set the scene for working with the Command Line Interface (CLI) on a UNIX system. Now we will explore some techniques for performing basic tests on our network infrastructure to check for potential problems.

Microphones: Part 10 - Mid-Side (M-S) Recording And Processing

M-S techniques provide useful sound-field positioning and a convenient way to check mono compatibility. We explain the hard science behind this often misunderstood technique.

Microphones: Part 9 - The Science Of Stereo Capture & Reproduction

Here we look at the science of using a matched pair of microphones positioned as a coincident pair to capture stereo sound images.

Microphones: Part 8 - Audio Vectorscopes

The audio vectorscope is an excellent tool for assuring quality in stereo sound production, because it makes the virtual sound image visible in the same way that a television vectorscope allows the color signals to be seen.