It seems almost superfluous today to specify that audio is digital because most audio capture, production and distribution today is done numerically. This was not always the case and at one time audio was primarily done without the help of numbers and the term digital audio was introduced to distinguish the new technology from what went before.
Fundamentally, sound is transient. The equilibrium of the air is disturbed, but then it returns to normal. Sound cannot therefore be recorded or distributed in its original form, but instead has to be expressed as an analog that can. Early versions of that included writing and Morse code, that allowed words to be recorded and transmitted, and sheet music, that was an early form of MIDI, where dots on paper could control an orchestra, using a guy with a baton as a sync pulse generator.
Words change their meaning and analog is a good example of that. Originally an analog was the representation of one thing by another. In audio there must be two analogs. One of them is the pressure or velocity of the air and the other is time. The electrical signal from a microphone is in real time, but all recording media must have a means of reproducing the recorded sound in the original time frame.
Fig.1 shows the various analogs used in audio. The movement of the air can be represented by an electrical voltage on a wire, the frequency or the amplitude of a radio wave, the displacement of a groove in a cylinder or disk, the strength of magnetization along a tape, the variation in light transmission of a film or by the magnitude of a series of numbers. The time axis is taken care of in disks, tape and film by having the medium move at steady rotational or linear speed. In numerical systems, the magnitude of the number is changed at some constant clock rate.
So in that original sense, digital audio is just another analog of the sound waveform. It might equally have been called numerical audio, as indeed the French do call it, or perhaps discrete audio. Before digital audio, there was just audio, but once digital audio arrived, its forerunner had retrospectively to be given a name. Unfortunately that name was analog. It was unfortunate because it suggests that audio properly expressed as a series of numbers is not an analog and not continuously variable.
But there it is, and it was not long before people started using the word to mean "not digital". A watch with hands rather than a numerical display became an analog watch, even though the hands jumped every second and it was obviously discrete in operation. Movie cameras using film became analog cameras, even though they captured discrete frames. NTSC TV sets became analog TVs, even though the picture was broken up into frames and lines.
As a consequence when I read the word analog today, I can't be sure what the writer meant. I prefer not to use it, in case readers attribute some new meaning I didn't intend.
The words we choose to use have no effect on nature and, as will be explained in due course, properly implemented digital audio is an analog of a sound waveform and it is continuously variable. Not only that, but it also allows a more accurate and more continuously variable representation of a sound waveform than any previous technology could. Whilst digital audio allows great accuracy, it does not guarantee it and where the pressures of economy are unfettered, or where there is insufficient design skill, digital audio can also sound pretty awful.
Fig.1. The various analogs used to preserve an audio waveform. One of the most successful analogs is to express the waveform using a series of numbers.
There are various aspects of digital audio that distinguish it from prior technologies and these are shown in Fig.2. In the case of sound quality, it is possible to compare the quality of digital audio with the quality of prior media. However, in a further aspect, which we might call freedom, no comparison is possible because prior technologies simply could not offer the facilities available with digital audio.
The freedom also overlapped into economics, where new and previously impractical techniques could save money. Although there will be much more to say about these aspects, they can briefly be introduced here.
In all prior audio media, the sound quality was defined by the way the storage medium worked. Failure to maintain constant speed would introduce wow and flutter. Vinyl discs had tracing distortion, surface noise and scratches, tapes had hiss and dropouts and modulation noise and so on and these could be recognized, so that all legacy media had a characteristic sound.
Digital audio is not like that, because what is recorded on the medium are data, typically binary and, thanks to error correction, the data that are recovered are bit-for-bit identical to those that were recorded. Thanks to time base correction there is no wow and flutter. The words have practically disappeared from the language.
In that sense digital audio media have no sound quality; the sound quality has become independent of the medium. Instead the quality of a recording depends only on the quality of the converter that was used to express the waveform as numbers. Clearly if we wish fully to enjoy that recording, the conversion back again needs to be no worse.
There is one major exception to the previous paragraph, and that is where lossy compression has been used. In that case the convertor and the medium are substantially blameless and the compressor determines the sound quality.
Once audio signals have become data they are almost indistinguishable from any other type of data, such as images, text or computer instructions and can thus be stored on any suitable medium. As the storage is bit-accurate, there is no generation loss when data are transferred from one medium to another or through networks. The only difference is that audio data should be reproduced on a certain time axis.
In prior audio media, there was always a quality loss on playback and this would build up to become generation loss when copies were made. As a result recorders for professional use had to be over-specified on their first generation quality so that they would be adequate after several generations. Professional machines were therefore larger and more expensive than consumer machines and used greater quantities of media.
The adoption of digital audio gave an economic advantage: in the absence of generation loss the over-specification could be reduced and the purchase and running costs could come down.
There is no generation loss in digital recording, but manipulation of audio data for production purposes does introduce a slight loss. Production equipment would typically have a higher dynamic range than consumer equipment to allow for uncertain levels when recording.
Further economies were realized because digital technology allows facilities such as random access, which speeds up processes such as editing and simplifies or automates playout for radio stations. To contrast random access with the previous technologies that were linear, the term non-linear was adopted. Fortunately that applied to the time axis and not to the transfer function.
Fig.2. Legacy audio can partially be compared with numerical audio, for example in the areas of sound quality and generation loss. In other areas, such as access time, no comparison is possible.
Finally, digital technology requires little or no maintenance or adjustment and those who previously did that for a living were shown the door.
There is a third aspect of digital audio in Fig.2 that differs from what went before. It requires no great background knowledge to understand how a vinyl disk works as it can be grasped by looking at the grooves through a magnifying glass. In that respect the vinyl disk resembles the steam locomotive; the way it works is immediately obvious to the onlooker from the rods and cranks that are on display.
Digital audio is not like that. It’s more like nuclear power, in that not only is there nothing to see, but what is going on cannot be described in simple terms and becomes a source of fear. It is just not possible to discuss anything meaningful about digital audio with someone lacking a certain technical background. The coherence of an argument is irrelevant if it is not understood.
The unfortunate result was that journalism did what it always did, and provided something plausible whether it was correct or not, and the consumer did what he always did which was to prefer to believe something incorrect rather than not to know.
The raft of half-truths, mythology, pseudoscience and downright nonsense that emerged over digital audio was as breathtaking as it was disappointing. Naturally enough, manufacturers of legacy audio equipment made unwarranted attacks on digital audio. None of this had any serious effect, because the technology spoke for itself.
I remember the first time I heard a Compact Disc. It would have been in 1981 somewhat before the official launch. It was a revelation, and so obviously obsoleted everything that had gone before that it was no contest. I have been associated with digital audio ever since and over the decades I have learned a few things and found ways of explaining the seemingly incomprehensible.
I have also taken pleasure in exploding a few myths along the way. But myths are like horoscopes; there’s an insatiable demand and as long as my supply of dynamite holds out I won’t be changing my life style any time soon.
Why Did You Read This?
You might also like...
Noise shaping performs an important role in digital audio because it allows hardware to be made at lower cost without sacrificing performance, and in some cases allowing a performance improvement.
Oversampling is a topic that is central to digital audio and has almost become universal, but what does it mean?
Strategies for capturing immersive audio for scene and object-based audio.
It was on December 13, 2011 that the Federal Communications Committee (FCC, the governmental body that oversees TV broadcasting in the U.S.), along with many irritated consumers, had had enough and decided to do something about the often times huge disparity…
Genelec Senior Technologist Thomas Lund starts down the road to ideal monitoring for immersive audio by looking at what is real, and how that could or should be translated for the listener.