In this series of articles, we explain broadcasting for IT engineers. Television is an illusion, there are no moving pictures and todays broadcast formats are heavily dependent on decisions engineers made in the 1930’s and 1940’s, and in this article, we investigate analogue audio and its importance in television.
Without vision, television is radio. Audio preceded television by many years and much research and development had already been completed when the first television channels were launched.
From the perspective of human psychology, audio quality is more important than video quality. If this wasn’t the case, then radio would not exist. The importance of audio is rooted deep in our subconscious mind established by our ancestors millions of years ago. If we hear a noise in the dead of night, we wake up instantly, fearing an attacker or somebody wanting to do us harm.
Distorted or interrupted audio triggers within us the same systems that woke us in the dead of night to fight or run away from danger. To our subconscious, the sound of a predator breaking a twig is no different than that of a crackling audio signal.
Generally, viewers are much more tolerant of a distorted or interrupted vision signal than they are of an audio signal. Although audio only has a bandwidth of approximately 20KHz, compared to 5, 10 or 100MHz for television video, making television audio work reliably takes a disproportionate amount of time and effort.
Hearing System is Complex
The human hearing system has three attributes the broadcast engineer uses and is aware of; frequency response, spatial awareness, and perception of loudness. Humans have an average frequency response from 20hz to 20KHz, and the ears consist of three main parts, the outer, middle, and inner ear.
The outer ear acts as a megaphone in reverse by funneling acoustic sound waves to the inner ear via the middle ear.
Our brains require the hearing system to create electrical impulses that replicate the sound waves in the air. This is achieved by allowing a fluid in the cochlea of the inner ear to vibrate over hairs within the cochlea in sympathy with the sound to create the required electrical impulses.
Diagram 1 – A schematic of the main components of the human hearing system. The function of the ear is to turn sound-air-waves into electrical impulses for the brain to process.
The middle ear acts as an interface to convert the air-sound waves of the outer ear into the liquid sound waves of the inner ear.
The eardrum of the middle ear connects to a small oval window on the cochlea via the three bones of the middle ear. They allow the vibrations established in the eardrum to transfer to the cochlea thus providing the air-wave to liquid-wave conversion.
Human Impedance Matching
As the cochlea contains liquid and is more difficult to move than air, the middle ear also amplifies the sound to perform impedance matching between the air-waves of the outer ear and the liquid-waves of the cochlea in the inner ear.
It is no coincidence that we have two ears. Our brain can detect the phase and temporal difference between our two ears thus allowing us to detect and locate the source of a sound. Again, this is a useful feature handed down to us by our ancestors as localization of sound allows us to pinpoint where a predator is coming from. Whether it’s from the front, back or sides.
Sound Source Synthesis
Stereo recording mimics the localization function of human hearing to give the perception of distance between actors in a scene, or instruments in an orchestra. The most basic stereo recording uses two microphones spaced apart to record two tracks. When replaying, two loudspeakers are placed in a room and each microphone track is fed to each speaker resulting in a stereo image giving the perception of physical distance in a recording.
Adding more and more microphones, with suitable spacing and sound mixing, gives better spatial awareness for the listener. In a recording of an orchestra, each group of instruments can be pin-pointed using just two loudspeakers.
Increasing the number of microphones and loudspeakers allows sound engineers to give greater spatial awareness to a sound image. In a 5.1-surround-sound system, sound sources appear to be behind the listener. And adding loudspeakers above and below the listener gives the perception of sounds above or below. This is exceptionally useful in action packed programs where the director wants to give the perception of depth to their program or film.
Diagram 2 – The human hearing system detects the source of a sound by determining the phase difference between the two ears. Anybody with only one functioning ear will find it difficult to locate the spatial source of a sound; a challenge when crossing the road.
Although the human hearing system has a frequency response of 20Hz to 20KHz, the response is not linear, and our sensitivity varies with the level of the sound. This is referred to as “loudness”.
At low-sound levels the human hearing is less sensitive at lower frequencies. But as we increase the volume and play a recording louder, our ability to resolve the lower frequencies improves and we become more sensitive to them.
The change in sensitivity is a function of the human hearing system. As the volume level increases, it’s our response that changes, not the actual frequency components within the signal.
Home hi-fi equipment often has a “loudness” button on the control panel, pushing this will boost the bass frequencies at the lower volume level. This is different from using bass-boost or the tone controls because tone controls affect frequency gain equally over the whole volume range.
Adverts Are Perceived to be Louder
Loudness is a new phenomenon in broadcast television and has gained prominence due to adverts appearing to be louder than they really were during commercial breaks. In response to this, many countries throughout the world have released legislation over the past few years to restrict loudness levels and combat the perception that adverts sound louder.
Prior to the new loudness rules, sound engineers would mix their levels to a specific voltage limit measured on their audio measurement devices, often called VU or PPM meters. But these meters measured the overall voltage level and not the specific frequency power levels to mimic human hearing. If there was too much energy in a mix, the listener perceived the sound to be much louder than it really was, even if the voltage levels were correct.
Bad Sound Causes Stress
Other factors also influence loudness measurements. Periods of low level audio or silence, punctuated by short high-level transients, also gives the perception of louder levels, a trick not missed by the sound engineers making high impact adverts.
Sound engineering is a very difficult discipline in broadcast television. The audio must satisfy the three attributes needed by the human hearing system; frequency response, spatial awareness and perception of loudness. And to avoid causing any stress to the audience we must not have any glitches, pops, or breaks in the transmission. Even loss of sound for one millisecond can cause distress to viewers.
You might also like...
Oversampling is a topic that is central to digital audio and has almost become universal, but what does it mean?
Strategies for capturing immersive audio for scene and object-based audio.
It was on December 13, 2011 that the Federal Communications Committee (FCC, the governmental body that oversees TV broadcasting in the U.S.), along with many irritated consumers, had had enough and decided to do something about the often times huge disparity…
Genelec Senior Technologist Thomas Lund starts down the road to ideal monitoring for immersive audio by looking at what is real, and how that could or should be translated for the listener.
Lawo’s Christian Scheck takes a tour of console functions and features that have a special place in immersive audio production, and how they are developing.