On Recording the Human Voice

Recording the human speaking voice can be one of the trickiest tasks a professional sound recordist encounters. Even when working with seasoned professional voice artists, problems can creep in. Here are a few of them and how to solve the problem.

Let’s begin by clarifying one thing. I don’t mean a singing vocalist, but a speaking voice — perhaps an announcer, a narrator, a person doing a commercial or even someone recording a book on tape. We are talking everyday speech.

First of all, the voice must sound natural and be clear and understandable. This is not music. Special efforts to manipulate the voice are not allowed. No masking is allowed either. We are talking about the purity of the human voice here.

Courtesy Alt Recording Studios.

Courtesy Alt Recording Studios.

Normally, in such situations, the voice talent sits or stands in front of a microphone in a treated studio or voice-over booth. It can be at a broadcast station, a recording studio or even on-location with the proper acoustic treatment. The problems we must deal with here occur even under ideal recording conditions.

The first situation that can occur is sibilance, a manner of articulation of fricative and affricate consonants. Sibilance occurs when a stream of air is directed with the tongue toward the sharp edge of the teeth. It causes a sibilant — or strident — sound.

Sibilance is an unpleasant tonal harshness that can happen during consonant syllables (like S, T and Z), caused by disproportionate audio dynamics in upper midrange frequencies. Sibilance is often centered between 5kHz to 8kHz, but can occur well above that frequency range.

This problem is usually caused by the actual vocal formant, but can also be exaggerated by microphone placement and technique. Every human voice is different and don’t pre-suppose that anything you’ve tried before will or will not work again. It’s all up for grabs.

The best way to start is to leave some space — about 12 to 18 inches — between the speaker and the microphone. Forget a pop filter here — it won’t help. Once you find a suitable microphone and distance combination that reduces sibilance, point the microphone downward 10 to 15 degrees toward the throat instead of the source. Also, a good tip is change out the type of microphone. Dynamic mics often work when condensers don’t in these situations.

If electronics are required, de-essers are the tools of choice. The de-esser technique typically uses a narrow peak EQ in the sidechain to boost the most offensive sibilant frequencies. This EQ exaggerates the dynamic difference between the sibilant band and the rest of the vocal waveform, making it much easier to achieve gain reduction during those consonants.

Another vocal issue that can develop into a problem are plosives — blasts of air that result from certain consonant sounds usually heard on words with Ps and Bs. This is where a pop shield does help. Position it a couple of inches from the mic and cross your fingers.

Plosives can be especially bad with cardioid or hypercardioid mics and can cause the diaphragm to bottom out, hit the backplate insulator and cause mechanical clipping. This is bad and can ruin a recording. In this case, try a mic with an omni directional pickup pattern which can lessen the effect. Sometimes, though, plosives are unavoidable.

If electronics are needed to fix plosives, try iZotope’s De-Plosive module in RX Advanced for the fix. As with all such problems though, it is best to solve it in the recording session rather than depend on electronic solutions in post.

Finally, and this tends to come into play when using other than top-tier trained voiceover artists, are the assorted pops, clicks, smacks, swallows and other odd sounds that creep into human speech. It can happen at any moment and often tests the skill set of the engineer doing the recording session.

These odd-ball sounds fall under the idiosyncrasies of human speech. This involves more the talent and professionalism of the person doing the recording more than anyone else. It may involve working with the voice talent to address the problem and to make sure the person is well hydrated before the recording session. It is always good to have hot tea, lemon and honey on the set to help soothe the voice.

Of course, switching mics and other gear can help, but in the end iZotope’s RX Advanced and Wave’s modules can also help save the day. Editing with these tools has become the go-to fix for many tiny, indescribable problems.

Recording the human voice has never been easy. It tests the skills of every recordist. When you think you’ve seen it all, there is something new waiting in the wings to test you again.

You might also like...

Designing IP Broadcast Systems: Where Broadcast Meets IT

Broadcast and IT engineers have historically approached their professions from two different places, but as technology is more reliable, they are moving closer.

Comms In Hybrid SDI - IP - Cloud Systems - Part 2

We continue our examination of the demands placed on hybrid, distributed comms systems and the practical requirements for connectivity, transport and functionality.

NAB Show 2024 BEIT Sessions Part 2: New Broadcast Technologies

The most tightly focused and fresh technical information for TV engineers at the NAB Show will be analyzed, discussed, and explained during the four days of BEIT sessions. It’s the best opportunity on Earth to learn from and question i…

Standards: Part 6 - About The ISO 14496 – MPEG-4 Standard

This article describes the various parts of the MPEG-4 standard and discusses how it is much more than a video codec. MPEG-4 describes a sophisticated interactive multimedia platform for deployment on digital TV and the Internet.

Chris Brown Discusses The Themes Of The 2024 NAB Show

The Broadcast Bridge sat down with Chris Brown, executive vice president and managing director, NAB Global Connections and Events to discuss this year’s gathering April 13-17 (show floor open April 14-17) and how the industry looks to the show e…