Production & Post Global Viewpoint – November 2016

Adobe MAX 2016 Teases New Technology

Adobe holds a yearly confab called Adobe MAX. The event is used to announce new updates and features for its premiere product, Creative Cloud. The conference is also used to tease the audience with demonstrations of creative capabilities that may, or may not, eventually become Adobe products.

This year’s event was held in San Diego November 2-4 and attended by some 10,000 creatives. The attendees come to see and hear a range of experts working in the creative spaces. It is a learn-from-the-experts kind of experience. Film maker Quentin Tarantino and photographer Lynsey Addario were two of the keynote speakers. For those unable to attend, the keynote presentations were recorded and are available for on-demand-viewing at MAX online.

As previously mentioned, Adobe MAX is used by Adobe to tempt the audience with demonstrations of newly developed technology. The demonstrations are designed to show a range of new ideas that may eventually become Adobe products. This year’s event demonstrated 11 new potential products that proved to be big hits with the audience.

Photoshopping voiceovers

Admittedly, I am not familiar with most of the Adobe Creative Cloud intricacies. However, there was one demonstration that blew me away. Even better, the tool appears so easy to use that even I could edit the audio on a video track.

Adobe calls the technology “Photoshopping Voiceovers,” or #VoCo. It was one of 11 experimental technologies demoed at Adobe MAX 2016.

Consider this scenario. Your producers complete a video project and forward it to the client for approval. However, the client asks for a few minor changes to the voiceover. Unfortunately the voiceover artist has long left the scene and is on her way to a destination wedding.

Thanks to the Adobe technology “Photoshopping Voiceovers,” your producer could complete the project.

#VoCo first creates real-time text from the captured audio. The text is displayed on the screen along and aligned with the traditional audio waveform one would expect to see on a digital audio workstation. The text is vertically aligned with the waveform so the editor operator can see the dynamics of the audio and where each word is with respect to those visual audio clues.

#VoCo allows an editor to change words in a voiceover simply by moving words in the sentence, or even adding unspoken words to the sentence. Once the desired copy text is entered, #VoCo creates new audio to match exactly those words. And the sound of the created words matches exactly the sound of the original speaker’s voice. You have to hear it to believe it.

Creating audio from text

The #VoVo demonstration was performed by Adobe Creative Technologies Lab Intern Zeyu Jin. Although Zeyu mentioned audio books as one example of how the technology could be used, I can imagine hundreds more.


The #VoCo demonstration was performed by Adobe Creative Technologies Lab Intern Zeyu Jin. His demonstration of turning text into audio drew crowd applause.

Here is an example of how the tool might be used.

Editing the audio of a voice over is as simple as cutting/pasting a sentence. Once the audio is captured, the accompanying text is displayed below the traditional audio waveform. To change the audio, simply cut and paste the sentence to create the desired text. Here is an example of the demonstration by Zeyu.

His original recorded audio: “I awoke, kissed my dog, then kissed my wife.”

With two quick text edits, exchanging “wife” for “dog”, the audio becomes, “I awoke, kissed my wife, then kissed my dog.” Again, there is no waveform scrubbing or audio expertise needed. The edits are simple text changes to a sentence.

How here is where the Adobe program really gets interesting.

Zeyu demonstrated how the program could create words that were not even spoken. The original text was, “I awoke, kissed my dog, then kissed my wife.” Zeyu now changed the sentence to say, “I awoke, kissed my dog three times, then kissed my wife.” Note he added the phrase, “three times.”

When the audio was played, it perfectly said, “I awoke, kissed my dog three times, then kissed my wife.”

That phrase was not spoken by the actor, but rather entirely created by the Adobe software. The result was indistinguishable from the original actor’s speech. It absolutely sounded authentic. 

Zeyu said #VoCo required about 20 minutes of a person’s voice before it could create new and unspoken words from written text. He also said the audio could be watermarked so edits could be detected. 

Unfortunately, #VoCo is not yet an available program—and may never become an Adobe product. I’m betting it does make it into Adobe’s Creative Cloud because it solves a common production problem.

I could go on about other demonstrations and teased new technology and creative functions, but better you view them for yourself. All of the Adobe Max conference demonstration videos can be found here.

Let us know what you think…

Log-in or Register for free to post comments…

Related Editorial Content

EVS introduces collaborative editing with Adobe Anywhere

Approaching NAB, live production technology developers EVS have announced a integration with Adobe Anywhere, Adobe’s collaborative video workflow platform. This tie-up extends EVS live file access, metadata and delivery workflows with expanded collaborative workflows for editors.