Crossing the Uncanny Valley with Volumetric Capture

Microsoft has adopted technology from IO Industries to raise Virtual Reality production to new heights of realism and flexibility.

The technology of video production, especially as it relates to VR (Virtual Reality) and AR (Augmented Reality), is about to take a giant leap forward. In fact, it’s about to jump a valley—the famous uncanny valley.

The concept of the “uncanny valley” was identified by the robotics professor Masahiro Mori as “bukimi no tani genshō” (不気味の谷現象) in 1970. The term was first translated as “uncanny valley” in the 1978 book Robots: Fact, Fiction, and Prediction, written by Jasia Reichardt, but it has become a guidon for animators, c.g. artists and computer game creators as a warning that, just as Commander John J. Adams (Leslie Nielsen) said to Altaira (Anne Francis) at the end of 1956’s “Forbidden Planet”, “We are, after all, not God.”

The abyss of the uncanny valley. BTW, “bunraku” refers to large Japanese puppets manipulated by actors (click to expand).

The abyss of the uncanny valley. BTW, “bunraku” refers to large Japanese puppets manipulated by actors (click to expand).

As this graph shows, crafting an imitation of life-like humans can be entertaining, up to a point. But if you come too close, and there is still something unidentifiable missing, it becomes creepy.

Uncanny.

That’s why, up to now, most motion capture approaches, no matter how pseudo god-like, have been better at creating Gollum than Swan Lake.

But the world of VR is changing.

Microsoft has been pursuing the concept of Volumetric Capture for the past seven years. As Andrew Searle, sales manager at IO Industries, explained to me, “The Microsoft Research Division had been working on holographic, free-viewpoint 3D capture video for some time,” he began, “with the idea of using it in their HoloLens head-mounted display, or Windows Mixed Reality platform. But about two years ago they ran into a roadblock with the scalability of their capture platform, so they reached out to us.”

IO Industries’ use of multiple high speed cameras and capture systems had been successful in scientific and research applications such as fluid dynamics, aerospace airfoils, and weapons testing.

But when I asked Searle if what IO Industries and Microsoft were working on could be referenced to the famous “bullet time” or “time slice” shot from 1999’s “The Matrix”, he acknowledged the similarities but used the differences as definitional.

“In ’The Matrix’, a ramp of cameras was set up around computer hacker Neo (Keanu Reeves) inside a circular green screen,” he explained, “and when Neo jumps they all capture one picture simultaneously. When merged together, all those images spin around Neo in one instance of time. With Volumetric Capture, this is done with moving video from multiple cameras in real time, so the action is not frozen.”

This lets the director move the viewpoint of the audience as the action demands, while maintaining the illusion of 3D depth.

Another useful reference is today’s motion capture techniques, where multiple angles from cameras looking at actors often wearing special suits are combined inside a computer to create a skeletal model of the person onto which a c. g. character can be modeled.

As we have seen in many brilliant motion capture films this can look highly realistic for fantasy characters—but never quite real enough to represent actual humans.

Hence the uncanny valley.

You can even interact with VR images created b Volumetric Capture on a 2D screen.

You can even interact with VR images created b Volumetric Capture on a 2D screen.

“Since Microsoft’s Volumetric Capture is recording the entire actor from all angles, all the muscles, all the facial expressions, all the soft tissue movements, we can cross that realism gap,” Searle told me.

Microsoft came to IO Industries for several reasons. They wanted a system that could be scaled from small studios to large installations, they needed a recording technology that could handle input from up to 100+ cameras, and cameras that could be synchronized to within a microsecond of each other.”

IO Industries' DVR Express raw recorders offer LTC timestamp, frame triggering and configurable SSD storage.

IO Industries' DVR Express raw recorders offer LTC timestamp, frame triggering and configurable SSD storage.

The system employs one additional rather clever trick. Only half of the cameras are outputting RAW, RGB color video. The other half are recording near-infrared images of the tiny dots projected onto the actors’, props’ and objects’ forms by several laser pattern projectors mounted all around the set.

“These overlapping near-IR cameras give us the texture and depth analysis that the Microsoft processing algorithm needs to provide the ultimate in shape rendering,” Searle said. “Motion capture suits can only fit so many reference dots on their surface. But this technique of using pinpoints of laser light gives us a much higher resolution for spatial and depth localization.”

Victorem camera uses advanced global shutter CMOS sensors with high dynamic range.

Victorem camera uses advanced global shutter CMOS sensors with high dynamic range.

Searle wanted to be clear during our interview that the Volumetric Capture is the result of almost 7 years research invested by the Microsoft Research Division, but about two years ago they came to IO Industries for some key empowering technologies.

“They are using our Victorem high speed cameras,” he said. “Within a small form factor, they are capable of recording 2K/4K/HD/UHD and non-standard video formats with shutter speeds in a fraction of a second.”

Most importantly, since a common configuration of a Volumetric Capture utilizes up to 100+ cameras for each take, fully shutter-synchronized with microsecond-level accuracy and recording unprocessed RAW images, they generate over 10 Gigabytes of data per second.

Then to wrangle all that data, IO Industries StudioCap management software merges these files so they can be fed into Microsoft’s own processing engine from which it is output as real time, moving 360-degree imagery.

StudioCap management software gives you live video streaming with focus assist and production database integration.

StudioCap management software gives you live video streaming with focus assist and production database integration.

Microsoft is not the only tech firm going after this Volumetric Capture concept for future production, but they are definitely on the forefront.

And making great progress in both VR and AR production using it.

“We’re going to be on the other side of the uncanny valley any time soon,” Searle concluded.

In August, a new studio called Metastage providing Volumetric Capture capabilities based on the Microsoft approach to VR and AR was opened at Culver Studios in Culver City, California. This joins Microsoft’s own Mixed Reality Capture Studios in San Francisco and Redmond, Washington. The system has also been licensed to London’s Dimension Studios in the UK.
Let us know what you think…

Log-in or Register for free to post comments…

You might also like...

Nine Pitfalls Of Relying On FTP To Move Large Media Files

Broadcasters are continuing to adopt and take advantage of IT working practices as they transition to file-based workflows. However, some seemingly effective solutions are outdated, have not kept pace with advances in computing power, and are unable to efficiently transfer…

Color and Colorimetry – Part 5

In a multi-disciplinary subject such as color space, it is hard to know where to start. John Watkinson argues that the starting point is less important than the destination.

Esports Expands Audiences Using Broadcast IP Production & Distribution – Part 2 – The IP Technology

Esports viewership worldwide is on a steep upward trajectory and will soon begin to challenge traditional sports broadcast audience figures. As the esports and traditional sports communities converge, what can traditional broadcasters learn from the remote production workflows being pioneered…

What Does PCI 4.0 Offer?

When, in May 2019, AMD announced their Ryzen Zen 2 architecture, beyond the amazing performance offered by the new Series 3000 microprocessors, they announced the new chips would support PCI 4.0. Although I was pretty confident the step from 3.0 to 4.0 meant 2X greater bandwidth,…

Esports Expands Audiences Using Broadcast IP Production & Distribution – Part 1 – The Business Case

Esports viewership worldwide is on a steep upward trajectory and will soon begin to challenge traditional sports broadcast audience figures. As the esports and traditional sports communities converge, what can traditional broadcasters learn from the remote production workflows being pioneered…