Virtual Production For Broadcast: Capturing Objects In 3D

Sometimes, there’ll be a need to represent real-world objects in the virtual world. Simple objects could be built like any VFX asset; more complex ones might be better scanned as a 3D object, something some studios have begun to consider as a service to offer.

Populating a virtual world with convincing objects is a lot of work. As we’ve seen, not every application of virtual production will need to create a full, three-dimensional virtual world, but when we do, the virtual art department might be faced with the need to design and build an intimidating number of assets.

It’s not always necessary for everything in a virtual world to have been built from scratch. A lot of library assets are available, albeit mainly intended for game development. While they might look great on a PlayStation, the detail and finish might not be enough to convince a movie or TV audience. Even so, depending on the nature of the object, it might make sense to purchase an existing asset and improve it as opposed to building something from scratch. That might require artists with experience in games and visual effects, although any virtual art department will need those skills anyway.

Inevitably, though, at least some assets must be built from first principles, especially when something which exists in reality must appear in the virtual world. Sometimes, we might need vague shapes to lend a little three-dimensionality to live-action background plates. Where more complex objects are involved, particularly irregular shapes with a lot of complexity, capturing a real world object becomes attractive.

Traditional Modelling

The earliest computer generated imaging programs built their objects from primitives such as spheres, cylinders and cubes. By stretching, moving and rotating those primitives, and by mathematically subtracting one from another, complex modelling is possible. With that in mind, the techniques behind the light cycles of the 1981 Tron become clear. Advantages include the fact that a sphere or cylinder has no resolution limit; it never becomes faceted. It’s an inflexible approach, though, making complex objects hard to build, and the idea of modelling surfaces defined by flat polygons quickly became popular.

Polygons are invariably triangular because they are assumed to be flat, the three points can only define a flat shape (camera tripods are stable on uneven surfaces for the same reason, while a four-legged table might rock). It’s still possible to create spheres, cubes and cylinders, but what we describe is their outer surface, not their shape and volume. Tools for polygon modelling are a key feature of most 3D software, and techniques for capturing real places and objects are invariably designed to create polygonal models.

Laser Scanning

Virtual production is often about placing the scene in a novel location, and capturing locations has often involved laser scanning. The basics are easy to understand – in a laser rangefinder, distance is measured by timing pulses of light. Repeat that process thousands of times using a rotating mount, and a three-dimensional representation of an area can be created. It can be a reasonably slow process, taking tens of minutes to scan an area with reasonable resolution.

Laser scanners can only scan surfaces which are visible from the position of the scanner. That might mean taking several scans of a single environment to fill the voids, and very complex environments may make it difficult to ensure every last surface is scanned. Also, laser scans (like many 3D scanning techniques) produce a series of distance measurements, sometimes called a point cloud, which must be converted to a series of polygonal surfaces for most uses. Given the cloud may contain millions of points, that’s necessarily an automatic process, although it may fail where an area of the model was not scanned clearly.

Laser scanners are commonly used for creating a model of an environment that the scanner is within, at least to within the maximum range of the scanner. Smaller models can be used to create a model of a small object, although that’s perhaps not the commonest application.

Photogrammetry

The term photogrammetry covers a wide range of computer vision techniques, but in film and television effects it generally means creating a 3D model of something by taking a series of photographs. It derives from the stereoscopic techniques behind 3D cinema, expanded such that we can use more than two photographs – dozens, even hundreds – to accurately create models of almost anything that can be photographed.

Photogrammetry works on everything from tiny objects to vast landscapes. The reference pictures can be taken fairly quickly using a reasonably low-cost, conventional stills camera, and software to turn those photos into a model, at least in simpler situations, might even be free. Shooting photogrammetry reference images demands some knowledge of how the process works, and ideally enough experience to understand what images are needed. Under-coverage can mean inaccurate models, while overdoing it will slow down make the process of processing the images.

Photogrammetry is often the technique used by cellphones (some have additional laser scanners) and it’s one of the most immediately accessible approaches to capturing a 3D model.

Structured Light

One of the most widely-deployed consumer applications of 3D capture involves structured light. Microsoft’s Kinect accessory for the Xbox used the technique, projecting a pseudo-random (but known) pattern of invisible infrared light into the room and photographing that pattern using a camera. The degree to which the pattern appears to have moved sideways is proportional to the distance.

Some hand held object scanners use the same approach, often with higher-resolution cameras and higher-resolution structured light patterns than Microsoft’s 2010 design. The fundamental limitation is one of range: a structured light scanner can only scan objects which are close enough for the light pattern to be visible. It can be more accurate than photogrammetry, but is most often used for objects small enough to put on a desk.

The Limits

The limits of 3D object scanners generally arise from the fact that most of them – with the exception of laser scanners – are dependent on image recognition. Very dark objects which just don’t bounce back much light can create problems, as can complex patterns which may make it difficult for a structured light scanner to recognise the target pattern. Conversely, objects without surface texture can be difficult for photogrammetry software to track. Almost any scanning technique can be confused by ambient lighting conditions.

Transparent and reflective objects often cause problems. Laser scanners may mistake mirrors for being a doorway to a complete, depth-reversed copy of the room being scanned. Photogrammetry may struggle to track transparent or reflective objects, and that can create large errors requiring a manual fix.

Possibly the most limiting caveat is noise. Even the most well-finished wall, which a 3D artist might represent with a simple plane, is likely to be seen by the scanner as a number of triangles due to inevitable random noise in the data. Lots of 3D modelling software has features to automatically optimise the polygon layout of any model, although a scanned model will generally involve more polygons, in a less ideal layout, than an equivalent handmade model. More polygons demand more rendering performance, and rendering performance is at a premium in virtual production, so scanned assets are likely to be handled carefully and only when necessary.

Materials

Experienced visual effects people will happily confirm that the materials assigned to a 3D object are often much more crucial to its believability than the fine detail in its geometry. Most of the scanning techniques we’ve discussed here are capable of capturing both shape and colour. That’s clearly true of photogrammetry, which relies on colour photographs of the subject. Even laser scanners often include a supplementary colour camera to record surface detail. One consideration is that these images will represent the fall of light on the object as it was in reality, which may cause problems later when the 3D object is used in a different lighting scenario (in the parlance of the field, the object has its lighting baked in).

3D scanning is therefore best done under diffuse, even lighting, which may be difficult over a large area. Even that risks misrepresenting reflective or transparent surfaces. Some advanced scanning techniques can estimate surface reflectance very accurately using modulated lighting, and they’ve often been used to scan faces. Very often, though, really good materials will still mean manual intervention. Advances in real-time 3D rendering technology mean that physically-accurate materials, using real-world values for phenomena such as reflectance and diffusion, are increasingly available outside the world of offline, non-realtime rendering. Physically-based materials tend to require less tweaking and may carry better between software and lighting scenarios.

For Speed And Accuracy

There’s no straightforward answer as to whether 3D scanning is the right approach for the creation of any particular asset. Poorly-done 3D scans can create complex lighting problems and impact performance. Well-done scans can also be a fast route to very convincing objects. What’s almost certain is that for many applications of 3D graphics, from video games to visual effects to virtual production, are starting to tax the limits of human ingenuity in creating sufficiently detailed environments. Even more realistic environments are likely to become more dependent on object scanning, so there’s every reason to hope that the tools will continue to improve as time goes on.

You might also like...

HDR & WCG For Broadcast - Expanding Acquisition Capabilities With HDR & WCG

HDR & WCG do present new requirements for vision engineers, but the fundamental principles described here remain familiar and easily manageable.

What Does Hybrid Really Mean?

In this article we discuss the philosophy of hybrid systems, where assets, software and compute resource are located across on-prem, cloud and hybrid infrastructure.

AI In The Content Lifecycle: Part 5 - Ethical Broadcasting And Regulatory Compliance

Broadcasters and video service providers are looking at AI to police the regulatory and ethical problems it has created, as well as bear down on some longer standing challenges. The latter include ensuring that content developed in one country complies…

Designing IP Broadcast Systems: NMOS

SMPTE have delivered reliable low latency video and audio distribution over IP networks, but it’s NMOS that is delivering solutions to discovery & registration challenges that satisfy operational requirements.

HDR & WCG For Broadcast - HDR Picture Fundamentals: Color

How humans perceive color and the various compromises involved in representing color, using the historical iterations of display technology.