We talk to the BBC R&D team about their work within the Max-R Project, where a consortium of industry innovators are collaborating on development of a set of standards and workflows that could bring genuinely deliverable, new XR based immersive experiences several steps closer to mainstream broadcast reality within a number of interesting use cases.
The BBC’s research and development department has its roots in the earliest days of one of the world’s most august and respected media organizations. The department has been involved in a range of pivotal developments covering more or less the entire history of television, and it’s perhaps quite normal to find it enthusiastically involved in the cutting edge.
Graham Thomas, head of applied research for production at BBC R&D, on how long that’s been the case. “I think we've existed for pretty much the entire history of the BBC. We've been involved in developments related to color TV. We've been instrumental in the rollout of digital TV, teletext, the RDS system for traffic announcement in cars… HLG was developed by one of my colleagues. Back in the day when you couldn't really buy broadcast equipment commercially, we used to design loudspeakers and mics, and some of the loudspeakers we designed are still available.”
MAX-R is a relatively new addition to the BBC’s R&D portfolio. A thirty-month initiative founded in September 2022, MAX-R is designed to explore and unify the technologies involved in mixed, augmented, and extended reality work, concepts broadly referred to as XR. The field ranges from the vast walls of virtual production to the pocket-sized screens on personal electronics, and the consortium (listed in full at the project homepage at Universitat Pompeu Fabra’s site) includes academic institutions, production facilities, a number of VR/XR focused broadcast technology vendors, alongside industry stalwarts Filmlight, Arri and the BBC itself.
A sampling of projects under the MAX-R banner gives some idea of the project’s scope. OpenAssetIO is a standard for managing tools and resources; Rooms is a web-based 3D content generation creation system intended to make creation easier for inexperienced people. At NAB 2023, Brainstorm showcased a collaboration with Filmlight which integrated two remote presenters onto a single virtual stage, with tracked cameras for each. Part of the BBC’s involvement is based on sports broadcasts augmented with 3D graphics, a clear precursor to augmented reality.
Thomas describes the pre-existing system as “a sports graphics system that's used by the BBC and fifty or sixty other companies around the world. One of the things it can do is capture pictures of players on the pitch, then create a virtual stadium so you can simulate the view a referee or linesman would have had. At that time, all that game engine technology was something you would put at the broadcaster’s end. It was a production tool, but now it’s available in people's phones.”
“We've been thinking about how you can take that mix of the real and virtual worlds, and give them the ability to interact,” Thomas continues, outlining a core principle of anything that might be called extended reality. “That led us to thinking about how we could bring certain kinds of program to people who don’t think of TV as their first point of consumption… in the next few years we're expecting the number of hours delivered online to exceed that of over-the-air. It’s not that people used to watch a linear program on TV and now they're going to watch a linear program on YouTube. There's so much more you can do with local rendering and interaction.”
MAX-R is divided into three broad categories, each category has a defined use case/s and each use case/s is separated into demonstrable scenarios. As illustrated in the diagram above. Image courtesy Universitat Pompeu Fabra. Barcelona.
That 3D rendering technology is essential across many disciplines, and, given the membership of the consortium, it’s perhaps no surprise that virtual production was a unifying cause. The stack of technologies required to make it work is always intricate, sometimes fragile, and addressing those issues is well within the purview of MAX-R. “When we were approached, the focus was on VP technology,” Thomas states. “We've done projects with Arri and Foundry and we were approached by a group who were trying to start this project, thinking that the BBC must be interested in what we can do with XR production.”
One context in which those interests converge involves the desire of virtual productions to record camera tracking and lens data, and increasingly image-based interactive lighting, for later reference by visual effects people. “That's one of the main interests of Arri in this area,” Thomas points out, “especially in the world of things like lighting. How you can get information out of a lighting rig or a set of screens and log that information in such a way that you can bring the whole thing into post? Can you do that in a manner that’s nice and easy and interoperable?”
Answers to questions like that have frequently been drawn from existing technologies which were not designed for virtual production, much as they’ve often worked well. “In the past I was involved in the development of one of the first camera tracking systems,” Thomas recalls, “and it's interesting that the data format it used, the FreeD format, is about the only thing which survives from that system. A number of manufacturers used it. These ad-hoc standards tend to appear and there have been various attempts at standardizing things.”
The wider world of XR has been leveraging many of the same tools, albeit to very different ends. Thomas gives a few examples. “There are a few areas that the MAX-R project is focusing on. One of those, being driven by Improbable, is a way to have very large numbers of people in immersive events. From what I understand, in Unreal - and I think it's the same with experiences built on Unity - there's a limit of perhaps around a hundred people who can be in one online space. One thing Improbable has developed is a way of distributing animation data more efficiently, and they can run up to ten thousand people in one virtual space.”
The idea of putting a lot of people in one place throws up interesting questions which, as Thomas points out, are less about technology than sociology. “Do you need ten thousand people to get the feeling of attending a large event? If you're trying to create the feeling of a smaller, intimate event, twenty or thirty people may be enough. In a larger event, it's more about what the crowd are doing and how they move.”
At the same time, Thomas goes on, interactivity between audience and performer has sometimes been overlooked even at the highest end. “In a number of the large, headline-grabbing music events in some of these digital spaces – things like Lil Nas, Ariana Grande and Travis Scott - what people are watching is an animated music video. It's been either motion captured or hand animated and a lot of money has been spent to create something very polished.”
Building on its work in augmented sports broadcasting, Thomas says, the BBC is exploring live production, “using a video-based route rather than a motion capture route. If you don't want to use trackers and a big rig of cameras, or if they pick up a guitar and you don't want to have to track the guitar, or the smoke, the lights… it’s all a big part of the performance that's not a part of the traditional motion capture. So, we're going down a route of using several cameras pointed at a stage. With a bit of care about lighting and camera position, you can give the impression that what you’re seeing is 3D rendered, but what you’re really seeing is a 2D representation.”
The ability of visual effects artists to get away with carefully-rendered virtual crimes in this regard is already well known, as Thomas explains. “So, do you really need volumetric capture, motion capture? Or can you do something that would get you ninety-five per cent of the way there much more simply, and get you a production pipeline that's how a broadcaster would cover a live event normally? We've been working in our little experimental studio setting up one or several cameras, putting it into Unreal Engine.”
There’s still some distance between these experiments and a technology that’s ready for widespread deployment, but that’s a gap BBC R&D was more or less expressly designed to cross. Thomas emphasizes that the prerequisites are lightweight, with distribution demanding nothing more than conventional video streaming. “The approach being taken is pixel streaming. In the cloud somewhere you have a bunch of machines with nice GPUs, but what people receive is a video encoded stream. So, all they need is a web browser that can decode.” Subtleties abound – some of the BBC’s work on this subject involves breaking the video image up into individual objects, relighting them, and casting shadows based on their shape and position.
Finding an audience for this requires an audience that’s used to something very different from the long history of linear television. Thomas’ colleague Fiona Rivera is a senior R&D producer who is keenly aware that the BBC’s XR work represents a much more fundamental change in how people used to watching TV might enjoy the things that organizations like the BBC create.
Rivera returns quickly to the core goal of interactivity. “Watching implies being passive. This is more about active participation - being able to respond by dancing, cheering, talking to the artist, and the artist being able to see the virtual crowd. Maybe they can pick out avatars to shout out to. It's all about being there together, and having that social experience. You can move around the virtual space by using touch controls if you're on a touch device, or with your mouse and keyboard.”
The word “curating” is often used to describe the process of bringing this kind of virtual event to a real crowd, and Rivera reports that MAX-R member Improbable is an early leader in the field. “Improbable are very experienced at curating experiences - having MCs promoting activities that encourage interaction for those who want to have interaction. You can have quizzes, spontaneous polls, and different activities to support the main event. Improbable's platform allows for text chats, and for proximity-based audio so you can press a button and talk to those around you.”
With eighteen months to run, and a specific remit to think big, a lot of things could happen. Only time will tell which of those things will engage the world’s imagination, but if there’s any organization with the institutional patience to wait and the mandate to keep coming up with new ideas while it does so, it’s the BBC’s department of new ideas. Which ideas those might be remains to be seen: even Rivera concludes that “what we end up launching is yet to be determined,” though she emphasizes that barriers to entry should be low. “Experiences we're targeting will be browser based, so that people can access through their laptops, desktops, tablets, phones... we would like to not have people have to have special kit in order to join in.”
More information on Max-R can be found here.
You might also like...
As broadcast production begins to leverage cloud-native production systems, and re-examines how it approaches timing to achieve that potential, audio and its requirement for very low latency remains one of the key challenges.
How adding PTP to asynchronous IP networks provides a synchronization layer that maintains fluidity of motion and distortion free sound in the audio domain.
This article describes the various codecs in common use and their symbiotic relationship to the media container files which are essential when it comes to packaging the resulting content for storage or delivery.
This list of file container formats and their extensions is not exhaustive but it does describe the important ones whose standards are in everyday use in a broadcasting environment.
The Bathurst 1000 is a massive production by anybody’s standards, with 175 cameras, 10 OB’s, 250 crew and 31 miles of fiber cable. Here is how the team at Gravity Media Australia pull it off.