Machine Generated Content - Part 1

Machine generated content covers a wide landscape from metadata driven sports statistics and graphics superimposed on action replays, to footage synthesized entirely by AI systems.

This article is part 1 of our ‘Essential Guide: Machine Generated Content’ and focuses on the various types of machine-generated content used in broadcasting, looks at the technical challenges of real-time rendering, and identifies different approaches for automating their creation.

This article is part of 'Essential Guide: Machine Generated Content'. Download the entire Essential Guide for free here.

A Grand Tradition

In common with most overnight success stories, machine generated content has been evolving for a long time. It was used in the 1950s to create the opening credits sequence for Alfred Hitchcock’s Vertigo movie and earlier experiments to create abstract images were conducted in the 1940s; in fact, the study and understanding of animation techniques began when moving film was first invented in the late 1800s.

Machine generated imagery is widely used in broadcasting and film production. Here is a summary of the broad scope of applications where machine generated content is useful:

Full frame rendered images.
Real world simulations.
Stings and trails.
Virtual studios.
Set extensions.
Animated featurettes.
Textual overlays.
Statistical analysis (sports results and historical records).
Visual effects.
Music visualizers.
Maps and diagrams.
Weather forecast imagery and animations.
Scientific visualizations.
Architectural visualization.
Infographics to support news and documentary content.
Title sequences.
End credits crawls.

All of these could be classified as machine generated by an Artificial Intelligence system but the degree of intelligence needed varies significantly.

Overview

Machine generation has various stages like any other content creation process. At the outset, incoming data is required to steer the process. This must be filtered and massaged into an abstract representation of the final output, such as a table of sports results or instructions to fly a camera through a 3D model. If the render system is built around a generative AI engine, then a prompting text needs to be created. Rendering latency increases with the complexity of the prompt.

A simple sports results table should render in a few milliseconds whereas a 3D rendering of a complex scene would take considerably longer, even on a reasonably fast processor. Rendering video sequences with Generative AI systems is likely to take even longer depending on the prompt.

The engineering team then creates a rendering pipeline and specifies the data ingest filtering and reformatting, and the constraints that steer it. Whether the data is created manually or by some automated means, it must be compatible with the Application Programming Interface. Irrespective of how the output is generated, there are certain rules that always apply:

Filtering and constraining simple data structures for statistical displays is deterministic. The incoming data conforms to an expected format and contains values within a predicted range.
Constructing prompt texts to drive a Generative AI system involves natural language processing and semantics. Properly constructing a prompt that delivers the expected result is a complex task that is not easily predicted and may require multiple attempts to get it right.

Live Content Rendering

One of the most difficult problems to solve with live streamed content is the latency introduced by codecs that prepare the video for streaming.

Adding live rendered overlays and content (common in sports presentations) introduces additional latency. This is often alleviated by showing the animation as a replay insert rather than directly in the live feed.

The image creation process needs to track the motion in the scene and correctly interpret it so that object trajectories and predictions can be superimposed onto the live footage. This requires significant computing resource.

Originally developed to monitor cricket matches as an aid to the umpire, live rendered overlays are now used for a variety of other sports. The input from a large array of high-quality (resolution and framerate) cameras is used to triangulate the position of a ball with an accuracy of a few millimeters. That position is then projected into the co-ordinate space of a 3D model of the stadium and the ball trajectory is then animated and rendered.

More recently, other ML based systems have evolved. They need fewer cameras and less computing capacity. Whilst they may not be as accurate they are sufficient for commentators to present an analysis of gameplay and potential strategies.

Tracking systems are covered in greater detail in Sports Graphics: Part 1 - Data Driven Visualization.

At least for now, Generative AI processes are too slow to use directly in the live broadcast path.

Building Frameworks And Steering Mechanisms

Constructing the pipeline through which your incoming data is processed and steers the engine that creates machine generated content is straightforward and there are a variety of architectural patterns available. Some are simple to implement but impose a degree of latency. Others are highly performant and can be called to action instantaneously on demand and building those is more complex. Frameworks include:

Batch Queues. Run batch queues to process a series of jobs offline one at a time where the content is being created well in advance of when it is needed. The results are cached ready for deployment. This is easy to construct on UNIX platforms with the cron scheduler calling shell scripts to action.
Service Listeners. Run jobs via service listeners that are re-entrant so they can be spawned as multiple simultaneous instances. Consider the processing time for these jobs and scope the available CPU capacity to cope with the maximum expected workload. These are compiled executables registered with the operating system and only consume CPU resources when they are called to action.
Virtualized containers. Deploy cloud-based containers on a high-performance multi-processor platform. The jobs may still take some time (latency) but computing capacity is not likely to be a limiting factor. Take account of the time needed to move the finished product to the deployment cache. Large video files take time to transfer. This will determine the speed of the network link required.

As you plan your system, think about whether tasks are urgent or not, how many need to run at the same time and how large the output is. Scalability and capacity planning is key.

Driving Tools With Scripts

Open-source tools often support a command-line mode of operation. This can be called directly from a shell script. These shell scripts can be manufactured directly with AI or can use data feeds supplied by AI. Data feeds are intrinsically safer than trusting AI (or any other external agent) to create executable code.

More specialized tools may only be available on the macOS or Windows platforms. Scripting these tools is done with AppleScript on macOS and PowerShell on Windows. Custom tasks can be built with these scripting languages as a wrapper and then called to action as needed.

There are several levels of scripting to control our tools:

Internal scripting allows the user interface to be extended by creating macros. These would usually be called to action by an operator but can be triggered externally. If they can be installed when needed, then an AI system could create them.
External scripting is used to call the application to action. AI might be used to generate a run-time script from a template and fill in file names and other parameters. This might also install custom internal script macros.
Some applications host additional plug-in modules which themselves may be scriptable.

Bridging from one scripting environment to another provides more flexibility. For example, AppleScript can call UNIX command-line tools and shell scripts to action. The shell scripts can call AppleScripts to action so a bridge between the two environments is possible. PowerShell provides similar mechanisms on Windows.

An alternative to scripting is using a node graph user interface to construct a pipeline. The Apple QuartzComposer tool was a good example of how to build a flexible machine generated video tool with a simple and intuitive user experience.

Project Level Synthesis

It is important to inspect the way that your creative tools store their project files. Often these are simple containers holding the assets.

A manifest and parameter store is often implemented as a simple text file that can be reproduced with an automation system. XML is popular but more modern applications might use JSON or simple comma and tab delimited files, and a little reverse engineering will reveal the format. Adding a publishing pipeline to an AI fronted database system can then manufacture project containers to load directly into your creative tools. Provided you have created a valid container, your tools cannot distinguish these from projects that have been saved directly by the tool.

Transforming To A Different Domain

Extracting content from proprietary formats may be awkward, but converting those files to another format may make it easier to operate on them. For example, MS Word DOCX files are really a Zip archive container. Run them through an extractor and you can lift the full resolution images that were pasted in. Embed this in an automation system to extract and store the images without needing to open the files in Word.

You can also convert documents to RTF format and embed replaceable tags inside the document, then use UNIX command line tools to find and replace them. This gives you a simple templated document publishing mechanism without needing to use a word processor at all.

Moving problems to a different domain often makes them much easier to solve. Use this approach inside your AI driven pipelines to perform tasks that were deemed difficult or impossible previously.

This article is part of 'Essential Guide: Machine Generated Content'. Download the entire Essential Guide for free here.

You might also like...

Production–Delivery Convergence: Part 6 - Designing Experiences That Viewers Trust

Performance reliability is an invisible contract between a streaming service and its customer, and it is fundamental to guaranteeing viewer retention. The problem is that performance isn’t just about delivery. Here we identify where to look and why it’s c…

SMPTE Education Launches Summer 2026 Lineup Of IP And ST 2110 Courses

Boasting two standalone courses, an intensive boot camp, and a hands-on practical lab, SMPTE Education has launched its summer 2026 Lineup of IP and ST 2110 Courses.

Virtual Production For Broadcast: Principles, Terminology & Technology

The technology and techniques of virtual production, from the camera back through the video wall, processors, and rendering servers.

Standards: Video - Advanced Video Coding (AVC)

AVC remains one of the most widely deployed video codecs in the world, but navigating its profiles, levels and signaling mechanisms is far from straightforward.

Live Sports & Monetization: Public Service Broadcasters Maximizing Live Sports Opportunities

PSBs across the world are making the most of limited resources to enrich live sports coverage around ancillary content and platforms, and monetizing the resulting services. Here we focus on the content and coverage rather than technical issues around workflow…