HDR: Part 24 - Creative Technology - Artificial Intelligence

Every decade has had a buzzword. Watch a 1950s educational movie and realize how dated the term “atomic” sounds now, and not only because the downsides of nuclear power have since become so painfully apparent. Since then, we’ve been sold technology marked “transistor,” “digital,” and now “AI,” although sometimes it’s not quite clear how key those things are to the success of a technology.

If that seems unfair, consider that portable radios were being promoted as a “transistor radio” from the mid-50s. The term was still being used a quarter of a century later to promote new designs, long after the underlying technology had become fairly mundane. Digital processing emerged in the early 1980s and is now so ubiquitous that using it as a prefix is positively inane. The titles of government departments such as the United States Digital Service and the UK’s Department for Digital, Culture, Media and Sport tell us practically nothing about what they do. What isn’t digital, in 2021?

The big news right now, though, is the idea of artificial intelligence. It’s already been shown that AI (or at least, things called AI) can do some amazing things, potentially getting us around some of the most intractable problems in information systems and enabling things like subtitles and audio description on the fly, for live TV. Like a lot of poorly-understood terms, though, it’s being abused, often by startups who may not have a particularly comprehensive understanding of what artificial intelligence actually is, or any real idea how to develop products or services that genuinely use it. They are, however, hyper-aware that venture capitalists take a favourable view of the concept.

And to be fair, film and television production is a field in which genuine applications of machine learning can be spectacular, for good or ill. It’s no longer a shock to see a piece of video depicting a major public figure saying things that major public figure would never say, and that’s likely to change the public perception of video evidence in the same way that Photoshop changed the perception of photographic evidence – at least, we can only hope that perception will change. Still, deliberate fakery, whether for comedy or deliberate political malfeasance, seems likely to be a niche application in comparison to things like significantly better video upscalers.

Prediction Limitations

That particular application is such a fantastic example of where machine learning scores that it’s worth pausing to think about it. Turning a lower-resolution image into a higher-resolution one runs into issues that are part of formal information theory, but it’s instinctive to most people that we can’t get something from nothing. On paper, machine learning solutions operate under exactly the same limitations, but they have more information to add – information about the world, about how pictures should look, based on all the data used to train them. Conventional scalers only have the input image to work with. Like all the best solutions to long-term, intractable problems, machine learning doesn’t break the rules. It can’t. What it can do is to provide another way for us to put information into the process.

At the time of writing, even the best machine learning scalers didn’t look as good as going back to a 35mm original negative and scanning it on high-quality modern equipment, but that’s not always practical. One demonstration, put together by an enthusiast, showed images from Star Trek: Deep Space Nine. That production could, in theory, be remastered from its original 35mm negatives. The problem is that it used a lot of CG effects, which are difficult and complicated to remaster. The remastered original series of Star Trek made money. The followup Star Trek: The Next Generation, however, performed only moderately well. The next followup, Deep Space Nine, is therefore unlikely to see the same treatment, to the dismay of fans. With modern techniques, though, an upscale might look a lot better than the standard-definition originals, and it’s only a matter of time before that sort of capability is in every TV anyway.

So, AI sounds legitimate. What’s the problem?

Most of the most impressive examples of things promoted as “AI” are perhaps better called machine learning, which we should differentiate from the more general term “artificial intelligence”. AI itself isn’t a particularly well-defined field anyway, though most people know not to expect to have a spoken conversation with a computer. Researchers might call that an “artificial general intelligence,” which can learn any task a human can, and which remains science fiction. There’s no immediate risk that directors, editors and writers will be replaced by AI.

Improving On Algorithms

Most people do, however, expect AI to offer more than the simple, predictable behavior of historic computer algorithms. Google’s hugely capable image-recognition system, which stands to revolutionize media asset management systems and search functions by providing a text description of the contents of an image, is a great example. The actual implementation of machine learning as a piece of computer software can be done in a number of ways, though neural networks are a common example.

A full explanation of neural networks is more than we have space for, but broadly, they involve layers of cells each holding a value, such as the brightness of a pixel or the likelihood that the image represents a particular letter of the alphabet, or some intermediate value. Several layers of these neurons are interconnected, with each connection weighted such that data placed on the input layer (such as pixel brightness) ultimately provokes useful data at the output layer (such as letters). Crucially, this arrangement means that the computer resources required, which must represent every neuron and every interconnection, quickly become very large as the work becomes more complex.

There is some controversy over whether absolutely all of machine learning is an example of artificial intelligence. To some extent, the idea of using information about a subject to guide decisions on that subject is just statistics, although most people would accept neural networks are an example of AI.

Making AI Work

Perhaps the most difficult issue is taking all this theory and making it into a usable piece of software, much less a saleable product. Doing that requires expertise which still isn’t that common, and fast computers too. It took fast, modern phones, with their processors that outpace desktop workstations of a decade hence, to make handwriting recognition work. Imagine, based on that, the sheer complexity of the system that allows Google to do its image content to text translation. Hiring specialists to assemble a commercial service based on really powerful AI is necessarily an expensive, longwinded operation – and all that before there’s even an indication that the product might work as a business.

The human response to a difficult task has often been not to undertake it, or at least to undertake it to the minimum extent possible. Since “artificial intelligence” is not a well-defined term, it’s hard to stop anyone using it to describe more or less anything, even something that uses techniques that few people would think of as legitimately AI. And it works: we can confidently predict that at any time of the day or night, for most of the last few years, somewhere on the planet a venture capitalist has been looking impressed at a proposal which might involve only the most tangential connection to AI proper.

Or, perhaps, no connection at all. One of the things current AI techniques do particularly well is to form an interface between the fuzzy, irregular real world and the precision of digital information, hence the handwriting recognition and Google’s object recognition. One other way of providing that interface is to pay a lot of minimum-wage people in a low-wage jurisdiction to do it. Startups have been launched which claimed to be using AI, but were in fact using sweatshops of real people to transcribe speech and read handwriting. That approach is difficult to expand to accommodate a growing customer base in the same way we might add more cloud computing resources to a true AI deployment, but – well – hiring people gets a product to market quickly, without all the expensive, time-consuming distraction of actually engineering something new. Faking a speech by a head of state is a potential downside of AI; faking the use of AI in the first place deserves some sort of Machiavellian prize.

Classifying Training Data

Sometimes, this sort of thing can be legitimate. Training an AI often relies on large amounts of pre-classified data. These systems might need images which are already known to contain (say) a football, so that they can learn to identify footballs in general, and ideally identify them better than the camera-operating AI that repeatedly confused a linesman’s bald head for the ball during a football match in Scotland in October 2020. Classifying that training data must be done by real people, and it’s normal to use people to correct and assist a real AI which may need its decisions double-checking so it can learn from the results. The difference between that and just having humans do all the work can be hard to mark, so even downright dishonesty can therefore be hard to identify.

In the end, with normal advancement in the field, AI and its associated technologies should one day be just as available and just as easy to use as any of those historic buzzwords. Improved tools have already made certain types of AI – machine learning for image processing, most obviously – more easily available to a wide range of developers, and those almost-magic ways around information theory might become more everyday. Of course, they’ll also seem less magical, so by the time AI is actually available, it may become more difficult to impress silicon valley investors with talk of it.

Being Cautious

To risk a prediction, the most difficult outcome of this is likely to be the number of jobs displaced by ever greater amounts of automation. Computerization has always nibbled most freely at the lower-skill end of the job market, and AI raises much the same spectre. Quite how ab-initio television editors are to be trained when there may no longer be any need to manually log rushes remains to be seen, especially given the traditional reluctance of the industry to allocate more than the meanest resources to training new people.

In the meantime, while there’s perhaps a reason to be cautious about the extravagant claims of companies promoting products involving AI, it is already saving many of us huge amounts of time. On the basis that it’s actually been under development effectively since the dawn of computing – certainly since the 1950s – a reasonable reaction might actually be a relieved gasp of finally.

We might malign the hype, but without the power of a buzzword, we might have stopped pushing for it decades ago.

You might also like...

Designing IP Broadcast Systems - The Book

Designing IP Broadcast Systems is another massive body of research driven work - with over 27,000 words in 18 articles, in a free 84 page eBook. It provides extensive insight into the technology and engineering methodology required to create practical IP based broadcast…

An Introduction To Network Observability

The more complex and intricate IP networks and cloud infrastructures become, the greater the potential for unwelcome dynamics in the system, and the greater the need for rich, reliable, real-time data about performance and error rates.

The test demo setup at Humber College provided time synchronization within 25 nanoseconds.

2024 BEITC Update: ATSC 3.0 Broadcast Positioning Systems

Move over, WWV and GPS. New information about Broadcast Positioning Systems presented at BEITC 2024 provides insight into work on a crucial, common view OTA, highly precision, public time reference that ATSC 3.0 broadcasters can easily provide.

Next-Gen 5G Contribution: Part 2 - MEC & The Disruptive Potential Of 5G

The migration of the core network functionality of 5G to virtualized or cloud-native infrastructure opens up new capabilities like MEC which have the potential to disrupt current approaches to remote production contribution networks.

Designing IP Broadcast Systems: Addressing & Packet Delivery

How layer-3 and layer-2 addresses work together to deliver data link layer packets and frames across networks to improve efficiency and reduce congestion.