What’s Next For AI And Camera And Imaging Technology?
Over the last few years, advances in camera and imaging technology powered by artificial intelligence have been staggering. Today, we are seeing increasingly sophisticated applications of AI in mid- and higher-end professional cameras like the Olympus OM-D E-M1X and Sony a6600.
These cameras employ AI-powered focus systems that can distinguish, among other things, humans from non-humans, eyes from ears, and trees from dogs. This capability, derived from ever-improving machine-learning algorithms, has enabled faster, smarter, and more precise auto-focus and follow-focus operations.
a] Olympus OM-D E-M1X; b) Sony Alpha a6600; c) ARRI Alexa LF]. AI has enabled pro cameras like the Olympus OM-D E-M1x and Sony a6600 to reliably find and hold critical focus on human and non-human subjects. Most Smartphones make extensive use of AI to reduce noise and enhance images by adding illumination and shadow detail, for example, to night scenes. AI can be much more effective in small-sensor devices like the iPhone since large-format cinema and broadcast cameras require substantially greater processing power and sophistication.
The advances in post-camera imaging utilizing artificial intelligence have been even more remarkable, with major strides apparent in the latest-generation rotoscoping tools that enable easy replacement of a drab gray sky, or remove ill-placed utility wires and telephone poles jutting into frame.
Artificial intelligence is fundamentally changing how we approach our work: there is little point now for grips to spend hours flagging the sun off the side of an aircraft hangar, or cropping out a modern skyscraper when composing a 19th century cityscape.
For cinematographers and everyone else in the creative process, it’s logical to consider the potential end game for AI as it pertains to film and TV content creators. Will we see, soon or ever, A MAKE MOVIE BUTTON on a professional broadcast camera?
a] MAKE MOVIE BUTTON; b) iPhone ?]. Coming from Apple in 2033?? Don’t laugh. Judging by the success of the iPhone and the single button interface, does a simple MAKE MOVIE BUTTON seem all that far-fetched?
Sure, the idea may seem preposterous now but the truth is we are already seeing widespread use of AI to generate content in the greater media context on financial business websites like Marketwatch. Each trading day, a two-hundred word story is generated from concept to finished copy with nary a human being’s input, nuanced word play, or pithy comic aside.
We are all familiar with Apple’s signature-Steve Jobs-inspired one-button approach as dramatically evidenced by the meteoric success of the single-button iPhone. This same philosophy and approach have applied to Apple software as well, like the one-button MAKE SLIDESHOW function inside iPhoto. Introduced over a decade ago, the Photos application today has grown increasingly sophisticated with the addition of AI-powered facial and content recognition. The finished slideshows today may not be hugely inspired or compelling, but the sheer convenience and economy of so easily producing finished content is bound to attract many appreciative takers.
GOOD, BETTER, or BEST? Apple’s old Compressor application once served up a mind-blowing choice for filmmakers and their outputted movie. Thinking about it, one has to wonder: Who would choose ‘GOOD’ if one can simply just choose ‘BEST’?
Listening to the presentations at Adobe Max in October, one got the impression that artificial intelligence will soon rule the roost for most, if not all, the major post-camera applications. Leveraging the company’s Sensei machine-leaning algorithm, Adobe has significantly reduced the umpteen labor-hours required for many common processes like re-formatting of content, color correction, and sky control, that is, at least for those who desire a simple, one-click convenience and are willing to accept the generalized result.
Sky replacement and the creation of intricate mattes that used to take days have now been reduced to a simple operation. Object selection, rotoscoping, even the changing of seasons via a single button click, are all powered by artificial intelligence that is dramatically improving in scope and capabilities every year.
Patrick Palmer, Director of Product Management for Pro Video & Motion Design, places the AI juggernaut in perspective. “The idea at Adobe is to use AI’s power to reduce the drudgery of mundane tasks like rotoscoping, which is notoriously labor-intensive as it must be performed at a frame-by-frame level.”
Many broadcasters are already reaping the benefits of AI machine-learning for automating segmentation, frame analysis, and speech-to-text. The intelligent re-formatting of content has proven to be a real time-saver, easily and quickly applied via Auto Reframe, for example, inside Premiere Pro.
Adobe neural filters: a] colorization b) sky control c) Super Zoom]. Adobe has introduced a bevy of post-camera imaging tools leveraging the company’s Sensei machine-learning algorithm. The ‘neural’ filters inside Photoshop enable reasonably accurate colorization of black-and-white images, a sky control feature that impacts only the sky and not the surrounding landscape, and a ‘Smart Zoom’ function that intelligently restores the detail lost in zoomed-in images. AI works at the pixel level, and therefore offers the greatest potential to cinematographers inside the frame with no aspect of motion.
Beyond reformatting, AI’s potential for broadcasters includes, most notably, a vastly more powerful search function. I recall in the 1980s as a cinematographer being handed two 400-foot rolls of 35mm film to shoot a 30 second commercial. If I asked my producer for more, he would grimace and shoot me a disparaging glare. ‘You’re only shooting a 30 second commercial. Not War and Peace!’
Today, for a comparable undertaking, if I don’t fill two or three 1TB memory cards, I’ll be accused of dogging it or taking the day off. The advent of digital media with its perceived economy has led to overshooting on an epic scale. Shooting 1000:1 or more is not uncommon these days, which leads, of course, to a truly overwhelming amount of footage and frames (in 4K no less) that must be viewed and sorted, and ultimately archived.
So how would a MAKE MOVIE BUTTON actually work? First, AI must identify what is important in the frame. This would entail a rigorous gaze analysis study involving thousands of test subjects’ eyes following content in a lab. The learning algorithm would still likely produce a less than satisfactory result since a rating system is also needed in order to make more human-like decisions.
Now, with AI powering ever more-capable search functions, we can scan vast libraries of footage, and identify content by person, location, activity, or spoken word, and this capability will be readily available in even the lowest-cost editing and archiving tools.
Which brings us back to our initial question whether cinematographers’ cameras in the future will feature a one-click MAKE MOVIE BUTTON.
While artificial intelligence will no doubt offer the ability to identify and locate shots in an archive, the smart technology vis-à-vis the cinematographer will prove ultimately much more applicable inside the frame for finding and holding critical focus, for example, or sky replacement, and wire removal.
Generating a complete sequence, however, or an entire movie, with the concomitant timing and continuity is a totally different matter. Given the permutations, choices, and endless vagaries of editing and assembling motion video, one can safely say that AI will never replace what a human brain can come up with. As Adobe’s Palmer is quick to point out, ‘‘Machine-learning can be only as good as the previous defined tasks in the experience of the machine-learning algorithm.’
Someday, years in the future, perhaps Apple (or someone else) will harness the power of artificial intelligence to produce a Smartphone with a one-click MAKE MOVIE BUTTON. If and when that ever happens, we can take some solace from Adobe’s current suite of pro applications. The cinematographer’s camera, like Adobe’s software, must feature the necessary transparency and overrides that are essential to a fully-professional creative storytelling tool.
AI will never be your boss. But it can be a fabulous assistant.
You might also like...
NDI For Broadcast: Part 1 – What Is NDI?
This is the first of a series of three articles which examine and discuss NDI and its place in broadcast infrastructure.
Brazil Adopts ATSC 3.0 For NextGen TV Physical Layer
The decision by Brazil’s SBTVD Forum to recommend ATSC 3.0 as the physical layer of its TV 3.0 standard after field testing is a particular blow to Japan’s ISDB-T, because that was the incumbent digital terrestrial platform in the country. C…
Broadcasting Innovations At Paris 2024 Olympic Games
France Télévisions was the standout video service performer at the 2024 Paris Summer Olympics, with a collection of technical deployments that secured the EBU’s Excellence in Media Award for innovations enabled by application of cloud-based IP production.
HDR & WCG For Broadcast - Expanding Acquisition Capabilities With HDR & WCG
HDR & WCG do present new requirements for vision engineers, but the fundamental principles described here remain familiar and easily manageable.
What Does Hybrid Really Mean?
In this article we discuss the philosophy of hybrid systems, where assets, software and compute resource are located across on-prem, cloud and hybrid infrastructure.