Machine Learning and AI dominate SMPTE 2017 conference

Attendees to Society of Motion Picture and Television Engineers (SMPTE) conference might conclude that Machine Learning (ML) and Artificial Intelligence (AI) have progressed way beyond hype as they start to enter just about every aspect of video production and distribution.

ML and AI have climbed out of analyst group Gartner’s famous trough of disillusionment onto the sunlit uplands of enlightenment. Both featured within many of the presentations covering different aspects of the video lifecycle, which perhaps is not so surprising since they represent the course of computation in general away from explicit series of instructions to implicit techniques that are more flexible and allow the system to adapt to its environment without further modification of software. This in turn reflects the growing ubiquity of computing as it permeates every aspect of life, requiring solutions to more complex problems that cannot be readily programmed explicitly.

Both AI and ML have been around since the dawn of computing but it is only recently that the almost astronomical increase in both processing power and storage capacity has finally enabled them to be applied seriously at scale. This in turn has stimulated adaptation of the algorithms to exploit the far greater resources available. AI covers a wide variety of advanced techniques designed in various ways to capture or simulate human expertise. ML has emerged from some of these, especially pattern recognition and computational learning theory, to embrace tasks not suited for traditional software based on serial instructions. Such tasks where ML has already been proven include email filtering, intruder detection and computer vision, where machines learn essentially from trial and error combined with feedback to respond more intelligently or appropriately to the inputs. ML is also playing a central role in great emerging sectors, such as autonomous driving and video security monitoring.

At the SMPTE conference, Konstantin Wilms, Principal Solution Architect at Amazon Web Services, was discussing the security angle as part of his presentation “Integrating AI and ML technologies into cloud-based media workflows”. Deep learning can be applied both to threat monitoring to distinguish between real attacks and false alarms, cutting down on both false positives and negatives, as well as in authentication via biometric techniques such as facial recognition. Konstantin argues that ML can be infused into a wide variety of processes including classification of management metadata and also sentiment analysis, which has been applied in TV for at least a decade but again can be enhanced with the latest tools and hardware. Sentiment analysis involves assessing a user’s interest in, or opinion of, a particular program through a variety of cues, including statements, viewing time, gaps between watching episodes if relevant, and recommendations made by social media. The technique can be applied to text in messages or social media postings to build up a picture of a user’s opinion.

Jaclyn Pytlarz, Senior Engineer of Applied Vision Science at Dolby Laboratories reviewed the technology underlying color and dynamic range management and how these issues relate to human perception.

Jaclyn Pytlarz, Senior Engineer of Applied Vision Science at Dolby Laboratories reviewed the technology underlying color and dynamic range management and how these issues relate to human perception.

AI and ML are also starting to make a major contribution on the automation front in various ways, during both video processing and preparation. At the SMPTE conference, Martin Wahl, Principal Program Manager for Microsoft’s Azure Media Services, was explaining how its AI-based video indexing service allows users to curate metadata directly from uploaded video content, exploiting speech-to-text transcription and closed captioning, as well as face and object detection combined with language translation. This is a good example of various AI methods working together, with applications in this case including automatic advertisement and content classification, dynamic adaptation of content based on audience preferences, and automatic creation of highlight reels, as well as summaries based on detection of scenes, motion and people within a video. At the talk Wahl was explaining how the core technology is derived from underlying vision, speech, language, knowledge and search modules that Microsoft’s R&D has developed over some years. These reduce effort and at the same time enhance the scope of metadata to describe content in greater depth and nuance.

On another very contemporary theme, color balancing for content captured in HDR (High Dynamic Range) with WCG (Wide Color Gamut), Jaclyn Pitlarz, Senior Engineer of Applied Vision Science at Dolby Laboratories, was covering techniques for managing color and dynamic range for multi-camera production. She stressed that for HDR and WCG to deliver the best possible experience, color management must be geared towards human perception and she was exploring possible future solutions on that front.

There is also scope for applying ML and AI to optimize encoding, with several discussions at the conference highlighting the potential of ML for adaptive compression taking better account of redundancy and variation in detail as the video sequence progresses. By matching compression better to the content at a higher level than just individual frames there is scope for further reductions in bit rate at a given video quality. Delegates at the conference might be surprised then that ML has not already been incorporated into compression algorithms, given its now well proven scope for much more efficient encoding with great potential for live content in particular.

Indeed, no significant commercially available video encoder employs machine learning, as they have nearly all evolved around explicit heuristic methods that have been fine-tuned over many years. Given relatively limited scope for significant further improvements through fine tuning, there is growing demand from some experts for applying ML techniques to develop software that combines well proven techniques with much greater adaptation to content in real time. There is mounting evidence that greater efficiency can be achieved with the help of ML while reducing programming effort. This is leading towards application of ML for automatic and iterative software development, which would realize a long-standing dream of computer programming and IT project management. It might be less good news for humans if even skilled tasks such as software development become automated, although the same people might find employment in AI and ML.

Yet another application of ML, again with an automation focus, discussed at the SMPTE conference was Quality Control (QC). This is required right across the media supply chain, as Telestream’s product managers Dominic Jackson and James Welch were explaining. In a presentation called “Zen and the Art of Media in Motion: The Many Aspects of Quality in the Media Supply Chain”, they were discussing how ML and AI can be applied to automate QC processes that at present can only be performed by people. This can reduce costs and at the same time has potential to improve quality and reliability further, because once machines have mastered a task, they are, in principle at least, flawless.

You might also like...

Network Technologies At IBC 2022 - System Configuration & Control Key Themes

Will any new digital solutions on display at IBC 2022 not have an IP address?

Playout & Delivery At IBC 2022 - Cloud-Native & Workflow Efficiencies Key Themes

One of the key trends at IBC 2022 is virtualization and moving to cloud-native infrastructures. Manufacturers and users want to improve workflow efficiencies with whole cloud ecosystems and data.

US Open Golf Puts Spotlight On Latency And Scale For Live Streaming

Every big global sporting event exerts stress on streaming infrastructures and challenges providers to deliver further improvements in the viewing experience as demand and traffic levels go on increasing. The 2022 US Open Golf Championship in Brookline, Massachusetts, is particularly under…

No Sign Of Innovation Slowing At NAB 2022

Covid-19 may have changed the course of broadcasting but has not slowed its development, judging from NAB 2022, the first major industry show with a physical presence since before the pandemic.

NAB 2022 Shows Less Can Be More For Trade Shows

It has been hard to find vendors or visitors regretting their presence at NAB 2022, or suggesting they will not come next year, despite the significant drop in overall numbers.