Machine Learning and AI dominate SMPTE 2017 conference

Attendees to Society of Motion Picture and Television Engineers (SMPTE) conference might conclude that Machine Learning (ML) and Artificial Intelligence (AI) have progressed way beyond hype as they start to enter just about every aspect of video production and distribution.

ML and AI have climbed out of analyst group Gartner’s famous trough of disillusionment onto the sunlit uplands of enlightenment. Both featured within many of the presentations covering different aspects of the video lifecycle, which perhaps is not so surprising since they represent the course of computation in general away from explicit series of instructions to implicit techniques that are more flexible and allow the system to adapt to its environment without further modification of software. This in turn reflects the growing ubiquity of computing as it permeates every aspect of life, requiring solutions to more complex problems that cannot be readily programmed explicitly.

Both AI and ML have been around since the dawn of computing but it is only recently that the almost astronomical increase in both processing power and storage capacity has finally enabled them to be applied seriously at scale. This in turn has stimulated adaptation of the algorithms to exploit the far greater resources available. AI covers a wide variety of advanced techniques designed in various ways to capture or simulate human expertise. ML has emerged from some of these, especially pattern recognition and computational learning theory, to embrace tasks not suited for traditional software based on serial instructions. Such tasks where ML has already been proven include email filtering, intruder detection and computer vision, where machines learn essentially from trial and error combined with feedback to respond more intelligently or appropriately to the inputs. ML is also playing a central role in great emerging sectors, such as autonomous driving and video security monitoring.

At the SMPTE conference, Konstantin Wilms, Principal Solution Architect at Amazon Web Services, was discussing the security angle as part of his presentation “Integrating AI and ML technologies into cloud-based media workflows”. Deep learning can be applied both to threat monitoring to distinguish between real attacks and false alarms, cutting down on both false positives and negatives, as well as in authentication via biometric techniques such as facial recognition. Konstantin argues that ML can be infused into a wide variety of processes including classification of management metadata and also sentiment analysis, which has been applied in TV for at least a decade but again can be enhanced with the latest tools and hardware. Sentiment analysis involves assessing a user’s interest in, or opinion of, a particular program through a variety of cues, including statements, viewing time, gaps between watching episodes if relevant, and recommendations made by social media. The technique can be applied to text in messages or social media postings to build up a picture of a user’s opinion.

Jaclyn Pytlarz, Senior Engineer of Applied Vision Science at Dolby Laboratories reviewed the technology underlying color and dynamic range management and how these issues relate to human perception.

Jaclyn Pytlarz, Senior Engineer of Applied Vision Science at Dolby Laboratories reviewed the technology underlying color and dynamic range management and how these issues relate to human perception.

AI and ML are also starting to make a major contribution on the automation front in various ways, during both video processing and preparation. At the SMPTE conference, Martin Wahl, Principal Program Manager for Microsoft’s Azure Media Services, was explaining how its AI-based video indexing service allows users to curate metadata directly from uploaded video content, exploiting speech-to-text transcription and closed captioning, as well as face and object detection combined with language translation. This is a good example of various AI methods working together, with applications in this case including automatic advertisement and content classification, dynamic adaptation of content based on audience preferences, and automatic creation of highlight reels, as well as summaries based on detection of scenes, motion and people within a video. At the talk Wahl was explaining how the core technology is derived from underlying vision, speech, language, knowledge and search modules that Microsoft’s R&D has developed over some years. These reduce effort and at the same time enhance the scope of metadata to describe content in greater depth and nuance.

On another very contemporary theme, color balancing for content captured in HDR (High Dynamic Range) with WCG (Wide Color Gamut), Jaclyn Pitlarz, Senior Engineer of Applied Vision Science at Dolby Laboratories, was covering techniques for managing color and dynamic range for multi-camera production. She stressed that for HDR and WCG to deliver the best possible experience, color management must be geared towards human perception and she was exploring possible future solutions on that front.

There is also scope for applying ML and AI to optimize encoding, with several discussions at the conference highlighting the potential of ML for adaptive compression taking better account of redundancy and variation in detail as the video sequence progresses. By matching compression better to the content at a higher level than just individual frames there is scope for further reductions in bit rate at a given video quality. Delegates at the conference might be surprised then that ML has not already been incorporated into compression algorithms, given its now well proven scope for much more efficient encoding with great potential for live content in particular.

Indeed, no significant commercially available video encoder employs machine learning, as they have nearly all evolved around explicit heuristic methods that have been fine-tuned over many years. Given relatively limited scope for significant further improvements through fine tuning, there is growing demand from some experts for applying ML techniques to develop software that combines well proven techniques with much greater adaptation to content in real time. There is mounting evidence that greater efficiency can be achieved with the help of ML while reducing programming effort. This is leading towards application of ML for automatic and iterative software development, which would realize a long-standing dream of computer programming and IT project management. It might be less good news for humans if even skilled tasks such as software development become automated, although the same people might find employment in AI and ML.

Yet another application of ML, again with an automation focus, discussed at the SMPTE conference was Quality Control (QC). This is required right across the media supply chain, as Telestream’s product managers Dominic Jackson and James Welch were explaining. In a presentation called “Zen and the Art of Media in Motion: The Many Aspects of Quality in the Media Supply Chain”, they were discussing how ML and AI can be applied to automate QC processes that at present can only be performed by people. This can reduce costs and at the same time has potential to improve quality and reliability further, because once machines have mastered a task, they are, in principle at least, flawless.

Let us know what you think…

Log-in or Register for free to post comments…

You might also like...

NAB NY 2017 – What You Didn’t See

The 2017 NAB NY show was not the new SMPTE standards showcase I hoped for. The exhibit floor did affirm that SDI remains a trusted solution for many of us in media and broadcast, even with SMPTE 2110 just around the corner.

Articles You May Have Missed – August 23, 2017

The broadcast and production industries are among the most innovative and fast-paced around. Yet, for those of us with more than a few years’ experience, sometimes it seems things change for no good reason. Yogi Berra’s comment, “It’s Déjà vu…

UK HPA Tech Retreat Report - Day 3

Tuesdays HPA Tech Retreat was all about 360 and VR, and Wednesday focused on the versioning explosion. On the final day, delegates were given a summary of the current state of the industry, and the influences of artificial intelligence on media…

UK HPA Tech Retreat Report - Day 2

Yesterday’s 2017 HPA Tech Retreat in Oxford, UK, was all about VR and 360, and on Wednesday, they moved to the thorny issue of the versioning explosion. As broadcasters seek wider audiences over different platforms, localisation has become a big issue. M…

UK HPA Tech Retreat Report - Day 1

Set in the idyllic surroundings of Heythrop Park, Oxfordshire - UK, this year’s Hollywood Professional Association Tech Retreat was brimming with innovation, demonstrations and expert industry speakers. VR and 360 dominated day one with production and technical speakers battling out t…