Telestream is promoting Machine Learning for Quality Control at the SMPTE’s annual conference.
Attendees to Society of Motion Picture and Television Engineers (SMPTE) conference might conclude that Machine Learning (ML) and Artificial Intelligence (AI) have progressed way beyond hype as they start to enter just about every aspect of video production and distribution.
ML and AI have climbed out of analyst group Gartner’s famous trough of disillusionment onto the sunlit uplands of enlightenment. Both featured within many of the presentations covering different aspects of the video lifecycle, which perhaps is not so surprising since they represent the course of computation in general away from explicit series of instructions to implicit techniques that are more flexible and allow the system to adapt to its environment without further modification of software. This in turn reflects the growing ubiquity of computing as it permeates every aspect of life, requiring solutions to more complex problems that cannot be readily programmed explicitly.
Both AI and ML have been around since the dawn of computing but it is only recently that the almost astronomical increase in both processing power and storage capacity has finally enabled them to be applied seriously at scale. This in turn has stimulated adaptation of the algorithms to exploit the far greater resources available. AI covers a wide variety of advanced techniques designed in various ways to capture or simulate human expertise. ML has emerged from some of these, especially pattern recognition and computational learning theory, to embrace tasks not suited for traditional software based on serial instructions. Such tasks where ML has already been proven include email filtering, intruder detection and computer vision, where machines learn essentially from trial and error combined with feedback to respond more intelligently or appropriately to the inputs. ML is also playing a central role in great emerging sectors, such as autonomous driving and video security monitoring.
At the SMPTE conference, Konstantin Wilms, Principal Solution Architect at Amazon Web Services, was discussing the security angle as part of his presentation “Integrating AI and ML technologies into cloud-based media workflows”. Deep learning can be applied both to threat monitoring to distinguish between real attacks and false alarms, cutting down on both false positives and negatives, as well as in authentication via biometric techniques such as facial recognition. Konstantin argues that ML can be infused into a wide variety of processes including classification of management metadata and also sentiment analysis, which has been applied in TV for at least a decade but again can be enhanced with the latest tools and hardware. Sentiment analysis involves assessing a user’s interest in, or opinion of, a particular program through a variety of cues, including statements, viewing time, gaps between watching episodes if relevant, and recommendations made by social media. The technique can be applied to text in messages or social media postings to build up a picture of a user’s opinion.
Jaclyn Pytlarz, Senior Engineer of Applied Vision Science at Dolby Laboratories reviewed the technology underlying color and dynamic range management and how these issues relate to human perception.
AI and ML are also starting to make a major contribution on the automation front in various ways, during both video processing and preparation. At the SMPTE conference, Martin Wahl, Principal Program Manager for Microsoft’s Azure Media Services, was explaining how its AI-based video indexing service allows users to curate metadata directly from uploaded video content, exploiting speech-to-text transcription and closed captioning, as well as face and object detection combined with language translation. This is a good example of various AI methods working together, with applications in this case including automatic advertisement and content classification, dynamic adaptation of content based on audience preferences, and automatic creation of highlight reels, as well as summaries based on detection of scenes, motion and people within a video. At the talk Wahl was explaining how the core technology is derived from underlying vision, speech, language, knowledge and search modules that Microsoft’s R&D has developed over some years. These reduce effort and at the same time enhance the scope of metadata to describe content in greater depth and nuance.
On another very contemporary theme, color balancing for content captured in HDR (High Dynamic Range) with WCG (Wide Color Gamut), Jaclyn Pitlarz, Senior Engineer of Applied Vision Science at Dolby Laboratories, was covering techniques for managing color and dynamic range for multi-camera production. She stressed that for HDR and WCG to deliver the best possible experience, color management must be geared towards human perception and she was exploring possible future solutions on that front.
There is also scope for applying ML and AI to optimize encoding, with several discussions at the conference highlighting the potential of ML for adaptive compression taking better account of redundancy and variation in detail as the video sequence progresses. By matching compression better to the content at a higher level than just individual frames there is scope for further reductions in bit rate at a given video quality. Delegates at the conference might be surprised then that ML has not already been incorporated into compression algorithms, given its now well proven scope for much more efficient encoding with great potential for live content in particular.
Indeed, no significant commercially available video encoder employs machine learning, as they have nearly all evolved around explicit heuristic methods that have been fine-tuned over many years. Given relatively limited scope for significant further improvements through fine tuning, there is growing demand from some experts for applying ML techniques to develop software that combines well proven techniques with much greater adaptation to content in real time. There is mounting evidence that greater efficiency can be achieved with the help of ML while reducing programming effort. This is leading towards application of ML for automatic and iterative software development, which would realize a long-standing dream of computer programming and IT project management. It might be less good news for humans if even skilled tasks such as software development become automated, although the same people might find employment in AI and ML.
Yet another application of ML, again with an automation focus, discussed at the SMPTE conference was Quality Control (QC). This is required right across the media supply chain, as Telestream’s product managers Dominic Jackson and James Welch were explaining. In a presentation called “Zen and the Art of Media in Motion: The Many Aspects of Quality in the Media Supply Chain”, they were discussing how ML and AI can be applied to automate QC processes that at present can only be performed by people. This can reduce costs and at the same time has potential to improve quality and reliability further, because once machines have mastered a task, they are, in principle at least, flawless.
You might also like...
At the 2019 IBC convention this year it was clear that the consumer is king and, for broadcasters and content delivery platforms, reliably serving that on-demand ruler with hyper-adaptable operations that can reach many platforms simultaneously could secure the keys to…
In this thought-provoking missive, Gary Olson delivers his predictions and insights for IBC 2019.
Philo T. Farnsworth was the original TV pioneer. When he transmitted the first picture from a camera to a receiver in another room in 1927, he exclaimed to technicians helping him, “There you are – electronic television!” What’s never been quoted but lik…
From theory to implementation, this second year of the IP Showcase and Theater at IBC2018 is should be on everyone’s schedule.
Android TV is finally being adopted on a large scale by pay TV operators three years after its launch and seven years on from the original unveiling of its predecessor Google TV. One casualty could be the RDK (Reference Design…