Facial recognition - game changer
Video compression solutions provider V-Nova and Metaliquid, an AI video analysis solutions provider, have announced a strategic partnership to develop and commercialize products for machine learning powered content indexing.
To effectively deliver video analysis, a combination of speed and accuracy is paramount, states V-Nova. Currently, broadcasters can only afford to analyze a small portion of their media archive or a limited sample of frames. “They are often forced to reduce the resolution at which the analysis is performed because it’s faster and cheaper to process,” it argues. “However, lower resolutions lose details, which reduce the accuracy when recognizing key features like faces or the OCR of small text.
After a proof-of-concept was shown at IBC 2019, Metaliquid’s video analysis solution and V-Nova’s PPro (previously PERSEUS Pro) have been combined to deliver an AI-powered software library for encoding and decoding SMPTE VC-6, which uses a hierarchical approach to represent images.
Guendalina Cobianchi, SVP Business Development & Partnerships at V-Nova, comments “PPro is very smart: each video frame includes multiple levels of resolution and you can not only selectively access these resolutions, but also decode specific areas of the frame that are important for the analysis, plus it’s extremely fast. We can perform each video analysis task on the most appropriate set of pixels without having to trade off speed and accuracy.”
The initial proof-of-concept demonstrated an “outstanding” 3.2x performance gain thanks to the use of PPro instead of JPEG, which combined with the performance of Metaliquid’s algorithm, outperformed by “over an order of magnitude the benchmarked solutions currently used by the sponsors, with a comparable or higher accuracy level.”
Even further gains are expected during productisation.
Testimonies in support of their claims include:
Alan Winthroub, Director, Software Engineering at AP says, “We have one of the world’s largest multimedia archives and its long-term value is dependent on powerful indexing to deliver the rich metadata to make it discoverable. The step-change in performance this catalyst project has delivered means we can process more content, more quickly while generating richer data.”
Deirdre Temple, Head of Solutions, Transformation & Technology at RTÈ says “If we have an election or referendum, as a public service provider we need to show that we're giving balanced coverage for all parties involved, or for both sides of the debate. Providing this data is very labour-intensive, during a live debate we use stopwatches. Ideally, we would like to have real time data available online to show our balanced coverage throughout a campaign. Metaliquid and V-Nova’s solution that has emerged from the Catalyst programme is a real game-changer for us”.
Simone Bronzin, CEO and founder at Metaliquid said “The full-stack control and customization capabilities we have over our proprietary deep-learning technology has yielded products that respond to industry challenges that are not easy to solve with general purpose AI solutions. Increasing the performance of image encoding and decoding in our workflow was another important step in offering a best-in-class solution. We are tremendously excited by the results of the catalyst project and look forward to delivering this step-change solution to the market very soon.”
You might also like...
In the data recording or transmission fields, any time a recovered bit is not the same as what was supplied to the channel, there has been an error. Different types of data have different tolerances to error. Any time the…
Lawo’s Christian Struck looks at the potential for production automation in immersive sports broadcasting, and how it can help move towards a personalized, object-based experience.
Genelec Senior Technologist Thomas Lund moves the monitoring discussion on to the practical considerations for immersive audio, wherever you are.
The Ultra HD Forum has given a stimulus to UHD deployments with the release of its latest 2.1 guidelines that give proper weight to all the ingredients constituting next generation A/V (Audio/Video).
In this fourth installment of the Immersive Audio series we investigate the production tools needed to produce live immersive content. Moving from channel-based output to object audio presents some interesting challenges as the complex audio image moves around in three-dimensional…