Quantum is launching aiWARE for Xcellis, an on-premise version of Veritone’s cloud-based artificial intelligence platform. This solution enables organizations to apply cognitive analytics to video and audio content libraries without the cost and hassle of moving their large media libraries to the cloud.
Here, we talk with Quantum's senior director of product marketing Dave Frederick about the development and begin by asking whether the industry in general is confusing customers by badging solutions as artificially intelligent when in fact they perhaps better described as advanced analytics.
Dave Frederick, Quantum: Perhaps. We see customers looking for three levels of awareness. The first level is storage awareness with respect to infrastructure, and it answers questions about storage capacity, how is it allocated, etc. The second level is file awareness. This is awareness about what’s being stored — files and directories — in the capacity, what’s being archived, when was a file touched last, who changed a file and when, etc. The third level is content awareness — the knowledge of what files contain, what content looks like, what’s being said, etc.
The first two levels of awareness are usually generated by systems that automatically collect and store information and report it when needed. The third level can be human-generated if necessary. Or, it can be generated with artificial intelligence (AI).
AI offers an automated way of generating content-related information that traditionally could be extracted only through human interaction. AI typically relies on some training to recognize the material it’s evaluating. Some AI engines are capable of improving their performance by inspecting more data. This is referred to as machine learning.
AI isn’t needed to analyze disk utilization or find duplicate or similarly named files. Rather, AI is about discovering the attributes of content and generating metadata that enhances the utilization of the content itself.
The new solution is initially available with optical character recognition (OCR), object recognition and transcription ― to extract additional value from their on-premise video and audio content. Can you explain what OCR is and how it works?
DF: OCR engines have the ability to identify readable text in an image or video and convert that text into encoded characters that can be used in compute processes such word processing, spreadsheets, etc. Use cases for OCR analysis include documenting on-screen graphics to identify names of people, reading scoreboards or license plates, or capturing and transcribing text from any scene that includes readable content.
How in practice is aiWARE applied to video and audio assets?
DF: Video and audio represent data that can only be evaluated and cataloged by watching (listening to) it. Customers are using AI to transcribe audio into a searchable text file so that they can quickly find specific content within an entire library of files. Once a text directory is created, further inspection might take the form of OCR (described above), object detection (finding known objects), object recognition (finding objects that look like a specific example), facial detection (the presence of a face on the screen), facial recognition (the person on the screen), speaker separation (split an interview or conversation into separate speakers), and more. While it’s cost-effective to run transcription against an entire library and beneficial to create an index with the resulting data, more advanced AI engines typically will be applied only to a subset of files to save time and money.
Can you elaborate on the orchestration element (multiple engines sequentially processing the same data and access to cloud-based AI engines for additional processes when desired) by explaining how this works and what this means for organizations?
DF: When the content being sought requires a series of machines for content analysis, multiple analyses can be linked so that each benefits from the information already gleaned by others. This model reduces the overall time and cost of processing as multiple engines hone in on the ultimate result. Orchestration allows these operations to be scheduled and run automatically in sequence.
By 2020 - how will AI have improved (to perform things it cannot do now perhaps)
DF: Besides becoming more accurate, AI engines will also get faster. This is important in that AI will be used more and more often at the time of ingest or even at initial capture to create metadata in real time. The resulting metadata could be used throughout the entire production, post-production and delivery process. We will also see specialized engines that understand different topics or fields, such as medical, legal, financial, etc.
You might also like...
Saving dollars is one of the reasons broadcasters are moving to IP. Network speeds have now reached a level where real-time video and audio distribution is a realistic option. Taking this technology to another level, Rohde and Schwarz demonstrate in…
The first commercially available helium-filled hard drives were introduced by HGST, a Western Digital subsidiary, in November, 2013. At the time, the six terabyte device was the highest capacity hard drive available. Backblaze, a major hard drive user, wanted to find…
A revolution in storytelling for TV, cinema, VR, and related forms of entertainment has just begun, enabled by artificial intelligence (AI). This computer-science and engineering-based technique, including machine learning, deep learning, language understanding, computer vision, and big data - is…
With near unfettered access to portable media players of all types and faster networks, consumers are increasingly migrating to video providers that serve them best. Quality and reliability are the key drivers for loyal and recurring engagement.
It was late 2007, after seeing a Drobo in action at a trade show, that I bought one. It could handle up to four 3.5-inch hard drives of any capacity and automatically backed up redundant data without the user having to…