Digital Nirvana’s Trance AI Closed-Captioning Workflow Accelerates USTA Caption Generation

Digital Nirvana announced that its Trance automated postproduction captioning solution with advanced AI is powering the closed captioning workflow for the United States Tennis Association (USTA), the US governing body for tennis.

Integrated with Adobe Premiere Pro, Trance improves the speed and efficiency of the league’s caption generation process while freeing technical personnel to focus on the creative aspects of their jobs.

“The USTA aims to give tennis enthusiasts access to its content with minimal delay, especially when it pertains to live events or other timely content — something it couldn’t do with its old captioning process. Now with Trance seamlessly integrated into its Adobe postproduction workflow, the league gets a rapid turnaround time that is virtually unheard of in the captioning industry,” said Russell Wise, senior vice president of sales and marketing for Digital Nirvana. “Not only that, but the USTA has the industry’s highest caption-quality standards, with captions that are virtually 100% accurate and display with minimal delay, as the words are spoken.”

The USTA’s previous captioning process was time-consuming and inefficient, relying on manual tasks and transcripts generated by a third-party speech-to-text service. Previously, video personnel were required to copy and paste text from the transcripts, line by line, onto the video timeline and manually enter timecodes. Depending on the duration of the video program, captioning was taking several hours – an unsatisfactory turnaround. Now with Digital Nirvana’s new AI-powered workflow Trance combined with the power of Adobe Premiere Pro, most of the manual work has been eliminated and captioning is provided in a fraction of the time.

Since moving to Digital Nirvana, the USTA production team has been able to cut the turnaround time for captioning an average video title from hours to only 30 minutes, with the captioning task completely offloaded from the technical team. In a typical captioning workflow, a video technician simply drags the clip to a “hot folder” in the Digital Nirvana media service portal. The clip is then automatically uploaded to the processing center, where the Digital Nirvana team uses AI-based speech-to-text algorithms to generate a transcript, validate its accuracy, and then burn captions into the video. Once complete, the captioned video is automatically uploaded to the Premiere Pro timeline.

Digital Nirvana’s Trance application unites cutting-edge STT technology and other AI-driven processes with cloud-based architecture to drive efficient broadcast and media workflows. Implementing cloud-based metadata generation and closed captioning as part of their existing operations, media companies can radically reduce the time and cost of delivering accurate, compliant content for publishing worldwide. They also can enrich and classify content, enabling more effective repurposing of media libraries and facilitating more intelligent targeting of advertising spots. The latest version, Trance 3.0, has a text translation engine that simplifies and speeds captioning in additional languages and automated caption conformance to accelerate delivery of content to new platforms and geographic regions.

You might also like...

IP Security For Broadcasters: Part 11 - EBU R143 Security Recommendations

EBU R143 formalizes security practices for both broadcasters and vendors. This comprehensive list should be at the forefront of every broadcaster’s and vendor’s thoughts when designing and implementing IP media facilities.

Machine Learning For Broadcasters: Part 2 - Applications

In part one of this series, we looked at why machine learning, with particular emphasis on neural networks, is different to traditional methods of statistical based classification and prediction. In this article, we investigate some of the applications specific for…

NextGenTV Mid-2022 Progress Report

Some call it the Broadcast Core Network, or Broadcast Internet, or One-to-Many Private Datacasting. Others simply call it datacasting.

How To Achieve Broadcast-Grade Latency For Live Video Streaming - Part 1

For a serious discussion about “making streaming broadcast-grade” we must address latency. Our benchmark is 5-seconds from encoder input to device, but we cannot compromise on quality. It’s not easy, and leaders in the field grapple with the trade-offs to en…

Information: Part 5 - Moving Images

Signal transducers such as cameras, displays, microphones and loudspeakers handle information, ideally converting it from one form to another, but practically losing some. Information theory can be used to analyze such devices.