Making Broadcast Captioning Accessible With ASR

From the devices we use to consume content, the type of content we consume, and indeed our viewing habits, almost every aspect of media and more importantly, our relationship with media, has undergone a significant change in the last 10 to 15 years. As an example, in the UK adults aged 16 and over watched, on average, 34 minutes less television in 2017 than in 2010 (according to Statista). While the rise of streaming services and on-demand content is undoubtedly a contributing factor, a bigger influence is almost certainly the rise in social media consumption.In fact, in 2015, more than 370 years of video - just shy of 200 million minutes - were watched on Twitter worldwide on a daily basis. As is the case with any industry, such rapid change brings with it new challenges, and the need for evolution and adaptation, and captioning is no exception.

Broadening our reach

From a technological perspective, captioning is as good as it’s ever been, but technological advancements across the board have led to an increase in people’s expectations; mobile phones are as capable as they’ve ever been; homes are now smart and connected; and cars are close to driving themselves. This mentality translates to all technologies, meaning that people’s expectations of captioning are higher than ever before.

This constantly requires the need for innovation and improvement to provide more efficient captioning to a high level of accuracy across the ever expanding broadcast and digital landscape.

Speaking of increases, not only is expectation and demand growing, but the amount of produced content requiring captions, and the amount of content that’s consumed is higher than ever before, meaning that the focus should be firmly placed on the audience.

While regulated content will continue to require skilled captioners to meet regulatory demands, there is a wealth of content within unregulated markets that could benefit from technological advancements to reach a wider pool of people.

Browsing through your average Twitter, LinkedIn or Facebook feed reveals that the presence of captioned content can be hit and miss. It is simply too expensive to use the same methods for this media used to meet regulatory demands. This means that the significant majority of SMEs lack the resources available to caption their content across their social channels, negatively impacting their engagement with their users.

According to multiple publishers (DigiDayup to 85% of all videos on Facebook are watched without sound; when you consider that over 500 million people watch video on Facebook every single day (via Forbes) there is an incomprehensibly large audience that is currently not being tailored to.

This becomes an even bigger problem when viewed through the lens of accessibility. With over 900 million people (or one in ten) estimated to suffer from disabling hearing loss by 2050, it is now more important than ever for all producers of video content to make sure accessibility remains at the forefront of their mind.

The challenge therefore lies not only in accurate captioning but captioning the sheer amount of content that’s uploaded across various social medias every minute. It’s imperative that we as an industry continue to add the necessary tools to be able to make more content available with better quality captioning across all media platforms - but how?

It goes without saying that the logistics of producing, managing and delivering those captions is a challenge of substantial complexity. A possible solution to the problem is in the form of Automatic Speech Recognition (ASR).

Huge advances in recent years have allowed the technology to be a realistic player in captioning for the very first time. It means that it can be used in our workflows to drive up productivity and allows us to cover more than ever before, more quickly, and more effectively.

At Red Bee Media, introducing the element of ASR has allowed us to create a solution that makes the production of accurate captions more cost-efficient. Using Speechmatics’ highly accurate ASR technology, we are able to apply captions to video content for online and social more efficiently and more accurately than before. This has also enabled us to transcribe thousands of hours of video content to make it easily searchable in a secure environment.

This is not only applicable to new content either; we can also pull from existing video content that needs repurposing, and it is significantly more efficient and easier to execute.


Captioning is at an inflection point; new technologies, new media, and new problems have created the perfect storm of issues, but also new opportunities. The industry is ripe for embracing new technological solutions, widening its reach to allow for the creation of better, more efficient captioning for the broadest possible audience.

Tom Wootton is product manager at Red Bee Media, part of Ericsson.

Let us know what you think…

Log-in or Register for free to post comments…

You might also like...

Special Report: Super Bowl LIII – The Technology Behind the Broadcast

New England Patriot quarterback, Tom Brady, entered Mercedes Benz stadium in Atlanta, GA on February 3rd having already won five Super Bowl games. And through four-quarters of play, all delivered by a television crew of hundreds of technicians, sports casters…

eBook:  Preparing for Broadcast IP Infrastructures

This FREE to download eBook is likely to become the reference document you keep close at hand, because, if, like many, you are tasked with Preparing for Broadcast IP Infrastructures. Supported by Riedel, this near 100 pages of in-depth guides, illustrations,…

Practical Broadcast Storage - Part 3

Artificial Intelligence (AI) has made its mark on IT and is rapidly advancing into mainstream broadcasting. By employing AI methodologies, specifically machine learning, broadcasters can benefit greatly from the advances in IT infrastructure innovation and advanced storage designs.

An Insider’s Guide to Object Storage

As the IP revolution continues to gain momentum and more broadcast facilities take advantage of the fantastic and unprecedented opportunities IT delivers, administrators and system designers must master the complex aspects of data storage.

IP Control Uncovered

As broadcasters continue to successfully migrate video and audio to IP, attention soon turns to control, interoperability, and interconnectivity to improve reliability and efficiency. In this article, we investigate IP control as we move to IP infrastructures.