Making Broadcast Captioning Accessible With ASR

From the devices we use to consume content, the type of content we consume, and indeed our viewing habits, almost every aspect of media and more importantly, our relationship with media, has undergone a significant change in the last 10 to 15 years. As an example, in the UK adults aged 16 and over watched, on average, 34 minutes less television in 2017 than in 2010 (according to Statista). While the rise of streaming services and on-demand content is undoubtedly a contributing factor, a bigger influence is almost certainly the rise in social media consumption.In fact, in 2015, more than 370 years of video - just shy of 200 million minutes - were watched on Twitter worldwide on a daily basis. As is the case with any industry, such rapid change brings with it new challenges, and the need for evolution and adaptation, and captioning is no exception.

Broadening our reach

From a technological perspective, captioning is as good as it’s ever been, but technological advancements across the board have led to an increase in people’s expectations; mobile phones are as capable as they’ve ever been; homes are now smart and connected; and cars are close to driving themselves. This mentality translates to all technologies, meaning that people’s expectations of captioning are higher than ever before.

This constantly requires the need for innovation and improvement to provide more efficient captioning to a high level of accuracy across the ever expanding broadcast and digital landscape.

Speaking of increases, not only is expectation and demand growing, but the amount of produced content requiring captions, and the amount of content that’s consumed is higher than ever before, meaning that the focus should be firmly placed on the audience.

While regulated content will continue to require skilled captioners to meet regulatory demands, there is a wealth of content within unregulated markets that could benefit from technological advancements to reach a wider pool of people.

Browsing through your average Twitter, LinkedIn or Facebook feed reveals that the presence of captioned content can be hit and miss. It is simply too expensive to use the same methods for this media used to meet regulatory demands. This means that the significant majority of SMEs lack the resources available to caption their content across their social channels, negatively impacting their engagement with their users.

According to multiple publishers (DigiDayup to 85% of all videos on Facebook are watched without sound; when you consider that over 500 million people watch video on Facebook every single day (via Forbes) there is an incomprehensibly large audience that is currently not being tailored to.

This becomes an even bigger problem when viewed through the lens of accessibility. With over 900 million people (or one in ten) estimated to suffer from disabling hearing loss by 2050, it is now more important than ever for all producers of video content to make sure accessibility remains at the forefront of their mind.

The challenge therefore lies not only in accurate captioning but captioning the sheer amount of content that’s uploaded across various social medias every minute. It’s imperative that we as an industry continue to add the necessary tools to be able to make more content available with better quality captioning across all media platforms - but how?

It goes without saying that the logistics of producing, managing and delivering those captions is a challenge of substantial complexity. A possible solution to the problem is in the form of Automatic Speech Recognition (ASR).

Huge advances in recent years have allowed the technology to be a realistic player in captioning for the very first time. It means that it can be used in our workflows to drive up productivity and allows us to cover more than ever before, more quickly, and more effectively.

At Red Bee Media, introducing the element of ASR has allowed us to create a solution that makes the production of accurate captions more cost-efficient. Using Speechmatics’ highly accurate ASR technology, we are able to apply captions to video content for online and social more efficiently and more accurately than before. This has also enabled us to transcribe thousands of hours of video content to make it easily searchable in a secure environment.

This is not only applicable to new content either; we can also pull from existing video content that needs repurposing, and it is significantly more efficient and easier to execute.


Captioning is at an inflection point; new technologies, new media, and new problems have created the perfect storm of issues, but also new opportunities. The industry is ripe for embracing new technological solutions, widening its reach to allow for the creation of better, more efficient captioning for the broadest possible audience.

Tom Wootton is product manager at Red Bee Media, part of Ericsson.

Let us know what you think…

Log-in or Register for free to post comments…

You might also like...

Essential Guide: When to Virtualize IP

Moving to IP opens a whole plethora of options for broadcasters. Engineers often speak of the advantages of scalability and flexibility in IP systems. But IP systems take on many flavors, from on-prem to off-prem, private and public cloud. And…

Essential Guide:  Immersive Audio Primer – Part 1

Part one of this four-part series introduces immersive audio, the terminology used, the standards adopted, and the key principles that make it work.

TV’s ‘Back to the Future’ Moment?

Philo T. Farnsworth was the original TV pioneer. When he transmitted the first picture from a camera to a receiver in another room in 1927, he exclaimed to technicians helping him, “There you are – electronic television!” What’s never been quoted but lik…

Server-Based “At Home” Workflows Provide Efficiency For NASCAR Productions

NASCAR Productions, based in Charlotte NC, prides itself on maintaining one of the most technically advanced content creation organizations in the country. It’s responsible for providing content, graphics and other show elements to broadcasters (mainly Fox and NBC), as w…

Essential Guide:  Practical Broadcast Storage

Ground breaking advances in storage technology are paving the way to empower broadcasters to fully utilize IT storage systems. Taking advantage of state-of-the-art machine learning techniques, IT innovators now deliver storage systems that are more resilient, flexible, and reliable than…