How to Add Live Captions to Streaming Content

If adding captions to your streaming content – whether it’s on-demand or live – feels daunting, you’re not alone. Both present new challenges that arise in step with fast-evolving formats and consumer tastes.

Of the two possibilities, however, live captioning for streaming content is the more difficult format for producers to nail down. That’s because, for starters, no universal standard format exists to uplink captions separately from video. In the past, this has created a need for even many web-only productions to use a hardware closed caption appliance that will generate VANC captions for input to an SDI-in streaming media encoder.

In addition, there’s a whole host of factors that can affect live workflows: The video codec, streaming video encoder, streaming server, and multitude of consumer players each may introduce specific restrictions on the types of caption data accepted through the signal path.

But getting it right is worth it: Done correctly, closed captioning exposes content to a much larger audience, has a positive impact on video search, and is increasingly required for compliance with FCC standards. And it’s not just for broadcasters: In addition to the online presentation of news and sporting events, quality closed captioning has clear benefits for e-learning classrooms, corporate webinars, government facilities, and much more.

While hardware options exist for online streaming, the recent emergence of cloud-based real-time caption processing applications is making captioning for live online content more efficient and affordable than ever. A virtual caption encoder can now be employed on-demand that functions completely in the cloud, allowing live content producers to connect to transcriptionists through a caption delivery network, such as iCap, and route captions directly to compatible Web video servers. This method makes live closed captioning much more accessible for streaming-only media, since it can eliminate the need for an SDI signal path hardware caption encoder feeding a caption-compatible H264 or RTMP streaming video encoder.

The the availability of cloud storage and transmission, captioning can be more easily generated and distributed.

The the availability of cloud storage and transmission, captioning can be more easily generated and distributed.

    Here’s one signal path with which live streaming content producers could caption their projects, using a virtual caption encoder:

    1) Create an account with a cloud-based caption processing software solution and familiarize yourself with its use. Make sure ahead of time that the service you are using is compatible with the streaming server or service that you are uplinking your video to.

    2) Source a captioning or transcription service provider who can provide realtime transcription for your event. The highest quality captions come from professional stenographers or re-speaking “voice writers” who are certified by the National Court Reporters Association or similar professional groups for specific accuracy and words-per-minute standards. Training in-house staff in voice captioning is possible, but takes a much greater time investment than many media companies are prepared for. Operator-less Automatic Speech Recognition software packages also exist, but have many limitations.

    3) Configure the caption processing application with essential details about the target stream – these will let it know exactly where to send the captions. Specifications include:

    a: The captioning or transcription service provider covering the event

    b:.The streaming service being used. Common examples include YouTube Live Events, VideoLinq, UVault, and independent systems running the Wowza Streaming Engine software package.

    c: A secure iCap Access Code for your event, or alternate method of reaching the caption processor. If you are working with an outside captioner, they will typically request this information well before the planned event or program so that they can test their connection to you.

    d: Finally, pasting the URL at which your server accepts captioning input will allow the virtual caption encoder to send captions to it.

    4) Initiate the project via a “Create” button or similar.

Arranging for cloud captioning can be as easy as filling out a form.

Arranging for cloud captioning can be as easy as filling out a form.

From there, the video producer’s job is typically finished, and the remaining steps are required from the caption service provider when they begin their work:

1. The captioner receives the iCap Access Code or other access information that was created in the virtual caption encoder application.

2. He or she enters the iCap Access Code into their local live writing software, and provides captions through the same workflows typically used for live TV content or in-venue captioning/CART.

3. The virtual caption encoder routes these captions behind the scenes into your live-Web stream which is directly linked to the iCap Access Code being used.

Use Case Scenario

Imagine a situation where an iCap Access Code has been created for a virtual caption encoder instance and provided to a captioner servicing the event – they now have the ability to access the stream through this code and provide captions from their caption delivery network software.

Meanwhile, the content being captured with a video camera is sent live directly through a software streaming media encoder which is pointed at the chosen streaming video service. This is the live stream.

Simultaneously, the captioner will be providing captions to this stream from their caption delivery network software by using the connection details that have been provided to them in the beginning. The stream and the live captions meet at the video-streaming server, which merges them for the configured set of live delivery formats, including RTMP and HTTP Live Streaming, and are then passed down to the end viewer. All captioning-enabled players should display a “CC” icon in the play bar and allow selectable viewing of the formatted transcription data.


The above is just one of many scenarios where a virtual captioning encoder can be deployed to make live streaming captioning possible. The end result is a perfectly captioned live web-stream that feels seamless to the content producer, and just as importantly, to the viewer.

You might also like...

Standards: Part 4 - Standards For Media Container Files

This article describes the various codecs in common use and their symbiotic relationship to the media container files which are essential when it comes to packaging the resulting content for storage or delivery.

Standards: Appendix E - File Extensions Vs. Container Formats

This list of file container formats and their extensions is not exhaustive but it does describe the important ones whose standards are in everyday use in a broadcasting environment.

Standards: Part 3 - Standards For Video Coding

This article gives an overview of the various codec specifications currently in use. ISO and non-ISO standards will be covered alongside SMPTE 2110 elements to contextualize all the different video coding standard alternatives and their comparative efficiency - all of which…

5G Broadcast: Part 4 - 5G Broadcast Challenges Digital Terrestrial

Fast growing traction for 5G Broadcast and Multicast has the potential to disrupt over the air broadcasting by presenting an alternative to the established digital terrestrial networks just as they progress to the next generation. Yet the two may end…

Standards: Part 2 - Standards For Broadcasting & Deployment

This article gives an overview of the standards relating to production and transmission or playout. It prepares the ground for subsequent more detailed articles which will explore the following subject areas: ST 2110, higher bit rate codecs and profiles that are…