How to Add Live Captions to Streaming Content

If adding captions to your streaming content – whether it’s on-demand or live – feels daunting, you’re not alone. Both present new challenges that arise in step with fast-evolving formats and consumer tastes.

Of the two possibilities, however, live captioning for streaming content is the more difficult format for producers to nail down. That’s because, for starters, no universal standard format exists to uplink captions separately from video. In the past, this has created a need for even many web-only productions to use a hardware closed caption appliance that will generate VANC captions for input to an SDI-in streaming media encoder.

In addition, there’s a whole host of factors that can affect live workflows: The video codec, streaming video encoder, streaming server, and multitude of consumer players each may introduce specific restrictions on the types of caption data accepted through the signal path.

But getting it right is worth it: Done correctly, closed captioning exposes content to a much larger audience, has a positive impact on video search, and is increasingly required for compliance with FCC standards. And it’s not just for broadcasters: In addition to the online presentation of news and sporting events, quality closed captioning has clear benefits for e-learning classrooms, corporate webinars, government facilities, and much more.

While hardware options exist for online streaming, the recent emergence of cloud-based real-time caption processing applications is making captioning for live online content more efficient and affordable than ever. A virtual caption encoder can now be employed on-demand that functions completely in the cloud, allowing live content producers to connect to transcriptionists through a caption delivery network, such as iCap, and route captions directly to compatible Web video servers. This method makes live closed captioning much more accessible for streaming-only media, since it can eliminate the need for an SDI signal path hardware caption encoder feeding a caption-compatible H264 or RTMP streaming video encoder.

The the availability of cloud storage and transmission, captioning can be more easily generated and distributed.

The the availability of cloud storage and transmission, captioning can be more easily generated and distributed.

    Here’s one signal path with which live streaming content producers could caption their projects, using a virtual caption encoder:

    1) Create an account with a cloud-based caption processing software solution and familiarize yourself with its use. Make sure ahead of time that the service you are using is compatible with the streaming server or service that you are uplinking your video to.

    2) Source a captioning or transcription service provider who can provide realtime transcription for your event. The highest quality captions come from professional stenographers or re-speaking “voice writers” who are certified by the National Court Reporters Association or similar professional groups for specific accuracy and words-per-minute standards. Training in-house staff in voice captioning is possible, but takes a much greater time investment than many media companies are prepared for. Operator-less Automatic Speech Recognition software packages also exist, but have many limitations.

    3) Configure the caption processing application with essential details about the target stream – these will let it know exactly where to send the captions. Specifications include:

    a: The captioning or transcription service provider covering the event

    b:.The streaming service being used. Common examples include YouTube Live Events, VideoLinq, UVault, and independent systems running the Wowza Streaming Engine software package.

    c: A secure iCap Access Code for your event, or alternate method of reaching the caption processor. If you are working with an outside captioner, they will typically request this information well before the planned event or program so that they can test their connection to you.

    d: Finally, pasting the URL at which your server accepts captioning input will allow the virtual caption encoder to send captions to it.

    4) Initiate the project via a “Create” button or similar.

Arranging for cloud captioning can be as easy as filling out a form.

Arranging for cloud captioning can be as easy as filling out a form.

From there, the video producer’s job is typically finished, and the remaining steps are required from the caption service provider when they begin their work:

1. The captioner receives the iCap Access Code or other access information that was created in the virtual caption encoder application.

2. He or she enters the iCap Access Code into their local live writing software, and provides captions through the same workflows typically used for live TV content or in-venue captioning/CART.

3. The virtual caption encoder routes these captions behind the scenes into your live-Web stream which is directly linked to the iCap Access Code being used.

Use Case Scenario

Imagine a situation where an iCap Access Code has been created for a virtual caption encoder instance and provided to a captioner servicing the event – they now have the ability to access the stream through this code and provide captions from their caption delivery network software.

Meanwhile, the content being captured with a video camera is sent live directly through a software streaming media encoder which is pointed at the chosen streaming video service. This is the live stream.

Simultaneously, the captioner will be providing captions to this stream from their caption delivery network software by using the connection details that have been provided to them in the beginning. The stream and the live captions meet at the video-streaming server, which merges them for the configured set of live delivery formats, including RTMP and HTTP Live Streaming, and are then passed down to the end viewer. All captioning-enabled players should display a “CC” icon in the play bar and allow selectable viewing of the formatted transcription data.


The above is just one of many scenarios where a virtual captioning encoder can be deployed to make live streaming captioning possible. The end result is a perfectly captioned live web-stream that feels seamless to the content producer, and just as importantly, to the viewer.

You might also like...

Designing Media Supply Chains: Part 3 - Content Packaging, Dynamic Ad Insertion And Personalization

The venerable field of audio/visual (AV) packaging is undergoing a renaissance in the streaming age, driven by convergence between broadcast and broadband, demand for greater flexibility, and delivery in multiple versions over wider geographical areas requiring different languages and…

Playout & Delivery At IBC 2022 - Cloud-Native & Workflow Efficiencies Key Themes

One of the key trends at IBC 2022 is virtualization and moving to cloud-native infrastructures. Manufacturers and users want to improve workflow efficiencies with whole cloud ecosystems and data.

2022 NAB Show Review, Part 1

Many annual NAB Shows have become milestones in TV broadcasting history. The presence of the 2022 NAB Show marked the first Las Vegas NAB Show since 2019.

FCC Expands Caption Mandate As Automated Processing Takes Center Stage

On October 27, 2020 The Federal Communications Commission issued an order to expand its captioning mandate for broadcasters to include audio description requirements for 40 designated market areas (DMAs) over the next four years. The move came after the Twenty-First Century Communications and…

The Resurrection Of Live Linear TV And How To Playout From The Cloud

One of the surprises from the latest research published by Nielsen was the significant rise in audiences watching live linear TV. Lockdown has not only sent SVOD viewing soaring through the roof but linear TV is expanding rapidly. One reason…