The World Of OTT: Part 6 - Content Origination

Content Origination is in the midst of significant transformation, like all parts of the OTT video ecosystem. As OTT grows and new efficiencies are pursued, Origination must play its part as a fundamental element of the delivery chain. But Origination is not just about smooth and efficient content delivery. It’s also about providing key features to the OTT service.

Content Origination is the point at which Live, Linear and VOD content are prepared for final delivery and streamed into the delivery networks. It is where the push systems of broadcast playout and VOD asset publication meet the pull system of streaming to multiple device types according to the required bit-rate and format.

Today, this push-pull line can be blurred. Linear OTT often still treats the Content Origination platform as part of the push system, passing through all formats and bit-rates to the CDNs regardless of whether they were requested or not. In many CDN environments this is required to “warm up the caches” so content is ready for when it is requested to achieve lowest possible latency with minimal requests back to the Origination platform.

For VOD, the push system ends as content is moved into Central Storage, ready to be streamed on-demand. That said, some CDN environments will replicate an entire VOD library in their own storage, pushing content deeper into the delivery network.

Content Origination Functions

Content Origination combines a set of functions that prepare content for OTT delivery. The diagram below shows the primary functions.

Figure 1: Primary functions of OTT Content Origination.

Figure 1: Primary functions of OTT Content Origination.

The first step is at encoding and transcoding, to ensure the live streams and VOD files are in the correct bit-rates for OTT delivery. Generally, an ABR group is prepared according to a set of pre-defined profiles, designed to handle the variations in network conditions and the range of devices that can request content. The choice of codec is an important consideration for the whole Origination Platform, given that newer codecs offer improved efficiency but at the cost of higher levels of processing (e.g. HEVC is more intensive than H.264).

Once content is encoded/transcoded there can be multiple workflows depending on live or VOD, the availability of “timeshifted viewing”, and whether or not low latency is required.

The core functions are therefore used in different ways according to the workflow. In general:

  • Live Recording and/or File Ingest – to provide Live TV catch-up services, now a basic feature of OTT services, the live stream is recorded to create a back-up for content held in the CDN. This ranges from short-term Live Pause to long-term CloudDVR. For VOD, files need to be ingested into storage. These functions integrate closely with the OTT Central Storage and synchronize with the Content Management Systems in order to confirm their availability for viewer consumption. The “and/or” distinction means that some platforms are unified for both Live and VOD services while some are separate, which is generally decided based on different operational requirements, which include scalability, for Live and VOD services.
  • Storage Management – not only is content recorded and ingested into Storage, but the Storage must be managed. As metadata is updated, as content “ages” and as content is deleted, something must manage where that content is and synchronize with the Content Management System. This content life-cycle management is often a software function in a module within the Content Origination platform.
  • (JIT) Packaging – this function is required in order to deliver content to different device operating systems, like Android (DASH), Apple (HLS) and Microsoft (MSS). Some OTT systems package every piece of content regardless of demand – below this is referred to as Legacy Linear OTT. Just-in-Time (JIT) Packaging is best practice in large VOD environments where it is inefficient to package every piece of content before storing it.
  • (JIT) Encryption – once packaged, content is encrypted for secure delivery over the internet to authorized viewers. Each package type has a respective encryption method (e.g. HLS uses Fairplay), although some package types use multiple encryption methods (e.g. DASH can use Widevine and Playready). Encryption follows packaging, so if content is packaged and then stored, it is generally also encrypted. JIT-Encryption accompanies JIT-Packaging for a more efficient storage model for large VOD libraries.
  • Low Latency Processing – this can be isolated as a specific function of the Origination platform. The compute resources can be set up to act on specific streams or specific pieces of content to deliver in Low Latency formats which involves reducing GOP sizes and managing many more connection requests as smaller segments are delivered. The decision to do this is made further upstream in the Content Management Systems, but then executed at Origination.
  • File and/or Live Streaming – once one or more of these content processing steps are complete, the content is streamed, pulled by requests from the CDN(s). Typically, the “origin server” has been an internet-facing web server specifically designed for delivering streams, which passes through streams from the Packager. Today these are software functions rather than discrete servers, although depending on workflow scalability and operational models it can make sense to deploy the software in dedicated hardware environments. 

Evolution Of Content Origination

Today there are 2 primary models for Content Origination, with 2 new models on the horizon.

The 1st generation Broadcast OTT model (Figure 2) began when Linear TV channels began to be streamed OTT. It is still widely in use today because it is a simple way to provide OTT content for linear broadcasters. This model takes an output from a broadcast channel which is then encoded, packaged and encrypted for OTT delivery. It is also recorded for catch-up TV. In this model, VOD utilizes a similar workflow to package file-based content before storage, to simplify operations and onward delivery to Streaming.

As shown in the grey Outputs boxes, this model produces a set of outputs which are consistent across the delivery chain. There are two drawbacks of this model: 1) it requires an unnecessary amount of central storage (shown with a value of “X”) and 2) it creates a pre-formatted VOD library that is inflexible to changes in format type that routinely occur as new consumer devices come to market.

Figure 2: 1st generation Broadcast OTT – Linear OTT with Catch-up VOD.

Figure 2: 1st generation Broadcast OTT – Linear OTT with Catch-up VOD.

The “JIT” model (Figure 3) is increasingly used today. Pioneered by large VOD businesses like Cable TV operators, JIT addresses the two main weaknesses of the 1st generation model. First, it provides flexibility to the fast-changing consumer device world by storing content libraries in a mezzanine format and then packaging and encrypting on-demand. This means that if a new format is required, the OTT operator does not need to transcode all VOD assets to the new format, potentially requiring weeks of processing. Secondly, it significantly reduces the size of the central storage. For example, storing in HLS, DASH and MSS increases the number of files stored by 3-times more than necessary when compared with a JIT model. As broadcaster OTT libraries grow into the multi-PB range this makes a big difference, not just for storage costs but also for streaming performance (how to avoid the performance issue is discussed in The World Of OTT: Part 3 – OTT’s Unique Storage Requirements).

Figure 3: 2nd generation Broadcast OTT – JIT Model for Live and large-scale VOD.

Figure 3: 2nd generation Broadcast OTT – JIT Model for Live and large-scale VOD.

The 3rd generation model can be called “Common Format & Encryption” (Figure 4). This is enabled by the CMAF and CENC formats, which are the basis of today’s DASH and HLS low latency formats, and is moving towards a truly common format. From a storage and streaming efficiency perspective, this model no longer requires JIT packaging or encryption of the media segments. Instead, the single format and encryption type means that the stream or file is prepared, then stored and streamed. Not only does this result in a simplification of processing for packaging and encryption, but it also greatly improves cache efficiency through the use of common media segments, it retains the same storage efficiency as the JIT model, and it reduces the complexity of egress for the Streaming component of the platform which previously egressed multiple package types.

This model is deployable today but is likely to be limited to a low percentage of use cases based on end-to-end adoption of CMAF (including players, devices, etc.). Leaders in this space envisage the generation 3 model to become the standard over the next 3-4 years, but to work alongside models 1 and 2 for many years. In the meantime, benefits of VOD library flexibility and storage efficiency can still be achieved through the JIT model.

Figure 4: 3rd generation Broadcast OTT – Common Format & Encryption.

Figure 4: 3rd generation Broadcast OTT – Common Format & Encryption.

The 4th Generation model is emerging now, more in discussion than in deployment, aiming to leverage artificial intelligence, machine learning and the most modern codecs. We can call this the “Consumption-Aware” model (Figure 5). In this model a new level of consumer personalization is achieved as the system applies the video codec and the bit-rate profile according to the specific piece of content and the type of device requesting it.

This model creates the most efficient Origination platform based on intelligence being applied before video is processed, and it enables even more efficient storage by leveraging the benefits of newer codecs where possible and creating more optimal ABR profiles. It is another step towards the perfect pull-system that seeks to optimize content delivery at scale.

This model is not about individual consumer customization. Decisions must be made about groups of customers and groups of content. This model will use metadata about customers’ devices and each piece of content, plus artificial intelligence to analyze consumer behavior and the level of demand for particular content.

An example of this in action on a live stream could be the decision to stream a graphically rich program in a higher-quality codec like HEVC or AV1. This would optimize the infrastructure being used, perhaps on a pay-per-use basis, and allow premium content to have premium viewing experiences. On a VOD library, AI-driven processing could be used to streamline or enrich codecs and bit-rates available in the library based on consumer demand. Codec licensing could become part of the variable factors used to optimize customer satisfaction, OTT throughput and total cost.

Figure 5: 4th generation Broadcast OTT – Consumption Aware.

Figure 5: 4th generation Broadcast OTT – Consumption Aware.

Content Origination is the execution point for delivering required content formats to consumers. While it is not a part of the OTT infrastructure that needs to dramatically expand in capacity as audiences grow (unlike storage and edge caching – see other articles in The World of OTT series), it is the first part of the OTT delivery system where the dual objectives of customer personalization and delivery efficiency are simultaneously addressed. Which is why the Content Origination platform directly enables OTT operators to achieve the vision for OTT – the delivery of highly personalized viewing experiences, at scale.

You might also like...

Transitioning to Broadcast IP Infrastructures; Part 2 - Practical Topologies and Systems

Our first Essential Insights is a set of three video episodes in which we discuss transitioning to IP with industry experts. We explore the fundamental challenges during the planning stage. The decisions that need to be made, and the long-term…

DVB Adds Watermarking Option For Inserting Targeted Ads In Linear Broadcasts

DVB has added watermarking as an option for signalling where targeted ads or other content should be inserted within linear broadcast streams. This is an update to its DVB Targeted Advertising specification originally unveiled in November 2019. The addition is aimed…

News Production Technologies Support Seamless Working From Home

With the pandemic’s alarming numbers now decreasing, news anchors have carefully begun reporting from the studio again, albeit in separate parts of the building and socially distanced. However, the IP-enabled technology and remote workflows developed by equipment vendors across t…

Diversified Systems Builds Cost-Effective IP Network For Power Station At BerkleeNYC

A number of new production facilities are now being designed and built around the ST 2110 standard for video over IP, but the cost has been prohibitive for many others. The engineers at Diversified Systems Inc. (DSI), a veteran systems integrator,…

Is Gamma Still Needed?: Part 9 - Processing In Floating Point

Floating-point notation and gamma are both techniques that trade precision for dynamic range. However they differ fundamentally. Gamma is a non-linear function whereas floating point remains linear. Any mathematical manipulations carried out on floating-point encoded data will be correct whereas…