OTT (Or Is It ABR?) - Part 2 - The Mechanics Of ABR Delivery

ABR delivery offers the prospect of automatic adaptation of bit rates to the prevailing network conditions. Since the client is aware of the set of available bit rates, it can determine, segment-by-segment, which is the optimal bit rate to use.

This article was first published as part of Essential Guide: OTT (or is it ABR?)


The aim is to receive the highest rates that can be delivered given the network availability (i.e. based on measured download rate) and avoiding too many rate step-changes. There are two scenarios that the clients should avoid: a) running at too low a bit rate (since that will result in more quantization artifacts and possibly reduced resolution), and b) avoiding too many changes in selected rate (both directions of change cause a visual disturbance to the viewer, but if there is a need for a “panic” change down – i.e. when a segment at a given rate is failing to be downloaded quickly enough – the visual disruption is severe).

One unfortunate aspect is that ABR clients create bursty, periodic traffic, and while that’s not an issue if there is only one client, with multiple ABR clients in a shared capacity, the measured download rate can be unstable or unfair. The degree of instability is a function of the responsiveness of the client to changes in measured rate: too fast and the rates will change frequently, too slow and there is the risk of congestion and a resulting panic rate change. Stability is also a function of the segment length.

Understanding The Challenge of Low Latency ABR

Achieving low latency for OTT, or even for ABR on-network, is not trivial. To understand why, it’s important to consider the way data is formatted and transferred, and how networks and clients behave. A typical configuration today will use 6 second segments, with 30 or 60 frames per second, depending on the profile. In the sections below, the diagrams show a reduced number of frames per segment, just for clarity.

Figure 2: ABR live workflow.

Figure 2: ABR live workflow.

Live content is initially processed by the live transcoder in real time. Some latency is incurred at that point, however in principle that latency is equivalent to the conventional TV’s buffering delay, so is not included here for the comparison. Encoded video is fed to the packager as a continuous data flow (with embedded boundary point markers denoting the start & end of each segment), where it is formatted into segment files and published onto the origin server. The segment duration is typically about 6 seconds and it is not until the complete segment is placed on the origin that the manifest can be updated to advertise its availability. Once it is available, it can be pulled by a client via the CDN (which caches in the process) using a single HTTP transfer. Whole file transfers must take place and unless the CDN bandwidth is very much higher than the sum of the data being moved across it, each file will take a time that is less than a full segment, but still finite, to be guaranteed to be moved. The CDN may not provide the data from the cache until the segment is fully received.

At the client, there are typically three segments in operation: one being decoded, one ready for decode next and one being received. The reason for the “next segment” to be complete and waiting is due to the need to adapt bit rate. Each time bit rate adapts significantly, the perception of quality is impaired (the quality discontinuity is visually unpleasant.  Since each client makes an autonomous decision about what bit rate to attempt to download, there is a need for it to:

  • Measure download rate with a reasonable accuracy.
  • Determine whether the segment will be received sufficiently ahead of time.
  • Abandon an HTTP transfer, start a new one and complete the download without underflowing the buffer.

Consider what happens when the available bandwidth suddenly reduces, for example because of competing traffic on the network connection.

Figure-3 shows a single 3-representation (or “profile”) example, with a, b and c designating the representation, and 1, 2, 3 and 4 showing the time sequence. Since the content is being encoded live, there is an earliest availability point, after which it can be downloaded. In the absence of any restriction, the highest representation will be downloaded.

Figure 3: Segments available over time to be downloaded by client.

Figure 3: Segments available over time to be downloaded by client.

Next consider the case of competing traffic causing the transfer rate to reduce. Having downloaded 2a, the client next tries to download 3a. This time, however, because of the competing traffic, the rate is much lower, meaning the transfer would not complete in time. In order for the client to be able to make that decision, enough data must have been downloaded to give a reasonable accuracy for the measurement. While there is no specific right answer to how long to measure, about 2 seconds’ worth of data appears to give a reasonable balance between time taken and accuracy. Since there is inevitably some uncertainty, clients ask for a lower bit rate representation than they measure as the capacity: about 70% is typical. Without this, the clients would spuriously adapt rates too frequently, causing frequent visual disturbance.

Figure 4: Rate adaptation by the client device.

Figure 4: Rate adaptation by the client device.

Once the client has made the decision to abandon a transfer and drop to a lower rate, some time will have elapsed (i.e. the time taken to make the measurement and consequently the decision). The alternative representation for the same segment in time therefore must be downloaded in substantially less time than normal, which is why clients often drop all the way to the lowest rate representation when adapting downwards. Unfortunately, this also maximizes the visual disturbance caused by the adaptation.  If shorter segments were used, the period available for measurement would be shorter, leading to a corresponding increase in measurement volatility (and therefore frequency of spurious adaptation). This can be mitigated to some extent by reducing further the 70% figure, but either way, using shorter segments to reduce latency will lead to a reduction in quality.

The client’s need for measurement is fundamental to the client being able to adapt autonomously. Other system delays, however, are not fundamental and there exist mechanisms to improve the latency.

Overall, the end-to-end latency looks similar to Figure 5, with complete segments moving between each stage.

Figure 5: Overall end-to-end latency with conventional ABR and HTTP transfer.

Figure 5: Overall end-to-end latency with conventional ABR and HTTP transfer.

Another aspect to consider is that a client also needs to guess when to next request a segment (or in the case of HLS, when to poll for the next update to the manifest). This results in further latency as the client needs to avoid asking too early, so must apply some degree of conservatism.

Part of a series supported by

Why Did You Read This?

You might also like...

The World Of OTT: Part 6 - Content Origination

Content Origination is in the midst of significant transformation, like all parts of the OTT video ecosystem. As OTT grows and new efficiencies are pursued, Origination must play its part as a fundamental element of the delivery chain. But Origination…

IT-Centric Technologies Will Continue To Alter Traditional Video Production

After a year like 2020, predicting the future is scary business. However there are several leading-edge technologies—many borrowed from the IT and consumer-facing industries—that certainly look to make a significant impact on video production and broadcasting in 2021. Here are som…

The World Of OTT: Part 5 - CDN For Live And VOD

CDNs are much more than just high-speed links between ISPs. Instead, they form a complete ecosystem of storage and processing. In this article we look at the different workloads for Live and VOD to understand better how they operate.

The World Of OTT: Part 4 - Evolving CDNs To Improve OTT

To attract and retain an audience, OTT services must provide excellent customer experience by delivering content at the highest possible quality. Service must be smooth, uninterrupted and in the resolution required by the customer. This is one of the key…

No More Flying Blind In The Cloud

Migrating live and file-based video services to the cloud holds the promise of huge flexibility in deployment architectures, geographical backhaul possibilities, virtually limitless peak provisioning, pay per use pricing, access to machine learning and other advanced intellectual property on-demand, and…