OTT (Or Is It ABR?) - Part 2 - The Mechanics Of ABR Delivery

ABR delivery offers the prospect of automatic adaptation of bit rates to the prevailing network conditions. Since the client is aware of the set of available bit rates, it can determine, segment-by-segment, which is the optimal bit rate to use.

This article was first published as part of Essential Guide: OTT (or is it ABR?)


The aim is to receive the highest rates that can be delivered given the network availability (i.e. based on measured download rate) and avoiding too many rate step-changes. There are two scenarios that the clients should avoid: a) running at too low a bit rate (since that will result in more quantization artifacts and possibly reduced resolution), and b) avoiding too many changes in selected rate (both directions of change cause a visual disturbance to the viewer, but if there is a need for a “panic” change down – i.e. when a segment at a given rate is failing to be downloaded quickly enough – the visual disruption is severe).

One unfortunate aspect is that ABR clients create bursty, periodic traffic, and while that’s not an issue if there is only one client, with multiple ABR clients in a shared capacity, the measured download rate can be unstable or unfair. The degree of instability is a function of the responsiveness of the client to changes in measured rate: too fast and the rates will change frequently, too slow and there is the risk of congestion and a resulting panic rate change. Stability is also a function of the segment length.

Understanding The Challenge of Low Latency ABR

Achieving low latency for OTT, or even for ABR on-network, is not trivial. To understand why, it’s important to consider the way data is formatted and transferred, and how networks and clients behave. A typical configuration today will use 6 second segments, with 30 or 60 frames per second, depending on the profile. In the sections below, the diagrams show a reduced number of frames per segment, just for clarity.

Figure 2: ABR live workflow.

Figure 2: ABR live workflow.

Live content is initially processed by the live transcoder in real time. Some latency is incurred at that point, however in principle that latency is equivalent to the conventional TV’s buffering delay, so is not included here for the comparison. Encoded video is fed to the packager as a continuous data flow (with embedded boundary point markers denoting the start & end of each segment), where it is formatted into segment files and published onto the origin server. The segment duration is typically about 6 seconds and it is not until the complete segment is placed on the origin that the manifest can be updated to advertise its availability. Once it is available, it can be pulled by a client via the CDN (which caches in the process) using a single HTTP transfer. Whole file transfers must take place and unless the CDN bandwidth is very much higher than the sum of the data being moved across it, each file will take a time that is less than a full segment, but still finite, to be guaranteed to be moved. The CDN may not provide the data from the cache until the segment is fully received.

At the client, there are typically three segments in operation: one being decoded, one ready for decode next and one being received. The reason for the “next segment” to be complete and waiting is due to the need to adapt bit rate. Each time bit rate adapts significantly, the perception of quality is impaired (the quality discontinuity is visually unpleasant.  Since each client makes an autonomous decision about what bit rate to attempt to download, there is a need for it to:

  • Measure download rate with a reasonable accuracy.
  • Determine whether the segment will be received sufficiently ahead of time.
  • Abandon an HTTP transfer, start a new one and complete the download without underflowing the buffer.

Consider what happens when the available bandwidth suddenly reduces, for example because of competing traffic on the network connection.

Figure-3 shows a single 3-representation (or “profile”) example, with a, b and c designating the representation, and 1, 2, 3 and 4 showing the time sequence. Since the content is being encoded live, there is an earliest availability point, after which it can be downloaded. In the absence of any restriction, the highest representation will be downloaded.

Figure 3: Segments available over time to be downloaded by client.

Figure 3: Segments available over time to be downloaded by client.

Next consider the case of competing traffic causing the transfer rate to reduce. Having downloaded 2a, the client next tries to download 3a. This time, however, because of the competing traffic, the rate is much lower, meaning the transfer would not complete in time. In order for the client to be able to make that decision, enough data must have been downloaded to give a reasonable accuracy for the measurement. While there is no specific right answer to how long to measure, about 2 seconds’ worth of data appears to give a reasonable balance between time taken and accuracy. Since there is inevitably some uncertainty, clients ask for a lower bit rate representation than they measure as the capacity: about 70% is typical. Without this, the clients would spuriously adapt rates too frequently, causing frequent visual disturbance.

Figure 4: Rate adaptation by the client device.

Figure 4: Rate adaptation by the client device.

Once the client has made the decision to abandon a transfer and drop to a lower rate, some time will have elapsed (i.e. the time taken to make the measurement and consequently the decision). The alternative representation for the same segment in time therefore must be downloaded in substantially less time than normal, which is why clients often drop all the way to the lowest rate representation when adapting downwards. Unfortunately, this also maximizes the visual disturbance caused by the adaptation.  If shorter segments were used, the period available for measurement would be shorter, leading to a corresponding increase in measurement volatility (and therefore frequency of spurious adaptation). This can be mitigated to some extent by reducing further the 70% figure, but either way, using shorter segments to reduce latency will lead to a reduction in quality.

The client’s need for measurement is fundamental to the client being able to adapt autonomously. Other system delays, however, are not fundamental and there exist mechanisms to improve the latency.

Overall, the end-to-end latency looks similar to Figure 5, with complete segments moving between each stage.

Figure 5: Overall end-to-end latency with conventional ABR and HTTP transfer.

Figure 5: Overall end-to-end latency with conventional ABR and HTTP transfer.

Another aspect to consider is that a client also needs to guess when to next request a segment (or in the case of HLS, when to poll for the next update to the manifest). This results in further latency as the client needs to avoid asking too early, so must apply some degree of conservatism.

Part of a series supported by

Broadcast Bridge Survey

You might also like...

NAB Show 2024 BEIT Sessions Part 2: New Broadcast Technologies

The most tightly focused and fresh technical information for TV engineers at the NAB Show will be analyzed, discussed, and explained during the four days of BEIT sessions. It’s the best opportunity on Earth to learn from and question i…

Standards: Part 6 - About The ISO 14496 – MPEG-4 Standard

This article describes the various parts of the MPEG-4 standard and discusses how it is much more than a video codec. MPEG-4 describes a sophisticated interactive multimedia platform for deployment on digital TV and the Internet.

The Big Guide To OTT: Part 9 - Quality Of Experience (QoE)

Part 9 of The Big Guide To OTT features a pair of in-depth articles which discuss how a data driven understanding of the consumer experience is vital and how poor quality streaming loses viewers.

Chris Brown Discusses The Themes Of The 2024 NAB Show

The Broadcast Bridge sat down with Chris Brown, executive vice president and managing director, NAB Global Connections and Events to discuss this year’s gathering April 13-17 (show floor open April 14-17) and how the industry looks to the show e…

Essential Guide: Next-Gen 5G Contribution

This Essential Guide explores the technology of 5G and its ongoing roll out. It discusses the technical reasons why 5G has become the new standard in roaming contribution, and explores the potential disruptive impact 5G and MEC could have on…