CDN Optimization for VR Streaming

Virtual Reality (VR) 360° content is still very new, but that does not make viewer expectations any more relaxed. If anything, the required quality level is even more important with VR content due to the expectations for immersiveness, which can be broken by just a minor glitch in delivery.

Virtual Reality content is unique in many ways, the first of which is that VR video even has physical implications in a way that other video does not. A low-quality VR video may cause motion sickness, for example, creating a lasting, discouraging perception of VR experiences.

Additionally, 360°VR content by its very nature is extremely voluminous, which creates new challenges for VR streaming providers. Not only is there more video to stream to create a picture surrounding the viewer, but the quality must be very high.  For example, YouTube recommends uploading 360° videos with a bitrate of 150 Mbps, which would make a five-minute video approximately 5.5 GB. For reference, the recommended bitrate for a standard video in 4K resolution is only 35 to 45 Mbps.

Streaming such high-quality 360° video uninterrupted and with minimal buffering requires a great deal of network bandwidth. While that level of capacity may be available on managed networks such as cable TV, delivering 360° video with acceptable quality of experience is a challenge on unmanaged networks like the internet. The open nature of the public internet means that consumers compete for bandwidth to receive content and the path it takes to the viewer may not be consistent, even packet by packet.

Viewport Adaptive Delivery

One of the most common and efficient methods to decrease the bandwidth required by 360° content is to deliver only the content in the user’s current Field of Vision (FOV) in high quality while delivering the rest of the video in low quality. By prioritizing the content that the viewer is actually seeing at a given moment, providers can deliver high-quality experiences while preparing for a viewer’s every move without wasting bandwidth.

This is achieved by dividing each video frame into smaller pieces, called tiles. Tiles can be individually delivered based on the user’s current FOV and put back by the client before delivering it to the video decoder. This technique of adapting the video stream based on the current viewport has been used as the basis for several different methods by different organizations.

One such method is called Tiled-Based Adaptive VR streaming. In this case, tiles are served by using a custom packaging format of the video asset such that it provides random access to the tiles of a frame. High quality tiles are fetched from the origin by content delivery network (CDN) edge servers using HTTP byte-range requests based on the client FOV, while the URL in the manifest for the video remains unchanged. This technique makes it easier to adapt to network conditions and adapt to new angles as the viewer turns his or her head.

Figure 1.  Splitting a 360° VR video into tiles at the origin server makes it easier to deliver to different kinds of devices at different quality levels across the CDN network.

Figure 1. Splitting a 360° VR video into tiles at the origin server makes it easier to deliver to different kinds of devices at different quality levels across the CDN network.

Proximity-Aware Content Prepositioning

Another proven bandwidth optimization technique involves prefetching tiles that will be needed in the future based on client proximity and head movement of the user. Such methods proactively load predicted contents into a CDN cache server. This reduces the time it takes to switch a low-quality tile with a high-quality one in the user’s FOV by more than 50 percent, compared to not prefetching on the CDN.

Moreover, since tiles can be treated as any another simple binary object that needs to be delivered from the origin server to the client via the edge over HTTP or HTTPS, tile delivery can be optimized in several different ways. Hence, other tiled based approaches of VR streaming such as the one proposed by Fraunhofer HHI which delivers customized bitstreams for each user on the fly can also benefit from such prefetching.

Figure 2. An edge server can pre-fetch and cache data so it is available when needed by the viewer. This is key to ensuring a smooth playback.

Figure 2. An edge server can pre-fetch and cache data so it is available when needed by the viewer. This is key to ensuring a smooth playback.

Next-Generation Protocols

A common approach to media delivery optimization is switching to less chattier protocol to reduce overhead. Some newer protocols that are ripe for VR delivery such as QUIC and HTTP/2 are already supported out of the box and ready to be experimented with. A presentation at a recent ACM Multimedia conference examined how HTTP/2 server pushes increase throughput especially in mobile, high RTT networks.

Figure 3. HTTP/2 will enable applications to run faster, simpler, and be more robust by resolving some of the limitations of HTTP/1.1. The standard also will open many new opportunities to optimize applications, add features, all while improving performance.

Figure 3. HTTP/2 will enable applications to run faster, simpler, and be more robust by resolving some of the limitations of HTTP/1.1. The standard also will open many new opportunities to optimize applications, add features, all while improving performance.

Ad Insertion

360 video presents a new avenue for ad-based monetization. For example, dynamic ads can be inserted at different spatial locations in the 360° video space as the video is being played out. If tile-based encoding is used then some of the tiles can be overlaid with ads at run-time.

These tile-based ads can be stored via the cloud and delivered to the client with minimal latency. For a much more dynamic experience, edge servers can pull tile-based ads from third-party ad servers at run-time based on ad targeting rules. To make the most of the opportunity presented by advertising in VR, content providers should target individuals based on their demographics rather than on the content they are viewing, which allows for more relevant ads to be delivered in each session. This requires an advanced platform that balances connections to an ad decision-making server with content delivery.

Getting It Right

As VR gathers momentum, it will be increasingly important for content providers to get the video delivery right. Early adopters will need to validate the experiences that VR providers are promising in order for the market to grow. However, that means VR providers need to make good on those promises by delivering high quality video.  Irrespective of the techniques used to transmit 360° video, mature capabilities exist that can be used to achieve a high quality of experience and usher in the new era of 360°content.

Vishal Changrani, Enterprise Architect, Global Consulting Services, Akamai Technologies (left) and Eugene Zhang, Enterprise Architect Director, Global Consulting Services, Akamai Technologies (right).

Vishal Changrani, Enterprise Architect, Global Consulting Services, Akamai Technologies (left) and Eugene Zhang, Enterprise Architect Director, Global Consulting Services, Akamai Technologies (right).

You might also like...

The Big Guide To OTT: Part 10 - Monetization & ROI

Part 10 of The Big Guide To OTT features four articles which tackle the key topic of how to monetize OTT content. The articles discuss addressable advertising, (re)bundling, sports fan engagement and content piracy.

Video Quality: Part 2 - Streaming Video Quality Progress

We continue our mini-series about Video Quality, with a discussion of the challenges of streaming video quality. Despite vast improvements, continued proliferation in video streaming, coupled with ever rising consumer expectations, means that meeting quality demands is almost like an…

2024 BEITC Update: ATSC 3.0 Broadcast Positioning Systems

Move over, WWV and GPS. New information about Broadcast Positioning Systems presented at BEITC 2024 provides insight into work on a crucial, common view OTA, highly precision, public time reference that ATSC 3.0 broadcasters can easily provide.

Next-Gen 5G Contribution: Part 2 - MEC & The Disruptive Potential Of 5G

The migration of the core network functionality of 5G to virtualized or cloud-native infrastructure opens up new capabilities like MEC which have the potential to disrupt current approaches to remote production contribution networks.

The Streaming Tsunami: Securing Universal Service Delivery For Public Service Broadcasters (Part 3)

Like all Media companies, Public Service Broadcasters (PSBs) have three core activities to focus on: producing content, distributing content, and understanding (i.e., to monetize) content consumption. In these areas, where are the best opportunities for intra-PSB collaboration as we…