The World Of OTT (Quality Assurance Pt2) - Monitoring From The Network Side

Part 1 of this series described how network-side QoE (Quality of Experience) measurement is fundamental to proactively assuring the quality of OTT services. At its core, the network-side can be an early warning system for QoS, which in turn correlates to actual QoE performance. This article considers the two types of network monitoring available to us, relative priorities for the points of measurement, and how the video platforms contributing to OTT services are evolving to support OTT quality at scale.

The Measurement Method

Network-side measurement of content, delivery and QoE performance is either through service monitoring which focuses on an individual or small subset of the audience, or platform monitoring which focuses on all inputs and outputs from a particular technical component such as a CDN or encoder.

Service monitoring uses active testing, where probes simulate OTT clients. This is typically the most widely used measurement method, because it is most cost-effective for smaller, targeted sample testing and is generally able to identify most quality issues. Service monitoring can determine whether, for example, all ABR profiles can be streamed successfully to a particular device type, or if the latest VOD content can be streamed from each CDN, or if the queued-up ads are ready to be played during the break. Because active testing draws content to a client device, a side-benefit of the tests is that caches can be pre-warmed with the tested content, which is particularly helpful if multiple caches in the CDN can all be populated with the necessary content from the test activity.

Service monitoring can be targeted or broad. It can focus on single live streams for the duration of an event, or it can focus on an entire VOD library. It can simulate 20 end customers, or 20,000. The deployment depends on the budget and value of the content, but the flexibility exists to cater to a wide range of requirements with today’s cloud-based SaaS solutions. Monitoring processes can be expanded or contracted according to the content and audience locations and paid for accordingly. Data analytics, required in real-time to be of most use to the OTT operator, can be accelerated by elastic computing.

Service monitoring for live events generally involves monitoring of all stream variants continuously, as any Playout MCR demonstrates. But in OTT it is not enough to monitor the streaming output of the live encoder or the origin to confirm all required bitrates and packages are streaming as expected. The output to the client devices is where the quality issues manifest once the streams have passed through the various network paths. So, in the absence of direct network control, or real-time stream-level reporting from CDN suppliers and the ISPs, or sufficiently scalable external monitoring tools, OTT operators naturally relied upon what they could control – client-side monitoring. But this leads to the conclusion, as mentioned before, that troubleshooting root cause or proactively assuring quality is not possible.

“There is a clear shift underway from linear delivery to OTT delivery,” states Anupama Anantharaman, VP Product Management at Interra. “The need to provide great video quality and user experience on the complex OTT platform, across different geographies has caused new quality assurance strategies to emerge. Operators want intelligent error correlation and troubleshooting tools, and flexible monitoring, from deep, persistent monitoring for content quality at the encoder and origin server to lighter, delivery-specific checks at the CDN and edge points. Cloud-based monitoring solutions offer solid advantages and are evolving to meet these needs.”

Like most troubleshooting activities, a lack of root cause diagnosis reaches a point where it becomes necessary to see 100% of the platform for extended periods of time to know what is truly happening. Platform monitoring, fulfilled through passive monitoring by tools external to the video delivery platforms, can meet this need.

This external platform monitoring method is intensive and relies on collaboration with the platform owners to insert line taps to see what is “on the wire”. This can also become expensive, but sometimes it is the only way to resolve a persistent issue.

External platform monitoring is made more complex by the distributed nature of OTT delivery networks. A single CDN could have tens or hundreds of edge cache servers all contributing to the delivery of streams. Or there could be 20 different ISPs contributing to the final delivery over their networks. There are multiple multi-tenant platforms working together to deliver OTT video – CDN, IXP, ISP, Access Network, and home routers. Often, external platform monitoring has to be focused in on the most critical network junctions, like an Origin interfacing to multiple CDNs.

Internal platform monitoring is provided by the platforms themselves, like a CDN or an ISP. Because these platforms are often multi-tenant and based on total compute / storage / network performance, the internal monitoring activity is generally focused on availability – i.e., if the infrastructure is operating within tolerances, then it is healthy. But this can hide a plethora of issues with video quality and QoE.

One of the main quality challenges is to consistently sustain a stream’s bitrate. Bitrates can fluctuate due to congestion in the CDN, congestion in the ISP networks and peering networks, the impact of many devices simultaneously requesting the same content from the Origin, the nature of the access network technology (e.g., ADSL vs. FTTP), and more. ABR was invented to deal with the fact that sustaining a bitrate over the internet is an almost impossible task. But stepping up and down the bitrate ladder is not an ideal customer experience, and for high-performance streaming to paying customers ABR is not the final solution. There is a need to solve this issue as far as possible, given the myriad of potential hurdles a stream can face on its way to a device. This QoE issue can be understood in detail by network-side measurement tools.

CDNs, which represent the last video-specialised environment in the delivery chain before the consumer device, are evolving to give this stream-level QoE data to their OTT operator customers. As a rule of thumb, if the quality of the stream meets QoE specifications at the egress point from the CDN Edge Cache it is most likely that the consumer will be happy. As the final point of delivery into the last-mile network, this works much the same way as the head-end for over-the-air broadcasting.

Considering the order of priority for service assurance monitoring today, the below diagram indicates priority 1 for active testing of encoder output (including file transcoders for VOD assets), origin output (which can be multiple points of output) and CDN egress. Already this raises the bar on what most OTT operators see because CDN egress is not often available in the appropriate granularity. Note however, that content quality monitoring specifically if done post-encryption requires de-encryption to visualize the asset. While eminently achievable, and most often needed in workflows with combined transcoder-packager functions that cannot be monitored post-transcoder, this requires an extra level of integration with the DRM supplier as opposed to the delivery and QoE quality monitoring which can be done simply on the packaged and encrypted streams.

Priority 2 is the egress from the edge of each access network, which is a bigger task and currently requires sample-based service monitoring or a deep relationship with an ISP. Priority 2 also includes continuous passive monitoring of the Origin egress, which is often required when active testing of origin and/or CDN egress does not lead to a clear diagnosis. Congestion, request time-out, and load balancing configuration issues on the Origin are more complex problems to understand and often require passive monitoring.

Figure 1: Prioritized monitoring points for video stream quality.

Figure 1: Prioritized monitoring points for video stream quality.

Evasive Action

Measuring is one thing. Performing real-time analysis to create actionable insight is another. Typically, monitoring is against agreed tolerances such as a minimum average bitrate across all streams from a single CDN. Thresholds are pre-defined based on the known consumer ecosystem served by the OTT operator. Alarm thresholds are normally set to be as proactive as possible, to report signs of degradation that need to be addressed, rather than waiting reactively for outages.

Raising an alarm against a threshold is straightforward. Correlating network-side service degradation with customer dissatisfaction is more complex. But this is the standard that must be achieved for OTT operators to assure quality. Advanced solutions today can correlate leading (as opposed to trailing) performance indicators in real-time across various network domains, including data supplied by the client-side monitoring domains. The evasive action often relies on network re-routing to avoid the problematic routes for the future streams, which today means that client-side monitoring tools re-direct streams to another CDN. But is there a better way?

“Highly evolved OTT operators are often frustrated by a lack of visibility into their CDN services,” states Philippe Tripodi, COO-Product at MainStreaming. “CDNs should be transparent in order to support OTT operators’ quality assurance efforts. By adding CDN data to holistic service monitoring tools, the OTT operator can have a deeper understanding of the reasons for each customer’s experience. And if CDNs incorporate data from downstream network domains into their own stream management algorithms, they can make better decisions about how to manage the end customer’s QoE.”

The Future Of Network-side QoE Monitoring

Quality assurance is a never-ending task. New content formats, dynamic networks, changing devices and evolving customer expectations mean that OTT operators need quality assurance to be a core competence.

OTT operators need to have holistic service monitoring tools to understand the overall service they are providing to their audience. The good news now is that SaaS solutions and cloud deployments make this cost effective and easily deployable. Live streams, VOD streams and VOD assets can all be routinely sampled with measurement rules defined ever more precisely over time. OTT operators can really know their stream health and customer QoE.

The partnership between the OTT operator and their monitoring systems, plus their CDN and ISP platform partners, will provide the ability to understand QoE from a network perspective. Today, CDNs are the primary video-centric platform partner to rely on, as they look upstream to the Origin and downstream to the ISPs. Their final touch and view of the video before it is received by the consumer, and their ability to act and proactively move a stream to a new network path is fundamental to excellent QoE. Already, CDNs work with ISPs but this relationship will evolve to become much closer and more proactive, particularly for the largest streamers delivering to the largest audiences.

We are on a path towards OTT at scale with many millions of concurrent streams from single OTT operators to their audience. Some OTT services already reach this size on a regular basis, but in the coming years many more OTT services will. Quality control is already a very real issue for many OTT operators, and it will only become more important.

Broadcast Bridge Survey

You might also like...

TAG Introduces Realtime Media Platform

TAG Video Systems takes advantage of over 70,000 globally deployed probing points to give users the ability to dive deep into streaming content monitoring. The company anticipates more than 100,000 probing point deployments by the end of 2021.

The World Of OTT (Infrastructure Pt9) - Minimizing OTT Churn Rates Through Viewer Engagement

The basic goal is for consumers of video services to be highly engaged. It is easy to say but hard to do. Yet it is at the core of being a D2C streamer. D2C requires a deep understanding…

The Case For Adopting Both Client And Server Side Ad Insertion

The video streaming tide has been accelerated by the Covid-19 pandemic, but will continue to flow as relative normality returns, driving demand to monetize online content not just through subscriptions but also advertising.

Software IP Enabling Storytelling - Part 2

In the previous article in this two-part series we looked at how cloud systems are empowering storytellers to convey their message and communicate with viewers. In this article we investigate further the advantages for production and creative teams.

Integrating NDI And ST 2110 For Internet Streaming

The focus of much of the latest broadcast TV R&D is the Remote Integration Model (REMI). From millions of Skype meetings over consumer ISPs to the recent Winter Olympics TV broadcasts, REMI is significantly changing the internal dynamics…

1 of 5. See more