Delivering live, high quality broadcast video over the internet has always been an interesting challenge. Broadcast engineers are expected to understand and manage complex video, networking, scale, reliance, and playback to deliver reliable programming to multi-platform viewing devices. In this series of articles, we delve deeper into live-OTT broadcasting, identify some of these challenges, and present strategies for achieving reliable live-OTT distribution.
Streaming over the internet enables broadcasters to reach a much larger audience than traditional terrestrial, cable, and satellite models. Viewers are now watching their favorite programing on a whole plethora of devices including cellular phones, gaming systems, PC’s, and Smart-TV’s. And to increase their audience base and hence revenue, broadcasters must deliver to these viewers.
The public internet opens new opportunities for multi-platform delivery providing audiences with many new viewing options. However, this is not as straightforward as it would first appear as there are three specific challenges that will be faced by most broadcasters; bandwidths vary, latency is unpredictable, and picture sizes are determined by the playback device the viewer is using.
OTT Pulls Data Streams
Fundamentally, broadcasting and OTT delivery differs in one important aspect. Satellite, cable, and terrestrial systems all push data to the set-top-box and TV set. In contrast, OTT playback devices request a stream and pull the data from the broadcaster, giving each audience member a unique view.
The internet was developed to deliver textual documents using a client-server model. To initiate any data transfer, a client often starts by sending a “GET” command to the listening agent, often a web server. Web-servers are in permanent listening mode and when they receive the “GET” command from an authorized client, they will send the requested information back to the browser at the appropriate IP address.
Diagram 1 – OTT uses HTTP’s GET command to allow web-browsers to request (pull) consecutive segments of video and audio from the media server.
Internet connected devices generally use the HTTP (Hyper Text Transfer Protocol) model to communicate with web servers. HTTP sits on top of a TCP (Transfer Control Protocol), which in turn sits on top of IP datagrams. Although more commands have been added to the HTTP protocol as it has developed over the years, the client-server, demand-supply model, is how most internet connected devices work today. Even if the viewer watches on a dedicated app, the HTTP client-server approach is used.
HTTP generally operates on top of TCP/IP to guarantee data is reliably exchanged between the client and server. Although TCP is highly effective in resending lost packets that if not resent would significantly degrade a video feed and affect the viewing experience, there is an overhead associated with TCP that can lead to increased latency and network traffic.
Other systems do exist such as RTMP (Real Time Messaging Protocol) and webRTP (web Real-Time-Protocol). Traditionally, RTMP was used in Flash viewers, but it’s use has declined as delivery networks look to consolidate infrastructures to a common delivery method and Flash has become deprecated in many viewing environments.
While not initially developed to stream live video over the public internet, HTTP has become the most commonly used video stream delivery protocol today. Because it is the de-facto language for most web traffic, standards-based infrastructure already broadly exists.
Work Back from the Playback Device
To understand OTT distribution, from a broadcast engineers’ point of view, is to start at the playback divide and work back to the playout center.
In IT terms, streaming is the process of breaking a file into segments and making them available to a playback device to facilitate video and audio viewing. The alternative is to download the entire file into the player. While progressive download exists for on-demand, it’s not ideal as the long download times will affect the viewer experience and cost.
VOD and live-OTT are similar in that they both fragment the media, so the playback device can request consecutive chunks of data and play the clips in an orderly fashion. Where they differ is that VOD has all the data available before fragmentation begins, whereas live-OTT does not, and must compress and fragment video, audio, and metadata on the fly.
In a live-OTT single client-server model, this works well as the viewers’ playback device will be sending HTTP-GET commands to the web-server approximately once every second to retrieve consecutive chunks of video and audio data. However, life gets interesting when the number of people watching the event increases to national and international volumes. If 10 million people are watching the event, then 10 million HTTP-Get requests will be sent every second.
Diagram 2 – VOD and live-OTT both segment media into small segments so the browser can generate HTTP-GET commands to retrieve the data and then assemble it for viewing.
Scalable Systems are Critical
Before the IP revolution, broadcasters had to predict audience and viewing capacity requirements years in advance during a system installation. Although SDI broadcast systems have proved their worth and reliability, they have resulted in proprietary, rigid, infrastructures, that were incredibly expensive to build and largely only applicable for video and audio delivery.
IT scaling allows us to build a template system at a relatively modest cost, and then duplicate it as customer demand increases, thus giving CEO’s the perfect business model where profits are directly proportional to operational costs. This model is available in live-OTT using distributed streaming-servers.
Quality of Experience is Paramount
Many modern streaming architectures now include just-in-time packaging (JITP). JITP consists of a resilient origin-server that hosts the live content, and a multi-CDN approach with optimized caching layers that scale on-demand as audience size grows. Robust monitoring systems are placed throughout the delivery chain to enable broadcasters to proactively guarantee viewers’ quality of experience (QoE).
An added benefit of the HTTP model is that it allows broadcasters to gather statistics directly from all viewers enabling them to not only improve QoE, but also better target and evaluate monetization opportunities.
Viewers are on the Move
Multi-platform viewing introduces new challenges as consumers watch events on mobile devices. By their very nature, cellular phones and notepads are moved from location to location as viewers go about their business. In doing so, the reliability and quality of the network varies. Watching a live event on a cellular phone while on a train journey is a good example of this.
As a viewer moves between cellular networks, environmental conditions cause variations in the phone signal resulting in periodic changes in data rates and latency. Without intervention, viewers would suffer a poor QoE as the phone-app would be constantly buffering data. The picture would stutter, sound would break up, and the “please wait” icon would be regularly displayed.
To resolve this, most modern streaming platforms deliver adjustable bitrate (ABR) packages of content. As HTTP requires the video streams to be packetized, streaming platforms take advantage of this and provide multiple bit rates for each program. Manifest files are periodically sent enabling playback devices to choose the most optimum bit rate for the quality of network available to them.
DASH to the Rescue
Dynamic Adaptive Streaming over HTTP (DASH) is a relatively new system that provides optimal multi-platform and multi-data-rate viewing. DASH is codec agnostic and relies on the viewers device to adopt strategies to effectively switch between differing data-rates and provide the optimum QoE.
In a typical deployment, a broadcaster might provide five video and audio compression outputs to deliver data-rates of 18, 12, 8, 5, and 3 MBits/sec. These would be continuously made available to the origin or CDN, who would in turn make them available to the viewers device. Software running on the playback device interprets the manifest and programmatically identifies the best data-rate to use given the network conditions.
Diagram 3 – The viewers device uses its own algorithms to determine which compression stream to use to provide the best QoE.
If the viewer was watching a game on a cellular phone in an empty coffee shop they might achieve WiFi coverage at 8Mbits/sec, but as more customers enter and use the WiFi, it might decrease to 5Mbits/sec. With DASH, their playback software identifies the reduction in data-rate and dynamically switches to the lower compression on the fly. The switch is bound to “key-frames” so the user would be unaware the change had taken place. The opposite happens as customers leave the shop and higher data-rates become available.
Keep New Viewers
Regardless of how a transmission reaches the viewer, and the device they are viewing on, any degradation of QoE will be blamed on the broadcaster. Customers are completely unaware of the route the game they are watching takes, and nor do they care. All they know is that they couldn’t watch their event if something goes wrong in the supply chain.
It’s imperative that broadcasters not only improve their offering through OTT and multi-platform devices, but also invest heavily in their monitoring, at all levels. HTTP systems provide unprecedented access to user viewing habits, network reliability, and QoE. And all this information can be harvested by the broadcaster using appropriate monitoring tools to deliver unified OTT transmissions to keep their existing viewers and gain new audiences.
You might also like...
As broadcasters continue to differentiate themselves through live programing and events, intercom is gaining more influence now than ever. This is especially true for large arena events where mobile crews demand the freedom of wireless connectivity. But as RF technology…
Our auditory system is incredibly sensitive to the smallest sound distortion or discontinuity. Even the slightest audio pop, stutter, or level clip grabs our attention and distracts us from the television or radio program. Consequently, vendors working in the audio…
Hackers are always improving the level of sophistication and constantly finding new surface areas to attack – resulting in the surging volume and frequency of cyberattacks.
Sound engineers have spent over twenty years implementing and improving audio over IP systems. This has given audio a head-start in the race to migrate to IP. Not only does the sound seamlessly transfer across networks but recent designs have…
In the fourth and final part of this series, we wrap up with an explanation on how PTP is used to support SMPTE ST 2110 based services, we dive into timing constraints related to using COTS (Commercial Off-The-Shelf) hardware, i.e.:…