We continue our series of articles on design considerations for Media Supply Chains looking at the technology trends influencing Ingest.
The pace of change across the AV (Audio Video) supply chain has been especially marked for ingest, driven by several factors including cloudification, the rise in live sports streaming, and the boom in remote production accelerated by the Covid-19 pandemic.
The changing of the guard in the field of content production and delivery has also had an impact, with innovation now being led by the big streamers such as Disney, YouTube, Netflix and Amazon’s Twitch, more than major broadcasters such as the BBC, NBC or ZDF.
Sometimes new issues are thrown up by developments designed to solve longstanding challenges. This is evident with cloudification, motivated by the desire to achieve scale economies and reduce complexity by offloading as much video processing as possible to the cloud. This has been driven not just by the rise of streaming, but also the migration of overall video workflows towards IP, with a decisive shift away from dedicated head-ends based on bespoke hardware from long established vendors in the field which have had to reinvent themselves to stay in business. Instead, there is increasing software encoding employed in-house, as well is in hybrid cloud deployments.
One problem is that although network bandwidth has been rising fast and falling in price, it is still not enough to ingest full RAW video as it comes off the camera into the cloud. As a result, the encoding process itself cannot usually be cloudified for high quality content. Instead, the need for high quality and low delay at an affordable operational cost is often met by ingesting source content into the cloud through a mezzanine encoder located at the site of the content owner, whether this is a traditional broadcaster, rights holder, or streaming service provider. The availability and affordability of bandwidth between the encoders on the content owner’s site and the cloud installation where other video processing is performed is a key factor determining the particular codec used.
In cases where bandwidth is plentiful and affordable, a codec like JPEG XS might be used for just a light compression, maximizing the quality and integrity of the video for processing. More typically though a stronger codec at the contribution stage will be used, such as HEVC or AVC, configured to upload content at little more than double the bit rate used eventually for distribution to the end device. A compromise therefore should be made between bandwidth and eventual quality of the video at this stage.
The field of ingest is still evolving, with one likely change coming in the underlying protocols used for transmitting the encoded content into the cloud. One side effect of the transition of ingest to the cloud is a revival of the RTMP protocol, if only temporarily. RTMP was developed by Adobe for streaming from video servers to end devices, using either the internet TCP or UDP underlying protocols for basic transport. Because RTMP in turn relies on Adobe Flash, which is no longer supported by either web browsers or mobile devices, it cannot be used any more for streaming delivery. But it has found favour for uploading content from an encoder to an online video host in the cloud as part of the ingest process. The ingesting itself may then be performed over a different protocol, depending on the content type and quality required.
RTMP itself is suitable for ingesting live content when ultra low latency is more important than absolute quality. Then for on demand content where high quality is required, especially say 4K resolution at 2160x3840 pixels per frame, HLS (HTTP Live Streaming) or DASH are recommended. So, there are decisions broadcasters and content providers have to make, almost on a case by case basis, if they want to enable the best experience downstream for the user.
The increasing amount of premium sport being broadcast live, as lower cost remote production and more automation make it cost effective to cover less popular niche events that can still amass substantial audiences on an aggregated global basis via online delivery, is driving live linear delivery. In some of these cases, content is being contributed to the cloud before encoding to make use of faster hardware-based systems to minimize latency. For this, lower latency protocols are required for the uplinking over dedicated IP links, the main contenders being SRT (Secure Reliable Transport), RIST (Reliable Internet Stream Transport), and the proprietary Zixi protocol.
Again, these protocols were developed in the first instance for streaming delivery, but more recently than RTMP. It is likely over time that RTMP will gradually become extinct for ingest as well, perhaps to be replaced by these newer streaming protocols.
Above the protocol, ingest is evolving at a higher level as part of overall supply chain optimization, particularly among larger broadcasters, streaming service providers and content aggregators. At this level, considerable cost savings and efficiency improvements can be made by optimizing the location of ingest and routing of the content at the various stages. This has been illustrated by a number of major aggregators and distributors, such as You Tube, and also Twitch, which recently discussed its proprietary ingest routing system called Intelligest, developed to distribute live video ingest traffic as intelligently as possible from its PoPs (Points of Presence) to the origins. This has two components, the Intelligest media proxy running in each PoP, and Intelligest Routing Service (IRS) running in AWS (Amazon Web Services), which as an Amazon subsidiary Twitch uses for its video transport, both ingest and distribution.
As with the cloud, this expansion and diversification into multiple PoPs and origin centers created new problems. A major problem arising was inefficient utilization of origin servers resulting from the cyclical nature of broadcasting subject to sharp peaks and troughs in demand around the day. It was found that typically origin servers would run at peak load early in the evening perhaps, with far lower utilization at some other times. This was expensive because the systems had to be provisioned to cope with peak loads on an individual basis.
To overcome that and other performance related challenges, including meeting growing demands of live video traffic, as well as enabling new applications and services for its broadcasters, Twitch decided to revamp its software architecture by developing Intelligest.
The Intelligest media proxy serves as the ingest gateway to the Twitch infrastructure, terminating live video streams from broadcasters, and extracting relevant properties from metadata perhaps. It then hands the streams over the Intelligest Routing Service (IRS), where the real intelligence resides, to determine which origins the streams should be sent to. This supports rule-based routing to suit different use cases, so that for example live streams associated with given video channels can always forwarded to the same origin no matter what PoPs the video streams arrive at, perhaps to satisfy specific processing needs that can only be serviced in that one place.
The major objective though was to utilize the overall compute and network resources in the infrastructure more effectively, to make it more scalable within a budget, as well as reduce operational costs. This involved a two-stage process that could be relevant for other large content aggregators. Firstly, Twitch developed a static routing system involving offline optimization leading to regular updating of routing rules to move ingest streams to origins with compute resources available at peak times, spreading the computational load around the world as far as possible.
This greatly improved utilization, but because it required offline optimization still failed to handle unexpected traffic patterns and system fluctuations in real time, resulting say from unanticipated demand during a major event, or a sudden loss of video transcoding capacity in an origin data center.
Making the overall system more dynamic involved addition of real time monitoring of both compute resources in the origin and the backbone network. This means that both availability of resources and their changing utilization are monitored in much nearer real time, allowing the whole infrastructure to flex more effectively to fluctuations in demand, even when these are not anticipated.
This is still work in progress, as all complex systems should be, and like the ingest field as a whole.
You might also like...
Outside Broadcast connectivity using managed and unmanaged networks is delivering opportunities for employers that enhances productivity through flexibility, scalability, and resilience.
Connecting a camera in an SDI infrastructure is easy. Just connect the camera output to the monitor input, and all being well, a picture will appear. The story is very different in the IP domain.
This second part of our Master Control mini series tackles COMMS - without which we would have chaos.
We begin our series on things to consider when designing broadcast audio systems with the pivotal role audio plays in production and the key challenges audio presents.
LL35 is a “If you build it, they will come” attempt to attract the young gaming community to live streaming and TV content.