News consumption on smartphones is growing faster than other video sectors.
Consumption of news on mobile devices has been soaring for over a decade, overtaking desktop PCs in 2016 in many markets such as the US and some European countries.
That then is old news, but more recently consumer expectations of quality and robustness have grown to challenge broadcasters and video service providers, with much lower tolerance of buffering and artefacts, coupled with elevated expectations that content being presented in an optimum format for small screens. Such demands have been stoked further by the arrival of 5G and superior large screen smartphones that expose quality to greater scrutiny and make defects more visible.
In the longer term, the distinction between fixed and mobile delivery will blur, with wireless being increasingly the last hop of a service delivered over wired broadband services, as is already the case often when video is viewed over a home or public WiFi network. Yet even as 5G is being rolled out and WiFi has been upgraded to its sixth generation, many old problems of wireless video delivery still dog service providers. This is because wireless networks remain susceptible to varying levels of signal attenuation as a result of obstacles in the path or RF interference, leading to packet losses and inconsistent round-trip delays, compared to wired counterparts.
Such continuing inconsistencies in wireless delivery were outlined by California based online video optimization and analytics firm Conviva recently in a survey finding that video start-up times were 11% worse on mobiles in the USA and 9% worse in Asia as a result of buffering, which might not sound huge, but led to 5.1 billion hours lost in total across those regions. It also provokes churn away from services, as American instant messaging app Snapchat reported that consumers become restive and start jumping ship after just 2 seconds start up delay. Every second on top of that leads to 5.8% of viewers abandoning on average, according to Snapchat.
Similar findings have been reported by others coming from different perspectives, including Akamai on the Content Delivery Network (CDN) front, and Facebook, rebranded as Meta Platforms, as the world’s biggest social, media conglomerate. It has led to Facebook recalibrating the way it measures mobile video consumption, since it was finding that many users had already dropped off after the previous minimum threshold of three seconds viewing. A lot of that Facebook content is news.
The underlying point is that improvements in data delivery under both cellular and WiFi in their new generations are not sufficient to meet growing demand for quality on mobile devices. Separate technology advances are required to improve playout on mobile devices, starting with the fundamental protocol of the internet, HTTP. This evolved in the early 1990s to enable application-level data to be transmitted with error correction over the TCP protocol. It was upgraded with a serious revision to HTTP/2 around 2015 to exploit new techniques for increased speed and also to support application types that had evolved, notably pre-emptive data delivery. The latter became valuable for some mobile news services, avoiding need for reliance on live delivery, enabling playout from the device.
However, HTTP/2, although better than HTTP/1 most of the time, still failed to cope well with poor mobile connections subject to wide variations in delay and packet loss. A fundamental problem was reliance on the rather rigid TCP protocol that is connection oriented, requiring a path to be set up over the network with retransmission of lost or corrupted IP packets.
It is true that upgrade from 4G to 5G, or from WiFi 5 to WiFi 6/6E, improves the situation, but not enough necessarily to meet rising expectations of quality.
This led the major internet technology players such as Google and Facebook to promote a new protocol called QUIC with video services in mind, especially for mobile delivery of live snippets such as news bites. A key change was replacement of TCP with the alternative connectionless User Datagram Protocol (UDP), which does not require one to one links to be set up and cuts latency with the help of additional changes. One of these changes is to reduce overhead during connection setup by avoiding need for separate dedicated handshakes to exchange encryption keys. Instead that is done as part of the initial handshake over UDP.
The use of UDP rather than TCP bypasses the latter’s loss recovery mechanism with packet retransmission that can impose significant delay, especially as it cannot cater independently for separate video streams within an overall session. QUIC by contrast continues servicing other streams in the event one stream has error packets that need retransmitting.
There are various other improvements that collectively cut overall latency and improve throughput, with a notable one for mobile being ability to cater more efficiently when users roam between mobile cells under say 4G/5G, or between WiFi hotspots. Under TCP, such network switch events require existing connections to time out laboriously one-by-one and then be re-established on demand. QUIC instead includes a tag uniquely identifying the connection to the server, regardless of source.
As QUIC has matured it has been adopted for HTTP/3, the next version of HTTP yet to be formerly ratified as an internet standard but already supported by 73% of currently running web browsers, including Google Chrome, Microsoft Edge, Chrome for Android, and Mozilla Firefox.
Despite these improvements, inevitably some specialists in the field argue there is a need for more fine tuning to accelerate delivery of live video such as news. One of these is Portuguese mobile CDN software specialist Codavel, which has developed a protocol called Bolina, claiming it clearly outperforms not just HTTP/2 as currently widely deployed, but even HTTP3 built on QUIC. There is no doubt Codavel has improved on QUIC but as always there is the question of whether the gains merit supporting a proprietary standard rather than one with such industry weight behind it.
For mobile video delivery, the other greater challenge lies in shaping content optimally for different screens as well as networks, catering for varying capacities, bandwidths and display resolutions. There is a need to cater for not just varying service commitments but also the fluctuating quality of the mobile network, as well as different receiving device capabilities, in the real sense of “best effort”, that is delivering the highest quality possible to a given device at a particular time. This is enabled by adaptive bit rate streaming (ABRS) or Multi-bitrate streaming, so that broadcasters or content providers can generate and stream multiple renditions of the same video simultaneously. The ABRS player calculates best bitrate for each individual viewer, on the basis of measurements from the network and knowledge of the client device, then automatically invoking the optimal bit rate.
This is done in conjunction with transcoding, which is the process of creating these multiple renditions of video of different sizes, catering for varying bit rates. Transcoding is really a catch all term for taking a video file that has already been encoded or compressed and decompressing it so that it can be manipulated for creation of those different versions. These changes can include transrating, or changing the bit rate, and transizing, which is altering the way video frames are displayed. Those display changes can be in resolution, frame rate, or both. Transrating and transizing are often combined, as when converting say a 4K video source at 14 Mbps down to a 1080p “full HD” stream at 4 Mbps. This requires adjustment of both resolution and bit rate, while in some cases frame rate may also be altered.
This revolves around the video codec, with H.264 from the MPEG camp still widely used for mobile delivery, although there is likely to be a swing towards the Alliance for Open Media’s (AOM) AV1 in future. That is one area where the IoS and Android camps are aligned, with Apple a governing member of AOM alongside Amazon, ARM, Cisco, Facebook, Google, Huawei, Intel, Microsoft, Mozilla, Netflix, Nvidia, Samsung Electronics and China’s Tencent.
That said, Apple does not support AV1 on iPhones as yet, although momentum is growing generally, with Netflix starting streaming in that codec to Android mobile devices as early as February 2020. H.264, its successor H.265, and AV1, will all be in the frame for mobile video delivery over the coming few years.
The likes of Avid are producing bolt-on lights and microphone for smartphones to enhance remote mobile journalism.
Irrespective of the codec used, the bit rate adopted for streaming will also determine the target quality, in addition to device and network capability. For example, video may be delivered at just 400 Kbps at frame size 320x240 and frame rate 30 fps for acceptable quality on old legacy devices. But expectations are generally higher and users of current top end smartphones will require serving at 2 Mbps or more over cellular and higher still over WiFi, at the same frame rate but frame sizes of at least 960 x 640 or even 1280 x 720, which used to be called “half HD”.
These different qualities can be served by transcoding as already described, but demands and capabilities are evolving all the time, which is a challenge for broadcasters to stay abreast of. Quality monitoring across the stream categories is essential, and broadcasters will need to keep assessing the range of settings they employ to serve the wide range of device types.
News is adding to these challenges, partly as a result of the mobile proliferation extending the range and reach of coverage for breaking events. When some of the material is coming in from non-professional sources, captured more and more on users’ smart phones, quality control has to be extended accordingly. Automation is also critical for a news organization of any size, having to cope with large numbers of inputs as well as outputs. Incoming footage has to be assessed for standards compliance including loudness, and then set to the various output frame rates, bit rates and resolutions, before distributed over one of the main adaptive bit rates streaming systems, chiefly Smooth Streaming, DASH or Apple HLS. Vendors such as Interra Systems offer automated hybrid quality control, offering systems they claim will monitor content across the entire creation and distribution chain, embracing the main target devices.
Beyond quality control, there is growing need for automation across the whole playout cycle to cope with the complexities and content volumes now entailed, and that is boosting the channel-in-a-box field. Channel in the box systems started taking over from legacy linear playout chains some years ago, replacing separate dedicated hardware often from multiple manufacturers, connected via wired serial or ethernet cabling. Such systems had become too expensive and unwieldy, especially for emerging content producers, constrained by the need for purpose-built hardware and physical wiring.
Channel in a box playout instead implemented a growing number of those hardware-based playout functions in software. This led the way towards virtualization and use of commodity underlying hardware components divorced from the application. Initially some more specialized functions remained on dedicated hardware, including special effects, subtitling and more advanced graphics.
Over time though, as off the shelf systems based on common Intel or other platforms rapidly increased further in computational power, and as IP based technology for video advanced, even those more advanced functions could be virtualized and, in many cases, offloaded to the cloud for more effective remote operation and scale economies. Such clouds could be privately owned in the case of larger content or video service providers, or public, in either case taking over the complete playout chain.
This has brought in new vendors as well as galvanizing established players. Vendors in the cloud playout arena, or playout in a box, include Evertz, Grass Valley, Imagine Communications, Harmonic, Playbox Technology, Evrideo, Rohde & Schwarz subsidiary Pixel Power, Broadstream Solutions, and Ericsson’s Red Bee Media.
The greater flexibility of cloud-based playout has been a boon for a number of news organizations, including both traditional broadcasters and emerging providers. They enable much more rapid launches of channels at lower cost, capable of reaching multiple mobile platforms. It has led to the rise of pop-up broadcast channels particularly conducive for news, enabling a prospective service to be tested for just a few months without the time and costs of starting up on a legacy platform, to assess demand.
Mobile devices are changing the face of news journalism at both ends of the content cycle, capture and consumption. There is also a thriving industry for tools that enhance content capture, enabling professional journalists to roam much more readily just with their smartphones to cover events, as well as enabling amateurs to contribute when they are first on a scene. For professionals, while smartphone cameras are now capable of capturing footage at high quality, often now with ultra-wide and telephoto capabilities, sound and lighting can lag behind. Traditional players in video production and editing such as Avid are among this category for such professional smartphone content capture in the field with small clip-on microphones and battery-operated light sources.
It is though the consumption side that is posing the greatest challenges for news organizations, and also feeding back into content creation by changing what people want to watch. It is increasing demand for snackable short form content, especially in the case of news.
You might also like...
Multi-CDN is a standard model for today’s D2C Streamers, which automatically requires a CDN selection solution. As streaming aims to be broadcast-grade and cost-effective, how are CDN Selection solutions evolving to support these objectives?
Information theory can also be applied to loudspeakers, which are among the most difficult of transducers to design. Measuring the information capacity of loudspeakers is a useful tool.
In the previous article in this series, we looked at layer-2 switching and layer-3 routing. In this article, we look at Software Defined Networks and why they are so appealing to broadcasters.
Here we look at some practical results of transform theory that show up in a large number of audio and visual applications.
Machine learning is often compared to the human brain. But what do they really have in common?