Testing and measurements were performed using a real-time encoder transmitting A/V content, a network simulator, and a real-time decoder.
Internet packets can be occasionally dropped, primarily due to instantaneous congestion in routers. Dropped packets need to be recovered for glitch-free video. Given enough time, any losses can be recovered, but contribution applications are typically latency-sensitive and can not ‘wait forever’.
This article is the second of a 2-Part series based on White Papers by Ciro A. Noronha, Ph.D. at Cobalt Digital and Juliana W Noronha, at UC Davis. Part 2 is based on a White Paper Dr. Noronha presented on RIST details at a 2019 NAB Show BEIT session. Dr. Noronha’s repeated his White Paper at the 2019 NAB Show IP Showcase.
The speed and reliability of the Internet has made it possible for broadcasters to use it as a cost-effective low-latency contribution link. Many companies have commercial products providing this functionality, and all these products are implemented as a variation of the Automatic Repeat reQuest (ARQ) protocol.
Many commercial products use ARQ with proprietary implementations that do not interoperate. To address that interoperability issue, the VSF formed the Reliable Internet Stream Transport (RIST) Activity Group in 2017 to create a common protocol specification. In October 2018, The VSF published the TR-06-1 RIST protocol specification to provide interoperability between products using ARQ, and successfully demonstrated it at IBC 2018.
RIST Simple Profile
The ARQ technique for packet recovery was devised in the 1960s and can be found in most computer networking textbooks. In general, the protocol works as follows:
- Sender transmits packets without waiting for any kind of feedback from the receiver.
- Packets have sequence numbers so that the receiver can identify packet losses.
- No acknowledgement is sent for packets that are correctly received.
- The receiver will request a re-transmission for lost packets.
- A lost packet may be requested multiple times.
This example of two successive losses shows that when the ARQ receiver detects a packet loss, it requests a retransmission. At that point, it must wait for one network round-trip delay until that retransmission can possibly arrive. If it does not, then the packet may be requested again.
If we denote the maximum number of retransmission requests by R and the network round-trip delay in seconds by T, it follows that both the receiver and the sender must have a buffer enough to hold RT seconds of content, and that the added latency of the protocol is RT. By controlling R, it is possible to control the latency-reliability tradeoff of the protocol.
RIST Simple Profile Protocol Description
RIST selected the Real Time Transport Protocol (RTP) for the media transport. RTP is a very simple layer on top of UDP and provides sequence numbers (to detect packet loss) and timestamps (to remove network jitter, if required). The use of RTP ensures that RIST-compliant systems can interoperate with non-RIST systems at a base level without packet loss recovery.
The RTP specification also requires that senders and receivers periodically transmit RTCP packets that are used for control.
In the receiver, RIST uses this mechanism to request retransmissions of lost packets. In the sender, the same mechanism is used to create and maintain state in firewalls along the path for the receiver messages.
How RIST Operates
- Sender transmits the media stream to UDP port P, where P is an even number and configured by the user.
- Sender transmits periodic RTCP messages to UDP port P+1, with source port S. Messages should be transmitted at least 10 times/second.
- Sender listens for the RTCP messages on port S (as “response” to its RTCP messages).
- Receiver listens for the media stream on port P and for the RTCP stream on port P+1.
- Receiver sends periodic RTCP messages from port P+1 directed to sender port S, at least 10 times/second. When necessary, these RTCP messages will include retransmission requests.
- RIST does not require the receiver to process the content of the RTCP messages.
Two types of retransmission request messages are defined. The Bitmask-Based Retransmission Request covers a range of 17 consecutive packets and can request any loss pattern within this range. The Range-Based Retransmission Request can request a continuous range of packets. Both types of requests can include multiple ranges. A RIST receiver may use either type of message, and a RIST sender will respond to both. An advanced RIST receiver may dynamically decide which message to use based on the loss pattern, thus optimizing the bandwidth utilization.
To ensure protocol stability, it is necessary for the receiver to differentiate between original packets and retransmissions. RIST uses the SSRC field in the RTP header to make this differentiation, as recommended by RFC 4588. However, unlike RFC 4588, the retransmitted packet is an exact copy of the original RTP packet, except for the data in the SSRC field.
RIST and Firewalls
RIST Simple Profile only requires that UDP ports P and P+1 be opened at the receive site firewall. The sender is configured to transmit to the public IP address at the receiving site. Flow from the sender to UDP port P is unidirectional and will contain the audio/video content, as shown in Figure 1.
The receiver directs its Real-Time Control Protocol (RTCP) packets towards the source IP and UDP port of the traffic received in port P+1. The firewalls along the path treat these packets as a “response” to the sender RTCP packets, and forward them back along the path.
The RIST Simple Profile includes IP Multicast support, and it operates similarly to unicast.
- Sender transmits media stream to UDP port P (even number) and a multicast IP address M.
- Sender transmits periodic RTCP packets to UDP port P+1, and the same multicast IP address M as the media stream.
- Receiver joins multicast M and listens on UDP port P for the media stream and port P+1 for RTCP.
- Receiver sends its RTCP packets to multicast M, UDP port P+1.
- Sender also joins multicast M and listens for receiver RTCP packets on port P+1.
This gives every receiver the ability to “see” the RTCP packets from all other receivers and optimize its retransmission requests accordingly. It does not need to request a retransmission that has been recently requested by another receiver.
RIST Simple Profile supports multiple paths between the sender and the receiver. This is applicable in the following scenarios that are not mutually exclusive:
- The media stream may be split over multiple lower-bandwidth paths for bonding. A common example is the use of multiple cellular connections for media transmission.
- The media stream may be replicated over two or more paths for reliability, similar to what is specified in SMPTE-2022-7. As a matter of fact, a SMPTE-2022-7 Class-C compliant receiver may be able to accept a multipath RIST stream depending on its buffer sizes.
In order to support re-ordering of packets, the RIST receiver needs to expand its buffer to include a Re-ordering Section prior to the retransmission buffer. This Re-ordering Section can be seen as a 'grace period' for any packets still in flight to arrive. A packet is deemed lost only if it does not arrive after this grace period. In other words, packet loss is detected at the boundary between the re-ordering buffer and the retransmission buffer.
RIST Performance Measurements
The use of a network simulator allows precise control of the network conditions. The setup is shown in the diagram at the top of this article. The measurement parameters and procedure used were:
- Media bit rate: 8 Mb/s (1920×1080i59.94 source).
- Simulated round-trip delay: 200 milliseconds.
- Random independently and identically distributed packet losses including Single packet losses and 5-packet burst losses.
- Two-minute runs.
- Independent variable: number of retries, tested from 1 to 10.
- Receiver re-transmission buffer set to (200R + 100) milliseconds, where R is the number of retries.
- Sender buffer set high enough to handle the worst-case receiver buffer.
- For each retry value, increase the packet loss until at least one unrecovered packet is detected in the two-minute run.
- Record this packet loss rate.
- Repeat each test 10 times.
The test results are shown in the figure below, which indicates the safe operating region as a function of the packet losses. The figure shows that a properly configured RIST system can safely operate with packet losses in excess of 30%.
The results for 5-packet burst losses are similar to those for single-packet losses, especially in regard to the average behavior. One artifact of this type of testing is that, to keep the same packet loss rate, burst losses must be less frequent, making recovery somewhat easier at the lower packet loss values.
In practical terms, there is only one reason why packets may arrive at the destination out-of-order: when there are multiple network paths between the source and the destination. Routers may classify and transmit packets based on priority, but packets belonging to the same flow should receive the same classification.
The presence of multiple paths may be intentional (e.g., bonding – multiple end-to-end paths), or something that happens in the network backbone, outside of the control of the user. Service providers will likely route most packets through the same path; this path may change over time, and out-of-order instances will happen at the changeover.
The cost of adding a Reorder Section to deal with out-of-order packets is added latency. The cost of not having a Reorder Section is bandwidth efficiency: The receiver may request the re-transmission of a packet that is still in flight and will arrive shortly. Since packets have sequence numbers, receivers can identify and discard duplicates, so other than the waste of bandwidth, there is no other penalty.
What should be the size of the Reorder Section when the user has a single Internet connection for the sender and the receiver? One can always measure and characterize their individual link, but the great majority of users will not have the capability to do so, and most providers do not have data on out-of-order packets.
In order to provide a recommendation, we rely on data from a paper by Jaiswal et al , where the authors characterized the out-of-order behavior of a number of Internet backbone links. They found that, on average, only 0.365% of the packets were out of order. Therefore, in the absence of any additional data, it may be reasonable to simply not have a Reorder Section. The penalty for that is a small increase in the re-transmission data.
RIST Set-Up Recommendations (Noronha and Noronha)
When setting up a RIST Simple Profile link, the user must manually choose a few parameters to optimize the link. Our recommendations are:
- Find out the round-trip time between the sender and the receiver, using the “ping” utility. If using bonding, do this for all links.
- If the network loss is known (e.g., there is an SLA in place), read the minimum number of retries from Figure 2 to be in the Safe Operating region. Common Internet Service Level Agreement (SLA) values are 99% (1% loss) and 99.9% (0.1% loss). The plot suggests the use two or more retries for the first case, and one or more retries for the second. A safety margin is also recommended. For example, operation at 1% loss and 2 retries is marginal as it is at the border of the operating region in Figure 2. Add at least one retry for margin.
- If the network loss is not known, we recommend starting with 4 retries. Our experience is that 4 retries will give good results in most links. Indeed, the 2018 IBC demonstration was performed with 4 retries. Alternatively, if the application has a maximum latency requirement, divide that by the round-trip time to find the number of retries, and use this value.
- If R is the number of retries selected and T is the round-trip time, the Re-transmission Reassembly Section of the receiver buffer should be set to at least RT. If the application can tolerate it, we recommend a 10% additional margin as network delays tend to vary. In a bonding situation, use the highest round-trip time for T.
- If the transmit buffer is configurable, it should be set as high as possible. At the very least, it must not be less than the receive buffer.
- If using bonding, the Reorder Section must be set to at least the difference between the highest and the lowest round-trip delay, divided by two. A safety margin is also recommended. If not using bonding, this can be left at zero.
Commissioning a RIST Link
It is always recommended that a new link be monitored for an initial period to validate the settings. The recommended adjustments are:
- If the receiver reports late packets, its buffers should be increased – the link latency is probably higher than expected.
- A certain number of duplicate packets is expected. However, it this number is significant, either increase the time between retries, or increase the size of the Reorder Section.
- If there are too many unrecovered packets, the number of retries should be increased if possible, with a corresponding increase in the Re-transmission Reassembly Section of the receiver.
Future RIST Work
This article describes RIST Simple Profile, whose main purpose is to recover from network packet loss. The RIST Activity Group is currently working on RIST Main Profile, which will include features such as tunneling, encryption, bandwidth optimization, and extensions for high bitrate operation. A demo of RIST Main Profile is planned for IBC 2019.
 Jaiswal, S., Iannaccone, G., Diot, C., Kurose, J., and Towsley, D., “Measurement and Classification of Out-of-Sequence Packets in a Tier-1 IP Backbone”, IEEE INFOCOM 2003, San Francisco, April 2003.
You might also like...
In this series of three articles, we investigate the underlying aspects of computer server design for high value security and 24-hour operation. In the first article we look at advanced server security, in the second article we understand how servers…
A long chain of events is needed to see a color picture on a TV set. Only by considering every link in the chain can we strengthen any weak links.
Optical disks rely totally on the ability of the pickup to follow and focus on the data track. It is taken for granted that these mechanisms are phenomenally accurate, work at high speed despite being made at low cost and…
May 14, 2019 may not have seemed a particularly important date for those who edit and color grade on Mac’s and PC’s. But it was. By chance, that day I went looking for the May Windows 10 Feature Update (1903). I was sur…
The optical disk has some useful characteristics that have allowed it to survive alongside magnetic media. John Watkinson takes a look.