Building An IP Studio: Connecting Cameras - Part 3 - Network Switching And Routing
In the last article in this series, we looked at SDP files and the importance they play in identifying source and destination devices. In this article, we look at why we combine layer-2 switching and layer-3 routing, and why broadcasters are moving towards Software Defined Networking (SDN).
Other articles in this series:
To fully understand IP networks, we should take a step back and ask, “what problems do networks solve and why do we need them?”. Fundamentally, the network provides two tasks: it allows connected devices to exchange data, and it is needed to stop packets of data being unnecessarily delivered to other devices.
Network topologies vary greatly but in modern IT and broadcast applications they consist of layer-2 ethernet switches that connect devices within a close physical locality, and layer-3 routers that connect switches, or groups of switches together to provide more disperse physical connectivity in the form of a LAN, or greatly dispersed in the form of a WAN.
Each device, such as a computer, server, IP-camera, or IP-microphone, connects to a specific port on a layer-2 switch. Although this may seem an obvious method, one-to-one mapping of device-to-switch-port was not always the case. In the early days of ethernet, multiple devices were connected to the same physical coaxial cable. This required them to listen for traffic on the cable, and when none was detected, they could then send their own data. Collisions did happen resulting in latency leading to the use of CSMA/CD (Carrier-sense multiple access with collision detection). Connecting multiple devices to a single passive cable was required as switchers were incredibly expensive, and the technology was in its infancy. Luckily, this form of topology is now very rare, and we certainly wouldn’t use it in modern IT and broadcast infrastructures.
Keeping To Layer-2
This then begs the question, “why not just connect devices using layer-2 frames and dispense with layer-3 IP packets altogether?”. This is indeed a very interesting question as some networks operate in this way, such as AVB (Audio Video Bridging) with protocols including IEEE 802.1QA. As it operates at layer-2 switching, and when individual switcher ports are assigned to each device, latency and collisions are very low, thus solving many of the challenges broadcasters need from a network. The answer of why we don’t exclusively use AVB becomes clear when we look at larger networks.
Layer-2 switching has no knowledge of IP datagrams and doesn’t care about the IP datagrams source and destination addressing. Instead, layer-2 uses the MAC source and destination addressing to switch frames to the appropriate connected device. Following on from this, layer-2 has a broadcast message facility that allows the sending device to set the frames destination MAC address to FF:FF:FF:FF:FF:FF. When this frame is sent to the connected switch, it instructs the switch, and all connected switches, to send the frame to all its ports and connected devices, thus potentially flooding the network with unnecessary frames of data. Such an action occurs only during discovery mode but the more devices that are connected then the greater chance there is of flooding the network with broadcast traffic, thus leading to the possibility of congestion, frame loss, and high latency.
Most traffic will occur within the locality of the layer-2 network. For example, in a studio, the cameras, production switcher, multi-viewer, microphones, sound console, and monitoring would be connected to a small number of layer-2 switches, all interconnected with high-speed network connections to form a mesh topology, or individually connected to an aggregate switch to form a spine-leaf type topology. This would provide its own broadcast-message domain thus keeping the broadcast traffic localized and network latency low. However, if we wanted to connect any of these devices to another studio in a different building or even country, and we did this using layer-2 technology alone, the broadcast traffic from the local network would be sent to the newly connected network, and this broadcast traffic would also infiltrate our local network. This would greatly increase the possibility of congestion caused by excessive broadcasting of ethernet frames leading to increased latency.
Layer-3 Routing
To fix this we use layer-3 IP routing. We still use layer-2 switches to connect devices within the same physical location to gain the best possible latency, but we group layer-2 networks together and interconnect them using IP routing. This means the broadcast messaging does not jump the gap between layer-2 networks as the IP router stops this from happening, thus keeping latency low.
Fig 1 – layer 2 switches provide high speed connectivity for the majority of the traffic in Los Angeles, but if New York needed a feed from camera 1, it could be routed via R1-R2-R3 to the facility in New York. Although the video would probably be compressed, the routers keep traffic localized to stop congestion and keep latency low. R1-R4-R3 provides another possible routing for resilience and distributed network loading using ECMP – see text.
Another positive side effect of IP is that the IP addresses used by the connected devices are dynamically assigned by the system administrator. However, any manufacturer that provides a piece of equipment that connects using a MAC capable transport stream, such as ethernet, requires a MAC address that is provided by the IEEE, in other words, the MAC address is static. This provides some interesting challenges when replacing equipment as the whole management system must be notified of the newly connected device with the new MAC address.
One consequence of using IP and ethernet together, which most IT and broadcast networks do, is that a connected device has two addresses, it’s dynamic IP address, and the static MAC address. This may seem a bit of an unnecessary overhead, however, as we’ve seen, layer-2 switching is fast and has lower latency within a small neighborhood, but IP routing provides greater flexibility for more disperse systems. Hence, we can use both and gain the advantage of low latency and greater flexibility.
We could easily build a network that used only ethernet frames without IP datagrams, but this would lack the flexibility of IP and restrict us to a small family of transport streams. But one of the greatest benefits of IP datagrams is that they are transport stream agnostic. They don’t care whether they are being sent over ethernet, Wi-Fi, or fiber etc.
When we switch an IP datagram from an ethernet to WIFI networks, the IP datagram encapsulated by the ethernet frame doesn’t change, but the underlying layer-2 frame does. Thus, allowing the IP datagram to seamlessly switch between layer-2 ethernet and WIFI.
MAC And IP Addressing
That said, one of the first actions of an IP-device wanting to exchange data with another device on a layer-2 network is to associate the MAC address and IP address of the destination device as the IP datagram is encapsulated by the layer-2 ethernet frame. This is necessary as the layer-2 switches only look at the header information in the layer-2 frame and not any of the payload where the IP datagram resides. That is, it doesn’t look at the layer-3 IP datagram addressing information. Consequently, the sending device must know the MAC address as well as the IP address of the device it is transmitting its IP datagram too. This is achieved using ARP (Address Resolution Protocol).
When an IP-camera, tries to connect to a production switcher with a known IP address for the first time it will broadcast a layer-2 network message to all devices on the network asking, “who has this IP address?”. The camera may know the IP address of the production switcher, but it doesn’t initially know it’s MAC address so cannot send the ethernet layer-2 frame to its connected layer-2 switch. ARP is the protocol running on the production switcher (and every compliant connected device) that allows the production switcher to send back its MAC address when it sees its IP address in the ARP request. The camera will then know the ethernet MAC address so it can populate the ethernet frame for future data exchange. This will allow the camera to send the ethernet frame to the switch, and in turn the switch knows the MAC address of the production switcher, so knows which port to switch the frame to allowing the production switcher to receive the frame. Only then does the production switcher extract the IP datagram from the ethernet frame.
If an IP-camera needs to stream media to an IP-production switcher on a separate network, a router is needed to allow the IP-datagrams to be sent between them. Again, we could keep this at layer-2 and essentially connect the two networks together, but this could cause congestion due to the excessive broadcast messaging. A far more efficient method of operation is to keep the two layer-2 networks separate but allow IP-datagrams to move between them, thus removing the need to exchange broadcast messages.
For example, an IP-camera layer-2 switch may be connected to router-A and the IP-production layer-2 switch may be connected to router-B. Forwarding tables between router-A and router-B will exist that will allow the IP-cameras media stream to be sent to the IP-production switcher. Router-A and router-B will probably extract the layer-3 IP datagram and re-encapsulate it with the layer-2 frame associated with the routers network and interconnect between them. This keeps the layer-2 networks logically separate but maintains the flexibility of sending only specific datagrams between them, thus keeping congestion and latency low.
Load Balancing Networks
Although routing may be statically assigned, it may also be dynamically distributed over multiple routers within a network. This typically happens over networks with inbuilt redundancy where multiple routes exist to the same destination. Protocols such as ECMP (Equal Cost Multi-Path) route packets dynamically over the network to distribute the load and make more efficient use of the network.
WANs (Wide Area Networks) develop these ideas further with protocols such as BGP (Border Gateway Protocol) where packets can be routed through other networks to reach a final destination. One example of this is the public internet.
Other network protocols such as MPLS, ICMP, and IPsec, but to name a few, are all available within the IP domain. These both help connect layer-2 networks and provide security.
This system works well for traditional IT infrastructures where short bursts of data are sent, and tight latency tolerances do not have to be adhered too. However, for broadcasters distributing media where latency tolerances are critical, leaving the network to its own devices through ARP, ECMP and BGP causes significant challenges.
Broadcasters have had the privilege and convenience of employing managed networks for many years through static SDI and AES connectivity. These networks have predictable latency and data throughput but suffer from a lack of scalability and flexibility. Although IP and ethernet networks solve these challenges, they introduce unpredictable latency as each router forwards the IP packet based on its source and destination addresses, and it does this every time a packet needs forwarding. In effect, the router has no memory and makes its routing decision based on the analysis of the packet when it is received.
One solution to this is to use SDN (Software Defined Networks) as it allows broadcasters to remove some of the automation within their networks and gain greater control of how layer-2 frames are being switched, and how layer-3 datagrams are being routed. In the next article in this series, we will dig deeper into how SDNs operate and the benefits they provide for broadcasters.
You might also like...
The Resolution Revolution
We can now capture video in much higher resolutions than we can transmit, distribute and display. But should we?
Microphones: Part 3 - Human Auditory System
To get the best out of a microphone it is important to understand how it differs from the human ear.
HDR Picture Fundamentals: Camera Technology
Understanding the terminology and technical theory of camera sensors & lenses is a key element of specifying systems to meet the consumer desire for High Dynamic Range.
IP Security For Broadcasters: Part 2 - The Problem To Be Solved
By assuming that IP must be made secure, we run the risk of missing a more fundamental question that is often overlooked: why is IP so insecure?
Standards: Part 22 - Inside AIFF Files
Compared with other popular standards in use, AIFF is ancient. The core functionality was stabilized over 30 years ago and remains unchanged.