Designing IP Broadcast Systems: Network Layers & Topologies

How layer-2 switching and layer-3 routing are intrinsic parts of networks, and adding SDNs to spine-leaf and mesh networks improves flexibility and scalability.

Broadcasters have widely adopted the spine-leaf topology for their media signal network topology. However, there are other topologies that exist in IT network infrastructures and understanding them will help us realize why spine-leaf is so ubiquitous in the broadcast environment.

Networks have undergone massive transformations since their inception in the 1970s. A combination of improved technology and use-case requirements has allowed researchers and engineers to design and build a whole plethora of network topologies and designs. Although some have moved into the history books, others have stood the test of time.

Networks roughly split into two types: wired and wireless. And these further break down into subcategories and topologies. For the purposes of uncompressed media streaming in broadcast networks, wired networks form most of the systems. There may be specific wireless networks that join LANs to form WANs, but these are usually provided by specialist service providers.

Wired topologies fundamentally form seven network types: ring, star, partial mesh, fully connected mesh, line, tree, and bus. Although we often think of Ethernet as the dominant network transport layer, this is not always the case in wider generalized IT networks. It’s also important to remember that layer-2 protocols, such as Ethernet, HDLC, and FDDI are not the same as IP, which is a layer-3 protocol. In effect, IP exists as packets of data in memory and the protocol does not provide a defined or specified transport system of its own. So, without a layer-2 transport layer, an IP packet would never leave the host device, whether this is a camera, microphone, or production switcher etc. But far from being a hindrance, this is one of IPs greatest strengths as it can traverse many different transport layer types as it is moved between buildings and cities, or even countries.

Network Layers

Differentiating between layer-2 and layer-3 in the ISO seven-layer model can be challenging for any broadcast engineer new to IP as the two are often mistakenly interchanged. For broadcasters, we often speak of IP but when referring to network topologies speak in terms of spine-leaf, which is fundamentally a layer-2 (often Ethernet or Fiber) network. The differentiation is practical as layer-2 is the layer that physically transports the layer-3 IP packets between devices. A layer-3 (IP) network cannot exist in isolation as it requires a layer-2 network to transport the IP packets. However, a layer-3 network doesn’t necessarily have to be IP, other layer-3 protocols such as Appletalk, IPX, and NetBEUI all exist but to a much lesser extent due to the popularity and almost complete dominance of TCP/IP. And to differentiate further, layer-2 refers to data frames, and layer-3 refers to data packets.

Defining Ethernet networks relies on an understanding of broadcast messages. In Ethernet, only layer-2 frames with the same destination MAC address as the NIC (Network Interface Card) will be passed to the hosts operating system for processing. All other Ethernet frames that are sent to the NIC will be ignored. This is fundamental to the reliable operation of the host device as one of the roles of the NIC is to only send layer-2 frames meant for its host device to be processed. If this wasn’t the case, then the host would probably be overwhelmed with filtering out data frames meant for other devices.

However, there is one exception to this and that is the broadcast message. The Ethernet specification uses a sender and a destination MAC address in the layer-2 header to indicate where the frame has come from and where it is destined to be sent to. Every Ethernet device throughout the world has a unique MAC address that is issued by the IEEE and hard coded into the device during manufacture. One of these addresses is reserved and that is the broadcast message address FF:FF:FF:FF:FF:FF. In other words, all the bits in the MAC destination address of the frame are set to logical ‘1’. Any NIC receiving this frame must pass it onto its hosts processing system so the data can be processed. An Ethernet network is defined as any network that shares the same nodes that can send and receive this broadcast frame.

For example, studio-1 may have one layer-2 switch that connects its video infrastructure and studio-2 would have a separate layer-2 switch infrastructure. Assuming they are not connected then studio-1 layer-2 broadcast messages would only be consumed by devices in studio-1, thus keeping unnecessary network traffic low in studio-2 – if they were connected then studio-2 would receive all of studio-1’s traffic and visa versa. And this would lead to network congestion and video dropout.

It may well be necessary to send some of studio-1’s traffic to studio-2 and to achieve this at a layer-2 level a bridge could be used. It’s switching table would allow specific devices to cross to the other studio. This method works and has practical applications, but it is very difficult to manage this type of network in a highly dynamic environment such as a broadcast studio infrastructure where thousands of latency sensitive media streams are exchanged across the network requiring unprecedented levels of data integrity. This is mainly because we don’t want to use TCP/IP in studios due to the unpredictable and variable latency that it introduces.

Figure 1 – The spine-leaf topology provides a good compromise between reliability, flexibility, low-latency, and data integrity.

Figure 1 – The spine-leaf topology provides a good compromise between reliability, flexibility, low-latency, and data integrity.

Moving media streams across asynchronous networks, such as Ethernet and IP, is challenging as they are time sensitive and require high levels of data integrity. IP protocols such as TCP often improve data integrity at the expense of increased latency. Therefore, to improve latency, protocols such as SMPTEs ST2110 use UDP packets, or in other words, fire-and-forget. This greatly improves latency but requires the network design to be of an incredibly high quality as lost packets cannot be tolerated.

Broadcasters are tending to use the same method of layer-3 (IP) transport as those widely found in IT Enterprise infrastructures, that is, IP over Ethernet (this may be twisted pair such as CAT8 or Fiber, or a combination of the two).

Network Topologies

As previously stated, there are approximately seven different types of network topology, but due to the unique requirements of high-quality media signal exchange, topologies such as ring, star, line, and bus can all be discounted due to the bottlenecks within the design, and so this leaves partially connected mesh, fully connected mesh, and tree.

Fully connected topologies are a subset of mesh topologies as they only allow for partial connection between the nodes. In this context, the node is a layer-2 switching device that is connected to other nodes to form the network.  The host devices such as cameras, microphones, and production switchers then connect to the node devices to allow data to be sent to the rest of the devices on the network. Although partially connected and fully connected mesh topologies seem like the utopian solution for resilience, they do suffer from connection bloat as the number of connections between the nodes increases quadratically as the network scales and more nodes are added. For example, if a fully connected mesh has 4 nodes, then each node requires 6 connections, but if the mesh has 6 nodes, then this increases to 15 connections for each node. And if we take this to the extreme and use fifty nodes, then each node will need 1,225 individual connections. Partially connected mesh topologies are a solution to this as they reduce the number of node interconnections but do so at the expense of resilience.

Figure 2 – The fully connected mesh network when combined with a software controller to provide the Software Defined Network provides even greater flexibility and scalability due to its distributed nature.

Figure 2 – The fully connected mesh network when combined with a software controller to provide the Software Defined Network provides even greater flexibility and scalability due to its distributed nature.

To date, the use of fully or partially connected mesh topologies in studio infrastructures hasn’t really taken off using COTS equipment, probably due to the network complexity that results from scaling, however, some vendors have adopted the partially connected topology using time division multiplexed layer-2 Ethernet which is proving very effective for broadcast studio infrastructures.

Tree topologies can also be incredibly complex due to the potential for a massive number of branches, however, a simplified variation on this is becoming widespread in broadcast studio infrastructures and is called spine-leaf. This is essentially a one-level tree topology with the spine acting as the root of the tree with a single level of branches to the nodes (leaf’s). A variation on this allows for two spines with each node connected to it so that dual resilience is provided.

Spine-leaf topologies are much easier to design than mesh topologies as the data paths are much better defined, which leads to easier signal routing for broadcasters. Using the two-spine model resilience can be built into the network to reduce the risk of single points of failure. And importantly, the physical frame switching layer can be abstracted from the control layer leading to a greater potential for software defined networking (SDN) which will greatly improve the efficiency and flexibility of the network.

Spine-leaf topologies currently tick many boxes for broadcasters but achieving higher levels of efficiency is often gained through distributed hardware, and mesh networks have the potential to achieve this. SDNs are relatively new to IT but are currently finding many applications in networks. And as technology and our understanding of network routing improves, then so will the efficiency and flexibility with which we can operate media flow spine-leaf and mesh networks will also improve, especially when combined with SDNs. 

Part of a series supported by

You might also like...

The Resolution Revolution

We can now capture video in much higher resolutions than we can transmit, distribute and display. But should we?

Microphones: Part 3 - Human Auditory System

To get the best out of a microphone it is important to understand how it differs from the human ear.

HDR Picture Fundamentals: Camera Technology

Understanding the terminology and technical theory of camera sensors & lenses is a key element of specifying systems to meet the consumer desire for High Dynamic Range.

IP Security For Broadcasters: Part 2 - The Problem To Be Solved

By assuming that IP must be made secure, we run the risk of missing a more fundamental question that is often overlooked: why is IP so insecure?

Standards: Part 22 - Inside AIFF Files

Compared with other popular standards in use, AIFF is ancient. The core functionality was stabilized over 30 years ago and remains unchanged.