Scalable Dynamic Software For Broadcasters: Part 7 - Connecting Container And Microservice Apps

Messaging is an incredibly important concept in microservices and forms the backbone of their communication. But keeping systems coherent and resilient requires an understanding of how microservices communicate and why.

There are two fundamental types of messaging in a microservice ecosystem. There are those messages that are public facing, and those that are private. The public facing messages are needed by users to execute a process, such as an ingest operation. And the private messages are those used by the microservices to communicate with each other without presenting them to the outside world.

As users will generally be accessing the operational side of the microservice through a web browser or over the internet, then the public messages must be internet compliant. Generally, this means HTTPS/TCP/IP. The TCP layer provides a logical connection with guaranteed delivery, and the HTTP provides the transfer protocol to allow web browsers and servers to exchange data.

Messaging Resilience

Messaging becomes particularly interesting when we consider what happens when things go wrong. A message may be lost in transit or corrupted before it is delivered, a node may fail, an instance might crash or a be overwhelmed with requests rendering it unable to process jobs in the message queue.

There may be tens, hundreds, or even thousands of instances of microservices running in a particular ecosystem. All these need to exchange messages to maintain the workflow and provide the relevant updates.

Failures fall into two camps: transient or complete failure. With transient failures, a failure may only exist for a short period of time as the system effectively repairs itself. This could be caused due to a network switch failing resulting in the IP packets taking another route, or a node becoming temporarily overloaded but recovering quickly. Complete failures exist for much longer periods of time and can be caused by situations such as a server failing or the power system not having sufficient failsafe systems.

Recovery

If a recovery strategy is not adopted, the messages will either cause congestion in the network and servers or render the system unstable. Consequently, the transient and complete failure types require different message recovery strategies, and these fall into two different types: retry and circuit breaker.

The “retry” method deals best with suspected transient failures as the sender will keep transmitting messages to the target microservice assuming the failure will recover quickly. If it does, then the previous messages will be discarded or serve no practical use and the system will continue to function. However, this is a big assumption as continually sending the same message to a microservice assumes it is idempotent, that is, sending multiple messages will not have a detrimental effect on the microservice.

Figure 1 – If microservice ‘B’ is not idempotent, resending a failed message will result in the microservice executing the instruction twice. This will have potentially disastrous side effects for the system making it unstable.

Figure 1 – If microservice ‘B’ is not idempotent, resending a failed message will result in the microservice executing the instruction twice. This will have potentially disastrous side effects for the system making it unstable.

Maintaining idempotent software is incredibly important, otherwise the system can become unstable. For example, if microservice ‘A’ sends a message to microservice ‘B’ requesting an ‘id’ to check if an operation has been executed, this is considered idempotent as the message from microservice ‘A’ does not change the status of microservice ‘B’ in any way. So, if multiple messages were sent, there would be no unintended side effects for microservice ‘B’. However, if microservice ‘A’ sent a message instructing microservice ‘B’ to read the next file in a directory, microservice ‘B’ would certainly not be idempotent as there would be a massive side effect. If the same message was sent ten times by microservice ‘A’, and some of the messages were lost, then the state of microservice ‘B’ would be unknown, it would be indeterminant.

If too many failed messages are transmitted by the sender, a bottleneck can easily occur leading to congestion in the network and the destination microservices as the messages overflow in the queue buffer. The circuit breaker will detect the failure of the messages after reaching a predetermined threshold and stop the sender transmitting. The error will then be raised to the orchestrator software where other action can be taken such as re-routing or raising of an alarm.

Synchronous And Asynchronous Messaging

When discussing synchronous and asynchronous messaging it’s important to note that these refer to the protocol not the underlying I/O. An operating system can create either synchronous or asynchronous I/O access, also referred to as blocking and non-blocking. With a blocking I/O access the operating system will have to wait for an event to complete before the software can continue, but with non-blocking, the software execution is not affected. Synchronous and asynchronous messaging have a similar idea, but they work at the protocol level, not the I/O level. But it is entirely possible for a synchronous message to operate over an asynchronous I/O.

There are many pros and cons between synchronous and asynchronous messaging systems, but they differ in whether the sender waits for a response from the receiver (synchronous) or not (asynchronous). That is, with asynchronous messaging the sending thread is not blocked, which potentially results in an improved latency response. For example, if microservice ‘A’ calls microservice ‘B’, and this in turn calls microservice ‘C’, then with a synchronous system microservice ‘A’ could find itself blocked until microservice ‘C’ completes its action, reports back to microservice ‘B’ and then microservice ‘A’.

An asynchronous messaging system can also be thought of as fire-and-forget communication system.

Figure 2 – In the top diagram Microservice ‘A’ sends a synchronous message to microservice ‘B’ but it must wait until ‘B’ has responded before it can continue. In the bottom diagram, microservice ‘A’ sends messages to the message broker using a fire-and-forget methodology, so microservice ‘A’ does not have to wait for any responses. Furthermore, the messaging broker allows multiple microservices to subscribe to, and receive the messages from microservice ‘A’.

Figure 2 – In the top diagram Microservice ‘A’ sends a synchronous message to microservice ‘B’ but it must wait until ‘B’ has responded before it can continue. In the bottom diagram, microservice ‘A’ sends messages to the message broker using a fire-and-forget methodology, so microservice ‘A’ does not have to wait for any responses. Furthermore, the messaging broker allows multiple microservices to subscribe to, and receive the messages from microservice ‘A’.

As asynchronous events do not need to wait for a response from the receiver, they lend themselves well to broadcasting or streaming messages to multiple receivers, often referred to as the pub/sub model (publisher – subscriber). Conceptually, this is similar to IP broadcast methods where routers subscribe to a broadcast stream to receive the data for multiple destination devices. Within microservice architectures, a separate messaging bus is often employed to deliver this type of event-driven communication. For example, five microservices may subscribe to a microservice sender through the messaging bus. As the sender does not expect a response from any of the receivers it will send (publish) the message and any service subscribing to it will receive the message.

Message Brokers

To facilitate message delivery microservice architectures often employ a messaging broker. This is effectively an intelligent queue management system that buffers the messages in memory and makes sure they are delivered to the microservices that have subscribed to the publishing microservice.

The messaging broker also handles all the management of the subscriptions and facilitates their delivery. A microservice architecture may employ thousands of publishers and subscribers making the whole asynchronous messaging system incredibly complicated.

Without a messaging broker, the microservices themselves would have to provide the necessary management and protocols to facilitate asynchronous delivery. This is a complex and arduous task and so is often left to the expert service providers that deliver this type of service thus leaving the broadcast vendor to focus on building their application.

Messaging within microservice architectures not only provides a method of communication for different microservice apps but also delivers resilience and scaling to greatly improve system reliability and flexibility.

Part of a series supported by

You might also like...

BEITC 24 Report: Worldwide 5G TV Update

The appetite for broadcast content over mobile devices has reached several important milestones, providing more opportunities for the latest versions of ATSC and DVB content to be distributed as cellular data without a SIM card or a cellular subscription. The…

Audio For Broadcast: Cloud Based Audio

With several industry leading audio vendors demonstrating milestone product releases based on new technology at the 2024 NAB Show, the evolution of cloud-based audio took a significant step forward. In light of these developments the article below replaces previously published content…

An Introduction To Network Observability

The more complex and intricate IP networks and cloud infrastructures become, the greater the potential for unwelcome dynamics in the system, and the greater the need for rich, reliable, real-time data about performance and error rates.

2024 BEITC Update: ATSC 3.0 Broadcast Positioning Systems

Move over, WWV and GPS. New information about Broadcast Positioning Systems presented at BEITC 2024 provides insight into work on a crucial, common view OTA, highly precision, public time reference that ATSC 3.0 broadcasters can easily provide.

Next-Gen 5G Contribution: Part 2 - MEC & The Disruptive Potential Of 5G

The migration of the core network functionality of 5G to virtualized or cloud-native infrastructure opens up new capabilities like MEC which have the potential to disrupt current approaches to remote production contribution networks.