Building Software Defined Infrastructure: Monitoring Microservices

Breaking production systems into individual microservice based processors, requires monitoring over IP via RESTful APIs and a database system to capture the results.
Microservices, by their very nature are a collection of smaller programs that each perform a specific task and then combine to deliver a complete solution. This methodology opposes the monolithic design philosophy and enables applications to be more flexible and easier to maintain. However, we need to understand how the applications interact and why, if we are to build highly reliable broadcast systems.
Providing a simple eight-input playout switcher requires three basic microservice applications: the video switcher, audio switcher, and input controller. Each of these need to collaborate at the control and signal processing level. Consequently, each of these requires different types of monitoring.
The video and audio input switcher microservice applications may well have media signal monitoring built into their front ends so that the quality of the video and audio stream can be assessed without adding an extra monitoring probe. How the control system operates also needs to be understood. For example, does the Human Control Interface (HCI) operate the video and audio switcher apps directly? Or does it control the video switch, which then sends messages to the audio switch microservice?
It’s worth remembering that in a pure off-prem public cloud infrastructure, there are no SDI, AES, GPIOs or RS422 systems, instead, the only access method we have is IP. Therefore, all our interaction with microservice systems must be through IP interfacing. This may well be via ethernet, but the only protocol available is IP. Therefore, any interaction with the microservices must be through IP connectivity.
RESTful APIs
To maintain maximum flexibility, the control and monitoring system may employ a web browser, in which case the microservice apps will need to be compliant with RESTful software architecture design. REST (Representational State Transfer) provides a methodology that creates stateless and reliable applications. The stateless communications protocol is a concept that means the receiver (our microservice) does not need to retain any session state information from previous requests. In essence, the microservice does not retain any information from the web-type-browser and it is the browsers job to do this. For example, a web service that is streaming to a browser requires the browser to keep a record of which video frames or chunks are to be sent next. Adopting this method allows many browsers to be connected to the same streaming application and furthermore makes scalability possible.
Achieving RESTful communication requires the microservice to use API calls via its IP interface, and to be truly RESTful compliant must have the API requests managed through HTTP/TCP/IP. Consequently, the APIs adopted by microservices are incredibly important to broadcast engineers managing a software defined infrastructure.
This all sounds well and good as RESTful APIs are a proven technology that deliver reliable system control and monitoring. However, as broadcast engineers, we are used to working in dedicated and isolated systems, such as SDI and AES as they are circuit switched networks dedicated to specific signals, but as we move into the world of scalable IT systems then API becomes very much plural. The number of devices that can connect to a microservice API interface is not limited to one, and more often than not uses multiple connections.
API Multi Client Access
Microservice applications rarely work in isolation and the reliability of the API interface is one of the first ports of call for monitoring. If too many devices try and connect to the API interface simultaneously, then freezing can occur where the response either fails or even exhibits high levels of latency. Equally, the freezing of the interface, that is where sending it a command either does nothing or takes too long to complete, could also be indicative of a more serious underlying bug. Other types of API errors occur where the microservice loses contact with its database or the database it is connected to has too many connections and fails. These are just a few examples of the types of issues broadcasters must now be aware of in highly flexible and scalable asynchronous systems.

Diagram 1 –The microservices on the left can store monitoring data to a centralized or localized database directly. This allows monitoring systems in the MCR and studio to extract, display and interrogate the monitoring data without having to access the microservices directly via their APIs, thus reducing any negative monitoring and observation influence on the microservices directly.
Knowing the number of active connections to a microservices API may also help with security issues. Although microservice systems are highly dynamic, they must also be secure and if a user can randomly connect to an API, then all kinds of problems can occur. Apart from the obvious issue of an unauthorized user connecting to an API interface in terms of taking control of the process, there are other more subtle challenges such as being able to route the video and audio to a storage device to enable illegal copying of material. As well as monitoring the status of the API interface, the broadcaster must also keep a record of the connected devices and be able to validate them all.
In the context of our four-input production switcher than the broadcast engineer must regularly extract the statistics from the HCI, video and audio microservice switches. This can be achieved through the API directly, or with more advanced systems from the microservices database directly. As the microservices are stateless, they do not keep any information about the sender’s request between requests. This is an important differentiation as the microservice keeps localized information about the sender’s request at the time it is made to enable it to respond correctly. This may include the GUID (Globally Unique Identifier) for example, but once the service has been requested then the GUID is forgotten. And this is the primary need for a database.
Offloading To Databases
When the microservice is responding to a sender’s request (such as a control message from an HCI or monitoring system), then the microservice may well write updates to a database, which may be centralized or local to the microservice. This system topology complies with RESTful architectures as the API is not maintaining a state but is using the database to record information about it, such as the number of API requests a second, a list of authorized devices, or the video and audio measurements.
If the microservice can access a database directly then it solves many issues with the observer influencing the operation of the microservice as it doesn’t need to communicate directly with the microservice to read back its monitoring statistics. Instead, the monitoring system can either connect directly to the database and access the information directly, or it can connect through a monitoring microservice API. Connecting a user directly to the database is not a great idea due to the security implications, instead, a monitoring microservice can be employed which acts as an intermediary between the database and the monitoring system. Not only does this maintain higher levels of security, but it also stops the database being overwhelmed with monitoring requests (because the monitoring microservice can block a user requesting too much data).
Local & Centralized Monitoring Storage
Employing a database for the microservice monitoring data also helps with logging for later forensic analysis, especially if a service goes off-air and a microservice has crashed. The amount of data that can be logged becomes a function of the network and database capacity as opposed to just the capability of each microservice. But again, great care must be taken when establishing how much monitoring data to record. It may be the case that all microservices update the database once a second or so with their monitored data, but if an area of the infrastructure is misbehaving, then more of a focus can be established around that area and extra monitoring probes in the guise of specialized microservices can be added, thus creating an optimized and scalable monitoring system.
With all the monitoring data in the database, which may be a single global device or multiple localized devices, then monitoring applications such as Grafana can be used to display the information on strategic monitors for the broadcast engineers. Also, specialized monitoring and alarm software can be used that accesses the database and generates alarms if parameters fall outside of their defined limits.
Monitoring microservices is more detailed than the traditional video and audio monitoring adopted by broadcast engineers. In addition to video and audio signal monitoring, we must dig deep into the architecture to understand the interaction of RESTful messaging.
Part of a series supported by
You might also like...
Monitoring & Compliance In Broadcast: Monitoring QoS & QoE To Power Monetization
Measuring Quality of Experience (QoE) as perceived by viewers has become critical for monetization both from targeted advertising and direct content consumption.
IP Monitoring & Diagnostics With Command Line Tools: Part 5 - Using Shell Scripts
Shell scripts enable you to edit your diagnostic and monitoring commands into a script file so they can be repeated without needing to type them manually every time. Shell scripts also offer some unique and powerful features that help to…
Building Software Defined Infrastructure: Observability In Microservice Architecture
Building dynamic microservices based infrastructure introduces the potential for variable latency which brings new monitoring challenges that require an understanding of observability.
Broadcast Standards: Kubernetes & The Architecture Of Cloud Compute Based Systems
Here we describe Kubernetes and the taxonomy of containerized architecture based cloud compute system designs it manages.
Live Sports Production: Backhaul In Live Sports Production
Getting content reliably and securely from venue to studio remains key to live sports production so here we discuss the technology and services required.