Scalable Dynamic Software For Broadcasters: Part 6 - Containers And Storage

Monolithic software has defined a model that provides persistence using database storage while maintaining data integrity, but at the same time this restricts scalability. Microservices present a new method for maintaining data integrity and at the same time facilitating scalability.

All 12 articles in this series are now available in our free 88 page eBook ‘Scalable Dynamic Software For Broadcasters’ – download it HERE.

All articles are also available individually:

Data Consistency

In the ideal world, we would have one application accessing one database that would meet all the needs of the broadcaster. But it doesn’t take too long to realize this is an unrealistic expectation and something that could never be achieved. Even the simplest of broadcast facilities with one playout workflow would require a massive database that would hold an abundance of data including media asset library, playout lists, and advertiser metadata.

Other business units would soon demand access to the central database. This would allow sales departments to understand which Ads were played and when, resulting in two applications attempting to access the same data. Although the single database approach may work in theory, the reality is quite different as the structure of the database places limits on how each application can be developed. If the playout service was being upgraded and a change to a database record was required, then any application accessing the same record would also need to be changed.

Making one database application change is difficult but with two or more this becomes a real challenge as multiple vendors will need to collaborate and agree on roll-out schedules. Consequently, the playout and sales applications will often create their own database. Creating two databases holding similar, if not the same information, is a recipe for disaster as we suddenly have two versions of the truth. When they disagree, we have data inconsistency. And this results in massive errors within the broadcast facility.

One solution is to hold one single database and allow the principal application to provide access to the data through an API for all other applications. In essence this would work, however, it’s a method that is difficult to scale as the principal application would need to be installed on faster and faster servers to meet the changing demands of the service, but there is a limit to how fast a server can operate.

Distributed Processing

To overcome the monolithic solution just described, distributed processing was developed leading to parallel processing of data. Although this better facilitates scaling, the applications are still accessing the same database. This leads to poor database access resulting in latency and potentially lost or corrupted data.

In adopting faster servers with common APIs, all we’ve done is to just kick the proverbial bottleneck of congestion down the road from the application servers to the database.

Microservices deliver the distributed processing to provide scalability but still have some challenges to overcome with accessing central databases. For example, if a group of microservices were operating in California, but the database resided in Paris, there would be clear latency issues due to the physical distance between them. Although it’s somewhat romantic to think of the public cloud as a massive blob of limitless scalable resource, it really does physically exist, and one eye must always be kept on this inconvenient fact.

Another solution, and probably the most reliable and efficient, is for each microservice to have its own database. However, more fundamentally we should be asking why a microservice needs access to a database at all. In isolation, it’s foreseeable to operate a microservice without a database. The information needed to facilitate a task may not need to be persistent as it will be working in a stateless operation, so a database is an unnecessary overhead. However, in real-world applications the microservice will be working as part of a complex workflow where multiple microservices providing different functions will need to work together and collaborate.

Workflows With Persistence

It is this collaboration that determines the need for data persistence and therefore, the use of database applications. The database exists independently of the microservices but holds important information which needs to be maintained. For example, during ingest, a media file will need to be moved to a storage device and then information such as the video rate, color space, frame size, and audio sample rate will need to be maintained. The transcoding microservice would then process the file to transfer it into the house mezzanine format, thus creating another media file as well as the associated metadata which will be different to the original.

In the case of a broadcaster the original file format will probably be kept, along with its associated metadata so that there is a reference should anything go wrong with the mezzanine format, or if another format needs to be derived. Transcoding from the original will always provide the highest quality. Each process during the workflow will add more meta data. For example, if the media file is a movie and needs to be edited to create Ad break junctions, then a timecode list will be required to allow the playout system to insert the Ads, thus adding another level of metadata.

Any process that makes use of a microservice has the potential to create more metadata for the media file and this must be stored for as long as that media file exists.

Figure 1 – Access to the microservices’ local database through the API makes systems much more reliable as upgrades can be performed to individual microservices without impacting the rest of the system.

Polyglot Persistence

There’s another potential caveat, that is, not all database technologies are the same. The database might be hierarchical, non-SQL, or relationship based, to name but a few. All these provide different technologies that serve different applications. A relationship-based database may be needed for the orchestration layer to keep track of how workflows and their associated media assets are progressing, and a non-SQL database may be needed to maintain simplicity and improve execution speeds.

Because microservices can have their own associated database, the developers can choose the technology that best fits the application. Furthermore, they can be distributed so they physically reside near the servers hosting the associated microservice thus keeping latency low while maintaining the highest data integrity possible.

Polyglot programming is a term that expresses the idea that not all applications benefit from the same language. The workflow aspect of a web server application may be written in Node.js to deliver a product to market quickly, but a high-speed video or audio processing function may be written in C or C++ to build the fastest application possible. The same principle applies to databases: by using a polyglot approach, developers can choose the best database technology that delivers for their application.

Accessing Data

Microservices benefit greatly from APIs as they provide a unified method of communicating with the service while maintaining backwards compatibility. The same is true for database access.

Applications that need access to the microservice’s data will achieve the greatest reliability by using the API. This way, the underlying database structure can be changed and updated as required without any impact on the rest of the system. New APIs can be generated to provide extra data as it becomes available through the natural development cycle, but the existing APIs will be maintained without change. This way, any dependent services or functions will continue to work reliably even if a major change to the database or microservice has taken place.

Using this method of API data access allows any process within the ecosystem to reliably and efficiently read data from each microservice. Orchestration and monitoring systems can take full advantage of this so that they can mine data from all over the infrastructure, no matter where it resides in the world.

Microservices hold a wealth of application-specific information and their associated databases can make this available to any function or service to log and manage the infrastructure effectively.

Other related articles posted on The Broadcast Bridge.

Scalable Dynamic Software For Broadcasters: Part 7 - Connecting Container And Microservice Apps

Part of a series supported by

You might also like...

Microphones: Part 11 - The State Of The Art… And The Potential Of MEMS Microphone Arrays

Here we look from the state of the art in microphones, to what the future may bring with the enticing theoretical potential of microphone arrays built using MEMS technology.

IP Monitoring & Diagnostics With Command Line Tools: Part 2 - Testing Remote Connections

In the previous article, we set the scene for working with the Command Line Interface (CLI) on a UNIX system. Now we will explore some techniques for performing basic tests on our network infrastructure to check for potential problems.

Microphones: Part 10 - Mid-Side (M-S) Recording And Processing

M-S techniques provide useful sound-field positioning and a convenient way to check mono compatibility. We explain the hard science behind this often misunderstood technique.

Microphones: Part 9 - The Science Of Stereo Capture & Reproduction

Here we look at the science of using a matched pair of microphones positioned as a coincident pair to capture stereo sound images.

Microphones: Part 8 - Audio Vectorscopes

The audio vectorscope is an excellent tool for assuring quality in stereo sound production, because it makes the virtual sound image visible in the same way that a television vectorscope allows the color signals to be seen.