# Machine Learning (ML) For Broadcasters: Part 1 - Overview

Machine Learning is generating a great deal of interest in the broadcast industry, and in this short series we cut through the marketing hype and discover what ML is, and what it isn’t, with particular emphasis on neural networks (NNs).

**Other articles in this series:**

It’s important to remember that ML is a vast subject that is still the subject of deep research. Even while writing these articles, the LSTM (Long Short-Term Memory) NN used in time series analysis type applications is seeing a lot of competition from Transformers. Consequently, much of the terminology is still open to interpretation and change, especially in the general population.

ML is a generic term that encompasses a whole range of mathematical tools including Naïve Bayes, Random Forest, and K-Means, all providing solutions to different applications. But of particular interest to broadcasters are NNs as prediction and classification, which are two of the areas where NNs excel. Applications in prediction and classification include video compression, QC, library ingest meta data tagging, and IP network optimization.

Key to understanding ML requires an appreciation of tools such as regression analysis. Although this may be a grandiose title, it’s really a statement on establishing relationships between a dependent variable, or the output (the variable we want to predict or classify), and the independent variables, or input data. A simple example of this is shown in Figure 1 where graphically we can easily see a delineation between the two classifications of data points. In this example, the “plus” datapoints may represent videos that have passed QC and the boxes have failed QC.

Figure 1 – both diagrams represent a simple linear delineation for classification between the two types of datapoints.

The left diagram easily demonstrates a simple linear relationship of the type *y *=* mx *+* c*, and the diagram on the right is still linear as it represents a parabolic relationship of the type *y *=* ax*^{2 }+* bx *+* c*. However, Figure 2 is anything but linear and the delineation would be incredibly challenging to find as it doesn’t fit a standard linear equation model.

In a classification example, where we want to find a pass or fail outcome, Figure 1 demonstrates the relationship would be relatively easy to find, although drawing the delineation line isn’t trivial. However, finding the relationships in Figure 2 is almost impossible using linear regression. So, we turn to ML, and specifically NNs to find the non-linear relationships demonstrated in Figure 2.

Figure 2 – two examples of non-linear relationships between the input data and the delineation for the classified output data.

Quite often, linear regression relies on statistical analysis to find the equation we’re looking for to predict the output based on the independent input data. The major difference between statistical analysis and ML is how we use the data. In statistics the data scientist will analyze the data to initially find equations that meet simple linear relationships between the input and the output. As the models become more complex and the patterns harder to find, the data scientist adopts a whole new array of mathematical tools. The point here is that the analysis of the data, and hence the design of the statistical model, relies almost entirely on the skill of the data scientist and their ability to find the relationships between the data and the predictions. In other words, they must be domain experts.

ML differs greatly from statistical analysis as the model “learns” patterns within the data set based on generalized models. These include MLPs (Multi-Layer Perceptron’s), RNN (Recurrent Neural Networks), LSTMs (Long Short-Term Memory) and GAN (Generative Adversarial Networks), to name but a few. The basic concept is that we apply “training” data to the models and through some fairly involved mathematical processes the parameters within the networks are optimized so that when the input is presented to the network, the output meets the prediction. More fundamentally, the data scientist does not need to be a domain expert in the field they are working in. It’s fair to say that we still need a domain expert to classify the data, but this would be somebody such as a QC engineer who can pass or fail the images. However, the data scientist building the complex ML models treats the video and its QC classification as data. They don’t need to understand where the boundary of pass and fail exists, only that it does, and that they design and train a model to provide the desired classification or prediction.

When we talk about ML it covers two distinct processes, the training, and evaluation. Training is the process of teaching the model to find the patterns within the training dataset that matches the desired output. After the model is trained, previously unseen data is presented to the model during the evaluation phase which in turn provides a pass or fail output.

Training is based on presenting tens, or even hundreds of thousands of datapoints so that the parameters within the model can be automatically adjusted. These same parameters are used during evaluation so that the model can now provide a classification prediction on previously unseen data. Later in this series we delve deeper into the fundamentals of training and evaluation to demonstrate how complex this process really is.

Although classification is used extensively by ML to tag images and sound, ML also comes into its own when we start considering prediction, specifically in compression and standards rate conversion. If an ML model can predict the next frame of video, or sequence of frames of video, then we have an incredibly powerful tool. We no longer need to be concerned with motion compensation as the model will be able to predict the next pixel values. And a similar argument applies to video and audio compression.

One of the interesting aspects of ML is that the training data implies the solution is based on past experiences. This is similar to the operation of the human brain, and this is one of the reasons pundits draw comparisons to the workings of the mind. ML doesn’t pretend to replace the human brain in any way, but it does replicate the methods with which we all learn. In later parts of this series, we look at the importance of training data and how confirmation bias can affect the quality of the ML solution. ML is all about the training data!

## You might also like...

# Why AI Won’t Roll Out In Broadcasting As Quickly As You’d Think

We’ve all witnessed its phenomenal growth recently. The question is: how do we manage the process of adopting and adjusting to AI in the broadcasting industry? This article is more about our approach than specific examples of AI integration;…

# Designing IP Broadcast Systems: Integrating Cloud Infrastructure

Connecting on-prem broadcast infrastructures to the public cloud leads to a hybrid system which requires reliable secure high value media exchange and delivery.

# Video Quality: Part 1 - Video Quality Faces New Challenges In Generative AI Era

In this first in a new series about Video Quality, we look at how the continuing proliferation of User Generated Content has brought new challenges for video quality assurance, with AI in turn helping address some of them. But new…

# Minimizing OTT Churn Rates Through Viewer Engagement

A D2C streaming service requires an understanding of satisfaction with the service – the quality of it, the ease of use, the style of use – which requires the right technology and a focused information-gathering approach.

# Production Control Room Tools At NAB 2024

As we approach the 2024 NAB Show we discuss the increasing demands placed on production control rooms and their crew, and the technologies coming to market in this key area of live broadcast production.