AI is finding lots of new and interesting applications in broadcast television, so why should ML unconscious bias concern us?
Machine Learning is roughly divided into two types: supervised and unsupervised learning. The learning, or training as it is also known, is key to the success of any ML application and this is where the concept of bias can be silent and deadly.
Unsupervised learning allows the ML model to find relevant information for itself. It’s an incredibly interesting technology and is the subject of much research. Using various ML models, the sample data is transformed into a lower dimensional space with the intention of finding relevant features that can be used in applications. Data scientists influence the classified attributes by adjusting various parameters during training when identifying the classifications.
Supervised learning, on the other hand, requires the data scientist to classify the data in some way prior to training so that the model can learn the attributes of the classification. For example, if the model was to detect images of red buses, then tens of thousands of representative sample images would be fed into the ML engine to facilitate training. Not every image of a red bus is fed into the ML engine, just enough for it to be able to predict an image of a red bus it hasn’t seen before.
The obvious challenge with supervised learning is that if the data scientist mis-classifies the image then this will have a negative outcome on the learning. For example, if images of red trucks are labeled as red buses, then the ML engine will be biased to both red buses and red trucks when used in an application.
This is well understood but what happens if the person classifying the dataset has an unconscious bias? The simple answer is that their bias will probably find its way into the data classification, and hence influence the ML model. Although this is more of an inconvenience when considering the red bus and red truck example, it becomes a little more concerning when we look at the implications for classification based on our perceptions within society, some of which may be unconsciously biased. After all, television is a great community influencer.
Clearly, if the data scientists bias was conscious then they would not classify the data in a biased way, but if their bias is unconscious then by definition, they are not aware that they may well be introducing bias.
We might assume that the unsupervised type learning is free from unconscious bias as the computer has no prior knowledge, but the data scientist does influence the outcome of the learning through their analysis of the ML engine’s classification and their associated parameter adjustments.
As broadcasting is maturing from its cottage industry working practices to a streamlined production-line type discipline, any mistake, or anomaly has the potential to magnify itself exponentially.
ML has massive potential to advance the viewing experience and improve efficiencies for broadcasters. But we must be aware of how it really works so we can protect against some of the challenges such as unconscious bias. After all, who is guarding the guard?