Audio Global Viewpoint – February 2023
The Future Of ML
As more copyright owners are using litigation to take on the might of the machine learning commercialization, what will be the future for ML?
ML relies on massive amounts of training data to teach the neural networks and uses the relatively new phenomena of computer data led learning. In the domain of image classification this is particularly powerful as ML engineers can create working systems that identify images with very little domain knowledge or expertise. But there are many domains where ML has applications including video, audio, human pose estimation and vehicle position estimation, to name but a few. For example, if a vendor was building an automated QC, then they would need hundreds of thousands of video and audio sequences that had both pass and fail labels.
Supervised learning requires classification labels so that the ML model “knows” what it is looking for so that it can train its networks. This is similar to how parents teach toddlers to recognize objects such as orange, or apple, etc. by holding it in front of them and saying “apple”. But try turning the apple upside down and watch the toddler become confused about the new object that has been presented to them. In other words, not only does ML need massive amounts of data, but it also needs the objects in the images to be presented in a whole multitude of orientations.
So, supervised ML has two challenges, it needs massive amounts of data, and it needs somebody to classify and label them. And this often requires a human. The question is, who owns the training data?
Although broadcasting is expanding its use of ML with great enthusiasm and success, looking outside shows a whole multitude of AI/ML innovations and commercialized products and services that demonstrate some of our future challenges. Just look at the incredible ChatGPT (Generalized Pretrained Transformers) and its ability to create dialogue in the style of many different genres at the request of the user. This is surely paving the way for text-to-movie applications. But, looking at the newspaper headlines shows Getty Images suing the creators of the AI art tool Stable Diffusion for using Getty’s images to train their models. And this is where ML training data gets really interesting.
An outsider without any emotional or commercial attachment to art may say “what is the difference between scraping the internet for images to train an ML model and going to multiple galleries and gaining inspiration so that an artist can create their next painting?”. Well, in terms of copyright and the law, we will find out when the courts make their judgements. But the moral perspective is a whole different ball game, and no doubt will keep philosophers gainfully employed for years to come.
In my view, the issue of how ML innovators use training data will come down to commercial agreements. If I owned a stock photo agency then I am in the business of licensing images, and training data is an extension of this. But this then begs the question of who is responsible for making sure the correct licenses are in place for the training data? The service vendor or the service user? I’m sure there are contracts that could be written to define this, but the moral question is something of a higher domain. In years to come there will be ML services that can turn text into movies or television productions. In the same way where a text-to-image creates an image in a particular genre, the text-to-movie will achieve the same. But then the question will be – what happens to human creativity? Is this the beginning of the end for human innovation? Something broadcasting greatly relies on.
If it’s any consolation then I do not believe ML will take over humanity or the whole of human creativity, and in part I cite the work of Godels Incompleteness Theorem to demonstrate this, but we’ll have to save that for another day.