Hardware Infrastructure Global Viewpoint – June 2021
ML, GPUs And Cloud Computing
Machine Learning is having a massive impact on broadcast workflows, and this looks set to continue. But what do GPUs do? And could they be the Achilles heel of cloud computing?
The primary role of a GPU is to take much of the intensive video processing and rendering away from a systems CPU. Specific processing units and memory provide hardware acceleration that optimizes many of the repetitive tasks that would otherwise slow down the processor and severely impact the user experience.
The available resources on GPUs are massive, the latest NVIDIA RTX 3090 boast 10,496 floating point processing cores and 25 GB of RAM, not to mention 1.4GHz clock speeds and a data bandwidth of 936 Gbps. However, all this processing power comes at a price with each GPU consuming up to 350 Watts meaning a single 5V bus bar for one GPU card must supply 70 amps.
What have GPUs got to do with ML? The simple answer is linear algebra and this forms the basis of the hardware accelerated processing found in GPUs. ML based neural networks use a massive number of processing units called neurons that simulate the structure of the human brain. Nobody would dare suggest that an ML solution is equivalent to the human brain, it’s just that we base the neuron and their associated activation functions loosely on how some parts of the brain operate.
There are two parts to an ML solution – training and evaluation. Training is where the heavy processing takes place. Evaluation still does a lot of complex processing but not as much as the training process. The evaluation process can be thought of as the user-product that would operate in our workflows, such as ML based video compression or image detection during library archive for metadata tagging.
ML training is an incredibly complex task and requires tens or even hundreds of thousands of data points to be processed by the ML engine. During this process backwards propagation takes place and this is where the neurons parameters are updated and optimized. Back-prop uses calculus through partial derivatives and matrices to detect minima’s and minimize the loss function. And by some strange and interesting coincidence of the universe, a side effect of GPUs is that they are able to provide hardware acceleration for this process, thus greatly reducing the training time.
Training can still take days or weeks, even with high power GPUs, but this is significantly less time than just relying on CPUs.
Anybody thinking one step ahead will now realize a potential flaw in ML applications and cloud computing, that is, we need massive GPU accelerators to make it work and this may well breach our COTS ideology. But there are two points to remember – evaluation doesn’t need as much processing as it’s not doing the back-prop process, and GPUs are available in the public cloud.
It’s fair to assume that supervised ML provides most of the broadcast applications for workflows and all the training takes place with the vendor’s resource, and retraining is a continuous process that vendors conduct to improve their solutions. The output of training process is a binary file that contains the parameters for the ML-engine used during evaluation. This can be supplied to the broadcaster as part of a software update, or if the vendor is providing a microservices SaaS solution then they will automatically and efficiently update the ML parameters directly without the broadcaster knowing or being involved.
Cloud GPUs can provide enough processing power for ML training, but at a cost. The really good news is that the finished product working in evaluation mode doesn’t do the learning so only needs a smaller GPU, and sometimes no GPU at all. So once again, cloud computing provides multiple benefits for broadcasters and takes away the risk and inconvenience of procuring and maintaining hardware.