In the good old days when you were thinking about upgrading your computer you began by reading printed reviews such as those published by Byte magazine. These reviews usually included industry standard benchmarks. Now, of course, you are far more likely to watch internet video reviews.
When I view these on-line reviews it’s clear the majority have been written by “gamers.” These reviews will have titles such as the “DeathAxe Laptop Reviewed Using 23 Games.”
A gaming computer reads compressed artificial world descriptions from a disk file. This artificial world is regenerated by the CPU and loaded into the GPU where it is displayed to the gamer. The gamer’s actions are fed back to the GPU which dynamically modifies the artificial world it displays.
This is a different process than video playback from a content creator’s system. Here a continuous sequence of frames is rapidly pulled from a disk, decompressed by the CPU, sent to GPU buffers where mathematical operations, such as those for color correction, are applied to the buffer - and then displayed for a precise interval. This process is repeated for the next frame.
When a content creator’s computer is transcoding media the workflow is also different. Transcoding involves reading compressed video frames from a disk, each of which is then decompressed and recompressed, with the recompressed frame written back to a disk.
There is one thing that gaming and content creation do have in common - both involve long task execution times during which heat is generated by the CPU and GPU. Even when a laptop has a good cooling system (see Figure 1), playing games for hours or working with high-resolution compressed video - often with complex effects - can push a computer toward, or into, thermal throttling. So, while gaming and content creation tasks do not employ the same compute intensive processes, they both can generate high thermal loads.
Understanding this, we want to know whether game play performance or industry standard benchmark performance has a higher correlation with content creation performance.
Let’s, for the time being, ignore measuring content creation performance. At this point, we need only a measure of game play performance and a benchmark performance measure. (When data were being collected, each laptop was powered by its charger from mains power.)
Both the gaming and Geekbench tests ran for well over a minute thus allowing heat to build as it would during editing and color-correction. Laptop display resolution, at 1920x1080, remained constant during testing.
We will need to measure performance on multiple computers because we need multiple data points for the measure of game play performance and multiple data points for the measure of benchmark performance.
Before describing the measures I took, you need descriptions of the computers. In the first phase of this exploration I had access to four systems. Thankfully these systems offered a wide performance range.
Number 1. Samsung Galaxy TabPro S Windows 10 Tablet: Intel m3-6Y30 CPU/GPU (similar to an i3); 4GB RAM; 2GB Video memory; 256GB SSD. (See Figure 2.)
Number 2. Lenovo Ideacenter Y910-27: Intel i7-6700 (3.4-4.0GHz); 16GB DDR4-2133; NVIDIA GTX 1080, 8GB GDDR5 memory; 128GB M.2 NVMe PCIe SSD. (See Figure 3.)
Number 3. Gigabyte 15 OLED Laptop: Intel i7-9750H (2.6-4.5GHz); 16GB DDR4-2666 dual-channel RAM; NVIDIA GTX 1660Ti 8GB GDDR6 memory; 512GB M.2 NVMe PCIe SSD. (See Figure 4.)
Number 4. HP Omen 15: Intel i7-9750H (2.6-4.5GHz); 16GB 2666MHz DDR4 dual-channel RAM; NVIDIA GTX 1660Ti 6GB GDDR6 memory; 512GB M.2 NVMe PCIe SSD; 32GB Intel Optane memory. The Intel Optane RAM option increases system performance. (See Figure 5.)
The first phase of my two-part exploration I call an “experiment” because this phase involves a null hypothesis. My null hypothesis is “there is no difference between a measure of game performance and a measure of benchmark performance.”
Why might this be my finding? Perhaps my free game, War Thunder, did not stress my computers as much as would an expensive game owned by a gamer.
To maximize the load put on a computer, I set the game’s visual quality to Movie which is the maximum possible. See Figure 6.
Were data from my experiment to not allow me to reject the null hypothesis, I would be unable to say anything about the experiment. Since you are reading this, we know I was able to reject the null hypothesis. Figure 7 presents Geekbench 4 performance. (AERO Geekbench 5 Multi-core performance is 5486.)
How did I determine there was a difference in performance? As shown by the following example, I merely “looked” at data plots of the two measures.
To better understand how this would work, I created the three sets of data shown by Figure 8. The orange plot shows a linear correlation between two variables: a Y-axis variable (e.g., heating BTU estimate) and an X-axis variable (e.g., number of office windows). The blue plot shows a linear correlation between a Y-axis variable (cooling BTU estimate) and the same X-axis variable (windows).
We can see that although the orange and blue data values are different (2.4 to 7.8 and 3.4 to 8.3), their plots do have the same slopes (correlations).
The green plot, however, shows a non-linear correlation between a Y-axis variable (heating/cooling power costs) and an X-axis variable—number of windows. (When bundling heating and cooling together, the power company provides a “consumption discount.”)
An Excel logarithmic trendline (dashed red) has been overlaid on the green curve. Looking at this logarithmic trendline and the two linear data plots, we can see the nature of the correlations is different.
Figure 9 presents a histogram of Geekbench 4 multi-core performance generated by the four computers. (A histogram is the correct way to plot discrete data.) A linear trendline has been overlaid on these data. The Coefficient of Determination, r2 provides an estimate of the correlation (0.0 to 1.00) between the test data and the trendline.
Figure 10 presents game performance, in frames-per-second, from these four computers. Each computer’s data point is the average of the automatic playback of three benchmark battles: Pacific War (morning), Battle of Berlin, and Tank Battle. A logarithmic trendline has been overlaid on these data. Again, r2 provides an estimate of the correlation between the test data and the trendline.
Figures 11 and 12 present these same data using connected data points so the shape of their plots is rendered more clearly.
To make the plot shapes more comparable, Figure 13 shows the two measures superimposed.
In the second of this two-part exploration, we will compare these two performance measures with performance data collected when editing with DaVinci Resolve. We should then see whether the game play curve or the multi-core benchmark curve is a better match to content creation performance.
You might also like...
Felix Krückels is a certified audio engineer who graduated from the Detmold University of Music and has been involved in immersive audio since 2012. He was there when NHK launched its Super Hi-Vision project with the help of Lawo.
Whenever I’m asked about my opinion on the transition to IP, I always state that the impact can’t be appreciated until its history is understood. This brings into context the need for broadcasters to educate and surround themselves wit…
With the emergence of the cloud into the media production and delivery space, the broadcast and media industry must embrace an entirely new approach to acquiring and deploying technology. Large capital expenditures (CapEx) are increasingly being replaced by operating expense …
It seems almost superfluous today to specify that audio is digital because most audio capture, production and distribution today is done numerically. This was not always the case and at one time audio was primarily done without the help of…
There is level and then there is loudness. Neither can be measured absolutely, but by adopting standardized approaches it is possible to have measurements that are useful.