In the good old days when you were thinking about upgrading your computer you began by reading printed reviews such as those published by Byte magazine. These reviews usually included industry standard benchmarks. Now, of course, you are far more likely to watch internet video reviews.
When I view these on-line reviews it’s clear the majority have been written by “gamers.” These reviews will have titles such as the “DeathAxe Laptop Reviewed Using 23 Games.”
A gaming computer reads compressed artificial world descriptions from a disk file. This artificial world is regenerated by the CPU and loaded into the GPU where it is displayed to the gamer. The gamer’s actions are fed back to the GPU which dynamically modifies the artificial world it displays.
This is a different process than video playback from a content creator’s system. Here a continuous sequence of frames is rapidly pulled from a disk, decompressed by the CPU, sent to GPU buffers where mathematical operations, such as those for color correction, are applied to the buffer - and then displayed for a precise interval. This process is repeated for the next frame.
When a content creator’s computer is transcoding media the workflow is also different. Transcoding involves reading compressed video frames from a disk, each of which is then decompressed and recompressed, with the recompressed frame written back to a disk.
There is one thing that gaming and content creation do have in common - both involve long task execution times during which heat is generated by the CPU and GPU. Even when a laptop has a good cooling system (see Figure 1), playing games for hours or working with high-resolution compressed video - often with complex effects - can push a computer toward, or into, thermal throttling. So, while gaming and content creation tasks do not employ the same compute intensive processes, they both can generate high thermal loads.
Understanding this, we want to know whether game play performance or industry standard benchmark performance has a higher correlation with content creation performance.
Let’s, for the time being, ignore measuring content creation performance. At this point, we need only a measure of game play performance and a benchmark performance measure. (When data were being collected, each laptop was powered by its charger from mains power.)
Both the gaming and Geekbench tests ran for well over a minute thus allowing heat to build as it would during editing and color-correction. Laptop display resolution, at 1920x1080, remained constant during testing.
We will need to measure performance on multiple computers because we need multiple data points for the measure of game play performance and multiple data points for the measure of benchmark performance.
Before describing the measures I took, you need descriptions of the computers. In the first phase of this exploration I had access to four systems. Thankfully these systems offered a wide performance range.
Number 1. Samsung Galaxy TabPro S Windows 10 Tablet: Intel m3-6Y30 CPU/GPU (similar to an i3); 4GB RAM; 2GB Video memory; 256GB SSD. (See Figure 2.)
Number 2. Lenovo Ideacenter Y910-27: Intel i7-6700 (3.4-4.0GHz); 16GB DDR4-2133; NVIDIA GTX 1080, 8GB GDDR5 memory; 128GB M.2 NVMe PCIe SSD. (See Figure 3.)
Number 3. Gigabyte 15 OLED Laptop: Intel i7-9750H (2.6-4.5GHz); 16GB DDR4-2666 dual-channel RAM; NVIDIA GTX 1660Ti 8GB GDDR6 memory; 512GB M.2 NVMe PCIe SSD. (See Figure 4.)
Number 4. HP Omen 15: Intel i7-9750H (2.6-4.5GHz); 16GB 2666MHz DDR4 dual-channel RAM; NVIDIA GTX 1660Ti 6GB GDDR6 memory; 512GB M.2 NVMe PCIe SSD; 32GB Intel Optane memory. The Intel Optane RAM option increases system performance. (See Figure 5.)
The first phase of my two-part exploration I call an “experiment” because this phase involves a null hypothesis. My null hypothesis is “there is no difference between a measure of game performance and a measure of benchmark performance.”
Why might this be my finding? Perhaps my free game, War Thunder, did not stress my computers as much as would an expensive game owned by a gamer.
To maximize the load put on a computer, I set the game’s visual quality to Movie which is the maximum possible. See Figure 6.
Were data from my experiment to not allow me to reject the null hypothesis, I would be unable to say anything about the experiment. Since you are reading this, we know I was able to reject the null hypothesis. Figure 7 presents Geekbench 4 performance. (AERO Geekbench 5 Multi-core performance is 5486.)
How did I determine there was a difference in performance? As shown by the following example, I merely “looked” at data plots of the two measures.
To better understand how this would work, I created the three sets of data shown by Figure 8. The orange plot shows a linear correlation between two variables: a Y-axis variable (e.g., heating BTU estimate) and an X-axis variable (e.g., number of office windows). The blue plot shows a linear correlation between a Y-axis variable (cooling BTU estimate) and the same X-axis variable (windows).
We can see that although the orange and blue data values are different (2.4 to 7.8 and 3.4 to 8.3), their plots do have the same slopes (correlations).
The green plot, however, shows a non-linear correlation between a Y-axis variable (heating/cooling power costs) and an X-axis variable—number of windows. (When bundling heating and cooling together, the power company provides a “consumption discount.”)
An Excel logarithmic trendline (dashed red) has been overlaid on the green curve. Looking at this logarithmic trendline and the two linear data plots, we can see the nature of the correlations is different.
Figure 9 presents a histogram of Geekbench 4 multi-core performance generated by the four computers. (A histogram is the correct way to plot discrete data.) A linear trendline has been overlaid on these data. The Coefficient of Determination, r2 provides an estimate of the correlation (0.0 to 1.00) between the test data and the trendline.
Figure 10 presents game performance, in frames-per-second, from these four computers. Each computer’s data point is the average of the automatic playback of three benchmark battles: Pacific War (morning), Battle of Berlin, and Tank Battle. A logarithmic trendline has been overlaid on these data. Again, r2 provides an estimate of the correlation between the test data and the trendline.
Figures 11 and 12 present these same data using connected data points so the shape of their plots is rendered more clearly.
To make the plot shapes more comparable, Figure 13 shows the two measures superimposed.
In the second of this two-part exploration, we will compare these two performance measures with performance data collected when editing with DaVinci Resolve. We should then see whether the game play curve or the multi-core benchmark curve is a better match to content creation performance.
You might also like...
In the previous article in this two-part series we looked at how cloud systems are empowering storytellers to convey their message and communicate with viewers. In this article we investigate further the advantages for production and creative teams.
It’s traditional for film and TV technical journalists to play soothsayer in the run-up to major industry events. With NAB and Cine Gear virtual this year and the world’s manufacturers having enjoyed an unprecedented stretch of downtime to hat…
Television is still a niche industry, but nonetheless, one of the most powerful storytelling mediums in existence. Whether reporting news events, delivering educational seminars, or product reviews, television still outperforms all other mediums in terms of its ability to communicate…
It’s a truism of our craft that compelling visual stories in film and TV are communicated in the subtext of scenes, that is to say, what we exclude from the Frame is almost always more important to the storytelling t…
The film and TV business is a prominent producer of things that were once very expensive, but which have become much more affordable as developments overtook them. That’s never clearer than when browsing everyone’s favorite auction website, which has…