V-Nova PERSUS encoding system
V-Nova commissioned the independent consultancy informitv to develop a formal methodology to evaluate the performance of PERSEUS video compression for deployment across various scenarios. This article summarizes the results.
The tests examined the V-Nova PERSEUS as it might be used in professional video contribution applications. Images were compared to the uncompressed source material in various formats at a range of data rates, from 75Mbps to 300Mbps.
Experienced observers ranked image results that, when aggregated, provided the data rate at which the encoder’s compressed output is visually lossless in quality compared to an equivalent uncompressed source.
Four sample sequences were evaluated, including a range of fast action and high detail scenes, representative of typical video programme material. The sequences were selected to be deliberately challenging to encode, and included camera movement, shot cuts, slow dissolves and graphic overlays.
Test model characteristics
The test sequences used RAW files direct from the camera, recorded at high frame rates in progressive format at resolutions of 4K or greater. The images were neutrally graded, cropped and scaled. The frame rate was conformed to 50 frames per second without interpolation.
The shots were edited into a series of 10-second sequences with a variety of cuts, mixes and superimposed graphics. This enabled various features representing different characteristics of television output to be tested. These master materials were used to render uncompressed source files in 10-bit 4:2:2 RGB format. Interlaced output was derived from the progressive source where necessary.
The source video was played out in 10-bit 4:2:2 YUV format and encoded and decoded in real time on a V-Nova P.Link 2.0 encoder and decoder combination. The output was recorded in uncompressed YUV format for comparison. This removed any dependencies on the codec for playback and enabled direct comparison with an uncompressed source.
Both source and encoded outputs were evaluated using a Video Clarity ClearView analyzer, connected via four SDI links to a SONY PVMX300 30-inch 4K TRIMASTER LCD professional monitor. This enabled the playback of two uncompressed 1080p channels of video in real time or a single channel of 2160p50 video. Playback of two channels of 2160p50 for side-by-side comparison was limited to 30 frames per second, without dropping frames, so was effectively 60% normal play rate.
The viewing panel consisted of seven consultancy employees with normal corrected colour vision, all experienced in evaluating video compression. For each 10-second sequence they individually and independently compared the identified source to the previously compressed output, without knowing its encoding parameters.
Each observer was permitted to review the sequences any number of times, from any distance, full screen, side-by-side split screen or seamless split screen. This approach was intended to encourage close and careful visual inspection to identify any discernable difference.
The observers were asked to assess the compressed image with respect to the uncompressed source and then score it on a nine-point degradation rating scale.
The anonymous results of the observations were aggregated to determine the Mean Opinion Score for each of the four sequences. The resulting overall mean for each format and data rate are shown in Figure 1.
Figure 1. Mean opinion score of recorded output of real-time V-Nova P.Link 2.0 encoder compared to uncompressed source as rated by seven experienced viewers. (click to enlarge).
PERSEUS scored an average of over 8 out of 9 for each sequence at each data rate, which corresponds to 4.5 on a 5-point scale. The lowest score for any observation was 7, which accounted for 23 observations. This corresponds to the point at which any degradation is perceptible but not annoying.
Observers were instructed to score 7 or lower if they were able to identify any specific degradation or visual artifact not present in the source. A total of 24 observations scored 8. This corresponds to the threshold of degradation, or the point at which the observer is not completely sure they can perceive degradation. This score means the image is considered to be visually lossless. Out of 168 observations, 121 or 72%, scored 9 meaning that any degradation was imperceptible and the observers were unable to identify reliably the source or the encoded version.
There was some variation in scores between different sequences but this was not substantial. There was also some variation between observers, with some scoring lower overall than others. The scores increased with data rate for each format, indicating an increase in quality, but in each case the lowest data rate tested was on average considered to be visually lossless.
Objective tests were separately performed using standard metrics to compare the source input with the encoded output at the same resolution. In each case the PSNR or peak signal-to-noise ratio and MS-SSIM or multi-scale structural similarity index were calculated. The metrics were computed using the Video Clarity ClearView analyzer and represent the average across the frames of each sequence in the Y or luminance channel.
PSNR or peak signal-to-noise ratio is a popular measure of distortion in decibels (dB). Higher numbers indicate less distortion. Although PSNR does not necessarily correspond to subjective image fidelity, a value over 35 dB indicates relatively low distortion and is generally considered to be good, while values around 50 dB are consistent with visually lossless compression.
Figure 2. The average PSNR ranged from 45.44 dB for the 75Mbps interlaced high-definition format, to 49.39 dB for the 300Mbps ultra-high-definition format. (click to enlarge).
The PSNR varied by sequence, indicating that such results are image dependent. As shown in Figure 2, the higher data rates delivered higher PSNR values. The results closely followed the pattern of the average mean opinion scores of the video experts. A PSNR approaching 50 dB or above was consistent with an average Mean Opinion Score (MOS) of 9, indicating visually lossless compression.
What did we learn?
PERSEUS delivered visually lossless compression in observer evaluations at data rates of 75Mbps for 1080i25, 150Mbps for 1080p50 and 200Mbps for 2160p50. The compressed output at these operating points scored an average of over 8% on MOS scale of 1-9, with 72% of the total observations scoring 9. This effectively means that observers were unable to perceive any degradation, and that V-Nova PERSEUS is visually lossless at the operating points tested.
You might also like...
In the data recording or transmission fields, any time a recovered bit is not the same as what was supplied to the channel, there has been an error. Different types of data have different tolerances to error. Any time the…
Lawo’s Christian Struck looks at the potential for production automation in immersive sports broadcasting, and how it can help move towards a personalized, object-based experience.
Genelec Senior Technologist Thomas Lund moves the monitoring discussion on to the practical considerations for immersive audio, wherever you are.
The Ultra HD Forum has given a stimulus to UHD deployments with the release of its latest 2.1 guidelines that give proper weight to all the ingredients constituting next generation A/V (Audio/Video).
In this fourth installment of the Immersive Audio series we investigate the production tools needed to produce live immersive content. Moving from channel-based output to object audio presents some interesting challenges as the complex audio image moves around in three-dimensional…