HDR: Part 17 - Creative Technology - Is RAW Really Uncompressed & Unprocessed?

It’s hard to object to raw recording. The last thing anyone wants is for the creative intent to be adulterated by unfortunate technical necessities like compression, and the flexibility of raw makes for… well. Let’s admit it: better grading, but also easier rectification of mistakes after the fact, to the point where the glitch isn’t really noticeable.

Or at least, that all makes sense in terms of what the word “raw” originally meant. It meant uncompressed and, to the maximum extent possible, unprocessed. That made it difficult, because uncompressed images are huge, and uncompressed moving images are even more huge. Because raw is such an attractive idea, though, there’s commercial interest in being able to call something raw while avoiding the inconvenience, which has meant the introduction of compression and at least some colour processing. The validity of both those things depends on the circumstances, but they have ramifications for compatibility, and for exactly the things that raw was designed to give us.

In short, has the whole idea of raw recording been watered down?

The idea emerged in digital stills cameras. JPEG images predate mainstream DSLRs, and given the need to store lots of images before big flash cards were available, it was an easy choice. Even then, the desire to promote a high picture count meant images were sometimes squeezed harder than perhaps they should have been; maybe enough for the compression to become visible. Raw went further than just avoiding compression: it took the unaltered pixel values from the sensor and stored them in a file, not even unscrambling the matrix of red, green and blue-filtered information from a single-chip camera. Doing it later, on a desktop computer, might let it be done better.

Grey Definitions

We should be careful about terms like “unaltered pixel values” because there’s a grey area between the action of reading information from the sensor, which we have to do in camera, and starting to process it into an image, which we can do at any time. For one thing, there’s the issue of how we relate numbers stored in a file to light levels hitting a sensor or coming out of a monitor, which isn’t always simple. Classical raw formats simply store the numbers read from the sensor, which meant the number stored was, usually, roughly, proportional to the amount of light detected.

That’s encouragingly simple, but it doesn’t respect how eyes or monitors work, so the adjustments required to create a normal-looking image can be quite extreme. That risks visible edges – banding – between adjacent brightness levels. Some formats mathematically modify the relationship between light level and stored number to allow for gentler adjustments later, minimising the problem. That could be considered “processing the image,” although it’s hard to claim that the result is not still legitimately raw; it’s still a matrix of red, green and blue samples from a sensor and not really a viewable picture yet, and it’s usually still uncompressed.

And that’s what most people expect raw to be. It’s been implemented on more or less every DSLR since the late 90s, and that’s the source of the first real issue: more or less every manufacturer does it differently, and there’s no technical reason why. It’s not particularly difficult to write a long list of sample values to a file and store it on a flash card. It’s objectively less computing work than creating a JPEG file. Some additional data, such as white balance, is included in raw files which is required to correctly decode them and recover a viewable image, but most of this is fairly straightforward and it’s hard to see why it requires engineering that’s custom to every major manufacturer.

Standardized DNG

As early as 2003, Adobe had begun work on the DNG (“digital negative”) format, with the idea that it should allow raw images to be stored in a standardized format. It was arguably too late; many manufacturers, including Nikon and Canon, eschewed DNG in favor of the in-house designs they were already using. By and large, they still do; the DNG format was not adopted nearly as widely as had been hoped, and many stills cameras still use a proprietary format.

With grim inevitability, similar things applied to moving image cameras. The short-lived Dalsa Origin was announced at NAB 2003, around the same time Adobe started to consider DNG, and… didn’t record to that format. Neither did Arri’s early digital cameras. At that time, there wasn’t any practical way to record 24 uncompressed files every second onto a flash card in the back of the camera, so recording was generally to a big box of hard disks on the other end of a cable, but the file format consideration is the same. Various Alexa models have long been able to record ArriRaw internally – but none of these things natively produce DNG or any other standard format. Blackmagic’s Ursa series actually did, at least initially, though that option has vanished from more recent firmware.

Most of the time all of these things do conceptually the same job, offering all the benefits that raw provided to stills people, though the sheer data load was vast. Since the early 2000s, the world market for flash storage has provoked a lot of progress in terms of capacity, speed and cost, though much of that has been offset by the push for ever-increasing resolution and frame rates. The solution, perhaps predictably, was compression. Arguably, a raw frame isn’t really an image, since it isn’t viewable without quite a lot of processing, but we certainly have three blocks of photographic samples that can be compressed using more or less the same techniques we’d use on images.

Trading Disk Space

DNG had always offered lossless compression, but version 1.4 included provision for compression that traded (a small amount of) quality for (a lot of) disk space. That wasn’t until 2012, though, by which time it was already clear the format might not become ubiquitous. Compressed raw had been introduced some time previously by both Cineform and Red, both using wavelet compression. All of them, though, were still proprietary formats, compatible only with software anointed by their developers (though Cineform’s codec would later become open source). Compressed raw was quickly entrenched, with most of the benefits of raw intact – albeit caveated by the exigencies of compression.

Given all this, it’s perhaps no surprise that there is sometimes confusion about what raw really means. Users sometimes work on the basis that “raw” means “uncompressed” when that’s certainly not always true. Sometimes, raw can be very heavily compressed. Back when even standard definition was difficult, broadcast-quality pictures were compressed using the JPEG algorithm at perhaps 3:1. Some modern raw formats may use up to 18:1, depending on resolution and frame rate, and while the mathematics have become cleverer, they probably haven’t become six times cleverer.

Yes, it’s nice to be able to make lengthy recordings of raw material to inexpensive media, and yes, it’s nice to be able to handle that material without needing massive hard disk arrays in post production, and yes, it’s nice to be able to record high frame rates without making special provisions, but users should know what they’re getting into.

Increasing Flash Speeds

Is it technically possible to solve these problems? Sure. As flash becomes ever faster and more capacious, we might expect to find less need for compression. But even if we do use compression, it’s hard to see why proprietary approaches are really needed. Most of the mathematical techniques used in image compression are public. Meanwhile, processing the raw data into a finished image is target for lots of original research, but it’s still something that’s quite widely discussed in scientific literature. While companies might add proprietary tweaks to both compression and processing, it’s not clear that the improvements are enough to offset the convenience of broader compatibility.

Evidence of that is available in the form of the open source program dcraw, or in the derived library LibRaw, based on much the same code. It implements algorithms to generate viewable final images from raw files recorded by a wide variety of digital cameras, somewhat torpedoing the mystique that each manufacturer’s sensor technology requires special handling. It is difficult to objectively assess the quality of this sort of software, since there are many compromises involved. Some software might prioritise sharpness, while other code might prioritise minimising aliasing and colored moire patterning, which are potential flaws in raw image processing. Commercial implementations of this sort of software are possibly more about expressing a particular choice on that spectrum of compromise than they are about any game-changing innovation. If such a thing were possible, one company’s software might well process another company’s raw data perfectly acceptably.

Raw Format Unification

As a result, the idea of one unified raw recording standard was never beyond the realm of technical possibility. That’s what Cineform Raw tries (or tried) to be; it’s also what ProRes Raw tries to be, it’s what Blackmagic Raw tries to be. There’s nothing wrong with any of them; the problem is that they exist simultaneously, although there’s no question that these technologies are effective, and they’re capable of sufficiently light compression (5:1 in the case of what Blackmagic call Q0) that it’s hard to argue for a serious image quality problem.

So, when we encounter a raw setting on a modern camera, no, it may not be uncompressed, untouched sensor data that the term originally implied. Compatibility issues have, perhaps, been boiled down to a choice of two mainstream options plus a smattering of pretenders, but most are compressed and some do a degree of processing in camera. Almost unavoidably, this represents at least some blurring of what “raw” really means. The upside is that it’s made things a lot easier, particularly given the push toward very high-resolution workflows; 8K and 12K cameras would be almost impossibly cumbersome and expensive without fairly high ratio compression. The question of whether the resulting files reliably contain 8K or 12K of data is hard to answer.

Perhaps the most powerful argument is that compression and raw processing algorithms are both notoriously difficult to objectively assess. The same could always be said of film, of course, and if modern raw recording allows us to get past the numbers and be more accepting of a completely subjective assessment of what the pictures look like, it’s hard to object.

Why Did You Read This?

You might also like...

HDR: Part 24 - Creative Technology - Artificial Intelligence

Every decade has had a buzzword. Watch a 1950s educational movie and realize how dated the term “atomic” sounds now, and not only because the downsides of nuclear power have since become so painfully apparent. Since then, we’ve been sold …

Timing: Part 2 - The Birth Of Video Recording

The peculiarities of the motion of planet Earth are responsible for much more than seasons and the midnight sun and it took a while before it was all figured out.

Core Insights - Improving Headset Comms At Extreme Events

Without intercom, a live broadcast production would soon degenerate into chaos. A whole industry has been built on the protocols intercom users have adopted and everybody involved in the production must be able to hear the director’s instructions, clearly a…

HDR: Part 23 - DOPs: How Useful Is The iPhone Really For Professional Production?

We’ve heard the hype, and I admit I’ve contributed my fair share. The iPhone is able to capture impressively sharp, high-resolution images that stand up to critical examination even when magnified and viewed on a 20-meter cinema screen. The…

Improving Compression Efficiency With AI

A group of international technology vendors and broadcasters is working on developing and implementing Artificial Intelligence (AI) standards to improve video coding. Calling itself MPAI (Moving Picture, Audio and Data Coding by Artificial Intelligence) they believe that machine learning can…