The Meaning Of Metadata

Metadata is increasingly used to automate media management, from creation and acquisition to increasingly granular delivery channels and everything in-between. There’s nothing much new about metadata—it predated digital media by decades—but it is poised to become pivotal in broadcast technology’s current phase of rapid evolution.

From the earliest days of film and professional audio, metadata has been part of creative people's workflows, even if it wasn't talked about as such. The moment you put a strip of film in a "bin," write a label on a reel of tape, or even put it on a particular shelf, you're creating metadata. Many of our current working methods originated in the analog world, and we needed the administrative techniques to be efficient even back then.

In those pre-digital, pre-database days, we didn't call it metadata at all; we just called it "organization." As computers inexorably became part of our working lives, we clung to pre-digital concepts like files, filing cabinets, folders, and even desktops. We did this with good reason; these ideas were effective.

Eventually, we would find good reasons to change them, but only when these old-world concepts started to constrain digital organization and data manipulation.

At the risk of meandering through a philosophical wormhole, it's worth looking more closely at what we mean by "metadata" and what it is exactly. This is not to suggest that our current understanding of metadata is flawed in some way but to see how far we can expand the scope of metadata and perhaps use it in new and fruitful ways.

"Meta" means, essentially, "beyond", as in metaphysics and (very topical) metaverse. "Metaphysics" is a description of things beyond the boundaries of physics: religion, aesthetics, etc, and the "Metaverse" is what the internet is likely to evolve into. You find similar uses in words like "Metamorphosis" and "Metabolic", which ultimately refer to changes beyond the initial state of something. (So the idea of "beyond" refers more to a temporal and physical relationship rather than one simply of abstraction).

When we talk about digital media—and all the activities surrounding its production and distribution—we tend to use the notion that metadata is "data about data." You can't really argue with that, except that it doesn't tell us very much. What we do know is that metadata can be a transformative element in modern media workflows, allowing for smarter, more adaptable, and more robust workflows.

Here's a simple example to illustrate that.

In media asset management, the idea is essentially to take assets (video, still photography, documents, audio, etc.), store them somewhere, and eventually retrieve them. As MAM systems grow, they often incorporate facilities like transcoding and other processes such as adding subtitles, potentially in several languages.

A chunk of media with no associated metadata can only be treated as a closed box. With no idea what's in it, all the system can do is deliver it. The more metadata you have, the more you can do with the payload inside the box. In this context, metadata lets you see what's in the box without opening it. One distinction might be, "Is this a proxy file, or is it a full-resolution file"? If the box is labelled "proxy", then the system will know to treat it as such (for example, not sending it down a path that would lead to it being transmitted to end users).

The more metadata, the smarter the system gets. Ultimately, you could design a system where the content contains sufficient information to "find its own way" to its intended destination, complete with processing like transcoding during its journey.

Metadata is typically associated with moving media, even if that media is in storage, because, somehow, it got there, and somehow, it needs to get from there to whoever needs it. Metadata is information, so how do you move information? Does that information exist in one place, or is it, like an idea, independent of location and, essentially, everywhere at once?

How do you move the number 7 from London to New York? You can't. That's a category mistake. You can't weigh it nor measure its dimensions. It simply exists everywhere at the same time. It's a concept. You can't put the idea of "Red" in a box and ship it.

Think about a history book about an ancient battle. The book doesn't "contain" the battle because it took place in the past; it happened somewhere other than where the book is. The book isn't the battle; it is about the battle—there are no real soldiers fighting between the pages.

But if the book is the only physical record of the battle, you need to care for it when you move it.

So, metadata exists in a kind of duality. At the same time, it is a bunch of concepts that don't exist in any particular place, but it can also be as rare and fragile as the Crown Jewels, depending on its designated value and the precariousness of its storage medium.

In practical terms, you can often break down and simplify metadata by structuring the process that you're describing. In this sense, structure is also metadata, but it's built into the entity being described. You don't need to write the word "tree" on every tree because you can see it's a tree.

You might never have considered the distinction between "Denotation" and "Connotation". These are relevant here. The word "tree" denotes the thing (i.e. a component of a forest) that we identify when we're looking where to stick our label that says "Tree". The connotation is "Branches, leaves, bark, roots" etc. If you understand the connotation of a label, you only need to denote a package: you don't have to fully describe it because that information will be in a database or look-up table somewhere.

Think of a freight ship. A ship dedicated to carrying cars, frozen fish, electronics, or even fresh produce is easy to label. You need one label because wherever you look on the ship, you'll find the same thing. You may need to associate specific handling instructions with that label (frozen fish can't be allowed to thaw out when it's unloaded; fresh food needs prompt onward shipment, etc.), but you only need that detail.

Container ships are different. It's easy to handle containers, but they could contain anything, so the labels have to go to another level, where each container needs its own specific instructions.

However, some containers might not have uniform cargo. They might be loaded with individual packages (teddy bears and lawnmowers, for example), and these parcels would need individual labels.

You can immediately see how helpful metadata is in all walks of life. In broadcasting, containers can convey compressed digital media in a standardized container format (like MP4) and are agnostic about specific formats.

Metadata opens up file and content transfer from being one-dimensional (let's say with a Media Asset Management system at each end of a wire) to being multidimensional. This means that the more you know about the media (though its metadata), the more intelligent ways to manage and orchestrate it you have. You can set up conditional workflows where conditional movement allows you to apply quality control and send different content to different destinations; you can regionalize, personalize and customize, as well as add new levels of security.

You can orchestrate media movement because each piece of media essence carries associated metadata that informs the system where it should be going. There's no limit to this. If your metadata is detailed enough, you can achieve an extraordinary level of automation.

How do you achieve this practically?

Here are two common approaches.

You can have a giant, overseeing database of all metadata.

Or have an object-oriented system where all media essence carries its own metadata in its container or a sidecar file (a file containing metadata that is always associated with the media it is describing).

Each approach has its advantages, but that depends on the complexity of the metadata. Structured databases can become unwieldy when presented with data schema that is too granular. Systems where the "payload" can find its own way through the maze can be more robust, but it is hard and perhaps foolhardy to generalize.

Most metadata is currently generated at ingest and can be as fine-grained as necessary, even to the level where it records what the sound supervisor had for breakfast. Today's AI developments mean that metadata is likely to be generated in every segment of a workflow, which will make an almost unimaginable difference to its usefulness.

Eventually, digital media essence and metadata will merge, creating a kind of singularity in which frame rates and resolution become irrelevant. My prediction is that this will happen sometime between the next century and next week.

You might also like...

Designing IP Broadcast Systems: NMOS

SMPTE have delivered reliable low latency video and audio distribution over IP networks, but it’s NMOS that is delivering solutions to discovery & registration challenges that satisfy operational requirements.

HDR & WCG For Broadcast - HDR Picture Fundamentals: Color

How humans perceive color and the various compromises involved in representing color, using the historical iterations of display technology.

Audio At IBC 2024

Great audio is fundamental to any great broadcast and professional audio remains one of the busiest areas of the show both in terms of number of exhibitors and innovative new technologies on show. IP and cloud developments seem set to…

Network Orchestration & Monitoring At IBC 2024

Software defined systems is one of the hottest topics of the broadcast industry and IBC will be the perfect opportunity to get first hand demonstrations and expert advice from the vendors at the forefront of the leading edge of the…

Encoding & Transport For Remote Contribution At IBC 2024

The technology required to get high quality content from the venue to the viewer for live sports production remains an area of intense research and development, so there will be plenty of innovation and expertise in this area on the…