Storing & Optimizing M&E Content in a Big Data-Fast Data World

New NVME and object storage platforms can support the needs of big data and fast data media applications.

The storage requirements for digital media never stop growing. Image size and resolution increases. Metadata is added to every frame. Finally, the data must live forever. All these factors and others combine to place serious demands on a production or broadcast facility’s storage infrastructure. Fortunately there is a solution.

Film studios and post-production houses are constantly challenged when technology evolves. Video resolutions have evolved to 4K, and 8K is rapidly emerging. Frame rates are also increasing, transitioning from 30fps (2K resolutions) to 60fps (4K resolutions). As camera capture technology also evolves, additional storage capacity and bandwidth will be required to support these higher resolutions and frame rates. If a studio changes a camera resolution from 2K (and 30fps), to 4K (and 60fps), approximately eight times as much data will need to be stored and streamed to support this increase in resolution. Looking at it another way, 4K imagery and digital cameras can produce about 2.63TB of data per hour, and emerging 8K video can generate 100TB per hour. To ingest petabyte-scale content requires a new approach to storage tiering that will protect valuable footage, enable global collaboration, as well as data analysis.

As digital film content continually grows in numbers and size, data storage capabilities MUST expand at similar rates, and take into account the ever-changing content activity in the production workflow, as well as protecting data to last forever. The way by which studios approach their storage strategies are becoming increasingly important, and once implemented, these strategies can be optimized by big data and fast data to deliver added value, intelligence and unexpected outcomes. It is no longer about just saving film or digital assets, but more about understanding what has been saved, how to access it, and how to extract value from it. To reach that data storage nirvana in film production, a tiered strategy is required that best categorizes the content and the storage medium for which the content is stored.

The Traditional M&E Storage Workflow

In the classic tiered enterprise storage pyramid model, data that is accessed continuously is the most important and considered hot, or tier 0 data. It requires very fast, high-performance media to store and retrieve it. Enterprise flash SSDs and all-flash arrays (AFAs) are commonly used as tier 0 storage. When tier 0 content becomes less frequented, it moves to warm status, or tier 1 data, and is typically housed in more cost-efficient, slower performing media such as enterprise hard drives. When the content is only accessed once in a while, it becomes cold, or tier 2 data, and is typically stored on even slower and lower-priced media such as commercial hard drives, until it becomes really cold, or tier 3 data, and is moved to very inexpensive media, such as tape, for archival.

When raw footage is captured by a camera, that content is as hot as it gets (Figure 1). It is moved from embedded storage (SD and microSD cards) within the camera to enterprise SSDs or HDDs as tier 0 storage. From the storage device, the footage goes to a digital imaging technician (DIT) cart that immediately makes two to three copies of the footage for post-production use associated with editing, transcoding, controlling image quality, performing on-set color correction, adding audio and special effects like 3D, collaborating in real-time, and troubleshooting.

Once the post-production work has completed, the production footage is moved to tier 1 storage and considered warm as further revisions can be made to the content. To protect these post-production revisions, daily back-ups to tape or capacity disk are performed and considered tier 2 storage that eventually moves to tier 3 storage the colder the content gets, and does not need to be modified – only read from time to time. Tape has been widely used for tier 3 storage.

Figure 1. Traditional tiered M&E storage model. Click to enlarge.

In the traditional M&E storage process, there are many challenges. The main problem is the speed that is required to ingest petabyte-scale content to the DIT cart or workstation for post-production. Flash-based SSDs or HDDs used in this tier 0 configuration are RAID-based, do not scale well and are not optimal for incredibly large workflows. If the storage media cannot properly manage the influx of content, dropped frames can occur that impact the technical quality of the film and cause on-screen distractions and a flawed viewing experience.

Additionally, if a hardware or data integrity issue occurs, it could take weeks to rebuild the content, negatively affecting workflow productivity and production schedules. The risk of content loss skyrockets if a subsequent issue occurs. When tape is used to archive data, content may become unreadable or difficult to access. As such, to ingest petabyte-scale content requires a new approach to storage tiering.

The New M&E Storage Tiering Model

There are two key adjustments that can be made to improve on the traditional tiered M&E storage model (Figure 2). For example, deploying high-performance SSDs based on the NVMe (Non-Volatile Memory express) standard has become a popular choice for ingesting film content for post-production use. As NVMe was designed specifically for flash media, it delivers significant improvements in latency and throughput when compared to hard drives or legacy SSDs based on SCSI commands. It features a streamlined memory interface, command set and queue design that bypasses the storage device stack to deliver significantly faster performance than traditional interfaces, and enables petabyte-scale film assets to be stored on fewer devices, and with smaller physical hardware footprints.

Figure 2. The new tiered M&E storage model. Click to enlarge.

A second adjustment to the tiered storage structure can be made when the post-production work completes and data is moved accordingly. Instead of placing traditional file-based content in a NAS or SAN environment, then migrate it to a production workstation for editing, and then archive it, the new paradigm is to simply place all of this content in an object storage system, bypassing the need for other storage media, and the pivot point for everything.

Object storage is an architecture that stores unstructured data as objects, whether a document, film, video, audio, image, photo, etc., and includes metadata that provides descriptive information about the object and the data itself. Since object data and metadata can be placed in a flat address space, the need for a hierarchical file structure is eliminated, simplifying data access. Since metadata is defined by users, data analytics and other discovery techniques can be enabled, and the film assets can also be aggregated to deliver very efficient capacity scaling.

Global Collaboration Requires Fast Data

Collaborating with team members and partners is also considered a workflow in the film production process requiring seamless and immediate access to all content at a global scale. File-based and block-based storage systems of the past have created silos of data as storage capacity grew, making it difficult to ensure a consistent global view to all data, and fast access to it. Leveraging object storage (on-premises or in the cloud) can help to eliminate these challenges and enable fast data to deliver the speed and performance to meet the global collaboration objective.

Data Analysis using Big Data

Many studios and post-production houses have captured a ton of film assets over time but have not taken advantage of the buried treasures that may lie beneath. There lies an abundance of data that can be mined to hopefully find some golden nuggets of importance or analyzed as part of a big data application to extract further value, intelligence, predictions, associations or some desired outcome. As such, the content must be stored without losing any of it.

Object storage is where the film industry is headed as these systems replicate data across three locations (similar to the triple mirroring model), but only requires storing about one-third of the object data in each location. These systems can also detect and self-heal sector-level bit errors in the background using erasure coding and data scrubbing technologies to achieve up to 19 nines of data durability. The result is data that has high integrity to deliver very precise analytic outcomes.

Final Thoughts

Studios and post-production houses are increasingly collecting high resolution content at every stage of film production. The workflow is growing exponentially and getting more complex and compute-intensive requiring large scale storage and processing, as well as efficient infrastructure solutions that maintain comprehensive digital libraries and support a global organization. The answer is in NVMe-based storage media to ingest film content for post-production use, and object storage to do everything else from migrating the film asset to a production workstation for editing, to archiving it, to performing global collaboration and data analysis. To ingest and process this much film content requires a different approach to data storage and a new tiering structure.

Erik Weaver is the Director of Product Marketing for media and entertainment solutions within the Data Center Systems group of Western Digital Corporation.

Other related articles posted on The Broadcast Bridge.

Applied Technology: Using Object Storage for M&E Workflows

You might also like...

Building Software Defined Infrastructure: Observability In Microservice Architecture

Building dynamic microservices based infrastructure introduces the potential for variable latency which brings new monitoring challenges that require an understanding of observability.

IP Monitoring & Diagnostics With Command Line Tools: Part 5 - Using Shell Scripts

Shell scripts enable you to edit your diagnostic and monitoring commands into a script file so they can be repeated without needing to type them manually every time. Shell scripts also offer some unique and powerful features that help to…

Broadcast Standards: Kubernetes & The Architecture Of Cloud Compute Based Systems

Here we describe Kubernetes and the taxonomy of containerized architecture based cloud compute system designs it manages.

Live Sports Production: Backhaul In Live Sports Production

Getting content reliably and securely from venue to studio remains key to live sports production so here we discuss the technology and services required.

Monitoring & Compliance In Broadcast: Monitoring Delivery In The Converged OTA – OTT Ecosystem

Convergence or coexistence between linear broadcast, IP based delivery and 5G mobile networks creates new challenges for monitoring of delivery paths, both technically and logistically.