Industry studies suggest that only about 20% of all digital data is ever accessed or used again after it is stored. Yet, the demand to produce more content is driving many media managers to seek better ways to store, search and retrieve enormous volumes of archival content to help build productions. A key partner technology in this quest should be active archiving.
The Active Archive Becomes Integral to Your Game Plan
Relentless Data Growth
Newly created worldwide digital data is expected to grow at 30% or more annually reaching 163 zettabytes (a ZB=1021bytes) by 2025, according to a report from IDC. Data is rapidly piling up in archives as retention requirements of up to 100 years to forever are now common. The top external factors driving archival and long-term retention requirements include government compliance regulations, growing dependency on security and surveillance systems, advanced 3D and 4D video capabilities, content producers in Media & Entertainment, the relentless growth of Big Data analytics, and the emerging IoT (Internet of Things).
Most data reaches archival status in 90 days or less. IDC estimates that by the end of 2025, only 15% of the data in the global data-sphere of 163 zettabytes will be tagged and only 20% of that will be analyzed. Therefore, if 80% of the data created is never analyzed, that data will likely reach archival status upon creation.
Recent analyst surveys indicate that less than 40% of corporations have a dedicated archive strategy in place. This reveals a huge, unaddressed and growing archive challenge that can be met with modern archiving concepts.Active Archive Alliance
These enormous data volumes will require that businesses build their storage architectures by optimizing SSD, HDD and tape in tiered storage solutions. The greatest economic advantages of tiered storage occur when tape is included. Classifying your data upon its creation by its value, performance and capacity requirements will enable the right data to be in the right place at the right time.
It is projected that by 2025 5% of digital data will be stored on Tier 0 SSD, 35% on Tier 1 and Tier 2 HDDs and 60% on Tier 3 tape. Tier 3 is typically referred to as the archive tier with an average of 60% of data classified as archival upon creation. See Figure 1.
Backup And Archive Are Entirely Different Processes
It’s important to distinguish between backup and archive as these core IT processes are not the same and are often misunderstood. Many businesses still use backup copies to store archival data and repeatedly back up unchanging archives wasting HDD space. Backup and archive are entirely different processes and have different objectives.
The backup process creates copies of data for recovery purposes which may be used to restore the original copy after a data loss or data corruption event. Backups are cycled and updated frequently to account for and protect the latest versions of important data assets.
Archiving moves unchanging and less frequently used data to a new location(s) and refers to data specifically selected for long-term retention. Archival data is typically unchanging, and is not overwritten. See Figure 2.
Figure 2. An active archive is far more than a simple backup copy or long-term archive storage solution. This is a key difference that needs to be appreciated when selecting a data storage topology.
Building An Active Archive
An active archive integrates SSD, HDD, tape, and the public or private cloud, leveraging the popular tiered storage model that is focused on the archive function. The active archive supports file, block or object storage systems using advanced data management software to maintain end user accessibility to archival data regardless of the storage device it is residing on.
Intelligent data management software provides faster online random access, search and retrieval capability for archival data in a single virtualized storage pool, and automatically migrates data between storage tiers based on user policy. The widespread usage of SSDs and high capacity HDDs coupled with tape’s highly favorable economics, security, and archive characteristics have propelled the successful emergence of active archives. A typical active archive in block diagram form is illustrated in Figure 3.
Active archiving can use existing storage devices to build an integrated hardware and software solution and can incorporate enhanced file systems such as the open standard LTFS (Linear Tape File System) for Linux, Apple and Windows, or TAR (Tape Archive) for Unix systems. For those who do not want to build their own repository using existing equipment, multiple vendors offer preconfigured active archive appliances and file systems that work with most any tape library back-end. A summary of the key benefits of active archiving is shown in Figure 4.
Digital Archives Embrace Object Storage
Archiving is the earliest enterprise use case for object storage, having been used for over a decade providing scalable, long-term data preservation. Object storage enables IT managers to organize archival content with its associated metadata into containers to easily allow retention of massive amounts of unstructured data. In July, 2017 IBM Spectrum Archive Enterprise Edition V1.2.4 which uses LTFS, announced the connection with OpenStack Swift to enable movement of cold (archive) data from object storage to more economical tape and cloud storage for long-term retention. LTFS now provides a back-end connector for open source SwiftHLM (Swift High Latency Media), a high-latency storage back end that makes it much easier to perform bulk operations using tape within a Swift data ring. Cloud storage is the most prominent use case for object storage.
Enterprises archive their data because they either want to or because they have to, but either way, the magnitude of this requirement can quickly become overwhelming. With the amount of archival data soaring and no end in sight, active archiving is poised to play an increasingly important role in tomorrow’s media centers as it reawakens the archives. It is critical that technical managers and producers now begin leveraging the value of archival data. It’s time to develop your game plan.
Peter Faulhaber, president of Fujifilm Recording Media U.S.A and Chairman of the Active Archive Alliance.