​Welcome to the PetaScale: Are You Ready? 

We are creating and saving more data than ever, but we are no longer limited as to where we store that data. Centralized home servers, remote storage facilities, and, of course, “the cloud” in all its incarnations have changed the way we think about saving data, what we save, and for how long. Ultimately, these storage options also have changed how we use or consume that data. Creating and acquiring massive amounts of content has become easier than ever. A terabyte of data once was barely manageable, and now a single petabtye seems common.

It was just a few years ago that I got my first one terabyte hard drive in my laptop, and I thought, “How will I ever fill this up?” Six months later I was wondering when I would be able to get a two terabyte drive. Luckily, I never ended up needing it. It is not that I am saving less data; I am just saving it in more places. In fact, more and more people and companies are doing the same.

Managing Complexity
Many organizations face requirements to access volumes of data just to perform day-to-day analysis and tasks. This problem affects departments, divisions, and whole companies. Data comes from a growing number of sources, with each piece of data filed independently. The rate of ingest is overwhelming, and there isn’t enough time in a day to effectively manage the data. Automated data-management tools are just catching up to this reality, but a significant amount of the data housekeeping responsibility remains a manual task that falls onto users. This model is hard to sustain and even breaks companies’ standard operating policies.

The cloud has now become a convenient warehouse for data — much more than a closet — and this shift has put enormous bandwidth demands on the public cloud. At the same time, a growing number of internal private cloud deployments have emerged to meet the demand for cloud-based storage.

The growing number of organizations with petabyte-sized or 'petascale' archives are facing ever increasing cost and complexity in maintaining local data and content. Many are turning to the public cloud and specialized public cloud; others, less comfortable storing all that information on the public cloud, are looking at private cloud solutions. In both cases, organizations have come to realize that as they move forward, the traditional storage technology used to manage larger data sets at petabyte scale is simply not cost-effective, reliable, or efficient enough.

Relying on traditional RAID technology to maintain long-term archives of active data is not only highly inefficient, but also unsustainable from a power and cooling standpoint. The high-speed disk and CPU-intensive design of tier one enterprise disk is just not capable of scaling into several petabytes — not to mention exabyte-scale, which seems to be just a few years away.

Data Usage is Changing
Data usage patterns have also changed. Until recently, data would have been mostly at rest, and users were satisfied with storage of data on tape. Now, data must be readily available for use.

For example, a reality TV producer could easily ingest and maintain 16 terabytes of content monthly for each show created, and that one might manage several shows concurrently. In such a case, it would be desirable not only to keep all stored assets indefinitely, but also to maintain the ability to retrieve those assets quickly for repurposing or analysis.

With capacity needs in single namespace data sets outstripping tape capacities and with users demanding nearly instantaneous access to stored data, disk was at times the only viable solution. Eventually, however, evolving technology provided a better way to address storage requirements.

Enter object storage
Next-generation object storage helps power the cloud for many service providers today. This technology, which already has been used for reliable satellite communication transmissions in space and for cell phone communications, has now been reengineered for use by commercial cloud service providers as durable, efficient data protection.

Object storage saves data as objects with IDs rather than in traditional hierarchal file structures. This enables large volumes of data to be spread geographically yet remain locally accessible to users in any location at nearly tier-one storage performance.

Object storage leverages cloud-grade hard drives that provide highly efficient CPUs and memory while requiring less power. The new technology has the added benefit of offering more flexible durability standards, meaning that data contained in object storage can actually be more durable than that in a similarly sized RAID 6 storage array, even when the RAID 6 array is mirrored at another location. In addition, some object storage technologies can virtually eliminate the need for the time-consuming and risky migration of data from one platform to another in order to maintain long-term archives.

Object storage provides tremendous cost benefits when it comes to accessible and quick access to volumes of archives. Using the right access technology as the on-ramp to object storage, companies move toward a primarily cloud-based storage infrastructure. Other companies simply interested in building a private cloud infrastructure gain similar benefits, but with greater control.

The single most cost-effective solution for storage at petabyte scale and beyond remains to be determined. But today — and for the foreseeable future — the combination of private cloud and public cloud object storage-based storage solutions with optimized on-premise cloud-aware storage seems to be the answer. This is true both for consumers who have access to services that use these technologies and for organizations that, like consumers, can benefit from the scale of cloud and security of private cloud. As the world’s insatiable thirst for data continues to grow, technology companies will continue to develop approaches to help us manage a petascale lifestyle.

Alex Grossman is vice president, media and entertainment at Quantum

You might also like...

Standards: Part 6 - About The ISO 14496 – MPEG-4 Standard

This article describes the various parts of the MPEG-4 standard and discusses how it is much more than a video codec. MPEG-4 describes a sophisticated interactive multimedia platform for deployment on digital TV and the Internet.

Chris Brown Discusses The Themes Of The 2024 NAB Show

The Broadcast Bridge sat down with Chris Brown, executive vice president and managing director, NAB Global Connections and Events to discuss this year’s gathering April 13-17 (show floor open April 14-17) and how the industry looks to the show e…

Designing IP Broadcast Systems: Part 2 - IT Philosophies, Cloud Infrastructure, & Addressing

Welcome to the second part of ‘Designing IP Broadcast Systems’ - a major 18 article exploration of the technology needed to create practical IP based broadcast production systems. Part 2 discusses the different philosophies of IT & Broadcast, the advantages and challenges…

Designing IP Broadcast Systems: Timing

How adding PTP to asynchronous IP networks provides a synchronization layer that maintains fluidity of motion and distortion free sound in the audio domain.

Standards: Part 4 - Standards For Media Container Files

This article describes the various codecs in common use and their symbiotic relationship to the media container files which are essential when it comes to packaging the resulting content for storage or delivery.