​Welcome to the PetaScale: Are You Ready? 

We are creating and saving more data than ever, but we are no longer limited as to where we store that data. Centralized home servers, remote storage facilities, and, of course, “the cloud” in all its incarnations have changed the way we think about saving data, what we save, and for how long. Ultimately, these storage options also have changed how we use or consume that data. Creating and acquiring massive amounts of content has become easier than ever. A terabyte of data once was barely manageable, and now a single petabtye seems common.

It was just a few years ago that I got my first one terabyte hard drive in my laptop, and I thought, “How will I ever fill this up?” Six months later I was wondering when I would be able to get a two terabyte drive. Luckily, I never ended up needing it. It is not that I am saving less data; I am just saving it in more places. In fact, more and more people and companies are doing the same.

Managing Complexity
Many organizations face requirements to access volumes of data just to perform day-to-day analysis and tasks. This problem affects departments, divisions, and whole companies. Data comes from a growing number of sources, with each piece of data filed independently. The rate of ingest is overwhelming, and there isn’t enough time in a day to effectively manage the data. Automated data-management tools are just catching up to this reality, but a significant amount of the data housekeeping responsibility remains a manual task that falls onto users. This model is hard to sustain and even breaks companies’ standard operating policies.

The cloud has now become a convenient warehouse for data — much more than a closet — and this shift has put enormous bandwidth demands on the public cloud. At the same time, a growing number of internal private cloud deployments have emerged to meet the demand for cloud-based storage.

The growing number of organizations with petabyte-sized or 'petascale' archives are facing ever increasing cost and complexity in maintaining local data and content. Many are turning to the public cloud and specialized public cloud; others, less comfortable storing all that information on the public cloud, are looking at private cloud solutions. In both cases, organizations have come to realize that as they move forward, the traditional storage technology used to manage larger data sets at petabyte scale is simply not cost-effective, reliable, or efficient enough.

Relying on traditional RAID technology to maintain long-term archives of active data is not only highly inefficient, but also unsustainable from a power and cooling standpoint. The high-speed disk and CPU-intensive design of tier one enterprise disk is just not capable of scaling into several petabytes — not to mention exabyte-scale, which seems to be just a few years away.

Data Usage is Changing
Data usage patterns have also changed. Until recently, data would have been mostly at rest, and users were satisfied with storage of data on tape. Now, data must be readily available for use.

For example, a reality TV producer could easily ingest and maintain 16 terabytes of content monthly for each show created, and that one might manage several shows concurrently. In such a case, it would be desirable not only to keep all stored assets indefinitely, but also to maintain the ability to retrieve those assets quickly for repurposing or analysis.

With capacity needs in single namespace data sets outstripping tape capacities and with users demanding nearly instantaneous access to stored data, disk was at times the only viable solution. Eventually, however, evolving technology provided a better way to address storage requirements.

Enter object storage
Next-generation object storage helps power the cloud for many service providers today. This technology, which already has been used for reliable satellite communication transmissions in space and for cell phone communications, has now been reengineered for use by commercial cloud service providers as durable, efficient data protection.

Object storage saves data as objects with IDs rather than in traditional hierarchal file structures. This enables large volumes of data to be spread geographically yet remain locally accessible to users in any location at nearly tier-one storage performance.

Object storage leverages cloud-grade hard drives that provide highly efficient CPUs and memory while requiring less power. The new technology has the added benefit of offering more flexible durability standards, meaning that data contained in object storage can actually be more durable than that in a similarly sized RAID 6 storage array, even when the RAID 6 array is mirrored at another location. In addition, some object storage technologies can virtually eliminate the need for the time-consuming and risky migration of data from one platform to another in order to maintain long-term archives.

Object storage provides tremendous cost benefits when it comes to accessible and quick access to volumes of archives. Using the right access technology as the on-ramp to object storage, companies move toward a primarily cloud-based storage infrastructure. Other companies simply interested in building a private cloud infrastructure gain similar benefits, but with greater control.

The single most cost-effective solution for storage at petabyte scale and beyond remains to be determined. But today — and for the foreseeable future — the combination of private cloud and public cloud object storage-based storage solutions with optimized on-premise cloud-aware storage seems to be the answer. This is true both for consumers who have access to services that use these technologies and for organizations that, like consumers, can benefit from the scale of cloud and security of private cloud. As the world’s insatiable thirst for data continues to grow, technology companies will continue to develop approaches to help us manage a petascale lifestyle.

Alex Grossman is vice president, media and entertainment at Quantum

You might also like...

2022 NAB Show Review, Part 1

Many annual NAB Shows have become milestones in TV broadcasting history. The presence of the 2022 NAB Show marked the first Las Vegas NAB Show since 2019.

Grass Valley’s New CEO Wants To Lead The Third Video Revolution (The Cloud)

After two years of virtual gathering, broadcasters convening in person for this year’s NAB Show in Las Vegas will see a lot of new faces due to management and staff changes at the various vendors. One notable “new” figure will …

2022 Winter Olympics - Remote Broadcast Infrastructure In A Safety Bubble

This year’s Winter Olympics Games, coming a mere six months after the 2021 Summer Games, is set to begin Feb. 4-20 amidst a persistent pandemic that will once again limit how the television production teams can interact on-site in Beijing, C…

TDM Mesh Networks - A Simple Alternative To Leaf-Spine ST2110: Application - Eurovision Song Contest

With over 4000 signals to distribute, transfer and route, the Eurovision Song Contest (ESC) proved to be this year’s showpiece for Riedel’s TDM based distributed mesh networked system MediorNet. Understanding the intricacies of such an event is key to rea…

TDM Mesh Networks - A Simple Alternative To Leaf-Spine ST2110: Knowing What To Choose

Broadcasters are no longer faced with the binary choice of going down the SDI or IP routes. The hybrid method of using TDM (Time Domain Multiplexing) combines the advantages of distributed networks with IP and SDI to deliver a fully…