Closing In On Methods For Long Term Archiving

As the amount of data in the world keeps exponentially multiplying, a Holy Grail in research is finding a way to reliably preserve that data for the ages. Researchers are now closing in on methods to make data permanent. The problem is there is no way to be absolutely sure it will work far into the future.

By 2023, Microsoft predicts that over 100 zettabytes of data — including movies, television programming and audio — will be stored in the cloud. That staggering amount of data requires a fundamental re-thinking of how large-scale storage systems operate.

In 2016, Microsoft began a partnership with the University of Southampton Optoelectronics Research Centre in the UK to tackle the archiving issue. It is called Project Silica. The project is designed to store cold data — or data that is infrequently accessed. It doesn’t need to sit on a server for instant use.

Through the project, Microsoft is testing glass as a long term storage medium. Recently, it did an experiment with Warner Brothers to store a copy of the 1978 film, Superman, on a glass disc that is 7.5 cm x 7.5 cm x 2 mm.

Microsoft Project Silica senior optical scientist Patrick Anderson loads the system to write data to glass. Photo by Jonathan Banks <br />

Microsoft Project Silica senior optical scientist Patrick Anderson loads the system to write data to glass. Photo by Jonathan Banks

The glass contains 75.6 GB of data plus error redundancy codes. It is said to be the first test of the new archiving technology for long term storage of films and television programs.

Theoretically, the glass storage could last thousands of years. If it works, a studio like Warner Brothers, who houses some 20 million film assets in temperature controlled warehouses, would have an extra level of protection.

Glass has long been used to preserve audio programming, going back to the radio drama days. In World War II, metal record platters were banned due to metal shortages and glass was substituted for recording. Though glass lasts a long time, it is also delicate. Everyone who has worked with glass discs have opened boxes to find the platters shattered.

However, Microsoft’s methods are different. Project Silica uses lasers similar to those used for Lasik eye surgeries to burn small geometrical shapes, also known as voxels, into the glass. The multiple bits for each voxel is encoded and the data is applied in multiple layers. For the Warner Brothers experiment, 74 layers were used for the Superman film.

Once the data for the program is embedded into the glass, the content is accessed by shining a light through the disc and capturing the data with microscope-like readers. The Warner’s film was checked bit by bit and it was flawless.

Microsoft senior optical scientist James Clegg reads data with a specialized microscope. Photo by Jonathan Banks <br />

Microsoft senior optical scientist James Clegg reads data with a specialized microscope. Photo by Jonathan Banks

So what about the easy breakage of glass? Microsoft said it did extensive tests to make sure that Project Silica storage media didn’t easily damage. It was baked in hot ovens, submerged in boiling water, microwaved and scratched with steel wool. But, all glass still breaks. Apple’s iPhone screens are supposed to be the toughest glass in the world and the screens still easily break when dropped. Only time will tell if the Project Silica glass is tough enough.

Also, there is a question of whether or not the readers for such discs will still be manufactured a thousand years into the future. Technology changes and companies go out of business. It is anybody’s guess how this will play out.

Microsoft’s own cloud, called Azure, already has a major interest in safekeeping vast amounts of both hot and cold data. Azure still uses tape, which has to be checked frequently and re-copied to maintain data integrity. Glass could one day be a more secure solution to safekeep data for the company and its customers.

Much work remains to be done on Project Silica. Read- and write-operations need to be unified into a single device, and the amount of data stored on one piece of glass needs to increase. But the company is betting that the future of long term archiving is in glass.

Microsoft also has a parallel project using DNA molecules for archival storage. The beauty of DNA is it can archive an exabyte per cubic millimeter and have a life of over 500 years. But how will it be read far into the future?

Others are also researching long term archiving. Group 47, formed in 2008 to secure the patents, designs and manufacturing processes for DOTS, developed by the Eastman Kodak Company.

DOTS (Digital Optical Technology System) is a 100-year archival technology that is non-magnetic, chemically inert and immune from electromagnetic fields including electromagnetic pulse (EMP). The storage media can be stored in normal office environments or extremes ranging from 15 to 150-degrees F.

DOTS is stored on a phase change media composed of a metallic alloy sputtered on an archival polyester base. To tackle reader availability in the future, DOTS is a true visual “eye-readable” method of storing digital files. With sufficient magnification, any eye can actually see the digital information.

A “Rosetta Leader” specification calls for microfiche-scale human readable text at the beginning of each tape with instructions on how the data is encoded and instructions on how to actually construct a reader. Because the information is visible, any high magnification camera can read the information.

Long term archival systems are incredibly complex because computer operating systems, hardware/software and technology as a whole are constantly changing. What works today may not work tomorrow, much less a 1,000 years from now.

And perhaps most problematic of all, how does anyone living in today’s world know how long anything will last? It’s a major problem with no easy solutions. 

Let us know what you think…

Log-in or Register for free to post comments…

You might also like...

PTP Explained - Part 4 - Requirement’s For Virtualisation Of ST 2110 COTS Infrastructures

In the fourth and final part of this series, we wrap up with an explanation on how PTP is used to support SMPTE ST 2110 based services, we dive into timing constraints related to using COTS (Commercial Off-The-Shelf) hardware, i.e.:…

Data Recording: Part 14 – Error Handling

In the data recording or transmission fields, any time a recovered bit is not the same as what was supplied to the channel, there has been an error. Different types of data have different tolerances to error. Any time the…

5G And Live Production

This past summer the NBA did a little experimenting using 5G and mobile phones to cover their summer league. This is not User Generated Content (UGC) by any means. It also was not an off the shelf deployment of 5G…

PTP Explained - Part 3 - Operational Supervision Of PTP Network Services

In the previous two parts of this four-part series, we covered the basic principles of PTP and explained how time transfer can be made highly reliable using both the inherent methods IEE1588 provides as well as various complementing redundancy technologies.…

Cyber Security Is An All Industry Issue

Cyber security impacts everyone and every industry. One unifying comment from cyber security experts is the bad guys are mostly winning. The good guys are fighting the good fight and we each need to do our part. One of the…