Closing In On Methods For Long Term Archiving

As the amount of data in the world keeps exponentially multiplying, a Holy Grail in research is finding a way to reliably preserve that data for the ages. Researchers are now closing in on methods to make data permanent. The problem is there is no way to be absolutely sure it will work far into the future.

By 2023, Microsoft predicts that over 100 zettabytes of data — including movies, television programming and audio — will be stored in the cloud. That staggering amount of data requires a fundamental re-thinking of how large-scale storage systems operate.

In 2016, Microsoft began a partnership with the University of Southampton Optoelectronics Research Centre in the UK to tackle the archiving issue. It is called Project Silica. The project is designed to store cold data — or data that is infrequently accessed. It doesn’t need to sit on a server for instant use.

Through the project, Microsoft is testing glass as a long term storage medium. Recently, it did an experiment with Warner Brothers to store a copy of the 1978 film, Superman, on a glass disc that is 7.5 cm x 7.5 cm x 2 mm.

Microsoft Project Silica senior optical scientist Patrick Anderson loads the system to write data to glass. Photo by Jonathan Banks <br />

Microsoft Project Silica senior optical scientist Patrick Anderson loads the system to write data to glass. Photo by Jonathan Banks

The glass contains 75.6 GB of data plus error redundancy codes. It is said to be the first test of the new archiving technology for long term storage of films and television programs.

Theoretically, the glass storage could last thousands of years. If it works, a studio like Warner Brothers, who houses some 20 million film assets in temperature controlled warehouses, would have an extra level of protection.

Glass has long been used to preserve audio programming, going back to the radio drama days. In World War II, metal record platters were banned due to metal shortages and glass was substituted for recording. Though glass lasts a long time, it is also delicate. Everyone who has worked with glass discs have opened boxes to find the platters shattered.

However, Microsoft’s methods are different. Project Silica uses lasers similar to those used for Lasik eye surgeries to burn small geometrical shapes, also known as voxels, into the glass. The multiple bits for each voxel is encoded and the data is applied in multiple layers. For the Warner Brothers experiment, 74 layers were used for the Superman film.

Once the data for the program is embedded into the glass, the content is accessed by shining a light through the disc and capturing the data with microscope-like readers. The Warner’s film was checked bit by bit and it was flawless.

Microsoft senior optical scientist James Clegg reads data with a specialized microscope. Photo by Jonathan Banks <br />

Microsoft senior optical scientist James Clegg reads data with a specialized microscope. Photo by Jonathan Banks

So what about the easy breakage of glass? Microsoft said it did extensive tests to make sure that Project Silica storage media didn’t easily damage. It was baked in hot ovens, submerged in boiling water, microwaved and scratched with steel wool. But, all glass still breaks. Apple’s iPhone screens are supposed to be the toughest glass in the world and the screens still easily break when dropped. Only time will tell if the Project Silica glass is tough enough.

Also, there is a question of whether or not the readers for such discs will still be manufactured a thousand years into the future. Technology changes and companies go out of business. It is anybody’s guess how this will play out.

Microsoft’s own cloud, called Azure, already has a major interest in safekeeping vast amounts of both hot and cold data. Azure still uses tape, which has to be checked frequently and re-copied to maintain data integrity. Glass could one day be a more secure solution to safekeep data for the company and its customers.

Much work remains to be done on Project Silica. Read- and write-operations need to be unified into a single device, and the amount of data stored on one piece of glass needs to increase. But the company is betting that the future of long term archiving is in glass.

Microsoft also has a parallel project using DNA molecules for archival storage. The beauty of DNA is it can archive an exabyte per cubic millimeter and have a life of over 500 years. But how will it be read far into the future?

Others are also researching long term archiving. Group 47, formed in 2008 to secure the patents, designs and manufacturing processes for DOTS, developed by the Eastman Kodak Company.

DOTS (Digital Optical Technology System) is a 100-year archival technology that is non-magnetic, chemically inert and immune from electromagnetic fields including electromagnetic pulse (EMP). The storage media can be stored in normal office environments or extremes ranging from 15 to 150-degrees F.

DOTS is stored on a phase change media composed of a metallic alloy sputtered on an archival polyester base. To tackle reader availability in the future, DOTS is a true visual “eye-readable” method of storing digital files. With sufficient magnification, any eye can actually see the digital information.

A “Rosetta Leader” specification calls for microfiche-scale human readable text at the beginning of each tape with instructions on how the data is encoded and instructions on how to actually construct a reader. Because the information is visible, any high magnification camera can read the information.

Long term archival systems are incredibly complex because computer operating systems, hardware/software and technology as a whole are constantly changing. What works today may not work tomorrow, much less a 1,000 years from now.

And perhaps most problematic of all, how does anyone living in today’s world know how long anything will last? It’s a major problem with no easy solutions. 

Let us know what you think…

Log-in or Register for free to post comments…

You might also like...

Data Recording: Part 14 – Error Handling

In the data recording or transmission fields, any time a recovered bit is not the same as what was supplied to the channel, there has been an error. Different types of data have different tolerances to error. Any time the…

Predicting Editing Performance From Games And Benchmarks: Part 1

In the good old days when you were thinking about upgrading your computer you began by reading printed reviews such as those published by Byte magazine. These reviews usually included industry standard benchmarks. Now, of course, you are far more…

User Generated Content Or Reporting In Real Time

We all understand what it means when someone says a video went viral. It typically means a person used a mobile device to record an event and posted it to any number of social media websites. How does that have…

What You Need To Know About Thermal Throttling

With 6K acquisition becoming more common, you may be considering getting ahead of the editing curve by upgrading your computer system. Likely you’ll want a hot system based upon one of the new AMD or Intel 6- or 8-core m…

Essential Guide: Secure IP Infrastructures For Broadcasters

Security is becoming increasingly important for broadcasters looking to transition to IP infrastructures. But creating improved software, firewalls and secure networks is only half the story as cybercriminals look to find new and imaginative methods of compromising data.