Auto QC in a Digitized Workflow: Part 1

Over the years, content creators and broadcasters have accumulated large libraries of assets in analog formats. With the switch to the digital workflow, there is a critical need to digitize all these assets before they are lost and for the monetized value they represent.

Introduction

Over the years, content creators and broadcasters have accumulated large libraries of assets in analog formats. With the switch to the digital workflow, there is a critical need to digitize all these assets because of multiple reasons:

  • Risk of losing the asset forever if not digitized
  • Preservation of asset for posterity, since digital format offers immunity from degradation over time
  • Getting the asset ready to be used and monetized in today’s digital workflows
  • Space and operational costs optimization

As a result, the digitization of analog tapes and archiving into digital libraries is critical to complete the transition to the file-based digital world. Post digitization, the content becomes the master – the analog tapes are thrown away. The asset, however, is only as good as the digitization process. What if the process was faulty, or there were errors introduced during the digitization process itself – the tape head was not aligned, the tape was read twisted, there was audio and video drift, or some audio track went completely missing, color bleeding happened, or maybe there was too much hue or red in the ingested material, or the tape head was not clean and it inserted a vertical line on all the frames – the possibilities of things going wrong are immense. And all of these are known to happen. If the process is faulty in anyway, the loss is immense – priceless assets will be lost forever. So what does the archiving process rely on – eyeball QC of the ingested content. But with thousands of hours of content to be digitized, manual QC is neither a practical nor a good option. The manual process is unreliable and is fraught with errors. It lacks consistency and with human fatigue setting in, it tends to be unreliable. Further, there are several errors that are not human detectable, but only machine detectable in the file-based digital workflows. Additionally, metadata checking can be erroneous – the asset with wrong metadata may also be practically lost in the archives, never to be retrieved when needed.

To overcome all the shortcomings, auto QC is now an essential and central part of the digitization workflow. The process is fast and efficient, consistent and reliable. When coupled with a manual review process on random and/or erroneous digitized content, the results are vastly improved in terms of the digitized content quality. Good auto QC tools with deep video quality checks for analog dropouts are increasingly deployed in the tape archiving process. Auto QC quality checks need to be enhanced to handle many digitization specific issues. Sensitive and detailed video dropout checks are critical for good archiving and one cannot take shortcuts with simple file-checking tools – industry grade QC tools with in-depth video dropout checks specially developed for analog tape ingest need to be deployed. A word of caution - the field of video dropouts is a subject of R&D with several checks still being evolved. This paper explores some of the checks in depth and how auto QC is being deployed in the digitization workflows.

The Need for Digitization and the Process

Most archives, broadcasters, universities, governments and television stations have thousands of hours of content accumulated and stored in analog tapes over the years. A typical broadcaster may have nearly 100,000 to 200,000 tapes of one hour duration collected over a ten-year period. While a few stations might have Super 8 or U-Matic tapes, majority of the tapes are a mix of Betacam (SP/SX Digital/IMX), XDCAM or HDCAM.

Preservation of the tapes requires not only space, maintaining correct tags, sorting of the tapes in correct sequence (like all sequence of sports together), the mere number of tapes creates a huge problem for the management to effectively use the tapes when needed. Also, the quality of the tapes deteriorates with time due to the inherent nature of magnetic tapes. In many cases the recoverability of the programs from the old tapes can no longer be guaranteed. Maintaining the tapes is a costly affair while the quality of the content is still not guaranteed. The digital workflow offers a solution to this.

Once tapes are digitized, facilities can achieve multiple benefits from the same. These include:

  • Preservation of assets without the fear of quality loss or degradation
  • Optimization of space and operational costs - retaining large archives of tapes in temperature and humidity controlled spaces is expensive while storing digital content in files is a lot less expensive
  • Faster access and retrieval from the archives with enhanced metadata search capabilities
  • Online content for new audiences and monetization possibilities

Facilities are fast migrating to complete digital and file-based workflows and getting rid of the old tape archives.

Figure 1 below provides a typical digitization process deployed during migration.

Figure 1: A Typical Digitization Process.

Stage 1. Tape Preview

At this stage, the different type of tapes (IMX, Betacam, HDCAM, XDCAM etc.) are sorted and tagged. Tapes are also physically checked for tape quality, presence of any foreign body in the tape, physical damage, tape twisting etc. If some of the important tapes are found to be damaged considerably, the tapes are usually sent to external specialist for restoration.

Stage 2. Tape Preparation and Cleaning

The tapes identified and sorted for digitisation are moved to the ingest area at least 24hours before the actual ingestion of the tapes to avoid sudden expansion/contraction of the tapes. The ready tapes are then loaded in a tape cleaning machines to remove the dust and residuals like oxide deposits. With a huge number of tapes to be digitized, barcode labels are generally put on the tapes for better tracking and mapping of the metadata to the assets. Barcodes are also used by the downstream tools to automatically select the transcode profile during the digitization process.

Stage 3. Digitization of Tapes

At this stage, the tapes are played back and the ingested digitised content is encoded to house formats like Jpeg2000, AVC Intra etc. For large scale digitization, automated robots are deployed which can feed the tapes to VTR automatically from the stacked tapes using barcodes. Apart from creating the digitized files, a database containing the metadata is also updated for the digitized assets. A low-res proxy file is also generated along with the hi-res files.

Stage 4. Quality Checking

The quality of the ingested content must be checked to ensure proper ingestion. Post digitization, the digital content becomes the master and the tape becomes redundant. It is therefore essential to ensure that the right quality has been achieved in the digital master, before the tape is thrown away. If the content volume is low, one may rely on eyeball or manual QC to check the digitized content quality. However, with hundreds of hours of digitized content, manual QC is not a practical option. The manual process is also fraught with errors:

  • Manual QC lacks consistency
  • Some errors are not perceptible, but manifest themselves only during playback on some equipment
  • Human fatigues sets in, leading to unreliable QC process
  • Metadata checking can be erroneous – the asset with wrong metadata may also be practically lost in the archives, never to be retrieved when needed

Large scale digitization hence relies on auto QC tools to assist in the quality checking process. However, as we will see in the next section, there are a host of issues that can crop up in the storage of tapes and the playback process, which impact the video quality of the digitized assets. Similar issues can arise in the audio as well.

All these can lead to various different kinds of artefacts in the ingested digitized content. These artefacts are classified as “analog dropouts” in the video and the associated audio. A good auto QC tool should be able to reliably and accurately detect such artefacts. While one can identify these artefacts with visual inspection, identifying all such issues automatically through auto QC tools is still a subject of research and a lot of R&D is being done on the same (we will go into more details on this in the next section). Some of the advance auto QC tools provide a higher degree of reliability, accuracy and coverage of these analog artefacts, and are much better suited for deployment in the digitization workflow.

The auto QC process can be complemented with a manual review process to finally accept or reject the digitized content.

Stage 5. The Archiving Process

Once the digitised content is accepted, it is then archived using the selected archiving software. Metadata is updated, along with the proxy file. The process is complete and the corresponding tape can be discarded.

Editor's Note

Part two in this two-part series can be found here.

You might also like...

Selecting A Content Creation Laptop

Computer marketing departments typically do not promote all company products. Rather they focus on high margin products.

Data Recording and Transmission: Error Correction II - Part 17

Here we look at one of the first practical error-correcting codes to find wide usage. Richard Hamming worked with early computers and became frustrated when errors made them crash. The rest is history.

Data Recording: Error Correction - Part 16

Error correction is fascinating not least because it involves concepts that are not much used elsewhere, along with some idiomatic terminology that needs careful definition.

Predicting Editing Performance From Games And Benchmarks: Part 2

Computer game apps read compressed artificial world descriptions from a disk file. This artificial world is regenerated by the CPU and loaded into the GPU where it is displayed to the gamer. The gamer’s actions are fed back to t…

Data Recording: Error Handling II - Part 15

Errors are handled in real channels by a combination of techniques and it is the overall result that matters. This means that different media and channels can have completely different approaches to the problem, yet still deliver reliable data.