Moving stored content into an archive is a multi-step process and requires proper QC at key points.
In Part 1 of this series on preserving taped resources and moving them to an archive, we reviewed the typical problems that may be encountered and the tools needed to resolve them. With those identified, let’s see how it is possible to automate the process to ensure maximum throughput and quality.
The Auto QC Process
As discussed in the previous article, (Part 1 available here) auto QC is now an essential and central part of the archiving workflow. The auto QC process is fast, efficient, consistent and reliable. When coupled with a manual review process on random and/or erroneous digitized content, we achieve higher levels of productivity with vastly improved results.
Figure 1. A typical auto QC workflow in a digitisation project.
There are mostly four types of checks that are done as part of a quality checking process on digitized files:
- Checking the compliance of the generated content
- Checking for timecodes and metadata
- Checking the baseband quality in audio and video
- Checking for encoding/transcoding errors, if content is compressed
Compliance and metadata checking is a straightforward process needed to ensure that digitized content will work with all downstream tools. It is similar in nature to the checking done in current file-based workflows. The real complexity comes with ensuring that baseband quality of the digitized content is above the defined and agreed to threshold level. This becomes even more challenging when the same has to be done reliably with auto QC tools. Video issues can manifest themselves in different ways, and each one of them requires deep R&D to detect them reliably and accurately. With one large broadcaster, we saw over 50 different types of video quality issues in the digitized tapes. In the next section, we describe some of these in more details.
Quality Issues and Detection
The information embedded within tapes is in the form of voltage signals. The formation of each pixel, frame or picture is attributed by stored signal values on magnetic tapes. Alteration in natural variations in these signal values will lead to incorrect color values for captured pixels and cause errors in formation of fields, frames or pictures. These alterations are caused due to mishandling, ageing and improper maintenance of tapes. These can also be due to errors within the digitization process being used. The resulting video artefacts in this way are collectively termed here as analog dropouts. Some examples are blotches, scratches, miss-tracking, head clog, skew error, horizontal/vertical sync pulse loss, etc. The following sections will discuss some of the commonly observed analog video dropouts in further detail.
Horizontal / Vertical Sync Pulse Loss
A video frame consists of multiple horizontal scan lines spread across the vertical resolution. A specific voltage level exists at the end of each scan line indicating its end and start of the next scan line. Any variation in the voltage level (due to noise) will shift content lines, perceptually viewed as horizontal lines. This is shown in the snapshot below (Figure 2 a).
Vertical sync pulse is another such voltage level controlling the start/end of a new video frame. Any deviation in the voltage level will disturb the start of the formation of the next frame. Vertical sync pulse loss merges the two adjoining frames at the frame boundary (Figure 2 b).
Figure 2(a) Horizontal Sync Pulse Loss (left)
Figure 2(b) Vertical Sync Pulse Loss (right)
A magnetic tape can have dimensional changes due to continuous expansion or shrinkage of the tape surface over time. Due to this, the recorded tracks are affected by changes in length and angle resulting in misalignment with respect to the playback head. During playback/recording, this loss in alignment will shift a band of scan lines at the top/bottom of the picture. This horizontal shifted portion of the video frame at the top or the bottom part is termed as Skew Error (Figure 3).
Figure 3. Skew Error.
Line Repetition Error
An analog to digital conversion device gets the video data in the form of scan lines. The buffers that store each scan line data are updated regularly after each sample and hold duration. The line repetition error is caused due to issues in controlling the signals - the current scan line is not captured and is replaced by the previously fetched scan line. This error in the control signal continues for a while and the same is manifested as a repeated set of horizontal content lines. The artefact is shown in Figure 4.
Figure 4. Line Repetition Error.
Blotches occur due to presence of dirt/sparkle on the surface of a magnetic tape. Dirt/blotches disrupt the reception of signals during video data capture. The area for which the data is not received, appears as white or black spots. Snapshot of the video frame with blotches is shown in Figure 5.
Figure 5. Blotches.
Scratches appear in the video frame due to removal of oxide on the tape surface. The loss of oxide is due to wear and tear after prolonged or continuous usage of a tape. Generally, these scratches are in the form of thick horizontal line with some break at the boundary. The artefact is shown in Figure 6 below.
Figure 6. Scratches.
Chroma Phase Error
Composite video signals consist of chrominance components combined with luminance component using the phase modulation method. Any deviation in the phase will affect all the constituent components. With the phase error, the hue and saturation for the pixel colors may change and this will result into deviation of the colors from its natural values e.g., skin color, natural color of leaves or flowers sky etc. One of such examples is shown in Figure 7.
Figure 7: Chroma Phase Error.
Dot Crawl / Rainbow Artefact
While capturing from a tape using composite signals, sometimes luma can be misinterpreted as chroma or vice versa. If chroma is treated as luma, the resulting artefact is termed as Dot Crawl. On the other hand, if luma is treated as chroma, the resulting artefact is termed as Rainbow Artefact.
The ghosting artefact is perception of weak shadows around the edges of the primary visible objects within a scene. It happens due to transfer of magnetic signals across the adjacent tapes. A snapshot frame for this error is shown in Figure 8.
Figure 8. Ghosting.
Apart from the above listed set of errors, other errors may also get introduced while capturing color values corresponding to each of the pixel location in a frame. In some cases, values are not retrieved at all; localized patches are created abruptly within the content. If the captured values are different from its natural value, video signal level and out-of-gamut errors are introduced in the captured video sequence. Apart from these errors, different kinds of noise or noise patterns can be perceived due to noise introduced while capturing analog signals.
Fortunately, there are processes and tools to correct not all but some of the errors introduced after analog to digital conversion. These tools or processes consider specialized steps to correct the tape device or the conversion process itself. There are post-processing tools as well to remove any noise in the digitized content, to correct the hue / saturation / balance / contrast of colors etc. But before applying any such correction step, it is required to know if there is an error and what type of error it is. The knowledge about the type of errors will help in selection of the correction steps to be followed.
Similar to videos, audio samples too are stored as voltage signals on magnetic tapes. Any aberration while capturing the audio signal during the digitization process can lead to audio distortion of different types as discussed below.
Audio Click/Audio Crackle/Transient Noise
Click/Crackle/Transient Noise/ Glitches are introduced due to scratches and dust on the surface of a tape. These are localised degradation that only affect certain groups of samples and thus cause a discontinuity in the waveform.
Scratches lead to disrupted audio samples during of the digitization process. These are perceived as ticking/popping/scratchy kind of noise lasting for a very small duration.
Audio Dropout is defined as distortion in audio signals in which silent frames of small duration (from 4ms to 300ms) are introduced in midst of normal audio data. It is characterized by abrupt fall in the signal level at the event of audio drop and abrupt rise at the end of audio drop frame.
Audio Dropouts are mostly introduced during digitization due to damage appearing on the tape. If a certain part of the tape is damaged, it won’t be possible for the head to read the corresponding audio data resulting in audio loss for that specific duration.
In addition to the above defects, the digitization process can also cause Audio Clipping. Because of dust and dirt contamination, it is possible that voltages become so high that it causes few of the audio samples to go above the legal range of 0 dB.
For detection of audio defects, checks like loudness checks, audio dropout, audio clipping, checking for different type of audio noises are very common during the quality checking process.
If closed caption and burnt-in subtitles are present in the content, advanced quality checking tool will not only check for their consistency and dropout, but it will also make sure they are present in safe area of the screen.
Unlike the errors in compressed digital data, the errors in analog medium are difficult to model. The analog data errors are random and do not follow a known pattern. This is also due to variations in the conversion processes or varieties of electro-mechanical components used inside tapes. Because of this uncertainty, detection of these artefacts is quite tedious. Highly specialized image and video processing concepts and algorithms are required for accurate and reliable detection of errors in the digitized data.
The Challenges Faced by Auto QC Tools
Selection of a correct auto QC tool for digitisation is not only critical, but it has direct impact on the quality of the digitised content. A good auto QC tool can make the digitization process more efficient by detecting issues accurately and reliably. Algorithms to detect analog errors are more complex than that of digital errors. The detection algorithms need to consider and model various kind of non-linear processes followed during analog to digital conversion. Error detection algorithms have been developed for detecting some but research is continuing for the difficult ones where it is complex to model the actual error context. The auto QC tool you deploy in your workflow will provide benefits which are only as good as the depth and accuracy with which it detects such analog dropouts. Some QC tools just do a lip service in the name of detecting such issues, and it is advised that a proper tool be selected after due testing of the results. Fortunately, there are some industry grade QC tools, with in-depth video dropout detections available now. These tools have exhaustive checks dedicated to the analog tape ingest process and have been successfully deployed at large archiving projects.
The digitization of tapes and archiving to digital formats is a necessity to complete the transition to the file-based digital workflow. During this process, it is critical to use the right set of tools to ensure the quality of the content being archived. Artefacts can manifest in multiple ways in these tapes and need to be detected. Detection of these artefacts called analog dropouts is complex and several deep algorithms have been developed for the same. While a lot more research needs to be done to cover a larger set of analog dropouts, using the right auto QC tool during the archiving process helps detect these complex analog errors more accurately and reliably, and enables you to preserve and deliver high quality of the generated content.
Part one in this two-part series can be found here.
You might also like...
Error correction is fascinating not least because it involves concepts that are not much used elsewhere, along with some idiomatic terminology that needs careful definition.
Computer game apps read compressed artificial world descriptions from a disk file. This artificial world is regenerated by the CPU and loaded into the GPU where it is displayed to the gamer. The gamer’s actions are fed back to t…
Errors are handled in real channels by a combination of techniques and it is the overall result that matters. This means that different media and channels can have completely different approaches to the problem, yet still deliver reliable data.
Hackers are always improving the level of sophistication and constantly finding new surface areas to attack – resulting in the surging volume and frequency of cyberattacks.
As the amount of data in the world keeps exponentially multiplying, a Holy Grail in research is finding a way to reliably preserve that data for the ages. Researchers are now closing in on methods to make data permanent. The…