Data Recording and Transmission: Part 23 - Delivering Data

The requirements for data transmission have changed out of all recognition since the early days of computing where the goal was simply to make something that worked. Today that’s the easy part.

As has been seen, using error correction it is possible to protect data against naturally occurring loss that takes place in recording media and transmission channels. At one time that was all that was necessary. However, the explosion in IT made possible by microelectronics meant that data became exposed to the ways of the world at large, which are a mixture of the genuine concerns of commerce along with the interest of the criminal.

In the event that a data file arrives at a given destination and the error correction strategy pronounces it to be free of error, there could be a number of possibilities. Ideally the destination was supposed to receive the file and it genuinely is what the source sent. Today there are more possibilities and Fig.1 shows some of them.

At Fig1.a) the file somehow turned up at one or more destinations in addition to the intended destination. If that file contained advertising material, then that’s a bonus. If it contained details of a military invasion, that’s a disaster. When sending a file over a public network, it must be assumed that it has essentially been shouted from the rooftops and could turn up anywhere, either by accident, foul play or by the activities of the security services.

Fig.1. In the real word a number of unintended things can happen to a data transmission. See text for details.

There is a requirement to prevent harm if a legitimate file arrives at an inappropriate destination, but on the other hand if the file contains plans for some criminal act it is in the interests of society for it to be detected. A difficulty arises in corrupt regimes where opponents of the corruption are deemed to be criminals.

At Fig1.b) the file has not come from its purported source but has come from elsewhere. If the recipient believes the file to be genuine, harm may be done. A variation on that theme is where the file actually did come from the sender, but the sender subsequently denies it.

At Fig1.c) is shown a variation on b) in which the file did originate from the purported source but was subject to tampering on the way. Once more if the recipient is unaware, harm may be done.

At Fig1.d) is the case in which additional data have been added to the file. If the destination is a computer, the additional data may cause it to do things the owner would rather it did not do.

One approach to prevention of unauthorized receipt of a file is to render it meaningless using encryption. There are various ways in which this can be done, according to the requirements. For example, a commercial undertaking might provide files that contain copyright material such as movies that are paid for by subscribers. If it were possible for large numbers of non-subscribers to view them, there would be economic loss.

Digital cinema was not about picture quality. It was about piracy. Reels of film were too easily intercepted and copied and would be mass duplicated and sold cheaply. Digital cinema now has extremely powerful encryption where every projector is unique and has its own decryption key. The movie is encoded at source to suit the key of each selected cinema and if the file is intercepted it is meaningless.

For subscription movies over the Internet a relatively simple encryption system would be adequate as the small number of cases where the encryption had been defeated would not be commercially significant. On the other hand where the message is of national importance a much higher level of security must be provided. The only meaningful metric of an encryption system is the probability that will be broken.

Early encryption systems used analog techniques and were referred to as scrambling. In one system intended for secure speech the key was electronically generated noise recorded on precisely two gramophone disks, one of which had to be transported to each end of the secure channel. If the receiving end did not confirm receipt of the disk, the sending end would not encode using the matching disk, so if one disk fell into the wrong hands it was useless.

With the disks playing in synchrony, noise added to the speech at the source could be subtracted at the destination. Eavesdroppers just heard the noise. The most sophisticated of these systems was called SIGSALY and went into service in 1943. The apparatus weighed tons but was never compromised.

Data having discrete symbols, such as text or digital information, is ideal for encryption as one symbol can be substituted for another. Provided both ends of the channel know what the substitution was, information can be transferred. If the substitution is done by algorithm, known as a cipher, and the same algorithm is always used, there is some danger that a determined interloper might work out the algorithm and compromise the system.

Fig.2. Public and private key transmission. The sender encoded using the recipient's public key. The message cannot be decoded with the public key. The recipient decodes with the private key.

If security is paramount, the substitution should take the form of a one-time pad. This means that the substitution is not algorithmic, and the key is as big as the message and is used only once. The key must be truly random, which means that it should pass the so-called next-bit test. This means that no algorithm attempting prediction of the next bit should be able to do better than 50% throughout the key.

Early cyphers were symmetrical, meaning that the same key is used both to encode and decode the file and the key must be shared between the sender and the receiver and no-one else. This meant there was a vulnerability to interception of the key.

Subsequently the asymmetric cypher was developed, in which the key is divided into two parts

Most practical security systems rely on what are called one-way-functions. This means that it is not possible to tell from the output how it was generated. There is presently no mathematical proof that one-way functions exist and all that can be said is that working backwards is computationally unfeasible.

By way of example if the output of an OR-gate is true, it is not possible uniquely to tell what the input states were. Clearly it would be trivial to test all the possibilities for something as simple as an OR-gate, so this is not a robust one-way function.

A compliant MPEG bitstream is a one-way function because it is not possible to establish how the encoder works by analyzing it. That was a goal of the MPEG standard so that encoder manufacturers wouldn’t give away the secrets of their intellectual property in the bit stream.

An asymmetrical cipher uses two related keys that are generated in a single process. One of the keys can be made public, the other remains private and known only to the holder. If it is wished securely to send a file to the key holder, it is encrypted using the public key. As the key is a one-way function, it is not possible (strictly it is not computationally feasible) to decrypt the message. The public key may be widely known and cannot be used to decrypt.

Essentially the encryption using the public key produces a message that could have resulted from a huge number of possible input files. Only the private key can establish the correct one. The advantage of the asymmetrical cypher is that the private key doesn’t have to be shared for the transfer of encrypted files and so cannot be intercepted.

Fig.2 shows the mechanism of public-private key transmission. In order to render it computationally infeasible, the keys in asymmetrical encryption are much larger than in symmetrical encryption.

Fig.3. The owner of the private key encodes a message using it. Anyone having the public key can decode the message, but if that is possible, the message can only have come from the key owner.

The two keys can be used in another way. Fig.3 shows that if the key holder uses a private key to add a checksum to a message, the recipient or recipients of the message can use the public key to check the message. If the public key check succeeds, the message must be from the key holder and has not been modified en-route. This is the basis of digital signatures that are used to validate transactions. It is important that the messages sent encoded with the private key are always different in order to make it harder to compromise the private key.

If using distributed technology that is commonly called the cloud for storage, then strong encryption of important data such as copyright material is an elementary precaution. The problem comes when computing power in the cloud is used to perform processes on important data. By definition no processing can take place on encrypted data. It has to be decrypted first. This has two consequences. Firstly, it means that the key to the encrypted data must be known to others and secondly whilst being processed the data are unencrypted and therefore vulnerable to theft.

Whilst much is made of the economics of cloud computing, little is said about the cost of a security breach, which could easily eclipse any saving. It is not possible for a cloud user independently to assess the security of the cloud and so it cannot be assumed to be secure. The best form of security is to know where your data are and what access there is to it.

You might also like...