In the UK we have Oxford v Cambridge. In the USA it’s Princeton v Harvard. The only difference is that one is a boat race and the other is computer architecture race.
Princeton and Harvard were the locations where the early US development of computing took place. The early machines in these two locations differed in architecture and the names stuck, just as they did for the hamburger and the frankfurter.
Whilst no one would confuse those foodstuffs, the same cannot be said for the architectures. Today the definition of Harvard architecture seems to be rather elusive.
Long ago, when life was much simpler, the two dominant computer architectures looked like Fig.1. The Princeton architecture, also known as the von Neumann architecture after its inventor, is characterised by using the same memory for instructions and for the data to be processed. In contrast, the Harvard architecture had a separate memory with its own bus or pathway for instructions.
In most treatments of the differences, the concept of security is not mentioned even though the differences from the security point of view are great. Traditionally computer technology goes for speed first and never mind the consequences – a silicon version of James Dean. When processors were relatively slow, the memory speed and the architecture were not the limiting factor, but as processors got faster, both became important.
Fig. 1.a) The von Neumann or Princeton architecture has one address space and one memory that is used for instructions and variables alike. Separation between them is primarily logical, with some help from memory management hardware. b) The Harvard architecture has separate memories for instructions and data and each has its own address space. This gives the Harvard architecture an extra level of executable space protection because the separation is physical.
The processor has to fetch instructions, so it knows what to do, and it has to fetch data to be processed and output the results. If these processes take too long, the processor has to wait and is said to be memory bound. In the case of the Harvard architecture, there is a speed advantage. Because the instruction and data memories are separate, they can be accessed simultaneously. In the von Neumann architecture a single memory has to be accessed sequentially.
Another advantage of different memories is that they do not have to have the same word length. The instruction memory can be optimised for the instruction word length and the data memory can have a different word length. Data are often organized as bytes, whereas typical instructions are larger than a byte. The Harvard machine also has two address spaces and addresses must individually be created for the two memories. Instruction memory would typically be addressed from the program counter.
Exploring that difference further, a Harvard machine does not have to use the same memory technology for instructions and data. From a security standpoint that has some advantages. For example the operating system could be stored in memory that is write-locked or implemented in read-only memory, meaning it cannot inadvertently be changed whether by accident or malicious intent.
When the programs contain loops, where the every iteration executes the same sequence of instructions, a speed advantage can be had using a cache memory, which is a high-speed memory close to the processor that keeps a record of the most recent memory accesses. Both the address and the data are stored. If the processor returns to an address that is cached, the address is recognised, the data are provided quicker from the cache than from the main memory.
It is very simple to equip a Harvard architecture machine with two cache memories, one for instructions and one for data as each will be connected to a different pathway and they work independently. Initially the von Neumann machines only had one cache memory that had to work with instructions and data. The result was that the cache would get overwritten with data and when an instruction was wanted again it would no longer be there.
The solution was to provide the von Neumann machine with two cache memories as shown in Fig.2. This is often called a split cache. As the processor knows whether it is accessing an instruction or fetching data, it can determine which cache to use. With instructions and data separated in that way, the chances of a looped program getting cache hits when instructions were fetched went up significantly. The question then became what the idea should be called.
Despite the underlying architecture still being that of the von Neumann machine, with the same main memory having a single address space and a single word length shared between data and instructions, the split cache approach that separated instruction and data caches caused the idea to be called a modified Harvard architecture. This has caused endless confusion. According to that definition, practically every modern computer has modified Harvard architecture. The hamburger has become a modified frankfurter. The result is that the naming of computer architectures has become unreliable and it is always necessary to refer to a block diagram to see what is going on.
Fig 2 - In the split cache approach, the single main memory bus is fitted with two caches, one for instructions and one for data. The processor knows which is involved in any transaction and can enable the appropriate cache.
With different cache memories for data and instructions, it would be possible for a machine with a single main memory to access instructions and data simultaneously. For example if the instruction fetch resulted in a cache hit, a data fetch could come from the data cache or from main memory. In the case of cache misses, the machine reverts to the von Neumann performance. A split cache machine has only one address space, whereas the true Harvard machine has two.
This leads to a more recent and useful definition of the Harvard machine, which is that it always has two address spaces. From a computer security standpoint, that definition is especially meaningful, because it is the separation of address spaces that gives the Harvard architecture an edge in security measures where it allows better executable space protection. Unfortunately that definition is widely ignored.
The Harvard architecture is popular in digital signal processing (DSP) in applications such as image and audio processing where the same algorithm is repeatedly run on a steady throughput of data. The improved executable space protection is not really needed in DSP applications, but the increased speed is useful. In present Harvard machines, built for speed, the instruction memory is typically not very big. A Harvard machine built for security would have to be different, not in principle, but in detail.
For high security applications, it is possible to conceive of a modified Harvard architecture machine in which the operating system resides in instruction space that will be accessed in kernel mode in a memory that is either ROM or write-locked RAM (perhaps loaded from flash memory on startup). An area of instruction memory would need to be read-write capable to accommodate stacks and so on. All peripheral and I/O control registers would be in instruction space. The modified architecture also allows instructions to be fetched from data memory, so that suitable application code written for von Neumann machines can be run. When executing code from data memory, the processor cannot be in kernel mode.
Applications cannot affect the memory management unit controlling main memory, because the registers are not in the data address space. Applications cannot control the peripherals because they are not in data address space either. If an application wants data, it has to request the OS to get it. The OS sends commands to a peripheral, such as a hard drive, in instruction space, but the disk controller can transfer data by DMA to data space so the application can find it. This means that device controllers have to be specifically designed for security. This is not a big deal, as many disk controllers use separate buses; one for commands and status and one for mass transfer.
In a modern computer the arithmetic logic unit forms a surprisingly small part of the machine, hiding amongst split cache memories, memory management units and floating-point processors. It is not too difficult to conceive of a computer that has not just a separate address space for the operating system, but a separate processor as well. A kernel-only processor does not need to have the same instruction set as the main processor, and it could happily run with a reduced instruction set.
Although the main purpose of a kernel processor is security, it would also speed up the main processor as switching between user and kernel mode could be much faster. Housekeeping processes performed by the kernel processor would be in parallel to the main processor.
With such executable space protection, the operating system cannot be modified and code cannot be written to it or changed by malware. The opportunity is there to build machines that are practically un-hackable and no new technology is needed to create them. All that is needed is the commitment to proceed in that direction. Logic suggests that any IT problem should define what the software needs to do and then hardware needs to be found that will provide an environment for that software. This is only a further application of that logic. The problem is not the technology, but that James Dean is still at the wheel.
You might also like...
TDM Mesh Networks - A Simple Alternative To Leaf-Spine ST2110: Application - Eurovision Song Contest
With over 4000 signals to distribute, transfer and route, the Eurovision Song Contest (ESC) proved to be this year’s showpiece for Riedel’s TDM based distributed mesh networked system MediorNet. Understanding the intricacies of such an event is key to rea…
Broadcasters are no longer faced with the binary choice of going down the SDI or IP routes. The hybrid method of using TDM (Time Domain Multiplexing) combines the advantages of distributed networks with IP and SDI to deliver a fully…
TDM Mesh Networks: A Simple Alternative To Leaf-Spine ST2110. Pt1 - Balancing Technical Requirements
IP is well known and appreciated for its flexibility, scalability, and resilience. But there are times when the learning curve and installation challenges a complete ST-2110 infrastructure provides are just too great.
IP is delivering unprecedented flexibility and scalability for broadcasters. But there is a price to pay for these benefits, namely, the complexity of the system increases significantly as we add more video and audio over IP.
Never trust the adhesive holding tape to the hub of a 40 year-old ¾-inch videocassette.