The document discusses limitations of the Von Neumann architecture and opportunities for parallel processing through non-Von Neumann architectures. It provides background on the Von Neumann model and its key components. The document then explains how increasing demand for faster computers is driving researchers to explore parallel processing approaches like SIMD and MIMD that can solve larger, more complex problems by distributing work across multiple processors. It aims to describe non-Von Neumann architectures and parallel algorithms.
WSO2's API Vision: Unifying Control, Empowering Developers
Non-Von Neumann Architectures
1. Non-Von Neumann Architectures: A Study into Parallel Processing and Parallel Algorithms
The Von Neuman Architecture which was proposed by brilliant mathematician John Von
Neumann in about 1946 has served the computer science field for over seven and half decades
now. However, many Computer Scientists believe that it may be reaching the end of its useful life
[1]. This is because the problems computers are being asked to solve have grown significantly in
size and complexity since the appearance of the first-generation computers in the late 1940s and
the early 1950s [1].
Also, it must be stated that, Computer Engineers through advances in hardware design,
manufacturing methods and circuit technology, have been able to take the basic sequential
architecture described by Von Neumann and improve its performance by 3 or 4 orders of
magnitude [1]. Today, even a desktop computer can perform about 300 to 500 million instructions
per second. However, “even though the rate of increase of performance of newer machines is
slowing down, the size of problems that researchers are attempting to solve is not slowing down”
[1]. New application domains such as computational modelling, real-time graphics, and
bioinformatics are greatly increasing the demands placed on new computer systems [1].
It must be emphasized that, due to the demand for faster computers, Computer Engineers
are rethinking the basic ideas of the Von Neumann model [1]. The new ideas of Computer
Organization are known as Non-Von Neumann Architectures. This presents us with a basic
principle: “if you cannot build something to work as twice as fast, do two things at once. The
results will be identical” [1]. This truism leads to Parallel Processing – which is building computers
with tens, hundreds or thousands of processors. So that, when each processor is engaged in
meaningful work then this will speed up the solution to complex and larger problems.
This paper is concerned with this concept of Parallel Processing and also its immediate
colleague, Parallel Algorithms. The paper first describes the Von Neumann Architecture and then
moves on to the various types of the Non-Von Neumann Architectures and finally to Parallel
Algorithms.
2. The Von Neumann Architecture
The Von Neumann Architecture is made up of the following three features:
• A computer consists of four major subsystems namely Memory, Input/Output, the
Arithmetic/Logic Unit (ALU) and the Control Unit [1].
• The Stored Program Concept, in which instructions to be executed by the computer are
represented as binary values and stored in memory [1].
• The Sequential Execution of Instruction, in which one instruction at a time is fetched from
memory to the control unit, where it is decoded and executed [1].
We describe below the various subsystems of a computer as outlined by the Von Neumann
Architecture.
Memory
Memory is the functional unit of a computer that stores and retrieves instruction and data that is
being executed [1]. The computer memory uses an access technique known as Random Access
and the acronym RAM (Meaning Random Access Memory) is frequently used to refer to the
memory unit [1]. Also, the acronym ROM (Meaning Read-Only Memory) is a random access
memory in which the ability to store information is disabled. It is only possible to read or fetch
information [1].
Input/Output and Mass Storage
The Input/Output (I/O) units are the devices that allow the computer to communicate to the
outside world as well as information [1]. Contrarily to the RAM which is a volatile storage that
disappears when the computer is shut down, I/O devices are nonvolatile and are the role of mass
storage devices such as disks and tapes [1]. There are two groups of I/O devices [1]. These are
those that represent information in human - readable form such as keyboards, screens, and
printers and those that store information in machine - readable format such as floppy disks, hard
disks CD-ROMs, DVDs and streaming tapes [1].
3. Arithmetic/Logic Unit (ALU)
The Arithmetic/Logic Unit is the subsystem that performs mathematical and logical operations
such as addition, subtraction and comparison for equality [1]. Although the ALU can be seen a
conceptually different component from the Control Unit, modern computers have combined
them into a single component known as a processor [1]. The ALU is made up of the following
three parts: registers, the interconnections between components, and the ALU circuitry [1]. These
components are called data paths [1]. A register is a storage cell that holds the operands of an
arithmetic operation [1]. It also holds the result of the operation [1]. Registers differ from memory
cell in the following ways:
• They do not have numeric memory address but are accessed by a special register
designator such as A, X, R0.
• They can be accessed more quickly than memory cells.
• They are not used for general – purpose storage but for specific purposes such as holding
operands for an arithmetic operation.
Control Unit
One of the characteristics of the Von Neumann Architecture is the concept of a stored program
[1]. The task of the control unit is to fetch the next instruction from memory, decode it and
execute it [1]. The set of operations that can be executed by a processor is called its Instruction
Set [1]. One of the ways of designing instruction set is to make then as small and simple as
possible, say a few 30 to 50 instructions [1]. Computers with this type of instruction set are called
Reduced Instruction Set Computers or RISC machines [1]. This approach minimizes the amount
of hardware circuitry (gates and transistors) needed to build a processor [1]. The opposite
philosophy common on computers of the 1970s, 1980s, and 1990s is to include a large number,
say 300 to 500 of very powerful instructions in the instruction set [1]. These types of processors
4. are called Complex Instruction Set Computers or CISC machines [1]. As such CISC machines are
more complex, more expensive, and more difficult to build [1].
Non-Von Neumann Architectures
The inability of the sequential one-instruction-at-a-time Von Neumann model to handle the
complex and large-scale problems is called the Von Neumann bottleneck [1]. This is a problem in
computer organization [1]. The ideas for solving this problem are called the Non-Von Neumann
Architectures. One form of this new approaches for solving the Von Neumann bottleneck is
Parallel Processing [1]. There are two main ways of designing parallel processing systems. These
are SIMD Parallel Processing (SIMD meaning Single Instruction Stream/Multiple Data Stream)
and MIMD Parallel Processing (MIMD meaning Multiple Instruction Stram/Multiple Data
Stream) [1].
It must be stated that Parallel Processing is a catch – all term for a variety of computing
architectures and approaches to algorithms that try to provide the necessary speed needed for
the kind of complex and large-scale problems researchers are trying to solve [2].
SIMD Parallel Processing
MIMD Parallel Processing
References
1. G. M. Schneider, J. L. Gersting Invitation to Computer Science 3rd Edition Chapter 5
2. G. M. Schneider, J. L. Gersting Invitation to Computer Science 3rd Edition Chapter 9
3. https://www.tutorialspoint.com/parallel_algorithm/parallel_algorithm_introduction.htm#