Need, types , applications and challenges,
Architecture of Parallel Systems-Flynn’s
classification; ARM Processor: The thumb
instruction set, Processor and CPU cores,
Instruction Encoding format, Memory load
and Store instruction, Basics of I/O operations.
Case study: ARM 5 and ARM 7 Architecture
UNIT - V
Syllabus
Need of Architecture of Parallel Systems-Flynn’s
classification
Parallel computing is computing where the jobs are broken into discrete parts that can
be executed concurrently.
Each part is further broken down into a series of instructions.
Instructions from each piece execute simultaneously on different CPUs.
The breaking up of different parts of a task among multiple processors will help to
reduce the amount of time to run a program.
Parallel systems deal with the simultaneous use of multiple computer resources that
can include a single computer with multiple processors, a number of computers
connected by a network to form a parallel processing cluster, or a combination of both.
Parallel systems are more difficult to program than computers with a single
processor because the architecture of parallel computers varies accordingly and
the processes of multiple CPUs must be coordinated and synchronized.
The difficult problem of parallel processing is portability.
An Instruction Stream is a sequence of instructions that are read from memory.
Data Stream is the operations performed on the data in the processor.
Flynn’s taxonomy is a classification scheme for computer architectures proposed by
Michael Flynn in 1966. The taxonomy is based on the number of instruction
streams and data streams that can be processed simultaneously by a computer
architecture.
The need for different architectures of parallel systems, as defined by Flynn's
classification, arises from the variety of applications, performance requirements,
and efficiency considerations in the world of computing. Different architectures are
tailored to specific use cases to optimize system performance and resource
utilization.
Performance Improvement: Parallelism allows for the execution of multiple tasks
simultaneously, leading to improved performance and reduced execution times for
computationally intensive applications.
Scalability: Parallel systems can be scaled by adding more processing units, making
them suitable for both small and large-scale computational tasks.
Diverse Applications: Different applications have varying degrees of parallelism.
SIMD architectures are well-suited for tasks like image and video processing, while
MIMD architectures are better for general-purpose computing.
Efficiency: Parallel architectures can optimize resource utilization by distributing
workloads efficiently across multiple processors, reducing idle time and enhancing
overall system efficiency.
Fault Tolerance: In some cases, MISD architectures are used for fault tolerance,
where multiple instructions can be executed to provide redundancy and reliability.
Real-Time Processing: Some real-time control systems require multiple instructions
to be executed on a single data stream to meet strict timing constraints.
Types of Architecture of Parallel Systems-Flynn’s classification
 Single Instruction Single Data (SISD): In a SISD architecture, there is a
single processor that executes a single instruction stream and operates
on a single data stream. This is the simplest type of computer
architecture and is used in most traditional computers.
 Single Instruction Multiple Data (SIMD): In a SIMD architecture, there
is a single processor that executes the same instruction on multiple
data streams in parallel. This type of architecture is used in
applications such as image and signal processing.
 Multiple Instruction Single Data (MISD): In a MISD architecture,
multiple processors execute different instructions on the same data
stream. This type of architecture is not commonly used in practice, as
it is difficult to find applications that can be decomposed into
independent instruction streams.
 Multiple Instruction Multiple Data (MIMD): In a MIMD architecture,
multiple processors execute different instructions on different data
streams. This type of architecture is used in distributed computing,
parallel processing, and other high-performance computing
applications.
Single-instruction, single-data (SISD) Systems
 An SISD computing system is a uniprocessor machine that
is capable of executing a single instruction, operating on a
single data stream.
 In SISD, machine instructions are processed in a sequential
manner and computers adopting this model are popularly
called sequential computers.
 Most conventional computers have SISD architecture. All
the instructions and data to be processed have to be stored
in primary memory.
 The speed of the processing element in the SISD model is
limited(dependent) by the rate at which the computer can
transfer information internally.
 Dominant representative SISD systems are IBM PC, and
workstations.
Single-instruction, multiple-data (SIMD) systems
 An SIMD system is a multiprocessor machine capable of
executing the same instruction on all the CPUs but
operating on different data streams.
 Machines based on a SIMD model are well suited to
scientific computing since they involve lots of vector and
matrix operations.
 So that the information can be passed to all the
processing elements (PEs) organized data elements of
vectors can be divided into multiple sets(N-sets for N PE
systems) and each PE can process one data set.
 Dominant representative SIMD systems are Cray’s vector
processing machines.
Multiple-instruction, single-data (MISD) systems
 An MISD computing system is a multiprocessor machine
capable of executing different instructions on different
PEs but all of them operate on the same dataset.
 Example Z = sin(x)+cos(x)+tan(x)
 The system performs different operations on the same
data set.
 Machines built using the MISD model are not useful in
most applications, a few machines are built, but none of
them are available commercially.
Multiple-instruction, multiple-data (MIMD) systems
 An MIMD system is a multiprocessor machine that is
capable of executing multiple instructions on multiple
data sets.
 Each PE in the MIMD model has separate instruction and
data streams; therefore machines built using this model
are capable of any application.
 Unlike SIMD and MISD machines, PEs in MIMD machines
work asynchronously.
 MIMD machines are broadly categorized into shared-
memory MIMD and distributed-memory MIMD based on
the way PEs are coupled to the main memory.
 In the shared memory MIMD model (tightly coupled multiprocessor systems), all the PEs are
connected to a single global memory and they all have access to it.
 The communication between PEs in this model takes place through the shared memory, modification
of the data stored in the global memory by one PE is visible to all other PEs.
 The dominant representative shared memory MIMD systems are Silicon Graphics machines and
Sun/IBM’s SMP (Symmetric Multi-Processing).
 In Distributed memory MIMD machines (loosely coupled multiprocessor systems) all PEs
have a local memory.
 The communication between PEs in this model takes place through the interconnection
network (the inter-process communication channel, or IPC).
 The network connecting PEs can be configured to tree, mesh, or in accordance with the
requirement.
 The shared-memory MIMD architecture is easier to program but is less tolerant to failures
and harder to extend with respect to the distributed memory MIMD model.
 Failures in a shared-memory MIMD affect the entire system, whereas this is not the case in
the distributed model, in which each of the PEs can be easily isolated.
 Moreover, shared memory MIMD architectures are less likely to scale because the addition
of more PEs leads to memory contention.
 This is a situation that does not happen in the case of distributed memory, in which each
PE has its own memory.
 As a result of practical outcomes and user requirements, distributed memory MIMD
architecture is superior to the other existing models.
Applications of Architecture of Parallel
Systems-Flynn’s classification
 SISD (Single Instruction, Single Data):
 Traditional single-core processors used in personal computers and laptops.
 Sequential programming and tasks that cannot be parallelized.
 SIMD (Single Instruction, Multiple Data):
 Graphics Processing: SIMD architectures are widely used in GPUs for rendering images
and videos, as they can perform the same calculations on multiple pixels or vertices
simultaneously.
 Scientific Computing: SIMD is beneficial for scientific simulations and calculations where
the same operation is applied to a large dataset.
 Encryption: SIMD instructions can be used to speed up encryption algorithms, such as
AES.
 MISD (Multiple Instruction, Single Data):
 Fault-Tolerant Systems: MISD architectures can be used for fault tolerance by having multiple
processing units execute different instructions on the same data. Errors in one unit can be cross-
checked by others.
 Real-Time Systems: In certain real-time control systems, multiple instructions may need to be
applied to the same data stream to ensure precise timing and control.
 MIMD (Multiple Instruction, Multiple Data):
 Cluster and Distributed Computing: MIMD systems are widely used in clusters of computers, data
centers, and distributed computing environments where multiple machines independently process
data and tasks.
 Scientific Simulations: Complex simulations in physics, chemistry, weather forecasting, and
computational biology often rely on MIMD architectures to divide computational workloads.
 Parallel Databases: MIMD architectures can speed up database operations, enabling simultaneous
queries and updates.
 Supercomputing: Many of the world's fastest supercomputers are MIMD-based systems, used for
simulations, climate modeling, and scientific research.
Challenges of Architecture of Parallel Systems-Flynn’s classification
The key challenges associated with each category:
 SISD (Single Instruction, Single Data):
 Lack of Parallelism: SISD systems do not exploit parallelism, which can limit their
performance for computationally intensive tasks.
 Inefficient for Parallel Workloads: Applications that inherently involve parallelism may
not benefit from SISD architectures.
 SIMD (Single Instruction, Multiple Data):
 Data Dependencies: SIMD architectures require careful management of data
dependencies. If data elements are dependent on each other, parallelism can be limited.
 Programming Complexity: Writing software that effectively utilizes SIMD instructions can
be complex and require expertise in vectorization.
 Limited Applicability: SIMD architectures are best suited for tasks with a high degree of
data-level parallelism, which may not cover all computational needs.
 MISD (Multiple Instruction, Single Data):
 Complex Coordination: Coordinating multiple instruction streams operating on the same
data can be challenging and may introduce overhead.
 Limited Practical Use: MISD architectures are rare in practice and are typically used in
specialized systems. They have limited applicability in mainstream computing.
 MIMD (Multiple Instruction, Multiple Data):
 Synchronization Overhead: Managing synchronization between multiple processors in
MIMD systems can be complex and introduce overhead, potentially reducing overall
performance.
 Load Balancing: Ensuring that workloads are evenly distributed among processors is
crucial for efficient MIMD operation. Load imbalances can lead to underutilization of
resources.
 Communication Overhead: Data exchange and communication between processors in
MIMD systems can introduce latency and impact performance.
 Programming Complexity: Writing software for MIMD architectures can be challenging
due to the need for parallel algorithms, data distribution strategies, and synchronization
mechanisms.
In addition to these challenges, there are common issues that apply to parallel computing
in general, regardless of the Flynn's classification, such as:
 Parallel Programming Complexity: Developing parallel software can be more complex
than sequential programming, requiring careful consideration of parallel algorithms,
data dependencies, and synchronization.
 Scalability: Ensuring that a parallel system scales efficiently as more processors are
added can be challenging.
 Load Balancing: Efficiently distributing workloads among processors to ensure they
are all utilized optimally can be complex.
 Data Movement: Minimizing data movement between processors is crucial to avoid
communication bottlenecks and latency.
 Fault Tolerance: Ensuring the reliability of parallel systems, especially in large-scale
environments, can be challenging.
ARM Processor: The thumb instruction set
 The Thumb instruction set is a reduced instruction set computer (RISC) instruction set
architecture developed by ARM Holdings.
 It is designed to improve code density while maintaining reasonable performance levels,
making it suitable for embedded systems with limited memory and lower processing power.
 The Thumb instruction set is used in various ARM processor families, especially in
microcontrollers and other resource-constrained devices.
 Instruction Encoding: Thumb instructions are 16 bits long, which is half the size of the
standard ARM instructions. This reduced size allows for more compact code, saving memory
space.
 Code Density: Thumb instructions are designed to be more code-efficient, which is essential
in embedded systems where memory is often limited. By using a subset of ARM instructions,
Thumb code can be denser, which means it takes up less memory.
 Performance: While Thumb instructions are more compact, they typically execute faster than
pure software-based solutions. However, they may not be as fast as ARM instructions in
terms of execution speed.
 Compatibility: Thumb instructions are compatible with ARM instructions, meaning that a
processor supporting the Thumb instruction set can execute both Thumb and ARM
instructions. This provides flexibility for developers to choose the instruction set that best
suits their application.
 Thumb-2: In addition to the original Thumb instruction set, ARM introduced Thumb-2, which
combines the benefits of Thumb with improved performance. Thumb-2 instructions are 16
or 32 bits in length and provide better performance while maintaining code density
advantages.
 Compiler Support: To write code for the Thumb instruction set, developers typically use
compilers that generate Thumb instructions. Most ARM development toolchains include
support for both Thumb and ARM instruction sets, making it easy to switch between them as
needed.
 Application Areas: Thumb is commonly used in microcontrollers, embedded systems, and
other devices where code size and power efficiency are critical factors. It is not typically used
in high-performance computing environments.
 Trade-offs: While Thumb instructions are efficient in terms of code size and are suitable for
many embedded applications, they may not be ideal for tasks requiring high computational
ARM Processor: Processor and CPU cores
 ARM (Advanced RISC Machine) processors are a family of microprocessor architectures originally
developed by ARM Holdings (now part of NVIDIA) and widely used in a variety of computing devices,
from mobile phones to embedded systems and supercomputers.
 ARM processors are known for their power efficiency, scalability, and versatility.
 These processors can be found in single-core and multi-core configurations, each with its own set of
characteristics and applications.
ARM Started With Micro computing
 The applications of the ARM Process start with getting knowledge of the ARM Processor’s history.
 Before ARM, x86 processors were used, which were launched in 1978.
 Whenever we remove the predefined instructions like complex instructions and hard-to-implement
instructions, the remaining instructions take less power and pace and run faster, this is called
Reduced Instruction Set Computer (RISC) Architecture.
 x86 is a Complex Instruction Set Architecture (CISC).
 One of the most common electronic
architectural designs in the market is
Advanced RISC Machine Architecture, even
better than x86, which is very common in the
server market.
 ARM Architecture is widely used in
smartphones, normal phones, and also in
laptops.
 Though x86 processors have optimized
performance ARM Processor gives cost-
effective processors with small size, takes less
power to run, and also gives better battery
life.
 ARM Processor is not only limited to mobile
phones but is also used in Fugaku, the world’s
fastest supercomputer.
 ARM Processor also gives more feasibility to
designs of hardware designers and also gives
ARM Processor
Features of ARM Processor
Multiprocessing System
Tightly Coupled Memory
Memory Management
Thumb-2 Technology
One-Cycle Execution Time
Pipelining
A large number of Registers
Advantages of ARM Processor
 ARM processors deal with a single processor at a time, which makes it
faster and it also consumes lesser power.
 ARM processors work in the case of a multiprogramming system, where
more than one processor is used to process information.
 ARM processors are cheaper than other processors, which makes them
usable in mobile phones.
 ARM processors are scalable, and this feature helps it in using a variety
of devices.
Disadvantages of ARM Processor
 ARM processors are not stable with x86 processors, and due to this,
they cannot be used in Windows Systems.
 ARM processors are not capable of very high performance, which limits
them to a variety of applications.
 ARM processor execution is a little hard, which requires skilled
programmers to use it.
 ARM processor is inefficient in handling Scheduling instructions.
ARM Processor: Instruction Encoding format
ARM Processor: Instruction Encoding format
 ARM processors use a fixed-length instruction encoding format for their instructions. The basic
instruction encoding format for ARM instructions is 32 bits (4 bytes) in length. These 32 bits are
divided into several fields, each serving a specific purpose. Here's an overview of the basic
instruction encoding format for ARM instructions:
 Operation Code (Opcode): The opcode field specifies the operation to be performed, such as
arithmetic operations, data transfers, branching, etc. It usually occupies the first six bits of the
instruction.
 Condition Field: ARM instructions often include a condition field that specifies under which
conditions the instruction should be executed. The condition field is typically a 4-bit portion that
follows the opcode. Common conditions include "equal," "not equal," "greater than," and so on.
 Operands and Registers: Depending on the instruction, there are fields that specify the
source and destination operands, as well as the registers involved in the operation. These
fields can vary in size and position within the instruction, depending on the specific
instruction format.
 Immediate Values: Some instructions allow for immediate values (constants) to be
encoded within the instruction itself. The immediate value field specifies these constants,
which can be used in arithmetic operations or as offsets for memory access.
 Shift and Rotate Operations: ARM instructions often include fields that specify shift or
rotate operations on data. These fields define how data should be shifted or rotated
before the operation is performed.
 Data Size and Type: The instruction encoding includes fields that specify the data size and
type (e.g., byte, half-word, word) being operated on.
 Condition Flags: ARM instructions can set or modify condition flags in the processor's
status register (e.g., the N flag for negative results or the Z flag for zero results). These
flags help control program flow through conditional branching.
 Addressing Modes: Instructions may include addressing mode fields that determine how
memory addresses are calculated, especially in load and store operations.
 Immediate Offsets: In load and store instructions, an immediate offset may be included to
specify the address offset from a base register.
 Bit-Handling Instructions: Some ARM instructions are used for bit manipulation
operations, such as setting, clearing, testing, or toggling individual bits within registers or
memory. These instructions have specific fields for specifying the target bit(s).
 It's important to note that while this is the basic format for ARM instructions, there are
different instruction sets and variations within the ARM architecture, such as Thumb,
Thumb-2, ARMv7-A, ARMv8-A, and others. Each of these may have its own specific
encoding formats and instruction set extensions to cater to different application domains,
including 32-bit and 64-bit instruction sets. The specifics of the instruction encoding can
vary between these architectures and instruction sets.
ARM Processor: Memory load and Store instruction
 ARM processors use load and store instructions to manipulate data stored in
memory.
 These instructions are fundamental to the operation of ARM-based systems, and they
are used to move data between registers and memory.
 In ARM assembly language, load and store instructions are typically represented as
"LDR" (Load Register) and "STR" (Store Register) instructions, respectively.
 LDR (Load Register):
 LDR Rd, [Rn, #offset]: Loads a word from memory at the address specified by Rn plus the #offset into
register Rd.
 Example: LDR R0, [R1, #4] loads a word from the memory location at R1 + 4 into R0.
 STR (Store Register):
 STR Rd, [Rn, #offset]: Stores the value in register Rd into memory at the address specified by
Rn plus the #offset.
 Example: STR R0, [R1, #8] stores the value in R0 into the memory location at R1 + 8.
 LDM (Load Multiple):
 LDM Rn, {Rd1, Rd2, ...}: Loads multiple registers with values from memory starting at the
address in Rn. Useful for loading several values at once.
 Example: LDM R1, {R2, R3, R4} loads values from memory into R2, R3, and R4 starting at the
address in R1.
 STM (Store Multiple):
 STM Rn, {Rd1, Rd2, ...}: Stores multiple registers' values into memory starting at the address
in Rn. Useful for saving several values at once.
 Example: STM R1, {R2, R3, R4} stores the values in R2, R3, and R4 into memory starting at the
address in R1.
ARM Processor: Basics of I/O operations
 In ARM-based systems, I/O (Input/Output) operations encompass various types
of interactions between the ARM processor and external devices.
 It involve the interaction between the CPU (Central Processing Unit) and
external devices such as sensors, displays, keyboards, and storage devices.
 The I/O operations are essential for interfacing ARM processors with a diverse
range of external devices and for enabling various functionalities in embedded
systems, IoT (Internet of Things) devices, and other applications.
 The specific type of I/O operation used depends on the nature of the external
devices and the requirements of the application.
 Here are the key aspects of I/O operations with ARM processors:
 Memory-Mapped I/O (MMIO):
 Purpose: Memory-mapped I/O is a common technique where I/O devices are mapped into the memory address
space of the ARM processor.
 Usage: It allows the ARM processor to read from and write to I/O devices using standard load (LDR) and store (STR)
instructions, treating I/O registers as if they were memory locations.
 Port-Mapped I/O:
 Purpose: Port-mapped I/O is an alternative to memory-mapped I/O where I/O devices are accessed through specific
I/O port addresses.
 Usage: Specialized instructions or commands are used to read from and write to I/O ports, which are distinct from
regular memory locations.
 Serial Communication (UART, SPI, I2C):
 Purpose: Serial communication protocols like UART, SPI (Serial Peripheral Interface), and I2C (Inter-Integrated
Circuit) are used for communication between the ARM processor and external devices.
 Usage: These protocols enable the ARM processor to exchange data serially with devices such as sensors, displays,
and other microcontrollers.
 File I/O (Sequential and Random Access):
 Purpose: File I/O operations involve reading from and writing to files stored in external storage devices like SD
cards or hard drives.
 Usage: The ARM processor uses file I/O operations to access and manipulate data stored in files sequentially or
randomly.
 GPIO (General-Purpose Input/Output):
 Purpose: GPIO pins are used for basic digital I/O operations, allowing the ARM processor to read input signals
from external devices (e.g., sensors) and control output signals to external devices (e.g., LEDs).
 Usage: GPIO pins can be configured as inputs or outputs, and their states can be controlled programmatically.
 Interrupt Handling:
 Purpose: Interrupts are used to signal the ARM processor when external events, such as data arrival or device
status changes, require attention.
 Usage: ARM processors handle interrupts by executing interrupt service routines (ISRs), allowing for timely
responses to external events without constant polling.
 DMA (Direct Memory Access):
 Purpose: DMA is used for high-speed data transfer between external devices and memory without CPU intervention.
 Usage: DMA controllers in ARM-based systems enable peripherals to read from or write to memory directly, reducing CPU
load and increasing data transfer speed.
 Analog-to-Digital Conversion (ADC) and Digital-to-Analog Conversion (DAC):
 Purpose: ADC and DAC operations involve converting analog signals to digital and vice versa, enabling the ARM processor
to interface with sensors and control analog devices.
 Usage: ARM processors can read analog data from sensors through ADCs and generate analog output signals through
DACs.
 Ethernet and Network Communication:
 Purpose: ARM processors can communicate with other devices over Ethernet networks for data transfer and network-
based I/O operations.
 Usage: Network protocols like TCP/IP and UDP are commonly used for tasks such as data exchange, remote control, and
device management.
 USB (Universal Serial Bus) Communication:
 Purpose: USB interfaces allow ARM-based systems to connect to a wide range of USB devices, such as keyboards, mice,
storage devices, and more.
 Usage: ARM processors can use USB controllers to manage data transfer and communication with USB peripherals.
Case study: ARM 5 Architecture
ARMv5 Architecture Overview:
 32-bit Architecture: ARMv5 is a 32-bit architecture, meaning it primarily deals with 32-bit data and instructions.
It was widely used in various embedded systems, mobile devices, and microcontrollers.
 Thumb Instruction Set: ARMv5 introduced the Thumb instruction set, which is a 16-bit subset of the ARM
instruction set. Thumb is designed for code density, making it suitable for applications with limited memory
space.
 Classic RISC Design: ARMv5 processors typically use a classic RISC (Reduced Instruction Set Computer) pipeline
architecture, which consists of multiple stages for instruction execution. This design emphasizes simplicity and
efficiency.
 Memory Management: ARMv5 supports virtual memory management and includes a Memory Management
Unit (MMU) for address translation and memory protection. This enables features like multitasking and memory
isolation.
 Clock Speeds: The clock speed of ARMv5 processors can vary depending on the specific chip design and
application requirements. ARMv5 processors were used in a wide range of devices with different clock
frequencies.
Use Cases and Applications of ARM 5 Architecture
ARMv5 processors were widely used in various applications, including:
Embedded Systems: ARMv5 was popular in embedded applications due to its
power efficiency and versatility.
Mobile Devices: Some early smartphones and mobile devices utilized ARMv5
processors.
Consumer Electronics: ARMv5 was found in devices like digital cameras, set-top
boxes, and portable media players.
Network Equipment: ARMv5-based processors were used in networking
equipment like routers and switches.
Industrial Automation: ARMv5 was used in controllers and automation systems.
Limitations of ARM 5 Architecture
Compared to later ARM architectures, ARMv5 lacked some of the advanced
features and performance enhancements found in newer versions.
It had limitations in terms of floating-point performance and multimedia
capabilities.
Case study: ARM 7 Architecture
Case study: ARM 7 Architecture
 ARM 7
 Introduced in 1994, the ARM7™ processor family has been immensely successful, and has helped establish
ARM as the architecture of choice in the digital world.
 Over the years, more than 10 billion ARM7 processor family-based devices have powered a wide variety of
cost and power-sensitive applications.
 ARM 7 Series: ARM7500, ARM7DI, ARM710T.
 While the ARM7 processor family continues to be used today for simple 32-bit devices, newer embedded
designs are increasingly making use of latest ARM processors such as the Cortex™-M0 and Cortex-M3
processors, both of which offer significant technical enhancements over the ARM7 family.
 ARM7500
 It is a highly integrated single chip computer based around the ARM RISC microprocessor microcell.
 ARM7500 contains all the functionality required to create a complete computing system with the minimum
of external components.
 The wide range of features incorporated into ARM7500 make it an extremely flexible device, which can be
programmed according to the required application to optimize for high performance or low power, or a
combination of both.
Highly integrated RISC computer
30 Dhrystone 2.1 MIPS ARM7 core @ 33MHz
4 Kbyte combined instruction and data cache
Flexible Memory Management Unit
Supports 16 or 32 bit wide memory via internal ROM and DRAM controllers
3 channel DMA
I/O controller
2 serial ports, 4 A/D channels
8 stereo sound channels
32-bit CD quality serial sound channel
Video controller with up to 120MHz pixel clock
16 million colours from 256-entry palette
16-level grey scales for LCD displays
Suspend and stop power saving modes 9444954195
Features of ARM7500
Applications of ARM7500
ARM7500 is ideally suited to those applications requiring a compact, low-cost,
power efficient, high performance, RISC computing system on a single chip. These
include:
Multimedia Interactive visual display terminals, Portable Computing Handheld
test
instrumentation, Games consoles Desktop computing
ARM7DI
 Is a low-power, general purpose 32-bit RISC microprocessor with integrated debug
support
 It comprises the ARM7D CPU core, and ICE breaker module and a TAP controller.
 Its simple, elegant and fully static design is particularly suitable for cost and power
sensitive applications.
Enhancements
 The ARM7DI is similar to the ARM6 but with the following enhancements:
 advanced debug (integrated ICE) support for faster time to market
 fabrication on a sub-micron process for increased speed and reduced power
consumption
 3V operation, for very low power consumption, as well as 5V operation for
system compatibility
 higher clock speed for faster program execution.
Applications
 The ARM7DI is ideally suited to those applications requiring RISC performance from
a compact, power-efficient processor.
These include:
 Telecomms GSM terminal controller
 Datacomms Protocol conversion
 Portable Computing Palmtop computer
 Portable Instruments Handheld data acquisition unit
 Automotive Engine management unit
 Information Systems Smart cards
 Imaging JPEG controller.
ARM710T
 The ARM710T is a general-purpose 32-bit microprocessor with 8KB cache, enlarged
write buffer and Memory Management Unit (MMU) combined in a single chip.
 The CPU within the ARM710T is the ARM7TDMI. The ARM710T is software compatible
with the ARM processor family.
 The ARM710T architecture is based on Reduced Instruction Set Computer (RISC)
principles, and the instruction set and related decode mechanism are greatly simplified
compared with microprogrammed Complex Instruction Set Computers (CISC).
 The on-chip mixed data and instruction cache, together with the write buffer,
substantially raise the average execution speed and reduce the average amount of
memory bandwidth required by the processor.
 This allows the external memory to support additional processors or Direct Memory
Access (DMA) channels with minimal performance loss.
ARM710T
 The MMU supports a conventional two-level page-table structure and a number of
extensions which make it ideal for embedded control, UNIX and Object Oriented systems.
 The memory interface has been designed to allow the performance potential to be
realized without incurring high costs in the memory system.
 Speed-critical control signals are pipelined to allow system control functions to be
implemented in standard low-power logic, and these control signals permit the
exploitation of paged mode access offered by industry standard DRAMs.
 ARM710T is a fully static part and has been designed to minimize power requirements.
 This makes it ideal for portable applications where both these features are essential.

Computer organisation and architecture unit 5, SRM

  • 1.
    Need, types ,applications and challenges, Architecture of Parallel Systems-Flynn’s classification; ARM Processor: The thumb instruction set, Processor and CPU cores, Instruction Encoding format, Memory load and Store instruction, Basics of I/O operations. Case study: ARM 5 and ARM 7 Architecture UNIT - V Syllabus
  • 2.
    Need of Architectureof Parallel Systems-Flynn’s classification Parallel computing is computing where the jobs are broken into discrete parts that can be executed concurrently. Each part is further broken down into a series of instructions. Instructions from each piece execute simultaneously on different CPUs. The breaking up of different parts of a task among multiple processors will help to reduce the amount of time to run a program. Parallel systems deal with the simultaneous use of multiple computer resources that can include a single computer with multiple processors, a number of computers connected by a network to form a parallel processing cluster, or a combination of both.
  • 3.
    Parallel systems aremore difficult to program than computers with a single processor because the architecture of parallel computers varies accordingly and the processes of multiple CPUs must be coordinated and synchronized. The difficult problem of parallel processing is portability. An Instruction Stream is a sequence of instructions that are read from memory. Data Stream is the operations performed on the data in the processor. Flynn’s taxonomy is a classification scheme for computer architectures proposed by Michael Flynn in 1966. The taxonomy is based on the number of instruction streams and data streams that can be processed simultaneously by a computer architecture. The need for different architectures of parallel systems, as defined by Flynn's classification, arises from the variety of applications, performance requirements, and efficiency considerations in the world of computing. Different architectures are tailored to specific use cases to optimize system performance and resource utilization.
  • 4.
    Performance Improvement: Parallelismallows for the execution of multiple tasks simultaneously, leading to improved performance and reduced execution times for computationally intensive applications. Scalability: Parallel systems can be scaled by adding more processing units, making them suitable for both small and large-scale computational tasks. Diverse Applications: Different applications have varying degrees of parallelism. SIMD architectures are well-suited for tasks like image and video processing, while MIMD architectures are better for general-purpose computing. Efficiency: Parallel architectures can optimize resource utilization by distributing workloads efficiently across multiple processors, reducing idle time and enhancing overall system efficiency. Fault Tolerance: In some cases, MISD architectures are used for fault tolerance, where multiple instructions can be executed to provide redundancy and reliability. Real-Time Processing: Some real-time control systems require multiple instructions to be executed on a single data stream to meet strict timing constraints.
  • 5.
    Types of Architectureof Parallel Systems-Flynn’s classification  Single Instruction Single Data (SISD): In a SISD architecture, there is a single processor that executes a single instruction stream and operates on a single data stream. This is the simplest type of computer architecture and is used in most traditional computers.  Single Instruction Multiple Data (SIMD): In a SIMD architecture, there is a single processor that executes the same instruction on multiple data streams in parallel. This type of architecture is used in applications such as image and signal processing.  Multiple Instruction Single Data (MISD): In a MISD architecture, multiple processors execute different instructions on the same data stream. This type of architecture is not commonly used in practice, as it is difficult to find applications that can be decomposed into independent instruction streams.  Multiple Instruction Multiple Data (MIMD): In a MIMD architecture, multiple processors execute different instructions on different data streams. This type of architecture is used in distributed computing, parallel processing, and other high-performance computing applications.
  • 6.
    Single-instruction, single-data (SISD)Systems  An SISD computing system is a uniprocessor machine that is capable of executing a single instruction, operating on a single data stream.  In SISD, machine instructions are processed in a sequential manner and computers adopting this model are popularly called sequential computers.  Most conventional computers have SISD architecture. All the instructions and data to be processed have to be stored in primary memory.  The speed of the processing element in the SISD model is limited(dependent) by the rate at which the computer can transfer information internally.  Dominant representative SISD systems are IBM PC, and workstations.
  • 7.
    Single-instruction, multiple-data (SIMD)systems  An SIMD system is a multiprocessor machine capable of executing the same instruction on all the CPUs but operating on different data streams.  Machines based on a SIMD model are well suited to scientific computing since they involve lots of vector and matrix operations.  So that the information can be passed to all the processing elements (PEs) organized data elements of vectors can be divided into multiple sets(N-sets for N PE systems) and each PE can process one data set.  Dominant representative SIMD systems are Cray’s vector processing machines.
  • 8.
    Multiple-instruction, single-data (MISD)systems  An MISD computing system is a multiprocessor machine capable of executing different instructions on different PEs but all of them operate on the same dataset.  Example Z = sin(x)+cos(x)+tan(x)  The system performs different operations on the same data set.  Machines built using the MISD model are not useful in most applications, a few machines are built, but none of them are available commercially.
  • 9.
    Multiple-instruction, multiple-data (MIMD)systems  An MIMD system is a multiprocessor machine that is capable of executing multiple instructions on multiple data sets.  Each PE in the MIMD model has separate instruction and data streams; therefore machines built using this model are capable of any application.  Unlike SIMD and MISD machines, PEs in MIMD machines work asynchronously.  MIMD machines are broadly categorized into shared- memory MIMD and distributed-memory MIMD based on the way PEs are coupled to the main memory.  In the shared memory MIMD model (tightly coupled multiprocessor systems), all the PEs are connected to a single global memory and they all have access to it.  The communication between PEs in this model takes place through the shared memory, modification of the data stored in the global memory by one PE is visible to all other PEs.  The dominant representative shared memory MIMD systems are Silicon Graphics machines and Sun/IBM’s SMP (Symmetric Multi-Processing).
  • 10.
     In Distributedmemory MIMD machines (loosely coupled multiprocessor systems) all PEs have a local memory.  The communication between PEs in this model takes place through the interconnection network (the inter-process communication channel, or IPC).  The network connecting PEs can be configured to tree, mesh, or in accordance with the requirement.  The shared-memory MIMD architecture is easier to program but is less tolerant to failures and harder to extend with respect to the distributed memory MIMD model.  Failures in a shared-memory MIMD affect the entire system, whereas this is not the case in the distributed model, in which each of the PEs can be easily isolated.  Moreover, shared memory MIMD architectures are less likely to scale because the addition of more PEs leads to memory contention.  This is a situation that does not happen in the case of distributed memory, in which each PE has its own memory.  As a result of practical outcomes and user requirements, distributed memory MIMD architecture is superior to the other existing models.
  • 11.
    Applications of Architectureof Parallel Systems-Flynn’s classification  SISD (Single Instruction, Single Data):  Traditional single-core processors used in personal computers and laptops.  Sequential programming and tasks that cannot be parallelized.  SIMD (Single Instruction, Multiple Data):  Graphics Processing: SIMD architectures are widely used in GPUs for rendering images and videos, as they can perform the same calculations on multiple pixels or vertices simultaneously.  Scientific Computing: SIMD is beneficial for scientific simulations and calculations where the same operation is applied to a large dataset.  Encryption: SIMD instructions can be used to speed up encryption algorithms, such as AES.
  • 12.
     MISD (MultipleInstruction, Single Data):  Fault-Tolerant Systems: MISD architectures can be used for fault tolerance by having multiple processing units execute different instructions on the same data. Errors in one unit can be cross- checked by others.  Real-Time Systems: In certain real-time control systems, multiple instructions may need to be applied to the same data stream to ensure precise timing and control.  MIMD (Multiple Instruction, Multiple Data):  Cluster and Distributed Computing: MIMD systems are widely used in clusters of computers, data centers, and distributed computing environments where multiple machines independently process data and tasks.  Scientific Simulations: Complex simulations in physics, chemistry, weather forecasting, and computational biology often rely on MIMD architectures to divide computational workloads.  Parallel Databases: MIMD architectures can speed up database operations, enabling simultaneous queries and updates.  Supercomputing: Many of the world's fastest supercomputers are MIMD-based systems, used for simulations, climate modeling, and scientific research.
  • 13.
    Challenges of Architectureof Parallel Systems-Flynn’s classification The key challenges associated with each category:  SISD (Single Instruction, Single Data):  Lack of Parallelism: SISD systems do not exploit parallelism, which can limit their performance for computationally intensive tasks.  Inefficient for Parallel Workloads: Applications that inherently involve parallelism may not benefit from SISD architectures.  SIMD (Single Instruction, Multiple Data):  Data Dependencies: SIMD architectures require careful management of data dependencies. If data elements are dependent on each other, parallelism can be limited.  Programming Complexity: Writing software that effectively utilizes SIMD instructions can be complex and require expertise in vectorization.  Limited Applicability: SIMD architectures are best suited for tasks with a high degree of data-level parallelism, which may not cover all computational needs.
  • 14.
     MISD (MultipleInstruction, Single Data):  Complex Coordination: Coordinating multiple instruction streams operating on the same data can be challenging and may introduce overhead.  Limited Practical Use: MISD architectures are rare in practice and are typically used in specialized systems. They have limited applicability in mainstream computing.  MIMD (Multiple Instruction, Multiple Data):  Synchronization Overhead: Managing synchronization between multiple processors in MIMD systems can be complex and introduce overhead, potentially reducing overall performance.  Load Balancing: Ensuring that workloads are evenly distributed among processors is crucial for efficient MIMD operation. Load imbalances can lead to underutilization of resources.  Communication Overhead: Data exchange and communication between processors in MIMD systems can introduce latency and impact performance.  Programming Complexity: Writing software for MIMD architectures can be challenging due to the need for parallel algorithms, data distribution strategies, and synchronization mechanisms.
  • 15.
    In addition tothese challenges, there are common issues that apply to parallel computing in general, regardless of the Flynn's classification, such as:  Parallel Programming Complexity: Developing parallel software can be more complex than sequential programming, requiring careful consideration of parallel algorithms, data dependencies, and synchronization.  Scalability: Ensuring that a parallel system scales efficiently as more processors are added can be challenging.  Load Balancing: Efficiently distributing workloads among processors to ensure they are all utilized optimally can be complex.  Data Movement: Minimizing data movement between processors is crucial to avoid communication bottlenecks and latency.  Fault Tolerance: Ensuring the reliability of parallel systems, especially in large-scale environments, can be challenging.
  • 16.
    ARM Processor: Thethumb instruction set  The Thumb instruction set is a reduced instruction set computer (RISC) instruction set architecture developed by ARM Holdings.  It is designed to improve code density while maintaining reasonable performance levels, making it suitable for embedded systems with limited memory and lower processing power.  The Thumb instruction set is used in various ARM processor families, especially in microcontrollers and other resource-constrained devices.  Instruction Encoding: Thumb instructions are 16 bits long, which is half the size of the standard ARM instructions. This reduced size allows for more compact code, saving memory space.  Code Density: Thumb instructions are designed to be more code-efficient, which is essential in embedded systems where memory is often limited. By using a subset of ARM instructions, Thumb code can be denser, which means it takes up less memory.  Performance: While Thumb instructions are more compact, they typically execute faster than pure software-based solutions. However, they may not be as fast as ARM instructions in terms of execution speed.
  • 17.
     Compatibility: Thumbinstructions are compatible with ARM instructions, meaning that a processor supporting the Thumb instruction set can execute both Thumb and ARM instructions. This provides flexibility for developers to choose the instruction set that best suits their application.  Thumb-2: In addition to the original Thumb instruction set, ARM introduced Thumb-2, which combines the benefits of Thumb with improved performance. Thumb-2 instructions are 16 or 32 bits in length and provide better performance while maintaining code density advantages.  Compiler Support: To write code for the Thumb instruction set, developers typically use compilers that generate Thumb instructions. Most ARM development toolchains include support for both Thumb and ARM instruction sets, making it easy to switch between them as needed.  Application Areas: Thumb is commonly used in microcontrollers, embedded systems, and other devices where code size and power efficiency are critical factors. It is not typically used in high-performance computing environments.  Trade-offs: While Thumb instructions are efficient in terms of code size and are suitable for many embedded applications, they may not be ideal for tasks requiring high computational
  • 18.
    ARM Processor: Processorand CPU cores  ARM (Advanced RISC Machine) processors are a family of microprocessor architectures originally developed by ARM Holdings (now part of NVIDIA) and widely used in a variety of computing devices, from mobile phones to embedded systems and supercomputers.  ARM processors are known for their power efficiency, scalability, and versatility.  These processors can be found in single-core and multi-core configurations, each with its own set of characteristics and applications. ARM Started With Micro computing  The applications of the ARM Process start with getting knowledge of the ARM Processor’s history.  Before ARM, x86 processors were used, which were launched in 1978.  Whenever we remove the predefined instructions like complex instructions and hard-to-implement instructions, the remaining instructions take less power and pace and run faster, this is called Reduced Instruction Set Computer (RISC) Architecture.  x86 is a Complex Instruction Set Architecture (CISC).
  • 19.
     One ofthe most common electronic architectural designs in the market is Advanced RISC Machine Architecture, even better than x86, which is very common in the server market.  ARM Architecture is widely used in smartphones, normal phones, and also in laptops.  Though x86 processors have optimized performance ARM Processor gives cost- effective processors with small size, takes less power to run, and also gives better battery life.  ARM Processor is not only limited to mobile phones but is also used in Fugaku, the world’s fastest supercomputer.  ARM Processor also gives more feasibility to designs of hardware designers and also gives ARM Processor
  • 20.
    Features of ARMProcessor Multiprocessing System Tightly Coupled Memory Memory Management Thumb-2 Technology One-Cycle Execution Time Pipelining A large number of Registers
  • 21.
    Advantages of ARMProcessor  ARM processors deal with a single processor at a time, which makes it faster and it also consumes lesser power.  ARM processors work in the case of a multiprogramming system, where more than one processor is used to process information.  ARM processors are cheaper than other processors, which makes them usable in mobile phones.  ARM processors are scalable, and this feature helps it in using a variety of devices.
  • 22.
    Disadvantages of ARMProcessor  ARM processors are not stable with x86 processors, and due to this, they cannot be used in Windows Systems.  ARM processors are not capable of very high performance, which limits them to a variety of applications.  ARM processor execution is a little hard, which requires skilled programmers to use it.  ARM processor is inefficient in handling Scheduling instructions.
  • 23.
  • 24.
    ARM Processor: InstructionEncoding format  ARM processors use a fixed-length instruction encoding format for their instructions. The basic instruction encoding format for ARM instructions is 32 bits (4 bytes) in length. These 32 bits are divided into several fields, each serving a specific purpose. Here's an overview of the basic instruction encoding format for ARM instructions:  Operation Code (Opcode): The opcode field specifies the operation to be performed, such as arithmetic operations, data transfers, branching, etc. It usually occupies the first six bits of the instruction.  Condition Field: ARM instructions often include a condition field that specifies under which conditions the instruction should be executed. The condition field is typically a 4-bit portion that follows the opcode. Common conditions include "equal," "not equal," "greater than," and so on.  Operands and Registers: Depending on the instruction, there are fields that specify the source and destination operands, as well as the registers involved in the operation. These fields can vary in size and position within the instruction, depending on the specific instruction format.
  • 25.
     Immediate Values:Some instructions allow for immediate values (constants) to be encoded within the instruction itself. The immediate value field specifies these constants, which can be used in arithmetic operations or as offsets for memory access.  Shift and Rotate Operations: ARM instructions often include fields that specify shift or rotate operations on data. These fields define how data should be shifted or rotated before the operation is performed.  Data Size and Type: The instruction encoding includes fields that specify the data size and type (e.g., byte, half-word, word) being operated on.  Condition Flags: ARM instructions can set or modify condition flags in the processor's status register (e.g., the N flag for negative results or the Z flag for zero results). These flags help control program flow through conditional branching.
  • 26.
     Addressing Modes:Instructions may include addressing mode fields that determine how memory addresses are calculated, especially in load and store operations.  Immediate Offsets: In load and store instructions, an immediate offset may be included to specify the address offset from a base register.  Bit-Handling Instructions: Some ARM instructions are used for bit manipulation operations, such as setting, clearing, testing, or toggling individual bits within registers or memory. These instructions have specific fields for specifying the target bit(s).  It's important to note that while this is the basic format for ARM instructions, there are different instruction sets and variations within the ARM architecture, such as Thumb, Thumb-2, ARMv7-A, ARMv8-A, and others. Each of these may have its own specific encoding formats and instruction set extensions to cater to different application domains, including 32-bit and 64-bit instruction sets. The specifics of the instruction encoding can vary between these architectures and instruction sets.
  • 27.
    ARM Processor: Memoryload and Store instruction  ARM processors use load and store instructions to manipulate data stored in memory.  These instructions are fundamental to the operation of ARM-based systems, and they are used to move data between registers and memory.  In ARM assembly language, load and store instructions are typically represented as "LDR" (Load Register) and "STR" (Store Register) instructions, respectively.  LDR (Load Register):  LDR Rd, [Rn, #offset]: Loads a word from memory at the address specified by Rn plus the #offset into register Rd.  Example: LDR R0, [R1, #4] loads a word from the memory location at R1 + 4 into R0.
  • 28.
     STR (StoreRegister):  STR Rd, [Rn, #offset]: Stores the value in register Rd into memory at the address specified by Rn plus the #offset.  Example: STR R0, [R1, #8] stores the value in R0 into the memory location at R1 + 8.  LDM (Load Multiple):  LDM Rn, {Rd1, Rd2, ...}: Loads multiple registers with values from memory starting at the address in Rn. Useful for loading several values at once.  Example: LDM R1, {R2, R3, R4} loads values from memory into R2, R3, and R4 starting at the address in R1.  STM (Store Multiple):  STM Rn, {Rd1, Rd2, ...}: Stores multiple registers' values into memory starting at the address in Rn. Useful for saving several values at once.  Example: STM R1, {R2, R3, R4} stores the values in R2, R3, and R4 into memory starting at the address in R1.
  • 29.
    ARM Processor: Basicsof I/O operations  In ARM-based systems, I/O (Input/Output) operations encompass various types of interactions between the ARM processor and external devices.  It involve the interaction between the CPU (Central Processing Unit) and external devices such as sensors, displays, keyboards, and storage devices.  The I/O operations are essential for interfacing ARM processors with a diverse range of external devices and for enabling various functionalities in embedded systems, IoT (Internet of Things) devices, and other applications.  The specific type of I/O operation used depends on the nature of the external devices and the requirements of the application.  Here are the key aspects of I/O operations with ARM processors:
  • 30.
     Memory-Mapped I/O(MMIO):  Purpose: Memory-mapped I/O is a common technique where I/O devices are mapped into the memory address space of the ARM processor.  Usage: It allows the ARM processor to read from and write to I/O devices using standard load (LDR) and store (STR) instructions, treating I/O registers as if they were memory locations.  Port-Mapped I/O:  Purpose: Port-mapped I/O is an alternative to memory-mapped I/O where I/O devices are accessed through specific I/O port addresses.  Usage: Specialized instructions or commands are used to read from and write to I/O ports, which are distinct from regular memory locations.  Serial Communication (UART, SPI, I2C):  Purpose: Serial communication protocols like UART, SPI (Serial Peripheral Interface), and I2C (Inter-Integrated Circuit) are used for communication between the ARM processor and external devices.  Usage: These protocols enable the ARM processor to exchange data serially with devices such as sensors, displays, and other microcontrollers.
  • 31.
     File I/O(Sequential and Random Access):  Purpose: File I/O operations involve reading from and writing to files stored in external storage devices like SD cards or hard drives.  Usage: The ARM processor uses file I/O operations to access and manipulate data stored in files sequentially or randomly.  GPIO (General-Purpose Input/Output):  Purpose: GPIO pins are used for basic digital I/O operations, allowing the ARM processor to read input signals from external devices (e.g., sensors) and control output signals to external devices (e.g., LEDs).  Usage: GPIO pins can be configured as inputs or outputs, and their states can be controlled programmatically.  Interrupt Handling:  Purpose: Interrupts are used to signal the ARM processor when external events, such as data arrival or device status changes, require attention.  Usage: ARM processors handle interrupts by executing interrupt service routines (ISRs), allowing for timely responses to external events without constant polling.
  • 32.
     DMA (DirectMemory Access):  Purpose: DMA is used for high-speed data transfer between external devices and memory without CPU intervention.  Usage: DMA controllers in ARM-based systems enable peripherals to read from or write to memory directly, reducing CPU load and increasing data transfer speed.  Analog-to-Digital Conversion (ADC) and Digital-to-Analog Conversion (DAC):  Purpose: ADC and DAC operations involve converting analog signals to digital and vice versa, enabling the ARM processor to interface with sensors and control analog devices.  Usage: ARM processors can read analog data from sensors through ADCs and generate analog output signals through DACs.  Ethernet and Network Communication:  Purpose: ARM processors can communicate with other devices over Ethernet networks for data transfer and network- based I/O operations.  Usage: Network protocols like TCP/IP and UDP are commonly used for tasks such as data exchange, remote control, and device management.  USB (Universal Serial Bus) Communication:  Purpose: USB interfaces allow ARM-based systems to connect to a wide range of USB devices, such as keyboards, mice, storage devices, and more.  Usage: ARM processors can use USB controllers to manage data transfer and communication with USB peripherals.
  • 33.
    Case study: ARM5 Architecture ARMv5 Architecture Overview:  32-bit Architecture: ARMv5 is a 32-bit architecture, meaning it primarily deals with 32-bit data and instructions. It was widely used in various embedded systems, mobile devices, and microcontrollers.  Thumb Instruction Set: ARMv5 introduced the Thumb instruction set, which is a 16-bit subset of the ARM instruction set. Thumb is designed for code density, making it suitable for applications with limited memory space.  Classic RISC Design: ARMv5 processors typically use a classic RISC (Reduced Instruction Set Computer) pipeline architecture, which consists of multiple stages for instruction execution. This design emphasizes simplicity and efficiency.  Memory Management: ARMv5 supports virtual memory management and includes a Memory Management Unit (MMU) for address translation and memory protection. This enables features like multitasking and memory isolation.  Clock Speeds: The clock speed of ARMv5 processors can vary depending on the specific chip design and application requirements. ARMv5 processors were used in a wide range of devices with different clock frequencies.
  • 34.
    Use Cases andApplications of ARM 5 Architecture ARMv5 processors were widely used in various applications, including: Embedded Systems: ARMv5 was popular in embedded applications due to its power efficiency and versatility. Mobile Devices: Some early smartphones and mobile devices utilized ARMv5 processors. Consumer Electronics: ARMv5 was found in devices like digital cameras, set-top boxes, and portable media players. Network Equipment: ARMv5-based processors were used in networking equipment like routers and switches. Industrial Automation: ARMv5 was used in controllers and automation systems.
  • 35.
    Limitations of ARM5 Architecture Compared to later ARM architectures, ARMv5 lacked some of the advanced features and performance enhancements found in newer versions. It had limitations in terms of floating-point performance and multimedia capabilities.
  • 36.
    Case study: ARM7 Architecture
  • 37.
    Case study: ARM7 Architecture  ARM 7  Introduced in 1994, the ARM7™ processor family has been immensely successful, and has helped establish ARM as the architecture of choice in the digital world.  Over the years, more than 10 billion ARM7 processor family-based devices have powered a wide variety of cost and power-sensitive applications.  ARM 7 Series: ARM7500, ARM7DI, ARM710T.  While the ARM7 processor family continues to be used today for simple 32-bit devices, newer embedded designs are increasingly making use of latest ARM processors such as the Cortex™-M0 and Cortex-M3 processors, both of which offer significant technical enhancements over the ARM7 family.  ARM7500  It is a highly integrated single chip computer based around the ARM RISC microprocessor microcell.  ARM7500 contains all the functionality required to create a complete computing system with the minimum of external components.  The wide range of features incorporated into ARM7500 make it an extremely flexible device, which can be programmed according to the required application to optimize for high performance or low power, or a combination of both.
  • 38.
    Highly integrated RISCcomputer 30 Dhrystone 2.1 MIPS ARM7 core @ 33MHz 4 Kbyte combined instruction and data cache Flexible Memory Management Unit Supports 16 or 32 bit wide memory via internal ROM and DRAM controllers 3 channel DMA I/O controller 2 serial ports, 4 A/D channels 8 stereo sound channels 32-bit CD quality serial sound channel Video controller with up to 120MHz pixel clock 16 million colours from 256-entry palette 16-level grey scales for LCD displays Suspend and stop power saving modes 9444954195 Features of ARM7500
  • 39.
    Applications of ARM7500 ARM7500is ideally suited to those applications requiring a compact, low-cost, power efficient, high performance, RISC computing system on a single chip. These include: Multimedia Interactive visual display terminals, Portable Computing Handheld test instrumentation, Games consoles Desktop computing
  • 40.
    ARM7DI  Is alow-power, general purpose 32-bit RISC microprocessor with integrated debug support  It comprises the ARM7D CPU core, and ICE breaker module and a TAP controller.  Its simple, elegant and fully static design is particularly suitable for cost and power sensitive applications. Enhancements  The ARM7DI is similar to the ARM6 but with the following enhancements:  advanced debug (integrated ICE) support for faster time to market  fabrication on a sub-micron process for increased speed and reduced power consumption  3V operation, for very low power consumption, as well as 5V operation for system compatibility  higher clock speed for faster program execution.
  • 41.
    Applications  The ARM7DIis ideally suited to those applications requiring RISC performance from a compact, power-efficient processor. These include:  Telecomms GSM terminal controller  Datacomms Protocol conversion  Portable Computing Palmtop computer  Portable Instruments Handheld data acquisition unit  Automotive Engine management unit  Information Systems Smart cards  Imaging JPEG controller.
  • 42.
    ARM710T  The ARM710Tis a general-purpose 32-bit microprocessor with 8KB cache, enlarged write buffer and Memory Management Unit (MMU) combined in a single chip.  The CPU within the ARM710T is the ARM7TDMI. The ARM710T is software compatible with the ARM processor family.  The ARM710T architecture is based on Reduced Instruction Set Computer (RISC) principles, and the instruction set and related decode mechanism are greatly simplified compared with microprogrammed Complex Instruction Set Computers (CISC).  The on-chip mixed data and instruction cache, together with the write buffer, substantially raise the average execution speed and reduce the average amount of memory bandwidth required by the processor.  This allows the external memory to support additional processors or Direct Memory Access (DMA) channels with minimal performance loss.
  • 43.
    ARM710T  The MMUsupports a conventional two-level page-table structure and a number of extensions which make it ideal for embedded control, UNIX and Object Oriented systems.  The memory interface has been designed to allow the performance potential to be realized without incurring high costs in the memory system.  Speed-critical control signals are pipelined to allow system control functions to be implemented in standard low-power logic, and these control signals permit the exploitation of paged mode access offered by industry standard DRAMs.  ARM710T is a fully static part and has been designed to minimize power requirements.  This makes it ideal for portable applications where both these features are essential.