ARM Architecture and
Programming Model
E.C. Engineering Department,
GEC Rajkot
Advanced Microcontroller
Content
■ARM Registers
■CPSR and SPSR format
■Registers available in Thumb state
■Pipelining organisation
■ARM Exceptions
■Operating modes of ARM
■Exception handlers
■ARM Input output system
■ARM Based Embedded Device
Address
Register
Register Bank
A[31:0]
PC Incrementer
Mult
Barrel
Shifter
Instruction
Decode
&
Control
Control Signals
Data In Register
Data Out Register
ALU
D[31:0]
A
Bus
B
Bus
A LU
Bus
Register bank stores
processor state. It
has two read ports
and one write port.
Barrel shifter used to
shift or rotate one
operand by any number
of bits It is
combinational circuit
which maximize
hardware use
■Address register and
incrementer selects and
holds all memory
addresses and generates
sequential addresses
when required
Architecture of ARM7TDMI
■Register bank stores processor state.
It has two read ports and one write
port.
■These ports can be used to access
any register.
■ Barrel shifter used to shift or rotate
one operand by any number of bits It
is combinational circuit which
maximize hardware use
Architecture of ARM7TDMI
■Address register and incrementer
selects and holds all memory
addresses and generates sequential
addresses when required
■Auto increment and auto decrement
■Load and Store multiple data block
Architecture of ARM7TDMI
■In single cycle data processing
instruction, two register operands
are accessed. Value of B bus is shifted
and combined with value on A bus in
the ALU and result is written back
into the register bank.
■Program counter value is in Address
register, from where it is fed to
incrementer. Incremented value of PC
is copied back to register R15.
Simplified block diagram of ARM chip
ARM Registers ……
■Total 37 32-bit registers
■16 Visible, R0 – R15
■R0 to R7 are Unbanked Registers
(Same 32 bit physical registers in all processor
modes)
■Register R8 to R14 are Banked registers
ARM Registers …….
R0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R13
R14
R15 (PC)
CPSR
System &
User R0
R1
R2
R3
R4
R5
R6
R7
R8_fiq
R9_fiq
R10_fiq
R11_fiq
R12_fiq
R13_fiq
R14_fiq
R15 (PC)
CPSR
SPSR_fiq
FI
Q R0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R13_irq
R14_irq
R15 (PC)
CPSR
SPSR_irq
IR
Q
R0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R13_svc
R14_svc
R15 (PC)
CPSR
SPSR_sv
c
Superviso
r R0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R13_abt
R14_abt
R15 (PC)
CPSR
SPSR_ab
t
Abor
t R0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R13_und
R14_und
R15 (PC)
CPSR
SPSR_un
d
Undefine
d
ARM Programmer’s model
ARM Registers ……
■Special roles:
■Hardware
■R14 – Link Register (LR):
optionally holds return address
for branch instructions
■R15 – Program Counter (PC)
■Software
■R13 - Stack Pointer (SP)
ARM Registers ……
■Current Program Status Register (CPSR)
■Saved Program Status Register (SPSR)
■On exception, entering mod mode:
■(PC + 4) ⭢ LR
■CPSR ⭢ SPSR_mod
■PC ⭠ IV address
■R13, R14 replaced by R13_mod, R14_mod
■In case of FIQ mode R7 – R12 also replaced
CPSR & SPSR format
■CPSR is used to store condition codes
■N: Negative (Set of ALU operation results into –Ve value)
■Z: Zero (ALU operation produces zero result)
■C: Carry (Generates carry out due to ALU operation)
■V: Overflow (For Signed operation )
Registers for Thumb mode
Features of ARM7TDMI architecture
■Uses 0.25μm and less die size and HCMOS
technology. Low die size facilitates low
voltage operation and cause low power
dissipation
■Gives high performance of 300 MIPS at die
size= 0.13μm
■Fully static operation (Since it is MOSFET
based)
Features of ARM7TDMI architecture
■Large register set consisting 37 registers.
■Three stage pipeline
■232
addresses for 4 GByte linear address space
■32/16 bit RISC architecture (ARM v4T) which
has 32 bit ARM instruction set and 16 bit
Thumb instruction subset as extension
■32 bit RALU and high performance multiplier
Features of ARM7TDMI architecture
■Instruction processes data with different data
types such as 8 bit, 16 bit and 32 bit.
■Two types of interrupt requests: FIQ (Fast
Interrupt Request) and IRQ (Interrupt
Request). FIQ has high priority.
■Co-processor interface available
■Extensive Debug facilities like ICE (In circuit
emulator, RT (real time) debug and on-chip
JTAG interface)
Pipeline Organization
■Superscalar processor has a pipeline for
processing: more than one instruction is
at fetching, decoding and executing
stage.
■Performance is n times in n-stage
pipeline because of simultaneous
execution
Pipeline Organization
■Pipeline organization Increases speed
most instructions executed in single
cycle
■Versions:
■3-stage (ARM7TDMI and earlier)
■5-stage (ARMS, ARM9TDMI)
■6-stage (ARM10TDMI)
Pipeline Organization
■3-stage pipeline: Fetch – Decode - Execute
■Three-cycle latency and
one instruction per cycle throughput
cycle
Fetch Decode Execute
Fetch Decode Execute
Fetch Decode Execute
i
n
s
t
r
u
c
t
i
o
n
t t+1 t+2 t+3 t+4
i
i+1
i+2
Write-
back
Buffer/
data
Execute
Decode
Pipeline Organization
■5-stage pipeline:
■Reduces work per cycle =>
allows higher clock frequency
■Separates data and
instruction memory =>
reduction of CPI
(average number
of clock Cycles Per Instruction)
■Stages:
Fetch
Pipeline Organization
■Pipeline flushed and refilled on branch,
causing execution to slow down
■Special features in instruction set
eliminate small jumps in code
to obtain the best flow through pipeline
ARM Exceptions
When Exception occurs ……
■Mode changes
■Saves CPSR to SPSR
■Save PC to LR
■Set CPSR to exception mode
■Set PC to address of exception handler
Operating Modes
■Seven operating modes:
■User
■Privileged:
■System (version 4 and above)
■FIQ
■IRQ
■Abort
■Undefined
■Supervisor
exception modes
Operating Modes
User mode:
■Normal program
execution mode
■System resources
unavailable
■Mode changed
by exception only
Exception modes:
■Entered
upon exception
■Full access
to system resources
■Mode changed freely
Exceptions
Table 1 - Exception types, sorted by Interrupt Vector addresses
Exception Mode Priority IV Address
Reset Supervisor 1 0x00000000
Undefined instruction Undefined 6 0x00000004
Software interrupt Supervisor 6 0x00000008
Prefetch Abort Abort 5 0x0000000C
Data Abort Abort 2 0x00000010
Interrupt IRQ 4 0x00000018
Fast interrupt FIQ 3 0x0000001C
Memory organisation ...
Read-write data
Heap
Stack
Microcontroller
Internal
ROM
Internal
RAM
CPU
Bus
Vector table
Exception and interrupt
handlers
Startup code
(RESET handler)
Read-only data
Main Code
Unused ROM
Unused RAM
00000000
20000000
Exception handling ....
Read-write data
Heap
Stack
Vector table
Exception and interrupt
handlers
Startup code
(RESET handler)
Read-only data
Main Code
Unused ROM
Unused RAM
00000000
Exception routines
Vector address for exceptions have not enough
space to accommodate entire exception routine
so branch is taken at some another location
using branch instructions ..
■B <Address>
■MOV PC,#<Immediate Value>
■LDR PC,[PC,#<Offset Value>]
Exception priorities
Exceptions Priority I bit F bit
RESET 1 1 1
DATAABORT 2 1 -
FIQ 3 1 1
IRQ 4 1 -
Pre-fetch abort 5 1 -
SWI 6 1 -
Undefined Instructions 6 1 -
The I and F bits
The I and F bits are the
interrupt disable bits:
when the I bit is set, IRQ
interrupts are disabled
when the F bit is set, FIQ
interrupts are disabled.
Exception handlers
RESET:
• Initialize system, set up stack pointers, memory
before enabling IRQ and FRQ
• Code should be designed to avoid further
triggering of unwanted exceptions
Data Abort:
• It occurs when invalid memory address is
accessed.
• FIQ exception can be raised within Data abort
handler
Exception handlers
FIQ:
• FIQ occurs when external peripheral generates
FIQ input signal
• Core disables both FIQ and IRQ interrupts
IRQ:
• It also occurs when external peripheral/device
generates IRQ input signal
• IRQ will be generated only if FIQ and Data abort
are not generated
• On entry IRQ is disabled until it is enabled again
in that handler
Exception handlers
Pre-fetch abort:
• Occurs when attempt to fetch instructions results
in memory fault
• FIQ can be serviced within pre-fetch abort
Undefined:
• It occurs when instruction is not in ARM or
Thumb
• SWI and Undefined have same level of priority 6
because they can not occur together
Return from exceptions
There is no RET instruction ….
• Only way to return in move value of
Link register LR to program counter
PC
MOV PC,R14
• Exception handler must not corrupt
value of LR
• Restore CPSR from SPSR
Interrupt assignment
• Interrupt controller can be used to
connect general purpose interrupts to
FIQ or IRQ
• FIQ is usually reserved for interrupts
which requires fast response time
• IRQ assigned to general purpose
interrupts like periodic timer
interrupts for context switching of
processes
Interrupt Latency
• Hardware and software latency
• Hardware (Use more registers)
• Software (Use nested interrupt
handler which allows interrupt inside
interrupt)
• Stack organization is needed for
reducing interrupt latency using
software. Average latency of high
priority interrupt reduces
Interrupt Latency
• Stack size requirement is more for
nested interrupt handler
• Size of stack can be designed by
looking at embedded system
application such as what are the
sources of interrupts, how much
interrupt latency can be tolerated
etc..
ARM Input-Output System
• ARM I/O system uses memory
mapped I/O
• No separate address range for I/O
devices
• Interrupt support is: IRQ & FIQ
• DMA support: large bandwidth and
data transfer
ARM Architecture Versions …
ARM Based Embedded Device
References:
[1] Arm System Developer’s Guide, Designing and
Optimizing Software, Andrew N. Sloss, Dominic
Symes, Chris Wwight, Elsevier
[2] www.arm.com
Thank You

Topic 2 ARM Architecture and Programmer's Model.pptx

  • 1.
    ARM Architecture and ProgrammingModel E.C. Engineering Department, GEC Rajkot Advanced Microcontroller
  • 2.
    Content ■ARM Registers ■CPSR andSPSR format ■Registers available in Thumb state ■Pipelining organisation ■ARM Exceptions ■Operating modes of ARM ■Exception handlers ■ARM Input output system ■ARM Based Embedded Device
  • 3.
    Address Register Register Bank A[31:0] PC Incrementer Mult Barrel Shifter Instruction Decode & Control ControlSignals Data In Register Data Out Register ALU D[31:0] A Bus B Bus A LU Bus Register bank stores processor state. It has two read ports and one write port. Barrel shifter used to shift or rotate one operand by any number of bits It is combinational circuit which maximize hardware use ■Address register and incrementer selects and holds all memory addresses and generates sequential addresses when required
  • 4.
    Architecture of ARM7TDMI ■Registerbank stores processor state. It has two read ports and one write port. ■These ports can be used to access any register. ■ Barrel shifter used to shift or rotate one operand by any number of bits It is combinational circuit which maximize hardware use
  • 5.
    Architecture of ARM7TDMI ■Addressregister and incrementer selects and holds all memory addresses and generates sequential addresses when required ■Auto increment and auto decrement ■Load and Store multiple data block
  • 6.
    Architecture of ARM7TDMI ■Insingle cycle data processing instruction, two register operands are accessed. Value of B bus is shifted and combined with value on A bus in the ALU and result is written back into the register bank. ■Program counter value is in Address register, from where it is fed to incrementer. Incremented value of PC is copied back to register R15.
  • 7.
  • 8.
    ARM Registers …… ■Total37 32-bit registers ■16 Visible, R0 – R15 ■R0 to R7 are Unbanked Registers (Same 32 bit physical registers in all processor modes) ■Register R8 to R14 are Banked registers
  • 9.
    ARM Registers ……. R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13 R14 R15(PC) CPSR System & User R0 R1 R2 R3 R4 R5 R6 R7 R8_fiq R9_fiq R10_fiq R11_fiq R12_fiq R13_fiq R14_fiq R15 (PC) CPSR SPSR_fiq FI Q R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13_irq R14_irq R15 (PC) CPSR SPSR_irq IR Q R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13_svc R14_svc R15 (PC) CPSR SPSR_sv c Superviso r R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13_abt R14_abt R15 (PC) CPSR SPSR_ab t Abor t R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13_und R14_und R15 (PC) CPSR SPSR_un d Undefine d ARM Programmer’s model
  • 10.
    ARM Registers …… ■Specialroles: ■Hardware ■R14 – Link Register (LR): optionally holds return address for branch instructions ■R15 – Program Counter (PC) ■Software ■R13 - Stack Pointer (SP)
  • 11.
    ARM Registers …… ■CurrentProgram Status Register (CPSR) ■Saved Program Status Register (SPSR) ■On exception, entering mod mode: ■(PC + 4) ⭢ LR ■CPSR ⭢ SPSR_mod ■PC ⭠ IV address ■R13, R14 replaced by R13_mod, R14_mod ■In case of FIQ mode R7 – R12 also replaced
  • 12.
    CPSR & SPSRformat ■CPSR is used to store condition codes ■N: Negative (Set of ALU operation results into –Ve value) ■Z: Zero (ALU operation produces zero result) ■C: Carry (Generates carry out due to ALU operation) ■V: Overflow (For Signed operation )
  • 14.
  • 15.
    Features of ARM7TDMIarchitecture ■Uses 0.25μm and less die size and HCMOS technology. Low die size facilitates low voltage operation and cause low power dissipation ■Gives high performance of 300 MIPS at die size= 0.13μm ■Fully static operation (Since it is MOSFET based)
  • 16.
    Features of ARM7TDMIarchitecture ■Large register set consisting 37 registers. ■Three stage pipeline ■232 addresses for 4 GByte linear address space ■32/16 bit RISC architecture (ARM v4T) which has 32 bit ARM instruction set and 16 bit Thumb instruction subset as extension ■32 bit RALU and high performance multiplier
  • 17.
    Features of ARM7TDMIarchitecture ■Instruction processes data with different data types such as 8 bit, 16 bit and 32 bit. ■Two types of interrupt requests: FIQ (Fast Interrupt Request) and IRQ (Interrupt Request). FIQ has high priority. ■Co-processor interface available ■Extensive Debug facilities like ICE (In circuit emulator, RT (real time) debug and on-chip JTAG interface)
  • 18.
    Pipeline Organization ■Superscalar processorhas a pipeline for processing: more than one instruction is at fetching, decoding and executing stage. ■Performance is n times in n-stage pipeline because of simultaneous execution
  • 19.
    Pipeline Organization ■Pipeline organizationIncreases speed most instructions executed in single cycle ■Versions: ■3-stage (ARM7TDMI and earlier) ■5-stage (ARMS, ARM9TDMI) ■6-stage (ARM10TDMI)
  • 20.
    Pipeline Organization ■3-stage pipeline:Fetch – Decode - Execute ■Three-cycle latency and one instruction per cycle throughput cycle Fetch Decode Execute Fetch Decode Execute Fetch Decode Execute i n s t r u c t i o n t t+1 t+2 t+3 t+4 i i+1 i+2
  • 21.
    Write- back Buffer/ data Execute Decode Pipeline Organization ■5-stage pipeline: ■Reduceswork per cycle => allows higher clock frequency ■Separates data and instruction memory => reduction of CPI (average number of clock Cycles Per Instruction) ■Stages: Fetch
  • 22.
    Pipeline Organization ■Pipeline flushedand refilled on branch, causing execution to slow down ■Special features in instruction set eliminate small jumps in code to obtain the best flow through pipeline
  • 23.
    ARM Exceptions When Exceptionoccurs …… ■Mode changes ■Saves CPSR to SPSR ■Save PC to LR ■Set CPSR to exception mode ■Set PC to address of exception handler
  • 24.
    Operating Modes ■Seven operatingmodes: ■User ■Privileged: ■System (version 4 and above) ■FIQ ■IRQ ■Abort ■Undefined ■Supervisor exception modes
  • 25.
    Operating Modes User mode: ■Normalprogram execution mode ■System resources unavailable ■Mode changed by exception only Exception modes: ■Entered upon exception ■Full access to system resources ■Mode changed freely
  • 26.
    Exceptions Table 1 -Exception types, sorted by Interrupt Vector addresses Exception Mode Priority IV Address Reset Supervisor 1 0x00000000 Undefined instruction Undefined 6 0x00000004 Software interrupt Supervisor 6 0x00000008 Prefetch Abort Abort 5 0x0000000C Data Abort Abort 2 0x00000010 Interrupt IRQ 4 0x00000018 Fast interrupt FIQ 3 0x0000001C
  • 27.
    Memory organisation ... Read-writedata Heap Stack Microcontroller Internal ROM Internal RAM CPU Bus Vector table Exception and interrupt handlers Startup code (RESET handler) Read-only data Main Code Unused ROM Unused RAM 00000000 20000000
  • 28.
    Exception handling .... Read-writedata Heap Stack Vector table Exception and interrupt handlers Startup code (RESET handler) Read-only data Main Code Unused ROM Unused RAM 00000000
  • 29.
    Exception routines Vector addressfor exceptions have not enough space to accommodate entire exception routine so branch is taken at some another location using branch instructions .. ■B <Address> ■MOV PC,#<Immediate Value> ■LDR PC,[PC,#<Offset Value>]
  • 30.
    Exception priorities Exceptions PriorityI bit F bit RESET 1 1 1 DATAABORT 2 1 - FIQ 3 1 1 IRQ 4 1 - Pre-fetch abort 5 1 - SWI 6 1 - Undefined Instructions 6 1 - The I and F bits The I and F bits are the interrupt disable bits: when the I bit is set, IRQ interrupts are disabled when the F bit is set, FIQ interrupts are disabled.
  • 31.
    Exception handlers RESET: • Initializesystem, set up stack pointers, memory before enabling IRQ and FRQ • Code should be designed to avoid further triggering of unwanted exceptions Data Abort: • It occurs when invalid memory address is accessed. • FIQ exception can be raised within Data abort handler
  • 32.
    Exception handlers FIQ: • FIQoccurs when external peripheral generates FIQ input signal • Core disables both FIQ and IRQ interrupts IRQ: • It also occurs when external peripheral/device generates IRQ input signal • IRQ will be generated only if FIQ and Data abort are not generated • On entry IRQ is disabled until it is enabled again in that handler
  • 33.
    Exception handlers Pre-fetch abort: •Occurs when attempt to fetch instructions results in memory fault • FIQ can be serviced within pre-fetch abort Undefined: • It occurs when instruction is not in ARM or Thumb • SWI and Undefined have same level of priority 6 because they can not occur together
  • 34.
    Return from exceptions Thereis no RET instruction …. • Only way to return in move value of Link register LR to program counter PC MOV PC,R14 • Exception handler must not corrupt value of LR • Restore CPSR from SPSR
  • 35.
    Interrupt assignment • Interruptcontroller can be used to connect general purpose interrupts to FIQ or IRQ • FIQ is usually reserved for interrupts which requires fast response time • IRQ assigned to general purpose interrupts like periodic timer interrupts for context switching of processes
  • 36.
    Interrupt Latency • Hardwareand software latency • Hardware (Use more registers) • Software (Use nested interrupt handler which allows interrupt inside interrupt) • Stack organization is needed for reducing interrupt latency using software. Average latency of high priority interrupt reduces
  • 37.
    Interrupt Latency • Stacksize requirement is more for nested interrupt handler • Size of stack can be designed by looking at embedded system application such as what are the sources of interrupts, how much interrupt latency can be tolerated etc..
  • 38.
    ARM Input-Output System •ARM I/O system uses memory mapped I/O • No separate address range for I/O devices • Interrupt support is: IRQ & FIQ • DMA support: large bandwidth and data transfer
  • 39.
  • 40.
  • 41.
    References: [1] Arm SystemDeveloper’s Guide, Designing and Optimizing Software, Andrew N. Sloss, Dominic Symes, Chris Wwight, Elsevier [2] www.arm.com
  • 42.