ARM Processors
Dr. N. Mathivanan
Visiting Professor
Department of Instrumentation and Control Engineering
National Institute of Technology,
TRICHY, TAMILNADU
INDIA
N. Mathivanan
Topics
• Features of ARM processors
• ARM architecture variants and processor families
• ARM7-TDMI internal architecture
• Register organization
• Pipelining
• Operating modes
• Exception handling
• ARM bus architecture
• Debug architecture
• Interface signals
N. Mathivanan
ARM Processors
• Advanced RISC Machine -
ARM Ltd. not a manufacturing Co., provides license to manufacturers
Used in high end applications involving complex computation
Hand held device, Robotic, Automation system, Consumer electronics
• Features
High performance, low power, small in size (ideal for embedded sys)
Large Register File, Small instruction set, Load-Store instructions,
Fixed length instructions, Conditional execution of instructions,
High code density, most instructions executable in single cycle,
32-bit in-line barrel shifter, built-in circuit for hardware debugging,
DSP enhanced instructions, Jazelle (Java byte code extn. 3rd state),
TrustZone (SoC approach to security)
N. Mathivanan
ARM Architecture Variants (core), Processor Families
• Each family has its own instruction set, mem management, etc.
Architecture
version
Processor
Families
Processor Features Microcontroller
ARM v4T
ARM7TDMI
(1995)
ARM720T
ARM740T
Von Neumann,
3-stage pipeline
LPC2100 series
ARM9TDMI
ARM920T
ARM922T
ARM942T
MMU, Harvard,
5-stage pipeline
SAM9G, LPC29xx,
LPC3xxx, STR9
ARM v5TE,
ARM v5TEJ
ARM9E
(1997)
ARM926EJ-S, MMU, DSP, Jazelle, SAM9XE
ARM946E-S, MPU, DSP
ARM966HS MPU (optional), DSP
ARM10E
(1999)
ARM1020E MMU, DSP
ARM1026EJ-S MMU/MPU, DSP, Jazelle
ARM v6
ARM11
(2003)
ARM1136J(F)-S MMU, TrustZone, DSP, Jazelle MSM7000, i.MX3x
ARM1156T2(F)-S MPU, DSP
ARM1176JZ(F)-S, MMU, TrustZone, DSP, Jazelle BCM2835
ARM11 MP core
MMU, Multiprocessor cache
support, DSP, JazelleN. Mathivanan
Architecture
version
Processor
Families
Processor Features Microcontroller
ARM v6-M Cortex
Cortex-M0 NVIC
LPC1200, 1100 series
STM32F0x0, x1, x2
Cortex-M1 FPGA TCM Interface, NVIC STM32F1, F2, L1, W
ARM v7-M Cortex Cortex-M3 MPU (optional), NVIC
ST32F512-M, LPC1300,
1700, 1800
ARM v7-R Cortex
Cortex-R4 MPU, DSP
STA1095, SAM4L,
SAM4N, SAM4S
Cortex-R4F MPU, DSP, Floating Point
SAM4C, SAM4E,
LPC40xx, 43xx, STM32
F3, F4
ARM v7-A Cortex
Cortex-A8
MMU, Trust Zone, DSP,
Jazelle, Neon, Floating Point
Freescale i.MX5X
Cortex-A9
MMU, Trust Zone,
Multiprocessor, DSP, Jazelle,
Neon, Floating Point
Freescale i.MX6QP
N. Mathivanan
ARM Nomenclature
• A R M x y z T D M I E J F S (Example: ARM7-TDMI-S)
x – Series
y – MMU
z – Cache
T – Thumb
D – Debugger
M – Multiplier
I – Embedded In-Circuit Emulator (ICE) macrocell
E – Enhanced Instructions for DSP
J – JAVA acceleration by Jazelle
F – Floating-point
S – Synthesizable version
N. Mathivanan
ARM7-TDMI – Internal Architecture
• Von Neumann architecture
• Data bus – 32-bit
• Address bus – 32-bit
Addressable memory space – 4 GB
• Register bank – (31+6) 32-bit regs.
• In-line barrel shifter
• Multiplier
• ALU
• Incrementer
• Address register
• Instruction decoder & control logic
Address Register
Register Bank
(31 x 32-bit registers)
(6 status registers)
Address
Incrementer
32 x 8
Data Bus D[31:0]
A[31:0]
BBus
Data In Register
PCBus
ALUBus
ALU
LogicControl
Barrel Shifter
InstructionDecoderand
Data Out Register
Address Bus
IncrementerBus
ABus
multiplier
N. Mathivanan
Register Organization
31 gen. purpose registers.
Only 16 regs accessible
15 registers are hidden
Named as r0 – r15
r13,r14, r15 are SP, LR, PC
8-bit/16-bit/32-bit data can
be read/write
6 status registers
Only 1 is accessible
Named as CPSR, SPSR
Contains flags, control bits
Register bank has
2 read and 1 write port and
1 read and 1 write port for PC
N. Mathivanan
• Bit definitions of Program Status registers
• Barrel shifter
Combinational logic circuit
Shifts left/right any no. of bits position in one cycle
Preprocess one of data from source reg. before passed to ALU
• Multiplier
32-bit x 8-bit with early termination, Booth Algorithm
32-bit x 32-bit in 5 cycles
Non M type multiplies in 32x2-bit and for 32-bitx32-bit - 17 cycles
M0M1M2CN I
B31 B23
V
B15 B8
F
B0
M3Z T
B16 B7
Thumb State Flag
M4
B24
Zero
Overflow
FIQ Interrupt Mask
Mode
Carry
Negative IRQ Interrupt Mask
N. Mathivanan
• ALU
Connected to register bank using A-bus and B-bus
ALU and barrel shifter operations take place in same cycle
Result of ALU operation goes back to register bank thro’ ALU bus
• Address register
Holds the address of next instruction to be fetched
• Instruction decoder and control logic
Enables interfacing peripherals to processors
Has Thumb decompressor –
Decompresses 16-bit Thumb code to 32-bit ARM code
N. Mathivanan
• Data Types
o Word – 32-bit, Halfword – 16-bit, Byte – 8-bit
o Memory is byte addressable, can hold 232 bytes (= 4 GB)
o Word/ halfword /byte size data are placed at word/ halfword/
byte aligned addresses.
o 32-bit ARM instructions are placed at word aligned addresses
• Byte order – Endian format
o Word/halfword size data can be saved/retrieved in big endian
or little endian format.
o Big endian: MSB of word/halfword data are stored in lowest
address and the data is addressed by address of MSB
o Little endian: LSB of word/halfword data are stored in lowest
address and the data is addressed by address of LSB
N. Mathivanan
Pipelining
• Executes instructions in Fetch, Decode, Execute cycles – 3 stages
• Pipelining increases flow of instructions to processor
• Enables processor and memory to work continuously
• Performs Fetch, Decode, Execute operations simultaneously
Example:
• In T1 cycle, fetch 1st instruction
• In T2, fetch 2nd, decode 1st instructions
• In T3, fetch 3rd, decode 2nd, execute 1st
instructions.
• In each cycle one instruction is executed
Instruction Fetch
Register
E X E C U T E
ARM Decode
F E T C H
Read ALU Write
Register
Shift
Thumb - ARM
D E C O D E
Register SelectDecompression
N. Mathivanan
Pipelining Continued……
• PC points to memory address of instruction currently fetched
• Address of instruction currently decoded is PC-4
• Address of instruction currently executed is PC-8
• Branch instruction break pipeline, LDR/STR instruction stall pipeline
• Solution: Harvard architect., adv. processors have deeper pipelining,
• ARM9 – 5 stage, ARM10 – 6 stage, ARM11 – 8 stage
E X E C U T E M E M O R Y W R I T E
Write
Reg.
Instruction Fetch
Reg.
Read Shift + ALUDecompression
D E C O D E
RegisterSelect
ARMDecode
F E T C H
Thumb- ARM
MemoryAccess
N. Mathivanan
Operating Modes
• ARM7 has 7 modes, classified into privileged and non-privileged
o In non-privileged mode processor can’t change control bits of CPSR
• Each mode has different subset of registers. (refer to reg. organization)
• Switching between the modes requires saving / retrieving of register
values.
• Banked registers accessible in their respective modes
N. Mathivanan
Operating Modes Continued….
Mode Function
User
Normal programs and applications.
Only non-privileged mode.
FIQ
Enters when highest priority interrupt occurs. Fast
data transfer/processing mode.
IRQ
Enters when normal interrupt occurs.
General-purpose interrupt handling.
SVC
Enters when CPU is reset or SVC is executed.
Protected mode for OS, default on startup or reset.
ABT
Enters when illegal memory accesses occurs.
Implements virtual memory and/or memory protection
SYS Privileged user mode for OS (runs OS tasks)
UND
Enters when unknown/illegal instruction executed.
Software emulation of hardware coprocessors
r5
r4
r6
r2
r1
r0
r3
r7
r8
r14_lr
CPSR
r11
r13_sp
r9
IRQ
r15_pc
r12
r10
User Mode
r14_irq
SPSR_irq
r13_irq
N. Mathivanan
Exceptions
• Generated by external events or internal sources
• Seven types of exceptions
o Reset: Occurs when ‘Reset’ pin is asserted – power-up/reset
o Undefined: Occurs when currently executing instruction could not
be recognized
o SWI: Occurs if program in user mode executes SWI instruction
to request OS services that are available in supervisor mode.
o Prefetch Abort: Occurs if instruction fetched from invalid
address. Exception is generated at execution stage.
o Data Abort: Occurs if data load/store attempt at illegal address
o IRQ: Occurs if IRQ pin goes low (only if CPSR IRQ mask bit is 0)
o FIQ: Occurs if FIQ pin goes low (only if CPSR FIQ mask bit is 0)
N. Mathivanan
• Exception causes diversion of execution to a particular mem location
• Exception Vector Table holds instructions at exception vector
locations to branch processor to actual exception handler routine.
• The addresses assigned for the 7 exceptions and one reserved
exception are from 0x0000 0000 to 0x0000 001C at 4 bytes
interval.
• Response of the processor to exceptions: (automatically performed)
o Copies CPSR to SPSRexception
o Changes CPSR bits (enable ARM state, change mode bits, disable
interrupts)
o Saves return address i.e. (current PC–4) in LRexception
o Places vector address of exception mode in PC.
Exception Handling
N. Mathivanan
• Execution branches to exception handler routine and executes
Exception handler has codes to return back to previous mode at correct loc
Restores CPSR from SPSRexception
Restores PC from LRexception
Above two are performed using instruction like (MOVS PC,LR)
At the entry/exit registers are stored/retrieved using STMFD/LDRFD
Value in PC at the instant of exception is not same for all exceptions:
Hence instruction to return back is different for different exceptions
N. Mathivanan
Response of Processor to Exceptions
What is the delay in execution introduced by 3-stage pipeline when an
IRQ interrupt occurs (interrupt latency)?
Ans.: 7 cycles N. Mathivanan
Except
Exception
Vector
Processor Response to Enter Exception Handler
Return
Instruction
Pri
Reset 0x00000000
SPSR_svc = unexpected
CPSR[4:0]= 10011B (SVC mode)
CPSR[5] = 0 (ARM state), CPSR[6] = 1 (Disable FIQ)
CPSR[7] = 1 (Disable IRQ)
r14_svc = unexpected
PC = 0x00000000 (Exception vector)
No return 1
Un
defined
0x00000004
SPSR_und = CPSR
CPSR[4:0]= 11011B (undefined mode)
CPSR[5] = 0 (ARM state), CPSR[6] unchanged
CPSR[7] = 1 (Disable IRQ)
r14_und = addr of next to undefined instruction
PC = 0x00000004 (Exception vector)
If registers are
not saved in
stack
MOVS pc,lr 6
Registers are saved in stack at entry using:
STMFD sp!,{<reglist>,lr}
If saved:
LDMFD sp!,
{<reglist>,pc}^
N. Mathivanan
Except
Exception
Vector
Processor Response to Enter Exception Handler
Return
Instruction
Pri
SWI 0x00000008
SPSR_svc = CPSR
CPSR[4:0]= 10011B (SVC mode),
CPSR[5] = 0 (ARM state)
CPSR[6] unchanged, CPSR[7] = 1 (Disable IRQ)
r14_swi = addr of next instruction to SWI
PC = 0x00000008 ;Exception vector
If registers are not
saved in stack
MOVS pc,lr
6
Registers are saved in stack at entry using:
STMFD sp!,{<reglist>,lr}
If saved:
LDMFD sp!,
{<reglist>,pc}^
Prefetch
Abort
0x0000000C
SPSR_abt = CPSR
CPSR[4:0]= 10111B (Abort mode),
CPSR[5] = 0 (ARM state)
CPSR[6] unchanged, CPSR[7] = 1 (Disable IRQ)
r14_abt = aborted instruction addr.+4
PC = 0x0000000C ;Exception vector
If registers are not
saved in stack
SUBS pc,lr,#4
5
Registers are saved in stack at entry using:
SUB lr,lr,#4
STMFD sp!,{<reglist>,lr}
If saved:
LDMFD sp!,
{<reglist>,pc}^
N. Mathivanan
Exception Exception
Vector
Processor Response to Enter
Exception Handler
Return
Instruction
Pri
Data Abort 0x00000010
SPSR_abt = CPSR
CPSR[4:0]= 10111B (Abort mode),
CPSR[5] = 0 (ARM state) , CPSR[6] unchanged
CPSR[7] = 1 (Disable IRQ)
r14_abt = aborted instruction addr.+8
PC = 0x00000010 (Exception vector)
If registers are
not saved in
stack
SUBS pc,lr,#8 2
Registers are saved in stack at entry using:
SUB lr,lr,#8
STMFD sp!,{<reglist>,lr}
If saved:
LDMFD sp!,
{<reglist>,pc}^
Reserved 0x00000014 Reserved Reserved Reserved
N. Mathivanan
Except
ion
Exception
Vector
Processor Response to Enter Exception Handler
Return
Instruction
Pri
IRQ 0x00000018
SPSR_irq = CPSR
CPSR[4:0]= 10010B (IRQ mode), CPSR[5] = 0 (ARM state)
CPSR[6] = unchanged, CPSR[7] = 1 (Disable IRQ)
r14_irq = last executed instruction addr.+8
PC = 0x00000018 ;Exception vector
If registers are
not saved in stack
SUBS pc,lr,#4
4
Registers are saved in stack at entry using:
SUB lr,lr,#4
STMFD sp!,{<reglist>,lr}
If saved:
LDMFD sp!,
{<reglist>,pc}^
FIQ 0x0000001C
SPSR_fiq = CPSR
CPSR[4:0]= 10001B (FIQ mode), CPSR[5] = 0 (ARM state)
CPSR[6] = 1 (Disable FIQ), CPSR[7] = 1 (Disable IRQ)
r14_fiq=last executed instruction addr.+8
PC = 0x0000001C ;Exception vector
If registers are
not saved in stack
SUBS pc,lr,#4
3
Registers are saved in stack at entry using:
SUB lr,lr,#4
STMFD sp!,{<reglist>,lr}
If saved:
LDMFD sp!,
{<reglist>,pc}^
N. Mathivanan
Bus Architecture
• Advanced Microcontroller Bus Architecture (AMBA)
o Bus system connects memory, controllers and peripherals in
ARM processor based microcontroller to ARM core
o AMBA bus protocol std., adopted as on-chip bus by many mC
o ARM core is bus master, peripherals are slaves
o 3 buses within AMBA spec: AHP, ASP, APB
 AHP (Advanced High-performance Bus):
Provides high band-width.
Supports multiple masters, slaves (e.g. of masters: DMA,
Test interface, DSP, and e.g. of slaves: external memory).
Includes bus arbiter, decoder
Used in complex and more sophisticated systems
N. Mathivanan
 ASB (Advanced System Bus):
AHB and ASB have many things in common
Both support bursting, pipelining, split transaction
ASB is used in simple cost effective designs
 APB (Advanced Peripheral Bus):
Simple, low speed, low power bus, for UART, .... peripherals
Implemented with simple tri-stated data bus
AHB-APB bridge: buffers data & operations between the two
N. Mathivanan
Debug Architecture
• Uses boundary scan technology and JTAG to access core
• JTAG can be used in programming, debugging, testing
• ARM7-TDMI is JTAG enabled
InstructionRegister
BYPASSRegister
IDRegister
OtherRegisters
Test AccessPort
Controller
TDI
TCK
TMS
TRST
____
TDO
Boundary
ScanPath
Boundary
ScanCells
CORE
I/O Pads
Din(Parallel-in)
Dout (Parallel-out)
Serial-out
(tonext cell)
Load
Update
Mode
(fromprevious cell)
Serial-in
Shift/
(fromdevice pin) (tocore logic)
(a)
ScanCell
(input cell)
Clock
(fromprevious cell)
(tonext cell)
1
Update
Scanout
Q
0
Din
Q
1
0
D D
(Parall-in)
ScanCell
Clk
(Parallel-out)
Capture
HoldCell
Scanin
Clk
Dout
Load
Update
Shift/
Clock
MUX
Mode
(c)
= 0 forfunctional mode
= 1 fortest mode
MUX
LOGIC
(b)
N. Mathivanan
• Boundary scan cells – read/write pins without physically accessing
Capture: parallel load captures signals on input pin to input cell and
core output to output cell.
Update: parallel unload operation passes values in output cell to
output pin and input cells to core
Serial shift: scan out of scan cell passed to scan in of next cell
Normal mode: Din goes to Dout for normal operation, bypass scan cell
• Scan chains - Scan cells linked to form chain around ARM macrocells
TAP provides JTAG signals to control boundary scan operation
Data can be shifted around from TDI to TDO, collected and probed
• ARM7-TDMI debug – 3 Scan chains around core and EmbeddedICE
‘D’ and ‘I’ stand for Debug and In-Circuit Emulator
Debug system: host, protocol converter, debug target
N. Mathivanan
• Host: PC running ARM debugger, allows setting break points/watch
point, examining contents of registers from host.
• EmbeddedICE forces ARM processor to debug state in which
processor is isolated from rest of the system
• Debug Comm Channel tech. allows ARM core to communicate with host
while program is running and even without entering debug state
N. Mathivanan
Interface Signals
• Clocks & Timing signals
• Processor mode signals
• Memory interface signals
• Bus control signals
• Interrupts
• Memory management signals
• Coprocessor interface signals
• Debug interface signals
TDO
IR[3:0]
TCK1
SCREG[3:0]
nM[4:0]
A[31:0]
TAPSM[3:0]
nTDOEN
CPB
TBIT
BL[3:0]
ABORT
nOPC
CPA
DIN[31:0]
nMREQ
SEQ
MAS[1:0]
LOCK
D[31:0]
nRW
nTRANS
nCP1
DOUT[31:0]
TCK2
VDD
COMMRX
DBGRQI
ABE
BIGEND
TBE
BUSDIS
RANGEOUT0
BREAKPT
RANGEOUT1
DBGACK
DBGEN
VSS
nENOUT1
nENOUT
nEXEC
nENIN
nHIGHZ
ECAPCLK
ALE
HIGHZ
BUSEN
nRESET
EXTERN0
DBGRQ
EXTERN1
COMMTX
DBE
APE
TMS
TDI
nTRUST
TCK
Bus controls
ECLK
timing
ISYNC
MCLK
Processorstate
Debug
Borndaryscan
Power
Interface
MemoryManagement
nWAIT
MemoryInterface
Coprocessor
nIRQ
nFIQ
Processormode
Clocksand
Boundaryscancontrolsignals
Interface
Interrupts
ARM7-TDMI
N. Mathivanan
Review Questions
1. What is the size of address and data busses in ARM7 processor?
2. What is the size of memory space ARM7 processor can address?
3. List the features of ARM processors.
4. What does ‘TDMI-S’ in ARM7-TDMI-S refer to?
5. What are the special functions of r13, r14 and r15 registers?
6. What are special features of multiplier block in ARM7 processor?
7. What are CPSR and SPSR?
8. What are the functions of control bits of program status register?
9. What is the purpose and feature of barrel shifter?
10. Describe the internal architecture of ARM7 processor
11. List modes of operation of ARM7 processor.
12. What is the width of half-word size data?
N. Mathivanan
13. Show schematically how the following data are saved in memory
starting from memory address 0x0000 0000 in big endian and
little endian byte order.
Data: 0xEF, 0x1234, 0xAB, 0x6789ABCD.
14. Illustrate ARM7 processor 3-stage pipelining.
15. What is interrupt latency?
16. Illustrate braking and stalling of pipeline by branch and
Load/Store instructions with suitable examples.
17. List ARM modes and the events that cause the processor to enter
into the corresponding mode.
18. Which one of the ARM modes is non-privileged mode?
19. How are the privileged and non-privileged modes distinguished?
20. What are the seven types of exceptions? List the events that
generate the exceptions.
N. Mathivanan
21. How does the ARM7 processor handle exceptions? Explain the
response of the processor to exceptions.
22. What is exception vector table? What does it hold?
23. Give examples of return instructions used in exception routines.
24. Write down the ARM exceptions in the order of its priority.
25. What is the delay in execution introduced by 3-stage pipeline when
an IRQ interrupt occurs?
26. Discuss typical ARM Bus Architecture implemented in a ARM7
processor based microcontroller.
27. What are the AHB and APB buses and their characteristics?
28. What are technologies used in ARM debug architecture?
N. Mathivanan

ARM Processors

  • 1.
    ARM Processors Dr. N.Mathivanan Visiting Professor Department of Instrumentation and Control Engineering National Institute of Technology, TRICHY, TAMILNADU INDIA N. Mathivanan
  • 2.
    Topics • Features ofARM processors • ARM architecture variants and processor families • ARM7-TDMI internal architecture • Register organization • Pipelining • Operating modes • Exception handling • ARM bus architecture • Debug architecture • Interface signals N. Mathivanan
  • 3.
    ARM Processors • AdvancedRISC Machine - ARM Ltd. not a manufacturing Co., provides license to manufacturers Used in high end applications involving complex computation Hand held device, Robotic, Automation system, Consumer electronics • Features High performance, low power, small in size (ideal for embedded sys) Large Register File, Small instruction set, Load-Store instructions, Fixed length instructions, Conditional execution of instructions, High code density, most instructions executable in single cycle, 32-bit in-line barrel shifter, built-in circuit for hardware debugging, DSP enhanced instructions, Jazelle (Java byte code extn. 3rd state), TrustZone (SoC approach to security) N. Mathivanan
  • 4.
    ARM Architecture Variants(core), Processor Families • Each family has its own instruction set, mem management, etc. Architecture version Processor Families Processor Features Microcontroller ARM v4T ARM7TDMI (1995) ARM720T ARM740T Von Neumann, 3-stage pipeline LPC2100 series ARM9TDMI ARM920T ARM922T ARM942T MMU, Harvard, 5-stage pipeline SAM9G, LPC29xx, LPC3xxx, STR9 ARM v5TE, ARM v5TEJ ARM9E (1997) ARM926EJ-S, MMU, DSP, Jazelle, SAM9XE ARM946E-S, MPU, DSP ARM966HS MPU (optional), DSP ARM10E (1999) ARM1020E MMU, DSP ARM1026EJ-S MMU/MPU, DSP, Jazelle ARM v6 ARM11 (2003) ARM1136J(F)-S MMU, TrustZone, DSP, Jazelle MSM7000, i.MX3x ARM1156T2(F)-S MPU, DSP ARM1176JZ(F)-S, MMU, TrustZone, DSP, Jazelle BCM2835 ARM11 MP core MMU, Multiprocessor cache support, DSP, JazelleN. Mathivanan
  • 5.
    Architecture version Processor Families Processor Features Microcontroller ARMv6-M Cortex Cortex-M0 NVIC LPC1200, 1100 series STM32F0x0, x1, x2 Cortex-M1 FPGA TCM Interface, NVIC STM32F1, F2, L1, W ARM v7-M Cortex Cortex-M3 MPU (optional), NVIC ST32F512-M, LPC1300, 1700, 1800 ARM v7-R Cortex Cortex-R4 MPU, DSP STA1095, SAM4L, SAM4N, SAM4S Cortex-R4F MPU, DSP, Floating Point SAM4C, SAM4E, LPC40xx, 43xx, STM32 F3, F4 ARM v7-A Cortex Cortex-A8 MMU, Trust Zone, DSP, Jazelle, Neon, Floating Point Freescale i.MX5X Cortex-A9 MMU, Trust Zone, Multiprocessor, DSP, Jazelle, Neon, Floating Point Freescale i.MX6QP N. Mathivanan
  • 6.
    ARM Nomenclature • AR M x y z T D M I E J F S (Example: ARM7-TDMI-S) x – Series y – MMU z – Cache T – Thumb D – Debugger M – Multiplier I – Embedded In-Circuit Emulator (ICE) macrocell E – Enhanced Instructions for DSP J – JAVA acceleration by Jazelle F – Floating-point S – Synthesizable version N. Mathivanan
  • 7.
    ARM7-TDMI – InternalArchitecture • Von Neumann architecture • Data bus – 32-bit • Address bus – 32-bit Addressable memory space – 4 GB • Register bank – (31+6) 32-bit regs. • In-line barrel shifter • Multiplier • ALU • Incrementer • Address register • Instruction decoder & control logic Address Register Register Bank (31 x 32-bit registers) (6 status registers) Address Incrementer 32 x 8 Data Bus D[31:0] A[31:0] BBus Data In Register PCBus ALUBus ALU LogicControl Barrel Shifter InstructionDecoderand Data Out Register Address Bus IncrementerBus ABus multiplier N. Mathivanan
  • 8.
    Register Organization 31 gen.purpose registers. Only 16 regs accessible 15 registers are hidden Named as r0 – r15 r13,r14, r15 are SP, LR, PC 8-bit/16-bit/32-bit data can be read/write 6 status registers Only 1 is accessible Named as CPSR, SPSR Contains flags, control bits Register bank has 2 read and 1 write port and 1 read and 1 write port for PC N. Mathivanan
  • 9.
    • Bit definitionsof Program Status registers • Barrel shifter Combinational logic circuit Shifts left/right any no. of bits position in one cycle Preprocess one of data from source reg. before passed to ALU • Multiplier 32-bit x 8-bit with early termination, Booth Algorithm 32-bit x 32-bit in 5 cycles Non M type multiplies in 32x2-bit and for 32-bitx32-bit - 17 cycles M0M1M2CN I B31 B23 V B15 B8 F B0 M3Z T B16 B7 Thumb State Flag M4 B24 Zero Overflow FIQ Interrupt Mask Mode Carry Negative IRQ Interrupt Mask N. Mathivanan
  • 10.
    • ALU Connected toregister bank using A-bus and B-bus ALU and barrel shifter operations take place in same cycle Result of ALU operation goes back to register bank thro’ ALU bus • Address register Holds the address of next instruction to be fetched • Instruction decoder and control logic Enables interfacing peripherals to processors Has Thumb decompressor – Decompresses 16-bit Thumb code to 32-bit ARM code N. Mathivanan
  • 11.
    • Data Types oWord – 32-bit, Halfword – 16-bit, Byte – 8-bit o Memory is byte addressable, can hold 232 bytes (= 4 GB) o Word/ halfword /byte size data are placed at word/ halfword/ byte aligned addresses. o 32-bit ARM instructions are placed at word aligned addresses • Byte order – Endian format o Word/halfword size data can be saved/retrieved in big endian or little endian format. o Big endian: MSB of word/halfword data are stored in lowest address and the data is addressed by address of MSB o Little endian: LSB of word/halfword data are stored in lowest address and the data is addressed by address of LSB N. Mathivanan
  • 12.
    Pipelining • Executes instructionsin Fetch, Decode, Execute cycles – 3 stages • Pipelining increases flow of instructions to processor • Enables processor and memory to work continuously • Performs Fetch, Decode, Execute operations simultaneously Example: • In T1 cycle, fetch 1st instruction • In T2, fetch 2nd, decode 1st instructions • In T3, fetch 3rd, decode 2nd, execute 1st instructions. • In each cycle one instruction is executed Instruction Fetch Register E X E C U T E ARM Decode F E T C H Read ALU Write Register Shift Thumb - ARM D E C O D E Register SelectDecompression N. Mathivanan
  • 13.
    Pipelining Continued…… • PCpoints to memory address of instruction currently fetched • Address of instruction currently decoded is PC-4 • Address of instruction currently executed is PC-8 • Branch instruction break pipeline, LDR/STR instruction stall pipeline • Solution: Harvard architect., adv. processors have deeper pipelining, • ARM9 – 5 stage, ARM10 – 6 stage, ARM11 – 8 stage E X E C U T E M E M O R Y W R I T E Write Reg. Instruction Fetch Reg. Read Shift + ALUDecompression D E C O D E RegisterSelect ARMDecode F E T C H Thumb- ARM MemoryAccess N. Mathivanan
  • 14.
    Operating Modes • ARM7has 7 modes, classified into privileged and non-privileged o In non-privileged mode processor can’t change control bits of CPSR • Each mode has different subset of registers. (refer to reg. organization) • Switching between the modes requires saving / retrieving of register values. • Banked registers accessible in their respective modes N. Mathivanan
  • 15.
    Operating Modes Continued…. ModeFunction User Normal programs and applications. Only non-privileged mode. FIQ Enters when highest priority interrupt occurs. Fast data transfer/processing mode. IRQ Enters when normal interrupt occurs. General-purpose interrupt handling. SVC Enters when CPU is reset or SVC is executed. Protected mode for OS, default on startup or reset. ABT Enters when illegal memory accesses occurs. Implements virtual memory and/or memory protection SYS Privileged user mode for OS (runs OS tasks) UND Enters when unknown/illegal instruction executed. Software emulation of hardware coprocessors r5 r4 r6 r2 r1 r0 r3 r7 r8 r14_lr CPSR r11 r13_sp r9 IRQ r15_pc r12 r10 User Mode r14_irq SPSR_irq r13_irq N. Mathivanan
  • 16.
    Exceptions • Generated byexternal events or internal sources • Seven types of exceptions o Reset: Occurs when ‘Reset’ pin is asserted – power-up/reset o Undefined: Occurs when currently executing instruction could not be recognized o SWI: Occurs if program in user mode executes SWI instruction to request OS services that are available in supervisor mode. o Prefetch Abort: Occurs if instruction fetched from invalid address. Exception is generated at execution stage. o Data Abort: Occurs if data load/store attempt at illegal address o IRQ: Occurs if IRQ pin goes low (only if CPSR IRQ mask bit is 0) o FIQ: Occurs if FIQ pin goes low (only if CPSR FIQ mask bit is 0) N. Mathivanan
  • 17.
    • Exception causesdiversion of execution to a particular mem location • Exception Vector Table holds instructions at exception vector locations to branch processor to actual exception handler routine. • The addresses assigned for the 7 exceptions and one reserved exception are from 0x0000 0000 to 0x0000 001C at 4 bytes interval. • Response of the processor to exceptions: (automatically performed) o Copies CPSR to SPSRexception o Changes CPSR bits (enable ARM state, change mode bits, disable interrupts) o Saves return address i.e. (current PC–4) in LRexception o Places vector address of exception mode in PC. Exception Handling N. Mathivanan
  • 18.
    • Execution branchesto exception handler routine and executes Exception handler has codes to return back to previous mode at correct loc Restores CPSR from SPSRexception Restores PC from LRexception Above two are performed using instruction like (MOVS PC,LR) At the entry/exit registers are stored/retrieved using STMFD/LDRFD Value in PC at the instant of exception is not same for all exceptions: Hence instruction to return back is different for different exceptions N. Mathivanan
  • 19.
    Response of Processorto Exceptions What is the delay in execution introduced by 3-stage pipeline when an IRQ interrupt occurs (interrupt latency)? Ans.: 7 cycles N. Mathivanan
  • 20.
    Except Exception Vector Processor Response toEnter Exception Handler Return Instruction Pri Reset 0x00000000 SPSR_svc = unexpected CPSR[4:0]= 10011B (SVC mode) CPSR[5] = 0 (ARM state), CPSR[6] = 1 (Disable FIQ) CPSR[7] = 1 (Disable IRQ) r14_svc = unexpected PC = 0x00000000 (Exception vector) No return 1 Un defined 0x00000004 SPSR_und = CPSR CPSR[4:0]= 11011B (undefined mode) CPSR[5] = 0 (ARM state), CPSR[6] unchanged CPSR[7] = 1 (Disable IRQ) r14_und = addr of next to undefined instruction PC = 0x00000004 (Exception vector) If registers are not saved in stack MOVS pc,lr 6 Registers are saved in stack at entry using: STMFD sp!,{<reglist>,lr} If saved: LDMFD sp!, {<reglist>,pc}^ N. Mathivanan
  • 21.
    Except Exception Vector Processor Response toEnter Exception Handler Return Instruction Pri SWI 0x00000008 SPSR_svc = CPSR CPSR[4:0]= 10011B (SVC mode), CPSR[5] = 0 (ARM state) CPSR[6] unchanged, CPSR[7] = 1 (Disable IRQ) r14_swi = addr of next instruction to SWI PC = 0x00000008 ;Exception vector If registers are not saved in stack MOVS pc,lr 6 Registers are saved in stack at entry using: STMFD sp!,{<reglist>,lr} If saved: LDMFD sp!, {<reglist>,pc}^ Prefetch Abort 0x0000000C SPSR_abt = CPSR CPSR[4:0]= 10111B (Abort mode), CPSR[5] = 0 (ARM state) CPSR[6] unchanged, CPSR[7] = 1 (Disable IRQ) r14_abt = aborted instruction addr.+4 PC = 0x0000000C ;Exception vector If registers are not saved in stack SUBS pc,lr,#4 5 Registers are saved in stack at entry using: SUB lr,lr,#4 STMFD sp!,{<reglist>,lr} If saved: LDMFD sp!, {<reglist>,pc}^ N. Mathivanan
  • 22.
    Exception Exception Vector Processor Responseto Enter Exception Handler Return Instruction Pri Data Abort 0x00000010 SPSR_abt = CPSR CPSR[4:0]= 10111B (Abort mode), CPSR[5] = 0 (ARM state) , CPSR[6] unchanged CPSR[7] = 1 (Disable IRQ) r14_abt = aborted instruction addr.+8 PC = 0x00000010 (Exception vector) If registers are not saved in stack SUBS pc,lr,#8 2 Registers are saved in stack at entry using: SUB lr,lr,#8 STMFD sp!,{<reglist>,lr} If saved: LDMFD sp!, {<reglist>,pc}^ Reserved 0x00000014 Reserved Reserved Reserved N. Mathivanan
  • 23.
    Except ion Exception Vector Processor Response toEnter Exception Handler Return Instruction Pri IRQ 0x00000018 SPSR_irq = CPSR CPSR[4:0]= 10010B (IRQ mode), CPSR[5] = 0 (ARM state) CPSR[6] = unchanged, CPSR[7] = 1 (Disable IRQ) r14_irq = last executed instruction addr.+8 PC = 0x00000018 ;Exception vector If registers are not saved in stack SUBS pc,lr,#4 4 Registers are saved in stack at entry using: SUB lr,lr,#4 STMFD sp!,{<reglist>,lr} If saved: LDMFD sp!, {<reglist>,pc}^ FIQ 0x0000001C SPSR_fiq = CPSR CPSR[4:0]= 10001B (FIQ mode), CPSR[5] = 0 (ARM state) CPSR[6] = 1 (Disable FIQ), CPSR[7] = 1 (Disable IRQ) r14_fiq=last executed instruction addr.+8 PC = 0x0000001C ;Exception vector If registers are not saved in stack SUBS pc,lr,#4 3 Registers are saved in stack at entry using: SUB lr,lr,#4 STMFD sp!,{<reglist>,lr} If saved: LDMFD sp!, {<reglist>,pc}^ N. Mathivanan
  • 24.
    Bus Architecture • AdvancedMicrocontroller Bus Architecture (AMBA) o Bus system connects memory, controllers and peripherals in ARM processor based microcontroller to ARM core o AMBA bus protocol std., adopted as on-chip bus by many mC o ARM core is bus master, peripherals are slaves o 3 buses within AMBA spec: AHP, ASP, APB  AHP (Advanced High-performance Bus): Provides high band-width. Supports multiple masters, slaves (e.g. of masters: DMA, Test interface, DSP, and e.g. of slaves: external memory). Includes bus arbiter, decoder Used in complex and more sophisticated systems N. Mathivanan
  • 25.
     ASB (AdvancedSystem Bus): AHB and ASB have many things in common Both support bursting, pipelining, split transaction ASB is used in simple cost effective designs  APB (Advanced Peripheral Bus): Simple, low speed, low power bus, for UART, .... peripherals Implemented with simple tri-stated data bus AHB-APB bridge: buffers data & operations between the two N. Mathivanan
  • 26.
    Debug Architecture • Usesboundary scan technology and JTAG to access core • JTAG can be used in programming, debugging, testing • ARM7-TDMI is JTAG enabled InstructionRegister BYPASSRegister IDRegister OtherRegisters Test AccessPort Controller TDI TCK TMS TRST ____ TDO Boundary ScanPath Boundary ScanCells CORE I/O Pads Din(Parallel-in) Dout (Parallel-out) Serial-out (tonext cell) Load Update Mode (fromprevious cell) Serial-in Shift/ (fromdevice pin) (tocore logic) (a) ScanCell (input cell) Clock (fromprevious cell) (tonext cell) 1 Update Scanout Q 0 Din Q 1 0 D D (Parall-in) ScanCell Clk (Parallel-out) Capture HoldCell Scanin Clk Dout Load Update Shift/ Clock MUX Mode (c) = 0 forfunctional mode = 1 fortest mode MUX LOGIC (b) N. Mathivanan
  • 27.
    • Boundary scancells – read/write pins without physically accessing Capture: parallel load captures signals on input pin to input cell and core output to output cell. Update: parallel unload operation passes values in output cell to output pin and input cells to core Serial shift: scan out of scan cell passed to scan in of next cell Normal mode: Din goes to Dout for normal operation, bypass scan cell • Scan chains - Scan cells linked to form chain around ARM macrocells TAP provides JTAG signals to control boundary scan operation Data can be shifted around from TDI to TDO, collected and probed • ARM7-TDMI debug – 3 Scan chains around core and EmbeddedICE ‘D’ and ‘I’ stand for Debug and In-Circuit Emulator Debug system: host, protocol converter, debug target N. Mathivanan
  • 28.
    • Host: PCrunning ARM debugger, allows setting break points/watch point, examining contents of registers from host. • EmbeddedICE forces ARM processor to debug state in which processor is isolated from rest of the system • Debug Comm Channel tech. allows ARM core to communicate with host while program is running and even without entering debug state N. Mathivanan
  • 29.
    Interface Signals • Clocks& Timing signals • Processor mode signals • Memory interface signals • Bus control signals • Interrupts • Memory management signals • Coprocessor interface signals • Debug interface signals TDO IR[3:0] TCK1 SCREG[3:0] nM[4:0] A[31:0] TAPSM[3:0] nTDOEN CPB TBIT BL[3:0] ABORT nOPC CPA DIN[31:0] nMREQ SEQ MAS[1:0] LOCK D[31:0] nRW nTRANS nCP1 DOUT[31:0] TCK2 VDD COMMRX DBGRQI ABE BIGEND TBE BUSDIS RANGEOUT0 BREAKPT RANGEOUT1 DBGACK DBGEN VSS nENOUT1 nENOUT nEXEC nENIN nHIGHZ ECAPCLK ALE HIGHZ BUSEN nRESET EXTERN0 DBGRQ EXTERN1 COMMTX DBE APE TMS TDI nTRUST TCK Bus controls ECLK timing ISYNC MCLK Processorstate Debug Borndaryscan Power Interface MemoryManagement nWAIT MemoryInterface Coprocessor nIRQ nFIQ Processormode Clocksand Boundaryscancontrolsignals Interface Interrupts ARM7-TDMI N. Mathivanan
  • 30.
    Review Questions 1. Whatis the size of address and data busses in ARM7 processor? 2. What is the size of memory space ARM7 processor can address? 3. List the features of ARM processors. 4. What does ‘TDMI-S’ in ARM7-TDMI-S refer to? 5. What are the special functions of r13, r14 and r15 registers? 6. What are special features of multiplier block in ARM7 processor? 7. What are CPSR and SPSR? 8. What are the functions of control bits of program status register? 9. What is the purpose and feature of barrel shifter? 10. Describe the internal architecture of ARM7 processor 11. List modes of operation of ARM7 processor. 12. What is the width of half-word size data? N. Mathivanan
  • 31.
    13. Show schematicallyhow the following data are saved in memory starting from memory address 0x0000 0000 in big endian and little endian byte order. Data: 0xEF, 0x1234, 0xAB, 0x6789ABCD. 14. Illustrate ARM7 processor 3-stage pipelining. 15. What is interrupt latency? 16. Illustrate braking and stalling of pipeline by branch and Load/Store instructions with suitable examples. 17. List ARM modes and the events that cause the processor to enter into the corresponding mode. 18. Which one of the ARM modes is non-privileged mode? 19. How are the privileged and non-privileged modes distinguished? 20. What are the seven types of exceptions? List the events that generate the exceptions. N. Mathivanan
  • 32.
    21. How doesthe ARM7 processor handle exceptions? Explain the response of the processor to exceptions. 22. What is exception vector table? What does it hold? 23. Give examples of return instructions used in exception routines. 24. Write down the ARM exceptions in the order of its priority. 25. What is the delay in execution introduced by 3-stage pipeline when an IRQ interrupt occurs? 26. Discuss typical ARM Bus Architecture implemented in a ARM7 processor based microcontroller. 27. What are the AHB and APB buses and their characteristics? 28. What are technologies used in ARM debug architecture? N. Mathivanan