Introduction to Digital Signal
processors
Dr.S.Periyanayagi
Professor& Head/ECE
Ramco Institute of Technology
DSP Processors
• A digital Signal Processor is a specialized
microprocessor targeted at digital signal processing.
• DSP Processors - needs of specific digital signal
processing applications.
• Advanced Microprocessor – RISC (Reduced
Instruction Set Computer) Processor, CISC (Complex
Instruction set Computer) Processor
• For real time signal processing, DSP processors are
rated best among the programmable processors
Salient features
• For Efficient performance of DSP Operations
Multiplier and Multiplier Accumulator
Modified Bus Structure and Memory Access
Schemes
Multiple Access Memory
Very Long Instruction Word VLIW Architecture
Pipelining
Special Addressing Modes
On Chip Peripherals
Categories of DSPs
• General Purpose digital Signal processors
 Fixed Point Processors
TMS320C5X,TMS320C54x and Motorola
DSP563x, DSP56156/166 (16 bit)
 Floating Point Processors
TMS320C4x,TMS320C67xx, Motorola
DSP6002(32 bit)
• Special Purpose Processors
 Design for specific DSP algorithms
FFT
 Hardware designed for specific applications PCM,
• Analog Devices
– ADSP-2100 Family (16-bit Fixed Point)
– ADSP- 21020 (32 bit Floating point)
– ADSP-2106x(32 bit Floating Point
Texas Instruments
TMS320C1x (16 bit fixed point)
TMS320C2x(16 bit fixed point)
TMS320C3x(32 bit floating point)
TMS320C4x(32 bit floating point)
TMS320C5x(16 bit fixed point)
TMS320C67x(32 bit floating point)
USE of DSP
Evolution of DSP
• TMS 320 Family of processors includes four
basic types of processors.
 Fixed Point Processors – Low power, Low cost
device and operates at high speed.
 32-bit floating point Processors - Large
dynamic range, wider instruction word size and
more addressing modes.
 VLIW architecture processors - Executes
Parallel instructions at a time by multiple
execution unit.
Multiprocessor DSPs – Provides parallel
processing
Architecture of Microprocessor
Diagram taken from ‘Digital signal Processing’- Emmanuel Ifeachor & Barrie W.Jervis, Second
edition book
Basic General Hardware Architecture for Signal
Processing
Diagram taken from ‘Digital signal Processing’- Emmanuel Ifeachor & Barrie W.Jervis, Second
edition book
Techniques in DSP Processor
• Harvard architecture
• Pipelining
• Fast, dedicated hardware multiplier/
accumulator
• Special instruction dedicated to DSP
• Replication
• On chip memory/Cache
• Extended parallelism – SMID, VLIW and
static superscalar processing
Special Features of Digital Signal
Processing
• Fast Data Access
• Fast Computation
• Numerical fidelity
• Fast execution control
Fast Data Access
• High-bandwidth Memory Architectures
 Von Neumann Architecture
 Harvard Architecture
 Modified Harvard Architecture
 Architecture of Advanced digital Signal
processors.
• Specialized Addressing Modes
 Circular Addressing
 Bit reversed addressing
• Direct Memory Access (DMA)
Fast Computation
• MAC (Multiply/Accumulate) Unit
• Pipelining of Instruction Execution
 Phase 1 : Fetch the opcode (or instruction code)
from program memory.
 Phase 2 : Decode the instruction code.
 Phase 3 : Read the operands (or data) from
data/program memory.
 Phase 4 : Execute the task specified by the
instruction and store the result.
Von Neumann Architecture
Diagram taken from ‘Digital signal Processors-Architecture, Programming and Applications’-
B.Venkataramani & M Bhaskar, Second Edition book
• MAC operation with data move (MACD
instruction) – Requires 4 memory access per
instruction cycle.
 Fetch the MACD instruction from the program
memory
 Fetch one of the operands from the program
memory
 Fetch the second operand from the data
memory
 Write the content of the data memory with
address DMA into the location with address
DMA+1
• Von Neumann architecture – 4 clock cycles
Von Neumann Architecture
• Consist of three buses
 Data bus
 Address bus
 Control bus
Non-Harvard architecture with single
memory space
Diagram taken from ‘Digital signal Processing’- Emmanuel Ifeachor & Barrie W.Jervis, Second
edition book
Types of instruction
• Instruction Fetch
• Instruction decode
• Instruction execute
Class Poll
• A Microprocessor and DSP processor
differ in
a. Speed of operation
b. Real time signal processing
c. Multiple Busses
d. All the above
Harvard Architecture
Diagram taken from ‘Digital signal Processors-Architecture, Programming and Applications’-
B.Venkataramani & M Bhaskar, Second Edition book
Basic Harvard Architecture
Diagram taken from ‘Digital signal Processing’- Emmanuel Ifeachor & Barrie W.Jervis, Second
edition book
Instruction overlap made in Harvard
architecture
Diagram taken from ‘Digital signal Processing’- Emmanuel Ifeachor & Barrie W.Jervis, Second
edition book
• Number of clock cycles are reduced by
using two separate buses for the program
and data memory
• Content of program and data memory can
be accessed parallel.
• Instruction code can be fed from the
program memory to the control unit while
the operand is fed to the processing unit
from the data memory.
• Processing unit consist of
– Registers
– Processing elements - MAC units, Multiplier,
ALU, shifter
• The number of memory accesses/ Clock
cycles - Increased by using more number
of buses
 Motorola DSP5600X.DSP96002 – three
separate buses
 TMS320C54X – 4 address buses
• The cost of IC increases - Number of pins
in the IC
• Extending number of buses – unduly
increases the price
• P-DSPs – multiple buses for connecting
on chip memory to the control unit and
data path.
Modified Harvard Architecture
Diagram taken from ‘Digital signal Processors-Architecture, Programming and Applications’-
B.Venkataramani & M Bhaskar, Second Edition book
• One set of bus - access both program and
data memory
• Other – data alone
• Data can be transferred from one memory
to another
• Texas instruments and Analog devices
Special Purpose DSPs Examples
FFT processor
PDSP 16515A,TM-44,TM-66
Programmable FIR filter
UPDSP 16256, Model13092
Architecture of Advanced Digital
Signal Processors
VLIW Architecture
Diagram taken from ‘Digital signal Processors-Architecture, Programming and Applications’-
B.Venkataramani & M Bhaskar, Second Edition book
VLIW Architecture
Diagram taken from ‘Digital signal Processors-Architecture, Programming and Applications’-
B.Venkataramani & M Bhaskar, Second Edition book
Numerical Fidelity
• Guard Bits
• Dynamic range
dBdB
ValueSmallest
ValueestL
rangeDynamic 6.186
2/1
1
log20
arg
log20
31













Fast Execution control
• Zero-overhead Hardware loop
• Very fast interrupt handling by employing
shadow registers.
Applications of TI DSPs
• C1X,C2X,C2XX,C5X,C54X : toys, Hard disk drives,
Modems, Cellular phones and active car suspensions.
• C3X : Filters, analysers, hi-fi systems, voice mail,
imaging, barcode readers, motor control, 3D graphics or
scientific processing.
• C4X: parallel- processing clusters in virtual reality,
image recognition telecom routing, and parallel
processing systems.
• C6X: Wireless base stations, pooled modems, remote-
access servers, digital subscriber loop systems, cable
modems and multichannel telephone systems.
• C8X: video telephony, 3D computer graphics, virtual
reality and number of multimedia applications.
IC Number
• The TI DSP Chip have IC numbers with the
prefix TMS320.
• Next letter C - CMOS technology (TMS320Cxx)
• Next Letter E – CMOS and On chip non volatile
memory EPROM (TMS320E5x)
• If TMS3205x – NMOS technology and On chip
non volatile memory ROM
• Under C5X – C50, C51, and C5X – identical in
instruction set but differs in capacity of on chip
ROM and RAM.
Characteristics of some TMS320 family DSP
chips
'C15 'C25 'C30 'C50 'C541
Cycle Time (ns) 200 100 60 50 25
On chip RAM 4K 4K 4K 2K 5K
Total Memory 4K 128K 16M 128K 128K
Parallel ports 8 16 16M 64K 64K
Architecture of TMS320C5x
Diagram taken from ‘Digital signal Processors-Architecture, Programming and Applications’-
B.Venkataramani & M Bhaskar, Second Edition book
• Bus Structure
 Program Bus (PB) – carries instruction code
and immediate operands from program
memory to CPU.
 Program address bus (PAB) – It provides
addresses to program memory space for both
reads and writes.
 Data read bus (DB) – It interconnects various
elements of the CPU to data memory space.
 Data read address bus(DAB) – It provides
address to access the data memory space.
Features of TMS320C5x Family
• Central Arithmetic Logic Unit (CALU)
– 16-bit CPU
– 20 to 50 ns single cycle instruction execution time
– Single cycle 16 x 16-bit MAC (Multiply/
Accumulate) unit
– 64k x 16-bit external Program memory address
space
– 64k x 16-bit external data memory address space
• 64k x 16-bit external IO address space
• 32k x 16k-bit external global memory address
space
• 2k to 32k x 16-bit single-access On-chip
PROM
• 1k to 9k x 16-bit single-access On-chip
Program/data RAM
• 1k x 16-bit dual-access On-chip program/ data
RAM
• Synchronous, TDM and buffered serial ports
• Programmable timer and PLL (Phase Locked
Loops)
• IEEE standard JTAG ports
• 5 V/3 V operation with low power dissipation
and power down modes
• DMA interface
• 100/128/132/144 pins in plastic QFP and
TQFP
• Central Processing Unit
 Central arithmetic logic unit (CALU)
 Parallel logic unit(PLU)
 Auxiliary register arithmetic unit(ARAU)
 Memory mapped registers
 Program controller
Central arithmetic logic unit (CALU)
• 16X16 Bit Parallel Multiplier
• 32 bit Accumulator(ACC)
• 32 bit Accumulator Buffer (ACCB)
• Product register (PREG)
• 0-16 bit barrel shifters(right and left)
• 32 bit ALU
• One of the operands for ALU operation comes from ACC.
• Result is stored in ACC
• A 32 bit ACCB is used for temporary storage of ACC.
• The hardware multiplier – 16x16 multiplication of
number represented in 2’s complement form.
• 32 bit PREG – result of multiplication
• 0-16 bit left and right barrel shifters in CALU - permit
the contents of memory to be left shifted by 0-16 bits
before they are fed to ALU or stored from ALU to
memory.
• Auxiliary register arithmetic unit(ARAU):
Eight auxiliary register (AR0-AR&) each of 16 bit
length,
3 bit ARP (Auxiliary register pointer) and unsigned 16
bit ALU
Used as Address pointer and general purpose register
Index register,
ARCR,BMAR,BRR(RPTC,BRCR,PASR,PAER), PLU
– Index register:
 Used by ARAU as step value to modify the address in AR's during
indirect addressing.
– Auxiliary Register compare register:
 Used for Address boundary comparison
– Block Move Address Register (BMAR)
 16 bit holds an address value to be used with block
moves and multiply/accumulate operations.
 provides 16 bit address for indirect addressed second
operand
– Block Repeat Register (BRR)
 16 bit wide
 Repeat counter register (RPTC)
 Block repeat counter register (BRCR)
 Block repeat program address start register (PASR)
 Block repeat program address end register (PAER)
– Parallel Logic Unit (PLU)
Performs Boolean operations or bit manipulations
Logic unit executes Logic operations – set, clear, test
or toggle multiplier bits in control register or any
data memory location.
• Memory mapped registers
 96 registers
 Used for indirect data pointer, temporary
storage
• Instruction Registers
• Interrupt Registers
• Status Registers
• Program Controller
Program Counter
Hardware Stack
Program Memory Address Generation
Status and Control Registers
Circular Buffer Registers
Process Mode Status Register
Status Register (ST0 and ST1)
Status Registers
• ST0 bit assignment
 ARP: Auxiliary Register Pointer- Select AR in indirect
addressing
 OV: Overflow flag bit – Arithmetic operation overflow in ALU
 OVM: overflow Mode bit –Accumulator overflow saturation
mode
 INTM: Interrupt Mode bit – Globally masks or enables all
interrupts.
 DP: Data memory page pointer bit – Address of current
data memory page
15-14 12 11 10 9 8-0
ARP OV OVM 1 INTM DP
• ST1 Bit assignment
• ARB: Auxiliary Register Buffer – Holds previous value of
ARP
• CNF on chip RAM configuration control bit
CNF: 0-on chip DARAM B0 is mapped to data memory
CNF:1 – on chip DARAM B0 is mapped to Program memory
• TC test/ control flag bit – Stores the result of ALU or PLU test
bit operations
• SXM: Sign extension mode bit- enables /disables sign
extension of an arithmetic operation
15-14 12 11 10 9 8-7 6 5 4 3-2 1-0
ARB CNF TC SXM C 11 HM 1 XF 11 PM
ARB Auxiliary Register Buffer
• C: Carry bit-Indicates arithmetic carry or borrow
• HM: Hold mode bit- indicates CPU stops or continues
execution
• XF: pin status bit – determines level of external flag
output pin
• PM: product shift mode bits
00 – No shift
01-Left shifted 1 bit; LSBs zero filled
10- Left shifted 4 bits; LSBs zero filled
11- Right shifted 6 bits;6 LSBs lost
• On-chip Memory
Program Memory
Data/Program Dual access RAM
Data/Program single Access RAM
On Chip Memory Protection
• On-chip Peripherals
Clock Generator
Hardware Timer
Software Programmable wait state generators
General Purpose I/O Pins
Parallel I/O Ports
Serial Port Interface
Buffered Serial Port
TDM Serial Port
Host Port Interface
User maskable interrupts
Addressing Modes
The method of specifying the data to be operated
by the instruction is called addressing modes.
Direct addressing
Memory mapped register addressing
Indirect addressing
Immediate addressing
Register addressing
Circular addressing mode
Direct Addressing Mode
• Address of the data is directly specified in the
instruction itself.
• 16-bit data memory address bus(DAB)
Diagram taken from ‘Digital signal Processors-Architecture, Programming and Applications’-
B.Venkataramani & M Bhaskar, Second Edition book
Memory Mapped Register
Addressing
• LAMM- Load accumulator with memory
mapped register
• LMMR – Load memory mapped register
• SAMM – Store accumulator in memory
mapped register
• SMMR – Store memory mapped register
Bit Reversed Addressing Mode
• Bit Reversed operation – FFT
Immediate Addressing
• The immediate addressing mode can be used
to load either a 16-bit constant or a constant of
length 13,9 or 7
• Accordingly it is referred to as long immediate
or short immediate addressing mode.
• This mode is indicated by the symbol#.
Indirect Addressing
Symbol Value of AR pointed by ARP after
instruction execution
* AR unaltered
*+ AR incremented by 1
*- AR decremented by 1
*0+ AR incremented by the content of INDX
*0- AR decremented by the content of INDX
*BR0+ AR incremented by the content of INDX with
reverse carry propagation
*BR0- AR decremented by the content of INDX with
reverse carry propagation
Dedicated - Register Addressing
• The advantage of this addressing mode is that
the address of the block of memory to be acted
upon can be changed during execution of the
program
Circular Addressing Mode
• CBSR1- Circular buffer 1 start register
• CBSR1- Circular buffer 2 start register
• CBER1- Circular buffer 1 end register
• CBER1- Circular buffer 2 end register
• CBCR- Circular buffer control register
Pipelining
• Pipelining a processor means breaking down
its instruction into series of discrete pipeline
stages which are completed in sequence.
• Phases of Pipelining
Fetch(F)
Decode(D)
Read(R)
Execute(E)
Pipelining
Diagram taken from ‘Digital signal Processors-Architecture, Programming and Applications’-
B.Venkataramani & M Bhaskar, Second Edition book
Advantages
• Improves system Performance
• Increases the speed of operation
MAC Operation
• Numerical operations in DSP- Multiplication
and Addition
• Real Time DSP to be fast - MAC unit is
mandatory
• Fixed or floating Hardware MAC – Standard
in DSPs
• In fixed point – It multiplies two 16 bit 2’s
complement fractional numbers and computes
a 32 bit product in a single cycle (25ns)
• DSP Hardware MAC Configuration is depicted
MAC Configuration in DSPs
Diagram taken from ‘Digital signal Processors-Architecture, Programming and Applications’-
B.Venkataramani & M Bhaskar, Second Edition book
• The multiplier has a pair of input registers that
hold the inputs to the multiplier and a 32 bit
product register which holds the result of a
multiplication.
• The output of the P (product) register is connected
to a double precision accumulator.
• The principle is very much the same for hardware
floating point multiplier accumulators.
• Floating point MACs allow fast computation of
DSP results with minimal Errors.
• Floating point offers a wide dynamic range and
reduced arithmetic errors, Many applications the
dynamic range provided by the fixed point
representation is adequate.
First generation – Fixed point processor
Fixed Point digital signal processors
• The Key features of four generation of the fixed point
DSP processors from 4 leading semiconductor
manufacturers.
• Basic architecture of the first generation fixed point
DSP processor TMS320C1x by Texas Instruments
• Dedicated arithmetic units – multiplier and an
accumulator
• The processor family – modified Harvard architecture
with two separate memory spaces for programs and
data.
• On-chip memory and special instruction for execution
• Has three separate address spaces for program
memory, data memory and I/O.
• 16 Bit two Auxiliary registers (AR0-AR1)
• The content of auxiliary registers can be saved in
and loaded from data memory with SAR and
LAR
• Provides 144/256words of 16 bit on chip data
RAM
• 1.5K/4K words of program ROM/EPROM
Second generation – Fixed point processor
• Second generation fixed point DSPs – enhanced
features compared to the first generation.
• Much larger on chip memories and more special
instruction to support efficient execution of DSP
algorithms
• Computational performance – 4 to 6 times more than
first generation.
• Figure shows – Special instructions for DSP
operations include a multiply and accumulate with
data move instruction – repeat instruction to execute
an FIR filter with time saving.
• Second generation – provides more on chip memory
• TMS320C2X – Modified Harvard architecture for
speed and flexibility
• 32 bit ALU and accumulator perform a wide range of
arithmetic and logical instructions.
• Separate Program and Data memory spaces – each
with 16 bit address and on chip data buses.
• 16x16 bit hardware multiplier capable of computing a
signed and unsigned 32 bit product in single machine
cycle.
• Six register :
– A serial port receive register
– A Serial port Transmit register
– A time register
– A period register
– An Interrupt mask register
– Memory allocation register
• TMS320C2X allows flexible configurations
– A Stand alone Processor
– A multiprocessor with devices in parallel
– A slave/host multiprocessor with global memory space
– A peripheral processed interfaced via processor controlled
signal to another device
Second generation – Motorola DSP56002
Third Generation – Fixed Point processor
• Third Generation fixed point DSPs –
enhancement of second generation DSPs
• Performance Enhancement – Achieved by
increasing and/or making more effective use of
available on chip resources.
• More data paths, wider data paths, Larger on
chip memory and instruction cache and dual
MAC.
• Third generation DSPs - 2 or 3 times superior
to second generation
• Texas Instruments TMS320C3x,
TMS320C54X
• Third Generation TMS320C3X -executes 60
million floating point operations per second.
• On chip parallelism in processor – 11
operations in a single instruction
• High performance
– Perform parallel multiply and arithmetic unit
operations on integer, floating point in single cycle
– General purpose register file
– Large on chip memory
– High degree of parallelism
– Direct memory access controller
Fourth Generation-Fixed point processors
Diagram taken from ‘Digital signal Processing’- Emmanuel Ifeachor & Barrie W.Jervis, Second
edition book
• Fourth Generation fixed point DSP processors
– Multi channel Applications
• Digital Subscriber loop
• Remote Access server modem
• Wireless base station
• 3 G Mobile systems
• Medical Imaging
 Uses VLIW Architecture
 Wider Instruction word
 Wider data paths
 More registers
 Larger Instruction Cache
 Multiple Arithmetic unit
• Core processor has – two independent arithmetic
paths, each with four execution units
– Logic Unit (Li)
– Shifter/ Logic Unit (Si)
– Multiplier (Mi)
– Data Address Unit (Di)
• Core processor- fetches 32 bit instructions at a time
• Instruction Width of 256 bits
• Executes 8 instruction in parallel for one cycle
• Large Program and Data memory
• Advantages of VLIW Architecture – High
computational performance.
Floating Point Representation
• First generation – TMS320 C3x
• Second generation – TMS320C4x
• Third generation – TMS320C67X
Floating Point Representation
• TMS320C6x – VelociTI architecture – first DSP
to use advanced VLIW architecture
• Excellent choice of multiple execution and
multifunction applications
• VelociTI architecture – Reduced Code size,
flexibility of code, data type and zero overhead in
branching.
• TMS320C62X,64X – Fixed point processors
• TMS320C67X – Floating point processor 32
general purpose register with 32 bit size.
Features of TMS320C6x processors
• Advanced VLIW CPU – 8 functional units (2 multiplier and
6 ALUs)
• Executes 8 instructions per cycle
• Instruction Packing reduces code size, program fetch and
power consumption
• Conditional Execution of all instruction
• Efficient code execution on independent functional units
• Supports 8/16/32 bit data format
• 40 bit arithmetic, saturation and normalization operations
• Field manipulation and instruction extract, set, clear, and bit
counting operations.
• Supports Single precision (32 bit) , double precision (64 bit)
IEEE floating point operations
• 32 x 32 bit integer multiplication with 32 or 64 bit results.
Floating Point Representation
Internal Architecture
Diagram taken from ‘Digital signal Processing’- Emmanuel Ifeachor & Barrie W.Jervis, Second
edition book
• C6x- contains 32 bit CPU
• On chip program
• Data memory and on chip peripheral
• Peripherals such as
– External memory interface (EMIF)
– Direct memory Access Control (DMA)
– Timers
– Multi channel buffered serial ports (MsBSP)
– Host port interface(HPI)
– Power down logic
CPU unit of TMS320C6X
Diagram taken from ‘Digital signal Processing’- Emmanuel Ifeachor & Barrie W.Jervis, Second
edition book
• CPU contains
– Program fetch unit
– Instruction dispatch unit
– Instruction decode unit
– Two data path – 4 functional unit
– Register file for each data path
– Control register
– Control Logic
– Test, Emulator and interrupt logic
• Functional unit accepts 32 bit instruction
(Instruction packet size is 256 bits) at a time
• Program fetch unit generates address of eight
instructions and send it to program memory for
each fetch packet – once fetched CPU receives
the packets.
• The Instruction Dispatch unit receives the fetch
packet and split it into execute packets.
• Instruction in execute packets are assigned to
appropriate eight functional unit in data path
• In instruction decode, the source register and
destination register and associated path are
decoded for execution of the instruction in
functional unit.
• Instructions are executed by functional units
• The register files A& B – 32 numbers of 32 bit
registers (16 register for each data path)
• 8 Functional units – 6 ALU and 2 Multiplier
(.L1,.L2,.S1,.S2,.M1,.M2,.D1,.D2)
Diagram taken from ‘Digital signal Processing’- Emmanuel Ifeachor & Barrie W.Jervis, Second
edition book
Functional units of C6x
Name of the
Unit
Type of Floating point Operation
.L Unit Arithmetic operations
.S Unit Compare square root and absolute value
operations
.M unit 32x32 bit fixed point multiply operations and
floating point multiply operations
.D unit Load double word with 5 bit constant offset
• Data Path
– Register file data path
– Register file cross path
– Register file Memory access path
• Control Register File –
– 10 control registers,
– .S2 can read and write to control register
– Accessed by MVC (Move between Control and register
file)
– Addressing mode register
– Control Status register
– Program counter
– Interrupt flag , set, clear, enable register
– Interrupt return pointer
– Non-maskable interrupt return pointer
Visual Quiz
• The features in which PDSP is superior to advanced
Microprocessor is
A. Low cost
B. Low power
C. Computational speed
D. Real time I/O Capability
• The addressing mode convenient for FFT computation
is
A. Indirect addressing
B. Circular mode
C. Bit reversed addressing
D. Memory mapped addressing
• The Result of operations performed in CALU are
stored in
A. ACC
B. ACCB
C. TREG0
D. PREG
• The ........ Permits execution of logical operation
on data without affecting the contents of ALU
A. PLU
B. Auxiliary ALU
C. CALU
D. Memory mapped addressing
• The Register used for indirect addressing of
memory
A. ARs
B. Block move address register
C. TREG
D. Index register
• The Content to left unaltered in indirect
addressing mode symbol is
A. *
B. *0+
C. #
D. *+
• The C6X processor is based on ..........
architecture
A. Modified Harvard
B. VelociTI
C. Advanced Harvard
D. Davinci
• The functional unit used for 32/40 bit shift
operation
A. .L
B. .S
C. .M
D. .D
• The number of register in C62x and C67x CPU
register file
A. 16
B. 32
C. 40
D. 64
• The floating point devices in C6x processors
are
A. C62x
B. C67x
C. C64x
D. C62x and C74x

Introduction to Digital Signal processors

  • 1.
    Introduction to DigitalSignal processors Dr.S.Periyanayagi Professor& Head/ECE Ramco Institute of Technology
  • 2.
    DSP Processors • Adigital Signal Processor is a specialized microprocessor targeted at digital signal processing. • DSP Processors - needs of specific digital signal processing applications. • Advanced Microprocessor – RISC (Reduced Instruction Set Computer) Processor, CISC (Complex Instruction set Computer) Processor • For real time signal processing, DSP processors are rated best among the programmable processors
  • 3.
    Salient features • ForEfficient performance of DSP Operations Multiplier and Multiplier Accumulator Modified Bus Structure and Memory Access Schemes Multiple Access Memory Very Long Instruction Word VLIW Architecture Pipelining Special Addressing Modes On Chip Peripherals
  • 4.
    Categories of DSPs •General Purpose digital Signal processors  Fixed Point Processors TMS320C5X,TMS320C54x and Motorola DSP563x, DSP56156/166 (16 bit)  Floating Point Processors TMS320C4x,TMS320C67xx, Motorola DSP6002(32 bit) • Special Purpose Processors  Design for specific DSP algorithms FFT  Hardware designed for specific applications PCM,
  • 5.
    • Analog Devices –ADSP-2100 Family (16-bit Fixed Point) – ADSP- 21020 (32 bit Floating point) – ADSP-2106x(32 bit Floating Point Texas Instruments TMS320C1x (16 bit fixed point) TMS320C2x(16 bit fixed point) TMS320C3x(32 bit floating point) TMS320C4x(32 bit floating point) TMS320C5x(16 bit fixed point) TMS320C67x(32 bit floating point)
  • 6.
  • 7.
  • 9.
    • TMS 320Family of processors includes four basic types of processors.  Fixed Point Processors – Low power, Low cost device and operates at high speed.  32-bit floating point Processors - Large dynamic range, wider instruction word size and more addressing modes.  VLIW architecture processors - Executes Parallel instructions at a time by multiple execution unit. Multiprocessor DSPs – Provides parallel processing
  • 10.
    Architecture of Microprocessor Diagramtaken from ‘Digital signal Processing’- Emmanuel Ifeachor & Barrie W.Jervis, Second edition book
  • 11.
    Basic General HardwareArchitecture for Signal Processing Diagram taken from ‘Digital signal Processing’- Emmanuel Ifeachor & Barrie W.Jervis, Second edition book
  • 12.
    Techniques in DSPProcessor • Harvard architecture • Pipelining • Fast, dedicated hardware multiplier/ accumulator • Special instruction dedicated to DSP • Replication • On chip memory/Cache • Extended parallelism – SMID, VLIW and static superscalar processing
  • 13.
    Special Features ofDigital Signal Processing • Fast Data Access • Fast Computation • Numerical fidelity • Fast execution control
  • 14.
    Fast Data Access •High-bandwidth Memory Architectures  Von Neumann Architecture  Harvard Architecture  Modified Harvard Architecture  Architecture of Advanced digital Signal processors. • Specialized Addressing Modes  Circular Addressing  Bit reversed addressing • Direct Memory Access (DMA)
  • 15.
    Fast Computation • MAC(Multiply/Accumulate) Unit • Pipelining of Instruction Execution  Phase 1 : Fetch the opcode (or instruction code) from program memory.  Phase 2 : Decode the instruction code.  Phase 3 : Read the operands (or data) from data/program memory.  Phase 4 : Execute the task specified by the instruction and store the result.
  • 16.
    Von Neumann Architecture Diagramtaken from ‘Digital signal Processors-Architecture, Programming and Applications’- B.Venkataramani & M Bhaskar, Second Edition book
  • 17.
    • MAC operationwith data move (MACD instruction) – Requires 4 memory access per instruction cycle.  Fetch the MACD instruction from the program memory  Fetch one of the operands from the program memory  Fetch the second operand from the data memory  Write the content of the data memory with address DMA into the location with address DMA+1 • Von Neumann architecture – 4 clock cycles
  • 18.
    Von Neumann Architecture •Consist of three buses  Data bus  Address bus  Control bus
  • 19.
    Non-Harvard architecture withsingle memory space Diagram taken from ‘Digital signal Processing’- Emmanuel Ifeachor & Barrie W.Jervis, Second edition book
  • 20.
    Types of instruction •Instruction Fetch • Instruction decode • Instruction execute
  • 21.
    Class Poll • AMicroprocessor and DSP processor differ in a. Speed of operation b. Real time signal processing c. Multiple Busses d. All the above
  • 22.
    Harvard Architecture Diagram takenfrom ‘Digital signal Processors-Architecture, Programming and Applications’- B.Venkataramani & M Bhaskar, Second Edition book
  • 23.
    Basic Harvard Architecture Diagramtaken from ‘Digital signal Processing’- Emmanuel Ifeachor & Barrie W.Jervis, Second edition book
  • 24.
    Instruction overlap madein Harvard architecture Diagram taken from ‘Digital signal Processing’- Emmanuel Ifeachor & Barrie W.Jervis, Second edition book
  • 25.
    • Number ofclock cycles are reduced by using two separate buses for the program and data memory • Content of program and data memory can be accessed parallel. • Instruction code can be fed from the program memory to the control unit while the operand is fed to the processing unit from the data memory. • Processing unit consist of – Registers – Processing elements - MAC units, Multiplier, ALU, shifter
  • 26.
    • The numberof memory accesses/ Clock cycles - Increased by using more number of buses  Motorola DSP5600X.DSP96002 – three separate buses  TMS320C54X – 4 address buses • The cost of IC increases - Number of pins in the IC • Extending number of buses – unduly increases the price • P-DSPs – multiple buses for connecting on chip memory to the control unit and data path.
  • 27.
    Modified Harvard Architecture Diagramtaken from ‘Digital signal Processors-Architecture, Programming and Applications’- B.Venkataramani & M Bhaskar, Second Edition book
  • 28.
    • One setof bus - access both program and data memory • Other – data alone • Data can be transferred from one memory to another • Texas instruments and Analog devices
  • 29.
    Special Purpose DSPsExamples FFT processor PDSP 16515A,TM-44,TM-66 Programmable FIR filter UPDSP 16256, Model13092
  • 30.
    Architecture of AdvancedDigital Signal Processors
  • 31.
    VLIW Architecture Diagram takenfrom ‘Digital signal Processors-Architecture, Programming and Applications’- B.Venkataramani & M Bhaskar, Second Edition book
  • 32.
    VLIW Architecture Diagram takenfrom ‘Digital signal Processors-Architecture, Programming and Applications’- B.Venkataramani & M Bhaskar, Second Edition book
  • 33.
    Numerical Fidelity • GuardBits • Dynamic range dBdB ValueSmallest ValueestL rangeDynamic 6.186 2/1 1 log20 arg log20 31             
  • 34.
    Fast Execution control •Zero-overhead Hardware loop • Very fast interrupt handling by employing shadow registers.
  • 35.
    Applications of TIDSPs • C1X,C2X,C2XX,C5X,C54X : toys, Hard disk drives, Modems, Cellular phones and active car suspensions. • C3X : Filters, analysers, hi-fi systems, voice mail, imaging, barcode readers, motor control, 3D graphics or scientific processing. • C4X: parallel- processing clusters in virtual reality, image recognition telecom routing, and parallel processing systems. • C6X: Wireless base stations, pooled modems, remote- access servers, digital subscriber loop systems, cable modems and multichannel telephone systems. • C8X: video telephony, 3D computer graphics, virtual reality and number of multimedia applications.
  • 36.
    IC Number • TheTI DSP Chip have IC numbers with the prefix TMS320. • Next letter C - CMOS technology (TMS320Cxx) • Next Letter E – CMOS and On chip non volatile memory EPROM (TMS320E5x) • If TMS3205x – NMOS technology and On chip non volatile memory ROM • Under C5X – C50, C51, and C5X – identical in instruction set but differs in capacity of on chip ROM and RAM.
  • 37.
    Characteristics of someTMS320 family DSP chips 'C15 'C25 'C30 'C50 'C541 Cycle Time (ns) 200 100 60 50 25 On chip RAM 4K 4K 4K 2K 5K Total Memory 4K 128K 16M 128K 128K Parallel ports 8 16 16M 64K 64K
  • 38.
  • 39.
    Diagram taken from‘Digital signal Processors-Architecture, Programming and Applications’- B.Venkataramani & M Bhaskar, Second Edition book
  • 40.
    • Bus Structure Program Bus (PB) – carries instruction code and immediate operands from program memory to CPU.  Program address bus (PAB) – It provides addresses to program memory space for both reads and writes.  Data read bus (DB) – It interconnects various elements of the CPU to data memory space.  Data read address bus(DAB) – It provides address to access the data memory space.
  • 41.
    Features of TMS320C5xFamily • Central Arithmetic Logic Unit (CALU) – 16-bit CPU – 20 to 50 ns single cycle instruction execution time – Single cycle 16 x 16-bit MAC (Multiply/ Accumulate) unit – 64k x 16-bit external Program memory address space – 64k x 16-bit external data memory address space
  • 42.
    • 64k x16-bit external IO address space • 32k x 16k-bit external global memory address space • 2k to 32k x 16-bit single-access On-chip PROM • 1k to 9k x 16-bit single-access On-chip Program/data RAM • 1k x 16-bit dual-access On-chip program/ data RAM
  • 43.
    • Synchronous, TDMand buffered serial ports • Programmable timer and PLL (Phase Locked Loops) • IEEE standard JTAG ports • 5 V/3 V operation with low power dissipation and power down modes • DMA interface • 100/128/132/144 pins in plastic QFP and TQFP
  • 44.
    • Central ProcessingUnit  Central arithmetic logic unit (CALU)  Parallel logic unit(PLU)  Auxiliary register arithmetic unit(ARAU)  Memory mapped registers  Program controller
  • 45.
    Central arithmetic logicunit (CALU) • 16X16 Bit Parallel Multiplier • 32 bit Accumulator(ACC) • 32 bit Accumulator Buffer (ACCB) • Product register (PREG) • 0-16 bit barrel shifters(right and left) • 32 bit ALU
  • 46.
    • One ofthe operands for ALU operation comes from ACC. • Result is stored in ACC • A 32 bit ACCB is used for temporary storage of ACC. • The hardware multiplier – 16x16 multiplication of number represented in 2’s complement form. • 32 bit PREG – result of multiplication • 0-16 bit left and right barrel shifters in CALU - permit the contents of memory to be left shifted by 0-16 bits before they are fed to ALU or stored from ALU to memory.
  • 47.
    • Auxiliary registerarithmetic unit(ARAU): Eight auxiliary register (AR0-AR&) each of 16 bit length, 3 bit ARP (Auxiliary register pointer) and unsigned 16 bit ALU Used as Address pointer and general purpose register Index register, ARCR,BMAR,BRR(RPTC,BRCR,PASR,PAER), PLU – Index register:  Used by ARAU as step value to modify the address in AR's during indirect addressing. – Auxiliary Register compare register:  Used for Address boundary comparison
  • 48.
    – Block MoveAddress Register (BMAR)  16 bit holds an address value to be used with block moves and multiply/accumulate operations.  provides 16 bit address for indirect addressed second operand – Block Repeat Register (BRR)  16 bit wide  Repeat counter register (RPTC)  Block repeat counter register (BRCR)  Block repeat program address start register (PASR)  Block repeat program address end register (PAER) – Parallel Logic Unit (PLU) Performs Boolean operations or bit manipulations Logic unit executes Logic operations – set, clear, test or toggle multiplier bits in control register or any data memory location.
  • 49.
    • Memory mappedregisters  96 registers  Used for indirect data pointer, temporary storage • Instruction Registers • Interrupt Registers • Status Registers • Program Controller Program Counter Hardware Stack Program Memory Address Generation
  • 50.
    Status and ControlRegisters Circular Buffer Registers Process Mode Status Register Status Register (ST0 and ST1)
  • 51.
    Status Registers • ST0bit assignment  ARP: Auxiliary Register Pointer- Select AR in indirect addressing  OV: Overflow flag bit – Arithmetic operation overflow in ALU  OVM: overflow Mode bit –Accumulator overflow saturation mode  INTM: Interrupt Mode bit – Globally masks or enables all interrupts.  DP: Data memory page pointer bit – Address of current data memory page 15-14 12 11 10 9 8-0 ARP OV OVM 1 INTM DP
  • 52.
    • ST1 Bitassignment • ARB: Auxiliary Register Buffer – Holds previous value of ARP • CNF on chip RAM configuration control bit CNF: 0-on chip DARAM B0 is mapped to data memory CNF:1 – on chip DARAM B0 is mapped to Program memory • TC test/ control flag bit – Stores the result of ALU or PLU test bit operations • SXM: Sign extension mode bit- enables /disables sign extension of an arithmetic operation 15-14 12 11 10 9 8-7 6 5 4 3-2 1-0 ARB CNF TC SXM C 11 HM 1 XF 11 PM
  • 53.
    ARB Auxiliary RegisterBuffer • C: Carry bit-Indicates arithmetic carry or borrow • HM: Hold mode bit- indicates CPU stops or continues execution • XF: pin status bit – determines level of external flag output pin • PM: product shift mode bits 00 – No shift 01-Left shifted 1 bit; LSBs zero filled 10- Left shifted 4 bits; LSBs zero filled 11- Right shifted 6 bits;6 LSBs lost
  • 54.
    • On-chip Memory ProgramMemory Data/Program Dual access RAM Data/Program single Access RAM On Chip Memory Protection • On-chip Peripherals Clock Generator Hardware Timer Software Programmable wait state generators General Purpose I/O Pins
  • 55.
    Parallel I/O Ports SerialPort Interface Buffered Serial Port TDM Serial Port Host Port Interface User maskable interrupts
  • 56.
    Addressing Modes The methodof specifying the data to be operated by the instruction is called addressing modes. Direct addressing Memory mapped register addressing Indirect addressing Immediate addressing Register addressing Circular addressing mode
  • 57.
    Direct Addressing Mode •Address of the data is directly specified in the instruction itself. • 16-bit data memory address bus(DAB) Diagram taken from ‘Digital signal Processors-Architecture, Programming and Applications’- B.Venkataramani & M Bhaskar, Second Edition book
  • 58.
    Memory Mapped Register Addressing •LAMM- Load accumulator with memory mapped register • LMMR – Load memory mapped register • SAMM – Store accumulator in memory mapped register • SMMR – Store memory mapped register
  • 59.
    Bit Reversed AddressingMode • Bit Reversed operation – FFT
  • 60.
    Immediate Addressing • Theimmediate addressing mode can be used to load either a 16-bit constant or a constant of length 13,9 or 7 • Accordingly it is referred to as long immediate or short immediate addressing mode. • This mode is indicated by the symbol#.
  • 61.
    Indirect Addressing Symbol Valueof AR pointed by ARP after instruction execution * AR unaltered *+ AR incremented by 1 *- AR decremented by 1 *0+ AR incremented by the content of INDX *0- AR decremented by the content of INDX *BR0+ AR incremented by the content of INDX with reverse carry propagation *BR0- AR decremented by the content of INDX with reverse carry propagation
  • 62.
    Dedicated - RegisterAddressing • The advantage of this addressing mode is that the address of the block of memory to be acted upon can be changed during execution of the program
  • 63.
    Circular Addressing Mode •CBSR1- Circular buffer 1 start register • CBSR1- Circular buffer 2 start register • CBER1- Circular buffer 1 end register • CBER1- Circular buffer 2 end register • CBCR- Circular buffer control register
  • 64.
    Pipelining • Pipelining aprocessor means breaking down its instruction into series of discrete pipeline stages which are completed in sequence. • Phases of Pipelining Fetch(F) Decode(D) Read(R) Execute(E)
  • 65.
    Pipelining Diagram taken from‘Digital signal Processors-Architecture, Programming and Applications’- B.Venkataramani & M Bhaskar, Second Edition book
  • 66.
    Advantages • Improves systemPerformance • Increases the speed of operation
  • 67.
    MAC Operation • Numericaloperations in DSP- Multiplication and Addition • Real Time DSP to be fast - MAC unit is mandatory • Fixed or floating Hardware MAC – Standard in DSPs • In fixed point – It multiplies two 16 bit 2’s complement fractional numbers and computes a 32 bit product in a single cycle (25ns) • DSP Hardware MAC Configuration is depicted
  • 68.
    MAC Configuration inDSPs Diagram taken from ‘Digital signal Processors-Architecture, Programming and Applications’- B.Venkataramani & M Bhaskar, Second Edition book
  • 69.
    • The multiplierhas a pair of input registers that hold the inputs to the multiplier and a 32 bit product register which holds the result of a multiplication. • The output of the P (product) register is connected to a double precision accumulator. • The principle is very much the same for hardware floating point multiplier accumulators. • Floating point MACs allow fast computation of DSP results with minimal Errors. • Floating point offers a wide dynamic range and reduced arithmetic errors, Many applications the dynamic range provided by the fixed point representation is adequate.
  • 70.
    First generation –Fixed point processor
  • 71.
    Fixed Point digitalsignal processors • The Key features of four generation of the fixed point DSP processors from 4 leading semiconductor manufacturers. • Basic architecture of the first generation fixed point DSP processor TMS320C1x by Texas Instruments • Dedicated arithmetic units – multiplier and an accumulator • The processor family – modified Harvard architecture with two separate memory spaces for programs and data. • On-chip memory and special instruction for execution
  • 72.
    • Has threeseparate address spaces for program memory, data memory and I/O. • 16 Bit two Auxiliary registers (AR0-AR1) • The content of auxiliary registers can be saved in and loaded from data memory with SAR and LAR • Provides 144/256words of 16 bit on chip data RAM • 1.5K/4K words of program ROM/EPROM
  • 73.
    Second generation –Fixed point processor
  • 74.
    • Second generationfixed point DSPs – enhanced features compared to the first generation. • Much larger on chip memories and more special instruction to support efficient execution of DSP algorithms • Computational performance – 4 to 6 times more than first generation. • Figure shows – Special instructions for DSP operations include a multiply and accumulate with data move instruction – repeat instruction to execute an FIR filter with time saving. • Second generation – provides more on chip memory
  • 75.
    • TMS320C2X –Modified Harvard architecture for speed and flexibility • 32 bit ALU and accumulator perform a wide range of arithmetic and logical instructions. • Separate Program and Data memory spaces – each with 16 bit address and on chip data buses. • 16x16 bit hardware multiplier capable of computing a signed and unsigned 32 bit product in single machine cycle.
  • 76.
    • Six register: – A serial port receive register – A Serial port Transmit register – A time register – A period register – An Interrupt mask register – Memory allocation register • TMS320C2X allows flexible configurations – A Stand alone Processor – A multiprocessor with devices in parallel – A slave/host multiprocessor with global memory space – A peripheral processed interfaced via processor controlled signal to another device
  • 77.
    Second generation –Motorola DSP56002
  • 78.
    Third Generation –Fixed Point processor
  • 79.
    • Third Generationfixed point DSPs – enhancement of second generation DSPs • Performance Enhancement – Achieved by increasing and/or making more effective use of available on chip resources. • More data paths, wider data paths, Larger on chip memory and instruction cache and dual MAC. • Third generation DSPs - 2 or 3 times superior to second generation • Texas Instruments TMS320C3x, TMS320C54X
  • 80.
    • Third GenerationTMS320C3X -executes 60 million floating point operations per second. • On chip parallelism in processor – 11 operations in a single instruction • High performance – Perform parallel multiply and arithmetic unit operations on integer, floating point in single cycle – General purpose register file – Large on chip memory – High degree of parallelism – Direct memory access controller
  • 81.
    Fourth Generation-Fixed pointprocessors Diagram taken from ‘Digital signal Processing’- Emmanuel Ifeachor & Barrie W.Jervis, Second edition book
  • 82.
    • Fourth Generationfixed point DSP processors – Multi channel Applications • Digital Subscriber loop • Remote Access server modem • Wireless base station • 3 G Mobile systems • Medical Imaging  Uses VLIW Architecture  Wider Instruction word  Wider data paths  More registers  Larger Instruction Cache  Multiple Arithmetic unit
  • 83.
    • Core processorhas – two independent arithmetic paths, each with four execution units – Logic Unit (Li) – Shifter/ Logic Unit (Si) – Multiplier (Mi) – Data Address Unit (Di) • Core processor- fetches 32 bit instructions at a time • Instruction Width of 256 bits • Executes 8 instruction in parallel for one cycle • Large Program and Data memory • Advantages of VLIW Architecture – High computational performance.
  • 84.
    Floating Point Representation •First generation – TMS320 C3x • Second generation – TMS320C4x • Third generation – TMS320C67X
  • 85.
    Floating Point Representation •TMS320C6x – VelociTI architecture – first DSP to use advanced VLIW architecture • Excellent choice of multiple execution and multifunction applications • VelociTI architecture – Reduced Code size, flexibility of code, data type and zero overhead in branching. • TMS320C62X,64X – Fixed point processors • TMS320C67X – Floating point processor 32 general purpose register with 32 bit size.
  • 86.
    Features of TMS320C6xprocessors • Advanced VLIW CPU – 8 functional units (2 multiplier and 6 ALUs) • Executes 8 instructions per cycle • Instruction Packing reduces code size, program fetch and power consumption • Conditional Execution of all instruction • Efficient code execution on independent functional units • Supports 8/16/32 bit data format • 40 bit arithmetic, saturation and normalization operations • Field manipulation and instruction extract, set, clear, and bit counting operations. • Supports Single precision (32 bit) , double precision (64 bit) IEEE floating point operations • 32 x 32 bit integer multiplication with 32 or 64 bit results.
  • 87.
    Floating Point Representation InternalArchitecture Diagram taken from ‘Digital signal Processing’- Emmanuel Ifeachor & Barrie W.Jervis, Second edition book
  • 88.
    • C6x- contains32 bit CPU • On chip program • Data memory and on chip peripheral • Peripherals such as – External memory interface (EMIF) – Direct memory Access Control (DMA) – Timers – Multi channel buffered serial ports (MsBSP) – Host port interface(HPI) – Power down logic
  • 89.
    CPU unit ofTMS320C6X Diagram taken from ‘Digital signal Processing’- Emmanuel Ifeachor & Barrie W.Jervis, Second edition book
  • 90.
    • CPU contains –Program fetch unit – Instruction dispatch unit – Instruction decode unit – Two data path – 4 functional unit – Register file for each data path – Control register – Control Logic – Test, Emulator and interrupt logic • Functional unit accepts 32 bit instruction (Instruction packet size is 256 bits) at a time • Program fetch unit generates address of eight instructions and send it to program memory for each fetch packet – once fetched CPU receives the packets.
  • 91.
    • The InstructionDispatch unit receives the fetch packet and split it into execute packets. • Instruction in execute packets are assigned to appropriate eight functional unit in data path • In instruction decode, the source register and destination register and associated path are decoded for execution of the instruction in functional unit. • Instructions are executed by functional units • The register files A& B – 32 numbers of 32 bit registers (16 register for each data path) • 8 Functional units – 6 ALU and 2 Multiplier (.L1,.L2,.S1,.S2,.M1,.M2,.D1,.D2)
  • 92.
    Diagram taken from‘Digital signal Processing’- Emmanuel Ifeachor & Barrie W.Jervis, Second edition book
  • 93.
    Functional units ofC6x Name of the Unit Type of Floating point Operation .L Unit Arithmetic operations .S Unit Compare square root and absolute value operations .M unit 32x32 bit fixed point multiply operations and floating point multiply operations .D unit Load double word with 5 bit constant offset
  • 94.
    • Data Path –Register file data path – Register file cross path – Register file Memory access path • Control Register File – – 10 control registers, – .S2 can read and write to control register – Accessed by MVC (Move between Control and register file) – Addressing mode register – Control Status register – Program counter – Interrupt flag , set, clear, enable register – Interrupt return pointer – Non-maskable interrupt return pointer
  • 95.
    Visual Quiz • Thefeatures in which PDSP is superior to advanced Microprocessor is A. Low cost B. Low power C. Computational speed D. Real time I/O Capability • The addressing mode convenient for FFT computation is A. Indirect addressing B. Circular mode C. Bit reversed addressing D. Memory mapped addressing
  • 96.
    • The Resultof operations performed in CALU are stored in A. ACC B. ACCB C. TREG0 D. PREG • The ........ Permits execution of logical operation on data without affecting the contents of ALU A. PLU B. Auxiliary ALU C. CALU D. Memory mapped addressing
  • 97.
    • The Registerused for indirect addressing of memory A. ARs B. Block move address register C. TREG D. Index register • The Content to left unaltered in indirect addressing mode symbol is A. * B. *0+ C. # D. *+
  • 98.
    • The C6Xprocessor is based on .......... architecture A. Modified Harvard B. VelociTI C. Advanced Harvard D. Davinci • The functional unit used for 32/40 bit shift operation A. .L B. .S C. .M D. .D
  • 99.
    • The numberof register in C62x and C67x CPU register file A. 16 B. 32 C. 40 D. 64 • The floating point devices in C6x processors are A. C62x B. C67x C. C64x D. C62x and C74x