DSP Processor


Published on

Published in: Technology, Business
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

DSP Processor

  1. 1. 2nd Half of UNIT 6 --- DSP Processor BY Prof L.S.Kalkonde Department of Electronics & Telecommunication Prof Ram Meghe College of Engineering & Technology,Badnera
  2. 2. Digital Signal Processor---- Definition A digital signal processor (DSP) is an integrated circuit designed for high-speed data manipulations, and is used in Audio Communications image manipulation Other data-acquisition and Data-control applications.
  3. 3. How Digital Signal Processing Works 1. To explain how digital signal processing works, you must understand the difference between analog and digital signals. 2. Analog signals, which include sound intensity, pressure, light intensity, etc., are continuously variable. 3. Each of our senses is sensitive to different kinds of analog signals. 4. Our ears are sensitive to sound, our eyes are sensitive to light, and so on. 5. Once we receive a signal, our sensory organs convert it to an electrical signal and send it to our analog computer (the brain).
  4. 4. How Digital Signal Processing Works 6. Our brains are very powerful parallel computer whose performance currently is unmatched by any digital computer. 7. Our brains not only analyze the information received, but also make decisions using this data. 8. Digital signals are those that are transmitted within or between computers, in which information is represented by discrete states –
  5. 5. How Analog and Digital Signals Work Together
  6. 6. How Analog and Digital Signals Work Together Digital technology such as personal computers (PCs), assist us in many ways: writing documents, spell checking, and drawing. Unfortunately, the world is analog, and electronic analog computers are not as versatile as digital computers. Therefore, in order to make use of the tremendous processing power that digital technology offers us, we must do the following: Convert the analog signals into electrical signals, using a transducer (such as a microphone, as shown in the diagram). · Digitize these signals (i.e., convert them from analog to digital using an analog-to-digital converter (ADC)), as shown in the diagram.
  7. 7. Why Do We Need Digital Signal Processors?
  8. 8. Why Do We Need Digital Signal Processors? Add and Subtract Add and subtract operations are performed quite simply by general-purpose microprocessors in a single or very few clock cycles. Digital addition is similar to decimal add. Our example shows adding 1 plus 2. The result is the decimal 3. Multiply and Divide The multiply and divide operations are more complex. A digital multiply operation consists of a series of shift and add operations. example shows a multiplication of 3 & 5. General-purpose microprocessors are quite slow in performing multiply and divide operations. They will typically sequentially execute a series of shift, add, and subtract operations from their microcode i.e. to perform a single multiply operation, it may consume many cycles to complete The DSP performs multiplication in a single cycle by implementing all shift and add operations in parallel.
  9. 9. What’s Inside DSP (Elements of DSP)
  10. 10. • Program Memory: – Stores the programs the DSP will use to process data • Data Memory: – Stores the information to be processed • Compute Engine: – Performs the math processing, accessing the program from the Program Memory and the data from the Data Memory • Input / Output: – Serves a range of functions to connect to the outside world
  11. 11. Types of Architecture Harvard Architecture Von Neumann Architecture Super/ Modified Harvard Architecture
  12. 12. Von Neumann Architecture Memory Instruction & Data Address Bus CPU Data Bus
  13. 13. Harvard Architecture Address Bus Address Bus Program Memory Data Memory CPU Data Bus Data Bus
  14. 14. Which Architecture is Best Suited for DSP? 1. Common general-purpose personal computers use processors designed with the von Neuman architecture while the Harvard architecture is more commonly used in specialized microprocessors for real-time and embedded applications. 2. DSPs typically use Harvard architecture, although von Neuman DSPs also exist. 3. Many signal and image processing applications require fast, real-time machines. 4. The drawback to using a true Harvard architecture is that since it uses separate program and data memories, it needs twice as many address and data pins on the chip and twice as much external memory. Unfortunately, as the number of pins or chips increases, so does the price.
  15. 15. Which Architecture is Best Suited for DSP? An elegant solution: A single data and address bus is used externally. Two (or more) separate buses for program and data are used internally. Timing (multiplexing) handles the separation of program and data information. In one clock cycle, the program information flows on the pins, and In the second cycle, data follows on the same pins. Program and data information is then routed onto separate internal program and data buses. Such machines are called modified Harvard architecture processors because the internal architecture is Harvard external architecture is von Neuman. Also Multiple internal RAM/ROM cells for high-use instructions and data.
  16. 16. Fixed vs. Floating Point Characteristic Floating point 32-bit Fixed point 16-bit Dynamic range much larger smaller Resolution comparable comparable Ease of programming comparable comparable Compiler efficiency much easier more difficult Power consumption more efficient less efficient Chip cost comparable comparable System cost comparable comparable Design cost less more faster slower Time to market
  17. 17. TMS320 Family 16-Bit Fixed Point Devices C5x Voice Processing C54x Digital Cellular Phones 32-Bit Floating Point Devices C6x Advanced VLIW Processor Wireless Base Stations/Pooled Modems
  18. 18. TMS32054XX
  19. 19. Features of TMS32054XX • • • • • • • • • 16 bit CPU Can execute 40 to 120 Million Instructions Per Second 17×17 bit MAC 64k × 16 bit physical program memory address space 64k × 16 bit external data memory address space 64k × 16 bit external IO address space Programmable timer & PLL DMA interface 100/128/144 TQFP & BGA packages
  20. 20. Functional Units • • • • • • • • • 40 bit ALU 2- 40 bit accumulators ACCA & ACCB Barrel shifter 17X17 bit multiplier 40 bit adder CSSU-Compare, Select & store unit Exponent Encoder Data Address generation Program & address generation unit
  21. 21. TMS32054XX •Uses an advanced , Modified Harvard architecture •Maximizes processing power by providing 4 pairs Bus Structure 3 Pairs 1 Pair Data Memory Program Memory
  22. 22. ALU • 40 Bit ALU • Wide range of Arithmetic & Logic Operation in single clock cycle. • After ALU operation destination of result – Accumulator or – Memory
  23. 23. Accumulators • 40 bit ACCA & ACCB • To store result for ALU & Multiply/Add. • Temporary storage for other.
  24. 24. Barrel Shifter • The barrel shifter can produce a left shift of 0 to 31 bits and a right shift of 0 to 16 bits on the input data. • The shift requirements are defined in – the shift count field of the instruction, the shift count field (ASM) of status register ST1, or – In the temporary register T.
  25. 25. Multiplier/Adder Unit • The multiplier/adder block consists of several elements: – – – – a multiplier, an adder, signed/unsigned input control logic, fractional control logic, zero detector, a rounder , overflow/saturation logic and a 16-bit temporary storage register (T). • The multiplier/adder unit performs 17 x 17-bit 2scomplement multiplication with a 40–bit addition in a single instruction cycle
  26. 26. CSSU- Compare, Select, and Store Unit • The compare, select, and store unit (CSSU) performs maximum comparisons between • the accumulator’s high and low word, allows both the test/control flag bit (TC) in status register ST0 and the transition register (TRN) to keep their transition histories.
  27. 27. Exponent Encoder • To implement floating point arithmetic in fixed point processor require separation of exponent & mantissa of the floating point data.
  28. 28. Data Address Generation Unit • 2 Auxiliary Register Arithmetic Units ARAU0 & ARAU1 (Address Generation for indirect addressing mode i.e. increment, decrement, indexing, bit reverse ,circular addressing ) • 8- AR0 to AR7 (To generate 2 data memory address simult.)
  29. 29. JTAG-Joint Test Action Group JTAG, as defined by the IEEE Std.-1149.1 standard. An integrated method for testing interconnects on printed circuit boards (PCBs) that are implemented at the integrated circuit (IC) level. The JTAG test architecture provides a means to test interconnects between integrated circuits on a board without using physical test probes. Potential benefits from JTAG Shorter test times, Higher test coverage, Increased diagnostic capability and Lower capital equipment cost.
  30. 30. Instruction Pipelining in TMS320C54X Processors 1.Program Pre fetch PAB is loaded with the address next instruction to be fetched 2. Program Fetch The op-code is fetched from PB & loaded into Instruction Register 3.Decode The opcode is decoded to determiine access operation
  31. 31. Instruction Pipelining in TMS320C54X Processors 4.Access Operand address is loaded on data DAB – Data Address Bus. If 2nd operand is required , then another address is loaded into CAB 5. Read The operands are read from the buses DB & CB 6.Execute Perform the task specified by the instruction
  32. 32. Sr no Parameter DSP Processor GPP Processor 1 Instruction Cycle Single Cycle ( i.e., true instruction cycle) Multiple instruction cycle for one instruction 2 Instruction Execution Parallel execution is possible Always sequential execution is possible 3 Operand fetched from Multiple operands are fetch simultaneously Operands are fetch sequentially memory 4 present On-chip/off-chip Program memory and data memory are Normally on-chip cache memory is present present on-chip and expandable off-chip. .Main memory is off-chip. Address generation Addresses are generated combinely by Program counter is incremented DAGs and program sequencer. 7 Normally no such separate memories are memories 6 Separate program memory and data memory 5 Memories sequentially to generate addresses. Address and data buses are not multiplexed. Address/data buses can be separate on the multiplexing 8 Address/data bus They are separate on chip as well as off chip. chip but usually multiplexed off-chip. Computational units Three separate computational units: ALU is the main computational unit. ALU,MAC and shifter. 9 Suitable for Array processing operations 10 Queuing/Pipelining Queuing is implemented through instruction Queuing is performed explicitly by queuing register and instruction cache Genral purpose processing register for pipelining of instructions