Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introduction to Blackfin BF532 DSP


Published on

Published in: Education, Technology
  • Be the first to comment

Introduction to Blackfin BF532 DSP

  2. 2. AgendaIntroduction to DSPIntroduction to Blackfin familyGetting Started with VisualDSP 5.0Programming and Optimizing C on BlackfinExamples
  3. 3. What is DSPWhat is [a] DSP? In brief, DSPs are processors or microcomputerswhose hardware, software, and instruction sets are optimized for high-speed numeric processing applications— an essential for processingdigital data representing analog signals in real time.. What a DSP does is straightforward. When acting as a digital filter,for example, the DSP receives digital values based on samples of asignal, calculates the results of a filter function operating on thesevalues, and provides digital values that represent the filter output. TheDSP’s high-speed arithmetic and logical hardware is programmed torapidly execute algorithms modeling the filter transformation.
  4. 4. EX
  5. 5. Key DifferenceThe combination of design elements—arithmetic operators, memoryhandling, instruction set, parallelism, data addressing—that providethis ability forms the key difference between DSPs and other kinds ofprocessors.The real-time signal comes to the DSP as a train of individual samplesfrom an analog-to-digital converter (ADC). To do filtering in real-time,the DSP must complete all the calculations and operations requiredfor processing each sample (usually updating a process involvingmany previous samples) before the next sample arrives. To performhigh-order filtering of real-world signals having significant frequencycontent calls for really fast processors
  6. 6. General features of DSPEfficient ALU and MAC Units (Multiple)Harward or Super-Harward ArchitectureExtended Precision in Computational UnitsHardware LoopingEfficient and fast peripheralsCircular BufferingHigh Speeds of operation
  7. 7. Continued….Fast MultipliersMultiple Execution UnitsEfficient Memory Access -Harward and Super Harward architecturesData Format -Fixed point and Floating pointZero Overhead LoopingStreaming I/OSpecialized Instruction Sets
  8. 8. IntroductionBlackfin processors embody a new type of 16/32-bitembedded processor designed specifically to meet thecomputational demands and power constraints of today’sembedded audio, video, automotive,industrial/instrumentation, and communicationsapplicationsBlackfin processors combine a 32-bit RISC instruction set,dual 16-bit multiply accumulate (MAC) digital signalprocessing functionality, and 8-bit video processingperformance.
  9. 9. Roadmap
  10. 10. Continued……………..
  11. 11. Characteristics of a Embedded Processor
  12. 12. BF531/2/3
  13. 13. BF532 Block Diagram
  14. 14. FeaturesUp to 600Mhz High Performance Processor2 16 bit MAC’s, 2 40-bit ALU’s, 4 8-bit Video ALU’s and a 40bitShifter0.85 V to 1.30 V core VDD with on-chip voltage regulation1.8 V, 2.5 V, and 3.3 V compliant I/OUp to 148K bytes of on-chip memory which can be used as acache or SRAM and having both data and code banksExternal Memory controller with glue less support for SDRAM,SRAM , flash and ROMMultiple booting Options from SPI and Parallel Flash
  15. 15. Peripherals and UnitsDynamic Power management UnitDirect Memory AccessSPI interfaceParallel Port InterfaceSerial Port ControllersUARTProgrammable FlagsTimers and RTCEBIU(External Bus Interface Unit)
  16. 16. Corethe Blackfin processor core contains two 16-bitmultipliers, two 40-bit accumulators, two 40-bit ALUs,four video ALUs, and a 40-bit shifter. The computationunits process 8-bit, 16-bit, or 32-bit data from the registerfile.The compute register file contains eight 32-bit registers.When performing compute operations on 16-bit operanddata, the register file operates as 16 independent 16-bitregisters.The ALUs perform a traditional set of arithmetic andlogical operations on 16-bit or 32-bit data.
  17. 17. Each MAC can perform a 16-bit by 16-bit multiply in each cycle,accumulating the results into the 40-bit accumulators. Signedand unsigned formats, rounding, and saturation are supported.The 40-bit shifter can perform shifts and rotates and is used tosupport normalization, field extract, and field deposit instructions.The program sequencer controls the flow of instruction execu-tion, including instruction alignment and decoding. Forprogram flow control, the sequencer supports PC relative and
  18. 18. Hardware is provided to support zero-over-headlooping. The architecture is fully interlocked,meaning that the programmer need not manage thepipeline when executing instructions with datadependencies.
  19. 19. Operating ModesThe architecture provides three modes of operation: user mode, supervisor mode emulation mode.User mode has restricted access to certain systemresources, thus providing a protected softwareenvironment, while supervisor mode has unrestrictedaccess to the system and core resourcesEmulation Mode is used for Testing Purposes only
  20. 20. Core
  21. 21. Memory
  22. 22. Booting The Process of loading of internal memories of the processor by using external memories by using itself is called Booting The processor is having 2 Boot pins BMODE0,1 so, it will support 4 Boot Modes Which are shown in side window
  23. 23. Dynamic Power Management UnitThe Processor has 3 Power Domains VDDEXT(Peripherals) VDDINT(Core) VDDRTC(RTC) And it has 2 CLOCK domains Peripherals will work with SCLK and the Core will work with CCLKThe Processor has Internal PLL by using which we canget multiple frequency of operations by just changingregister values
  24. 24. The dynamic power management feature of the ADSP-BF531/ ADSP-BF532/ADSP-BF533 processor allows both theproces-sor’s input voltage (VDDINT) and clock frequency(fCCLK) to be dynamically controlled.Different Applications require Different Clock speeds,According to the Clock speed the VDDINT will be reducedthereby reducing overall power dissipationBecause of this feature Blackfin processors are used in LowPower applications
  25. 25. Different Modes Available Different applications requires different types of modes. These power modes will offer different levels of power savings. Hibernate Mode will be having High power saving where as Full-On will be having less power savings with more performance
  26. 26. VisualDSP++ 5.0
  27. 27. Start UP
  28. 28. Creating the session
  29. 29. Selecting the Processor
  30. 30. Selecting Connection Type
  31. 31. Selecting the Platform
  32. 32. Completing the session
  33. 33. Selecting the session
  34. 34. Creating a Project
  35. 35. Project Information
  36. 36. Processor selection
  37. 37. Application Settings
  38. 38. Startup code/ldf
  39. 39. Completed
  40. 40. Project Layout
  41. 41. Adding Source Files to the Project
  42. 42. Contd….
  43. 43. Building and Running the Project Build the project by performing one of these actions. • Click the Build Project button or • From the Project menu, choose Build Project. Or Click the Rebuild All button ( ) to build the project. The C source file opens in an editor window, and execution halts at the main () At the End we will be seeing “Build completed successfully.” Press F5 to run the project
  44. 44. Changing the Project Options
  45. 45. Options1. Processor: BF5322. Type : Loader File3. Revision: Automatic
  46. 46. Loader file Settings Choose : boot mode as flash/ PROM, Boot Format as ASCII and Output width as 16 bit. Choose a folder for an output file . After changing the options again Rebuild All
  47. 47. C LanguageAdvantages:  C is much cheaper to develop. ( encourages experimentation )  C is much cheaper to maintain.  C is comparatively portable.Disadvantages  ANSI C is not designed for DSP.  DSP processor designs usually expect assembly in key areas.  DSP applications continue to evolve. ( faster than ANSI Standard C )
  48. 48. Missing operations provided by software emulation(floating point!)C is more machine-dependent than you might think for example: is a “short” 16 or 32 bits?Can be a poor match for DSP – accumulators? SIMD?Fractions?Not really a mathematical focus. Systemsprogramming language
  49. 49. Increasing C PerformanceProcess of Performance Tuning is a Specialization ofthe program for the particular hardwareWork at the Higher level first  Improve the algorithm  Make sure that algorithm suits to ArchitectureLook at Machine capabilities May have specialized instructions
  50. 50. Linear Profiling toolsUsing the compiler Optimization (Automatic Compiler Optimization)Optimizing the algorithm for the HardwareUsing the Pipeline viewerUsing the Compiler Libraries given which are already optimizedroutines for the Hardware  fractional builtins  fract types fract16 and fract32  ETSI(European Telecommunications Standards Institutes fract functions)Fractional Arithmetic is 100 times faster than floating point arithmetic
  51. 51. Arrays and PointersArrays are easier to analyse.  void va_ind(int a[], int b[], int out[], int n) {  int i;  for (i = 0; i < n; ++i)  out[i] = a[i] + b[i];  }Pointers are closer to the hardware.  void va_ptr(int a[], int b[], int out[], int n) {  Int i,  for (i = 0; i < n; ++i)  *out++ = *a++ + *b++  }Which produces the fastest code?Mostly no difference Start with Array if performance not sufficient usePointers
  52. 52. Avoid Loop Carried dependenciesBad: Scalar dependency. for (i = 0; i < n; ++i) x = a[i] - x; Value used from previous iteration. So iterations cannot be overlapped.Bad: Array dependency. for (i = 0; i < n; ++i) a[i] = b[i] * a[c[i]]; Value may be from previous iteration. So iterations cannot be overlapped.
  53. 53. Avoid Loop Carried dependenciesUsing Hardware LoopsWord align your data 32-bit loads help keep compute units busy 32-bit references must be at 4 byte boundaries Top-level arrays are allocated on 4 byte boundaries Only pass the address of first element of arrays Write loops that process input arrays an element at a time
  54. 54. Use of the tools “volatile” and “const”Volatile: Volatile is essential for hardware or interrupt-related data Data is not changed by the Program it will be changed by the hardware and used by the programConst: It will remove wrong access of the memory and changing the memory contents
  55. 55. Use of Circular addressingUse of the Key word “asm”Replace Conditionals with Min, Max and AbsAvoid jump statementsAvoid Division Statements: Using Shift by 2
  56. 56. Removing Conditionals Duplicate small loops rather than have a conditional in a small loop.
  57. 57. //Analog The Scientist and Engineers guide to //A very nice site which is having allDSP
  58. 58. Any Queries