ARM Boards for DSP Applications

M. S. Ramaiah School of Advanced Studies
1
M. Sc. (Engg.) in Electronics System Design Engineering
GREESHMA S
CWB0913004 , FT-20136thModule Presentation
Module code : ESE2511
Module name : Microcontrollers and Interfacing
Module leader: Mr. Nagananda S.N.
Presentation on : 07/05/2014
ARM Boards for DSP Applications

2
•INTRODUCTION
•ARM9E-S
•DM3730
•FUNCTIONALBLOCKDIAGRAM
•BLOCKDIAGRAM
•SOFTWAREARCHITECTURE
•CHARACTERISTICSOFDSPPROCESSORS
•FEATURESOFDM3730
•REPRESENTINGADIGITALSIGNAL
•ADDITIONANDSUBTRACTIONOFFIXED-POINTSIGNAL
Overview

3
•MULTIPLICATIONANDDIVISIONOFFIXED-POINTSIGNAL
•SQUAREROOTOFFIXEDPOINTSIGNAL
•DSPONARM9E
•DSPONARM10E
•FIRFILTER
•IIRFILTER
•THEDISCRETEANDFASTFOURIERTRANSFORM
•APPLICATIONS
•CONCLUSION
•REFERENCES
Overview

4Introduction
Emergingstandardsforalgorithmsinmanyapplicationareashaveputfurtherdemandsontheabilityofprocessingplatformstodeliverefficientcontrolcapability
ARM’sapproachhasbeentodesignRISCcorearchitectureswithinstructionsetsthatprovideefficientsupportforparticularapplications,withoptimalbalancebetweenhardwareandsoftwareimplementation
Toacceleratesignal-processingalgorithmsARMaddsnewDSPinstructionstotheARMinstructionset
ARMDSPextensionsbroadenthesuitabilityoftheARMCPUfamilytoapplicationsthatrequireintensivesignalprocessingandatthesametimeretainingthepowerandefficiencyofahighperformanceRISCmicrocontroller
TheARMDSPextensionshavealreadybeenimplementedintheARM926EJ-S, ARM946E-S,ARM966E-S,ARM9E-S

5Introduction
Processing digitalized signals requires high memory bandwidths and fast multiplyaccumulate operations
A microcontroller handles the user interface, and a separate DSP processor manipulate digitalized signals such as audio
A single-core design can reduce cost and power consumption over a two-core solution
The ARMv5TE extensions available in the ARM9E and later cores provideefficient multiply accumulate operations
DSP applications are typically multiply and load-store intensive
Filtering is most commonly used signal processing operation
Another very common algorithm is the Discrete Fourier Transform

6
Introduction

7ARM9E-S
The ARM9E-S core has the ARM architecture v5TE
This includes an enhanced multiplier design for improved DSP performance
It is a 32-bit microcontroller
It offers high performance for very low power consumption and gate count
The ARM architecture is based on Reduced Instruction Set Computer (RISC) principles
The reduced instruction set and related decode mechanism are much simpler than those of Complex Instruction Set Computer (CISC) designs
This simplicity gives
•a high instruction throughput
•an excellent real-time interrupt response
•a small, cost effective, processor macrocell

8DM3730
Based on enhanced device architecture
Integrated on TI’s advanced 45-nm technology
Device supports HLOS and RTOS
Fully backward compatible

9Functional Block DiagramFigure 1 : DM3730 Functional Block Diagram

10
Block Diagram
Benefits
•2000DMIPS for Oss like linux, Win CE, RTOS
•3-D graphics up to 20M polygons per second for robust GUIs
•Backward compatible with OMAP3530
Figure 2 : DM3730 BlockDiagram
Application
•Smart connected devices
•Patient monitoring
•Media Player

11Software ArchitectureFigure 3 : Software Architecture of DM3730
Industry Standard OS component
TI provider component
Open Source

12
Characteristics of DSP processor
Harvard Architecture
High performance MAC
Saturating math
SIMD instruction for parallel computation
Barrel shifters
Floating point hardware

13Features of DM3730
ARM microprocessor subsystem
Enhanced direct memory access controller
Video hardware accelerators
Tile based architecture delivering up to 20MPoly/sec
DSP instructions/data little Endian
NEON multimedia architecture
Load store architecture with Non-aligned support
64 32-Bit General purpose registers
Six ALUs, each supports single 32-bit, dual 16-bit, or quad-8 bit , Arithmetic per clock cycle

14Representing a Digital Signal Figure 4 : Digitalizing an Analogue Signal
xis signal and t is time
In an analogue signal x[t ], the index tand the value x are both continuous real variables
ARM uses fixed point representation

15
Addition and Subtraction of Fixed-Point Signals
The general case is to convert the signal equation
Fixed-point format
or in integer C
n = m = d. Therefore normal integer addition gives a fixed-point
Provided d = m or d = n

16Contd…
There are four common ways you can prevent overflow
•Ensure that the X[t ]and C[t ] representations have one bit of spare headroom each
•Use a larger container type for Y than for X and C
•Use a smaller Q representation for y[t ]
•For example, if d = n − 1 = m − 1, then the operation becomes
•Use saturation

17Multiplication of Fixed-Point Signals
Fixed point format
or in integer CDivision of Fixed-Point Signals
fixed point format
or in integer C

18Square Root of a Fixed-Point Signals
Fixed point format
or in integer C

19DSP on the ARM9E
The ARM9E core has a very fast pipelined multiplier array that performs a 32-bit by 16-bit multiply in a single issue cycleWriting DSP Code for the ARM9E
The ARMv5TE architecture multiply operations are capable of unpacking 16-bit halvesfrom 32-bit words and multiplying them
The multiply operations do not early terminate. Therefore use MUL and MLA for multiplying 32-bit integers. For 16-bit values use SMULxy and SMLAxy
Multiply is the same speed as multiply accumulate. Use the SMLAxy instructionrather than a separate multiply and add

20DSP on the ARM10E
The ARM10E implements a background loading mechanism to accelerate load and storemultiples
It uses a 64-bit-wide data path that can transfer two registers on every background cycleWriting DSP Code for the ARM10E
Load and store multiples run in the background to give a high memory bandwidth
Ensure data arrays are 64-bit aligned so that load and store multiple operations canTransfer two words per cycle
The multiply operations do not early terminate. Therefore use MUL and MLA for multiplying 32-bit integers. For 16-bit values use SMULxy and SMLAxy
The SMLAxy instruction takes one cycle more than SMULxy

21FIR filters
The finite impulse response (FIR) filter is a basicbuilding block of many DSP applications
FIR filter to remove unwanted frequency ranges, boostcertain frequencies, or implement special effects
The FIR filter is the simplest type of digital filter
The filtered sample y(t)depends linearly on afixed, finite number of unfilteredsamples x(t)
Calculating accumulated values A[t ]

22IIR filters
An infinite impulse response (IIR) filter is a digital filter that depends linearly on a finite number of input samplesand a finite number of previous filter outputs
Mathematically
Factorize the filter into a series of bi quads—anIIR filter with M = L = 2
Z-Transform

23The Discrete Fourier TransformThe Fast Fourier Transform
The DiscreteFourier Transform (DFT)converts a time domain signal to a frequency domain signal
A FFT is an algorithm to compute the discrete Fourier transform and its inverse

24Applications
Portable data terminals
Navigation
Auto Infotainment
Gaming
Medical Imaging
Home automation
Single board

25Conclusion
DM3730 cost effective
It is low power and has high performance
DM3730 delivers a nearly 40% increase in ARM performance
Over 50% increase in DSP performance
Has twice the graphics capability, while reducing power consumption
Use a fixed-point representation for DSP applications where speed is critical withmoderate dynamic range

26Reference
1.DM3730, http:// www.ti.com/lit/ds/symlink/dm3730.pdf
2.DM3730, http://www.ti.com/lit/ml/sprt571/sprt571.pdf
3.DM3730, http://media.digikey.com/pdf/ DM3730_AM3703TorpedoSOMBrief.pdf

27

ARM Boards for DSP Applications

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to ARM Boards for DSP Applications

Similar to ARM Boards for DSP Applications (20)

More from Greeshma S

More from Greeshma S (9)

ARM Boards for DSP Applications