Presented by:
SHALY JOSE
ECE
VJCET
2011-2015
4/12/2015
1
INTRODUCTION
Discrete Fourier Transform
 Fast Fourier Transform:
Algorithm to compute DFT using
reduced no of calculations.
4/12/2015
2
Applications of FFT
Used in signal processing applications,such as:
 OFDM:Orthogonal Frequency Division Multiplexing;
 A method encoding digital data on multiple carrier
frequencies.
 Software defined radio:SDR:
 Radio communication components implemented by
means of software on a personal computer
or embedded system.
4/12/2015
3
64 pt FFT PROCESSOR—turbo 64
 Developed for IEEE 802.11(a) standard.
 Core area:6.8 sq. mm
 Average power consumption: 41 mW at 1.8 V @ 20 MHz
frequency
4/12/2015
4
Why TURBO64 at 20 MHz?
IEEE 802.11(a)
standard
4microsec
Cooley-Turkey
Algorithm
192 complex
butterfly
operations for a
64 point FFT
1 butterfly
operation
• 20.8 ns
48 MHz
frequency
4/12/2015
5
• FFT, A(r)= 𝑘=0
𝑁−1
𝐵 𝑘 𝑊 𝑟𝑘
𝑁
………………………….1
Where B(k)-complex data sequence
N-length of the sequence
𝑊𝑁 ------ 𝑒−2𝑗𝜋/𝑁
• Consider:
N=MT,r=s+Tt ,k=l+Mm
s,l ∈ 0,1, … … 7
k,m∈ 0, … … 𝑇 − 1
Applying these values in above equation:
4/12/2015
6
A(s+Tt)= 𝑙=0
𝑇−1
𝑊 𝑙𝑡
𝑀[𝑊 𝑠𝑙
𝑀𝑇 𝑚=0
𝑇−1
𝐵(𝑙 + 𝑀𝑚)𝑊 𝑠𝑚
𝑇
]………….2
• Eqn (2) FFT decomposed into M point and T point FFT
And combined for final result.
• Considering M=8 and T=8: the 64 point FFT can be expressed as:
A(s+Tt)= 𝑙=0
7
𝑊 𝑙𝑡
8 [𝑊 𝑠𝑙
64 𝑚=0
7
𝐵(𝑙 + 8𝑚)𝑊 𝑠𝑚
8
]………
3
4/12/2015
7
Signal Flow graph of a 8 point DFT
4/12/2015
8
4/12/2015
9
Swap real and
imaginary
terms
Forward FFT
Again swap
the real and
imaginary
terms.
ARCHITECTURE OF TURBO64
Input unit
1st 8 pt
FFT unit
Multiplier
unit
CB unit
2nd 8 pt
FFT unit
Output
unit
Data 16bit
Start input
5 bit binary
counter
Start mode
count
16 bit
complex
output
Data out
4/12/2015
10
INPUT
UNIT
Three
additional
register
Input register
bank
16 bit
wordlength and
57 complex
samples
4/12/2015
11
4/12/2015
12
Block diagram of input unit
8 point FFT Unit
• Fully parallel 8
Point FFT
• Internal
wordlength:16 bit
4/12/2015
13
Signal flow graph of a 8 point FFT
MULTIPLIER UNIT
Interdimensional Constants:
• 49 non trivial interdimensional constants to be
multiplied to result of 1st 8 point FFT.
(𝑊 𝑠𝑙
64
,s,l ∈ {1,2….7})
• Only nine sets are unique
4/12/2015
14
Contd…
• (1,0), (0.995178, 0.097961), (0.980773,
0.195068),(0.956909, 0.290283), (0.923828,
0.382629), (0.881896,0.471374), (0.831420,
0.555541), (0.773010, 0.634338),
(0.707092, 0.707092)
• Each constant decomposed as summation/subtraction
based on powers of 2.
4/12/2015
15
• Eg:0.991578 (1-2−8 − 2−10+2−14)
4/12/2015
16
Circuit diagram of proposed multiplier unit-----hard
wired representation of the constant
TABLE I
REALIZATION OF DIFFERENT CONSTANTS
IN TERMS OF POWER OF 2
4/12/2015
17
• Final design of multiplier unit,
4/12/2015
18
Block diagram of complete multiplier unit
TABLE II
UTILIZATION OF THE DIFFERENT HARD-WIRED CONSTANTS DURING THE
49 COMPLEX MULTIPLICATION OPERATION
4/12/2015
19
COMPARISON
Synthesized
multiplier unit
Cell area:0.6 sqmm
Average power
consumption:19mW
Multiplier unit with
8 complex
multipliers
Cell area:1.1 sqmm
Average power
consumption:31mW
4/12/2015
20
Multiplier unit also
has 2 shuffle
network:
1.Routes data to
appropriate
constants
2.Maps multiplied
data to appropriate
index of CB unit.
4/12/2015
21
ARCHITECTURE OF TURBO64
Input unit
1st 8 pt
FFT unit
Multiplier
unit
CB unit
2nd 8 pt
FFT unit
Output
unit
Data 16bit
Start input
5 bit binary
counter
Start mode
count
16 bit
complex
output
Data out
4/12/2015
22
OUTPUT UNIT
4/12/2015
23
Block diagram of output
unit
• Fabricated using 0.25micrometer 3 metal
layer BiCmos process.
• 85 I/O ports
• Average power dissipation over 55 fabricated
chips:
o 4.1mW at 1.8V @ 20MHz frequency
o 84mW at 2.5 V @ same frequency
o Maximum frequency of operation:
o At 1.8 V=26 MHz
o At 2.5 V=38MHz
4/12/2015
24
Die photograph of
TURBO64 processor
4/12/2015
25
CONCLUSION
• Requires smaller no. of clock cycles
• Better power perfomance, less silicon area
• Proposed architecture can be used for any fast and
low power requirement operations.
4/12/2015
26
REFERENCES
• [1]A 64 point fourier transform chp for high speed
wireless LAN application using OFDM-
K.Maharatna,E.Grass,U.Jaghold
• [2] A. M. Despain, “Very fast Fourier transform
algorithms hardware for implementation,” IEEE
Trans. Comput., vol. C-28, no. 5, pp. 333–341,1979.
• [3]C. Chen and L.Wang, “A new efficient systolic
architecture for the 2-D discrete Fourier transform,”
in Proc. IEEE Int. Symp. Circuits and Systems,vol.
6, ch. 732, 1992, pp. 689–692.
4/12/2015
27
4/12/2015
28

64 point fft chip

  • 1.
  • 2.
    INTRODUCTION Discrete Fourier Transform Fast Fourier Transform: Algorithm to compute DFT using reduced no of calculations. 4/12/2015 2
  • 3.
    Applications of FFT Usedin signal processing applications,such as:  OFDM:Orthogonal Frequency Division Multiplexing;  A method encoding digital data on multiple carrier frequencies.  Software defined radio:SDR:  Radio communication components implemented by means of software on a personal computer or embedded system. 4/12/2015 3
  • 4.
    64 pt FFTPROCESSOR—turbo 64  Developed for IEEE 802.11(a) standard.  Core area:6.8 sq. mm  Average power consumption: 41 mW at 1.8 V @ 20 MHz frequency 4/12/2015 4
  • 5.
    Why TURBO64 at20 MHz? IEEE 802.11(a) standard 4microsec Cooley-Turkey Algorithm 192 complex butterfly operations for a 64 point FFT 1 butterfly operation • 20.8 ns 48 MHz frequency 4/12/2015 5
  • 6.
    • FFT, A(r)=𝑘=0 𝑁−1 𝐵 𝑘 𝑊 𝑟𝑘 𝑁 ………………………….1 Where B(k)-complex data sequence N-length of the sequence 𝑊𝑁 ------ 𝑒−2𝑗𝜋/𝑁 • Consider: N=MT,r=s+Tt ,k=l+Mm s,l ∈ 0,1, … … 7 k,m∈ 0, … … 𝑇 − 1 Applying these values in above equation: 4/12/2015 6
  • 7.
    A(s+Tt)= 𝑙=0 𝑇−1 𝑊 𝑙𝑡 𝑀[𝑊𝑠𝑙 𝑀𝑇 𝑚=0 𝑇−1 𝐵(𝑙 + 𝑀𝑚)𝑊 𝑠𝑚 𝑇 ]………….2 • Eqn (2) FFT decomposed into M point and T point FFT And combined for final result. • Considering M=8 and T=8: the 64 point FFT can be expressed as: A(s+Tt)= 𝑙=0 7 𝑊 𝑙𝑡 8 [𝑊 𝑠𝑙 64 𝑚=0 7 𝐵(𝑙 + 8𝑚)𝑊 𝑠𝑚 8 ]……… 3 4/12/2015 7
  • 8.
    Signal Flow graphof a 8 point DFT 4/12/2015 8
  • 9.
    4/12/2015 9 Swap real and imaginary terms ForwardFFT Again swap the real and imaginary terms.
  • 10.
    ARCHITECTURE OF TURBO64 Inputunit 1st 8 pt FFT unit Multiplier unit CB unit 2nd 8 pt FFT unit Output unit Data 16bit Start input 5 bit binary counter Start mode count 16 bit complex output Data out 4/12/2015 10
  • 11.
  • 12.
  • 13.
    8 point FFTUnit • Fully parallel 8 Point FFT • Internal wordlength:16 bit 4/12/2015 13 Signal flow graph of a 8 point FFT
  • 14.
    MULTIPLIER UNIT Interdimensional Constants: •49 non trivial interdimensional constants to be multiplied to result of 1st 8 point FFT. (𝑊 𝑠𝑙 64 ,s,l ∈ {1,2….7}) • Only nine sets are unique 4/12/2015 14
  • 15.
    Contd… • (1,0), (0.995178,0.097961), (0.980773, 0.195068),(0.956909, 0.290283), (0.923828, 0.382629), (0.881896,0.471374), (0.831420, 0.555541), (0.773010, 0.634338), (0.707092, 0.707092) • Each constant decomposed as summation/subtraction based on powers of 2. 4/12/2015 15
  • 16.
    • Eg:0.991578 (1-2−8− 2−10+2−14) 4/12/2015 16 Circuit diagram of proposed multiplier unit-----hard wired representation of the constant
  • 17.
    TABLE I REALIZATION OFDIFFERENT CONSTANTS IN TERMS OF POWER OF 2 4/12/2015 17
  • 18.
    • Final designof multiplier unit, 4/12/2015 18 Block diagram of complete multiplier unit
  • 19.
    TABLE II UTILIZATION OFTHE DIFFERENT HARD-WIRED CONSTANTS DURING THE 49 COMPLEX MULTIPLICATION OPERATION 4/12/2015 19
  • 20.
    COMPARISON Synthesized multiplier unit Cell area:0.6sqmm Average power consumption:19mW Multiplier unit with 8 complex multipliers Cell area:1.1 sqmm Average power consumption:31mW 4/12/2015 20
  • 21.
    Multiplier unit also has2 shuffle network: 1.Routes data to appropriate constants 2.Maps multiplied data to appropriate index of CB unit. 4/12/2015 21
  • 22.
    ARCHITECTURE OF TURBO64 Inputunit 1st 8 pt FFT unit Multiplier unit CB unit 2nd 8 pt FFT unit Output unit Data 16bit Start input 5 bit binary counter Start mode count 16 bit complex output Data out 4/12/2015 22
  • 23.
  • 24.
    • Fabricated using0.25micrometer 3 metal layer BiCmos process. • 85 I/O ports • Average power dissipation over 55 fabricated chips: o 4.1mW at 1.8V @ 20MHz frequency o 84mW at 2.5 V @ same frequency o Maximum frequency of operation: o At 1.8 V=26 MHz o At 2.5 V=38MHz 4/12/2015 24
  • 25.
    Die photograph of TURBO64processor 4/12/2015 25
  • 26.
    CONCLUSION • Requires smallerno. of clock cycles • Better power perfomance, less silicon area • Proposed architecture can be used for any fast and low power requirement operations. 4/12/2015 26
  • 27.
    REFERENCES • [1]A 64point fourier transform chp for high speed wireless LAN application using OFDM- K.Maharatna,E.Grass,U.Jaghold • [2] A. M. Despain, “Very fast Fourier transform algorithms hardware for implementation,” IEEE Trans. Comput., vol. C-28, no. 5, pp. 333–341,1979. • [3]C. Chen and L.Wang, “A new efficient systolic architecture for the 2-D discrete Fourier transform,” in Proc. IEEE Int. Symp. Circuits and Systems,vol. 6, ch. 732, 1992, pp. 689–692. 4/12/2015 27
  • 28.