Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
Verilog-HDL Tutorial (13)
Next
Download to read offline and view in fullscreen.

5

Share

Download to read offline

Naist2015 dec ver1

Download to read offline

It is a lecture slide for NAIST seminar 1.
The topic is a digital spectrometer on the FPGA.

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Naist2015 dec ver1

  1. 1. A Digital Spectrometer  on a Radio Telescope,  and its Realization on FPGAs Hiroki Nakahara Ehime University
  2. 2. Outline • Introduction • Digital spectrometer for a radio telescope • ROACH system at Oxford University • Realization on the FPGA • Nested residue number system (Nested RNS) • Implementation • Future plans • Conclusion
  3. 3. Ehime University ・ KU KIT ・ ・ EU
  4. 4. Field Programmable Gate Array  (FPGA) 4 PLL(Phase Locked Loop) Block Memory (BRAM) Logic Cell Look-Up Table (LUT) DSP Block I/O Block
  5. 5. High‐end Process to FPGAs 5 16nm! 14nm!
  6. 6. Comparison FPGAs with ASICs 6
  7. 7. Xilinx Ultra Scale FPGA Coming Soon!
  8. 8. Custom Computing Machine 8 Multi‐valued logic Pattern matching circuit • Regular expression matching circuit • Packet classifier • IP address look‐up 40m Radio telescope Deep neural network
  9. 9. Radio Telescope 9 45m AirBUS A321 44.51m 53m
  10. 10. SKA (Square Kilometer Array) 10
  11. 11. Spectrometer Feed horn Amplifier Mixer CASPER ROACH-2 Revision 2 Stand-alone FPGA board -FPGA: Xilinx Virtex-6 SX475T -PowerPC 440 EPx -Multi-gigabit transceiver (SFP+) -2 x ZDOKs 11 Sub Reflector Main Reflector
  12. 12. Digital Spectrometer 12 ADC BRAM FFT Magnitude Window Coefficient Data from Antenna Power Spectrum FFT FFT Magnitude Magnitude + Reg. + Reg. + Reg. Window FFT Accumulation
  13. 13. Window Function 13 ADC BRAM FFT Magnitude Window Coefficient Data from Antenna Power Spectrum FFT FFT Magnitude Magnitude + Reg. + Reg. + Reg. × Voltage Voltage Time Time
  14. 14. Fast Fourier Transform (FFT) 14 ADC BRAM FFT Magnitude Window Coefficient Data from Antenna Power Spectrum FFT FFT Magnitude Magnitude + Reg. + Reg. + Reg. Time Frequency Voltage Power
  15. 15. Accumulation 15 ADC BRAM FFT Magnitude Window Coefficient Data from Antenna Power Spectrum FFT FFT Magnitude Magnitude + Reg. + Reg. + Reg. 15Frequency Power Power Frequency
  16. 16. Doppler Effect 16 FFT FFT Frequency Frequency
  17. 17. Case: Solar Radio Burst 17 Co-relation Cleaning On‐line Computation Off‐line Computation
  18. 18. Requirements 18 Wide‐bandHigh‐resolution 230‐240 points FFT • OFDM: 28 • CT Scanner: 216 0.1 – 1000GHz • Digital TV: 470‐770MHz (UHF in Japan) • Cellular phone: 0.8‐2GHz Frequency [Hz] Frequency [Hz] SKA, “SKA phase 1 system (level 1) requirements specification,” http://www.astronomers.skatelescope.org.
  19. 19. Goal 19 FPGA FPGA FFT FFT FFT High‐Resolution FFT High‐Resolution FFT High‐Resolution FFT High‐Resolution FFT • Highly Throughput per area Spectrometer ADC 5‐10 GHz 300‐400 MHz High‐Resolution FFT ADC
  20. 20. Outline • Introduction • Digital spectrometer for a radio telescope • ROACH system at Oxford University • Realization on the FPGA • Nested residue number system (Nested RNS) • Implementation • Future plans • Conclusion
  21. 21. October, 2011 21
  22. 22. Oxford University 22
  23. 23. Sightseeing at Oxford 23
  24. 24. Quiz 24
  25. 25. Discussion with Prof. Nakanishi 25
  26. 26. Digital Spectrometer 26
  27. 27. ROACH System 27
  28. 28. CASPER 28
  29. 29. Dinner...but, 29
  30. 30. Spy? 30 !?
  31. 31. 1st Generation ROACH System 31
  32. 32. Mt.Nobeyama 32
  33. 33. 45m Radio Telescope at Mt. Nobeyama 33
  34. 34. Internal of Radio Telescope 34
  35. 35. Observation Building 35
  36. 36. Internal of Observation Build. 36
  37. 37. 1st Observation at 13th, Dec., 2013 37
  38. 38. Outline • Introduction • Digital spectrometer for a radio telescope • ROACH system at Oxford University • Realization Highly Throughput/Area on the FPGA • Nested residue number system (Nested RNS) • Implementation • Future plans • Conclusion
  39. 39. Signal Flow Graph for FFT 39 x(0) x(1) x(2) x(3) x(4) x(5) x(6) x(7) X(0) X(4) X(2) X(6) X(1) X(5) X(3) X(7) 2 8W 2 8W 1 8W 2 8W 3 8W ‐1 ‐1 ‐1 ‐1 ‐1 ‐1 ‐1 ‐1 ‐1 ‐1 ‐1 ‐1 Radix-2 Butterfly
  40. 40. Pipelined Binary FFT 40 Radix‐4 Butterfly Swap Mem. Radix‐4 Butterfly Swap Mem. N4log stages Reg. Reg. Reg. H. Nakahara, H. Nakanishi, and T. Sasao, "On a wideband fast Fourier transform for a radio telescope," ACM SIGARCH Computer Architecture News, Vol.40, No. 5, 2012, pp.46-51.
  41. 41. Chinese Reminder Theorem • 今有物、不知其数。三・三数之、剰二。 五・五数之、剰三。七・七数之、剰二。問 物幾何? • 答曰:二十三。 • 術曰:『三・三数之、剰二』、置一百四十。 『五・五数之、剰三』、置六十三。『七・七 数之、剰二』、置三十。并之、得二百三十 三。以二百一十減之、即得。凡、三・三数 之、剰一、則置七十。五・五数之、剰一、 則置二十一。七・七数之、剰一、則置十 五。一百六以上、以一百五減之、即得。 41
  42. 42. Residue Number System (RNS) • Defined by a set of L mutually prime integer constants  〈m1,m2,...,mL〉 • An arbitrary integer X can be represented by a tuple of L  integers (X1,X2,…,XL), where • Dynamic range  42 )(mod ii mXX  M  mi i1 L 
  43. 43. Parallel Multiplication Multiplication on RNS • Moduli set〈3,4,5〉, X=8, Y=2 • Z=X×Y=16=(1,0,1) • X=(2,0,3), Y=(2,2,2) Z=(4 mod 3,0 mod 4,6 mod 5) =(1,0,1)=16 43 Binary2RNS Conversion RNS2Binary Conversion ➔ ➔
  44. 44. RNS FFT 44 ROM (Bin 2 RNS) X(k)  j0 N1 x(j)W jk  m1 X(k)  j0 N1 x( j)W jk  m2 X(k)  j0 N1 x( j)W jk  mL RNS2Binary (Offlinecomputation) Online computation log2 m1   log2 m2   log2 mL   log2 mL   log2 m2   log2 m1   X Input Signal (from ADC) 8-14 [bit] N
  45. 45. Reduction of Dynamic Range 45 Binary FFT Memory RNS FFT Butterfly Circuit
  46. 46. Increase of Dynamic Range 46 X mod 3 0 0 1 1 2 2 3 0 4 1 5 2 6 0 X mod 3 mod 5 0 0 0 1 1 1 2 2 2 3 0 3 4 1 4 5 2 0 6 0 1
  47. 47. RNS2RNS Converter 47 RNS2RNS Converter
  48. 48. Single ROM Realization 48 ROM m1 m2 mL m'1 m'2 m'L' 2 log2 mi  i1 L   log2 mi   i1 L' Mem. Size: [bit]
  49. 49. RNS2RNS Converter • Compact realization • Input: {m1,m2,...,mL} • Output: {m1,m2,...,mL,mL+1} Realize only g(m1,m2,...,mL)→mL+1 49 ROM m1 m2 mL m1 m2 mL mL+1 Keep the relation m1<m2<... <mL
  50. 50. Decomposition of  the RNS2RNS Converter 50 m1 m2 mL m1 m2 mL mL+1 RNS 2 Binary Converter Binary 2 Modulus Converter
  51. 51. Example of the LUT Cascade  Based on the mod‐EVMDD 510 1 0 m1=2 m2=3 m3=5 15 0 10 20 6 12 18 24 1 2 0 1 2 3 4 x1 y1 0 0 1 15 x2 y2 0 0 1 10 2 20 x3 y3 0 0 1 6 2 12 3 18 4 24 Mod 30 Adder Mod 30 Adder
  52. 52. 52 00 01 10 11 00 01 10 11 0 1 1 1 1 1 0 0 0 1 1 1 1 1 0 0 X1=(x1, x2) X2=(x3, x4) =2 h(X1) 0 01 1 x1 0 0 1 1 x2 0 1 0 1 h(X1) 0 1 0 1 0 1 00 0 1 01 1 1 10 1 0 11 1 0 x3,x4 h(X1) Functional Decomposition 24x1=16 [bit] 22x1+23x1=12 [bit]
  53. 53. Decomposition Chart for X mod 3 53 000 001 010 011 00 01 10 11 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 X2=(x3, x4, x5) X1=(x1,x2) 100 101 110 111 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 0 mod 3 = 0 1 mod 3 = 1 2 mod 3 = 2 3 mod 3 = 0 4 mod 3 = 1 5 mod 3 = 2 6 mod 3 = 0 7 mod 3 = 1 8 mod 3 = 2 9 mod 3 = 0 10 mod 3 = 1 … Free variables Bound variables
  54. 54. Decomposition Chart for X mod 3 54 0 1 2 00 01 10 11 0 1 2 0 1 2 0 1 2 0 1 2 X2=(x3,x4,x5)X1=(x1,x2) 0 mod 3 = 0 1 mod 3 = 1 2 mod 3 = 2 3 mod 3 = 0 4 mod 3 = 1 5 mod 3 = 2 6 mod 3 = 0 7 mod 3 = 1 8 mod 3 = 2 9 mod 3 = 0 10 mod 3 = 1 … FreeBound x3 0 0 0 0 1 1 1 1 x4 0 0 1 1 0 0 1 1 x5 0 1 0 1 0 1 0 1 h(X2) 0 1 2 0 1 2 0 1
  55. 55. RNS2RNS Converter using LUT  Cascades 55 Modulo M Adder ROM ROM ROM m1 m2 mL ROM ROM ROM mL+1  12log Lm RNS2Binary Converter Using LUT cascades based on mod-EVMDD Binary2Modulus Converter using LUT cascades based on MTMDD
  56. 56. Problem • Moduli set of RNS consists of mutually prime numbers • sizes of circuits are all different • Example: <7,11,13> 56 6‐input LUT 8‐input LUT 8‐input LUT 3 4 4 4 4 3 3 4 4 Binary2RNS Converter by BRAMs RNS2Binary Converter by DSP blocks and BRAMs ➔ ➔
  57. 57. Nested RNS • (Z1,Z2,…,Zi,…, ZL) (Z1,Z2,…,(Zi1,Zi2,…,Zij),…, ZL) • Ex: <7,11,13>×<7,11,13> <7,<5,6,7>11,<5,6,7>13>×<7,<5,6,7>11,<5,6,7>13> 57 1. Reuse the same moduli set 2. Decompose a large modulo into smaller ones Original modulus ➔
  58. 58. Example of Nested RNS • 19x22(=418) on <7,<5,6,7>11,<5,6,7>13> 19×22 =<5,8,6>×<1,0,9> =<5,<3,2,1>11,<1,0,6>13>×<1,<0,0,0>11,<4,3,2>13> =<5,<0,0,0>11,<4,0,5>13> =<5,0,2> =418 58 Modulo Multiplication Bin2RNS on NRNS RNS2Bin Binary2NRNS Conversion
  59. 59. Realization of Nested RNS 59 <5,6,7> 2Bin Bin2 <7,11,13> 3 <7,11,13> 2Bin <5,6,7> 2Bin Bin2 <5,6,7> Bin2 <5,6,7> 6‐input LUT 6‐input LUT 6‐input LUT 6‐input LUT 6‐input LUT 6‐input LUT 6‐input LUT Bin2 <7,11,13> Bin2 <5,6,7> Bin2 <5,6,7> 4 4 3 4 4 3 3 3 3 3 3 Binary 2NRNS NRNS2 Binary Realized by BRAMs                      LUTs      BRAMs and DSP blocks   
  60. 60. NRNS FFT 60 ROM (Bin 2 NRNS) X(k)  j0 N1 x(j)W jk  m1 X(k)  j0 N1 x( j)W jk  m2 X(k)  j0 N1 x( j)W jk  mL NRNS2Binary (Offlinecomputation) Online computation log2 m1   log2 m2   log2 mL   log2 mL   log2 m2   log2 m1   X Input Signal (from ADC) 8-14 [bit] N
  61. 61. Comparison NRNS with RNS 61 m1 m2 mL m1 m2 mL mL+1,1 mL+1,2 mL+1,i RNS 2 Binary Convert. Binary 2 Modulu s Convert. Modulu s 2 NRNS Convert. m1 m2 mL m1 m2 mL mL+1 RNS 2 Binary Convert. Binary 2 Modulus Convert. RNS NRNS Arithmetic Circuit Arithmetic Circuit , , , Arithmetic Circuit , , , Smaller or Larger?
  62. 62. Gain(LFPGA=64) 62 0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50 2 4 6 8 10 12 14 16 18 20 22 24 Gain for #LUTs RNS Modulo: mL+1 ←mL+1=15
  63. 63. Comparison with other FFTs • Implemented on the Xilinx Corp. Virtex7 FPGA • Binary FFT • Xilinx Corp. FFT (v.7.1) • Butterfly operator is realized by LUTs • Transpose memory is realized by BRAMs • RNS FFT (Applied NRNS) • N=1024: {5,7,9,11,13,16} • N=2048: {7,8,9,11,13,17} • N=4096: {7,8,9,11,13,15,31} • N=8192: {7,11,13,15,17,19} 63
  64. 64. Comparison with #6‐LUTs 64 0 2000 4000 6000 8000 10000 12000 1024 2048 4096 8192 16384 # of FFT points Binary FFT (Xilinx Library) RNS FFT (Without RNS2RNS converters) RNS FFT (With RNS2RNS converters) NRNS FFT (Proposed, Applied to NRNS) 9.4-20.5% reduced comparing with RNS FFT 42.4-47.8% reduced comparing with Binary FFT
  65. 65. Comparison with #BRAMs 65 0 50 100 150 200 1024 2048 4096 8192 16384 # of FFT points 34.1% increased comparing with RNS FFT 20.0-156.5% increased comparing with Binary FFT Binary FFT (Xilinx Library) RNS FFT (Without RNS2RNS converters) RNS FFT (With RNS2RNS converters) NRNS FFT (Proposed, Applied to NRNS)
  66. 66. Outline • Introduction • Digital spectrometer for a radio telescope • ROACH system at Oxford University • Realization Highly Throughput/Area on the FPGA • Nested residue number system (Nested RNS) • Implementation • Future plans • Conclusion
  67. 67. Present Status • Nested RNS(NRNS) FFT • NRNS2NRNS converter • Comparison NRNS FFT with RNS one • Implemented on Xilinx Inc. Virtex7 • Compared with conventional FFTs • #LUTs: Reduced by 42‐47% • #BRAMs: Increased by 20‐156% 67
  68. 68. Next Generation ROACH ”3” 68 Net FPGA Sume (Virtex7 FPGA) FMC-ZDOC Converter CASPER ADC1.3 (5Gsps)
  69. 69. 69
  70. 70. Questions?
  • debadattabehera

    Dec. 11, 2016
  • CristinaSilva264

    Oct. 26, 2016
  • DaciaHenry

    Oct. 20, 2016
  • MartineLucasBscHonsP

    Oct. 20, 2016
  • TomoyaKameda

    Dec. 27, 2015

It is a lecture slide for NAIST seminar 1. The topic is a digital spectrometer on the FPGA.

Views

Total views

2,077

On Slideshare

0

From embeds

0

Number of embeds

155

Actions

Downloads

23

Shares

0

Comments

0

Likes

5

×