Ongoing developments of FPGAs and PCIe technologies - CFD acceleration - Gabriel Caffarena Laboratory of Integrated System...
Agenda <ul><li>CFD with FPGAs </li></ul><ul><li>Hardware Design Methodology </li></ul><ul><li>Current results </li></ul><u...
<ul><ul><li>People involved in CFD acceleration: </li></ul></ul><ul><ul><li>1 Senior researcher </li></ul></ul><ul><ul><li...
Agenda <ul><li>CFD with FPGAs </li></ul><ul><li>Hardware Design Methodology </li></ul><ul><li>Current results </li></ul><u...
CFD with FPGAs <ul><li>CFD is essential for the aeronautics industry </li></ul><ul><li>A huge amount of  computational pow...
CFD with FPGAs <ul><li>Use of  custom  hardware </li></ul><ul><ul><li>Field-Programmable Gate Arrays  ( FPGA s) </li></ul>...
CFD with FPGAs  FPGAs <ul><li>ASIC    Fixed hardware </li></ul><ul><li>FPGA    Configurable hardware  </li></ul><ul><li>...
CFD with FPGAs  Design flow <ul><li>Analyze CFD code (C) </li></ul><ul><li>Design  custom processor </li></ul><ul><ul><li>...
CFD with FPGAs <ul><li>Total control </li></ul><ul><ul><li>CFD HW processor </li></ul></ul><ul><ul><li>Custom precision </...
Agenda <ul><li>CFD with FPGAs </li></ul><ul><li>Hardware Design Methodology </li></ul><ul><li>Current results </li></ul><u...
Hardware design methodology <ul><li>Analyze CFD code (C) </li></ul><ul><li>HW-oriented C code </li></ul><ul><li>Precision ...
Hardware design methodology <ul><li>Analyze CFD code (C)     MANUAL </li></ul><ul><li>HW-oriented C code     MANUAL </li...
Hardware Design Methodology Precision Analysis <ul><li>Floating-point vs fixed-point </li></ul><ul><ul><li>Fixed-point res...
Hardware Design Methodology Precision Analysis Sensibility Analysis Error=f(parameters, bits) Fast WL Optimization CFD cod...
Hardware Design Methodology C to VHDL <ul><li>Automation of process </li></ul><ul><ul><li>C  Intermediate Language  VHDL...
Hardware Design Methodology FPGA flow:  bitstream generation <ul><li>Xilinx  ( www.xilinx.com ) </li></ul><ul><ul><li>ISE ...
Agenda <ul><li>CFD with FPGAs </li></ul><ul><li>Hardware Design Methodology </li></ul><ul><li>Current results </li></ul><u...
Current results <ul><li>Hightech Global board  ( HTG-V5-PCIE-110) </li></ul><ul><ul><li>Xilinx FPGA (Virtex5-LX) </li></ul...
Current results Procedural approach Speedup x1.6  (Theoretical limit x2.8:  Amdahl’s law ) Precision 10 -5 PCIe bottleneck...
Current results Algorithmic approach <ul><li>Speedup x7.5  Theoretical limit  x30  </li></ul><ul><li>Precision 10 -4 </li>...
Current results Lessons learnt <ul><li>Algorithmic better than procedural approach </li></ul><ul><ul><li>x2.8    x30 (The...
Agenda <ul><li>CFD with FPGAs </li></ul><ul><li>Hardware Design Methodology </li></ul><ul><li>Current results </li></ul><u...
Future work <ul><li>Roe 2D, Euler 2D </li></ul><ul><ul><li>Speedup depends on actual code </li></ul></ul><ul><li>Algorithm...
Future work <ul><li>Reduce precision analysis time </li></ul><ul><li>Create real C-to-VHDL compiler </li></ul><ul><li>Rese...
Future work <ul><li>Analyze CFD code (C) </li></ul><ul><li>HW-oriented C code     AUTOMATIC </li></ul><ul><li>Precision a...
Projects and collaborations <ul><li>DOVRES (Fusim-E) </li></ul><ul><li>DOMINO </li></ul><ul><li>AMEBA 3 </li></ul><ul><li>...
Upcoming SlideShare
Loading in …5
×

CFD and FPGAs

2,395 views

Published on

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,395
On SlideShare
0
From Embeds
0
Number of Embeds
12
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

CFD and FPGAs

  1. 1. Ongoing developments of FPGAs and PCIe technologies - CFD acceleration - Gabriel Caffarena Laboratory of Integrated Systems (LSI) Universidad Politécnica de Madrid CFD on Future Architectures C 2 A 2 S 2 E – DLR Braunschweig October 2009
  2. 2. Agenda <ul><li>CFD with FPGAs </li></ul><ul><li>Hardware Design Methodology </li></ul><ul><li>Current results </li></ul><ul><li>Future work </li></ul>
  3. 3. <ul><ul><li>People involved in CFD acceleration: </li></ul></ul><ul><ul><li>1 Senior researcher </li></ul></ul><ul><ul><li>3 Researchers (PhD) </li></ul></ul><ul><ul><li>4 PhD candidates </li></ul></ul><ul><ul><li>4 Students </li></ul></ul><ul><li>Scientific applications acceleration: CFD, bioinformatics </li></ul><ul><li>CAD tools (Precision analysis, C-to-HW, power, etc.) </li></ul><ul><li>FPGA prototyping (Wireless, Cryptanalysis, etc.) </li></ul><ul><li>Universidad Politecnica de Madrid: www.upm.es </li></ul><ul><li>LSI: www.lsi.die.upm.es </li></ul><ul><li>[email_address] </li></ul>
  4. 4. Agenda <ul><li>CFD with FPGAs </li></ul><ul><li>Hardware Design Methodology </li></ul><ul><li>Current results </li></ul><ul><li>Future work </li></ul>
  5. 5. CFD with FPGAs <ul><li>CFD is essential for the aeronautics industry </li></ul><ul><li>A huge amount of computational power is required </li></ul>
  6. 6. CFD with FPGAs <ul><li>Use of custom hardware </li></ul><ul><ul><li>Field-Programmable Gate Arrays ( FPGA s) </li></ul></ul><ul><ul><li>PCIe In-socket </li></ul></ul>PC HOST
  7. 7. CFD with FPGAs FPGAs <ul><li>ASIC  Fixed hardware </li></ul><ul><li>FPGA  Configurable hardware </li></ul><ul><li>Microprocessor  Programmable hardware </li></ul>
  8. 8. CFD with FPGAs Design flow <ul><li>Analyze CFD code (C) </li></ul><ul><li>Design custom processor </li></ul><ul><ul><li>Computation: CFD, mathematical precision </li></ul></ul><ul><ul><li>Communications: Host  FPGA, RAM  FPGA </li></ul></ul><ul><ul><li>Intellectual Property (IP) </li></ul></ul><ul><li>Implement HW: VHDL, CAD tools </li></ul><ul><li>Develop HW-SW interface: API </li></ul><ul><li>Integrate and validate </li></ul>CFD
  9. 9. CFD with FPGAs <ul><li>Total control </li></ul><ul><ul><li>CFD HW processor </li></ul></ul><ul><ul><li>Custom precision </li></ul></ul><ul><ul><li>High speedups </li></ul></ul><ul><li>Many decisions to make </li></ul><ul><ul><li>Complexity  Longer design times </li></ul></ul><ul><ul><li>Solution: </li></ul></ul><ul><ul><li>High-level in-house and commercial tools </li></ul></ul>FPGA
  10. 10. Agenda <ul><li>CFD with FPGAs </li></ul><ul><li>Hardware Design Methodology </li></ul><ul><li>Current results </li></ul><ul><li>Future work </li></ul>
  11. 11. Hardware design methodology <ul><li>Analyze CFD code (C) </li></ul><ul><li>HW-oriented C code </li></ul><ul><li>Precision analysis  Fixed-point </li></ul><ul><li>VHDL code and CFD processor architecture </li></ul><ul><li>FPGA programming file </li></ul><ul><ul><li>Synthesis + Debugging  VHDL-to-gates </li></ul></ul><ul><ul><li>Place and Route  Location and interconnection </li></ul></ul><ul><ul><li>Post-place and route debugging </li></ul></ul><ul><ul><li>Bitstream generation </li></ul></ul>
  12. 12. Hardware design methodology <ul><li>Analyze CFD code (C)  MANUAL </li></ul><ul><li>HW-oriented C code  MANUAL </li></ul><ul><li>Precision analysis  AUTOMATIC </li></ul><ul><li>VHDL code </li></ul><ul><ul><li>Computation  AUTOMATIC </li></ul></ul><ul><ul><li>Communications  MANUAL </li></ul></ul><ul><li>FPGA programming  Standard FPGA flow </li></ul>
  13. 13. Hardware Design Methodology Precision Analysis <ul><li>Floating-point vs fixed-point </li></ul><ul><ul><li>Fixed-point results in faster, low-power, low-area designs </li></ul></ul><ul><li>Automatic precision analysis tool </li></ul><ul><ul><li>Design time reduction: from months to hours </li></ul></ul><ul><li>Custom precision </li></ul><ul><ul><li>Custom precision for each variable/block </li></ul></ul><ul><ul><li>Control on results accuracy: 10 -5 , 10 -6 , … </li></ul></ul><ul><ul><li>“ Smaller” HW resources  Faster processor </li></ul></ul><ul><li>Detailed precision requirements for CFD? </li></ul>
  14. 14. Hardware Design Methodology Precision Analysis Sensibility Analysis Error=f(parameters, bits) Fast WL Optimization CFD code Accuracy Check Double-precision vs Fixed-point Error parameters
  15. 15. Hardware Design Methodology C to VHDL <ul><li>Automation of process </li></ul><ul><ul><li>C  Intermediate Language  VHDL </li></ul></ul><ul><ul><ul><li>Design time reduction (x20) </li></ul></ul></ul><ul><ul><li>Reduction of human errors </li></ul></ul><ul><li>FPGA optimized arithmetic blocks </li></ul><ul><li>High-speed approach: Full pipeline </li></ul><ul><li>Ad-hoc methodology for CFD </li></ul><ul><ul><li>Improvement over general purpose commercial tools </li></ul></ul>if (z!=b-c) z=(a+b)*c;
  16. 16. Hardware Design Methodology FPGA flow: bitstream generation <ul><li>Xilinx ( www.xilinx.com ) </li></ul><ul><ul><li>ISE </li></ul></ul><ul><ul><li>EDK </li></ul></ul><ul><li>Altera ( www.altera.com ) </li></ul><ul><ul><li>Quartus II </li></ul></ul><ul><ul><li>Nios-II IDE </li></ul></ul>
  17. 17. Agenda <ul><li>CFD with FPGAs </li></ul><ul><li>Hardware Design Methodology </li></ul><ul><li>Current results </li></ul><ul><li>Future work </li></ul>
  18. 18. Current results <ul><li>Hightech Global board ( HTG-V5-PCIE-110) </li></ul><ul><ul><li>Xilinx FPGA (Virtex5-LX) </li></ul></ul><ul><ul><li>1x512 MB DDR-2 RAM </li></ul></ul><ul><li>Implementation of Sod Shock tube </li></ul><ul><ul><li>Procedural: Euler 1D (SW) + Roe 1D (HW) </li></ul></ul><ul><ul><li>Algorithmic: Euler 1D + Roe 1D (HW) </li></ul></ul><ul><li>CFD Coding style guidelines for FPGAs </li></ul>
  19. 19. Current results Procedural approach Speedup x1.6 (Theoretical limit x2.8: Amdahl’s law ) Precision 10 -5 PCIe bottleneck (16 Gbps  6-8 Gbps)
  20. 20. Current results Algorithmic approach <ul><li>Speedup x7.5 Theoretical limit x30 </li></ul><ul><li>Precision 10 -4 </li></ul><ul><ul><li>Single memory and alternate read/write bottleneck </li></ul></ul><ul><ul><li>Clock frequency limitations (current board and commercial tools) </li></ul></ul><ul><ul><li>100 MHz vs 300 MHz </li></ul></ul>FPGA [rho, u, p] in RAM [rho, u, p] t [rho, u, p] out PCI-e PCI-e RAM DMA CPU Euler 1D DDR FIFOs
  21. 21. Current results Lessons learnt <ul><li>Algorithmic better than procedural approach </li></ul><ul><ul><li>x2.8  x30 (Theoretical limits) </li></ul></ul><ul><li>More than one memory </li></ul><ul><ul><li>x30  x60 (Theoretical limits) </li></ul></ul><ul><li>Larger DSP-oriented FPGAs </li></ul><ul><li>Communications IPs </li></ul><ul><ul><li>Design time reduction </li></ul></ul><ul><li>On-going research on in-house tools </li></ul><ul><ul><li>Design time reduction </li></ul></ul>New FPGA boards DINI - XILINX GIDEL - ALTERA
  22. 22. Agenda <ul><li>CFD with FPGAs </li></ul><ul><li>Hardware Design Methodology </li></ul><ul><li>Current results </li></ul><ul><li>Future work </li></ul>
  23. 23. Future work <ul><li>Roe 2D, Euler 2D </li></ul><ul><ul><li>Speedup depends on actual code </li></ul></ul><ul><li>Algorithmic approach is the main target </li></ul><ul><li>HPC FPGA board </li></ul><ul><ul><li>2 large FPGAs </li></ul></ul><ul><ul><li>2x4 GB DDR-2 Memories per FPGA </li></ul></ul><ul><ul><li>Communication IPs </li></ul></ul><ul><li>Evaluation of DDR-3 memories </li></ul>
  24. 24. Future work <ul><li>Reduce precision analysis time </li></ul><ul><li>Create real C-to-VHDL compiler </li></ul><ul><li>Research on architecture optimization </li></ul><ul><ul><li>CFD processor </li></ul></ul><ul><ul><li>Memory accesses </li></ul></ul><ul><ul><li>Multiple FPGAs </li></ul></ul><ul><li>Integration in cluster nodes </li></ul><ul><ul><li>Communications between nodes </li></ul></ul>
  25. 25. Future work <ul><li>Analyze CFD code (C) </li></ul><ul><li>HW-oriented C code  AUTOMATIC </li></ul><ul><li>Precision analysis  AUTOMATIC </li></ul><ul><li>VHDL code </li></ul><ul><ul><li>Computation  AUTOMATIC </li></ul></ul><ul><ul><li>Communications  IP-based </li></ul></ul><ul><li>FPGA programming  Standard FPGA flow </li></ul>
  26. 26. Projects and collaborations <ul><li>DOVRES (Fusim-E) </li></ul><ul><li>DOMINO </li></ul><ul><li>AMEBA 3 </li></ul><ul><li>AMURA </li></ul><ul><li>Airbus Spain </li></ul><ul><li>Universidad Politécnica de Madrid </li></ul><ul><li>Universidad Autónoma de Madrid </li></ul><ul><li>INTA </li></ul>

×