SlideShare a Scribd company logo
1 of 12
bibbidi N-BObbiDY boo
Magic Acceleration of N-Body Simulation
E. Del Sozzo, M. Rabozzi, M. Nanni, M. D. Santambrogio
emanuele.delsozzo@polimi.it
marco.rabozzi@polimi.it
marco3.nanni@mail.polimi.it
marco.santambrogio@polimi.it
Xilinx Open Hardware 2017 Contest
N-Body Simulation 2
F1,2
F1,3
F2,1
F3,1
F2,3
F3,2
N-Body Simulation 3
Algorithm Overview 4
COMPUTE
FORCES
UPDATE
VELOCITIES
AND BODY
POSITIONS
𝑂(𝑁2
) 𝑂(𝑁)
N = Number of Bodies
Repeat T times
Platforms 5
ASIC
(Application-Specific
Integrated Circuit)
CPU
(Central Processing Unit)
GPU
(Graphics Processing Unit)
FPGA
(Field Programmable
Gate Array)
Which one? 6
Parameters 7
ASIC
2 x 480 GFLOPS
FPGA
CPU
GPU
3.1 – 68.7 Mpairs/s
192.0 - 6312.0 Mpairs/s ≃2327.3 Mpairs/s
J. Makino and H. Daisaka, “Grape-8–an accelerator for gravitational nbody simulation with 20.5 gflops/w performance,” in High Performance Computing, Networking, Storage and Analysis (SC), 2012 International
Conference for. IEEE, 2012, pp. 1–10.
B. Peng, T. Wang, X. Jin, and C. Wang, “An accelerating solution for-body mond simulation with fpga-soc,” International Journal of Reconfigurable Computing, vol. 2016, 2016.
E. Del Sozzo, L. Di Tucci, M. D. Santambrogio, “A Highly Scalable and Efficient Parallel Design of N-Body Simulation on FPGA,” Politecnico di Milano, Dipartimento di Elettronica, Informazione e Bioingegneria,
Milano, Italy.
Parameters 8
ASIC
20.5 GFLOPS/W
FPGA
CPU
GPU
2.1 – 6.7 Mpairs/s/W
63.1 – 96.0 Mpairs/s/W ≃116.4 Mpairs/s/W
J. Makino and H. Daisaka, “Grape-8–an accelerator for gravitational nbody simulation with 20.5 gflops/w performance,” in High Performance Computing, Networking, Storage and Analysis (SC), 2012 International
Conference for. IEEE, 2012, pp. 1–10.
B. Peng, T. Wang, X. Jin, and C. Wang, “An accelerating solution for-body mond simulation with fpga-soc,” International Journal of Reconfigurable Computing, vol. 2016, 2016.
E. Del Sozzo, L. Di Tucci, M. D. Santambrogio, “A Highly Scalable and Efficient Parallel Design of N-Body Simulation on FPGA,” Politecnico di Milano, Dipartimento di Elettronica, Informazione e Bioingegneria,
Milano, Italy.
Parameters 9
ASIC
$200,000.00
FPGA
CPU
GPU
$200.00 - $400.00
$3,990.00 $3,495.00
J. Makino and H. Daisaka, “Grape-8–an accelerator for gravitational nbody simulation with 20.5 gflops/w performance,” in High Performance Computing, Networking, Storage and Analysis (SC), 2012 International
Conference for. IEEE, 2012, pp. 1–10.
B. Peng, T. Wang, X. Jin, and C. Wang, “An accelerating solution for-body mond simulation with fpga-soc,” International Journal of Reconfigurable Computing, vol. 2016, 2016.
E. Del Sozzo, L. Di Tucci, M. D. Santambrogio, “A Highly Scalable and Efficient Parallel Design of N-Body Simulation on FPGA,” Politecnico di Milano, Dipartimento di Elettronica, Informazione e Bioingegneria,
Milano, Italy.
Parameters 10
ASIC
FPGA
CPU
GPU
J. Makino and H. Daisaka, “Grape-8–an accelerator for gravitational nbody simulation with 20.5 gflops/w performance,” in High Performance Computing, Networking, Storage and Analysis (SC), 2012 International
Conference for. IEEE, 2012, pp. 1–10.
B. Peng, T. Wang, X. Jin, and C. Wang, “An accelerating solution for-body mond simulation with fpga-soc,” International Journal of Reconfigurable Computing, vol. 2016, 2016.
E. Del Sozzo, L. Di Tucci, M. D. Santambrogio, “A Highly Scalable and Efficient Parallel Design of N-Body Simulation on FPGA,” Politecnico di Milano, Dipartimento di Elettronica, Informazione e Bioingegneria,
Milano, Italy.
Our Choice 11
Thanks for your attention 12
Bibbidi N-Bobbidy boo at NECST
(https://www.facebook.com/BibbidiNBobbidyboo/)
Bibbidy N-BObbiDY boo at NECST
(https://www.slideshare.net/bibbidyN-BObbiDYboo)
Emanuele Del Sozzo
emanuele.delsozzo@polimi.it
Marco Rabozzi
marco.rabozzi@polimi.it
Marco Nanni
marco3.nanni@mail.polimi.it
Marco D. Santambrogio
marco.santambrogio@polimi.it
@N_BodyAtNECST
(https://twitter.com/N_BodyAtNECST)

More Related Content

Similar to 2. Rationale behind FPGA

DReAMS: High Performance Reconfigurable Computing at NECSTLab
DReAMS: High Performance Reconfigurable Computing at NECSTLabDReAMS: High Performance Reconfigurable Computing at NECSTLab
DReAMS: High Performance Reconfigurable Computing at NECSTLabNECST Lab @ Politecnico di Milano
 
1-bit semantic segmentation
1-bit semantic segmentation1-bit semantic segmentation
1-bit semantic segmentationJeonghoonKim30
 
FPGA-based soft-processors: 6G nodes and post-quantum security in space
 FPGA-based soft-processors: 6G nodes and post-quantum security in space FPGA-based soft-processors: 6G nodes and post-quantum security in space
FPGA-based soft-processors: 6G nodes and post-quantum security in spaceFacultad de Informática UCM
 
HUG + Nomica: a scalable FPGA-based architecture for variant-calling
HUG + Nomica: a scalable FPGA-based architecture for variant-callingHUG + Nomica: a scalable FPGA-based architecture for variant-calling
HUG + Nomica: a scalable FPGA-based architecture for variant-callingNECST Lab @ Politecnico di Milano
 
An FPGA-based acceleration methodology and performance model for iterative st...
An FPGA-based acceleration methodology and performance model for iterative st...An FPGA-based acceleration methodology and performance model for iterative st...
An FPGA-based acceleration methodology and performance model for iterative st...NECST Lab @ Politecnico di Milano
 
International Journal of Computational Engineering Research (IJCER)
International Journal of Computational Engineering Research (IJCER) International Journal of Computational Engineering Research (IJCER)
International Journal of Computational Engineering Research (IJCER) ijceronline
 
Evo star2012 Robot Base Disturbance Optimization with Compact Differential Ev...
Evo star2012 Robot Base Disturbance Optimization with Compact Differential Ev...Evo star2012 Robot Base Disturbance Optimization with Compact Differential Ev...
Evo star2012 Robot Base Disturbance Optimization with Compact Differential Ev...Fabio Caraffini
 
⭐⭐⭐⭐⭐ CHARLA FIEC: Monitoring of system memory usage embedded in #FPGA
⭐⭐⭐⭐⭐ CHARLA FIEC: Monitoring of system memory usage embedded in #FPGA⭐⭐⭐⭐⭐ CHARLA FIEC: Monitoring of system memory usage embedded in #FPGA
⭐⭐⭐⭐⭐ CHARLA FIEC: Monitoring of system memory usage embedded in #FPGAVictor Asanza
 
Reservoir computing fast deep learning for sequences
Reservoir computing   fast deep learning for sequencesReservoir computing   fast deep learning for sequences
Reservoir computing fast deep learning for sequencesClaudio Gallicchio
 
Early Benchmarking Results for Neuromorphic Computing
Early Benchmarking Results for Neuromorphic ComputingEarly Benchmarking Results for Neuromorphic Computing
Early Benchmarking Results for Neuromorphic ComputingDESMOND YUEN
 
⭐⭐⭐⭐⭐ #IEEE #PRC #YP Puerto Rico and Caribbean (Virtual Summit 2020): Clasifi...
⭐⭐⭐⭐⭐ #IEEE #PRC #YP Puerto Rico and Caribbean (Virtual Summit 2020): Clasifi...⭐⭐⭐⭐⭐ #IEEE #PRC #YP Puerto Rico and Caribbean (Virtual Summit 2020): Clasifi...
⭐⭐⭐⭐⭐ #IEEE #PRC #YP Puerto Rico and Caribbean (Virtual Summit 2020): Clasifi...Victor Asanza
 
The CAOS framework: democratize the acceleration of compute intensive applica...
The CAOS framework: democratize the acceleration of compute intensive applica...The CAOS framework: democratize the acceleration of compute intensive applica...
The CAOS framework: democratize the acceleration of compute intensive applica...NECST Lab @ Politecnico di Milano
 
GPUs for GEC Competition @ GECCO-2013
GPUs for GEC Competition @ GECCO-2013GPUs for GEC Competition @ GECCO-2013
GPUs for GEC Competition @ GECCO-2013Daniele Loiacono
 
Optimize Single Particle Orbital (SPO) Evaluations Based on B-splines
Optimize Single Particle Orbital (SPO) Evaluations Based on B-splinesOptimize Single Particle Orbital (SPO) Evaluations Based on B-splines
Optimize Single Particle Orbital (SPO) Evaluations Based on B-splinesIntel® Software
 
Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...
Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...
Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...Ilham Amezzane
 
Collective Knowledge: python and scikit-learn based open research SDK for col...
Collective Knowledge: python and scikit-learn based open research SDK for col...Collective Knowledge: python and scikit-learn based open research SDK for col...
Collective Knowledge: python and scikit-learn based open research SDK for col...Grigori Fursin
 

Similar to 2. Rationale behind FPGA (20)

DReAMS: High Performance Reconfigurable Computing at NECSTLab
DReAMS: High Performance Reconfigurable Computing at NECSTLabDReAMS: High Performance Reconfigurable Computing at NECSTLab
DReAMS: High Performance Reconfigurable Computing at NECSTLab
 
High Performance Reconfigurable Computing at NECSTLab
High Performance Reconfigurable Computing at NECSTLabHigh Performance Reconfigurable Computing at NECSTLab
High Performance Reconfigurable Computing at NECSTLab
 
1-bit semantic segmentation
1-bit semantic segmentation1-bit semantic segmentation
1-bit semantic segmentation
 
FPGA-based soft-processors: 6G nodes and post-quantum security in space
 FPGA-based soft-processors: 6G nodes and post-quantum security in space FPGA-based soft-processors: 6G nodes and post-quantum security in space
FPGA-based soft-processors: 6G nodes and post-quantum security in space
 
HUG + Nomica: a scalable FPGA-based architecture for variant-calling
HUG + Nomica: a scalable FPGA-based architecture for variant-callingHUG + Nomica: a scalable FPGA-based architecture for variant-calling
HUG + Nomica: a scalable FPGA-based architecture for variant-calling
 
An FPGA-based acceleration methodology and performance model for iterative st...
An FPGA-based acceleration methodology and performance model for iterative st...An FPGA-based acceleration methodology and performance model for iterative st...
An FPGA-based acceleration methodology and performance model for iterative st...
 
International Journal of Computational Engineering Research (IJCER)
International Journal of Computational Engineering Research (IJCER) International Journal of Computational Engineering Research (IJCER)
International Journal of Computational Engineering Research (IJCER)
 
Evo star2012 Robot Base Disturbance Optimization with Compact Differential Ev...
Evo star2012 Robot Base Disturbance Optimization with Compact Differential Ev...Evo star2012 Robot Base Disturbance Optimization with Compact Differential Ev...
Evo star2012 Robot Base Disturbance Optimization with Compact Differential Ev...
 
⭐⭐⭐⭐⭐ CHARLA FIEC: Monitoring of system memory usage embedded in #FPGA
⭐⭐⭐⭐⭐ CHARLA FIEC: Monitoring of system memory usage embedded in #FPGA⭐⭐⭐⭐⭐ CHARLA FIEC: Monitoring of system memory usage embedded in #FPGA
⭐⭐⭐⭐⭐ CHARLA FIEC: Monitoring of system memory usage embedded in #FPGA
 
Presentation
PresentationPresentation
Presentation
 
Reservoir computing fast deep learning for sequences
Reservoir computing   fast deep learning for sequencesReservoir computing   fast deep learning for sequences
Reservoir computing fast deep learning for sequences
 
Early Benchmarking Results for Neuromorphic Computing
Early Benchmarking Results for Neuromorphic ComputingEarly Benchmarking Results for Neuromorphic Computing
Early Benchmarking Results for Neuromorphic Computing
 
6. Implementation
6. Implementation6. Implementation
6. Implementation
 
⭐⭐⭐⭐⭐ #IEEE #PRC #YP Puerto Rico and Caribbean (Virtual Summit 2020): Clasifi...
⭐⭐⭐⭐⭐ #IEEE #PRC #YP Puerto Rico and Caribbean (Virtual Summit 2020): Clasifi...⭐⭐⭐⭐⭐ #IEEE #PRC #YP Puerto Rico and Caribbean (Virtual Summit 2020): Clasifi...
⭐⭐⭐⭐⭐ #IEEE #PRC #YP Puerto Rico and Caribbean (Virtual Summit 2020): Clasifi...
 
The CAOS framework: democratize the acceleration of compute intensive applica...
The CAOS framework: democratize the acceleration of compute intensive applica...The CAOS framework: democratize the acceleration of compute intensive applica...
The CAOS framework: democratize the acceleration of compute intensive applica...
 
GPUs for GEC Competition @ GECCO-2013
GPUs for GEC Competition @ GECCO-2013GPUs for GEC Competition @ GECCO-2013
GPUs for GEC Competition @ GECCO-2013
 
Final Project Presentation
Final Project PresentationFinal Project Presentation
Final Project Presentation
 
Optimize Single Particle Orbital (SPO) Evaluations Based on B-splines
Optimize Single Particle Orbital (SPO) Evaluations Based on B-splinesOptimize Single Particle Orbital (SPO) Evaluations Based on B-splines
Optimize Single Particle Orbital (SPO) Evaluations Based on B-splines
 
Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...
Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...
Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...
 
Collective Knowledge: python and scikit-learn based open research SDK for col...
Collective Knowledge: python and scikit-learn based open research SDK for col...Collective Knowledge: python and scikit-learn based open research SDK for col...
Collective Knowledge: python and scikit-learn based open research SDK for col...
 

Recently uploaded

Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingWSO2
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformWSO2
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxFIDO Alliance
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityVictorSzoltysek
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...caitlingebhard1
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringWSO2
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTopCSSGallery
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 

Recently uploaded (20)

Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation Computing
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software Engineering
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 

2. Rationale behind FPGA

  • 1. bibbidi N-BObbiDY boo Magic Acceleration of N-Body Simulation E. Del Sozzo, M. Rabozzi, M. Nanni, M. D. Santambrogio emanuele.delsozzo@polimi.it marco.rabozzi@polimi.it marco3.nanni@mail.polimi.it marco.santambrogio@polimi.it Xilinx Open Hardware 2017 Contest
  • 4. Algorithm Overview 4 COMPUTE FORCES UPDATE VELOCITIES AND BODY POSITIONS 𝑂(𝑁2 ) 𝑂(𝑁) N = Number of Bodies Repeat T times
  • 5. Platforms 5 ASIC (Application-Specific Integrated Circuit) CPU (Central Processing Unit) GPU (Graphics Processing Unit) FPGA (Field Programmable Gate Array)
  • 7. Parameters 7 ASIC 2 x 480 GFLOPS FPGA CPU GPU 3.1 – 68.7 Mpairs/s 192.0 - 6312.0 Mpairs/s ≃2327.3 Mpairs/s J. Makino and H. Daisaka, “Grape-8–an accelerator for gravitational nbody simulation with 20.5 gflops/w performance,” in High Performance Computing, Networking, Storage and Analysis (SC), 2012 International Conference for. IEEE, 2012, pp. 1–10. B. Peng, T. Wang, X. Jin, and C. Wang, “An accelerating solution for-body mond simulation with fpga-soc,” International Journal of Reconfigurable Computing, vol. 2016, 2016. E. Del Sozzo, L. Di Tucci, M. D. Santambrogio, “A Highly Scalable and Efficient Parallel Design of N-Body Simulation on FPGA,” Politecnico di Milano, Dipartimento di Elettronica, Informazione e Bioingegneria, Milano, Italy.
  • 8. Parameters 8 ASIC 20.5 GFLOPS/W FPGA CPU GPU 2.1 – 6.7 Mpairs/s/W 63.1 – 96.0 Mpairs/s/W ≃116.4 Mpairs/s/W J. Makino and H. Daisaka, “Grape-8–an accelerator for gravitational nbody simulation with 20.5 gflops/w performance,” in High Performance Computing, Networking, Storage and Analysis (SC), 2012 International Conference for. IEEE, 2012, pp. 1–10. B. Peng, T. Wang, X. Jin, and C. Wang, “An accelerating solution for-body mond simulation with fpga-soc,” International Journal of Reconfigurable Computing, vol. 2016, 2016. E. Del Sozzo, L. Di Tucci, M. D. Santambrogio, “A Highly Scalable and Efficient Parallel Design of N-Body Simulation on FPGA,” Politecnico di Milano, Dipartimento di Elettronica, Informazione e Bioingegneria, Milano, Italy.
  • 9. Parameters 9 ASIC $200,000.00 FPGA CPU GPU $200.00 - $400.00 $3,990.00 $3,495.00 J. Makino and H. Daisaka, “Grape-8–an accelerator for gravitational nbody simulation with 20.5 gflops/w performance,” in High Performance Computing, Networking, Storage and Analysis (SC), 2012 International Conference for. IEEE, 2012, pp. 1–10. B. Peng, T. Wang, X. Jin, and C. Wang, “An accelerating solution for-body mond simulation with fpga-soc,” International Journal of Reconfigurable Computing, vol. 2016, 2016. E. Del Sozzo, L. Di Tucci, M. D. Santambrogio, “A Highly Scalable and Efficient Parallel Design of N-Body Simulation on FPGA,” Politecnico di Milano, Dipartimento di Elettronica, Informazione e Bioingegneria, Milano, Italy.
  • 10. Parameters 10 ASIC FPGA CPU GPU J. Makino and H. Daisaka, “Grape-8–an accelerator for gravitational nbody simulation with 20.5 gflops/w performance,” in High Performance Computing, Networking, Storage and Analysis (SC), 2012 International Conference for. IEEE, 2012, pp. 1–10. B. Peng, T. Wang, X. Jin, and C. Wang, “An accelerating solution for-body mond simulation with fpga-soc,” International Journal of Reconfigurable Computing, vol. 2016, 2016. E. Del Sozzo, L. Di Tucci, M. D. Santambrogio, “A Highly Scalable and Efficient Parallel Design of N-Body Simulation on FPGA,” Politecnico di Milano, Dipartimento di Elettronica, Informazione e Bioingegneria, Milano, Italy.
  • 12. Thanks for your attention 12 Bibbidi N-Bobbidy boo at NECST (https://www.facebook.com/BibbidiNBobbidyboo/) Bibbidy N-BObbiDY boo at NECST (https://www.slideshare.net/bibbidyN-BObbiDYboo) Emanuele Del Sozzo emanuele.delsozzo@polimi.it Marco Rabozzi marco.rabozzi@polimi.it Marco Nanni marco3.nanni@mail.polimi.it Marco D. Santambrogio marco.santambrogio@polimi.it @N_BodyAtNECST (https://twitter.com/N_BodyAtNECST)