SlideShare a Scribd company logo
1 of 1
Reconfigurable FPGA-based Clusters: Next step in Supercomputing
                                                                                                  Vivek Venugopal, Kevin Shinpaugh

                                                                                                                                                                                              HPReC systems
                                                                                                                                                                                                                                                                                               FPGA
                                                                                                                                                                                                                                                                                                                          Interconnect




    Introduction                                                                • Partially Reconfigurable System
                                                                                                                                                                                                                                                                                       processor/co-processor
                                                                                                                                                                                                                                                                                                                             network
                                                                                                                                                                                                                                                                                               nodes



                                                                                                                                                          Bus/Switch                                                                                                                     00         01           02             03



    • Current High Performance Computing (HPC)                                                                                  Inside each
                                                                                                                                    node
                                                                                                                                                              FPGA

                                                                                                                                                                                • Cluster of FPGAs equivalent to a huge processor with embedded
                                                                                                                                                                                                                                                                                         10         11           12             13

    applications include genome sequencing (BLAST),
                                                                                                                                                                                reconfigurability, replication and parallelism
    molecular dynamics simulation (AMBER, NAMD),                                                                                                                  RAM                                                                                                                    20         21           22             23



                                                                                                                                                                                • Issues prevalent with systems:
    astrophysics simulation, weather prediction, etc.                                                                                                                                                                                                                                    30         31           32             33


                                                                                                                                                                                     • Scalability of the system with respect to type of application
                                                                                                                                                            GPP



    Motivation
                                                                                                Interconnection Network


                                                                                                                                                                                     • Availability of a fast interconnection network for I/O bound
                                                                                • Completely Reconfigurable                                                                           applications (Bandwidth access)
    • HPC systems cater to two types of applications:                                                                                                     Bus/Switch
                                                                                System
                                                                                                                                                                                     • More processors or more floating point cores for compute
                                                                                                                                    Inside each
                                  HPC applications                                                                                                                                                                                                                                              Infiniband interconnect
                                                                                                                                        node
                                                                                                                                                              I/F FPGA
                                                                                                                                                                                     bound applications
                                                                                                                                                                                                                                                                                               Memory
                                                                                                                                                                   RAM
                                                                                                                                                                                                                                                                                                                     User
                                                                                                                                                                                                                                                                                                                Logic >100
                                                                                                                                                                                                                                                                                                                million gates
                             Compute
                                               I/O bound                                                                                                                                                                                                                                         n
                              bound                                                                                                                         FPGA                                                                                                                              Power PC
                                               application
                            application                                                         Interconnection Network




    • HPC systems need to be built according to the
                                                                                                                                                                                          Test platforms (HPReC)
                                                                                            CPU
                                                                                            AMD
    application for maximum efficiency.                                                     Opteron
                  Current                                    Future                                                                                                                                                                       Cray XD1                 SGI RASC
                                                                                                 3.2 GB/s                                                                                                                                  platform                 platform
               HPC scenario                          HPReC scenario
                                                                                                                                                                                                                    Processing          AMD Opteron +              Intel Itanium +
                                                                                                                                                                                                                     hardware       Xilinx Virtex-4 FPGAs     Xilinx Virtex-II FPGAs
                                                                                                                                                                        Cache Memory
                                                                                                                                     FPGA
                                                                                 RapidArray Interconnect
                                                                                                                                                                           16 MB
                                                                                                                                     Xilinx
                                                        Application                                                                                                                                                 Interconnect
                 Application                                                          Interface chip                                                                                                                               Rapid Array Interconnect   Numalink Interconnect
                                                                                                                                                                        QDR SDRAM
                                                                                                                                    Virtex 4
                                                                                                                  3.2 GB/s                          12.8 GB/s                                                          network
                                                                                                                                                                                                                    Bandwidth
                                                                                                                                                                                                                                           4 GB/s                  12.8 GB/s
                                                                                                                                                                                                                     access
                                                                                                      2 x 2 GB/s

                                                                                                                                                                                          • Most of the hardware mapping from the software is automated and is based on
                                                                                                                                                                                          the availability of specified libraries or processors for implementation, eg.
               GPP with fixed
                                                                      flexible
                hardware for                                                                 RapidArray Interconnect Bus
                                                                       logic
                                                                                                                                                                                          Mitrionics, Handel-C, etc.
              computation and
                                                                      blocks
               communication


          Performance and speedup                 Better performance and



                                                                                                                                                                                           HPReC applications
         scales with more processors               speedup with FPGA



    Reconfigurable systems                                                                                                      Select map
                                                                                   CPU
                                                                                                                 Loader       prog. interface
                                                                                   Intel
                                                                                                                 FPGA
                                                                                                                                                                                           • A combination of I/O bound and compute
                                                                                Itanium2

    • Reconfigurable computing is based on the concept                                                                                                                                      bound applications
                                                                                                 PCI
    that the application defines the processor.                                                  66 MHz

                                                                                                                                                                                           • Bioinformatics
                                                                                                                     2 x 3.2 GB/s
    •FPGAs are inherently parallel with lower power
                                                                                                                                                                                                •Smith Waterman algorithm
                                                                                                                                                                           Cache Memory
                                                                                                                                         Algorithm

    dissipation and are available with a huge library of                                                                                                                      16 MB
                                                                                                   TIO                                      FPGA

                                                                                                                                                                                                •BLAST
                                                                                                                                                                           QDR SDRAM
                                                                                                                                       Xilinx Virtex II     9.6 GB/s
    application cores.
                                                                                                                                                                                           • Physics
    • Reconfiguration can result in (i) efficient hardware                                                             4 x 3.2 GB/s
                                                                                                                                                                                                • Coliter data
    utilization for repeated operations in a specific
                                                                                                                                                                                           • Molecular Simulation Dynamics
    application and (ii) better data passing on the                                        Numalink 4 Interconnect Bus
                                                                                                                                                                                                •AMBER
    interconnection network between the processors.


References
[1] “Cray XD1 datasheet,” Cray Inc., Technical report, 2005.. Available: http://www.cray.com/downloads/Cray_XD1_Datasheet.pdf
[2] “Cray XD1 supercomputer for reconfigurable computing,” Cray Inc., Technical report, 2005. Available: http://www.cray.com/downloads/FPGADatasheet.pdf
[3] “ SGI Reconfigurable Application Specific Computing: Accelerating Production Workflows,” Silicon Graphics Inc., Technical report, December 2006. Available: http://www.sgi.com/pdfs/3984.pdf
[4] “ Extraordinary Acceleration of Workflows with Reconfigurable Application-specific Computing from SGI,” Silicon Graphics Inc., Technical report, November 2004. Available: http://www.sgi.com/pdfs/3721.pdf

More Related Content

Viewers also liked

MIPI DevCon 2016: Troubleshooting MIPI M-PHY Link and Protocol Issues
MIPI DevCon 2016: Troubleshooting MIPI M-PHY Link and Protocol IssuesMIPI DevCon 2016: Troubleshooting MIPI M-PHY Link and Protocol Issues
MIPI DevCon 2016: Troubleshooting MIPI M-PHY Link and Protocol IssuesMIPI Alliance
 
2dal rav infanzia al puer5 storia di un esperienza
2dal rav infanzia al puer5 storia di un esperienza2dal rav infanzia al puer5 storia di un esperienza
2dal rav infanzia al puer5 storia di un esperienzaOrnella Castellano
 
Sales Training Metrics That Drive Business Results Fay Crawford ATD Internati...
Sales Training Metrics That Drive Business Results Fay Crawford ATD Internati...Sales Training Metrics That Drive Business Results Fay Crawford ATD Internati...
Sales Training Metrics That Drive Business Results Fay Crawford ATD Internati...Rosemary Crawford
 
SECON'2016. Круглый стол. Обучение программированию в средней школе. Новейшие...
SECON'2016. Круглый стол. Обучение программированию в средней школе. Новейшие...SECON'2016. Круглый стол. Обучение программированию в средней школе. Новейшие...
SECON'2016. Круглый стол. Обучение программированию в средней школе. Новейшие...SECON
 
MIPI DevCon 2016: MIPI D-PHY - Physical Layer Test & Measurement Challenges
MIPI DevCon 2016: MIPI D-PHY - Physical Layer Test & Measurement ChallengesMIPI DevCon 2016: MIPI D-PHY - Physical Layer Test & Measurement Challenges
MIPI DevCon 2016: MIPI D-PHY - Physical Layer Test & Measurement ChallengesMIPI Alliance
 
Виртуальные сети передачи данных: новый стандарт
Виртуальные сети передачи данных: новый стандартВиртуальные сети передачи данных: новый стандарт
Виртуальные сети передачи данных: новый стандартКРОК
 
Программно-определяемый ЦОД сегодня — строим, управляем, резервируем
Программно-определяемый ЦОД сегодня — строим, управляем, резервируемПрограммно-определяемый ЦОД сегодня — строим, управляем, резервируем
Программно-определяемый ЦОД сегодня — строим, управляем, резервируемКРОК
 

Viewers also liked (10)

MIPI DevCon 2016: Troubleshooting MIPI M-PHY Link and Protocol Issues
MIPI DevCon 2016: Troubleshooting MIPI M-PHY Link and Protocol IssuesMIPI DevCon 2016: Troubleshooting MIPI M-PHY Link and Protocol Issues
MIPI DevCon 2016: Troubleshooting MIPI M-PHY Link and Protocol Issues
 
Porcelain application methods
Porcelain application methodsPorcelain application methods
Porcelain application methods
 
2dal rav infanzia al puer5 storia di un esperienza
2dal rav infanzia al puer5 storia di un esperienza2dal rav infanzia al puer5 storia di un esperienza
2dal rav infanzia al puer5 storia di un esperienza
 
Plusdotazione
PlusdotazionePlusdotazione
Plusdotazione
 
Sales Training Metrics That Drive Business Results Fay Crawford ATD Internati...
Sales Training Metrics That Drive Business Results Fay Crawford ATD Internati...Sales Training Metrics That Drive Business Results Fay Crawford ATD Internati...
Sales Training Metrics That Drive Business Results Fay Crawford ATD Internati...
 
Job&future
Job&futureJob&future
Job&future
 
SECON'2016. Круглый стол. Обучение программированию в средней школе. Новейшие...
SECON'2016. Круглый стол. Обучение программированию в средней школе. Новейшие...SECON'2016. Круглый стол. Обучение программированию в средней школе. Новейшие...
SECON'2016. Круглый стол. Обучение программированию в средней школе. Новейшие...
 
MIPI DevCon 2016: MIPI D-PHY - Physical Layer Test & Measurement Challenges
MIPI DevCon 2016: MIPI D-PHY - Physical Layer Test & Measurement ChallengesMIPI DevCon 2016: MIPI D-PHY - Physical Layer Test & Measurement Challenges
MIPI DevCon 2016: MIPI D-PHY - Physical Layer Test & Measurement Challenges
 
Виртуальные сети передачи данных: новый стандарт
Виртуальные сети передачи данных: новый стандартВиртуальные сети передачи данных: новый стандарт
Виртуальные сети передачи данных: новый стандарт
 
Программно-определяемый ЦОД сегодня — строим, управляем, резервируем
Программно-определяемый ЦОД сегодня — строим, управляем, резервируемПрограммно-определяемый ЦОД сегодня — строим, управляем, резервируем
Программно-определяемый ЦОД сегодня — строим, управляем, резервируем
 

More from Vivek Venugopalan

xDEFENSE: An Extended DEFENSE for mitigating Next Generation Intrusions
xDEFENSE: An Extended DEFENSE for mitigating Next Generation IntrusionsxDEFENSE: An Extended DEFENSE for mitigating Next Generation Intrusions
xDEFENSE: An Extended DEFENSE for mitigating Next Generation IntrusionsVivek Venugopalan
 
Design, Implementation and Security Analysis of Hardware Trojan Threats in FPGA
Design, Implementation and Security Analysis of Hardware Trojan Threats in FPGADesign, Implementation and Security Analysis of Hardware Trojan Threats in FPGA
Design, Implementation and Security Analysis of Hardware Trojan Threats in FPGAVivek Venugopalan
 
Accelerating Real-Time LiDAR Data Processing Using GPUs
Accelerating Real-Time LiDAR Data Processing Using GPUsAccelerating Real-Time LiDAR Data Processing Using GPUs
Accelerating Real-Time LiDAR Data Processing Using GPUsVivek Venugopalan
 
Hardware acceleration of TEA and XTEA algorithms on FPGA, GPU and multi-core ...
Hardware acceleration of TEA and XTEA algorithms on FPGA, GPU and multi-core ...Hardware acceleration of TEA and XTEA algorithms on FPGA, GPU and multi-core ...
Hardware acceleration of TEA and XTEA algorithms on FPGA, GPU and multi-core ...Vivek Venugopalan
 
Real-time processing for ATST
Real-time processing for ATSTReal-time processing for ATST
Real-time processing for ATSTVivek Venugopalan
 
Accelerating Particle Image Velocimetry using Hybrid Architectures
Accelerating Particle Image Velocimetry using Hybrid ArchitecturesAccelerating Particle Image Velocimetry using Hybrid Architectures
Accelerating Particle Image Velocimetry using Hybrid ArchitecturesVivek Venugopalan
 

More from Vivek Venugopalan (7)

xDEFENSE: An Extended DEFENSE for mitigating Next Generation Intrusions
xDEFENSE: An Extended DEFENSE for mitigating Next Generation IntrusionsxDEFENSE: An Extended DEFENSE for mitigating Next Generation Intrusions
xDEFENSE: An Extended DEFENSE for mitigating Next Generation Intrusions
 
Design, Implementation and Security Analysis of Hardware Trojan Threats in FPGA
Design, Implementation and Security Analysis of Hardware Trojan Threats in FPGADesign, Implementation and Security Analysis of Hardware Trojan Threats in FPGA
Design, Implementation and Security Analysis of Hardware Trojan Threats in FPGA
 
Accelerating Real-Time LiDAR Data Processing Using GPUs
Accelerating Real-Time LiDAR Data Processing Using GPUsAccelerating Real-Time LiDAR Data Processing Using GPUs
Accelerating Real-Time LiDAR Data Processing Using GPUs
 
Hardware acceleration of TEA and XTEA algorithms on FPGA, GPU and multi-core ...
Hardware acceleration of TEA and XTEA algorithms on FPGA, GPU and multi-core ...Hardware acceleration of TEA and XTEA algorithms on FPGA, GPU and multi-core ...
Hardware acceleration of TEA and XTEA algorithms on FPGA, GPU and multi-core ...
 
Real-time processing for ATST
Real-time processing for ATSTReal-time processing for ATST
Real-time processing for ATST
 
Accelerating Particle Image Velocimetry using Hybrid Architectures
Accelerating Particle Image Velocimetry using Hybrid ArchitecturesAccelerating Particle Image Velocimetry using Hybrid Architectures
Accelerating Particle Image Velocimetry using Hybrid Architectures
 
CISL talk
CISL talkCISL talk
CISL talk
 

Recently uploaded

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 

Recently uploaded (20)

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 

Reconfigurable FPGA-based Clusters: Next step in Supercomputing

  • 1. Reconfigurable FPGA-based Clusters: Next step in Supercomputing Vivek Venugopal, Kevin Shinpaugh HPReC systems FPGA Interconnect Introduction • Partially Reconfigurable System processor/co-processor network nodes Bus/Switch 00 01 02 03 • Current High Performance Computing (HPC) Inside each node FPGA • Cluster of FPGAs equivalent to a huge processor with embedded 10 11 12 13 applications include genome sequencing (BLAST), reconfigurability, replication and parallelism molecular dynamics simulation (AMBER, NAMD), RAM 20 21 22 23 • Issues prevalent with systems: astrophysics simulation, weather prediction, etc. 30 31 32 33 • Scalability of the system with respect to type of application GPP Motivation Interconnection Network • Availability of a fast interconnection network for I/O bound • Completely Reconfigurable applications (Bandwidth access) • HPC systems cater to two types of applications: Bus/Switch System • More processors or more floating point cores for compute Inside each HPC applications Infiniband interconnect node I/F FPGA bound applications Memory RAM User Logic >100 million gates Compute I/O bound n bound FPGA Power PC application application Interconnection Network • HPC systems need to be built according to the Test platforms (HPReC) CPU AMD application for maximum efficiency. Opteron Current Future Cray XD1 SGI RASC 3.2 GB/s platform platform HPC scenario HPReC scenario Processing AMD Opteron + Intel Itanium + hardware Xilinx Virtex-4 FPGAs Xilinx Virtex-II FPGAs Cache Memory FPGA RapidArray Interconnect 16 MB Xilinx Application Interconnect Application Interface chip Rapid Array Interconnect Numalink Interconnect QDR SDRAM Virtex 4 3.2 GB/s 12.8 GB/s network Bandwidth 4 GB/s 12.8 GB/s access 2 x 2 GB/s • Most of the hardware mapping from the software is automated and is based on the availability of specified libraries or processors for implementation, eg. GPP with fixed flexible hardware for RapidArray Interconnect Bus logic Mitrionics, Handel-C, etc. computation and blocks communication Performance and speedup Better performance and HPReC applications scales with more processors speedup with FPGA Reconfigurable systems Select map CPU Loader prog. interface Intel FPGA • A combination of I/O bound and compute Itanium2 • Reconfigurable computing is based on the concept bound applications PCI that the application defines the processor. 66 MHz • Bioinformatics 2 x 3.2 GB/s •FPGAs are inherently parallel with lower power •Smith Waterman algorithm Cache Memory Algorithm dissipation and are available with a huge library of 16 MB TIO FPGA •BLAST QDR SDRAM Xilinx Virtex II 9.6 GB/s application cores. • Physics • Reconfiguration can result in (i) efficient hardware 4 x 3.2 GB/s • Coliter data utilization for repeated operations in a specific • Molecular Simulation Dynamics application and (ii) better data passing on the Numalink 4 Interconnect Bus •AMBER interconnection network between the processors. References [1] “Cray XD1 datasheet,” Cray Inc., Technical report, 2005.. Available: http://www.cray.com/downloads/Cray_XD1_Datasheet.pdf [2] “Cray XD1 supercomputer for reconfigurable computing,” Cray Inc., Technical report, 2005. Available: http://www.cray.com/downloads/FPGADatasheet.pdf [3] “ SGI Reconfigurable Application Specific Computing: Accelerating Production Workflows,” Silicon Graphics Inc., Technical report, December 2006. Available: http://www.sgi.com/pdfs/3984.pdf [4] “ Extraordinary Acceleration of Workflows with Reconfigurable Application-specific Computing from SGI,” Silicon Graphics Inc., Technical report, November 2004. Available: http://www.sgi.com/pdfs/3721.pdf