SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
Reconfigurable FPGA-based Clusters: Next step in Supercomputing
Reconfigurable FPGA-based Clusters: Next step in Supercomputing
1.
Reconfigurable FPGA-based Clusters: Next step in Supercomputing
Vivek Venugopal, Kevin Shinpaugh
HPReC systems
FPGA
Interconnect
Introduction • Partially Reconfigurable System
processor/co-processor
network
nodes
Bus/Switch 00 01 02 03
• Current High Performance Computing (HPC) Inside each
node
FPGA
• Cluster of FPGAs equivalent to a huge processor with embedded
10 11 12 13
applications include genome sequencing (BLAST),
reconfigurability, replication and parallelism
molecular dynamics simulation (AMBER, NAMD), RAM 20 21 22 23
• Issues prevalent with systems:
astrophysics simulation, weather prediction, etc. 30 31 32 33
• Scalability of the system with respect to type of application
GPP
Motivation
Interconnection Network
• Availability of a fast interconnection network for I/O bound
• Completely Reconfigurable applications (Bandwidth access)
• HPC systems cater to two types of applications: Bus/Switch
System
• More processors or more floating point cores for compute
Inside each
HPC applications Infiniband interconnect
node
I/F FPGA
bound applications
Memory
RAM
User
Logic >100
million gates
Compute
I/O bound n
bound FPGA Power PC
application
application Interconnection Network
• HPC systems need to be built according to the
Test platforms (HPReC)
CPU
AMD
application for maximum efficiency. Opteron
Current Future Cray XD1 SGI RASC
3.2 GB/s platform platform
HPC scenario HPReC scenario
Processing AMD Opteron + Intel Itanium +
hardware Xilinx Virtex-4 FPGAs Xilinx Virtex-II FPGAs
Cache Memory
FPGA
RapidArray Interconnect
16 MB
Xilinx
Application Interconnect
Application Interface chip Rapid Array Interconnect Numalink Interconnect
QDR SDRAM
Virtex 4
3.2 GB/s 12.8 GB/s network
Bandwidth
4 GB/s 12.8 GB/s
access
2 x 2 GB/s
• Most of the hardware mapping from the software is automated and is based on
the availability of specified libraries or processors for implementation, eg.
GPP with fixed
flexible
hardware for RapidArray Interconnect Bus
logic
Mitrionics, Handel-C, etc.
computation and
blocks
communication
Performance and speedup Better performance and
HPReC applications
scales with more processors speedup with FPGA
Reconfigurable systems Select map
CPU
Loader prog. interface
Intel
FPGA
• A combination of I/O bound and compute
Itanium2
• Reconfigurable computing is based on the concept bound applications
PCI
that the application defines the processor. 66 MHz
• Bioinformatics
2 x 3.2 GB/s
•FPGAs are inherently parallel with lower power
•Smith Waterman algorithm
Cache Memory
Algorithm
dissipation and are available with a huge library of 16 MB
TIO FPGA
•BLAST
QDR SDRAM
Xilinx Virtex II 9.6 GB/s
application cores.
• Physics
• Reconfiguration can result in (i) efficient hardware 4 x 3.2 GB/s
• Coliter data
utilization for repeated operations in a specific
• Molecular Simulation Dynamics
application and (ii) better data passing on the Numalink 4 Interconnect Bus
•AMBER
interconnection network between the processors.
References
[1] “Cray XD1 datasheet,” Cray Inc., Technical report, 2005.. Available: http://www.cray.com/downloads/Cray_XD1_Datasheet.pdf
[2] “Cray XD1 supercomputer for reconfigurable computing,” Cray Inc., Technical report, 2005. Available: http://www.cray.com/downloads/FPGADatasheet.pdf
[3] “ SGI Reconfigurable Application Specific Computing: Accelerating Production Workflows,” Silicon Graphics Inc., Technical report, December 2006. Available: http://www.sgi.com/pdfs/3984.pdf
[4] “ Extraordinary Acceleration of Workflows with Reconfigurable Application-specific Computing from SGI,” Silicon Graphics Inc., Technical report, November 2004. Available: http://www.sgi.com/pdfs/3721.pdf