GROMACS Molecular Dynamics on GPU

  • 2,093 views
Uploaded on

Benchmarks showing benefits of running GROMACS Molecular Dynamics Application on GPUs

Benchmarks showing benefits of running GROMACS Molecular Dynamics Application on GPUs

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
2,093
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
32
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Nodes CPU only gpu1 2.26 8.362 3.58 13.014 6.7 21.68
  • Nodes CPU GPU86.61320.3351611.28237.01632 23.06763.8766442.28496.62812872.694 144.424
  • nanoseconds/day8 X5550 6.72M2090+2X5550 8.36CPU Node: 4 X 2 X $1000 = $8000CPU + GPU Node: 1 X 2 X $1000 + 2 X $2000 = $6000
  • GPU: 640 (watts) * 10,334 (seconds/nanosecond) = 6.6 MegaJoulesCPU: 760 (watts) * 12,895 (seconds/nanosecond) = 9.8 MegaJoules
  • Before we end this session I would like to tell you about GPU Test Drive. It is an excellent resource for computational chemistry researchers such as yourself to evaluate benefits of GPU computing in speeding up your simulations. Most importantly it is free.NVIDIA along with its partners is offering access to remotely hosted GPU cluster. You can run applications such as AMBER and NAMD to find out how your models speed up. You can also try code that you have developed to run on GPU and see how it scales on a 8 GPU cluster. All you need to do is sign up and log in – it is really that easy! We have several partners who are demonstrating the GPU Test Drive on the GTC show floor. Please plan on visiting them.Sign up forms have been given out. If you are interested please fill them out and return them to me.

Transcript

  • 1. GROMACS 4.6 Pre-Beta and 4.6 Beta
  • 2. Benefits of GPU Accelerated Computing Faster than CPU only systems in all tests Large performance boost with marginal price increase Energy usage cut by more than half GPUs scale well within a node and over multiple nodes K20 GPU is our fastest and lowest power high performance GPU yet Try GPU accelerated GROMACS for free – www.nvidia.com/GPUTestDrive2
  • 3. Great Scaling in Small Systems 25.00 Running GROMACS 4.6 pre-beta with CUDA 4.1 21.68 Each blue node contains 1x Intel X5550 CPU 20.00 3.2x (95W TDP, 4 Cores per CPU) 3.2x Each green node contains 1x Intel X5550 CPUNanoseconds / Day (95W TDP, 4 Cores per CPU) and 1x NVIDIA 15.00 M2090 (225W TDP per GPU) 13.01 CPU Only 10.00 3.6x With GPU 8.36 3.6x 5.00 3.7x Benchmark systems: RNAse in water with 16,816 atoms in truncated dodecahedron box 0.00 1 2 3 Number of Nodes Get up to 3.7x performance compared to CPU-only nodes
  • 4. Additional Strong Scaling on Larger System 128K Water Molecules 160 Running GROMACS 4.6 pre-beta with CUDA 4.1 Each blue node contains 1x Intel X5670 (95W 140 TDP, 6 Cores per CPU) 120 Each green node contains 1x Intel X5670 (95W 2x TDP, 6 Cores per CPU) and 1x NVIDIA M2070Nanoseconds / Day 100 (225W TDP per GPU) 80 CPU Only 60 With GPU 2.8x 40 20 3.1x 0 8 16 32 64 128 Number of NodesUp to 128 nodes, NVIDIA GPU-accelerated nodes deliver 2-3x performance when compared to CPU-only nodes
  • 5. Replace 3 Nodes with 2 GPUs Running GROMACS 4.6 pre-beta with CUDA 4.1 ADH in Water (134K Atoms) The blue node contains 2x Intel X5550 CPUs9 4 CPU Nodes 9000 (95W TDP, 4 Cores, $1000 per CPU) 8.36 $8,0008 8000 The green node contains 2x Intel X5550 CPUs (95W TDP, 4 Cores, $1000 per CPU) and 2x7 6.7 7000 $6,500 NVIDIA M2090s as the GPU (225W TDP, $2000 per GPU)6 60005 50004 40003 30002 20001 10000 0 Nanoseconds/Day Cost Save thousands of dollars and perform 25% faster
  • 6. Greener Science ADH in Water (134K Atoms) Running GROMACS 4.6 with CUDA 4.1 12000 The blue nodes contain 2x Intel X5550 CPUsEnergy Expended (KiloJoules Consumed) (95W TDP, 4 Cores per CPU) 10000 The green node contains 2x Intel X5550 CPUs, Lower is better 4 Cores per CPU) and 2x NVIDIA M2090s GPUs 8000 (225W TDP per GPU) 6000 4000 Energy Expended = Power x Time 2000 0 4 Nodes 1 Node + 2x M2090 (760 Watts) (640 Watts) In simulating each nanosecond, the GPU-accelerated system uses 33% less energy
  • 7. The Power of Kepler RNase Solvated Protein 24k Atoms140 Running GROMACS version 4.6 beta120 The grey nodes contain 1 or 2 E5-2687W CPUs (150W each, 8 Cores per CPU) and 1 or 2100 NVIDIA M2090s. The green nodes contain 1 or 2 E5-2687W 80 CPUs (8 Cores per CPU) and 1 or 2 NVIDIA M2090 K20X GPUs (235W each). 60 K20X 40 20 0 1 CPU + 1 GPU 1 CPU + 2 GPU 2 CPU + 1 GPU 2 CPU + 2 GPU Upgrading an M2090 to a K20X increases performance 10-45% Ribonuclease
  • 8. K20X – Fast RNase Solvated Protein 24k Atoms 120 Running GROMACS version 4.6 beta 100 The blue nodes contain 1 or 2 E5-2687W CPUs (150W each, 8 Cores per CPU). 80Nanoseconds / Day The green nodes contain 1 or 2 E5-2687W CPUs (8 Cores per CPU) and 1 or 2 NVIDIA K20X GPUs (235W each). 60 CPU Only With 1 K20X 40 20 0 1 CPU 2 CPUs Adding a K20X increases performance by up to 3x Ribonuclease
  • 9. K20X, the Fastest Yet 192K Water Molecules 16 Running GROMACS version 4.6-beta2 and 14 CUDA 5.0.35 12 The blue node contains 2 E5-2687W CPUs (150W each, 8 Cores per CPU).Nanoseconds / Day 10 The green nodes contain 2 E5-2687W CPUs (8 Cores per CPU) and 1 or 2 NVIDIA K20X GPUs 8 (235W each). 6 4 2 0 CPU CPU + K20X CPU + 2x K20X Using K20X nodes increases performance by 2.5x Water
  • 10. Recommended GPU Node Configuration for GROMACS Computational Chemistry Workstation or Single Node Configuration # of CPU sockets 2 Cores per CPU socket 6+ CPU speed (Ghz) 2.66+ System memory per socket (GB) 32 Kepler K10, K20, K20X GPUs Fermi M2090, M2075, C2075 1x Kepler-based GPUs (K20X, K20 or K10): need fast Sandy # of GPUs per CPU socket Bridge or perhaps the very fastest Westmeres, or high-end AMD Opterons GPU memory preference (GB) 6 GPU to CPU connection PCIe 2.0 or higher Server storage 500 GB or higher Network configuration Gemini, InfiniBand10 Scale to multiple nodes with same single node configuration
  • 11. GPU Test Drive Experience GPU Acceleration For Computational Chemistry Researchers, Biophysicists Preconfigured with Molecular Dynamics Apps Remotely Hosted GPU Servers Free & Easy – Sign up, Log in and See Results www.nvidia.com/gputestdrive11