Your SlideShare is downloading. ×
Nvidia SC13 Podcast
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Nvidia SC13 Podcast

544
views

Published on

In this slidecast, Sumit Gupta from Nvidia discusses the latest product news on GPU computing for HPC. …

In this slidecast, Sumit Gupta from Nvidia discusses the latest product news on GPU computing for HPC.

* IBM and NVIDIA Partner to Build Next-Generation Supercomputers
* NVIDIA Launches the Tesla K40 GPU Accelerator, their fastest accelerator ever

Learn more: http://nvidianews.nvidia.com/Releases/NVIDIA-Launches-World-s-Fastest-Accelerator-for-Supercomputing-and-Big-Data-Analytics-a66.aspx

Watch the video presentation: http://wp.me/p3RLHQ-aRY

Published in: Technology

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
544
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
22
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. SUPERCOMPUTING 2013 PRESS DECK Sumit Gupta | General Manager, Tesla Accelerated Computing
  • 2. SC13 News 1 IBM Taps GPU Accelerators 2 New Product Announcements 3 New Supercomputer Announcements
  • 3. Accelerated Computing Growing Fast 2x Growth in One Year 50% Percent of HPC Systems With Accelerators 44% Hundreds of GPU Accelerated Apps 300 242 250 40% 200 30% 22% 24% 150 20% NVIDIA GPU is Accelerator of Choice INTEL PHI 4% OTHERS 11% 182 113 100 10% 50 0% 0 2010 2011 2012 Intersect360 Research HPC User Site Census: Systems, July 2013 NVIDIA GPUs 85% 2011 2012 2013 Intersect360 Research HPC User Site Census: Systems, July 2013
  • 4. IBM Using GPUs to Accelerate Enterprise & Data Analytics Applications Application Infrastructure Business Intelligence Predictive Analytics Risk Analytics
  • 5. IBM Partners with NVIDIA to Build NextGeneration Supercomputers + Tesla GPU POWER8 CPU GPU-Accelerated POWER-Based Systems Available in 2014
  • 6. GPU Computing in Data Centers Power ARM64 x86 x86 2007 2008 2009 2010 2011 2012 2013 2014
  • 7. Linux GCC Compiler to Support GPU Accelerators Open Source OpenACC in GCC by Mentor Graphics & Samsung Pervasive Impact Free to all Linux users Mainstream Most Widely Used HPC Compiler “ Incorporating OpenACC into GCC is an excellent example of open source and open standards working together to make accelerated computing broadly accessible to all Linux developers. ” 7 OpenACC-standard.org confidential Oscar Hernandez Oak Ridge National Laboratory
  • 8. SC13 News 1 IBM Taps GPU Accelerators 2 New Product Announcements 3 New Supercomputer Announcements
  • 9. Tesla K40 World’s Fastest Accelerator for Supercomputing and Big Data Analytics CUDA 6 Dramatically Simplifies Parallel Programming with Unified Memory
  • 10. Tesla K40 World’s Fastest Accelerator FASTER 1.4 TF| 2880 Cores | 288 GB/s ns/day 5 LARGER 2x Memory Enables More Apps AMBER Benchmark 4 SMARTER Unlock Extra Performance Using Power Headroom 6GB 3 2 Fluid Rendering Dynamics Seismic Analysis 1 0 CPU K20X K40 GPU Boost 12GB AMBER Benchmark: SPFP-Nucleosome CPU: Dual E5-2687W @ 3.10GHz, 64GB System Memory, CentOS 6.2, GPU systems: Single Tesla K20X or Single Tesla K40
  • 11. GPU Boost Up to 25% Extra Performance on Applications Use Power Headroom to Run at Higher Clocks 1.40 25% Faster 1.20 20% Faster 14% Faster 17% Faster 1.00 0.80 13% Faster 0.60 0.40 0.20 11% Faster 0.00 AMBER SPFP-TRPCage Tesla K40 (base) LAMMPS-EAM NAMD 2.9-APOA1 Tesla K40 with GPU Boost
  • 12. ANNOUNCING Unified Memory CUDA 6
  • 13. Unified Memory Dramatically Lower Developer Effort Developer View Today System Memory GPU Memory Developer View With Unified Memory Unified Memory
  • 14. Super Simplified Memory Management Code CPU Code void sortfile(FILE *fp, int N) { char *data; data = (char *)malloc(N); CUDA 6 Code with Unified Memory void sortfile(FILE *fp, int N) { char *data; cudaMallocManaged(&data, N); fread(data, 1, N, fp); qsort(data, N, 1, compare); qsort<<<...>>>(data,N,1,compare); cudaDeviceSynchronize(); use_data(data); use_data(data); free(data); } fread(data, 1, N, fp); cudaFree(data); }
  • 15. SC13 News 1 IBM Taps GPU Accelerators 2 New Product Announcements 3 New Supercomputer Announcements
  • 16. Fastest Supercomputer In Europe 6.27 PetaFLOPS (80% Linpack Efficiency) Piz Daint Greenest Petascale System 3110 MFLOPS/W #2: JUQUEEN: 2176 MFLOPS/W Production-Grade Weather Forecasts: COSMO 7 National Weather Agencies Germany | Greece | Italy | Poland | Russia | Romania | Switzerland
  • 17. Greenest Supercomputer in the World Tokyo Tech KFC System 4000+ MFLOPS per Watt 25% Higher than #1 Green500 System 160 Tesla K20X GPUs Oil Immersion Technology Current Green500 #1: CINECA Eurora System, Italy, 3208 MF/W
  • 18. ANSYS Fluent Doubles Performance with GPUs Automobile Drag Simulation Throughput 30 Number of Jobs per Day 25 90% Faster 20 15 2x 10 Better Insight for Low Drag Design 5 2% 0 CPU K40 2 x E5-2680 CPUs 8 cores used; 2 Tesla K40s Sedan Geometry, 3.6M mixed cells Steady, turbulent, external aerodynamics- Coupled PBNS, DP Solver 1.5B Less Drag Gal. of Fuel Saved/Year
  • 19. SUPERCOMPUTING 2013 PRESS DECK Sumit Gupta | General Manager, Tesla Accelerated Computing
  • 20. Additional Information
  • 21. Tesla K40 20-40% Faster than K20X on Applications 1.5 1.4x K20X 1.3x 1.2x 1.3x K40 @ base 1.3x 1.3x K40 @ boost 1.3x 1.0 0.5 0.0 ANSYS 14 LAMMPS NAMD 2.9 AMBER LSMS QMCPACK SMP-V14sp-4 EAM APOA1 SPFP-Nucleosome Fe32 3x3x1 CUBLAS
  • 22. First Tesla K40 Customers CSC Finland Texas Advanced Computing Center CEA France Swinburne Australia
  • 23. Tesla K40 OEM Partners
  • 24. K20X K40 Peak Single Precision Peak SGEMM 3.93 TF 2.95 TF 4.29 TF 3.22 TF Peak Double Precision Peak DGEMM 1.31 TF 1.22 TF 1.43 TF 1.33 TF Memory size 6 GB 12 GB Memory BW (ECC off) 250 GB/s 288 GB/s Memory Clock 2.6 GHz 3.0 GHz PCIe Gen Gen 2 Gen 3 # of Cores 2688 2880 Core Clock 732 MHz Base: 745 MHz Boost Clocks: 810 & 875 Mhz Total Board Power 235W 235W Form Factor PCIe Passive PCIe Passive, Active 9

×