SUPERCOMPUTING 2013 PRESS DECK

Sumit Gupta | General Manager, Tesla Accelerated Computing
SC13
News

1

IBM Taps GPU Accelerators

2

New Product Announcements

3

New Supercomputer Announcements
Accelerated Computing Growing Fast
2x Growth in One Year
50%

Percent of HPC Systems
With Accelerators

44%

Hundreds of G...
IBM Using GPUs to Accelerate
Enterprise & Data Analytics Applications
Application
Infrastructure

Business Intelligence

P...
IBM Partners with NVIDIA to Build NextGeneration Supercomputers

+
Tesla

GPU

POWER8

CPU

GPU-Accelerated POWER-Based Sy...
GPU Computing in Data Centers

Power
ARM64
x86

x86

2007

2008

2009

2010

2011

2012

2013

2014
Linux GCC Compiler to Support GPU Accelerators
Open Source
OpenACC in GCC by Mentor Graphics & Samsung

Pervasive Impact
F...
SC13
News

1

IBM Taps GPU Accelerators

2

New Product Announcements

3

New Supercomputer Announcements
Tesla K40

World’s Fastest Accelerator
for Supercomputing and
Big Data Analytics

CUDA 6

Dramatically Simplifies
Parallel...
Tesla K40

World’s Fastest Accelerator
FASTER

1.4 TF| 2880 Cores | 288 GB/s
ns/day

5

LARGER

2x Memory Enables More App...
GPU Boost

Up to 25% Extra Performance on Applications
Use Power Headroom to Run at Higher Clocks
1.40

25%

Faster
1.20

...
ANNOUNCING

Unified Memory

CUDA 6
Unified Memory

Dramatically Lower Developer Effort
Developer View Today

System
Memory

GPU Memory

Developer View With
U...
Super Simplified Memory Management Code
CPU Code
void sortfile(FILE *fp, int N) {
char *data;
data = (char *)malloc(N);

C...
SC13
News

1

IBM Taps GPU Accelerators

2

New Product Announcements

3

New Supercomputer Announcements
Fastest Supercomputer In Europe
6.27 PetaFLOPS (80% Linpack Efficiency)
Piz Daint

Greenest Petascale System
3110 MFLOPS/W...
Greenest Supercomputer in the World
Tokyo Tech KFC System

4000+ MFLOPS per Watt
25% Higher than #1 Green500 System
160 Te...
ANSYS Fluent Doubles Performance with GPUs
Automobile Drag Simulation Throughput
30

Number of Jobs per Day

25

90%
Faste...
SUPERCOMPUTING 2013 PRESS DECK

Sumit Gupta | General Manager, Tesla Accelerated Computing
Additional Information
Tesla K40

20-40% Faster than K20X on Applications
1.5

1.4x

K20X

1.3x

1.2x

1.3x

K40 @ base

1.3x

1.3x

K40 @ boost
...
First Tesla K40 Customers

CSC Finland

Texas Advanced
Computing Center

CEA France

Swinburne
Australia
Tesla K40 OEM Partners
K20X

K40

Peak Single Precision
Peak SGEMM

3.93 TF
2.95 TF

4.29 TF
3.22 TF

Peak Double Precision
Peak DGEMM

1.31 TF
1...
Upcoming SlideShare
Loading in …5
×

Nvidia SC13 Podcast

1,020 views

Published on

In this slidecast, Sumit Gupta from Nvidia discusses the latest product news on GPU computing for HPC.

* IBM and NVIDIA Partner to Build Next-Generation Supercomputers
* NVIDIA Launches the Tesla K40 GPU Accelerator, their fastest accelerator ever

Learn more: http://nvidianews.nvidia.com/Releases/NVIDIA-Launches-World-s-Fastest-Accelerator-for-Supercomputing-and-Big-Data-Analytics-a66.aspx

Watch the video presentation: http://wp.me/p3RLHQ-aRY

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,020
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
24
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Nvidia SC13 Podcast

  1. 1. SUPERCOMPUTING 2013 PRESS DECK Sumit Gupta | General Manager, Tesla Accelerated Computing
  2. 2. SC13 News 1 IBM Taps GPU Accelerators 2 New Product Announcements 3 New Supercomputer Announcements
  3. 3. Accelerated Computing Growing Fast 2x Growth in One Year 50% Percent of HPC Systems With Accelerators 44% Hundreds of GPU Accelerated Apps 300 242 250 40% 200 30% 22% 24% 150 20% NVIDIA GPU is Accelerator of Choice INTEL PHI 4% OTHERS 11% 182 113 100 10% 50 0% 0 2010 2011 2012 Intersect360 Research HPC User Site Census: Systems, July 2013 NVIDIA GPUs 85% 2011 2012 2013 Intersect360 Research HPC User Site Census: Systems, July 2013
  4. 4. IBM Using GPUs to Accelerate Enterprise & Data Analytics Applications Application Infrastructure Business Intelligence Predictive Analytics Risk Analytics
  5. 5. IBM Partners with NVIDIA to Build NextGeneration Supercomputers + Tesla GPU POWER8 CPU GPU-Accelerated POWER-Based Systems Available in 2014
  6. 6. GPU Computing in Data Centers Power ARM64 x86 x86 2007 2008 2009 2010 2011 2012 2013 2014
  7. 7. Linux GCC Compiler to Support GPU Accelerators Open Source OpenACC in GCC by Mentor Graphics & Samsung Pervasive Impact Free to all Linux users Mainstream Most Widely Used HPC Compiler “ Incorporating OpenACC into GCC is an excellent example of open source and open standards working together to make accelerated computing broadly accessible to all Linux developers. ” 7 OpenACC-standard.org confidential Oscar Hernandez Oak Ridge National Laboratory
  8. 8. SC13 News 1 IBM Taps GPU Accelerators 2 New Product Announcements 3 New Supercomputer Announcements
  9. 9. Tesla K40 World’s Fastest Accelerator for Supercomputing and Big Data Analytics CUDA 6 Dramatically Simplifies Parallel Programming with Unified Memory
  10. 10. Tesla K40 World’s Fastest Accelerator FASTER 1.4 TF| 2880 Cores | 288 GB/s ns/day 5 LARGER 2x Memory Enables More Apps AMBER Benchmark 4 SMARTER Unlock Extra Performance Using Power Headroom 6GB 3 2 Fluid Rendering Dynamics Seismic Analysis 1 0 CPU K20X K40 GPU Boost 12GB AMBER Benchmark: SPFP-Nucleosome CPU: Dual E5-2687W @ 3.10GHz, 64GB System Memory, CentOS 6.2, GPU systems: Single Tesla K20X or Single Tesla K40
  11. 11. GPU Boost Up to 25% Extra Performance on Applications Use Power Headroom to Run at Higher Clocks 1.40 25% Faster 1.20 20% Faster 14% Faster 17% Faster 1.00 0.80 13% Faster 0.60 0.40 0.20 11% Faster 0.00 AMBER SPFP-TRPCage Tesla K40 (base) LAMMPS-EAM NAMD 2.9-APOA1 Tesla K40 with GPU Boost
  12. 12. ANNOUNCING Unified Memory CUDA 6
  13. 13. Unified Memory Dramatically Lower Developer Effort Developer View Today System Memory GPU Memory Developer View With Unified Memory Unified Memory
  14. 14. Super Simplified Memory Management Code CPU Code void sortfile(FILE *fp, int N) { char *data; data = (char *)malloc(N); CUDA 6 Code with Unified Memory void sortfile(FILE *fp, int N) { char *data; cudaMallocManaged(&data, N); fread(data, 1, N, fp); qsort(data, N, 1, compare); qsort<<<...>>>(data,N,1,compare); cudaDeviceSynchronize(); use_data(data); use_data(data); free(data); } fread(data, 1, N, fp); cudaFree(data); }
  15. 15. SC13 News 1 IBM Taps GPU Accelerators 2 New Product Announcements 3 New Supercomputer Announcements
  16. 16. Fastest Supercomputer In Europe 6.27 PetaFLOPS (80% Linpack Efficiency) Piz Daint Greenest Petascale System 3110 MFLOPS/W #2: JUQUEEN: 2176 MFLOPS/W Production-Grade Weather Forecasts: COSMO 7 National Weather Agencies Germany | Greece | Italy | Poland | Russia | Romania | Switzerland
  17. 17. Greenest Supercomputer in the World Tokyo Tech KFC System 4000+ MFLOPS per Watt 25% Higher than #1 Green500 System 160 Tesla K20X GPUs Oil Immersion Technology Current Green500 #1: CINECA Eurora System, Italy, 3208 MF/W
  18. 18. ANSYS Fluent Doubles Performance with GPUs Automobile Drag Simulation Throughput 30 Number of Jobs per Day 25 90% Faster 20 15 2x 10 Better Insight for Low Drag Design 5 2% 0 CPU K40 2 x E5-2680 CPUs 8 cores used; 2 Tesla K40s Sedan Geometry, 3.6M mixed cells Steady, turbulent, external aerodynamics- Coupled PBNS, DP Solver 1.5B Less Drag Gal. of Fuel Saved/Year
  19. 19. SUPERCOMPUTING 2013 PRESS DECK Sumit Gupta | General Manager, Tesla Accelerated Computing
  20. 20. Additional Information
  21. 21. Tesla K40 20-40% Faster than K20X on Applications 1.5 1.4x K20X 1.3x 1.2x 1.3x K40 @ base 1.3x 1.3x K40 @ boost 1.3x 1.0 0.5 0.0 ANSYS 14 LAMMPS NAMD 2.9 AMBER LSMS QMCPACK SMP-V14sp-4 EAM APOA1 SPFP-Nucleosome Fe32 3x3x1 CUBLAS
  22. 22. First Tesla K40 Customers CSC Finland Texas Advanced Computing Center CEA France Swinburne Australia
  23. 23. Tesla K40 OEM Partners
  24. 24. K20X K40 Peak Single Precision Peak SGEMM 3.93 TF 2.95 TF 4.29 TF 3.22 TF Peak Double Precision Peak DGEMM 1.31 TF 1.22 TF 1.43 TF 1.33 TF Memory size 6 GB 12 GB Memory BW (ECC off) 250 GB/s 288 GB/s Memory Clock 2.6 GHz 3.0 GHz PCIe Gen Gen 2 Gen 3 # of Cores 2688 2880 Core Clock 732 MHz Base: 745 MHz Boost Clocks: 810 & 875 Mhz Total Board Power 235W 235W Form Factor PCIe Passive PCIe Passive, Active 9

×