Your SlideShare is downloading. ×
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
GPUs in Big Data - StampedeCon 2014
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

GPUs in Big Data - StampedeCon 2014

1,258

Published on

At StampedeCon 2014, John Tran of NVIDIA presented "GPUs in Big Data." Modern graphics processing units (GPUs) are massively parallel general-purpose processors that are taking Big Data by storm. In …

At StampedeCon 2014, John Tran of NVIDIA presented "GPUs in Big Data." Modern graphics processing units (GPUs) are massively parallel general-purpose processors that are taking Big Data by storm. In terms of power efficiency, compute density, and scalability, it is clear now that commodity GPUs are the future of parallel computing. In this talk, we will cover diverse examples of how GPUs are revolutionizing Big Data in fields such as machine learning, databases, genomics, and other computational sciences.

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,258
On Slideshare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
0
Comments
0
Likes
5
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. BIG DATA IN GPUS John Tran | StampedeCon2014, May 29 2014, St Louis, MO
  • 2. “If you were plowing a field, which would you rather use? Two strong oxen or 1024 chickens?” —Seymour Cray
  • 3. Example CPU: Xeon E5-2687W ! 2.27 B transistors ! 8 cores, 16 threads @ 3.1 GHz ! 0.35 SP TFLOPS ! 0.17 DP TFLOPS ! 256 GB DDR3 @1600 MHz ! 51.2 GB/s ! 150 W ! 20 MB L3 cache ! Single thread Perf ! branch prediction ! out of order execution
  • 4. Example GPU: Tesla K40 ! 7.1 B transistors ! 2880 cores, 30720 threads @ 745 MHz ! 4.29 SP TFLOPS ! 1.43 DP TFLOPS ! 12 GB GDDR5 @ 3GHz ! 288 GB/s memory BW ! 235 W ! PCIE Gen3 x16 ! 12 GB/s
  • 5. Math and memory peak throughput 4.29 TFLOPS Xeon E5-2687-W Tesla K40 0.35 0.17 1.43 5 4 3 2 1 0 SP TFLOPS DP TFLOPS 51.2 288 400 300 200 100 0 Memory BW GB/s Xeon E5-2687W Tesla K40
  • 6. The Chickens are Winning ! Parallel computing is no longer “the future” ! If you are not parallel, you are already behind ! GPUs win in ! Performance == $$ ! Power == $$ ! Cost == $$
  • 7. Where did these GPUs come from?
  • 8. OK, but what about computing?
  • 9. All Computing is Parallel Computing
  • 10. Parallel Computing CPU GPU
  • 11. The Basic Idea – Accelerated Computing Application Code Compute-Intensive Functions Rest of Sequential CPU Code GPU CPU CUDA
  • 12. Quick CUDA C example Standard C Code Parallel C Code void saxpy(int n, float a, float *x, float *y) { for (int i = 0; i < n; ++i) y[i] = a*x[i] + y[i]; } int N = 1<<20; // Perform SAXPY on 1M elements saxpy(N, 2.0, x, y); __global__ void saxpy(int n, float a, float *x, float *y) { int i = blockIdx.x*blockDim.x + threadIdx.x; if (i < n) y[i] = a*x[i] + y[i]; } int N = 1<<20; cudaMemcpy(x, d_x, N, cudaMemcpyHostToDevice); cudaMemcpy(y, d_y, N, cudaMemcpyHostToDevice); // Perform SAXPY on 1M elements saxpy<<<4096,256>>>(N, 2.0, x, y); cudaMemcpy(d_y, y, N, cudaMemcpyDeviceToHost); http://developer.nvidia.com/cuda-toolkit
  • 13. How else can you program it? ! Libraries ! Thrust, BLAS, SPARSE, FFT, NPP, RAND ! Directives ! OpenACC ! Languages ! CUDA C, CUDA C++, thrust, python, fortran, C++ proposal, matlab, gpu.net ! Learn ! “get cuda,” Udacity, Coursera
  • 14. How does this matter to Big Data?
  • 15. ! 90 M monthly active users ! 17 M tracks tagged / day ! 27 M tracks in DB “GPUs enable us to handle our tremendous processing needs at a substantial cost savings, delivering twice the performance per dollar compared to a CPU-based system.” -Jason Titus, CTO, Shazam
  • 16. Deep Neural Networks for image classification
  • 17. Google Datacenter Stanford AI Lab 1000 CPU Servers 600 kWatts $5,000,000 3 GPU-Accelerated Servers 3.6 kWatts $21,000 Deep learning with COTS HPC systems, A Coates, B Huval, T Wang, D Wu, A Ng, B Catanzaro, NIPS 2013
  • 18. Speech Recognition
  • 19. The DataScope at JHU 5PB of science data (in 2010) “The Data-Scope will allow us to mine out relationships among data that already exist but that we can’t yet handle and to sift discoveries from what seems like an overwhelming flow of information. New discoveries will definitely emerge this way. There are relationships and patterns that we just cannot fathom buried in that onslaught of data. Data-Scope will tease these out.” – Alex Szalay, JHU
  • 20. HIV Capsid
  • 21. Beating Heart Surgery Patient stands to lose 1 point of IQ every 10 min with heart stopped Only ~2% of heart surgeons will operate on a beating heart GPU enables real-time motion compensation to virtually stop beating heart for surgeons: Courtesy Laboratoire d’Informatique de Robotique et de Microelectronique de Montpellier
  • 22. NVBIO
  • 23. Final Thoughts ! Parallel computing is here ! Re-think parallel or get left behind ! Scale up before scaling out ! Several orders of magnitude parallelism increase by using a GPU ! Do you really need a cluster? ! GPUs are the most efficient solution for parallel problems ! Perf / $ ! Perf / Watt
  • 24. All Computing is Parallel Computing

×