Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Spark Summit Brussels, October 26, 2016
THE POTENTIAL OF GPU-DRIVEN HIGH
PERFORMANCE DATA ANALYTICS IN SPARK
Andy Steinbac...
2
Scale up
Compute intensive
Scale out
Data intensive
HOW TO SCALE AI
& DATA ANALYTICS?
We are headed here
3
HIGH PERFORMANCE DATA ANALYTICS
Scale out
Scale up
Spark + TensorFlow + GPU
Spark + AI framework + GPU
Machine Learning
...
4
“Training”
ImageNet
“Inference”
DEEP LEARNING - A NEW COMPUTING MODEL
5
BEYOND JUST COMPUTER VISION
6
Trained
model
Labelled training examples
Inference applied to unseen inputs
A REVOLUTION IN MEDECINE
7
A REVOLUTION IN ROBOTICS
88
GPU-POWERED SELF-DRIVING CARS
9
SUPERHUMAN PERFORMANCE
10
WHAT DOES DEEP LEARNING LEARN?
Feature
Representation
Learning
Algorithm
Input
11
PREDICTIVE ANALYTICS IS NEXT
12
10,000s of features
make up todays
fraudulent behavior.
AI can detect
patterns faster and
more accurate than
humans
-Hu...
13
THE NEED TO SCALE UP & OUT IS HUGE
INCREASING DATA VARIETY
Search
Marketing
Behavioral
Targeting
Dynamic
Funnels
User
G...
14
DGX-1 DEEP LEARNING SUPERCOMPUTER
15
0
100
200
300
400
500
600
700
800
2008 2011 2012 2014 2016
Peak Memory Bandwidth
NVIDIA GPU x86 CPU
M2090
M1060
K20
K80...
16
0
60
120
180
Scala UDF Scala UDF
(optimized)
TensorFrames TensorFrames
+ GPU
Runtime(sec)
In practice, compute:
with:
I...
Upcoming SlideShare
Loading in …5
×

The Potential of GPU-driven High Performance Data Analytics in Spark

1,636 views

Published on

Spark Summit EU talk by Andy Steinbach

Published in: Data & Analytics
  • Hello! I can recommend a site that has helped me. It's called ⇒ www.HelpWriting.net ⇐ They helped me for writing my quality research paper.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • ⇒ www.WritePaper.info ⇐ is a good website if you’re looking to get your essay written for you. You can also request things like research papers or dissertations. It’s really convenient and helpful.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

The Potential of GPU-driven High Performance Data Analytics in Spark

  1. 1. Spark Summit Brussels, October 26, 2016 THE POTENTIAL OF GPU-DRIVEN HIGH PERFORMANCE DATA ANALYTICS IN SPARK Andy Steinbach, Sr. Director, NVIDIA
  2. 2. 2 Scale up Compute intensive Scale out Data intensive HOW TO SCALE AI & DATA ANALYTICS? We are headed here
  3. 3. 3 HIGH PERFORMANCE DATA ANALYTICS Scale out Scale up Spark + TensorFlow + GPU Spark + AI framework + GPU Machine Learning & DB Query Deep Learning
  4. 4. 4 “Training” ImageNet “Inference” DEEP LEARNING - A NEW COMPUTING MODEL
  5. 5. 5 BEYOND JUST COMPUTER VISION
  6. 6. 6 Trained model Labelled training examples Inference applied to unseen inputs A REVOLUTION IN MEDECINE
  7. 7. 7 A REVOLUTION IN ROBOTICS
  8. 8. 88 GPU-POWERED SELF-DRIVING CARS
  9. 9. 9 SUPERHUMAN PERFORMANCE
  10. 10. 10 WHAT DOES DEEP LEARNING LEARN? Feature Representation Learning Algorithm Input
  11. 11. 11 PREDICTIVE ANALYTICS IS NEXT
  12. 12. 12 10,000s of features make up todays fraudulent behavior. AI can detect patterns faster and more accurate than humans -Hui Wang, Senior Director of Global Risk Sciences, Pay Pal PREDICTIVE ANALYTICS IS NEXT
  13. 13. 13 THE NEED TO SCALE UP & OUT IS HUGE INCREASING DATA VARIETY Search Marketing Behavioral Targeting Dynamic Funnels User Generated Content Mobile Web SMS/MMS Sentiment HD Video Speech To Text Product/ Service Logs Social Network Business Data Feeds User Click Stream Sensors Infotainment Systems Wearable Devices Cyber Security Logs Connected Vehicles Machine Data IoT Data Dynamic Pricing Payment Record Purchase Detail Purchase Record Support Contacts Segmentation Offer Details Web Logs Offer History A/B Testing BUSINESS PROCESS PETABYTESTERABYTESGIGABYTESEXABYTESZETTABYTES Streaming Video Natural Language Processing WEB DIGITAL AI
  14. 14. 14 DGX-1 DEEP LEARNING SUPERCOMPUTER
  15. 15. 15 0 100 200 300 400 500 600 700 800 2008 2011 2012 2014 2016 Peak Memory Bandwidth NVIDIA GPU x86 CPU M2090 M1060 K20 K80 Pascal GB/s 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 2008 2010 2012 2014 2016 Peak Double Precision FLOPS NVIDIA GPU x86 CPU M1060 K20 GFLOPS K80 Pascal M2090 PERFORMANCE GAP INCREASES
  16. 16. 16 0 60 120 180 Scala UDF Scala UDF (optimized) TensorFrames TensorFrames + GPU Runtime(sec) In practice, compute: with: In a nutshell: a complex numerical function HOW TO SCALE DATA ANALYTICS?

×