Discuss how GOAI makes it possible to use GPU acceleration in Python without having to learn CUDA.
Updated slides here: https://www.slideshare.net/MikeWendt/updated-171127-pydata-ny-lightning-talk-gpu-acceleration-with-goai-in-python
4. 4
DATA PROCESSING EVOLUTION
Faster Data Access Less Data Movement
HDFS
Read
HDFS
Write
HDFS
Read
HDFS
Write
HDFS
Read
Query ETL ML Train
Hadoop Processing, Reading from disk
5. 5
DATA PROCESSING EVOLUTION
Faster Data Access Less Data Movement
HDFS
Read
HDFS
Write
HDFS
Read
HDFS
Write
HDFS
Read
Query ETL ML Train
HDFS
Read
Query ETL ML Train
Hadoop Processing, Reading from disk
25-100x Improvement
Less code
Language flexible
Primarily In-Memory
Spark In-Memory Processing
7. 7
25-100x Improvement
Less code
Language flexible
Primarily In-Memory
DATA PROCESSING EVOLUTION
Faster Data Access Less Data Movement
HDFS
Read
HDFS
Write
HDFS
Read
HDFS
Write
HDFS
Read
Query ETL ML Train
HDFS
Read
Query ETL ML Train
HDFS
Read
GPU
Read
Query
CPU
Write
GPU
Read
ETL
CPU
Write
GPU
Read
ML
Train
5-10x Improvement
More code
Language rigid
Substantially on GPU
GPU/Spark In-Memory Processing
Hadoop Processing, Reading from disk
Spark In-Memory Processing
10. 10
APP A
GPU-ACCELERATED ARCHITECTURE THEN
Too much data movement and too many different data formats
CPU GPU
APP B
Read DataH2O.ai
Anaconda Gunrock
Graphistry
BlazingDB MapDKinetica
Copy & Convert
Copy & Convert
Copy & Convert
Load Data
APP A GPU
Data
APP B
GPU
Data
11. 11
APP A
IDEAL GPU-ACCELERATED ARCHITECTURE
Too much data movement and too many different data formats
CPU GPU
APP B
Read DataH2O.ai
Anaconda Gunrock
Graphistry
BlazingDB MapDKinetica
Copy & Convert
Copy & Convert
Copy & Convert
Load Data
APP A GPU
Data
APP B
GPU
Data
12. 12
GPU-ACCELERATED ARCHITECTURE NOW
Single data format and shared access to data on GPU
CPU GPU
GPU
MEM
Read DataH2O.ai
Anaconda Gunrock
Graphistry
BlazingDB MapDKinetica Load Data
Shared
GPU
Data
Frame
13. 13
GPU OPEN ANALYTICS INITIATIVE
github.com/gpuopenanalytics
GPU Data Frame (GDF)
Ingest/
Parse
Exploratory
Analysis
Feature
Engineering
ML/DL
Algorithms
Grid Search
Scoring
Model
Export
@gpuoai
20. 20
25-100x Improvement
Less code
Language flexible
Primarily In-Memory
DATA PROCESSING EVOLUTION
Faster Data Access Less Data Movement
HDFS
Read
HDFS
Write
HDFS
Read
HDFS
Write
HDFS
Read
Query ETL ML Train
HDFS
Read
Query ETL ML Train
HDFS
Read
GPU
Read
Query
CPU
Write
GPU
Read
ETL
CPU
Write
GPU
Read
ML
Train
Arrow
Read
Query ETL
ML
Train
5-10x Improvement
More code
Language rigid
Substantially on GPU
25-100x Improvement
Same code
Language flexible
Primarily on GPU
End to End GPU Processing (GOAI)
GPU/Spark In-Memory Processing
Hadoop Processing, Reading from disk
Spark In-Memory Processing
21. 21
LET’S COLLABORATE AND
SHARE DATA ON GPUS!
@gpuoai
github.com/gpuopenanalytics
Docker Demos with
Jupyter Notebooks
pyGDF Library
gpuopenanalytics.com
Google Groups
Public Slack
Wiki