Mike Wendt
@mike_wendt
FLEXIBLE AND FAST STORAGE FOR
DEEP LEARNING WITH ALLUXIO
Yupeng Fu
2
ACCELERATE THE DEEP LEARNING STACK
GPU-Acceleration by NVIDIA and Fast Storage from Alluxio
Apache Arrow
+
3
DATA PROCESSING EVOLUTION
Faster Data Access Less Data Movement
Store
Read
HDFS
Write
HDFS
Read
HDFS
Write
HDFS
Read
Query ETL ML Train
Hadoop Processing, Reading from disk
4
DATA PROCESSING EVOLUTION
Faster Data Access Less Data Movement
HDFS
Read
HDFS
Write
HDFS
Read
HDFS
Write
HDFS
Read
Query ETL ML Train
HDFS
Read
Query ETL ML Train
Hadoop Processing, Reading from disk
25-100x
Improvement
Less code
Language flexible
Primarily In-Memory
Storage bottleneck
Spark In-Memory Processing
5
25-100x
Improvement
Less code
Language flexible
Primarily In-Memory
DATA PROCESSING EVOLUTION
Faster Data Access Less Data Movement
HDFS
Read
HDFS
Write
HDFS
Read
HDFS
Write
HDFS
Read
Query ETL ML Train
HDFS
Read
Query ETL ML Train
HDFS
Read
GPU
Read
Query
CPU
Write
GPU
Read
ETL
CPU
Write
GPU
Read
ML
Train
5-10x Improvement
More code
Language rigid
Substantially on GPU
Storage still a
bottleneck
GPU/Spark In-Memory Processing
Hadoop Processing, Reading from disk
Spark In-Memory Processing
6
WE CAN DO
BETTER!
7
ACCELERATE THE DEEP LEARNING STACK
GPU-Acceleration by NVIDIA and Fast Storage from Alluxio
Apache Arrow
+
8
APP A
GPU-ACCELERATED ARCHITECTURE THEN
Too much data movement and too many different data formats
CPU GPU
APP B
Read DataH2O.ai
Anaconda Gunrock
Graphistry
BlazingDB MapD
Copy & Convert
Copy & Convert
Copy & Convert
Load Data
APP A GPU
Data
APP B
GPU
Data
9
GPU-ACCELERATED ARCHITECTURE NOW
Single data format and shared access to data on GPU
CPU GPU
GPU
MEM
Read DataH2O.ai
Anaconda Gunrock
Graphistry
BlazingDB MapD Load Data
Apache Arrow
Powered by:
GPU Data Frame
10
GRAPH
PROCESSING
ANALYTICS
GPU DATABASES
github.com/gpuopenanalytics
@gpuoai
Apache Arrow
Powered by:
11
GPU ACCELERATION ACROSS THE ECOSYSTEM
Apache Arrow
12
25-100x
Improvement
Less code
Language flexible
Primarily In-Memory
DATA PROCESSING EVOLUTION
Faster Data Access Less Data Movement
HDFS
Read
HDFS
Write
HDFS
Read
HDFS
Write
HDFS
Read
Query ETL ML Train
HDFS
Read
Query ETL ML Train
HDFS
Read
GPU
Read
Query
CPU
Write
GPU
Read
ETL
CPU
Write
GPU
Read
ML
Train
Alluxio
Read
Query ETL
ML
Train
5-10x Improvement
More code
Language rigid
Substantially on GPU
25-100x Improvement
Same code
Language flexible
Primarily on GPU
Alluxio Fast and
Flexible Storage
End to End GPU Processing (GoAi)
GPU/Spark In-Memory Processing
Hadoop Processing, Reading from disk
Spark In-Memory Processing
13
ACCELERATE THE ENTIRE ANALYTICS STACK
GPU-Acceleration by NVIDIA and Fast Storage from Alluxio
Apache Arrow
+
14
DATA & AI ECOSYSTEM EXPLODES
…
• Many Compute
Frameworks
• Many Storage Systems
• Most not co-located
…
15
Data & AI Ecosystem Issues
• Each app manages
multiple data sources
• Data source changes
require global updates
• Storage optimizations
requires app change
• Poor performance due
to lack of locality
…
…
16
Data & AI Ecosystem with Alluxio
• Apps only talk to
Alluxio
• Simple Add/Remove
• No App Changes
• Highest performance
in Memory
Java File API HDFS Interface
Amazon S3
Interface
REST Web Service
HDFS Interface
Amazon S3
Interface
Swift Interface NFS Interface
…
…
17
Storage Challenges for DL& ML
2 Data Freshness
• Cross-network movement is slow
• Copies create lag
• Data quality suffers with copies
4 Security & Governance
• Data security & governance is
increasingly complex
1 Speed & Complexity
• Integration and interoperability issues
(on prem, hybrid, cloud)
• Many departments & groups
3 Cost
• Cloud storage is cheap and reliable, but
slow
• Data duplication
17
Heavy integrations create painful organizational drag
18
Alluxio Design Principles
2 Data Sharing
• Don’t own the data
• Multiple apps sharing common data
• Data stored in multiple, hybrid systems
4 Enterprise Class
• Distributed architecture
• Commodity hardware
• Service-oriented
• High availability
• Security
1 Big Data & Machine Learning
• Interoperability with leading projects
• Large scale data sets
• High IO
3 High Speed Data Access
• Remote data
• Hot/warm/cold data
• Temporary data
• Read/write support
18
19
Alluxio FUSE
Deep Learning
Frameworks
Unified
Data
Storage
Systems
ALLUXIO FUSE
20
Filesystem in Userspace (FUSE)
Running file system code
in user space
21
Alluxio Innovation:
Unified Namespace
Enables effective data management across different Under Stores
Uses Mounting with Transparent Naming
22
Alluxio Innovation:
Unified Namespace
Create a catalog of available data sources for Data Scientists
/finance/customer-transactions/
/finance/vendor-transactions/
/operations/device-logs/
/operations/phone-call-recordings/
/operations/check-images/
/research/us-economic-data/
/research/intl-economic-data/
/marketing/advertising-dataset/
/marketing/marketing-funnel-dataset/
alluxio://
23
Alluxio Innovation:
Intelligent Cache
Local performance from remote data using native multi-tier storage
RAM
SSD
HDD
Hot Warm Cold
24
Deep Learning Input Pipeline
Deep Learning training involves three stages of utilizing different
resources:
• Data reads (I/O): e.g. choose and read image files from source.
• Data Preprocessing (CPU): e.g. decode image records into
images, preprocess, and organize into mini-batches.
• Modeling training (GPU): Calculate and update the parameters
in the multiple convolutional layers
25
Alluxio overcomes I/O bottleneck
26
Alluxio Architecture
Alluxio
Master
Zookeeper
Standby
Master
Alluxio
Worker
Alluxio
Worker
Under Store
Under Store
Alluxio
Client
Application
RAM / SSD / HDD
RAM / SSD / HDD
Control Path
Data Path
28
Read data in Alluxio, on same node as client
Alluxio
Worker
RAM / SSD / HDD
Memory Speed Read of Data
Application
Alluxio
Client
Alluxio
Master
29
Read data not in Alluxio + Caching
29
RAM / SSD / HDD
Network / Disk Speed Read of Data
Application
Alluxio
Client
Alluxio
Master
Alluxio
WorkerUnder Store
30
Read data in Alluxio, not on same node as
client + Caching
RAM / SSD / HDD
Network Speed Read of Data
Application
Alluxio
Client
Alluxio
Master
Alluxio
Worker
RAM / SSD / HDD
Alluxio
Worker
31
Write data only to Alluxio on same
node as client
Alluxio
Worker
RAM / SSD / HDD
Memory SpeedWrite of Data
Application
Alluxio
Client
Alluxio
Master
32
Write data to Alluxio and Under Store
synchronously
RAM / SSD / HDD
Network / Disk SpeedWrite of Data
Application
Alluxio
Client
Alluxio
Master
Alluxio
Worker
Under Store
33
100+ known production deployments
AND MORE!
Twitter.com/alluxio
Linkedin.com/alluxio
Website
www.alluxio.com
E-mail
info@alluxio.com
@
Social Media
á
™
34
Thank you!
Yupeng Fu
yupeng@alluxio.com
Github: yupeng9
Twitter.com/alluxio
Linkedin.com/alluxio
Website
www.alluxio.org
E-mail
info@alluxio.com
@
Social Media
á
™

Flexible and Fast Storage for Deep Learning with Alluxio

  • 1.
    Mike Wendt @mike_wendt FLEXIBLE ANDFAST STORAGE FOR DEEP LEARNING WITH ALLUXIO Yupeng Fu
  • 2.
    2 ACCELERATE THE DEEPLEARNING STACK GPU-Acceleration by NVIDIA and Fast Storage from Alluxio Apache Arrow +
  • 3.
    3 DATA PROCESSING EVOLUTION FasterData Access Less Data Movement Store Read HDFS Write HDFS Read HDFS Write HDFS Read Query ETL ML Train Hadoop Processing, Reading from disk
  • 4.
    4 DATA PROCESSING EVOLUTION FasterData Access Less Data Movement HDFS Read HDFS Write HDFS Read HDFS Write HDFS Read Query ETL ML Train HDFS Read Query ETL ML Train Hadoop Processing, Reading from disk 25-100x Improvement Less code Language flexible Primarily In-Memory Storage bottleneck Spark In-Memory Processing
  • 5.
    5 25-100x Improvement Less code Language flexible PrimarilyIn-Memory DATA PROCESSING EVOLUTION Faster Data Access Less Data Movement HDFS Read HDFS Write HDFS Read HDFS Write HDFS Read Query ETL ML Train HDFS Read Query ETL ML Train HDFS Read GPU Read Query CPU Write GPU Read ETL CPU Write GPU Read ML Train 5-10x Improvement More code Language rigid Substantially on GPU Storage still a bottleneck GPU/Spark In-Memory Processing Hadoop Processing, Reading from disk Spark In-Memory Processing
  • 6.
  • 7.
    7 ACCELERATE THE DEEPLEARNING STACK GPU-Acceleration by NVIDIA and Fast Storage from Alluxio Apache Arrow +
  • 8.
    8 APP A GPU-ACCELERATED ARCHITECTURETHEN Too much data movement and too many different data formats CPU GPU APP B Read DataH2O.ai Anaconda Gunrock Graphistry BlazingDB MapD Copy & Convert Copy & Convert Copy & Convert Load Data APP A GPU Data APP B GPU Data
  • 9.
    9 GPU-ACCELERATED ARCHITECTURE NOW Singledata format and shared access to data on GPU CPU GPU GPU MEM Read DataH2O.ai Anaconda Gunrock Graphistry BlazingDB MapD Load Data Apache Arrow Powered by: GPU Data Frame
  • 10.
  • 11.
    11 GPU ACCELERATION ACROSSTHE ECOSYSTEM Apache Arrow
  • 12.
    12 25-100x Improvement Less code Language flexible PrimarilyIn-Memory DATA PROCESSING EVOLUTION Faster Data Access Less Data Movement HDFS Read HDFS Write HDFS Read HDFS Write HDFS Read Query ETL ML Train HDFS Read Query ETL ML Train HDFS Read GPU Read Query CPU Write GPU Read ETL CPU Write GPU Read ML Train Alluxio Read Query ETL ML Train 5-10x Improvement More code Language rigid Substantially on GPU 25-100x Improvement Same code Language flexible Primarily on GPU Alluxio Fast and Flexible Storage End to End GPU Processing (GoAi) GPU/Spark In-Memory Processing Hadoop Processing, Reading from disk Spark In-Memory Processing
  • 13.
    13 ACCELERATE THE ENTIREANALYTICS STACK GPU-Acceleration by NVIDIA and Fast Storage from Alluxio Apache Arrow +
  • 14.
    14 DATA & AIECOSYSTEM EXPLODES … • Many Compute Frameworks • Many Storage Systems • Most not co-located …
  • 15.
    15 Data & AIEcosystem Issues • Each app manages multiple data sources • Data source changes require global updates • Storage optimizations requires app change • Poor performance due to lack of locality … …
  • 16.
    16 Data & AIEcosystem with Alluxio • Apps only talk to Alluxio • Simple Add/Remove • No App Changes • Highest performance in Memory Java File API HDFS Interface Amazon S3 Interface REST Web Service HDFS Interface Amazon S3 Interface Swift Interface NFS Interface … …
  • 17.
    17 Storage Challenges forDL& ML 2 Data Freshness • Cross-network movement is slow • Copies create lag • Data quality suffers with copies 4 Security & Governance • Data security & governance is increasingly complex 1 Speed & Complexity • Integration and interoperability issues (on prem, hybrid, cloud) • Many departments & groups 3 Cost • Cloud storage is cheap and reliable, but slow • Data duplication 17 Heavy integrations create painful organizational drag
  • 18.
    18 Alluxio Design Principles 2Data Sharing • Don’t own the data • Multiple apps sharing common data • Data stored in multiple, hybrid systems 4 Enterprise Class • Distributed architecture • Commodity hardware • Service-oriented • High availability • Security 1 Big Data & Machine Learning • Interoperability with leading projects • Large scale data sets • High IO 3 High Speed Data Access • Remote data • Hot/warm/cold data • Temporary data • Read/write support 18
  • 19.
  • 20.
    20 Filesystem in Userspace(FUSE) Running file system code in user space
  • 21.
    21 Alluxio Innovation: Unified Namespace Enableseffective data management across different Under Stores Uses Mounting with Transparent Naming
  • 22.
    22 Alluxio Innovation: Unified Namespace Createa catalog of available data sources for Data Scientists /finance/customer-transactions/ /finance/vendor-transactions/ /operations/device-logs/ /operations/phone-call-recordings/ /operations/check-images/ /research/us-economic-data/ /research/intl-economic-data/ /marketing/advertising-dataset/ /marketing/marketing-funnel-dataset/ alluxio://
  • 23.
    23 Alluxio Innovation: Intelligent Cache Localperformance from remote data using native multi-tier storage RAM SSD HDD Hot Warm Cold
  • 24.
    24 Deep Learning InputPipeline Deep Learning training involves three stages of utilizing different resources: • Data reads (I/O): e.g. choose and read image files from source. • Data Preprocessing (CPU): e.g. decode image records into images, preprocess, and organize into mini-batches. • Modeling training (GPU): Calculate and update the parameters in the multiple convolutional layers
  • 25.
  • 26.
    26 Alluxio Architecture Alluxio Master Zookeeper Standby Master Alluxio Worker Alluxio Worker Under Store UnderStore Alluxio Client Application RAM / SSD / HDD RAM / SSD / HDD Control Path Data Path
  • 27.
    28 Read data inAlluxio, on same node as client Alluxio Worker RAM / SSD / HDD Memory Speed Read of Data Application Alluxio Client Alluxio Master
  • 28.
    29 Read data notin Alluxio + Caching 29 RAM / SSD / HDD Network / Disk Speed Read of Data Application Alluxio Client Alluxio Master Alluxio WorkerUnder Store
  • 29.
    30 Read data inAlluxio, not on same node as client + Caching RAM / SSD / HDD Network Speed Read of Data Application Alluxio Client Alluxio Master Alluxio Worker RAM / SSD / HDD Alluxio Worker
  • 30.
    31 Write data onlyto Alluxio on same node as client Alluxio Worker RAM / SSD / HDD Memory SpeedWrite of Data Application Alluxio Client Alluxio Master
  • 31.
    32 Write data toAlluxio and Under Store synchronously RAM / SSD / HDD Network / Disk SpeedWrite of Data Application Alluxio Client Alluxio Master Alluxio Worker Under Store
  • 32.
    33 100+ known productiondeployments AND MORE!
  • 33.
    Twitter.com/alluxio Linkedin.com/alluxio Website www.alluxio.com E-mail info@alluxio.com @ Social Media á ™ 34 Thank you! YupengFu yupeng@alluxio.com Github: yupeng9 Twitter.com/alluxio Linkedin.com/alluxio Website www.alluxio.org E-mail info@alluxio.com @ Social Media á ™