SlideShare a Scribd company logo
1 of 21
Download to read offline
Which Is Deeper
Comparison of Deep Learning
Frameworks Atop Spark
Zhe Dong, Dr. Yu Cao
EMC Corporation
Outline
• Motivation
• Theoretical Principle
• State-of-the-Art
• Evaluation Criteria
• Evaluation Results
• Summary
• Conclusion
2
Deep Learning on Spark Motivation
• Single-machine DL
• Low efficiency (in hours to even days)
• Limited DNN model capability (hard to
support billions of parameters)
• Dedicated deep learning cluster
• Massive data movement
• High maintenance cost
• Spark+Deep Learning = Truly All-in-One
3
Theoretical Principle
• Large Scale Distributed Deep Networks,Jeffrey Dean,2012
• Model parallelism
• Data parallelism
https://papers.nips.cc/paper/4687-large-scale-
distributed-deep-networks.pdf 4
Data Parallelism for distributed SGD
• Model is replicated on worker nodes
• Two repeating steps
– Train each model replica with mini-batches
– Synchronize model parameters across cluster
• Specific implementations can be different
– How parameters are combined
– Synchronization (strong or weak)
– Parameter server (centralized or not)
5
DownpourSGD Client Pseudo code
http://www.cs.toronto.edu/~ranzato/publications/DistBeliefNIPS2012_withAppendix
.pdf
6
DL on Spark – State-of-the-Art
l AMPLab SparkNet
l Yahoo! CaffeOnSpark
l Arimo Tensorflow On Spark
l Skymind DeepLearning4J
l DeepDist
l H2O Spark
7
Evaluation Criteria
Evaluation
Criteria
Dimensions For Example
Ease of Getting
Started
Documentation Are there detailed, well-organized, up-to-date documents?
Installation How automatic it is?
Built-in Examples Examples available for quick warming up?
Ease of Use Interface Programming language support
Model Encapsulation Model/Layer/Node
Functionality Built-in Models Which NN models have been implemented?
Parallelism Model parallelism or data parallelism
Performance Performance MNIST benchmark results
Status Quo Community Vitality Github project statistics
Enterprise Support Contributions from organizations? 8
SparkNet
• Started by AMPLab from 2015
• Wrapper of Caffe and Tensorflow
• Centralized parameter server
• Strong SGD synchronization
• Differentiating feature: A fixed number (τ) of iterations (mini-
batch) on its subset of data
http://arxiv.org/pdf/1511.06051v4 9
AMPLab SparkNet - EvaluationEvaluation
Criteria
Dimensions SparkNet Score
Ease of Getting
Started
Documentation Paper; No Blog; README.md in Github
Installation No Installation; Have to copy to each worker node
Built-in Examples Cifar10/MNIST/ImageNet
Ease of Use Interface Java/Scala
Model
Encapsulation
Model/Layer
Functionality Built-in Models Tensorflow and Caffe
Parallelism Data Parallelism
Performance Performance MNIST
Status Quo Community
Vitality
Enterprise
Support
AMPLab
1 2 3 4
Iterations 1000 2000 5000 10000
Time (seconds) 2130 4218 10471 21003
Accuracy 94.13% 94.26% 94.01% 94.22%
10
Deeplearning4J
• Started by Skymind from 2014
• An open-source, distributed deep-learning project in Java
and Scala
• Parameter server: IterativeReduce
• Strong SGD synchronization
http://deeplearning4j.org/iterativereduce.html 11
Deeplearning4J - Evaluation
Evaluation
Criteria
Dimensions
DL4J Score
Ease of Getting
Started
Documentation Comprehensive but bad-organized
Installation No Installation
Built-in Examples Only For CDH5;MNIST/IRIS/GravesLSTM
Ease of Use Interface Java/Scala
Model
Encapsulation
Layer
Functionality Built-in Models CNN/RNN/LSTM/DBN/SAE
Parallelism Data Parallelism
Performance Performance MNIST
Status Quo Community
Vitality
Enterprise
Support
Skymind
1 2 3 4
Epochs 5 10 15 20
Time (seconds) 2098 4205 6303 8367
Accuracy 70% 79% 82.7% 84.6%
12
CaffeOnSpark
• Started by Yahoo! from 2015
• Peer-to-Peer parameter server
• Strong SGD synchronization
• Distinguishing feature: MPI Allreduce, RMDA, Infiniband
w1 w2 w3
w1 w2 w3w1 w2 w3
Worker 1 (Parameter Server for w1)
Worker 3
(Parameter Server for w3)
Worker 2
(Parameter Server for w2)
Weights
propagation
(Gradients are sent
in reverse
direction)
13
CaffeOnSpark - Evaluation
Evaluation
Criteria
Dimensions
CaffeOnSpark Score
Ease of Getting
Started
Documentation Blog; README.md in github
Installation Have to install all Caffe needed in each node
Built-in
Examples
Cifar10/MNIST
Ease of Use Interface Java/Scala, DataFrames
Model
Encapsulation
Model
Functionality Built-in Models Caffe
Parallelism Data Parallelism
Performance Performance MNIST
Status Quo Community
Vitality
Enterprise
Support
Yahoo!
1 2 3 4
Iterations 1000 2000 5000 10000
Time(seconds) 224 445 1113 2229
Accuracy 97% 99.4% 99.7% 99.6%
14
Tensorflow on Spark
• Started by Arimo from 2014
• A data-parallel Downpour SGD implementation on Spark
• Centralized parameter server
• Weak SGD synchronization
15
Tensorflow on Spark - Evaluation
Evaluation
Criteria
Dimensions
Tensorflow on Spark Score
Ease of Getting
Started
Documentation Blog; Spark Summit East 2016 slides and video
Installation Dependent on Tensorflow and tornado
Built-in
Examples
MNISTcnn/MNISTdnn/higgsdnn/moleculardnn
Ease of Use Interface Python
Model
Encapsulation
Model/Layer
Functionality Built-in Models Tensorflow
Parallelism Data Parallelism
Performance Performance MNIST
Status Quo Community
Vitality
Enterprise
Support
Arimo
1 2 3 4
Epochs 5 10 15 20
Time(seconds) 223 415 615 828
Accuracy 93% 94% 94.2% 95.4%
16
Benchmark – MNIST
20
30
40
50
60
70
80
90
100
0 2000 4000 6000 8000 10000
SparkNet
DL4J
CaffeOnSpark
Tensorflow on Spark
One master (16-Core,64GB)
Five slaves (8-Core,32GB)
Executor memory: 20GB
Batch size: 64
Accuracy
Time (seconds) 17
Benchmark – MNIST
20
30
40
50
60
70
80
90
100
0 200 400 600 800 1000
SparkNet
CaffeOnSpark
Tensorflow on Spark
One master (16-Core,64GB)
Five slaves (8-Core,32GB)
Executor memory: 20GB
Batch size: 64
Accuracy
Time (seconds) 18
Tensorflow On Spark - Evaluation
• Easy of Use
– Language:Java/Scala
– Interface Level: Model/High Level Network Structure
• Function
– Algorithm:Excellent
– Data Parallel
– Only Ethernet
• Easy to Get Start
– Document: Average
– Need to setup in each node
– Example: Average
• Performance
• Maturity
– Early Stage
– Community: bad
– Commercial or Big company support: AMPLab
Evaluation
Criteria
Dimensions SparkNet DL4J CaffeOnSpark Tensorflow on
Spark
Ease of
Getting Started
Documentation
Installation
Built-in Examples
Ease of Use Interface
Model Encapsulation
Functionality Built-in Models
Parallelism
Performance Performance
Status Quo Community Vitality
Enterprise Support
19
Conclusion
• Common issues
– Lack of model parallelism
– Potential network congestion
– Early-stage development
• Future evaluation work
– GPU integration
– SGD synchronization
– Scalability
20
THANK YOU.
Zhe.Dong@emc.com

More Related Content

What's hot

GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production ScaleGPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production ScaleSpark Summit
 
GPU Computing With Apache Spark And Python
GPU Computing With Apache Spark And PythonGPU Computing With Apache Spark And Python
GPU Computing With Apache Spark And PythonJen Aman
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...Databricks
 
Distributed Deep Learning with Apache Spark and TensorFlow with Jim Dowling
Distributed Deep Learning with Apache Spark and TensorFlow with Jim DowlingDistributed Deep Learning with Apache Spark and TensorFlow with Jim Dowling
Distributed Deep Learning with Apache Spark and TensorFlow with Jim DowlingDatabricks
 
Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...
Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...
Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...Databricks
 
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...Databricks
 
TensorFlowOnSpark: Scalable TensorFlow Learning on Spark Clusters
TensorFlowOnSpark: Scalable TensorFlow Learning on Spark ClustersTensorFlowOnSpark: Scalable TensorFlow Learning on Spark Clusters
TensorFlowOnSpark: Scalable TensorFlow Learning on Spark ClustersDataWorks Summit
 
Deep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei ZahariaDeep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei ZahariaJen Aman
 
How to use Apache TVM to optimize your ML models
How to use Apache TVM to optimize your ML modelsHow to use Apache TVM to optimize your ML models
How to use Apache TVM to optimize your ML modelsDatabricks
 
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...Databricks
 
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016MLconf
 
Prediction as a service with ensemble model in SparkML and Python ScikitLearn
Prediction as a service with ensemble model in SparkML and Python ScikitLearnPrediction as a service with ensemble model in SparkML and Python ScikitLearn
Prediction as a service with ensemble model in SparkML and Python ScikitLearnJosef A. Habdank
 
Scalable Deep Learning Platform On Spark In Baidu
Scalable Deep Learning Platform On Spark In BaiduScalable Deep Learning Platform On Spark In Baidu
Scalable Deep Learning Platform On Spark In BaiduJen Aman
 
Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning
Handling Data Skew Adaptively In Spark Using Dynamic RepartitioningHandling Data Skew Adaptively In Spark Using Dynamic Repartitioning
Handling Data Skew Adaptively In Spark Using Dynamic RepartitioningSpark Summit
 
Build, Scale, and Deploy Deep Learning Pipelines with Ease
Build, Scale, and Deploy Deep Learning Pipelines with EaseBuild, Scale, and Deploy Deep Learning Pipelines with Ease
Build, Scale, and Deploy Deep Learning Pipelines with EaseDatabricks
 
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...Databricks
 
Leveraging GPU-Accelerated Analytics on top of Apache Spark with Todd Mostak
Leveraging GPU-Accelerated Analytics on top of Apache Spark with Todd MostakLeveraging GPU-Accelerated Analytics on top of Apache Spark with Todd Mostak
Leveraging GPU-Accelerated Analytics on top of Apache Spark with Todd MostakDatabricks
 
Advertising Fraud Detection at Scale at T-Mobile
Advertising Fraud Detection at Scale at T-MobileAdvertising Fraud Detection at Scale at T-Mobile
Advertising Fraud Detection at Scale at T-MobileDatabricks
 
From Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim HunterFrom Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim HunterDatabricks
 

What's hot (20)

GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production ScaleGPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
 
GPU Computing With Apache Spark And Python
GPU Computing With Apache Spark And PythonGPU Computing With Apache Spark And Python
GPU Computing With Apache Spark And Python
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
 
Distributed Deep Learning with Apache Spark and TensorFlow with Jim Dowling
Distributed Deep Learning with Apache Spark and TensorFlow with Jim DowlingDistributed Deep Learning with Apache Spark and TensorFlow with Jim Dowling
Distributed Deep Learning with Apache Spark and TensorFlow with Jim Dowling
 
Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...
Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...
Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...
 
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
 
TensorFlowOnSpark: Scalable TensorFlow Learning on Spark Clusters
TensorFlowOnSpark: Scalable TensorFlow Learning on Spark ClustersTensorFlowOnSpark: Scalable TensorFlow Learning on Spark Clusters
TensorFlowOnSpark: Scalable TensorFlow Learning on Spark Clusters
 
Tensorflow vs MxNet
Tensorflow vs MxNetTensorflow vs MxNet
Tensorflow vs MxNet
 
Deep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei ZahariaDeep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
 
How to use Apache TVM to optimize your ML models
How to use Apache TVM to optimize your ML modelsHow to use Apache TVM to optimize your ML models
How to use Apache TVM to optimize your ML models
 
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...
 
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
 
Prediction as a service with ensemble model in SparkML and Python ScikitLearn
Prediction as a service with ensemble model in SparkML and Python ScikitLearnPrediction as a service with ensemble model in SparkML and Python ScikitLearn
Prediction as a service with ensemble model in SparkML and Python ScikitLearn
 
Scalable Deep Learning Platform On Spark In Baidu
Scalable Deep Learning Platform On Spark In BaiduScalable Deep Learning Platform On Spark In Baidu
Scalable Deep Learning Platform On Spark In Baidu
 
Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning
Handling Data Skew Adaptively In Spark Using Dynamic RepartitioningHandling Data Skew Adaptively In Spark Using Dynamic Repartitioning
Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning
 
Build, Scale, and Deploy Deep Learning Pipelines with Ease
Build, Scale, and Deploy Deep Learning Pipelines with EaseBuild, Scale, and Deploy Deep Learning Pipelines with Ease
Build, Scale, and Deploy Deep Learning Pipelines with Ease
 
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
 
Leveraging GPU-Accelerated Analytics on top of Apache Spark with Todd Mostak
Leveraging GPU-Accelerated Analytics on top of Apache Spark with Todd MostakLeveraging GPU-Accelerated Analytics on top of Apache Spark with Todd Mostak
Leveraging GPU-Accelerated Analytics on top of Apache Spark with Todd Mostak
 
Advertising Fraud Detection at Scale at T-Mobile
Advertising Fraud Detection at Scale at T-MobileAdvertising Fraud Detection at Scale at T-Mobile
Advertising Fraud Detection at Scale at T-Mobile
 
From Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim HunterFrom Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim Hunter
 

Similar to Which Is Deeper - Comparison Of Deep Learning Frameworks On Spark

Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...
Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...
Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...Spark Summit
 
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...Spark Summit
 
Urs Köster Presenting at RE-Work DL Summit in Boston
Urs Köster Presenting at RE-Work DL Summit in BostonUrs Köster Presenting at RE-Work DL Summit in Boston
Urs Köster Presenting at RE-Work DL Summit in BostonIntel Nervana
 
Boosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of TechniquesBoosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of TechniquesAhsan Javed Awan
 
Profiling & Testing with Spark
Profiling & Testing with SparkProfiling & Testing with Spark
Profiling & Testing with SparkRoger Rafanell Mas
 
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...Indrajit Poddar
 
Pankaj_Kapila
Pankaj_Kapila Pankaj_Kapila
Pankaj_Kapila Panapka
 
Deep Learning with Spark and GPUs
Deep Learning with Spark and GPUsDeep Learning with Spark and GPUs
Deep Learning with Spark and GPUsDataWorks Summit
 
Spark and Deep Learning frameworks with distributed workloads
Spark and Deep Learning frameworks with distributed workloadsSpark and Deep Learning frameworks with distributed workloads
Spark and Deep Learning frameworks with distributed workloadsS N
 
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...Chris Fregly
 
Legion - AI Runtime Platform
Legion -  AI Runtime PlatformLegion -  AI Runtime Platform
Legion - AI Runtime PlatformAlexey Kharlamov
 
Use Case: Apollo Group at Oracle Open World
Use Case: Apollo Group at Oracle Open WorldUse Case: Apollo Group at Oracle Open World
Use Case: Apollo Group at Oracle Open WorldMongoDB
 
Pankaj_Kapila
Pankaj_KapilaPankaj_Kapila
Pankaj_KapilaPanapka
 
Spark Summit EU talk by Josef Habdank
Spark Summit EU talk by Josef HabdankSpark Summit EU talk by Josef Habdank
Spark Summit EU talk by Josef HabdankSpark Summit
 
Spark to Production @Windward
Spark to Production @WindwardSpark to Production @Windward
Spark to Production @WindwardDemi Ben-Ari
 

Similar to Which Is Deeper - Comparison Of Deep Learning Frameworks On Spark (20)

Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...
Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...
Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...
 
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
 
01-06 OCRE Test Suite - Fernandes.pdf
01-06 OCRE Test Suite - Fernandes.pdf01-06 OCRE Test Suite - Fernandes.pdf
01-06 OCRE Test Suite - Fernandes.pdf
 
Urs Köster Presenting at RE-Work DL Summit in Boston
Urs Köster Presenting at RE-Work DL Summit in BostonUrs Köster Presenting at RE-Work DL Summit in Boston
Urs Köster Presenting at RE-Work DL Summit in Boston
 
Boosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of TechniquesBoosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of Techniques
 
Profiling & Testing with Spark
Profiling & Testing with SparkProfiling & Testing with Spark
Profiling & Testing with Spark
 
Spark
SparkSpark
Spark
 
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
 
Pankaj_Kapila
Pankaj_Kapila Pankaj_Kapila
Pankaj_Kapila
 
Spark Workshop
Spark WorkshopSpark Workshop
Spark Workshop
 
Deep Learning with Spark and GPUs
Deep Learning with Spark and GPUsDeep Learning with Spark and GPUs
Deep Learning with Spark and GPUs
 
Paddle_Spark_Summit
Paddle_Spark_SummitPaddle_Spark_Summit
Paddle_Spark_Summit
 
Spark and Deep Learning frameworks with distributed workloads
Spark and Deep Learning frameworks with distributed workloadsSpark and Deep Learning frameworks with distributed workloads
Spark and Deep Learning frameworks with distributed workloads
 
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
 
Legion - AI Runtime Platform
Legion -  AI Runtime PlatformLegion -  AI Runtime Platform
Legion - AI Runtime Platform
 
Use Case: Apollo Group at Oracle Open World
Use Case: Apollo Group at Oracle Open WorldUse Case: Apollo Group at Oracle Open World
Use Case: Apollo Group at Oracle Open World
 
Pankaj_Kapila
Pankaj_KapilaPankaj_Kapila
Pankaj_Kapila
 
Percona presentation v2
Percona presentation v2Percona presentation v2
Percona presentation v2
 
Spark Summit EU talk by Josef Habdank
Spark Summit EU talk by Josef HabdankSpark Summit EU talk by Josef Habdank
Spark Summit EU talk by Josef Habdank
 
Spark to Production @Windward
Spark to Production @WindwardSpark to Production @Windward
Spark to Production @Windward
 

More from Spark Summit

FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang Spark Summit
 
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...Spark Summit
 
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang WuApache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang WuSpark Summit
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya RaghavendraSpark Summit
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...Spark Summit
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...Spark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingSpark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingSpark Summit
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...Spark Summit
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakSpark Summit
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimSpark Summit
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraSpark Summit
 
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Spark Summit
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...Spark Summit
 
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spark Summit
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovGoal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovSpark Summit
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Spark Summit
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkGetting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkSpark Summit
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Spark Summit
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...Spark Summit
 

More from Spark Summit (20)

FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
 
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
 
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang WuApache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
 
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
 
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovGoal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim Simeonov
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkGetting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir Volk
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
 

Recently uploaded

Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etclalithasri22
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformationAnnie Melnic
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfPratikPatil591646
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfNicoChristianSunaryo
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfnikeshsingh56
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are successPratikSingh115843
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 

Recently uploaded (17)

Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etc
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformation
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdf
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdf
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdf
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are success
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 

Which Is Deeper - Comparison Of Deep Learning Frameworks On Spark

  • 1. Which Is Deeper Comparison of Deep Learning Frameworks Atop Spark Zhe Dong, Dr. Yu Cao EMC Corporation
  • 2. Outline • Motivation • Theoretical Principle • State-of-the-Art • Evaluation Criteria • Evaluation Results • Summary • Conclusion 2
  • 3. Deep Learning on Spark Motivation • Single-machine DL • Low efficiency (in hours to even days) • Limited DNN model capability (hard to support billions of parameters) • Dedicated deep learning cluster • Massive data movement • High maintenance cost • Spark+Deep Learning = Truly All-in-One 3
  • 4. Theoretical Principle • Large Scale Distributed Deep Networks,Jeffrey Dean,2012 • Model parallelism • Data parallelism https://papers.nips.cc/paper/4687-large-scale- distributed-deep-networks.pdf 4
  • 5. Data Parallelism for distributed SGD • Model is replicated on worker nodes • Two repeating steps – Train each model replica with mini-batches – Synchronize model parameters across cluster • Specific implementations can be different – How parameters are combined – Synchronization (strong or weak) – Parameter server (centralized or not) 5
  • 6. DownpourSGD Client Pseudo code http://www.cs.toronto.edu/~ranzato/publications/DistBeliefNIPS2012_withAppendix .pdf 6
  • 7. DL on Spark – State-of-the-Art l AMPLab SparkNet l Yahoo! CaffeOnSpark l Arimo Tensorflow On Spark l Skymind DeepLearning4J l DeepDist l H2O Spark 7
  • 8. Evaluation Criteria Evaluation Criteria Dimensions For Example Ease of Getting Started Documentation Are there detailed, well-organized, up-to-date documents? Installation How automatic it is? Built-in Examples Examples available for quick warming up? Ease of Use Interface Programming language support Model Encapsulation Model/Layer/Node Functionality Built-in Models Which NN models have been implemented? Parallelism Model parallelism or data parallelism Performance Performance MNIST benchmark results Status Quo Community Vitality Github project statistics Enterprise Support Contributions from organizations? 8
  • 9. SparkNet • Started by AMPLab from 2015 • Wrapper of Caffe and Tensorflow • Centralized parameter server • Strong SGD synchronization • Differentiating feature: A fixed number (τ) of iterations (mini- batch) on its subset of data http://arxiv.org/pdf/1511.06051v4 9
  • 10. AMPLab SparkNet - EvaluationEvaluation Criteria Dimensions SparkNet Score Ease of Getting Started Documentation Paper; No Blog; README.md in Github Installation No Installation; Have to copy to each worker node Built-in Examples Cifar10/MNIST/ImageNet Ease of Use Interface Java/Scala Model Encapsulation Model/Layer Functionality Built-in Models Tensorflow and Caffe Parallelism Data Parallelism Performance Performance MNIST Status Quo Community Vitality Enterprise Support AMPLab 1 2 3 4 Iterations 1000 2000 5000 10000 Time (seconds) 2130 4218 10471 21003 Accuracy 94.13% 94.26% 94.01% 94.22% 10
  • 11. Deeplearning4J • Started by Skymind from 2014 • An open-source, distributed deep-learning project in Java and Scala • Parameter server: IterativeReduce • Strong SGD synchronization http://deeplearning4j.org/iterativereduce.html 11
  • 12. Deeplearning4J - Evaluation Evaluation Criteria Dimensions DL4J Score Ease of Getting Started Documentation Comprehensive but bad-organized Installation No Installation Built-in Examples Only For CDH5;MNIST/IRIS/GravesLSTM Ease of Use Interface Java/Scala Model Encapsulation Layer Functionality Built-in Models CNN/RNN/LSTM/DBN/SAE Parallelism Data Parallelism Performance Performance MNIST Status Quo Community Vitality Enterprise Support Skymind 1 2 3 4 Epochs 5 10 15 20 Time (seconds) 2098 4205 6303 8367 Accuracy 70% 79% 82.7% 84.6% 12
  • 13. CaffeOnSpark • Started by Yahoo! from 2015 • Peer-to-Peer parameter server • Strong SGD synchronization • Distinguishing feature: MPI Allreduce, RMDA, Infiniband w1 w2 w3 w1 w2 w3w1 w2 w3 Worker 1 (Parameter Server for w1) Worker 3 (Parameter Server for w3) Worker 2 (Parameter Server for w2) Weights propagation (Gradients are sent in reverse direction) 13
  • 14. CaffeOnSpark - Evaluation Evaluation Criteria Dimensions CaffeOnSpark Score Ease of Getting Started Documentation Blog; README.md in github Installation Have to install all Caffe needed in each node Built-in Examples Cifar10/MNIST Ease of Use Interface Java/Scala, DataFrames Model Encapsulation Model Functionality Built-in Models Caffe Parallelism Data Parallelism Performance Performance MNIST Status Quo Community Vitality Enterprise Support Yahoo! 1 2 3 4 Iterations 1000 2000 5000 10000 Time(seconds) 224 445 1113 2229 Accuracy 97% 99.4% 99.7% 99.6% 14
  • 15. Tensorflow on Spark • Started by Arimo from 2014 • A data-parallel Downpour SGD implementation on Spark • Centralized parameter server • Weak SGD synchronization 15
  • 16. Tensorflow on Spark - Evaluation Evaluation Criteria Dimensions Tensorflow on Spark Score Ease of Getting Started Documentation Blog; Spark Summit East 2016 slides and video Installation Dependent on Tensorflow and tornado Built-in Examples MNISTcnn/MNISTdnn/higgsdnn/moleculardnn Ease of Use Interface Python Model Encapsulation Model/Layer Functionality Built-in Models Tensorflow Parallelism Data Parallelism Performance Performance MNIST Status Quo Community Vitality Enterprise Support Arimo 1 2 3 4 Epochs 5 10 15 20 Time(seconds) 223 415 615 828 Accuracy 93% 94% 94.2% 95.4% 16
  • 17. Benchmark – MNIST 20 30 40 50 60 70 80 90 100 0 2000 4000 6000 8000 10000 SparkNet DL4J CaffeOnSpark Tensorflow on Spark One master (16-Core,64GB) Five slaves (8-Core,32GB) Executor memory: 20GB Batch size: 64 Accuracy Time (seconds) 17
  • 18. Benchmark – MNIST 20 30 40 50 60 70 80 90 100 0 200 400 600 800 1000 SparkNet CaffeOnSpark Tensorflow on Spark One master (16-Core,64GB) Five slaves (8-Core,32GB) Executor memory: 20GB Batch size: 64 Accuracy Time (seconds) 18
  • 19. Tensorflow On Spark - Evaluation • Easy of Use – Language:Java/Scala – Interface Level: Model/High Level Network Structure • Function – Algorithm:Excellent – Data Parallel – Only Ethernet • Easy to Get Start – Document: Average – Need to setup in each node – Example: Average • Performance • Maturity – Early Stage – Community: bad – Commercial or Big company support: AMPLab Evaluation Criteria Dimensions SparkNet DL4J CaffeOnSpark Tensorflow on Spark Ease of Getting Started Documentation Installation Built-in Examples Ease of Use Interface Model Encapsulation Functionality Built-in Models Parallelism Performance Performance Status Quo Community Vitality Enterprise Support 19
  • 20. Conclusion • Common issues – Lack of model parallelism – Potential network congestion – Early-stage development • Future evaluation work – GPU integration – SGD synchronization – Scalability 20