Naveen Swamy
Distributed Deep Learning Inference
using Apache MXNet* and Apache Spark
Amazon AI
*
Outline
• Review of Deep Learning
• Apache MXNet Framework
• Distributed Inference using MXNet and Spark
Input layer
(Raw pixels)
Output
(object identity)
3rd hidden layer
(object parts)
2nd hidden layer
(corners & contours)
1st hidden layer
(edges)
• Originally inspired by our biological
neural systems.
• A System that learns important
features from experience.
• Layers of Neurons learning concepts.
• Deep learning != deep understanding
Deep Learning
Credit: Ian Goodfellow etal., Deep Learning Book
CAR PERSON DOG
Algorithmic Advances
(Faster Learning)
Abundance of Data
(Deeper Networks)
High Performance Compute
GPUs
(Faster Experiments)
Bigger and Better Models = Better AI Products
Why does Deep Learning matter?
Autonomous
Vehicles
Personal Assistants
Solve Intelligence ???
Health care
Deep Learning & AI, Limitations
Artificial Intelligence
DL Limitations:
• Requires lots of data and
compute power.
• Cannot detect Inherent bias in
data - Transparency.
• Uninterpretable Results.
Machine Learning
Deep
Learning
Deep Learning Training
forward
dog
dog
?
error
labels
data
backward
• Pass data through the network – forward pass
• Define an objective – Loss function
• Send the error back – backward pass
Model: Output of Training a neural network
X2 h2 w6 = 0.5
y = 1.0
y` = 0.9
loss = y – y`
l = 0.1
y
X1 h1
w2 = 0.5
w1 = 0.5
w3 = 0.5
w5 = 0.4
w
4
=
0.5
0.1
0.1
backward pass
forward pass
Deep Learning Inference
• Real time Inference: Tasks that require immediate result.
• Batch Inference: Tasks where you need to run on a large data sets.
o Pre-computations are necessary - Recommender Systems.
o Backfilling with state-of-the art models.
o Testing new models on historic data.
model
forward
dog
Types of Learning
• Supervised Learning – Uses labeled training data learning to
associate input data to output.
Example: Image classification, Speech Recognition, Machine translation
• Unsupervised Learning - Learns patterns from Unlabeled data.
Example: Clustering, Association discovery.
• Active Learning – Semi-supervised, human in the middle..
• Reinforcement Learning – learn from environment, using rewards and
feedback.
Outline
• Apache MXNet Framework
• Distributed Inference using MXNet and Spark
Why MXNet
MXNet – NDArray & Symbol
• NDArray– Imperative Tensor Operations that work on both CPU and
GPUs.
• Symbol APIs – similar to NDArray but adopts declarative programming
for optimization.
Computation GraphSymbolic Program
MXNet - Module
High level APIs to work with Symbol
1) Create Graph
2) Bind
3) Pass data
Outline
• Distributed Inference using MXNet and Spark
Distributed Inference
Challenges
• Similar to large scale data
processing systems
High Performance DL framework
Distributed Cluster
Resource Management
Job Management
Efficient Partition of Data
Deep Learning Setup
Apache Spark:
• Multiple Cluster Managers
• Works well with MXNet.
• Integrates with Hadoop & big data tools.
MXNet + Spark for Inference.
• ImageNet trained ResNet-18 classifier.
• For demo, CIFAR-10 test dataset with 10K Images.
• PySpark on Amazon EMR, MXNet is also available in Scala.
• Inference on CPUs, can be extended to use GPUs.
Distributed Inference Pipeline
download
S3 keys
on driver
create RDD
and
partition
fetch batch
of images
on executor
decode to
numpy array
run
prediction
collect
predictions
initialize model only once
mapPartitions
MXNet + Spark for Inference.
Onthedriver
Ontheexecutor
Summary
• Overview of Deep Learning
o How Deep Learning works and Why Deep Learning is a big deal.
o Phases of Deep Learning
o Types of Learning
• Apache MXNet – Efficient deep learning library
o NDArray/Symbol/Module
• Apache MXNet and Spark for distributed Inference.
What’s Next ?
• Released simplified Scala Inference APIs (v1.2.0)
oAvailable on Maven : org.apache.mxnet
• Working on Java APIs for Inference.
• Dataframe support is under consideration.
• MXNet community is fast evolving, join hands to democratize
AI.
Resources/References
• https://github.com/apache/incubator-mxnet
• Blog- Distributed Inference using MXNet and Spark
• Distributed Inference code sample on GitHub
• Apache MXNet Gluon Tutorials
• Apache MXNet – Flexible and efficient deep learning.
• The Deep Learning Book
• MXNet – Using pre-trained models
• Amazon Elastic MapReduce
Thank You
nswamy@apache.org

Distributed Inference with MXNet and Spark

  • 1.
    Naveen Swamy Distributed DeepLearning Inference using Apache MXNet* and Apache Spark Amazon AI *
  • 2.
    Outline • Review ofDeep Learning • Apache MXNet Framework • Distributed Inference using MXNet and Spark
  • 3.
    Input layer (Raw pixels) Output (objectidentity) 3rd hidden layer (object parts) 2nd hidden layer (corners & contours) 1st hidden layer (edges) • Originally inspired by our biological neural systems. • A System that learns important features from experience. • Layers of Neurons learning concepts. • Deep learning != deep understanding Deep Learning Credit: Ian Goodfellow etal., Deep Learning Book CAR PERSON DOG
  • 4.
    Algorithmic Advances (Faster Learning) Abundanceof Data (Deeper Networks) High Performance Compute GPUs (Faster Experiments) Bigger and Better Models = Better AI Products
  • 5.
    Why does DeepLearning matter? Autonomous Vehicles Personal Assistants Solve Intelligence ??? Health care
  • 6.
    Deep Learning &AI, Limitations Artificial Intelligence DL Limitations: • Requires lots of data and compute power. • Cannot detect Inherent bias in data - Transparency. • Uninterpretable Results. Machine Learning Deep Learning
  • 7.
    Deep Learning Training forward dog dog ? error labels data backward •Pass data through the network – forward pass • Define an objective – Loss function • Send the error back – backward pass Model: Output of Training a neural network X2 h2 w6 = 0.5 y = 1.0 y` = 0.9 loss = y – y` l = 0.1 y X1 h1 w2 = 0.5 w1 = 0.5 w3 = 0.5 w5 = 0.4 w 4 = 0.5 0.1 0.1 backward pass forward pass
  • 8.
    Deep Learning Inference •Real time Inference: Tasks that require immediate result. • Batch Inference: Tasks where you need to run on a large data sets. o Pre-computations are necessary - Recommender Systems. o Backfilling with state-of-the art models. o Testing new models on historic data. model forward dog
  • 9.
    Types of Learning •Supervised Learning – Uses labeled training data learning to associate input data to output. Example: Image classification, Speech Recognition, Machine translation • Unsupervised Learning - Learns patterns from Unlabeled data. Example: Clustering, Association discovery. • Active Learning – Semi-supervised, human in the middle.. • Reinforcement Learning – learn from environment, using rewards and feedback.
  • 10.
    Outline • Apache MXNetFramework • Distributed Inference using MXNet and Spark
  • 11.
  • 12.
    MXNet – NDArray& Symbol • NDArray– Imperative Tensor Operations that work on both CPU and GPUs. • Symbol APIs – similar to NDArray but adopts declarative programming for optimization. Computation GraphSymbolic Program
  • 13.
    MXNet - Module Highlevel APIs to work with Symbol 1) Create Graph 2) Bind 3) Pass data
  • 14.
    Outline • Distributed Inferenceusing MXNet and Spark
  • 15.
    Distributed Inference Challenges • Similarto large scale data processing systems High Performance DL framework Distributed Cluster Resource Management Job Management Efficient Partition of Data Deep Learning Setup Apache Spark: • Multiple Cluster Managers • Works well with MXNet. • Integrates with Hadoop & big data tools.
  • 16.
    MXNet + Sparkfor Inference. • ImageNet trained ResNet-18 classifier. • For demo, CIFAR-10 test dataset with 10K Images. • PySpark on Amazon EMR, MXNet is also available in Scala. • Inference on CPUs, can be extended to use GPUs.
  • 17.
    Distributed Inference Pipeline download S3keys on driver create RDD and partition fetch batch of images on executor decode to numpy array run prediction collect predictions initialize model only once mapPartitions
  • 18.
    MXNet + Sparkfor Inference. Onthedriver
  • 19.
  • 20.
    Summary • Overview ofDeep Learning o How Deep Learning works and Why Deep Learning is a big deal. o Phases of Deep Learning o Types of Learning • Apache MXNet – Efficient deep learning library o NDArray/Symbol/Module • Apache MXNet and Spark for distributed Inference.
  • 21.
    What’s Next ? •Released simplified Scala Inference APIs (v1.2.0) oAvailable on Maven : org.apache.mxnet • Working on Java APIs for Inference. • Dataframe support is under consideration. • MXNet community is fast evolving, join hands to democratize AI.
  • 22.
    Resources/References • https://github.com/apache/incubator-mxnet • Blog-Distributed Inference using MXNet and Spark • Distributed Inference code sample on GitHub • Apache MXNet Gluon Tutorials • Apache MXNet – Flexible and efficient deep learning. • The Deep Learning Book • MXNet – Using pre-trained models • Amazon Elastic MapReduce
  • 23.