CaffeOnSpark Update: Recent
Enhancements and Use Cases
Mridul Jain and Jun Shi
Yahoo!
▪ Apache 2.0 license
▪ Distributed deep learning
› GPU or CPU servers
› Ethernet or InfiniBand connection
▪ Easily deployed on public
cloud (ex. EC2) or private
cloud
CaffeOnSpark Open Sourced (Q1 2016)
EXAMPLE
SLIDE
github.com/yahoo/CaffeOnSpark
▪ TensorflowOnSpark - Opensourced from Y! 2017
› Design pattern of CaffeOnSpark applied to Tensorflow
› RDMA support to Tensorflow - contributed by Y!
› DataWorks Summit 2017 talk by Lee Yang and Andy Feng
▪ Spark Deep Learning from Databricks
› Enables Tensorflow inferencing via Spark MLLib pipelines
Deep Learning on Spark
Agenda
Usecases from Yahoo!
› Flickr
› NSFW
› e-Sports
New Features
› Unified data layer with multi-label datasets
› Training with validation
› LSTM - training/inference
› Docker
Demo: Autocaptioning (LSTM)
5
● In production for 2 years
● Powers search and magic view
● Recognizes ~2K concepts
● Photo aesthetics models
Deep Learning at Flickr
• Released with Flickr 4.0
• https://flickr.com/cameraroll
• Photos organized according to
70 categories
• Allows serendipitous photo
discovery
Flickr Magic View
7
NSFW - Opensource framework built on CaffeOnSpark
https://github.com/yahoo/open_nsfw
Yahoo eSports: Game Highlight Reel
8
https://soundcloud.com/theaipodcast/ep-23-
how-yahoo-uses-ai-to-create-instant-
esports-highlight-reels/s-oeTfz
Deep Learning vs. Hadoop
9
Deep Learning vs. Hadoop
10
2. DL training & test
1.Prepare datasets
3. Apply DL Model
Data
Model
CaffeOnSpark: Deep Learning on Spark
Addressing DL Challenges
➔ Data/Feature processing
◆ Spark enables large scale distributed processing
◆ Dataframes and multi-label input
➔ DL Model exploration and training
◆ Classic ML and Deep Learning models
◆ Python, Scala API
◆ Training with cross validation
➔ DL Inferencing
◆ Distributed inferencing using spark
id label data
1 cat
2 cat
3 dog
Id label fn1 fn2
1 cat [1.5] [0.1, 2.1]
2 cat [1.7] [0.2, 1.0]
3 dog [11.9] [22.0, 2.0]
DataFrames: Our Primary Data Format
Training/test dataframe Feature dataframe
Distributed collection of data in named columns
Unified Data Layer: Replaces Memory DataLayer
● Scalar Column
● Multi-dimensional Column
● byte-arrays (encoded or raw images)
id label data
1 cat, feline
2 dog, animal
Multi-label per image
cooper by paddy patterson
● labels: cooper, labrador, dog, yacht,
pet
● predictions: animal, dog, pet, beach,
grass
Training with Validation
▪ Runs test-set interleaved with training iterations
▪ User can control how many test iterations to run after configurable certain
training iterations
▪ Helps check how your models are evolving without waiting till end
CaffeOnSpark now on Docker
▪ Docker support for both CPU and GPU
Inaddition to:
› Single Node
› Spark standalone cluster
› Yarn Cluster
› EC2
▪ Enables seamless deployment on multiple platforms
▪ Thanks to Arun Das (Open Cloud Institute) for
contributing
courtesy: http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-sequences.pdf
LSTM - Long Short Term Memory
courtesy: http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-sequences.pdf
CaffeOnSpark LSTM Demo
2
0
CaffeOnSpark IPython Notebook :
https://github.com/yahoo/CaffeOnSpark/blob/master/caffe-
grid/src/main/python/examples/ImageCaptioning.ipynb
▪ Dataset creation
▪ Image model training: CNN
▪ Caption model training: LSTM
▪ Caption generation
Dataset: http://mscoco.org
Summary
▪ CaffeOnSpark makes Deep Learning Scalable
› Easy to install with docker, EC2
› Easy integration with your existing big data spark pipelines
▪ Contributions welcome:
› Caffe2 integration
› Java API
› Asynchronous Distributed Training
Thank You!
●github.com/yahoo/caffeonspark
●caffeonspark-users@googlegroups.com
CaffeOnSpark: Scalable Architecture
23

CaffeOnSpark Update: Recent Enhancements and Use Cases

  • 1.
    CaffeOnSpark Update: Recent Enhancementsand Use Cases Mridul Jain and Jun Shi Yahoo!
  • 2.
    ▪ Apache 2.0license ▪ Distributed deep learning › GPU or CPU servers › Ethernet or InfiniBand connection ▪ Easily deployed on public cloud (ex. EC2) or private cloud CaffeOnSpark Open Sourced (Q1 2016) EXAMPLE SLIDE github.com/yahoo/CaffeOnSpark
  • 3.
    ▪ TensorflowOnSpark -Opensourced from Y! 2017 › Design pattern of CaffeOnSpark applied to Tensorflow › RDMA support to Tensorflow - contributed by Y! › DataWorks Summit 2017 talk by Lee Yang and Andy Feng ▪ Spark Deep Learning from Databricks › Enables Tensorflow inferencing via Spark MLLib pipelines Deep Learning on Spark
  • 4.
    Agenda Usecases from Yahoo! ›Flickr › NSFW › e-Sports New Features › Unified data layer with multi-label datasets › Training with validation › LSTM - training/inference › Docker Demo: Autocaptioning (LSTM)
  • 5.
    5 ● In productionfor 2 years ● Powers search and magic view ● Recognizes ~2K concepts ● Photo aesthetics models Deep Learning at Flickr
  • 6.
    • Released withFlickr 4.0 • https://flickr.com/cameraroll • Photos organized according to 70 categories • Allows serendipitous photo discovery Flickr Magic View
  • 7.
    7 NSFW - Opensourceframework built on CaffeOnSpark https://github.com/yahoo/open_nsfw
  • 8.
    Yahoo eSports: GameHighlight Reel 8 https://soundcloud.com/theaipodcast/ep-23- how-yahoo-uses-ai-to-create-instant- esports-highlight-reels/s-oeTfz
  • 9.
  • 10.
    Deep Learning vs.Hadoop 10 2. DL training & test 1.Prepare datasets 3. Apply DL Model Data Model
  • 11.
  • 12.
    Addressing DL Challenges ➔Data/Feature processing ◆ Spark enables large scale distributed processing ◆ Dataframes and multi-label input ➔ DL Model exploration and training ◆ Classic ML and Deep Learning models ◆ Python, Scala API ◆ Training with cross validation ➔ DL Inferencing ◆ Distributed inferencing using spark
  • 13.
    id label data 1cat 2 cat 3 dog Id label fn1 fn2 1 cat [1.5] [0.1, 2.1] 2 cat [1.7] [0.2, 1.0] 3 dog [11.9] [22.0, 2.0] DataFrames: Our Primary Data Format Training/test dataframe Feature dataframe Distributed collection of data in named columns
  • 14.
    Unified Data Layer:Replaces Memory DataLayer ● Scalar Column ● Multi-dimensional Column ● byte-arrays (encoded or raw images) id label data 1 cat, feline 2 dog, animal
  • 15.
    Multi-label per image cooperby paddy patterson ● labels: cooper, labrador, dog, yacht, pet ● predictions: animal, dog, pet, beach, grass
  • 16.
    Training with Validation ▪Runs test-set interleaved with training iterations ▪ User can control how many test iterations to run after configurable certain training iterations ▪ Helps check how your models are evolving without waiting till end
  • 17.
    CaffeOnSpark now onDocker ▪ Docker support for both CPU and GPU Inaddition to: › Single Node › Spark standalone cluster › Yarn Cluster › EC2 ▪ Enables seamless deployment on multiple platforms ▪ Thanks to Arun Das (Open Cloud Institute) for contributing
  • 18.
  • 19.
  • 20.
    CaffeOnSpark LSTM Demo 2 0 CaffeOnSparkIPython Notebook : https://github.com/yahoo/CaffeOnSpark/blob/master/caffe- grid/src/main/python/examples/ImageCaptioning.ipynb ▪ Dataset creation ▪ Image model training: CNN ▪ Caption model training: LSTM ▪ Caption generation Dataset: http://mscoco.org
  • 21.
    Summary ▪ CaffeOnSpark makesDeep Learning Scalable › Easy to install with docker, EC2 › Easy integration with your existing big data spark pipelines ▪ Contributions welcome: › Caffe2 integration › Java API › Asynchronous Distributed Training
  • 22.
  • 23.