Bootstrap Custom Image Classification using Transfer Learning by Danielle Dean and Wee Hyong at Strata Data Conference Singapore 2017

Bootstrap Custom Image Classification using
Transfer Learning
Credits: Mark Hamilton, Ilia Karmanov, Anusua Trivedi, Vivek Gupta, Patrick Buehler, Alok Kirpal
Danielle Dean PhD, Wee Hyong Tok PhD
Principal Data Scientist Lead
Cloud AI
Microsoft
@danielleodean | @weehyong
Strata Singapore 2017

What are the common models?
CNN RNN
Convolutional Neural Network Recurrent Neural Network

Before 2017
2017
April
ResNet-50
32 CPU
256 Nvidia P100 GPUs
1
hour
ResNet-50
NVIDIA M40 GPU
14
days
1018 single precision
operations
Sept
ResNet-50
1,600 CPUs
31
minutes
Nov
15
minutes
ResNet-50
1,024 P100 GPUs
UC Berkeley, TACC, UC DavisFacebook Preferred Network
ChainerMN

Overview
Use Cases
Image Classification using
Transfer Learning
How to jumpstart

Computer Vision Patterns
Yes
Similar
image
Query
image

Aerial Use Classification ESmart – Connected Drone and Power
line inspection
Jabil – Defect Inspection
Use Cases
Lung Cancer Detection

What is common in all these use cases?

What is common in all these use cases?
Convolution Neural Network (CNN) + Transfer Learning

Clothing texture dataset:
Applying transfer learning to
accurately classify clothing
texture Striped
Argyle
Dotted
https://github.com/Azure/MachineLearningSamples-
ImageClassificationUsingCNTK

Clothing texture dataset:
Striped
Argyle
Dotted
https://github.com/Azure/ImageSimilarityUsingCntk
Applying transfer learning to
accurately classify clothing
texture

14,197,122 images
21841 synsets
Diverse images, Lots of labels!

11x11 conv, 96, /4, pool/2
5x5 conv, 256, pool/2
3x3 conv, 384
3x3 conv, 384
fc, 4096
fc, 4096
fc, 1000
AlexNet, 8 layers
(ILSVRC 2012)
3x3 conv, 64
3x3 conv, 128
3x3 conv, 256
3x3 conv, 256
3x3 conv, 256
3x3 conv, 512
3x3 conv, 512
3x3 conv, 512
3x3 conv, 512
3x3 conv, 512
3x3 conv, 512
fc, 4096
fc, 4096
fc, 1000
VGG, 19 layers
(ILSVRC 2014)
input
Conv
7x7+ 2(S)
MaxPool
3x3+ 2(S)
LocalRespNorm
Conv
1x1+ 1(V)
Conv
3x3+ 1(S)
LocalRespNorm
MaxPool
3x3+ 2(S)
Conv Conv Conv Conv
1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S)
Conv Conv MaxPool
1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S)
Dept hConcat
Conv Conv Conv Conv
1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S)
Conv Conv MaxPool
1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S)
Dept hConcat
MaxPool
3x3+ 2(S)
Conv Conv Conv Conv
1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S)
Conv Conv MaxPool
1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S)
Dept hConcat
Conv Conv Conv Conv
1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S)
Conv Conv MaxPool
1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S)
AveragePool
5x5+ 3(V)
Dept hConcat
Conv Conv Conv Conv
1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S)
Conv Conv MaxPool
1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S)
Dept hConcat
Conv Conv Conv Conv
1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S)
Conv Conv MaxPool
1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S)
Dept hConcat
Conv Conv Conv Conv
1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S)
Conv Conv MaxPool
1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S)
AveragePool
5x5+ 3(V)
Dept hConcat
MaxPool
3x3+ 2(S)
Conv Conv Conv Conv
1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S)
Conv Conv MaxPool
1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S)
Dept hConcat
Conv Conv Conv Conv
1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S)
Conv Conv MaxPool
1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S)
Dept hConcat
AveragePool
7x7+ 1(V)
FC
Conv
1x1+ 1(S)
FC
FC
Soft maxAct ivat ion
soft max0
Conv
1x1+ 1(S)
FC
FC
soft max1
soft max2
GoogleNet, 22 layers
(ILSVRC 2014)
ILSVRC (ImageNet Large Scale Visual Recognition Challenge)

ResNet, 152 layers 1x1 conv, 64
3x3 conv, 64
1x1 conv, 256
1x1 conv, 64
3x3 conv, 64
1x1 conv, 256
1x1 conv, 64
3x3 conv, 64
1x1 conv, 256
1x2 conv, 128, /2
3x3 conv, 128
1x1 conv, 512
1x1 conv, 128
3x3 conv, 128
1x1 conv, 512
1x1 conv, 128
3x3 conv, 128
1x1 conv, 512
1x1 conv, 128
3x3 conv, 128
1x1 conv, 512
1x1 conv, 128
3x3 conv, 128
1x1 conv, 512
1x1 conv, 128
3x3 conv, 128
1x1 conv, 512
7x7 conv, 64, /2, pool/2
Microsoft

Example – Visualizing the different layers
Olah, et al., "Feature Visualization", Distill, 2017
https://distill.pub/2017/feature-visualization/
Another fun site:
https://deepart.io/nips/submissions/random/
http://cs231n.stanford.edu/

https://github.com/ilkarman/Blog/tree/master/Visuals

Types of Transfer Learning
Type How to Initialize
Featurization
Layers
Output
Layer
Initialization
How is Transfer Learning
used?
How to Train?
Standard DNN Random Random None Train featurization and output
jointly
Headless DNN Learn using
another task
Separate ML
algorithm
Use the features learned
on a related task
Use the features to train a
separate classifier
Fine Tune DNN Learn using
another task
Random Use and fine tune features
learned on a related task
Train featurization and output
jointly with a small learning rate
Multi-Task DNN Random Random Learned features need to
solve many related tasks
Share a featurization network
across both tasks. Train all
networks jointly with a loss
function (sum of individual task
loss function)

Pre-Built CNN from General Task on Millions of Images
Output
Layer
Stripped
cat? YES
dog? NO
car? NO
Classi
fier
e.g.
SVM
dotted?
Complex
Objects &
Scenes
(people, animals,
cars, beach
scene, etc.)
Low-Level Features
(lines, edges,
color fields, etc.)
High-Level Features
(corners, contours,
simple shapes)
Object Parts
(wheels, faces,
windows, etc.)
Outputs of penultimate layer of ImageNet Trained CNN
provide excellent general purpose image features

Pre-Built CNN from General Task on Millions of Images
Output
Layer
Stripped
Using a pre-trained DNN, an accurate
model can be achieved with thousands (or
less) of labeled examples instead of millions
cat? YES
dog? NO
car? NO
dotted?
Train one or more
layers in new network

DNN featurization
Input Image Size: 224x224 pixels
Area Under Curve: 0.59
Classification Accuracy: 69.0%
Fine-tuning (full CNN)
Fine-tuning (full CNN)

How do you get started with
transfer learning?

Pre-trained Models
http://bit.ly/2jf97NE http://bit.ly/2zNiN8B http://bit.ly/1KlVMf0

Summary
Use Cases
Image Classification using
Transfer Learning
How to jumpstart

Bootstrap Custom Image Classification using Transfer Learning by Danielle Dean and Wee Hyong at Strata Data Conference Singapore 2017

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Bootstrap Custom Image Classification using Transfer Learning by Danielle Dean and Wee Hyong at Strata Data Conference Singapore 2017

Similar to Bootstrap Custom Image Classification using Transfer Learning by Danielle Dean and Wee Hyong at Strata Data Conference Singapore 2017 (20)

Recently uploaded

Recently uploaded (20)

Bootstrap Custom Image Classification using Transfer Learning by Danielle Dean and Wee Hyong at Strata Data Conference Singapore 2017