Machine Learning Specialization
Deep Learning:
Searching for Images
Emily Fox & Carlos Guestrin
Machine Learning Specialization
University of Washington
©2015 Emily Fox & Carlos Guestrin
Machine Learning Specialization
Visual product recommender
©2015 Emily Fox & Carlos Guestrin
Machine Learning Specialization
3	
  
I want to buy new shoes, but…
©2015 Emily Fox & Carlos Guestrin
Too many
options online…
Machine Learning Specialization
4	
  
Text search doesn’t help…
©2015 Emily Fox & Carlos Guestrin
“Dress shoes”
Machine Learning Specialization
Visual product search demo
©2015 Emily Fox & Carlos Guestrin
Machine Learning Specialization
Features are key to
machine learning
©2015 Emily Fox & Carlos Guestrin
Machine Learning Specialization
7	
  
Goal: revisit classifiers, but using more
complex, non-linear features
©2015 Emily Fox & Carlos Guestrin
Sentence
from
review
Classifier
MODEL
Input: x
Output: y
Predicted class
Machine Learning Specialization
8	
  
Image classification
©2015 Emily Fox & Carlos Guestrin
Input: x
Image pixels
Output: y
Predicted object
Machine Learning Specialization
Neural networks
ê
Learning *very*
non-linear features
©2015 Emily Fox & Carlos Guestrin
Machine Learning Specialization
10	
  
Linear classifiers
w
0
+
w
1
x
1
+
w
2
x
2
+
…
+
w
d
x
d
=
0
Score(x) > 0 Score(x) < 0
©2015 Emily Fox & Carlos Guestrin
Score(x) = w0 + w1 x1 + w2 x2 + … + wd xd
Machine Learning Specialization
11	
  
Graph representation of classifier:
useful for defining neural networks
x1
x2
xd
y
…
1
w1
w2
> 0, output 1	
  
< 0, output 0	
  
Input Output
©2015 Emily Fox & Carlos Guestrin
Score(x) =
w0 + w1 x1 + w2 x2 + … + wd xd
Machine Learning Specialization
12	
  
What can a linear classifier represent?
x1 OR x2 x1 AND x2
x1
x2
1
y x1
x2
1
y
©2015 Emily Fox & Carlos Guestrin
Machine Learning Specialization
13	
  
What can’t a simple linear classifier
represent?
XOR
the counterexample
to everything
Need non-linear features
©2015 Emily Fox & Carlos Guestrin
Machine Learning Specialization
Solving the XOR problem:
Adding a layer
XOR = x1 AND NOT x2 OR NOT x1 AND x2
z1
-0.5
1
-1
z1 z2
z2
-0.5
-1
1
x1
x2
1
y
1 -0.5
1
1
Thresholded to 0 or 1
©2015 Emily Fox & Carlos Guestrin
Machine Learning Specialization
15	
  
A neural network
• Layers and layers and layers of
linear models and non-linear transformations
• Around for about 50 years
- Fell in “disfavor” in 90s
• In last few years, big resurgence
- Impressive accuracy on
several benchmark problems
- Powered by huge datasets, GPUs,
& modeling/learning alg improvements
x1
x2
1
z1
z2
1
y
©2015 Emily Fox & Carlos Guestrin
Machine Learning Specialization
Application of deep learning
to computer vision
©2015 Emily Fox & Carlos Guestrin
Machine Learning Specialization
17	
  
Image features
• Features = local detectors
- Combined to make prediction
- (in reality, features are more low-level)
Face!
Eye
Eye
Nose
Mouth
©2015 Emily Fox & Carlos Guestrin
Machine Learning Specialization
18	
  
Typical local detectors look for
locally “interesting points” in image
• Image features: collections of
locally interesting points
- Combined to build classifiers
©2015 Emily Fox & Carlos Guestrin
Face!
Machine Learning Specialization
19	
  
SIFT [Lowe ‘99]
©2015 Emily Fox & Carlos Guestrin
•Spin Images
[Johnson & Herbert ‘99]
•Textons
[Malik et al. ‘99]
•RIFT
[Lazebnik ’04]
•GLOH
[Mikolajczyk & Schmid ‘05]
•HoG
[Dalal & Triggs ‘05]
•…
Many hand created features exist
for finding interest points…
Machine Learning Specialization
20	
  
Standard image
classification approach
Input Use simple classifier
e.g., logistic regression, SVMs
Face?
©2015 Emily Fox & Carlos Guestrin
Extract features
Hand-created
features
Machine Learning Specialization
21	
  
SIFT [Lowe ‘99]
©2015 Emily Fox & Carlos Guestrin
•Spin Images
[Johnson & Herbert ‘99]
•Textons
[Malik et al. ‘99]
•RIFT
[Lazebnik ’04]
•GLOH
[Mikolajczyk & Schmid ‘05]
•HoG
[Dalal & Triggs ‘05]
•…
Many hand created features exist
for finding interest points…
Hand-created
features
… but very painful to design
Machine Learning Specialization
22	
  
Deep learning:
implicitly learns features
Layer 1 Layer 2 Layer 3 Prediction
©2015 Emily Fox & Carlos Guestrin
Example
detectors
learned
Example
interest
points
detected
[Zeiler & Fergus ‘13]
Machine Learning Specialization
Deep learning performance
©2015 Emily Fox & Carlos Guestrin
Machine Learning Specialization
Sample results using deep neural networks
• German traffic sign
recognition benchmark
- 99.5% accuracy (IDSIA team)
• House number recognition
- 97.8% accuracy per character
[Goodfellow et al. ’13]
©2015 Emily Fox & Carlos Guestrin
Machine Learning Specialization
ImageNet 2012 competition:
1.2M training images, 1000 categories
©2015 Emily Fox & Carlos Guestrin
0
0.05
0.1
0.15
0.2
0.25
0.3
SuperVision ISI OXFORD_VGG
Error
(best
of
5
guesses)
Huge
gain
Exploited hand-coded features like SIFT
Top 3 teams
Machine Learning Specialization
ImageNet 2012 competition:
1.2M training images, 1000 categories
©2015 Emily Fox & Carlos Guestrin
Winning entry: SuperVision
8 layers, 60M parameters [Krizhevsky et al. ’12]
Achieving these amazing results required:
• New learning algorithms
• GPU implementation
Machine Learning Specialization
Deep learning in computer vision
©2015 Emily Fox & Carlos Guestrin
Machine Learning Specialization
Scene parsing with deep learning
©2015 Emily Fox & Carlos Guestrin
[Farabet et al. ‘13]
Machine Learning Specialization
Retrieving similar images
©2015 Emily Fox & Carlos Guestrin
Input Image Nearest neighbors
Machine Learning Specialization
Challenges of deep learning
©2015 Emily Fox & Carlos Guestrin
Machine Learning Specialization
Deep learning score card
Pros
• Enables learning of features
rather than hand tuning
• Impressive performance gains
- Computer vision
- Speech recognition
- Some text analysis
• Potential for more impact
©2015 Emily Fox & Carlos Guestrin
Machine Learning Specialization
Deep learning workflow
Lots of
labeled
data
Training
set
Validation
set
Learn
deep
neural net
Validate
Adjust
parameters,
network
architecture,…
©2015 Emily Fox & Carlos Guestrin
Machine Learning Specialization
33	
  
Many tricks needed to work well…
Different types of layers, connections,…
needed for high accuracy
[Krizhevsky et al. ’12]
©2015 Emily Fox & Carlos Guestrin
Machine Learning Specialization
Deep learning score card
Pros
• Enables learning of features
rather than hand tuning
• Impressive performance gains
- Computer vision
- Speech recognition
- Some text analysis
• Potential for more impact
Cons
• Requires a lot of data for
high accuracy
• Computationally
really expensive
• Extremely hard to tune
- Choice of architecture
- Parameter types
- Hyperparameters
- Learning algorithm
- …
©2015 Emily Fox & Carlos Guestrin
Computational cost+ so many choices
=
incredibly hard to tune
Machine Learning Specialization
Deep features:
Deep learning
+
Transfer learning
©2015 Emily Fox & Carlos Guestrin
Machine Learning Specialization
36	
  
Standard image
classification approach
Input Use simple classifier
e.g., logistic regression, SVMs
Face?
©2015 Emily Fox & Carlos Guestrin
Extract features
Hand-created
features
Can we learn
features from
data, even when
we don’t have
data or time?
Machine Learning Specialization
37	
  
Transfer learning: Use data from
one task to help learn on another
Lots of data:
Learn
neural net
Great
accuracy on
cat v. dog
Some data: Neural net as
feature extractor
+
Simple classifier
Great
accuracy on
101 categories
Old idea, explored for deep learning by Donahue et al. ’14 & others
©2015 Emily Fox & Carlos Guestrin
vs.
Machine Learning Specialization
38	
  
What’s learned in a neural net
Very specific
to Task 1
Should be ignored
for other tasks
More generic
Can be used as feature extractor
©2015 Emily Fox & Carlos Guestrin
vs.
Neural net trained for Task 1: cat vs. dog
Machine Learning Specialization
39	
  
Transfer learning in more detail…
Very specific
to Task 1
Should be ignored
for other tasks
More generic
Can be used as feature extractor
©2015 Emily Fox & Carlos Guestrin
For Task 2, predicting 101 categories,
learn only end part of neural net
Use simple classifier
e.g., logistic regression,
SVMs, nearest neighbor,…
Class?
Keep weights fixed!
Neural net trained for Task 1: cat vs. dog
Machine Learning Specialization
40	
  
Careful where you cut:
latter layers may be too task specific
Layer 1 Layer 2 Layer 3 Prediction
©2015 Emily Fox & Carlos Guestrin
Example
detectors
learned
Example
interest
points
detected
[Zeiler & Fergus ‘13]
Too specific
for new task
Use these!
Machine Learning Specialization
Transfer learning with deep features workflow
Some
labeled
data
©2015 Emily Fox & Carlos Guestrin
Extract
features
with
neural net
trained on
different
task
Learn
simple
classifier
Validate
Training
set
Validation
set
Machine Learning Specialization
How general are deep features?
©2015 Emily Fox & Carlos Guestrin
Machine Learning Specialization
Summary of deep learning
©2015 Emily Fox & Carlos Guestrin
Machine Learning Specialization
44	
  
What you can do now…
• Describe multi-layer neural network models
• Interpret the role of features as local detectors
in computer vision
• Relate neural networks to hand-crafted image
features
• Describe some settings where deep learning
achieves significant performance boosts
• State the pros & cons of deep learning model
• Apply the notion of transfer learning
• Use neural network models trained in one
domain as features for building a model in
another domain
• Build an image retrieval tool using deep features
©2015 Emily Fox & Carlos Guestrin

deeplearning-annotated.pdf

  • 1.
    Machine Learning Specialization DeepLearning: Searching for Images Emily Fox & Carlos Guestrin Machine Learning Specialization University of Washington ©2015 Emily Fox & Carlos Guestrin
  • 2.
    Machine Learning Specialization Visualproduct recommender ©2015 Emily Fox & Carlos Guestrin
  • 3.
    Machine Learning Specialization 3   I want to buy new shoes, but… ©2015 Emily Fox & Carlos Guestrin Too many options online…
  • 4.
    Machine Learning Specialization 4   Text search doesn’t help… ©2015 Emily Fox & Carlos Guestrin “Dress shoes”
  • 5.
    Machine Learning Specialization Visualproduct search demo ©2015 Emily Fox & Carlos Guestrin
  • 6.
    Machine Learning Specialization Featuresare key to machine learning ©2015 Emily Fox & Carlos Guestrin
  • 7.
    Machine Learning Specialization 7   Goal: revisit classifiers, but using more complex, non-linear features ©2015 Emily Fox & Carlos Guestrin Sentence from review Classifier MODEL Input: x Output: y Predicted class
  • 8.
    Machine Learning Specialization 8   Image classification ©2015 Emily Fox & Carlos Guestrin Input: x Image pixels Output: y Predicted object
  • 9.
    Machine Learning Specialization Neuralnetworks ê Learning *very* non-linear features ©2015 Emily Fox & Carlos Guestrin
  • 10.
    Machine Learning Specialization 10   Linear classifiers w 0 + w 1 x 1 + w 2 x 2 + … + w d x d = 0 Score(x) > 0 Score(x) < 0 ©2015 Emily Fox & Carlos Guestrin Score(x) = w0 + w1 x1 + w2 x2 + … + wd xd
  • 11.
    Machine Learning Specialization 11   Graph representation of classifier: useful for defining neural networks x1 x2 xd y … 1 w1 w2 > 0, output 1   < 0, output 0   Input Output ©2015 Emily Fox & Carlos Guestrin Score(x) = w0 + w1 x1 + w2 x2 + … + wd xd
  • 12.
    Machine Learning Specialization 12   What can a linear classifier represent? x1 OR x2 x1 AND x2 x1 x2 1 y x1 x2 1 y ©2015 Emily Fox & Carlos Guestrin
  • 13.
    Machine Learning Specialization 13   What can’t a simple linear classifier represent? XOR the counterexample to everything Need non-linear features ©2015 Emily Fox & Carlos Guestrin
  • 14.
    Machine Learning Specialization Solvingthe XOR problem: Adding a layer XOR = x1 AND NOT x2 OR NOT x1 AND x2 z1 -0.5 1 -1 z1 z2 z2 -0.5 -1 1 x1 x2 1 y 1 -0.5 1 1 Thresholded to 0 or 1 ©2015 Emily Fox & Carlos Guestrin
  • 15.
    Machine Learning Specialization 15   A neural network • Layers and layers and layers of linear models and non-linear transformations • Around for about 50 years - Fell in “disfavor” in 90s • In last few years, big resurgence - Impressive accuracy on several benchmark problems - Powered by huge datasets, GPUs, & modeling/learning alg improvements x1 x2 1 z1 z2 1 y ©2015 Emily Fox & Carlos Guestrin
  • 16.
    Machine Learning Specialization Applicationof deep learning to computer vision ©2015 Emily Fox & Carlos Guestrin
  • 17.
    Machine Learning Specialization 17   Image features • Features = local detectors - Combined to make prediction - (in reality, features are more low-level) Face! Eye Eye Nose Mouth ©2015 Emily Fox & Carlos Guestrin
  • 18.
    Machine Learning Specialization 18   Typical local detectors look for locally “interesting points” in image • Image features: collections of locally interesting points - Combined to build classifiers ©2015 Emily Fox & Carlos Guestrin Face!
  • 19.
    Machine Learning Specialization 19   SIFT [Lowe ‘99] ©2015 Emily Fox & Carlos Guestrin •Spin Images [Johnson & Herbert ‘99] •Textons [Malik et al. ‘99] •RIFT [Lazebnik ’04] •GLOH [Mikolajczyk & Schmid ‘05] •HoG [Dalal & Triggs ‘05] •… Many hand created features exist for finding interest points…
  • 20.
    Machine Learning Specialization 20   Standard image classification approach Input Use simple classifier e.g., logistic regression, SVMs Face? ©2015 Emily Fox & Carlos Guestrin Extract features Hand-created features
  • 21.
    Machine Learning Specialization 21   SIFT [Lowe ‘99] ©2015 Emily Fox & Carlos Guestrin •Spin Images [Johnson & Herbert ‘99] •Textons [Malik et al. ‘99] •RIFT [Lazebnik ’04] •GLOH [Mikolajczyk & Schmid ‘05] •HoG [Dalal & Triggs ‘05] •… Many hand created features exist for finding interest points… Hand-created features … but very painful to design
  • 22.
    Machine Learning Specialization 22   Deep learning: implicitly learns features Layer 1 Layer 2 Layer 3 Prediction ©2015 Emily Fox & Carlos Guestrin Example detectors learned Example interest points detected [Zeiler & Fergus ‘13]
  • 23.
    Machine Learning Specialization Deeplearning performance ©2015 Emily Fox & Carlos Guestrin
  • 24.
    Machine Learning Specialization Sampleresults using deep neural networks • German traffic sign recognition benchmark - 99.5% accuracy (IDSIA team) • House number recognition - 97.8% accuracy per character [Goodfellow et al. ’13] ©2015 Emily Fox & Carlos Guestrin
  • 25.
    Machine Learning Specialization ImageNet2012 competition: 1.2M training images, 1000 categories ©2015 Emily Fox & Carlos Guestrin 0 0.05 0.1 0.15 0.2 0.25 0.3 SuperVision ISI OXFORD_VGG Error (best of 5 guesses) Huge gain Exploited hand-coded features like SIFT Top 3 teams
  • 26.
    Machine Learning Specialization ImageNet2012 competition: 1.2M training images, 1000 categories ©2015 Emily Fox & Carlos Guestrin Winning entry: SuperVision 8 layers, 60M parameters [Krizhevsky et al. ’12] Achieving these amazing results required: • New learning algorithms • GPU implementation
  • 27.
    Machine Learning Specialization Deeplearning in computer vision ©2015 Emily Fox & Carlos Guestrin
  • 28.
    Machine Learning Specialization Sceneparsing with deep learning ©2015 Emily Fox & Carlos Guestrin [Farabet et al. ‘13]
  • 29.
    Machine Learning Specialization Retrievingsimilar images ©2015 Emily Fox & Carlos Guestrin Input Image Nearest neighbors
  • 30.
    Machine Learning Specialization Challengesof deep learning ©2015 Emily Fox & Carlos Guestrin
  • 31.
    Machine Learning Specialization Deeplearning score card Pros • Enables learning of features rather than hand tuning • Impressive performance gains - Computer vision - Speech recognition - Some text analysis • Potential for more impact ©2015 Emily Fox & Carlos Guestrin
  • 32.
    Machine Learning Specialization Deeplearning workflow Lots of labeled data Training set Validation set Learn deep neural net Validate Adjust parameters, network architecture,… ©2015 Emily Fox & Carlos Guestrin
  • 33.
    Machine Learning Specialization 33   Many tricks needed to work well… Different types of layers, connections,… needed for high accuracy [Krizhevsky et al. ’12] ©2015 Emily Fox & Carlos Guestrin
  • 34.
    Machine Learning Specialization Deeplearning score card Pros • Enables learning of features rather than hand tuning • Impressive performance gains - Computer vision - Speech recognition - Some text analysis • Potential for more impact Cons • Requires a lot of data for high accuracy • Computationally really expensive • Extremely hard to tune - Choice of architecture - Parameter types - Hyperparameters - Learning algorithm - … ©2015 Emily Fox & Carlos Guestrin Computational cost+ so many choices = incredibly hard to tune
  • 35.
    Machine Learning Specialization Deepfeatures: Deep learning + Transfer learning ©2015 Emily Fox & Carlos Guestrin
  • 36.
    Machine Learning Specialization 36   Standard image classification approach Input Use simple classifier e.g., logistic regression, SVMs Face? ©2015 Emily Fox & Carlos Guestrin Extract features Hand-created features Can we learn features from data, even when we don’t have data or time?
  • 37.
    Machine Learning Specialization 37   Transfer learning: Use data from one task to help learn on another Lots of data: Learn neural net Great accuracy on cat v. dog Some data: Neural net as feature extractor + Simple classifier Great accuracy on 101 categories Old idea, explored for deep learning by Donahue et al. ’14 & others ©2015 Emily Fox & Carlos Guestrin vs.
  • 38.
    Machine Learning Specialization 38   What’s learned in a neural net Very specific to Task 1 Should be ignored for other tasks More generic Can be used as feature extractor ©2015 Emily Fox & Carlos Guestrin vs. Neural net trained for Task 1: cat vs. dog
  • 39.
    Machine Learning Specialization 39   Transfer learning in more detail… Very specific to Task 1 Should be ignored for other tasks More generic Can be used as feature extractor ©2015 Emily Fox & Carlos Guestrin For Task 2, predicting 101 categories, learn only end part of neural net Use simple classifier e.g., logistic regression, SVMs, nearest neighbor,… Class? Keep weights fixed! Neural net trained for Task 1: cat vs. dog
  • 40.
    Machine Learning Specialization 40   Careful where you cut: latter layers may be too task specific Layer 1 Layer 2 Layer 3 Prediction ©2015 Emily Fox & Carlos Guestrin Example detectors learned Example interest points detected [Zeiler & Fergus ‘13] Too specific for new task Use these!
  • 41.
    Machine Learning Specialization Transferlearning with deep features workflow Some labeled data ©2015 Emily Fox & Carlos Guestrin Extract features with neural net trained on different task Learn simple classifier Validate Training set Validation set
  • 42.
    Machine Learning Specialization Howgeneral are deep features? ©2015 Emily Fox & Carlos Guestrin
  • 43.
    Machine Learning Specialization Summaryof deep learning ©2015 Emily Fox & Carlos Guestrin
  • 44.
    Machine Learning Specialization 44   What you can do now… • Describe multi-layer neural network models • Interpret the role of features as local detectors in computer vision • Relate neural networks to hand-crafted image features • Describe some settings where deep learning achieves significant performance boosts • State the pros & cons of deep learning model • Apply the notion of transfer learning • Use neural network models trained in one domain as features for building a model in another domain • Build an image retrieval tool using deep features ©2015 Emily Fox & Carlos Guestrin