Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Applications of Deep Learning
in Computer Vision
Christoph Körner
Outline
1) Introduction to Neural Networks
2) Deep Learning
3) Applications in Computer Vision
4) Conclusion
Why Deep Learning?
●
Wins every computer vision challenge
(classification, segmentation, etc.)
●
Can be applied in various...
Perceptron (1958)
●
Weighted sum of inputs
●
Threshold operator
Artificial Neural Network (1960)
●
Universal function approximator
●
Can solve the XOR problem
Backpropagation (1982)
●
Propagate the error through the network
●
Allows Optimization (SGD, etc.)
●
Enables training of m...
Convolution and Pooling (1989)
●
Less parameters than hidden layers
●
More efficient training
Handwritten ZIP Codes (1989)
●
30 training passes
●
Achieved 92% accuracy
What happened until 2011?
●
Better Initialization
●
Better Non-linearities: ReLU
●
1000 times more training data
●
More co...
Deep Learning
●
Conv-, Pool- and Fully-Connected Layers
●
ReLU activations
●
Deep nested models with many parameters
●
New...
AlexNet (2012)
●
62.378.344 parameters (250MB)
●
24 layers
VGGNet (2013)
●
102.908.520 parameters (412MB)
●
23 layers
GoogLeNet (2014)
●
6.998.552 parameters (28MB)
●
143 layers
Inception Module
●
Heavy use of 1x1 convolutions (applied along the
depth dimension)
●
Very efficient
ResNet (2015)
●
Residual learning
●
152 layers
Applications in Computer Vision
Classification
●
One class per image
●
Softmax layer at the end
Localization
●
Bounding box Regression
●
Sigmoid layer with 4 outputs at the end
●
Via Classification
Detection
●
Multiple Objects, multiple classes
●
Solved using multiple networks
Segmentation
More Applications
●
Compression
●
Auto-encoders, Self-organizing maps
●
Image Captioning
●
Solved with Recurrent Architect...
Conclusion
●
Powerful, learn from data instead of hand-crafted
feature extraction
●
Better than humans
●
Deeper is always ...
Thank you!
Christoph Körner
Upcoming SlideShare
Loading in …5
×

Intro to Deep Learning for Computer Vision

1,031 views

Published on

Slides for my presentation at the Seminar in Computer Graphics at Vienna University of Technology, 2016

Published in: Technology
  • Be the first to comment

Intro to Deep Learning for Computer Vision

  1. 1. Applications of Deep Learning in Computer Vision Christoph Körner
  2. 2. Outline 1) Introduction to Neural Networks 2) Deep Learning 3) Applications in Computer Vision 4) Conclusion
  3. 3. Why Deep Learning? ● Wins every computer vision challenge (classification, segmentation, etc.) ● Can be applied in various domains (speech recognition, game prediction, computer vision, etc.) ● Beats human accuracy ● Big communities and resources ● Hardware for Deep Learning
  4. 4. Perceptron (1958) ● Weighted sum of inputs ● Threshold operator
  5. 5. Artificial Neural Network (1960) ● Universal function approximator ● Can solve the XOR problem
  6. 6. Backpropagation (1982) ● Propagate the error through the network ● Allows Optimization (SGD, etc.) ● Enables training of multi-layer networks
  7. 7. Convolution and Pooling (1989) ● Less parameters than hidden layers ● More efficient training
  8. 8. Handwritten ZIP Codes (1989) ● 30 training passes ● Achieved 92% accuracy
  9. 9. What happened until 2011? ● Better Initialization ● Better Non-linearities: ReLU ● 1000 times more training data ● More computing power ● Factor 1 million speedup in training time through parallelization on GPUs
  10. 10. Deep Learning ● Conv-, Pool- and Fully-Connected Layers ● ReLU activations ● Deep nested models with many parameters ● New layer types and structures ● New techniques to reduce overfitting ● Loads of training data and compute power ● 10.000.000 images ● Weeks of training on multi-GPU machines
  11. 11. AlexNet (2012) ● 62.378.344 parameters (250MB) ● 24 layers
  12. 12. VGGNet (2013) ● 102.908.520 parameters (412MB) ● 23 layers
  13. 13. GoogLeNet (2014) ● 6.998.552 parameters (28MB) ● 143 layers
  14. 14. Inception Module ● Heavy use of 1x1 convolutions (applied along the depth dimension) ● Very efficient
  15. 15. ResNet (2015) ● Residual learning ● 152 layers
  16. 16. Applications in Computer Vision
  17. 17. Classification ● One class per image ● Softmax layer at the end
  18. 18. Localization ● Bounding box Regression ● Sigmoid layer with 4 outputs at the end ● Via Classification
  19. 19. Detection ● Multiple Objects, multiple classes ● Solved using multiple networks
  20. 20. Segmentation
  21. 21. More Applications ● Compression ● Auto-encoders, Self-organizing maps ● Image Captioning ● Solved with Recurrent Architecture ● Image Stylization ● Clustering ● Many more...
  22. 22. Conclusion ● Powerful, learn from data instead of hand-crafted feature extraction ● Better than humans ● Deeper is always better ● Overfitting ● More data is always better ● Data quality ● Ground truth
  23. 23. Thank you! Christoph Körner

×