Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)

399 views

Published on

https://telecombcn-dl.github.io/2017-dlcv/

Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)

  1. 1. [course site] Elisa Sayrol Clols Elisa.sayrol@upc.edu Associate Professor Universitat Politecnica de Catalunya Technical University of Catalonia Convolutional Neural Networks Day 1 Lecture 3 #DLUPC
  2. 2. The i-th layer is defined by a matrix Wi and a vector bi, and the activation is simply a dot product plus bi: Deep Neural Network Num parameters to learn at i-th layer: 2 x1 x2 x3 x4 y1 Layer 1 Layer 2 Layer 3 Layer 0 y2
  3. 3. From Neurons to Convolutional Neural Networks What if the Input is an image?
  4. 4. From Neurons to Convolutional Neural Networks For a 200x200 image, we have 4x104 neurons each one with 4x104 inputs, that is 16x108 parameters, only for one layer (not counting the bias)!!! Figure Credit: Ranzatto
  5. 5. From Neurons to Convolutional Neural Networks For a 200x200 image, we have 4x104 neurons each one with 10x10 “local connections” (also called receptive field) inputs, that is 4x106 What else can we do to reduce the number of parameters? Figure Credit: Ranzatto
  6. 6. From Neurons to Convolutional Neural Networks Translation invariance: we can use same parameters to capture a specific “feature” in any area of the image. We can try different sets of parameters to capture different features. These operations are equivalent to perform convolutions with different filters. Ex: With100 different filters (or feature extractors) of size 10x10, the number of parameters is 104 Figure Credit: Ranzatto
  7. 7. From Neurons to Convolutional Neural Networks ● sparse interactions ● parameter sharing ● equivariant representations
  8. 8. From Neurons to Convolutional Neural Networks … and don’t forget the activation function! Figure Credit: Ranzatto ReLu PReLu
  9. 9. From Neurons to Convolutional Neural Networks Most ConvNets use Pooling (or subsampling) to reduce dimensionality and provide invariance to small local changes. Pooling options: • Max • Average • Stochastic pooling Figure Credit: Ranzatto
  10. 10. From Neurons to Convolutional Neural Networks Padding (P): When doing the convolution in the borders, you may add values to compute the convolution. When the values are zero, that is quite common, the technique is called zero-padding. When padding is not used the output size is reduced. FxF=3x3
  11. 11. From Neurons to Convolutional Neural Networks Padding (P): When doing the convolution in the borders, you may add values to compute the convolution. When the values are zero, that is quite common, the technique is called zero-padding. When padding is not used the output size is reduced. FxF=5x5
  12. 12. From Neurons to Convolutional Neural Networks Stride (S): When doing the convolution or another operation, like pooling, we may decide to slide not pixel by pixel but every 2 or more pixels. The number of pixels that we skip is the value of the stride. It might be used to reduce the dimensionality of the output
  13. 13. References Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position, Kunihiko Fukushima NHK BroadcastingScienceResearchLaboratories,Kinuta, Setagaya, Tokyo, Japan, Bio Cybernetics, (4) 1980, 193-202 Deep Learning An MIT Press book Ian Goodfellow and Yoshua Bengio and Aaron Courville http://www.deeplearningbook.org/contents/convnets.html Stanford Course in Convolutional NN for Visual Representation (2017) http://cs231n.github.io/convolutional-networks/ 2016 course: https://youtu.be/LxfUGhug-iQ
  14. 14. Questions?

×