Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Visual perception through Deep Learning

3,118 views

Published on

Renyi Hour talk for Barcelona GSE Data Science students by Dario Garcia-Gasulla, Barcelona Supercomputing Center

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Visual perception through Deep Learning

  1. 1. Visual perception through Deep Learning Dario Garcia-Gasulla dario.garcia@bsc.es Barcelona Supercomputing Center (BSC) June 1, 2016
  2. 2. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary The basics Dario Garcia-Gasulla June 1, 2016 2 / 38
  3. 3. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary The Artificial Neuron and the Artificial Neural Network Definition (Maureen Caudill) ...a computing system made up of a number of simple, highly interconnected processing elements, which process information by their dynamic state response to external inputs. McCulloch & Pitts, 1943 Rosenblat, 1958 Dario Garcia-Gasulla June 1, 2016 3 / 38
  4. 4. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Training Neural Networks Backpropagation algorithm Measure error on output (loss function) Optimize weights to reduce loss (Gradient Descent) Backpropagate the loss, layer by layer, until all neuron weights have been improved Repeat! Dario Garcia-Gasulla June 1, 2016 4 / 38
  5. 5. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary (old) Neural Networks Traditionally used as classifiers for simple problems Capable of finding non-linearities on the data Limitations Large networks are increasingly expensive to train (millions of weights) Needs tons of data to find complex non-linearities Training easily stalls on local sub-optimals Dario Garcia-Gasulla June 1, 2016 5 / 38
  6. 6. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Deep Neural Network (aka Deep Learning) More layers! Made possible by: Hardware Advances (GPU’s) More efficient types of neurons Training optimizations Dario Garcia-Gasulla June 1, 2016 6 / 38
  7. 7. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Deep Learning families Based on neuron, layer and training particularities: Convolutional Neural Networks (CNNs): Capture 2D features. Appropriate for visual data. Recurrent Neural Networks (RNNs): Capture streams of data. May include memory components (LSTM). Appropriate for text, sound, etc.. Deep Belief Network: Probabilistic model. ... and many others Dario Garcia-Gasulla June 1, 2016 7 / 38
  8. 8. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Convolutional Neural Networks Dario Garcia-Gasulla June 1, 2016 8 / 38
  9. 9. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary The explosion of Deep Learning The ImageNet Challenge: Visual recognition competition. Recognize 1,000 different objects. In 2012... Alex Krizhevsky et. al. trained a CNN with 5 layers... and improved the best result by 11%. In 2014 all candidates were based on CNNs. In 2015, human-level performance was achieved. Dario Garcia-Gasulla June 1, 2016 9 / 38
  10. 10. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary CNNs: The Origin Design Fukushima, 1980 (neocognitron). LeCun, 1998, 2003. Based on the visual cortex of animals: Each neuron percieves a small portion of the input, and exploits the spatial correlation. Reuse neuron weights to reduce complexity. What was missing: Feasible implementation. GPUs Dario Garcia-Gasulla June 1, 2016 10 / 38
  11. 11. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary CNNs Layers: Convolution Convolutional Layers Each neuron inputs a small patch of data (called receptive field e.g., 3x3). A neuron parameters are convolved on all the input. This provides translation invariace. Dario Garcia-Gasulla June 1, 2016 11 / 38
  12. 12. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary CNNs Layers: Pooling Pooling Layers Down-sampling technique to reduce complexity at the price of precision. Reduce values within pooling filter (e.g., 2x2) to the maximum or average (e.g., max pooling, average pooling). The exact location is not as important as relative location. Dario Garcia-Gasulla June 1, 2016 12 / 38
  13. 13. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary CNNs Layers: Fully-connected Fully-connected Layers Standard NN layer. Each neuron inputs all neurons from the previous layer. Spatial information is no longer taken into account. The output will be an estimate of prediction (class probability). Dario Garcia-Gasulla June 1, 2016 13 / 38
  14. 14. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary CNNs Architecture Standard Architecture Stack convolution and pooling layers. To estimate probabilities, use fully connected layers at the end. Output feeds a classifier (softmax, SVM). Dario Garcia-Gasulla June 1, 2016 14 / 38
  15. 15. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary CNNs in Action During Traning A CNN trained to recognize objects learns different representations at each depth. 1. Lines, angles 2. Composed shapes 3. Parts of entities 4. Full entities During Deployment The CNN looks for increasingly complex patterns in the image. Finally, by considering the most complex (top layer) a class prediction is made. Dario Garcia-Gasulla June 1, 2016 15 / 38
  16. 16. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary CNNs Practical Notes Requirements Large set of labeled data for training Computational power for training (GPUs) Deployment is cheap Where to start? Almost out-of-the-box CNNs: Caffe, Torch, Theano, TensorFlow Pre-trained models are available for download Dario Garcia-Gasulla June 1, 2016 16 / 38
  17. 17. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Applications of Convolutional Neural Networks Dario Garcia-Gasulla June 1, 2016 17 / 38
  18. 18. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Object Recognition (A. Krizhevsky et. al., 2012.) Dario Garcia-Gasulla June 1, 2016 18 / 38
  19. 19. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Image Segmentation (LC Chen et. al., 2014.) Dario Garcia-Gasulla June 1, 2016 19 / 38
  20. 20. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Style Transfer (Gatys et. al., 2015.) Dario Garcia-Gasulla June 1, 2016 20 / 38
  21. 21. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Colorization (Zhang et. al., 2016.) Dario Garcia-Gasulla June 1, 2016 21 / 38
  22. 22. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Colorization II (Iizuka et. al., 2016.) Dario Garcia-Gasulla June 1, 2016 22 / 38
  23. 23. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Other applications Mobile Apps Aipoly, AI Scry, BlindTool: Textual description of image Artify: Artistic style Nippler, AwesomeCNN, WhatPlant: Object detections AI Challenges Playing videogames, GO, ... Self-driving car Image retrieval (Google, Facebook, Instagram, etc.) Dario Garcia-Gasulla June 1, 2016 23 / 38
  24. 24. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Mining CNN learnt representations: A case of research Dario Garcia-Gasulla June 1, 2016 24 / 38
  25. 25. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Beyond CNNs CNNs... Learn lots of relevant representations (millions) from a training set Characterize input data based on learnt representations Our hypothesis Kind of a Feature extractor What mining/learning can be performed with those features? Dario Garcia-Gasulla June 1, 2016 25 / 38
  26. 26. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Step 1: Vector Embeddings From Images to Vectors For a given image, annotate which neurons activate for it, and its activation strength Use a subset of those neurons to define a fixed vector length Produce a vector for each image, assuming each variable is independent The vector represents everything the CNN percieves in the image Dario Garcia-Gasulla June 1, 2016 26 / 38
  27. 27. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Step 2: Abstract Representations Image Class Vectors Images are imperfect representations of entities (changes in perspective, ilumination, specimen etc.) To build stable class representations we need to aggregate the evidence provided by many images of the same entity Result: One vector per class, with millions of values Result: One vector with millions of numerical values for each abstract class Dario Garcia-Gasulla June 1, 2016 27 / 38
  28. 28. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Step 3: Exploit vectors Vector operations Compute distances to perform clustering (unsupervised learning) Visualize class vectors (see what the CNN sees) Vector arithmetics (visual reasoning) Dario Garcia-Gasulla June 1, 2016 28 / 38
  29. 29. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Mining process: Step by Step 1. Build million-pattern description for a set of images 2. Aggregate images by class 3. Compute distances, clusters, arithmetics Image to vector Image Class to vector Image Class clustering and equations Dario Garcia-Gasulla June 1, 2016 29 / 38
  30. 30. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Actual Data The Model GoogLeNet architecture pretrained to recognize 1,000 classes using 1.5M images. 80MB. Extract 1.2M features from the CNN. One vector < 3MB The Data Process 50,000 images (ImageNet test set) Aggregate 50 images per class: 1,000 class vectors Dario Garcia-Gasulla June 1, 2016 30 / 38
  31. 31. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Clustering (I) 114 Dogs (black) 44 Wheeled vehicles (grey) ? Similar things are close Implicit high level knowledge Dario Garcia-Gasulla June 1, 2016 31 / 38
  32. 32. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Clustering (II) Which semantics does the vector space actually capture? Find n-clusters For each cluster, find their most representative WordNet label Dario Garcia-Gasulla June 1, 2016 32 / 38
  33. 33. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Class Visualization (I) Vector to image Generate images from ClassVectors See a concept as the CNN percieves it Based on Gatys et.al., 2015 Dario Garcia-Gasulla June 1, 2016 33 / 38
  34. 34. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Class Visualization (II) Dario Garcia-Gasulla June 1, 2016 34 / 38
  35. 35. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Vector Arithmetics (I) Church - Mosque = Bellcote Horse cart - Horse = Rickshaw Dario Garcia-Gasulla June 1, 2016 35 / 38
  36. 36. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Vector Arithmetics (III) Panda bear - Brown bear = Skunk, Football, Indri, Angora rabbit What do these four image classes have in common? Dario Garcia-Gasulla June 1, 2016 36 / 38
  37. 37. Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary Deep Learning and CNN Technically Not so new Made possible by increase computational power, and few optimizations Currently, trial and error research approach Impact Anything related with visual data has changed Same will happen with text, sound and others Just the tip of the iceberg! Dario Garcia-Gasulla June 1, 2016 37 / 38
  38. 38. Deep Learning and CNNs online materials http://cs231n.github.io/convolutional-networks/ http://ufldl.stanford.edu/tutorial/ Almost out-of-the-box CNNs Caffe, Torch, Theano, TensorFlow dario.garcia@bsc.es

×