Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Deep Learning for Computer Vision: Image Classification (UPC 2016)

2,945 views

Published on

http://imatge-upc.github.io/telecombcn-2016-dlcv/

Deep Learning for Computer Vision Barcelona

Summer seminar UPC TelecomBCN (July 4-8, 2016)

Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or text captioning.

Published in: Technology
  • Be the first to comment

Deep Learning for Computer Vision: Image Classification (UPC 2016)

  1. 1. Day 1 Lecture 2 Classification [course site] Eva Mohedano
  2. 2. Outline Image Classification Train/Test Splits Metrics Models
  3. 3. Outline Image Classification Train/Test Splits Metrics Models
  4. 4. 4 Image Classification Set of predefined categories [eg: table, apple, dog, giraffe] Binary classification [1, 0] DOG
  5. 5. 5 Image Classification
  6. 6. 6 Dog Image Classification pipeline Slide credit: Jose M Àlvarez
  7. 7. 7Slide credit: Jose M Àlvarez Dog Learned Representation Image Classification pipeline
  8. 8. 8 Dog Learned Representation Part I: End-to-end learning (E2E) Image Classification pipeline Slide credit: Jose M Àlvarez
  9. 9. 9 Image Classification: Example Datasets training set of 60,000 examples test set of 10,000 examples
  10. 10. 10 Image Classification: Example Datasets
  11. 11. Outline Image Classification Train/Test Splits Metrics Models
  12. 12. 2.1 3.2 4.8 0.1 0.0 2.6 3.1 1.4 2.5 0.2 1.0 2.0 1.0 2.3 3.2 9.3 6.4 0.3 2.0 5.0 3.2 1.0 6.9 9.1 9.0 3.5 5.4 5.5 3.2 1.0 N training examples (rows) D features (columns) 0 1 1 0 0 N Training set
  13. 13. Dataset Shuffled data shuffle Training data (70%) Test data (30%) split Learning algorithm fit(X, y) Model Prediction algorithm predict(X) Predictions Compute error/accuracy score(X, y) Out-of-sample error estimate NO! Train/Test Splits
  14. 14. Split your dataset into train and test at the very start ● Usually good practice to shuffle data (exception: time series) Do not look at test data (data snooping)! ● Lock it away at the start to prevent contamination NB: Never ever train on the test data! ● You have no way to estimate error if you do ● Your model could easily overfit the test data and have poor generalization, you have no way of knowing without test data ● Model may fail in production Data hygiene
  15. 15. Outline Image Classification Train/Test Splits Metrics Models
  16. 16. Confusion matrices provide a by-class comparison between the results of the automatic classifications with ground truth annotations. Metrics
  17. 17. Metrics Correct classifications appear in the diagonal, while the rest of cells correspond to errors. Prediction Class 1 Class 2 Class 3 Ground Truth Class 1 x(1,1) x(1,2) x(1,3) Class 2 x(2,1) x(2,2) x(2,3) Class 3 x(3,1) x(3,2) x(3,3)
  18. 18. Special case: Binary classifiers in terms of “Positive” vs “Negative”. Prediction Positives negative Ground Truth Positives True positive (TP) False negative (FN) negative False positives (FP) True negative (TN) Metrics
  19. 19. The “accuracy” measures the proportion of correct classifications, not distinguishing between classes. Binary Prediction Class 1 Class 2 Class 3 Ground Truth Class 1 x(1,1) x(1,2) x(1,3) Class 2 x(2,1) x(2,2) x(2,3) Class 3 x(3,1) x(3,2) x(3,3) Prediction Positives negative Ground Truth Positives True positive (TP) False negative (FN) Negative False positives (FP) True negative (TN) Metrics
  20. 20. Given a reference class, its Precision (P) and Recall (R) are complementary measures of relevance. Prediction Positives Negatives Ground Truth Positives True positive (TP) False negative (FN) Negatives False positives (FP) "Precisionrecall" by Walber - Own work. Licensed under Creative Commons Attribution-Share Alike 4.0 via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Precisionrecall. svg#mediaviewer/File:Precisionrecall.svg Example: Relevant class is “Positive” in a binary classifier. Metrics
  21. 21. Binary classification results often depend from a parameter (eg. decision threshold) whose value directly impacts precision and recall. For this reason, in many cases a Receiver Operating Curve (ROC curve) is provided as a result. TruePositiveRate Metrics
  22. 22. Outline Image Classification Train/Test Splits Metrics Models
  23. 23. 23 Dog Image Classification pipeline Slide credit: Jose M Àlvarez
  24. 24. Mapping function to predict a score for the class label Linear Models f(x, w) = (wT x + b) CS231n: Convolutional Neural Networks for Visual Recognition
  25. 25. Sigmoid f(x, w) = g(wT x + b) Activation function: Turn score into probabilities Logistic Regression
  26. 26. Neuron
  27. 27. Summary Image classification aims to build a model to automatically predict the content of an image. When presented as supervised problem, data should be structured in Train/Test splits to learn the model. Important to have a way to asses the performance of the models Easiest models classification are linear models. Linear models with a non-linearity on top represent the the “neuron”, the basic element of Neural Networks.

×