Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problems - StampedeCon AI Summit 2017

118 views

Published on

In this session, we’ll discuss approaches for applying convolutional neural networks to novel computer vision problems, even without having millions of images of your own. Pretrained models and generic image data sets from Google, Kaggle, universities, and other places can be leveraged and adapted to solve industry and business specific problems. We’ll discuss the approaches of transfer learning and fine tuning to help anyone get started on using deep learning to get cutting edge results on their computer vision problems.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problems - StampedeCon AI Summit 2017

  1. 1. Don’t Start From Scratch Florian Muellerklein fmuellerklein@minerkasch.com
  2. 2. Contents ● What is transfer learning? ● How to do it ● Why does it work?
  3. 3. What is transfer learning? ● Transfer learning is taking a neural network that has already been trained and adapting it to a new dataset. ● Allows us to use deep learning on problems that we have very little training data for. ● Improve the generalization of our network on some novel task. ● Let big AI labs do all the hard work for us.
  4. 4. Pretrained Neural Networks Big research labs typically design and test their new deep learning models against large public benchmark datasets. Most commonly on Imagenet – 1.2 million images across 1000 categories Or on MS COCO – 330k images for things like object detection, image segmentation, and keypoint detection
  5. 5. Pretrained Neural Networks - Classification VGG VGG - https://arxiv.org/pdf/1409.1556.pdf, Inception V3 - https://arxiv.org/pdf/1512.00567v3.pdf, ResNet - https://arxiv.org/abs/1512.03385 Inception V3 ResNet
  6. 6. Pretrained Neural Networks - Detection RCNN- https://arxiv.org/pdf/1506.01497.pdf, SSD- https://arxiv.org/pdf/1512.02325.pdf, YOLO- https://arxiv.org/abs/1512.03385 R-CNN SSD YOLO
  7. 7. Transfer learning - Classification Ok, so we have a state of the art neural network that someone else trained to recognize images and classify them in one of 1000 categories. But our problem has maybe 5 categories. What do we do?
  8. 8. Transfer learning - Classification Ok, so we have a state of the art neural network that someone else trained to recognize images and classify them in one of 1000 categories. But our problem has maybe 5 categories. What do we do?
  9. 9. Transfer learning - Classification Ok, so we have a state of the art neural network that someone else trained to recognize images and classify them in one of 1000 categories. But our problem has maybe 5 categories. What do we do? fc 5
  10. 10. Transfer learning - Detection Very similar, the only portions of the pretrained detection network that are not flexible are the last portions. We can call this the ‘Detector’ portion of the network, we can just swap that out with our own ’detector’ portion that will learn the specific objects that we want to detect.
  11. 11. What is transfer learning? Once you have your own classifier/regression or whatever layer tacked on you can start training. If your dataset is small you may want to only train your new layer. If your dataset is large you may want to retrain the whole network. If your dataset is very different from the one that the network was originally trained with you may want to retrain the whole network.
  12. 12. What if I don’t want to train? Can even make use of a pretrained neural network without retraining. Once trained the deep learning model because an effective way to represent any image as a feature rich vector. Now we can pass any image through our pretrained neural network and get a dense vector representation of that image. Allows us to do use these features as the input for any traditional machine learning algorithm (logistic regression, random forest, etc …) Or we can set up a reverse image search type of system by computing similarity scores between the vectors.
  13. 13. Transfer learning – How Well Does it Work? Remi Cadene’s masters thesis - Deep Learning for Visual Recognition
  14. 14. What’s going on here?
  15. 15. Representation Learning!
  16. 16. Representation Learning? http://www.rsipvision.com/exploring-deep-learning/
  17. 17. Representation Learning? http://www.rsipvision.com/exploring-deep-learning/ Input data
  18. 18. Representation Learning? http://www.rsipvision.com/exploring-deep-learning/ Input data Data Transformations
  19. 19. Representation Learning? http://www.rsipvision.com/exploring-deep-learning/ Input data Data Transformations Output - usually some shallow machine learning layer (logistic regression, linear svm, linear regression).
  20. 20. Representation learning Deep learning presents us with a unique philosophy for machine learning. Instead of learning the mapping from our features directly to an output we are learning the best representation of our features. Each layer of a neural network introduces feature representations that are expressed in terms of other more simple representations. The process has multiple steps, with each layer building on the representations created by the previous layer.
  21. 21. Representation Learning Deep Learning Book (2016)
  22. 22. Representation Learning Here is a test with natural images. Unsupervised clustering on the raw images and then again on the features extracted from the final layer of a deep neural network.
  23. 23. Representation Learning
  24. 24. Representation Learning
  25. 25. Representation Learning
  26. 26. Representation Learning
  27. 27. Transfer Learning Context
  28. 28. Transfer Learning Context Edges are very generic features. Probably don’t need to finetune these.
  29. 29. Representation Learning
  30. 30. Representation Learning Contours, shapes, and corners are slightly more specific. Finetuning could be helpful here, maybe not necessary.
  31. 31. Representation Learning Layers are starting to get really specific to the original problem. Should finetune these.
  32. 32. Representation Learning Definitely finetune these
  33. 33. Representation learning This is an interesting observation! The ‘deep’ part of this neural network is essentially working to try and make the problem easier. We will leverage this property to very quickly and effectively make use of the power of deep learning without the headache of training the networks from scratch.
  34. 34. Recap
  35. 35. Applied Deep Learning is Mostly Finetuning Most applied deep learning work revolves around two strategies. Both of which utilize neural networks that have already been trained 1. We can take a pretrained network and finetune it on our specific problem. a. Ex. A computer vision model already knows how to ‘see’ we just need to finetune it to see whatever our specific problem requires. 2. Using a pretrained network and extract its internal representations of a dataset to use as features for a model.
  36. 36. Fine Tuning a Network To finetune a network we typically choose a network that was already trained on a similar problem to our own. We then train the network in much the same way that it was originally trained with a few differences. 1. Use a much smaller learning rate 2. Only train for a handful of iterations
  37. 37. Feature Extraction We can use these networks as feature extractors to represent our data as dense information rich vectors. We can vectorize the following quite easily: 1. Images a. Using Convolutional Neural Networks trained to classify images 2. Words a. Using neural networks trained to predict words given their context 3. Sentences a. Using neural networks that are trained to reconstruct sentences given their context
  38. 38. Further Resources – Classification Tutorials Tensorflow - https://www.tensorflow.org/tutorials/image_retraining PyTorch - http://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html Caffe - http://caffe.berkeleyvision.org/gathered/examples/finetune_flickr_style.html MXNet - https://mxnet.incubator.apache.org/how_to/finetune.html
  39. 39. Further Resources – Theory How transferable are features in deep neural networks? Jason Yosinski, Jeff Clune, Yoshua Bengio, Hod Lipson - https://arxiv.org/abs/1411.1792 Deep Learning Book, Ian Goodfellow, Yoshua Bengio, Aaron Courville - http://www.deeplearningbook.org/ Overfeat: Integrated Recognition, Localization and Detection using Convolutional Networks. Pierre Sermanet, David Eigen, Xiang Zhang, Michael Mathieu, Rob Fergus, Yann LeCun - https://arxiv.org/pdf/1312.6229.pdf

×