Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
[course site]
Xavier Giro-i-Nieto
xavier.giro@upc.edu
Associate Professor
Universitat Politecnica de Catalunya
Technical U...
2
ImageNet Challenge
● 1,000 object classes
(categories).
● Images:
○ 1.2 M train
○ 100k test.
3
Russakovsky, Olga, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang et al. "Imagenet
large sc...
Slide credit:
Rob Fergus (NYU)
-9.8%
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., ... & Fei-Fei, L...
AlexNet (Supervision)
5
Orange
A Krizhevsky, I Sutskever, GE Hinton “Imagenet classification with deep convolutional neura...
ImageNet Classification 2013
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., ... & Fei-Fei, L. (2015)...
The development of better
convnets is reduced to
trial-and-error.
7
Zeiler-Fergus (ZF)
Visualization can help in
proposing...
“A convnet model that uses the same
components (filtering, pooling) but in
reverse, so instead of mapping pixels
to featur...
Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In Computer Vision–ECCV 2014 (pp...
10
Regularization with more
dropout: introduced in the
input layer.
Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskev...
ImageNet Classification 2013
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., ... & Fei-Fei, L. (2015)...
12NVIDIA, “NVIDIA and IBM CLoud Support ImageNet Large Scale Visual Recognition Challenge” (2015)
ImageNet Challenge: 2014
13
ImageNet Challenge: 2014
GoogLeNet (Inception)
14Movie: Inception (2010)
15
22 layers !
Szegedy, Christian, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov,
Dumitru Erhan, V...
16
GoogLeNet (Inception)
17
Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." ICLR 2014.
GoogLeNet (Inception)
18
Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." ICLR 2014.
Multiple
scales
GoogLeNet (Inception)
GoogLeNet (NiN)
19
3x3 and 5x5 convolutions deal
with different scales.
Lin, Min, Qiang Chen, and Shuicheng Yan. "Network ...
20
Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." ICLR 2014.
Dimensionality
reduction
GoogLeNet (Inception)
21
1x1 convolutions does dimensionality
reduction (c3<c2) and accounts for rectified
linear units (ReLU).
Lin, Min, Qiang ...
22
In GoogLeNet, the Cascaded 1x1 Convolutions compute reductions before the
expensive 3x3 and 5x5 convolutions.
GoogLeNet...
23
Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." ICLR 2014.
GoogLeNet (Inception)
24
Two Softmax Classifiers at intermediate layers combat the vanishing gradient while
providing regularization at training...
25
Szegedy, Christian, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent
Vanho...
E2E: Classification: VGG
26
Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image...
E2E: Classification: VGG
27
Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image...
E2E: Classification: VGG: 3x3 Stacks
28
Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large...
E2E: Classification: VGG
29
Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image...
30
3.6% top 5 error…
with 152 layers !!
ImageNet Challenge: 2015
E2E: Classification: ResNet
31
He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. "Deep residual learning for image r...
E2E: Classification: ResNet
32
● Deeper networks (34 is deeper than 18) are more difficult to train.
Thin curves: training...
ResNet
33
● Residual learning: reformulate the layers as learning residual functions with
reference to the layer inputs, i...
E2E: Classification: ResNet
34
He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. "Deep residual learning for image r...
35
Learn more
Li Fei-Fei, “How we’re teaching computers to understand
pictures” TEDTalks 2014.
Russakovsky, Olga, Jia Deng...
36
The end of the challenge
http://image-net.org/challenges/beyond_ilsvrc
37
Thanks ! Q&A ?
Follow me at
https://imatge.upc.edu/web/people/xavier-giro
@DocXavi
/ProfessorXavi
Upcoming SlideShare
Loading in …5
×

Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vision)

2,863 views

Published on

https://telecombcn-dl.github.io/2017-dlcv/

Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.

Published in: Data & Analytics
  • Be the first to comment

Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vision)

  1. 1. [course site] Xavier Giro-i-Nieto xavier.giro@upc.edu Associate Professor Universitat Politecnica de Catalunya Technical University of Catalonia Image Classification on ImageNet #DLUPC
  2. 2. 2 ImageNet Challenge ● 1,000 object classes (categories). ● Images: ○ 1.2 M train ○ 100k test.
  3. 3. 3 Russakovsky, Olga, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang et al. "Imagenet large scale visual recognition challenge." International Journal of Computer Vision 115, no. 3 (2015): 211-252. [web] ImageNet Dataset
  4. 4. Slide credit: Rob Fergus (NYU) -9.8% Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., ... & Fei-Fei, L. (2014). Imagenet large scale visual recognition challenge. arXiv preprint arXiv:1409.0575. [web] 4 Based on SIFT + Fisher Vectors ImageNet Challenge: 2012
  5. 5. AlexNet (Supervision) 5 Orange A Krizhevsky, I Sutskever, GE Hinton “Imagenet classification with deep convolutional neural networks” NIPS 2012
  6. 6. ImageNet Classification 2013 Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., ... & Fei-Fei, L. (2015). Imagenet large scale visual recognition challenge. arXiv preprint arXiv:1409.0575. [web] Slide credit: Rob Fergus (NYU) 6 ImageNet Challenge: 2013
  7. 7. The development of better convnets is reduced to trial-and-error. 7 Zeiler-Fergus (ZF) Visualization can help in proposing better architectures. Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In Computer Vision–ECCV 2014 (pp. 818-833). Springer International Publishing.
  8. 8. “A convnet model that uses the same components (filtering, pooling) but in reverse, so instead of mapping pixels to features does the opposite.” Zeiler, Matthew D., Graham W. Taylor, and Rob Fergus. "Adaptive deconvolutional networks for mid and high level feature learning." Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 2011. 8 Zeiler-Fergus (ZF)
  9. 9. Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In Computer Vision–ECCV 2014 (pp. 818-833). Springer International Publishing. 9 Zeiler-Fergus (ZF)
  10. 10. 10 Regularization with more dropout: introduced in the input layer. Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580. Chicago Zeiler-Fergus (ZF): Drop out
  11. 11. ImageNet Classification 2013 Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., ... & Fei-Fei, L. (2015). Imagenet large scale visual recognition challenge. arXiv preprint arXiv:1409.0575. [web] -5% 11 ImageNet Challenge: 2013
  12. 12. 12NVIDIA, “NVIDIA and IBM CLoud Support ImageNet Large Scale Visual Recognition Challenge” (2015) ImageNet Challenge: 2014
  13. 13. 13 ImageNet Challenge: 2014
  14. 14. GoogLeNet (Inception) 14Movie: Inception (2010)
  15. 15. 15 22 layers ! Szegedy, Christian, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. "Going deeper with convolutions." GoogLeNet (Inception)
  16. 16. 16 GoogLeNet (Inception)
  17. 17. 17 Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." ICLR 2014. GoogLeNet (Inception)
  18. 18. 18 Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." ICLR 2014. Multiple scales GoogLeNet (Inception)
  19. 19. GoogLeNet (NiN) 19 3x3 and 5x5 convolutions deal with different scales. Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." ICLR 2014. [Slides]
  20. 20. 20 Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." ICLR 2014. Dimensionality reduction GoogLeNet (Inception)
  21. 21. 21 1x1 convolutions does dimensionality reduction (c3<c2) and accounts for rectified linear units (ReLU). Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." ICLR 2014. [Slides] GoogLeNet (Inception)
  22. 22. 22 In GoogLeNet, the Cascaded 1x1 Convolutions compute reductions before the expensive 3x3 and 5x5 convolutions. GoogLeNet (Inception)
  23. 23. 23 Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." ICLR 2014. GoogLeNet (Inception)
  24. 24. 24 Two Softmax Classifiers at intermediate layers combat the vanishing gradient while providing regularization at training time. ...and no fully connected layers needed (12 times fewer parameters than AlexNet. !) GoogLeNet (Inception)
  25. 25. 25 Szegedy, Christian, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. "Going deeper with convolutions." CVPR 2015. [video] [slides] [poster] GoogLeNet (Inception)
  26. 26. E2E: Classification: VGG 26 Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." ICLR 2015. [video] [slides] [project]
  27. 27. E2E: Classification: VGG 27 Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." International Conference on Learning Representations (2015). [video] [slides] [project]
  28. 28. E2E: Classification: VGG: 3x3 Stacks 28 Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." International Conference on Learning Representations (2015). [video] [slides] [project]
  29. 29. E2E: Classification: VGG 29 Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." International Conference on Learning Representations (2015). [video] [slides] [project] ● No poolings between some convolutional layers. ● Convolution strides of 1 (no skipping).
  30. 30. 30 3.6% top 5 error… with 152 layers !! ImageNet Challenge: 2015
  31. 31. E2E: Classification: ResNet 31 He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. "Deep residual learning for image recognition." CVPR 2016. [slides]
  32. 32. E2E: Classification: ResNet 32 ● Deeper networks (34 is deeper than 18) are more difficult to train. Thin curves: training error Bold curves: validation error He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. "Deep residual learning for image recognition." CVPR 2016. [slides]
  33. 33. ResNet 33 ● Residual learning: reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. "Deep residual learning for image recognition." CVPR 2016. [slides]
  34. 34. E2E: Classification: ResNet 34 He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. "Deep residual learning for image recognition." CVPR 2016. [slides]
  35. 35. 35 Learn more Li Fei-Fei, “How we’re teaching computers to understand pictures” TEDTalks 2014. Russakovsky, Olga, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang et al. "Imagenet large scale visual recognition challenge." International Journal of Computer Vision 115, no. 3 (2015): 211-252. [web]
  36. 36. 36 The end of the challenge http://image-net.org/challenges/beyond_ilsvrc
  37. 37. 37 Thanks ! Q&A ? Follow me at https://imatge.upc.edu/web/people/xavier-giro @DocXavi /ProfessorXavi

×