Hybridoma Technology ( Production , Purification , and Application )
Image classification with neural networks
1. Amirkabir University of Technology
Department of Computer Engineering
and Information Technology
Image Classification with
Deep Convolutional
Neural Networks
Sepehr Rasouli
2. Introduction > Methods > Results > Conclusion2
Outline
• Introduction to Image Classification
& Deep Networks
• Proposed Method
• Main Idea
• Data Set
• Architecture
• Techniques
• Comparison & Results
• Conclusion
4. Introduction > Methods > Results > Conclusion4
Why Deep Learning?
•“Shallow” vs. “deep” architectures
Learn a feature hierarchy all the way from pixels to classifier
Hand Designed
Feature
Extraction
Trainable
Classifier
Layer 1 Layer N
Simpler
classifier
5. Introduction > Methods > Results > Conclusion5
Our Method
• Deep Convolutional Neural Network
• 5 convolutional and 3 fully connected layers
• 650,000 neurons, 60 million parameters
• Techniques used for boosting up performance
• ReLU nonlinearity
• Training on Multiple GPUs
• Overlapping max pooling
• Data Augmentation
• Dropout
6. Introduction > Methods > Results > Conclusion6
Overall Architecture
• Trained with stochastic gradient descent on two NVIDIA GPUs for about a
week (5~6 days)
• 650,000 neurons, 60 million parameters, 630 million connections
• The last layer contains 1,000 neurons which produces a distribution over the
1,000 class labels.
7. Introduction > Methods > Results > Conclusion7
Dataset
• ImageNet
§ Over 15 million high-quality labeled images
§ About 22,000 categories
§ Collected from the web, labeled by humans on Amazon's Mechanical
Turk
§ Variable-resolution images
• ILSVRC Competition
§ ImageNet Large-Scale Pascal Visual Object Challenge
§ Annual competition of image classification at large scale
§ Subset of ImageNet
§ 1,000 categories with about 1,000 images each
§ 1.2M images in 1K categories
§ Classification: make 5 guesses about the image label
8. Introduction > Methods > Results > Conclusion8
Rectified Linear Units
𝑥 = 𝑤$ 𝑓 𝑍$ + 𝑤( 𝑓 𝑍(
+𝑤) 𝑓 𝑍)
x is called the total input
to the neuron, and f(x)
is its output
Very bad
(slow to train )
Very good
(quick to train)
f(x) = max(0,x)f(x) = tanh(x)
9. Introduction > Methods > Results > Conclusion9
Rectified Linear Units
• Biological plausibility: One-sided, compared
to the antisymmetry of tanh.
• Sparse activation: For example, in a randomly
initialized network, only about 50% of hidden
units are activated (having a non-zero output).
• Efficient gradient propagation: No vanishing
gradient problem or exploding effect.
• Efficient computation: Only comparison,
addition and multiplication
10. Introduction > Methods > Results > Conclusion10
Training on Multiple GPUs
• Spread across two GPUs
• GTX 580 GPU with 3GB memory
• Particularly well-suited to cross-GPU
parallelization
• Very efficient implementation of CNN on
GPUs
11. Model Top-1 Top-5
Sparse coding [3] 47.1% 28.2%
SIFT + FVs [4] 45.7% 25.7%
CNN 37.5 17.0%
Introduction > Methods > Results > Conclusion11
Results & Comparison
•ILSVRC-2010 test set
ILSVRC-2010 winner
Previous best
published result
Our Method
Comparison of results on ILSRVCs 2010
test set. In italics best results achieved
by others.
12. Introduction > Methods > Results > Conclusion12
Conclusion
• Large, deep convolutional neural networks for large
scale image classification was proposed
• 5 convolutional layers, 3 fully-connected layers
• 650,000 neurons, 60 million parameters
• Several techniques for boosting up performance
• The proposed method won the ILSVRC-2012
• Achieved a winning top-5 error rate of 15.3%,
compared to 26.2% achieved by the second-best entry
14. Introduction > Methods > Results > Conclusion14
References
[1] http://cs.nyu.edu/~fergus/tutorials/
deep_learning_cvpr12/fergus_dl_tutorial_final.pptx
[2] reference : http://web.engr.illinois.edu/
~slazebni/spring14/lec24_cnn.pdf
[3] A. Berg, J. Deng, and L. Fei-Fei. Large scale
visual recognition challenge 2010.
www.imagenet.org/challenges. 2010. [4]
S. Tara, Brian Kingsbury, A.-r. Mohamed and
B. Ramabhadran, "Learning Filter Banks within a Deep
[4] J.Sánchezand F.Perronnin.High-dimensional
signature compression for large-scale image classification.
In Computer Vision and Pattern Recognition(CVPR),
2011IEEEConferenceon,pages1665–1672.IEEE, 2011.
15. Introduction > Methods > Results > Conclusion15
Thank you for your attention
Any Questions?
18. Introduction > Methods > Results > Conclusion18
Pooling
• Spatial Pooling
• Non-overlapping / overlapping regions
• Sum or max
Max
Sum
19. Introduction > Methods > Results > Conclusion19
Dropout
• Independently set each hidden unit activity to zero with 0.5
probability
• Used in the two globally-connected hidden layers at the net's
output