SlideShare a Scribd company logo
1 of 42
Download to read offline
convolutional neural networks for image
classification
Evidence from Kaggle National Data Science Bowl
.
Dmytro Mishkin, ducha.aiki at gmail com
March 25, 2015
Czech Technical University in Prague
kaggle national data science bowl overview
The image classification problem
 130,400 test images
 30,336 train images
 1 channel (grayscale)
 121 (biased) classess
 90% images ≤ 100x100 px
 logloss score = - 1
N
N∑
i=1
M∑
j=1
yij log pij
 No external data
1
classes diagram
1
1url: http://npow.github.io/plankton/viewer/index.html.
2
final leaderboard
3
Which approach to use?
4
lunch time chat at kth’s computer vision group
 a computer vision scientist: How long does it take to train these
generic features on ImageNet?
 Hossein: 2 weeks
 Ali: almost 3 weeks depending on the hardware
 the computer vision scientist: hmmmm...
 Stefan: Well, you have to compare the three weeks to the last 40
years of computer vision2
2http://www.csc.kth.se/cvap/cvg/DL/ots/
5
convolutional networks
CNNs are state-of-art in such fields of image recognition as:3
:
 – Object Image Classification
 – Scene Image Classification
 – Action Image Classification
 – Object Detection
 – Semantic Segmentation
 – Fine-grained Recognition
 – Attribute Detection
 – Metric Learning
 – Instance Retrieval (almost).
3beat classic computer vision methods in 19 datasets out of 20
http://www.csc.kth.se/cvap/cvg/DL/ots/
6
contents
1. Basics of convolutional networks
2. Image preprocessing
3. Network architectures
4. Ensembling
5. What (seems that) do and does not work
6. Winner‘s solution highlights
7
..
basics of convolutional net-
works
what is convolution
4
4https://developer.apple.com/library/ios/documentation/Performance/
Conceptual/vImage/ConvolutionOperations/ConvolutionOperations.html
9
softmax classifier
Softmax(cross-entropy) loss
L = − log e
fyi
∑
j
e
fj
SVM (hinge)loss
L =
∑
j̸=yi
max(0, f(xi, W)j − f(xi, W)yi + ∆)
5
5http://vision.stanford.edu/teaching/cs231n/linear-classify-demo/
10
lenet-5. no other layers are necessary
6
Firstly idea proposed by LeCun7
in 1989, recently revived by
Springenberg et. al. in ”All Convolutional Net”8
,
6http://eblearn.sourceforge.net/beginner_tutorial2_train.html
7url: https://www.facebook.com/yann.lecun/posts/10152766574417143.
8J. T. Springenberg et al. “Striving for Simplicity: The All Convolutional Net”. In:
ArXiv e-prints (2014). arXiv: 1412.6806 [cs.LG].
11
non-linearities
−3 −2 −1 0 1 2 3
−3
−2
−1
0
1
2
3
4
Input
Activation TanH
Sigmoid
ReLU
maxout (sort of)
LeakyReLU
12
regularization - dropout, weight decay
9
9Nitish Srivastava et al. “Dropout: A Simple Way to Prevent Neural Networks from
Overfitting”. In: Journal of Machine Learning Research 15 (2014), pp. 1929–1958.
url: http://jmlr.org/papers/v15/srivastava14a.html.
13
deep learning libraries
Table 1: Popular deep learning GPU libraries
Name url languages Notes
caffe github.com/BVLC/caffe C++/Python/no largest community
cxxnet github.com/dmlc/cxxnet C++/no good memory management
Theano github.com/Theano/Theano Python huge flexibility
Torch facebook/fbcunn lua LeCun Facebook library
cuda-convnet2 code.google.com/p/cuda-convnet2/ C++/python
SparseConvNet http://tinyurl.com/pu65cfp C++/CUDA differs from others
14
..image preprocessing
basic network architecture
72x72x1 → Crop to 64x64 →20C5 →MP2 →50C5 → MP2 →500IP → clf
16
basic data preprocessing
Table 2: 5-layer network experiments, 48x48 input image, no non-linearities,
mean pixel extraction
Name, augmentation Val logloss Train logloss
No mean extraction, no scaling – –
mirror 1.67 0.64
histeq, mirror 1.74 0.64
mirror + ReLU 1.61 0.44
mirror + scale 1.42 0.937
mirror + scale LeakyReLU 1.34 0.83
mirror + rand rot 1.53 1.31
17
basic data preprocessing
Table 3: 5-layer network experiments, 48x48 input image, LeakyReLU
non-linearities, mean pixel extraction
Name, augmentation Val logloss Train logloss
mirror + scale 1.34 0.83
invert, mirror + scale 1.27 0.80
invert, norm, mirror + scale 1.24 0.505
invert, norm, mirror + scale, salt-pepper 1.15 n/a
18
more geometric transformations
Table 4: 5-layer network experiments, 64x64 input image, LeakyReLU
Name, augmentation Val logloss
mirror 1.30
mirror + scale (resize modes) 1.12
h+v mirror, scale 1.10
h+v mirror, scale + rot 1.08
mirror, less baselr 1.04 :)
h+v mirror, scale + rot, depolar imgs 1.28
19
regularization methods
Table 5: 5-layer network experiments, 64x64 input image, LeakyReLU
Name, augmentation Val logloss
h+v mirror, scale + rot, vanilla 1.08
h+v mirror, scale + rot, PReLU (but slow down a lot)10
1.03
h+v mirror, scale + rot, BatchNorm11
1.10
h+v mirror, scale + rot, StochPool12
0.98
10K. He et al. “Delving Deep into Rectifiers: Surpassing Human-Level Performance on
ImageNet Classification”. In: ArXiv e-prints (2015). arXiv: 1502.01852 [cs.CV].
11S. Ioffe and C. Szegedy. “Batch Normalization: Accelerating Deep Network Training by
Reducing Internal Covariate Shift”. In: ArXiv e-prints (2015). arXiv: 1502.03167
[cs.LG].
12M. D. Zeiler and R. Fergus. “Stochastic Pooling for Regularization of Deep
Convolutional Neural Networks”. In: ArXiv e-prints (2013). arXiv: 1301.3557 [cs.LG].
20
data augmentation - don‘t forget about it during test time
for i = 0,90,180,270 degrees rotation
for 9 crops (N, NE, E, ...)
get predictions for mirrored/non-mirrored
21
..network architectures
cifar/lenet for testing
Pro‘s
 + Training time  20 min
 + Can be done in parallel
 + therefore lots of experiments
Con‘s
 - Not complex enough to check smth (i.e. BatchNorm)
 - That is why might lead to wrong conclusions about ”bad” things (i.e.
random rotations hurts CifarNets, but helps VGGNets)
 - Or ”good” things (i.e. Stochastic pooling helps CifarNets, but none
for VGGNets)
23
We need to go deeper
24
googlenet
GoogLeNet architecture13
13C. Szegedy et al. “Going Deeper with Convolutions”. In: ArXiv e-prints (2014).
arXiv: 1409.4842 [cs.CV].
25
googlenet
22 layers, but simple base brick – ”Inception”
26
internal ensemble
Take mean of all auxiliary classifiers instead of just throwing away them
Table 6: GoogLeNet,validation loss
Name Public LB
clf on inc3 0.722
clf on inc4a 0.754
clf on inc4b 0.757
clf on inc5b 0.855
average 0.693
Table 7: VGGNet,validation loss
Name Public LB
clf on pool4 0.762
clf on pool5 0.657
clf on fc7 0.707
average 0.630
14
14J. Xie, B. Xu, and Z. Chuang. “Horizontal and Vertical Ensemble with Deep
Representation for Classification”. In: ArXiv e-prints (2013). arXiv: 1306.2759
[cs.LG].
27
googlenet-results
Table 8: GoogLeNet, 64x64 input image, Leaky ReLU (if not stated other),
AlexNet-oversample
Name Public LB
No inv, scale, ReLU, last-clf 0.910
No inv, scale, ReLU 0.859
No inv, scale 0.816
No inv scale, maxout-clf 0.785
Inv, scale, maxout-clf, retrain 0.703
96x96, inv, scale, maxout-clf, retrained, no-aug-ft15
0.684
112x112, inv, scale, maxout-clf, retrained, no-aug-ft. 0.716
48x48, inv, scale, maxout-clf, retrained, no-aug-ft. + test rot 0.749
96x96, inv, scale, maxout-clf, retrained, no-aug-ft. + test rot 0.679
48x48+96x96+112x112, inv, scale, maxout-clf, retrained, no-aug-ft 0.677
15Ben Graham‘s trick: finetune converged model for 1-5 epochs without
data-augmentation with small lrhttp://blog.kaggle.com/2015/01/02/
cifar-10-competition-winners-interviews-with-dr-ben-graham-phil-culliton-zygmu
28
vggnet
VGGNet architectures16
Differences: Dropout in conv-layers (0.3), SPP-pooling for pool5, LeakyReLU,
aux. clf.
16K. Simonyan and A. Zisserman. “Very Deep Convolutional Networks for Large-Scale
Image Recognition”. In: ArXiv e-prints (Sept. 2014). arXiv: 1409.1556 [cs.CV].
29
spatial pyramid pooling
17
17K. He et al. “Spatial Pyramid Pooling in Deep Convolutional Networks for Visual
Recognition”. In: ArXiv e-prints (2014). arXiv: 1406.4729 [cs.CV].
30
vggnet-results
Table 9: GoogLeNet, 64x64 input image, Leaky ReLU (if not stated other),
AlexNet-oversample, no-SPP
Name Public LB
No inv, scale, ReLU, fc-maxout 0.752
Inv, scale, single random crop 0.773
Inv, scale, 50 random crops 0.751
Inv, scale, 0.729
Inv, scale, retrained 0.720
Inv, scale, fc-maxout 0.662
Inv, scale, fc-maxout, SPP 0.654
All VGGNets Mix 0.650
31
sparseconvnet
 – 0.79 LB Score
 – Unusual library
 – C2 instead of C3 convolution
 – Only padding - for input image
 – Kaggle CIFAR-10 winning architecture
320C2 - 320C2 - MP2 -
640C2 - 10% dropout - 640C2 - 10% dropout - MP2 -
960C2 - 20% dropout - 960C2 - 20% dropout - MP2 -
1280C2 - 30% dropout - 1280C2 - 30% dropout - MP2 -
1600C2 - 40% dropout - 1600C2 - 40% dropout - MP2 -
1920C2 - 50% dropout - 1920C1 - 50% dropout - 121C1 - Softmax output
32
ensemble-results
Table 10: Different mixes of all modes (3 GoogleNets, 4 VGGNets, 1
SparseConvNet)
Name Public LB Private LB
4 VGG 0.650 0.651
3 VGG, 1 GLN 0.625 0.629
4 VGG, 3 GLN 0.617 0.618
4 VGG, 3 GLN, 1 Sparse 0.611 0.616
4 VGG, 3 GLN, 1 Sparse, figure-skating 0.609 0.613
33
..misc
batchnorm
Works for CIFAR
But no big difference for VGGNet in KNDB for me. However, works for
other people, i.e. Jae Hyun Lim18
, 22nd place
18https://github.com/lim0606/ndsb
35
what else seems to work here
 – Retrain top layers with different non-linearity (cheat diversity)
 – Figure-skating average – throw away max and min prediction (0.003
LB score)
36
what seems, that does not work here
 – Dense SIFT + BOW / Fisher Vector 6̃0% accuracy
 – Random forest on CNN features 6̃5% accuracy
 – Mix of Hinge and Cross-Entropy losses
 – Averaging with other mean than arithmetical
 – Image enhancement or preprocessing (histogram equalization, etc.)
37
..winner‘s solution highlights
team work
 – Roll-pool
 – Hand-engineered features
 – RMS-Pool
 – Knowledge distillation
19
19http://benanne.github.io/2015/03/17/plankton.html
39
Questions?
40
thanks
This nice presentation theme is taken from
github.com/matze/mtheme
The theme itself is licensed under a Creative Commons
Attribution-ShareAlike 4.0 International License.
cba
41

More Related Content

What's hot

Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Seonho Park
 
Modern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentationModern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentationGioele Ciaparrone
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)Fellowship at Vodafone FutureLab
 
ujava.org Deep Learning with Convolutional Neural Network
ujava.org Deep Learning with Convolutional Neural Network ujava.org Deep Learning with Convolutional Neural Network
ujava.org Deep Learning with Convolutional Neural Network 신동 강
 
Introduction to Convolutional Neural Networks
Introduction to Convolutional Neural NetworksIntroduction to Convolutional Neural Networks
Introduction to Convolutional Neural NetworksHannes Hapke
 
Efficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image ClassficationEfficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image ClassficationYogendra Tamang
 
Convolutional Neural Network
Convolutional Neural NetworkConvolutional Neural Network
Convolutional Neural NetworkJunho Cho
 
Deep Learning Tutorial
Deep Learning Tutorial Deep Learning Tutorial
Deep Learning Tutorial Ligeng Zhu
 
#4 Convolutional Neural Networks for Natural Language Processing
#4 Convolutional Neural Networks for Natural Language Processing#4 Convolutional Neural Networks for Natural Language Processing
#4 Convolutional Neural Networks for Natural Language ProcessingBerlin Language Technology
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)Fellowship at Vodafone FutureLab
 
Deep Learning behind Prisma
Deep Learning behind PrismaDeep Learning behind Prisma
Deep Learning behind Prismalostleaves
 
AI&BigData Lab. Артем Чернодуб "Распознавание изображений методом Lazy Deep ...
AI&BigData Lab. Артем Чернодуб  "Распознавание изображений методом Lazy Deep ...AI&BigData Lab. Артем Чернодуб  "Распознавание изображений методом Lazy Deep ...
AI&BigData Lab. Артем Чернодуб "Распознавание изображений методом Lazy Deep ...GeeksLab Odessa
 
Learning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for GraphsLearning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for GraphsMathias Niepert
 
Deep Learning for Computer Vision: A comparision between Convolutional Neural...
Deep Learning for Computer Vision: A comparision between Convolutional Neural...Deep Learning for Computer Vision: A comparision between Convolutional Neural...
Deep Learning for Computer Vision: A comparision between Convolutional Neural...Vincenzo Lomonaco
 
Tutorial on convolutional neural networks
Tutorial on convolutional neural networksTutorial on convolutional neural networks
Tutorial on convolutional neural networksHojin Yang
 
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...Alex Conway
 
Neuroevolution and deep learing
Neuroevolution and deep learing Neuroevolution and deep learing
Neuroevolution and deep learing Accenture
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural NetworksYogendra Tamang
 
TensorFlow Tutorial Part2
TensorFlow Tutorial Part2TensorFlow Tutorial Part2
TensorFlow Tutorial Part2Sungjoon Choi
 
CNN and its applications by ketaki
CNN and its applications by ketakiCNN and its applications by ketaki
CNN and its applications by ketakiKetaki Patwari
 

What's hot (20)

Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
 
Modern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentationModern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentation
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
 
ujava.org Deep Learning with Convolutional Neural Network
ujava.org Deep Learning with Convolutional Neural Network ujava.org Deep Learning with Convolutional Neural Network
ujava.org Deep Learning with Convolutional Neural Network
 
Introduction to Convolutional Neural Networks
Introduction to Convolutional Neural NetworksIntroduction to Convolutional Neural Networks
Introduction to Convolutional Neural Networks
 
Efficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image ClassficationEfficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image Classfication
 
Convolutional Neural Network
Convolutional Neural NetworkConvolutional Neural Network
Convolutional Neural Network
 
Deep Learning Tutorial
Deep Learning Tutorial Deep Learning Tutorial
Deep Learning Tutorial
 
#4 Convolutional Neural Networks for Natural Language Processing
#4 Convolutional Neural Networks for Natural Language Processing#4 Convolutional Neural Networks for Natural Language Processing
#4 Convolutional Neural Networks for Natural Language Processing
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
 
Deep Learning behind Prisma
Deep Learning behind PrismaDeep Learning behind Prisma
Deep Learning behind Prisma
 
AI&BigData Lab. Артем Чернодуб "Распознавание изображений методом Lazy Deep ...
AI&BigData Lab. Артем Чернодуб  "Распознавание изображений методом Lazy Deep ...AI&BigData Lab. Артем Чернодуб  "Распознавание изображений методом Lazy Deep ...
AI&BigData Lab. Артем Чернодуб "Распознавание изображений методом Lazy Deep ...
 
Learning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for GraphsLearning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for Graphs
 
Deep Learning for Computer Vision: A comparision between Convolutional Neural...
Deep Learning for Computer Vision: A comparision between Convolutional Neural...Deep Learning for Computer Vision: A comparision between Convolutional Neural...
Deep Learning for Computer Vision: A comparision between Convolutional Neural...
 
Tutorial on convolutional neural networks
Tutorial on convolutional neural networksTutorial on convolutional neural networks
Tutorial on convolutional neural networks
 
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
 
Neuroevolution and deep learing
Neuroevolution and deep learing Neuroevolution and deep learing
Neuroevolution and deep learing
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural Networks
 
TensorFlow Tutorial Part2
TensorFlow Tutorial Part2TensorFlow Tutorial Part2
TensorFlow Tutorial Part2
 
CNN and its applications by ketaki
CNN and its applications by ketakiCNN and its applications by ketaki
CNN and its applications by ketaki
 

Viewers also liked

Introduction to CNN
Introduction to CNNIntroduction to CNN
Introduction to CNNShuai Zhang
 
Deep Learning - Convolutional Neural Networks - Architectural Zoo
Deep Learning - Convolutional Neural Networks - Architectural ZooDeep Learning - Convolutional Neural Networks - Architectural Zoo
Deep Learning - Convolutional Neural Networks - Architectural ZooChristian Perone
 
PyCon 2016: Personalised emails with Spark and Python
PyCon 2016:  Personalised emails  with Spark and PythonPyCon 2016:  Personalised emails  with Spark and Python
PyCon 2016: Personalised emails with Spark and PythonTomas Sirny
 
Intrusion Detection with Neural Networks
Intrusion Detection with Neural NetworksIntrusion Detection with Neural Networks
Intrusion Detection with Neural Networksantoniomorancardenas
 
DeepFix: a fully convolutional neural network for predicting human fixations...
DeepFix:  a fully convolutional neural network for predicting human fixations...DeepFix:  a fully convolutional neural network for predicting human fixations...
DeepFix: a fully convolutional neural network for predicting human fixations...Universitat Politècnica de Catalunya
 
Convolutional neural networks for sentiment classification
Convolutional neural networks for sentiment classificationConvolutional neural networks for sentiment classification
Convolutional neural networks for sentiment classificationYunchao He
 
Deep learning for music classification, 2016-05-24
Deep learning for music classification, 2016-05-24Deep learning for music classification, 2016-05-24
Deep learning for music classification, 2016-05-24Keunwoo Choi
 
Deep Convolutional Neural Networks - Overview
Deep Convolutional Neural Networks - OverviewDeep Convolutional Neural Networks - Overview
Deep Convolutional Neural Networks - OverviewKeunwoo Choi
 
101: Convolutional Neural Networks
101: Convolutional Neural Networks 101: Convolutional Neural Networks
101: Convolutional Neural Networks Mad Scientists
 
Convolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in TheanoConvolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in TheanoSeongwon Hwang
 
Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1ananth
 
CNN for Text Classification
CNN for Text ClassificationCNN for Text Classification
CNN for Text ClassificationEmory NLP
 
Convolution Neural Networks
Convolution Neural NetworksConvolution Neural Networks
Convolution Neural NetworksAhmedMahany
 
[DL輪読会]Combining Fully Convolutional and Recurrent Neural Networks for 3D Bio...
[DL輪読会]Combining Fully Convolutional and Recurrent Neural Networks for 3D Bio...[DL輪読会]Combining Fully Convolutional and Recurrent Neural Networks for 3D Bio...
[DL輪読会]Combining Fully Convolutional and Recurrent Neural Networks for 3D Bio...Deep Learning JP
 
Search First Migration - Using SharePoint 2013 Search for SharePoint 2010
Search First Migration - Using SharePoint 2013 Search for SharePoint 2010Search First Migration - Using SharePoint 2013 Search for SharePoint 2010
Search First Migration - Using SharePoint 2013 Search for SharePoint 2010Bob German
 
Architecting Your Content For the Unknown Consumer
Architecting Your Content For the Unknown ConsumerArchitecting Your Content For the Unknown Consumer
Architecting Your Content For the Unknown ConsumerRichard Jones
 
Ipsos MORI Political Monitor: August 2016
Ipsos MORI Political Monitor: August 2016Ipsos MORI Political Monitor: August 2016
Ipsos MORI Political Monitor: August 2016Ipsos UK
 

Viewers also liked (20)

Introduction to CNN
Introduction to CNNIntroduction to CNN
Introduction to CNN
 
Deep Learning - Convolutional Neural Networks - Architectural Zoo
Deep Learning - Convolutional Neural Networks - Architectural ZooDeep Learning - Convolutional Neural Networks - Architectural Zoo
Deep Learning - Convolutional Neural Networks - Architectural Zoo
 
PyCon 2016: Personalised emails with Spark and Python
PyCon 2016:  Personalised emails  with Spark and PythonPyCon 2016:  Personalised emails  with Spark and Python
PyCon 2016: Personalised emails with Spark and Python
 
Intrusion Detection with Neural Networks
Intrusion Detection with Neural NetworksIntrusion Detection with Neural Networks
Intrusion Detection with Neural Networks
 
DeepFix: a fully convolutional neural network for predicting human fixations...
DeepFix:  a fully convolutional neural network for predicting human fixations...DeepFix:  a fully convolutional neural network for predicting human fixations...
DeepFix: a fully convolutional neural network for predicting human fixations...
 
GoogLeNet Insights
GoogLeNet InsightsGoogLeNet Insights
GoogLeNet Insights
 
Convolutional neural networks for sentiment classification
Convolutional neural networks for sentiment classificationConvolutional neural networks for sentiment classification
Convolutional neural networks for sentiment classification
 
Deep learning for music classification, 2016-05-24
Deep learning for music classification, 2016-05-24Deep learning for music classification, 2016-05-24
Deep learning for music classification, 2016-05-24
 
Deep Convolutional Neural Networks - Overview
Deep Convolutional Neural Networks - OverviewDeep Convolutional Neural Networks - Overview
Deep Convolutional Neural Networks - Overview
 
101: Convolutional Neural Networks
101: Convolutional Neural Networks 101: Convolutional Neural Networks
101: Convolutional Neural Networks
 
Region-oriented Convolutional Networks for Object Retrieval
Region-oriented Convolutional Networks for Object RetrievalRegion-oriented Convolutional Networks for Object Retrieval
Region-oriented Convolutional Networks for Object Retrieval
 
Convolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in TheanoConvolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in Theano
 
Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1
 
CNN for Text Classification
CNN for Text ClassificationCNN for Text Classification
CNN for Text Classification
 
Convolution Neural Networks
Convolution Neural NetworksConvolution Neural Networks
Convolution Neural Networks
 
[DL輪読会]Combining Fully Convolutional and Recurrent Neural Networks for 3D Bio...
[DL輪読会]Combining Fully Convolutional and Recurrent Neural Networks for 3D Bio...[DL輪読会]Combining Fully Convolutional and Recurrent Neural Networks for 3D Bio...
[DL輪読会]Combining Fully Convolutional and Recurrent Neural Networks for 3D Bio...
 
Search First Migration - Using SharePoint 2013 Search for SharePoint 2010
Search First Migration - Using SharePoint 2013 Search for SharePoint 2010Search First Migration - Using SharePoint 2013 Search for SharePoint 2010
Search First Migration - Using SharePoint 2013 Search for SharePoint 2010
 
Strategy Wars
Strategy WarsStrategy Wars
Strategy Wars
 
Architecting Your Content For the Unknown Consumer
Architecting Your Content For the Unknown ConsumerArchitecting Your Content For the Unknown Consumer
Architecting Your Content For the Unknown Consumer
 
Ipsos MORI Political Monitor: August 2016
Ipsos MORI Political Monitor: August 2016Ipsos MORI Political Monitor: August 2016
Ipsos MORI Political Monitor: August 2016
 

Similar to Convolutional neural networks for image classification — evidence from Kaggle National Data Science Bowl

Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningOleg Mygryn
 
Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)
Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)
Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)Universitat Politècnica de Catalunya
 
San Francisco Hadoop User Group Meetup Deep Learning
San Francisco Hadoop User Group Meetup Deep LearningSan Francisco Hadoop User Group Meetup Deep Learning
San Francisco Hadoop User Group Meetup Deep LearningSri Ambati
 
Deep Learning through Examples
Deep Learning through ExamplesDeep Learning through Examples
Deep Learning through ExamplesSri Ambati
 
H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614Sri Ambati
 
H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14Sri Ambati
 
DIY Deep Learning with Caffe Workshop
DIY Deep Learning with Caffe WorkshopDIY Deep Learning with Caffe Workshop
DIY Deep Learning with Caffe Workshopodsc
 
Convolution Neural Network Lecture Slides
Convolution Neural Network Lecture SlidesConvolution Neural Network Lecture Slides
Convolution Neural Network Lecture SlidesAdnanHaider234505
 
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI ProjectsDiscovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI ProjectsWee Hyong Tok
 
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...Universitat Politècnica de Catalunya
 
H2ODeepLearningThroughExamples021215
H2ODeepLearningThroughExamples021215H2ODeepLearningThroughExamples021215
H2ODeepLearningThroughExamples021215Sri Ambati
 
Software Defined Visualization (SDVis): Get the Most Out of ParaView* with OS...
Software Defined Visualization (SDVis): Get the Most Out of ParaView* with OS...Software Defined Visualization (SDVis): Get the Most Out of ParaView* with OS...
Software Defined Visualization (SDVis): Get the Most Out of ParaView* with OS...Intel® Software
 
MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...Sri Ambati
 
運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...
運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...
運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...Herman Wu
 
Jose Selvi - Side-Channels Uncovered [rootedvlc2018]
Jose Selvi - Side-Channels Uncovered [rootedvlc2018]Jose Selvi - Side-Channels Uncovered [rootedvlc2018]
Jose Selvi - Side-Channels Uncovered [rootedvlc2018]RootedCON
 
Fcv learn yu
Fcv learn yuFcv learn yu
Fcv learn yuzukun
 
Java collections the force awakens
Java collections  the force awakensJava collections  the force awakens
Java collections the force awakensRichardWarburton
 
Early Benchmarking Results for Neuromorphic Computing
Early Benchmarking Results for Neuromorphic ComputingEarly Benchmarking Results for Neuromorphic Computing
Early Benchmarking Results for Neuromorphic ComputingDESMOND YUEN
 

Similar to Convolutional neural networks for image classification — evidence from Kaggle National Data Science Bowl (20)

Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)
Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)
Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)
 
San Francisco Hadoop User Group Meetup Deep Learning
San Francisco Hadoop User Group Meetup Deep LearningSan Francisco Hadoop User Group Meetup Deep Learning
San Francisco Hadoop User Group Meetup Deep Learning
 
Lec11 object-re-id
Lec11 object-re-idLec11 object-re-id
Lec11 object-re-id
 
Deep Learning through Examples
Deep Learning through ExamplesDeep Learning through Examples
Deep Learning through Examples
 
H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614
 
H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14
 
DIY Deep Learning with Caffe Workshop
DIY Deep Learning with Caffe WorkshopDIY Deep Learning with Caffe Workshop
DIY Deep Learning with Caffe Workshop
 
Convolution Neural Network Lecture Slides
Convolution Neural Network Lecture SlidesConvolution Neural Network Lecture Slides
Convolution Neural Network Lecture Slides
 
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI ProjectsDiscovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
 
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
 
Deep learning
Deep learningDeep learning
Deep learning
 
H2ODeepLearningThroughExamples021215
H2ODeepLearningThroughExamples021215H2ODeepLearningThroughExamples021215
H2ODeepLearningThroughExamples021215
 
Software Defined Visualization (SDVis): Get the Most Out of ParaView* with OS...
Software Defined Visualization (SDVis): Get the Most Out of ParaView* with OS...Software Defined Visualization (SDVis): Get the Most Out of ParaView* with OS...
Software Defined Visualization (SDVis): Get the Most Out of ParaView* with OS...
 
MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...
 
運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...
運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...
運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...
 
Jose Selvi - Side-Channels Uncovered [rootedvlc2018]
Jose Selvi - Side-Channels Uncovered [rootedvlc2018]Jose Selvi - Side-Channels Uncovered [rootedvlc2018]
Jose Selvi - Side-Channels Uncovered [rootedvlc2018]
 
Fcv learn yu
Fcv learn yuFcv learn yu
Fcv learn yu
 
Java collections the force awakens
Java collections  the force awakensJava collections  the force awakens
Java collections the force awakens
 
Early Benchmarking Results for Neuromorphic Computing
Early Benchmarking Results for Neuromorphic ComputingEarly Benchmarking Results for Neuromorphic Computing
Early Benchmarking Results for Neuromorphic Computing
 

Recently uploaded

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 

Recently uploaded (20)

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 

Convolutional neural networks for image classification — evidence from Kaggle National Data Science Bowl

  • 1. convolutional neural networks for image classification Evidence from Kaggle National Data Science Bowl . Dmytro Mishkin, ducha.aiki at gmail com March 25, 2015 Czech Technical University in Prague
  • 2. kaggle national data science bowl overview The image classification problem 130,400 test images 30,336 train images 1 channel (grayscale) 121 (biased) classess 90% images ≤ 100x100 px logloss score = - 1 N N∑ i=1 M∑ j=1 yij log pij No external data 1
  • 6. lunch time chat at kth’s computer vision group a computer vision scientist: How long does it take to train these generic features on ImageNet? Hossein: 2 weeks Ali: almost 3 weeks depending on the hardware the computer vision scientist: hmmmm... Stefan: Well, you have to compare the three weeks to the last 40 years of computer vision2 2http://www.csc.kth.se/cvap/cvg/DL/ots/ 5
  • 7. convolutional networks CNNs are state-of-art in such fields of image recognition as:3 : – Object Image Classification – Scene Image Classification – Action Image Classification – Object Detection – Semantic Segmentation – Fine-grained Recognition – Attribute Detection – Metric Learning – Instance Retrieval (almost). 3beat classic computer vision methods in 19 datasets out of 20 http://www.csc.kth.se/cvap/cvg/DL/ots/ 6
  • 8. contents 1. Basics of convolutional networks 2. Image preprocessing 3. Network architectures 4. Ensembling 5. What (seems that) do and does not work 6. Winner‘s solution highlights 7
  • 11. softmax classifier Softmax(cross-entropy) loss L = − log e fyi ∑ j e fj SVM (hinge)loss L = ∑ j̸=yi max(0, f(xi, W)j − f(xi, W)yi + ∆) 5 5http://vision.stanford.edu/teaching/cs231n/linear-classify-demo/ 10
  • 12. lenet-5. no other layers are necessary 6 Firstly idea proposed by LeCun7 in 1989, recently revived by Springenberg et. al. in ”All Convolutional Net”8 , 6http://eblearn.sourceforge.net/beginner_tutorial2_train.html 7url: https://www.facebook.com/yann.lecun/posts/10152766574417143. 8J. T. Springenberg et al. “Striving for Simplicity: The All Convolutional Net”. In: ArXiv e-prints (2014). arXiv: 1412.6806 [cs.LG]. 11
  • 13. non-linearities −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 4 Input Activation TanH Sigmoid ReLU maxout (sort of) LeakyReLU 12
  • 14. regularization - dropout, weight decay 9 9Nitish Srivastava et al. “Dropout: A Simple Way to Prevent Neural Networks from Overfitting”. In: Journal of Machine Learning Research 15 (2014), pp. 1929–1958. url: http://jmlr.org/papers/v15/srivastava14a.html. 13
  • 15. deep learning libraries Table 1: Popular deep learning GPU libraries Name url languages Notes caffe github.com/BVLC/caffe C++/Python/no largest community cxxnet github.com/dmlc/cxxnet C++/no good memory management Theano github.com/Theano/Theano Python huge flexibility Torch facebook/fbcunn lua LeCun Facebook library cuda-convnet2 code.google.com/p/cuda-convnet2/ C++/python SparseConvNet http://tinyurl.com/pu65cfp C++/CUDA differs from others 14
  • 17. basic network architecture 72x72x1 → Crop to 64x64 →20C5 →MP2 →50C5 → MP2 →500IP → clf 16
  • 18. basic data preprocessing Table 2: 5-layer network experiments, 48x48 input image, no non-linearities, mean pixel extraction Name, augmentation Val logloss Train logloss No mean extraction, no scaling – – mirror 1.67 0.64 histeq, mirror 1.74 0.64 mirror + ReLU 1.61 0.44 mirror + scale 1.42 0.937 mirror + scale LeakyReLU 1.34 0.83 mirror + rand rot 1.53 1.31 17
  • 19. basic data preprocessing Table 3: 5-layer network experiments, 48x48 input image, LeakyReLU non-linearities, mean pixel extraction Name, augmentation Val logloss Train logloss mirror + scale 1.34 0.83 invert, mirror + scale 1.27 0.80 invert, norm, mirror + scale 1.24 0.505 invert, norm, mirror + scale, salt-pepper 1.15 n/a 18
  • 20. more geometric transformations Table 4: 5-layer network experiments, 64x64 input image, LeakyReLU Name, augmentation Val logloss mirror 1.30 mirror + scale (resize modes) 1.12 h+v mirror, scale 1.10 h+v mirror, scale + rot 1.08 mirror, less baselr 1.04 :) h+v mirror, scale + rot, depolar imgs 1.28 19
  • 21. regularization methods Table 5: 5-layer network experiments, 64x64 input image, LeakyReLU Name, augmentation Val logloss h+v mirror, scale + rot, vanilla 1.08 h+v mirror, scale + rot, PReLU (but slow down a lot)10 1.03 h+v mirror, scale + rot, BatchNorm11 1.10 h+v mirror, scale + rot, StochPool12 0.98 10K. He et al. “Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification”. In: ArXiv e-prints (2015). arXiv: 1502.01852 [cs.CV]. 11S. Ioffe and C. Szegedy. “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift”. In: ArXiv e-prints (2015). arXiv: 1502.03167 [cs.LG]. 12M. D. Zeiler and R. Fergus. “Stochastic Pooling for Regularization of Deep Convolutional Neural Networks”. In: ArXiv e-prints (2013). arXiv: 1301.3557 [cs.LG]. 20
  • 22. data augmentation - don‘t forget about it during test time for i = 0,90,180,270 degrees rotation for 9 crops (N, NE, E, ...) get predictions for mirrored/non-mirrored 21
  • 24. cifar/lenet for testing Pro‘s + Training time 20 min + Can be done in parallel + therefore lots of experiments Con‘s - Not complex enough to check smth (i.e. BatchNorm) - That is why might lead to wrong conclusions about ”bad” things (i.e. random rotations hurts CifarNets, but helps VGGNets) - Or ”good” things (i.e. Stochastic pooling helps CifarNets, but none for VGGNets) 23
  • 25. We need to go deeper 24
  • 26. googlenet GoogLeNet architecture13 13C. Szegedy et al. “Going Deeper with Convolutions”. In: ArXiv e-prints (2014). arXiv: 1409.4842 [cs.CV]. 25
  • 27. googlenet 22 layers, but simple base brick – ”Inception” 26
  • 28. internal ensemble Take mean of all auxiliary classifiers instead of just throwing away them Table 6: GoogLeNet,validation loss Name Public LB clf on inc3 0.722 clf on inc4a 0.754 clf on inc4b 0.757 clf on inc5b 0.855 average 0.693 Table 7: VGGNet,validation loss Name Public LB clf on pool4 0.762 clf on pool5 0.657 clf on fc7 0.707 average 0.630 14 14J. Xie, B. Xu, and Z. Chuang. “Horizontal and Vertical Ensemble with Deep Representation for Classification”. In: ArXiv e-prints (2013). arXiv: 1306.2759 [cs.LG]. 27
  • 29. googlenet-results Table 8: GoogLeNet, 64x64 input image, Leaky ReLU (if not stated other), AlexNet-oversample Name Public LB No inv, scale, ReLU, last-clf 0.910 No inv, scale, ReLU 0.859 No inv, scale 0.816 No inv scale, maxout-clf 0.785 Inv, scale, maxout-clf, retrain 0.703 96x96, inv, scale, maxout-clf, retrained, no-aug-ft15 0.684 112x112, inv, scale, maxout-clf, retrained, no-aug-ft. 0.716 48x48, inv, scale, maxout-clf, retrained, no-aug-ft. + test rot 0.749 96x96, inv, scale, maxout-clf, retrained, no-aug-ft. + test rot 0.679 48x48+96x96+112x112, inv, scale, maxout-clf, retrained, no-aug-ft 0.677 15Ben Graham‘s trick: finetune converged model for 1-5 epochs without data-augmentation with small lrhttp://blog.kaggle.com/2015/01/02/ cifar-10-competition-winners-interviews-with-dr-ben-graham-phil-culliton-zygmu 28
  • 30. vggnet VGGNet architectures16 Differences: Dropout in conv-layers (0.3), SPP-pooling for pool5, LeakyReLU, aux. clf. 16K. Simonyan and A. Zisserman. “Very Deep Convolutional Networks for Large-Scale Image Recognition”. In: ArXiv e-prints (Sept. 2014). arXiv: 1409.1556 [cs.CV]. 29
  • 31. spatial pyramid pooling 17 17K. He et al. “Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition”. In: ArXiv e-prints (2014). arXiv: 1406.4729 [cs.CV]. 30
  • 32. vggnet-results Table 9: GoogLeNet, 64x64 input image, Leaky ReLU (if not stated other), AlexNet-oversample, no-SPP Name Public LB No inv, scale, ReLU, fc-maxout 0.752 Inv, scale, single random crop 0.773 Inv, scale, 50 random crops 0.751 Inv, scale, 0.729 Inv, scale, retrained 0.720 Inv, scale, fc-maxout 0.662 Inv, scale, fc-maxout, SPP 0.654 All VGGNets Mix 0.650 31
  • 33. sparseconvnet – 0.79 LB Score – Unusual library – C2 instead of C3 convolution – Only padding - for input image – Kaggle CIFAR-10 winning architecture 320C2 - 320C2 - MP2 - 640C2 - 10% dropout - 640C2 - 10% dropout - MP2 - 960C2 - 20% dropout - 960C2 - 20% dropout - MP2 - 1280C2 - 30% dropout - 1280C2 - 30% dropout - MP2 - 1600C2 - 40% dropout - 1600C2 - 40% dropout - MP2 - 1920C2 - 50% dropout - 1920C1 - 50% dropout - 121C1 - Softmax output 32
  • 34. ensemble-results Table 10: Different mixes of all modes (3 GoogleNets, 4 VGGNets, 1 SparseConvNet) Name Public LB Private LB 4 VGG 0.650 0.651 3 VGG, 1 GLN 0.625 0.629 4 VGG, 3 GLN 0.617 0.618 4 VGG, 3 GLN, 1 Sparse 0.611 0.616 4 VGG, 3 GLN, 1 Sparse, figure-skating 0.609 0.613 33
  • 36. batchnorm Works for CIFAR But no big difference for VGGNet in KNDB for me. However, works for other people, i.e. Jae Hyun Lim18 , 22nd place 18https://github.com/lim0606/ndsb 35
  • 37. what else seems to work here – Retrain top layers with different non-linearity (cheat diversity) – Figure-skating average – throw away max and min prediction (0.003 LB score) 36
  • 38. what seems, that does not work here – Dense SIFT + BOW / Fisher Vector 6̃0% accuracy – Random forest on CNN features 6̃5% accuracy – Mix of Hinge and Cross-Entropy losses – Averaging with other mean than arithmetical – Image enhancement or preprocessing (histogram equalization, etc.) 37
  • 40. team work – Roll-pool – Hand-engineered features – RMS-Pool – Knowledge distillation 19 19http://benanne.github.io/2015/03/17/plankton.html 39
  • 42. thanks This nice presentation theme is taken from github.com/matze/mtheme The theme itself is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. cba 41