SlideShare a Scribd company logo
1 of 37
Tuning CNN: Tips & Tricks
Dmytro Panchenko
Machine learning engineer, Altexsoft
Workshop setup
1. Clone code from https://github.com/hokmund/cnn-tips-and-tricks
2. Download data and checkpoints from http://tiny.cc/4flryy
3. Extract them from the archive and place under src/ in the source
code folder
4. Run pip install โ€“r requirements.txt
Agenda
1. Workshop setup
2. Transfer learning
3. Learning curves interpretation
4. Learning rate management & cyclic learning rate
5. Augmentations
6. Dealing with imbalanced classification
7. TTA
8. Pseudolabeling
Exploratory data analysis
data-analysis.ipynb
Exploratory data analysis
โ€ข Real-world images of
various goods.
โ€ข Different occlusions,
illumination, etc.
โ€ข Most of items are
centered on the
picture.
โ€ข There are extremely
close classes.
Exploratory data analysis
Dataset split
โ€ข Validation set is used for hyperparameter tuning.
โ€ข Test set is used for the final evaluation of the tuned model.
โ€ข Train set โ€“ 37184 samples (imbalanced).
โ€ข Validation set โ€“ 12800 samples (balanced).
โ€ข Test set โ€“ 25600 samples (balanced).
Transfer learning
Transfer learning โ€“ usage of a pre-trained on a very large dataset CNN
instead of training from scratch.
Transfer learning
Your have little data You have a lot of data
Datasets
are similar
Train a classifier (usually, logistic
regression or MLP) on bottleneck
features
Fine-tune several or all layers
Datasets
are
different
Train a classifier on deep features of the
CNN
Fine-tune all layers (use pre-trained
weights as an initialization for your CNN)
Fine-tuning pre-trained CNN
fine-tuning.ipynb
Learning curve
Underfitting (accuracy still
improves, so you probably
need higher learning rate
and more training epochs)
Learning curve
Underfitting (accuracy
doesnโ€™t improve so you
need a deeper network)
Learning curve
Overfitting (train accuracy
increases while validation
get worse, so you need to
add regularization or
increase dataset if
possible)
Learning curve
Overfitting with oscillations
(network became unstable
after several epochs; you
need to decrease learning
rate during training)
Learning curve
Almost perfect learning
curve
Tuning more layers
fine-tuning.ipynb
Learning rate strategies
Time-based decay:
๐‘™๐‘Ÿ = ๐‘™๐‘Ÿ โˆ—
1
1 + ๐‘‘๐‘’๐‘๐‘Ž๐‘ฆ โˆ— ๐‘’๐‘๐‘œ๐‘โ„Ž
This decay is used by default in
Keras optimizers.
Learning rate strategies
Step decay:
๐‘™๐‘Ÿ = ๐‘™๐‘Ÿ๐‘ ๐‘ก๐‘Ž๐‘Ÿ๐‘ก
1
1 โˆ’๐‘‘๐‘’๐‘๐‘Ž๐‘ฆโˆ—๐‘‘๐‘Ÿ๐‘œ๐‘
๐‘‘๐‘Ÿ๐‘œ๐‘ =
๐‘’๐‘๐‘œ๐‘โ„Ž
๐‘ ๐‘ก๐‘’๐‘
Reducing learning rate on plateau
Reducing learning rate whenever validation metric stops improving
(can be combined with previously discussed strategies).
Keras implementation โ€“ ReduceLROnPlateau callback.
Cyclic learning rate
Learning rate increases and
decreases in a cycle.
Upper bound of the cycle can be
static or can decrease with time.
Upper bound is selected by LR
finder algorithm.
Lower bound is chosen to be 1-2
orders of magnitude less than
upper bound.
Original paper - https://arxiv.org/abs/1506.01186
Learning rate finder
1. Select reasonably small lower
bound (e.g. 1e-6)
2. Usually, 1e0 is a good choice
for an upper bound
3. Increase learning rate
exponentially
4. Plot smoothed loss vs LR
5. Select a point slightly lower
than the global minimum
Snapshot ensemble
Source - https://arxiv.org/pdf/1704.00109.pdf
Learning rate finder and CLR
fine-tuning.ipynb
Augmentation
โ€ข Augmentation increases dataset
size by applying natural
transformations to images.
โ€ข Useful strategy:
โ€ข Start with soft augmentation.
โ€ข Make it harsher with time.
โ€ข If the dataset is big enough, finish
training with several epochs with soft
augmentation / without any.
Implementation:
https://github.com/albu/albumentations
Tuning whole network
fine-tuning.ipynb
Dealing with imbalanced train set
Common ways to deal with it imbalanced classification are upsampling and
downsampling. In case of deep learning there is also weighted loss.
Weighted loss example:
Class A has 1000 samples.
Class B has 2000 samples.
Class C has 400 samples.
Overall loss:
๐‘™๐‘œ๐‘ ๐‘  =
๐‘๐‘™๐‘Ž๐‘ ๐‘ =0
๐‘›
๐‘™๐‘œ๐‘ ๐‘ ๐‘๐‘™๐‘Ž๐‘ ๐‘ 
๐‘๐‘™๐‘Ž๐‘ ๐‘ =0
๐‘›
๐‘ค๐‘’๐‘–๐‘”โ„Ž๐‘ก๐‘๐‘™๐‘Ž๐‘ ๐‘ 
=
2 โˆ— ๐‘™๐‘œ๐‘ ๐‘ ๐ด + ๐‘™๐‘œ๐‘ ๐‘ ๐ต + 5 โˆ— ๐‘™๐‘œ๐‘ ๐‘ ๐ถ
8
Weighted loss
fine-tuning.ipynb
Test-time augmentation
โ€ข One way to apply TTA is to use
augmentations similar to
training but softer.
โ€ข Simpler strategies:
โ€ข Only flips
โ€ข Flips + crops
โ€ข Caution: TTA increases inference
time!
Predictions with TTA
fine-tuning.ipynb
Semi-supervised approach
โ€ข Deep layers of a CNN learn very
generic features.
โ€ข You can refine such feature
extractors by training on
unlabeled data.
โ€ข Most popular approach for such
training is called pseudolabeling.
Pseudolabeling
1. Train classifier on the initial training set.
2. Predict validation / test set with your
classifier.
3. Optional: remove images with low-
confidence labels.
4. Add pseudolabeled data to your training
set.
5. Use it to train CNN from scratch (some kind
of a warmup) or to refine your previous
classifier.
Source - https://www.analyticsvidhya.com/blog/2017/09/pseudo-
labelling-semi-supervised-learning-technique/
Pseudolabeling constraints
1. Test dataset has reasonable size (at least comparable to the
training set).
2. Network which is trained on pseudolabels is deep enough
(especially when pseudolabels are generated by an
ensemble of models).
3. Training data and pseudolabeled data are mixed in 1:2 โ€“
1:4 proportions respectively.
Using pseudolabeling
In competitions:
- Label test set with your ensemble;
- Train new model;
- Add it to the final ensemble.
In production:
- Collect as much data as possible (both labeled and unlabeled);
- Train model on labeled data;
- Apply pseudolabeling.
Pseudolabeling
pseudolabeling.ipynb
Summary
1. Train networkโ€™s head
2. Add head to the convolutional part
3. Add augmentations and learning rate scheduling / CLR
4. Select appropriate loss
5. Predict with test-time augmentations
6. If you donโ€™t have enough training data, apply pseudolabeling
7. Good luck!
Other tricks (out of scope)
โ€ข How to select network architecture (size,
regularization, pooling type, classifier structure)
โ€ข How to select an optimizer (Adam, RMSprop, etc.)
โ€ข Training on the bigger resolution
โ€ข Hard samples mining
โ€ข Ensembling
Thank you for your attention
Questions are welcomed

More Related Content

Similar to 17_00-Dima-Panchenko-cnn-tips-and-tricks.pptx

Scalable gradientbasedtuningcontinuousregularizationhyperparameters ppt
Scalable gradientbasedtuningcontinuousregularizationhyperparameters pptScalable gradientbasedtuningcontinuousregularizationhyperparameters ppt
Scalable gradientbasedtuningcontinuousregularizationhyperparameters pptRuochun Tzeng
ย 
Dmytro Panchenko "Cracking Kaggle: Human Protein Atlas"
Dmytro Panchenko "Cracking Kaggle: Human Protein Atlas"Dmytro Panchenko "Cracking Kaggle: Human Protein Atlas"
Dmytro Panchenko "Cracking Kaggle: Human Protein Atlas"Lviv Startup Club
ย 
Deeplearning
Deeplearning Deeplearning
Deeplearning Nimrita Koul
ย 
Dataset Augmentation and machine learning.pdf
Dataset Augmentation and machine learning.pdfDataset Augmentation and machine learning.pdf
Dataset Augmentation and machine learning.pdfsudheeremoa229
ย 
StackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkStackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkSri Ambati
ย 
Facial Emotion Detection on Children's Emotional Face
Facial Emotion Detection on Children's Emotional FaceFacial Emotion Detection on Children's Emotional Face
Facial Emotion Detection on Children's Emotional FaceTakrim Ul Islam Laskar
ย 
Understanding GBM and XGBoost in Scikit-Learn
Understanding GBM and XGBoost in Scikit-LearnUnderstanding GBM and XGBoost in Scikit-Learn
Understanding GBM and XGBoost in Scikit-Learn์ฒ ๋ฏผ ๊ถŒ
ย 
Unsupervised Data Augmentation for Consistency Training
Unsupervised Data Augmentation for Consistency TrainingUnsupervised Data Augmentation for Consistency Training
Unsupervised Data Augmentation for Consistency TrainingSungchul Kim
ย 
Hands-on Deep Learning in Python
Hands-on Deep Learning in PythonHands-on Deep Learning in Python
Hands-on Deep Learning in PythonImry Kissos
ย 
Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories taeseon ryu
ย 
Large Scale Kernel Learning using Block Coordinate Descent
Large Scale Kernel Learning using Block Coordinate DescentLarge Scale Kernel Learning using Block Coordinate Descent
Large Scale Kernel Learning using Block Coordinate DescentShaleen Kumar Gupta
ย 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningHoa Le
ย 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningShubhmay Potdar
ย 
Cheatsheet deep-learning-tips-tricks
Cheatsheet deep-learning-tips-tricksCheatsheet deep-learning-tips-tricks
Cheatsheet deep-learning-tips-tricksSteve Nouri
ย 
Scaling Deep Learning Algorithms on Extreme Scale Architectures
Scaling Deep Learning Algorithms on Extreme Scale ArchitecturesScaling Deep Learning Algorithms on Extreme Scale Architectures
Scaling Deep Learning Algorithms on Extreme Scale Architecturesinside-BigData.com
ย 
Day 4
Day 4Day 4
Day 4HuyPhmNht2
ย 
Comparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural NetworksComparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural NetworksVincenzo Lomonaco
ย 

Similar to 17_00-Dima-Panchenko-cnn-tips-and-tricks.pptx (20)

Scalable gradientbasedtuningcontinuousregularizationhyperparameters ppt
Scalable gradientbasedtuningcontinuousregularizationhyperparameters pptScalable gradientbasedtuningcontinuousregularizationhyperparameters ppt
Scalable gradientbasedtuningcontinuousregularizationhyperparameters ppt
ย 
presentation.ppt
presentation.pptpresentation.ppt
presentation.ppt
ย 
Dmytro Panchenko "Cracking Kaggle: Human Protein Atlas"
Dmytro Panchenko "Cracking Kaggle: Human Protein Atlas"Dmytro Panchenko "Cracking Kaggle: Human Protein Atlas"
Dmytro Panchenko "Cracking Kaggle: Human Protein Atlas"
ย 
Deeplearning
Deeplearning Deeplearning
Deeplearning
ย 
Dataset Augmentation and machine learning.pdf
Dataset Augmentation and machine learning.pdfDataset Augmentation and machine learning.pdf
Dataset Augmentation and machine learning.pdf
ย 
StackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkStackNet Meta-Modelling framework
StackNet Meta-Modelling framework
ย 
Facial Emotion Detection on Children's Emotional Face
Facial Emotion Detection on Children's Emotional FaceFacial Emotion Detection on Children's Emotional Face
Facial Emotion Detection on Children's Emotional Face
ย 
C3 w1
C3 w1C3 w1
C3 w1
ย 
Understanding GBM and XGBoost in Scikit-Learn
Understanding GBM and XGBoost in Scikit-LearnUnderstanding GBM and XGBoost in Scikit-Learn
Understanding GBM and XGBoost in Scikit-Learn
ย 
Unsupervised Data Augmentation for Consistency Training
Unsupervised Data Augmentation for Consistency TrainingUnsupervised Data Augmentation for Consistency Training
Unsupervised Data Augmentation for Consistency Training
ย 
Hands-on Deep Learning in Python
Hands-on Deep Learning in PythonHands-on Deep Learning in Python
Hands-on Deep Learning in Python
ย 
Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories
ย 
Large Scale Kernel Learning using Block Coordinate Descent
Large Scale Kernel Learning using Block Coordinate DescentLarge Scale Kernel Learning using Block Coordinate Descent
Large Scale Kernel Learning using Block Coordinate Descent
ย 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearning
ย 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter Tuning
ย 
1 5
1 51 5
1 5
ย 
Cheatsheet deep-learning-tips-tricks
Cheatsheet deep-learning-tips-tricksCheatsheet deep-learning-tips-tricks
Cheatsheet deep-learning-tips-tricks
ย 
Scaling Deep Learning Algorithms on Extreme Scale Architectures
Scaling Deep Learning Algorithms on Extreme Scale ArchitecturesScaling Deep Learning Algorithms on Extreme Scale Architectures
Scaling Deep Learning Algorithms on Extreme Scale Architectures
ย 
Day 4
Day 4Day 4
Day 4
ย 
Comparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural NetworksComparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural Networks
ย 

Recently uploaded

Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
ย 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
ย 
ๅŠž็†ๅญฆไฝ่ฏไธญไฝ›็ฝ—้‡Œ่พพๅคงๅญฆๆฏ•ไธš่ฏ,UCFๆˆ็ปฉๅ•ๅŽŸ็‰ˆไธ€ๆฏ”ไธ€
ๅŠž็†ๅญฆไฝ่ฏไธญไฝ›็ฝ—้‡Œ่พพๅคงๅญฆๆฏ•ไธš่ฏ,UCFๆˆ็ปฉๅ•ๅŽŸ็‰ˆไธ€ๆฏ”ไธ€ๅŠž็†ๅญฆไฝ่ฏไธญไฝ›็ฝ—้‡Œ่พพๅคงๅญฆๆฏ•ไธš่ฏ,UCFๆˆ็ปฉๅ•ๅŽŸ็‰ˆไธ€ๆฏ”ไธ€
ๅŠž็†ๅญฆไฝ่ฏไธญไฝ›็ฝ—้‡Œ่พพๅคงๅญฆๆฏ•ไธš่ฏ,UCFๆˆ็ปฉๅ•ๅŽŸ็‰ˆไธ€ๆฏ”ไธ€F sss
ย 
Call Girls in Defence Colony Delhi ๐Ÿ’ฏCall Us ๐Ÿ”8264348440๐Ÿ”
Call Girls in Defence Colony Delhi ๐Ÿ’ฏCall Us ๐Ÿ”8264348440๐Ÿ”Call Girls in Defence Colony Delhi ๐Ÿ’ฏCall Us ๐Ÿ”8264348440๐Ÿ”
Call Girls in Defence Colony Delhi ๐Ÿ’ฏCall Us ๐Ÿ”8264348440๐Ÿ”soniya singh
ย 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
ย 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
ย 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
ย 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]๐Ÿ“Š Markus Baersch
ย 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
ย 
ไธ“ไธšไธ€ๆฏ”ไธ€็พŽๅ›ฝไฟ„ไบฅไฟ„ๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•pdf็”ตๅญ็‰ˆๅˆถไฝœไฟฎๆ”น
ไธ“ไธšไธ€ๆฏ”ไธ€็พŽๅ›ฝไฟ„ไบฅไฟ„ๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•pdf็”ตๅญ็‰ˆๅˆถไฝœไฟฎๆ”นไธ“ไธšไธ€ๆฏ”ไธ€็พŽๅ›ฝไฟ„ไบฅไฟ„ๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•pdf็”ตๅญ็‰ˆๅˆถไฝœไฟฎๆ”น
ไธ“ไธšไธ€ๆฏ”ไธ€็พŽๅ›ฝไฟ„ไบฅไฟ„ๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•pdf็”ตๅญ็‰ˆๅˆถไฝœไฟฎๆ”นyuu sss
ย 
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Bookvip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Bookmanojkuma9823
ย 
Call Us โžฅ97111โˆš47426๐ŸคณCall Girls in Aerocity (Delhi NCR)
Call Us โžฅ97111โˆš47426๐ŸคณCall Girls in Aerocity (Delhi NCR)Call Us โžฅ97111โˆš47426๐ŸคณCall Girls in Aerocity (Delhi NCR)
Call Us โžฅ97111โˆš47426๐ŸคณCall Girls in Aerocity (Delhi NCR)jennyeacort
ย 
9711147426โœจCall In girls Gurgaon Sector 31. SCO 25 escort service
9711147426โœจCall In girls Gurgaon Sector 31. SCO 25 escort service9711147426โœจCall In girls Gurgaon Sector 31. SCO 25 escort service
9711147426โœจCall In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
ย 
็ง‘็ฝ—ๆ‹‰ๅคšๅคงๅญฆๆณขๅฐ”ๅพ—ๅˆ†ๆ กๆฏ•ไธš่ฏๅญฆไฝ่ฏๆˆ็ปฉๅ•-ๅฏๅŠž็†
็ง‘็ฝ—ๆ‹‰ๅคšๅคงๅญฆๆณขๅฐ”ๅพ—ๅˆ†ๆ กๆฏ•ไธš่ฏๅญฆไฝ่ฏๆˆ็ปฉๅ•-ๅฏๅŠž็†็ง‘็ฝ—ๆ‹‰ๅคšๅคงๅญฆๆณขๅฐ”ๅพ—ๅˆ†ๆ กๆฏ•ไธš่ฏๅญฆไฝ่ฏๆˆ็ปฉๅ•-ๅฏๅŠž็†
็ง‘็ฝ—ๆ‹‰ๅคšๅคงๅญฆๆณขๅฐ”ๅพ—ๅˆ†ๆ กๆฏ•ไธš่ฏๅญฆไฝ่ฏๆˆ็ปฉๅ•-ๅฏๅŠž็†e4aez8ss
ย 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
ย 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
ย 
1:1ๅฎšๅˆถ(UQๆฏ•ไธš่ฏ๏ผ‰ๆ˜†ๅฃซๅ…ฐๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•ไฟฎๆ”น็•™ไฟกๅญฆๅŽ†่ฎค่ฏๅŽŸ็‰ˆไธ€ๆจกไธ€ๆ ท
1:1ๅฎšๅˆถ(UQๆฏ•ไธš่ฏ๏ผ‰ๆ˜†ๅฃซๅ…ฐๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•ไฟฎๆ”น็•™ไฟกๅญฆๅŽ†่ฎค่ฏๅŽŸ็‰ˆไธ€ๆจกไธ€ๆ ท1:1ๅฎšๅˆถ(UQๆฏ•ไธš่ฏ๏ผ‰ๆ˜†ๅฃซๅ…ฐๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•ไฟฎๆ”น็•™ไฟกๅญฆๅŽ†่ฎค่ฏๅŽŸ็‰ˆไธ€ๆจกไธ€ๆ ท
1:1ๅฎšๅˆถ(UQๆฏ•ไธš่ฏ๏ผ‰ๆ˜†ๅฃซๅ…ฐๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•ไฟฎๆ”น็•™ไฟกๅญฆๅŽ†่ฎค่ฏๅŽŸ็‰ˆไธ€ๆจกไธ€ๆ ทvhwb25kk
ย 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
ย 

Recently uploaded (20)

Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
ย 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
ย 
ๅŠž็†ๅญฆไฝ่ฏไธญไฝ›็ฝ—้‡Œ่พพๅคงๅญฆๆฏ•ไธš่ฏ,UCFๆˆ็ปฉๅ•ๅŽŸ็‰ˆไธ€ๆฏ”ไธ€
ๅŠž็†ๅญฆไฝ่ฏไธญไฝ›็ฝ—้‡Œ่พพๅคงๅญฆๆฏ•ไธš่ฏ,UCFๆˆ็ปฉๅ•ๅŽŸ็‰ˆไธ€ๆฏ”ไธ€ๅŠž็†ๅญฆไฝ่ฏไธญไฝ›็ฝ—้‡Œ่พพๅคงๅญฆๆฏ•ไธš่ฏ,UCFๆˆ็ปฉๅ•ๅŽŸ็‰ˆไธ€ๆฏ”ไธ€
ๅŠž็†ๅญฆไฝ่ฏไธญไฝ›็ฝ—้‡Œ่พพๅคงๅญฆๆฏ•ไธš่ฏ,UCFๆˆ็ปฉๅ•ๅŽŸ็‰ˆไธ€ๆฏ”ไธ€
ย 
Call Girls in Defence Colony Delhi ๐Ÿ’ฏCall Us ๐Ÿ”8264348440๐Ÿ”
Call Girls in Defence Colony Delhi ๐Ÿ’ฏCall Us ๐Ÿ”8264348440๐Ÿ”Call Girls in Defence Colony Delhi ๐Ÿ’ฏCall Us ๐Ÿ”8264348440๐Ÿ”
Call Girls in Defence Colony Delhi ๐Ÿ’ฏCall Us ๐Ÿ”8264348440๐Ÿ”
ย 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
ย 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
ย 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
ย 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
ย 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
ย 
Call Girls in Saket 99530๐Ÿ” 56974 Escort Service
Call Girls in Saket 99530๐Ÿ” 56974 Escort ServiceCall Girls in Saket 99530๐Ÿ” 56974 Escort Service
Call Girls in Saket 99530๐Ÿ” 56974 Escort Service
ย 
ไธ“ไธšไธ€ๆฏ”ไธ€็พŽๅ›ฝไฟ„ไบฅไฟ„ๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•pdf็”ตๅญ็‰ˆๅˆถไฝœไฟฎๆ”น
ไธ“ไธšไธ€ๆฏ”ไธ€็พŽๅ›ฝไฟ„ไบฅไฟ„ๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•pdf็”ตๅญ็‰ˆๅˆถไฝœไฟฎๆ”นไธ“ไธšไธ€ๆฏ”ไธ€็พŽๅ›ฝไฟ„ไบฅไฟ„ๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•pdf็”ตๅญ็‰ˆๅˆถไฝœไฟฎๆ”น
ไธ“ไธšไธ€ๆฏ”ไธ€็พŽๅ›ฝไฟ„ไบฅไฟ„ๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•pdf็”ตๅญ็‰ˆๅˆถไฝœไฟฎๆ”น
ย 
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Bookvip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
ย 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
ย 
Call Us โžฅ97111โˆš47426๐ŸคณCall Girls in Aerocity (Delhi NCR)
Call Us โžฅ97111โˆš47426๐ŸคณCall Girls in Aerocity (Delhi NCR)Call Us โžฅ97111โˆš47426๐ŸคณCall Girls in Aerocity (Delhi NCR)
Call Us โžฅ97111โˆš47426๐ŸคณCall Girls in Aerocity (Delhi NCR)
ย 
9711147426โœจCall In girls Gurgaon Sector 31. SCO 25 escort service
9711147426โœจCall In girls Gurgaon Sector 31. SCO 25 escort service9711147426โœจCall In girls Gurgaon Sector 31. SCO 25 escort service
9711147426โœจCall In girls Gurgaon Sector 31. SCO 25 escort service
ย 
็ง‘็ฝ—ๆ‹‰ๅคšๅคงๅญฆๆณขๅฐ”ๅพ—ๅˆ†ๆ กๆฏ•ไธš่ฏๅญฆไฝ่ฏๆˆ็ปฉๅ•-ๅฏๅŠž็†
็ง‘็ฝ—ๆ‹‰ๅคšๅคงๅญฆๆณขๅฐ”ๅพ—ๅˆ†ๆ กๆฏ•ไธš่ฏๅญฆไฝ่ฏๆˆ็ปฉๅ•-ๅฏๅŠž็†็ง‘็ฝ—ๆ‹‰ๅคšๅคงๅญฆๆณขๅฐ”ๅพ—ๅˆ†ๆ กๆฏ•ไธš่ฏๅญฆไฝ่ฏๆˆ็ปฉๅ•-ๅฏๅŠž็†
็ง‘็ฝ—ๆ‹‰ๅคšๅคงๅญฆๆณขๅฐ”ๅพ—ๅˆ†ๆ กๆฏ•ไธš่ฏๅญฆไฝ่ฏๆˆ็ปฉๅ•-ๅฏๅŠž็†
ย 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
ย 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
ย 
1:1ๅฎšๅˆถ(UQๆฏ•ไธš่ฏ๏ผ‰ๆ˜†ๅฃซๅ…ฐๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•ไฟฎๆ”น็•™ไฟกๅญฆๅŽ†่ฎค่ฏๅŽŸ็‰ˆไธ€ๆจกไธ€ๆ ท
1:1ๅฎšๅˆถ(UQๆฏ•ไธš่ฏ๏ผ‰ๆ˜†ๅฃซๅ…ฐๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•ไฟฎๆ”น็•™ไฟกๅญฆๅŽ†่ฎค่ฏๅŽŸ็‰ˆไธ€ๆจกไธ€ๆ ท1:1ๅฎšๅˆถ(UQๆฏ•ไธš่ฏ๏ผ‰ๆ˜†ๅฃซๅ…ฐๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•ไฟฎๆ”น็•™ไฟกๅญฆๅŽ†่ฎค่ฏๅŽŸ็‰ˆไธ€ๆจกไธ€ๆ ท
1:1ๅฎšๅˆถ(UQๆฏ•ไธš่ฏ๏ผ‰ๆ˜†ๅฃซๅ…ฐๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•ไฟฎๆ”น็•™ไฟกๅญฆๅŽ†่ฎค่ฏๅŽŸ็‰ˆไธ€ๆจกไธ€ๆ ท
ย 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
ย 

17_00-Dima-Panchenko-cnn-tips-and-tricks.pptx

  • 1. Tuning CNN: Tips & Tricks Dmytro Panchenko Machine learning engineer, Altexsoft
  • 2. Workshop setup 1. Clone code from https://github.com/hokmund/cnn-tips-and-tricks 2. Download data and checkpoints from http://tiny.cc/4flryy 3. Extract them from the archive and place under src/ in the source code folder 4. Run pip install โ€“r requirements.txt
  • 3. Agenda 1. Workshop setup 2. Transfer learning 3. Learning curves interpretation 4. Learning rate management & cyclic learning rate 5. Augmentations 6. Dealing with imbalanced classification 7. TTA 8. Pseudolabeling
  • 5. Exploratory data analysis โ€ข Real-world images of various goods. โ€ข Different occlusions, illumination, etc. โ€ข Most of items are centered on the picture. โ€ข There are extremely close classes.
  • 7. Dataset split โ€ข Validation set is used for hyperparameter tuning. โ€ข Test set is used for the final evaluation of the tuned model. โ€ข Train set โ€“ 37184 samples (imbalanced). โ€ข Validation set โ€“ 12800 samples (balanced). โ€ข Test set โ€“ 25600 samples (balanced).
  • 8. Transfer learning Transfer learning โ€“ usage of a pre-trained on a very large dataset CNN instead of training from scratch.
  • 9. Transfer learning Your have little data You have a lot of data Datasets are similar Train a classifier (usually, logistic regression or MLP) on bottleneck features Fine-tune several or all layers Datasets are different Train a classifier on deep features of the CNN Fine-tune all layers (use pre-trained weights as an initialization for your CNN)
  • 11. Learning curve Underfitting (accuracy still improves, so you probably need higher learning rate and more training epochs)
  • 12. Learning curve Underfitting (accuracy doesnโ€™t improve so you need a deeper network)
  • 13. Learning curve Overfitting (train accuracy increases while validation get worse, so you need to add regularization or increase dataset if possible)
  • 14. Learning curve Overfitting with oscillations (network became unstable after several epochs; you need to decrease learning rate during training)
  • 17. Learning rate strategies Time-based decay: ๐‘™๐‘Ÿ = ๐‘™๐‘Ÿ โˆ— 1 1 + ๐‘‘๐‘’๐‘๐‘Ž๐‘ฆ โˆ— ๐‘’๐‘๐‘œ๐‘โ„Ž This decay is used by default in Keras optimizers.
  • 18. Learning rate strategies Step decay: ๐‘™๐‘Ÿ = ๐‘™๐‘Ÿ๐‘ ๐‘ก๐‘Ž๐‘Ÿ๐‘ก 1 1 โˆ’๐‘‘๐‘’๐‘๐‘Ž๐‘ฆโˆ—๐‘‘๐‘Ÿ๐‘œ๐‘ ๐‘‘๐‘Ÿ๐‘œ๐‘ = ๐‘’๐‘๐‘œ๐‘โ„Ž ๐‘ ๐‘ก๐‘’๐‘
  • 19. Reducing learning rate on plateau Reducing learning rate whenever validation metric stops improving (can be combined with previously discussed strategies). Keras implementation โ€“ ReduceLROnPlateau callback.
  • 20. Cyclic learning rate Learning rate increases and decreases in a cycle. Upper bound of the cycle can be static or can decrease with time. Upper bound is selected by LR finder algorithm. Lower bound is chosen to be 1-2 orders of magnitude less than upper bound. Original paper - https://arxiv.org/abs/1506.01186
  • 21. Learning rate finder 1. Select reasonably small lower bound (e.g. 1e-6) 2. Usually, 1e0 is a good choice for an upper bound 3. Increase learning rate exponentially 4. Plot smoothed loss vs LR 5. Select a point slightly lower than the global minimum
  • 22. Snapshot ensemble Source - https://arxiv.org/pdf/1704.00109.pdf
  • 23. Learning rate finder and CLR fine-tuning.ipynb
  • 24. Augmentation โ€ข Augmentation increases dataset size by applying natural transformations to images. โ€ข Useful strategy: โ€ข Start with soft augmentation. โ€ข Make it harsher with time. โ€ข If the dataset is big enough, finish training with several epochs with soft augmentation / without any. Implementation: https://github.com/albu/albumentations
  • 26. Dealing with imbalanced train set Common ways to deal with it imbalanced classification are upsampling and downsampling. In case of deep learning there is also weighted loss. Weighted loss example: Class A has 1000 samples. Class B has 2000 samples. Class C has 400 samples. Overall loss: ๐‘™๐‘œ๐‘ ๐‘  = ๐‘๐‘™๐‘Ž๐‘ ๐‘ =0 ๐‘› ๐‘™๐‘œ๐‘ ๐‘ ๐‘๐‘™๐‘Ž๐‘ ๐‘  ๐‘๐‘™๐‘Ž๐‘ ๐‘ =0 ๐‘› ๐‘ค๐‘’๐‘–๐‘”โ„Ž๐‘ก๐‘๐‘™๐‘Ž๐‘ ๐‘  = 2 โˆ— ๐‘™๐‘œ๐‘ ๐‘ ๐ด + ๐‘™๐‘œ๐‘ ๐‘ ๐ต + 5 โˆ— ๐‘™๐‘œ๐‘ ๐‘ ๐ถ 8
  • 28. Test-time augmentation โ€ข One way to apply TTA is to use augmentations similar to training but softer. โ€ข Simpler strategies: โ€ข Only flips โ€ข Flips + crops โ€ข Caution: TTA increases inference time!
  • 30. Semi-supervised approach โ€ข Deep layers of a CNN learn very generic features. โ€ข You can refine such feature extractors by training on unlabeled data. โ€ข Most popular approach for such training is called pseudolabeling.
  • 31. Pseudolabeling 1. Train classifier on the initial training set. 2. Predict validation / test set with your classifier. 3. Optional: remove images with low- confidence labels. 4. Add pseudolabeled data to your training set. 5. Use it to train CNN from scratch (some kind of a warmup) or to refine your previous classifier. Source - https://www.analyticsvidhya.com/blog/2017/09/pseudo- labelling-semi-supervised-learning-technique/
  • 32. Pseudolabeling constraints 1. Test dataset has reasonable size (at least comparable to the training set). 2. Network which is trained on pseudolabels is deep enough (especially when pseudolabels are generated by an ensemble of models). 3. Training data and pseudolabeled data are mixed in 1:2 โ€“ 1:4 proportions respectively.
  • 33. Using pseudolabeling In competitions: - Label test set with your ensemble; - Train new model; - Add it to the final ensemble. In production: - Collect as much data as possible (both labeled and unlabeled); - Train model on labeled data; - Apply pseudolabeling.
  • 35. Summary 1. Train networkโ€™s head 2. Add head to the convolutional part 3. Add augmentations and learning rate scheduling / CLR 4. Select appropriate loss 5. Predict with test-time augmentations 6. If you donโ€™t have enough training data, apply pseudolabeling 7. Good luck!
  • 36. Other tricks (out of scope) โ€ข How to select network architecture (size, regularization, pooling type, classifier structure) โ€ข How to select an optimizer (Adam, RMSprop, etc.) โ€ข Training on the bigger resolution โ€ข Hard samples mining โ€ข Ensembling
  • 37. Thank you for your attention Questions are welcomed