Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
COMPARING INCREMENTAL LEARNING STRATEGIES
FOR
CONVOLUTIONAL NEURAL NETWORKS
Vincenzo Lomonaco & Davide Maltoni
{vincenzo.l...
OUTLINE
1. Introduction
• CNNs and current limitations
• Incremental learning: Why?
2. Incremental learning Strategies
for...
OUTLINE
1. Introduction
• CNNs and current limitations
• Incremental learning: Why?
2. Incremental learning Strategies
for...
INTRODUCTION – CNNs and Current Limitations
State-of-the-art algorithm for many
tasks in CV, NLP, SR, etc..
Very general a...
INTRODUCTION – Incremental learnig: Why?
𝐵𝑎𝑡𝑐ℎ0 𝐵𝑎𝑡𝑐ℎ1 𝐵𝑎𝑡𝑐ℎ 𝑛
. . .
Initial Batch Incremental Batches

INTRODUCTION – Incremental learnig: Why?
Constraints:
• Memory: We can’t afford to keep in memory all the batches.
• Compu...
INTRODUCTION – Incremental learnig: Why?
Goal:
• Maximize the Accuracy % after each batch
• Going towards a more smooth an...
INTRODUCTION – Incremental learnig: Why?
𝐵𝑎𝑡𝑐ℎ0 𝐵𝑎𝑡𝑐ℎ1 𝐵𝑎𝑡𝑐ℎ 𝑛
. . .
𝑀0
INTRODUCTION – Incremental learnig: Why?
𝐵𝑎𝑡𝑐ℎ0 𝐵𝑎𝑡𝑐ℎ1 𝐵𝑎𝑡𝑐ℎ 𝑛
. . .
𝑀0
𝑀1
• We can free the memory
occupied by 𝐵𝑎𝑡𝑐ℎ0 and...
INTRODUCTION – Incremental learnig: Why?
𝐵𝑎𝑡𝑐ℎ0 𝐵𝑎𝑡𝑐ℎ1 𝐵𝑎𝑡𝑐ℎ 𝑛
. . .
𝑀0
𝑀1
INTRODUCTION – Incremental learnig: Why?
𝐵𝑎𝑡𝑐ℎ0 𝐵𝑎𝑡𝑐ℎ1 𝐵𝑎𝑡𝑐ℎ 𝑛
. . .
𝑀0
𝑀1
INTRODUCTION – Incremental learnig: Why?
𝐵𝑎𝑡𝑐ℎ0 𝐵𝑎𝑡𝑐ℎ1 𝐵𝑎𝑡𝑐ℎ 𝑛
. . .
𝑀0
𝑀1
. . . 𝑀 𝑛
OUTLINE
1. Introduction
• CNNs and current limitations
• Incremental learning: Why?
2. Incremental learning Strategies
for...
INC. LEARNING STRATEGIES FOR CNNS - Definitions
The different possibilities we explored to deal with an incremental
tuning...
INC. LEARNING STRATEGIES FOR CNNS - Instantiations
In our experiments (with focus on image classification) we tested three...
INC. LEARNING STRATEGIES FOR CNNS - Instantiations
Furthermore, for the “BigBrother” dataset we decided to test an additio...
OUTLINE
1. Introduction
• CNNs and current limitations
• Incremental learning: Why?
2. Incremental Learning Strategies
for...
DATASETS
We were interested in datasets where:
• The objects of interest have been acquired in a number of successive sess...
DATASETS – iCubWorld28
Key Features:
• Img size: 128×128
• Num. classes: 7 (× 4 obj)
• Tot. imgs: 39,693
• Num batches: 9 ...
DATASETS – BigBrother
Key Features:
• Img size: 70×70
• Num. classes: 7
• Tot. imgs: 23,842
• Num batches: 56 +1 (test)
OUTLINE
1. Introduction
• CNNs and current limitations
• Incremental learning: Why?
2. Incremental Learning Strategies
for...
EXPERIMENTS AND RESULTS – Exp. Design
Experiments Policy:
• We trained the models until full convergence on the first batc...
EXPERIMENTS AND RESULTS – iCubWorld28 Results
EXPERIMENTS AND RESULTS – iCubWorld28 Results
• CaffeNet + SVM has a very
good recognition rate
increment
• CaffeNet + FT ...
EXPERIMENTS AND RESULTS – BigBrother Results
EXPERIMENTS AND RESULTS – BigBrother Results
• LeNet7 model performs
slightly better than CaffeNet
+ SVM or CaffeNet + FT
...
EXPERIMENTS AND RESULTS – Dealing with Forgetting
EXPERIMENTS AND RESULTS – Dealing with Forgetting
• An adjustable learning
rate is significantly more
stable
• A simple th...
OUTLINE
1. Introduction
• CNNs and current limitations
• Incremental learning: Why?
2. Incremental Learning Strategies
for...
CONCLUSIONS AND FUTURE WORKS
• When possible (i.e., transfer learning from the same domain), it is preferable to use
CNN a...
CONCLUSIONS AND FUTURE WORKS
In the near future we plan to extend this work by:
• Performing a more extensive experimental...
COMPARING INCREMENTAL LEARNING STRATEGIES FOR
CONVOLUTIONAL NEURAL NETWORKS
Vincenzo Lomonaco & Davide Maltoni
{vincenzo.l...
Upcoming SlideShare
Loading in …5
×

Comparing Incremental Learning Strategies for Convolutional Neural Networks

532 views

Published on

In the last decade, Convolutional Neural Networks (CNNs) have shown to perform incredibly well in many computer vision tasks such as object recognition and object detection, being able to extract meaningful high-level invariant features. However, partly because of their complex training and tricky hyper-parameters tuning, CNNs have been scarcely studied in the context of incremental learning where data are available in consecutive batches and retraining the model from scratch is unfeasible. In this work we compare different incremental learning strategies for CNN based architectures, targeting real-word applications.

If you are interested in this work please cite:
Lomonaco, V., & Maltoni, D. (2016, September). Comparing Incremental Learning Strategies for Convolutional Neural Networks. In IAPR Workshop on Artificial Neural Networks in Pattern Recognition (pp. 175-184). Springer International Publishing.

For further information visit my website: http://www.vincenzolomonaco.com/

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Comparing Incremental Learning Strategies for Convolutional Neural Networks

  1. 1. COMPARING INCREMENTAL LEARNING STRATEGIES FOR CONVOLUTIONAL NEURAL NETWORKS Vincenzo Lomonaco & Davide Maltoni {vincenzo.lomonaco, davide.maltoni}@unibo.it Department of Computer Science and Engineering – DISI University of Bologna
  2. 2. OUTLINE 1. Introduction • CNNs and current limitations • Incremental learning: Why? 2. Incremental learning Strategies for CNNs • Definitions • Possible instantiations used during the experimentations 3. Datasets • iCubWorld28 • BigBrother 4. Experiments and Results • Exp. design • Results analysis 5. Conclusions and Future Works
  3. 3. OUTLINE 1. Introduction • CNNs and current limitations • Incremental learning: Why? 2. Incremental learning Strategies for CNNs • Definitions • Possible instantiations used during the experimentations 3. Datasets • iCubWorld28 • BigBrother 4. Experiments and Results • Exp. design • Results analysis 5. Conclusions and Future Works
  4. 4. INTRODUCTION – CNNs and Current Limitations State-of-the-art algorithm for many tasks in CV, NLP, SR, etc.. Very general and adaptive Works directy on raw data (no hand-engineered features required) Computational demanding Tricky hyper-parametrization Applicability in Incremental Learning Scenario?
  5. 5. INTRODUCTION – Incremental learnig: Why? 𝐵𝑎𝑡𝑐ℎ0 𝐵𝑎𝑡𝑐ℎ1 𝐵𝑎𝑡𝑐ℎ 𝑛 . . . Initial Batch Incremental Batches 
  6. 6. INTRODUCTION – Incremental learnig: Why? Constraints: • Memory: We can’t afford to keep in memory all the batches. • Computational power: We can’t afford to train our classification model from scratch after each batch. 𝐵𝑎𝑡𝑐ℎ0 𝐵𝑎𝑡𝑐ℎ1 𝐵𝑎𝑡𝑐ℎ 𝑛 . . .  Initial Batch Incremental Batches
  7. 7. INTRODUCTION – Incremental learnig: Why? Goal: • Maximize the Accuracy % after each batch • Going towards a more smooth and natural learning but still using CNNs 𝐵𝑎𝑡𝑐ℎ0 𝐵𝑎𝑡𝑐ℎ1 𝐵𝑎𝑡𝑐ℎ 𝑛 . . .  Initial Batch Incremental Batches
  8. 8. INTRODUCTION – Incremental learnig: Why? 𝐵𝑎𝑡𝑐ℎ0 𝐵𝑎𝑡𝑐ℎ1 𝐵𝑎𝑡𝑐ℎ 𝑛 . . . 𝑀0
  9. 9. INTRODUCTION – Incremental learnig: Why? 𝐵𝑎𝑡𝑐ℎ0 𝐵𝑎𝑡𝑐ℎ1 𝐵𝑎𝑡𝑐ℎ 𝑛 . . . 𝑀0 𝑀1 • We can free the memory occupied by 𝐵𝑎𝑡𝑐ℎ0 and get 𝑀1 just by updating 𝑀0 with the new coming batch • However, we risk to forget what we’ve previously learned
  10. 10. INTRODUCTION – Incremental learnig: Why? 𝐵𝑎𝑡𝑐ℎ0 𝐵𝑎𝑡𝑐ℎ1 𝐵𝑎𝑡𝑐ℎ 𝑛 . . . 𝑀0 𝑀1
  11. 11. INTRODUCTION – Incremental learnig: Why? 𝐵𝑎𝑡𝑐ℎ0 𝐵𝑎𝑡𝑐ℎ1 𝐵𝑎𝑡𝑐ℎ 𝑛 . . . 𝑀0 𝑀1
  12. 12. INTRODUCTION – Incremental learnig: Why? 𝐵𝑎𝑡𝑐ℎ0 𝐵𝑎𝑡𝑐ℎ1 𝐵𝑎𝑡𝑐ℎ 𝑛 . . . 𝑀0 𝑀1 . . . 𝑀 𝑛
  13. 13. OUTLINE 1. Introduction • CNNs and current limitations • Incremental learning: Why? 2. Incremental learning Strategies for CNNs • Definitions • Possible instantiations used during the experimentations 3. Datasets • iCubWorld28 • BigBrother 4. Experiments and Results • Exp. design • Results analysis 5. Conclusions and Future Works
  14. 14. INC. LEARNING STRATEGIES FOR CNNS - Definitions The different possibilities we explored to deal with an incremental tuning/learning scenario, can be conveniently framed in three main strategies: 1. Training/tuning an ad hoc CNN architecture suitable for the problem. 2. Using an already trained CNN as a fixed feature extractor in conjunction with an incremental classifier. 3. Fine-tuning an already trained CNN.
  15. 15. INC. LEARNING STRATEGIES FOR CNNS - Instantiations In our experiments (with focus on image classification) we tested three instantiations of the aforementioned strategies, respectively: 1. (Ad-hoc arch.)  LeNet7 Consists of the classical “LeNet7” proposed by Yan LeCun in 2004. Still competitive on low/medium scale problems. 2. (CNN-fixed w. inc. Classifier)  CaffeNet + SVM Consists of a pre-trained CNN provided in the Caffe library (“BVLC Reference CaffeNet”, based on the “AlexNet” architecture; An incremental and linear SVM as Classifier. 3. (CNN-Finetuning)  CaffeNet + FT Consists again of the “BVLC Reference CaffeNet” but instead of using it as a fixed feature extractor the network is fine-tuned to suit the new task.
  16. 16. INC. LEARNING STRATEGIES FOR CNNS - Instantiations Furthermore, for the “BigBrother” dataset we decided to test an additional pair of strategies: 4. (CNN-fixed w. inc. Classifier)  VGG_Face + SVM Consists of a pre-trained (16-levels) CNN called “VGG Face” which has been trained on a very large dataset of faces (2,622 Subjects and 2.6M images); Again, a incremental and linear SVM as Classifier. 5. (CNN-Finetuning)  VGG_Face + FT Consists again of the “VGG_Face” CNN but instead of using it as a fixed feature extractor the network is fine-tuned to suit the new task.
  17. 17. OUTLINE 1. Introduction • CNNs and current limitations • Incremental learning: Why? 2. Incremental Learning Strategies for CNNs • Definitions • Possible instantiations used during the experimentations 3. Datasets • iCubWorld28 • BigBrother 4. Experiments and Results • Exp. design • Results analysis 5. Conclusions and Future Works
  18. 18. DATASETS We were interested in datasets where: • The objects of interest have been acquired in a number of successive sessions • The environmental condition can change among the sessions. We focused on two applicative fields where incremental learning is very relevant (robotics and biometrics) and chose two datasets respectively: • iCubWorld28 • BigBrother
  19. 19. DATASETS – iCubWorld28 Key Features: • Img size: 128×128 • Num. classes: 7 (× 4 obj) • Tot. imgs: 39,693 • Num batches: 9 +1 (test)
  20. 20. DATASETS – BigBrother Key Features: • Img size: 70×70 • Num. classes: 7 • Tot. imgs: 23,842 • Num batches: 56 +1 (test)
  21. 21. OUTLINE 1. Introduction • CNNs and current limitations • Incremental learning: Why? 2. Incremental Learning Strategies for CNNs • Definitions • Possible instantiations used during the experimentations 3. Datasets • iCubWorld28 • BigBrother 4. Experiments and Results • Exp. design • Results analysis 5. Conclusions and Future Works
  22. 22. EXPERIMENTS AND RESULTS – Exp. Design Experiments Policy: • We trained the models until full convergence on the first batch of data • We tuned them on the successive incremental batches, trying to balance the trade-off between accuracy gain and forgetting.
  23. 23. EXPERIMENTS AND RESULTS – iCubWorld28 Results
  24. 24. EXPERIMENTS AND RESULTS – iCubWorld28 Results • CaffeNet + SVM has a very good recognition rate increment • CaffeNet + FT is the most effective • LeNet7 struggles to learn complex invariant features necessary for this problem
  25. 25. EXPERIMENTS AND RESULTS – BigBrother Results
  26. 26. EXPERIMENTS AND RESULTS – BigBrother Results • LeNet7 model performs slightly better than CaffeNet + SVM or CaffeNet + FT • VGG_Face + SVM and VGG_Face + FT have impressive performance on this problem • VGG_Face + SVM seems to be the best choice both for the accuracy and the stability
  27. 27. EXPERIMENTS AND RESULTS – Dealing with Forgetting
  28. 28. EXPERIMENTS AND RESULTS – Dealing with Forgetting • An adjustable learning rate is significantly more stable • A simple thresholding approach has been used. • We did not found any significant difference using a continuous approach
  29. 29. OUTLINE 1. Introduction • CNNs and current limitations • Incremental learning: Why? 2. Incremental Learning Strategies for CNNs • Definitions • Possible instantiations used during the experimentations 3. Datasets • iCubWorld28 • BigBrother 4. Experiments and Results • Exp. design • Results analysis 5. Conclusions and Future Works
  30. 30. CONCLUSIONS AND FUTURE WORKS • When possible (i.e., transfer learning from the same domain), it is preferable to use CNN as a fixed feature extractor to feed an incremental classifier • If the features are not optimized, the tuning of low level layers may be preferable and the learning strength can be used to control forgetting. • Training a CNN from scratch can be advantageous if the problem patterns (and feature invariances) are highly specific and a sufficient number of samples are available.
  31. 31. CONCLUSIONS AND FUTURE WORKS In the near future we plan to extend this work by: • Performing a more extensive experimental evaluation • Finding a more principled way to control forgetting and adapting the tuning parameters to the size (and bias) of each incremental batch. • Studying real-world applications of semi-supervised incremental learning strategies for CNNs.
  32. 32. COMPARING INCREMENTAL LEARNING STRATEGIES FOR CONVOLUTIONAL NEURAL NETWORKS Vincenzo Lomonaco & Davide Maltoni {vincenzo.lomonaco, davide.maltoni}@unibo.it Department of Computer Science and Engineering – DISI University of Bologna Thank you for your attention. Any Questions?

×