Study Meeting Presentation:
(A very quick introduction) Continual Learning
Author: Aurélie Peng
Date: 2021/10/27
What is Continual Learning ?
• Learn sequentially.
• Try not to forget previously learned classes
• Old data are not available when training new classes
Sorry, this is very simple introduction by a very beginner-level person
Terminology
Lifelong
Learning
Reinforcement
Learning
Online
Learning
Incremental
Learning
Continual
Learning
Terminology
“Task” = Learn some classes at once.
Example :
Task 1. “Cat”, “Dog”, “Bird”
Task 2. “House”, “Building”
Task 3. “Apple”, “Orange”, “Pear”
Why it’s useful
- Memory : Don’t need to store all data, only newest one
- Speed : Don’t have to train from scratch everytime there’s new data /
task.
- Ex. If client wants to add new class to model but only give you one hour and 100 yens budget …..
Catastrophic forgetting or difficult question ?
Shiba vs. Drink Shiba vs. Akita
Main methods against catastrophic forgetting
•Memory and rehearsal
•Regularization
•Architectural method
•(Meta-Learning)
Memory and rehearsal
- Store samples from previous tasks
- When training on new tasks,
sometimes use samples from
memory
“Standard” memory size
- ImageNet : About 20 images / class
- Cifar10 : About 10-20 images / class
“On Tiny Episodic Memories in Continual Learning” A. Chaudhry
https://arxiv.org/abs/1902.10486
Regularization
Add constraints to protect
the important weights from
previous tasks
Example. Important weights
can not be changed too much
Many other methods exists like Distillation Loss :
“Learning without Forgetting”, Z. Li, https://arxiv.org/pdf/1606.09282.pdf
Architectural method
• Add more parameters to
the architecture
• Often used with
multitask (if know task,
can choose which
parameters to use)
(Meta-learning)
• Many complicated methods exist ….
• “Learn to learn”
• Backpropagation on the backpropagation ….
• Example 1. Learn to learn continually
“Meta-Learning Representations for Continual Learning”, K. Javed
“https://arxiv.org/abs/1905.12588
• Example 2. Learn to select “important” features
“Learning to Continually Learn”, S. Beaulieu,
https://arxiv.org/abs/2002.09571
Benchmarks – Does it work well ?
Continual Learning with
nothing against catastrophic
forgetting
“Normal” training
Not continual
Source :
“Dark Experience for General
Continual Learning: a Strong,
Simple Baseline”, P. Buzzega
https://arxiv.org/pdf/2004.072
11.pdf
In the Industry …
• What matter are : Performance, Speed, Memory
• Example :
Goal : Make robots learn by showing pictures of object :
- Low computational capacity
- Can’t store many previously seen images
CVPR 2020 challenge on Continual Learning
• Dataset for Object classification
- 50 classes (about 300 images per class)
- 11 environments
- Fine-grained
https://sites.google.com/view/clvision2020/challenge
CVPR 2020 challenge on Continual Learning
• 3 scenarios
1. Each step = All classes in one environment
2. Each step = Some classes in all environment
3. Each step = One class in one environment
• Metrics
Accuracy on Test, Average Accuracy on Validation,
Training Time, Test Time,
RAM, Disk Usage
https://sites.google.com/view/clvision2020/challenge
CVPR 2020 challenge on Continual Learning
• Results
Source https://arxiv.org/abs/2009.09929
What seems to work in the industry ?
Finalists of the challenge
• All used pretrained model (biggest factor)
• Majority used memory
• One person used regularization
Research world : Almost no paper use pretrained model.
There might currently be a big gap between research & industry …
End !

Continual Learning Introduction

  • 1.
    Study Meeting Presentation: (Avery quick introduction) Continual Learning Author: Aurélie Peng Date: 2021/10/27
  • 2.
    What is ContinualLearning ? • Learn sequentially. • Try not to forget previously learned classes • Old data are not available when training new classes Sorry, this is very simple introduction by a very beginner-level person
  • 3.
  • 4.
    Terminology “Task” = Learnsome classes at once. Example : Task 1. “Cat”, “Dog”, “Bird” Task 2. “House”, “Building” Task 3. “Apple”, “Orange”, “Pear”
  • 5.
    Why it’s useful -Memory : Don’t need to store all data, only newest one - Speed : Don’t have to train from scratch everytime there’s new data / task. - Ex. If client wants to add new class to model but only give you one hour and 100 yens budget …..
  • 7.
    Catastrophic forgetting ordifficult question ? Shiba vs. Drink Shiba vs. Akita
  • 8.
    Main methods againstcatastrophic forgetting •Memory and rehearsal •Regularization •Architectural method •(Meta-Learning)
  • 9.
    Memory and rehearsal -Store samples from previous tasks - When training on new tasks, sometimes use samples from memory “Standard” memory size - ImageNet : About 20 images / class - Cifar10 : About 10-20 images / class “On Tiny Episodic Memories in Continual Learning” A. Chaudhry https://arxiv.org/abs/1902.10486
  • 10.
    Regularization Add constraints toprotect the important weights from previous tasks Example. Important weights can not be changed too much Many other methods exists like Distillation Loss : “Learning without Forgetting”, Z. Li, https://arxiv.org/pdf/1606.09282.pdf
  • 11.
    Architectural method • Addmore parameters to the architecture • Often used with multitask (if know task, can choose which parameters to use)
  • 12.
    (Meta-learning) • Many complicatedmethods exist …. • “Learn to learn” • Backpropagation on the backpropagation …. • Example 1. Learn to learn continually “Meta-Learning Representations for Continual Learning”, K. Javed “https://arxiv.org/abs/1905.12588 • Example 2. Learn to select “important” features “Learning to Continually Learn”, S. Beaulieu, https://arxiv.org/abs/2002.09571
  • 13.
    Benchmarks – Doesit work well ? Continual Learning with nothing against catastrophic forgetting “Normal” training Not continual Source : “Dark Experience for General Continual Learning: a Strong, Simple Baseline”, P. Buzzega https://arxiv.org/pdf/2004.072 11.pdf
  • 14.
    In the Industry… • What matter are : Performance, Speed, Memory • Example : Goal : Make robots learn by showing pictures of object : - Low computational capacity - Can’t store many previously seen images
  • 15.
    CVPR 2020 challengeon Continual Learning • Dataset for Object classification - 50 classes (about 300 images per class) - 11 environments - Fine-grained https://sites.google.com/view/clvision2020/challenge
  • 16.
    CVPR 2020 challengeon Continual Learning • 3 scenarios 1. Each step = All classes in one environment 2. Each step = Some classes in all environment 3. Each step = One class in one environment • Metrics Accuracy on Test, Average Accuracy on Validation, Training Time, Test Time, RAM, Disk Usage https://sites.google.com/view/clvision2020/challenge
  • 17.
    CVPR 2020 challengeon Continual Learning • Results Source https://arxiv.org/abs/2009.09929
  • 18.
    What seems towork in the industry ? Finalists of the challenge • All used pretrained model (biggest factor) • Majority used memory • One person used regularization Research world : Almost no paper use pretrained model. There might currently be a big gap between research & industry …
  • 19.