SlideShare a Scribd company logo
Continual Learning in
Deep Neural Networks
why, how, and when
Gabriele Graffieti
Ph.D. Day DS&C 2021
A tale about machine learning
Once upon a time, a lot of data was collected.
That data was fed into a huge machine learning model.
The model was then able to properly process the data, and when new similar data
arrived at the model, it can correctly handle it.
What’s wrong with this?
A tale about machine learning
Once upon a time, a lot of data was collected.
That data was fed to a huge machine learning model.
The model was then able to properly process the data, and when new similar data
arrived to the model, it can correctly handle it.
What’s wrong with this? Nothing! I just described how machine learning works!
And it works wonderfully!
The antagonists of our tale
● ...a lot of data was collected
○ is it always possible?
○ How to store it?
○ What about training time?
○ What if I don’t want to wait until I collected a lot of data?
○ When some data is a lot? When some data is sufficient?
● ...when new similar data arrived at the model, it can correctly handle it
○ What if dissimilar data arrives at the model?
○ What if I want to adapt the model to new data?
○ What if data changes over time?
○ What if I want to continually train the model?
The main villain of ML: catastrophic forgetting
● Catastrophic interference, also known as catastrophic forgetting, is the tendency of an artificial neural
network to completely and abruptly forget previously learned information upon learning new
information.
● This holds for all the ML models that are trained using “greedy” algorithms (e.g. stochastic gradient
descent, CART…)
● The training algorithm optimizes the parameters of the model using the currently available data, past
data and past knowledge are not taken into consideration.
● Heavily related to the stability-plasticity dilemma.
Catastrophic Forgetting: an example
You collect some data from the real world, and after each collection phase you train the model (only on the
lastly collected data) to classify it in two classes:
Data 1 Data 2 Data 3
Catastrophic Forgetting: an example
You collect some data from the real world, and after each collection phase you train the model (only on the
lastly collected data) to classify it in two classes:
Final solution on data 1 Final solution on all data Optimal final solution
ML fails everyday!
● Social network
○ Impossible to train a model with all the available data (too much computation and time needed)
○ Train the model only on last data ≠ train it on whole data
○ Train a model incrementally only on new data (less time/computation required) without forgetting past
knowledge
● Self-driving cars
○ Train the car at the same time it is driving (correction from the driver as signals) - Tesla is doing this!
○ Do not forget past knowledge! If I live in NYC don’t want my car to forget how to drive in the countryside
● Personalized devices
○ Keyboard next word suggestion may learn my writing style without forgetting about “good” writing style
○ Domestic robots can learn to recognize and handle new objects without forgetting how to handle other objects
● Scientific analysis of data
○ A model may be able to merge and reason about different results of different experiments
● Carbon footprint and energy consumption
○ Retraining a model with all the data every time new data arrives is costly both economically and environmentally
○ Extreme case: training complex language models (GPT-3) emit as much CO2
as 5 cars in their entire lifetimes
Continual learning (CL) in a nutshell
We define a continual learning scenario, a scenario when we do not have all
the data at once, but we discover new data as time progresses.
More specifically, the data we discover may not be a good approximation for
the total data distribution.
Other constraints:
● Every time new data arrives the model needs to be updated
● The model update should be fast enough to be used before new data
arrives
● Past knowledge must not be forgotten (at least not catastrophically)
Continual Learning
Continual learning strategies
● Architectural
✅ It works pretty good
❌ Does not scale well
❌ Not biologically plausible
● Regularization
✅ Mathematically sound
❌ Difficult to optimize and implement
● Rehearsal / replay
✅ Straightforward to implement
✅ It works pretty good in most scenarios
❌ Memory
❌ Privacy
❌ Computation
Replay strategy
e(k+1)
e(k)
e(k-1)
e(k-2)
External
memory
Pro:
● Catastrophic forgetting highly
reduced.
● Simple and easy to implement
strategy.
● Memory is cheap and abundant.
Cons:
● Memory is not infinite (the stream
of experience can be infinite).
● What about privacy and private
data?
● Not biologically plausible.
● Computation
Latent replay
Pellegrini, L., Graffieti, G., Lomonaco, V., & Maltoni, D. Latent replay for real-time continual learning. In 2020 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS)
● Only latent activations are
memorized (less memory
footprint)
● Only a portion of the network
needs to be trained with replay
data (less computational
footprint)
● The latent replay layer can be
moved to balance speed and
accuracy
Latent replay
Pellegrini, L., Graffieti, G., Lomonaco, V., & Maltoni, D. Latent replay for real-time continual learning. In 2020 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS)
My talk about it
Training on smartphone devices
Pellegrini, L., Lomonaco, V., Graffieti, G., & Maltoni, D. Continual Learning at the Edge: Real-Time Training on Smartphone Devices.In 2021 European
Symposium on Artificial Neural Networks (ESANN)
Video demo
Generative replay
e(k+1)
e(k)
e(k-1)
e(k-2)
Generative Model
Pro:
● No replay memory is needed.
● More biologically plausible.
● Can also generate unseen or new
plausible data.
● Generative replay can generalize
and possibly yield better results.
Cons:
● How to train the generative
model??
○ The problem is now on the
continual training of the
generator instead of the
classifier.
○ Data quality is the main
issue here
Negative generative replay
● Instead of using bad quality generated data as replay data, use them only as negative example.
○ E.g. do not tell to the model “this image is a cat”; instead tell it “this image is not a dog”
Graffieti, G., Maltoni D., Pellegrini, L. & Lomonaco, V. __________ - Under double blind review at some conference
A continual learning Avalanche
Lomonaco, V., Pellegrini, L., Cossu, A., Carta, A., Graffieti, G., Hayes, T. L., ... & Maltoni, D. Avalanche: an End-to-End Library for Continual Learning. In
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
https://avalanche.continualai.org
Future work
Continual learning is a very active research area and can be addressed in many directions.
● We are now investigating replay (especially generative replay) and the role of (episodic) memory.
● Bio-inspired models and neuroscience can be the key to solve many issues.
● Gaussian processes (and Bayesian statistics) are being explored as continual learning frameworks.
My thought (not necessarily right)
● We need a “non-greedy” optimization algorithm (SGD and the majority of optimization algorithms for
NN are greedy).
● Memory is a key concept and should be analyzed and explored.
● We need a more “natural” way of learning (learn how to walk before learning how to run).
● Machine learning is the perfect fit if we want to solve a single task. ML is not the perfect fit if we want
to build real intelligence.
Publications
Gabriele Graffieti, Davide Maltoni, Lorenzo Pellegrini, Vincenzo Lomonaco, _________ - Under double blind review at some conference
Guido Borghi, Annalisa Franco, Gabriele Graffieti, Davide Maltoni, Automated Artifact Retouching in Morphed Images with Attention Maps, IEEE Access (2021)
Lorenzo Pellegrini, Vincenzo Lomonaco, Gabriele Graffieti, Davide Maltoni, Continual Learning at the Edge: Real-Time Training on Smartphone Devices.
Proceedings of the 29th European Symposium on Artificial Neural Networks (ESANN), 2021
Gabriele Graffieti, Davide Maltoni. Artifacts-Free Single Image Defogging. Atmosphere 12 (5), 577, 2021.
V Lomonaco, L Pellegrini, A Cossu, A Carta, G Graffieti, et al. Avalanche: an End-to-End Library for Continual Learning. Proceedings of the IEEE/CVF Conference
on Computer Vision and Pattern Recognition (CVPR) 2021.
Gabriele Graffieti, Davide Maltoni. Towards Artifacts-free Image Defogging. International Conference on Pattern Recognition (ICPR) 2020.
Publications (cont.)
H Bae, E Brophy, RHM Chan, B Chen, F Feng, G Graffieti, et al. Iros 2019 lifelong robotic vision: Object recognition challenge. IEEE Robotics & Automation
Magazine 27 (2), 11-16, 2020
Lorenzo Pellegrini, Gabriele Graffieti, Vincenzo Lomonaco, Davide Maltoni. Latent replay for real-time continual learning. International Conference on
Intelligent Robots and Systems (IROS) 2020.
Gabriele Graffieti, Lorenzo Pellegrini, Vincenzo Lomonaco, Davide Maltoni. Efficient Continual Learning with Latent Rehearsal. International Conference on
Intelligent Robots and Systems (IROS) 2019.
Competitions
Self-supervised Learning for Next-Generation Industry-level Autonomous Driving (ICCV 2021) - 1st place (5k$ prize)
Lifelong Robotic Vision Challenge (IROS 2019) - 2nd place
Continual Learning: why, how, and when

More Related Content

What's hot

Comparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural NetworksComparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural Networks
Vincenzo Lomonaco
 
Visual prompt tuning
Visual prompt tuningVisual prompt tuning
Visual prompt tuning
taeseon ryu
 
Siamese networks
Siamese networksSiamese networks
Siamese networks
Nicholas McClure
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
Jinwon Lee
 
Machine Learning on Your Hand - Introduction to Tensorflow Lite Preview
Machine Learning on Your Hand - Introduction to Tensorflow Lite PreviewMachine Learning on Your Hand - Introduction to Tensorflow Lite Preview
Machine Learning on Your Hand - Introduction to Tensorflow Lite Preview
Modulabs
 
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAI
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAIYurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAI
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAI
Lviv Startup Club
 
Introduction to Diffusion Models
Introduction to Diffusion ModelsIntroduction to Diffusion Models
Introduction to Diffusion Models
Sangwoo Mo
 
Fine tuning large LMs
Fine tuning large LMsFine tuning large LMs
Fine tuning large LMs
SylvainGugger
 
GraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDBGraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDB
ArangoDB Database
 
[Paper Review] Continual learning with deep generative replay
[Paper Review] Continual learning with deep generative replay[Paper Review] Continual learning with deep generative replay
[Paper Review] Continual learning with deep generative replay
Pyungin Paek
 
Representational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual LearningRepresentational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual Learning
MLAI2
 
Federated learning in brief
Federated learning in briefFederated learning in brief
Federated learning in brief
Shashi Perera
 
continual learning survey
continual learning surveycontinual learning survey
continual learning survey
ぱんいち すみもと
 
Masked Autoencoders Are Scalable Vision Learners.pptx
Masked Autoencoders Are Scalable Vision Learners.pptxMasked Autoencoders Are Scalable Vision Learners.pptx
Masked Autoencoders Are Scalable Vision Learners.pptx
Sangmin Woo
 
Deep Learning - CNN and RNN
Deep Learning - CNN and RNNDeep Learning - CNN and RNN
Deep Learning - CNN and RNN
Ashray Bhandare
 
Adversarial Attacks and Defense
Adversarial Attacks and DefenseAdversarial Attacks and Defense
Adversarial Attacks and Defense
Kishor Datta Gupta
 
Federated Learning
Federated LearningFederated Learning
Federated Learning
DataWorks Summit
 
Federated Learning
Federated LearningFederated Learning
What’s next for deep learning for Search?
What’s next for deep learning for Search?What’s next for deep learning for Search?
What’s next for deep learning for Search?
Bhaskar Mitra
 
Transformers AI PPT.pptx
Transformers AI PPT.pptxTransformers AI PPT.pptx
Transformers AI PPT.pptx
RahulKumar854607
 

What's hot (20)

Comparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural NetworksComparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural Networks
 
Visual prompt tuning
Visual prompt tuningVisual prompt tuning
Visual prompt tuning
 
Siamese networks
Siamese networksSiamese networks
Siamese networks
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
 
Machine Learning on Your Hand - Introduction to Tensorflow Lite Preview
Machine Learning on Your Hand - Introduction to Tensorflow Lite PreviewMachine Learning on Your Hand - Introduction to Tensorflow Lite Preview
Machine Learning on Your Hand - Introduction to Tensorflow Lite Preview
 
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAI
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAIYurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAI
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAI
 
Introduction to Diffusion Models
Introduction to Diffusion ModelsIntroduction to Diffusion Models
Introduction to Diffusion Models
 
Fine tuning large LMs
Fine tuning large LMsFine tuning large LMs
Fine tuning large LMs
 
GraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDBGraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDB
 
[Paper Review] Continual learning with deep generative replay
[Paper Review] Continual learning with deep generative replay[Paper Review] Continual learning with deep generative replay
[Paper Review] Continual learning with deep generative replay
 
Representational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual LearningRepresentational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual Learning
 
Federated learning in brief
Federated learning in briefFederated learning in brief
Federated learning in brief
 
continual learning survey
continual learning surveycontinual learning survey
continual learning survey
 
Masked Autoencoders Are Scalable Vision Learners.pptx
Masked Autoencoders Are Scalable Vision Learners.pptxMasked Autoencoders Are Scalable Vision Learners.pptx
Masked Autoencoders Are Scalable Vision Learners.pptx
 
Deep Learning - CNN and RNN
Deep Learning - CNN and RNNDeep Learning - CNN and RNN
Deep Learning - CNN and RNN
 
Adversarial Attacks and Defense
Adversarial Attacks and DefenseAdversarial Attacks and Defense
Adversarial Attacks and Defense
 
Federated Learning
Federated LearningFederated Learning
Federated Learning
 
Federated Learning
Federated LearningFederated Learning
Federated Learning
 
What’s next for deep learning for Search?
What’s next for deep learning for Search?What’s next for deep learning for Search?
What’s next for deep learning for Search?
 
Transformers AI PPT.pptx
Transformers AI PPT.pptxTransformers AI PPT.pptx
Transformers AI PPT.pptx
 

Similar to Continual Learning: why, how, and when

State of the art in Natural Language Processing (March 2019)
State of the art in Natural Language Processing (March 2019)State of the art in Natural Language Processing (March 2019)
State of the art in Natural Language Processing (March 2019)
Liad Magen
 
Machine Learning basics
Machine Learning basicsMachine Learning basics
Machine Learning basics
NeeleEilers
 
Neuromation.io AI Ukraine Presentation
Neuromation.io AI Ukraine PresentationNeuromation.io AI Ukraine Presentation
Neuromation.io AI Ukraine Presentation
Bohdan Klimenko
 
Lessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsLessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systems
Xavier Amatriain
 
Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016
Grigory Sapunov
 
深度学习639页PPT/////////////////////////////
深度学习639页PPT/////////////////////////////深度学习639页PPT/////////////////////////////
深度学习639页PPT/////////////////////////////
alicejiang7888
 
A Friendly Introduction to Machine Learning
A Friendly Introduction to Machine LearningA Friendly Introduction to Machine Learning
A Friendly Introduction to Machine Learning
Haptik
 
A few questions about large scale machine learning
A few questions about large scale machine learningA few questions about large scale machine learning
A few questions about large scale machine learning
Theodoros Vasiloudis
 
#1 Berlin Students in AI, Machine Learning & NLP presentation
#1 Berlin Students in AI, Machine Learning & NLP presentation#1 Berlin Students in AI, Machine Learning & NLP presentation
#1 Berlin Students in AI, Machine Learning & NLP presentation
parlamind
 
Gans - Generative Adversarial Nets
Gans - Generative Adversarial NetsGans - Generative Adversarial Nets
Gans - Generative Adversarial Nets
SajalRastogi8
 
Deep learning 1.0 and Beyond, Part 2
Deep learning 1.0 and Beyond, Part 2Deep learning 1.0 and Beyond, Part 2
Deep learning 1.0 and Beyond, Part 2
Deakin University
 
Ml masterclass
Ml masterclassMl masterclass
Ml masterclass
Maxwell Rebo
 
Jakub Langr (University of Oxford) - Overview of Generative Adversarial Netwo...
Jakub Langr (University of Oxford) - Overview of Generative Adversarial Netwo...Jakub Langr (University of Oxford) - Overview of Generative Adversarial Netwo...
Jakub Langr (University of Oxford) - Overview of Generative Adversarial Netwo...
Codiax
 
AI-based Robotic Manipulation
AI-based Robotic ManipulationAI-based Robotic Manipulation
AI-based Robotic Manipulation
Akihiko Yamaguchi
 
A step towards machine learning at accionlabs
A step towards machine learning at accionlabsA step towards machine learning at accionlabs
A step towards machine learning at accionlabs
Chetan Khatri
 
Demystifying Ml, DL and AI
Demystifying Ml, DL and AIDemystifying Ml, DL and AI
Demystifying Ml, DL and AI
Greg Werner
 
Introduction to LLMs
Introduction to LLMsIntroduction to LLMs
Introduction to LLMs
Loic Merckel
 
Statistics vs machine learning
Statistics vs machine learningStatistics vs machine learning
Statistics vs machine learning
Tom Dierickx
 
Toward Continual Learning on the Edge
Toward Continual Learning on the EdgeToward Continual Learning on the Edge
Toward Continual Learning on the Edge
Vincenzo Lomonaco
 
Interpretable Machine Learning
Interpretable Machine LearningInterpretable Machine Learning
Interpretable Machine Learning
Sri Ambati
 

Similar to Continual Learning: why, how, and when (20)

State of the art in Natural Language Processing (March 2019)
State of the art in Natural Language Processing (March 2019)State of the art in Natural Language Processing (March 2019)
State of the art in Natural Language Processing (March 2019)
 
Machine Learning basics
Machine Learning basicsMachine Learning basics
Machine Learning basics
 
Neuromation.io AI Ukraine Presentation
Neuromation.io AI Ukraine PresentationNeuromation.io AI Ukraine Presentation
Neuromation.io AI Ukraine Presentation
 
Lessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsLessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systems
 
Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016
 
深度学习639页PPT/////////////////////////////
深度学习639页PPT/////////////////////////////深度学习639页PPT/////////////////////////////
深度学习639页PPT/////////////////////////////
 
A Friendly Introduction to Machine Learning
A Friendly Introduction to Machine LearningA Friendly Introduction to Machine Learning
A Friendly Introduction to Machine Learning
 
A few questions about large scale machine learning
A few questions about large scale machine learningA few questions about large scale machine learning
A few questions about large scale machine learning
 
#1 Berlin Students in AI, Machine Learning & NLP presentation
#1 Berlin Students in AI, Machine Learning & NLP presentation#1 Berlin Students in AI, Machine Learning & NLP presentation
#1 Berlin Students in AI, Machine Learning & NLP presentation
 
Gans - Generative Adversarial Nets
Gans - Generative Adversarial NetsGans - Generative Adversarial Nets
Gans - Generative Adversarial Nets
 
Deep learning 1.0 and Beyond, Part 2
Deep learning 1.0 and Beyond, Part 2Deep learning 1.0 and Beyond, Part 2
Deep learning 1.0 and Beyond, Part 2
 
Ml masterclass
Ml masterclassMl masterclass
Ml masterclass
 
Jakub Langr (University of Oxford) - Overview of Generative Adversarial Netwo...
Jakub Langr (University of Oxford) - Overview of Generative Adversarial Netwo...Jakub Langr (University of Oxford) - Overview of Generative Adversarial Netwo...
Jakub Langr (University of Oxford) - Overview of Generative Adversarial Netwo...
 
AI-based Robotic Manipulation
AI-based Robotic ManipulationAI-based Robotic Manipulation
AI-based Robotic Manipulation
 
A step towards machine learning at accionlabs
A step towards machine learning at accionlabsA step towards machine learning at accionlabs
A step towards machine learning at accionlabs
 
Demystifying Ml, DL and AI
Demystifying Ml, DL and AIDemystifying Ml, DL and AI
Demystifying Ml, DL and AI
 
Introduction to LLMs
Introduction to LLMsIntroduction to LLMs
Introduction to LLMs
 
Statistics vs machine learning
Statistics vs machine learningStatistics vs machine learning
Statistics vs machine learning
 
Toward Continual Learning on the Edge
Toward Continual Learning on the EdgeToward Continual Learning on the Edge
Toward Continual Learning on the Edge
 
Interpretable Machine Learning
Interpretable Machine LearningInterpretable Machine Learning
Interpretable Machine Learning
 

More from Gabriele Graffieti

Introduction to AI Ethics
Introduction to AI EthicsIntroduction to AI Ethics
Introduction to AI Ethics
Gabriele Graffieti
 
Self-supervised Learning for Next-Generation Industry-level Autonomous Drivin...
Self-supervised Learning for Next-Generation Industry-level Autonomous Drivin...Self-supervised Learning for Next-Generation Industry-level Autonomous Drivin...
Self-supervised Learning for Next-Generation Industry-level Autonomous Drivin...
Gabriele Graffieti
 
Continual Learning over Small non-i.i.d. Batches of Natural Video Streams
Continual Learning over Small non-i.i.d. Batches of Natural Video StreamsContinual Learning over Small non-i.i.d. Batches of Natural Video Streams
Continual Learning over Small non-i.i.d. Batches of Natural Video Streams
Gabriele Graffieti
 
Goals, Risks and Countermeasures in the Artificial Intelligence Era
Goals, Risks and Countermeasures in the Artificial Intelligence EraGoals, Risks and Countermeasures in the Artificial Intelligence Era
Goals, Risks and Countermeasures in the Artificial Intelligence Era
Gabriele Graffieti
 
Efficient Continual Learning with Latent Rehearsal
Efficient Continual Learning with Latent RehearsalEfficient Continual Learning with Latent Rehearsal
Efficient Continual Learning with Latent Rehearsal
Gabriele Graffieti
 
Spotlight - Generative Models and GANs
Spotlight - Generative Models and GANsSpotlight - Generative Models and GANs
Spotlight - Generative Models and GANs
Gabriele Graffieti
 
From art to deepfakes: an introduction to Generative Adversarial Networks
From art to deepfakes: an introduction to Generative Adversarial NetworksFrom art to deepfakes: an introduction to Generative Adversarial Networks
From art to deepfakes: an introduction to Generative Adversarial Networks
Gabriele Graffieti
 
Image-to-image Translation with Generative Adversarial Networks (without math)
Image-to-image Translation with Generative Adversarial Networks (without math)Image-to-image Translation with Generative Adversarial Networks (without math)
Image-to-image Translation with Generative Adversarial Networks (without math)
Gabriele Graffieti
 

More from Gabriele Graffieti (8)

Introduction to AI Ethics
Introduction to AI EthicsIntroduction to AI Ethics
Introduction to AI Ethics
 
Self-supervised Learning for Next-Generation Industry-level Autonomous Drivin...
Self-supervised Learning for Next-Generation Industry-level Autonomous Drivin...Self-supervised Learning for Next-Generation Industry-level Autonomous Drivin...
Self-supervised Learning for Next-Generation Industry-level Autonomous Drivin...
 
Continual Learning over Small non-i.i.d. Batches of Natural Video Streams
Continual Learning over Small non-i.i.d. Batches of Natural Video StreamsContinual Learning over Small non-i.i.d. Batches of Natural Video Streams
Continual Learning over Small non-i.i.d. Batches of Natural Video Streams
 
Goals, Risks and Countermeasures in the Artificial Intelligence Era
Goals, Risks and Countermeasures in the Artificial Intelligence EraGoals, Risks and Countermeasures in the Artificial Intelligence Era
Goals, Risks and Countermeasures in the Artificial Intelligence Era
 
Efficient Continual Learning with Latent Rehearsal
Efficient Continual Learning with Latent RehearsalEfficient Continual Learning with Latent Rehearsal
Efficient Continual Learning with Latent Rehearsal
 
Spotlight - Generative Models and GANs
Spotlight - Generative Models and GANsSpotlight - Generative Models and GANs
Spotlight - Generative Models and GANs
 
From art to deepfakes: an introduction to Generative Adversarial Networks
From art to deepfakes: an introduction to Generative Adversarial NetworksFrom art to deepfakes: an introduction to Generative Adversarial Networks
From art to deepfakes: an introduction to Generative Adversarial Networks
 
Image-to-image Translation with Generative Adversarial Networks (without math)
Image-to-image Translation with Generative Adversarial Networks (without math)Image-to-image Translation with Generative Adversarial Networks (without math)
Image-to-image Translation with Generative Adversarial Networks (without math)
 

Recently uploaded

Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
ssuserbfdca9
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
subedisuryaofficial
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
Sérgio Sacani
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
muralinath2
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
aishnasrivastava
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
anitaento25
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
NathanBaughman3
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
muralinath2
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
muralinath2
 

Recently uploaded (20)

Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
 

Continual Learning: why, how, and when

  • 1. Continual Learning in Deep Neural Networks why, how, and when Gabriele Graffieti Ph.D. Day DS&C 2021
  • 2. A tale about machine learning Once upon a time, a lot of data was collected. That data was fed into a huge machine learning model. The model was then able to properly process the data, and when new similar data arrived at the model, it can correctly handle it. What’s wrong with this?
  • 3. A tale about machine learning Once upon a time, a lot of data was collected. That data was fed to a huge machine learning model. The model was then able to properly process the data, and when new similar data arrived to the model, it can correctly handle it. What’s wrong with this? Nothing! I just described how machine learning works! And it works wonderfully!
  • 4.
  • 5. The antagonists of our tale ● ...a lot of data was collected ○ is it always possible? ○ How to store it? ○ What about training time? ○ What if I don’t want to wait until I collected a lot of data? ○ When some data is a lot? When some data is sufficient? ● ...when new similar data arrived at the model, it can correctly handle it ○ What if dissimilar data arrives at the model? ○ What if I want to adapt the model to new data? ○ What if data changes over time? ○ What if I want to continually train the model?
  • 6. The main villain of ML: catastrophic forgetting ● Catastrophic interference, also known as catastrophic forgetting, is the tendency of an artificial neural network to completely and abruptly forget previously learned information upon learning new information. ● This holds for all the ML models that are trained using “greedy” algorithms (e.g. stochastic gradient descent, CART…) ● The training algorithm optimizes the parameters of the model using the currently available data, past data and past knowledge are not taken into consideration. ● Heavily related to the stability-plasticity dilemma.
  • 7. Catastrophic Forgetting: an example You collect some data from the real world, and after each collection phase you train the model (only on the lastly collected data) to classify it in two classes: Data 1 Data 2 Data 3
  • 8. Catastrophic Forgetting: an example You collect some data from the real world, and after each collection phase you train the model (only on the lastly collected data) to classify it in two classes: Final solution on data 1 Final solution on all data Optimal final solution
  • 9. ML fails everyday! ● Social network ○ Impossible to train a model with all the available data (too much computation and time needed) ○ Train the model only on last data ≠ train it on whole data ○ Train a model incrementally only on new data (less time/computation required) without forgetting past knowledge ● Self-driving cars ○ Train the car at the same time it is driving (correction from the driver as signals) - Tesla is doing this! ○ Do not forget past knowledge! If I live in NYC don’t want my car to forget how to drive in the countryside ● Personalized devices ○ Keyboard next word suggestion may learn my writing style without forgetting about “good” writing style ○ Domestic robots can learn to recognize and handle new objects without forgetting how to handle other objects ● Scientific analysis of data ○ A model may be able to merge and reason about different results of different experiments ● Carbon footprint and energy consumption ○ Retraining a model with all the data every time new data arrives is costly both economically and environmentally ○ Extreme case: training complex language models (GPT-3) emit as much CO2 as 5 cars in their entire lifetimes
  • 10. Continual learning (CL) in a nutshell We define a continual learning scenario, a scenario when we do not have all the data at once, but we discover new data as time progresses. More specifically, the data we discover may not be a good approximation for the total data distribution. Other constraints: ● Every time new data arrives the model needs to be updated ● The model update should be fast enough to be used before new data arrives ● Past knowledge must not be forgotten (at least not catastrophically) Continual Learning
  • 11. Continual learning strategies ● Architectural ✅ It works pretty good ❌ Does not scale well ❌ Not biologically plausible ● Regularization ✅ Mathematically sound ❌ Difficult to optimize and implement ● Rehearsal / replay ✅ Straightforward to implement ✅ It works pretty good in most scenarios ❌ Memory ❌ Privacy ❌ Computation
  • 12. Replay strategy e(k+1) e(k) e(k-1) e(k-2) External memory Pro: ● Catastrophic forgetting highly reduced. ● Simple and easy to implement strategy. ● Memory is cheap and abundant. Cons: ● Memory is not infinite (the stream of experience can be infinite). ● What about privacy and private data? ● Not biologically plausible. ● Computation
  • 13. Latent replay Pellegrini, L., Graffieti, G., Lomonaco, V., & Maltoni, D. Latent replay for real-time continual learning. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) ● Only latent activations are memorized (less memory footprint) ● Only a portion of the network needs to be trained with replay data (less computational footprint) ● The latent replay layer can be moved to balance speed and accuracy
  • 14. Latent replay Pellegrini, L., Graffieti, G., Lomonaco, V., & Maltoni, D. Latent replay for real-time continual learning. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) My talk about it
  • 15. Training on smartphone devices Pellegrini, L., Lomonaco, V., Graffieti, G., & Maltoni, D. Continual Learning at the Edge: Real-Time Training on Smartphone Devices.In 2021 European Symposium on Artificial Neural Networks (ESANN) Video demo
  • 16. Generative replay e(k+1) e(k) e(k-1) e(k-2) Generative Model Pro: ● No replay memory is needed. ● More biologically plausible. ● Can also generate unseen or new plausible data. ● Generative replay can generalize and possibly yield better results. Cons: ● How to train the generative model?? ○ The problem is now on the continual training of the generator instead of the classifier. ○ Data quality is the main issue here
  • 17. Negative generative replay ● Instead of using bad quality generated data as replay data, use them only as negative example. ○ E.g. do not tell to the model “this image is a cat”; instead tell it “this image is not a dog” Graffieti, G., Maltoni D., Pellegrini, L. & Lomonaco, V. __________ - Under double blind review at some conference
  • 18. A continual learning Avalanche Lomonaco, V., Pellegrini, L., Cossu, A., Carta, A., Graffieti, G., Hayes, T. L., ... & Maltoni, D. Avalanche: an End-to-End Library for Continual Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). https://avalanche.continualai.org
  • 19. Future work Continual learning is a very active research area and can be addressed in many directions. ● We are now investigating replay (especially generative replay) and the role of (episodic) memory. ● Bio-inspired models and neuroscience can be the key to solve many issues. ● Gaussian processes (and Bayesian statistics) are being explored as continual learning frameworks. My thought (not necessarily right) ● We need a “non-greedy” optimization algorithm (SGD and the majority of optimization algorithms for NN are greedy). ● Memory is a key concept and should be analyzed and explored. ● We need a more “natural” way of learning (learn how to walk before learning how to run). ● Machine learning is the perfect fit if we want to solve a single task. ML is not the perfect fit if we want to build real intelligence.
  • 20. Publications Gabriele Graffieti, Davide Maltoni, Lorenzo Pellegrini, Vincenzo Lomonaco, _________ - Under double blind review at some conference Guido Borghi, Annalisa Franco, Gabriele Graffieti, Davide Maltoni, Automated Artifact Retouching in Morphed Images with Attention Maps, IEEE Access (2021) Lorenzo Pellegrini, Vincenzo Lomonaco, Gabriele Graffieti, Davide Maltoni, Continual Learning at the Edge: Real-Time Training on Smartphone Devices. Proceedings of the 29th European Symposium on Artificial Neural Networks (ESANN), 2021 Gabriele Graffieti, Davide Maltoni. Artifacts-Free Single Image Defogging. Atmosphere 12 (5), 577, 2021. V Lomonaco, L Pellegrini, A Cossu, A Carta, G Graffieti, et al. Avalanche: an End-to-End Library for Continual Learning. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021. Gabriele Graffieti, Davide Maltoni. Towards Artifacts-free Image Defogging. International Conference on Pattern Recognition (ICPR) 2020.
  • 21. Publications (cont.) H Bae, E Brophy, RHM Chan, B Chen, F Feng, G Graffieti, et al. Iros 2019 lifelong robotic vision: Object recognition challenge. IEEE Robotics & Automation Magazine 27 (2), 11-16, 2020 Lorenzo Pellegrini, Gabriele Graffieti, Vincenzo Lomonaco, Davide Maltoni. Latent replay for real-time continual learning. International Conference on Intelligent Robots and Systems (IROS) 2020. Gabriele Graffieti, Lorenzo Pellegrini, Vincenzo Lomonaco, Davide Maltoni. Efficient Continual Learning with Latent Rehearsal. International Conference on Intelligent Robots and Systems (IROS) 2019.
  • 22. Competitions Self-supervised Learning for Next-Generation Industry-level Autonomous Driving (ICCV 2021) - 1st place (5k$ prize) Lifelong Robotic Vision Challenge (IROS 2019) - 2nd place