SlideShare a Scribd company logo
CS 330
Lifelong Learning
Reminders
Tuesday (Nov 30th): Project poster session
Wednesday (Dec 8th): Project due
This Wednesday: Lecture and instructor office hours over zoom
Plan for Today
The lifelong learning problem statement
Basic approaches to lifelong learning
Can we do better than the basics?
Revisiting the problem statement
from the meta-learning perspective
A brief review of problem statements.
Meta-Learning
Given i.i.d. task distribution,
learn a new task efficiently
quickly learn
new task
learn to learn tasks
Multi-Task Learning
Learn to solve a set of tasks.
perform tasks
learn tasks
In contrast, many real world settings look like:
Meta-Learning
learn to learn tasks
quickly learn
new task
Multi-Task Learning
perform tasks
learn tasks
time
- a student learning concepts in school
- a deployed image classification system learning from a
stream of images from users
- a robot acquiring an increasingly large set of skills in
different environments
- a virtual assistant learning to help different users with
different tasks at different points in time
- a doctor’s assistant aiding in medical decision-making
Some examples:
Our agents may not be given a large batch of
data/tasks right off the bat!
Sequential learning settings
online learning, lifelong learning, continual learning, incremental learning, streaming data
distinct from sequence data and sequential decision-making
Some Terminology
1. Pick an example setting.
2. Discuss problem statement in your break-out room:
(a) how would you set-up an experiment to develop & test your algorithm?
(b) what are desirable/required properties of the algorithm?
(c) how do you evaluate such a system?
What is the lifelong learning problem statement?
A. a student learning concepts in school
B. a deployed image classification system learning from a
stream of images from users
C. a robot acquiring an increasingly large set of skills in
different environments
D. a virtual assistant learning to help different users with
different tasks at different points in time
E. a doctor’s assistant aiding in medical decision-making
Example settings:
Exercise:
Desirable properties/considerations Evaluation setup
Some considerations:
- computational resources
- memory
- model performance
- data efficiency
Problem variations:
- task/data order: i.i.d. vs. predictable vs. curriculum vs. adversarial
- others: privacy, interpretability, fairness,
test time compute & memory
- discrete task boundaries vs. continuous shifts (vs. both)
- known task boundaries/shifts vs. unknown
Substantial variety in problem statement!
What is the lifelong learning problem statement?
General [supervised] online learning problem:
What is the lifelong learning problem statement?
for t = 1, …, n
observe 𝑥𝑡
predict 𝑦
̂
𝑡
observe label 𝑦𝑡
i.i.d. setting: 𝑥𝑡 ∼ 𝑝(𝑥), 𝑦𝑡 ∼ 𝑝(𝑦|𝑥)
𝑝 not a function of 𝑡
streaming setting: cannot store (𝑥𝑡, 𝑦𝑡)
- lack of memory
- lack of computational resources
- privacy considerations
- want to study neural memory mechanisms
otherwise: 𝑥𝑡 ∼ 𝑝𝑡(𝑥), 𝑦𝑡 ∼ 𝑝𝑡(𝑦|𝑥)
true in some cases, but not in many cases!
- recall: replay buffers
<— if observable task boundaries: observe 𝑥𝑡, 𝑧𝑡
What do you want from your lifelong learning algorithm?
minimal regret (that grows slowly with 𝑡)
regret: cumulative loss of learner — cumulative loss of best learner in hindsight
(cannot be evaluated in practice, useful for analysis)
Regret𝑇: = ∑
1
𝑇
ℒ𝑡(𝜃𝑡) − 𝑚𝑖𝑛
𝜃
∑
1
𝑇
ℒ𝑡(𝜃)
Regret that grows linearly in 𝑡 is trivial. Why?
What do you want from your lifelong learning algorithm?
regret: cumulative loss of learner — cumulative loss of best learner in hindsight
Regret𝑇: = ∑
1
𝑇
ℒ𝑡(𝜃𝑡) − 𝑚𝑖𝑛
𝜃
∑
1
𝑇
ℒ𝑡(𝜃)
t 1
10
30
2
30
28
3
28
32
𝑦
̂
𝑡
𝑦𝑡
10 10
positive & negative transfer
What do you want from your lifelong learning algorithm?
positive forward transfer: previous tasks cause you to do better on future tasks
compared to learning future tasks from scratch
positive backward transfer: current tasks cause you to do better on previous tasks
compared to learning past tasks from scratch
positive -> negative : better -> worse
Plan for Today
The lifelong learning problem statement
Basic approaches to lifelong learning
Can we do better than the basics?
Revisiting the problem statement
from the meta-learning perspective
Store all the data you’ve seen so far, and train on it.
Approaches
—> follow the leader algorithm
Take a gradient step on the datapoint you observe. —> stochastic gradient descent
+ will achieve very strong performance
- computation intensive
- can be memory intensive
—> Continuous fine-tuning can help.
[depends on the application]
+ computationally cheap
+ requires 0 memory
- subject to negative backward transfer
“forgetting”
sometimes referred to as
catastrophic forgetting
- slow learning
Very simple continual RL algorithm
Julian, Swanson, Sukhatme, Levine, Finn, Hausman, Never Stop Learning, 2020
86% 49%
7 robots collected 580k grasps
Very simple continual RL algorithm
Julian, Swanson, Sukhatme, Levine, Finn, Hausman, Never Stop Learning, 2020
Very simple continual RL algorithm
Julian, Swanson, Sukhatme, Levine, Finn, Hausman, Never Stop Learning, 2020
Can we do better?
What about negative transfer?
Plan for Today
The lifelong learning problem statement
Basic approaches to lifelong learning
Can we do better than the basics?
Revisiting the problem statement
from the meta-learning perspective
Case Study: Can we modify vanilla SGD to avoid negative backward transfer?
(from scratch)
Idea:
35
Lopez-Paz & Ranzato. Gradient Episodic Memory for Continual Learning. NeurIPS ‘17
(1) store small amount of data per task in memory
(2) when making updates for new tasks, ensure that they don’t unlearn previous tasks
How do we accomplish (2)?
memory: ℳ𝑘 for task 𝑧𝑘
For 𝑡 = 0, . . . , 𝑇
minimize ℒ(𝑓𝜃(⋅, 𝑧𝑡), (𝑥𝑡, 𝑦𝑡))
subject to ℒ(𝑓𝜃, ℳ𝑘) ≤ ℒ(𝑓𝜃
𝑡−1
, ℳ𝑘) for all 𝑧𝑘 < 𝑧𝑡
learning predictor 𝑦𝑡 = 𝑓𝜃(𝑥𝑡, 𝑧𝑡)
(i.e. s.t. loss on previous
tasks doesn’t get worse)
Can formulate & solve as a QP.
Assume local
linearity:
⟨𝑔𝑡, 𝑔𝑘⟩: = ⟨
𝜕ℒ(𝑓𝜃, (𝑥𝑡, 𝑦𝑡))
𝜕𝜃
,
ℒ(𝑓𝜃, ℳ𝑘)
𝜕𝜃
⟩ ≥ 0 for all 𝑧𝑘 < 𝑧𝑡
36
Lopez-Paz & Ranzato. Gradient Episodic Memory for Continual Learning. NeurIPS ‘17
Experiments
If we take a step back… do these experimental domains make sense?
BWT: backward transfer,
FWT: forward transfer
- MNIST permutations
- MNIST rotations
- CIFAR-100 (5 new classes/task)
Problems:
Total memory size:
5012 examples
Can we meta-learn how to avoid negative backward transfer?
Javed & White. Meta-Learning Representations for Continual Learning. NeurIPS ‘19
Beaulieu et al. Learning to Continually Learn. ‘20
Plan for Today
The lifelong learning problem statement
Basic approaches to lifelong learning
Can we do better than the basics?
Revisiting the problem statement
from the meta-learning perspective
More realistically:
learn learn learn learn learn
learn
slow learning rapid learning
learn
time
What might be wrong with the online learning formulation?
Online Learning
(Hannan ’57, Zinkevich ’03)
Perform sequence of tasks
while minimizing static regret. time
perform perform perform perform perform perform
perform
zero-shot performance
Online Learning
(Hannan ’57, Zinkevich ’03)
Perform sequence of tasks
while minimizing static regret.
(Finn*, Rajeswaran*, Kakade, Levine ICML ’18)
Online Meta-Learning
Efficiently learn a sequence of tasks
from a non-stationary distribution.
time
learn learn learn learn learn learn
learn
time
perform perform perform perform perform perform
perform
zero-shot performance
evaluate performance after seeing a small amount of data
What might be wrong with the online learning formulation?
Primarily a difference in evaluation, rather than the data stream.
The Online Meta-Learning Setting
Goal: Learning algorithm with sub-linear
Loss of algorithm
Loss of best algorithm
in hindsight
for task t = 1, …, n
observe 𝒟𝑡
tr
use update procedure Φ(𝜃𝑡, 𝒟𝑡
tr) to produce parameters 𝜙𝑡
observe label 𝑦𝑡
observe 𝑥𝑡
predict 𝑦
̂
𝑡 = 𝑓𝜙𝑡
(𝑥𝑡)
Standard online learning setting
(Finn*, Rajeswaran*, Kakade, Levine ICML ’18)
Store all the data you’ve seen so far, and train on it.
Recall the follow the leader (FTL) algorithm:
Follow the meta-leader (FTML) algorithm:
Can we apply meta-learning in lifelong learning settings?
Store all the data you’ve seen so far, and meta-train on it.
Run update procedure on the current task.
Deploy model on current task.
What meta-learning algorithms are well-suited for FTML?
What if 𝑝𝑡(𝒯) is non-stationary?
Experiment with sequences of tasks:
- Colored, rotated, scaled MNIST
- 3D object pose prediction
- CIFAR-100 classification
Example pose prediction tasks
plane
car
chair
Experiments
Experiments
Learning
efficiency
(#
datapoints)
Task index
Rainbow MNIST Pose Prediction
Task index
Rainbow MNIST Pose Prediction
- TOE (train on everything): train on all data so far
- FTL (follow the leader): train on all data so far, fine-tune on current task
- From Scratch: train from scratch on each task
Learning
proficiency
(error)
Comparisons:
Follow The Meta-Leader
learns each new task faster & with greater proficiency,
approaches few-shot learning regime
Takeaways
Many flavors of lifelong learning, all under the same name.
Defining the problem statement is often the hardest part
Meta-learning can be viewed as a slice of the lifelong learning problem.
A very open area of research.
Reminders
Tuesday (Nov 30th): Project poster session
Wednesday (Dec 8th): Project due
This Wednesday: Lecture and instructor office hours over zoom

More Related Content

Similar to cs330_2021_lifelong_learning.pdf

AI_Unit-4_Learning.pptx
AI_Unit-4_Learning.pptxAI_Unit-4_Learning.pptx
AI_Unit-4_Learning.pptx
MohammadAsim91
 
Big data expo - machine learning in the elastic stack
Big data expo - machine learning in the elastic stack Big data expo - machine learning in the elastic stack
Big data expo - machine learning in the elastic stack
BigDataExpo
 
Learning how to learn
Learning how to learnLearning how to learn
Learning how to learn
Joaquin Vanschoren
 
ML crash course
ML crash courseML crash course
ML crash course
mikaelhuss
 
Machine Learning Landscape
Machine Learning LandscapeMachine Learning Landscape
Machine Learning Landscape
Eng Teong Cheah
 
Machine learning
Machine learningMachine learning
Machine learning
Digvijay Singh
 
LearningAG.ppt
LearningAG.pptLearningAG.ppt
LearningAG.ppt
butest
 
[update] Introductory Parts of the Book "Dive into Deep Learning"
[update] Introductory Parts of the Book "Dive into Deep Learning"[update] Introductory Parts of the Book "Dive into Deep Learning"
[update] Introductory Parts of the Book "Dive into Deep Learning"
Young-Min kang
 
Unit 1 - ML - Introduction to Machine Learning.pptx
Unit 1 - ML - Introduction to Machine Learning.pptxUnit 1 - ML - Introduction to Machine Learning.pptx
Unit 1 - ML - Introduction to Machine Learning.pptx
jawad184956
 
Ron's muri presentation
Ron's muri presentationRon's muri presentation
Ron's muri presentation
gowinraj
 
Machine Learning basics
Machine Learning basicsMachine Learning basics
Machine Learning basics
NeeleEilers
 
introduction to machine learning and nlp
introduction to machine learning and nlpintroduction to machine learning and nlp
introduction to machine learning and nlp
Mahmoud Farag
 
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptxRahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
RahulKirtoniya
 
Teachers Little Helper: Multi-Math-Coach
Teachers Little Helper: Multi-Math-CoachTeachers Little Helper: Multi-Math-Coach
Teachers Little Helper: Multi-Math-Coach
Educational Technology
 
Learning for Big Data-林軒田
Learning for Big Data-林軒田Learning for Big Data-林軒田
Learning for Big Data-林軒田
台灣資料科學年會
 
Machine learning: A Walk Through School Exams
Machine learning: A Walk Through School ExamsMachine learning: A Walk Through School Exams
Machine learning: A Walk Through School Exams
Ramsha Ijaz
 
Deep learning short introduction
Deep learning short introductionDeep learning short introduction
Deep learning short introduction
Adwait Bhave
 
Introduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningIntroduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement Learning
NAVER Engineering
 
Brief Tour of Machine Learning
Brief Tour of Machine LearningBrief Tour of Machine Learning
Brief Tour of Machine Learning
butest
 
Machine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-codeMachine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-code
Osama Ghandour Geris
 

Similar to cs330_2021_lifelong_learning.pdf (20)

AI_Unit-4_Learning.pptx
AI_Unit-4_Learning.pptxAI_Unit-4_Learning.pptx
AI_Unit-4_Learning.pptx
 
Big data expo - machine learning in the elastic stack
Big data expo - machine learning in the elastic stack Big data expo - machine learning in the elastic stack
Big data expo - machine learning in the elastic stack
 
Learning how to learn
Learning how to learnLearning how to learn
Learning how to learn
 
ML crash course
ML crash courseML crash course
ML crash course
 
Machine Learning Landscape
Machine Learning LandscapeMachine Learning Landscape
Machine Learning Landscape
 
Machine learning
Machine learningMachine learning
Machine learning
 
LearningAG.ppt
LearningAG.pptLearningAG.ppt
LearningAG.ppt
 
[update] Introductory Parts of the Book "Dive into Deep Learning"
[update] Introductory Parts of the Book "Dive into Deep Learning"[update] Introductory Parts of the Book "Dive into Deep Learning"
[update] Introductory Parts of the Book "Dive into Deep Learning"
 
Unit 1 - ML - Introduction to Machine Learning.pptx
Unit 1 - ML - Introduction to Machine Learning.pptxUnit 1 - ML - Introduction to Machine Learning.pptx
Unit 1 - ML - Introduction to Machine Learning.pptx
 
Ron's muri presentation
Ron's muri presentationRon's muri presentation
Ron's muri presentation
 
Machine Learning basics
Machine Learning basicsMachine Learning basics
Machine Learning basics
 
introduction to machine learning and nlp
introduction to machine learning and nlpintroduction to machine learning and nlp
introduction to machine learning and nlp
 
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptxRahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
 
Teachers Little Helper: Multi-Math-Coach
Teachers Little Helper: Multi-Math-CoachTeachers Little Helper: Multi-Math-Coach
Teachers Little Helper: Multi-Math-Coach
 
Learning for Big Data-林軒田
Learning for Big Data-林軒田Learning for Big Data-林軒田
Learning for Big Data-林軒田
 
Machine learning: A Walk Through School Exams
Machine learning: A Walk Through School ExamsMachine learning: A Walk Through School Exams
Machine learning: A Walk Through School Exams
 
Deep learning short introduction
Deep learning short introductionDeep learning short introduction
Deep learning short introduction
 
Introduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningIntroduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement Learning
 
Brief Tour of Machine Learning
Brief Tour of Machine LearningBrief Tour of Machine Learning
Brief Tour of Machine Learning
 
Machine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-codeMachine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-code
 

More from Kuan-Tsae Huang

TWCC22_PPT_v3_KL.pdf quantum computer, quantum computer , quantum computer ,...
TWCC22_PPT_v3_KL.pdf  quantum computer, quantum computer , quantum computer ,...TWCC22_PPT_v3_KL.pdf  quantum computer, quantum computer , quantum computer ,...
TWCC22_PPT_v3_KL.pdf quantum computer, quantum computer , quantum computer ,...
Kuan-Tsae Huang
 
0305吳家樂lecture qva lecture , letcutre , my talk.pdf
0305吳家樂lecture qva lecture , letcutre , my talk.pdf0305吳家樂lecture qva lecture , letcutre , my talk.pdf
0305吳家樂lecture qva lecture , letcutre , my talk.pdf
Kuan-Tsae Huang
 
lecture_16_jiajun.pdf
lecture_16_jiajun.pdflecture_16_jiajun.pdf
lecture_16_jiajun.pdf
Kuan-Tsae Huang
 
MIT-Lec-11- RSA.pdf
MIT-Lec-11- RSA.pdfMIT-Lec-11- RSA.pdf
MIT-Lec-11- RSA.pdf
Kuan-Tsae Huang
 
PQS_7_ruohan.pdf
PQS_7_ruohan.pdfPQS_7_ruohan.pdf
PQS_7_ruohan.pdf
Kuan-Tsae Huang
 
lecture_1_2_ruohan.pdf
lecture_1_2_ruohan.pdflecture_1_2_ruohan.pdf
lecture_1_2_ruohan.pdf
Kuan-Tsae Huang
 
lecture_6_jiajun.pdf
lecture_6_jiajun.pdflecture_6_jiajun.pdf
lecture_6_jiajun.pdf
Kuan-Tsae Huang
 
discussion_3_project.pdf
discussion_3_project.pdfdiscussion_3_project.pdf
discussion_3_project.pdf
Kuan-Tsae Huang
 
經濟部科研成果價值創造計畫-簡報格式111.05.30.pptx
經濟部科研成果價值創造計畫-簡報格式111.05.30.pptx經濟部科研成果價值創造計畫-簡報格式111.05.30.pptx
經濟部科研成果價值創造計畫-簡報格式111.05.30.pptx
Kuan-Tsae Huang
 
半導體實驗室.pptx
半導體實驗室.pptx半導體實驗室.pptx
半導體實驗室.pptx
Kuan-Tsae Huang
 
B 每產生1度氫電排放多少.pptx
B 每產生1度氫電排放多少.pptxB 每產生1度氫電排放多少.pptx
B 每產生1度氫電排放多少.pptx
Kuan-Tsae Huang
 
20230418 GPT Impact on Accounting & Finance -Reg.pdf
20230418 GPT Impact on Accounting & Finance -Reg.pdf20230418 GPT Impact on Accounting & Finance -Reg.pdf
20230418 GPT Impact on Accounting & Finance -Reg.pdf
Kuan-Tsae Huang
 
電動拖板車100208.pptx
電動拖板車100208.pptx電動拖板車100208.pptx
電動拖板車100208.pptx
Kuan-Tsae Huang
 
電動拖板車100121.pptx
電動拖板車100121.pptx電動拖板車100121.pptx
電動拖板車100121.pptx
Kuan-Tsae Huang
 
FIN330 Chapter 20.pptx
FIN330 Chapter 20.pptxFIN330 Chapter 20.pptx
FIN330 Chapter 20.pptx
Kuan-Tsae Huang
 
FIN330 Chapter 22.pptx
FIN330 Chapter 22.pptxFIN330 Chapter 22.pptx
FIN330 Chapter 22.pptx
Kuan-Tsae Huang
 
FIN330 Chapter 01.pptx
FIN330 Chapter 01.pptxFIN330 Chapter 01.pptx
FIN330 Chapter 01.pptx
Kuan-Tsae Huang
 
Thoughtworks Q1 2022 Investor Presentation.pdf
Thoughtworks Q1 2022 Investor Presentation.pdfThoughtworks Q1 2022 Investor Presentation.pdf
Thoughtworks Q1 2022 Investor Presentation.pdf
Kuan-Tsae Huang
 
20220828-How-do-you-build-an-effic.9658238.ppt.pptx
20220828-How-do-you-build-an-effic.9658238.ppt.pptx20220828-How-do-you-build-an-effic.9658238.ppt.pptx
20220828-How-do-you-build-an-effic.9658238.ppt.pptx
Kuan-Tsae Huang
 
Amine solvent development for carbon dioxide capture by yang du, 2016 (doctor...
Amine solvent development for carbon dioxide capture by yang du, 2016 (doctor...Amine solvent development for carbon dioxide capture by yang du, 2016 (doctor...
Amine solvent development for carbon dioxide capture by yang du, 2016 (doctor...
Kuan-Tsae Huang
 

More from Kuan-Tsae Huang (20)

TWCC22_PPT_v3_KL.pdf quantum computer, quantum computer , quantum computer ,...
TWCC22_PPT_v3_KL.pdf  quantum computer, quantum computer , quantum computer ,...TWCC22_PPT_v3_KL.pdf  quantum computer, quantum computer , quantum computer ,...
TWCC22_PPT_v3_KL.pdf quantum computer, quantum computer , quantum computer ,...
 
0305吳家樂lecture qva lecture , letcutre , my talk.pdf
0305吳家樂lecture qva lecture , letcutre , my talk.pdf0305吳家樂lecture qva lecture , letcutre , my talk.pdf
0305吳家樂lecture qva lecture , letcutre , my talk.pdf
 
lecture_16_jiajun.pdf
lecture_16_jiajun.pdflecture_16_jiajun.pdf
lecture_16_jiajun.pdf
 
MIT-Lec-11- RSA.pdf
MIT-Lec-11- RSA.pdfMIT-Lec-11- RSA.pdf
MIT-Lec-11- RSA.pdf
 
PQS_7_ruohan.pdf
PQS_7_ruohan.pdfPQS_7_ruohan.pdf
PQS_7_ruohan.pdf
 
lecture_1_2_ruohan.pdf
lecture_1_2_ruohan.pdflecture_1_2_ruohan.pdf
lecture_1_2_ruohan.pdf
 
lecture_6_jiajun.pdf
lecture_6_jiajun.pdflecture_6_jiajun.pdf
lecture_6_jiajun.pdf
 
discussion_3_project.pdf
discussion_3_project.pdfdiscussion_3_project.pdf
discussion_3_project.pdf
 
經濟部科研成果價值創造計畫-簡報格式111.05.30.pptx
經濟部科研成果價值創造計畫-簡報格式111.05.30.pptx經濟部科研成果價值創造計畫-簡報格式111.05.30.pptx
經濟部科研成果價值創造計畫-簡報格式111.05.30.pptx
 
半導體實驗室.pptx
半導體實驗室.pptx半導體實驗室.pptx
半導體實驗室.pptx
 
B 每產生1度氫電排放多少.pptx
B 每產生1度氫電排放多少.pptxB 每產生1度氫電排放多少.pptx
B 每產生1度氫電排放多少.pptx
 
20230418 GPT Impact on Accounting & Finance -Reg.pdf
20230418 GPT Impact on Accounting & Finance -Reg.pdf20230418 GPT Impact on Accounting & Finance -Reg.pdf
20230418 GPT Impact on Accounting & Finance -Reg.pdf
 
電動拖板車100208.pptx
電動拖板車100208.pptx電動拖板車100208.pptx
電動拖板車100208.pptx
 
電動拖板車100121.pptx
電動拖板車100121.pptx電動拖板車100121.pptx
電動拖板車100121.pptx
 
FIN330 Chapter 20.pptx
FIN330 Chapter 20.pptxFIN330 Chapter 20.pptx
FIN330 Chapter 20.pptx
 
FIN330 Chapter 22.pptx
FIN330 Chapter 22.pptxFIN330 Chapter 22.pptx
FIN330 Chapter 22.pptx
 
FIN330 Chapter 01.pptx
FIN330 Chapter 01.pptxFIN330 Chapter 01.pptx
FIN330 Chapter 01.pptx
 
Thoughtworks Q1 2022 Investor Presentation.pdf
Thoughtworks Q1 2022 Investor Presentation.pdfThoughtworks Q1 2022 Investor Presentation.pdf
Thoughtworks Q1 2022 Investor Presentation.pdf
 
20220828-How-do-you-build-an-effic.9658238.ppt.pptx
20220828-How-do-you-build-an-effic.9658238.ppt.pptx20220828-How-do-you-build-an-effic.9658238.ppt.pptx
20220828-How-do-you-build-an-effic.9658238.ppt.pptx
 
Amine solvent development for carbon dioxide capture by yang du, 2016 (doctor...
Amine solvent development for carbon dioxide capture by yang du, 2016 (doctor...Amine solvent development for carbon dioxide capture by yang du, 2016 (doctor...
Amine solvent development for carbon dioxide capture by yang du, 2016 (doctor...
 

Recently uploaded

06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
taqyea
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
VyNguyen709676
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
a9qfiubqu
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
SaffaIbrahim1
 

Recently uploaded (20)

06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
 

cs330_2021_lifelong_learning.pdf

  • 2. Reminders Tuesday (Nov 30th): Project poster session Wednesday (Dec 8th): Project due This Wednesday: Lecture and instructor office hours over zoom
  • 3. Plan for Today The lifelong learning problem statement Basic approaches to lifelong learning Can we do better than the basics? Revisiting the problem statement from the meta-learning perspective
  • 4. A brief review of problem statements. Meta-Learning Given i.i.d. task distribution, learn a new task efficiently quickly learn new task learn to learn tasks Multi-Task Learning Learn to solve a set of tasks. perform tasks learn tasks
  • 5. In contrast, many real world settings look like: Meta-Learning learn to learn tasks quickly learn new task Multi-Task Learning perform tasks learn tasks time - a student learning concepts in school - a deployed image classification system learning from a stream of images from users - a robot acquiring an increasingly large set of skills in different environments - a virtual assistant learning to help different users with different tasks at different points in time - a doctor’s assistant aiding in medical decision-making Some examples: Our agents may not be given a large batch of data/tasks right off the bat!
  • 6. Sequential learning settings online learning, lifelong learning, continual learning, incremental learning, streaming data distinct from sequence data and sequential decision-making Some Terminology
  • 7. 1. Pick an example setting. 2. Discuss problem statement in your break-out room: (a) how would you set-up an experiment to develop & test your algorithm? (b) what are desirable/required properties of the algorithm? (c) how do you evaluate such a system? What is the lifelong learning problem statement? A. a student learning concepts in school B. a deployed image classification system learning from a stream of images from users C. a robot acquiring an increasingly large set of skills in different environments D. a virtual assistant learning to help different users with different tasks at different points in time E. a doctor’s assistant aiding in medical decision-making Example settings: Exercise:
  • 9. Some considerations: - computational resources - memory - model performance - data efficiency Problem variations: - task/data order: i.i.d. vs. predictable vs. curriculum vs. adversarial - others: privacy, interpretability, fairness, test time compute & memory - discrete task boundaries vs. continuous shifts (vs. both) - known task boundaries/shifts vs. unknown Substantial variety in problem statement! What is the lifelong learning problem statement?
  • 10. General [supervised] online learning problem: What is the lifelong learning problem statement? for t = 1, …, n observe 𝑥𝑡 predict 𝑦 ̂ 𝑡 observe label 𝑦𝑡 i.i.d. setting: 𝑥𝑡 ∼ 𝑝(𝑥), 𝑦𝑡 ∼ 𝑝(𝑦|𝑥) 𝑝 not a function of 𝑡 streaming setting: cannot store (𝑥𝑡, 𝑦𝑡) - lack of memory - lack of computational resources - privacy considerations - want to study neural memory mechanisms otherwise: 𝑥𝑡 ∼ 𝑝𝑡(𝑥), 𝑦𝑡 ∼ 𝑝𝑡(𝑦|𝑥) true in some cases, but not in many cases! - recall: replay buffers <— if observable task boundaries: observe 𝑥𝑡, 𝑧𝑡
  • 11. What do you want from your lifelong learning algorithm? minimal regret (that grows slowly with 𝑡) regret: cumulative loss of learner — cumulative loss of best learner in hindsight (cannot be evaluated in practice, useful for analysis) Regret𝑇: = ∑ 1 𝑇 ℒ𝑡(𝜃𝑡) − 𝑚𝑖𝑛 𝜃 ∑ 1 𝑇 ℒ𝑡(𝜃) Regret that grows linearly in 𝑡 is trivial. Why?
  • 12. What do you want from your lifelong learning algorithm? regret: cumulative loss of learner — cumulative loss of best learner in hindsight Regret𝑇: = ∑ 1 𝑇 ℒ𝑡(𝜃𝑡) − 𝑚𝑖𝑛 𝜃 ∑ 1 𝑇 ℒ𝑡(𝜃) t 1 10 30 2 30 28 3 28 32 𝑦 ̂ 𝑡 𝑦𝑡 10 10
  • 13. positive & negative transfer What do you want from your lifelong learning algorithm? positive forward transfer: previous tasks cause you to do better on future tasks compared to learning future tasks from scratch positive backward transfer: current tasks cause you to do better on previous tasks compared to learning past tasks from scratch positive -> negative : better -> worse
  • 14. Plan for Today The lifelong learning problem statement Basic approaches to lifelong learning Can we do better than the basics? Revisiting the problem statement from the meta-learning perspective
  • 15. Store all the data you’ve seen so far, and train on it. Approaches —> follow the leader algorithm Take a gradient step on the datapoint you observe. —> stochastic gradient descent + will achieve very strong performance - computation intensive - can be memory intensive —> Continuous fine-tuning can help. [depends on the application] + computationally cheap + requires 0 memory - subject to negative backward transfer “forgetting” sometimes referred to as catastrophic forgetting - slow learning
  • 16. Very simple continual RL algorithm Julian, Swanson, Sukhatme, Levine, Finn, Hausman, Never Stop Learning, 2020 86% 49% 7 robots collected 580k grasps
  • 17. Very simple continual RL algorithm Julian, Swanson, Sukhatme, Levine, Finn, Hausman, Never Stop Learning, 2020
  • 18. Very simple continual RL algorithm Julian, Swanson, Sukhatme, Levine, Finn, Hausman, Never Stop Learning, 2020 Can we do better? What about negative transfer?
  • 19. Plan for Today The lifelong learning problem statement Basic approaches to lifelong learning Can we do better than the basics? Revisiting the problem statement from the meta-learning perspective
  • 20. Case Study: Can we modify vanilla SGD to avoid negative backward transfer? (from scratch)
  • 21. Idea: 35 Lopez-Paz & Ranzato. Gradient Episodic Memory for Continual Learning. NeurIPS ‘17 (1) store small amount of data per task in memory (2) when making updates for new tasks, ensure that they don’t unlearn previous tasks How do we accomplish (2)? memory: ℳ𝑘 for task 𝑧𝑘 For 𝑡 = 0, . . . , 𝑇 minimize ℒ(𝑓𝜃(⋅, 𝑧𝑡), (𝑥𝑡, 𝑦𝑡)) subject to ℒ(𝑓𝜃, ℳ𝑘) ≤ ℒ(𝑓𝜃 𝑡−1 , ℳ𝑘) for all 𝑧𝑘 < 𝑧𝑡 learning predictor 𝑦𝑡 = 𝑓𝜃(𝑥𝑡, 𝑧𝑡) (i.e. s.t. loss on previous tasks doesn’t get worse) Can formulate & solve as a QP. Assume local linearity: ⟨𝑔𝑡, 𝑔𝑘⟩: = ⟨ 𝜕ℒ(𝑓𝜃, (𝑥𝑡, 𝑦𝑡)) 𝜕𝜃 , ℒ(𝑓𝜃, ℳ𝑘) 𝜕𝜃 ⟩ ≥ 0 for all 𝑧𝑘 < 𝑧𝑡
  • 22. 36 Lopez-Paz & Ranzato. Gradient Episodic Memory for Continual Learning. NeurIPS ‘17 Experiments If we take a step back… do these experimental domains make sense? BWT: backward transfer, FWT: forward transfer - MNIST permutations - MNIST rotations - CIFAR-100 (5 new classes/task) Problems: Total memory size: 5012 examples
  • 23. Can we meta-learn how to avoid negative backward transfer? Javed & White. Meta-Learning Representations for Continual Learning. NeurIPS ‘19 Beaulieu et al. Learning to Continually Learn. ‘20
  • 24. Plan for Today The lifelong learning problem statement Basic approaches to lifelong learning Can we do better than the basics? Revisiting the problem statement from the meta-learning perspective
  • 25. More realistically: learn learn learn learn learn learn slow learning rapid learning learn time What might be wrong with the online learning formulation? Online Learning (Hannan ’57, Zinkevich ’03) Perform sequence of tasks while minimizing static regret. time perform perform perform perform perform perform perform zero-shot performance
  • 26. Online Learning (Hannan ’57, Zinkevich ’03) Perform sequence of tasks while minimizing static regret. (Finn*, Rajeswaran*, Kakade, Levine ICML ’18) Online Meta-Learning Efficiently learn a sequence of tasks from a non-stationary distribution. time learn learn learn learn learn learn learn time perform perform perform perform perform perform perform zero-shot performance evaluate performance after seeing a small amount of data What might be wrong with the online learning formulation? Primarily a difference in evaluation, rather than the data stream.
  • 27. The Online Meta-Learning Setting Goal: Learning algorithm with sub-linear Loss of algorithm Loss of best algorithm in hindsight for task t = 1, …, n observe 𝒟𝑡 tr use update procedure Φ(𝜃𝑡, 𝒟𝑡 tr) to produce parameters 𝜙𝑡 observe label 𝑦𝑡 observe 𝑥𝑡 predict 𝑦 ̂ 𝑡 = 𝑓𝜙𝑡 (𝑥𝑡) Standard online learning setting (Finn*, Rajeswaran*, Kakade, Levine ICML ’18)
  • 28. Store all the data you’ve seen so far, and train on it. Recall the follow the leader (FTL) algorithm: Follow the meta-leader (FTML) algorithm: Can we apply meta-learning in lifelong learning settings? Store all the data you’ve seen so far, and meta-train on it. Run update procedure on the current task. Deploy model on current task. What meta-learning algorithms are well-suited for FTML? What if 𝑝𝑡(𝒯) is non-stationary?
  • 29. Experiment with sequences of tasks: - Colored, rotated, scaled MNIST - 3D object pose prediction - CIFAR-100 classification Example pose prediction tasks plane car chair Experiments
  • 30. Experiments Learning efficiency (# datapoints) Task index Rainbow MNIST Pose Prediction Task index Rainbow MNIST Pose Prediction - TOE (train on everything): train on all data so far - FTL (follow the leader): train on all data so far, fine-tune on current task - From Scratch: train from scratch on each task Learning proficiency (error) Comparisons: Follow The Meta-Leader learns each new task faster & with greater proficiency, approaches few-shot learning regime
  • 31. Takeaways Many flavors of lifelong learning, all under the same name. Defining the problem statement is often the hardest part Meta-learning can be viewed as a slice of the lifelong learning problem. A very open area of research.
  • 32. Reminders Tuesday (Nov 30th): Project poster session Wednesday (Dec 8th): Project due This Wednesday: Lecture and instructor office hours over zoom