SlideShare a Scribd company logo
1 of 26
A Simple Neural AttentIve Meta-
Learner
PR-153
Mar 31, 2019
Taekmin Kim
1
2
Machine Learning vs. Human
● Machine Learning
○ Try to learn data points
■ Supervised/Unsupervised Learning
■ Reinforcement Learning
● Human
○ Fast adaptation with prior knowledge
■ Few-shot learning
■ Generalization across tasks
3
● Related Work
○ LEARNING TO REINFORCEMENT LEARN
○ RL^2
○ MAML
○ Auto-Meta
Meta-Learner?
Multi-armed bandit problem
https://blog.floydhub.com/meta-rl/
4
Meta-RL
● Goal: Generalization across tasks
● Notations
○ T: Task distribution e.g., driving, multi-armed bandit problems
○ T_i: Specific task e.g., Sonata, Porsche, ...
○ x_t: state
○ a_t: action
5
RNN-based Meta-RL(Agent)
● sequence-to-sequence problem
○ refer to past experience
● Drawbacks:
○ Temporally-linear dependency
https://blog.floydhub.com/meta-rl/
6
Motivation
● Temporal(Causal) Convolution
○ depends on previous steps
● Soft Attention
○ weighted sum
https://www.slideshare.net/ThomasHjeldeThoresen/temporal-convolutional-networks-dethroning-rnns-for-sequence-modelling
https://medium.com/syncedreview/memory-attention-sequences-8522f531dd43
7
Temporal Convolution
https://www.slideshare.net/ThomasHjeldeThoresen/temporal-convolutional-networks-dethroning-rnns-for-sequence-modelling
Vanilla 1D TCs
(exponential) Dilated 1D TCs
Vanilla 1D Convolution Temporal Convolution(TC)
8
Attention is All you Need(2017)
https://mchromiak.github.io/articles/2017/Sep/12/Transformer-Attention-is-all-you-need/#.XJ6U6-szZ0c
https://medium.com/@hyponymous/paper-summary-attention-is-all-you-need-22c2c7a5e06
Q: Hidden State of Decoder
K: Hidden State of Encoder
V: (normalized) Weights
9
PR-049: https://www.youtube.com/watch?v=6zGgVIlStXs
Motivation
● Temporal(Causal) Convolution
○ depends on previous steps
● Soft Attention
○ weighted sum
https://www.slideshare.net/ThomasHjeldeThoresen/temporal-convolutional-networks-dethroning-rnns-for-sequence-modelling
https://medium.com/syncedreview/memory-attention-sequences-8522f531dd43
10
Simple Neural AttentIve Learner
Building Blocks
● DenseBlock
● TCBlock
● AttentionBlock
11
Dense Block
12
TC Block
13
Attention Block
Query: Hidden State of Decoder
Key: Hidden State of Encoder
Value: (normalized) Weights
14
15
Simple Neural AttentIve Learner
Building Blocks
● DenseBlock
● TCBlock
● AttentionBlock
16
Experiments
● Supervised Learning
○ Few-Shot Learning(Image Classification)
■ n-Way DATASET
■ m-shot
● Reinforcement Learning
○ Multi-Armed Bandits
○ Tabular MDPs
○ Continuous Control
○ Visual Navigation
17
Results: Few-shot Learning
MAML
Omniglot
18
MAML
19
MAML: Optimization-based Meta-RL
20
https://arxiv.org/abs/1703.03400
PR-094: MAML https://www.youtube.com/watch?v=fxJXXKZb-ik
Results: Multi-armed Bandits
MAML
21
Results: Visual Navigation
22
Results: Tabular MDPs
23
가깝고도 먼 TRPO(이웅원 님)
https://www.slideshare.net/WoongwonLee/trpo-87165690
MAML
24
Results: Continous Control
25
Summary
26
● SNAIL
○ Temporal Convolution
○ Soft Attention
● Meta-RL is promising
● Related Work
○ LEARNING TO REINFORCEMENT LEARN
○ RL^2
○ MAML
○ Auto-Meta
● Materials
○ Meta-RL(Chelsea Finn): http://rail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-20.pdf

More Related Content

What's hot

What's hot (20)

Machine Learning and Applications
Machine Learning and ApplicationsMachine Learning and Applications
Machine Learning and Applications
 
LLaMA 2.pptx
LLaMA 2.pptxLLaMA 2.pptx
LLaMA 2.pptx
 
A brief primer on OpenAI's GPT-3
A brief primer on OpenAI's GPT-3A brief primer on OpenAI's GPT-3
A brief primer on OpenAI's GPT-3
 
Machine learning
Machine learning Machine learning
Machine learning
 
大規模言語モデル開発を支える分散学習技術 - 東京工業大学横田理央研究室の藤井一喜さん
大規模言語モデル開発を支える分散学習技術 - 東京工業大学横田理央研究室の藤井一喜さん大規模言語モデル開発を支える分散学習技術 - 東京工業大学横田理央研究室の藤井一喜さん
大規模言語モデル開発を支える分散学習技術 - 東京工業大学横田理央研究室の藤井一喜さん
 
Automatic Machine Learning, AutoML
Automatic Machine Learning, AutoMLAutomatic Machine Learning, AutoML
Automatic Machine Learning, AutoML
 
Gpt models
Gpt modelsGpt models
Gpt models
 
Customizing LLMs
Customizing LLMsCustomizing LLMs
Customizing LLMs
 
Siamese networks
Siamese networksSiamese networks
Siamese networks
 
[DL輪読会]BERT: Pre-training of Deep Bidirectional Transformers for Language Und...
[DL輪読会]BERT: Pre-training of Deep Bidirectional Transformers for Language Und...[DL輪読会]BERT: Pre-training of Deep Bidirectional Transformers for Language Und...
[DL輪読会]BERT: Pre-training of Deep Bidirectional Transformers for Language Und...
 
Pydata_リクルートにおけるbanditアルゴリズム_実装前までのプロセス
Pydata_リクルートにおけるbanditアルゴリズム_実装前までのプロセスPydata_リクルートにおけるbanditアルゴリズム_実装前までのプロセス
Pydata_リクルートにおけるbanditアルゴリズム_実装前までのプロセス
 
Machine learning basics
Machine learning basics Machine learning basics
Machine learning basics
 
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Model-Agnostic Meta-Learning for Fast Adaptation of Deep NetworksModel-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
MLP深層学習 LSTM
MLP深層学習 LSTMMLP深層学習 LSTM
MLP深層学習 LSTM
 
Optunaを使ったHuman-in-the-loop最適化の紹介 - 2023/04/27 W&B 東京ミートアップ #3
Optunaを使ったHuman-in-the-loop最適化の紹介 - 2023/04/27 W&B 東京ミートアップ #3Optunaを使ったHuman-in-the-loop最適化の紹介 - 2023/04/27 W&B 東京ミートアップ #3
Optunaを使ったHuman-in-the-loop最適化の紹介 - 2023/04/27 W&B 東京ミートアップ #3
 
알아두면 쓸데있는 신비한 딥러닝 이야기
알아두면 쓸데있는 신비한 딥러닝 이야기알아두면 쓸데있는 신비한 딥러닝 이야기
알아두면 쓸데있는 신비한 딥러닝 이야기
 
An introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTAn introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERT
 
A Brief Introduction on Recurrent Neural Network and Its Application
A Brief Introduction on Recurrent Neural Network and Its ApplicationA Brief Introduction on Recurrent Neural Network and Its Application
A Brief Introduction on Recurrent Neural Network and Its Application
 
Machine learning
Machine learningMachine learning
Machine learning
 

Recently uploaded

TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
Wonjun Hwang
 

Recently uploaded (20)

Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfFrisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 

PR-153: SNAIL: A Simple Neural Attentive Meta-Learner