SlideShare a Scribd company logo
1 of 49
Perception and Intelligence Laboratory
Seoul
National
University
End-to-End Memory
Networks
Sukhbaatar, S., Weston, J., & Fergus, R. (NIPS 2015)
Junho Cho2016.08.05
Memory networks:
reasoning with long-term
memory component that can
be read and written.
2
What to solve?
Read Long Story and Answering Questions.
Problem that requires long term memory.
What about RNN?
 the memory (encoded by hidden states and weights) is
typically too small
 Not accurately remember facts from the past.
(knowledge compressed in dense vectors)
 Not even able to reproduce input as output
(Zaremba & Sutskever, 2014)
Perception and Intelligence Lab., Copyright © 2016 3
What is Memory
Network(MemNN)?
Perception and Intelligence Lab., Copyright © 2016 4
• First introduced in 2014 in ICLR2015 [Memory Networks]
• Class of models that combine large memory with learning
component that can read and write to it.
• Incorporates reasoning with attention over memory
< End-to-End Memory Network >
Neural Network with
a recurrent attention model
over an external memory.
Perception and Intelligence Lab., Copyright © 2016 5
< End-to-End Memory Network >
End-to-end Memory Network (MemN2N)
FYI, memory component
is external and not to be
changed after sentences are
embedded to memory.
Full Scheme
A, B, C, W
are to be trained
jointly
Full Scheme
1. Store story sentences
in input memory.
Full Scheme
2. Embed question
and inner product with
each memory vector.
If memory is related to
question, it will be
more attended
Full Scheme
3. Knowing which
memory to attend,
weight sum on
output memory vector.
Add embedded question,
on output and predict
answer as one-hot vector.
Full Scheme
4. Calculations in
embedding space
Full Scheme 𝑎=Softmax(W(o+u))
𝑚𝑖 =
𝑗
𝑨𝑥𝑖𝑗
𝑢 =
𝑗
𝑩𝑞 𝑗
𝑝𝑖 = 𝑆𝑜𝑓𝑡𝑚𝑎𝑥(𝑢 𝑇
𝑚 𝑡)
𝑜 =
𝑗
𝑝𝑖 𝑐𝑖
𝑐𝑖 =
𝑗
𝑪𝑥𝑖𝑗
Tasks: bAbI
Perception and Intelligence Lab., Copyright © 2016 12
• Facebook AI Research
• 20 tasks for testing text understanding and reasoning
• Generated by the simulation. Humans should get 100%.
• Each task, 1000 q & 1000 ans.
• Sample task: Factoid QA with Two supporting facts
John is in the school.
Jason is in the office.
John picked up the football.
Jason went to the kitchen.
Q: Where is the football?
SUPPORTING FACT
SUPPORTING FACT
To answer the question, the two support facts are important.
Other sentences are just distraction.
Memory Network should focus on supporting fact
A: School
Tasks: bAbI
Perception and Intelligence Lab., Copyright © 2016 13
• 20 tasks
Tasks: bAbI
Perception and Intelligence Lab., Copyright © 2016 14
• 20 tasks
Tasks: bAbI
Perception and Intelligence Lab., Copyright © 2016 15
• Dynamic Memory Network (not introducing today) DEMO
• http://yerevann.com/dmn-ui/#/
• Example test story + predictions:
Antoine went to the kitchen. Antoine got the milk. Antoine travell
ed to the office. Antoine dropped the milk. Sumit picked up the fo
otball. Antoine went to the bathroom. Sumit moved to the kitchen
.
• where is the milk now? A: office
• where is the football? A: kitchen
• where is Antoine ? A: bathroom
• where is Sumit ? A: kitchen
• where was Antoine before the bathroom? A: office
Challenging
Perception and Intelligence Lab., Copyright © 2016 16
• Able to infer through Supporting FACT
• Supporting FACT is labeled at sentence.
• Also can be given for fully supervised learning
• In this paper, only give Answer as ground truth.
• Memory network will predict answer by inferring which sentence is
supporting fact.
• Useful in realistic QA tasks and language modeling
John is in the school.
Jason is in the office.
John picked up the football.
Jason went to the kitchen.
Q: Where is the football?
SUPPORTING FACT
SUPPORTING FACT
A: School
End-to-end Memory Network (MemN2N)
• Presented in NIPS2015
• New end-to-end model:
• Reads from memory with soft attention
• Performs multiple lookups (hops) on memory
• End-to-end training with back-propagation
• Only need supervision on the final output
• It is based on MemNN but that had:
• Hard attention
• requires explicit supervision of attention during training
• Only feasible for simple tasks
Problem Statement
Perception and Intelligence Lab., Copyright © 2016 18
1. bAbI: Synthetic QA tasks
2. Language model
context
TODAY
Input: Context 문장들과 질문
𝑥1 = Mary journeyed to the den.
𝑥2 = Mary went back to the kitchen.
𝑥3 = John journeyed to the bedroom.
𝑥4 = Mary discarded the milk.
Q: Where was the milk before the den?
𝑥1 𝑥2 𝑥3 𝑥4
slide from TaeHoon Kim
Word: 𝑥𝑖𝑗 ∈ ℝ 𝑉
one hot vector, V = 177
Sentences and Question: BoW Representation
문장 하나 𝑥𝑖, 단어의 조합
A sentence ∶ 𝑥𝑖
“Mary journeyed to the den”
𝑥𝑖 = 𝑥𝑖1, 𝑥𝑖2, … , 𝑥𝑖𝑛
𝑥𝑖1 = mary
𝑥𝑖2 = journeyed
𝑥𝑖3 = to
𝑥𝑖4 = the
𝑥𝑖5 = den
slide from TaeHoon Kim
Word: 𝑥𝑖𝑗 ∈ ℝ 𝑉
one hot vector, V = 177
Sentences and Question: BoW Representation
단어 하나: Bag-of-Words (BoW)로
표현
1
0
0
0
.
.
.
0
mary
Bag-of-Words (BoW)
𝑥𝑖1
=
𝑥𝑖1 = mary
𝑥𝑖 = 𝑥𝑖1, 𝑥𝑖2, … , 𝑥𝑖𝑛
slide from TaeHoon Kim
Word: 𝑥𝑖𝑗 ∈ ℝ 𝑉
one hot vector, V = 177
Sentences and Question: BoW Representation
A sentence ∶ 𝑥𝑖
“Mary journeyed to the den”
𝑥𝑖 = 𝑥𝑖1, 𝑥𝑖2, … , 𝑥𝑖𝑛
Mary
journeyed
to
the
den
문장 하나: BoW의 set
1
0
0
0
.
.
.
0
mary
0
1
0
0
.
.
.
0
journeyed
0
0
1
0
.
.
.
0
to
0
0
0
1
.
.
.
0
the
𝑥𝑖 = 𝑥𝑖1, 𝑥𝑖2, … , 𝑥𝑖𝑛
𝑥𝑖1 = mary
𝑥𝑖2 = journeyed
𝑥𝑖3 = to
𝑥𝑖4 = the
𝑥𝑖5 = den
slide from TaeHoon Kim
Word: 𝑥𝑖𝑗 ∈ ℝ 𝑉
one hot vector, V = 177
Sentences and Question: BoW Representation
Mary
journeyed
to
the
den
0
0
0
0
.
.
.
1
den
Input: BoW로 표현된 Context 문장들과
질문
𝑥1 = Mary journeyed to the den.
𝑥2 = Mary went back to the kitchen.
𝑥3 = John journeyed to the bedroom.
𝑥4 = Mary discarded the milk.
𝑥1 𝑥3 𝑥8 𝑥2 𝑥4 𝑥6 𝑥9
Q: Where was the milk before the den?
1
0
0
0
.
.
.
0
0
1
0
0
.
.
.
0
0
0
1
0
.
.
.
0
0
0
0
1
.
.
.
0
slide from TaeHoon Kim
Word: 𝑥𝑖𝑗 ∈ ℝ 𝑉
one hot vector, V = 177
Sentences and Question: BoW Representation
Full Scheme 𝑎=Softmax(W(o+u))
𝑚𝑖 =
𝑗
𝑨𝑥𝑖𝑗
𝑢 =
𝑗
𝑩𝑞 𝑗
𝑝𝑖 = 𝑆𝑜𝑓𝑡𝑚𝑎𝑥(𝑢 𝑇
𝑚 𝑡)
𝑜 =
𝑗
𝑝𝑖 𝑐𝑖
𝑐𝑖 =
𝑗
𝑪𝑥𝑖𝑗
Memory: 문장 하나로부터
만들어진다
𝑚𝑖 =
𝑗
𝑨𝑥𝑖𝑗
1
0
0
0
.
.
.
0
0
1
0
0
.
.
.
0
0
0
1
0
.
.
.
0
0
0
0
1
.
.
.
0
𝑚1 = 𝑨 +𝑨 +𝑨 +𝑨
𝑥11 𝑥12 𝑥13 𝑥14
mary journeyed to the
𝑨: embedding matrix
Dimension Check
A: (d x V)
x: V = 177
m: d = 20 or 50
slide from TaeHoon Kim
0
0
0
0
.
.
.
1
+𝑨
𝑥15
den
Thus, # of memory = # of sentences
𝑚𝑖 =
𝑗
𝑨𝑥𝑖𝑗
1
0
0
0
.
.
.
0
0
1
0
0
.
.
.
0
0
0
1
0
.
.
.
0
0
0
0
1
.
.
.
0
𝑚1 = 𝑨 +𝑨 +𝑨 +𝑨
𝑥11 𝑥12 𝑥13 𝑥14
𝑚1
𝑨: embedding matrix
Embedding to Memory
Embedding to Memory
𝑚1, 𝑚2, 𝑚3, 𝑚4
𝑥1 = Mary journeyed to the den.
𝑥2 = Mary went back to the kitchen.
𝑥3 = John journeyed to the bedroom.
𝑥4 = Mary discarded the milk.
𝑥1 = Mary journeyed to the den.
𝑥2 = Mary went back to the kitchen.
𝑥3 = John journeyed to the bedroom.
𝑥4 = Mary discarded the milk.
In bAbI task, Simpler form of memory component.
d d d
𝑚1, 𝑚2, 𝑚3, 𝑚4
𝑨
𝑚𝑖 =
𝑗
𝑨𝑥𝑖𝑗
𝑨: embedding matrix
d
Memory: 필요한 것만 Input으로 사용
Input memory
실제로 메모리의 본체는 embedding matrix인 𝑨이며,
𝑨로 BoW Representation을 memory vector로 embedding
Training에서 𝑨를 학습.
𝑚𝑖 =
𝑗
𝑨𝑥𝑖𝑗
d d dd
# of sentences : n < 320
# of memory vectors: n
but maximum capacity
restricted to recent 50 sentences.
d…
𝑢 =
𝑗
𝑩𝑞 𝑗
Question
Q: Where was the milk before the den?
𝑢 =
𝑗
𝑩𝑞 𝑗
What memory vector to attend, based on question
𝑝𝑖 = 𝑆𝑜𝑓𝑡𝑚𝑎𝑥(𝑢 𝑇
𝑚 𝑡)
𝑥1 = Mary journeyed to the den.
𝑥2 = Mary went back to the kitchen.
𝑥3 = John journeyed to the bedroom.
𝑥4 = Mary discarded the milk.
Q: Where was the milk before the den?
Attention model on external memory
질문 q에 대해서 어떤 memory vector에
얼마나 집중할 것인지
𝑝𝑖 = 𝑆𝑜𝑓𝑡𝑚𝑎𝑥(𝑢 𝑇
𝑚 𝑡)
If 𝑢 and 𝑚𝑖 is related,
𝑚𝑖 is attended in memory,
𝑝𝑖 is higher
Attention model on external memory
𝑐𝑖 =
𝑗
𝑪𝑥𝑖𝑗
Purpose of each embedding matrices
B : Question
A : Input memory vector to attend
C : Output vector based on attention
Output
𝑜 =
𝑗
𝑝𝑖 𝑐𝑖
slide from TaeHoon Kim
Output: 요약된 정보 𝑜 + 질문 정보 𝑢
o : response vector from memory
weighted by probability vector from input
Output: 요약된 정보 𝑜 + 질문 정보 𝑢
𝑜 + 𝑢
𝑜, 𝑢 둘 다 고려해 답을 도출
slide from TaeHoon Kim
o : response vector from memory
weighted by probability vector from input
u : internal state.
Input(Question)
embedding.
Output: 실제로 정답 단어 𝑎
𝑎=Softmax(W(o+u))
slide from TaeHoon Kim
W: d x V dimensional
Model scheme
Perception and Intelligence Lab., Copyright © 2016 36
• Input sentences: 𝑥1, 𝑥2, … , 𝑥 𝑛 is taken
• Sentences are embedded into memory vectors 𝑚𝑖 and 𝑐𝑖
by using embedding matrices 𝐴 and 𝐶
• question is embedded into internal state 𝑢
• Matching is performed between 𝑢 and 𝑚𝑖 with SoftMax function
• output is calculated by the relation: 𝑜 = 𝑗 𝑝𝑖 𝑐𝑖
• another SoftMax is used to produce final prediction after summing up 𝑜 with 𝑢
Recurrent attention model
with external memory
slide from TaeHoon Kim
1-hop
uk+1
=ok
+ uk
Recurrent attention model
with external memory
slide from TaeHoon Kim
2-hops
uk+1
=ok
+ uk
Recurrent attention model
with external memory
slide from TaeHoon Kim
3-hops
uk+1
=ok
+ uk
Recurrent attention model
with external memory
uk+1
=ok
+ uk
Recurrent on memory component.
More hops, better inference on
multiple supporting facts.
Recurrent attention model
with external memory
Weight sharing. Constrained to ease training & reduce parameters
1) Adjacent: 𝐶 𝑖 = 𝐴𝑖+1, 𝑊 𝑇 = 𝐶 𝐾, 𝐵 = 𝐴1
2) RNN-like: 𝐴𝑖
= 𝐴 𝑗
, 𝐶 𝑖
= 𝐶 𝑗
uk+1
=ok
+ uk
(act like hidden state in RNN)
Memory Network &
LSTMuk+1
=ok
+ uk
(act like cell state in LSTM)
Recurrent attention model
with external memory
𝑎: School
John is in the school.
Jason is in the office.
John picked up the football.
Jason went to the kitchen.
Q: Where is the football?
A: School
Factoid QA with Two Supporting Facts
- MemNN: Fully Supervised with Support Facts
- MemN2N: Weakly supervised with only answer
- Supporting facts not used
SUPPORTING FACT
SUPPORTING FACT
NOT USED
NOT USED
Result
Perception and Intelligence Lab., Copyright © 2016 44
• Best MemN2N models close to supervised MemNN.
• All beat weakly supervised baseline model.
• Joint training is good.
• More hops, improve performance.
Result
Perception and Intelligence Lab., Copyright © 2016 45
Result
Perception and Intelligence Lab., Copyright © 2016 46
More hops are better.
Each hop gives attention to different memory unit.
Succeed to focus on correct supporting sentences.
Test Acc Failed tasks
MemNN 93.3% 4
LSTM 49% 20
MemN2N
1 hop
74.82% 17
2 hops 84.4% 11
3 hops 87.6.% 11
20 bAbI Tasks
Perception and Intelligence Lab., Copyright © 2016 47
Conclusion
Perception and Intelligence Lab., Copyright © 2016 48
• Neural network with explicit memory and recurrent attention
mechanism for reading the memory
• Trained via backpropagation and jointly on all tasks.
• Weakly supervised trainable.
• No supervision of supporting facts
• Can be used in wider range of setting
• Perform better than other same level of supervision
• In language modeling, Perform better than LSTM, RNN.
• Increasing # of hops, better result
• Still fail on some bAbI tasks.
• Not applied on large memory.
Thank you

More Related Content

What's hot

Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...
Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...
Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...
Simplilearn
 
word embeddings and applications to machine translation and sentiment analysis
word embeddings and applications to machine translation and sentiment analysisword embeddings and applications to machine translation and sentiment analysis
word embeddings and applications to machine translation and sentiment analysis
Mostapha Benhenda
 

What's hot (20)

Deep Learning: Introduction & Chapter 5 Machine Learning Basics
Deep Learning: Introduction & Chapter 5 Machine Learning BasicsDeep Learning: Introduction & Chapter 5 Machine Learning Basics
Deep Learning: Introduction & Chapter 5 Machine Learning Basics
 
Iterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer PredictionIterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer Prediction
 
Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Co...
Muhammad Usman Akhtar  |  Ph.D Scholar  |  Wuhan  University  |  School of Co...Muhammad Usman Akhtar  |  Ph.D Scholar  |  Wuhan  University  |  School of Co...
Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Co...
 
Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...
Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...
Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...
 
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on Language
 
DLBLR talk
DLBLR talkDLBLR talk
DLBLR talk
 
Methodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniquesMethodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniques
 
Talk from NVidia Developer Connect
Talk from NVidia Developer ConnectTalk from NVidia Developer Connect
Talk from NVidia Developer Connect
 
Phx dl meetup
Phx dl meetupPhx dl meetup
Phx dl meetup
 
word embeddings and applications to machine translation and sentiment analysis
word embeddings and applications to machine translation and sentiment analysisword embeddings and applications to machine translation and sentiment analysis
word embeddings and applications to machine translation and sentiment analysis
 
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...
 
Session-Based Recommendations with Recurrent Neural Networks (Balazs Hidasi, ...
Session-Based Recommendations with Recurrent Neural Networks(Balazs Hidasi, ...Session-Based Recommendations with Recurrent Neural Networks(Balazs Hidasi, ...
Session-Based Recommendations with Recurrent Neural Networks (Balazs Hidasi, ...
 
AI for Neuroscience and Neuroscience for AI
AI for Neuroscience and Neuroscience for AIAI for Neuroscience and Neuroscience for AI
AI for Neuroscience and Neuroscience for AI
 
Deep learning for nlp
Deep learning for nlpDeep learning for nlp
Deep learning for nlp
 
Deep learning
Deep learningDeep learning
Deep learning
 
Document Analysis with Deep Learning
Document Analysis with Deep LearningDocument Analysis with Deep Learning
Document Analysis with Deep Learning
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 

Similar to 160805 End-to-End Memory Networks

Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?
NAVER Engineering
 
Grokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles ThinkingGrokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles Thinking
Grokking VN
 

Similar to 160805 End-to-End Memory Networks (16)

Sample modified dll with activity sheet
Sample modified dll with activity sheetSample modified dll with activity sheet
Sample modified dll with activity sheet
 
[246]QANet: Towards Efficient and Human-Level Reading Comprehension on SQuAD
[246]QANet: Towards Efficient and Human-Level Reading Comprehension on SQuAD[246]QANet: Towards Efficient and Human-Level Reading Comprehension on SQuAD
[246]QANet: Towards Efficient and Human-Level Reading Comprehension on SQuAD
 
Grammarly Meetup: Memory Networks for Question Answering on Tabular Data - Sv...
Grammarly Meetup: Memory Networks for Question Answering on Tabular Data - Sv...Grammarly Meetup: Memory Networks for Question Answering on Tabular Data - Sv...
Grammarly Meetup: Memory Networks for Question Answering on Tabular Data - Sv...
 
Memory Networks for Question Answering on Tabular Data
Memory Networks for Question Answering on Tabular Data Memory Networks for Question Answering on Tabular Data
Memory Networks for Question Answering on Tabular Data
 
Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)
 
A Day In The Life V4
A Day In The Life V4A Day In The Life V4
A Day In The Life V4
 
Online Tests: Filling in the Gaps | Mary-Ann Shuker & Dr Suzzanne Owen - Grif...
Online Tests: Filling in the Gaps | Mary-Ann Shuker & Dr Suzzanne Owen - Grif...Online Tests: Filling in the Gaps | Mary-Ann Shuker & Dr Suzzanne Owen - Grif...
Online Tests: Filling in the Gaps | Mary-Ann Shuker & Dr Suzzanne Owen - Grif...
 
Tranformational Model of Translational Research that Leverages Educational Te...
Tranformational Model of Translational Research that Leverages Educational Te...Tranformational Model of Translational Research that Leverages Educational Te...
Tranformational Model of Translational Research that Leverages Educational Te...
 
NetSci14 invited talk: Competing for attention
NetSci14 invited talk: Competing for attentionNetSci14 invited talk: Competing for attention
NetSci14 invited talk: Competing for attention
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?
 
Dll math 5 q1_w2 (june 12-16, 2017)
Dll math 5 q1_w2 (june 12-16,  2017)Dll math 5 q1_w2 (june 12-16,  2017)
Dll math 5 q1_w2 (june 12-16, 2017)
 
DLL_MATHEMATICS 5_Q1_W2.pdf
DLL_MATHEMATICS 5_Q1_W2.pdfDLL_MATHEMATICS 5_Q1_W2.pdf
DLL_MATHEMATICS 5_Q1_W2.pdf
 
Grokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles ThinkingGrokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles Thinking
 
Talk on Ebooks at the NSF BPC/CE21/STEM-C Community Meeting
Talk on Ebooks at the NSF BPC/CE21/STEM-C Community MeetingTalk on Ebooks at the NSF BPC/CE21/STEM-C Community Meeting
Talk on Ebooks at the NSF BPC/CE21/STEM-C Community Meeting
 
Understanding Basics of Machine Learning
Understanding Basics of Machine LearningUnderstanding Basics of Machine Learning
Understanding Basics of Machine Learning
 
Learn from Example and Learn Probabilistic Model
Learn from Example and Learn Probabilistic ModelLearn from Example and Learn Probabilistic Model
Learn from Example and Learn Probabilistic Model
 

More from Junho Cho (9)

Image Translation with GAN
Image Translation with GANImage Translation with GAN
Image Translation with GAN
 
Get Used to Command Line Interface
Get Used to Command Line InterfaceGet Used to Command Line Interface
Get Used to Command Line Interface
 
Convolutional Neural Network
Convolutional Neural NetworkConvolutional Neural Network
Convolutional Neural Network
 
160205 NeuralArt - Understanding Neural Representation
160205 NeuralArt - Understanding Neural Representation160205 NeuralArt - Understanding Neural Representation
160205 NeuralArt - Understanding Neural Representation
 
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
 
150807 Fast R-CNN
150807 Fast R-CNN150807 Fast R-CNN
150807 Fast R-CNN
 
150424 Scalable Object Detection using Deep Neural Networks
150424 Scalable Object Detection using Deep Neural Networks150424 Scalable Object Detection using Deep Neural Networks
150424 Scalable Object Detection using Deep Neural Networks
 
161209 Unsupervised Learning of Video Representations using LSTMs
161209 Unsupervised Learning of Video Representations using LSTMs161209 Unsupervised Learning of Video Representations using LSTMs
161209 Unsupervised Learning of Video Representations using LSTMs
 
Unsupervised Cross-Domain Image Generation
Unsupervised Cross-Domain Image GenerationUnsupervised Cross-Domain Image Generation
Unsupervised Cross-Domain Image Generation
 

Recently uploaded

Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
gajnagarg
 
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get CytotecAbortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
HyderabadDolls
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
HyderabadDolls
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
gajnagarg
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
HyderabadDolls
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 

Recently uploaded (20)

Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get CytotecAbortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
 
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
Oral Sex Call Girls Kashmiri Gate Delhi Just Call 👉👉 📞 8448380779 Top Class C...
Oral Sex Call Girls Kashmiri Gate Delhi Just Call 👉👉 📞 8448380779 Top Class C...Oral Sex Call Girls Kashmiri Gate Delhi Just Call 👉👉 📞 8448380779 Top Class C...
Oral Sex Call Girls Kashmiri Gate Delhi Just Call 👉👉 📞 8448380779 Top Class C...
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 

160805 End-to-End Memory Networks

  • 1. Perception and Intelligence Laboratory Seoul National University End-to-End Memory Networks Sukhbaatar, S., Weston, J., & Fergus, R. (NIPS 2015) Junho Cho2016.08.05
  • 2. Memory networks: reasoning with long-term memory component that can be read and written. 2
  • 3. What to solve? Read Long Story and Answering Questions. Problem that requires long term memory. What about RNN?  the memory (encoded by hidden states and weights) is typically too small  Not accurately remember facts from the past. (knowledge compressed in dense vectors)  Not even able to reproduce input as output (Zaremba & Sutskever, 2014) Perception and Intelligence Lab., Copyright © 2016 3
  • 4. What is Memory Network(MemNN)? Perception and Intelligence Lab., Copyright © 2016 4 • First introduced in 2014 in ICLR2015 [Memory Networks] • Class of models that combine large memory with learning component that can read and write to it. • Incorporates reasoning with attention over memory < End-to-End Memory Network >
  • 5. Neural Network with a recurrent attention model over an external memory. Perception and Intelligence Lab., Copyright © 2016 5 < End-to-End Memory Network > End-to-end Memory Network (MemN2N) FYI, memory component is external and not to be changed after sentences are embedded to memory.
  • 6. Full Scheme A, B, C, W are to be trained jointly
  • 7. Full Scheme 1. Store story sentences in input memory.
  • 8. Full Scheme 2. Embed question and inner product with each memory vector. If memory is related to question, it will be more attended
  • 9. Full Scheme 3. Knowing which memory to attend, weight sum on output memory vector. Add embedded question, on output and predict answer as one-hot vector.
  • 10. Full Scheme 4. Calculations in embedding space
  • 11. Full Scheme 𝑎=Softmax(W(o+u)) 𝑚𝑖 = 𝑗 𝑨𝑥𝑖𝑗 𝑢 = 𝑗 𝑩𝑞 𝑗 𝑝𝑖 = 𝑆𝑜𝑓𝑡𝑚𝑎𝑥(𝑢 𝑇 𝑚 𝑡) 𝑜 = 𝑗 𝑝𝑖 𝑐𝑖 𝑐𝑖 = 𝑗 𝑪𝑥𝑖𝑗
  • 12. Tasks: bAbI Perception and Intelligence Lab., Copyright © 2016 12 • Facebook AI Research • 20 tasks for testing text understanding and reasoning • Generated by the simulation. Humans should get 100%. • Each task, 1000 q & 1000 ans. • Sample task: Factoid QA with Two supporting facts John is in the school. Jason is in the office. John picked up the football. Jason went to the kitchen. Q: Where is the football? SUPPORTING FACT SUPPORTING FACT To answer the question, the two support facts are important. Other sentences are just distraction. Memory Network should focus on supporting fact A: School
  • 13. Tasks: bAbI Perception and Intelligence Lab., Copyright © 2016 13 • 20 tasks
  • 14. Tasks: bAbI Perception and Intelligence Lab., Copyright © 2016 14 • 20 tasks
  • 15. Tasks: bAbI Perception and Intelligence Lab., Copyright © 2016 15 • Dynamic Memory Network (not introducing today) DEMO • http://yerevann.com/dmn-ui/#/ • Example test story + predictions: Antoine went to the kitchen. Antoine got the milk. Antoine travell ed to the office. Antoine dropped the milk. Sumit picked up the fo otball. Antoine went to the bathroom. Sumit moved to the kitchen . • where is the milk now? A: office • where is the football? A: kitchen • where is Antoine ? A: bathroom • where is Sumit ? A: kitchen • where was Antoine before the bathroom? A: office
  • 16. Challenging Perception and Intelligence Lab., Copyright © 2016 16 • Able to infer through Supporting FACT • Supporting FACT is labeled at sentence. • Also can be given for fully supervised learning • In this paper, only give Answer as ground truth. • Memory network will predict answer by inferring which sentence is supporting fact. • Useful in realistic QA tasks and language modeling John is in the school. Jason is in the office. John picked up the football. Jason went to the kitchen. Q: Where is the football? SUPPORTING FACT SUPPORTING FACT A: School
  • 17. End-to-end Memory Network (MemN2N) • Presented in NIPS2015 • New end-to-end model: • Reads from memory with soft attention • Performs multiple lookups (hops) on memory • End-to-end training with back-propagation • Only need supervision on the final output • It is based on MemNN but that had: • Hard attention • requires explicit supervision of attention during training • Only feasible for simple tasks
  • 18. Problem Statement Perception and Intelligence Lab., Copyright © 2016 18 1. bAbI: Synthetic QA tasks 2. Language model context TODAY
  • 19. Input: Context 문장들과 질문 𝑥1 = Mary journeyed to the den. 𝑥2 = Mary went back to the kitchen. 𝑥3 = John journeyed to the bedroom. 𝑥4 = Mary discarded the milk. Q: Where was the milk before the den? 𝑥1 𝑥2 𝑥3 𝑥4 slide from TaeHoon Kim Word: 𝑥𝑖𝑗 ∈ ℝ 𝑉 one hot vector, V = 177 Sentences and Question: BoW Representation
  • 20. 문장 하나 𝑥𝑖, 단어의 조합 A sentence ∶ 𝑥𝑖 “Mary journeyed to the den” 𝑥𝑖 = 𝑥𝑖1, 𝑥𝑖2, … , 𝑥𝑖𝑛 𝑥𝑖1 = mary 𝑥𝑖2 = journeyed 𝑥𝑖3 = to 𝑥𝑖4 = the 𝑥𝑖5 = den slide from TaeHoon Kim Word: 𝑥𝑖𝑗 ∈ ℝ 𝑉 one hot vector, V = 177 Sentences and Question: BoW Representation
  • 21. 단어 하나: Bag-of-Words (BoW)로 표현 1 0 0 0 . . . 0 mary Bag-of-Words (BoW) 𝑥𝑖1 = 𝑥𝑖1 = mary 𝑥𝑖 = 𝑥𝑖1, 𝑥𝑖2, … , 𝑥𝑖𝑛 slide from TaeHoon Kim Word: 𝑥𝑖𝑗 ∈ ℝ 𝑉 one hot vector, V = 177 Sentences and Question: BoW Representation A sentence ∶ 𝑥𝑖 “Mary journeyed to the den” 𝑥𝑖 = 𝑥𝑖1, 𝑥𝑖2, … , 𝑥𝑖𝑛 Mary journeyed to the den
  • 22. 문장 하나: BoW의 set 1 0 0 0 . . . 0 mary 0 1 0 0 . . . 0 journeyed 0 0 1 0 . . . 0 to 0 0 0 1 . . . 0 the 𝑥𝑖 = 𝑥𝑖1, 𝑥𝑖2, … , 𝑥𝑖𝑛 𝑥𝑖1 = mary 𝑥𝑖2 = journeyed 𝑥𝑖3 = to 𝑥𝑖4 = the 𝑥𝑖5 = den slide from TaeHoon Kim Word: 𝑥𝑖𝑗 ∈ ℝ 𝑉 one hot vector, V = 177 Sentences and Question: BoW Representation Mary journeyed to the den 0 0 0 0 . . . 1 den
  • 23. Input: BoW로 표현된 Context 문장들과 질문 𝑥1 = Mary journeyed to the den. 𝑥2 = Mary went back to the kitchen. 𝑥3 = John journeyed to the bedroom. 𝑥4 = Mary discarded the milk. 𝑥1 𝑥3 𝑥8 𝑥2 𝑥4 𝑥6 𝑥9 Q: Where was the milk before the den? 1 0 0 0 . . . 0 0 1 0 0 . . . 0 0 0 1 0 . . . 0 0 0 0 1 . . . 0 slide from TaeHoon Kim Word: 𝑥𝑖𝑗 ∈ ℝ 𝑉 one hot vector, V = 177 Sentences and Question: BoW Representation
  • 24. Full Scheme 𝑎=Softmax(W(o+u)) 𝑚𝑖 = 𝑗 𝑨𝑥𝑖𝑗 𝑢 = 𝑗 𝑩𝑞 𝑗 𝑝𝑖 = 𝑆𝑜𝑓𝑡𝑚𝑎𝑥(𝑢 𝑇 𝑚 𝑡) 𝑜 = 𝑗 𝑝𝑖 𝑐𝑖 𝑐𝑖 = 𝑗 𝑪𝑥𝑖𝑗
  • 25. Memory: 문장 하나로부터 만들어진다 𝑚𝑖 = 𝑗 𝑨𝑥𝑖𝑗 1 0 0 0 . . . 0 0 1 0 0 . . . 0 0 0 1 0 . . . 0 0 0 0 1 . . . 0 𝑚1 = 𝑨 +𝑨 +𝑨 +𝑨 𝑥11 𝑥12 𝑥13 𝑥14 mary journeyed to the 𝑨: embedding matrix Dimension Check A: (d x V) x: V = 177 m: d = 20 or 50 slide from TaeHoon Kim 0 0 0 0 . . . 1 +𝑨 𝑥15 den Thus, # of memory = # of sentences
  • 26. 𝑚𝑖 = 𝑗 𝑨𝑥𝑖𝑗 1 0 0 0 . . . 0 0 1 0 0 . . . 0 0 0 1 0 . . . 0 0 0 0 1 . . . 0 𝑚1 = 𝑨 +𝑨 +𝑨 +𝑨 𝑥11 𝑥12 𝑥13 𝑥14 𝑚1 𝑨: embedding matrix Embedding to Memory
  • 27. Embedding to Memory 𝑚1, 𝑚2, 𝑚3, 𝑚4 𝑥1 = Mary journeyed to the den. 𝑥2 = Mary went back to the kitchen. 𝑥3 = John journeyed to the bedroom. 𝑥4 = Mary discarded the milk. 𝑥1 = Mary journeyed to the den. 𝑥2 = Mary went back to the kitchen. 𝑥3 = John journeyed to the bedroom. 𝑥4 = Mary discarded the milk. In bAbI task, Simpler form of memory component. d d d 𝑚1, 𝑚2, 𝑚3, 𝑚4 𝑨 𝑚𝑖 = 𝑗 𝑨𝑥𝑖𝑗 𝑨: embedding matrix d
  • 28. Memory: 필요한 것만 Input으로 사용 Input memory 실제로 메모리의 본체는 embedding matrix인 𝑨이며, 𝑨로 BoW Representation을 memory vector로 embedding Training에서 𝑨를 학습. 𝑚𝑖 = 𝑗 𝑨𝑥𝑖𝑗 d d dd # of sentences : n < 320 # of memory vectors: n but maximum capacity restricted to recent 50 sentences. d…
  • 29. 𝑢 = 𝑗 𝑩𝑞 𝑗 Question Q: Where was the milk before the den?
  • 30. 𝑢 = 𝑗 𝑩𝑞 𝑗 What memory vector to attend, based on question 𝑝𝑖 = 𝑆𝑜𝑓𝑡𝑚𝑎𝑥(𝑢 𝑇 𝑚 𝑡) 𝑥1 = Mary journeyed to the den. 𝑥2 = Mary went back to the kitchen. 𝑥3 = John journeyed to the bedroom. 𝑥4 = Mary discarded the milk. Q: Where was the milk before the den? Attention model on external memory
  • 31. 질문 q에 대해서 어떤 memory vector에 얼마나 집중할 것인지 𝑝𝑖 = 𝑆𝑜𝑓𝑡𝑚𝑎𝑥(𝑢 𝑇 𝑚 𝑡) If 𝑢 and 𝑚𝑖 is related, 𝑚𝑖 is attended in memory, 𝑝𝑖 is higher Attention model on external memory
  • 32. 𝑐𝑖 = 𝑗 𝑪𝑥𝑖𝑗 Purpose of each embedding matrices B : Question A : Input memory vector to attend C : Output vector based on attention Output
  • 33. 𝑜 = 𝑗 𝑝𝑖 𝑐𝑖 slide from TaeHoon Kim Output: 요약된 정보 𝑜 + 질문 정보 𝑢 o : response vector from memory weighted by probability vector from input
  • 34. Output: 요약된 정보 𝑜 + 질문 정보 𝑢 𝑜 + 𝑢 𝑜, 𝑢 둘 다 고려해 답을 도출 slide from TaeHoon Kim o : response vector from memory weighted by probability vector from input u : internal state. Input(Question) embedding.
  • 35. Output: 실제로 정답 단어 𝑎 𝑎=Softmax(W(o+u)) slide from TaeHoon Kim W: d x V dimensional
  • 36. Model scheme Perception and Intelligence Lab., Copyright © 2016 36 • Input sentences: 𝑥1, 𝑥2, … , 𝑥 𝑛 is taken • Sentences are embedded into memory vectors 𝑚𝑖 and 𝑐𝑖 by using embedding matrices 𝐴 and 𝐶 • question is embedded into internal state 𝑢 • Matching is performed between 𝑢 and 𝑚𝑖 with SoftMax function • output is calculated by the relation: 𝑜 = 𝑗 𝑝𝑖 𝑐𝑖 • another SoftMax is used to produce final prediction after summing up 𝑜 with 𝑢
  • 37. Recurrent attention model with external memory slide from TaeHoon Kim 1-hop uk+1 =ok + uk
  • 38. Recurrent attention model with external memory slide from TaeHoon Kim 2-hops uk+1 =ok + uk
  • 39. Recurrent attention model with external memory slide from TaeHoon Kim 3-hops uk+1 =ok + uk
  • 40. Recurrent attention model with external memory uk+1 =ok + uk Recurrent on memory component. More hops, better inference on multiple supporting facts.
  • 41. Recurrent attention model with external memory Weight sharing. Constrained to ease training & reduce parameters 1) Adjacent: 𝐶 𝑖 = 𝐴𝑖+1, 𝑊 𝑇 = 𝐶 𝐾, 𝐵 = 𝐴1 2) RNN-like: 𝐴𝑖 = 𝐴 𝑗 , 𝐶 𝑖 = 𝐶 𝑗 uk+1 =ok + uk (act like hidden state in RNN)
  • 42. Memory Network & LSTMuk+1 =ok + uk (act like cell state in LSTM)
  • 43. Recurrent attention model with external memory 𝑎: School John is in the school. Jason is in the office. John picked up the football. Jason went to the kitchen. Q: Where is the football? A: School Factoid QA with Two Supporting Facts - MemNN: Fully Supervised with Support Facts - MemN2N: Weakly supervised with only answer - Supporting facts not used SUPPORTING FACT SUPPORTING FACT NOT USED NOT USED
  • 44. Result Perception and Intelligence Lab., Copyright © 2016 44 • Best MemN2N models close to supervised MemNN. • All beat weakly supervised baseline model. • Joint training is good. • More hops, improve performance.
  • 45. Result Perception and Intelligence Lab., Copyright © 2016 45
  • 46. Result Perception and Intelligence Lab., Copyright © 2016 46 More hops are better. Each hop gives attention to different memory unit. Succeed to focus on correct supporting sentences. Test Acc Failed tasks MemNN 93.3% 4 LSTM 49% 20 MemN2N 1 hop 74.82% 17 2 hops 84.4% 11 3 hops 87.6.% 11 20 bAbI Tasks
  • 47. Perception and Intelligence Lab., Copyright © 2016 47
  • 48. Conclusion Perception and Intelligence Lab., Copyright © 2016 48 • Neural network with explicit memory and recurrent attention mechanism for reading the memory • Trained via backpropagation and jointly on all tasks. • Weakly supervised trainable. • No supervision of supporting facts • Can be used in wider range of setting • Perform better than other same level of supervision • In language modeling, Perform better than LSTM, RNN. • Increasing # of hops, better result • Still fail on some bAbI tasks. • Not applied on large memory.

Editor's Notes

  1. In this work … Merge references Add memn2n