Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
160805 End-to-End Memory Networks
1. Perception and Intelligence Laboratory
Seoul
National
University
End-to-End Memory
Networks
Sukhbaatar, S., Weston, J., & Fergus, R. (NIPS 2015)
Junho Cho2016.08.05
8. Full Scheme
2. Embed question
and inner product with
each memory vector.
If memory is related to
question, it will be
more attended
9. Full Scheme
3. Knowing which
memory to attend,
weight sum on
output memory vector.
Add embedded question,
on output and predict
answer as one-hot vector.
17. End-to-end Memory Network (MemN2N)
• Presented in NIPS2015
• New end-to-end model:
• Reads from memory with soft attention
• Performs multiple lookups (hops) on memory
• End-to-end training with back-propagation
• Only need supervision on the final output
• It is based on MemNN but that had:
• Hard attention
• requires explicit supervision of attention during training
• Only feasible for simple tasks
19. Input: Context 문장들과 질문
𝑥1 = Mary journeyed to the den.
𝑥2 = Mary went back to the kitchen.
𝑥3 = John journeyed to the bedroom.
𝑥4 = Mary discarded the milk.
Q: Where was the milk before the den?
𝑥1 𝑥2 𝑥3 𝑥4
slide from TaeHoon Kim
Word: 𝑥𝑖𝑗 ∈ ℝ 𝑉
one hot vector, V = 177
Sentences and Question: BoW Representation
20. 문장 하나 𝑥𝑖, 단어의 조합
A sentence ∶ 𝑥𝑖
“Mary journeyed to the den”
𝑥𝑖 = 𝑥𝑖1, 𝑥𝑖2, … , 𝑥𝑖𝑛
𝑥𝑖1 = mary
𝑥𝑖2 = journeyed
𝑥𝑖3 = to
𝑥𝑖4 = the
𝑥𝑖5 = den
slide from TaeHoon Kim
Word: 𝑥𝑖𝑗 ∈ ℝ 𝑉
one hot vector, V = 177
Sentences and Question: BoW Representation
21. 단어 하나: Bag-of-Words (BoW)로
표현
1
0
0
0
.
.
.
0
mary
Bag-of-Words (BoW)
𝑥𝑖1
=
𝑥𝑖1 = mary
𝑥𝑖 = 𝑥𝑖1, 𝑥𝑖2, … , 𝑥𝑖𝑛
slide from TaeHoon Kim
Word: 𝑥𝑖𝑗 ∈ ℝ 𝑉
one hot vector, V = 177
Sentences and Question: BoW Representation
A sentence ∶ 𝑥𝑖
“Mary journeyed to the den”
𝑥𝑖 = 𝑥𝑖1, 𝑥𝑖2, … , 𝑥𝑖𝑛
Mary
journeyed
to
the
den
22. 문장 하나: BoW의 set
1
0
0
0
.
.
.
0
mary
0
1
0
0
.
.
.
0
journeyed
0
0
1
0
.
.
.
0
to
0
0
0
1
.
.
.
0
the
𝑥𝑖 = 𝑥𝑖1, 𝑥𝑖2, … , 𝑥𝑖𝑛
𝑥𝑖1 = mary
𝑥𝑖2 = journeyed
𝑥𝑖3 = to
𝑥𝑖4 = the
𝑥𝑖5 = den
slide from TaeHoon Kim
Word: 𝑥𝑖𝑗 ∈ ℝ 𝑉
one hot vector, V = 177
Sentences and Question: BoW Representation
Mary
journeyed
to
the
den
0
0
0
0
.
.
.
1
den
23. Input: BoW로 표현된 Context 문장들과
질문
𝑥1 = Mary journeyed to the den.
𝑥2 = Mary went back to the kitchen.
𝑥3 = John journeyed to the bedroom.
𝑥4 = Mary discarded the milk.
𝑥1 𝑥3 𝑥8 𝑥2 𝑥4 𝑥6 𝑥9
Q: Where was the milk before the den?
1
0
0
0
.
.
.
0
0
1
0
0
.
.
.
0
0
0
1
0
.
.
.
0
0
0
0
1
.
.
.
0
slide from TaeHoon Kim
Word: 𝑥𝑖𝑗 ∈ ℝ 𝑉
one hot vector, V = 177
Sentences and Question: BoW Representation
27. Embedding to Memory
𝑚1, 𝑚2, 𝑚3, 𝑚4
𝑥1 = Mary journeyed to the den.
𝑥2 = Mary went back to the kitchen.
𝑥3 = John journeyed to the bedroom.
𝑥4 = Mary discarded the milk.
𝑥1 = Mary journeyed to the den.
𝑥2 = Mary went back to the kitchen.
𝑥3 = John journeyed to the bedroom.
𝑥4 = Mary discarded the milk.
In bAbI task, Simpler form of memory component.
d d d
𝑚1, 𝑚2, 𝑚3, 𝑚4
𝑨
𝑚𝑖 =
𝑗
𝑨𝑥𝑖𝑗
𝑨: embedding matrix
d
28. Memory: 필요한 것만 Input으로 사용
Input memory
실제로 메모리의 본체는 embedding matrix인 𝑨이며,
𝑨로 BoW Representation을 memory vector로 embedding
Training에서 𝑨를 학습.
𝑚𝑖 =
𝑗
𝑨𝑥𝑖𝑗
d d dd
# of sentences : n < 320
# of memory vectors: n
but maximum capacity
restricted to recent 50 sentences.
d…
30. 𝑢 =
𝑗
𝑩𝑞 𝑗
What memory vector to attend, based on question
𝑝𝑖 = 𝑆𝑜𝑓𝑡𝑚𝑎𝑥(𝑢 𝑇
𝑚 𝑡)
𝑥1 = Mary journeyed to the den.
𝑥2 = Mary went back to the kitchen.
𝑥3 = John journeyed to the bedroom.
𝑥4 = Mary discarded the milk.
Q: Where was the milk before the den?
Attention model on external memory
31. 질문 q에 대해서 어떤 memory vector에
얼마나 집중할 것인지
𝑝𝑖 = 𝑆𝑜𝑓𝑡𝑚𝑎𝑥(𝑢 𝑇
𝑚 𝑡)
If 𝑢 and 𝑚𝑖 is related,
𝑚𝑖 is attended in memory,
𝑝𝑖 is higher
Attention model on external memory
32. 𝑐𝑖 =
𝑗
𝑪𝑥𝑖𝑗
Purpose of each embedding matrices
B : Question
A : Input memory vector to attend
C : Output vector based on attention
Output
33. 𝑜 =
𝑗
𝑝𝑖 𝑐𝑖
slide from TaeHoon Kim
Output: 요약된 정보 𝑜 + 질문 정보 𝑢
o : response vector from memory
weighted by probability vector from input
34. Output: 요약된 정보 𝑜 + 질문 정보 𝑢
𝑜 + 𝑢
𝑜, 𝑢 둘 다 고려해 답을 도출
slide from TaeHoon Kim
o : response vector from memory
weighted by probability vector from input
u : internal state.
Input(Question)
embedding.
35. Output: 실제로 정답 단어 𝑎
𝑎=Softmax(W(o+u))
slide from TaeHoon Kim
W: d x V dimensional
40. Recurrent attention model
with external memory
uk+1
=ok
+ uk
Recurrent on memory component.
More hops, better inference on
multiple supporting facts.
41. Recurrent attention model
with external memory
Weight sharing. Constrained to ease training & reduce parameters
1) Adjacent: 𝐶 𝑖 = 𝐴𝑖+1, 𝑊 𝑇 = 𝐶 𝐾, 𝐵 = 𝐴1
2) RNN-like: 𝐴𝑖
= 𝐴 𝑗
, 𝐶 𝑖
= 𝐶 𝑗
uk+1
=ok
+ uk
(act like hidden state in RNN)
43. Recurrent attention model
with external memory
𝑎: School
John is in the school.
Jason is in the office.
John picked up the football.
Jason went to the kitchen.
Q: Where is the football?
A: School
Factoid QA with Two Supporting Facts
- MemNN: Fully Supervised with Support Facts
- MemN2N: Weakly supervised with only answer
- Supporting facts not used
SUPPORTING FACT
SUPPORTING FACT
NOT USED
NOT USED