5. Introduction
• ( Sa i M eCL d vT i
ocMh CL t y L ocM
l CL r
• ,) ) , M Mp
m , ) ) , , M g
• - ) , ) ) R u n cMl
5
6. Model
• ST LM I 3 I EA 3 1
3 LM 3
3 1 3 3 . 3 1 3 2 .
3
. 3 1 3
.3 1 . . - 3 1 3
6
7. 1. Allowing memories to interact using multi-head dot product attention
• P D W e f H E e
– - = > = = = =
– ( icP dM hP
– , = > = == = A!" S Sga > = = > = == = )
== =
7
hM
memory: !"
8. 2. Encoding new memories
• d , a
• ! c l , M
- , fk
e i
8
lM
memory: "#
input: !$
9. 3. Introducing recurrence and embedding into an LSTM
• h g T s !" - M ( s a
• lm( ( lmfn S a a
• ow ( r Pa
– z 7 T yLb lm tM ( ) 7 7 M
Ca
– a d i lm b Lb ae
a A M
9
d i
s d i
d i
10. 3. Introducing recurrence and embedding into an LSTM
10
1. M input x concate
2. self-attention MLP M
3. Input gate, forget gate
M ℎ"
4. LSTM
(apply gating)
5. M flatten output
11. Experiments
E 2 :F E F: 5 2
!"# 2 Mg kv f d kv j Wg P g a ih u
b kv b T kv P g j c g RMg
. 2 F2 E2 : l pk vauLsrtv nLw h e hg asL
mtvuLs e hg
: 2 :
-: : .2 2 G: F: G a oLpv f hgy mL oLpva j I f
Wg RMg
,2 E2 - 5 :
1: :0 . E 3 : 21 5 F G: : 5:2 x Lp a uLsrtv
MgG 5 R fh a aG 5a a u l
11
12. Results
% 6 RP GSC PRNC SGPC PIP
% !"# 4 FCP 0 7 8l291 2GDDC CL G C 9CR 1M NR GLE gwxyp nxu i ) h ~
hp mof a hg b 81 C GML 8C M 1M C i p
(% : ME 3S R GML0 jd e bhr sytvc xu p n
(% CGLDM C CL C LGLE
% 8GLG : L TG F SGCTNM 0 81i7 8g b NMGL Pjek o p ,-- SP% b
f c GLGLE i( jeh p SP% . DGER C
)% 7 LER EC 8M C GLE
% GIG C ) : M C 5R CL C E 5GE M S 0 xu g b NC N C G %*W %*
12