[Korean Version]
Multiple Vector Encoding techniques for Deep learning.
This article contains 1) RNN 2) Attention mechanism and 3) CNN for multiple vector encoding.
Image Classification
from AlexKrizhevsky, Ilya Sutskever, Geoffrey E. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, NIPS, 2012.
V to Vββ Linear Algebra
Weighted Sum
w1,1
w1,2
w1,3
w1,4
w1,5
w1,6
w1,7
w1,8
w1,9
x1 x2 x3 x4 x5 x6 x7 x8 x9 X = π
9
π₯π β π€1,π
w2,1
w2,2
w2,3
w2,4
w2,5
w2,6
w2,7
w2,8
w2,9
[1 x 9] matrix
[9x2] matrix
π
9
π₯π β π€2,π
,
[1x2] matrix
Fully Connected Network
19.
V to Vββ Projection Notation
- - -
- - -
- - -
?
?
V Vβ
W
V dimension μ κ°μ§ λ°μ΄ν°λ₯Ό Vβ λ‘ λ³νμν¬ μ μλ
Weight Set μ W λΌκ³ νμ.
W =V
Vβ
RNN μ Input/ Output
Data 1 Data 2 Data 3
Out 1 Out 2 Out 3
οΌ Vs ο Vβs
οΌ Len(Vs) = Len(Vβs)
Input λ°μ΄ν°μ λμνλ
Output λ°μ΄ν°λ₯Ό λ§λ€μ΄ λΌ μ μμ
Data 1 Data 2 Data 3
Out
Summarization
οΌ Vs ο 1
Input λ°μ΄ν°λ₯Ό Temporal ν μ 보λ₯Ό λ°μνμ¬
μμ½ν μ μλ€.
[μ°Έκ³ ] Translation Pyramid
BernardVauquois' pyramid showing comparative
depths of intermediary representation, interlingual
machine translation at the peak, followed by transfer-
based, then direct translation.
[ http://en.wikipedia.org/wiki/Machine_translation]
46.
RNN Encoder-Decoder forMachine Translation
Cho et al. (2014)
http://devblogs.nvidia.com/parallelforall/introduction-neural-machine-translation-gpus-part-2/
μ¬λ‘ : Attentionbased Neural Translation Model - Bahdanau et al. (2014)
One-Hot
BiRNN
EMB EMB EMB EMB
x1 x2 x3 x4
F1 F2 F3 F4
B1 B2 B3 B4
N1 N2 N3 N4Concat
Att
S1
Att
S2
Att
S3
Ο
A A
C
h1 h2 h3
N1 N2 N3 N4
A A
Softmax
D1 D2 D2
EMBβ EMBβ EMBβ
D0
EMBβ
Special Symbol
A : alignment weight
EMB, EMBβ : Embedding
75.
μ¬λ‘ : Attentionbased Neural Translation Model - Bahdanau et al. (2014)
One-Hot
BiRNN
EMB EMB EMB EMB
x1 x2 x3 x4
F1 F2 F3 F4
B1 B2 B3 B4
N1 N2 N3 N4Concat
Att
S1
Att
S2
Att
S3
Ο
A A
C
h1 h2 h3
N1 N2 N3 N4
A A
Softmax
D1 D2 D2
EMBβ EMBβ EMBβ
D0
EMBβ
Special Symbol
A : alignment weight
EMB, EMBβ : Embedding
Idea 1)
Idea
2/3)
Xu et al.(2015)
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
http://devblogs.nvidia.com/parallelforall/introduction-neural-machine-translation-gpus-part-3/
Attention Modeling for Image2Text
79.
Attention Modeling forImage2Text
Xu et al. (2015)
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Encoder / Decoder μμ Text Sequence Encoding μ
Image Sequence Encoding μΌλ‘ κ΅μ²΄λ§ ν΄λ μλν¨
X
Y
X
Y