[IMPL] Neural Machine Translation
장재호
19. 09. 06 (금)
목차
• Sequence to sequence with attention
• Data
• Tokenizer
• NMTDataset
• Model
• EncLayer
• DecCell
• DecLayer
• Utils
Sequence to sequence with
attention
참고자료
“Effective Approaches to Attention-based Neural Machine Translation”. Luong et al. 2015.
IBM/pytorch-seq2seq: https://github.com/IBM/pytorch-seq2seq
Sequence to sequence with attention
• Model: Sequence to sequence with attention (Luong et al.
2015)
• Tokenizer (SentencePiece):
• BPE (Byte Pair Encoding)
• WPM (WordPieceModel)
• Attention
• Dot Product Attention
• Multiplicative Attention
• Additive Attention
Sequence to sequence with attention
• Encoder-Decoder
• Encoder: Stacked Bidirectional
LSTM
• Decoder: Stacked LSTM
Sequence to sequence with attention
• Attention
• Dot Product Attention
• Multiplicative Attention
• Additive Attention
Data
Train: WMT’14 English-German Data
Test: Newstest2013 English-German Data
데이터 출처: https://nlp.stanford.edu/projects/nmt/
SentencePiece: https://github.com/google/sentencepiece
Data
• SentencePiece
• Subword Segmentation Library
• Pytorch 데이터 유틸리티
• torch.utils.data.Dataset
• Abstract class
• Iterable object
• __len__()
• __getitem__()
• torch.utils.data.Dataloader
• Generator object
• Collate (callback function)
Data
• https://github.com/om00839/Seq2Seq/blob/master/Script%20
-%20data.ipynb
Model
참고자료
“Effective Approaches to Attention-based Neural Machine Translation”. Luong et al. 2015.
IBM/pytorch-seq2seq: https://github.com/IBM/pytorch-seq2seq
Model
• Layers
• EncLayer
• DecCell
• DecLayer
• Attention
• BaseAttention
• DotProdAttention
• MulAttention
• AddAttention
• Model
• Seq2SeqWithAttn
Model
• EncLayer
𝒉
LSTM Layer
Embedding Layer
𝑤#
(%)
𝑤'
(%)
𝑤()
(%)
…
ℎ# ℎ' ℎ()…
Source Input
OutputTarget
Model
• DecCell
DecCell
Embedding
LSTM
Cell𝑤+
(+) Linear
Classifer 𝑙𝑜𝑔𝑖𝑡+Attention
𝑠+2#,
𝑐+2#
𝑠+, 𝑐+
𝒉
Source Input
OutputTarget
Model
• DecLayer
• train
𝐷𝑒𝑐
𝐶𝑒𝑙𝑙#
𝑤#
(+)
bos
𝑤'
(+) 𝑤(82#
(+)
…
𝑙𝑜𝑔𝑖𝑡# …𝑙𝑜𝑔𝑖𝑡'
𝑙𝑜𝑔𝑖𝑡(8
𝐷𝑒𝑐
𝐶𝑒𝑙𝑙'
𝐷𝑒𝑐
𝐶𝑒𝑙𝑙(8
𝑤'
(+)
𝑤9
(+) 𝑤(8
(+)
eos
…
Cross Entropy
…
Source Input
OutputTarget
Model
• DecLayer
• infer
𝐷𝑒𝑐
𝐶𝑒𝑙𝑙#
𝑤#
(+)
bos
:𝑤'
(+) :𝑤(82#
(+)
…
𝑙𝑜𝑔𝑖𝑡# …𝑙𝑜𝑔𝑖𝑡'
𝑙𝑜𝑔𝑖𝑡(8
𝐷𝑒𝑐
𝐶𝑒𝑙𝑙'
𝐷𝑒𝑐
𝐶𝑒𝑙𝑙(8
:𝑤'
(+)
:𝑤9
(+) :𝑤(8
(+)
eos
…
ℎ()
,
𝑐()
…
Source Input
OutputTarget
Model
• Configuration
• src_vocab_size=32000
• tar_vocab_size=32000
• attention=AddAttention
• embedding_dim=256
• hidden_dim=256
• enc_hidden_dim = 256
• dec_hidden_dim = 256*2 (enc_hidden_dim * n_directions)
• n_layers=2,
• bidirectional=True
Model
• https://github.com/om00839/Seq2Seq/blob/master/Script%20
-%20model.ipynb
Utils
참고자료
IBM/pytorch-seq2seq: https://github.com/IBM/pytorch-seq2seq
Utils
• Train & evaluate
• train_batches()
• evaluate_batches()
• generate_sentence()
• EarlyStopping
• Metrics (pytorch-nlp)
• bleu_score
• Accuracy
Utils
• https://github.com/om00839/Seq2Seq/blob/master/Script%20
-%20utils.ipynb
감사합니다

[Impl] neural machine translation