[Impl] neural machine translation

•

1 like•104 views

This document summarizes the key aspects of a neural machine translation model using sequence-to-sequence with attention. It describes the encoder-decoder architecture using stacked LSTMs, dot product attention, and how the model is trained on WMT'14 English-German data and evaluated on Newstest2013 data. It also provides information on the tokenizer, dataset creation, model configuration, and training utilities.

Data & Analytics

[IMPL] Neural Machine Translation
장재호
19. 09. 06 (금)

목차
• Sequence to sequence with attention
• Data
• Tokenizer
• NMTDataset
• Model
• EncLayer
• DecCell
• DecLayer
• Utils

Sequence to sequence with
attention
참고자료
“Effective Approaches to Attention-based Neural Machine Translation”. Luong et al. 2015.
IBM/pytorch-seq2seq: https://github.com/IBM/pytorch-seq2seq

Sequence to sequence with attention
• Model: Sequence to sequence with attention (Luong et al.
2015)
• Tokenizer (SentencePiece):
• BPE (Byte Pair Encoding)
• WPM (WordPieceModel)
• Attention
• Dot Product Attention
• Multiplicative Attention
• Additive Attention

Sequence to sequence with attention
• Encoder-Decoder
• Encoder: Stacked Bidirectional
LSTM
• Decoder: Stacked LSTM

Sequence to sequence with attention
• Attention
• Dot Product Attention
• Multiplicative Attention
• Additive Attention

Data
Train: WMT’14 English-German Data
Test: Newstest2013 English-German Data
데이터 출처: https://nlp.stanford.edu/projects/nmt/
SentencePiece: https://github.com/google/sentencepiece

Data
• SentencePiece
• Subword Segmentation Library
• Pytorch 데이터 유틸리티
• torch.utils.data.Dataset
• Abstract class
• Iterable object
• __len__()
• __getitem__()
• torch.utils.data.Dataloader
• Generator object
• Collate (callback function)

Data
• https://github.com/om00839/Seq2Seq/blob/master/Script%20
-%20data.ipynb

Model
참고자료
“Effective Approaches to Attention-based Neural Machine Translation”. Luong et al. 2015.
IBM/pytorch-seq2seq: https://github.com/IBM/pytorch-seq2seq

Model
• Layers
• EncLayer
• DecCell
• DecLayer
• Attention
• BaseAttention
• DotProdAttention
• MulAttention
• AddAttention
• Model
• Seq2SeqWithAttn

Model
• EncLayer
𝒉
LSTM Layer
Embedding Layer
𝑤#
(%)
𝑤'
(%)
𝑤()
(%)
…
ℎ# ℎ' ℎ()…
Source Input
OutputTarget

Model
• DecCell
DecCell
Embedding
LSTM
Cell𝑤+
(+) Linear
Classifer 𝑙𝑜𝑔𝑖𝑡+Attention
𝑠+2#,
𝑐+2#
𝑠+, 𝑐+
𝒉
Source Input
OutputTarget

Model
• DecLayer
• train
𝐷𝑒𝑐
𝐶𝑒𝑙𝑙#
𝑤#
(+)
bos
𝑤'
(+) 𝑤(82#
(+)
…
𝑙𝑜𝑔𝑖𝑡# …𝑙𝑜𝑔𝑖𝑡'
𝑙𝑜𝑔𝑖𝑡(8
𝐷𝑒𝑐
𝐶𝑒𝑙𝑙'
𝐷𝑒𝑐
𝐶𝑒𝑙𝑙(8
𝑤'
(+)
𝑤9
(+) 𝑤(8
(+)
eos
…
Cross Entropy
…
Source Input
OutputTarget

Model
• DecLayer
• infer
𝐷𝑒𝑐
𝐶𝑒𝑙𝑙#
𝑤#
(+)
bos
:𝑤'
(+) :𝑤(82#
(+)
…
𝑙𝑜𝑔𝑖𝑡# …𝑙𝑜𝑔𝑖𝑡'
𝑙𝑜𝑔𝑖𝑡(8
𝐷𝑒𝑐
𝐶𝑒𝑙𝑙'
𝐷𝑒𝑐
𝐶𝑒𝑙𝑙(8
:𝑤'
(+)
:𝑤9
(+) :𝑤(8
(+)
eos
…
ℎ()
,
𝑐()
…
Source Input
OutputTarget

Model
• Configuration
• src_vocab_size=32000
• tar_vocab_size=32000
• attention=AddAttention
• embedding_dim=256
• hidden_dim=256
• enc_hidden_dim = 256
• dec_hidden_dim = 256*2 (enc_hidden_dim * n_directions)
• n_layers=2,
• bidirectional=True

Model
• https://github.com/om00839/Seq2Seq/blob/master/Script%20
-%20model.ipynb

Utils
참고자료
IBM/pytorch-seq2seq: https://github.com/IBM/pytorch-seq2seq

Utils
• Train & evaluate
• train_batches()
• evaluate_batches()
• generate_sentence()
• EarlyStopping
• Metrics (pytorch-nlp)
• bleu_score
• Accuracy

Utils
• https://github.com/om00839/Seq2Seq/blob/master/Script%20
-%20utils.ipynb

What's hot

Notes on attention mechanismKhang Pham

BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingMinh Pham

Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRANTAUS - The Language Data Network

Advanced Neural Machine Translation (D4L2 Deep Learning for Speech and Langua...Universitat Politècnica de Catalunya

Neural Machine Translation (D2L10 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya

Attention Mechanism in Language Understanding and its ApplicationsArtifacia

BERT: Bidirectional Encoder Representations from TransformersLiangqun Lu

BERTKhang Pham

Master Thesis of Computer Engineering: OpenTranslatorGiuseppe D'Onofrio

Learning New Semi-Supervised Deep Auto-encoder Features for Statistical Machi...Vimukthi Wickramasinghe

[Paper review] BERTJEE HYUN PARK

Scalable image recognition model with deep embedding捷恩蔡

Machine Translation Introductionnlab_utokyo

BertAbdallah Bashir

Nips 2017 in a nutshellLULU CHENG

On using monolingual corpora in neural machine translationNAIST Machine Translation Study Group

BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingYoung Seok Kim

Living with-specSimon Belak

2017:12:06 acl読み会"Learning attention for historical text normalization by lea...ayaha osaki

What's hot (20)

Notes on attention mechanism

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRAN

Advanced Neural Machine Translation (D4L2 Deep Learning for Speech and Langua...

Neural Machine Translation (D2L10 Insight@DCU Machine Learning Workshop 2017)

Attention Mechanism in Language Understanding and its Applications

BERT: Bidirectional Encoder Representations from Transformers

BERT

Master Thesis of Computer Engineering: OpenTranslator

Learning New Semi-Supervised Deep Auto-encoder Features for Statistical Machi...

[Paper review] BERT

Scalable image recognition model with deep embedding

Machine Translation Introduction

Bert

Nips 2017 in a nutshell

On using monolingual corpora in neural machine translation

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Living with-spec

2017:12:06 acl読み会"Learning attention for historical text normalization by lea...

Similar to [Impl] neural machine translation

Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"Fwdays

Operationalizing analytics to scaleLooker

Effective .NET Core Unit Testing with SQLite and DapperMike Melusky

Deep Dive Time Series Anomaly Detection in Azure with dotnetMarco Parenzan

Consolidating MLOps at One of Europe’s Biggest AirportsDatabricks

OWF14 - Big Data : The State of Machine Learning in 2014Paris Open Source Summit

[DSC Europe 22] Engineers guide for shepherding models in to production - Mar...DataScienceConferenc1

Oracle nosql twjug-oktober-2014_taiwan_print_v01Gunther Pippèrr

Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Databricks

Presentacion day f-core v1.2.1.2-technical - englishJose Luis Sanchez del Coso

DevOps Days Rockies MLOpsMatthew Reynolds

PapyrusRT: Modelling and Code GenerationErnesto Posse

Productionizing Machine Learning - Bigdata meetup 5-06-2019Iulian Pintoiu

Building a Scalable and reliable open source ML Platform with MLFlowGoDataDriven

Effective .NET Core Unit Testing with SQLite and DapperMike Melusky

AI hype or realityAwantik Das

Deep dive time series anomaly detection with different Azure Data ServicesMarco Parenzan

TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...chiportal

Overview Of Parallel Development - Ericnelukdpe

Continuous delivery for machine learningRajesh Muppalla

Similar to [Impl] neural machine translation (20)

Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"

Operationalizing analytics to scale

Effective .NET Core Unit Testing with SQLite and Dapper

Deep Dive Time Series Anomaly Detection in Azure with dotnet

Consolidating MLOps at One of Europe’s Biggest Airports

OWF14 - Big Data : The State of Machine Learning in 2014

[DSC Europe 22] Engineers guide for shepherding models in to production - Mar...

Oracle nosql twjug-oktober-2014_taiwan_print_v01

Lessons Learned Replatforming A Large Machine Learning Application To Apache ...

Presentacion day f-core v1.2.1.2-technical - english

DevOps Days Rockies MLOps

PapyrusRT: Modelling and Code Generation

Productionizing Machine Learning - Bigdata meetup 5-06-2019

Building a Scalable and reliable open source ML Platform with MLFlow

Effective .NET Core Unit Testing with SQLite and Dapper

AI hype or reality

Deep dive time series anomaly detection with different Azure Data Services

TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...

Overview Of Parallel Development - Ericnel

Continuous delivery for machine learning

Recently uploaded

如何办理(UPenn毕业证书）宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证acoha1

Case Study 4 Where the cry of rebellion happen?RemarkSemacio

Digital Transformation Playbook by Graham WareGraham Ware

Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher

Simplify hybrid data integration at an enterprise scale. Integrate all your d...varanasisatyanvesh

如何办理(UCLA毕业证书）加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样jk0tkvfv

Introduction to Statistics Presentation.pptxAniqa Zai

如何办理澳洲拉筹伯大学毕业证（LaTrobe毕业证书）成绩单原件一模一样wsppdmt

Identify Customer Segments to Create Customer Offers for Each Segment - Appli...ThinkInnovation

ℂall Girls In Navi Mumbai Hire Me Neha 9910780858 Top Class ℂall Girl Serviℂe...Amara arora$V15

Abortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted KitAbortion pills in Riyadh +966572737505 get cytotec

Pentesting_AI and security challenges of AIf6x4zqzk86

Ranking and Scoring Exercises for ResearchRajesh Mondal

SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli

Bios of leading Astrologers & Researchersdarmandersingh4580

DBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTSSnehalVinod

obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...yulianti213969

Huawei Ransomware Protection Storage Solution Technical Overview Presentation...LuisMiguelPaz5

Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...ThinkInnovation

Predictive Precipitation: Advanced Rain Forecasting TechniquesBoston Institute of Analytics

Recently uploaded (20)

如何办理(UPenn毕业证书）宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证

Case Study 4 Where the cry of rebellion happen?

Digital Transformation Playbook by Graham Ware

Reconciling Conflicting Data Curation Actions: Transparency Through Argument...

Simplify hybrid data integration at an enterprise scale. Integrate all your d...

如何办理(UCLA毕业证书）加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样

Introduction to Statistics Presentation.pptx

如何办理澳洲拉筹伯大学毕业证（LaTrobe毕业证书）成绩单原件一模一样

Identify Customer Segments to Create Customer Offers for Each Segment - Appli...

ℂall Girls In Navi Mumbai Hire Me Neha 9910780858 Top Class ℂall Girl Serviℂe...

Abortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted Kit

Pentesting_AI and security challenges of AI

Ranking and Scoring Exercises for Research

SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...

Bios of leading Astrologers & Researchers

DBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTS

obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...

Huawei Ransomware Protection Storage Solution Technical Overview Presentation...

Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...

Predictive Precipitation: Advanced Rain Forecasting Techniques

[Impl] neural machine translation

1. [IMPL] Neural Machine Translation 장재호 19. 09. 06 (금)

2. 목차 • Sequence to sequence with attention • Data • Tokenizer • NMTDataset • Model • EncLayer • DecCell • DecLayer • Utils

3. Sequence to sequence with attention 참고자료 “Effective Approaches to Attention-based Neural Machine Translation”. Luong et al. 2015. IBM/pytorch-seq2seq: https://github.com/IBM/pytorch-seq2seq

4. Sequence to sequence with attention • Model: Sequence to sequence with attention (Luong et al. 2015) • Tokenizer (SentencePiece): • BPE (Byte Pair Encoding) • WPM (WordPieceModel) • Attention • Dot Product Attention • Multiplicative Attention • Additive Attention

5. Sequence to sequence with attention • Encoder-Decoder • Encoder: Stacked Bidirectional LSTM • Decoder: Stacked LSTM

6. Sequence to sequence with attention • Attention • Dot Product Attention • Multiplicative Attention • Additive Attention

7. Data Train: WMT’14 English-German Data Test: Newstest2013 English-German Data 데이터 출처: https://nlp.stanford.edu/projects/nmt/ SentencePiece: https://github.com/google/sentencepiece

8. Data • SentencePiece • Subword Segmentation Library • Pytorch 데이터 유틸리티 • torch.utils.data.Dataset • Abstract class • Iterable object • __len__() • __getitem__() • torch.utils.data.Dataloader • Generator object • Collate (callback function)

9. Data • https://github.com/om00839/Seq2Seq/blob/master/Script%20 -%20data.ipynb

10. Model 참고자료 “Effective Approaches to Attention-based Neural Machine Translation”. Luong et al. 2015. IBM/pytorch-seq2seq: https://github.com/IBM/pytorch-seq2seq

11. Model • Layers • EncLayer • DecCell • DecLayer • Attention • BaseAttention • DotProdAttention • MulAttention • AddAttention • Model • Seq2SeqWithAttn

12. Model • EncLayer 𝒉 LSTM Layer Embedding Layer 𝑤# (%) 𝑤' (%) 𝑤() (%) … ℎ# ℎ' ℎ()… Source Input OutputTarget

13. Model • DecCell DecCell Embedding LSTM Cell𝑤+ (+) Linear Classifer 𝑙𝑜𝑔𝑖𝑡+Attention 𝑠+2#, 𝑐+2# 𝑠+, 𝑐+ 𝒉 Source Input OutputTarget

14. Model • DecLayer • train 𝐷𝑒𝑐 𝐶𝑒𝑙𝑙# 𝑤# (+) bos 𝑤' (+) 𝑤(82# (+) … 𝑙𝑜𝑔𝑖𝑡# …𝑙𝑜𝑔𝑖𝑡' 𝑙𝑜𝑔𝑖𝑡(8 𝐷𝑒𝑐 𝐶𝑒𝑙𝑙' 𝐷𝑒𝑐 𝐶𝑒𝑙𝑙(8 𝑤' (+) 𝑤9 (+) 𝑤(8 (+) eos … Cross Entropy … Source Input OutputTarget

15. Model • DecLayer • infer 𝐷𝑒𝑐 𝐶𝑒𝑙𝑙# 𝑤# (+) bos :𝑤' (+) :𝑤(82# (+) … 𝑙𝑜𝑔𝑖𝑡# …𝑙𝑜𝑔𝑖𝑡' 𝑙𝑜𝑔𝑖𝑡(8 𝐷𝑒𝑐 𝐶𝑒𝑙𝑙' 𝐷𝑒𝑐 𝐶𝑒𝑙𝑙(8 :𝑤' (+) :𝑤9 (+) :𝑤(8 (+) eos … ℎ() , 𝑐() … Source Input OutputTarget

16. Model • Configuration • src_vocab_size=32000 • tar_vocab_size=32000 • attention=AddAttention • embedding_dim=256 • hidden_dim=256 • enc_hidden_dim = 256 • dec_hidden_dim = 256*2 (enc_hidden_dim * n_directions) • n_layers=2, • bidirectional=True

17. Model • https://github.com/om00839/Seq2Seq/blob/master/Script%20 -%20model.ipynb

18. Utils 참고자료 IBM/pytorch-seq2seq: https://github.com/IBM/pytorch-seq2seq

19. Utils • Train & evaluate • train_batches() • evaluate_batches() • generate_sentence() • EarlyStopping • Metrics (pytorch-nlp) • bleu_score • Accuracy

20. Utils • https://github.com/om00839/Seq2Seq/blob/master/Script%20 -%20utils.ipynb

21. 감사합니다

[Impl] neural machine translation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to [Impl] neural machine translation

Similar to [Impl] neural machine translation (20)

Recently uploaded

Recently uploaded (20)

[Impl] neural machine translation