SlideShare a Scribd company logo
1 of 37
Download to read offline
[course site]
Day 3 Lecture 4
Neural Machine
Translation
Marta R. Costa-jussà
2
Acknowledgments
Kyunghyun Cho, NVIDIA BLOGS:
https://devblogs.nvidia.com/parallelforall/introduction-neural-machine-translation-with-gpus/
3
Previous concepts from this course
● Recurrent neural network (LSTM and GRU) (handle
variable-length sequences)
● Word embeddings
● Language Modeling (assign a probability to a sentence)
4
Machine Translation background
Machine Translation is the application that is able to
automatically translate from source (S) to target (T).
5
Rule-based approach
Main approaches have been either rule-based or statistical-based
6
Statistical-based approach
Main approaches have been either rule-based or statistical-based
7
Why a new approach?
We need years to develop a nice rule-based approach
Regarding statistical systems:
(1) Word alignment and Translation are optimized separately
(2) Translation at the level of words, but difficulties with high variations in morphology (e.g. translation
English-to-Finnish)
(3) Translation by language pairs
(a) difficult to think of an automatic interlingua
(b) bad performance with low resourced-languages
8
Why Neural Machine Translation?
● Integrated MT paradigm
● Trainable at the subword/character level
● Multilingual advantages
9
What do we need?
● Parallel Corpus
Same requirement than phrase-based
systems
10
Sources of parallel corpus
● European Plenary Parlament Speeches (EPPS)
transcriptions
● Canadian Handsards
● United Nations
● CommonCrawl
● ...
International evaluation campaigns:
Conference on Machine Translation (WMT)
International Workshop on Spoken Language Translation (IWSLT)
11
What else do we need?
Same requirements than phrase-based
systems
Automatic measure
12
Towards Neural Machine Translation
13
Encoder-Decoder
Front View Side View
Representation of the sentence
Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015)
Cho, Kyunghyun, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger
Schwenk, and Yoshua Bengio. "Learning phrase representations using RNN encoder-decoder for
statistical machine translation." arXiv preprint arXiv:1406.1078 (2014).
Encoder
15
Encoder in three steps
Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015)
(2)
(3)
(1) One hot encoding
(2) Continuous space
representation
(3) Sequence summarization
16
Step 1: One-hot encoding
Natural language words can also be one-hot encoded on a
vector of dimensionality equal to the size of the dictionary (K).
Word One-hot encoding
economic 000010...
growth 001000...
has 100000...
slowed 000001...
From previous lecture on
language modeling
17
Step 2: Projection to continuous space
Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015)
siM Wi
E
The one-hot is linearly projected to a space of lower dimension
(typically 100-500) with matrix E for learned weights.
K
Ksi
= Ewi
18
Step 3: Recurrence
Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015)
19
Step 3: Recurrence
Sequence
Figure: Cristopher Olah, “Understanding LSTM Networks” (2015)
Activation function should be either
LSTM or GRU
Decoder
21
Decoder
Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015)
RNN’s internal state zi
depends on: summary vector ht
,
previous output word ui-1
and previous internal state zi-1
.
NEW INTERNAL
STATE
22
Decoder
Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015)
With zi
ready, we can score each word k in the vocabulary with
a dot product given this hidden state...
RNN
internal
state
Neuron
weights for
word k
23
Decoder
Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015)
A score is higher if word vector wk
and the decoder’s internal
state zi
are similar to each other.
RNN
internal
state
Neuron
weights for
word k
Remember:
a dot product gives the
length of the projection of
one vector onto another.
Similar vectors (nearly
parallel) the projection is
longer than if they are
very different (nearly
perpendicular)
24
Decoder
Bridle, John S. "Training Stochastic Model Recognition Algorithms as Networks can Lead to Maximum
Mutual Information Estimation of Parameters." NIPS 1989
...we can finally normalize to word probabilities with a softmax.
Probability that the ith word is word k
Previous words Hidden state
Given the score for word k
25
Decoder
Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015)
go back to the 1st step…
(1) computing the decoder’s internal state
(2) score and normalize target words
(3) select the next word
26
Decoder
Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015)
More words for the decoded sentence are generated until a
<EOS> (End Of Sentence) “word” is predicted.
EOS
Training
28
Training: Maximum Likelihood Estimation
Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015)
29
Computational Complexity
Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015)
Why this may not work?
Why this may not work?
We are encoding the entire source sentence
into a single context vector
How to solve this?
With the attention-based mechanism…
more details tomorrow
33
Summary
● Machine Translation is faced as a sequence-to-sequence
problem
● The source sentence is encoded into a fixed length vector
and this fixed length vector is decoded into the final most
probable target sentence
● Only parallel corpus and automatic evaluation measures
are required to train a neural machine translation system
34
Learn more
Natural Language Understanding with
Distributed Representation, Kyunghyun Cho,
Chapter 6, 2015 (available in github)
Thanks ! Q&A ?
https://www.costa-jussa.com
marta.ruiz@upc.edu
37
Another useful image for encoding-decoding
Kyunghyun Cho, “Natural Language Understanding with Distributed Representations” (2015)
ENCODER
DECODER
input words
output words

More Related Content

What's hot

NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers Arvind Devaraj
 
Natural language processing (NLP)
Natural language processing (NLP) Natural language processing (NLP)
Natural language processing (NLP) ASWINKP11
 
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...Universitat Politècnica de Catalunya
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer modelsDing Li
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingToine Bogers
 
Natural language processing
Natural language processingNatural language processing
Natural language processingAbash shah
 
Natural language processing
Natural language processingNatural language processing
Natural language processingHansi Thenuwara
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingMariana Soffer
 
Word Embeddings, why the hype ?
Word Embeddings, why the hype ? Word Embeddings, why the hype ?
Word Embeddings, why the hype ? Hady Elsahar
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkKnoldus Inc.
 
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review Jayneel Vora
 

What's hot (20)

NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
 
Natural language processing (NLP)
Natural language processing (NLP) Natural language processing (NLP)
Natural language processing (NLP)
 
NLP
NLPNLP
NLP
 
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
Nlp
NlpNlp
Nlp
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Information Extraction
Information ExtractionInformation Extraction
Information Extraction
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Introduction to Transformer Model
Introduction to Transformer ModelIntroduction to Transformer Model
Introduction to Transformer Model
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Attention Is All You Need
Attention Is All You NeedAttention Is All You Need
Attention Is All You Need
 
Bert
BertBert
Bert
 
Word Embeddings, why the hype ?
Word Embeddings, why the hype ? Word Embeddings, why the hype ?
Word Embeddings, why the hype ?
 
Word embedding
Word embedding Word embedding
Word embedding
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review
 

Viewers also liked

Advanced Neural Machine Translation (D4L2 Deep Learning for Speech and Langua...
Advanced Neural Machine Translation (D4L2 Deep Learning for Speech and Langua...Advanced Neural Machine Translation (D4L2 Deep Learning for Speech and Langua...
Advanced Neural Machine Translation (D4L2 Deep Learning for Speech and Langua...Universitat Politècnica de Catalunya
 
Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)
Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)
Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)Universitat Politècnica de Catalunya
 
Speech Recognition with Deep Neural Networks (D3L2 Deep Learning for Speech a...
Speech Recognition with Deep Neural Networks (D3L2 Deep Learning for Speech a...Speech Recognition with Deep Neural Networks (D3L2 Deep Learning for Speech a...
Speech Recognition with Deep Neural Networks (D3L2 Deep Learning for Speech a...Universitat Politècnica de Catalunya
 
Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...
Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...
Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...Universitat Politècnica de Catalunya
 
Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)
Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)
Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)Universitat Politècnica de Catalunya
 
Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...
Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...
Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...Universitat Politècnica de Catalunya
 
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...Universitat Politècnica de Catalunya
 
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...Universitat Politècnica de Catalunya
 
Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)
Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)
Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)Universitat Politècnica de Catalunya
 
Multimodal Deep Learning (D4L4 Deep Learning for Speech and Language UPC 2017)
Multimodal Deep Learning (D4L4 Deep Learning for Speech and Language UPC 2017)Multimodal Deep Learning (D4L4 Deep Learning for Speech and Language UPC 2017)
Multimodal Deep Learning (D4L4 Deep Learning for Speech and Language UPC 2017)Universitat Politècnica de Catalunya
 
Beyond the Hype of Neural Machine Translation, Diego Bartolome (tauyou) and G...
Beyond the Hype of Neural Machine Translation, Diego Bartolome (tauyou) and G...Beyond the Hype of Neural Machine Translation, Diego Bartolome (tauyou) and G...
Beyond the Hype of Neural Machine Translation, Diego Bartolome (tauyou) and G...TAUS - The Language Data Network
 
Summariz3 개발기
Summariz3 개발기Summariz3 개발기
Summariz3 개발기Jamie Seol
 
Google's Multilingual Neural Machine Translation System
Google's Multilingual Neural Machine Translation SystemGoogle's Multilingual Neural Machine Translation System
Google's Multilingual Neural Machine Translation SystemNamHyuk Ahn
 
Deep Learning for Machine Translation, by Jean Senellart, SYSTRAN
Deep Learning for Machine Translation, by Jean Senellart, SYSTRANDeep Learning for Machine Translation, by Jean Senellart, SYSTRAN
Deep Learning for Machine Translation, by Jean Senellart, SYSTRANTAUS - The Language Data Network
 
20161215Neural Machine Translation of Rare Words with Subword Units
20161215Neural Machine Translation of Rare Words with Subword Units20161215Neural Machine Translation of Rare Words with Subword Units
20161215Neural Machine Translation of Rare Words with Subword UnitsKanji Takahashi
 

Viewers also liked (19)

Advanced Neural Machine Translation (D4L2 Deep Learning for Speech and Langua...
Advanced Neural Machine Translation (D4L2 Deep Learning for Speech and Langua...Advanced Neural Machine Translation (D4L2 Deep Learning for Speech and Langua...
Advanced Neural Machine Translation (D4L2 Deep Learning for Speech and Langua...
 
Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)
Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)
Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)
 
Speech Recognition with Deep Neural Networks (D3L2 Deep Learning for Speech a...
Speech Recognition with Deep Neural Networks (D3L2 Deep Learning for Speech a...Speech Recognition with Deep Neural Networks (D3L2 Deep Learning for Speech a...
Speech Recognition with Deep Neural Networks (D3L2 Deep Learning for Speech a...
 
Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...
Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...
Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...
 
Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)
Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)
Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)
 
Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...
Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...
Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...
 
Speaker ID II (D4L1 Deep Learning for Speech and Language UPC 2017)
Speaker ID II (D4L1 Deep Learning for Speech and Language UPC 2017)Speaker ID II (D4L1 Deep Learning for Speech and Language UPC 2017)
Speaker ID II (D4L1 Deep Learning for Speech and Language UPC 2017)
 
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
 
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...
 
Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)
Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)
Deep Belief Networks (D2L1 Deep Learning for Speech and Language UPC 2017)
 
The Perceptron (D1L2 Deep Learning for Speech and Language)
The Perceptron (D1L2 Deep Learning for Speech and Language)The Perceptron (D1L2 Deep Learning for Speech and Language)
The Perceptron (D1L2 Deep Learning for Speech and Language)
 
Multimodal Deep Learning (D4L4 Deep Learning for Speech and Language UPC 2017)
Multimodal Deep Learning (D4L4 Deep Learning for Speech and Language UPC 2017)Multimodal Deep Learning (D4L4 Deep Learning for Speech and Language UPC 2017)
Multimodal Deep Learning (D4L4 Deep Learning for Speech and Language UPC 2017)
 
Beyond the Hype of Neural Machine Translation, Diego Bartolome (tauyou) and G...
Beyond the Hype of Neural Machine Translation, Diego Bartolome (tauyou) and G...Beyond the Hype of Neural Machine Translation, Diego Bartolome (tauyou) and G...
Beyond the Hype of Neural Machine Translation, Diego Bartolome (tauyou) and G...
 
Summariz3 개발기
Summariz3 개발기Summariz3 개발기
Summariz3 개발기
 
Machine Translation: The Neural Frontier
Machine Translation: The Neural FrontierMachine Translation: The Neural Frontier
Machine Translation: The Neural Frontier
 
Google's Multilingual Neural Machine Translation System
Google's Multilingual Neural Machine Translation SystemGoogle's Multilingual Neural Machine Translation System
Google's Multilingual Neural Machine Translation System
 
Deep Learning for Machine Translation, by Jean Senellart, SYSTRAN
Deep Learning for Machine Translation, by Jean Senellart, SYSTRANDeep Learning for Machine Translation, by Jean Senellart, SYSTRAN
Deep Learning for Machine Translation, by Jean Senellart, SYSTRAN
 
20161215Neural Machine Translation of Rare Words with Subword Units
20161215Neural Machine Translation of Rare Words with Subword Units20161215Neural Machine Translation of Rare Words with Subword Units
20161215Neural Machine Translation of Rare Words with Subword Units
 
On using monolingual corpora in neural machine translation
On using monolingual corpora in neural machine translationOn using monolingual corpora in neural machine translation
On using monolingual corpora in neural machine translation
 

Similar to Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)

Neural Machine Translation (D2L10 Insight@DCU Machine Learning Workshop 2017)
Neural Machine Translation (D2L10 Insight@DCU Machine Learning Workshop 2017)Neural Machine Translation (D2L10 Insight@DCU Machine Learning Workshop 2017)
Neural Machine Translation (D2L10 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Master Thesis of Computer Engineering: OpenTranslator
Master Thesis of Computer Engineering: OpenTranslatorMaster Thesis of Computer Engineering: OpenTranslator
Master Thesis of Computer Engineering: OpenTranslatorGiuseppe D'Onofrio
 
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...ijnlc
 
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIO...
ANALYZING ARCHITECTURES FOR NEURAL  MACHINE TRANSLATION USING LOW  COMPUTATIO...ANALYZING ARCHITECTURES FOR NEURAL  MACHINE TRANSLATION USING LOW  COMPUTATIO...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIO...kevig
 
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...kevig
 
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Roelof Pieters
 
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...AI Frontiers
 
LongT5_Efficient Text-toText Transformer for Long Sequences_san.pptx
LongT5_Efficient Text-toText Transformer for Long Sequences_san.pptxLongT5_Efficient Text-toText Transformer for Long Sequences_san.pptx
LongT5_Efficient Text-toText Transformer for Long Sequences_san.pptxSan Kim
 
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNetFrom Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNetEric Haibin Lin
 
ODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPindico data
 
IRJET- On-Screen Translator using NLP and Text Detection
IRJET- On-Screen Translator using NLP and Text DetectionIRJET- On-Screen Translator using NLP and Text Detection
IRJET- On-Screen Translator using NLP and Text DetectionIRJET Journal
 
Semantical Cognitive Scheduling
Semantical Cognitive SchedulingSemantical Cognitive Scheduling
Semantical Cognitive Schedulingigalshilman
 
Naver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNaver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNAVER Engineering
 
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn..."Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...Edge AI and Vision Alliance
 
Monotonic Multihead Attention review
Monotonic Multihead Attention reviewMonotonic Multihead Attention review
Monotonic Multihead Attention reviewJune-Woo Kim
 
NLP Project: Machine Comprehension Using Attention-Based LSTM Encoder-Decoder...
NLP Project: Machine Comprehension Using Attention-Based LSTM Encoder-Decoder...NLP Project: Machine Comprehension Using Attention-Based LSTM Encoder-Decoder...
NLP Project: Machine Comprehension Using Attention-Based LSTM Encoder-Decoder...Eugene Nho
 
Multi-modal Neural Machine Translation - Iacer Calixto
Multi-modal Neural Machine Translation - Iacer CalixtoMulti-modal Neural Machine Translation - Iacer Calixto
Multi-modal Neural Machine Translation - Iacer CalixtoSebastian Ruder
 

Similar to Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017) (20)

Neural Machine Translation (D2L10 Insight@DCU Machine Learning Workshop 2017)
Neural Machine Translation (D2L10 Insight@DCU Machine Learning Workshop 2017)Neural Machine Translation (D2L10 Insight@DCU Machine Learning Workshop 2017)
Neural Machine Translation (D2L10 Insight@DCU Machine Learning Workshop 2017)
 
Master Thesis of Computer Engineering: OpenTranslator
Master Thesis of Computer Engineering: OpenTranslatorMaster Thesis of Computer Engineering: OpenTranslator
Master Thesis of Computer Engineering: OpenTranslator
 
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
 
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIO...
ANALYZING ARCHITECTURES FOR NEURAL  MACHINE TRANSLATION USING LOW  COMPUTATIO...ANALYZING ARCHITECTURES FOR NEURAL  MACHINE TRANSLATION USING LOW  COMPUTATIO...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIO...
 
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
 
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
 
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
 
LongT5_Efficient Text-toText Transformer for Long Sequences_san.pptx
LongT5_Efficient Text-toText Transformer for Long Sequences_san.pptxLongT5_Efficient Text-toText Transformer for Long Sequences_san.pptx
LongT5_Efficient Text-toText Transformer for Long Sequences_san.pptx
 
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNetFrom Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
 
ODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLP
 
IRJET- On-Screen Translator using NLP and Text Detection
IRJET- On-Screen Translator using NLP and Text DetectionIRJET- On-Screen Translator using NLP and Text Detection
IRJET- On-Screen Translator using NLP and Text Detection
 
Semantical Cognitive Scheduling
Semantical Cognitive SchedulingSemantical Cognitive Scheduling
Semantical Cognitive Scheduling
 
Naver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNaver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltc
 
Unit 1.pptx
Unit 1.pptxUnit 1.pptx
Unit 1.pptx
 
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn..."Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
 
Monotonic Multihead Attention review
Monotonic Multihead Attention reviewMonotonic Multihead Attention review
Monotonic Multihead Attention review
 
NLP Project: Machine Comprehension Using Attention-Based LSTM Encoder-Decoder...
NLP Project: Machine Comprehension Using Attention-Based LSTM Encoder-Decoder...NLP Project: Machine Comprehension Using Attention-Based LSTM Encoder-Decoder...
NLP Project: Machine Comprehension Using Attention-Based LSTM Encoder-Decoder...
 
Multi-modal Neural Machine Translation - Iacer Calixto
Multi-modal Neural Machine Translation - Iacer CalixtoMulti-modal Neural Machine Translation - Iacer Calixto
Multi-modal Neural Machine Translation - Iacer Calixto
 
Tackling Deep Software Variability Together
Tackling Deep Software Variability TogetherTackling Deep Software Variability Together
Tackling Deep Software Variability Together
 

More from Universitat Politècnica de Catalunya

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoUniversitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosUniversitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat Politècnica de Catalunya
 

More from Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 

Recently uploaded

9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 

Recently uploaded (20)

9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 

Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)

  • 1. [course site] Day 3 Lecture 4 Neural Machine Translation Marta R. Costa-jussà
  • 2. 2 Acknowledgments Kyunghyun Cho, NVIDIA BLOGS: https://devblogs.nvidia.com/parallelforall/introduction-neural-machine-translation-with-gpus/
  • 3. 3 Previous concepts from this course ● Recurrent neural network (LSTM and GRU) (handle variable-length sequences) ● Word embeddings ● Language Modeling (assign a probability to a sentence)
  • 4. 4 Machine Translation background Machine Translation is the application that is able to automatically translate from source (S) to target (T).
  • 5. 5 Rule-based approach Main approaches have been either rule-based or statistical-based
  • 6. 6 Statistical-based approach Main approaches have been either rule-based or statistical-based
  • 7. 7 Why a new approach? We need years to develop a nice rule-based approach Regarding statistical systems: (1) Word alignment and Translation are optimized separately (2) Translation at the level of words, but difficulties with high variations in morphology (e.g. translation English-to-Finnish) (3) Translation by language pairs (a) difficult to think of an automatic interlingua (b) bad performance with low resourced-languages
  • 8. 8 Why Neural Machine Translation? ● Integrated MT paradigm ● Trainable at the subword/character level ● Multilingual advantages
  • 9. 9 What do we need? ● Parallel Corpus Same requirement than phrase-based systems
  • 10. 10 Sources of parallel corpus ● European Plenary Parlament Speeches (EPPS) transcriptions ● Canadian Handsards ● United Nations ● CommonCrawl ● ... International evaluation campaigns: Conference on Machine Translation (WMT) International Workshop on Spoken Language Translation (IWSLT)
  • 11. 11 What else do we need? Same requirements than phrase-based systems Automatic measure
  • 13. 13 Encoder-Decoder Front View Side View Representation of the sentence Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015) Cho, Kyunghyun, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. "Learning phrase representations using RNN encoder-decoder for statistical machine translation." arXiv preprint arXiv:1406.1078 (2014).
  • 15. 15 Encoder in three steps Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015) (2) (3) (1) One hot encoding (2) Continuous space representation (3) Sequence summarization
  • 16. 16 Step 1: One-hot encoding Natural language words can also be one-hot encoded on a vector of dimensionality equal to the size of the dictionary (K). Word One-hot encoding economic 000010... growth 001000... has 100000... slowed 000001... From previous lecture on language modeling
  • 17. 17 Step 2: Projection to continuous space Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015) siM Wi E The one-hot is linearly projected to a space of lower dimension (typically 100-500) with matrix E for learned weights. K Ksi = Ewi
  • 18. 18 Step 3: Recurrence Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015)
  • 19. 19 Step 3: Recurrence Sequence Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) Activation function should be either LSTM or GRU
  • 21. 21 Decoder Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015) RNN’s internal state zi depends on: summary vector ht , previous output word ui-1 and previous internal state zi-1 . NEW INTERNAL STATE
  • 22. 22 Decoder Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015) With zi ready, we can score each word k in the vocabulary with a dot product given this hidden state... RNN internal state Neuron weights for word k
  • 23. 23 Decoder Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015) A score is higher if word vector wk and the decoder’s internal state zi are similar to each other. RNN internal state Neuron weights for word k Remember: a dot product gives the length of the projection of one vector onto another. Similar vectors (nearly parallel) the projection is longer than if they are very different (nearly perpendicular)
  • 24. 24 Decoder Bridle, John S. "Training Stochastic Model Recognition Algorithms as Networks can Lead to Maximum Mutual Information Estimation of Parameters." NIPS 1989 ...we can finally normalize to word probabilities with a softmax. Probability that the ith word is word k Previous words Hidden state Given the score for word k
  • 25. 25 Decoder Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015) go back to the 1st step… (1) computing the decoder’s internal state (2) score and normalize target words (3) select the next word
  • 26. 26 Decoder Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015) More words for the decoded sentence are generated until a <EOS> (End Of Sentence) “word” is predicted. EOS
  • 28. 28 Training: Maximum Likelihood Estimation Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015)
  • 29. 29 Computational Complexity Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015)
  • 30. Why this may not work?
  • 31. Why this may not work? We are encoding the entire source sentence into a single context vector
  • 32. How to solve this? With the attention-based mechanism… more details tomorrow
  • 33. 33 Summary ● Machine Translation is faced as a sequence-to-sequence problem ● The source sentence is encoded into a fixed length vector and this fixed length vector is decoded into the final most probable target sentence ● Only parallel corpus and automatic evaluation measures are required to train a neural machine translation system
  • 34. 34 Learn more Natural Language Understanding with Distributed Representation, Kyunghyun Cho, Chapter 6, 2015 (available in github)
  • 35. Thanks ! Q&A ? https://www.costa-jussa.com marta.ruiz@upc.edu
  • 36.
  • 37. 37 Another useful image for encoding-decoding Kyunghyun Cho, “Natural Language Understanding with Distributed Representations” (2015) ENCODER DECODER input words output words