SlideShare a Scribd company logo
Translation as probability
“Decoding”
Training
“Log-linear”
Ain’t got nothin’ but the BLEUs?
The SMT lifecycle
Statistical machine translation in a few slides
Mikel L. Forcada1,2
1Departament de Llenguatges i Sistemes Informàtics, Universitat d’Alacant,
E-03071 Alacant (Spain)
2Prompsit Language Engineering, S.L., E-03690 St. Vicent del Raspeig (Spain)
April 14-16, 2009: Free/open-source MT tutorial at the
CNGL
Mikel L. Forcada SMT in a few slides
Translation as probability
“Decoding”
Training
“Log-linear”
Ain’t got nothin’ but the BLEUs?
The SMT lifecycle
Contents
1 Translation as probability
2 “Decoding”
3 Training
4 “Log-linear”
5 Ain’t got nothin’ but the BLEUs?
6 The SMT lifecycle
Mikel L. Forcada SMT in a few slides
Translation as probability
“Decoding”
Training
“Log-linear”
Ain’t got nothin’ but the BLEUs?
The SMT lifecycle
The “canonical” model
Translation as probability/1
Instead of saying that
a source-language (SL) sentence s in a SL text
and a target-language (TL) sentence t
as found in a SL–TL bitext are or are not a translation of
each other,
in SMT one says that they are a translation of each other
with a probability p(s, t) = p(t, s) (a joint probability).
We’ll assume we have such a probability model available.
Or at least a reasonable estimate.
Mikel L. Forcada SMT in a few slides
Translation as probability
“Decoding”
Training
“Log-linear”
Ain’t got nothin’ but the BLEUs?
The SMT lifecycle
The “canonical” model
Translation as probability/2
According to basic probability laws, we can write:
p(s, t) = p(t, s) = p(s|t)p(t) = p(t|s)p(s) (1)
where p(x|y) is the conditional probability of x given y.
We are interested in translating from SL to TL. That is, we
want to find the most likely translation given the SL
sentence s:
t = arg max
t
p(t|s) (2)
Mikel L. Forcada SMT in a few slides
Translation as probability
“Decoding”
Training
“Log-linear”
Ain’t got nothin’ but the BLEUs?
The SMT lifecycle
The “canonical” model
The “canonical” model
We can rewrite eq. (1) as
p(t|s) =
p(s|t)p(t)
p(s)
(3)
and then with (2) to get
t = arg max
t
p(s|t)p(t) (4)
Mikel L. Forcada SMT in a few slides
Translation as probability
“Decoding”
Training
“Log-linear”
Ain’t got nothin’ but the BLEUs?
The SMT lifecycle
“Decoding”/1
t = arg max
t
p(s|t)p(t)
We have a product of two probability models:
A reverse translation model p(s|t) which tells us how likely
the SL sentence s is a translation of the candidate TL
sentence t, and
a target-language model p(t) which tells us how likely the
sentence t is in the TL side of bitexts.
These may be related (respectively) to the usual notions of
[reverse] adequacy: how much of the meaning of t is
conveyed by s
fluency: how fluent is the candidate TL sentence.
The arg max strikes a balance between the two.
Mikel L. Forcada SMT in a few slides
Translation as probability
“Decoding”
Training
“Log-linear”
Ain’t got nothin’ but the BLEUs?
The SMT lifecycle
“Decoding”/2
In SMT parlance, the process of finding t∗ is called
decoding.1
Obviously, it does not explore all possible translations t in
the search space. There are infinitely many.
The search space is pruned.
Therefore, one just gets a reasonable t instead of the
ideal t
Pruning and search strategies are a very active research
topic.
Free/open-source software: Moses.
1
Reading SMT articles usually entails deciphering jargon which may be
very obscure to outsiders or newcomers
Mikel L. Forcada SMT in a few slides
Translation as probability
“Decoding”
Training
“Log-linear”
Ain’t got nothin’ but the BLEUs?
The SMT lifecycle
Training/1
So where do these probabilities come from?
p(t) may easily be estimated from a large monolingual TL
corpus (free/open-source software: irstlm)
The estimation of p(s|t) is more complex. It’s usually made
of
a lexical model describing the probability that the
translation of certain TL word or sequence of words
(“phrase”2
) is a certain SL word or sequence of words.
an alignment model describing the reordering of words or
“phrases”.
2
A very unfortunate choice in SMT jargon
Mikel L. Forcada SMT in a few slides
Translation as probability
“Decoding”
Training
“Log-linear”
Ain’t got nothin’ but the BLEUs?
The SMT lifecycle
Training/2
The lexical model and the alignment model are estimated
using a large sentence-aligned bilingual corpus through a
complex iterative process.
An initial set of lexical probabilities is obtained by
assuming, for instance, that any word in the TL sentence
aligns with any word in its SL counterpart. And then:
Alignment probabilities in accordance with the lexical
probabilities are computed.
Lexical probabilities are obtained in accordance with the
alignment probabilities
This process (“expectation maximization”) is repeated a
fixed number of times or until some convergence is
observed (free/open-source software: Giza++).
Mikel L. Forcada SMT in a few slides
Translation as probability
“Decoding”
Training
“Log-linear”
Ain’t got nothin’ but the BLEUs?
The SMT lifecycle
Training/3
In “phrase-based” SMT, alignments may be used to extract
(SL-phrase, TL-phrase) pairs of phrases
and their corresponding probabilities
for easier decoding and to avoid “word salad”.
Mikel L. Forcada SMT in a few slides
Translation as probability
“Decoding”
Training
“Log-linear”
Ain’t got nothin’ but the BLEUs?
The SMT lifecycle
“Log-linear”/1
More SMT jargon!
It’s short for linear combination of logarithms of
probabilities.
And, sometimes, even features that aren’t logarithms or
probabilities of any kind.
OK, let’s take a look at the maths.
Mikel L. Forcada SMT in a few slides
Translation as probability
“Decoding”
Training
“Log-linear”
Ain’t got nothin’ but the BLEUs?
The SMT lifecycle
“Log-linear”/2
One can write a more general formula:
p(t|s) =
exp( nF
k=1 λk fk (t, s))
Z
(5)
with nF feature functions fk (t, s) which can depend on s, t
or both.
Setting nF = 2, f1(s, t) = log p(s|t), f2(s, t) = log p(t), and
Z = p(s) one recovers the canonical formula (3).
The best translation is then
t = arg max
t
nF
k=1
λk fk (t, s) (6)
Most of the fk (t, s) are logarithms, hence “log-linear”.
Mikel L. Forcada SMT in a few slides
Translation as probability
“Decoding”
Training
“Log-linear”
Ain’t got nothin’ but the BLEUs?
The SMT lifecycle
“Log-linear”/3
“Feature selection is a very open problem in SMT” (Lopez
2008)
Other possible functions include length penalties
(discouraging unreasonably short or long translations),
“inverted” versions of p(s|t), etc.
Where do we get the λk ’s from?
They are usually tuned so as to optimize the results on a
tuning set, according to a certain objective function that
is taken to be an indicator that correlates with translation
quality
may be automatically obtained from the output of the SMT
system and the translation in the corpus.
This is called MERT (minimum error rate training)
sometimes (free/open-source software: the Moses suite).
Mikel L. Forcada SMT in a few slides
Translation as probability
“Decoding”
Training
“Log-linear”
Ain’t got nothin’ but the BLEUs?
The SMT lifecycle
Ain’t got nothin’ but the BLEUs?
The most famous “quality indicator” is called BLEU, but
there are many others.
BLEU counts which fraction of the 1-word,
2-word,. . . n-word sequences in the output match the
reference translation.
Correlation with subjective assessments of quality is still an
open question.
A lot of SMT research is currently BLEU-driven and makes
little contact with real applications of MT.
Mikel L. Forcada SMT in a few slides
Translation as probability
“Decoding”
Training
“Log-linear”
Ain’t got nothin’ but the BLEUs?
The SMT lifecycle
The SMT lifecycle
Development:
Training: monolingual and sentence-aligned
bilingual corpora are used to estimate
probability models (features)
Tuning: a held-out portion of the
sentence-aligned bilingual corpus is
used to tune the coeficients λk
Decoding: sentences s are fed into the SMT system and
“decoded” into their translations t.
Evaluation: the system is evaluated against a reference
corpus.
Mikel L. Forcada SMT in a few slides
Translation as probability
“Decoding”
Training
“Log-linear”
Ain’t got nothin’ but the BLEUs?
The SMT lifecycle
License
This work may be distributed under the terms of
the Creative Commons Attribution–Share Alike license:
http:
//creativecommons.org/licenses/by-sa/3.0/
the GNU GPL v. 3.0 License:
http://www.gnu.org/licenses/gpl.html
Dual license! E-mail me to get the sources: mlf@ua.es
Mikel L. Forcada SMT in a few slides

More Related Content

What's hot

eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and SummarizationeSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
INESC-ID (Spoken Language Systems Laboratory - L2F)
 
13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation
RIILP
 
Statistical machine translation
Statistical machine translationStatistical machine translation
Statistical machine translation
Hrishikesh Nair
 
Thomas Wolf "Transfer learning in NLP"
Thomas Wolf "Transfer learning in NLP"Thomas Wolf "Transfer learning in NLP"
Thomas Wolf "Transfer learning in NLP"
Fwdays
 
Latest trends in NLP - Exploring BERT
Latest trends in NLP -  Exploring BERTLatest trends in NLP -  Exploring BERT
Latest trends in NLP - Exploring BERT
Silversparro Technologies
 
[Paper review] BERT
[Paper review] BERT[Paper review] BERT
[Paper review] BERT
JEE HYUN PARK
 
6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine Translation6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine Translation
RIILP
 
Open-source machine translation for Icelandic: the Apertium platform as an o...
Open-source machine translation for Icelandic:
 the Apertium platform as an o...Open-source machine translation for Icelandic:
 the Apertium platform as an o...
Open-source machine translation for Icelandic: the Apertium platform as an o...
Forcada Mikel
 
BERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from TransformersBERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from Transformers
Liangqun Lu
 
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi..."Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
Yandex
 
Automata
AutomataAutomata
Automata
EunGi Hong
 
Machine Translation Introduction
Machine Translation IntroductionMachine Translation Introduction
Machine Translation Introduction
nlab_utokyo
 
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
Universitat Politècnica de Catalunya
 
Machine translation with statistical approach
Machine translation with statistical approachMachine translation with statistical approach
Machine translation with statistical approach
vini89
 
Speech To Sign Language Interpreter System
Speech To Sign Language Interpreter SystemSpeech To Sign Language Interpreter System
Speech To Sign Language Interpreter System
kkkseld
 
A tutorial on Machine Translation
A tutorial on Machine TranslationA tutorial on Machine Translation
A tutorial on Machine Translation
Jaganadh Gopinadhan
 
Coping with Semantic Variation Points in Domain-Specific Modeling Languages
Coping with Semantic Variation Points in Domain-Specific Modeling LanguagesCoping with Semantic Variation Points in Domain-Specific Modeling Languages
Coping with Semantic Variation Points in Domain-Specific Modeling Languages
Marc Pantel
 
Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...
John Tinsley
 
Intro to NLP. Lecture 2
Intro to NLP.  Lecture 2Intro to NLP.  Lecture 2
Intro to NLP. Lecture 2
Ekaterina Chernyak
 
NLP pipeline in machine translation
NLP pipeline in machine translationNLP pipeline in machine translation
NLP pipeline in machine translation
Marcis Pinnis
 

What's hot (20)

eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and SummarizationeSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
 
13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation
 
Statistical machine translation
Statistical machine translationStatistical machine translation
Statistical machine translation
 
Thomas Wolf "Transfer learning in NLP"
Thomas Wolf "Transfer learning in NLP"Thomas Wolf "Transfer learning in NLP"
Thomas Wolf "Transfer learning in NLP"
 
Latest trends in NLP - Exploring BERT
Latest trends in NLP -  Exploring BERTLatest trends in NLP -  Exploring BERT
Latest trends in NLP - Exploring BERT
 
[Paper review] BERT
[Paper review] BERT[Paper review] BERT
[Paper review] BERT
 
6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine Translation6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine Translation
 
Open-source machine translation for Icelandic: the Apertium platform as an o...
Open-source machine translation for Icelandic:
 the Apertium platform as an o...Open-source machine translation for Icelandic:
 the Apertium platform as an o...
Open-source machine translation for Icelandic: the Apertium platform as an o...
 
BERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from TransformersBERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from Transformers
 
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi..."Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
 
Automata
AutomataAutomata
Automata
 
Machine Translation Introduction
Machine Translation IntroductionMachine Translation Introduction
Machine Translation Introduction
 
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
 
Machine translation with statistical approach
Machine translation with statistical approachMachine translation with statistical approach
Machine translation with statistical approach
 
Speech To Sign Language Interpreter System
Speech To Sign Language Interpreter SystemSpeech To Sign Language Interpreter System
Speech To Sign Language Interpreter System
 
A tutorial on Machine Translation
A tutorial on Machine TranslationA tutorial on Machine Translation
A tutorial on Machine Translation
 
Coping with Semantic Variation Points in Domain-Specific Modeling Languages
Coping with Semantic Variation Points in Domain-Specific Modeling LanguagesCoping with Semantic Variation Points in Domain-Specific Modeling Languages
Coping with Semantic Variation Points in Domain-Specific Modeling Languages
 
Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...
 
Intro to NLP. Lecture 2
Intro to NLP.  Lecture 2Intro to NLP.  Lecture 2
Intro to NLP. Lecture 2
 
NLP pipeline in machine translation
NLP pipeline in machine translationNLP pipeline in machine translation
NLP pipeline in machine translation
 

Viewers also liked

Kim Daha Çabuk İş bulmak İster?
Kim Daha Çabuk İş bulmak İster?Kim Daha Çabuk İş bulmak İster?
Kim Daha Çabuk İş bulmak İster?
Taylan Demirkaya
 
Embryonix Ne Yapar?
Embryonix Ne Yapar?Embryonix Ne Yapar?
Embryonix Ne Yapar?
Taylan Demirkaya
 
Traducció automàtica de codi obert: Apertium, una oportunitat per a llengües ...
Traducció automàtica de codi obert: Apertium, una oportunitat per a llengües ...Traducció automàtica de codi obert: Apertium, una oportunitat per a llengües ...
Traducció automàtica de codi obert: Apertium, una oportunitat per a llengües ...
Forcada Mikel
 
Chapter 7
Chapter 7Chapter 7
Chapter 7
Taylan Demirkaya
 
Integrating corpus-based and rule-based approaches in an open-source machine ...
Integrating corpus-based and rule-based approaches in an open-source machine ...Integrating corpus-based and rule-based approaches in an open-source machine ...
Integrating corpus-based and rule-based approaches in an open-source machine ...
Forcada Mikel
 
Davranışsal Finans ve Ekonomi
Davranışsal Finans ve EkonomiDavranışsal Finans ve Ekonomi
Davranışsal Finans ve Ekonomi
Taylan Demirkaya
 

Viewers also liked (6)

Kim Daha Çabuk İş bulmak İster?
Kim Daha Çabuk İş bulmak İster?Kim Daha Çabuk İş bulmak İster?
Kim Daha Çabuk İş bulmak İster?
 
Embryonix Ne Yapar?
Embryonix Ne Yapar?Embryonix Ne Yapar?
Embryonix Ne Yapar?
 
Traducció automàtica de codi obert: Apertium, una oportunitat per a llengües ...
Traducció automàtica de codi obert: Apertium, una oportunitat per a llengües ...Traducció automàtica de codi obert: Apertium, una oportunitat per a llengües ...
Traducció automàtica de codi obert: Apertium, una oportunitat per a llengües ...
 
Chapter 7
Chapter 7Chapter 7
Chapter 7
 
Integrating corpus-based and rule-based approaches in an open-source machine ...
Integrating corpus-based and rule-based approaches in an open-source machine ...Integrating corpus-based and rule-based approaches in an open-source machine ...
Integrating corpus-based and rule-based approaches in an open-source machine ...
 
Davranışsal Finans ve Ekonomi
Davranışsal Finans ve EkonomiDavranışsal Finans ve Ekonomi
Davranışsal Finans ve Ekonomi
 

Similar to Smt in-a-few-slides

Deep-learning based Language Understanding and Emotion extractions
Deep-learning based Language Understanding and Emotion extractionsDeep-learning based Language Understanding and Emotion extractions
Deep-learning based Language Understanding and Emotion extractions
Jeongkyu Shin
 
An introduction to erlang
An introduction to erlangAn introduction to erlang
An introduction to erlang
Mirko Bonadei
 
Latent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet MixtureLatent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet Mixture
Rakuten Group, Inc.
 
The hangover: A "modern" (?) high performance approach to build an offensive ...
The hangover: A "modern" (?) high performance approach to build an offensive ...The hangover: A "modern" (?) high performance approach to build an offensive ...
The hangover: A "modern" (?) high performance approach to build an offensive ...
Nelson Brito
 
Model-driven Development of Model Transformations
Model-driven Development of Model TransformationsModel-driven Development of Model Transformations
Model-driven Development of Model Transformations
Pieter Van Gorp
 
Devoxx traitement automatique du langage sur du texte en 2019
Devoxx   traitement automatique du langage sur du texte en 2019 Devoxx   traitement automatique du langage sur du texte en 2019
Devoxx traitement automatique du langage sur du texte en 2019
Alexis Agahi
 
New compiler design 101 April 13 2024.pdf
New compiler design 101 April 13 2024.pdfNew compiler design 101 April 13 2024.pdf
New compiler design 101 April 13 2024.pdf
eliasabdi2024
 
Programming_Language_Syntax.ppt
Programming_Language_Syntax.pptProgramming_Language_Syntax.ppt
Programming_Language_Syntax.ppt
Amrita Sharma
 
Performance analysis of bangla speech recognizer model using hmm
Performance analysis of bangla speech recognizer model using hmmPerformance analysis of bangla speech recognizer model using hmm
Performance analysis of bangla speech recognizer model using hmm
Abdullah al Mamun
 
Site visit presentation 2012 12 14
Site visit presentation 2012 12 14Site visit presentation 2012 12 14
Site visit presentation 2012 12 14
Mitchell Wand
 
ITU - MDD - Textural Languages and Grammars
ITU - MDD - Textural Languages and GrammarsITU - MDD - Textural Languages and Grammars
ITU - MDD - Textural Languages and Grammars
Tonny Madsen
 
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
Abner Huang
 
Nltk - Boston Text Analytics
Nltk - Boston Text AnalyticsNltk - Boston Text Analytics
Nltk - Boston Text Analytics
shanbady
 
The future of DSLs - functions and formal methods
The future of DSLs - functions and formal methodsThe future of DSLs - functions and formal methods
The future of DSLs - functions and formal methods
Markus Voelter
 
NLTK
NLTKNLTK
Pycon Korea 2020
Pycon Korea 2020 Pycon Korea 2020
Pycon Korea 2020
jihoonkang29
 
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
TAUS - The Language Data Network
 
Fast and Precise Symbolic Analysis of Concurrency Bugs in Device Drivers
Fast and Precise Symbolic Analysis of Concurrency Bugs in Device DriversFast and Precise Symbolic Analysis of Concurrency Bugs in Device Drivers
Fast and Precise Symbolic Analysis of Concurrency Bugs in Device Drivers
Pantazis Deligiannis
 
KiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorialKiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorial
Alyona Medelyan
 
MACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSISMACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSIS
Massimo Schenone
 

Similar to Smt in-a-few-slides (20)

Deep-learning based Language Understanding and Emotion extractions
Deep-learning based Language Understanding and Emotion extractionsDeep-learning based Language Understanding and Emotion extractions
Deep-learning based Language Understanding and Emotion extractions
 
An introduction to erlang
An introduction to erlangAn introduction to erlang
An introduction to erlang
 
Latent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet MixtureLatent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet Mixture
 
The hangover: A "modern" (?) high performance approach to build an offensive ...
The hangover: A "modern" (?) high performance approach to build an offensive ...The hangover: A "modern" (?) high performance approach to build an offensive ...
The hangover: A "modern" (?) high performance approach to build an offensive ...
 
Model-driven Development of Model Transformations
Model-driven Development of Model TransformationsModel-driven Development of Model Transformations
Model-driven Development of Model Transformations
 
Devoxx traitement automatique du langage sur du texte en 2019
Devoxx   traitement automatique du langage sur du texte en 2019 Devoxx   traitement automatique du langage sur du texte en 2019
Devoxx traitement automatique du langage sur du texte en 2019
 
New compiler design 101 April 13 2024.pdf
New compiler design 101 April 13 2024.pdfNew compiler design 101 April 13 2024.pdf
New compiler design 101 April 13 2024.pdf
 
Programming_Language_Syntax.ppt
Programming_Language_Syntax.pptProgramming_Language_Syntax.ppt
Programming_Language_Syntax.ppt
 
Performance analysis of bangla speech recognizer model using hmm
Performance analysis of bangla speech recognizer model using hmmPerformance analysis of bangla speech recognizer model using hmm
Performance analysis of bangla speech recognizer model using hmm
 
Site visit presentation 2012 12 14
Site visit presentation 2012 12 14Site visit presentation 2012 12 14
Site visit presentation 2012 12 14
 
ITU - MDD - Textural Languages and Grammars
ITU - MDD - Textural Languages and GrammarsITU - MDD - Textural Languages and Grammars
ITU - MDD - Textural Languages and Grammars
 
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
 
Nltk - Boston Text Analytics
Nltk - Boston Text AnalyticsNltk - Boston Text Analytics
Nltk - Boston Text Analytics
 
The future of DSLs - functions and formal methods
The future of DSLs - functions and formal methodsThe future of DSLs - functions and formal methods
The future of DSLs - functions and formal methods
 
NLTK
NLTKNLTK
NLTK
 
Pycon Korea 2020
Pycon Korea 2020 Pycon Korea 2020
Pycon Korea 2020
 
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
 
Fast and Precise Symbolic Analysis of Concurrency Bugs in Device Drivers
Fast and Precise Symbolic Analysis of Concurrency Bugs in Device DriversFast and Precise Symbolic Analysis of Concurrency Bugs in Device Drivers
Fast and Precise Symbolic Analysis of Concurrency Bugs in Device Drivers
 
KiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorialKiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorial
 
MACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSISMACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSIS
 

Recently uploaded

HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
LucaBarbaro3
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
Shinana2
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
saastr
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
fredae14
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Jeffrey Haguewood
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
alexjohnson7307
 

Recently uploaded (20)

HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
 

Smt in-a-few-slides

  • 1. Translation as probability “Decoding” Training “Log-linear” Ain’t got nothin’ but the BLEUs? The SMT lifecycle Statistical machine translation in a few slides Mikel L. Forcada1,2 1Departament de Llenguatges i Sistemes Informàtics, Universitat d’Alacant, E-03071 Alacant (Spain) 2Prompsit Language Engineering, S.L., E-03690 St. Vicent del Raspeig (Spain) April 14-16, 2009: Free/open-source MT tutorial at the CNGL Mikel L. Forcada SMT in a few slides
  • 2. Translation as probability “Decoding” Training “Log-linear” Ain’t got nothin’ but the BLEUs? The SMT lifecycle Contents 1 Translation as probability 2 “Decoding” 3 Training 4 “Log-linear” 5 Ain’t got nothin’ but the BLEUs? 6 The SMT lifecycle Mikel L. Forcada SMT in a few slides
  • 3. Translation as probability “Decoding” Training “Log-linear” Ain’t got nothin’ but the BLEUs? The SMT lifecycle The “canonical” model Translation as probability/1 Instead of saying that a source-language (SL) sentence s in a SL text and a target-language (TL) sentence t as found in a SL–TL bitext are or are not a translation of each other, in SMT one says that they are a translation of each other with a probability p(s, t) = p(t, s) (a joint probability). We’ll assume we have such a probability model available. Or at least a reasonable estimate. Mikel L. Forcada SMT in a few slides
  • 4. Translation as probability “Decoding” Training “Log-linear” Ain’t got nothin’ but the BLEUs? The SMT lifecycle The “canonical” model Translation as probability/2 According to basic probability laws, we can write: p(s, t) = p(t, s) = p(s|t)p(t) = p(t|s)p(s) (1) where p(x|y) is the conditional probability of x given y. We are interested in translating from SL to TL. That is, we want to find the most likely translation given the SL sentence s: t = arg max t p(t|s) (2) Mikel L. Forcada SMT in a few slides
  • 5. Translation as probability “Decoding” Training “Log-linear” Ain’t got nothin’ but the BLEUs? The SMT lifecycle The “canonical” model The “canonical” model We can rewrite eq. (1) as p(t|s) = p(s|t)p(t) p(s) (3) and then with (2) to get t = arg max t p(s|t)p(t) (4) Mikel L. Forcada SMT in a few slides
  • 6. Translation as probability “Decoding” Training “Log-linear” Ain’t got nothin’ but the BLEUs? The SMT lifecycle “Decoding”/1 t = arg max t p(s|t)p(t) We have a product of two probability models: A reverse translation model p(s|t) which tells us how likely the SL sentence s is a translation of the candidate TL sentence t, and a target-language model p(t) which tells us how likely the sentence t is in the TL side of bitexts. These may be related (respectively) to the usual notions of [reverse] adequacy: how much of the meaning of t is conveyed by s fluency: how fluent is the candidate TL sentence. The arg max strikes a balance between the two. Mikel L. Forcada SMT in a few slides
  • 7. Translation as probability “Decoding” Training “Log-linear” Ain’t got nothin’ but the BLEUs? The SMT lifecycle “Decoding”/2 In SMT parlance, the process of finding t∗ is called decoding.1 Obviously, it does not explore all possible translations t in the search space. There are infinitely many. The search space is pruned. Therefore, one just gets a reasonable t instead of the ideal t Pruning and search strategies are a very active research topic. Free/open-source software: Moses. 1 Reading SMT articles usually entails deciphering jargon which may be very obscure to outsiders or newcomers Mikel L. Forcada SMT in a few slides
  • 8. Translation as probability “Decoding” Training “Log-linear” Ain’t got nothin’ but the BLEUs? The SMT lifecycle Training/1 So where do these probabilities come from? p(t) may easily be estimated from a large monolingual TL corpus (free/open-source software: irstlm) The estimation of p(s|t) is more complex. It’s usually made of a lexical model describing the probability that the translation of certain TL word or sequence of words (“phrase”2 ) is a certain SL word or sequence of words. an alignment model describing the reordering of words or “phrases”. 2 A very unfortunate choice in SMT jargon Mikel L. Forcada SMT in a few slides
  • 9. Translation as probability “Decoding” Training “Log-linear” Ain’t got nothin’ but the BLEUs? The SMT lifecycle Training/2 The lexical model and the alignment model are estimated using a large sentence-aligned bilingual corpus through a complex iterative process. An initial set of lexical probabilities is obtained by assuming, for instance, that any word in the TL sentence aligns with any word in its SL counterpart. And then: Alignment probabilities in accordance with the lexical probabilities are computed. Lexical probabilities are obtained in accordance with the alignment probabilities This process (“expectation maximization”) is repeated a fixed number of times or until some convergence is observed (free/open-source software: Giza++). Mikel L. Forcada SMT in a few slides
  • 10. Translation as probability “Decoding” Training “Log-linear” Ain’t got nothin’ but the BLEUs? The SMT lifecycle Training/3 In “phrase-based” SMT, alignments may be used to extract (SL-phrase, TL-phrase) pairs of phrases and their corresponding probabilities for easier decoding and to avoid “word salad”. Mikel L. Forcada SMT in a few slides
  • 11. Translation as probability “Decoding” Training “Log-linear” Ain’t got nothin’ but the BLEUs? The SMT lifecycle “Log-linear”/1 More SMT jargon! It’s short for linear combination of logarithms of probabilities. And, sometimes, even features that aren’t logarithms or probabilities of any kind. OK, let’s take a look at the maths. Mikel L. Forcada SMT in a few slides
  • 12. Translation as probability “Decoding” Training “Log-linear” Ain’t got nothin’ but the BLEUs? The SMT lifecycle “Log-linear”/2 One can write a more general formula: p(t|s) = exp( nF k=1 λk fk (t, s)) Z (5) with nF feature functions fk (t, s) which can depend on s, t or both. Setting nF = 2, f1(s, t) = log p(s|t), f2(s, t) = log p(t), and Z = p(s) one recovers the canonical formula (3). The best translation is then t = arg max t nF k=1 λk fk (t, s) (6) Most of the fk (t, s) are logarithms, hence “log-linear”. Mikel L. Forcada SMT in a few slides
  • 13. Translation as probability “Decoding” Training “Log-linear” Ain’t got nothin’ but the BLEUs? The SMT lifecycle “Log-linear”/3 “Feature selection is a very open problem in SMT” (Lopez 2008) Other possible functions include length penalties (discouraging unreasonably short or long translations), “inverted” versions of p(s|t), etc. Where do we get the λk ’s from? They are usually tuned so as to optimize the results on a tuning set, according to a certain objective function that is taken to be an indicator that correlates with translation quality may be automatically obtained from the output of the SMT system and the translation in the corpus. This is called MERT (minimum error rate training) sometimes (free/open-source software: the Moses suite). Mikel L. Forcada SMT in a few slides
  • 14. Translation as probability “Decoding” Training “Log-linear” Ain’t got nothin’ but the BLEUs? The SMT lifecycle Ain’t got nothin’ but the BLEUs? The most famous “quality indicator” is called BLEU, but there are many others. BLEU counts which fraction of the 1-word, 2-word,. . . n-word sequences in the output match the reference translation. Correlation with subjective assessments of quality is still an open question. A lot of SMT research is currently BLEU-driven and makes little contact with real applications of MT. Mikel L. Forcada SMT in a few slides
  • 15. Translation as probability “Decoding” Training “Log-linear” Ain’t got nothin’ but the BLEUs? The SMT lifecycle The SMT lifecycle Development: Training: monolingual and sentence-aligned bilingual corpora are used to estimate probability models (features) Tuning: a held-out portion of the sentence-aligned bilingual corpus is used to tune the coeficients λk Decoding: sentences s are fed into the SMT system and “decoded” into their translations t. Evaluation: the system is evaluated against a reference corpus. Mikel L. Forcada SMT in a few slides
  • 16. Translation as probability “Decoding” Training “Log-linear” Ain’t got nothin’ but the BLEUs? The SMT lifecycle License This work may be distributed under the terms of the Creative Commons Attribution–Share Alike license: http: //creativecommons.org/licenses/by-sa/3.0/ the GNU GPL v. 3.0 License: http://www.gnu.org/licenses/gpl.html Dual license! E-mail me to get the sources: mlf@ua.es Mikel L. Forcada SMT in a few slides