Statistical machine translation

•Download as PPTX, PDF•

1 like•326 views

asimuop

Statistical Machine Translation asim.ibms@gmail.com

Data & Analytics

STATISTICAL MACHINE
TRANSLATION
Presented by Asim Ullah Jan

Agenda
• What is Statistical Machine Translation?
• How to built an STM System?
• Sentence alignment
• Word alignment
What is Probibility?
History of Probability
Conditional Probability

Agenda (Cont..)
Language Model
Translation model
SMT Basic Architecture
Bayes Theorm
Chain Rule

SMT Model
• Fundimental Equation of Machine Translation
• P(E|F) = argmax P(F|E).P(E)
• Translation model: Find best target-language word or phrase for each input unit.
Result: a bag of words/phases.
• P(F|E) :-Probability that the input French gives this particular sentence
of English
• Language Model: Pick one word/phrase at a time from the bag of words/phrases
and build up a sentence.
• P(E) :-Probability governing how the English words are properly composed

Baye’s Rule
𝑃 𝐴, 𝐵 = 𝑃(𝐵) ∙ 𝑃 𝐴 𝐵
𝑃 𝐴, 𝐵 = 𝑃(𝐴) ∙ 𝑃 𝐵 𝐴
𝑃(𝐵) ∙ 𝑃 𝐴 𝐵 = 𝑃 𝐴 ∙ 𝑃 𝐵 𝐴
𝑃 𝐴 𝐵 =
𝑃 𝐴 ∙ 𝑃 𝐵 𝐴
𝑃(𝐵)
probability of A given B is equal to the probability of A, multiplied by the probability of B given
A, divided by the probability of B.
Thomas Bayes
1701 – 1761

Probabilistic Language Modeling
Goal: compute the probability of a sentence or sequence of words:
P(W) = P(w1,w2,w3,w4,w5…wn)
• Related task: probability of an upcoming word:
P(w5|w1,w2,w3,w4)
• A model that computes either of these:
P(W) or P(wn|w1,w2…wn-1) is called a language model.

How to compute P(W)
• How to compute this joint probability:
• P(its, water, is, so, transparent, that)
• Intuition: let’s rely on the Chain Rule of Probability

The Chain Rule
• Recall the definition of conditional probabilities
More variables:
P(A,B,C,D) = P(A)P(B|A)P(C|A,B)P(D|A,B,C)
• The Chain Rule in General
P(x1,x2,x3,…,xn) = P(x1)P(x2|x1)P(x3|x1,x2)…P(xn|x1,…,xn-1)

The Chain Rule applied to compute joint probability
of words in sentence
P(“its water is so transparent”) =
P(its) × P(water|its) × P(is|its water)
× P(so|its water is) × P(transparent|its water is so)
 
i
iin wwwwPwwwP )|()( 12121 

What's hot

NLP_KASHK:N-GramsHemantha Kulathilake

NLP_KASHK:Evaluating Language ModelHemantha Kulathilake

Cs6503 theory of computation book notesappasami

Theory of Computation Lecture NotesFellowBuddy.com

7. Trevor Cohn (usfd) Statistical Machine TranslationRIILP

LogicYasir Khan

Ngrams smoothingDigvijay Singh

NLP_KASHK:Text NormalizationHemantha Kulathilake

AI Lesson 09Assistant Professor

6. Khalil Sima'an (UVA) Statistical Machine TranslationRIILP

AI Lesson 11Assistant Professor

Pbsmt presenation waleed_oransa_29_april2010woransa

NLP_KASHK:Context-Free Grammar for EnglishHemantha Kulathilake

Composing (Im)politeness in Dependent Type SemanticsDaisuke BEKKI

Barreiro-Batista-LR4NLP@Coling2018-presentationINESC-ID (Spoken Language Systems Laboratory - L2F)

13. Constantin Orasan (UoW) Natural Language Processing for TranslationRIILP

Dependent Types in Natural Language SemanticsDaisuke BEKKI

Artificial intelligence and first order logicparsa rafiq

Mc0082 theory of computer sciencesmumbahelp

Pxc3898474Sivajyothi Chandra

What's hot (20)

NLP_KASHK:N-Grams

NLP_KASHK:Evaluating Language Model

Cs6503 theory of computation book notes

Theory of Computation Lecture Notes

7. Trevor Cohn (usfd) Statistical Machine Translation

Logic

Ngrams smoothing

NLP_KASHK:Text Normalization

AI Lesson 09

6. Khalil Sima'an (UVA) Statistical Machine Translation

AI Lesson 11

Pbsmt presenation waleed_oransa_29_april2010

NLP_KASHK:Context-Free Grammar for English

Composing (Im)politeness in Dependent Type Semantics

Barreiro-Batista-LR4NLP@Coling2018-presentation

13. Constantin Orasan (UoW) Natural Language Processing for Translation

Dependent Types in Natural Language Semantics

Artificial intelligence and first order logic

Mc0082 theory of computer science

Pxc3898474

Similar to Statistical machine translation

lec03-LanguageModels_230214_161016.pdfykyog

Lecture-18(11-02-22)Stochastics POS Tagging.pdfNiraliRajeshAroraAut

UncertaintyDigvijay Singh

INFO-2950-Languages-and-Grammars.pptLamhotNaibaho3

pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...Lifeng (Aaron) Han

Pptphrase tagset mapping for french and english treebanks and its application...Lifeng (Aaron) Han

Theory of computingRanjan Kumar

Dual Learning for Machine Translation (NIPS 2016)Toru Fujino

Word level language identification in code-switched textsHarsh Jhamtani

Similar to Statistical machine translation (9)

lec03-LanguageModels_230214_161016.pdf

Lecture-18(11-02-22)Stochastics POS Tagging.pdf

Uncertainty

INFO-2950-Languages-and-Grammars.ppt

pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...

Pptphrase tagset mapping for french and english treebanks and its application...

Theory of computing

Dual Learning for Machine Translation (NIPS 2016)

Word level language identification in code-switched texts

Recently uploaded

Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Riyadh +966572737505 get cytotec

Edukaciniai dropshipping via API with DroFxolyaivanovalion

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083

Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823

Halmar dropshipping via API with DroFxolyaivanovalion

Ravak dropshipping via API with DroFx.pptxolyaivanovalion

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE9953056974 Low Rate Call Girls In Saket, Delhi NCR

Week-01-2.ppt BBB human Computer interactionfulawalesam

Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten

Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083

Capstone Project on IBM Data Analytics ProgramMoniSankarHazra

Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083

BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692

ALSO dropshipping via API with DroFx.pptxolyaivanovalion

Zuja dropshipping via API with DroFx.pptxolyaivanovalion

BigBuy dropshipping via API with DroFx.pptxolyaivanovalion

Carero dropshipping via API with DroFx.pptxolyaivanovalion

Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71

Recently uploaded (20)

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec

Edukaciniai dropshipping via API with DroFx

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call

Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...

Halmar dropshipping via API with DroFx

Ravak dropshipping via API with DroFx.pptx

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE

Week-01-2.ppt BBB human Computer interaction

Log Analysis using OSSEC sasoasasasas.pptx

Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...

Capstone Project on IBM Data Analytics Program

Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call

BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx

ALSO dropshipping via API with DroFx.pptx

Zuja dropshipping via API with DroFx.pptx

BigBuy dropshipping via API with DroFx.pptx

Carero dropshipping via API with DroFx.pptx

Best VIP Call Girls Noida Sector 22 Call Me: 8448380779

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha

Statistical machine translation

1. STATISTICAL MACHINE TRANSLATION Presented by Asim Ullah Jan

2. Agenda • What is Statistical Machine Translation? • How to built an STM System? • Sentence alignment • Word alignment What is Probibility? History of Probability Conditional Probability

3. Agenda (Cont..) Language Model Translation model SMT Basic Architecture Bayes Theorm Chain Rule

4. SMT Model • Fundimental Equation of Machine Translation • P(E|F) = argmax P(F|E).P(E) • Translation model: Find best target-language word or phrase for each input unit. Result: a bag of words/phases. • P(F|E) :-Probability that the input French gives this particular sentence of English • Language Model: Pick one word/phrase at a time from the bag of words/phrases and build up a sentence. • P(E) :-Probability governing how the English words are properly composed

5. Basic SMT Architecture

6. Baye’s Rule 𝑃 𝐴, 𝐵 = 𝑃(𝐵) ∙ 𝑃 𝐴 𝐵 𝑃 𝐴, 𝐵 = 𝑃(𝐴) ∙ 𝑃 𝐵 𝐴 𝑃(𝐵) ∙ 𝑃 𝐴 𝐵 = 𝑃 𝐴 ∙ 𝑃 𝐵 𝐴 𝑃 𝐴 𝐵 = 𝑃 𝐴 ∙ 𝑃 𝐵 𝐴 𝑃(𝐵) probability of A given B is equal to the probability of A, multiplied by the probability of B given A, divided by the probability of B. Thomas Bayes 1701 – 1761

7. Probabilistic Language Modeling Goal: compute the probability of a sentence or sequence of words: P(W) = P(w1,w2,w3,w4,w5…wn) • Related task: probability of an upcoming word: P(w5|w1,w2,w3,w4) • A model that computes either of these: P(W) or P(wn|w1,w2…wn-1) is called a language model.

8. How to compute P(W) • How to compute this joint probability: • P(its, water, is, so, transparent, that) • Intuition: let’s rely on the Chain Rule of Probability

9. The Chain Rule • Recall the definition of conditional probabilities More variables: P(A,B,C,D) = P(A)P(B|A)P(C|A,B)P(D|A,B,C) • The Chain Rule in General P(x1,x2,x3,…,xn) = P(x1)P(x2|x1)P(x3|x1,x2)…P(xn|x1,…,xn-1)

Statistical machine translation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Statistical machine translation

Similar to Statistical machine translation (9)

Recently uploaded

Recently uploaded (20)

Statistical machine translation