4. 4
Machine Learning: French to English Translation
data
model
Méditerranée: 3200 personnes
secourues en cinq jours
Mediterranean: 3200 people
rescued in five days
6. Source Target Good Translations?
bonjour hello
bonjour blue
bonjour morning
bonjour good morning
bonjour hi
Source Target Good Translations?
bonjour hello ✔
bonjour blue ✗
bonjour morning ~
bonjour good morning ✔
bonjour hi ✔
MT: Translation Model
7. Target Good Language?
Be the change that you wish to see in the world.
Be the world that you wish to see in the change.
The be change which you wish to see on the world.
Be that the you world to in wish see change . the
Target Good Language?
Be the change that you wish to see in the world. ✔
Be the world that you wish to see in the change. ✗
The be change which you wish to see on the world. ✗✗
Be that the you world to in wish see change . the ✗✗✗✗✗
MT: Language Model
8. MT: Training
Statistical
Analysis
Translation Model
la the 80%
la a 12%
la 8%
capitale capital 70%
capitale death 30%
de of 53%
de from 47%
france france 100%
Is est 75%
Is was 25%
paris paris 100%
Language Model
the death of 54%
the capital of 34%
a capital of 11%
capital of france 41%
capital from france 9%
of france is 45%
of the france 2%
france is paris 23%
france was paris 22%
………
………
english
………
P(s/t) P(t)
parallel
monolingual
………
………
english
………
………
………
french
………
Statistical
Analysis
9. MT: Decoding
Statistical
Search
Translation Score
the capital of france is paris 94%
capital of france is paris 71%
a capital of france is paris 65%
... …
a death from france was paris 3%
Translation Model
la the 80%
la a 12%
la 8%
capitale capital 70%
capitale death 30%
de of 53%
de from 47%
france france 100%
Is est 75%
Is was 25%
paris paris 100%
Language Model
the death of 54%
the capital of 34%
a capital of 11%
capital of france 41%
capital from france 9%
of france is 45%
of the france 2%
france is paris 23%
france was paris 22%
Input
la capitale de la france est paris
10. SMT Models
Adaptive Models
Neural Models
• Translation Model P(s/t)
• Language Model P(t)
• Distortion
• Alignment
• Phrase
• POS
• Syntactic Translation
• Syntactic Language
• Reordering
• Lexicalized Reordering
• Preordering
• Word Deletion
• Lexicalized Smoothing
• Capitalization
• Morphology
• Transliteration
• Semantic
• Informal Models
• Social Media Components
20. Source Generic MT
Social Media Translation
• la2a hia katir fi lakhbar.
• ma 3ajbanish kida. Lazim t3'iyyer
l3ounouane
• Enty habla ?
• Kalemni lama t3raf ezay tebatal
teshtemni
• 3andy soda3 fi rassi... 5oshy namy badal
chat. a7san lik Ah sa7
• La2a hia katir Fi lakhbar.
• Ma 3ajbanish kida. lazim T3 (iyyer
L3ounouane
• enty habla?
• kalemni Lama T3RAF ezay tebatal
teshtemni
• 3Andy soda3 Fi rassi ... 5oshy namy
badal Chat. A7San lik Ah SA7
21. Source Social Media MT
Social Media Translation
• la2a hia katir fi lakhbar.
• ma 3ajbanish kida. Lazim t3'iyyer
l3ounouane
• Enty habla ?
• Kalemni lama t3raf ezay tebatal
teshtemni
• 3andy soda3 fi rassi... 5oshy namy badal
chat. a7san lik Ah sa7
• No, it is very much in the news.
• I don't like this. We must change the
title
• Are you an idiot?
• Talk to me when you know how to stop
insulting me
• I have a headache in my head. Go to
sleep, instead of chat. It is better for
you, Yes, sa7
24. Received Tuesday in Warsaw by Bronislaw Komorowski, Barack Obama has participated in ceremonies
marking the twenty-fifth anniversary of the first democratic elections in Poland
Broadcast News Translation
Reçu mardi à Varsovie par Bronislaw Komorowski, Barack Obama a participé aux cérémonies marquant
le vingt-cinquième anniversaire des premières élections démocratiques en Pologne.
Received Tuesday in Warsaw by Bronislaw Komorowski, Barack Obama has participated in ceremonies
marking the twenty-fifth anniversary of the first democratic elections in Poland
✗