SlideShare a Scribd company logo
Machine Translator
What is Machine Translator?
Automatic translation from one language to
another
Koehn: „Translating between languages is a task for
which even humans require special training.“
Why Even Humans Require Special Training
‫دادن‬ ‫نشان‬ ‫با‬«‫لولو‬»‫نمی‬ ‫قبله‬ ‫به‬ ‫رو‬ ‫ایران‬ ‫مردم‬ ،‫امنیت‬ ‫شورای‬ ‫ی‬‫ش‬‫وند‬
‫ترجمه‬‫نیوزویك‬:*‫گفته‬‫است‬‫كه‬‫اگر‬‫شورای‬‫امنیت‬‫مثل‬‫موجوداتی‬‫كه‬‫بچه‬‫ها‬‫را‬
‫می‬‫ترسانند‬‫ظاهر‬،‫شود‬‫مردم‬‫ایران‬‫به‬‫سوی‬‫قبله‬‫مسلمانان‬‫جهان‬‫دراز‬‫نمی‬‫كشند‬.
‫ترجمه‬‫نشریه‬‫اسپانیایی‬‫ال‬‫پائیس‬:*‫گفت‬‫كه‬‫اگر‬‫شورای‬‫امنیت‬‫چیز‬‫ترس‬‫ناكی‬‫را‬‫هم‬
‫به‬‫ایرانیان‬‫نشان‬،‫دهد‬‫باز‬‫هم‬‫مردم‬‫ایران‬‫به‬‫سوی‬‫عربستان‬‫سعودی‬‫نمی‬‫خوابن‬‫د‬.
‫ترجمه‬‫نشریه‬‫فرانسوی‬‫اومانیته‬:*‫گفت‬‫كه‬‫دراز‬‫كشیدن‬‫ایرانیان‬‫به‬‫سوی‬‫مركز‬
‫اعتقادات‬‫مسلمانان‬‫بستگی‬‫به‬‫این‬‫دارد‬‫كه‬‫آنها‬‫از‬‫موجودات‬‫افسانه‬‫ای‬‫بترسن‬،‫د‬‫این‬
‫یك‬‫داستان‬‫ایرانی‬‫است‬.
Other Similar Concept
•computer-aided translation
•machine-aided human translation (MAHT)
History
As early as the 17th century by philosophers René Descartes and
Gottfried Wilhelm Leibniz
MT in Computer
Applications
Dissemination
Publication in
other languages
Communication
Emails, chats
Assimilation
Understand the
Content
Challenges
Input
Typology
Lexical
Other
Input: Ambiguity
I saw a man with telescope
Input: Complexity
General relativity includes a dynamical spacetime so it is
difficult to see how to identify the conserved energy and
momentum Noether's theorem allows these quantities to be
determined from a Lagrangian with translation invariance but
general covariance makes translation invariance into
something of a gauge symmetry
Input: Wrong Sentence
I try for getting best grades but I did not can achive it
Typology: Morphology
Typology: Syntax
order of verbs (V), subjects (S) and objects (O)
Typology: Argument structure and linking
Typology: Pronouns omission
Lexical: Ambiguity
Lexical: Grammer
Lexical: Lexical gap
Lexical: Idiom
Other Challenges
Other Challenges
Human Translation Process
• Decoding the meaning of the source text
• Re-encoding this meaning in the target language.
Simplest Machine Translator
Apple ‫سیب‬
MT
Human
Translation
With Machine
Aid
Machine
Translation
With Human
Aid
Fully
Automated
Translation
Rule Based MT
Direct MT Transfer MT Interlingua
Knowledge
Based MT
Principle
Based MT
Empirical
Based MT
Statistical MT
Word Based
Translation
Phrase Based
Translation
Hierarchical
Phrase Based
Translation
Example
Based MT
Online
Interactive MT
Hybrid MT Neural MT
Rule-Base MT
Direct Translation
• dictionary has to cover all cross-lingual phenomena
• need to include contextual information in dictionary (long phrases)
• inflectional agreement, shifts in word order & structure
+direct translation systems include simplistic rules
Direct Translation Approach
• simplistic: only low-level pre/post-processing (tokenization, etc)
• advanced: handle some specific phenomena
identification & handling of syntactic ambiguity
morphological processing/synthesis
word re-ordering rules
rules for prepositions
handling of compounds and idioms, ...
Is Direct Translation Feasible?
Transfer Based Translation
Motivation:
• complete analysis of source language sentences
• handle lexical & structural ambiguity in one formalism
Transfer Based Needed Information/Tools
• source language parser (morpho-syntactic analysis)
• transfer engine (e.g. unification based grammar)
• target language generator
• Morphological analysis. Surface forms of the input text are classified as to part-
of-speech (e.g. noun, verb, etc.) and sub-category (number, gender, tense, etc.).
All of the possible "analyses" for each surface form are typically made output at
this stage, along with the lemma of the word.
• Lexical categorisation. In any given text some of the words may have more than
one meaning, causing ambiguity in analysis. Lexical categorisation looks at the
context of a word to try to determine the correct meaning in the context of the
input. This can involve part-of-speech tagging and word sense disambiguation.
• Lexical transfer. This is basically dictionary translation; the source language
lemma (perhaps with sense information) is looked up in a bilingual dictionary and
the translation is chosen.
• Structural transfer. While the previous stages deal with words, this stage deals
with larger constituents, for example phrases and chunks. Typical features of this
stage include concordance of gender and number, and re-ordering of words or
phrases.
• Morphological generation. From the output of the structural transfer stage, the
target language surface forms are generated.
Transfer Based: Syntactic Transfer
What are the problems?
• lots of grammar engineering (writing rules ...)
• language-pair specific rules
• exponential ambiguity
• variation & preference
Interlingua-based Translation
Persian
English
Persian
SpanishEnglish
Persian
Spanish
Japanese
English
interlingua
Persian
Spanish
New
English
Advantages & Disadvantages
• no language-pair specific transfer
• simple to add new languages (add new analysis/generation
component)
• need to design interlingua that covers all language phenomena
• need semantic representation (and that’s hard!)
Statistical MT
Statistical MT
Statistical MT
Statistical MT
(1) build a language model which allows us to estimate P(e)
(2) build a translation model which allows us to estimate P(f|e)
(3) search for e maximizing the product P(f|e).P(e)
Language Modeling
Which N-Gram?
• 1-Gram is not very realistic
• More realistic still is the trigram model
Problem
50,000 English word
2.5 billion possible bigrams
Many zero bigram in corpus but maybe needed in translations
linear interpolation
Translation Model
(i) a model of the sentence-aligned source–target training corpus
(ii) a method for computing the probability that S and T are
equivalent using that model
Translation Model Example
Example
Word Alignment
Simple Word Alignment
Expectation-Maximisation (EM) algorithm
Expectation-Maximisation (EM) algorithm
MT Evaluation
• How can we measure MT quality?
• How can we compare MT engines?
• How can we measure progress in MT development?
• Adequacy: Does the output convey the same meaning as the input sentence?
Is part of the message lost, added, or distorted?
• Fluency: Is the output good fluent English?
This involves both grammatical correctness and idiomatic word choices.
What do We Expect from MT?
• adequacy & informativeness (preserve meaning)
• fluency & grammaticality (translation needs to be natural)
• acceptance (for its task)
Task-specific evaluation
• browsing quality: Is the translation understandable in its
context?
• post-editing quality: How many edit operations are required
to turn it into a good translation?
• publishing quality: How many human interventions are
necessary to make the entire document ready for printing?
Evaluation is Difficult!
• I What is the best translation? (language variation!)
• I Subjective aspects (What is “fluent”? Clarity? Style?)
• I What is “grammatical”?
• I What is “adequate”? (Is it possible to be adequate?)
MT evaluation
Manual Evaluation
• ask actual users to rate translations
• statistics over user responses
• separate evaluations of adequacy &
fluency
• requires guidelines
• task-specific evaluation
Automatic Evaluation
• compare to reference translations
• approximations by measuring overlaps
• strong bias but useful for rapid
development
Fluency and Adequacy: Scales
Manual MT evaluation: What are the problems?
• need volunteers (every time we want to evaluate)
• expensive evaluation!
• subjective measures & disagreement between annotators
Automatic Evaluation: BLEU-score
• introduced in 2002 by Papineni et al
• desperately needed by rapid MT development
• quickly adapted by statistical MT community
• created a boom in MT research/experiments
• Many MT papers report only BLEU scores and don’t even look at the
translations
BLEU-score
the closer a machine translation is to a
professional human translation
the better it is
Definition
•Pn: for each pair of candidate and reference sentences.
• This score represents the proportion of n-word sequences in the
candidate translation which also occur in the reference translation.
• Koehn, Philipp. Statistical machine translation. Cambridge University Press, 2009.
• Arnold, D., et al. "Machine translation: An introductory guide. NCC Blackwell." (1994).
• https://www.slideshare.net/rushdishams/types-of-machine-translation
• https://en.wikipedia.org/wiki/Machine_translation
• Brown, Peter F., et al. "A statistical approach to machine translation." Computational
linguistics 16.2 (1990): 79-85.
• Hearne, Mary, and Andy Way. "Statistical machine translation: a guide for linguists and
translators." Language and Linguistics Compass 5.5 (2011): 205-226.
Question?

More Related Content

What's hot

Types of machine translation
Types of machine translationTypes of machine translation
Types of machine translationRushdi Shams
 
natural language processing help at myassignmenthelp.net
natural language processing  help at myassignmenthelp.netnatural language processing  help at myassignmenthelp.net
natural language processing help at myassignmenthelp.net
www.myassignmenthelp.net
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
saurabhnarhe
 
A tutorial on Machine Translation
A tutorial on Machine TranslationA tutorial on Machine Translation
A tutorial on Machine Translation
Jaganadh Gopinadhan
 
What is machine translation
What is machine translationWhat is machine translation
What is machine translation
Stephen Peacock
 
Machine Translation Introduction
Machine Translation IntroductionMachine Translation Introduction
Machine Translation Introduction
nlab_utokyo
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
Robert Lujo
 
Techniques for translation
Techniques for translationTechniques for translation
Techniques for translation
Randy Morales
 
Nlp
NlpNlp
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Toine Bogers
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
VenkateshMurugadas
 
NLP pipeline in machine translation
NLP pipeline in machine translationNLP pipeline in machine translation
NLP pipeline in machine translation
Marcis Pinnis
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Varunjeet Singh Rekhi
 
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...
Edureka!
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
VeenaSKumar2
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingYasir Khan
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
Abash shah
 
Machine translation from English to Hindi
Machine translation from English to HindiMachine translation from English to Hindi
Machine translation from English to Hindi
Rajat Jain
 
Recent Advances in Natural Language Processing
Recent Advances in Natural Language ProcessingRecent Advances in Natural Language Processing
Recent Advances in Natural Language Processing
Seth Grimes
 

What's hot (20)

Types of machine translation
Types of machine translationTypes of machine translation
Types of machine translation
 
natural language processing help at myassignmenthelp.net
natural language processing  help at myassignmenthelp.netnatural language processing  help at myassignmenthelp.net
natural language processing help at myassignmenthelp.net
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
A tutorial on Machine Translation
A tutorial on Machine TranslationA tutorial on Machine Translation
A tutorial on Machine Translation
 
What is machine translation
What is machine translationWhat is machine translation
What is machine translation
 
Machine Translation Introduction
Machine Translation IntroductionMachine Translation Introduction
Machine Translation Introduction
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
 
Techniques for translation
Techniques for translationTechniques for translation
Techniques for translation
 
Nlp
NlpNlp
Nlp
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
 
NLP pipeline in machine translation
NLP pipeline in machine translationNLP pipeline in machine translation
NLP pipeline in machine translation
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Machine translation from English to Hindi
Machine translation from English to HindiMachine translation from English to Hindi
Machine translation from English to Hindi
 
Recent Advances in Natural Language Processing
Recent Advances in Natural Language ProcessingRecent Advances in Natural Language Processing
Recent Advances in Natural Language Processing
 

Similar to Machine translator Introduction

Error Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsError Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsParisa Niksefat
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
Abdullah al Mamun
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Saurabh Kaushik
 
Machine Translation System: Chhattisgarhi to Hindi
Machine Translation System: Chhattisgarhi to HindiMachine Translation System: Chhattisgarhi to Hindi
Machine Translation System: Chhattisgarhi to Hindi
Padma Metta
 
Machine Translation Approaches and Design Aspects
Machine Translation Approaches and Design AspectsMachine Translation Approaches and Design Aspects
Machine Translation Approaches and Design Aspects
IOSR Journals
 
1 Introduction.ppt
1 Introduction.ppt1 Introduction.ppt
1 Introduction.ppt
tanishamahajan11
 
Shallow parser for hindi language with an input from a transliterator
Shallow parser for hindi language with an input from a transliteratorShallow parser for hindi language with an input from a transliterator
Shallow parser for hindi language with an input from a transliteratorShashank Shisodia
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptx
PriyadharshiniG41
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptx
PriyadharshiniG41
 
Translationusing moses1
Translationusing moses1Translationusing moses1
Translationusing moses1
Kalyanee Baruah
 
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...
Antonio Toral
 
TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFO...
TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFO...TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFO...
TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFO...
Lifeng (Aaron) Han
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)
Alia Hamwi
 
E-Translation
E-TranslationE-Translation
E-Translation
Marcielyne Razalan
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...
RajkiranVeluri
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
Kuppusamy P
 
Natural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptxNatural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptx
SHIBDASDUTTA
 
REPORT.doc
REPORT.docREPORT.doc
Lepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metricLepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metricLifeng (Aaron) Han
 

Similar to Machine translator Introduction (20)

SMT3
SMT3SMT3
SMT3
 
Error Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsError Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation Outputs
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
 
Machine Translation System: Chhattisgarhi to Hindi
Machine Translation System: Chhattisgarhi to HindiMachine Translation System: Chhattisgarhi to Hindi
Machine Translation System: Chhattisgarhi to Hindi
 
Machine Translation Approaches and Design Aspects
Machine Translation Approaches and Design AspectsMachine Translation Approaches and Design Aspects
Machine Translation Approaches and Design Aspects
 
1 Introduction.ppt
1 Introduction.ppt1 Introduction.ppt
1 Introduction.ppt
 
Shallow parser for hindi language with an input from a transliterator
Shallow parser for hindi language with an input from a transliteratorShallow parser for hindi language with an input from a transliterator
Shallow parser for hindi language with an input from a transliterator
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptx
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptx
 
Translationusing moses1
Translationusing moses1Translationusing moses1
Translationusing moses1
 
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...
 
TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFO...
TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFO...TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFO...
TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFO...
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)
 
E-Translation
E-TranslationE-Translation
E-Translation
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
 
Natural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptxNatural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptx
 
REPORT.doc
REPORT.docREPORT.doc
REPORT.doc
 
Lepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metricLepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metric
 

Recently uploaded

678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
CarlosHernanMontoyab2
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
BhavyaRajput3
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
timhan337
 

Recently uploaded (20)

678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
 

Machine translator Introduction

  • 2. What is Machine Translator? Automatic translation from one language to another Koehn: „Translating between languages is a task for which even humans require special training.“
  • 3. Why Even Humans Require Special Training ‫دادن‬ ‫نشان‬ ‫با‬«‫لولو‬»‫نمی‬ ‫قبله‬ ‫به‬ ‫رو‬ ‫ایران‬ ‫مردم‬ ،‫امنیت‬ ‫شورای‬ ‫ی‬‫ش‬‫وند‬ ‫ترجمه‬‫نیوزویك‬:*‫گفته‬‫است‬‫كه‬‫اگر‬‫شورای‬‫امنیت‬‫مثل‬‫موجوداتی‬‫كه‬‫بچه‬‫ها‬‫را‬ ‫می‬‫ترسانند‬‫ظاهر‬،‫شود‬‫مردم‬‫ایران‬‫به‬‫سوی‬‫قبله‬‫مسلمانان‬‫جهان‬‫دراز‬‫نمی‬‫كشند‬. ‫ترجمه‬‫نشریه‬‫اسپانیایی‬‫ال‬‫پائیس‬:*‫گفت‬‫كه‬‫اگر‬‫شورای‬‫امنیت‬‫چیز‬‫ترس‬‫ناكی‬‫را‬‫هم‬ ‫به‬‫ایرانیان‬‫نشان‬،‫دهد‬‫باز‬‫هم‬‫مردم‬‫ایران‬‫به‬‫سوی‬‫عربستان‬‫سعودی‬‫نمی‬‫خوابن‬‫د‬. ‫ترجمه‬‫نشریه‬‫فرانسوی‬‫اومانیته‬:*‫گفت‬‫كه‬‫دراز‬‫كشیدن‬‫ایرانیان‬‫به‬‫سوی‬‫مركز‬ ‫اعتقادات‬‫مسلمانان‬‫بستگی‬‫به‬‫این‬‫دارد‬‫كه‬‫آنها‬‫از‬‫موجودات‬‫افسانه‬‫ای‬‫بترسن‬،‫د‬‫این‬ ‫یك‬‫داستان‬‫ایرانی‬‫است‬.
  • 4. Other Similar Concept •computer-aided translation •machine-aided human translation (MAHT)
  • 5. History As early as the 17th century by philosophers René Descartes and Gottfried Wilhelm Leibniz
  • 8.
  • 10. Input: Ambiguity I saw a man with telescope
  • 11. Input: Complexity General relativity includes a dynamical spacetime so it is difficult to see how to identify the conserved energy and momentum Noether's theorem allows these quantities to be determined from a Lagrangian with translation invariance but general covariance makes translation invariance into something of a gauge symmetry
  • 12. Input: Wrong Sentence I try for getting best grades but I did not can achive it
  • 14. Typology: Syntax order of verbs (V), subjects (S) and objects (O)
  • 23. Human Translation Process • Decoding the meaning of the source text • Re-encoding this meaning in the target language.
  • 25. MT Human Translation With Machine Aid Machine Translation With Human Aid Fully Automated Translation Rule Based MT Direct MT Transfer MT Interlingua Knowledge Based MT Principle Based MT Empirical Based MT Statistical MT Word Based Translation Phrase Based Translation Hierarchical Phrase Based Translation Example Based MT Online Interactive MT Hybrid MT Neural MT
  • 27. Direct Translation • dictionary has to cover all cross-lingual phenomena • need to include contextual information in dictionary (long phrases) • inflectional agreement, shifts in word order & structure +direct translation systems include simplistic rules
  • 28. Direct Translation Approach • simplistic: only low-level pre/post-processing (tokenization, etc) • advanced: handle some specific phenomena identification & handling of syntactic ambiguity morphological processing/synthesis word re-ordering rules rules for prepositions handling of compounds and idioms, ...
  • 30. Transfer Based Translation Motivation: • complete analysis of source language sentences • handle lexical & structural ambiguity in one formalism
  • 31. Transfer Based Needed Information/Tools • source language parser (morpho-syntactic analysis) • transfer engine (e.g. unification based grammar) • target language generator
  • 32.
  • 33. • Morphological analysis. Surface forms of the input text are classified as to part- of-speech (e.g. noun, verb, etc.) and sub-category (number, gender, tense, etc.). All of the possible "analyses" for each surface form are typically made output at this stage, along with the lemma of the word. • Lexical categorisation. In any given text some of the words may have more than one meaning, causing ambiguity in analysis. Lexical categorisation looks at the context of a word to try to determine the correct meaning in the context of the input. This can involve part-of-speech tagging and word sense disambiguation. • Lexical transfer. This is basically dictionary translation; the source language lemma (perhaps with sense information) is looked up in a bilingual dictionary and the translation is chosen. • Structural transfer. While the previous stages deal with words, this stage deals with larger constituents, for example phrases and chunks. Typical features of this stage include concordance of gender and number, and re-ordering of words or phrases. • Morphological generation. From the output of the structural transfer stage, the target language surface forms are generated.
  • 35. What are the problems? • lots of grammar engineering (writing rules ...) • language-pair specific rules • exponential ambiguity • variation & preference
  • 41. Advantages & Disadvantages • no language-pair specific transfer • simple to add new languages (add new analysis/generation component) • need to design interlingua that covers all language phenomena • need semantic representation (and that’s hard!)
  • 42.
  • 46.
  • 47. Statistical MT (1) build a language model which allows us to estimate P(e) (2) build a translation model which allows us to estimate P(f|e) (3) search for e maximizing the product P(f|e).P(e)
  • 49. Which N-Gram? • 1-Gram is not very realistic • More realistic still is the trigram model Problem 50,000 English word 2.5 billion possible bigrams Many zero bigram in corpus but maybe needed in translations
  • 51. Translation Model (i) a model of the sentence-aligned source–target training corpus (ii) a method for computing the probability that S and T are equivalent using that model
  • 54.
  • 55.
  • 56.
  • 61.
  • 62.
  • 63.
  • 64. MT Evaluation • How can we measure MT quality? • How can we compare MT engines? • How can we measure progress in MT development?
  • 65. • Adequacy: Does the output convey the same meaning as the input sentence? Is part of the message lost, added, or distorted? • Fluency: Is the output good fluent English? This involves both grammatical correctness and idiomatic word choices.
  • 66. What do We Expect from MT? • adequacy & informativeness (preserve meaning) • fluency & grammaticality (translation needs to be natural) • acceptance (for its task)
  • 67. Task-specific evaluation • browsing quality: Is the translation understandable in its context? • post-editing quality: How many edit operations are required to turn it into a good translation? • publishing quality: How many human interventions are necessary to make the entire document ready for printing?
  • 68. Evaluation is Difficult! • I What is the best translation? (language variation!) • I Subjective aspects (What is “fluent”? Clarity? Style?) • I What is “grammatical”? • I What is “adequate”? (Is it possible to be adequate?)
  • 69. MT evaluation Manual Evaluation • ask actual users to rate translations • statistics over user responses • separate evaluations of adequacy & fluency • requires guidelines • task-specific evaluation Automatic Evaluation • compare to reference translations • approximations by measuring overlaps • strong bias but useful for rapid development
  • 71.
  • 72. Manual MT evaluation: What are the problems? • need volunteers (every time we want to evaluate) • expensive evaluation! • subjective measures & disagreement between annotators
  • 73.
  • 74.
  • 75.
  • 76.
  • 77. Automatic Evaluation: BLEU-score • introduced in 2002 by Papineni et al • desperately needed by rapid MT development • quickly adapted by statistical MT community • created a boom in MT research/experiments • Many MT papers report only BLEU scores and don’t even look at the translations
  • 78. BLEU-score the closer a machine translation is to a professional human translation the better it is
  • 79. Definition •Pn: for each pair of candidate and reference sentences. • This score represents the proportion of n-word sequences in the candidate translation which also occur in the reference translation.
  • 80.
  • 81. • Koehn, Philipp. Statistical machine translation. Cambridge University Press, 2009. • Arnold, D., et al. "Machine translation: An introductory guide. NCC Blackwell." (1994). • https://www.slideshare.net/rushdishams/types-of-machine-translation • https://en.wikipedia.org/wiki/Machine_translation • Brown, Peter F., et al. "A statistical approach to machine translation." Computational linguistics 16.2 (1990): 79-85. • Hearne, Mary, and Andy Way. "Statistical machine translation: a guide for linguists and translators." Language and Linguistics Compass 5.5 (2011): 205-226.

Editor's Notes

  1. Why hard
  2. Not enought
  3. More talk
  4. Represent grammer parser base
  5. run, runs, ran and running are forms of the same lexeme, with run as the lemma
  6. Hardness design
  7. aut
  8. nemishe
  9. What we do?
  10. for example, the word pair green and maison. This word pair is linked in a single-word alignment in the dataset (the first one), and this word alignment has probability 1/2 according to the E-step house and maison: this word pair is linked in two different word alignments in the dataset, each of which has probability ½ we still have frequencies rather than probabilities we add up the frequencies of the alternative translations for each target word and then divide each alternative translation frequency by the total
  11. keyfiyat