SlideShare a Scribd company logo
HUL 455
Evaluation of Hindi-English MT systems: Challenges and Solutions
APresentationby:
Sajeed Mahaboob
2011ME1111
MACHINE TRANSLATION
Translation can be defined as the act or process of translating,
especially from one language into another.
MT investigates the use of computer software to translate text
or speech from one language (SL) to another language (TL).
It is Automated system.
2
It analyzes text from Source Language (SL), processed it and
produces “equivalent” text in Target Language (TL).
It should be without human intervention.
MT systems are supposed to break the language barrier.
3
METHODS AND STRATEGIES
Direct Method
Transfer Method
Interlingual Method
4
DIRECT METHOD
The majority of MT systems of the 1950’s and
1960’s were based on this approach.
Designed in all details specifically for one
particular pair of languages.
Word by word matches of the SL and TL.
5
TRANSFER METHOD
Two stages that consist of underlying representations
for both SL and TL texts.
The first stage converts SL texts into SL
‘transfer’ representations.
The second stage converts these into TL
‘transfer’ representations.
6
INTERLINGUAL METHOD
Convert SL texts into semantico-syntactic
representations common to more than one
language.
From such ‘interlingual’ representations
texts would be generated into other
languages.
7
MT IN INDIA: WHY DO WE NEED ?
Multilingual country where the spoken language changes after every 50
miles.
22 official languages and approximately 2000 dialects are spoken.
State governments carry out their official work in their respective regional
language.
Translating documents manually is very time consuming and costly.
8
ENGLISH-HINDI MT SYSTEMS
MANTRA MT (1997)
Developed for information preservation. The text available in one Indian
language is made accessible in another Indian language with the help of
this system.
It uses XTAG based super tagger and light dependency analyzer for
performing the analysis of the input English text. The system produces
several outputs corresponding to a given input.
9
MANTRA MT(1999)
It translates English text into Hindi in a specific domain of personal
administration that includes gazette notifications, office orders, office
memorandums and circulars.
Uses the Tree Adjoining Grammar (TAG) formalism to represent the
English and Hindi grammar.
It uses tree transfer for translating from English to Hindi.
The system was tested for the translation of administrative documents such
as appointment letters, notification and circular issued in central
government from English to Hindi.
10
English–Hindi Translation System
A system based on transfer based translation approach, which uses
different grammatical rules of source and target languages and a
bilingual dictionary for translation.
The translation module consists of pre-processing, English tree
generator, post-processing of English tree, generation of Hindi tree,
Post-processing of Hindi tree and generating output.
The domain of the system was weather narration.
11
EVALUATION OF ENGLISH-HINDI MT SYSTEMS
Low accuracy, fluency and acceptability of output of any machine translation
system adversely affect the reliability and usage of that system. Evaluation
task can ascertain how and in what ways are the results of these systems
lacking.
Evaluation is one of the most important part in the development of MT systems
and one can’t claim MT systems success without evaluation !
The need and demand for evaluating an MT system is always at a higher
priority.
Here, we are evaluating the output of Hindi-English language pair through
two MT systems : Bing and Google.
12
Google MT/Translator is based on statistical and machine learning
approaches based on parallel corpora. It is running for 73 languages pairs.
Bing (Microsoft) MT is also based on statistical and machine learning
approaches based on parallel corpora. It also uses language specific rule-
based components to decode and encode sentences from one language
to another.
Linguistically informed statistical machine translation”. Bing MT
is running for 44 parallel languages pairs.
13
EVALUATIONSTRATEGIES
Evaluation strategies are mainly divided into two sections : (a) Automatic
evaluation (b) Manual or Human evaluation.
Automatic evaluation of any MT system is very difficult and is not as effective
as human metrics are. There are several tested MT evaluation measures
frequently used, for example: BLEU, mWER, mPER and NIST.
Human evaluation metrics are considered to be time taking and costly. But
they are the best strategies to improve any MT system’s accuracy ! !
It is a common scenario where more than one translation of a sentence exists.
At this level a human translator cum evaluator can judge the output
correctly. 14
CHALLENGES DURING EVALUATION
Sentences from the health and cuisine domains of the ILCI3 corpora are used
for evaluating the MT systems.
These sentences are entered in each of the systems in bulk and the output is
crawled, and discrepancies are marked.
In the resulting English output, several problems are noted particularly with
respect to gender agreement, structural mapping, Named Entity Recognition
(NER) and plural marker morphemes.
15
During the evaluation process the following kinds of
challenges are encountered.
1. Tokenization
2. Morph Issue
3. Structural/grammatical Differences
4. Errors with Gender agreement
5. Parser Issues
16
TOKENIZATION
 (i) With/Without Punctuation :
 (a) वह जाती है।
She goes by. (BO)
He is. (GO)
 (b) वह जाती है
He is (BO)
He is (GO)
Manual Translation: She goes.
 Examples (a) and (b) above exhibit how the use of a punctuation mark can significantly
affect translation. This variation in results is seen only in Bing. Google exhibits consistency.
17
TRANSLITERATION ISSUE:
 (b) एक नौन-स्टिक तवा गरम करें
A naun-stick frying pan and heat (BO)
A Non - stick frying pan and heat (GO)
 Manual Translation: Heat the non-stick fry pan
18
MORPHISSUE
 (i) Unknown words:
 छु आरे डालकर ममलाएं और
एक ममननि पकाएँ
One minute into the match and put chuare (BO)
Mix and cook one minute, add Cuare (GO)
 Manual Translation: Put date palm, stir and cook for a minute.
19
 (ii) Error with Paradigm fixation:
 कॅन्सर 1000 से अधिक बीमाररयों
 का एक समूह है
 Cancer is a group of more than 1000 berryman (BO)
 Cancer is a group of more than 1000 illnesses (GO)
 कॅन्सर 1000 से अधिक बीमारी
का एक समूह है
 Cancer is a group of more than 1,000 diseases (BO)
 Cancer is a group of more than 1000 illnesses (GO)
 Manual Translation: Cancer is a group of more than 1000 diseases. 20
STRUCTURAL/GRAMMATICAL DIFFERENCES
 वी. आइ. पी. क्या है?
 What is the VIP? (BO)
 VIP what is it? (GO)
 Manual Translation: What is the VIP?
Errors with Gender agreement
 वह जाती है।
 She goes by. (BO)
 He is. (GO)
 Manual Translation: She goes. 21
PARSER ISSUES
आँख की माांसपेधियों की कमजोरी के कारण लेंस अपना आकार नहीं बदल पाता पढ़ते या नजदीकी काम
करते समय प्रकाि की धकरणे रधिना के पीछे पड़ती है यह 40 वर्ष और उससे ऊपर की उम्र् में पाई जाती
है
Due to the weakness of the muscles of the eye lens cannot read or
change their size does proximity to work while the light rays have
it 40 years behind the retina and above in age (BO)
NO OUTPUT (GO)
22
Human evaluation strategy has been adopted to evaluate the Bing
(Microsoft) and Google MT (Hindi-English) output.
Methodology of MT testing:
For testing MT systems, 1,000 sentences were used. Their outputs were
then distributed into three different human evaluators who marked MT
outputs based on comprehensibility and fluency approaches.
23
Instructions for Evaluators to Evaluate :
 Read the target language translated output first.
 Judge each sentence for its comprehensibility.
 Rate it on the scale 0 to 4.
 Read the original source sentence only to verify the faithfulness of the translation (only for
reference).
 Do not read the source language sentence first.
 If the rating needs revision, change it to the new rating.
24
Guidelines of evaluation(on 5 point scale (over 0-4)):
 The following score is to be given to a sentence by looking at each output
sentence:
(A) For Comprehensibility
4= All meaning
3= most meaning
2 = much meaning
1= little meaning
0= none. 25
 (B)For fluency
4= for Flawless or Perfect: (like someone who knows the language)
3= for Good or Comprehensible but has quite a few errors: (like someone
speaking Hindi getting all its genders wrong)
2 = for Non-native or Comprehensible but has quite a few errors: (like
someone who can speak your language but would make lots of error.
However, you can make sense out of what is being said.)
1= for Diffluent or Some parts make sense but is not comprehensible over
all: (like listening to a language which has lot of borrowed words from your
language- you understood those words but nothing more)
0=for Incomprehensible or Non-Sense: (If the sentence does not make any
sense at all - It is like someone speaking to you in a language you do not
know)
26
EVALUATION METHOD
If scoring is done for N sentences and each of the N sentences is given a score
as above, the two parameters are as follows:
(a) Comprehensibility = (Number of sentences with the score of 2, 3, or 4) / N
(b) Fluency = 𝑘=1
𝑁
𝑆𝑖/𝑁
27
 Where Si is the score of ith sentence, for instance, If N=10, and suppose the scores obtained
for the each of the 10 sentences are : S1=3, S2=3, S3=2 S4=1, S5=4, S6=0, S7=0, S8=1, S9=0,
S10=0 This gives the following histogram :
 Number of sentences with score 4 = 1
 Number of sentences with score 3 = 2
 Number of sentences with score 2 = 1
 Number of sentences with score 1 = 2
 Number of sentences with score 0 = 3
 Weighted sum =14, then this produces:
 Comprehensibility = 40 % (Because 4 out of 10 sentences gain with a score of 2, 3, or 4.)
 Fluency = 14/10= 1.4 (on a scale of 0-4)
36% (on the max possible scale of 100) 28
Table 1: Score Table to Compute
Comprehensibility
Table 2: Score Table to Compute Fluency
29
30
Hence, we have evaluated Bing & Google MT systems. When
we examined and evaluated these systems, we found many
errors. And when, we evaluated MT systems, the fluency was
found to be very low but it was almost comprehensible. On
comparison, Google was found to be better than Bing MT in
comprehensibility.
31
SUGGESTIONS
While giving the input sentences tokenize them and avoid the use full stop
marker in final place.
Both MT systems should improve their morph dictionary through corpus data
and make linguistics rules for paradigm fixation(how to analyze inflectional
and derivational category), and if MT systems are trained with large number
of words and sentences then parsing issues might be resolved.
Then, these systems will improve and the errors will decrease up to some
extent. Following these steps, we can increase the Bing and Google MT
systems in fluency as well as in comprehensibility.
32
REFERENCES
http://www.shodhganga.inflibnet.ac.in
http://www.navbharattimes.indiatimes.com
http://www.academia.edu
Lecture slides
33
Thanks for your patience
34

More Related Content

What's hot

Tamil-English Document Translation Using Statistical Machine Translation Appr...
Tamil-English Document Translation Using Statistical Machine Translation Appr...Tamil-English Document Translation Using Statistical Machine Translation Appr...
Tamil-English Document Translation Using Statistical Machine Translation Appr...
baskaran_md
 
Tamil Morphological Analysis
Tamil Morphological AnalysisTamil Morphological Analysis
Tamil Morphological Analysis
Karthik Sankar
 
Personalising speech to-speech translation
Personalising speech to-speech translationPersonalising speech to-speech translation
Personalising speech to-speech translation
behzad66
 
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid ApproachPunjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
IJERA Editor
 
An Improved Approach for Word Ambiguity Removal
An Improved Approach for Word Ambiguity RemovalAn Improved Approach for Word Ambiguity Removal
An Improved Approach for Word Ambiguity Removal
Waqas Tariq
 
Parameters Optimization for Improving ASR Performance in Adverse Real World N...
Parameters Optimization for Improving ASR Performance in Adverse Real World N...Parameters Optimization for Improving ASR Performance in Adverse Real World N...
Parameters Optimization for Improving ASR Performance in Adverse Real World N...
Waqas Tariq
 
Statistical machine translation
Statistical machine translationStatistical machine translation
Statistical machine translation
Hrishikesh Nair
 
SMT3
SMT3SMT3
D3 dhanalakshmi
D3 dhanalakshmiD3 dhanalakshmi
D3 dhanalakshmi
Jasline Presilda
 
Hindi digits recognition system on speech data collected in different natural...
Hindi digits recognition system on speech data collected in different natural...Hindi digits recognition system on speech data collected in different natural...
Hindi digits recognition system on speech data collected in different natural...
csandit
 
A Review on a web based Punjabi t o English Machine Transliteration System
A Review on a web based Punjabi t o English Machine Transliteration SystemA Review on a web based Punjabi t o English Machine Transliteration System
A Review on a web based Punjabi t o English Machine Transliteration System
Editor IJCATR
 
Jq3616701679
Jq3616701679Jq3616701679
Jq3616701679
IJERA Editor
 
DETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNING
DETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNINGDETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNING
DETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNING
cscpconf
 
DETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNING
DETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNINGDETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNING
DETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNING
csandit
 
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorDynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Waqas Tariq
 
Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.
Sheeyam Shellvacumar
 
A Marathi Hidden-Markov Model Based Speech Synthesis System
A Marathi Hidden-Markov Model Based Speech Synthesis SystemA Marathi Hidden-Markov Model Based Speech Synthesis System
A Marathi Hidden-Markov Model Based Speech Synthesis System
iosrjce
 
Chunking in manipuri using crf
Chunking in manipuri using crfChunking in manipuri using crf
Chunking in manipuri using crf
ijnlc
 

What's hot (18)

Tamil-English Document Translation Using Statistical Machine Translation Appr...
Tamil-English Document Translation Using Statistical Machine Translation Appr...Tamil-English Document Translation Using Statistical Machine Translation Appr...
Tamil-English Document Translation Using Statistical Machine Translation Appr...
 
Tamil Morphological Analysis
Tamil Morphological AnalysisTamil Morphological Analysis
Tamil Morphological Analysis
 
Personalising speech to-speech translation
Personalising speech to-speech translationPersonalising speech to-speech translation
Personalising speech to-speech translation
 
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid ApproachPunjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
 
An Improved Approach for Word Ambiguity Removal
An Improved Approach for Word Ambiguity RemovalAn Improved Approach for Word Ambiguity Removal
An Improved Approach for Word Ambiguity Removal
 
Parameters Optimization for Improving ASR Performance in Adverse Real World N...
Parameters Optimization for Improving ASR Performance in Adverse Real World N...Parameters Optimization for Improving ASR Performance in Adverse Real World N...
Parameters Optimization for Improving ASR Performance in Adverse Real World N...
 
Statistical machine translation
Statistical machine translationStatistical machine translation
Statistical machine translation
 
SMT3
SMT3SMT3
SMT3
 
D3 dhanalakshmi
D3 dhanalakshmiD3 dhanalakshmi
D3 dhanalakshmi
 
Hindi digits recognition system on speech data collected in different natural...
Hindi digits recognition system on speech data collected in different natural...Hindi digits recognition system on speech data collected in different natural...
Hindi digits recognition system on speech data collected in different natural...
 
A Review on a web based Punjabi t o English Machine Transliteration System
A Review on a web based Punjabi t o English Machine Transliteration SystemA Review on a web based Punjabi t o English Machine Transliteration System
A Review on a web based Punjabi t o English Machine Transliteration System
 
Jq3616701679
Jq3616701679Jq3616701679
Jq3616701679
 
DETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNING
DETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNINGDETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNING
DETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNING
 
DETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNING
DETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNINGDETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNING
DETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNING
 
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorDynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
 
Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.
 
A Marathi Hidden-Markov Model Based Speech Synthesis System
A Marathi Hidden-Markov Model Based Speech Synthesis SystemA Marathi Hidden-Markov Model Based Speech Synthesis System
A Marathi Hidden-Markov Model Based Speech Synthesis System
 
Chunking in manipuri using crf
Chunking in manipuri using crfChunking in manipuri using crf
Chunking in manipuri using crf
 

Viewers also liked

Sanskrit subhashitas with english meaning
Sanskrit subhashitas with english meaningSanskrit subhashitas with english meaning
Sanskrit subhashitas with english meaning
Dokka Srinivasu
 
Save The Trees
Save The TreesSave The Trees
Save The Trees
SICAMP Caucasus
 
The effective salesperson
The effective salespersonThe effective salesperson
The effective salespersonmtlui4
 
Presentación1
Presentación1Presentación1
Presentación1
Harriet Kuiper
 
Полиуретан Для Горнодобывающей отрасли.
Полиуретан Для Горнодобывающей отрасли.Полиуретан Для Горнодобывающей отрасли.
Полиуретан Для Горнодобывающей отрасли.
Pavel Zmeev
 
Powerpoint presentation fb+en
Powerpoint presentation fb+enPowerpoint presentation fb+en
Powerpoint presentation fb+en
sigvesandvik
 
โครงงานคอมฯ
โครงงานคอมฯโครงงานคอมฯ
โครงงานคอมฯ
sensitive_area
 
PRA 304 Pure Company öykü eren
PRA 304 Pure Company öykü erenPRA 304 Pure Company öykü eren
PRA 304 Pure Company öykü eren
oykueren
 
Infographic - Modern CX
Infographic - Modern CXInfographic - Modern CX
Infographic - Modern CX
James R. Bailey
 
Mi paso por el valera 2
Mi paso por el valera 2Mi paso por el valera 2
Mi paso por el valera 2
sextomemola
 
проект 4 стихии
проект 4 стихиипроект 4 стихии
проект 4 стихииMaxibonko
 
Psychology journal
Psychology journalPsychology journal
Psychology journal
Moon Leong
 
Sd pertemuan 1 & 2
Sd   pertemuan 1 & 2Sd   pertemuan 1 & 2
Sd pertemuan 1 & 2muissyahril
 
Iv living things,
Iv living things,Iv living things,
Iv living things,
jaume2014
 
10 Steps to Author Branding
10 Steps to Author Branding10 Steps to Author Branding
10 Steps to Author Branding
Iola Goulton
 
Psycho video report
Psycho video reportPsycho video report
Psycho video report
Moon Leong
 
อุปกรณ์เครือข่ายคอมพิวเตอร์
อุปกรณ์เครือข่ายคอมพิวเตอร์อุปกรณ์เครือข่ายคอมพิวเตอร์
อุปกรณ์เครือข่ายคอมพิวเตอร์Ekkachai Juneg
 
Eng 2
Eng 2Eng 2
Eng 2
Moon Leong
 
Workshop details
Workshop detailsWorkshop details
Workshop details
Aguna Soft Technologies LLP
 
ข้อสอบ7วิชาสามัญ เคมี
ข้อสอบ7วิชาสามัญ เคมีข้อสอบ7วิชาสามัญ เคมี
ข้อสอบ7วิชาสามัญ เคมีRuetaitid Khamentdee
 

Viewers also liked (20)

Sanskrit subhashitas with english meaning
Sanskrit subhashitas with english meaningSanskrit subhashitas with english meaning
Sanskrit subhashitas with english meaning
 
Save The Trees
Save The TreesSave The Trees
Save The Trees
 
The effective salesperson
The effective salespersonThe effective salesperson
The effective salesperson
 
Presentación1
Presentación1Presentación1
Presentación1
 
Полиуретан Для Горнодобывающей отрасли.
Полиуретан Для Горнодобывающей отрасли.Полиуретан Для Горнодобывающей отрасли.
Полиуретан Для Горнодобывающей отрасли.
 
Powerpoint presentation fb+en
Powerpoint presentation fb+enPowerpoint presentation fb+en
Powerpoint presentation fb+en
 
โครงงานคอมฯ
โครงงานคอมฯโครงงานคอมฯ
โครงงานคอมฯ
 
PRA 304 Pure Company öykü eren
PRA 304 Pure Company öykü erenPRA 304 Pure Company öykü eren
PRA 304 Pure Company öykü eren
 
Infographic - Modern CX
Infographic - Modern CXInfographic - Modern CX
Infographic - Modern CX
 
Mi paso por el valera 2
Mi paso por el valera 2Mi paso por el valera 2
Mi paso por el valera 2
 
проект 4 стихии
проект 4 стихиипроект 4 стихии
проект 4 стихии
 
Psychology journal
Psychology journalPsychology journal
Psychology journal
 
Sd pertemuan 1 & 2
Sd   pertemuan 1 & 2Sd   pertemuan 1 & 2
Sd pertemuan 1 & 2
 
Iv living things,
Iv living things,Iv living things,
Iv living things,
 
10 Steps to Author Branding
10 Steps to Author Branding10 Steps to Author Branding
10 Steps to Author Branding
 
Psycho video report
Psycho video reportPsycho video report
Psycho video report
 
อุปกรณ์เครือข่ายคอมพิวเตอร์
อุปกรณ์เครือข่ายคอมพิวเตอร์อุปกรณ์เครือข่ายคอมพิวเตอร์
อุปกรณ์เครือข่ายคอมพิวเตอร์
 
Eng 2
Eng 2Eng 2
Eng 2
 
Workshop details
Workshop detailsWorkshop details
Workshop details
 
ข้อสอบ7วิชาสามัญ เคมี
ข้อสอบ7วิชาสามัญ เคมีข้อสอบ7วิชาสามัญ เคมี
ข้อสอบ7วิชาสามัญ เคมี
 

Similar to Evaluation of hindi english mt systems, challenges and solutions

A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to MarathiA Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathi
aciijournal
 
A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to MarathiA Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathi
aciijournal
 
Translationusing moses1
Translationusing moses1Translationusing moses1
Translationusing moses1
Kalyanee Baruah
 
**JUNK** (no subject)
**JUNK** (no subject)**JUNK** (no subject)
**JUNK** (no subject)
muthukumaran.tdr95
 
Deciphering voice of customer through speech analytics
Deciphering voice of customer through speech analyticsDeciphering voice of customer through speech analytics
Deciphering voice of customer through speech analytics
R Systems International
 
Error Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsError Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation Outputs
Parisa Niksefat
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio... HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
Lifeng (Aaron) Han
 
Pxc3898474
Pxc3898474Pxc3898474
Pxc3898474
Sivajyothi Chandra
 
Machine translation evaluation: a survey
Machine translation evaluation: a surveyMachine translation evaluation: a survey
Machine translation evaluation: a survey
Lifeng (Aaron) Han
 
Machine Translation Approaches and Design Aspects
Machine Translation Approaches and Design AspectsMachine Translation Approaches and Design Aspects
Machine Translation Approaches and Design Aspects
IOSR Journals
 
Classification of MT-Output Using Hybrid MT-Evaluation Metrics for Post-Editi...
Classification of MT-Output Using Hybrid MT-Evaluation Metrics for Post-Editi...Classification of MT-Output Using Hybrid MT-Evaluation Metrics for Post-Editi...
Classification of MT-Output Using Hybrid MT-Evaluation Metrics for Post-Editi...
aciijournal
 
Classification of MT-Output Using Hybrid MT-Evaluation Metrics for Post-Editi...
Classification of MT-Output Using Hybrid MT-Evaluation Metrics for Post-Editi...Classification of MT-Output Using Hybrid MT-Evaluation Metrics for Post-Editi...
Classification of MT-Output Using Hybrid MT-Evaluation Metrics for Post-Editi...
aciijournal
 
Survey on Indian CLIR and MT systems in Marathi Language
Survey on Indian CLIR and MT systems in Marathi LanguageSurvey on Indian CLIR and MT systems in Marathi Language
Survey on Indian CLIR and MT systems in Marathi Language
Editor IJCATR
 
IRJET - Response Analysis of Educational Videos
IRJET - Response Analysis of Educational VideosIRJET - Response Analysis of Educational Videos
IRJET - Response Analysis of Educational Videos
IRJET Journal
 
An exploratory research on grammar checking of Bangla sentences using statist...
An exploratory research on grammar checking of Bangla sentences using statist...An exploratory research on grammar checking of Bangla sentences using statist...
An exploratory research on grammar checking of Bangla sentences using statist...
IJECEIAES
 
Ac04507168175
Ac04507168175Ac04507168175
Ac04507168175
IJERA Editor
 
Lepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metricLepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metric
Lifeng (Aaron) Han
 
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
Lifeng (Aaron) Han
 
ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...
ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...
ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...
cscpconf
 
Cross language information retrieval in indian
Cross language information retrieval in indianCross language information retrieval in indian
Cross language information retrieval in indian
eSAT Publishing House
 

Similar to Evaluation of hindi english mt systems, challenges and solutions (20)

A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to MarathiA Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathi
 
A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to MarathiA Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathi
 
Translationusing moses1
Translationusing moses1Translationusing moses1
Translationusing moses1
 
**JUNK** (no subject)
**JUNK** (no subject)**JUNK** (no subject)
**JUNK** (no subject)
 
Deciphering voice of customer through speech analytics
Deciphering voice of customer through speech analyticsDeciphering voice of customer through speech analytics
Deciphering voice of customer through speech analytics
 
Error Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsError Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation Outputs
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio... HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 
Pxc3898474
Pxc3898474Pxc3898474
Pxc3898474
 
Machine translation evaluation: a survey
Machine translation evaluation: a surveyMachine translation evaluation: a survey
Machine translation evaluation: a survey
 
Machine Translation Approaches and Design Aspects
Machine Translation Approaches and Design AspectsMachine Translation Approaches and Design Aspects
Machine Translation Approaches and Design Aspects
 
Classification of MT-Output Using Hybrid MT-Evaluation Metrics for Post-Editi...
Classification of MT-Output Using Hybrid MT-Evaluation Metrics for Post-Editi...Classification of MT-Output Using Hybrid MT-Evaluation Metrics for Post-Editi...
Classification of MT-Output Using Hybrid MT-Evaluation Metrics for Post-Editi...
 
Classification of MT-Output Using Hybrid MT-Evaluation Metrics for Post-Editi...
Classification of MT-Output Using Hybrid MT-Evaluation Metrics for Post-Editi...Classification of MT-Output Using Hybrid MT-Evaluation Metrics for Post-Editi...
Classification of MT-Output Using Hybrid MT-Evaluation Metrics for Post-Editi...
 
Survey on Indian CLIR and MT systems in Marathi Language
Survey on Indian CLIR and MT systems in Marathi LanguageSurvey on Indian CLIR and MT systems in Marathi Language
Survey on Indian CLIR and MT systems in Marathi Language
 
IRJET - Response Analysis of Educational Videos
IRJET - Response Analysis of Educational VideosIRJET - Response Analysis of Educational Videos
IRJET - Response Analysis of Educational Videos
 
An exploratory research on grammar checking of Bangla sentences using statist...
An exploratory research on grammar checking of Bangla sentences using statist...An exploratory research on grammar checking of Bangla sentences using statist...
An exploratory research on grammar checking of Bangla sentences using statist...
 
Ac04507168175
Ac04507168175Ac04507168175
Ac04507168175
 
Lepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metricLepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metric
 
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
 
ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...
ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...
ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...
 
Cross language information retrieval in indian
Cross language information retrieval in indianCross language information retrieval in indian
Cross language information retrieval in indian
 

More from Sajeed Mahaboob

Managerial ethics and Social responsibility
Managerial ethics and Social responsibilityManagerial ethics and Social responsibility
Managerial ethics and Social responsibility
Sajeed Mahaboob
 
Morpheme, morph, allomorph
Morpheme, morph, allomorphMorpheme, morph, allomorph
Morpheme, morph, allomorph
Sajeed Mahaboob
 
Water Level Indicator (Mini project)
Water Level Indicator (Mini project)Water Level Indicator (Mini project)
Water Level Indicator (Mini project)
Sajeed Mahaboob
 
The concept of Core Competency
The concept of Core CompetencyThe concept of Core Competency
The concept of Core Competency
Sajeed Mahaboob
 
Design and development of horizontal tensile testing machine (5kN)
Design and development of horizontal tensile testing machine (5kN)Design and development of horizontal tensile testing machine (5kN)
Design and development of horizontal tensile testing machine (5kN)
Sajeed Mahaboob
 
Banana Fiber Reinforced Composite Materials
Banana Fiber Reinforced Composite MaterialsBanana Fiber Reinforced Composite Materials
Banana Fiber Reinforced Composite Materials
Sajeed Mahaboob
 
INTERROGATIVE CONSTRUCTIONS
INTERROGATIVE CONSTRUCTIONSINTERROGATIVE CONSTRUCTIONS
INTERROGATIVE CONSTRUCTIONS
Sajeed Mahaboob
 
Ocular tribology
Ocular tribologyOcular tribology
Ocular tribology
Sajeed Mahaboob
 
Design of a micro injection moulding machine for thermosetting moulding mater...
Design of a micro injection moulding machine for thermosetting moulding mater...Design of a micro injection moulding machine for thermosetting moulding mater...
Design of a micro injection moulding machine for thermosetting moulding mater...
Sajeed Mahaboob
 
High pressure cylinders
High pressure cylindersHigh pressure cylinders
High pressure cylinders
Sajeed Mahaboob
 
Interrogative constructions
Interrogative constructionsInterrogative constructions
Interrogative constructions
Sajeed Mahaboob
 
Study of corrosion and erosion in boilers
Study of corrosion and erosion in boilersStudy of corrosion and erosion in boilers
Study of corrosion and erosion in boilers
Sajeed Mahaboob
 
Bilingualism and Multilingualism_Sajeed Mahaboob
Bilingualism and Multilingualism_Sajeed MahaboobBilingualism and Multilingualism_Sajeed Mahaboob
Bilingualism and Multilingualism_Sajeed Mahaboob
Sajeed Mahaboob
 

More from Sajeed Mahaboob (13)

Managerial ethics and Social responsibility
Managerial ethics and Social responsibilityManagerial ethics and Social responsibility
Managerial ethics and Social responsibility
 
Morpheme, morph, allomorph
Morpheme, morph, allomorphMorpheme, morph, allomorph
Morpheme, morph, allomorph
 
Water Level Indicator (Mini project)
Water Level Indicator (Mini project)Water Level Indicator (Mini project)
Water Level Indicator (Mini project)
 
The concept of Core Competency
The concept of Core CompetencyThe concept of Core Competency
The concept of Core Competency
 
Design and development of horizontal tensile testing machine (5kN)
Design and development of horizontal tensile testing machine (5kN)Design and development of horizontal tensile testing machine (5kN)
Design and development of horizontal tensile testing machine (5kN)
 
Banana Fiber Reinforced Composite Materials
Banana Fiber Reinforced Composite MaterialsBanana Fiber Reinforced Composite Materials
Banana Fiber Reinforced Composite Materials
 
INTERROGATIVE CONSTRUCTIONS
INTERROGATIVE CONSTRUCTIONSINTERROGATIVE CONSTRUCTIONS
INTERROGATIVE CONSTRUCTIONS
 
Ocular tribology
Ocular tribologyOcular tribology
Ocular tribology
 
Design of a micro injection moulding machine for thermosetting moulding mater...
Design of a micro injection moulding machine for thermosetting moulding mater...Design of a micro injection moulding machine for thermosetting moulding mater...
Design of a micro injection moulding machine for thermosetting moulding mater...
 
High pressure cylinders
High pressure cylindersHigh pressure cylinders
High pressure cylinders
 
Interrogative constructions
Interrogative constructionsInterrogative constructions
Interrogative constructions
 
Study of corrosion and erosion in boilers
Study of corrosion and erosion in boilersStudy of corrosion and erosion in boilers
Study of corrosion and erosion in boilers
 
Bilingualism and Multilingualism_Sajeed Mahaboob
Bilingualism and Multilingualism_Sajeed MahaboobBilingualism and Multilingualism_Sajeed Mahaboob
Bilingualism and Multilingualism_Sajeed Mahaboob
 

Recently uploaded

How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience
Wahiba Chair Training & Consulting
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
heathfieldcps1
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
RAHUL
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
WaniBasim
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
adhitya5119
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
Dr. Mulla Adam Ali
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
Himanshu Rai
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
iammrhaywood
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
adhitya5119
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
HajraNaeem15
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
Israel Genealogy Research Association
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
Jyoti Chand
 
How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17
Celine George
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
Nguyen Thanh Tu Collection
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
Priyankaranawat4
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
TechSoup
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Fajar Baskoro
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
AyyanKhan40
 

Recently uploaded (20)

How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
 
How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
 

Evaluation of hindi english mt systems, challenges and solutions

  • 1. HUL 455 Evaluation of Hindi-English MT systems: Challenges and Solutions APresentationby: Sajeed Mahaboob 2011ME1111
  • 2. MACHINE TRANSLATION Translation can be defined as the act or process of translating, especially from one language into another. MT investigates the use of computer software to translate text or speech from one language (SL) to another language (TL). It is Automated system. 2
  • 3. It analyzes text from Source Language (SL), processed it and produces “equivalent” text in Target Language (TL). It should be without human intervention. MT systems are supposed to break the language barrier. 3
  • 4. METHODS AND STRATEGIES Direct Method Transfer Method Interlingual Method 4
  • 5. DIRECT METHOD The majority of MT systems of the 1950’s and 1960’s were based on this approach. Designed in all details specifically for one particular pair of languages. Word by word matches of the SL and TL. 5
  • 6. TRANSFER METHOD Two stages that consist of underlying representations for both SL and TL texts. The first stage converts SL texts into SL ‘transfer’ representations. The second stage converts these into TL ‘transfer’ representations. 6
  • 7. INTERLINGUAL METHOD Convert SL texts into semantico-syntactic representations common to more than one language. From such ‘interlingual’ representations texts would be generated into other languages. 7
  • 8. MT IN INDIA: WHY DO WE NEED ? Multilingual country where the spoken language changes after every 50 miles. 22 official languages and approximately 2000 dialects are spoken. State governments carry out their official work in their respective regional language. Translating documents manually is very time consuming and costly. 8
  • 9. ENGLISH-HINDI MT SYSTEMS MANTRA MT (1997) Developed for information preservation. The text available in one Indian language is made accessible in another Indian language with the help of this system. It uses XTAG based super tagger and light dependency analyzer for performing the analysis of the input English text. The system produces several outputs corresponding to a given input. 9
  • 10. MANTRA MT(1999) It translates English text into Hindi in a specific domain of personal administration that includes gazette notifications, office orders, office memorandums and circulars. Uses the Tree Adjoining Grammar (TAG) formalism to represent the English and Hindi grammar. It uses tree transfer for translating from English to Hindi. The system was tested for the translation of administrative documents such as appointment letters, notification and circular issued in central government from English to Hindi. 10
  • 11. English–Hindi Translation System A system based on transfer based translation approach, which uses different grammatical rules of source and target languages and a bilingual dictionary for translation. The translation module consists of pre-processing, English tree generator, post-processing of English tree, generation of Hindi tree, Post-processing of Hindi tree and generating output. The domain of the system was weather narration. 11
  • 12. EVALUATION OF ENGLISH-HINDI MT SYSTEMS Low accuracy, fluency and acceptability of output of any machine translation system adversely affect the reliability and usage of that system. Evaluation task can ascertain how and in what ways are the results of these systems lacking. Evaluation is one of the most important part in the development of MT systems and one can’t claim MT systems success without evaluation ! The need and demand for evaluating an MT system is always at a higher priority. Here, we are evaluating the output of Hindi-English language pair through two MT systems : Bing and Google. 12
  • 13. Google MT/Translator is based on statistical and machine learning approaches based on parallel corpora. It is running for 73 languages pairs. Bing (Microsoft) MT is also based on statistical and machine learning approaches based on parallel corpora. It also uses language specific rule- based components to decode and encode sentences from one language to another. Linguistically informed statistical machine translation”. Bing MT is running for 44 parallel languages pairs. 13
  • 14. EVALUATIONSTRATEGIES Evaluation strategies are mainly divided into two sections : (a) Automatic evaluation (b) Manual or Human evaluation. Automatic evaluation of any MT system is very difficult and is not as effective as human metrics are. There are several tested MT evaluation measures frequently used, for example: BLEU, mWER, mPER and NIST. Human evaluation metrics are considered to be time taking and costly. But they are the best strategies to improve any MT system’s accuracy ! ! It is a common scenario where more than one translation of a sentence exists. At this level a human translator cum evaluator can judge the output correctly. 14
  • 15. CHALLENGES DURING EVALUATION Sentences from the health and cuisine domains of the ILCI3 corpora are used for evaluating the MT systems. These sentences are entered in each of the systems in bulk and the output is crawled, and discrepancies are marked. In the resulting English output, several problems are noted particularly with respect to gender agreement, structural mapping, Named Entity Recognition (NER) and plural marker morphemes. 15
  • 16. During the evaluation process the following kinds of challenges are encountered. 1. Tokenization 2. Morph Issue 3. Structural/grammatical Differences 4. Errors with Gender agreement 5. Parser Issues 16
  • 17. TOKENIZATION  (i) With/Without Punctuation :  (a) वह जाती है। She goes by. (BO) He is. (GO)  (b) वह जाती है He is (BO) He is (GO) Manual Translation: She goes.  Examples (a) and (b) above exhibit how the use of a punctuation mark can significantly affect translation. This variation in results is seen only in Bing. Google exhibits consistency. 17
  • 18. TRANSLITERATION ISSUE:  (b) एक नौन-स्टिक तवा गरम करें A naun-stick frying pan and heat (BO) A Non - stick frying pan and heat (GO)  Manual Translation: Heat the non-stick fry pan 18
  • 19. MORPHISSUE  (i) Unknown words:  छु आरे डालकर ममलाएं और एक ममननि पकाएँ One minute into the match and put chuare (BO) Mix and cook one minute, add Cuare (GO)  Manual Translation: Put date palm, stir and cook for a minute. 19
  • 20.  (ii) Error with Paradigm fixation:  कॅन्सर 1000 से अधिक बीमाररयों  का एक समूह है  Cancer is a group of more than 1000 berryman (BO)  Cancer is a group of more than 1000 illnesses (GO)  कॅन्सर 1000 से अधिक बीमारी का एक समूह है  Cancer is a group of more than 1,000 diseases (BO)  Cancer is a group of more than 1000 illnesses (GO)  Manual Translation: Cancer is a group of more than 1000 diseases. 20
  • 21. STRUCTURAL/GRAMMATICAL DIFFERENCES  वी. आइ. पी. क्या है?  What is the VIP? (BO)  VIP what is it? (GO)  Manual Translation: What is the VIP? Errors with Gender agreement  वह जाती है।  She goes by. (BO)  He is. (GO)  Manual Translation: She goes. 21
  • 22. PARSER ISSUES आँख की माांसपेधियों की कमजोरी के कारण लेंस अपना आकार नहीं बदल पाता पढ़ते या नजदीकी काम करते समय प्रकाि की धकरणे रधिना के पीछे पड़ती है यह 40 वर्ष और उससे ऊपर की उम्र् में पाई जाती है Due to the weakness of the muscles of the eye lens cannot read or change their size does proximity to work while the light rays have it 40 years behind the retina and above in age (BO) NO OUTPUT (GO) 22
  • 23. Human evaluation strategy has been adopted to evaluate the Bing (Microsoft) and Google MT (Hindi-English) output. Methodology of MT testing: For testing MT systems, 1,000 sentences were used. Their outputs were then distributed into three different human evaluators who marked MT outputs based on comprehensibility and fluency approaches. 23
  • 24. Instructions for Evaluators to Evaluate :  Read the target language translated output first.  Judge each sentence for its comprehensibility.  Rate it on the scale 0 to 4.  Read the original source sentence only to verify the faithfulness of the translation (only for reference).  Do not read the source language sentence first.  If the rating needs revision, change it to the new rating. 24
  • 25. Guidelines of evaluation(on 5 point scale (over 0-4)):  The following score is to be given to a sentence by looking at each output sentence: (A) For Comprehensibility 4= All meaning 3= most meaning 2 = much meaning 1= little meaning 0= none. 25
  • 26.  (B)For fluency 4= for Flawless or Perfect: (like someone who knows the language) 3= for Good or Comprehensible but has quite a few errors: (like someone speaking Hindi getting all its genders wrong) 2 = for Non-native or Comprehensible but has quite a few errors: (like someone who can speak your language but would make lots of error. However, you can make sense out of what is being said.) 1= for Diffluent or Some parts make sense but is not comprehensible over all: (like listening to a language which has lot of borrowed words from your language- you understood those words but nothing more) 0=for Incomprehensible or Non-Sense: (If the sentence does not make any sense at all - It is like someone speaking to you in a language you do not know) 26
  • 27. EVALUATION METHOD If scoring is done for N sentences and each of the N sentences is given a score as above, the two parameters are as follows: (a) Comprehensibility = (Number of sentences with the score of 2, 3, or 4) / N (b) Fluency = 𝑘=1 𝑁 𝑆𝑖/𝑁 27
  • 28.  Where Si is the score of ith sentence, for instance, If N=10, and suppose the scores obtained for the each of the 10 sentences are : S1=3, S2=3, S3=2 S4=1, S5=4, S6=0, S7=0, S8=1, S9=0, S10=0 This gives the following histogram :  Number of sentences with score 4 = 1  Number of sentences with score 3 = 2  Number of sentences with score 2 = 1  Number of sentences with score 1 = 2  Number of sentences with score 0 = 3  Weighted sum =14, then this produces:  Comprehensibility = 40 % (Because 4 out of 10 sentences gain with a score of 2, 3, or 4.)  Fluency = 14/10= 1.4 (on a scale of 0-4) 36% (on the max possible scale of 100) 28
  • 29. Table 1: Score Table to Compute Comprehensibility Table 2: Score Table to Compute Fluency 29
  • 30. 30
  • 31. Hence, we have evaluated Bing & Google MT systems. When we examined and evaluated these systems, we found many errors. And when, we evaluated MT systems, the fluency was found to be very low but it was almost comprehensible. On comparison, Google was found to be better than Bing MT in comprehensibility. 31
  • 32. SUGGESTIONS While giving the input sentences tokenize them and avoid the use full stop marker in final place. Both MT systems should improve their morph dictionary through corpus data and make linguistics rules for paradigm fixation(how to analyze inflectional and derivational category), and if MT systems are trained with large number of words and sentences then parsing issues might be resolved. Then, these systems will improve and the errors will decrease up to some extent. Following these steps, we can increase the Bing and Google MT systems in fluency as well as in comprehensibility. 32
  • 34. Thanks for your patience 34