SlideShare a Scribd company logo
1 of 30
A SEMINAR REPORT
ON
English-Assamese Statistical
Machine Translation using Moses
by
KALYANEE KANCHAN BARUAAH
Contents
īƒ˜Introduction
īƒ˜Literature Review
īƒ˜Implementation
īƒ˜Problems and proposed solutions
īƒ˜Results and Evaluation
īƒ˜ Conclusion and future work
īƒ˜ References
Introduction
īƒ˜ Natural Language Processing
īƒ˜ Machine Translation
īƒ˜Need for Machine Translation
īƒ˜Problems in MT
īƒ˜Approaches to machine translation
īƒ˜Direct-based MT
īƒ˜Rule-based MT
īƒ˜Corpus-based MT
īƒ˜Knowledge-based MT
īƒ˜SOME EXISTING MT SYSTEMS
īƒ˜Statistical Machine Translation
Introduction
NLP (Natural Language Processing)
– deals with understanding and developing computational
theories of Human Language. Such theories allows us to
understand the structure of the language and build computer
software that can process language.
– Plays a major role in men-machine communication as well as
men-men communication.
– Machine Translation (MT) is a sub-field of computational
linguistics that investigates the use of computer software to
translate text or speech from one natural language to another.
Machine Translation
Machine translation the application of computer and language
sciences to the development of systems answering practical needs.
– Need for Machine Translation
– needed to translate literary works which from any language into
native languages.
– Most of the information available is in English which is understood
by only 3% of the population .
– making available work rich sources of literature available to people
across the world.
– Problems in Machine Translation
– Translation is not straightforward
– Automation of translation not easy
– Idioms
– Ambiguity
Machine Translation
īƒ˜MT Approaches
MT Approaches
īƒ˜ Direct MT
īƒ˜The most basic form of MT. It translates the individual words in a
sentence from one language to another using a two-way
dictionary. It makes use of very simple grammar rules
īƒ˜Little analysis of source language
īƒ˜No parsing
īƒ˜Reliance on large two-way dictionary
īƒ˜ Rule-Based Machine Translation
īƒ˜ (RBMT; also known as “Knowledge-Based Machine
Translation”; “Classical Approach” of MT) is a general term that
denotes machine translation systems based on linguistic
information about source and target languages basically retrieved
from (bilingual) dictionaries and grammars covering the main
semantic, morphological, and syntactic regularities of each
language respectively
MT Approaches
â€ĸ Interlingua based Machine Translation
â€ĸ Fig 2.3: Multilingual MT system with Interlingua approach
– The Interlingua Machine Translation converts words into a
universal language that is created for the MT simply to translate
it to more than one language.
– can be used in applications like information retrieval.
– more practical when several languages are to be interpreted since
it only needs to translate it from the source language
MT Approaches
Transfer based MT
â€ĸ Transfer based translation have the same idea as that of interlingua i.e.
to make a translation it is necessary to have an intermediate
representation that captures the "meaning" of the original sentence in
order to generate the correct translation. In interlingua-based MT this
intermediate representation must be independent of the languages in
question, whereas in transfer-based MT, it has some dependence on
the language pair involved[9].
Knowledge-based MT
â€ĸ Semantic-based approaches to language analysis have been introduced
by AI researchers. The approached require a large knowledge-base
that includes both ontological and lexical knowledge [6].
MT Approaches
Corpus Based Machine Translation
classified into statistical and example-based Machine Translation.
â€ĸ Example based MT
– Example based systems use previous translation examples to
generate translations for an input provided. When an input
sentence is presented to the system, it retrieves a similar source
sentence from the example-base and its translation.
Statistical Machine Translation
– SMT requires less human effort to undertake translation. SMT is
a machine translation paradigm where translations are generated
on the basis of statistical models
Example-based MT Statistical-based MT
Example-based MT systems use variety of linguistic
resources such as dictionaries and thesauri, etc., to
translate text.
Statistical-based MT uses purely
statistical based methods in aligning the
words and generation of texts.
SOME EXISTING MT SYSTEMS
â€ĸ Google Translate
â€ĸ Systran
â€ĸ Bing Translator
â€ĸ Bable Fish
Statistical Machine Translation
īƒ˜ Statistical Machine Translation consists of Language Model (LM),
Translation Model (TM) and the decoder.
īƒ˜ The purpose of the language model is to encourage fluent output
and the purpose of the translation model is to encourage similarity
between input and output, the decoder maximizes the probability of
translated text of target language.
īƒ˜ SMT is based on ideas used in Information Theory and in particular
Shannon’s noisy-channel model. The purpose of this model is to
identify a message which is transmitted through a communication
channel and is hence prone to errors due to the channel’s quality.
Statistical Machine Translation
Parallel corpus C= a collection of text chunks and their
translations.(byproduct of human translations.)
Given a source sentence f, select target sentence e.
𝑎𝑟𝑔𝑚𝑎đ‘Ĩ 𝑒∈𝐸 𝑓 {p(e|f)}= 𝑎𝑟𝑔𝑚𝑎đ‘Ĩ 𝑒∈𝐸 𝑓 {p(e)*p(f|e)}
E(f)=set of hypothesized translation of f.
P(f/e)=diverges due to –
– Word order
– Morphology
– Syntactic relation
– Idiomatic ways of expression
– Sparse datasets(popularized primarily with sparse datasets)
Statistical Machine Translation
SMT-Language Model
â€ĸ A language model gives the probability of a sentence. The
probability is computed using n-gram model. Language Model can
be considered as computation of the probability of single word given
all of the words that precede it in a sentence .
â€ĸ A sentence is decomposed into the product of conditional
probability.
â€ĸ By using chain rule the probability of sentence P (S), is broken
down as the probability of individual words P(w).
â€ĸ An n-gram model simplifies the task by approximating the
probability of a word given all the previous words.
SMT-Language Model
SMT-Translation Model
Udaipur is a famous city
āĻ‰āĻĻā§ŸāĻĒā§ā§° āĻāĻ–āĻ¨ āĻŦāĻŋāĻ–āĻ¯āĻžāĻ¤ āĻšāĻšā§°
The Translation Model helps to compute the
conditional probability P(T|S).
Implementation..
īą Install all packages in Moses
â€ĸ Install Giza++
â€ĸ Install IRSTLM
īąTraining
īąTuning
īąGenerate output (decoding)
TRAINING THE MOSES DECODER
â€ĸ Prepare data
â€ĸ Run GIZA++
â€ĸ Align words
â€ĸ Get lexical translation table
â€ĸ Extract phrases
â€ĸ Score phrases
â€ĸ Build lexicalized reordering model
â€ĸ Build generation models.
â€ĸ Create configuration file
PREPARING DATA
īƒ˜Tokenising - inserting spaces between words and
punctuation.
īƒ˜Truecaseing - setting the case of the first word in
each sentence.
īƒ˜Cleaning - removing empty lines, redundant
spaces, and lines that are too short or too long.
Sample of Parallel Corpus
eng-ass1.en eng-ass1.as
Shopping in Udaipur is always a delightful
experience and it displays excellent
handicrafts and works developed by local
traders.
āĻ‰āĻĻā§ŸāĻĒā§ā§°āĻ¤ āĻŋāĻœāĻžā§° āĻ•ā§°āĻžāĻŸ āĻž āĻāĻ• āĻ†āĻ¨āĻ¨ā§āĻĻāĻĻāĻžā§ŸāĻ• āĻ…āĻŦāĻŋāĻœā§āĻžāĻ¤āĻž āĻ†ā§°ā§ āĻ‡āĻŸā§Ÿ
āĻ¸ā§āĻĨāĻžāĻ¨ā§€ā§Ÿ āĻŋāĻ¯ā§ąāĻ¸āĻžā§Ÿā§€āĻ¸āĻ•āĻ˛ā§° āĻšāĻ¸ā§āĻ¤āĻ•āĻ˛āĻž āĻ†ā§°ā§ āĻ•āĻžāĻŽā§° āĻ‰āĻ¤ā§āĻ¤āĻŽ āĻŦāĻ¨āĻĻā§°ā§āĻļāĻ¨ āĻĻāĻžāĻŦāĻŋ
āĻ§āĻŸā§°āĨ¤
September to March is the best season to
visit Udaipur.
āĻšāĻšāĻŸā§‡āĻŽā§āĻŦā§°ā§° āĻĒā§°āĻž āĻŽāĻžāĻšāĻļ āĻ˛āĻ˛ āĻ‰āĻĻā§ŸāĻĒā§ā§° āĻ­ā§ā§°āĻŽāĻŖā§° āĻ‰āĻĒāĻ¯ā§āĻ•ā§āĻ¤ āĻ¸āĻŽā§ŸāĨ¤
The Shilpagram is designed on the concept
of a village with little emphasis on the
modern concept.
āĻāĻ–āĻ¨ āĻ—āĻžāĻžāĻ ā§ąā§° āĻĒ āĻŋā§‚ āĻŦāĻŽāĻ¤ āĻŦā§°ā§āĻ˛ā§āĻĒāĻ—ā§ā§°āĻžāĻŽ āĻ…āĻ‚āĻŦāĻ•āĻ¤ āĻ•ā§°āĻž āĻšāĻšāĻŸā§‡ āĻ¯ĘŧāĻ¤
āĻ†āĻ§ā§āĻŦāĻ¨āĻ• āĻ§āĻžā§°āĻŖāĻžā§° āĻ“āĻĒā§°āĻ¤ āĻ—ā§ā§°ā§āĻ¤ā§āĻŦ āĻŦāĻĻā§ŸāĻž āĻšāĻšāĻžā§ąāĻž āĻ¨āĻžāĻ‡āĨ¤
A part of the City Palace is now converted
into a museum that displays some of the
best forms of art and culture.
āĻŦāĻšāĻŸāĻŋ āĻšāĻĒāĻŸāĻ˛āĻšā§° āĻ āĻž āĻ…āĻ‚ā§°ā§ āĻāĻŦāĻ¤ā§ŸāĻž āĻ āĻž āĻ¸āĻ‚āĻ—ā§ā§°āĻžāĻšāĻžāĻ˛ā§ŸāĻ˛āĻ˛ ā§°ā§‚āĻĒāĻžāĻ¨ā§āĻ¤āĻŦā§°āĻ¤
āĻ•ā§°āĻž āĻšāĻšāĻŸā§‡ āĻ†ā§°ā§ āĻ‡ā§ŸāĻžāĻ¤ āĻŦāĻ•ā§‡ā§āĻŽāĻžāĻ¨ āĻ‰āĻ¨ā§āĻ¨āĻ¤ āĻ•āĻ˛āĻž āĻ†ā§°ā§ āĻ¸āĻ‚āĻ¸ā§āĻ•ā§ƒ āĻŦāĻ¤ā§°
āĻĒā§ā§°āĻĻā§°ā§āĻļāĻ¨ āĻ•ā§°āĻž āĻšāĻšāĻŸā§‡āĨ¤
Sample output
English sentences as input Corresponding output in Assamese
Kanak Vrindavan is a popular picnic
spot in Jaipur
āĻ•āĻ¨āĻ• āĻŋā§ƒāĻ¨ā§āĻĻāĻžāĻŋāĻ¨ āĻšāĻšāĻŸā§‡ āĻœā§ŸāĻĒā§ā§°ā§° āĻāĻ–āĻ¨ āĻœāĻ¨āĻŦāĻĒā§ā§°ā§Ÿ āĻŋāĻ¨āĻŸāĻŋāĻžāĻœ āĻ¸ā§āĻĨāĻžāĻ¨
City Palace is a synthesis of Mughal
and Rajasthani architecture
āĻŦāĻšāĻŸāĻŋ āĻšāĻĒāĻŸāĻ˛āĻš āĻšāĻŽāĻžāĻ—āĻ˛ āĻ†ā§°ā§ ā§°āĻžāĻœāĻ¸ā§āĻĨāĻžāĻ¨ā§€ āĻ¸ā§āĻĨāĻžāĻĒāĻ¤āĻ¯ āĻŦāĻŋāĻĻāĻ¯āĻžā§° āĻ¸āĻ‚āĻŦāĻŽāĻļā§ā§°āĻŖāĨ¤
Jama Masjid is the largest mosque in
India
āĻœāĻžāĻŽāĻž āĻŽā§‡āĻŦāĻœāĻĻ āĻŋāĻžā§°āĻ¤ā§° āĻŦāĻŋāĻ¤ā§°āĻ¤ āĻ† āĻžāĻ‡āĻ¤āĻ˛āĻ• āĻĄāĻžāĻŋā§° āĻŽā§‡āĻŦāĻœāĻĻāĨ¤
A part of the City Palace is
now converted into a museum
āĻŦāĻšāĻŸāĻŋ āĻšāĻĒāĻŸāĻ˛āĻšā§° āĻ āĻž āĻ…āĻ‚ā§°ā§ āĻāĻŦāĻ¤ā§ŸāĻž āĻ āĻž āĻ¸āĻ‚āĻ—ā§ā§°āĻžāĻšāĻžāĻ˛ā§ŸāĻ˛āĻ˛
ā§°ā§‚āĻĒāĻžāĻ¨ā§āĻ¤āĻŦā§°āĻ¤ āĻ•ā§°āĻž āĻšāĻšāĻŸā§‡
Block diagram showing SMT using moses
Results and evaluation
– The output of the experiment was evaluated using
BLEU(Bilingual Evaluation Understudy)
– If we do Assamese-English translation using same parallel
corpus, BLEU score of 5.72 is obtained. This is very small
and may be because we have used a very small data set.
â€ĸ BLEU scores are not commensurate even between different corpora in
the same translation direction. Bleu is really only comparable for
different systems or system variants on the exact same data.
Source/Target BLEU Score
English-Assamese 4.71
Problems and proposed solution
Problemsâ€Ļ
– As we have used limited amount of English-Assamese parallel
corpus. The efficiency of the translation model is less as
efficiency increase when we are with more amount of data
(parallel corpus) for training.
– As it is not convenient here to get a better result of translation
for the OOV (Out of vocabulary Words) here as moses tool
either ignore OOV words or drop down. We are trying to
implement transliteration for those OOV. OOV words can be
those words which are not present in the corpus, some proper
nouns etc.
Problems and proposed solution
Solution..
Transliteration
â€ĸ āĻ•ā§-āĻŽāĻž-ā§° -> (ku-ma-r) ā§°āĻžāĻœ āĻ•ā§ āĻŽāĻž ā§° ->(Raj-ku-ma-r)
â€ĸ Example: English-Assamese Transliteration
For example, when we translate the sentence “panaji is a city" ,We
have used the following command for incorporating transliteration
into translation
â€ĸ
â€ĸ echo 'panaji is a city'| ~/mymoses/bin/moses -f ~/work/mert-work/moses.ini |
./space.pl | ~/mymoses/bin/moses -f ~/work1/train/model/moses.ini | ./join.pl
This gives us the output :
â€ĸ “āĻĒāĻžāĻ¨āĻžāĻŦāĻœāĻ˛āĻšāĻŸā§‡āĻ āĻžāĻšāĻšā§°â€
Conclusion and Future Work
â€ĸ The work can be extended in following directions.
– The system can also be put in the web-based portal to translate content of
one web page in English to Assamese.
– We will try to increase the corpus for better training for better efficiency.
– We will try to develop the translation system by our own instead of using
Moses MT system.
– Since all Indian languages follow SOV order, and are relatively rich in
terms of morphology, the methodology presented should be applicable to
English to Indian language SMT in general. Since morphological and
parsing tools are not much widely available for Indian languages, an
approach like this which minimizes the use of such tools for the target
language would be quite handy.
Conclusion and Future Work
â€ĸ we try to get more corpora from different domains in such a
way that it will cover all the wordings. Since BLUE is not so
good for rough translation we need some other evaluation
techniques also.
â€ĸ We should try the incorporation of shallow syntactic
information (POS tags) in our discriminative model to boost
the performance of translation.
References
– Machine Translation Approaches and Survey for Indian Languages Antony P. J.∗
– G. Singh and G. Singh Lehal,” A Punjabi to Hindi Machine Translation System”, Coling 2008: Companion
volume- Posters and demonstrations, Manchester, August 2008.
– F.J.Och., “GIZA++: Training of statistical translation models”, [Online]. Available at:
http://fjoch.com/GIZA++.html.
– Moses Manual
â€ĸ Natural%20language%20processing%20-
%20Wikipedia,%20the%20free%20encyclopedia.html
â€ĸ D. D. Rao, “Machine Translation A Gentle Introduction”, RESONANCE, July 1998.
â€ĸ “Statistical machine translation”, [Online].
Available,http://en.wikipedia.org/wiki/Statistical_machine_translation
â€ĸ S.K. Dwivedi and P. P. Sukadeve, “Machine Translation System Indian Perspectives”,
Proceeding of Journal of Computer Science Vol. 6 No. 10. pp 1082-1087, May 2010.
â€ĸ “Machine Translation ”, [Online],Available :
â€ĸ http://www.ida.liu.se/~729G11/HYPERLINK
"http://www.ida.liu.se/~729G11/projekt/studentpapper-10/maria-"projekt/studentpapper-
10/maria- hedblom.pdf
â€ĸ “Machine Translation”, [Online]. Available,
http://faculty.ksu.edu.sa/homiedan/Publications/
â€ĸ Machine%20Translation.pdf
â€ĸ D. D. Rao, “Machine Translation A Gentle Introduction”, RESONANCE, July 1998.
TTANT T

More Related Content

What's hot

[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...
[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...
[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...Hayahide Yamagishi
 
Machine Translation: What it is?
Machine Translation: What it is?Machine Translation: What it is?
Machine Translation: What it is?Multilizer
 
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?tauyou
 
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid ApproachPunjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid ApproachIJERA Editor
 
Pbsmt presenation waleed_oransa_29_april2010
Pbsmt presenation waleed_oransa_29_april2010Pbsmt presenation waleed_oransa_29_april2010
Pbsmt presenation waleed_oransa_29_april2010woransa
 
Machine Translation System: Chhattisgarhi to Hindi
Machine Translation System: Chhattisgarhi to HindiMachine Translation System: Chhattisgarhi to Hindi
Machine Translation System: Chhattisgarhi to HindiPadma Metta
 
Part of speech tagging for Arabic
Part of speech tagging for ArabicPart of speech tagging for Arabic
Part of speech tagging for ArabicArabic_NLP_ImamU2013
 
Langauage model
Langauage modelLangauage model
Langauage modelc sharada
 
part of speech tagger for ARABIC TEXT
part of speech tagger for ARABIC TEXTpart of speech tagger for ARABIC TEXT
part of speech tagger for ARABIC TEXTarteimi
 
Classification of Machine Translation Outputs Using NB Classifier and SVM for...
Classification of Machine Translation Outputs Using NB Classifier and SVM for...Classification of Machine Translation Outputs Using NB Classifier and SVM for...
Classification of Machine Translation Outputs Using NB Classifier and SVM for...mlaij
 
mtp_stageII_final
mtp_stageII_finalmtp_stageII_final
mtp_stageII_finalBibek Behera
 
NLP_KASHK:Text Normalization
NLP_KASHK:Text NormalizationNLP_KASHK:Text Normalization
NLP_KASHK:Text NormalizationHemantha Kulathilake
 
IMPROVING THE QUALITY OF GUJARATI-HINDI MACHINE TRANSLATION THROUGH PART-OF-S...
IMPROVING THE QUALITY OF GUJARATI-HINDI MACHINE TRANSLATION THROUGH PART-OF-S...IMPROVING THE QUALITY OF GUJARATI-HINDI MACHINE TRANSLATION THROUGH PART-OF-S...
IMPROVING THE QUALITY OF GUJARATI-HINDI MACHINE TRANSLATION THROUGH PART-OF-S...ijnlc
 
Introduction To Translation Technologies
Introduction To Translation TechnologiesIntroduction To Translation Technologies
Introduction To Translation Technologiesxenotext
 
RULE BASED TRANSLITERATION SCHEME FOR ENGLISH TO PUNJABI
RULE BASED TRANSLITERATION SCHEME FOR ENGLISH TO PUNJABIRULE BASED TRANSLITERATION SCHEME FOR ENGLISH TO PUNJABI
RULE BASED TRANSLITERATION SCHEME FOR ENGLISH TO PUNJABIijnlc
 
MACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSISMACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSISMassimo Schenone
 
Methodology of MT Post-Editors Training
Methodology of MT Post-Editors TrainingMethodology of MT Post-Editors Training
Methodology of MT Post-Editors TrainingJakub Absolon
 
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Fwdays
 

What's hot (18)

[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...
[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...
[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...
 
Machine Translation: What it is?
Machine Translation: What it is?Machine Translation: What it is?
Machine Translation: What it is?
 
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?
 
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid ApproachPunjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
 
Pbsmt presenation waleed_oransa_29_april2010
Pbsmt presenation waleed_oransa_29_april2010Pbsmt presenation waleed_oransa_29_april2010
Pbsmt presenation waleed_oransa_29_april2010
 
Machine Translation System: Chhattisgarhi to Hindi
Machine Translation System: Chhattisgarhi to HindiMachine Translation System: Chhattisgarhi to Hindi
Machine Translation System: Chhattisgarhi to Hindi
 
Part of speech tagging for Arabic
Part of speech tagging for ArabicPart of speech tagging for Arabic
Part of speech tagging for Arabic
 
Langauage model
Langauage modelLangauage model
Langauage model
 
part of speech tagger for ARABIC TEXT
part of speech tagger for ARABIC TEXTpart of speech tagger for ARABIC TEXT
part of speech tagger for ARABIC TEXT
 
Classification of Machine Translation Outputs Using NB Classifier and SVM for...
Classification of Machine Translation Outputs Using NB Classifier and SVM for...Classification of Machine Translation Outputs Using NB Classifier and SVM for...
Classification of Machine Translation Outputs Using NB Classifier and SVM for...
 
mtp_stageII_final
mtp_stageII_finalmtp_stageII_final
mtp_stageII_final
 
NLP_KASHK:Text Normalization
NLP_KASHK:Text NormalizationNLP_KASHK:Text Normalization
NLP_KASHK:Text Normalization
 
IMPROVING THE QUALITY OF GUJARATI-HINDI MACHINE TRANSLATION THROUGH PART-OF-S...
IMPROVING THE QUALITY OF GUJARATI-HINDI MACHINE TRANSLATION THROUGH PART-OF-S...IMPROVING THE QUALITY OF GUJARATI-HINDI MACHINE TRANSLATION THROUGH PART-OF-S...
IMPROVING THE QUALITY OF GUJARATI-HINDI MACHINE TRANSLATION THROUGH PART-OF-S...
 
Introduction To Translation Technologies
Introduction To Translation TechnologiesIntroduction To Translation Technologies
Introduction To Translation Technologies
 
RULE BASED TRANSLITERATION SCHEME FOR ENGLISH TO PUNJABI
RULE BASED TRANSLITERATION SCHEME FOR ENGLISH TO PUNJABIRULE BASED TRANSLITERATION SCHEME FOR ENGLISH TO PUNJABI
RULE BASED TRANSLITERATION SCHEME FOR ENGLISH TO PUNJABI
 
MACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSISMACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSIS
 
Methodology of MT Post-Editors Training
Methodology of MT Post-Editors TrainingMethodology of MT Post-Editors Training
Methodology of MT Post-Editors Training
 
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
 

Viewers also liked

Machine transliteration survey
Machine transliteration surveyMachine transliteration survey
Machine transliteration surveyunyil96
 
Escaping style and script data
Escaping style and script dataEscaping style and script data
Escaping style and script dataMohamed Fadel Buffon
 
Sec16.3: Reordering Integration
Sec16.3: Reordering IntegrationSec16.3: Reordering Integration
Sec16.3: Reordering Integrationvarbalow
 
Designing e-Learning Content for Localization
Designing e-Learning Content for LocalizationDesigning e-Learning Content for Localization
Designing e-Learning Content for LocalizationSumaLatam
 
Summary of Rule-based Reordering Space in Statistical Machine Translation
Summary of Rule-based Reordering Space in Statistical Machine TranslationSummary of Rule-based Reordering Space in Statistical Machine Translation
Summary of Rule-based Reordering Space in Statistical Machine TranslationHiroshi Matsumoto
 
7. ebmt based on st sm
7. ebmt based on st sm7. ebmt based on st sm
7. ebmt based on st smHiroshi Matsumoto
 
A statistical approach to machine translation
A statistical approach to machine translationA statistical approach to machine translation
A statistical approach to machine translationHiroshi Matsumoto
 
Data Localization and Translation
Data Localization and TranslationData Localization and Translation
Data Localization and TranslationYevhen Shyshkin
 
Going Global? The ABC of Localization-Friendly Content
Going Global? The ABC of Localization-Friendly ContentGoing Global? The ABC of Localization-Friendly Content
Going Global? The ABC of Localization-Friendly ContentSumaLatam
 
Translation & Localization
Translation & LocalizationTranslation & Localization
Translation & LocalizationVengaGlobal
 
Statistical machine translation in a few slides
Statistical machine translation in a few slidesStatistical machine translation in a few slides
Statistical machine translation in a few slidesForcada Mikel
 
Machine Translation: Latest Innovations and their Impact on Commercial Transl...
Machine Translation: Latest Innovations and their Impact on Commercial Transl...Machine Translation: Latest Innovations and their Impact on Commercial Transl...
Machine Translation: Latest Innovations and their Impact on Commercial Transl...SDL
 
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...TAUS - The Language Data Network
 
TAUS webinar The Big Picture View On The Translation Industry, March 2013
TAUS webinar The Big Picture View On The Translation Industry, March 2013TAUS webinar The Big Picture View On The Translation Industry, March 2013
TAUS webinar The Big Picture View On The Translation Industry, March 2013TAUS - The Language Data Network
 
Localization and globalization in c#
Localization and globalization in c#Localization and globalization in c#
Localization and globalization in c#PaYal Umraliya
 
Localization framework
Localization frameworkLocalization framework
Localization frameworkOm Vikram Thapa
 
Machine translation with statistical approach
Machine translation with statistical approachMachine translation with statistical approach
Machine translation with statistical approachvini89
 
TAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engine
TAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engineTAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engine
TAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engineTAUS - The Language Data Network
 

Viewers also liked (20)

Machine transliteration survey
Machine transliteration surveyMachine transliteration survey
Machine transliteration survey
 
Escaping style and script data
Escaping style and script dataEscaping style and script data
Escaping style and script data
 
Sec16.3: Reordering Integration
Sec16.3: Reordering IntegrationSec16.3: Reordering Integration
Sec16.3: Reordering Integration
 
Designing e-Learning Content for Localization
Designing e-Learning Content for LocalizationDesigning e-Learning Content for Localization
Designing e-Learning Content for Localization
 
Summary of Rule-based Reordering Space in Statistical Machine Translation
Summary of Rule-based Reordering Space in Statistical Machine TranslationSummary of Rule-based Reordering Space in Statistical Machine Translation
Summary of Rule-based Reordering Space in Statistical Machine Translation
 
7. ebmt based on st sm
7. ebmt based on st sm7. ebmt based on st sm
7. ebmt based on st sm
 
Towards OpenLogos Hybrid Machine Translation - Anabela Barreiro
Towards OpenLogos Hybrid Machine Translation - Anabela BarreiroTowards OpenLogos Hybrid Machine Translation - Anabela Barreiro
Towards OpenLogos Hybrid Machine Translation - Anabela Barreiro
 
A statistical approach to machine translation
A statistical approach to machine translationA statistical approach to machine translation
A statistical approach to machine translation
 
Data Localization and Translation
Data Localization and TranslationData Localization and Translation
Data Localization and Translation
 
Going Global? The ABC of Localization-Friendly Content
Going Global? The ABC of Localization-Friendly ContentGoing Global? The ABC of Localization-Friendly Content
Going Global? The ABC of Localization-Friendly Content
 
Translation & Localization
Translation & LocalizationTranslation & Localization
Translation & Localization
 
Statistical machine translation in a few slides
Statistical machine translation in a few slidesStatistical machine translation in a few slides
Statistical machine translation in a few slides
 
Machine Translation: Latest Innovations and their Impact on Commercial Transl...
Machine Translation: Latest Innovations and their Impact on Commercial Transl...Machine Translation: Latest Innovations and their Impact on Commercial Transl...
Machine Translation: Latest Innovations and their Impact on Commercial Transl...
 
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...
 
WEBINAR: TAUS Outlook 2013
WEBINAR: TAUS Outlook 2013WEBINAR: TAUS Outlook 2013
WEBINAR: TAUS Outlook 2013
 
TAUS webinar The Big Picture View On The Translation Industry, March 2013
TAUS webinar The Big Picture View On The Translation Industry, March 2013TAUS webinar The Big Picture View On The Translation Industry, March 2013
TAUS webinar The Big Picture View On The Translation Industry, March 2013
 
Localization and globalization in c#
Localization and globalization in c#Localization and globalization in c#
Localization and globalization in c#
 
Localization framework
Localization frameworkLocalization framework
Localization framework
 
Machine translation with statistical approach
Machine translation with statistical approachMachine translation with statistical approach
Machine translation with statistical approach
 
TAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engine
TAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engineTAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engine
TAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engine
 

Similar to Translationusing moses1

Integration of speech recognition with computer assisted translation
Integration of speech recognition with computer assisted translationIntegration of speech recognition with computer assisted translation
Integration of speech recognition with computer assisted translationChamani Shiranthika
 
EMPLOYING PIVOT LANGUAGE TECHNIQUE THROUGH STATISTICAL AND NEURAL MACHINE TRA...
EMPLOYING PIVOT LANGUAGE TECHNIQUE THROUGH STATISTICAL AND NEURAL MACHINE TRA...EMPLOYING PIVOT LANGUAGE TECHNIQUE THROUGH STATISTICAL AND NEURAL MACHINE TRA...
EMPLOYING PIVOT LANGUAGE TECHNIQUE THROUGH STATISTICAL AND NEURAL MACHINE TRA...ijnlc
 
Experiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine TranslationExperiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine Translationkhyati gupta
 
project present
project presentproject present
project presentkhyati gupta
 
ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...
ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...
ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...cscpconf
 
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...iosrjce
 
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...IJERA Editor
 
Direct Punjabi to English Speech Translation using Discrete Units
Direct Punjabi to English Speech Translation using Discrete UnitsDirect Punjabi to English Speech Translation using Discrete Units
Direct Punjabi to English Speech Translation using Discrete UnitsIJCI JOURNAL
 
Error Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsError Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsParisa Niksefat
 
Improving the role of language model in statistical machine translation (Indo...
Improving the role of language model in statistical machine translation (Indo...Improving the role of language model in statistical machine translation (Indo...
Improving the role of language model in statistical machine translation (Indo...IJECEIAES
 
A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to MarathiA Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathiaciijournal
 
A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to MarathiA Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathiaciijournal
 
A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to MarathiA Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathiaciijournal
 
What is machine translation
What is machine translationWhat is machine translation
What is machine translationStephen Peacock
 
Machine Translation Approaches and Design Aspects
Machine Translation Approaches and Design AspectsMachine Translation Approaches and Design Aspects
Machine Translation Approaches and Design AspectsIOSR Journals
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Abdullah al Mamun
 
Improvement in Quality of Speech associated with Braille codes - A Review
Improvement in Quality of Speech associated with Braille codes - A ReviewImprovement in Quality of Speech associated with Braille codes - A Review
Improvement in Quality of Speech associated with Braille codes - A Reviewinscit2006
 

Similar to Translationusing moses1 (20)

Integration of speech recognition with computer assisted translation
Integration of speech recognition with computer assisted translationIntegration of speech recognition with computer assisted translation
Integration of speech recognition with computer assisted translation
 
EMPLOYING PIVOT LANGUAGE TECHNIQUE THROUGH STATISTICAL AND NEURAL MACHINE TRA...
EMPLOYING PIVOT LANGUAGE TECHNIQUE THROUGH STATISTICAL AND NEURAL MACHINE TRA...EMPLOYING PIVOT LANGUAGE TECHNIQUE THROUGH STATISTICAL AND NEURAL MACHINE TRA...
EMPLOYING PIVOT LANGUAGE TECHNIQUE THROUGH STATISTICAL AND NEURAL MACHINE TRA...
 
How to Translate from English to Khmer using Moses
How to Translate from English to Khmer using MosesHow to Translate from English to Khmer using Moses
How to Translate from English to Khmer using Moses
 
Experiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine TranslationExperiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine Translation
 
project present
project presentproject present
project present
 
ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...
ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...
ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...
 
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
 
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...
 
Direct Punjabi to English Speech Translation using Discrete Units
Direct Punjabi to English Speech Translation using Discrete UnitsDirect Punjabi to English Speech Translation using Discrete Units
Direct Punjabi to English Speech Translation using Discrete Units
 
Error Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsError Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation Outputs
 
Improving the role of language model in statistical machine translation (Indo...
Improving the role of language model in statistical machine translation (Indo...Improving the role of language model in statistical machine translation (Indo...
Improving the role of language model in statistical machine translation (Indo...
 
A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to MarathiA Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathi
 
A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to MarathiA Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathi
 
A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to MarathiA Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathi
 
What is machine translation
What is machine translationWhat is machine translation
What is machine translation
 
LLM.pdf
LLM.pdfLLM.pdf
LLM.pdf
 
Moses
MosesMoses
Moses
 
Machine Translation Approaches and Design Aspects
Machine Translation Approaches and Design AspectsMachine Translation Approaches and Design Aspects
Machine Translation Approaches and Design Aspects
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Improvement in Quality of Speech associated with Braille codes - A Review
Improvement in Quality of Speech associated with Braille codes - A ReviewImprovement in Quality of Speech associated with Braille codes - A Review
Improvement in Quality of Speech associated with Braille codes - A Review
 

Recently uploaded

What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Call Girls in Naraina Delhi đŸ’¯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi đŸ’¯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi đŸ’¯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi đŸ’¯Call Us 🔝8264348440🔝soniya singh
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 

Recently uploaded (20)

What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Call Girls in Naraina Delhi đŸ’¯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi đŸ’¯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi đŸ’¯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi đŸ’¯Call Us 🔝8264348440🔝
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi đŸĢĻ HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi đŸĢĻ HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi đŸĢĻ HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi đŸĢĻ HOT AND SEXY VVIP 🍎 SE...
 

Translationusing moses1

  • 1. A SEMINAR REPORT ON English-Assamese Statistical Machine Translation using Moses by KALYANEE KANCHAN BARUAAH
  • 2. Contents īƒ˜Introduction īƒ˜Literature Review īƒ˜Implementation īƒ˜Problems and proposed solutions īƒ˜Results and Evaluation īƒ˜ Conclusion and future work īƒ˜ References
  • 3. Introduction īƒ˜ Natural Language Processing īƒ˜ Machine Translation īƒ˜Need for Machine Translation īƒ˜Problems in MT īƒ˜Approaches to machine translation īƒ˜Direct-based MT īƒ˜Rule-based MT īƒ˜Corpus-based MT īƒ˜Knowledge-based MT īƒ˜SOME EXISTING MT SYSTEMS īƒ˜Statistical Machine Translation
  • 4. Introduction NLP (Natural Language Processing) – deals with understanding and developing computational theories of Human Language. Such theories allows us to understand the structure of the language and build computer software that can process language. – Plays a major role in men-machine communication as well as men-men communication. – Machine Translation (MT) is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another.
  • 5. Machine Translation Machine translation the application of computer and language sciences to the development of systems answering practical needs. – Need for Machine Translation – needed to translate literary works which from any language into native languages. – Most of the information available is in English which is understood by only 3% of the population . – making available work rich sources of literature available to people across the world. – Problems in Machine Translation – Translation is not straightforward – Automation of translation not easy – Idioms – Ambiguity
  • 7. MT Approaches īƒ˜ Direct MT īƒ˜The most basic form of MT. It translates the individual words in a sentence from one language to another using a two-way dictionary. It makes use of very simple grammar rules īƒ˜Little analysis of source language īƒ˜No parsing īƒ˜Reliance on large two-way dictionary īƒ˜ Rule-Based Machine Translation īƒ˜ (RBMT; also known as “Knowledge-Based Machine Translation”; “Classical Approach” of MT) is a general term that denotes machine translation systems based on linguistic information about source and target languages basically retrieved from (bilingual) dictionaries and grammars covering the main semantic, morphological, and syntactic regularities of each language respectively
  • 8. MT Approaches â€ĸ Interlingua based Machine Translation â€ĸ Fig 2.3: Multilingual MT system with Interlingua approach – The Interlingua Machine Translation converts words into a universal language that is created for the MT simply to translate it to more than one language. – can be used in applications like information retrieval. – more practical when several languages are to be interpreted since it only needs to translate it from the source language
  • 9. MT Approaches Transfer based MT â€ĸ Transfer based translation have the same idea as that of interlingua i.e. to make a translation it is necessary to have an intermediate representation that captures the "meaning" of the original sentence in order to generate the correct translation. In interlingua-based MT this intermediate representation must be independent of the languages in question, whereas in transfer-based MT, it has some dependence on the language pair involved[9]. Knowledge-based MT â€ĸ Semantic-based approaches to language analysis have been introduced by AI researchers. The approached require a large knowledge-base that includes both ontological and lexical knowledge [6].
  • 10. MT Approaches Corpus Based Machine Translation classified into statistical and example-based Machine Translation. â€ĸ Example based MT – Example based systems use previous translation examples to generate translations for an input provided. When an input sentence is presented to the system, it retrieves a similar source sentence from the example-base and its translation. Statistical Machine Translation – SMT requires less human effort to undertake translation. SMT is a machine translation paradigm where translations are generated on the basis of statistical models Example-based MT Statistical-based MT Example-based MT systems use variety of linguistic resources such as dictionaries and thesauri, etc., to translate text. Statistical-based MT uses purely statistical based methods in aligning the words and generation of texts.
  • 11. SOME EXISTING MT SYSTEMS â€ĸ Google Translate â€ĸ Systran â€ĸ Bing Translator â€ĸ Bable Fish
  • 12. Statistical Machine Translation īƒ˜ Statistical Machine Translation consists of Language Model (LM), Translation Model (TM) and the decoder. īƒ˜ The purpose of the language model is to encourage fluent output and the purpose of the translation model is to encourage similarity between input and output, the decoder maximizes the probability of translated text of target language. īƒ˜ SMT is based on ideas used in Information Theory and in particular Shannon’s noisy-channel model. The purpose of this model is to identify a message which is transmitted through a communication channel and is hence prone to errors due to the channel’s quality.
  • 13. Statistical Machine Translation Parallel corpus C= a collection of text chunks and their translations.(byproduct of human translations.) Given a source sentence f, select target sentence e. 𝑎𝑟𝑔𝑚𝑎đ‘Ĩ 𝑒∈𝐸 𝑓 {p(e|f)}= 𝑎𝑟𝑔𝑚𝑎đ‘Ĩ 𝑒∈𝐸 𝑓 {p(e)*p(f|e)} E(f)=set of hypothesized translation of f. P(f/e)=diverges due to – – Word order – Morphology – Syntactic relation – Idiomatic ways of expression – Sparse datasets(popularized primarily with sparse datasets)
  • 15. SMT-Language Model â€ĸ A language model gives the probability of a sentence. The probability is computed using n-gram model. Language Model can be considered as computation of the probability of single word given all of the words that precede it in a sentence . â€ĸ A sentence is decomposed into the product of conditional probability. â€ĸ By using chain rule the probability of sentence P (S), is broken down as the probability of individual words P(w). â€ĸ An n-gram model simplifies the task by approximating the probability of a word given all the previous words.
  • 17. SMT-Translation Model Udaipur is a famous city āĻ‰āĻĻā§ŸāĻĒā§ā§° āĻāĻ–āĻ¨ āĻŦāĻŋāĻ–āĻ¯āĻžāĻ¤ āĻšāĻšā§° The Translation Model helps to compute the conditional probability P(T|S).
  • 18. Implementation.. īą Install all packages in Moses â€ĸ Install Giza++ â€ĸ Install IRSTLM īąTraining īąTuning īąGenerate output (decoding)
  • 19. TRAINING THE MOSES DECODER â€ĸ Prepare data â€ĸ Run GIZA++ â€ĸ Align words â€ĸ Get lexical translation table â€ĸ Extract phrases â€ĸ Score phrases â€ĸ Build lexicalized reordering model â€ĸ Build generation models. â€ĸ Create configuration file
  • 20. PREPARING DATA īƒ˜Tokenising - inserting spaces between words and punctuation. īƒ˜Truecaseing - setting the case of the first word in each sentence. īƒ˜Cleaning - removing empty lines, redundant spaces, and lines that are too short or too long.
  • 21. Sample of Parallel Corpus eng-ass1.en eng-ass1.as Shopping in Udaipur is always a delightful experience and it displays excellent handicrafts and works developed by local traders. āĻ‰āĻĻā§ŸāĻĒā§ā§°āĻ¤ āĻŋāĻœāĻžā§° āĻ•ā§°āĻžāĻŸ āĻž āĻāĻ• āĻ†āĻ¨āĻ¨ā§āĻĻāĻĻāĻžā§ŸāĻ• āĻ…āĻŦāĻŋāĻœā§āĻžāĻ¤āĻž āĻ†ā§°ā§ āĻ‡āĻŸā§Ÿ āĻ¸ā§āĻĨāĻžāĻ¨ā§€ā§Ÿ āĻŋāĻ¯ā§ąāĻ¸āĻžā§Ÿā§€āĻ¸āĻ•āĻ˛ā§° āĻšāĻ¸ā§āĻ¤āĻ•āĻ˛āĻž āĻ†ā§°ā§ āĻ•āĻžāĻŽā§° āĻ‰āĻ¤ā§āĻ¤āĻŽ āĻŦāĻ¨āĻĻā§°ā§āĻļāĻ¨ āĻĻāĻžāĻŦāĻŋ āĻ§āĻŸā§°āĨ¤ September to March is the best season to visit Udaipur. āĻšāĻšāĻŸā§‡āĻŽā§āĻŦā§°ā§° āĻĒā§°āĻž āĻŽāĻžāĻšāĻļ āĻ˛āĻ˛ āĻ‰āĻĻā§ŸāĻĒā§ā§° āĻ­ā§ā§°āĻŽāĻŖā§° āĻ‰āĻĒāĻ¯ā§āĻ•ā§āĻ¤ āĻ¸āĻŽā§ŸāĨ¤ The Shilpagram is designed on the concept of a village with little emphasis on the modern concept. āĻāĻ–āĻ¨ āĻ—āĻžāĻžāĻ ā§ąā§° āĻĒ āĻŋā§‚ āĻŦāĻŽāĻ¤ āĻŦā§°ā§āĻ˛ā§āĻĒāĻ—ā§ā§°āĻžāĻŽ āĻ…āĻ‚āĻŦāĻ•āĻ¤ āĻ•ā§°āĻž āĻšāĻšāĻŸā§‡ āĻ¯ĘŧāĻ¤ āĻ†āĻ§ā§āĻŦāĻ¨āĻ• āĻ§āĻžā§°āĻŖāĻžā§° āĻ“āĻĒā§°āĻ¤ āĻ—ā§ā§°ā§āĻ¤ā§āĻŦ āĻŦāĻĻā§ŸāĻž āĻšāĻšāĻžā§ąāĻž āĻ¨āĻžāĻ‡āĨ¤ A part of the City Palace is now converted into a museum that displays some of the best forms of art and culture. āĻŦāĻšāĻŸāĻŋ āĻšāĻĒāĻŸāĻ˛āĻšā§° āĻ āĻž āĻ…āĻ‚ā§°ā§ āĻāĻŦāĻ¤ā§ŸāĻž āĻ āĻž āĻ¸āĻ‚āĻ—ā§ā§°āĻžāĻšāĻžāĻ˛ā§ŸāĻ˛āĻ˛ ā§°ā§‚āĻĒāĻžāĻ¨ā§āĻ¤āĻŦā§°āĻ¤ āĻ•ā§°āĻž āĻšāĻšāĻŸā§‡ āĻ†ā§°ā§ āĻ‡ā§ŸāĻžāĻ¤ āĻŦāĻ•ā§‡ā§āĻŽāĻžāĻ¨ āĻ‰āĻ¨ā§āĻ¨āĻ¤ āĻ•āĻ˛āĻž āĻ†ā§°ā§ āĻ¸āĻ‚āĻ¸ā§āĻ•ā§ƒ āĻŦāĻ¤ā§° āĻĒā§ā§°āĻĻā§°ā§āĻļāĻ¨ āĻ•ā§°āĻž āĻšāĻšāĻŸā§‡āĨ¤
  • 22. Sample output English sentences as input Corresponding output in Assamese Kanak Vrindavan is a popular picnic spot in Jaipur āĻ•āĻ¨āĻ• āĻŋā§ƒāĻ¨ā§āĻĻāĻžāĻŋāĻ¨ āĻšāĻšāĻŸā§‡ āĻœā§ŸāĻĒā§ā§°ā§° āĻāĻ–āĻ¨ āĻœāĻ¨āĻŦāĻĒā§ā§°ā§Ÿ āĻŋāĻ¨āĻŸāĻŋāĻžāĻœ āĻ¸ā§āĻĨāĻžāĻ¨ City Palace is a synthesis of Mughal and Rajasthani architecture āĻŦāĻšāĻŸāĻŋ āĻšāĻĒāĻŸāĻ˛āĻš āĻšāĻŽāĻžāĻ—āĻ˛ āĻ†ā§°ā§ ā§°āĻžāĻœāĻ¸ā§āĻĨāĻžāĻ¨ā§€ āĻ¸ā§āĻĨāĻžāĻĒāĻ¤āĻ¯ āĻŦāĻŋāĻĻāĻ¯āĻžā§° āĻ¸āĻ‚āĻŦāĻŽāĻļā§ā§°āĻŖāĨ¤ Jama Masjid is the largest mosque in India āĻœāĻžāĻŽāĻž āĻŽā§‡āĻŦāĻœāĻĻ āĻŋāĻžā§°āĻ¤ā§° āĻŦāĻŋāĻ¤ā§°āĻ¤ āĻ† āĻžāĻ‡āĻ¤āĻ˛āĻ• āĻĄāĻžāĻŋā§° āĻŽā§‡āĻŦāĻœāĻĻāĨ¤ A part of the City Palace is now converted into a museum āĻŦāĻšāĻŸāĻŋ āĻšāĻĒāĻŸāĻ˛āĻšā§° āĻ āĻž āĻ…āĻ‚ā§°ā§ āĻāĻŦāĻ¤ā§ŸāĻž āĻ āĻž āĻ¸āĻ‚āĻ—ā§ā§°āĻžāĻšāĻžāĻ˛ā§ŸāĻ˛āĻ˛ ā§°ā§‚āĻĒāĻžāĻ¨ā§āĻ¤āĻŦā§°āĻ¤ āĻ•ā§°āĻž āĻšāĻšāĻŸā§‡
  • 23. Block diagram showing SMT using moses
  • 24. Results and evaluation – The output of the experiment was evaluated using BLEU(Bilingual Evaluation Understudy) – If we do Assamese-English translation using same parallel corpus, BLEU score of 5.72 is obtained. This is very small and may be because we have used a very small data set. â€ĸ BLEU scores are not commensurate even between different corpora in the same translation direction. Bleu is really only comparable for different systems or system variants on the exact same data. Source/Target BLEU Score English-Assamese 4.71
  • 25. Problems and proposed solution Problemsâ€Ļ – As we have used limited amount of English-Assamese parallel corpus. The efficiency of the translation model is less as efficiency increase when we are with more amount of data (parallel corpus) for training. – As it is not convenient here to get a better result of translation for the OOV (Out of vocabulary Words) here as moses tool either ignore OOV words or drop down. We are trying to implement transliteration for those OOV. OOV words can be those words which are not present in the corpus, some proper nouns etc.
  • 26. Problems and proposed solution Solution.. Transliteration â€ĸ āĻ•ā§-āĻŽāĻž-ā§° -> (ku-ma-r) ā§°āĻžāĻœ āĻ•ā§ āĻŽāĻž ā§° ->(Raj-ku-ma-r) â€ĸ Example: English-Assamese Transliteration For example, when we translate the sentence “panaji is a city" ,We have used the following command for incorporating transliteration into translation â€ĸ â€ĸ echo 'panaji is a city'| ~/mymoses/bin/moses -f ~/work/mert-work/moses.ini | ./space.pl | ~/mymoses/bin/moses -f ~/work1/train/model/moses.ini | ./join.pl This gives us the output : â€ĸ “āĻĒāĻžāĻ¨āĻžāĻŦāĻœāĻ˛āĻšāĻŸā§‡āĻ āĻžāĻšāĻšā§°â€
  • 27. Conclusion and Future Work â€ĸ The work can be extended in following directions. – The system can also be put in the web-based portal to translate content of one web page in English to Assamese. – We will try to increase the corpus for better training for better efficiency. – We will try to develop the translation system by our own instead of using Moses MT system. – Since all Indian languages follow SOV order, and are relatively rich in terms of morphology, the methodology presented should be applicable to English to Indian language SMT in general. Since morphological and parsing tools are not much widely available for Indian languages, an approach like this which minimizes the use of such tools for the target language would be quite handy.
  • 28. Conclusion and Future Work â€ĸ we try to get more corpora from different domains in such a way that it will cover all the wordings. Since BLUE is not so good for rough translation we need some other evaluation techniques also. â€ĸ We should try the incorporation of shallow syntactic information (POS tags) in our discriminative model to boost the performance of translation.
  • 29. References – Machine Translation Approaches and Survey for Indian Languages Antony P. J.∗ – G. Singh and G. Singh Lehal,” A Punjabi to Hindi Machine Translation System”, Coling 2008: Companion volume- Posters and demonstrations, Manchester, August 2008. – F.J.Och., “GIZA++: Training of statistical translation models”, [Online]. Available at: http://fjoch.com/GIZA++.html. – Moses Manual â€ĸ Natural%20language%20processing%20- %20Wikipedia,%20the%20free%20encyclopedia.html â€ĸ D. D. Rao, “Machine Translation A Gentle Introduction”, RESONANCE, July 1998. â€ĸ “Statistical machine translation”, [Online]. Available,http://en.wikipedia.org/wiki/Statistical_machine_translation â€ĸ S.K. Dwivedi and P. P. Sukadeve, “Machine Translation System Indian Perspectives”, Proceeding of Journal of Computer Science Vol. 6 No. 10. pp 1082-1087, May 2010. â€ĸ “Machine Translation ”, [Online],Available : â€ĸ http://www.ida.liu.se/~729G11/HYPERLINK "http://www.ida.liu.se/~729G11/projekt/studentpapper-10/maria-"projekt/studentpapper- 10/maria- hedblom.pdf â€ĸ “Machine Translation”, [Online]. Available, http://faculty.ksu.edu.sa/homiedan/Publications/ â€ĸ Machine%20Translation.pdf â€ĸ D. D. Rao, “Machine Translation A Gentle Introduction”, RESONANCE, July 1998.