SlideShare a Scribd company logo
1 of 56
Presented by
NIKHIL.P
MCA S4
CHINTECH
INTRODUCTION
 TRANSLATION??

Translation is the communication of the meaning of
a source-language text by means of an equivalent
target-language text.
 TRANSLITERATION??
It is the conversion of a text from one script to another.
INTRODUCTION
 Why TRANSLATION??

Being able to establish links between two languages
allows for transferring resources from one language to
another.
Books written in unknown foreign languages can be
read by translating the contents of the book in our
own language.
Computers
Databases

Robotics

Artificial Intelligence

Algorithms

Natural Language Processing

Information
Retrieval

Machine
Translation

Networking

Search
INTRODUCTION
 Natural Language Processing(NLP)

NLP is a field of Computer Science, Artificial
Intelligence and Linguistics, concerned with the
interactions between computers and human(natural)
languages.
Applications of NLP
Machine Translation
database access
information retrieval
Machine Translation??
 Machine Translation is the automatic translation ,

for example using a computer system, from a first
language(source language) into another
language(target language).
Background
 Automatic machine language processing was one of

the first natural language processing applications
developed in computer science.
 Explores rule based, example based, knowledge based

and statistical approaches.
 Statistical Machine Translation(SMT) is the preferred

approach in many industrial and academic research.
 Rule based Machine Translation: a system of lexical,

grammatical, and reordering rules is created for
source/target pair. Rules are then applied to source to
produce output.
 Example based Machine Translation: a bilingual text

corpus is used directly for comparison against source
text and case based reasoning is applied to create
output.
What is Moses?
 It is an open source toolkit
 Toolkit for (SMT)Statistical Machine Translation
 Moses is under LGPL license
 It uses standard external toolkits such as GIZA++ and

SRILM
Statistical Machine Translation??
 Goal is to produce a target sentence from a source

sentence that maximizes the probability
 Statistical MT system is modeled as three separate
parts:
language model
translation model
decoder
language model(LM): assigns a probability to any
target string of words {P(e)}
an LM probability distribution over strings S that
attempts to reflect how frequently a string S occurs as
a sentence.
translation model(TM): assigns a probability to any
pair of target and source strings {P(f|e)}
decoder: determines translation based on
probabilities of LM & TM
GIZA++
 It is used for making word-alignments
 This toolkit is an implementation of the original IBM

Models that started machine translation research.
 First the language pairs are aligned bi-directionally, as

English to German and German to English
 This generates two word alignments, then performs
 Intersection-, we get a high-precision alignment of

high confidence alignment points,
 Union-, we get a high-recall alignment with additional
alignment points.
SRILM
 It is used for language modeling.
 It consists of the following components

A set of C++ class libraries implementing language
models, supporting data structures and miscellaneous
utility functions.
A set of executable programs built on top of these
libraries to perform standard tasks such as training
LMs and testing them on data,
A collection of miscellaneous scripts facilitating minor
related tasks
Moses Translation Process
 It involves
 Segmenting the source sentence into source phrases
 Translating each source phrase into a target phrase
 & optionally reordering the target phrases into a target

sentence.
Moses Toolkit
 Consists of all the components needed to preprocess

data , train the language models and the translation
models.
 Also contains tools for tuning these models using

minimum error rate.
 External tools like GIZA++ & SRILM
Moses Toolkit
 Decoder is the core component of Moses.
 Phrase based decoder is used.

 Job of decoder is to find the highest scoring sentence

in the target language corresponding to source
sentence.
 Possible to output a ranked list of translation

candidates
 Principles used when developing Moses decoder
 Accessibility
 Easy

to maintain
 Flexibility
 Easy for distributed team development
 Portability

 It was developed in C++ for efficiency and followed

modular, object-oriented design.
 Decoding process in various ways:

-Input:-can be plain sentence
-Translation model
-Decoding algorithm

-Language model
 Contributed Tools
 Moses Server- provides an xml-rpc interface to the

decoder
 Web translation- set of scripts to translate webpage
 Analysis tools- scripts to enable and analyze the

visualization of Moses output
Moses Decoder
A simple translation model

Contains two files:
Phrase-table(phrase translation table)
{de ||| the ||| 0.3 ||| |||}
Moses.ini(configuration file)
The decoder is controlled by moses.ini
 Phrase table:

The phrase translation tables are the main knowledge
source for the machine translation decoder.

• entry means that the probability of translating the

English word the from the German der is 0.3.
 Configuration file

The decoder is controlled by the Moses configuration
file moses.ini

translation model files and language model files are
specified here.
Moses Decoder
Trace

This option reveals which phrase translation were used
in the best translation found by the decoder.
Moses Decoder
Tuning for Quality

the probability cost is assigned by four models
 Phrase translation table (phi(f|e)

ensures that both source and target language
phrases are good translation of each other
 Language model (LM(e))

ensures that the output is fluent target language
 Reordering model (D(e,f))

allows for the re-ordering of the input sentence

 Word penalty (W(e))

to ensure that the translation do not get too long or
too short
Moses Decoder
Tuning for Speed

speed-ups are achieved by limiting the search space
of the decoder
• Translation table size
• Hypothesis stack size
Translation table size




one strategy is to reduce the number of translation
options used for each input phrase , i.e., number of
table entries that are retrieved.

two ways to limit table size
I.
II.

fixed limits on translation options retrieved
phrase translation probability has to above some value
 Hypothesis stack size

another way to reduce the search space is to reduce
the size of hypothesis stacks.
for each number of foreign words translated, decoder
keeps a stack of the best translations.
Moses Decoder
Limit on Distortion
 Reordering cost is measured by the number of words

skipped when foreign phrases are picked out of order.
 Reordering cost is computed for finding the best target
pair probability.
Moses Decoder
Decoding Algorithm
 Decoder uses a beam search algorithm
 The output sentence is generated left to right in form

of hypothesis
 Final state in the search are hypotheses that cover all

foreign words.
 Beam Search
an efficient search algorithm that quickly finds the
highest probability translation among the exponential
number of choices.
Search through the space of hypotheses generated is
performed using beam search that keeps in each node
the list of the top best translations for the node.
The score for the translation is computed using the
weights of the individual phrases that make up the
translation and the overall LM probability of the
combination.
The scores are computed by querying the standard
Moses Phrase Table and the LM for the target
language.
Language Models
 Decoder works with the following language models:
SRI language model
IRST language model
RandLM

KenLM is included by default in moses
Translating Webpages with Moses
 Moses servers are installed in one or several computers
 On each Moses server, a daemon(daemon.pl) accepts

network connection on a given port and copies
everything it receives from the connection to Moses.
 Another web server runs Apache or any web server

software
 Through web server cgi scripts(index.cgi, translate.cgi)

are served to clients.
 A client request index.cgi via the web server, a form

containing textbox is served back to enter the URL.
 The form is submitted to “translate.cgi” which does the

job.
it fetches page from web
extract plaintext from it
send those to moses server
inserts the translation back into document& to client
Setting up MOSES server
Choosing machines for moses servers
running Moses is slow and expensive process, so the
machine used must have a fast processor and as many
GB’s of memory as possible.
Install Moses
for each moses server, need to install and configure
the language pair that we wish to use.
Setting up MOSES server
Install daemon.pl
open bin/daemon.pl and edit the $MOSES and
$MOSES_INI paths to point to the location of moses
binary and moses configuration file.
Choose a port number
pick any port number between 1024 and 49151 for the
daemon process to listen on.
Setting up MOSES server
Start the daemon
to activate Moses server, type in a shell on the server,
./daemon.pl <hostname> <port>

hostname is the name of the host where Moses is
installed.
port is the selected port
Setting up MOSES server
Configure web server to connect to Moses server
final step is to tell the front-end Web server where to
find the back-end Moses server
in the translate.cgi script set the
@MOSES_ADDRESS array to the list of hostname:port
strings identifying the Moses servers.
Comparison with pharaoh and phramer for a fren translation of 2000 sentences
Installing Moses
Need to install boost
sudo apt-get install libboost-all-dev
get source code
git clone git://github.com/mosessmt/mosesdecoder.git
Installing GIZA++
 wget http://giza-pp.googlecode.com/files/giza-pp-

v1.0.7.tar.gz
 tar xzvf giza-pp-v1.0.7.tar.gz
 cd giza-pp
 Make

 cd ~/mosesdecoder
 mkdir tools
 cp ~/giza-pp/GIZA++-v2/GIZA++ ~/giza-pp/GIZA++-

v2/snt2cooc.out ~/giza-pp/mkcls-v2/mkcls tools
Installing IRSTLM
 tar zxvf irstlm-5.80.01.tgz
 cd irstlm-5.80.01
 ./regenerate-makefiles.sh
 ./configure --prefix=$HOME/irstlm

 make install
Moses Platform
 Primary development platform for Moses is Linux.
 & recommended platform is Linux since it is easier to

get support for it.
 However it works on other platforms also.
Moses Releases
 Moses 1.0 (28th Jan 2013)
 Moses 0.91 (12th Oct 2012)
Importance of Moses
 Moses is an installable software unlike other online-

only translation systems
 Online systems cannot be trained on our own data
 There is also a problem with privacy, if you have to

translate sensitive info.
Conclusion
Moses is an open source toolkit, so that the users can
modify and customize the toolkit based on their needs
and requirements.
Reference
 www.statmt.org/moses/
 www.crosslang.com/en/machine-translation/custom-

built-mt-engines/moses-smt
Questions??

More Related Content

What's hot

Les « copeaux de coco » : origine, propriétés, applications et valorisation f...
Les « copeaux de coco » : origine, propriétés, applications et valorisation f...Les « copeaux de coco » : origine, propriétés, applications et valorisation f...
Les « copeaux de coco » : origine, propriétés, applications et valorisation f...idealconnaissances
 
Open vocabulary problem
Open vocabulary problemOpen vocabulary problem
Open vocabulary problemJaeHo Jang
 
Lecture1: What is interpreting?
Lecture1: What is interpreting?Lecture1: What is interpreting?
Lecture1: What is interpreting?Trang Tran
 
07 Specialised Translation #3 Legal Translation
07 Specialised Translation #3 Legal Translation07 Specialised Translation #3 Legal Translation
07 Specialised Translation #3 Legal TranslationOlga Łabendowicz
 
Natural Language Processing: Parsing
Natural Language Processing: ParsingNatural Language Processing: Parsing
Natural Language Processing: ParsingRushdi Shams
 
Projet reseau-de-kherfallah-ipm-2010-2011
Projet reseau-de-kherfallah-ipm-2010-2011Projet reseau-de-kherfallah-ipm-2010-2011
Projet reseau-de-kherfallah-ipm-2010-2011Boubaker KHERFALLAH
 
A tutorial on Machine Translation
A tutorial on Machine TranslationA tutorial on Machine Translation
A tutorial on Machine TranslationJaganadh Gopinadhan
 
NLP pipeline in machine translation
NLP pipeline in machine translationNLP pipeline in machine translation
NLP pipeline in machine translationMarcis Pinnis
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversionankit_saluja
 
Translation vs. Interpretation
Translation vs. Interpretation Translation vs. Interpretation
Translation vs. Interpretation Rolando Tellez
 
Natural lanaguage processing
Natural lanaguage processingNatural lanaguage processing
Natural lanaguage processinggulshan kumar
 
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...Databricks
 
Inconsistent Speech Disorder and The Core Vocabulary Approach
Inconsistent Speech Disorder and The Core Vocabulary ApproachInconsistent Speech Disorder and The Core Vocabulary Approach
Inconsistent Speech Disorder and The Core Vocabulary Approachkmbrlyslp
 
Machine Translation: What it is?
Machine Translation: What it is?Machine Translation: What it is?
Machine Translation: What it is?Multilizer
 
Interpretation vs. translation
Interpretation vs. translationInterpretation vs. translation
Interpretation vs. translationErika Sandoval
 

What's hot (20)

Les « copeaux de coco » : origine, propriétés, applications et valorisation f...
Les « copeaux de coco » : origine, propriétés, applications et valorisation f...Les « copeaux de coco » : origine, propriétés, applications et valorisation f...
Les « copeaux de coco » : origine, propriétés, applications et valorisation f...
 
Open vocabulary problem
Open vocabulary problemOpen vocabulary problem
Open vocabulary problem
 
Lecture1: What is interpreting?
Lecture1: What is interpreting?Lecture1: What is interpreting?
Lecture1: What is interpreting?
 
07 Specialised Translation #3 Legal Translation
07 Specialised Translation #3 Legal Translation07 Specialised Translation #3 Legal Translation
07 Specialised Translation #3 Legal Translation
 
Natural Language Processing: Parsing
Natural Language Processing: ParsingNatural Language Processing: Parsing
Natural Language Processing: Parsing
 
Projet reseau-de-kherfallah-ipm-2010-2011
Projet reseau-de-kherfallah-ipm-2010-2011Projet reseau-de-kherfallah-ipm-2010-2011
Projet reseau-de-kherfallah-ipm-2010-2011
 
A tutorial on Machine Translation
A tutorial on Machine TranslationA tutorial on Machine Translation
A tutorial on Machine Translation
 
NLP pipeline in machine translation
NLP pipeline in machine translationNLP pipeline in machine translation
NLP pipeline in machine translation
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
 
Translation vs. Interpretation
Translation vs. Interpretation Translation vs. Interpretation
Translation vs. Interpretation
 
Machine translation
Machine translationMachine translation
Machine translation
 
Natural lanaguage processing
Natural lanaguage processingNatural lanaguage processing
Natural lanaguage processing
 
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...
 
gpt3_presentation.pdf
gpt3_presentation.pdfgpt3_presentation.pdf
gpt3_presentation.pdf
 
Td1 gsm
Td1 gsmTd1 gsm
Td1 gsm
 
Technical Translation
Technical TranslationTechnical Translation
Technical Translation
 
Inconsistent Speech Disorder and The Core Vocabulary Approach
Inconsistent Speech Disorder and The Core Vocabulary ApproachInconsistent Speech Disorder and The Core Vocabulary Approach
Inconsistent Speech Disorder and The Core Vocabulary Approach
 
Machine Translation: What it is?
Machine Translation: What it is?Machine Translation: What it is?
Machine Translation: What it is?
 
Interpretation vs. translation
Interpretation vs. translationInterpretation vs. translation
Interpretation vs. translation
 
Umts
UmtsUmts
Umts
 

Viewers also liked (20)

Intro to trans 350 lecture 1
Intro to trans 350 lecture 1Intro to trans 350 lecture 1
Intro to trans 350 lecture 1
 
The Story of Moses
The Story of MosesThe Story of Moses
The Story of Moses
 
SMT3
SMT3SMT3
SMT3
 
December 14,2014 Pass The Test of Offering for God's Great Blessings
December 14,2014 Pass The Test of Offering for God's Great BlessingsDecember 14,2014 Pass The Test of Offering for God's Great Blessings
December 14,2014 Pass The Test of Offering for God's Great Blessings
 
MOSE Project
MOSE ProjectMOSE Project
MOSE Project
 
Territories of urban design
Territories of urban designTerritories of urban design
Territories of urban design
 
Robert moses
Robert mosesRobert moses
Robert moses
 
Isaiah: 'The Song of Moses and the Lamb
Isaiah:  'The Song of Moses and the LambIsaiah:  'The Song of Moses and the Lamb
Isaiah: 'The Song of Moses and the Lamb
 
210 Moses course WH
210 Moses course WH210 Moses course WH
210 Moses course WH
 
Heroes of Faith
Heroes of FaithHeroes of Faith
Heroes of Faith
 
Joseph the Dreamer
Joseph the DreamerJoseph the Dreamer
Joseph the Dreamer
 
Moses Presentation (religion grade 11)
Moses Presentation (religion grade 11)Moses Presentation (religion grade 11)
Moses Presentation (religion grade 11)
 
Rem koolhass
Rem  koolhassRem  koolhass
Rem koolhass
 
Storia di Mosè
Storia di MosèStoria di Mosè
Storia di Mosè
 
Rem Koolhaas
Rem KoolhaasRem Koolhaas
Rem Koolhaas
 
Rem koolhaas
Rem koolhaasRem koolhaas
Rem koolhaas
 
Seattle public library
Seattle public librarySeattle public library
Seattle public library
 
Moses
MosesMoses
Moses
 
Peckham Library Case Study
Peckham Library Case StudyPeckham Library Case Study
Peckham Library Case Study
 
Rem Koolhaas –designing the design process
Rem Koolhaas –designing the design processRem Koolhaas –designing the design process
Rem Koolhaas –designing the design process
 

Similar to Moses

Compiler design Introduction
Compiler design IntroductionCompiler design Introduction
Compiler design IntroductionAman Sharma
 
Compiler_Lecture1.pdf
Compiler_Lecture1.pdfCompiler_Lecture1.pdf
Compiler_Lecture1.pdfAkarTaher
 
2 Programming Language.pdf
2 Programming Language.pdf2 Programming Language.pdf
2 Programming Language.pdfKINGZzofYouTube
 
Compiler an overview
Compiler  an overviewCompiler  an overview
Compiler an overviewamudha arul
 
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGE
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGESOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGE
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGEIJCI JOURNAL
 
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGE
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGESOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGE
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGEIJCI JOURNAL
 
Chapter1pdf__2021_11_23_10_53_20.pdf
Chapter1pdf__2021_11_23_10_53_20.pdfChapter1pdf__2021_11_23_10_53_20.pdf
Chapter1pdf__2021_11_23_10_53_20.pdfDrIsikoIsaac
 
Lecture1 compilers
Lecture1 compilersLecture1 compilers
Lecture1 compilersAftab Ahmad
 
Chapter One
Chapter OneChapter One
Chapter Onebolovv
 
Lecture 1 introduction to language processors
Lecture 1  introduction to language processorsLecture 1  introduction to language processors
Lecture 1 introduction to language processorsRebaz Najeeb
 
Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.Sheeyam Shellvacumar
 
Introduction to compiler development
Introduction to compiler developmentIntroduction to compiler development
Introduction to compiler developmentDeepOad
 
Compiler Design Introduction
Compiler Design Introduction Compiler Design Introduction
Compiler Design Introduction Thapar Institute
 
Language translators
Language translatorsLanguage translators
Language translatorsAditya Sharat
 
Zerfass trends in translation technologies
Zerfass trends in translation technologiesZerfass trends in translation technologies
Zerfass trends in translation technologiesascetlan
 
compiler construction tool in computer science .
compiler construction tool in computer science .compiler construction tool in computer science .
compiler construction tool in computer science .RanitHalder
 

Similar to Moses (20)

Compiler design Introduction
Compiler design IntroductionCompiler design Introduction
Compiler design Introduction
 
How to Translate from English to Khmer using Moses
How to Translate from English to Khmer using MosesHow to Translate from English to Khmer using Moses
How to Translate from English to Khmer using Moses
 
Compiler_Lecture1.pdf
Compiler_Lecture1.pdfCompiler_Lecture1.pdf
Compiler_Lecture1.pdf
 
2 Programming Language.pdf
2 Programming Language.pdf2 Programming Language.pdf
2 Programming Language.pdf
 
Compiler an overview
Compiler  an overviewCompiler  an overview
Compiler an overview
 
3.2
3.23.2
3.2
 
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGE
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGESOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGE
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGE
 
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGE
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGESOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGE
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGE
 
Chapter1pdf__2021_11_23_10_53_20.pdf
Chapter1pdf__2021_11_23_10_53_20.pdfChapter1pdf__2021_11_23_10_53_20.pdf
Chapter1pdf__2021_11_23_10_53_20.pdf
 
Lecture1 compilers
Lecture1 compilersLecture1 compilers
Lecture1 compilers
 
Chapter One
Chapter OneChapter One
Chapter One
 
Lecture 1 introduction to language processors
Lecture 1  introduction to language processorsLecture 1  introduction to language processors
Lecture 1 introduction to language processors
 
Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.
 
Introduction to compiler development
Introduction to compiler developmentIntroduction to compiler development
Introduction to compiler development
 
Chapter#01 cc
Chapter#01 ccChapter#01 cc
Chapter#01 cc
 
Translationusing moses1
Translationusing moses1Translationusing moses1
Translationusing moses1
 
Compiler Design Introduction
Compiler Design Introduction Compiler Design Introduction
Compiler Design Introduction
 
Language translators
Language translatorsLanguage translators
Language translators
 
Zerfass trends in translation technologies
Zerfass trends in translation technologiesZerfass trends in translation technologies
Zerfass trends in translation technologies
 
compiler construction tool in computer science .
compiler construction tool in computer science .compiler construction tool in computer science .
compiler construction tool in computer science .
 

Recently uploaded

Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseAnaAcapella
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 

Recently uploaded (20)

Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 

Moses

  • 1.
  • 3. INTRODUCTION  TRANSLATION?? Translation is the communication of the meaning of a source-language text by means of an equivalent target-language text.  TRANSLITERATION?? It is the conversion of a text from one script to another.
  • 4. INTRODUCTION  Why TRANSLATION?? Being able to establish links between two languages allows for transferring resources from one language to another. Books written in unknown foreign languages can be read by translating the contents of the book in our own language.
  • 5. Computers Databases Robotics Artificial Intelligence Algorithms Natural Language Processing Information Retrieval Machine Translation Networking Search
  • 6. INTRODUCTION  Natural Language Processing(NLP) NLP is a field of Computer Science, Artificial Intelligence and Linguistics, concerned with the interactions between computers and human(natural) languages. Applications of NLP Machine Translation database access information retrieval
  • 7. Machine Translation??  Machine Translation is the automatic translation , for example using a computer system, from a first language(source language) into another language(target language).
  • 8. Background  Automatic machine language processing was one of the first natural language processing applications developed in computer science.  Explores rule based, example based, knowledge based and statistical approaches.  Statistical Machine Translation(SMT) is the preferred approach in many industrial and academic research.
  • 9.  Rule based Machine Translation: a system of lexical, grammatical, and reordering rules is created for source/target pair. Rules are then applied to source to produce output.  Example based Machine Translation: a bilingual text corpus is used directly for comparison against source text and case based reasoning is applied to create output.
  • 10. What is Moses?  It is an open source toolkit  Toolkit for (SMT)Statistical Machine Translation  Moses is under LGPL license  It uses standard external toolkits such as GIZA++ and SRILM
  • 11. Statistical Machine Translation??  Goal is to produce a target sentence from a source sentence that maximizes the probability  Statistical MT system is modeled as three separate parts: language model translation model decoder
  • 12. language model(LM): assigns a probability to any target string of words {P(e)} an LM probability distribution over strings S that attempts to reflect how frequently a string S occurs as a sentence.
  • 13. translation model(TM): assigns a probability to any pair of target and source strings {P(f|e)} decoder: determines translation based on probabilities of LM & TM
  • 14. GIZA++  It is used for making word-alignments  This toolkit is an implementation of the original IBM Models that started machine translation research.
  • 15.
  • 16.  First the language pairs are aligned bi-directionally, as English to German and German to English  This generates two word alignments, then performs  Intersection-, we get a high-precision alignment of high confidence alignment points,  Union-, we get a high-recall alignment with additional alignment points.
  • 17. SRILM  It is used for language modeling.  It consists of the following components A set of C++ class libraries implementing language models, supporting data structures and miscellaneous utility functions. A set of executable programs built on top of these libraries to perform standard tasks such as training LMs and testing them on data, A collection of miscellaneous scripts facilitating minor related tasks
  • 18. Moses Translation Process  It involves  Segmenting the source sentence into source phrases  Translating each source phrase into a target phrase  & optionally reordering the target phrases into a target sentence.
  • 19. Moses Toolkit  Consists of all the components needed to preprocess data , train the language models and the translation models.  Also contains tools for tuning these models using minimum error rate.  External tools like GIZA++ & SRILM
  • 20. Moses Toolkit  Decoder is the core component of Moses.  Phrase based decoder is used.  Job of decoder is to find the highest scoring sentence in the target language corresponding to source sentence.  Possible to output a ranked list of translation candidates
  • 21.  Principles used when developing Moses decoder  Accessibility  Easy to maintain  Flexibility  Easy for distributed team development  Portability  It was developed in C++ for efficiency and followed modular, object-oriented design.
  • 22.  Decoding process in various ways: -Input:-can be plain sentence -Translation model -Decoding algorithm -Language model
  • 23.  Contributed Tools  Moses Server- provides an xml-rpc interface to the decoder  Web translation- set of scripts to translate webpage  Analysis tools- scripts to enable and analyze the visualization of Moses output
  • 24. Moses Decoder A simple translation model Contains two files: Phrase-table(phrase translation table) {de ||| the ||| 0.3 ||| |||} Moses.ini(configuration file) The decoder is controlled by moses.ini
  • 25.  Phrase table: The phrase translation tables are the main knowledge source for the machine translation decoder. • entry means that the probability of translating the English word the from the German der is 0.3.
  • 26.  Configuration file The decoder is controlled by the Moses configuration file moses.ini translation model files and language model files are specified here.
  • 27. Moses Decoder Trace This option reveals which phrase translation were used in the best translation found by the decoder.
  • 28. Moses Decoder Tuning for Quality the probability cost is assigned by four models  Phrase translation table (phi(f|e) ensures that both source and target language phrases are good translation of each other  Language model (LM(e)) ensures that the output is fluent target language
  • 29.  Reordering model (D(e,f)) allows for the re-ordering of the input sentence  Word penalty (W(e)) to ensure that the translation do not get too long or too short
  • 30. Moses Decoder Tuning for Speed speed-ups are achieved by limiting the search space of the decoder • Translation table size • Hypothesis stack size
  • 31. Translation table size   one strategy is to reduce the number of translation options used for each input phrase , i.e., number of table entries that are retrieved. two ways to limit table size I. II. fixed limits on translation options retrieved phrase translation probability has to above some value
  • 32.  Hypothesis stack size another way to reduce the search space is to reduce the size of hypothesis stacks. for each number of foreign words translated, decoder keeps a stack of the best translations.
  • 33. Moses Decoder Limit on Distortion  Reordering cost is measured by the number of words skipped when foreign phrases are picked out of order.  Reordering cost is computed for finding the best target pair probability.
  • 34.
  • 36. Decoding Algorithm  Decoder uses a beam search algorithm  The output sentence is generated left to right in form of hypothesis  Final state in the search are hypotheses that cover all foreign words.
  • 37.  Beam Search an efficient search algorithm that quickly finds the highest probability translation among the exponential number of choices. Search through the space of hypotheses generated is performed using beam search that keeps in each node the list of the top best translations for the node.
  • 38. The score for the translation is computed using the weights of the individual phrases that make up the translation and the overall LM probability of the combination. The scores are computed by querying the standard Moses Phrase Table and the LM for the target language.
  • 39. Language Models  Decoder works with the following language models: SRI language model IRST language model RandLM KenLM is included by default in moses
  • 41.  Moses servers are installed in one or several computers  On each Moses server, a daemon(daemon.pl) accepts network connection on a given port and copies everything it receives from the connection to Moses.  Another web server runs Apache or any web server software  Through web server cgi scripts(index.cgi, translate.cgi) are served to clients.
  • 42.  A client request index.cgi via the web server, a form containing textbox is served back to enter the URL.  The form is submitted to “translate.cgi” which does the job. it fetches page from web extract plaintext from it send those to moses server inserts the translation back into document& to client
  • 43. Setting up MOSES server Choosing machines for moses servers running Moses is slow and expensive process, so the machine used must have a fast processor and as many GB’s of memory as possible. Install Moses for each moses server, need to install and configure the language pair that we wish to use.
  • 44. Setting up MOSES server Install daemon.pl open bin/daemon.pl and edit the $MOSES and $MOSES_INI paths to point to the location of moses binary and moses configuration file. Choose a port number pick any port number between 1024 and 49151 for the daemon process to listen on.
  • 45. Setting up MOSES server Start the daemon to activate Moses server, type in a shell on the server, ./daemon.pl <hostname> <port> hostname is the name of the host where Moses is installed. port is the selected port
  • 46. Setting up MOSES server Configure web server to connect to Moses server final step is to tell the front-end Web server where to find the back-end Moses server in the translate.cgi script set the @MOSES_ADDRESS array to the list of hostname:port strings identifying the Moses servers.
  • 47. Comparison with pharaoh and phramer for a fren translation of 2000 sentences
  • 48. Installing Moses Need to install boost sudo apt-get install libboost-all-dev get source code git clone git://github.com/mosessmt/mosesdecoder.git
  • 49. Installing GIZA++  wget http://giza-pp.googlecode.com/files/giza-pp- v1.0.7.tar.gz  tar xzvf giza-pp-v1.0.7.tar.gz  cd giza-pp  Make  cd ~/mosesdecoder  mkdir tools  cp ~/giza-pp/GIZA++-v2/GIZA++ ~/giza-pp/GIZA++- v2/snt2cooc.out ~/giza-pp/mkcls-v2/mkcls tools
  • 50. Installing IRSTLM  tar zxvf irstlm-5.80.01.tgz  cd irstlm-5.80.01  ./regenerate-makefiles.sh  ./configure --prefix=$HOME/irstlm  make install
  • 51. Moses Platform  Primary development platform for Moses is Linux.  & recommended platform is Linux since it is easier to get support for it.  However it works on other platforms also.
  • 52. Moses Releases  Moses 1.0 (28th Jan 2013)  Moses 0.91 (12th Oct 2012)
  • 53. Importance of Moses  Moses is an installable software unlike other online- only translation systems  Online systems cannot be trained on our own data  There is also a problem with privacy, if you have to translate sensitive info.
  • 54. Conclusion Moses is an open source toolkit, so that the users can modify and customize the toolkit based on their needs and requirements.