SlideShare a Scribd company logo
International Association of Scientific Innovation and Research (IASIR)
(An Association Unifying the Sciences, Engineering, and Applied Research)
International Journal of Emerging Technologies in Computational
and Applied Sciences (IJETCAS)
www.iasir.net
IJETCAS 14-444; © 2014, IJETCAS All Rights Reserved Page 401
ISSN (Print): 2279-0047
ISSN (Online): 2279-0055
Rule Based Machine Translation for Assamese-English Using Apertium
Pranjal Das1
, Kalyanee Kanchan Baruah2
, Abdul Hannan3
, Shikhar Kr. Sarma4
Department of Information Technology
IST, Gauhati University
Guwahati
India
Abstract: Machine Translation is the process of using a software application to convert a source language into
a target language without the intervention of human. It is an important part of Natural Language Processing.
Here we have proposed a method for translating Assamese to English sentences and vice versa using a Rule
Based Machine Translation (RBMT) approach. RBMT uses linguistic rules to analyze the inputted content in the
source language to generate the output text in the target language. We have used Apertium toolkit, which is a
free and open-source platform for RBMT. We have used two monolingual dictionaries and a bilingual
dictionary for building the Assamese-English RBMT system. We wrote some transfer rules to deal with word-
reordering and other grammatical stuffs.
Keywords: Apertium, Transfer Rules, Assamese, Morphology, Machine Translation, Dictionary
I. Introduction
In this paper, we have described the development of the Assamese-English (asm-eng) language pair using
Apertium RBMT. Assamese is a language mainly spoken by the people of Assam, a state in the north-east of
India [9]. In Assam, not much has been done in the field of NLP and Machine Translation. Assamese is a
computationally underdeveloped language and research on Natural Language Processing (NLP) is still at its
developing stage. Today, almost everything is web-based and majority of the information is available in English.
So the purpose of our project is to enable the common Assamese people, who have lesser knowledge about the
English language, to stay abreast with the recent advancements of the world by providing him an Machine
Translation interface. The purpose of the project is also to let the world know about the Assamese language. Since
Indian languages are morphologically very rich and agglutinative in nature, majority of the MT projects in India
are hybrid or statistical based. But we believe that if we have a firm grip on the language, RBMT can be better. In
this paper, we have briefly described about the design of the RBMT system using the Apertium platform. Then,
we have described about the dictionaries and the rules used. Following this, we have given an evaluation of the
quality of output of our system and finally we concluded our paper with a discussion and prospective for future
work.
II. Related Works
Machine Translation in India is not a new process. There are many machine translation projects in India [2].
Some of the Machine Translation systems for Indian languages are:
Table I Some Machine Translation Projects in India
Machine Translation System Source-Target Language Developer Approach Used
ANGLABHARATI (1991) English-Indian Languages
(primarily Hindi)
IIT, Kanpur Pseudo-interlingua
UCSG-based MT English-Kannada University of Hyderabad Transfer based
Anusaaraka (1995) English to Hindi and five other
Indian Languages to Hindi
IIT, Kanpur LWG Mapping/PG
VAASAANUBAADA
(2002)
Bengali- Assamese Kommaluri Vijayanand, S
Choudhury and Pranab Ratna
EBMT
Anuvadaksh English to Hindi, Urdu, Oriya,
Bangla, Marathi, Tamil
EILMT consortium Hybrid Approach
Several universities in India, along with CDAC have been working on improving Machine Translation for Indian
Languages. MHRD, India is also taking several measures to make Machine Translation serve for the people of
India.
We are using Apertium to develop our RBMT system. There are many language pairs in use for Apertium. Work
is still going on for some language pairs. Some of the stable language pairs that are currently in use are shown in
the following figure:
Pranjal Das et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(5), March-May, 2014, pp.
401-406
IJETCAS 14-444; © 2014, IJETCAS All Rights Reserved Page 402
Figure 1: Stable language pairs of Apertium.
All the released pairs are stored in the trunk [3] of Apertium.
III. Files and Tools used for Implementation
A. Apertium
Apertium is a free and open-source platform used in Rule Based Machine Translation. Apertium uses FST (finite-
state transducers) for lexical processing to treat many kinds of multi-word expressions. Hidden Markov models
(HMM) are used in Apertium for part-of-speech tagging. Apertium also uses chunking, which is finite-state based
used for structural transfer.
B. Lttoolbox
It is a finite state toolkit used for lexical processing, morphological analysis and generation of words [4]. Three
programs are split in the lttoolbox:
lt-comp – the compiler
lt-proc – the processor
lt-expand – generates all possible mappings in the dictionary between surface forms and lexical forms.
C. Monolingual Dictionaries
The monolingual dictionary or the morphological dictionary contains the rules how words in that dictionary are
inflected. It is written in XML format. The monolingual dictionary establishes the correspondences between
Surface Forms (SFs) and Lexical Forms (LFs) [5]. For our language pair Assamese-English, we have two
monolingual dictionaries –
apertium-asm-eng.asm.dix – for Assamese
apertium-asm-eng.eng.dix – for English
D. Bilingual Dictionary
The bilingual dictionary contains relations between words and symbols in the two languages i.e. it establishes
the correspondences between source language LFs and target language LFs [5]. For our language pair, we have
the following bilingual dictionary –
apertium-asm-eng.asm-eng.dix
E. Transfer Rules
The transfer rule file contains some grammatical rules for how a source language will be changed to a target
language. These rules are mainly used because all languages do not follow the same word order. For e.g.
English follows SVO construction, while Assamese follows SOV word order construction. Two modules of
structural transfer can be used –
1-stage transfer - only one set of transfer rules is used (.t1x file).
3-stage transfer - transfer rules are used to group words into chunks, on which later operations such as inter-
chunk and post-chunking can be performed (.t1x, .t2x and .t3x files).
1-stage transfer can be used for languages which are closely related [6], for e.g. Assamese-Bengali. For
Assamese-English we have to use 3-stage transfer rules. We have used the following transfer files –
Pranjal Das et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(5), March-May, 2014, pp.
401-406
IJETCAS 14-444; © 2014, IJETCAS All Rights Reserved Page 403
apertium-asm-eng.asm-eng.t1x, apertium-asm-eng.asm-eng.t2x, apertium-asm-eng.asm-eng.t3x for Assamese to
English.
apertium-asm-eng.eng-asm.t1x, apertium-asm-eng.eng-asm.t2x, apertium-asm-eng.eng-asm.t3x for English to
Assamese.
IV. Methodology
A. The Apertium Architecture
The eight module architecture of Apertium [7] is shown in the following figure:
Figure 2: Architecture of Apertium
Deformatter
The deformatter separates text from the format information (html tags, space etc) by encapsulating it with
brackets, thus the rest of the module ignores it thinking it as blank space between words [12].
For example the html text দিল্লী <em>ভাৰতৰ</em> ৰাজধানী। would be transformed by the deformatter as দিল্লী [<em>]
ভাৰতৰ [</em>] ৰাজধানী।
The deformatter allows only text to be translated from the format information not the html tags [14].
Morphological Analyser
Here text in the surface forms is tokenized and one or more LFs (lexical forms) are delivered from each SF
(Surface form). For e.g.
^দিল্লী/দিল্লী<np><top><sg><nom>$ ^ভাৰতৰ/ভাৰত<np><top><sg><gen>$
^ৰাজধানী/ৰাজধানী<n><m><sg><nom>$^।/।<sent>$
Here ‘দিল্লী’ is analyzed into lemma দিল্লী, proper noun, toponym, singular and nominative; ‘ভাৰতৰ’ is analyzed into
lemma ভাৰত, proper noun, toponym, singular and genitive. The characters “^” and “$” delimit the analyses for
each SF; LFs for each SF are separated by “/”; angle brackets “< . . >” are used to delimit grammatical symbols.
The string after the “^” and before the first “/” is the SF as it appears in the source input text [13].
Parts of Speech Tagger
This step chooses one lexical form for those words having more than one.
The output of the tagger is something like this:
^দিল্লী<np><top><sg><nom>$ ^ভাৰত<np><top><sg><gen>$ ^ৰাজধানী<n><m><sg><nom>$^।<sent>$
Structural Transfer and Lexical Transfer
As the name says ‘Lexical’ changes words while ‘Structural’ changes the word order.
The output of the lexical transfer is:
^দিল্লী<np><top><sg><nom>/Delhi<np><top><sg><nom>$
^ভাৰত<np><top><sg><gen>/India<np><top><sg><gen>$
^ৰাজধানী<n><m><sg><nom>/capital<n><sg><nom>$^।<sent>/.<sent>$
Structural Transfer is performed in 3 stages: chunking, interchunking and postchunking.
The final output of the Structural transfer stage would be as follows:
^Delhi<np><top><sg>$ ^be<vbser><pres><p3><sg>$ ^capital<n><sg>$ ^of<pr>$
^India<np><top><sg>$^.<sent>$
Morphological Generator
Generate the surface forms of the target language words from the decomposition in lemma plus attributes
obtained from the previous steps.
Pranjal Das et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(5), March-May, 2014, pp.
401-406
IJETCAS 14-444; © 2014, IJETCAS All Rights Reserved Page 404
Post Generator
Performs some TL orthographical transformations, such as contractions (দিন্দু + স্থান  দিন্দুস্থান), inserting
apostrophes (can + not  can’t) [14].
Reformatter
The reformatter restores formatted information encapsulated by the deformatter.
For e.g. “Delhi [<em>] is capital of [</em>] India.” is restored into “Delhi <em>is capital of</em> India.”
B. Creating Dictionaries
Apertium basically works on dictionaries and shallow transfer rules. There are 3 basic dictionaries: two
monolingual dictionaries and a bilingual dictionary.
1) The Monolingual Dictionaries
We have two monolingual dictionaries, the Assamese monolingual dictionary and the English monolingual
dictionary. The skeleton of our monolingual dictionary is given below:
<?xml version="1.0" encoding="UTF-8"?>
<dictionary>
<sdefs>
<sdef n="n"/>
<sdef n="sg"/>
<sdef n="pl"/>
</sdefs>
<pardefs>
</pardefs>
<section id="main" type="standard">
</section>
</dictionary>
Now we want to add some more entries. So, at first we add some alphabets. In case of the Assamese
monolingual dictionary:
<alphabet>অআইঈউঊঋএঐওঔকখগঘঙচছজঝঞটঠডঢণতথিধনপফবভমযৰলৱশষসিক্ষড়ঢ়য়ংঃ ঃৎ</alphabet>
We have to place the alphabet below the <dictionary> tag.
Next we need to add some symbols inside <sdefs> tag. The symbols can be noun, pronoun, proper noun, singular,
plural, present continuous etc.
<sdefs>
<sdef n="n"/> <!-- noun -->
<sdef n="sg"/> <! -- singular -->
<sdef n="pl"/> <!-- plural -->
</sdefs>
Next we have to add the paradigms i.e. the <pardefs> tag.
<pardefs>
<pardef n="অপৰাধ_n">
<e><p><l/><r><s n="n"/><s n="nt"/><s n="sg"/><s n="nom"/></r></p></e>
<e><p><l>ববাৰ</l><r><s n="n"/><s n="nt"/><s n="pl"/><s n="nom"/></r></p></e>
</pardef>
</pardefs>
Here the singular form the noun is “অপৰাধ” and the plural form is “অপৰাধববাৰ”. Here the noun “অপৰাধ” is also neuter
gender and nominative case.
Now we put the entry in the <section>.
<section id="main" type="standard">
<e lm="অপৰাধ"><i>অপৰাধ</i><par n="অপৰাধ_n"/></e>
</section>
This entry states the lemma of the word, অপৰাধ, the root, অপৰাধ and the paradigm with which it inflects অপৰাধ_n
[8].
In the same way we have created the English monolingual dictionary.
2) The Bilingual Dictionary
The bilingual dictionary describes the mapping between words [8]. Lexical transfer is performed by the bilingual
dictionary i.e. it gives translations between words and not tags. The skeleton of the bilingual dictionary is shown
below:
<?xml version="1.0" encoding="UTF-8"?>
Pranjal Das et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(5), March-May, 2014, pp.
401-406
IJETCAS 14-444; © 2014, IJETCAS All Rights Reserved Page 405
<dictionary>
<alphabet/>
<sdefs>
<sdef n="n"/>
<sdef n="sg"/>
<sdef n="pl"/>
</sdefs>
<section id="main" type="standard">
</section>
</dictionary>
The skeleton is very much similar to that of the monolingual dictionary. Here the paradigm section is not
required. We just need add an entry to translate between the two words in Assamese and English, inside the
<section> tag. This can be done as following:
<section id="main" type="standard">
<e><p><l>অপৰাধ<s n="n"/><s n="nt"/></l><r>crime<s n="n"/></r></p></e>
</section>
All the nouns, pronouns, verbs etc. are added to the bilingual dictionary inside the <section> tag likewise.
C. Transfer Rules
Making rules is the toughest part in RBMT based systems. A sound knowledge in the relevant language is very
much required in making the grammatical rules. The lexical transfer only gives translations between words. The
reordering of words is not performed by lexical transfer. Now we will perform structural transfer, which can
change the order of words and can change or add tags on a per-category (group of lemmas) basis. As we are
dealing with Assamese and English languages, we have to perform the 3-stage transfer [10]. The stages are:
 Transfer stage or chunk stage: Here the words are translated using the bilingual dictionary and
grouped and put into chunks (in the .t1x file). The tags in the words can also be added here and removed or
made into 'pointers' that points to the tags in the enclosing chunk.
 Interchunk stage: In this stage the chunks are reordered, combined and split and chunk tags are
changed (in the .t2x file).
 Postchunk stage: This is the final stage of structural transfer. Here the chunks are restored (in the .t3x
file).
V. Results and Evaluation
A. Results and Statistics
Now as we have the dictionaries and the transfer rules, we can test our system with some sentences. Some
translations are not perfect, because of the problem in the transfer rules. We are currently working on adding
more and more rules into our system.
The current status of our work is shown in the following table:
Table II Status of our language pair
Number Entries
Monolingual Dictionary (asm) 3983 words
Bilingual Dictionary (asm-eng) 4341 words
Monolingual Dictionary (eng) 5940 words
Transfer rules (asmeng) 13 rules
Transfer rules (engasm) 11 rules
Some test results for Assamese to English and vice versa are shown in the following tables:
Table III Assamese to English Translation
Translating Output Correct Output
জয়পুৰ ৰাজস্থানৰ এখন দবখযাত চিৰ। Jaipur is a famous city of Rajasthan. Jaipur is a famous city of Rajasthan.
তাজমিল আগ্ৰাত অৱদস্থত। Taj Mahal is situated Agra. Taj Mahal is situated in Agra.
জামা মছদজি শ্বািজািাবন দনমমাণ কদৰদছল। Jama Masjid was built by Shahjahan. Jama Masjid was built by Shahjahan.
দিল্লী ভাৰতৰ ৰাজধানী। Delhi is capital of India. Delhi is the capital of India.
গঙ্গা নিীৰ পাৰত অৱদস্থত উজ্জদয়নী ভাৰতৰ সপ্ততীথমৰ দভতৰত
অনযতম।
on bank of Ganges river situated Ujjain
India of #seven sacred cities of inside #one
of the.
Situated near the banks of the river
Ganga, Ujjain is one of the seven sacred
cities of India.
মমতাজ মিল শ্বািজািানৰ পত্নী আদছল। Mumtaz Mahal is wife of Shahjahan #was. Mumtaz Mahal was the wife of
Shahjahan.
অন্ধ্ৰপ্ৰবিশ ভাৰতৰ এখন ৰাজয। Andhra Pradesh is a state of India. Andhra Pradesh is a state of India.
Table IV English to Assamese Translation
Translating Output Correct Output
Delhi is the capital of India. দিল্লী #খন ভাৰতৰ ৰাজধানী। দিল্লী ভাৰতৰ ৰাজধানী।
Pranjal Das et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(5), March-May, 2014, pp.
401-406
IJETCAS 14-444; © 2014, IJETCAS All Rights Reserved Page 406
Assam is a beautiful place. অসম এক সুন্দৰ #বজগা। অসম এখন ধুনীয়া ঠাই।
Kanak Vrindavan is a popular picnic spot in
Jaipur.
কনক বৃন্দাবন এক জনদপ্ৰয় #বনবভাজ #িাগ *in জয়পুৰ। কনক বৃন্দাবন হিবছ জয়পুৰৰ এখন জনদপ্ৰয় বনবভাজ স্থান।
Hyderabad is situated in Bangalore. িায়িৰাবাি অৱদস্থত *in #বাংগাবলাৰ । িায়িৰাবাি বাংগাবলাৰত অৱদস্থত।
Udaipur is a famous city of Rajasthan. উিয়পুৰ ৰাজস্থানৰ এক দবখযাত চিৰ। উিয়পুৰ ৰাজস্থানৰ এখন দবখযাত চিৰ।
Canada is a vast country. কানাডা এক দবশাল #বিশ। কানাডা এখন দবশাল বিশ।
Here ‘*<word>’ means that the word is not in the analyser and ‘#<word>” means that the word is not in the
generator.
B.Evaluation
Now we want to see how well our language pairs works in practice. We can see this by calculating the WER
(word error rate) and PWER (post-independent word error rate). The tool used for evaluating WER and PWER
is apertium-eval-translator [11]. The results of the evaluation are shown in the following table:
Table V Evaluation Results
Corpus WER PWER
Tourism data 53.80% 36.96%
The scores are not amusing enough, which indicates that our system is still not ready for use in general. Work is
still going on the improvement and addition of more transfer rules.
VI. Conclusion
We have tried to implement a bidirectional RBMT system between Assamese and English language. The results
obtained from our work are encouraging but not satisfactory. We are working on the transfer rules and increasing
the coverage of words in our dictionary. In the future we are looking forward to improve the translation quality of
our system and make it a full-fledged MT system. The long-term plan of our project is also to make our system
hybrid, taking the good qualities of both RBMT and SMT systems and see if it gives a better translation output.
References
[1] “List of language pairs”, Available: http://wiki.apertium.org/wiki/List_of_language_pairs
[2] Antony P.J., “Machine Translation and Survey for Indian Languages”, “The Association for Computational Linguistics and Chinese
Language Processing” Vol. 18, No. 1, March 2013, pp. 47-78.
[3] “Trunk”, Available: http://wiki.apertium.org/wiki/Trunk.
[4] “Lttoolbox”, Available: http://wiki.apertium.org/wiki/Lttoolbox.
[5] Carme Armentano-Oller, Rafael C. Carrasco, Antonio M. Corbí-Bellot, Mikel L. Forcada, Mireia Ginestí-Rosell, Sergio Ortiz-Rojas,
Juan Antonio Pérez-Ortiz, Gema Ramírez-Sánchez, Felipe Sánchez-Martínez, Miriam A. Scalco (2006) "Open-source Portuguese-
Spanish machine translation", in In Lecture Notes in Computer Science 3960 (Computational Processing of the Portuguese Language,
Proceedings of the 7th International Workshop on Computational Processing of Written and Spoken Portuguese, PROPOR 2006), May
13-17, 2006, ME - RJ / Itatiaia, Rio de Janeiro, Brazil. ((c) Springer-Verlag 2006), p. 50-59
[6] Francis M. Tyers, Jacob Nordfalk, “Shallow-transfer rule-based machine translation for Swedish to Danish”, “Proceedings of the First
International Workshop on Free/Open-Source Rule-Based Machine Translation “, 2009, pp. 27-33
[7] Forcada, M. L., Ginestí-Rosell, M., Nordfalk, J., O’Regan, J., Ortiz-Rojas, S., Pérez-Ortiz, J. A., Sánchez-Martínez, F., Ramírez-
Sánchez, G., and Tyers, F. M. (2011). Apertium: a free/opensource platform for rule-based machine translation. Machine Translation,
25(2):127–144.
[8] “Apertium New Language Pair HOWTO”, Available: http://wiki.apertium.org/wiki/Apertium_New_Language_Pair_HOWTO
[9] “Assamese Language”, Available: http://en.wikipedia.org/wiki/Assamese_Language.
[10] “Chunking: A full example”, Available: http://wiki.apertium.org/wiki/Chunking:_A_full_example.
[11] “Evaluation”, Available: http:// wiki.apertium.org/wiki/Evaluation.
[12] Roche, E., Schabes, Y.: Introduction. In Roche, E., Schabes, Y., eds.: Finite-State Language Processing. MIT Press, Cambridge, Mass.
(1997) 1–65.
[13] Garrido-Alenda, A., Forcada, M.L., Carrasco, R.C.: Incremental construction and maintenance of morphological analysers based on
augmented letter transducers. In: Proceedings of TMI 2002 (Theoretical and Methodological Issues in Machine Translation,
Keihanna/Kyoto, Japan, March 2002). (2002) 53–62
[14] “Transfer Rules Examples”, Available: http://wiki.apertium.org/wiki/Transfer_rules_examples
Acknowledgement
The authors are thankful to the Department of Information Technology, Gauhati University for providing us the
corpus, which helped us in making the dictionaries and building the RBMT system with Apertium.

More Related Content

What's hot

Hindi –tamil text translation
Hindi –tamil text translationHindi –tamil text translation
Hindi –tamil text translation
Vaibhav Agarwal
 
An implementation of apertium based assamese morphological analyzer
An implementation of apertium based assamese morphological analyzerAn implementation of apertium based assamese morphological analyzer
An implementation of apertium based assamese morphological analyzer
ijnlc
 
Error Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsError Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsParisa Niksefat
 
A tutorial on Machine Translation
A tutorial on Machine TranslationA tutorial on Machine Translation
A tutorial on Machine Translation
Jaganadh Gopinadhan
 
Shallow parser for hindi language with an input from a transliterator
Shallow parser for hindi language with an input from a transliteratorShallow parser for hindi language with an input from a transliterator
Shallow parser for hindi language with an input from a transliteratorShashank Shisodia
 
Tamil-English Document Translation Using Statistical Machine Translation Appr...
Tamil-English Document Translation Using Statistical Machine Translation Appr...Tamil-English Document Translation Using Statistical Machine Translation Appr...
Tamil-English Document Translation Using Statistical Machine Translation Appr...
baskaran_md
 
NLP pipeline in machine translation
NLP pipeline in machine translationNLP pipeline in machine translation
NLP pipeline in machine translation
Marcis Pinnis
 
Tamil Morphological Analysis
Tamil Morphological AnalysisTamil Morphological Analysis
Tamil Morphological Analysis
Karthik Sankar
 
Types of machine translation
Types of machine translationTypes of machine translation
Types of machine translationRushdi Shams
 
Experiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine TranslationExperiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine Translation
khyati gupta
 
A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...
A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...
A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...
Syeful Islam
 
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
ijma
 
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...
ijaia
 
MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...
MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...
MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...
kevig
 
Implementation of Text To Speech for Marathi Language Using Transcriptions Co...
Implementation of Text To Speech for Marathi Language Using Transcriptions Co...Implementation of Text To Speech for Marathi Language Using Transcriptions Co...
Implementation of Text To Speech for Marathi Language Using Transcriptions Co...
IJERA Editor
 

What's hot (16)

Hindi –tamil text translation
Hindi –tamil text translationHindi –tamil text translation
Hindi –tamil text translation
 
**JUNK** (no subject)
**JUNK** (no subject)**JUNK** (no subject)
**JUNK** (no subject)
 
An implementation of apertium based assamese morphological analyzer
An implementation of apertium based assamese morphological analyzerAn implementation of apertium based assamese morphological analyzer
An implementation of apertium based assamese morphological analyzer
 
Error Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsError Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation Outputs
 
A tutorial on Machine Translation
A tutorial on Machine TranslationA tutorial on Machine Translation
A tutorial on Machine Translation
 
Shallow parser for hindi language with an input from a transliterator
Shallow parser for hindi language with an input from a transliteratorShallow parser for hindi language with an input from a transliterator
Shallow parser for hindi language with an input from a transliterator
 
Tamil-English Document Translation Using Statistical Machine Translation Appr...
Tamil-English Document Translation Using Statistical Machine Translation Appr...Tamil-English Document Translation Using Statistical Machine Translation Appr...
Tamil-English Document Translation Using Statistical Machine Translation Appr...
 
NLP pipeline in machine translation
NLP pipeline in machine translationNLP pipeline in machine translation
NLP pipeline in machine translation
 
Tamil Morphological Analysis
Tamil Morphological AnalysisTamil Morphological Analysis
Tamil Morphological Analysis
 
Types of machine translation
Types of machine translationTypes of machine translation
Types of machine translation
 
Experiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine TranslationExperiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine Translation
 
A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...
A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...
A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...
 
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
 
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...
 
MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...
MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...
MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...
 
Implementation of Text To Speech for Marathi Language Using Transcriptions Co...
Implementation of Text To Speech for Marathi Language Using Transcriptions Co...Implementation of Text To Speech for Marathi Language Using Transcriptions Co...
Implementation of Text To Speech for Marathi Language Using Transcriptions Co...
 

Viewers also liked

Ijetcas14 523
Ijetcas14 523Ijetcas14 523
Ijetcas14 523
Iasir Journals
 

Viewers also liked (19)

Aijrfans14 271
Aijrfans14 271Aijrfans14 271
Aijrfans14 271
 
Ijetcas14 335
Ijetcas14 335Ijetcas14 335
Ijetcas14 335
 
Ijetcas14 409
Ijetcas14 409Ijetcas14 409
Ijetcas14 409
 
Ijetcas14 394
Ijetcas14 394Ijetcas14 394
Ijetcas14 394
 
Ijetcas14 446
Ijetcas14 446Ijetcas14 446
Ijetcas14 446
 
Aijrfans14 223
Aijrfans14 223Aijrfans14 223
Aijrfans14 223
 
Ijetcas14 323
Ijetcas14 323Ijetcas14 323
Ijetcas14 323
 
Ijebea14 207
Ijebea14 207Ijebea14 207
Ijebea14 207
 
Ijetcas14 393
Ijetcas14 393Ijetcas14 393
Ijetcas14 393
 
Ijebea14 211
Ijebea14 211Ijebea14 211
Ijebea14 211
 
Ijetcas14 523
Ijetcas14 523Ijetcas14 523
Ijetcas14 523
 
Ijetcas14 448
Ijetcas14 448Ijetcas14 448
Ijetcas14 448
 
Ijetcas14 345
Ijetcas14 345Ijetcas14 345
Ijetcas14 345
 
Ijetcas14 313
Ijetcas14 313Ijetcas14 313
Ijetcas14 313
 
Ijetcas14 438
Ijetcas14 438Ijetcas14 438
Ijetcas14 438
 
Ijetcas14 361
Ijetcas14 361Ijetcas14 361
Ijetcas14 361
 
Ijetcas14 399
Ijetcas14 399Ijetcas14 399
Ijetcas14 399
 
Aijrfans14 294
Aijrfans14 294Aijrfans14 294
Aijrfans14 294
 
Ijetcas14 378
Ijetcas14 378Ijetcas14 378
Ijetcas14 378
 

Similar to Ijetcas14 444

IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language ModelsIRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
IRJET Journal
 
Implementation of Marathi Language Speech Databases for Large Dictionary
Implementation of Marathi Language Speech Databases for Large DictionaryImplementation of Marathi Language Speech Databases for Large Dictionary
Implementation of Marathi Language Speech Databases for Large Dictionary
iosrjce
 
G1803013542
G1803013542G1803013542
G1803013542
IOSR Journals
 
RULE BASED TRANSLITERATION SCHEME FOR ENGLISH TO PUNJABI
RULE BASED TRANSLITERATION SCHEME FOR ENGLISH TO PUNJABIRULE BASED TRANSLITERATION SCHEME FOR ENGLISH TO PUNJABI
RULE BASED TRANSLITERATION SCHEME FOR ENGLISH TO PUNJABI
ijnlc
 
Rule Based Transliteration Scheme for English to Punjabi
Rule Based Transliteration Scheme for English to PunjabiRule Based Transliteration Scheme for English to Punjabi
Rule Based Transliteration Scheme for English to Punjabi
kevig
 
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorDynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Waqas Tariq
 
Translationusing moses1
Translationusing moses1Translationusing moses1
Translationusing moses1
Kalyanee Baruah
 
Ey4301913917
Ey4301913917Ey4301913917
Ey4301913917
IJERA Editor
 
DICTIONARY BASED AMHARIC-ARABIC CROSS LANGUAGE INFORMATION RETRIEVAL
DICTIONARY BASED AMHARIC-ARABIC CROSS LANGUAGE INFORMATION RETRIEVALDICTIONARY BASED AMHARIC-ARABIC CROSS LANGUAGE INFORMATION RETRIEVAL
DICTIONARY BASED AMHARIC-ARABIC CROSS LANGUAGE INFORMATION RETRIEVAL
csandit
 
Design and Development of a Malayalam to English Translator- A Transfer Based...
Design and Development of a Malayalam to English Translator- A Transfer Based...Design and Development of a Malayalam to English Translator- A Transfer Based...
Design and Development of a Malayalam to English Translator- A Transfer Based...
Waqas Tariq
 
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
ijnlc
 
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
kevig
 
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...
gerogepatton
 
Construction of Amharic-arabic Parallel Text Corpus for Neural Machine Transl...
Construction of Amharic-arabic Parallel Text Corpus for Neural Machine Transl...Construction of Amharic-arabic Parallel Text Corpus for Neural Machine Transl...
Construction of Amharic-arabic Parallel Text Corpus for Neural Machine Transl...
gerogepatton
 
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
iosrjce
 
Jq3616701679
Jq3616701679Jq3616701679
Jq3616701679
IJERA Editor
 
A Review on a web based Punjabi to English Machine Transliteration System
A Review on a web based Punjabi to English Machine Transliteration SystemA Review on a web based Punjabi to English Machine Transliteration System
A Review on a web based Punjabi to English Machine Transliteration System
Editor IJCATR
 
Contextual Analysis for Middle Eastern Languages with Hidden Markov Models
Contextual Analysis for Middle Eastern Languages with Hidden Markov ModelsContextual Analysis for Middle Eastern Languages with Hidden Markov Models
Contextual Analysis for Middle Eastern Languages with Hidden Markov Models
ijnlc
 

Similar to Ijetcas14 444 (20)

IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language ModelsIRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
 
Implementation of Marathi Language Speech Databases for Large Dictionary
Implementation of Marathi Language Speech Databases for Large DictionaryImplementation of Marathi Language Speech Databases for Large Dictionary
Implementation of Marathi Language Speech Databases for Large Dictionary
 
G1803013542
G1803013542G1803013542
G1803013542
 
RULE BASED TRANSLITERATION SCHEME FOR ENGLISH TO PUNJABI
RULE BASED TRANSLITERATION SCHEME FOR ENGLISH TO PUNJABIRULE BASED TRANSLITERATION SCHEME FOR ENGLISH TO PUNJABI
RULE BASED TRANSLITERATION SCHEME FOR ENGLISH TO PUNJABI
 
Rule Based Transliteration Scheme for English to Punjabi
Rule Based Transliteration Scheme for English to PunjabiRule Based Transliteration Scheme for English to Punjabi
Rule Based Transliteration Scheme for English to Punjabi
 
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorDynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
 
Translationusing moses1
Translationusing moses1Translationusing moses1
Translationusing moses1
 
Ey4301913917
Ey4301913917Ey4301913917
Ey4301913917
 
Cf32516518
Cf32516518Cf32516518
Cf32516518
 
DICTIONARY BASED AMHARIC-ARABIC CROSS LANGUAGE INFORMATION RETRIEVAL
DICTIONARY BASED AMHARIC-ARABIC CROSS LANGUAGE INFORMATION RETRIEVALDICTIONARY BASED AMHARIC-ARABIC CROSS LANGUAGE INFORMATION RETRIEVAL
DICTIONARY BASED AMHARIC-ARABIC CROSS LANGUAGE INFORMATION RETRIEVAL
 
Design and Development of a Malayalam to English Translator- A Transfer Based...
Design and Development of a Malayalam to English Translator- A Transfer Based...Design and Development of a Malayalam to English Translator- A Transfer Based...
Design and Development of a Malayalam to English Translator- A Transfer Based...
 
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
 
I1 geetha3 revathi
I1 geetha3 revathiI1 geetha3 revathi
I1 geetha3 revathi
 
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
 
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...
 
Construction of Amharic-arabic Parallel Text Corpus for Neural Machine Transl...
Construction of Amharic-arabic Parallel Text Corpus for Neural Machine Transl...Construction of Amharic-arabic Parallel Text Corpus for Neural Machine Transl...
Construction of Amharic-arabic Parallel Text Corpus for Neural Machine Transl...
 
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
 
Jq3616701679
Jq3616701679Jq3616701679
Jq3616701679
 
A Review on a web based Punjabi to English Machine Transliteration System
A Review on a web based Punjabi to English Machine Transliteration SystemA Review on a web based Punjabi to English Machine Transliteration System
A Review on a web based Punjabi to English Machine Transliteration System
 
Contextual Analysis for Middle Eastern Languages with Hidden Markov Models
Contextual Analysis for Middle Eastern Languages with Hidden Markov ModelsContextual Analysis for Middle Eastern Languages with Hidden Markov Models
Contextual Analysis for Middle Eastern Languages with Hidden Markov Models
 

More from Iasir Journals

ijetcas14 650
ijetcas14 650ijetcas14 650
ijetcas14 650
Iasir Journals
 
Ijetcas14 648
Ijetcas14 648Ijetcas14 648
Ijetcas14 648
Iasir Journals
 
Ijetcas14 647
Ijetcas14 647Ijetcas14 647
Ijetcas14 647
Iasir Journals
 
Ijetcas14 643
Ijetcas14 643Ijetcas14 643
Ijetcas14 643
Iasir Journals
 
Ijetcas14 641
Ijetcas14 641Ijetcas14 641
Ijetcas14 641
Iasir Journals
 
Ijetcas14 639
Ijetcas14 639Ijetcas14 639
Ijetcas14 639
Iasir Journals
 
Ijetcas14 632
Ijetcas14 632Ijetcas14 632
Ijetcas14 632
Iasir Journals
 
Ijetcas14 624
Ijetcas14 624Ijetcas14 624
Ijetcas14 624
Iasir Journals
 
Ijetcas14 619
Ijetcas14 619Ijetcas14 619
Ijetcas14 619
Iasir Journals
 
Ijetcas14 615
Ijetcas14 615Ijetcas14 615
Ijetcas14 615
Iasir Journals
 
Ijetcas14 608
Ijetcas14 608Ijetcas14 608
Ijetcas14 608
Iasir Journals
 
Ijetcas14 605
Ijetcas14 605Ijetcas14 605
Ijetcas14 605
Iasir Journals
 
Ijetcas14 604
Ijetcas14 604Ijetcas14 604
Ijetcas14 604
Iasir Journals
 
Ijetcas14 598
Ijetcas14 598Ijetcas14 598
Ijetcas14 598
Iasir Journals
 
Ijetcas14 594
Ijetcas14 594Ijetcas14 594
Ijetcas14 594
Iasir Journals
 
Ijetcas14 593
Ijetcas14 593Ijetcas14 593
Ijetcas14 593
Iasir Journals
 
Ijetcas14 591
Ijetcas14 591Ijetcas14 591
Ijetcas14 591
Iasir Journals
 
Ijetcas14 589
Ijetcas14 589Ijetcas14 589
Ijetcas14 589
Iasir Journals
 
Ijetcas14 585
Ijetcas14 585Ijetcas14 585
Ijetcas14 585
Iasir Journals
 
Ijetcas14 584
Ijetcas14 584Ijetcas14 584
Ijetcas14 584
Iasir Journals
 

More from Iasir Journals (20)

ijetcas14 650
ijetcas14 650ijetcas14 650
ijetcas14 650
 
Ijetcas14 648
Ijetcas14 648Ijetcas14 648
Ijetcas14 648
 
Ijetcas14 647
Ijetcas14 647Ijetcas14 647
Ijetcas14 647
 
Ijetcas14 643
Ijetcas14 643Ijetcas14 643
Ijetcas14 643
 
Ijetcas14 641
Ijetcas14 641Ijetcas14 641
Ijetcas14 641
 
Ijetcas14 639
Ijetcas14 639Ijetcas14 639
Ijetcas14 639
 
Ijetcas14 632
Ijetcas14 632Ijetcas14 632
Ijetcas14 632
 
Ijetcas14 624
Ijetcas14 624Ijetcas14 624
Ijetcas14 624
 
Ijetcas14 619
Ijetcas14 619Ijetcas14 619
Ijetcas14 619
 
Ijetcas14 615
Ijetcas14 615Ijetcas14 615
Ijetcas14 615
 
Ijetcas14 608
Ijetcas14 608Ijetcas14 608
Ijetcas14 608
 
Ijetcas14 605
Ijetcas14 605Ijetcas14 605
Ijetcas14 605
 
Ijetcas14 604
Ijetcas14 604Ijetcas14 604
Ijetcas14 604
 
Ijetcas14 598
Ijetcas14 598Ijetcas14 598
Ijetcas14 598
 
Ijetcas14 594
Ijetcas14 594Ijetcas14 594
Ijetcas14 594
 
Ijetcas14 593
Ijetcas14 593Ijetcas14 593
Ijetcas14 593
 
Ijetcas14 591
Ijetcas14 591Ijetcas14 591
Ijetcas14 591
 
Ijetcas14 589
Ijetcas14 589Ijetcas14 589
Ijetcas14 589
 
Ijetcas14 585
Ijetcas14 585Ijetcas14 585
Ijetcas14 585
 
Ijetcas14 584
Ijetcas14 584Ijetcas14 584
Ijetcas14 584
 

Recently uploaded

The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
Digital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion DesignsDigital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion Designs
chanes7
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
chanes7
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
JEE1_This_section_contains_FOUR_ questions
JEE1_This_section_contains_FOUR_ questionsJEE1_This_section_contains_FOUR_ questions
JEE1_This_section_contains_FOUR_ questions
ShivajiThube2
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
Krisztián Száraz
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 

Recently uploaded (20)

The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
Digital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion DesignsDigital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion Designs
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
JEE1_This_section_contains_FOUR_ questions
JEE1_This_section_contains_FOUR_ questionsJEE1_This_section_contains_FOUR_ questions
JEE1_This_section_contains_FOUR_ questions
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 

Ijetcas14 444

  • 1. International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS) www.iasir.net IJETCAS 14-444; © 2014, IJETCAS All Rights Reserved Page 401 ISSN (Print): 2279-0047 ISSN (Online): 2279-0055 Rule Based Machine Translation for Assamese-English Using Apertium Pranjal Das1 , Kalyanee Kanchan Baruah2 , Abdul Hannan3 , Shikhar Kr. Sarma4 Department of Information Technology IST, Gauhati University Guwahati India Abstract: Machine Translation is the process of using a software application to convert a source language into a target language without the intervention of human. It is an important part of Natural Language Processing. Here we have proposed a method for translating Assamese to English sentences and vice versa using a Rule Based Machine Translation (RBMT) approach. RBMT uses linguistic rules to analyze the inputted content in the source language to generate the output text in the target language. We have used Apertium toolkit, which is a free and open-source platform for RBMT. We have used two monolingual dictionaries and a bilingual dictionary for building the Assamese-English RBMT system. We wrote some transfer rules to deal with word- reordering and other grammatical stuffs. Keywords: Apertium, Transfer Rules, Assamese, Morphology, Machine Translation, Dictionary I. Introduction In this paper, we have described the development of the Assamese-English (asm-eng) language pair using Apertium RBMT. Assamese is a language mainly spoken by the people of Assam, a state in the north-east of India [9]. In Assam, not much has been done in the field of NLP and Machine Translation. Assamese is a computationally underdeveloped language and research on Natural Language Processing (NLP) is still at its developing stage. Today, almost everything is web-based and majority of the information is available in English. So the purpose of our project is to enable the common Assamese people, who have lesser knowledge about the English language, to stay abreast with the recent advancements of the world by providing him an Machine Translation interface. The purpose of the project is also to let the world know about the Assamese language. Since Indian languages are morphologically very rich and agglutinative in nature, majority of the MT projects in India are hybrid or statistical based. But we believe that if we have a firm grip on the language, RBMT can be better. In this paper, we have briefly described about the design of the RBMT system using the Apertium platform. Then, we have described about the dictionaries and the rules used. Following this, we have given an evaluation of the quality of output of our system and finally we concluded our paper with a discussion and prospective for future work. II. Related Works Machine Translation in India is not a new process. There are many machine translation projects in India [2]. Some of the Machine Translation systems for Indian languages are: Table I Some Machine Translation Projects in India Machine Translation System Source-Target Language Developer Approach Used ANGLABHARATI (1991) English-Indian Languages (primarily Hindi) IIT, Kanpur Pseudo-interlingua UCSG-based MT English-Kannada University of Hyderabad Transfer based Anusaaraka (1995) English to Hindi and five other Indian Languages to Hindi IIT, Kanpur LWG Mapping/PG VAASAANUBAADA (2002) Bengali- Assamese Kommaluri Vijayanand, S Choudhury and Pranab Ratna EBMT Anuvadaksh English to Hindi, Urdu, Oriya, Bangla, Marathi, Tamil EILMT consortium Hybrid Approach Several universities in India, along with CDAC have been working on improving Machine Translation for Indian Languages. MHRD, India is also taking several measures to make Machine Translation serve for the people of India. We are using Apertium to develop our RBMT system. There are many language pairs in use for Apertium. Work is still going on for some language pairs. Some of the stable language pairs that are currently in use are shown in the following figure:
  • 2. Pranjal Das et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(5), March-May, 2014, pp. 401-406 IJETCAS 14-444; © 2014, IJETCAS All Rights Reserved Page 402 Figure 1: Stable language pairs of Apertium. All the released pairs are stored in the trunk [3] of Apertium. III. Files and Tools used for Implementation A. Apertium Apertium is a free and open-source platform used in Rule Based Machine Translation. Apertium uses FST (finite- state transducers) for lexical processing to treat many kinds of multi-word expressions. Hidden Markov models (HMM) are used in Apertium for part-of-speech tagging. Apertium also uses chunking, which is finite-state based used for structural transfer. B. Lttoolbox It is a finite state toolkit used for lexical processing, morphological analysis and generation of words [4]. Three programs are split in the lttoolbox: lt-comp – the compiler lt-proc – the processor lt-expand – generates all possible mappings in the dictionary between surface forms and lexical forms. C. Monolingual Dictionaries The monolingual dictionary or the morphological dictionary contains the rules how words in that dictionary are inflected. It is written in XML format. The monolingual dictionary establishes the correspondences between Surface Forms (SFs) and Lexical Forms (LFs) [5]. For our language pair Assamese-English, we have two monolingual dictionaries – apertium-asm-eng.asm.dix – for Assamese apertium-asm-eng.eng.dix – for English D. Bilingual Dictionary The bilingual dictionary contains relations between words and symbols in the two languages i.e. it establishes the correspondences between source language LFs and target language LFs [5]. For our language pair, we have the following bilingual dictionary – apertium-asm-eng.asm-eng.dix E. Transfer Rules The transfer rule file contains some grammatical rules for how a source language will be changed to a target language. These rules are mainly used because all languages do not follow the same word order. For e.g. English follows SVO construction, while Assamese follows SOV word order construction. Two modules of structural transfer can be used – 1-stage transfer - only one set of transfer rules is used (.t1x file). 3-stage transfer - transfer rules are used to group words into chunks, on which later operations such as inter- chunk and post-chunking can be performed (.t1x, .t2x and .t3x files). 1-stage transfer can be used for languages which are closely related [6], for e.g. Assamese-Bengali. For Assamese-English we have to use 3-stage transfer rules. We have used the following transfer files –
  • 3. Pranjal Das et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(5), March-May, 2014, pp. 401-406 IJETCAS 14-444; © 2014, IJETCAS All Rights Reserved Page 403 apertium-asm-eng.asm-eng.t1x, apertium-asm-eng.asm-eng.t2x, apertium-asm-eng.asm-eng.t3x for Assamese to English. apertium-asm-eng.eng-asm.t1x, apertium-asm-eng.eng-asm.t2x, apertium-asm-eng.eng-asm.t3x for English to Assamese. IV. Methodology A. The Apertium Architecture The eight module architecture of Apertium [7] is shown in the following figure: Figure 2: Architecture of Apertium Deformatter The deformatter separates text from the format information (html tags, space etc) by encapsulating it with brackets, thus the rest of the module ignores it thinking it as blank space between words [12]. For example the html text দিল্লী <em>ভাৰতৰ</em> ৰাজধানী। would be transformed by the deformatter as দিল্লী [<em>] ভাৰতৰ [</em>] ৰাজধানী। The deformatter allows only text to be translated from the format information not the html tags [14]. Morphological Analyser Here text in the surface forms is tokenized and one or more LFs (lexical forms) are delivered from each SF (Surface form). For e.g. ^দিল্লী/দিল্লী<np><top><sg><nom>$ ^ভাৰতৰ/ভাৰত<np><top><sg><gen>$ ^ৰাজধানী/ৰাজধানী<n><m><sg><nom>$^।/।<sent>$ Here ‘দিল্লী’ is analyzed into lemma দিল্লী, proper noun, toponym, singular and nominative; ‘ভাৰতৰ’ is analyzed into lemma ভাৰত, proper noun, toponym, singular and genitive. The characters “^” and “$” delimit the analyses for each SF; LFs for each SF are separated by “/”; angle brackets “< . . >” are used to delimit grammatical symbols. The string after the “^” and before the first “/” is the SF as it appears in the source input text [13]. Parts of Speech Tagger This step chooses one lexical form for those words having more than one. The output of the tagger is something like this: ^দিল্লী<np><top><sg><nom>$ ^ভাৰত<np><top><sg><gen>$ ^ৰাজধানী<n><m><sg><nom>$^।<sent>$ Structural Transfer and Lexical Transfer As the name says ‘Lexical’ changes words while ‘Structural’ changes the word order. The output of the lexical transfer is: ^দিল্লী<np><top><sg><nom>/Delhi<np><top><sg><nom>$ ^ভাৰত<np><top><sg><gen>/India<np><top><sg><gen>$ ^ৰাজধানী<n><m><sg><nom>/capital<n><sg><nom>$^।<sent>/.<sent>$ Structural Transfer is performed in 3 stages: chunking, interchunking and postchunking. The final output of the Structural transfer stage would be as follows: ^Delhi<np><top><sg>$ ^be<vbser><pres><p3><sg>$ ^capital<n><sg>$ ^of<pr>$ ^India<np><top><sg>$^.<sent>$ Morphological Generator Generate the surface forms of the target language words from the decomposition in lemma plus attributes obtained from the previous steps.
  • 4. Pranjal Das et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(5), March-May, 2014, pp. 401-406 IJETCAS 14-444; © 2014, IJETCAS All Rights Reserved Page 404 Post Generator Performs some TL orthographical transformations, such as contractions (দিন্দু + স্থান  দিন্দুস্থান), inserting apostrophes (can + not  can’t) [14]. Reformatter The reformatter restores formatted information encapsulated by the deformatter. For e.g. “Delhi [<em>] is capital of [</em>] India.” is restored into “Delhi <em>is capital of</em> India.” B. Creating Dictionaries Apertium basically works on dictionaries and shallow transfer rules. There are 3 basic dictionaries: two monolingual dictionaries and a bilingual dictionary. 1) The Monolingual Dictionaries We have two monolingual dictionaries, the Assamese monolingual dictionary and the English monolingual dictionary. The skeleton of our monolingual dictionary is given below: <?xml version="1.0" encoding="UTF-8"?> <dictionary> <sdefs> <sdef n="n"/> <sdef n="sg"/> <sdef n="pl"/> </sdefs> <pardefs> </pardefs> <section id="main" type="standard"> </section> </dictionary> Now we want to add some more entries. So, at first we add some alphabets. In case of the Assamese monolingual dictionary: <alphabet>অআইঈউঊঋএঐওঔকখগঘঙচছজঝঞটঠডঢণতথিধনপফবভমযৰলৱশষসিক্ষড়ঢ়য়ংঃ ঃৎ</alphabet> We have to place the alphabet below the <dictionary> tag. Next we need to add some symbols inside <sdefs> tag. The symbols can be noun, pronoun, proper noun, singular, plural, present continuous etc. <sdefs> <sdef n="n"/> <!-- noun --> <sdef n="sg"/> <! -- singular --> <sdef n="pl"/> <!-- plural --> </sdefs> Next we have to add the paradigms i.e. the <pardefs> tag. <pardefs> <pardef n="অপৰাধ_n"> <e><p><l/><r><s n="n"/><s n="nt"/><s n="sg"/><s n="nom"/></r></p></e> <e><p><l>ববাৰ</l><r><s n="n"/><s n="nt"/><s n="pl"/><s n="nom"/></r></p></e> </pardef> </pardefs> Here the singular form the noun is “অপৰাধ” and the plural form is “অপৰাধববাৰ”. Here the noun “অপৰাধ” is also neuter gender and nominative case. Now we put the entry in the <section>. <section id="main" type="standard"> <e lm="অপৰাধ"><i>অপৰাধ</i><par n="অপৰাধ_n"/></e> </section> This entry states the lemma of the word, অপৰাধ, the root, অপৰাধ and the paradigm with which it inflects অপৰাধ_n [8]. In the same way we have created the English monolingual dictionary. 2) The Bilingual Dictionary The bilingual dictionary describes the mapping between words [8]. Lexical transfer is performed by the bilingual dictionary i.e. it gives translations between words and not tags. The skeleton of the bilingual dictionary is shown below: <?xml version="1.0" encoding="UTF-8"?>
  • 5. Pranjal Das et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(5), March-May, 2014, pp. 401-406 IJETCAS 14-444; © 2014, IJETCAS All Rights Reserved Page 405 <dictionary> <alphabet/> <sdefs> <sdef n="n"/> <sdef n="sg"/> <sdef n="pl"/> </sdefs> <section id="main" type="standard"> </section> </dictionary> The skeleton is very much similar to that of the monolingual dictionary. Here the paradigm section is not required. We just need add an entry to translate between the two words in Assamese and English, inside the <section> tag. This can be done as following: <section id="main" type="standard"> <e><p><l>অপৰাধ<s n="n"/><s n="nt"/></l><r>crime<s n="n"/></r></p></e> </section> All the nouns, pronouns, verbs etc. are added to the bilingual dictionary inside the <section> tag likewise. C. Transfer Rules Making rules is the toughest part in RBMT based systems. A sound knowledge in the relevant language is very much required in making the grammatical rules. The lexical transfer only gives translations between words. The reordering of words is not performed by lexical transfer. Now we will perform structural transfer, which can change the order of words and can change or add tags on a per-category (group of lemmas) basis. As we are dealing with Assamese and English languages, we have to perform the 3-stage transfer [10]. The stages are:  Transfer stage or chunk stage: Here the words are translated using the bilingual dictionary and grouped and put into chunks (in the .t1x file). The tags in the words can also be added here and removed or made into 'pointers' that points to the tags in the enclosing chunk.  Interchunk stage: In this stage the chunks are reordered, combined and split and chunk tags are changed (in the .t2x file).  Postchunk stage: This is the final stage of structural transfer. Here the chunks are restored (in the .t3x file). V. Results and Evaluation A. Results and Statistics Now as we have the dictionaries and the transfer rules, we can test our system with some sentences. Some translations are not perfect, because of the problem in the transfer rules. We are currently working on adding more and more rules into our system. The current status of our work is shown in the following table: Table II Status of our language pair Number Entries Monolingual Dictionary (asm) 3983 words Bilingual Dictionary (asm-eng) 4341 words Monolingual Dictionary (eng) 5940 words Transfer rules (asmeng) 13 rules Transfer rules (engasm) 11 rules Some test results for Assamese to English and vice versa are shown in the following tables: Table III Assamese to English Translation Translating Output Correct Output জয়পুৰ ৰাজস্থানৰ এখন দবখযাত চিৰ। Jaipur is a famous city of Rajasthan. Jaipur is a famous city of Rajasthan. তাজমিল আগ্ৰাত অৱদস্থত। Taj Mahal is situated Agra. Taj Mahal is situated in Agra. জামা মছদজি শ্বািজািাবন দনমমাণ কদৰদছল। Jama Masjid was built by Shahjahan. Jama Masjid was built by Shahjahan. দিল্লী ভাৰতৰ ৰাজধানী। Delhi is capital of India. Delhi is the capital of India. গঙ্গা নিীৰ পাৰত অৱদস্থত উজ্জদয়নী ভাৰতৰ সপ্ততীথমৰ দভতৰত অনযতম। on bank of Ganges river situated Ujjain India of #seven sacred cities of inside #one of the. Situated near the banks of the river Ganga, Ujjain is one of the seven sacred cities of India. মমতাজ মিল শ্বািজািানৰ পত্নী আদছল। Mumtaz Mahal is wife of Shahjahan #was. Mumtaz Mahal was the wife of Shahjahan. অন্ধ্ৰপ্ৰবিশ ভাৰতৰ এখন ৰাজয। Andhra Pradesh is a state of India. Andhra Pradesh is a state of India. Table IV English to Assamese Translation Translating Output Correct Output Delhi is the capital of India. দিল্লী #খন ভাৰতৰ ৰাজধানী। দিল্লী ভাৰতৰ ৰাজধানী।
  • 6. Pranjal Das et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(5), March-May, 2014, pp. 401-406 IJETCAS 14-444; © 2014, IJETCAS All Rights Reserved Page 406 Assam is a beautiful place. অসম এক সুন্দৰ #বজগা। অসম এখন ধুনীয়া ঠাই। Kanak Vrindavan is a popular picnic spot in Jaipur. কনক বৃন্দাবন এক জনদপ্ৰয় #বনবভাজ #িাগ *in জয়পুৰ। কনক বৃন্দাবন হিবছ জয়পুৰৰ এখন জনদপ্ৰয় বনবভাজ স্থান। Hyderabad is situated in Bangalore. িায়িৰাবাি অৱদস্থত *in #বাংগাবলাৰ । িায়িৰাবাি বাংগাবলাৰত অৱদস্থত। Udaipur is a famous city of Rajasthan. উিয়পুৰ ৰাজস্থানৰ এক দবখযাত চিৰ। উিয়পুৰ ৰাজস্থানৰ এখন দবখযাত চিৰ। Canada is a vast country. কানাডা এক দবশাল #বিশ। কানাডা এখন দবশাল বিশ। Here ‘*<word>’ means that the word is not in the analyser and ‘#<word>” means that the word is not in the generator. B.Evaluation Now we want to see how well our language pairs works in practice. We can see this by calculating the WER (word error rate) and PWER (post-independent word error rate). The tool used for evaluating WER and PWER is apertium-eval-translator [11]. The results of the evaluation are shown in the following table: Table V Evaluation Results Corpus WER PWER Tourism data 53.80% 36.96% The scores are not amusing enough, which indicates that our system is still not ready for use in general. Work is still going on the improvement and addition of more transfer rules. VI. Conclusion We have tried to implement a bidirectional RBMT system between Assamese and English language. The results obtained from our work are encouraging but not satisfactory. We are working on the transfer rules and increasing the coverage of words in our dictionary. In the future we are looking forward to improve the translation quality of our system and make it a full-fledged MT system. The long-term plan of our project is also to make our system hybrid, taking the good qualities of both RBMT and SMT systems and see if it gives a better translation output. References [1] “List of language pairs”, Available: http://wiki.apertium.org/wiki/List_of_language_pairs [2] Antony P.J., “Machine Translation and Survey for Indian Languages”, “The Association for Computational Linguistics and Chinese Language Processing” Vol. 18, No. 1, March 2013, pp. 47-78. [3] “Trunk”, Available: http://wiki.apertium.org/wiki/Trunk. [4] “Lttoolbox”, Available: http://wiki.apertium.org/wiki/Lttoolbox. [5] Carme Armentano-Oller, Rafael C. Carrasco, Antonio M. Corbí-Bellot, Mikel L. Forcada, Mireia Ginestí-Rosell, Sergio Ortiz-Rojas, Juan Antonio Pérez-Ortiz, Gema Ramírez-Sánchez, Felipe Sánchez-Martínez, Miriam A. Scalco (2006) "Open-source Portuguese- Spanish machine translation", in In Lecture Notes in Computer Science 3960 (Computational Processing of the Portuguese Language, Proceedings of the 7th International Workshop on Computational Processing of Written and Spoken Portuguese, PROPOR 2006), May 13-17, 2006, ME - RJ / Itatiaia, Rio de Janeiro, Brazil. ((c) Springer-Verlag 2006), p. 50-59 [6] Francis M. Tyers, Jacob Nordfalk, “Shallow-transfer rule-based machine translation for Swedish to Danish”, “Proceedings of the First International Workshop on Free/Open-Source Rule-Based Machine Translation “, 2009, pp. 27-33 [7] Forcada, M. L., Ginestí-Rosell, M., Nordfalk, J., O’Regan, J., Ortiz-Rojas, S., Pérez-Ortiz, J. A., Sánchez-Martínez, F., Ramírez- Sánchez, G., and Tyers, F. M. (2011). Apertium: a free/opensource platform for rule-based machine translation. Machine Translation, 25(2):127–144. [8] “Apertium New Language Pair HOWTO”, Available: http://wiki.apertium.org/wiki/Apertium_New_Language_Pair_HOWTO [9] “Assamese Language”, Available: http://en.wikipedia.org/wiki/Assamese_Language. [10] “Chunking: A full example”, Available: http://wiki.apertium.org/wiki/Chunking:_A_full_example. [11] “Evaluation”, Available: http:// wiki.apertium.org/wiki/Evaluation. [12] Roche, E., Schabes, Y.: Introduction. In Roche, E., Schabes, Y., eds.: Finite-State Language Processing. MIT Press, Cambridge, Mass. (1997) 1–65. [13] Garrido-Alenda, A., Forcada, M.L., Carrasco, R.C.: Incremental construction and maintenance of morphological analysers based on augmented letter transducers. In: Proceedings of TMI 2002 (Theoretical and Methodological Issues in Machine Translation, Keihanna/Kyoto, Japan, March 2002). (2002) 53–62 [14] “Transfer Rules Examples”, Available: http://wiki.apertium.org/wiki/Transfer_rules_examples Acknowledgement The authors are thankful to the Department of Information Technology, Gauhati University for providing us the corpus, which helped us in making the dictionaries and building the RBMT system with Apertium.