Machine translation is being carried out by the researchers from quite a long time. However, it is still a
dream to materialize flawless Machine Translator and the small numbers of researchers has focussed at
translating Marathi Text to English. Perfect Machine Translation Systems have not yet been fully built
owing to the fact that languages differ syntactically as well as morphologically. Majority of the researchers
have opted for Statistical Machine translation whereas in this paper we have addressed the challenges of
Rule based Machine Translation. The paper describes the major divergences observed in language
Marathi and English and many challenges encountered while attempting to build machine translation
system form Marathi to English using rule based approach and rules to handle these challenges. As there
are exceptions to the rules and limit to the feasibility of maintaining knowledgebase, the practical machine
translation from Marathi to English is a complex task.
Phrase Identification is one of the most critical and widely studied in Natural Language processing (NLP) tasks. Verb Phrase Identification within a sentence is very useful for a variety of application on NLP. One of the core enabling technologies required in NLP applications is a Morphological Analysis. This paper presents the Myanmar Verb Phrase Identification and Translation Algorithm and develops a Markov Model with Morphological Analysis. The system is based on Rule-Based Maximum Matching Approach. In Machine Translation, Large amount of information is needed to guide the translation process. Myanmar Language is inflected language and there are very few creations and researches of Lexicon in Myanmar, comparing to other language such as English, French and Czech etc. Therefore, this system is proposed Myanmar Verb Phrase identification and translation model based on Syntactic Structure and Morphology of Myanmar Language by using Myanmar- English bilingual lexicon. Markov Model is also used to reformulate the translation probability of Phrase pairs. Experiment results showed that proposed system can improve translation quality by applying morphological analysis on Myanmar Language.
S URVEY O N M ACHINE T RANSLITERATION A ND M ACHINE L EARNING M ODELSijnlc
Globalization and growth of Internet users truly demands for almost all internet based applications to
support
l
oca
l l
anguages. Support
of
l
oca
l
l
anguages can be
given in all internet based applications by
means of Machine Transliteration
and
Machine Translation
.
This paper provides the thorough survey on
machine transliteration models and machine learning
approaches
used for machine transliteration
over the
period
of more than two decades
for internationally used languages as well as Indian languages.
Survey
shows that linguistic approach provides better results for the closely related languages and probability
based statistical approaches are good when one of the
languages is phonetic and other is non
-
phonetic.
B
etter accuracy can be achieved only by using Hybrid and Combined models.
EXTRACTING LINGUISTIC SPEECH PATTERNS OF JAPANESE FICTIONAL CHARACTERS USING ...kevig
This study extracted and analyzed the linguistic speech patterns that characterize Japanese anime or game characters. Conventional morphological analyzers, such as MeCab, segment words with high performance, but they are unable to segment broken expressions or utterance endings that are not listed in the dictionary, which often appears in lines of anime or game characters. To overcome this challenge, we propose segmenting lines of Japanese anime or game characters using subword units that were proposed mainly for deep learning, and extracting frequently occurring strings to obtain expressions that characterize their utterances. We analyzed the subword units weighted by TF/IDF according to gender, age, and each anime character and show that they are linguistic speech patterns that are specific for each feature. Additionally, a classification experiment shows that the model with subword units outperformed that with the conventional method.
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid ApproachIJERA Editor
The language is an effective medium for the communication that conveys the ideas and expression of the human
mind. There are more than 5000 languages in the world for the communication. To know all these languages is
not a solution for problems due to the language barrier in communication. In this multilingual world with the
huge amount of information exchanged between various regions and in different languages in digitized format,
it has become necessary to find an automated process to convert from one language to another. Natural
Language Processing (NLP) is one of the hot areas of research that explores how computers can be utilizing to
understand and manipulate natural language text or speech. In the Proposed system a Hybrid approach to
transliterate the proper nouns from Punjabi to Hindi is developed. Hybrid approach in the proposed system is a
combination of Direct Mapping, Rule based approach and Statistical Machine Translation approach (SMT).
Proposed system is tested on various proper nouns from different domains and accuracy of the proposed system
is very good.
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATIONkevig
Phonetic typing using the English alphabet has become widely popular nowadays for social media and chat services. As a result, a text containing various English and Bangla words and phrases has become increasingly common. Existing transliteration tools display poor performance for such texts. This paper proposes a robust Three-stage Hybrid Transliteration (THT) framework that can transliterate both English words and phonetic typed Bangla words satisfactorily. This is achieved by adopting a hybrid approach of dictionary-based and rule-based techniques. Experimental results confirm superiority of THT as it significantly outperforms the benchmark transliteration tool.
Tamil-English Document Translation Using Statistical Machine Translation Appr...baskaran_md
The Paper presents a new method for translating a text document from Tamil to English. Our method is based on the Statistical Machine Translation Approach, combined with the Morphological Analysis, due to the fact that Tamil is a highly-inflected language. This paper presents a slight modification in SMT to make the approach more efficient and effective, and the experimental results have proven the method to be speed and accurate in the translation process.
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...kevig
In this paper, phoneme sequences are used as language information to perform code-switched language
identification (LID). With the one-pass recognition system, the spoken sounds are converted into
phonetically arranged sequences of sounds. The acoustic models are robust enough to handle multiple
languages when emulating multiple hidden Markov models (HMMs). To determine the phoneme similarity
among our target languages, we reported two methods of phoneme mapping. Statistical phoneme-based
bigram language models (LM) are integrated into speech decoding to eliminate possible phone
mismatches. The supervised support vector machine (SVM) is used to learn to recognize the phonetic
information of mixed-language speech based on recognized phone sequences. As the back-end decision is
taken by an SVM, the likelihood scores of segments with monolingual phone occurrence are used to
classify language identity. The speech corpus was tested on Sepedi and English languages that are often
mixed. Our system is evaluated by measuring both the ASR performance and the LID performance
separately. The systems have obtained a promising ASR accuracy with data-driven phone merging
approach modelled using 16 Gaussian mixtures per state. In code-switched speech and monolingual
speech segments respectively, the proposed systems achieved an acceptable ASR and LID accuracy.
Phrase Identification is one of the most critical and widely studied in Natural Language processing (NLP) tasks. Verb Phrase Identification within a sentence is very useful for a variety of application on NLP. One of the core enabling technologies required in NLP applications is a Morphological Analysis. This paper presents the Myanmar Verb Phrase Identification and Translation Algorithm and develops a Markov Model with Morphological Analysis. The system is based on Rule-Based Maximum Matching Approach. In Machine Translation, Large amount of information is needed to guide the translation process. Myanmar Language is inflected language and there are very few creations and researches of Lexicon in Myanmar, comparing to other language such as English, French and Czech etc. Therefore, this system is proposed Myanmar Verb Phrase identification and translation model based on Syntactic Structure and Morphology of Myanmar Language by using Myanmar- English bilingual lexicon. Markov Model is also used to reformulate the translation probability of Phrase pairs. Experiment results showed that proposed system can improve translation quality by applying morphological analysis on Myanmar Language.
S URVEY O N M ACHINE T RANSLITERATION A ND M ACHINE L EARNING M ODELSijnlc
Globalization and growth of Internet users truly demands for almost all internet based applications to
support
l
oca
l l
anguages. Support
of
l
oca
l
l
anguages can be
given in all internet based applications by
means of Machine Transliteration
and
Machine Translation
.
This paper provides the thorough survey on
machine transliteration models and machine learning
approaches
used for machine transliteration
over the
period
of more than two decades
for internationally used languages as well as Indian languages.
Survey
shows that linguistic approach provides better results for the closely related languages and probability
based statistical approaches are good when one of the
languages is phonetic and other is non
-
phonetic.
B
etter accuracy can be achieved only by using Hybrid and Combined models.
EXTRACTING LINGUISTIC SPEECH PATTERNS OF JAPANESE FICTIONAL CHARACTERS USING ...kevig
This study extracted and analyzed the linguistic speech patterns that characterize Japanese anime or game characters. Conventional morphological analyzers, such as MeCab, segment words with high performance, but they are unable to segment broken expressions or utterance endings that are not listed in the dictionary, which often appears in lines of anime or game characters. To overcome this challenge, we propose segmenting lines of Japanese anime or game characters using subword units that were proposed mainly for deep learning, and extracting frequently occurring strings to obtain expressions that characterize their utterances. We analyzed the subword units weighted by TF/IDF according to gender, age, and each anime character and show that they are linguistic speech patterns that are specific for each feature. Additionally, a classification experiment shows that the model with subword units outperformed that with the conventional method.
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid ApproachIJERA Editor
The language is an effective medium for the communication that conveys the ideas and expression of the human
mind. There are more than 5000 languages in the world for the communication. To know all these languages is
not a solution for problems due to the language barrier in communication. In this multilingual world with the
huge amount of information exchanged between various regions and in different languages in digitized format,
it has become necessary to find an automated process to convert from one language to another. Natural
Language Processing (NLP) is one of the hot areas of research that explores how computers can be utilizing to
understand and manipulate natural language text or speech. In the Proposed system a Hybrid approach to
transliterate the proper nouns from Punjabi to Hindi is developed. Hybrid approach in the proposed system is a
combination of Direct Mapping, Rule based approach and Statistical Machine Translation approach (SMT).
Proposed system is tested on various proper nouns from different domains and accuracy of the proposed system
is very good.
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATIONkevig
Phonetic typing using the English alphabet has become widely popular nowadays for social media and chat services. As a result, a text containing various English and Bangla words and phrases has become increasingly common. Existing transliteration tools display poor performance for such texts. This paper proposes a robust Three-stage Hybrid Transliteration (THT) framework that can transliterate both English words and phonetic typed Bangla words satisfactorily. This is achieved by adopting a hybrid approach of dictionary-based and rule-based techniques. Experimental results confirm superiority of THT as it significantly outperforms the benchmark transliteration tool.
Tamil-English Document Translation Using Statistical Machine Translation Appr...baskaran_md
The Paper presents a new method for translating a text document from Tamil to English. Our method is based on the Statistical Machine Translation Approach, combined with the Morphological Analysis, due to the fact that Tamil is a highly-inflected language. This paper presents a slight modification in SMT to make the approach more efficient and effective, and the experimental results have proven the method to be speed and accurate in the translation process.
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...kevig
In this paper, phoneme sequences are used as language information to perform code-switched language
identification (LID). With the one-pass recognition system, the spoken sounds are converted into
phonetically arranged sequences of sounds. The acoustic models are robust enough to handle multiple
languages when emulating multiple hidden Markov models (HMMs). To determine the phoneme similarity
among our target languages, we reported two methods of phoneme mapping. Statistical phoneme-based
bigram language models (LM) are integrated into speech decoding to eliminate possible phone
mismatches. The supervised support vector machine (SVM) is used to learn to recognize the phonetic
information of mixed-language speech based on recognized phone sequences. As the back-end decision is
taken by an SVM, the likelihood scores of segments with monolingual phone occurrence are used to
classify language identity. The speech corpus was tested on Sepedi and English languages that are often
mixed. Our system is evaluated by measuring both the ASR performance and the LID performance
separately. The systems have obtained a promising ASR accuracy with data-driven phone merging
approach modelled using 16 Gaussian mixtures per state. In code-switched speech and monolingual
speech segments respectively, the proposed systems achieved an acceptable ASR and LID accuracy.
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
call for paper 2012, hard copy of journal, research paper publishing, where to publish research paper,
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES ijnlc
Stemming is the process of term conflation. It conflates all the word variants to a common form called as stem. It plays significant role in numerous Natural Language Processing (NLP) applications like morphological analysis, parsing, document summarization, text classification, part-of-speech tagging, question-answering system, machine translation, word sense disambiguation, information retrieval (IR), etc. Each of these tasks requires some pre-processing to be done. Stemming is one of the important building blocks for all these applications. This paper, presents an overview of various stemming techniques, evaluation criteria for stemmers and various existing stemmers for Indic languages.
Transliteration by orthography or phonology for hindi and marathi to english ...ijnlc
e-Governance and Web based online commercial multilingual applications has given utmost importance to
the task of translation and transliteration. The Named Entities and Technical Terms occur in the source
language of translation are called out of vocabulary words as they are not available in the multilingual
corpus or dictionary used to support translation process. These Named Entities and Technical Terms need
to be transliterated from source language to target language without losing their phonetic properties. The
fundamental problem in India is that there is no set of rules available to write the spellings in English for
Indian languages according to the linguistics. People are writing different spellings for the same name at
different places. This fact certainly affects the Top-1 accuracy of the transliteration and in turn the
translation process. Major issue noticed by us is the transliteration of named entities consisting three
syllables or three phonetic units in Hindi and Marathi languages where people use mixed approach to
write the spelling either by orthographical approach or by phonological approach. In this paper authors
have provided their opinion through experimentation about appropriateness of either approach.
Machine Translation Approaches and Design AspectsIOSR Journals
Machine translation is a sub-field of computational linguistics that investigates the use of software to
translate text or speech from one natural language to another. On a basic level, MT performs simple
substitution of words in one natural language for words in another, but that alone usually cannot produce a
good translation of a text, because recognition of whole phrases and their closest counterparts in the target
language is needed. The paper focuses on Example Based Machine Translation (EBMT) system that translates
sentences from English to Hindi. Development of a machine translation (MT) system typically demands a large
volume of computational resources. For example, rule based MT systems require extraction of syntactic and
semantic knowledge in the form of rules, statistics-based MT systems require huge parallel corpus containing
sentences in the source languages and their translations in target language. Requirement of such computational
resources is much less in respect of EBMT. This makes development of EBMT systems for English to Hindi
translation feasible, where availability of large-scale computational resources is still scarce. Example based
machine translation relies on the database for its translation. The frequency of word occurrence is important for
translation.
A New Approach to Parts of Speech Tagging in Malayalamijcsit
Parts-of-speech tagging is the process of labeling each word in a sentence. A tag mentions the word’s
usage in the sentence. Usually, these tags indicate syntactic classification like noun or verb, and sometimes
include additional information, with case markers (number, gender etc) and tense markers. A large number
of current language processing systems use a parts-of-speech tagger for pre-processing.
There are mainly two approaches usually followed in Parts of Speech Tagging. Those are Rule based
Approach and Stochastic Approach. Rule based Approach use predefined handwritten rules. This is the
oldest approach and it use lexicon or dictionary for reference. Stochastic Approach use probabilistic and
statistical information to assign tag to words. It use large corpus, so that Time complexity and Space
complexity is high whereas Rule base approach has less complexity for both Time and Space. Stochastic
Approach is the widely used one nowadays because of its accuracy.
Malayalam is a Dravidian family of languages, inflectional with suffixes with the root word forms. The
currently used Algorithms are efficient Machine Learning Algorithms but these are not built for
Malayalam. So it affects the accuracy of the result of Malayalam POS Tagging.
My proposed Approach use Dictionary entries along with adjacent tag information. This algorithm use
Multithreaded Technology. Here tagging done with the probability of the occurrence of the sentence
structure along with the dictionary entry.
Survey on Indian CLIR and MT systems in Marathi LanguageEditor IJCATR
Cross Language Information Retrieval (CLIR) deals with retrieving relevant information stored in a language different from
the language of user’s query. This helps users to express the information need in their native languages. Machine translation based (MTbased)
approach of CLIR uses existing machine translation techniques to provide automatic translation of queries. This paper covers the
research work done in CLIR and MT systems for Marathi language in India.
Marathi Text-To-Speech Synthesis using Natural Language Processingiosrjce
IOSR journal of VLSI and Signal Processing (IOSRJVSP) is a double blind peer reviewed International Journal that publishes articles which contribute new results in all areas of VLSI Design & Signal Processing. The goal of this journal is to bring together researchers and practitioners from academia and industry to focus on advanced VLSI Design & Signal Processing concepts and establishing new collaborations in these areas.
Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...iosrjce
IOSR journal of VLSI and Signal Processing (IOSRJVSP) is a double blind peer reviewed International Journal that publishes articles which contribute new results in all areas of VLSI Design & Signal Processing. The goal of this journal is to bring together researchers and practitioners from academia and industry to focus on advanced VLSI Design & Signal Processing concepts and establishing new collaborations in these areas.Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels.
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...IJECEIAES
In Natural Language Processing (NLP), Word segmentation and Part-ofSpeech (POS) tagging are fundamental tasks. The POS information is also necessary in NLP’s preprocessing work applications such as machine translation (MT), information retrieval (IR), etc. Currently, there are many research efforts in word segmentation and POS tagging developed separately with different methods to get high performance and accuracy. For Myanmar Language, there are also separate word segmentors and POS taggers based on statistical approaches such as Neural Network (NN) and Hidden Markov Models (HMMs). But, as the Myanmar language's complex morphological structure, the OOV problem still exists. To keep away from error and improve segmentation by utilizing POS data, segmentation and labeling should be possible at the same time.The main goal of developing POS tagger for any Language is to improve accuracy of tagging and remove ambiguity in sentences due to language structure. This paper focuses on developing word segmentation and Part-of- Speech (POS) Tagger for Myanmar Language. This paper presented the comparison of separate word segmentation and POS tagging with joint word segmentation and POS tagging.
MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...kevig
This study proposes a method to develop neural models of the morphological analyzer for Japanese Hiragana sentences using the Bi-LSTM CRF model. Morphological analysis is a technique that divides text data into words and assigns information such as parts of speech. In Japanese natural language processing systems, this technique plays an essential role in downstream applications because the Japanese language does not have word delimiters between words. Hiragana is a type of Japanese phonogramic characters, which is used for texts for children or people who cannot read Chinese characters. Morphological analysis of Hiragana sentences is more difficult than that of ordinary Japanese sentences because there is less information for dividing. For morphological analysis of Hiragana sentences, we demonstrated the effectiveness of fine-tuning using a model based on ordinary Japanese text and examined the influence of training data on texts of various genres.
A SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGEScsandit
ABSTRACT
Natural Language processing is an interdisciplinary branch of linguistic and computer science studied under the Artificial Intelligence (AI) that gave birth to an allied area called
‘Computational Linguistic’ which focuses on processing of natural languages on computational devices. A natural language consists of a large number of sentences which are linguistic units involving one or more words linked together in accordance with a set of predefined rules called grammar. Grammar checking is the task of validating sentences syntactically and is a prominent tool within language engineering. Our review draws on the recent development of various grammar checkers to look at past, present and the future in a new light. Our review covers grammar checkers of many languages with the aim of seeking their approaches, methodologies for developing new tool and system as a whole. The survey concludes with the discussion of various features included in existing grammar checkers of foreign languages as well as a few Indian Languages.
ANNOTATED GUIDELINES AND BUILDING REFERENCE CORPUS FOR MYANMAR-ENGLISH WORD A...ijnlc
Reference corpus for word alignment is an important resource for developing and evaluating word alignment methods. For Myanmar-English language pairs, there is no reference corpus to evaluate the word alignment tasks. Therefore, we created the guidelines for Myanmar-English word alignment annotation between two languages over contrastive learning and built the Myanmar-English reference corpus consisting of verified alignments from Myanmar ALT of the Asian Language Treebank (ALT). This reference corpus contains confident labels sure (S) and possible (P) for word alignments which are used to test for the purpose of evaluation of the word alignments tasks. We discuss the most linking ambiguities to define consistent and systematic instructions to align manual words. We evaluated the results of annotators agreement using our reference corpus in terms of alignment error rate (AER) in word alignment tasks and discuss the words relationships in terms of BLEU scores.
A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...Syeful Islam
More than hundreds of millions of people of almost all levels of education and attitudes from different country communicate with each other for different purposes using various languages. Machine translation is highly demanding due to increasing the usage of web based Communication. One of the major problem of Bengali translation is identified a naming word from a sentence, which is relatively simple in English language, because such entities start with a capital letter. In Bangla we do not have concept of small or capital letters and there is huge no. of different naming entity available in Bangla. Thus we find difficulties in understanding whether a word is a naming word or not. Here we have introduced a new approach to identify naming word from a Bengali sentence for machine translation system without storing huge no. of naming entity in word dictionary. The goal is to make possible Bangla sentence conversion with minimal storing word in dictionary.
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorWaqas Tariq
In recent decades speech interactive systems have gained increasing importance. Performance of an ASR system mainly depends on the availability of large corpus of speech. The conventional method of building a large vocabulary speech recognizer for any language uses a top-down approach to speech. This approach requires large speech corpus with sentence or phoneme level transcription of the speech utterances. The transcriptions must also include different speech order so that the recognizer can build models for all the sounds present. But, for Telugu language, because of its complex nature, a very large, well annotated speech database is very difficult to build. It is very difficult, if not impossible, to cover all the words of any Indian language, where each word may have thousands and millions of word forms. A significant part of grammar that is handled by syntax in English (and other similar languages) is handled within morphology in Telugu. Phrases including several words (that is, tokens) in English would be mapped on to a single word in Telugu.Telugu language is phonetic in nature in addition to rich in morphology. That is why the speech technology developed for English cannot be applied to Telugu language. This paper highlights the work carried out in an attempt to build a voice enabled text editor with capability of automatic term suggestion. Main claim of the paper is the recognition enhancement process developed by us for suitability of highly inflecting, rich morphological languages. This method results in increased speech recognition accuracy with very much reduction in corpus size. It also adapts Telugu words to the database dynamically, resulting in growth of the corpus.
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATIONkevig
Phonetic typing using the English alphabet has become widely popular nowadays for social media and chat
services. As a result, a text containing various English and Bangla words and phrases has become
increasingly common. Existing transliteration tools display poor performance for such texts. This paper
proposes a robust Three-stage Hybrid Transliteration (THT) framework that can transliterate both English
words and phonetic typed Bangla words satisfactorily. This is achieved by adopting a hybrid approach of
dictionary-based and rule-based techniques. Experimental results confirm superiority of THT as it
significantly outperforms the benchmark transliteration tool.
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
call for paper 2012, hard copy of journal, research paper publishing, where to publish research paper,
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES ijnlc
Stemming is the process of term conflation. It conflates all the word variants to a common form called as stem. It plays significant role in numerous Natural Language Processing (NLP) applications like morphological analysis, parsing, document summarization, text classification, part-of-speech tagging, question-answering system, machine translation, word sense disambiguation, information retrieval (IR), etc. Each of these tasks requires some pre-processing to be done. Stemming is one of the important building blocks for all these applications. This paper, presents an overview of various stemming techniques, evaluation criteria for stemmers and various existing stemmers for Indic languages.
Transliteration by orthography or phonology for hindi and marathi to english ...ijnlc
e-Governance and Web based online commercial multilingual applications has given utmost importance to
the task of translation and transliteration. The Named Entities and Technical Terms occur in the source
language of translation are called out of vocabulary words as they are not available in the multilingual
corpus or dictionary used to support translation process. These Named Entities and Technical Terms need
to be transliterated from source language to target language without losing their phonetic properties. The
fundamental problem in India is that there is no set of rules available to write the spellings in English for
Indian languages according to the linguistics. People are writing different spellings for the same name at
different places. This fact certainly affects the Top-1 accuracy of the transliteration and in turn the
translation process. Major issue noticed by us is the transliteration of named entities consisting three
syllables or three phonetic units in Hindi and Marathi languages where people use mixed approach to
write the spelling either by orthographical approach or by phonological approach. In this paper authors
have provided their opinion through experimentation about appropriateness of either approach.
Machine Translation Approaches and Design AspectsIOSR Journals
Machine translation is a sub-field of computational linguistics that investigates the use of software to
translate text or speech from one natural language to another. On a basic level, MT performs simple
substitution of words in one natural language for words in another, but that alone usually cannot produce a
good translation of a text, because recognition of whole phrases and their closest counterparts in the target
language is needed. The paper focuses on Example Based Machine Translation (EBMT) system that translates
sentences from English to Hindi. Development of a machine translation (MT) system typically demands a large
volume of computational resources. For example, rule based MT systems require extraction of syntactic and
semantic knowledge in the form of rules, statistics-based MT systems require huge parallel corpus containing
sentences in the source languages and their translations in target language. Requirement of such computational
resources is much less in respect of EBMT. This makes development of EBMT systems for English to Hindi
translation feasible, where availability of large-scale computational resources is still scarce. Example based
machine translation relies on the database for its translation. The frequency of word occurrence is important for
translation.
A New Approach to Parts of Speech Tagging in Malayalamijcsit
Parts-of-speech tagging is the process of labeling each word in a sentence. A tag mentions the word’s
usage in the sentence. Usually, these tags indicate syntactic classification like noun or verb, and sometimes
include additional information, with case markers (number, gender etc) and tense markers. A large number
of current language processing systems use a parts-of-speech tagger for pre-processing.
There are mainly two approaches usually followed in Parts of Speech Tagging. Those are Rule based
Approach and Stochastic Approach. Rule based Approach use predefined handwritten rules. This is the
oldest approach and it use lexicon or dictionary for reference. Stochastic Approach use probabilistic and
statistical information to assign tag to words. It use large corpus, so that Time complexity and Space
complexity is high whereas Rule base approach has less complexity for both Time and Space. Stochastic
Approach is the widely used one nowadays because of its accuracy.
Malayalam is a Dravidian family of languages, inflectional with suffixes with the root word forms. The
currently used Algorithms are efficient Machine Learning Algorithms but these are not built for
Malayalam. So it affects the accuracy of the result of Malayalam POS Tagging.
My proposed Approach use Dictionary entries along with adjacent tag information. This algorithm use
Multithreaded Technology. Here tagging done with the probability of the occurrence of the sentence
structure along with the dictionary entry.
Survey on Indian CLIR and MT systems in Marathi LanguageEditor IJCATR
Cross Language Information Retrieval (CLIR) deals with retrieving relevant information stored in a language different from
the language of user’s query. This helps users to express the information need in their native languages. Machine translation based (MTbased)
approach of CLIR uses existing machine translation techniques to provide automatic translation of queries. This paper covers the
research work done in CLIR and MT systems for Marathi language in India.
Marathi Text-To-Speech Synthesis using Natural Language Processingiosrjce
IOSR journal of VLSI and Signal Processing (IOSRJVSP) is a double blind peer reviewed International Journal that publishes articles which contribute new results in all areas of VLSI Design & Signal Processing. The goal of this journal is to bring together researchers and practitioners from academia and industry to focus on advanced VLSI Design & Signal Processing concepts and establishing new collaborations in these areas.
Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...iosrjce
IOSR journal of VLSI and Signal Processing (IOSRJVSP) is a double blind peer reviewed International Journal that publishes articles which contribute new results in all areas of VLSI Design & Signal Processing. The goal of this journal is to bring together researchers and practitioners from academia and industry to focus on advanced VLSI Design & Signal Processing concepts and establishing new collaborations in these areas.Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels.
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...IJECEIAES
In Natural Language Processing (NLP), Word segmentation and Part-ofSpeech (POS) tagging are fundamental tasks. The POS information is also necessary in NLP’s preprocessing work applications such as machine translation (MT), information retrieval (IR), etc. Currently, there are many research efforts in word segmentation and POS tagging developed separately with different methods to get high performance and accuracy. For Myanmar Language, there are also separate word segmentors and POS taggers based on statistical approaches such as Neural Network (NN) and Hidden Markov Models (HMMs). But, as the Myanmar language's complex morphological structure, the OOV problem still exists. To keep away from error and improve segmentation by utilizing POS data, segmentation and labeling should be possible at the same time.The main goal of developing POS tagger for any Language is to improve accuracy of tagging and remove ambiguity in sentences due to language structure. This paper focuses on developing word segmentation and Part-of- Speech (POS) Tagger for Myanmar Language. This paper presented the comparison of separate word segmentation and POS tagging with joint word segmentation and POS tagging.
MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...kevig
This study proposes a method to develop neural models of the morphological analyzer for Japanese Hiragana sentences using the Bi-LSTM CRF model. Morphological analysis is a technique that divides text data into words and assigns information such as parts of speech. In Japanese natural language processing systems, this technique plays an essential role in downstream applications because the Japanese language does not have word delimiters between words. Hiragana is a type of Japanese phonogramic characters, which is used for texts for children or people who cannot read Chinese characters. Morphological analysis of Hiragana sentences is more difficult than that of ordinary Japanese sentences because there is less information for dividing. For morphological analysis of Hiragana sentences, we demonstrated the effectiveness of fine-tuning using a model based on ordinary Japanese text and examined the influence of training data on texts of various genres.
A SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGEScsandit
ABSTRACT
Natural Language processing is an interdisciplinary branch of linguistic and computer science studied under the Artificial Intelligence (AI) that gave birth to an allied area called
‘Computational Linguistic’ which focuses on processing of natural languages on computational devices. A natural language consists of a large number of sentences which are linguistic units involving one or more words linked together in accordance with a set of predefined rules called grammar. Grammar checking is the task of validating sentences syntactically and is a prominent tool within language engineering. Our review draws on the recent development of various grammar checkers to look at past, present and the future in a new light. Our review covers grammar checkers of many languages with the aim of seeking their approaches, methodologies for developing new tool and system as a whole. The survey concludes with the discussion of various features included in existing grammar checkers of foreign languages as well as a few Indian Languages.
ANNOTATED GUIDELINES AND BUILDING REFERENCE CORPUS FOR MYANMAR-ENGLISH WORD A...ijnlc
Reference corpus for word alignment is an important resource for developing and evaluating word alignment methods. For Myanmar-English language pairs, there is no reference corpus to evaluate the word alignment tasks. Therefore, we created the guidelines for Myanmar-English word alignment annotation between two languages over contrastive learning and built the Myanmar-English reference corpus consisting of verified alignments from Myanmar ALT of the Asian Language Treebank (ALT). This reference corpus contains confident labels sure (S) and possible (P) for word alignments which are used to test for the purpose of evaluation of the word alignments tasks. We discuss the most linking ambiguities to define consistent and systematic instructions to align manual words. We evaluated the results of annotators agreement using our reference corpus in terms of alignment error rate (AER) in word alignment tasks and discuss the words relationships in terms of BLEU scores.
A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...Syeful Islam
More than hundreds of millions of people of almost all levels of education and attitudes from different country communicate with each other for different purposes using various languages. Machine translation is highly demanding due to increasing the usage of web based Communication. One of the major problem of Bengali translation is identified a naming word from a sentence, which is relatively simple in English language, because such entities start with a capital letter. In Bangla we do not have concept of small or capital letters and there is huge no. of different naming entity available in Bangla. Thus we find difficulties in understanding whether a word is a naming word or not. Here we have introduced a new approach to identify naming word from a Bengali sentence for machine translation system without storing huge no. of naming entity in word dictionary. The goal is to make possible Bangla sentence conversion with minimal storing word in dictionary.
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorWaqas Tariq
In recent decades speech interactive systems have gained increasing importance. Performance of an ASR system mainly depends on the availability of large corpus of speech. The conventional method of building a large vocabulary speech recognizer for any language uses a top-down approach to speech. This approach requires large speech corpus with sentence or phoneme level transcription of the speech utterances. The transcriptions must also include different speech order so that the recognizer can build models for all the sounds present. But, for Telugu language, because of its complex nature, a very large, well annotated speech database is very difficult to build. It is very difficult, if not impossible, to cover all the words of any Indian language, where each word may have thousands and millions of word forms. A significant part of grammar that is handled by syntax in English (and other similar languages) is handled within morphology in Telugu. Phrases including several words (that is, tokens) in English would be mapped on to a single word in Telugu.Telugu language is phonetic in nature in addition to rich in morphology. That is why the speech technology developed for English cannot be applied to Telugu language. This paper highlights the work carried out in an attempt to build a voice enabled text editor with capability of automatic term suggestion. Main claim of the paper is the recognition enhancement process developed by us for suitability of highly inflecting, rich morphological languages. This method results in increased speech recognition accuracy with very much reduction in corpus size. It also adapts Telugu words to the database dynamically, resulting in growth of the corpus.
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATIONkevig
Phonetic typing using the English alphabet has become widely popular nowadays for social media and chat
services. As a result, a text containing various English and Bangla words and phrases has become
increasingly common. Existing transliteration tools display poor performance for such texts. This paper
proposes a robust Three-stage Hybrid Transliteration (THT) framework that can transliterate both English
words and phonetic typed Bangla words satisfactorily. This is achieved by adopting a hybrid approach of
dictionary-based and rule-based techniques. Experimental results confirm superiority of THT as it
significantly outperforms the benchmark transliteration tool.
EXTENDING A MODEL FOR ONTOLOGY-BASED ARABIC-ENGLISH MACHINE TRANSLATION (NAN) ijaia
The acceleration in telecommunication needs leads to many groups of research, especially in communication facilitating and Machine Translation fields. While people contact with others having different languages and cultures, they need to have instant translations. However, the available instant translators are still providing somewhat bad Arabic-English Translations, for instance when translating books or articles, the meaning is not totally accurate. Therefore, using the semantic web techniques to deal with the homographs and homonyms semantically, the aim of this research is to extend a model for the ontology-based Arabic-English Machine Translation, named NAN, which simulate the human way in translation. The experimental results show that NAN translation is approximately more similar to the Human Translation than the other instant translators. The resulted translation will help getting the translated texts in the target language somewhat correctly and semantically more similar to human translations for the Non-Arabic Natives and the Non-English natives.
Quality estimation of machine translation outputs through stemmingijcsa
Machine Translation is the challenging problem for Indian languages. Every day we can see some machine
translators being developed , but getting a high quality automatic translation is still a very distant dream .
The correct translated sentence for Hindi language is rarely found. In this paper, we are emphasizing on
English-Hindi language pair, so in order to preserve the correct MT output we present a ranking system,
which employs some machine learning techniques and morphological features. In ranking no human
intervention is required. We have also validated our results by comparing it with human ranking.
This paper presents an unsupervised approach for the development of a stemmer (For the case of Urdu &
Marathi language). Especially, during last few years, a wide range of information in Indian regional
languages has been made available on web in the form of e-data. But the access to these data repositories
is very low because the efficient search engines/retrieval systems supporting these languages are very
limited. Hence automatic information processing and retrieval is become an urgent requirement. To train
the system training dataset, taken from CRULP [22] and Marathi corpus [23] are used. For generating
suffix rules two different approaches, namely, frequency based stripping and length based stripping have
been proposed. The evaluation has been made on 1200 words extracted from the Emille corpus. The
experiment results shows that in the case of Urdu language the frequency based suffix generation approach gives the maximum accuracy of 85.36% whereas Length based suffix stripping algorithm gives maximum accuracy of 79.76%. In the case of Marathi language the systems gives 63.5% accuracy in the case of frequency based stripping and achieves maximum accuracy of 82.5% in the case of length based suffix stripping algorithm.
RBIPA: An Algorithm for Iterative Stemming of Tamil Language Textskevig
Cyberbullying is currently one of the most important research fields. The majority of researchers have contributed to research on bully text identification in English texts or comments, due to the scarcity of data; analyzing Tamil textstemming is frequently a tedious job. Tamil is a morphologically diverse and agglutinative language. The creation of a Tamil stemmer is not an easy undertaking. After examining the major difficulties encountered, proposed the rule-based iterative preprocessing algorithm (RBIPA). In this attempt, Tamil morphemes and lemmas were extracted using the suffix stripping technique and a supervised machine learning algorithm for classify the word based for pronouns and proper nouns. The novelty of proposed system is developing a preprocessing algorithm for iterative stemming; lemmatize process to discovering exact words from the Tamil Language comments. RBIPA shows 84.96% of accuracy in the given Test Dataset which hasa total of 13000 words.
RBIPA: An Algorithm for Iterative Stemming of Tamil Language Textskevig
Cyberbullying is currently one of the most important research fields. The majority of researchers have contributed to research on bully text identification in English texts or comments, due to the scarcity of data; analyzing Tamil textstemming is frequently a tedious job. Tamil is a morphologically diverse and agglutinative language. The creation of a Tamil stemmer is not an easy undertaking. After examining the major difficulties encountered, proposed the rule-based iterative preprocessing algorithm (RBIPA). In this attempt, Tamil morphemes and lemmas were extracted using the suffix stripping technique and a supervised machine learning algorithm for classify the word based for pronouns and proper nouns. The novelty of proposed system is developing a preprocessing algorithm for iterative stemming; lemmatize process to discovering exact words from the Tamil Language comments. RBIPA shows 84.96% of accuracy in the given Test Dataset which hasa total of 13000 words.
Grapheme-To-Phoneme Tools for the Marathi Speech SynthesisIJERA Editor
We describe in detail a Grapheme-to-Phoneme (G2P) converter required for the development of a good quality
Marathi Text-to-Speech (TTS) system. The Festival and Festvox framework is chosen for developing the
Marathi TTS system. Since Festival does not provide complete language processing support specie to various
languages, it needs to be augmented to facilitate the development of TTS systems in certain new languages.
Because of this, a generic G2P converter has been developed. In the customized Marathi G2P converter, we
have handled schwa deletion and compound word extraction. In the experiments carried out to test the Marathi
G2P on a text segment of 2485 words, 91.47% word phonetisation accuracy is obtained. This Marathi G2P has
been used for phonetising large text corpora which in turn is used in designing an inventory of phonetically rich
sentences. The sentences ensured a good coverage of the phonetically valid di-phones using only 1.3% of the
complete text corpora.
IMPROVING THE QUALITY OF GUJARATI-HINDI MACHINE TRANSLATION THROUGH PART-OF-S...ijnlc
Machine Translation for Indian languages is an emerging research area. Transliteration is one such module that we design while designing a translation system. Transliteration means mapping of source language text into the target language. Simple mapping decreases the efficiency of overall translation system. We propose the use of stemming and part-of-speech tagging for transliteration. The effectiveness of translation can be improved if we use part-of-speech tagging and stemming assisted transliteration. We have shown that much of the content in Gujarati gets transliterated while being processed for translation to Hindi language.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Design and Development of a Malayalam to English Translator- A Transfer Based...Waqas Tariq
This paper describes a transfer based scheme for translating Malayalam, a Dravidian language, to English. This system inputs Malayalam sentences and outputs equivalent English sentences. The system comprises of a preprocessor for splitting the compound words, a morphological parser for context disambiguation and chunking, a syntactic structure transfer module and a bilingual dictionary. All the modules are morpheme based to reduce dictionary size. The system does not rely on a stochastic approach and it is based on a rule-based architecture along with various linguistic knowledge components of both Malayalam and English. The system uses two sets of rules: rules for Malayalam morphology and rules for syntactic structure transfer from Malayalam to English. The system is designed using artificial intelligence techniques.
ATTENTION-BASED SYLLABLE LEVEL NEURAL MACHINE TRANSLATION SYSTEM FOR MYANMAR ...kevig
Neural machine translation is a new approach to machine translation that has shown the effective results
for high-resource languages. Recently, the attention-based neural machine translation with the large scale
parallel corpus plays an important role to achieve high performance for translation results. In this
research, a parallel corpus for Myanmar-English language pair is prepared and attention-based neural
machine translation models are introduced based on word to word level, character to word level, and
syllable to word level. We do the experiments of the proposed model to translate the long sentences and to
address morphological problems. To decrease the low resource problem, source side monolingual data are
also used. So, this work investigates to improve Myanmar to English neural machine translation system.
The experimental results show that syllable to word level neural mahine translation model obtains an
improvement over the baseline systems.
ATTENTION-BASED SYLLABLE LEVEL NEURAL MACHINE TRANSLATION SYSTEM FOR MYANMAR ...ijnlc
Neural machine translation is a new approach to machine translation that has shown the effective results
for high-resource languages. Recently, the attention-based neural machine translation with the large scale
parallel corpus plays an important role to achieve high performance for translation results. In this
research, a parallel corpus for Myanmar-English language pair is prepared and attention-based neural
machine translation models are introduced based on word to word level, character to word level, and
syllable to word level. We do the experiments of the proposed model to translate the long sentences and to
address morphological problems. To decrease the low resource problem, source side monolingual data are
also used. So, this work investigates to improve Myanmar to English neural machine translation system.
The experimental results show that syllable to word level neural mahine translation model obtains an
improvement over the baseline systems.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
HANDLING CHALLENGES IN RULE BASED MACHINE TRANSLATION FROM MARATHI TO ENGLISH
1. International Journal on Natural Language Computing (IJNLC) Vol.8, No.4, August 2019
DOI: 10.5121/ijnlc.2019.8404 39
HANDLING CHALLENGES IN RULE BASED
MACHINE TRANSLATION FROM MARATHI TO
ENGLISH
Namrata G Kharate1
, Dr.Varsha H. Patil2
1
Department of Computer Engineering, VIIT,Pune, Maharashtra, India
2
Head of Department, Department of Computer Engineering, MCOERC, Nashik,
Maharashtra, India
ABSTRACT
Machine translation is being carried out by the researchers from quite a long time. However, it is still a
dream to materialize flawless Machine Translator and the small numbers of researchers has focussed at
translating Marathi Text to English. Perfect Machine Translation Systems have not yet been fully built
owing to the fact that languages differ syntactically as well as morphologically. Majority of the researchers
have opted for Statistical Machine translation whereas in this paper we have addressed the challenges of
Rule based Machine Translation. The paper describes the major divergences observed in language
Marathi and English and many challenges encountered while attempting to build machine translation
system form Marathi to English using rule based approach and rules to handle these challenges. As there
are exceptions to the rules and limit to the feasibility of maintaining knowledgebase, the practical machine
translation from Marathi to English is a complex task.
KEYWORDS
NLP; Machine Translation; English; Marathi; grammar.
1. INTRODUCTION
Language is one of the most popular medium of communication and there are many languages
used in the world for verbal and written communication. Different languages use different ways
to encode information.
There is a need of Translation when the information has to be communicated among the people
speaking different languages. Translation is a process of encoding the information from one
language and decoding it another language using the rules of target language. This process has
been attempted for automation between a few pair of languages since a long time. Though
accuracy is not fully achieved in the pair of languages and very less attempt has been observed in
regional languages such as Marathi – English as a pair, Machine Translation often produces
coarse yet understandable translations.
In this paper, Machine Translation from Marathi to English has been considered for simple
assertive sentences along with the challenges in the translation and rules to handle challeneges.
Marathi is an Indo-Aryan language that has more than 42 identified dialects. English, on the other
hand is a West Germanic language. Its origin is in the Anglo-Frisian dialects of North West
Germany and the Netherlands[2]English is now considered as a global language, whereas Marathi
is a language spoken mainly in the central and Western regions of India. English is spoken as a
first language by around 375 million people whereas the number of Marathi speakers is 90
2. International Journal on Natural Language Computing (IJNLC) Vol.8, No.4, August 2019
40
million speakers worldwide. Marathi is the 15th most spoken language in the world and 4th most
spoken language in India [6].
Many official documents and lot of information these days are available in the Marathi language,
especially in a state of Maharashtra. Existing documents that are currently in the Marathi
language need to be translated to English for their widespread use. Manual translation is very
costly and time consuming and hence there is a need to have an automated translation system
which would do the language translation in an effective way. There are major challenges in the
process due to the structural difference between Marathi language and English language. English
follows Subject-Verb-Object grammar structure, while Marathi language follows Subject-Object-
Verb grammar structure, relatively of free word order and has large number of inflections. Hence
its translation to English is a task [5] and handling challenges is most challenging task.to handle
challenges need to design countless rules and endless lexicon dictionary.
Further, Marathi is highly dominated by inflections and case-suffixes. Thus, a rule based machine
translation system from Marathi to English would have to take into consideration these
differences in the languages. Such a Machine Translation system will not only promote the
language on a global scale, but it will also open the gates to the people who are facing problems
while translating Marathi to English.
Google translator is only tool available for Marathi to English translation .It uses Statistical
Machine Translation that is machine translation in which translation is done using statistical
translation models, parameters of which are derived from the analysis of bilingual text corpora. If
corresponding word is not found in the text corpora, accurate translation is not obtained.
Moreover the Google translate does not check the syntax of the given sentence. [7]
2. RELATED WORK
In the existing literature, the issue of translation divergence for Marathi and English MT has not
been exhaustively examined. S. B. Kulkarni [2] discuss syntactic and structural divergence issues
in English-Marathi machine translation and the same translation pair is then examined for reverse
translation so as to examine the nature of the divergence in each case. R.K.sinha[3] discuss
different types of translation divergences in Hindi and English MT. G.V.Garje [4] describes the
differences between the languages English and Marathi from a Machine Translation point of view
and also encountered challenges while attempting to build a Machine Translation system from
English to Marathi using Rule based Machine Translation approach. In this paper we describes
the major divergences observed in language Marathi and English and many challenges
encountered while attempting to build machine translation system form Marathi to English using
rule based approach.
3. CHALLENGES IN TRANSLATION
Rule Based Machine Translation uses grammar to formulate transfer-rules from source language
to target language. At times, these grammatical rules may not be formally defined. The transfer
rules include rules for word-reordering, disambiguation and grammatical additions in the target
language. Formation of transfer-rules in a language pair belonging to distant families is a
daunting task. Following are the few major challenges authors have come across [1] [3] [4] [13].
1. Unavailability of Lexical Resources
2. Constituent-order Divergence
3. Adjunction Divergence
4. Pleonastic Divergence
3. International Journal on Natural Language Computing (IJNLC) Vol.8, No.4, August 2019
41
5. Case suffix
6. Divergence in Determiner System
7. Replicative Words
8. Expressive Elements
9. Indirect Speech
10. Mapping of Time
11. Difference in methods of encoding information
12. Lexical Gap
13. Adposition
14. Difference in inaminate objects
15. Capitalization
16. Noun Inflection
17. Verb Inflection
1. Unavailability of Lexical Resources
Marathi is a very low resource language [9]. The lexemes in Marathi have their own morphology.
It is needed to acquaint with the properties of the language for translation. In high resource
languages these may be acquired using language analysis tools like parsers, POS taggers and
Named Entity Taggers. The semantic information tools such as language pair dictionaries and
wordnets provide with senses of the lexemes. These are later helpful in word sense
disambiguation. However these tools are not yet fully developed for Marathi.[4]. So it becomes
very difficult to translate as well as disambiguate the words in the sentence. The parallel corpora
for Marathi and Englsih are not sufficient to pursue statistical machine translation. Machine
translation methods such as Statistical MT and Corpus based MT require a large amount of
corpora which are not yet available on a large scale. This poses a restriction to the number of
methods which can be used for such a translation system.
2. Constituent-order Divergence
Constituent-order divergence relates to the word-order distinctions between English and Marathi.
Essentially, the constituent order describes where the specifier and the complements of a phrase
are positioned. For example, in English the complement of a verb is placed after the verb and the
specifier of the verb is placed before. Thus English is a Subject-Verb-Object (SVO) language.
Marathi, on the other hand, is an Subject-Object- Verb (SOV) language. Example 1 shows the
constituent-order divergence between English and Marathi.
Ex.1. तो आंबा खातो आहे.
S O V
He is eating mango.
S V O
3. Adjunction Divergence
Syntactic divergences associated with different types of adjunct structures are classified as
Adjunction divergence. Marathi and English differ in the possible positioning of the adjective
phrase. In Marathi a Prepositional Phrase(PP)and/or adjective phrase(AP)can be placed between a
verb and its object or before the object, while in English it can generally at the terminal of the
sentence; consider the following example,
4. International Journal on Natural Language Computing (IJNLC) Vol.8, No.4, August 2019
42
Ex.2. मी उद्या माझ्या बाइकवरआणीन.
S O AP V
I will bring it tomorrow on my bike.
S V O PP
4. Pleonastic Divergence
Another related point of divergence between Marathi and English is regarding the mapping of the
words like ‘there’ and ‘it’ in the sentences in English. In English constructions, ‘there’ and ‘it’ are
used to denote existential sentences, called as introductory subject. Marathi does not have a
pleonastic subject construction and the contrast between existential and non-existential sentences
is realized by several other ways such as the movement of the noun phrase from its canonical
position and the use of demonstrative elements[1].Let us consider following sentence.
Ex.3.खोली मध्ये सापआहे.
There is a snake in the room.
साप खोली मध्ये आहे.
The snake is in the room.
It is observed that the bare noun phrase साप and ‘snake’ are mapped by indefinite and definite
noun phrases in English. However, the only difference between these two Marathi sentences is
the respective positions of the subject Noun Phrase(NP) and the खोलीमध्येadverbial phrase. This
type of divergence is related to more than one aspect of grammar such as the word order, lexical
and structural gaps in languages. Hence there is a need to examine it in detail to categorize the
type of divergence it represents.
5. Case Suffixes
In modern languages there is less number of cases. E.g. in Sanskrit and in Marathi there are 7
cases; in German there are 4 cases while in English there are mainly 2 cases.
Each having its own functional meaning and suffixes. It is difficult to identify these cases from
the Marathi sentence and also it is difficult to map cases form Marathi to English. As each case
suffix represents different meaning it is utmost important to determine the exact case of the noun.
And cases were replaced by prepositions in the evolution of languages.
In the absence of case marker the case is called as “Nominative”.
Example.
Second vibhakti is actually preposition "To"
Third vibhakti is preposition "by" as used etc.
6. Determiner System
English has articles that mark the definiteness of the noun phrase overtly. Marathi lacks an overt
article system and different devices are used to realize the definiteness of a noun phrase in
Marathi. For instance, mapping of a bare NP in Marathi onto an NP with an article “a-an/the” in
English is dependent on a detailed syntactic and semantic analysis of the noun phrases in both the
languages, as in the following example,
5. International Journal on Natural Language Computing (IJNLC) Vol.8, No.4, August 2019
43
Ex.4. मी कु त्रा पाहहला.कु त्रा खूप गोंडस होता
I saw a dog. The dog was very cute
7. Replicative Words
Marathi has replicative words for which it is difficult to find an exact counterpart in European
languages such as English. Almost all kinds of words can be replicated to denote a number of
different functions in Marathi. A verb-verb replication such as पाहतापाहता in Marathi cannot be
translated as ‘see see’ in English, even though पाहता and see are the correct translations or
adjective replication उंचचउंच can be used to denote different types of functions that are mapped
onto English in various ways depending on a number of factors. For example,
Ex.5.पाहतापाहता सकाळझाली.
Ex.6.येथे उंचचउंच देवदार आणि चीड वृक्ांणिवाय काहीही नव्हते .
A closer translation will be ‘In the meanwhile, it became morning.’ And ‘There was nothing here
except for dense trees of cedar and pine’.
8. Expressive Elements
Expressive words exist in all natural languages and pose difficulty in processing, particularly in
mapping onto another language. The reason is that these words do not have exact parallel
translation in another language. Thus the word धडकन/धाडकनis only distantly mapped by
“bump” in English, as given below,
Ex.7.ती धाडकन पडली.
She fell with a ‘bump’.
The expressive words usually originate from the sound associated with the semantics of the action
verb and can be adverbial or verbalized action-verbs. One may argue this to be just a lexical gap
but indeed it is not so. However, some of these words can be handled in the lexicon but as in
many cases the mapping also involves structural changes, the issue involves a wider scope of
interpretation.
9. Indirect Speech
The indirect speech sentences in Marathi and English differ in both the form use of pronominal
elements.
Ex.8.राम म्हणाला की मी नाही जाणार.
Ram said that I/he would not go.
The use of the pronoun “मी “in the example is ambiguous and can be translated either by ‘I’ or
‘he’ in English. The example shows that the tense in the English indirect speech sentences is past
but must be mapped by present tense in the Marathi sentence. Although some aspects of this type
of translation divergence have been partially discussed in Dave et al (2001) [14] for Hindi-
English MT, but it needs further examination to comprehend.
6. International Journal on Natural Language Computing (IJNLC) Vol.8, No.4, August 2019
44
10. Mappings of Time
Usually, people’s perception of different objects in the world is dependent upon several socio-
cultural beliefs. For example, time is conceptualized in the Indian culture differently than that is
used in the Western culture. These concepts are expressed through our respective languages and
difference in concepts manifests itself in the language that is the source of translation divergence.
The representation of small span of the time changes as per the language. The example below
shows that the time at the 5 o’clock in the morning is denoted by a.m. in English but the exact
translation of ‘सकाळी’ in Marathi does not produce an appropriate English expression. However,
in second example, it is noticed that the time at 3 o’clock in the afternoon which is expressed in
English by p.m. but the exact translation of ‘दुपारी’ in Marathi does not produce an appropriate
English expression.
Ex.9.तो सकाळी ५ वाजता आला.
He arrived at 5 o’clock in the morning
He arrived at 5 a.m.
Ex.10.दुपारी ३ वाजता आला.
He arrived at 3o’clock in the afternoon
He arrived at 3 p.m.
Ex.11.तो संध्याकाळी ७ वाजता आला.
He arrived at 7o’clock in the evening
He arrived at 7 p.m.
11. Difference in methods of encoding information
English uses word ordering to encode information while Marathi uses morphemes. Here in the
Marathi sentence the additions of vibhakti pratyay(case suffixes) ने (ne) and ला (la)conveys that
‘Shyam’ is the doer of the action and ‘Ram’ is the receiver of the action. Even though positions of
doer and receiver are different in both sentences, their English translations are same.
For example:
Ex.12.रामला श्यामने आंबा हदला.
Shyam gave a mango to Ram.
Or
Ex.13. श्यामने रामला आंबा हदला.
Shyam gave a mango to Ram.
12. Lexical Gap
English and Marathi are spoken by societies belonging to diverse cultural backgrounds. Hence,
there are certain concepts those exist in only one of the languages. Such concepts may cause
problems while translating.
For example: ‘कमम’ in Marathi.
A solution to this problem is that, some of these concepts are directly borrowed in English.
For example: “karma”
7. International Journal on Natural Language Computing (IJNLC) Vol.8, No.4, August 2019
45
13. Adpositions
Adpositions are words those can occur before or after a phrase, word, or a clause that is necessary
to complete the meaning of a given sentence. The languages which follow SOV Structure use
postpositions. Hence, while translating a Marathi sentence (SOV structure) to an English sentence
(SVO structure), we need to change the postpositions (of Marathi) to prepositions (of English).
This is a major issue which needs to be resolved for inflecting the nouns, verbs and Cases
(Vibhakti). In example below, word ‘var’ appears after noun but in English translated sentence
word ‘on’ is before noun.
Ex.14.तो खुर्ची वर बसला
He sat on the chair
14. Difference in inanimate objects
In Marathi gender is an important morphological property of the nouns. In English the inanimate
objects are referred using the neuter gender. On the other hand, the inanimate objects in Marathi
have their own genders.
For example:
खुर्ची(F) ->Chair (N)
र्चमर्चा (M) -> Spoon (N)
15. Capitalization
In the English writing systems that distinguish between the upper and lower case have two
parallel sets of letters. While writing sentences in English there are some rules of capitalization
for example ,the first word of a sentence, The name of people, Months, days, and holidays ,The
titles of books, articles and movies etc. Name entity tagger can tag person name, place, city etc.,
so by using it we can capitalize. In the following example Marathi sentence is written in simple
but in English translated sentence first word of sentence and first word of name of state that is
Korean is capitalized.
Ex.15. हे हवद्यार्थी कोररयन आहेत.
These students are Korean
16. Noun Inflection
Noun inflection is performed on the basis of change in Multiplicity or Case (Vibhakti) in English
grammar. The inflection of a word can be determined from the word endings. Noun Inflection in
Multiplicity, there are different rules to get plural form of noun .The rules changes according to
end alphabets of noun.
Examples.
Box=boxes
Cat=cats
There are some exceptions also for plural forms
Ex. Woman=women
Noun Inflection in case that is possessive case. In possessive case “‘s” is attached to noun, there
are different rules to attach “‘s” depending on multiplicity of noun.
8. International Journal on Natural Language Computing (IJNLC) Vol.8, No.4, August 2019
46
Ex. Boys’ school, Men’s club, Boy’s pen
17. Verb Inflection
In Marathi, however, the word according to which the verb gets conjugated is either the doer of
the action (karta) or the receiver of the action (karma). In English the main verb in the sentence is
conjugated according to the person, tense, number of the subject of the sentence Further, the
attributes of the word that determine the conjugation are also different in English.
English verbs consider person, number, tense [9].
For example:
मी हिके ट खेळतो ->I play cricket. (First person)
तो हिके ट खेळतो->He plays cricket. (Third person)
According to suffix attached with Marathi verb and verb followed by आहे/होते/असेल, there is
concept of auxiliary verb in English language. According to tense we have to change to be form
and verb form.
For Example
मी खोका उघडला होता -> I had opened box.
तीने दरवाजा उघडला असेल ->She will have opened door.
4. HANDLING CHALLENGES
Rule based machine translation relies on countless built in linguistic rules and millions of
bilingual dictionaries for Marathi –English language pair. Replicative words, Expressive words
some of these words can be handled in the lexicon but as in many cases the mapping also
involves structural changes; the issue involves a wider scope of interpretation [1].
The divergences of the types under Constituent-order Divergence, Adjunction Divergence,
Pleonastic Divergence, Indirect Speech, Adposition will be handle through rules [15].
For example, noun inflection on basis of change of multiplicity design rules depending on end
words of English word as shown in table 1.
Table 1.Noun multiplicity inflection handling rules
Original word type Inflection Rule Examples
Noun ending in
-s,-sh,-ch,-x,-o
Adding -es Class-classes
Match-Matches
Box-Boxes
Nouns ending in –y &
preceded by a consonant
-y replace by –i and add -es City-Cities
Story-Stories
Nouns ending in –f or -
fe
-f, -fe replace by v and add
–es
Wife-Wives
Leaf-leaves
Nouns with inside vowel (Build database) Man-men
Woman-women
General noun Adding -s Book-Books
Desk-Desks
9. International Journal on Natural Language Computing (IJNLC) Vol.8, No.4, August 2019
47
For example, noun inflection on basis of case rules depending on end words of English word as
shown in table 2.
Table 2.Noun case inflection handling rules
Original word type Inflection Rule Examples
Noun-Singular Add “ ‘s “ The boy’s school
Noun-Plural &
Ends with ‘s’
ADD “ ‘ ” Boys’ school
Horses’ tails.
Noun-Plural & but does
not
Ends with ‘s’
Add “ ‘s “ Men’s club
Children’s books
Two nouns are closely
connected
Add “ ‘s “ to second noun Karim and Salim’s
Bakery
General noun Adding -s Book-Books
Desk-Desks
With the help of NER tag we can handle capitalization. Capitalize its first letter of word with
NER tag or proper noun.
Case suffix handling, noun inflection, verb inflection, inanimate objects, mapping of time, lexical
gap these challenges will be handled by designing tables in database. For example, noun
inflection on basis of case will be handling through following table 3. Depending upon pratay
attached with Marathi noun suffix/preposition is attached to respective English noun with the help
of table.
Likewise we can define tables for handling challenges. Determiner (a/an/the) concept is present in
English language which is not for Marathi language .While handling this challenge design tables
in database for countable and uncountable nouns.
Table 3.Noun case inflection
Marathi suffix English
SuffixCase (Vibhakti) Singular Plural
Nominative _ _ _
Accusative ला,स ना,स To
Instrumental
/Agent ने नी By /with
Dative ला ना for/to
Ablative उन /हून उन/हून from
Genitive र्चा /र्ची /र्चे र्चा /र्ची /र्चे Of , 's
Locative त त In
Vocative _ नो Oh
10. International Journal on Natural Language Computing (IJNLC) Vol.8, No.4, August 2019
48
5. CONCLUSIONS
The translation divergence is a challenging problem in the area of machine translation. A detailed
study of divergence issues in machine translation is required for their proper interpretation and
detection. Rule Based Machine Translation is a tedious approach. It needs lots of human work to
design number of rules. As number of rules increases the quality of translation improves.
However, it usually generates nearly accurate translations. As seen in this paper a number of rules
and dictionary need to be formed to achieve these translations. Moreover, resolution of these
problems is a pre-requisite for designing a robust machine translation system between the
languages considered for the present study. In this paper we have explained the various types of
divergence Patterns with respect to Marathi and English language pair. Also the identification and
classification of these divergence patterns along with. In spite of all these efforts there exist many
exceptions in the language which do not conform to these rules. The accuracy of this approach is
dependent on the size of the grammatical knowledge base. Handling of exceptional cases leads to
an increase in the size and the complexity of the knowledge base. As the size of the knowledge
base increases the accuracy increases up to a certain threshold. If the size increases above certain
threshold then the accuracy may reduce due to conflicting rules. Thus a system based on this
approach has to balance accuracy and the number of exceptions it can handle.
6. REFERENCES
[1] Sinha, R. M. K., & Thakur, A., 2005c, Divergence patterns in machine translation between Hindi
and English, Proceeding of MT Summit X. Phuket, Thailand, pp. 346-353
[2] S. B. Kulkarni, P. D. Deshmukh, M. M. Kazi, K. V. Kale, “Linguistic to Socio-And-Psyco
Linguistic Aspects in English-To-Marathi Language Translation”, International Journal of Research
in Computer Applications And Robotics, 2013; 1(9), pp.197-205
[3] S. B. Kulkarni, P. D. Deshmukh and K. V. Kale, “Syntactic and Structural Divergence in English-to-
Marathi Machine Translation”, IEEE 2013 International Symposium on Computational and Business
Intelligence, August 24-26, 2013, New Delhi, pp. 191-194,doi: 10.1109/ISCBI.2013.46
[4] G.V. Garje, G.K. Kharate,”Challenges in Rule Based Machine Translation from English to Marathi”,
3rd International Conference on Recent Trends in Engineering &Technology (ICRTET’2014),pp.
243-248.
[5] Namrata G Kharate, Dr.Varsha H. Patil “Survey of Machine Translation for Indian Languages to
English and Its Approaches” International Journal of Scientific Research in Computer Science,
Engineering and Information Technology ,Volume 3,Issue 1,ISSN : 2456-3307,pp. 613-622.
[6] Joshi A., Sasikumar N. Constructive approach to teach inflections in Marathi language, Proceedings
of National Conference on Advances in Technology andRecent Developments, Mumbai, India,
2008, pp.10-16
[7] Khan Md., Anwarus S., Amada S., Nishino T. Sublexical Translations for low-resource language,
Proceedings of Workshop on Machine Translation andParsing in Indian Languages (MTPIL-2012),
24th International Conference on Computer Linguistics (Coling12)
[8] M. R. Walimbe. Sugam Marathi VyakranLekhan, G.Y. Rane Publication
[9] Wren P., Martin H. High School English Grammar and Composition, S Chand Publication
[10] CharugatraTidke, Shital B, Shivani P (2013) “Inflection Rules for English to Marathi Machine
Translation”IJCSMC, Vol. 2, Issue. 4, April 2013, pg.7 – 18
[11] EshaPalta IITB. Word Sense Disambiguation, 2006-07, Master of Technology First Stage Report.
[12] Walker D. and Amsler R. 1986. The Use of Machine Readable Dictionaries in Sublanguage
Analysis. In Analyzing Language in Restricted Domains, Grishmanand Kittredge (eds), LEA Press,
pp. 69-83
11. International Journal on Natural Language Computing (IJNLC) Vol.8, No.4, August 2019
49
[13] Namrata G Kharate,Dr.Varsha H. Patil ” Challenges in Rule Based Machine Translation from
Marathi to English ” 5th International Conference on Advances in Computer Science and
Information Technology (ACSTY-2019), August 17-18, 2019.pp 45-54
Authors
Ms.Namrata Kharate is Pursuing Ph.D. in computer department with Specialization in
natural language processing from Savitribai Phule, Pune University. She received her
BE from Savitribai Phule Pune University & M.E from Savitribai Phule Pune
University. Working as an Assistant Professor in vishwakarma institute of information
and technology
Dr.Varsha H. Patil is Professor, University of Pune, and Head, Computer Science
Department, at Matoshri College of Engineering & Research Centre, Nashik. She is
also serving as the Vice Principal of the same institute.She has received her Ph.D in
Computer Engineering from BVCOE, Bharati Vidyapeeth Pune .She has received her
M.E in Computer Engineering from COEP, Pune .She has more than 20 years of
teaching experience and several research papers, published in national and international
journals of repute, to her credit. She is a member of the Board of Studies of Computer
Engineering at the University of Pune.