This document discusses reduplication in Hindi and proposes developing a tool to identify and translate reduplicated words from Hindi to Punjabi. It begins by defining reduplication and describing different types, including morphological, lexical, echo formation, compounds, word and discontinuous reduplication. It then states the problem of automatically identifying echo words in Hindi conversations to translate to Punjabi. Next, it discusses using rule-based and statistical approaches and choosing rule-based. Finally, it outlines the work to be done, including studying reduplication, past research, translating echo words, and rule-based approaches.
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
call for paper 2012, hard copy of journal, research paper publishing, where to publish research paper,
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
ISOLATING WORD LEVEL RULES IN TAMIL LANGUAGE FOR EFFICIENT DEVELOPMENT OF LAN...ijnlc
With the advent of social media, the amount of text available for processing across different natural
languages has become enormous. In the past few decades, there has been tremendous increase in the
number of language processing applications. The tools for natural language computing of various
languages are very different because each language has its own set of grammatical rules. This paper
focuses on identifying the basic inflectional principles of Tamil language at word level. Three levels of
word inflection concepts are considered – Patterns, Rules and Exceptions. How grammatical principles for
word inflections in Tamil can be grouped in these three levels and applied for obtaining different word
forms is the focus of this paper. These can be made use of in a wide variety of natural language
applications like morphological analysis, morphological generation, word level translation, spelling and
grammar check, information extraction etc. The tools using these rules will account for faster operation
and better implementation of Tamil grammatical rules referred from [த ொல்த ொப்பியம் |
tholgaappiyam] and [ நன்னூல் | nannool] in NLP applications.
An implementation of apertium based assamese morphological analyzerijnlc
Morphological Analysis is an important branch of linguistics for any Natural Language Processing Technology. Morphology studies the word structure and formation of word of a language. In current scenario of NLP research, morphological analysis techniques have become more popular day by day. For processing any language, morphology of the word should be first analyzed. Assamese language contains very complex morphological structure. In our work we have used Apertium based Finite-State-Transducers for developing morphological analyzer for Assamese Language with some limited domain and we get 72.7% accuracy
Kannada Phonemes to Speech Dictionary: Statistical ApproachIJERA Editor
The input or output of a natural Language processing system can be either written text or speech. To process written text we need to analyze: lexical, syntactic, semantic knowledge about the language, discourse information, real world knowledge to process spoken language, we need to analyze everything required to process written text, along with the challenges of speech recognition and speech synthesis. This paper describes how articulatory phonetics of Kannada is used to generate the phoneme to speech dictionary for Kannada; the statistical computational approach is used to map the elements which are taken from input query or documents. The articulatory phonetics is the place of articulation of a consonant. It is the point of contact where an obstruction occurs in the vocal tract between an articulatory gesture, an active articulator, typically some part of the tongue, and a passive location, typically some part of the roof of the mouth. Along with the manner of articulation and the phonation, this gives the consonant its distinctive sound. The results are presented for the same.
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
call for paper 2012, hard copy of journal, research paper publishing, where to publish research paper,
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
ISOLATING WORD LEVEL RULES IN TAMIL LANGUAGE FOR EFFICIENT DEVELOPMENT OF LAN...ijnlc
With the advent of social media, the amount of text available for processing across different natural
languages has become enormous. In the past few decades, there has been tremendous increase in the
number of language processing applications. The tools for natural language computing of various
languages are very different because each language has its own set of grammatical rules. This paper
focuses on identifying the basic inflectional principles of Tamil language at word level. Three levels of
word inflection concepts are considered – Patterns, Rules and Exceptions. How grammatical principles for
word inflections in Tamil can be grouped in these three levels and applied for obtaining different word
forms is the focus of this paper. These can be made use of in a wide variety of natural language
applications like morphological analysis, morphological generation, word level translation, spelling and
grammar check, information extraction etc. The tools using these rules will account for faster operation
and better implementation of Tamil grammatical rules referred from [த ொல்த ொப்பியம் |
tholgaappiyam] and [ நன்னூல் | nannool] in NLP applications.
An implementation of apertium based assamese morphological analyzerijnlc
Morphological Analysis is an important branch of linguistics for any Natural Language Processing Technology. Morphology studies the word structure and formation of word of a language. In current scenario of NLP research, morphological analysis techniques have become more popular day by day. For processing any language, morphology of the word should be first analyzed. Assamese language contains very complex morphological structure. In our work we have used Apertium based Finite-State-Transducers for developing morphological analyzer for Assamese Language with some limited domain and we get 72.7% accuracy
Kannada Phonemes to Speech Dictionary: Statistical ApproachIJERA Editor
The input or output of a natural Language processing system can be either written text or speech. To process written text we need to analyze: lexical, syntactic, semantic knowledge about the language, discourse information, real world knowledge to process spoken language, we need to analyze everything required to process written text, along with the challenges of speech recognition and speech synthesis. This paper describes how articulatory phonetics of Kannada is used to generate the phoneme to speech dictionary for Kannada; the statistical computational approach is used to map the elements which are taken from input query or documents. The articulatory phonetics is the place of articulation of a consonant. It is the point of contact where an obstruction occurs in the vocal tract between an articulatory gesture, an active articulator, typically some part of the tongue, and a passive location, typically some part of the roof of the mouth. Along with the manner of articulation and the phonation, this gives the consonant its distinctive sound. The results are presented for the same.
RULE BASED TRANSLITERATION SCHEME FOR ENGLISH TO PUNJABIijnlc
Machine Transliteration has come out to be an emerging and a very important research area in the field of
machine translation. Transliteration basically aims to preserve the phonological structure of words. Proper
transliteration of name entities plays a very significant role in improving the quality of machine translation.
In this paper we are doing machine transliteration for English-Punjabi language pair using rule based
approach. We have constructed some rules for syllabification. Syllabification is the process to extract or
separate the syllable from the words. In this we are calculating the probabilities for name entities (Proper
names and location). For those words which do not come under the category of name entities, separate
probabilities are being calculated by using relative frequency through a statistical machine translation
toolkit known as MOSES. Using these probabilities we are transliterating our input text from English to
Punjabi.
Anaphora Resolution is a process of finding referents in discourse. In computational linguistic, Anaphora
resolution is complex and challenging task. This paper focuses on pronominal anaphora resolution. It is a
subpart of anaphora resolution where pronouns are referred to noun referents. Including anaphora
resolution into many applications like automatic summarization, opinion mining, machine translation,
question answering systems etc. increase their accuracy by 10%. Related work in this field has been done
in many languages. This paper focuses on resolving anaphora for Punjabi language. A model is proposed
for resolving anaphora and an experiment is conducted to measure the accuracy of the system. The model
uses two factors: Recency and Animistic knowledge. Recency factor works on the concept of Lappin Leass
approach and for introducing animistic knowledge gazetteer method is used. The experiment is conducted
on a Punjabi story containing more than 1000 words and result is drawn with the future directions.
IMPROVEMENT OF CRF BASED MANIPURI POS TAGGER BY USING REDUPLICATED MWE (RMWE)cscpconf
This paper gives a detail overview about the modified features selection in CRF (Conditional Random Field) based Manipuri POS (Part of Speech) tagging. Selection of features is so
important in CRF that the better are the features then the better are the outputs. This work is an
attempt or an experiment to make the previous work more efficient. Multiple new features are
tried to run the CRF and again tried with the Reduplicated Multiword Expression (RMWE) as
another feature. The CRF run with RMWE because Manipuri is rich of RMWE and identification
of RMWE becomes one of the necessities to bring up the result of POS tagging. The new CRF
system shows a Recall of 78.22%, Precision of 73.15% and F-measure of 75.60%. With the
identification of RMWE and considering it as a feature makes an improvement to a Recall of 80.20%, Precision of 74.31% and F-measure of 77.14%.
Anaphora resolution in hindi language using gazetteer methodijcsa
Anaphora resolution is one of the active research areas within the realm of natural language processing.
Resolution of anaphoric reference is one of the most challenging and complex task to be handled. This
paper completely emphasis on pronominal anaphora resolution for Hindi Language. There are various
methodologies for resolving anaphora. This paper presents a computational model for anaphora resolution
in Hindi that is based on Gazetteer method. Gazetteer method is a creation of lists and then applies
operations to classify elements present in the list. There are many salient factors for resolving anaphora.
The proposed model resolves anaphora by using two factors that is Animistic and Recency. Animistic factor
always represent living things and non living things whereas Recency describes that the referents
mentioned in current sentence tends to have higher weights than those in previous sentence. This paper
demonstrate the experiments conducted on short Hindi stories ,news articles and biography content from
Wikipedia, its result & future directions to improve accuracy.
Stemming is the process of clipping off the affixes from the input word to obtain the respective
root word, but it is not necessary that stemming provide us the genuine and meaningful root
word. To overcome this problem we come up with a solution- Lemmatizer. It is the process by
which we crave out the lemma from the given word and can also add additional rules to make
the clipped word a proper stem. In this paper we have created an inflectional lemmatizer which
generates the rules for extracting the suffixes and also added rules for generating a proper
meaningful root word.
Stemming is the process of clipping off the affixes from the input word to obtain the respective root word, but it is not necessary that stemming provide us the genuine and meaningful root word. To overcome this problem we come up with a solution- Lemmatizer. It is the process by
which we crave out the lemma from the given word and can also add additional rules to make the clipped word a proper stem. In this paper we have created an inflectional lemmatizer which generates the rules for extracting the suffixes and also added rules for generating a proper
meaningful root word.
OPTIMIZE THE LEARNING RATE OF NEURAL ARCHITECTURE IN MYANMAR STEMMERijnlc
Morphological stemming becomes a critical step toward natural language processing. The process of stemming is to reduce alternative forms to a common morphological root. Word segmentation for Myanmar Language, like for most Asian Languages, is an important task and extensively-studied sequence labelling problem. Named entity detection is one of the issues in Asian Language that has traditionally required a large amount of feature engineering to achieve high performance. The new approach is integrating them that would benefit in all these processes. In recent years, end-to-end sequence labelling models with deep learning are widely used. This paper introduces a deep BiGRUCNN-CRF network that jointly learns word segmentation, stemming and named entity recognition tasks. We trained the model using manually annotated corpora. State-of-the-art named entity recognition systems rely heavily on handcrafted feature built in our new approach, we introduce the joint model that relies on two sources of information: character level representation and syllable level representation.
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATIONkevig
Phonetic typing using the English alphabet has become widely popular nowadays for social media and chat services. As a result, a text containing various English and Bangla words and phrases has become increasingly common. Existing transliteration tools display poor performance for such texts. This paper proposes a robust Three-stage Hybrid Transliteration (THT) framework that can transliterate both English words and phonetic typed Bangla words satisfactorily. This is achieved by adopting a hybrid approach of dictionary-based and rule-based techniques. Experimental results confirm superiority of THT as it significantly outperforms the benchmark transliteration tool.
IMPROVING THE QUALITY OF GUJARATI-HINDI MACHINE TRANSLATION THROUGH PART-OF-S...ijnlc
Machine Translation for Indian languages is an emerging research area. Transliteration is one such module that we design while designing a translation system. Transliteration means mapping of source language text into the target language. Simple mapping decreases the efficiency of overall translation system. We propose the use of stemming and part-of-speech tagging for transliteration. The effectiveness of translation can be improved if we use part-of-speech tagging and stemming assisted transliteration. We have shown that much of the content in Gujarati gets transliterated while being processed for translation to Hindi language.
EXTRACTING LINGUISTIC SPEECH PATTERNS OF JAPANESE FICTIONAL CHARACTERS USING ...kevig
This study extracted and analyzed the linguistic speech patterns that characterize Japanese anime or game characters. Conventional morphological analyzers, such as MeCab, segment words with high performance, but they are unable to segment broken expressions or utterance endings that are not listed in the dictionary, which often appears in lines of anime or game characters. To overcome this challenge, we propose segmenting lines of Japanese anime or game characters using subword units that were proposed mainly for deep learning, and extracting frequently occurring strings to obtain expressions that characterize their utterances. We analyzed the subword units weighted by TF/IDF according to gender, age, and each anime character and show that they are linguistic speech patterns that are specific for each feature. Additionally, a classification experiment shows that the model with subword units outperformed that with the conventional method.
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...kevig
In this paper, phoneme sequences are used as language information to perform code-switched language
identification (LID). With the one-pass recognition system, the spoken sounds are converted into
phonetically arranged sequences of sounds. The acoustic models are robust enough to handle multiple
languages when emulating multiple hidden Markov models (HMMs). To determine the phoneme similarity
among our target languages, we reported two methods of phoneme mapping. Statistical phoneme-based
bigram language models (LM) are integrated into speech decoding to eliminate possible phone
mismatches. The supervised support vector machine (SVM) is used to learn to recognize the phonetic
information of mixed-language speech based on recognized phone sequences. As the back-end decision is
taken by an SVM, the likelihood scores of segments with monolingual phone occurrence are used to
classify language identity. The speech corpus was tested on Sepedi and English languages that are often
mixed. Our system is evaluated by measuring both the ASR performance and the LID performance
separately. The systems have obtained a promising ASR accuracy with data-driven phone merging
approach modelled using 16 Gaussian mixtures per state. In code-switched speech and monolingual
speech segments respectively, the proposed systems achieved an acceptable ASR and LID accuracy.
Myanmar word sorting is very important in indexing of search engine to optimize in the searching process of keywords. This paper proposed an efficient sorting algorithm for Myanmar words based on the weights of consonants, vowels, devowelizers, and consonant combination of each syllable of the words since Myanmar words are composed of one syllable or more than one syllable and finally the words are sorted based on Quick sort. The proposed algorithm is intended to design for Zawgyi_One font, which is mainly dominant in Myanmar Web pages.
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGEijnlc
Manipuri is both a minority and morphologically rich language with genetic features similar to Tibeto Burman languages. It has Subject-Object-Verb (SOV) order, agglutinative verb morphology and is monosyllabic. Morphology and syntax are not clearly distinguished in this language. Natural Language
Processing (NLP) is a useful research field of computer science that deals with processing of a large amount of natural language corpus. The NLP applications encompass E-Dictionary, Morphological Analyzer, Reduplicated Multi-Word Expression (RMWE), Named Entity Recognition (NER), Part of Speech
(POS) Tagging, Machine Translation (MT), Word Net, Word Sense Disambiguation (WSD) etc. In this paper, we present a study on the advancements in NLP applications for Manipuri language, at the same time presenting a comparison table of the approaches and techniques adopted and the results obtained of each of the applications followed by a detail discussion of each work.
MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...kevig
This study proposes a method to develop neural models of the morphological analyzer for Japanese Hiragana sentences using the Bi-LSTM CRF model. Morphological analysis is a technique that divides text data into words and assigns information such as parts of speech. In Japanese natural language processing systems, this technique plays an essential role in downstream applications because the Japanese language does not have word delimiters between words. Hiragana is a type of Japanese phonogramic characters, which is used for texts for children or people who cannot read Chinese characters. Morphological analysis of Hiragana sentences is more difficult than that of ordinary Japanese sentences because there is less information for dividing. For morphological analysis of Hiragana sentences, we demonstrated the effectiveness of fine-tuning using a model based on ordinary Japanese text and examined the influence of training data on texts of various genres.
Implementation of Enhanced Parts-of-Speech Based Rules for English to Telugu ...Waqas Tariq
Words of a sentence will not follow same ordering in different languages. This paper proposes certain Parts-of-Speech (POS) based rules for reordering the given English sentence to get translation in Telugu. The added rules for adverbs, exceptional conjunctions in addition to improved handling of inflections enable the system to achieve more accurate translation. The proposed rules along with existing system gave a score of 0.6190 with BLEU evaluation metric while translating sentences from English to Telugu. This paper deals with simple form of sentences in a better way.
Green Roofs and Green Building Rating SystemsIJERA Editor
The environmental benefits for green building from the Leadership in Energy and Environment Design (LEED)
and Ecology, Energy, Waste, and Health (EEWH) rating systems have been extensively investigated; however,
the effect of green roofs on the credit-earning mechanisms is relatively unexplored. This study is concerned with
the environmental benefits of green roofs with respect to sustainability, stormwater control, energy savings, and
water resources. We focused on the relationship between green coverage and the credits of the rating systems,
evaluated the credits efficiency, and performed cost analysis. As an example, we used a university building in
Keelung, Northern Taiwan. The findings suggest that with EEWH, the proposed green coverage is 50–75%,
whereas with LEED, the proposed green coverage is 100%. These findings have implications for the application
of green roofs in green building.
Use of Hidden Markov Mobility Model for Location Based ServicesIJERA Editor
These days people prefer to use portable and wireless devices such as laptops, mobile phones, They are connected through satellites. As user moves from one point to other, task of updating stored information becomes difficult. Provision of Location based services to users, faces some challenges like limited bandwidth and limited client power. To optimize data accessibility and to minimize access cost, we can store frequently accessed data item in cache of client. So small size of cache is introduced in mobile devices. Data fetched from server is stored on cache. So requested data from user is provided from cache and not from remote server. Question arises that which data should be kept in the cache? Performance of cache majorly depends on the cache replacement policies which select data suitable for eviction from cache. This paper presents use of Hidden Markov Models(HMMs) for prediction of user‟s future location. Then data item irrelevant to this predicted location is fetched out from the cache. The proposed approach clusters location histories according to their location characteristics and also it considers each user‟s previous actions. This results in producing high packet delivery ratio and minimum delay.
Photovoltaic Modules Performance Loss Evaluation for Nsukka, South East Niger...IJERA Editor
The Photovoltaic (PV) systems and technology offer excellent reliability when designed with the right implementation tools and based on good technical judgements of components that make up each of the critical sections of solar power system. The PV array is an essential section of a solar power system and it is expected to function to deliver pre – estimated power based on design estimations. There are factors that derail the performance of PV modules; the contributions of these factors are peculiar to specific sites of installation, hence the need to empirically evaluate and characterize installation sites before deployment of PV systems. This paper presents the characterization of Nsukka (South East, Nigeria) environment using decent instrumentation; and consequently highlights the power loss indicators for PV modules in the target site while presenting equally mitigable design.
Privacy Preservation and Restoration of Data Using Unrealized Data SetsIJERA Editor
In today’s world, there is an improved advance in hardware technology which increases the capability to store and record personal data about consumers and individuals. Data mining extracts knowledge to support a variety of areas as marketing, medical diagnosis, weather forecasting, national security etc successfully. Still there is a challenge to extract certain kinds of data without violating the data owners’ privacy. As data mining becomes more enveloping, such privacy concerns are increasing. This gives birth to a new category of data mining method called privacy preserving data mining algorithm (PPDM). The aim of this algorithm is to protect the easily affected information in data from the large amount of data set. The privacy preservation of data set can be expressed in the form of decision tree. This paper proposes a privacy preservation based on data set complement algorithms which store the information of the real dataset. So that the private data can be safe from the unauthorized party, if some portion of the data can be lost, then we can recreate the original data set from the unrealized dataset and the perturbed data set.
RULE BASED TRANSLITERATION SCHEME FOR ENGLISH TO PUNJABIijnlc
Machine Transliteration has come out to be an emerging and a very important research area in the field of
machine translation. Transliteration basically aims to preserve the phonological structure of words. Proper
transliteration of name entities plays a very significant role in improving the quality of machine translation.
In this paper we are doing machine transliteration for English-Punjabi language pair using rule based
approach. We have constructed some rules for syllabification. Syllabification is the process to extract or
separate the syllable from the words. In this we are calculating the probabilities for name entities (Proper
names and location). For those words which do not come under the category of name entities, separate
probabilities are being calculated by using relative frequency through a statistical machine translation
toolkit known as MOSES. Using these probabilities we are transliterating our input text from English to
Punjabi.
Anaphora Resolution is a process of finding referents in discourse. In computational linguistic, Anaphora
resolution is complex and challenging task. This paper focuses on pronominal anaphora resolution. It is a
subpart of anaphora resolution where pronouns are referred to noun referents. Including anaphora
resolution into many applications like automatic summarization, opinion mining, machine translation,
question answering systems etc. increase their accuracy by 10%. Related work in this field has been done
in many languages. This paper focuses on resolving anaphora for Punjabi language. A model is proposed
for resolving anaphora and an experiment is conducted to measure the accuracy of the system. The model
uses two factors: Recency and Animistic knowledge. Recency factor works on the concept of Lappin Leass
approach and for introducing animistic knowledge gazetteer method is used. The experiment is conducted
on a Punjabi story containing more than 1000 words and result is drawn with the future directions.
IMPROVEMENT OF CRF BASED MANIPURI POS TAGGER BY USING REDUPLICATED MWE (RMWE)cscpconf
This paper gives a detail overview about the modified features selection in CRF (Conditional Random Field) based Manipuri POS (Part of Speech) tagging. Selection of features is so
important in CRF that the better are the features then the better are the outputs. This work is an
attempt or an experiment to make the previous work more efficient. Multiple new features are
tried to run the CRF and again tried with the Reduplicated Multiword Expression (RMWE) as
another feature. The CRF run with RMWE because Manipuri is rich of RMWE and identification
of RMWE becomes one of the necessities to bring up the result of POS tagging. The new CRF
system shows a Recall of 78.22%, Precision of 73.15% and F-measure of 75.60%. With the
identification of RMWE and considering it as a feature makes an improvement to a Recall of 80.20%, Precision of 74.31% and F-measure of 77.14%.
Anaphora resolution in hindi language using gazetteer methodijcsa
Anaphora resolution is one of the active research areas within the realm of natural language processing.
Resolution of anaphoric reference is one of the most challenging and complex task to be handled. This
paper completely emphasis on pronominal anaphora resolution for Hindi Language. There are various
methodologies for resolving anaphora. This paper presents a computational model for anaphora resolution
in Hindi that is based on Gazetteer method. Gazetteer method is a creation of lists and then applies
operations to classify elements present in the list. There are many salient factors for resolving anaphora.
The proposed model resolves anaphora by using two factors that is Animistic and Recency. Animistic factor
always represent living things and non living things whereas Recency describes that the referents
mentioned in current sentence tends to have higher weights than those in previous sentence. This paper
demonstrate the experiments conducted on short Hindi stories ,news articles and biography content from
Wikipedia, its result & future directions to improve accuracy.
Stemming is the process of clipping off the affixes from the input word to obtain the respective
root word, but it is not necessary that stemming provide us the genuine and meaningful root
word. To overcome this problem we come up with a solution- Lemmatizer. It is the process by
which we crave out the lemma from the given word and can also add additional rules to make
the clipped word a proper stem. In this paper we have created an inflectional lemmatizer which
generates the rules for extracting the suffixes and also added rules for generating a proper
meaningful root word.
Stemming is the process of clipping off the affixes from the input word to obtain the respective root word, but it is not necessary that stemming provide us the genuine and meaningful root word. To overcome this problem we come up with a solution- Lemmatizer. It is the process by
which we crave out the lemma from the given word and can also add additional rules to make the clipped word a proper stem. In this paper we have created an inflectional lemmatizer which generates the rules for extracting the suffixes and also added rules for generating a proper
meaningful root word.
OPTIMIZE THE LEARNING RATE OF NEURAL ARCHITECTURE IN MYANMAR STEMMERijnlc
Morphological stemming becomes a critical step toward natural language processing. The process of stemming is to reduce alternative forms to a common morphological root. Word segmentation for Myanmar Language, like for most Asian Languages, is an important task and extensively-studied sequence labelling problem. Named entity detection is one of the issues in Asian Language that has traditionally required a large amount of feature engineering to achieve high performance. The new approach is integrating them that would benefit in all these processes. In recent years, end-to-end sequence labelling models with deep learning are widely used. This paper introduces a deep BiGRUCNN-CRF network that jointly learns word segmentation, stemming and named entity recognition tasks. We trained the model using manually annotated corpora. State-of-the-art named entity recognition systems rely heavily on handcrafted feature built in our new approach, we introduce the joint model that relies on two sources of information: character level representation and syllable level representation.
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATIONkevig
Phonetic typing using the English alphabet has become widely popular nowadays for social media and chat services. As a result, a text containing various English and Bangla words and phrases has become increasingly common. Existing transliteration tools display poor performance for such texts. This paper proposes a robust Three-stage Hybrid Transliteration (THT) framework that can transliterate both English words and phonetic typed Bangla words satisfactorily. This is achieved by adopting a hybrid approach of dictionary-based and rule-based techniques. Experimental results confirm superiority of THT as it significantly outperforms the benchmark transliteration tool.
IMPROVING THE QUALITY OF GUJARATI-HINDI MACHINE TRANSLATION THROUGH PART-OF-S...ijnlc
Machine Translation for Indian languages is an emerging research area. Transliteration is one such module that we design while designing a translation system. Transliteration means mapping of source language text into the target language. Simple mapping decreases the efficiency of overall translation system. We propose the use of stemming and part-of-speech tagging for transliteration. The effectiveness of translation can be improved if we use part-of-speech tagging and stemming assisted transliteration. We have shown that much of the content in Gujarati gets transliterated while being processed for translation to Hindi language.
EXTRACTING LINGUISTIC SPEECH PATTERNS OF JAPANESE FICTIONAL CHARACTERS USING ...kevig
This study extracted and analyzed the linguistic speech patterns that characterize Japanese anime or game characters. Conventional morphological analyzers, such as MeCab, segment words with high performance, but they are unable to segment broken expressions or utterance endings that are not listed in the dictionary, which often appears in lines of anime or game characters. To overcome this challenge, we propose segmenting lines of Japanese anime or game characters using subword units that were proposed mainly for deep learning, and extracting frequently occurring strings to obtain expressions that characterize their utterances. We analyzed the subword units weighted by TF/IDF according to gender, age, and each anime character and show that they are linguistic speech patterns that are specific for each feature. Additionally, a classification experiment shows that the model with subword units outperformed that with the conventional method.
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...kevig
In this paper, phoneme sequences are used as language information to perform code-switched language
identification (LID). With the one-pass recognition system, the spoken sounds are converted into
phonetically arranged sequences of sounds. The acoustic models are robust enough to handle multiple
languages when emulating multiple hidden Markov models (HMMs). To determine the phoneme similarity
among our target languages, we reported two methods of phoneme mapping. Statistical phoneme-based
bigram language models (LM) are integrated into speech decoding to eliminate possible phone
mismatches. The supervised support vector machine (SVM) is used to learn to recognize the phonetic
information of mixed-language speech based on recognized phone sequences. As the back-end decision is
taken by an SVM, the likelihood scores of segments with monolingual phone occurrence are used to
classify language identity. The speech corpus was tested on Sepedi and English languages that are often
mixed. Our system is evaluated by measuring both the ASR performance and the LID performance
separately. The systems have obtained a promising ASR accuracy with data-driven phone merging
approach modelled using 16 Gaussian mixtures per state. In code-switched speech and monolingual
speech segments respectively, the proposed systems achieved an acceptable ASR and LID accuracy.
Myanmar word sorting is very important in indexing of search engine to optimize in the searching process of keywords. This paper proposed an efficient sorting algorithm for Myanmar words based on the weights of consonants, vowels, devowelizers, and consonant combination of each syllable of the words since Myanmar words are composed of one syllable or more than one syllable and finally the words are sorted based on Quick sort. The proposed algorithm is intended to design for Zawgyi_One font, which is mainly dominant in Myanmar Web pages.
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGEijnlc
Manipuri is both a minority and morphologically rich language with genetic features similar to Tibeto Burman languages. It has Subject-Object-Verb (SOV) order, agglutinative verb morphology and is monosyllabic. Morphology and syntax are not clearly distinguished in this language. Natural Language
Processing (NLP) is a useful research field of computer science that deals with processing of a large amount of natural language corpus. The NLP applications encompass E-Dictionary, Morphological Analyzer, Reduplicated Multi-Word Expression (RMWE), Named Entity Recognition (NER), Part of Speech
(POS) Tagging, Machine Translation (MT), Word Net, Word Sense Disambiguation (WSD) etc. In this paper, we present a study on the advancements in NLP applications for Manipuri language, at the same time presenting a comparison table of the approaches and techniques adopted and the results obtained of each of the applications followed by a detail discussion of each work.
MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...kevig
This study proposes a method to develop neural models of the morphological analyzer for Japanese Hiragana sentences using the Bi-LSTM CRF model. Morphological analysis is a technique that divides text data into words and assigns information such as parts of speech. In Japanese natural language processing systems, this technique plays an essential role in downstream applications because the Japanese language does not have word delimiters between words. Hiragana is a type of Japanese phonogramic characters, which is used for texts for children or people who cannot read Chinese characters. Morphological analysis of Hiragana sentences is more difficult than that of ordinary Japanese sentences because there is less information for dividing. For morphological analysis of Hiragana sentences, we demonstrated the effectiveness of fine-tuning using a model based on ordinary Japanese text and examined the influence of training data on texts of various genres.
Implementation of Enhanced Parts-of-Speech Based Rules for English to Telugu ...Waqas Tariq
Words of a sentence will not follow same ordering in different languages. This paper proposes certain Parts-of-Speech (POS) based rules for reordering the given English sentence to get translation in Telugu. The added rules for adverbs, exceptional conjunctions in addition to improved handling of inflections enable the system to achieve more accurate translation. The proposed rules along with existing system gave a score of 0.6190 with BLEU evaluation metric while translating sentences from English to Telugu. This paper deals with simple form of sentences in a better way.
Green Roofs and Green Building Rating SystemsIJERA Editor
The environmental benefits for green building from the Leadership in Energy and Environment Design (LEED)
and Ecology, Energy, Waste, and Health (EEWH) rating systems have been extensively investigated; however,
the effect of green roofs on the credit-earning mechanisms is relatively unexplored. This study is concerned with
the environmental benefits of green roofs with respect to sustainability, stormwater control, energy savings, and
water resources. We focused on the relationship between green coverage and the credits of the rating systems,
evaluated the credits efficiency, and performed cost analysis. As an example, we used a university building in
Keelung, Northern Taiwan. The findings suggest that with EEWH, the proposed green coverage is 50–75%,
whereas with LEED, the proposed green coverage is 100%. These findings have implications for the application
of green roofs in green building.
Use of Hidden Markov Mobility Model for Location Based ServicesIJERA Editor
These days people prefer to use portable and wireless devices such as laptops, mobile phones, They are connected through satellites. As user moves from one point to other, task of updating stored information becomes difficult. Provision of Location based services to users, faces some challenges like limited bandwidth and limited client power. To optimize data accessibility and to minimize access cost, we can store frequently accessed data item in cache of client. So small size of cache is introduced in mobile devices. Data fetched from server is stored on cache. So requested data from user is provided from cache and not from remote server. Question arises that which data should be kept in the cache? Performance of cache majorly depends on the cache replacement policies which select data suitable for eviction from cache. This paper presents use of Hidden Markov Models(HMMs) for prediction of user‟s future location. Then data item irrelevant to this predicted location is fetched out from the cache. The proposed approach clusters location histories according to their location characteristics and also it considers each user‟s previous actions. This results in producing high packet delivery ratio and minimum delay.
Photovoltaic Modules Performance Loss Evaluation for Nsukka, South East Niger...IJERA Editor
The Photovoltaic (PV) systems and technology offer excellent reliability when designed with the right implementation tools and based on good technical judgements of components that make up each of the critical sections of solar power system. The PV array is an essential section of a solar power system and it is expected to function to deliver pre – estimated power based on design estimations. There are factors that derail the performance of PV modules; the contributions of these factors are peculiar to specific sites of installation, hence the need to empirically evaluate and characterize installation sites before deployment of PV systems. This paper presents the characterization of Nsukka (South East, Nigeria) environment using decent instrumentation; and consequently highlights the power loss indicators for PV modules in the target site while presenting equally mitigable design.
Privacy Preservation and Restoration of Data Using Unrealized Data SetsIJERA Editor
In today’s world, there is an improved advance in hardware technology which increases the capability to store and record personal data about consumers and individuals. Data mining extracts knowledge to support a variety of areas as marketing, medical diagnosis, weather forecasting, national security etc successfully. Still there is a challenge to extract certain kinds of data without violating the data owners’ privacy. As data mining becomes more enveloping, such privacy concerns are increasing. This gives birth to a new category of data mining method called privacy preserving data mining algorithm (PPDM). The aim of this algorithm is to protect the easily affected information in data from the large amount of data set. The privacy preservation of data set can be expressed in the form of decision tree. This paper proposes a privacy preservation based on data set complement algorithms which store the information of the real dataset. So that the private data can be safe from the unauthorized party, if some portion of the data can be lost, then we can recreate the original data set from the unrealized dataset and the perturbed data set.
Geomatics Based Landslide Vulnerability Zonation Mapping - Parts Of Nilgiri D...IJERA Editor
Landslide includes a wide range of ground movements, such as rock falls, deep failure of slope, and shallow debris flows. Although gravity acting on an over steepened slope is the primary reason for a landslide. The Nilgiri Hills (Mountains) of Tamil Nadu, India are prone to landslides, which often result in considerable damage to private property, public infrastructure, and loss of life. The mapping of LVZ includes, the preparation of various thematic layers from different data sources, such as Survey of India topographic sheets, Satellite data, Geological Survey of India maps etc. These landslides are typically the result of the structural failure of thick laterite soils that have been saturated by heavy rains during the monsoon season. . GIS have proved to be useful tools for analyzing and managing landslide related data. GIS has been widely used in quantitative estimation landslide susceptibility. The methodology adopted for the identification of landslide vulnerable zones, and suggestion of remedial measures based on the vulnerability of landslides on different terrain parameters per unit area. Through this study, it is evinced again that the geomatics technology is a proven tool for landslide studies in order to properly understand, identify and suggest remedial measures.
By combining the wireless network and E-commerce, suppliers can provide a more convenient and quicker
service on a human scale for their customers. The main advantages of such services are their high availability,
independence of physical location and time. Mobile commerce raises a number of security and privacy
challenges. However, security has always been the key issue for the development of mobile E-commerce, which
is more vulnerable than the traditional E-commerce mode. . In order to solve the problem of security gap in the
transmission of mobile E-commerce information, a framework based on J2ME/MIDP is proposed, which
combines double layer encryption schemes, stego-image and secure XML messages which transferred between
the mobile terminal and the server. Our method provide strong secure and invisible communication with high
security and high operating efficiency that compatible with many types of mobile terminal.
Design of Memory Cell for Low Power ApplicationsIJERA Editor
Aggressive CMOS scaling results in lower threshold voltage and thin oxide thickness for transistors manufactured in nano regime. As a result, reducing the sub-threshold and tunneling gate leakage currents has become crucial in the design of ICs. This paper presents a new method to reduce the total leakage power dissipation of static random access memories (SRAMs) while maintaining their performance.
As a new material, the researches on polyvinyl alcohol fiber cement stabilized macadam are also less. This
article studies the effect of different volume fraction of fiber length on the compressive strength and splitting
tensile strength of cement stabilized macadam.To promote polyvinyl alcohol fiber cement stabilized macadam’s
application,it has theoretical significance and engineering application value.
Design and Implementation of Multibit Flip flops by Using Single Phase Adiaba...IJERA Editor
As technology advances, a system on chip (SOC) design can contain more and more components that lead to a
high power density. Now-a-days in IC's, the power consumed by clocking gradually takes a dominant part. .
Power, Area, Performance has become a main issue in VLSI design. In previous designs they have performed a
co-ordinate transformation to identify the flip-flops that can be merged and their legal regions. A combinational
table is formed by that flip-flops. Finally they used hierarchical way to merge the flip-flops. Due to this design
the power consumption can be reduced by replacing some flip-flops by fewer multi-bit flip-flops. We have
proposed an single phase adiabatic clock in for further reduction of the power. In our design also first we will
identify the mergeable flip-flops, build a combinational table from mergeable flip-flops and all the flip-flops are
divided into sub regions to check whether they are working properly or not and finally these flip-flops are
combined together, at this point we are giving an single phase adiabatic clock as input to those flip-flops for the
reduction of power consumption and timing.
Research on License Plate Recognition and Extraction from complicated ImagesIJERA Editor
Vehicle number plate recognition has attracted many researchers for intelligent transportation systems such as the payment of parking fee, controlling the traffic volume, traffic data collection, etc. We are presenting an enhanced license plate extraction methodology which includes edge statistics and the morphology. The proposed methodology includes vertical edge extraction, background curve and noise removing, edge statistical analysis and morphology-based license plate extraction.
Preparation and Characterization of Lithium Ion Conducting Solid Polymer Elec...IJERA Editor
Solid Polymer electrolyte films have been prepared from Starch-Poly vinyl alcohol (PVA) blend a well acknowledged biodegradable material. Solution cast technique was employed for the preparation of solid polymer electrolyte films added with Lithium Bromide (LiBr) salt. X-ray diffraction (XRD) studies of the prepared films portrayed the evolution of an amorphous structure with increasing content of salt which is an important factor that leads to the augmentation of conductivity. Electrochemical impedance spectroscopic analysis revealed noticeable ionic conductivity ~ 5x 10-3 S/cm for 20 wt% of salt at ambient conditions. Ionic conductivity showed an increasing trend with salt content at ambient conditions. Transference number measurements confirmed the ionic nature of the prepared solid polymer electrolyte films. Dielectric studies revealed a sharp increase in the number of charge carriers which contributed to enhancement in conductivity. Low values of activation energy extracted from temperature dependent conductivity measurements could be favorable for device applications. For the composition with highest conductivity a temperature independent relaxation mechanism was confirmed by electric modulus scaling.
Using Tunisian Phosphate Rock and Her Converted Hydroxyapatite for Lead Remov...IJERA Editor
Natural and synthesis apatites represent a cost effective soil amendment, which can be used for in situ reduction of lead bioavailability and mobility. In our previous work, we selected Tunisian Phosphate Rock (TPR) and Hydroxyapatite (CaHAp) as promising minerals for the removal of lead from aqueous solutions. X-ray powder diffraction patterns (DRX), Infra Red (IR), Thermogravimetric analysis (TGA) and Scanning Electron Microscopy (SEM) were used to characterize TPR and CaHAp. CaHAp was prepared from TPR and employed for the removal of Pb2+ ions at different concentrations from aqueous solution to determine the adsorption properties of CaHAp and compare them with those of a TPR. The kinetic data obtained indicated that the adsorption performances of the adsorbents depended both on their specific surface area and crystallinity. Complexation of lead ion on the adsorbent surface favoured the dissolution of hydroxyapatites characterized by a Ca/Pb molar ratio of 1.69. The maximum adsorption capacity of CaHAp for Pb2+ ions at 25 °C was 1.806 mmol /g relative to 1.035 mmol /g for TPR at the same temperature. The higher capacity of CaHAp was explained in terms of its porosity and crystallinity. The Pb2+ ions sorption results could be modelled by the Langmuir and Freundlich isotherms. The simulations of adsorption isotherms of Pb2+ on CaHAp allow us to conclude that there is a good correlation between the experimental data and the Langmuir model. On TPR, we show a good correlation between the experimental data and the Langmuir and Freundlich model.
On Some Double Integrals of H -Function of Two Variables and Their ApplicationsIJERA Editor
This paper deals with the evaluation of four integrals of H -function of two variables proposed by Singh and
Mandia [7] and their applications in deriving double half-range Fourier series for the H -function of two
variables. A multiple integral and a multiple half-range Fourier series of the H -function of two variables are
derived analogous to the double integral and double half-range Fourier series of the H -function of two
variables.
The behavior of hybrid reinforced concrete after heat resistanceIJERA Editor
This study is trying to provide the behavior of concrete when additional fibers are added under the effect of
evaluated temperatures. Three types of polypropylene fibers are used with different length respectively 3 mm, 6
mm and 12 mm and two types of steel fibers are used of length respectively of 3cm and 5 cm. Hybrid specimens
of concrete are prepared by using two different combinations: 0.5% steel fibers in combination with 0.2%
polypropylene fibers by the volume of concrete; and 0.25% of steel fibers in combination with 0.1%
polypropylene fibers by the volume of concrete. The specimens were subject to different temperatures. An
electric furnace was used to heat the specimens up to 200 0C, 400 0C and 600 0C. The mass loss and compressive
strength were determined by using twelve different mixtures.
Validation of the Newly Developed Fabric Feel Tester for Its Accuracy and Rep...IJERA Editor
The present paper deals with a comprehensive study of reproducibility of the newly developed instrument to
study fabric handle characteristics using extraction principle. As reported earlier that a new nozzle extraction
method for objective measurement of fabric handle characteristics has been developed. The force exerted by the
fabric being drawn out of the nozzle is known as extraction force and the force exerted by the fabric at the side
wall of the nozzle is known as radial force. A few fabric samples have been tested on this newly developed
instrument and the effect of numbers of tests has been studied. It has been observed that minimum five samples
of a fabric test in this instrument gives lower standard deviation of the test results. Also the overall deviations of
results justified the reproducibility of the instrument and hence the said instrument if validated for its testing
parameters.
Optimization of Preventive Maintenance Practice in Maritime Academy OronIJERA Editor
This research study investigates preventive maintenance management of diesel engine generators at the Nigerian Maritime Academy, Oron. An optimization methodology taking cognisance of equipment age was applied on failure data of diesel engine generators obtained from the institution’s maintenance data base to provide cost effective maintenance management / replacement programme for critical components of diesel engine generators. The analyzed results using Matlab provides a cost and reliability template which can be used to perform diesel generator maintenance management programme in the academy.
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorWaqas Tariq
In recent decades speech interactive systems have gained increasing importance. Performance of an ASR system mainly depends on the availability of large corpus of speech. The conventional method of building a large vocabulary speech recognizer for any language uses a top-down approach to speech. This approach requires large speech corpus with sentence or phoneme level transcription of the speech utterances. The transcriptions must also include different speech order so that the recognizer can build models for all the sounds present. But, for Telugu language, because of its complex nature, a very large, well annotated speech database is very difficult to build. It is very difficult, if not impossible, to cover all the words of any Indian language, where each word may have thousands and millions of word forms. A significant part of grammar that is handled by syntax in English (and other similar languages) is handled within morphology in Telugu. Phrases including several words (that is, tokens) in English would be mapped on to a single word in Telugu.Telugu language is phonetic in nature in addition to rich in morphology. That is why the speech technology developed for English cannot be applied to Telugu language. This paper highlights the work carried out in an attempt to build a voice enabled text editor with capability of automatic term suggestion. Main claim of the paper is the recognition enhancement process developed by us for suitability of highly inflecting, rich morphological languages. This method results in increased speech recognition accuracy with very much reduction in corpus size. It also adapts Telugu words to the database dynamically, resulting in growth of the corpus.
Word sense disambiguation using wsd specific wordnet of polysemy wordsijnlc
This paper presents a new model of WordNet that is used to disambiguate the correct sense of polysemy
word based on the clue words. The related words for each sense of a polysemy word as well as single sense
word are referred to as the clue words. The conventional WordNet organizes nouns, verbs, adjectives and
adverbs together into sets of synonyms called synsets each expressing a different concept. In contrast to the
structure of WordNet, we developed a new model of WordNet that organizes the different senses of
polysemy words as well as the single sense words based on the clue words. These clue words for each sense
of a polysemy word as well as for single sense word are used to disambiguate the correct meaning of the
polysemy word in the given context using knowledge based Word Sense Disambiguation (WSD) algorithms.
The clue word can be a noun, verb, adjective or adverb.
Lecture: Fluency Fitness! One larger size fits all!ETAI 2010
Elisheva Barkon: Lecture: Fluency Fitness! One larger size fits all!
Research has established fluency as a critical factor in smooth, efficient language processing. In this presentation, I will discuss approaches to language acquisition and reading that encourage recognition and use of chunks/multi word units as a way forward in the promotion of fluency.
SCORE-BASED SENTIMENT ANALYSIS OF BOOK REVIEWS IN HINDI LANGUAGEkevig
Sentiment analysis has been performed in different languages and in various domains, such as movie reviews, product reviews and tourism reviews. However, not much work has been done in the area of books considering the high availability of book reviews on Hindi blogs and online forums. In this paper, a scorebased sentiment mining system for Hindi language is discussed, which captures the sentiment behind the words of book review sentences. We conducted three experiments using scores from the HindiSentiWordNet (H-SWN), first using parts-of-speech tags of opinion words to extract their potential scores. Then, we focused on word-sense disambiguation (WSD) to increase the accuracy of system. Finally, the classification results were improved by handling morphological variations. The results were validated against human annotations achieving an overall accuracy of 86.3%. The work was extended further using Hindi Subjective Lexicon (HSL). We also developed an annotated corpus of book reviews in Hindi.
SCORE-BASED SENTIMENT ANALYSIS OF BOOK REVIEWS IN HINDI LANGUAGEijnlc
Sentiment analysis has been performed in different languages and in various domains, such as movie reviews, product reviews and tourism reviews. However, not much work has been done in the area of books considering the high availability of book reviews on Hindi blogs and online forums. In this paper, a scorebased sentiment mining system for Hindi language is discussed, which captures the sentiment behind the
words of book review sentences. We conducted three experiments using scores from the HindiSentiWordNet
(H-SWN), first using parts-of-speech tags of opinion words to extract their potential scores. Then, we focused on word-sense disambiguation (WSD) to increase the accuracy of system. Finally, the classification results were improved by handling morphological variations. The results were validated against human annotations achieving an overall accuracy of 86.3%. The work was extended further using Hindi Subjective Lexicon (HSL). We also developed an annotated corpus of book reviews in Hindi.
Nowadays peoples are actively involved in giving comments and reviews on social networking websites
and other websites like shopping websites, news websites etc. large number of people everyday share
their opinion on the web, results is a large number of user data is collected .users also find it trivial task
to read all the reviews and then reached into the decision. It would be better if these reviews are
classified into some category so that the user finds it easier to read. Opinion Mining or Sentiment
Analysis is a natural language processing task that mines information from various text forms such as
reviews, news, and blogs and classify them on the basis of their polarity as positive, negative or neutral.
But, from the last few years, user content in Hindi language is also increasing at a rapid rate on the Web.
So it is very important to perform opinion mining in Hindi language as well. In this paper a Hindi
language opinion mining system is proposed. The system classifies the reviews as positive, negative and
neutral for Hindi language. Negation is also handled in the proposed system. Experimental results using
reviews of movies show the effectiveness of the system.
In this paper, we made a survey on Word Sense Disambiguation (WSD). Near about in all major languages
around the world, research in WSD has been conducted upto different extents. In this paper, we have gone
through a survey regarding the different approaches adopted in different research works, the State of the
Art in the performance in this domain, recent works in different Indian languages and finally a survey in
Bengali language. We have made a survey on different competitions in this field and the bench mark
results, obtained from those competitions.
Stemming is the process of clipping off the affixes from the input word to obtain the respective root word, but it is not necessary that stemming provide us the genuine and meaningful root
word. To overcome this problem we come up with a solution- Lemmatizer. It is the process by which we crave out the lemma from the given word and can also add additional rules to make
the clipped word a proper stem. In this paper we have created an inflectional lemmatizer which generates the rules for extracting the suffixes and also added rules for generating a proper meaningful root word.
ISOLATING WORD LEVEL RULES IN TAMIL LANGUAGE FOR EFFICIENT DEVELOPMENT OF LAN...kevig
With the advent of social media, the amount of text available for processing across different natural languages has become enormous. In the past few decades, there has been tremendous increase in the number of language processing applications. The tools for natural language computing of various languages are very different because each language has its own set of grammatical rules. This paper focuses on identifying the basic inflectional principles of Tamil language at word level. Three levels of word inflection concepts are considered – Patterns, Rules and Exceptions. How grammatical principles for
word inflections in Tamil can be grouped in these three levels and applied for obtaining different word forms is the focus of this paper. These can be made use of in a wide variety of natural language applications like morphological analysis, orphological generation, word level translation, spelling and grammar check, information extraction etc. The tools using these rules will account for faster operation and better implementation of Tamil grammatical rules referred from [த ொல்த ொப்பியம் |
tholgaappiyam] and [ நன்னூல் | nannool] in NLP applications.
Using automated lexical resources in arabic sentence subjectivityijaia
A common point in almost any work on Sentiment analysis is the need to identify which elements of
language (words) contribute to express the subjectivity in text. Collecting of these elements (sentiment
words) regardless the context with their polarities (positive/negative) is called sentiment lexical resources
or subjective lexicon. In this paper, we investigate the method for generating Sentiment Arabic lexical
Semantic Database by using lexicon based approach. Also, we study the prior polarity effects of each word
using our Sentiment Arabic Lexical Semantic Database on the sentence-level subjectivity and multiple
machine learning algorithms. The experiments were conducted on MPQA corpus containing subjective and
objective sentences of Arabic language, and we were able to achieve 76.1 % classification accuracy.
USING AUTOMATED LEXICAL RESOURCES IN ARABIC SENTENCE SUBJECTIVITYijaia
A common point in almost any work on Sentiment analysis is the need to identify which elements of
language (words) contribute to express the subjectivity in text. Collecting of these elements (sentiment
words) regardless the context with their polarities (positive/negative) is called sentiment lexical resources
or subjective lexicon. In this paper, we investigate the method for generating Sentiment Arabic lexical
Semantic Database by using lexicon based approach. Also, we study the prior polarity effects of each word
using our Sentiment Arabic Lexical Semantic Database on the sentence-level subjectivity and multiple
machine learning algorithms. The experiments were conducted on MPQA corpus containing subjective and
objective sentences of Arabic language, and we were able to achieve 76.1 % classification accuracy.
Phonetic Dictionary for Natural Language Processing: KannadaIJERA Editor
India has 22 officially recognized languages: Assamese, Bengali, English, Gujarati, Hindi, Kannada, Kashmiri, Konkani, Malayalam, Manipuri, Marathi, Nepali, Oriya, Punjabi, Sanskrit, Tamil, Telugu, and Urdu. Clearly, India owns the language diversity problem. In the age of Internet, the multiplicity of languages makes it even more necessary to have sophisticated Systems for Natural Language Process. In this paper we are developing the phonetic dictionary for natural language processing particularly for Kannada. Phonetics is the scientific study of speech sounds. Acoustic phonetics studies the physical properties of sounds and provides a language to distinguish one sound from another in quality and quantity. Kannada language is one of the major Dravidian languages of India. The language uses forty nine phonemic letters, divided into three groups: Swaragalu (thirteen letters); Yogavaahakagalu (two letters); and Vyanjanagalu (thirty-four letters), similar to the vowels and consonants of English, respectively.
Acoustic Duration of Arabic Utterances Spoken by the Indonesian Learners of A...iosrjce
IOSR Journal of Humanities and Social Science is a double blind peer reviewed International Journal edited by International Organization of Scientific Research (IOSR).The Journal provides a common forum where all aspects of humanities and social sciences are presented. IOSR-JHSS publishes original papers, review papers, conceptual framework, analytical and simulation models, case studies, empirical research, technical notes etc.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
Explore the innovative world of trenchless pipe repair with our comprehensive guide, "The Benefits and Techniques of Trenchless Pipe Repair." This document delves into the modern methods of repairing underground pipes without the need for extensive excavation, highlighting the numerous advantages and the latest techniques used in the industry.
Learn about the cost savings, reduced environmental impact, and minimal disruption associated with trenchless technology. Discover detailed explanations of popular techniques such as pipe bursting, cured-in-place pipe (CIPP) lining, and directional drilling. Understand how these methods can be applied to various types of infrastructure, from residential plumbing to large-scale municipal systems.
Ideal for homeowners, contractors, engineers, and anyone interested in modern plumbing solutions, this guide provides valuable insights into why trenchless pipe repair is becoming the preferred choice for pipe rehabilitation. Stay informed about the latest advancements and best practices in the field.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
A Tool to Search and Convert Reduplicate Words from Hindi to Punjabi
1. Shivani Sachdeva Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 8( Version 3), August 2014, pp.31-35
www.ijera.com 31 | P a g e
A Tool to Search and Convert Reduplicate Words from Hindi to Punjabi Shivani Sachdeva, Dr. Ashwani Sethi Department of Computer Science, GKU Talwandi Sabo Professor, GKU Talwandi Sabo Department of Computer Science, GKU Talwandi Sabo
I. Introduction
1.1 Introduction
Language, the best means of communication, is not a static one. It changes from time to time. The causes for these changes may be borrowing, language contact and interference and convergence from other languages. Due to these processes many structural features may come from one language into another Language. These exchanges reflect in the phonological, morphological and lexical levels. While working with the various kind of languages there are some circumstances when there is reduplication of words that is repetition of all or the part of the word. Reduplication, the doubling or partial doubling of syllables and words, is very frequent in Hindi-Urdu. It may apply to full words of any kind: nouns, adjectives, adverbs, verbs (but not auxiliary verbs), even pronouns: 1.2 Reduplication Definition Reduplication is the repetition of all or part of the base with or without internal change before or after the base itself. It is of two types: 1.3 Types… Morphological & Lexical 1.3.1. Morphological Reduplication: Morphological reduplication refers to the minimally meaningful and segmentally indivisible morphemes that are a larger number of expressions used in speech where sound and sense seems to be united. These expressions have been termed as Onomatopoeia such as: 1. Some Acoustic Noises Monkey chattering U? U? Cat Chattering Mu?Mu? 2. Noises of Natural Phenomenon Rain pattering tap tap Thundering Sound gar gar 3. Noises by Humans Laughing Ha! Ha! Khick! Khick!
1.3.2. Lexical Reduplication: Lexical reduplication refers to the repetition of any sequence of phonological units comprising a word. Lexical reduplication, unlike morphological reduplication, is not minimally meaningful and thus can be further divided as they are formed of two identical words or two non – identical phonological words. It can be partial or complete at this level.“Complete Lexical Reduplication is constituted of two identical (bimodal) words, Ex.: baiThebaiThe “While sitting” in Hindi.Partial reduplication, on the other hand, is constituted of partial repetition of a word either phonologically or semantically. Echo words such as khanavana “Food etc.” or compounds It consists of four types: 1.3.2.1.Echo-Formation: An echo word is defined as a partially repeated form of the base word; partially repeated in the sense that either the initial phoneme which may be either consonant or vowel or the syllable of the base is replaced by another phoneme or another syllable. Here one thing we find that the replacer sound sequences are more or less fixed and rigid. In Bengali repetition starts with –T, Punjabi –S and Hindi –V. The echo word has neither any individual occurrence nor any meaning of its own in the language. It acquires the status of a meaningful element only after it is attached to a word. Here are some examples Hindi Gloss nam – wam Name khun – Vun Blood asan – vasan Easy Phal – val Fruit pyar – vyar Love gana- -vana Song Echo Formation Used in Sentences: 1 nam - vam: nam – vam pata kar ke kya karoge? (Hindi) 2 khun -vun: khun – vun nikle ki nehi? (Hindi) 3. asan -vasan : asan - vasan question mat kya karo (Hindi)
RESEARCH ARTICLE OPEN ACCESS
2. Shivani Sachdeva Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 8( Version 3), August 2014, pp.31-35
www.ijera.com 32 | P a g e
4. phal -val : phal – val kha kar wo ghar chala gaya (Hindi) 5.pyar -vyar : pyar -vyar sab bekarkicheejhai (Hindi) 6gana-vana: gana –vana gaya karo (Hindi) 1.3.2.2. Compounds Compounds refer to the paired constitution in which the second word is not an exact repetition of the first but has some similarity or relationship to the first word either on the semantic or on the phonetic level. A compound may be used independently in a sentence. Here, the paired construction does not have a new meaning. 1.3.2.3. Word Reduplication Word reduplication refers to those paired constructions when a single word or a clause is repeated once in the same sentence without any phonological or morphological variation. This word reduplication is of two types:
Word reduplication Complete partial
i. Complete Reduplication:
When both the parts have meanings and are meaningful, then the reduplication is called a complete reduplication. Examples of Complete Reduplication: Hindi Gloss chalte – chalte Running dhire – dhire Slowly achche –achche Good safed – safed White chhote – chhote Little (ii) Partial Reduplication: Partial reduplication in the sense that here only one word or free morpheme is meaningful other word is meaningless. Examples of Partial Reduplication Hindi Gloss pani – wani Water kam – wam Work chawye-waye Tea pan-wan Betel
1.3.2.4. Discontinuous Reduplication:
It is the reduplication with inter-fusion of a syllable which could be a vowel or vowel- consonant(VC) or consonant-vowel( CV) which is termed as discontinuous reduplication. Example of Discontinuous Reduplication
1. criss-cross: kisi ke sath criss-cross mat khela karo (Hindi)
2. zig-zag: ye rastazig-zaghai (Hindi)
3. tip-top: hamesha tip-top rahakaro (Hindi)
II. Problem Definition
The aim of the research work here is to automatically identify the echo words in hindi language that comes frequently in conversation of the day today life by using existing mechanism that is rule based approach so that we will be able to translate that particular one onto the Punjabi language. To identify these words we can use some lexical dictionary or the other methodologies to be followed. As we know that to work on to that particular research work we have to go through various phases such as the collection of data from various sources of hindi then we have to apply some particular mechanism either rule based or the statistical approach to identify that echo reduplicated word and then finally translate them onto the other language and that is the Punjabi language. 2.1 Sources for the Data: The data for this study are to collected by using various types of methods. The prime data for the present study are collected from various sources like the hindi texts, lexicons, daily news papers , weekly and monthly magazines, short story books, novels, etc.
III. Methodologies
3.1 Methodologies to be used: There are the two approaches that can to be used to achieve the particular goal so as to identify the echo words and then translate that particular one onto the punjabi language. The technique that can be followed are:
Rule based approach
Statistical approach
In the rule based approach, rules are to be followed various rules according to the circumstances or depending upon the situation rules are to be applied where as in statistical approach various statistical tools are to be used to perform the desired task. Here in this research work we will use the first one that is rule based approach in which various rules are to be be developed so that it can identify the echo words reduplication and then work on it as per the requirements
3. Shivani Sachdeva Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 8( Version 3), August 2014, pp.31-35
www.ijera.com 33 | P a g e
IV. Literature Survey
4.1 Introduction This chapter presents a comprehensive literature survey the various types of reduplication with having its fully and partially reduplication in the form of echo words.in this work we will work on to identify the echo reduplicated words in hindi and how to translate them into the other punjabi language. 4.2 Previous Work R.M.K Sinha and Anil Thakur,“Dealing with Replicative Words in Hindi for Machine Translation to English” they discuss different types of replicative words in Hindi and their syntactic and semantic characteristics to formulate rules and strategies to identify their multiple functions and mapping patterns in English for machine translation from Hindi to English.The phenomenon, in a natural language, where a word repeats itself in a sequence without any morphological variation has been referred to as “complete word reduplication”. In this paper, they call this process as replication, retaining the definition1 given in (Abbi, 1992): “those paired constructions when a single word or a clause is repeated once in the same sentence without any phonological or morphological variation. In Thoudam Doren Singh, Sivaji Bandy opadhyay;” Web Based Manipuri Corpus for Multiword NER and Reduplicated MWEs Identification using SVM” A web based Manipuri corpus is developed for identification of reduplicated multiword expression (MWE) and multiword named entity recognition (NER). The mode of corpus collection and the identification of reduplicated MWEs and multiword NE based on support vector machine (SVM) learning technique are reported. The SVM based reduplicated MWE system is evaluated with recall, precision and FScore values of 94.62%, 93.53% and 94.07% respectively outperforming the rule based approach. The recall, precision and F- Score for multiword NE are evaluated as 94.82%, 93.12% and 93.96% respectively. Annie Montaut, Inalco, Paris, Reduplication and „echo words‟ in Hindi/ Urdu, enquire into the various meanings of reduplicationas a linguistic operation, and not as a merely stylistic or expressivedevice. The theoretical frame is Antoine Culioli‟s „énonciative‟ linguistics(notion and located occurrence, notional domain and boundary); contextand intersubjectivity are taken into account as much as possible. section deals with total reduplication, within the nominal, verbal and adjectival category: it shows that reduplication on an occurrence modifies therelation between the reduplicated term and the term syntactically associated to it by denying the occurrence any specific stable value. It thus modifies the scheme of individuation of the notion
Dr. H.S. Ananthanarayana (1976) describes “Reduplication in Sanketi Tamil”.. He defines the reduplication as “The repetition of all or part of a base word”. To describe the reduplication he considers the onomatopoeic words vs. non- onomatopoeic words. He starts to describe reduplication by taking the onomatopoeic words which are formed from sounds which resemble those associated with the object or action. He gives examples like
1. kaTakaTa “to grind one‟s teeth”
2. kilikili “to laugh heartly”
Dr. Imtiaz Hasnain, Aligarh Muslim University, reviewed Dr. AnvitaAbbi`s book “Reduplication in South Asian Languages: An Areal Typological and Historical Study”. The summary of the book is in the following: i) There is an impending danger in the identification of linguistic area on the basis of commonality of features. ii) With a view to circumventing the impending danger the author rightly prefers to talk of a particular linguistic features area with an emphasis on Reduplicated Structures (RSs). iii) RSs have been examined in the entire major and some minor languages of South Asia. iv) The language data consist of 17 tribal Tibeto-Burman and 4 Austro-Asiatic languages and the remaining five namely Juang, Kurukh, Santhali, Sora and Dakhini, which are spokenoutside the Indian subcontinent but are akin to the Austric branch of the languages of India, Hokkien and Chinese structures have also been looked into, with a view to unravel their RS patterns. v) The RSs used in this study are the ones which are most commonly found in all the languages, thus constituting areal universals. M. Arunachalam (1977) treats echo-words in Tamil. In his article he has taken up three terms iraTTaikkiLavi, aTukkuttoTar and echo-words for consideration. The Tamil grammatical works Tolkaapiyam and Nannuul mentioned iraTTaikiLavi and aTukkuttoTar. They referred to these two forms always occurring as double. If they are occurring as single it won‟t give meaning. These are called as iTaiccolin the Tamil grammar.
Dr. Gnanasundram has written a paper entitled “Echo words in Tamil” in the year1972. He takes up onomatopoeia and lexical doublets for his study and defines echo-words as “When an onomatopoeia is reduplicated and when reduplicated part undergoes a sound change then that is called echo words”. He elaborately has collected all these kinds of words. He analyzes these words and finds out only the sound change which has taken place in these words. Because the first or the second syllable is changed in the duplicated part he considers all the above said words as echo-words. He actually lists the echo words but he does not take them into account for his
4. Shivani Sachdeva Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 8( Version 3), August 2014, pp.31-35
www.ijera.com 34 | P a g e
study. Most probably he does not recognize that one, as actual echo-word. But he lists many onomatopoeia words and lexical doublets as echo-words in his paper on the basis of consonant and vowel changes. Dr. PeriBhaskarrao has published a small book in the year 1977 titled “Reduplication and Onomatopoeia in Telugu.” He notes that Telugu utilizes the reduplication process for bringing out various subtle meaning differences. He includes the formation of echo-word under the reduplication. He suggests that the onomatopoeia occurring, in a reduplicated form, can be taken together.. Sangamitra has written a thesis on “Reduplication in Bengali, Mundari and Telugu: A Linguiistic Study”. In it, her aim is to study not only the reduplication but also the other reduplicative structures like echo – words, expressives and copulative compounds with focus on semantic analysis in Bengali, Mundari and Telugu. She concludes her thesis on three grounds, namely, structural pattern, findings on semantic grounds and future scope. For structural pattern she follows Moravisik‟s proposition and categorizes the reduplication under two heads, namely, monomorphemic model and bimodel. Bengali, Mundari and Telugu languages have identical, discontinuous and partial reduplication Dr. A. Usha Devi (2001) has prepared a small dictionary for Telugu onomatopoeic forms. In the introduction she mentions that the language is operating on two systems, viz., grammatical and lexical. The grammatical system analyses the phoneme, morpheme, sandhi, compound and sentence. The lexical system deals with the lexical categorization on the basis of sense and the relation between the common, opposite and the synonymous words
V. Objectives
5.1 Objectives of the Project Work
i. To thoroughly study of the basic concepts of REDUPLICTION echo words in hindi.
ii. To automatically identify the reduplicated echo words in hindi language.
iii. To use various resources such as lexical dictionaries or other methodologies.
iv. To translate the particular words onto the Punjabi language that is to be required
VI. Work to be carried out
6.1 Work to be carried out in Fourth Semester
Reduplication is very common in our day today language conversation and the different languages have different properties related to it. Such that these languages have their own combination of vowel/consonants, consonant/vowels and so on .but the problem here to be discussed is that the reduplicated words that comes out while working with the hindi language have to be identified automatically by the system.
The framework of rule based approach is being followed here to identify such words, as we know the rule based approach uses various rules depending upon their sentences and all Following work will be carried out during this semester.
Study the problem of reduplication
Study various past and recent researches on the topic reduplication.
Study the previous implementation of reduplicated echo words translation.
Translate reduplicated echo words of hindi to punjabi
Thoroughly study about rule based approach.
Comparative analysis of proposed research existing research.
References [1] M.D SohelRana (P hd)nov 2010, “reduplication in bengali language”; Volume 10:11 ;pp 1930-2940. [2] Kishorjit, N. &Sivaji, B., (2010) Identification of MWEs Using CRF in Manipuri and Improvement Using Reduplicated MWEs, In the Proceedings of 8th International Conference on Natural Language (ICON-2010), IIT Kharagpur, India, pp51
[3] Thoudam Doren Singh, Sivaji Bandyopadhyay, (august 2010)” Web Based Manipuri Corpus for Multiword NER and Reduplicated MWEs Identification using SVM” Proceedings of the 1st Workshop on South and Southeast Asian Natural Language Processing (WSSANLP), ,the 23rd International Conference on Computational Linguistics (COLING), Beijing, pages 35- 42 [4] A.PARIMALAGANTHAM,(8 august 2009)“A STUDY OF STRUCTURAL REDUPLICATION IN TAMIL AND TELUGU”: LANGUAGE IN INDIA ,Strength for Today and Bright Hope for Tomorrow “Volume 9 : ISSN 1930-2940 [5] JANA NOVOTNA,(2000)”REDUPLICATION IN SWAHILI”, AAP 64 : Swahili Forum VII • 57-73 [6] Rajendra singh (2005)“Reduplication in Modern Hindi and the Theory of Reduplication”, In Hurch (ed.), 263-81. [6] Jeffrey Lidz,” (1999) “Echo Reduplication in Kannada”:Implications for a Theory of Word-formation” U. Penn Working Papers in Linguistics, Volume 6.3.
5. Shivani Sachdeva Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 8( Version 3), August 2014, pp.31-35
www.ijera.com 35 | P a g e
[7] Akinlabi, A.M. & E. E Urua (1996) “A Prosodic Account of Reduplication in Ibibio”. Nigerian Language Studies. “A Journal of the National Institute for Nigerian Languages, Vol. 4” 1-15 [8] Giannakis, Geokrgios, 1992. “Reduplication as a Morphological marker in the Indo- Europian Languages” Word, vol.43. No 2. [9] Al Mtenje, 1988, “On tone and transfer in Chichew: A Reduplication” Linguistics26. pp. 125-155. [10] Anvita Abbi.(1997)”Reduplicated Adverbs in Hindi Indian Linguistics”Journal of the Linguistic Society of India vol:38;1977:pp: 125-135.