Sentiment Analysis is a natural language processing
task that extracts sentiment from various text for
ms
and classifies them according to positive, negative
or neutral polarity. It analyzes emotions, feeling
s, and
the attitude of a speaker or a writer towards a con
text. This paper gives comparative study of various
sentiment classification techniques and also discus
ses in detail two main categories of sentiment
classification techniques these are machine based a
nd lexicon based. The paper also presents challenge
s
associated with sentiment analysis along with lexic
al resources available.
Sentiment analysis is an important current research area. The demand for sentiment analysis and classification is growing day by day; this paper presents a novel method to classify Urdu documents as previously no work recorded on sentiment classification for Urdu text. We consider the problem by determining whether the review or sentence is positive, negative or neutral. For the purpose we use two machine learning methods Naïve Bayes and Support Vector Machines (SVM) . Firstly the documents are preprocessed and the sentiments features are extracted, then the polarity has been calculated, judged and classify through Machine learning methods.
Evaluation of Support Vector Machine and Decision Tree for Emotion Recognitio...journalBEEI
In this paper, the performance of Support Vector Machine (SVM) and Decision Tree (DT) in classifying emotions from Malay folklores is presented. This work is the continuation of our storytelling speech synthesis work to add emotions for a more natural storytelling. A total of 100 documents from children short stories are collected and used as the datasets of the text-based emotion recognition experiment. Term Frequency-Inverse Document Frequency (TF-IDF) is extracted from the text documents and classified using SVM and DT. Four types of common emotions, which are happy, angry, fearful and sad are classified using the two classifiers. Results showed that DT outperformed SVM by more than 22.2% accuracy rate. However, the overall emotion recognition is only at moderate rate suggesting an improvement is needed in future work. The accuracy of the emotion recognition should be improved in future studies by using semantic feature extractors or by incorporating deep learning for classification.
Natural Language Toolkit (NLTK) is a generic platform to process the data of various natural (human)
languages and it provides various resources for Indian languages also like Hindi, Bangla, Marathi and so
on. In the proposed work, the repositories provided by NLTK are used to carry out the processing of Hindi
text and then further for analysis of Multi word Expressions (MWEs). MWEs are lexical items that can be
decomposed into multiple lexemes and display lexical, syntactic, semantic, pragmatic and statistical
idiomaticity. The main focus of this paper is on processing and analysis of MWEs for Hindi text. The
corpus used for Hindi text processing is taken from the famous Hindi novel “KaramaBhumi by Munshi
PremChand”. The result analysis is done using the Hindi corpus provided by Resource Centre for Indian
Language Technology Solutions (CFILT). Results are analysed to justify the accuracy of the proposed
work.
Comparative performance analysis of two anaphora resolution systemsijfcstjournal
Anaphora Resolution is the process of finding referents in the given discourse. Anaphora Resolution is one
of the complex tasks of linguistics. This paper presents the performance analysis of two computational
models that uses Gazetteer method for resolving anaphora in Hindi Language. In Gazetteer method
different classes (Gazettes) of elements are created. These Gazettes are used to provide external knowledge
to the system. The two models use Recency and Animistic factor for resolving anaphors. For Recency factor
first model uses the concept of centering approach and second uses the concept of Lappin Leass approach.
Gazetteers are used to provide Animistic knowledge. This paper presents the experimental results of both
the models. These experiments are conducted on short Hindi stories, news articles and biography content
from Wikipedia. The respective accuracy for both the model is analyzed and finally the conclusion is drawn
for the best suitable model for Hindi Language.
Analysis of anaphora resolution system forijitjournal
Anaphora resolution is complex problem in linguistics and has attracted the attention of many researchers.
It is the problem of identifying referents in the discourse. Anaphora Resolution plays an important role in
Natural language processing task. This paper completely emphasis on pronominal anaphora resolution for
English Language in which pronouns refers to the intended noun in discourse. In this paper two
computational models are proposed for anaphora resolution. Resolution of anaphora is based on various
factors among which these models use Recency factor and Animistic Knowledge. Recency factor is
implemented by using Lappin Leass approach in first model and using Centering approach in second
model. Information about animacy is obtained by Gazetteer method. The identification of animistic
elements is employed to improve the accuracy of the system. This paper demonstrates experiment
conducted by both the models on different data sets from different domains. A comparative result of both
the model is summarized and conclusion is drawn for the best suitable model.
Sentiment analysis is an important current research area. The demand for sentiment analysis and classification is growing day by day; this paper presents a novel method to classify Urdu documents as previously no work recorded on sentiment classification for Urdu text. We consider the problem by determining whether the review or sentence is positive, negative or neutral. For the purpose we use two machine learning methods Naïve Bayes and Support Vector Machines (SVM) . Firstly the documents are preprocessed and the sentiments features are extracted, then the polarity has been calculated, judged and classify through Machine learning methods.
Evaluation of Support Vector Machine and Decision Tree for Emotion Recognitio...journalBEEI
In this paper, the performance of Support Vector Machine (SVM) and Decision Tree (DT) in classifying emotions from Malay folklores is presented. This work is the continuation of our storytelling speech synthesis work to add emotions for a more natural storytelling. A total of 100 documents from children short stories are collected and used as the datasets of the text-based emotion recognition experiment. Term Frequency-Inverse Document Frequency (TF-IDF) is extracted from the text documents and classified using SVM and DT. Four types of common emotions, which are happy, angry, fearful and sad are classified using the two classifiers. Results showed that DT outperformed SVM by more than 22.2% accuracy rate. However, the overall emotion recognition is only at moderate rate suggesting an improvement is needed in future work. The accuracy of the emotion recognition should be improved in future studies by using semantic feature extractors or by incorporating deep learning for classification.
Natural Language Toolkit (NLTK) is a generic platform to process the data of various natural (human)
languages and it provides various resources for Indian languages also like Hindi, Bangla, Marathi and so
on. In the proposed work, the repositories provided by NLTK are used to carry out the processing of Hindi
text and then further for analysis of Multi word Expressions (MWEs). MWEs are lexical items that can be
decomposed into multiple lexemes and display lexical, syntactic, semantic, pragmatic and statistical
idiomaticity. The main focus of this paper is on processing and analysis of MWEs for Hindi text. The
corpus used for Hindi text processing is taken from the famous Hindi novel “KaramaBhumi by Munshi
PremChand”. The result analysis is done using the Hindi corpus provided by Resource Centre for Indian
Language Technology Solutions (CFILT). Results are analysed to justify the accuracy of the proposed
work.
Comparative performance analysis of two anaphora resolution systemsijfcstjournal
Anaphora Resolution is the process of finding referents in the given discourse. Anaphora Resolution is one
of the complex tasks of linguistics. This paper presents the performance analysis of two computational
models that uses Gazetteer method for resolving anaphora in Hindi Language. In Gazetteer method
different classes (Gazettes) of elements are created. These Gazettes are used to provide external knowledge
to the system. The two models use Recency and Animistic factor for resolving anaphors. For Recency factor
first model uses the concept of centering approach and second uses the concept of Lappin Leass approach.
Gazetteers are used to provide Animistic knowledge. This paper presents the experimental results of both
the models. These experiments are conducted on short Hindi stories, news articles and biography content
from Wikipedia. The respective accuracy for both the model is analyzed and finally the conclusion is drawn
for the best suitable model for Hindi Language.
Analysis of anaphora resolution system forijitjournal
Anaphora resolution is complex problem in linguistics and has attracted the attention of many researchers.
It is the problem of identifying referents in the discourse. Anaphora Resolution plays an important role in
Natural language processing task. This paper completely emphasis on pronominal anaphora resolution for
English Language in which pronouns refers to the intended noun in discourse. In this paper two
computational models are proposed for anaphora resolution. Resolution of anaphora is based on various
factors among which these models use Recency factor and Animistic Knowledge. Recency factor is
implemented by using Lappin Leass approach in first model and using Centering approach in second
model. Information about animacy is obtained by Gazetteer method. The identification of animistic
elements is employed to improve the accuracy of the system. This paper demonstrates experiment
conducted by both the models on different data sets from different domains. A comparative result of both
the model is summarized and conclusion is drawn for the best suitable model.
Creation of speech corpus for emotion analysis in Gujarati language and its e...IJECEIAES
In the last couple of years emotion recognition has proven its significance in the area of artificial intelligence and man machine communication. Emotion recognition can be done using speech and image (facial expression), this paper deals with SER (speech emotion recognition) only. For emotion recognition emotional speech database is essential. In this paper we have proposed emotional database which is developed in Gujarati language, one of the official’s language of India. The proposed speech corpus bifurcate six emotional states as: sadness, surprise, anger, disgust, fear, happiness. To observe effect of different emotions, analysis of proposed Gujarati speech database is carried out using efficient speech parameters like pitch, energy and MFCC using MATLAB Software.
Business intelligence analytics using sentiment analysis-a surveyIJECEIAES
Sentiment analysis (SA) is the study and analysis of sentiments, appraisals and impressions by people about entities, person, happening, topics and services. SA uses text analysis techniques and natural language processing methods to locate and extract information from big data. As most of the people are networked themselves through social websites, they use to express their sentiments through these websites.These sentiments are proved fruitful to an individual, business, government for making decisions. The impressions posted on different available sources are being used by organization to know the market mood about the services they are providing. Analyzing huge moods expressed with different features, style have raised challenge for users. This paper focuses on understanding the fundamentals of sentiment analysis, the techniques used for sentiment extraction and analysis. These techniques are then compared for accuracy, advantages and limitations. Based on the accuracy for expexted approach, we may use the suitable technique.
Improving Sentiment Analysis of Short Informal Indonesian Product Reviews usi...TELKOMNIKA JOURNAL
Sentiment analysis in short informal texts like product reviews is more challenging. Short texts are
sparse, noisy, and lack of context information. Traditional text classification methods may not be suitable
for analyzing sentiment of short texts given all those difficulties. A common approach to overcome these
problems is to enrich the original texts with additional semantics to make it appear like a large document of
text. Then, traditional classification methods can be applied to it. In this study, we developed an automatic
sentiment analysis system of short informal Indonesian texts using Naïve Bayes and Synonym Based
Feature Expansion. The system consists of three main stages, preprocessing and normalization, features
expansion and classification. After preprocessing and normalization, we utilize Kateglo to find some
synonyms of every words in original texts and append them. Finally, the text is classified using Naïve
Bayes. The experiment shows that the proposed method can improve the performance of sentiment
analysis of short informal Indonesian product reviews. The best sentiment classification performance using
proposed feature expansion is obtained by accuracy of 98%.The experiment also show that feature
expansion will give higher improvement in small number of training data than in the large number of them.
Several attempts had been made to analyze emotion words in the fields of linguistics, psychology and sociology; with the advent of computers, the analyses of these words have taken a different dimension. Unfortunately, limited attempts have so far been made to using interval type-2 fuzzy logic (IT2FL) to analyze these words in native languages. This study used IT2FL to analyze Igbo emotion words. IT2F sets are computed using the interval approach method which is divided into two parts: the data part and the fuzzy set part. The data part preprocessed data and its statistics computed for the interval that survived the preprocessing stages while the fuzzy set part determined the nature of the footprint of uncertainty; the IT2F set mathematical models for each emotion characteristics of each emotion word is also computed. The data used in this work was collected from fifteen subjects who were asked to enter an interval for each of the emotion characteristics: Valence, Activation and Dominance on an interval survey of the thirty Igbo emotion words. With this, the words are being analyzed and can be used for the purposes of translation between vocabularies in consideration to context.
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUEJournal For Research
Natural Language Processing (NLP) techniques are one of the most used techniques in the field of computer applications. It has become one of the vast and advanced techniques. Language is the means of communication or interaction among humans and in present scenario when everything is dependent on machine or everything is computerized, communication between computer and human has become a necessity. To fulfill this necessity NLP has been emerged as the means of interaction which narrows the gap between machines (computers) and humans. It was evolved from the study of linguistics which was passed through the Turing test to check the similarity between data but it was limited to small set of data. Later on various algorithms were developed along with the concept of AI (Artificial Intelligence) for the successful execution of NLP. In this paper, the main emphasis is on the different techniques of NLP which have been developed till now, their applications and the comparison of all those techniques on different parameters.
Anaphora resolution in hindi language using gazetteer methodijcsa
Anaphora resolution is one of the active research areas within the realm of natural language processing.
Resolution of anaphoric reference is one of the most challenging and complex task to be handled. This
paper completely emphasis on pronominal anaphora resolution for Hindi Language. There are various
methodologies for resolving anaphora. This paper presents a computational model for anaphora resolution
in Hindi that is based on Gazetteer method. Gazetteer method is a creation of lists and then applies
operations to classify elements present in the list. There are many salient factors for resolving anaphora.
The proposed model resolves anaphora by using two factors that is Animistic and Recency. Animistic factor
always represent living things and non living things whereas Recency describes that the referents
mentioned in current sentence tends to have higher weights than those in previous sentence. This paper
demonstrate the experiments conducted on short Hindi stories ,news articles and biography content from
Wikipedia, its result & future directions to improve accuracy.
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorWaqas Tariq
In recent decades speech interactive systems have gained increasing importance. Performance of an ASR system mainly depends on the availability of large corpus of speech. The conventional method of building a large vocabulary speech recognizer for any language uses a top-down approach to speech. This approach requires large speech corpus with sentence or phoneme level transcription of the speech utterances. The transcriptions must also include different speech order so that the recognizer can build models for all the sounds present. But, for Telugu language, because of its complex nature, a very large, well annotated speech database is very difficult to build. It is very difficult, if not impossible, to cover all the words of any Indian language, where each word may have thousands and millions of word forms. A significant part of grammar that is handled by syntax in English (and other similar languages) is handled within morphology in Telugu. Phrases including several words (that is, tokens) in English would be mapped on to a single word in Telugu.Telugu language is phonetic in nature in addition to rich in morphology. That is why the speech technology developed for English cannot be applied to Telugu language. This paper highlights the work carried out in an attempt to build a voice enabled text editor with capability of automatic term suggestion. Main claim of the paper is the recognition enhancement process developed by us for suitability of highly inflecting, rich morphological languages. This method results in increased speech recognition accuracy with very much reduction in corpus size. It also adapts Telugu words to the database dynamically, resulting in growth of the corpus.
Paraphrasing refers to the sentences that either differs in their textual content or dissimilar in rearrangement of words but convey the same meaning. Identifying a paraphrase is exceptionally important in various real life applications such as Information Retrieval, Plagiarism Detection, Text Summarization and Question Answering. A large amount of work in Paraphrase Detection has been done in English and many Indian Languages. However, there is no existing system to identify paraphrases in Marathi.
Creation of speech corpus for emotion analysis in Gujarati language and its e...IJECEIAES
In the last couple of years emotion recognition has proven its significance in the area of artificial intelligence and man machine communication. Emotion recognition can be done using speech and image (facial expression), this paper deals with SER (speech emotion recognition) only. For emotion recognition emotional speech database is essential. In this paper we have proposed emotional database which is developed in Gujarati language, one of the official’s language of India. The proposed speech corpus bifurcate six emotional states as: sadness, surprise, anger, disgust, fear, happiness. To observe effect of different emotions, analysis of proposed Gujarati speech database is carried out using efficient speech parameters like pitch, energy and MFCC using MATLAB Software.
Business intelligence analytics using sentiment analysis-a surveyIJECEIAES
Sentiment analysis (SA) is the study and analysis of sentiments, appraisals and impressions by people about entities, person, happening, topics and services. SA uses text analysis techniques and natural language processing methods to locate and extract information from big data. As most of the people are networked themselves through social websites, they use to express their sentiments through these websites.These sentiments are proved fruitful to an individual, business, government for making decisions. The impressions posted on different available sources are being used by organization to know the market mood about the services they are providing. Analyzing huge moods expressed with different features, style have raised challenge for users. This paper focuses on understanding the fundamentals of sentiment analysis, the techniques used for sentiment extraction and analysis. These techniques are then compared for accuracy, advantages and limitations. Based on the accuracy for expexted approach, we may use the suitable technique.
Improving Sentiment Analysis of Short Informal Indonesian Product Reviews usi...TELKOMNIKA JOURNAL
Sentiment analysis in short informal texts like product reviews is more challenging. Short texts are
sparse, noisy, and lack of context information. Traditional text classification methods may not be suitable
for analyzing sentiment of short texts given all those difficulties. A common approach to overcome these
problems is to enrich the original texts with additional semantics to make it appear like a large document of
text. Then, traditional classification methods can be applied to it. In this study, we developed an automatic
sentiment analysis system of short informal Indonesian texts using Naïve Bayes and Synonym Based
Feature Expansion. The system consists of three main stages, preprocessing and normalization, features
expansion and classification. After preprocessing and normalization, we utilize Kateglo to find some
synonyms of every words in original texts and append them. Finally, the text is classified using Naïve
Bayes. The experiment shows that the proposed method can improve the performance of sentiment
analysis of short informal Indonesian product reviews. The best sentiment classification performance using
proposed feature expansion is obtained by accuracy of 98%.The experiment also show that feature
expansion will give higher improvement in small number of training data than in the large number of them.
Several attempts had been made to analyze emotion words in the fields of linguistics, psychology and sociology; with the advent of computers, the analyses of these words have taken a different dimension. Unfortunately, limited attempts have so far been made to using interval type-2 fuzzy logic (IT2FL) to analyze these words in native languages. This study used IT2FL to analyze Igbo emotion words. IT2F sets are computed using the interval approach method which is divided into two parts: the data part and the fuzzy set part. The data part preprocessed data and its statistics computed for the interval that survived the preprocessing stages while the fuzzy set part determined the nature of the footprint of uncertainty; the IT2F set mathematical models for each emotion characteristics of each emotion word is also computed. The data used in this work was collected from fifteen subjects who were asked to enter an interval for each of the emotion characteristics: Valence, Activation and Dominance on an interval survey of the thirty Igbo emotion words. With this, the words are being analyzed and can be used for the purposes of translation between vocabularies in consideration to context.
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUEJournal For Research
Natural Language Processing (NLP) techniques are one of the most used techniques in the field of computer applications. It has become one of the vast and advanced techniques. Language is the means of communication or interaction among humans and in present scenario when everything is dependent on machine or everything is computerized, communication between computer and human has become a necessity. To fulfill this necessity NLP has been emerged as the means of interaction which narrows the gap between machines (computers) and humans. It was evolved from the study of linguistics which was passed through the Turing test to check the similarity between data but it was limited to small set of data. Later on various algorithms were developed along with the concept of AI (Artificial Intelligence) for the successful execution of NLP. In this paper, the main emphasis is on the different techniques of NLP which have been developed till now, their applications and the comparison of all those techniques on different parameters.
Anaphora resolution in hindi language using gazetteer methodijcsa
Anaphora resolution is one of the active research areas within the realm of natural language processing.
Resolution of anaphoric reference is one of the most challenging and complex task to be handled. This
paper completely emphasis on pronominal anaphora resolution for Hindi Language. There are various
methodologies for resolving anaphora. This paper presents a computational model for anaphora resolution
in Hindi that is based on Gazetteer method. Gazetteer method is a creation of lists and then applies
operations to classify elements present in the list. There are many salient factors for resolving anaphora.
The proposed model resolves anaphora by using two factors that is Animistic and Recency. Animistic factor
always represent living things and non living things whereas Recency describes that the referents
mentioned in current sentence tends to have higher weights than those in previous sentence. This paper
demonstrate the experiments conducted on short Hindi stories ,news articles and biography content from
Wikipedia, its result & future directions to improve accuracy.
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorWaqas Tariq
In recent decades speech interactive systems have gained increasing importance. Performance of an ASR system mainly depends on the availability of large corpus of speech. The conventional method of building a large vocabulary speech recognizer for any language uses a top-down approach to speech. This approach requires large speech corpus with sentence or phoneme level transcription of the speech utterances. The transcriptions must also include different speech order so that the recognizer can build models for all the sounds present. But, for Telugu language, because of its complex nature, a very large, well annotated speech database is very difficult to build. It is very difficult, if not impossible, to cover all the words of any Indian language, where each word may have thousands and millions of word forms. A significant part of grammar that is handled by syntax in English (and other similar languages) is handled within morphology in Telugu. Phrases including several words (that is, tokens) in English would be mapped on to a single word in Telugu.Telugu language is phonetic in nature in addition to rich in morphology. That is why the speech technology developed for English cannot be applied to Telugu language. This paper highlights the work carried out in an attempt to build a voice enabled text editor with capability of automatic term suggestion. Main claim of the paper is the recognition enhancement process developed by us for suitability of highly inflecting, rich morphological languages. This method results in increased speech recognition accuracy with very much reduction in corpus size. It also adapts Telugu words to the database dynamically, resulting in growth of the corpus.
Paraphrasing refers to the sentences that either differs in their textual content or dissimilar in rearrangement of words but convey the same meaning. Identifying a paraphrase is exceptionally important in various real life applications such as Information Retrieval, Plagiarism Detection, Text Summarization and Question Answering. A large amount of work in Paraphrase Detection has been done in English and many Indian Languages. However, there is no existing system to identify paraphrases in Marathi.
SCORE-BASED SENTIMENT ANALYSIS OF BOOK REVIEWS IN HINDI LANGUAGEijnlc
Sentiment analysis has been performed in different languages and in various domains, such as movie reviews, product reviews and tourism reviews. However, not much work has been done in the area of books considering the high availability of book reviews on Hindi blogs and online forums. In this paper, a scorebased sentiment mining system for Hindi language is discussed, which captures the sentiment behind the
words of book review sentences. We conducted three experiments using scores from the HindiSentiWordNet
(H-SWN), first using parts-of-speech tags of opinion words to extract their potential scores. Then, we focused on word-sense disambiguation (WSD) to increase the accuracy of system. Finally, the classification results were improved by handling morphological variations. The results were validated against human annotations achieving an overall accuracy of 86.3%. The work was extended further using Hindi Subjective Lexicon (HSL). We also developed an annotated corpus of book reviews in Hindi.
SCORE-BASED SENTIMENT ANALYSIS OF BOOK REVIEWS IN HINDI LANGUAGEkevig
Sentiment analysis has been performed in different languages and in various domains, such as movie reviews, product reviews and tourism reviews. However, not much work has been done in the area of books considering the high availability of book reviews on Hindi blogs and online forums. In this paper, a scorebased sentiment mining system for Hindi language is discussed, which captures the sentiment behind the words of book review sentences. We conducted three experiments using scores from the HindiSentiWordNet (H-SWN), first using parts-of-speech tags of opinion words to extract their potential scores. Then, we focused on word-sense disambiguation (WSD) to increase the accuracy of system. Finally, the classification results were improved by handling morphological variations. The results were validated against human annotations achieving an overall accuracy of 86.3%. The work was extended further using Hindi Subjective Lexicon (HSL). We also developed an annotated corpus of book reviews in Hindi.
Nowadays peoples are actively involved in giving comments and reviews on social networking websites
and other websites like shopping websites, news websites etc. large number of people everyday share
their opinion on the web, results is a large number of user data is collected .users also find it trivial task
to read all the reviews and then reached into the decision. It would be better if these reviews are
classified into some category so that the user finds it easier to read. Opinion Mining or Sentiment
Analysis is a natural language processing task that mines information from various text forms such as
reviews, news, and blogs and classify them on the basis of their polarity as positive, negative or neutral.
But, from the last few years, user content in Hindi language is also increasing at a rapid rate on the Web.
So it is very important to perform opinion mining in Hindi language as well. In this paper a Hindi
language opinion mining system is proposed. The system classifies the reviews as positive, negative and
neutral for Hindi language. Negation is also handled in the proposed system. Experimental results using
reviews of movies show the effectiveness of the system.
An Improved sentiment classification for objective word.IJSRD
Sentiment classification is an ongoing field and interesting area of research because of its application in various fields. Customer sentiments play a very important role in daily life. Currently, Sentiment classification focused on subjective statements and ignores objective statements which also carry sentiment. During the sentiment classification, problem is faced due to the ambiguous sense (meaning) of words and negation words. In word sense disambiguation method semantic scores calculated from SentiWordNet of WordNet glosses terms. The correct sense of the word is extracted and determined similarity in WordNet glosses terms. SentiWordNet extract first sense of word which used in general sense. This work aims at improving the sentiment classification by modifying the sentiment values returned by SentiWordNet and compare classification accuracy of support vector machine and naïve bays.
Opinion mining in hindi language a surveyijfcstjournal
Opinions are very important in the life of human beings. These Opinions helped the humans to carry out
the decisions. As the impact of the Web is increasing day by day, Web documents can be seen as a new
source of opinion for human beings. Web contains a huge amount of information generated by the users
through blogs, forum entries, and social networking websites and so on To analyze this large amount of
information it is required to develop a method that automatically classifies the information available on the
Web. This domain is called Sentiment Analysis and Opinion Mining. Opinion Mining or Sentiment Analysis
is a natural language processing task that mine information from various text forms such as reviews, news,
and blogs and classify them on the basis of their polarity as positive, negative or neutral. But, from the last
few years, enormous increase has been seen in Hindi language on the Web. Research in opinion mining
mostly carried out in English language but it is very important to perform the opinion mining in Hindi
language also as large amount of information in Hindi is also available on the Web. This paper gives an
overview of the work that has been done Hindi language.
Senti-Lexicon and Analysis for Restaurant Reviews of Myanmar TextIJAEMSJORNAL
Social media has just become as an influential with the rapidly growing popularity of online customers reviews available in social sites by using informal languages and emoticons. These reviews are very helpful for new customers and for decision making process. Sentiment analysis is to state the feelings, opinions about people’s reviews together with sentiment. Most of researchers applied sentiment analysis for English Language. There is no research efforts have sought to provide sentiment analysis of Myanmar text. To tackle this problem, we propose the resource of Myanmar Language for mining food and restaurants’ reviews. This paper aims to build language resource to overcome the language specific problem and opinion word extraction for Myanmar text reviews of consumers. We address dictionary based approach of lexicon-based sentiment analysis for analysis of opinion word extraction in food and restaurants domain. This research assesses the challenges and problem faced in sentiment analysis of Myanmar Language area for future.
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...mathsjournal
For one dimensional homogeneous, isotropic aquifer, without accretion the governing Boussinesq
equation under Dupuit assumptions is a nonlinear partial differential equation. In the present paper
approximate analytical solution of nonlinear Boussinesq equation is obtained using Homotopy
perturbation transform method(HPTM). The solution is compared with the exact solution. The
comparison shows that the HPTM is efficient, accurate and reliable. The analysis of two important aquifer
parameters namely viz. specific yield and hydraulic conductivity is studied to see the effects on the height
of water table. The results resemble well with the physical phenomena.
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSISmlaij
Sentiment analysis and Opinion mining has emerged as a popular and efficient technique for information retrieval and web data analysis. The exponential growth of the user generated content has opened new horizons for research in the field of sentiment analysis. This paper proposes a model for sentiment analysis of movie reviews using a combination of natural language processing and machine learning approaches. Firstly, different data pre-processing schemes are applied on the dataset. Secondly, the behaviour of twoclassifiers, Naive Bayes and SVM, is investigated in combination with different feature selection schemes to
obtain the results for sentiment analysis. Thirdly, the proposed model for sentiment analysis is extended to
obtain the results for higher order n-grams.
Word and Sentence Level Emotion Analyzation in Telugu Blog and NewsIJCSEA Journal
Emotion analysis, a recent sub discipline at the crossroads of information retrieval and computational linguistics is becoming increasingly important from application viewpoints of affective computing.Emotion is crucial to identify as it is not open to any objective observation or verification. In this paper, emotion analysis on blog texts has been carried out for a less privileged language, Telugu and the same system has been applied on the English SemEval 2007 affect sensing corpus containing only news headlines. A set of six emotion tags, namely, happy ( ), sad ( ), anger ( ), fear ( ), surprise ( )and disgust ( ), have been selected towards this emotion detection task for reliable and semi-automatic annotation of blog and news data. Conditional Random Field (CRF) based classifier has been applied for recognizing six basic emotion tags for different words of a sentence. The classifier accuracy has been improved by arranging an equal distribution of emotional tags and non-emotional tag. A score based technique has been adopted to calculate and assign tag weights to each of the six emotion tags. A sense based scoring strategy has been applied to identify sentence level emotion scores for the six emotion tags based on the acquired word level emotion tags. Sentence level emotion tagging has been
carried out based on the maximum obtained sentence level emotion scores. Evaluation has been conducted for each emotion class separately on 200 test sentences from each of the Telugu blog and English news data. The system has resulted accuracies of 69.82% and 71.06% for happy, 70.24% and 66.42% for sad, 65.73% and 64.27% for anger, 76.01% and 69.90% for disgust, 72.19% and 73.59% for fear and 70.54% and 66.64% for surprise emotion classes on blog and news test data respectively.
EXTENDING THE KNOWLEDGE OF THE ARABIC SENTIMENT CLASSIFICATION USING A FOREIG...ijnlc
This article introduces a methodology for analyzing sentiment in Arabic text using a global foreign lexical
source. Our method leverages the available resource in another language such as the SentiWordNet in
English to the limited language resource that is Arabic. The knowledge that is taken from the external
resource will be injected into the feature model whilethe machine-learning-based classifier is trained. The
first step of our method is to build the bag-of-words (BOW) model of the Arabic text. The second step
calculates the score of polarity using translation machine technique and English SentiWordNet. The scores
for each text will be added to the model in three pairs for objective, positive, and negative. The last step of
our method involves training the ML classifier on that model to predict the sentiment of the Arabic text.
Our method increases the performance compared with the baseline model that is BOW in most cases. In
addition, it seems a viable approach to sentiment analysis in Arabic text where there is limitation of the
available resource.
Sentiment analysis is inevitable in current era. Internet is growing day-by-day. Now-a-days everything is online. We can shop, buy, and sell online. People can give feedbacks / opinions on the internet. Customers can compare among various products by analyzing the product reviews. As more and more people from different age groups and languages are becoming new internet users, we need it in regional languages. Till date most of the work related to sentiment analysis has been done in English language. But when it comes to Indian languages, not much research has done except for few languages. This paper mainly focuses on performing sentiment analysis in one of the Indian languages i.e. Marathi.
Automatic classification of bengali sentences based on sense definitions pres...ijctcm
Based on the sense definition of words available in the Bengali WordNet, an attempt is made to classify the
Bengali sentences automatically into different groups in accordance with their underlying senses. The input
sentences are collected from 50 different categories of the Bengali text corpus developed in the TDIL
project of the Govt. of India, while information about the different senses of particular ambiguous lexical
item is collected from Bengali WordNet. In an experimental basis we have used Naive Bayes probabilistic
model as a useful classifier of sentences. We have applied the algorithm over 1747 sentences that contain a
particular Bengali lexical item which, because of its ambiguous nature, is able to trigger different senses
that render sentences in different meanings. In our experiment we have achieved around 84% accurate
result on the sense classification over the total input sentences. We have analyzed those residual sentences
that did not comply with our experiment and did affect the results to note that in many cases, wrong
syntactic structures and less semantic information are the main hurdles in semantic classification of
sentences. The applicational relevance of this study is attested in automatic text classification, machine
learning, information extraction, and word sense disambiguation
A novel meta-embedding technique for drug reviews sentiment analysisIAESIJAI
Traditional word embedding models have been used in the feature extraction process of deep learning models for sentiment analysis. However, these models ignore the sentiment properties of words while maintaining the contextual relationships and have inadequate representation for domain-specific words. This paper proposes a method to develop a meta embedding model by exploiting domain sentiment polarity and adverse drug reaction (ADR) features to render word embedding models more suitable for medical sentiment analysis. The proposed lexicon is developed from the medical blogs corpus. The polarity scores of the existing lexicons are adjusted to assign new polarity score to each word. The neural network model utilizes sentiment lexicons and ADR in learning refined word embedding. The refined embedding obtained from the proposed approach is concatenated with original word vectors, lexicon vectors, and ADR feature to form a meta-embedding model which maintains both contextual and sentimental properties. The final meta-embedding acts as a feature extractor to assess the effectiveness of the model in drug reviews sentiment analysis. The experiments are conducted on global vectors (GloVE) and skip-gram word2vector (Word2Vec) models. The empirical results demonstrate the proposed meta-embedding model outperforms traditional word embedding in different performance measures.
Sentiment analysis on Bangla conversation using machine learning approachIJECEIAES
Nowadays, online communication is more convenient and popular than faceto-face conversation. Therefore, people prefer online communication over face-to-face meetings. Enormous people use online chatting systems to speak with their loved ones at any given time throughout the world. People create massive quantities of conversation every second because of their online engagement. People's feelings during the conversation period can be gleaned as useful information from these conversations. Text analysis and conclusion of any material as summarization can be done using sentiment analysis by natural language processing. The use of communication for customer service portals in various e-commerce platforms and crime investigations based on digital evidence is increasing the need for sentiment analysis of a conversation. Other languages, such as English, have welldeveloped libraries and resources for natural language processing, yet there are few studies conducted on Bangla. It is more challenging to extract sentiments from Bangla conversational data due to the language's grammatical complexity. As a result, it opens vast study opportunities. So, support vector machine, multinomial naïve Bayes, k-nearest neighbors, logistic regression, decision tree, and random forest was used. From the dataset, extracted information was labeled as positive and negative.
One fundamental problem in sentiment analysis is categorization of sentiment polarity. Given a piece of written text, the problem is to categorize the text into one specific sentiment polarity, positive or negative (or neutral). Based on the scope of the text, there are three distinctions of sentiment polarity categorization, namely the document level, the sentence level, and the entity and aspect level. Consider a review “I like multimedia features but the battery life sucks.†This sentence has a mixed emotion. The emotion regarding multimedia is positive whereas that regarding battery life is negative. Hence, it is required to extract only those opinions relevant to a particular feature (like battery life or multimedia) and classify them, instead of taking the complete sentence and the overall sentiment. In this paper, we present a novel approach to identify pattern specific expressions of opinion in text.
Similar to A SURVEY OF S ENTIMENT CLASSIFICATION TECHNIQUES USED FOR I NDIAN REGIONAL LANGUAGES (20)
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
Forklift Classes Overview by Intella PartsIntella Parts
Discover the different forklift classes and their specific applications. Learn how to choose the right forklift for your needs to ensure safety, efficiency, and compliance in your operations.
For more technical information, visit our website https://intellaparts.com
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Event Management System Vb Net Project Report.pdfKamal Acharya
In present era, the scopes of information technology growing with a very fast .We do not see any are untouched from this industry. The scope of information technology has become wider includes: Business and industry. Household Business, Communication, Education, Entertainment, Science, Medicine, Engineering, Distance Learning, Weather Forecasting. Carrier Searching and so on.
My project named “Event Management System” is software that store and maintained all events coordinated in college. It also helpful to print related reports. My project will help to record the events coordinated by faculties with their Name, Event subject, date & details in an efficient & effective ways.
In my system we have to make a system by which a user can record all events coordinated by a particular faculty. In our proposed system some more featured are added which differs it from the existing system such as security.
Courier management system project report.pdfKamal Acharya
It is now-a-days very important for the people to send or receive articles like imported furniture, electronic items, gifts, business goods and the like. People depend vastly on different transport systems which mostly use the manual way of receiving and delivering the articles. There is no way to track the articles till they are received and there is no way to let the customer know what happened in transit, once he booked some articles. In such a situation, we need a system which completely computerizes the cargo activities including time to time tracking of the articles sent. This need is fulfilled by Courier Management System software which is online software for the cargo management people that enables them to receive the goods from a source and send them to a required destination and track their status from time to time.
Explore the innovative world of trenchless pipe repair with our comprehensive guide, "The Benefits and Techniques of Trenchless Pipe Repair." This document delves into the modern methods of repairing underground pipes without the need for extensive excavation, highlighting the numerous advantages and the latest techniques used in the industry.
Learn about the cost savings, reduced environmental impact, and minimal disruption associated with trenchless technology. Discover detailed explanations of popular techniques such as pipe bursting, cured-in-place pipe (CIPP) lining, and directional drilling. Understand how these methods can be applied to various types of infrastructure, from residential plumbing to large-scale municipal systems.
Ideal for homeowners, contractors, engineers, and anyone interested in modern plumbing solutions, this guide provides valuable insights into why trenchless pipe repair is becoming the preferred choice for pipe rehabilitation. Stay informed about the latest advancements and best practices in the field.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
A SURVEY OF S ENTIMENT CLASSIFICATION TECHNIQUES USED FOR I NDIAN REGIONAL LANGUAGES
1. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.2, April 2015
DOI:10.5121/ijcsa.2015.5202 13
A SURVEY OF SENTIMENT CLASSIFICATION
TECHNIQUES USED FOR INDIAN REGIONAL
LANGUAGES
Pooja Pandey and Sharvari Govilkar
Department of Computer Engineering, University of Mumbai, PIIT, New Panvel, India
ABSTRACT
Sentiment Analysis is a natural language processing task that extracts sentiment from various text forms
and classifies them according to positive, negative or neutral polarity. It analyzes emotions, feelings, and
the attitude of a speaker or a writer towards a context. This paper gives comparative study of various
sentiment classification techniques and also discusses in detail two main categories of sentiment
classification techniques these are machine based and lexicon based. The paper also presents challenges
associated with sentiment analysis along with lexical resources available.
KEYWORDS
NLP, sentiment, sentiment analysis, classification techniques, challenges, lexical resources, features,
machine learning, lexicon based.
1.INTRODUCTION
Sentiment Analysis (SA) is a natural language processing task that deals with finding orientation
of opinion in a piece of text with respect to a topic [1]. It deals with analyzing emotions, feelings,
and the attitude of a speaker or a writer from a given piece of text. Sentiment Analysis involves
capturing of user’s behaviour, likes and dislikes of an individual from the text. The main goal
behind sentiment analysis is to identify sentiment associated with the text by extracting
sentimental context from the text.
The purpose of sentiment analysis is to determine the attitude or inclination of a communicator
through the contextual polarity of their speaking or writing. Their attitude may be reflected in
their own judgment, emotional state of the subject, or the state of any emotional communication
they are using to affect a reader or listener. It is trying to determine a person’s state of mind on
the subject they are communicating about. This information can be mined from various data
sources like: texts, tweets, blogs, social media, news articles, product comments.
There are different classification levels in SA: document-level, sentence-level and aspect-level.
Document-level SA aims to classify an opinion of the whole document as expressing a positive or
negative sentiment. Sentence-level SA aims to classify sentiment expressed in each sentence
which involves identifying whether sentence is subjective or objective. Aspect-level SA aims to
classify the sentiment with respect to the specific aspects of entities which is done by identifying
the entities and their aspects for instance researchers need a tool to generate summaries for
deciding whether to read the entire document or not and for summarizing information searched by
user on internet. News groups can use multi document summarization to cluster the information
from different media and summa.
2. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.2,April 2015
14
The paper presents a detail survey of various sentiment classification techniques. Related work
done and past literature is discussed in section 2.Baseline algorithm is defined in section 3 along
with challenges associated in performing sentiment analysis. Two main categories of sentiment
classification techniques which are machine based SA and lexicon based SA are discussed in
detail in section 4 along with the comparison of each method cited in Indian regional languages.
Finally, section 5 concludes the paper.
2.LITERATURE SURVEY
In this section we cite the relevant past literature of research work done in the field of sentiment
analysis for Indian languages.
Amitava Das and Bandopadhya developed SentiWordNet for Bengali language, which is an
automatically constructed lexical resource in which WordNet synset are assigned positive and a
negative score. SentiWordNet and Subjectivity Word List are used to generate merged sentiment
lexicon in which duplicate words are removed. Bengali SentiWordNet is created by applying
word level lexical-transfer, using an English-Bengali dictionary, on content available in
SentiWordNet. [4]
Authors have [1] developed a modified approach to identify the sentiments associated with Hindi
content by handling negation and discourse relation. They updated the existing Hindi
SentiWordNet (HSWN) by when specific sentiment words where not found in existing HSWN by
extracting same meaning word from English SentiWordNet. Through handling of negation and
discourse associated with text, their proposed algorithm achieved approximately 80% accuracy on
classification of Hindi reviews.
Amandeep Kaur and Vishal Gupta Proposed Algorithm for Sentiment Analysis for Punjabi Text.
Their used the Hindi WordNet to develop the Subjective Lexicon for the Punjabi language.[8]
They are using three popular methods used for the generation of subjective lexicon-Use of Bi-
Lingual Dictionary, Machine Translation, Use of Word net .Then they devise an Algorithm
Combining the unigram method and simple scoring method which provides better efficiency. The
overall efficiency of the proposed algorithm is 54.2%.
Aditya Joshi and Pushpak Bhattacharyya [2] proposed a fallback strategy for finding sentiment
associated with Hindi language. Machine Translation, In-language and Resource Based SA where
the three approach proposed for SA in Hindi. Through WordNet linking, words present in English
SentiWordNet were replaced by similar Hindi words to construct Hindi SentiWordNet
(HSWN).To determine the polarity of the opinion associated in text SVM classifier was used to
perform In-language SA. In a machine translation based method Google translator was used to
translate Hindi corpus into English and resulted corpus was used as input to the classifier to
determine polarity. In resource based SA approach the synset corresponds to the English
SentiWordNet is used in the corresponding synset in Hindi to build the SentiWordNet (H-SWN)
for Hindi. 78.14 was the best accuracy achieved through in-language sentiment analysis for Hindi
documents.
Kishorjit and Sivaji Bandyopadhyay proposed verb based Manipuri sentiment analysis. They are
using the conditional random field (CRF) approach. It is an unsupervised approach where the
system learns by giving some training and can be used for testing other texts. Then they processed
text for part of speech tagging using CRF. Here With the help of POS tagger the verbs of each
sentence are identified and the modified lexicon of verbs is used to notify the polarity of the
sentiment in the sentence, because the sentiment of the sentence is highly dependent on the verbs.
Their proposed algorithm achieved approximately 75% accuracy on sentiment analysis of
Manipuri. [9]
3. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.2,April 2015
15
Authors have used self learning neural network which takes linguistic and part of speech emotive
features as input, for detection of sentiment associated in Tamil content. The primary inputs to the
neural network are ; first the outputs of domain classifier which retrieves noun and verbs and term
domain frequencies, second the two schmaltzy analyzers which are Negation Scorer and Flow
scorer which assigns sore to each document based on the pleasantness of the words in it ,and and
lastly noun, verb and urichol the three taggers . Unsupervised learning was selected and Hebbian
learning was incorporated since emotion recognition has to deal with emotional intelligence. Two
dimensional animated face generator is used to show resultant emotion which is identified by
assigning weights for features based on their affective influence [7].
Das and Bandopadhya [5] used different strategies to predict the sentiment of a word in the given
text. In one of the strategies they annotated the words with their associated polarity manually. In
another strategy, to determine the polarity of the text, Bi-Lingual dictionary for English and
Indian Languages was used. In next strategy synonym and antonym relations of words in
WordNet are used, to determine the polarity. Final strategy used learning from pre-annotated
corpora to find the polarity of the text.
Authors have used lexicon based approach for extracting sentiment from Urdu text. Sentiment-
annotated lexicon based approach for analyzing sentiment based on SentiUnits. SentiUnits are the
expressions made of one or more words, which carry the sentiment information of the whole
sentence. Shallow parsing is used for identification and extraction of SentiUnits from the given
text. Two types of SentiUnits were used i.e. Single adjective phrase and multiple adjective
phrases. Adjectives, modifiers and orientation are the three attributes associated with SentiUnits.
Process of sentiment analysis is composed of three phases; pre-processing where normalization
and segmentation of text was performed then next phase involves use of shallow parsing for
extracting SentiUnits and finally extracted SentiUnits are compared with lexicon and their
polarities are calculated for classification as positive or negative and overall polarity is obtained
by combining polarity’s.[6]
Authors have built a robust sense based classifier [3] which is a supervised document level
sentiment classifier on basis of semantic space based on WordNet senses. Through the use of
similarity metrics unknown synset in the training set was replace by similar synset in test set.
Words in the corpus were annotated with their senses using combinations of manual sense
annotation and automatic iterative WSD. In synset replacement algorithm a synset encountered in
a test document is not found in the training corpus, it is replaced by one of the synsets present in
the training corpus. The substitute synset is determined on the basis of its similarity with the
synset in the test document. The synset that is replaced is referred to as an unseen synset as it is
not known to the trained model.
Authors have proposes Cross-Lingual Sentiment Analysis(CLSA) [10] using WordNet senses as
features for supervised sentiment classification where machine translation is not available for
translation between specific languages .Concept of linked WordNet is used which bridge the gap
between those two languages. WordNet of Hindi and Marathi was developed using an expansion
approach having same synset identifier The words in the corresponding synsets represent
translations of each other in specific contexts. Words of the training as well as the test corpus
were mapped to their WordNet synset identifiers. A classification model is learnt on the training
corpus and tested on the test corpus leads to a new corpora which is represented in the common
feature space i.e. sense space. Accuracy of 72% and 84% for Hindi and Marathi was obtained for
sentiment classification of these languages.
4. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.2,April 2015
16
3.SENTIMENT ANALYSIS
Sentiment analysis is computational study of emotions, opinions and mainly the sentiment
expressed in the text by user. Sentiment analysis is a challenging task due to many challenges
which are associated while processing natural language. Any sentiment analysis system needs
first to extract feature i.e. sentimental words or phrases from the given text and then using
suitable text classifier overall sentiment associated with the text is extracted.
3.1.Challenges for sentiment analysis
3.1.1.Contextual Information
Identifying the context of the text becomes an important challenge to address in SA.
Behaviour/use of the word changes in a great aspect based on the context.
Ex-1 The journey was long.
Ex-2 Seminar was long.
Ex-3 Battery life of Nexus 5 is long.
In all the above 3 examples, meaning of long is same it indicates the duration or passage of time.
In Ex-1 and Ex-2 “long” indicates bored hence a Negative expression whereas in Ex 3 “long”
indicates efficiency hence a Positive expression. In Ex 3 “long” indicates efficiency hence a
Positive expression.
3.1.2.Sarcasm Detection
Sarcasm involves statement and a remark which is usually indirect taunt towards any object or an
appraisal in a negative way. Detecting sarcasm is a tough task for humans and equally harder for
machine. Some examples of sarcasm: Ex- Amazing presentation by Mr. X, I won’t ever attend
such presentation again.
3.1.3.Word Sense Disambiguation
Word sense disambiguation (WSD) is the problem of determining in which sense a word having a
number of distinct senses is used in a given sentence. The same word can have multiple
meanings, and based on the sense of its usage the polarity of the word also changes. [19] For
example, the word "cold" has several senses and may refer to a disease, a temperature sensation,
or an environmental condition. The specific sense intended is determined by the textual context in
which an instance of the ambiguous word appears. In "I am taking aspirin for my cold" the
disease sense is intended, in "Let's go inside, I'm cold" the temperature sensation sense is meant,
while "It's cold today, only 2 degrees", implies the environmental condition sense.
3.1.4.Word Order:
Word order plays a vital role in deciding the polarity of a text, in the text same set of words with
slight variations and changes in the word order affect the polarity aspect. For example “X is
efficient than Y” conveys the exact opposite sentiment from “Y is efficient than X”.
3.1.5.Identify subjective portions of text:
The same word can be treated as subjective in one context and objective in some other. This
makes it difficult to identify the subjective (sentiment-bearing) portions of text. Consider
following examples:
Ex-1 the language of the author was very crude.
Ex-2 Crude oil is extracted from the sea beds.
5. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.2,April 2015
17
3.1.6.Indirect negation of sentiment
Negation of sentiment [17] is assigned to words like no, not, never, etc. But there are certain
words tend to reverse the sentiment polarity implicitly. For example the sentence, “This movie
avoids all predictable and boring drama found in most of the bollywood movies.” The negative
sentiment associated with words predictable and boring is reversed by associating word avoid
with those words.
3.1.7.Entity Recognition
Same entity is not seen for all the texts in a document. When multiple entities are being
mentioned about in a single document, the overall document polarity does not make much sense.
We need to separate out the text about a particular entity and then analyze its sentiment. Ex- I
hate Heat, but I like ICE.
3.2.Features for sentiment analysis
Converting a piece of text to a feature vector is the basic step in any data driven approach to SA.
Some commonly used features used in Sentiment Analysis are:
3.2.1.Term Presence vs. Term Frequency:
Term frequency is mostly used in different Information Retrieval (IR) and Text Classification
tasks. But Pang-Lee [11] found that term presence is more important to Sentiment analysis than
term frequency as for term presence binary-valued feature vectors are used, and those vectors the
entries merely indicate whether a term occurs indicated as value 1 or do not occur indicated by
value 0. Polarity based classification is very useful for SA ,as overall sentiment may not usually
be highlighted through repeated use of the same term an also occurrences of rare word can have
more sentimental value compared to frequently repeated word.
3.2.2.Term Position:
More sentimental value is given to certain words based on their position in a sentence or a
document. Generally words appearing in the 1st few sentences and last few sentences in a text
are given more weightage than those appearing elsewhere in the document.
3.2.3.Parts of speech (POS):
Part-of-speech (POS) information is commonly exploited in sentiment analysis which deals with
finding adjectives, adverbs in the text as they are important indicators of sentiment in a given text.
Amongst all parts of speech like nouns, verbs, adjectives, etc most important POS is considered
as adjective to represent sentimental feature or simply subjective text. Adverbs when occurs along
adjectives improves the probability of finding exact sentiment of a given text.
3.2.4.Topic-Oriented Features:
Interactions between topic and sentiment play an important role in extracting sentiment from the
content. For example, in a hypothetical article on Rebook, the sentences “Reebok reports that
profits rose” and “Target reports that profits rose” could indicate completely different types of
news regarding the subject of the document. Topic information can also be incorporated into
features set as they also help in specifying sentiment.
6. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.2,April 2015
18
3.2.5.Opinion words and phrases:
There are words like; good or bad, like or hate, happy or sad which are mostly used to express
opinions towards an object. Also sometimes there are phrases which expresses opinions without
using an of the opinion words in the phrase. For example: It cost me my full month salary.
3.3.Lexicon resources for sentiment lexicon
Lexicon resources provide [17] a polarity value of associated with words. Lexical resources can
be developed either manually by humans or automatically by training machines. Manually built
lexical resources tend to be more accurate but time consuming task, automatically built resources
can attain much higher coverage in less time. WordNet is a lexical hierarchical database with
nodes represented by word meaning instead of word itself, and relationships between different
synonym synsets representing the edges between the nodes. Multiple words can be assigned to
single node and a same word might be present in more than one node at a time.
3.3.1.Manually built Lexical Resources
General Inquirer consists of 11000+ words compiled from two different sources with roughly
2000 positive and 2000 negative words. Each word is tagged either as positive, negative or none.
The precision with which Inquirer tags a word is very high; because of its low coverage the recall
rate is very low. Bing- Liu Lexicon consists of around 2000 positive and 4000 negative words
which are frequently seen in social media. Bing-Liu lexicon has advantage over General Inquirer
in terms of coverage as it contains more sentiment-bearing words but has a low precision as less
overall words. Subjectivity Lexicon is developed from multiple sources and sentiment which are
manually tagged and contains 8000+ sentiment words, each tagged as either positive or negative.
Problem with manually-tagged sentiment lexicons is that there is no concept of like mildly
positive or strongly negative to classify in deep.
3.3.2.SentiWordNet
SentiWordNet is a lexical resource [17] which is an extension of WordNet. It appends sentiment
information to every synset present in the WordNet. Positive, Negative and Objective score are
tagged to each synset where there combined score sum up to 1.0 overall for a synset. Here
synsets sentiment are not fixed to a single category i.e. the same synset has non-zero score for
both the classes as it is positive in particular context and negative in another context.
3.3.3. SentiWS
SentimentWortschatz or SentiWS [15] is a German-language resource for sentiment analysis. It
contains list of positive and negative sentimental words which are assigned weighted between
[−1; 1] and also part of speech tag and their inflections are associated. Currently SentiWS
contains 1,650 negative and 1,818 positive words in the list. The list contains adjectives and
adverbs which explicitly expresses a sentiment with addition of nouns and verbs.
3.3.4.WordNet Affect
WordNet Affect is a lexical resource [12] that represents the affective content of synsets by
dividing them into affective categories. Thus, it gives more affective information as compared to
SentiWordNet and is used when analysis to be done is with respect to emotions like sad, anger,
happy, joy and others.
7. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.2,April 2015
19
3.4. Baseline Algorithm
Baseline algorithm for sentiment analysis consists of tokenization, feature extraction and
sentiment classification [18] using classifier such as SVM, Naive Base, etc.
Figure 1. Baseline Algorithm
We first have to tokenize the sentence which involves breaking a stream of text up into words,
phrases, symbols, or other meaningful elements called tokens. It is done by segmenting text by
splitting it by spaces and punctuation marks, and forms a bag of words. Care should be taken so
that short forms such as “don’t”, “I’ll”, “she’d” will remain as one word.
Then important features are identified. Features such as Terms presence and frequency, Parts of
speech (POS), Opinion words and phrases, Negations are identified. We need to take care of
negations, since they will reverse polarities and decide whether we want to use only adjective,
adjectives plus adverbs or simply all the words as features. Lexicon-based or statistical feature
selection methods can be used to select features from documents which treat document as Bag of
Words (BOW) or string. Stemming and removal of stop-words are mostly common feature
selection step.
After we've tokenized and decided which features to use we need to classify the sentiment. Is it
good or bad? Classification can be done with different algorithms. For example: Naïve Bayes,
Support Vector Machines, or Max Entropy. Lexical resources like dictionary, WordNet,
SentiWordNet are uses by these classifier algorithms. This technique attempts to determine
whether a text is objective or subjective and whether a subjective text contains positive or
negative sentiments. The system automatically collects, cluster, categorizes, and summarizes
news from several sites on the web on a daily basis. A summarization machine can be viewed as a
system which accepts either a single document or multiple documents or a query as an input and
produces an abstract or extract summary.
4.SENTIMENT CLASSIFICATION TECHNIQUES
Sentiment classification is a task under Sentiment Analysis (SA) that tags text as positive,
negative or neutral automatically. Thus, a sentiment classifier tags the sentence ‘the movie is
entertaining and totally worth your money!’ in a movie review as positive with respect to the
movie. On the other hand, a sentence ‘The movie is so boring that I was dozing away through the
second half.’ is labelled as negative. Finally, ‘The movie is directed by Nolan’ is labelled as
neutral [2]. There are two main techniques for sentiment classification: machine learning based
and lexicon based. Better performance can be obtained by combining these two methods.
8. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.2,April 2015
20
4.1.Machine Learning Approach
The machine learning method uses several learning algorithms to determine the sentiment by
training on a known dataset. Training and a test set are the two document sets which are mostly
needed for machine learning based techniques. Training set is used by classifier to understand the
different characteristics associated with documents, and to check the overall performance of the
classifier test set are used.
4.1.1.Supervised learning
The supervised learning methods depend on the existence of labelled training documents.
Supervised learning process: two Steps; Learning (training): Learn a model using the
training data testing: Test the model using unseen test data to assess the model accuracy.
There are different types of supervised classifiers like: Rule-based Classifiers, Decision
Tree Classifiers, Linear Classifiers and Probabilistic Classifiers.
a. Probabilistic Classifier
Mixture models for classification are used by Probabilistic classifiers where it assumes that each
class of the content is a component of the particular mixture. Each mixture component can be
referred as generative model which provides the probability of sampling a particular term for that
component.
1.Naive Bayes Classifier (NB)
Naive Bayes classification model computes the posterior probability of a class is computed in
Naive Bayes Classifier which is based on the way words are distributed in the particular
document. The positions of the word in the document are not considered for classification in this
model as it uses BOWs feature extraction technique. Bayes Theorem is used to predict the
probability where given feature set belongs to a particular label of the content.
P(label│features)=(P(label)*P(features|label))/(P(features))
P (label) signifies the prior probability of a label. P (features | label) signifies the prior probability
that a particular feature set is being classified as a label. P (features) specifies the prior probability
that a given feature set has occurred in the process. On basis of Naive assumption i.e. all features
are independent; the equation can be rewritten as:
P(label│features)=(P(label)*P(f1│label)*….*P(fn│label))/(P(features))
2.Bayesian Network (BN)
Bayesian Network model is a form of directed acyclic graph in which nodes represent
random variables and edges represent conditional dependencies in the graph. Complete
joint probability distribution (JPD) is specified for the model as it is reckoned as complete
model for the variables and their relationships.BN is not frequently used for text mining as
computation complexity is very expensive.
9. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.2,April 2015
21
3.Maximum Entropy Classifier (ME)
The MaxentClassifier also known as a conditional exponential classifier uses encoding to create
encoded vectors by converting labelled feature sets. This encoded vector are used to calculate
weights associated with each feature which can then be combined to determine the most likely
label for a feature set. Main parameters for this classifier is a set of X {weights}, which is used to
merge the joint features which are generated from a feature-set by an X {encoding}. Every C
{(featureset, label)} pair is mapped to a vector through encoding.
B.Linear classifiers
Probability of a particular classification is based on a linear combination of features and their
weights. A linear classifier determines which class an object belongs by making a classification
decision based on the value of a linear combination of the characteristics of the objects. Feature
vector are used to represent object characteristics. Different linear classifiers are available
1.Support Vector Machines Classifiers (SVM)
The main principle of SVMs is to determine linear separators in the search space which can best
separate the different classes. Due to sparse nature of text the text data are well suited for SVM
classification. In text data some of the features are irrelevant, but most of them seem to be
correlated with one another and can be easily organized into linearly separable categories. The
basic SVM has no dependency on probability as it predicts which two possible class forms an
output for a given input.
2.Neural Network (NN)
Neural Network simply consists of neurons which are arranged in layers and convert a given
input vector into meaningful output. Each neuron processes an input by applying nonlinear
function to it and the output is passed to the next layer for further process. Mostly neural
networks are designed as feed-forward network .Signals passing from one neuron to another are
assigned different weights and during the training phase these weights are adjusted so that neural
network adapts to a particular problem to solve. Multi-layer neural networks are used for non-
linear boundaries. These multiple layers are used to create multiple piecewise linear boundaries,
which are mainly used to approximate enclosed regions which belongs to a particular class of
output. Here training process is more difficult as there is a need of back-propagation of errors
over several layers.
C.Decision tree classifier
In decision tree classifier training data set ate hierarchical decomposed where to divide the data
the condition on the attribute value is used. The condition or predicate of the attribute signifies the
presence or absence of one or more words associated with it. Data space is divide recursively
until certain minimum numbers of records which are used for the purpose of classification are
available at leaf node.
D.Rule based classifiers
The data space is modeled with a set of different rules in rule based classifiers where the left side
signifies a condition on the feature set expressed in disjunctive normal form while the right hand
side represents the class label. During training phase all the rules are constructed based on
different criteria used to generate rules. Support and confidence are commonly used criteria
10. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.2,April 2015
22
where support signifies absolute number of instances present in the training data set which are
relevant to the rule specified and Confidence refers to the conditional probability that the right
hand side of the rule is satisfied when the left-hand side is satisfied for a given input.
4.1.2.Unsupervised learning
Unsupervised learning is deals with finding hidden structure in unlabeled data set. There is no
error or reward signal to evaluate a potential solution as examples given to the learner are
unlabeled. Unsupervised learning methods are useful when there are documents to classify which
are unlabeled. Nearest neighbor (KNN) is unsupervised machine learning algorithm in which
objects are classified based on the majority of its nearest neighbor of the object. The class which
is assigned to the object is based among its most k nearest neighbour’s object. Objects are
classified based on their similarities to objects in the training data in this algorithm. Selection
process is based on either majority voting or distance weighted voting.
4.2.Lexicon-based approach
The lexicon-based approach involves calculating sentiment polarity for a review using the
semantic orientation of words or sentences in the review. The semantic orientation is a measure of
subjectivity and opinion in text. Sentiment lexicon contains lists of words and expressions used to
express people’s subjective feelings and opinions. For example, start with positive and negative
word lexicons, analyze the document for which sentiment need to find. Then if the document has
more positive word lexicons, it is positive, otherwise it is negative. The lexicon based techniques
to Sentiment analysis is unsupervised learning [13] because it does not require prior training in
order to classify the data.
Manual construction, corpus-based methods and dictionary-based methods are the methods
through which sentiment lexicon are constructed. The manual construction of sentiment lexicon is
a difficult as it involves humans to manually assign polarities to sentimental words and it’s a
time-consuming task. Dictionary based method is an iterative technique which is initially
constructed manually by selecting small set of sentimental word and this set then iteratively
grows by adding the synonyms and antonyms from the WordNet. This iterative process continues
till no new words are reaming to be added to the seed list. The dictionary based approach have a
limitation is that it can’t find opinion words with domain specific orientations. Corpus based
techniques rely on syntactic patterns in large corpora. Corpus-based methods can produce opinion
words with relatively high accuracy. Most of these corpus based methods need very large labelled
training data but it helps to easily find domain specific opinion words and orientations of this
words towards a context.
Following tables presents a comparison of sentiment classification techniques cited in Indian
regional languages.
11. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.2,April 2015
23
Table 1. Sentiment Analysis techniques used in Hindi text.
Type Hindi [2]
Corpus 250 Hindi Movie Reviews and English movie reviews
Classificat
ion
Technique
In-Language using SVM Machine
Translation
Based using
SVM
Resource-based using
SentiWord list
Features
Term
Frequency
Term
presenc
e
TF-IDF TF-IDF Most Common
Sense
All sense
Accuracy
74.57% 72.57% 78.14% 65.96% 56.35% 60.31%
Benefit
MT-based systems give superior classification performance as compared to
majority-based systems based on lexical resources.
Limit
The error of the machine translation system affects the performance of MT-based
SA.
Table 2. Sentiment Analysis techniques used in Hindi text.
Type
Hindi
[1]
Corpus Hindi Movie Reviews
Classificatio
n Technique
Semantic approach using HindiSentiWordNet (HSWN)
Features
Improved
HSWN
Improved HSWN +
negation
Improved HSWN + negation+
Discosure
Accuracy 69.78% 78.39% 80.21%
Benefit Increases the coverage of HindiSentiWordNet (HSWN)
Limit Low accuracy for words which have dual nature.
12. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.2,April 2015
24
Table 3. Sentiment Analysis techniques used in Manipuri text.
Type Manipuri [9]
Corpus Manipuri news paper, total 2,75000 words
Classification
Technique
Conditional Random Field (CRF).
Features Part Of Speech(POS)
Accuracy Recall:72.10, Precision:78.14, F-Score:75.0
Benefit Model is easy to interpret
Limit
More methods and algorithms are to be search and implemented in order to
improve the accuracy
Table 4. Sentiment Analysis techniques used in Punjabi text.
Type Punjabi [8]
Corpus Documents written in Punjabi language
Classification
Technique
Subjective lexicon
Features Part Of Speech(POS)
Accuracy Recall:70, Precision:78,F-Score:67
Benefit Better accuracy
Limit
Performance is low.
Lexicon developed for Hindi language has limited coverage.
Table 5. Sentiment Analysis techniques used in Tamil text.
Type Tamil [7]
Corpus Tamil News Text
Classification
Technique
Neural network
Features Part Of Speech(POS)
Accuracy Precision : 60
Benefit The emotion is identified by assigning weights for features based on their
affective influence.
Limit Local minima and over fitting
13. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.2,April 2015
25
Table 6. Sentiment Analysis techniques used in Urdu text.
Type Urdu[6]
Corpus Movie Product
Classification
Technique
Lexicon based using SentiUnits
Features Part Of Speech(POS)
Accuracy 72% 78%
Benefit Achieving better results by using SentiUnits.
Limit
The SentiUnits with adjectives made by postposition combined with nouns, cause
errors and hence, an improved algorithm is required.
5.CONCLUSIONS
Sentiment analysis has lead to determine the attitude or inclination of a communicator through the
contextual polarity of their speaking or writing. Sentiments can be mined from texts, tweets,
blogs, social media, news articles, comments or from any source of information.
Sentiment Analysis has been quite popular and has lead to building of better products,
understanding user’s opinion, executing and managing of business decisions. People rely and
make decisions based on reviews and opinions. This research area has provided more importance
to the mass opinion instead of word-of-mouth.
Large amount of work in sentiment analysis has been done in English language, as English is a
global language, but there is a need to perform sentiment analysis in other languages also. Large
amount of other languages contents are available on the Web which needs to be mined to
determine the sentiment.
ACKNOWLEDGEMENTS
I am using this opportunity to express my gratitude to thank all the people who contributed in
some way to the work described in this paper. My deepest thanks to my project guide for giving
timely inputs and giving me intellectual freedom of work. I express my thanks to the head of
computer department and to the principal of Pillai Institute of Information Technology, New
Panvel for extending his support.
14. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.2,April 2015
26
REFERENCES
[1] Mittal, Namita, et al. "Sentiment Analysis of Hindi Review based on Negation and Discourse
Relation," 11th Workshop on Asian Language Resources (ALR), In Conjunction with IJCNLP. 2013.
[2] Joshi, Aditya, A. R. Balamurali, and Pushpak Bhattacharyya, "A fall-back strategy for sentiment
analysis in Hindi: a case study," Proceedings of the 8th ICON (2010).
[3] Balamurali, A. R., Aditya Joshi, and Pushpak Bhattacharyya, “Robust sense-based sentiment
classification," Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and
Sentiment Analysis, Association for Computational Linguistics, 2011.
[4] Das, Amitava, and Sivaji Bandyopadhyay. "SentiWordNet for Bangla," Knowledge Sharing Event-4:
Task 2 (2010).
[5] Das, Amitava, and Sivaji Bandyopadhyay, "SentiWordNet for Indian languages," The 8th Workshop
on Asian Language Resources. 2010.
[6] Syed, Afraz Z., Muhammad Aslam, and Ana Maria Martinez-Enriquez. "Lexicon based sentiment
analysis of Urdu text using SentiUnits," Advances in Artificial Intelligence, Springer Berlin
Heidelberg, 2010.
[7] Giruba Beulah and Madhan Karky, “On Emotion Detection from Tamil Text.”
[8] Kaur, Amandeep, and Vishal Gupta. ,"Proposed Algorithm of Sentiment Analysis for Punjabi Text,"
Journal of Emerging Technologies in Web Intelligence 6.2 (2014): 180-183.
[9] Nongmeikapam, Kishorjit, et al. "Verb Based Manipuri Sentiment Aanalysis,"2014.
[10] Balamurali, A. R. "Cross-lingual sentiment analysis for Indian languages using linked wordnets."
(2012).
[11] Pang, Bo, and Lillian Lee. "Opinion mining and sentiment analysis," Foundations and trends in
information retrieval 2.1-2 (2008): 1-135.
[12] Strapparava, Carlo, Alessandro Valitutti, and Oliviero Stock. "The affective weight of lexicon,"
Proceedings of the Fifth International Conference on Language Resources and Evaluation. 2006.
[13] VOHRA, MRSM, and JB TERAIYA, "A Comparative Study of Sentiment Analysis Techniques,"
JIKRCE, 2013.
[14] Medhat, Walaa, Ahmed Hassan, and Hoda Korashy. "Sentiment analysis algorithms and applications:
A survey." Ain Shams Engineering Journal (2014).
[15] Remus, Robert, Uwe Quasthoff, and Gerhard Heyer, "SentiWS-A Publicly Available German-
language Resource for Sentiment Analysis," LREC, 2010.
[16] Esuli, Andrea, and Fabrizio Sebastiani, "SentiWordNet: A publicly available lexical resource for
opinion mining," Proceedings of LREC, 2006.
[17] http://www.cfilt.iitb.ac.in/resources/surveys/SA-Literature%20Survey-2012-Akshat.pdf.
[18] https://class.coursera.org/nlp/lecture/145
[19] http://wsd.nlm.nih.gov
Authors
Pooja Pandey is currently a graduate student pursuing masters in Computer
Engineering at PIIT, New Panvel, and University of Mumbai, India. She has received
her B.E in Computer Engineering from University of RGPV. She is having 2 years of
past experience in the field of teaching. Her areas of interest are Natural Language
processing, theory of computation and ethical hacking.
Sharvari Govilkar is Associate professor in Computer Engineering Department, at
PIIT, New Panvel, and University of Mumbai, India. She has received her M.E in
Computer Engineering from University of Mumbai. Currently she is pursuing her
PhD in Information Technology from University of Mumbai. She is having 17 years
of experience in teaching. Her areas of interest are text mining, Natural language
processing, Compiler Design & Information retrieval etc.