SlideShare a Scribd company logo
1 of 9
Download to read offline
International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.5, September 2014 
A SCALABLE, LEXICON BASED TECHNIQUE FOR 
SENTIMENT ANALYSIS 
Chetan Kaushik and Atul Mishra 
Computer Engg. Department, YMCA University of Science & Technology, Faridabad 
ABSTRACT 
Rapid increase in the volume of sentiment rich social media on the web has resulted in an increased 
interest among researchers regarding Sentimental Analysis and opinion mining. However, with so much 
social media available on the web, sentiment analysis is now considered as a big data task. Hence the 
conventional sentiment analysis approaches fails to efficiently handle the vast amount of sentiment data 
available now a days. The main focus of the research was to find such a technique that can efficiently 
perform sentiment analysis on big data sets. A technique that can categorize the text as positive, negative 
and neutral in a fast and accurate manner. In the research, sentiment analysis was performed on a large 
data set of tweets using Hadoop and the performance of the technique was measured in form of speed and 
accuracy. The experimental results shows that the technique exhibits very good efficiency in handling big 
sentiment data sets. 
KEYWORDS 
Big data, Hadoop, Lexicon, Machine learning, Negation, NLP, Sentiment Analysis. 
1. INTRODUCTION 
Sentiment Analysis is a Natural Language Processing task that includes obtaining the writer’s 
feeling about several products or services which are posted on the internet via various posts or 
comments. In other words Sentiment Analysis helps in determining the response of a user or a 
group of users on a topic and categorizing their opinion as positive, negative or neutral. In recent 
years, a huge increase is seen in the amount of opinions expressed on the web by social media 
users and hence it has captured the interest of more and more researchers. 
Liu [1] defined a sentiment as a quintuple “<oj, fjk, soijkl, hi, tl >, where oj is a target object, fjk 
is a feature of the object oj, soijkl is the sentiment value of the opinion of the opinion holder hi on 
feature fjk of object oj at time tl, soijkl is +ve, -ve, or neutral, or a more granular rating, hi is an 
opinion holder, tl is the time when the opinion is expressed.” 
Sentiment Analysis has its applications in several fields such as in marketing for knowing 
customers’ response towards a new product or service or by social media users to find the general 
opinion about a topic which is currently trending. It can help manufactures in deciding the 
strategies for the launch of new products based on the responses towards the previous versions of 
that product in various geographical areas. It can also be used to detect heated arguments in 
comments, use of abusive language or spam detection. 
Sentimental Analysis can be phrase based in which the sentiment of a single phrase is taken in to 
consideration, sentence based in which the sentiment of the whole sentence is calculated or it can 
DOI:10.5121/ijfcst.2014.4504 35
International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.5, September 2014 
be document based in which an aggregated sentiment of the whole document is categorized as 
positive, negative or neutral 
Sentiment Analysis is generally carried out in three steps. First, the subject towards which the 
sentiment is directed is found then, the polarity of the sentiment is calculated and finally the 
degree of the polarity is assigned with the help of a sentiment score which denotes the intensity of 
the sentiment. 
Over the last few years, the amount of sentiment rich data on the web has increased rapidly with 
the increase in social media users. Various companies are now turning to online forums, blogs 
etc. to get the review of their products from customers. The sentiment data hence is now 
considered as big data which needs techniques that are capable of handling such large amounts of 
data efficiently. 
In this research a lexicon based approach is used to perform sentiment analysis. There has been a 
lot of research on sentiment calculation using lexicon based and other techniques. Some of which 
are described in the following section. 
2. RELATED WORK 
A lot of work has been done in the field of opinion mining or sentiment analysis for well over a 
decade now. Different techniques are used to classify the text according to polarity. Most of these 
techniques can be classified under two categories: 
2.1. Machine learning 
Machine learning strategies work by training our algorithm with a training data set before 
applying it to the actual data set. That is, a machine learning algorithm needs to be trained first for 
both supervised and unsupervised learning tasks. 
Machine learning techniques first trains the algorithm with some particular inputs with known 
outputs so that later it can work with new unknown data. Some of the most renowned works 
based on machine learning are as follows: 
Melville [2], Rui [3], Ziqiong [4], Songho [5], Qiang [6] and Smeureanu [7] used naïve bayes for 
classification of text. Naïve bayes is one of the most popular method in text classification. It is 
considered as one of the most simple and efficient approaches in NLP. It works by calculating the 
probability of an element being in a category. First the prior probability is calculated which is 
afterwards multiplied with the likelihood to calculate the final probability. The method assumes 
every word in the sentence to be independent which makes it easier to implement but less 
accurate. That’s why the method is given the name ‘naïve’. 
Rui [3], Ziqiong [4], Songho [5], and Rudy [8] used SVM (Support Vector Machines) in their 
approach. The sentiment classification was done using discriminative classifier. This approach is 
based on structural risk minimization in which support vectors are used to classify the training 
data sets into two different classes based on a predefined criteria. Multiclass SVM can also be 
used for text classification [9]. 
Songho [5] used centroid classifier which assign a centroid vector to different training classes and 
then uses this vector to calculate the similarity values of each element with a class. If the 
similarity value exceeds a defined threshold then the element is assigned to that class (polarity in 
this case). 
36
International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.5, September 2014 
Songho [5] also used K-Nearest Neighbor (KNN) approach which finds ‘K’ nearest neighbours of 
a text document among the documents in the training data set. Then classification is performed on 
the basis of similarity score of a class with respect to a neighbour. 
Some approaches use prediction method. Winnow first predicts a class for an entity and then 
feedback is used to check for mistakes. The weight vectors are changed according to the feedback 
and the process is repeated over a large training data set. 
A combined approach was proposed based on included rule classification, machine learning and 
supervised learning [8]. Each training set was subjected to a 10 fold cross validation process. 
Several classifiers were used in series. Whenever a classifier failed, the text was passed on to the 
next classifier until all the texts are classified or there are no remaining classifiers Rui [3] used 
Ensemble technique which combines the output of several classification methods into a single 
integrated output. 
Another approach based on artificial neural networks was used to divide the document into 
positive, negative and fuzzy tone [10]. The approach was based on recursive least squares back 
propagation training algorithm. 
Long-Sheng [11] used a neural network based approach to combine the advantages of machine 
learning and information retrieval techniques. 
2.2. Lexicon Based 
Lexicon Based techniques work on an assumption that the collective polarity of a document or 
sentence is the sum of polarities of the individual words or phrases. Some of the significant works 
done using this technique are: 
Kamps [12] used a simple technique based on lexical relations to perform classification of text. 
Andrea [13] used word net to classify the text using an assumption that words with similar 
polarity have similar orientation. 
Ting-Chun [14] used an algorithm based on pos (part of speech) patter. A text phrase was used as 
a query for a search engine and the results were used to classify the text. 
Prabhu [15] which used a simple lexicon based technique to extract sentiments from twitter data 
Turney [19] used semantic orientation on user reviews to identify the underlying sentiments. 
Taboada [20] used lexicon based approach to extract sentiments from microblogs. 
Sentiment analysis for microblogs is more challenging because of problems like use of short 
length status message, informal words, word shortening, spelling variation and emoticons. Twitter 
data was used for sentiment analysis by [21]. 
Negation word can reverse the polarity of any sentence. Taboada [20] performed sentiment 
analysis while handling negation and intensifying words. Role of negation was surveyed by [25]. 
Minquing [26] classified the text using a simple lexicon based approach with feature detection. 
It was observed that most of these existing techniques doesn’t scale to big data sets efficiently. 
While various machine learning methodologies exhibits better accuracy than lexicon based 
techniques, they take more time in training the algorithm and hence are not suitable for big data 
sets. In this paper, lexicon based approach is used to classify the text according to polarity. 
37
International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.5, September 2014 
3. PROPOSED APPROACH 
As explained earlier the focus of this research was to device an approach that can perform 
sentiment analysis quicker because vast amount of data needed to be analyzed. Also, it had to be 
made sure that accuracy is not compromised too much while focusing on speed. 
The proposed approach is a lexicon based technique i.e. a dictionary of sentiment bearing words 
along with their polarities was used to classify the text into positive, negative or neutral opinion. 
Machine learning techniques are not used because although they are more accurate than the 
lexicon based approaches, they take far too much time performing sentiment analysis as they have 
to be trained first and hence are not efficient in handling big sentiment data. 
The main component of this approach is the sentiment lexicon or dictionary. 
3.1. Sentiment Lexicon 
Unlike the small sized dictionaries which are used in other approaches, which are build up from a 
small set of words using a training data set, the dictionary used in this technique contains a large 
set of sentiment bearing words along with their polarity. 
The dictionary is domain specific i.e. the polarities of the words in the dictionary are set 
according to a specific domain e.g. book reviews, political blogs etc. Same word in different 
domains can have different meanings, the dictionary used in this approach is made for movie 
review domain. 
The dictionary contains all forms of a word i.e. every word is stored along with its various verb 
forms e.g. applause, applauding, applauded, applauds. Hence eliminating the need for stemming 
each word which saves more time. 
Emoticons are generally used by people to depict emotions. Hence it is obvious that they contain 
very useful sentiment information in them. The dictionary used in the approach contains more 
than 30 different emoticons along with their polarities. 
The Dictionary also contains the strength of the polarity of every word. Some word depicts 
stronger emotions than others. For example good and great are both positive words but great 
depicts a much stronger emotion. 
Negation and blind negation are very important in identifying the sentiments, as their presence 
can reverse the polarity of the sentence. The dictionary used here also contains various negation 
and blind negation words so that they can be identified in the sentence. 
The dictionary contains several different adjectives, nouns, negation words, emoticons etc. Some 
of these words are shown as an example in table 1. 
38 
Table 1. Some examples of the words and emoticons in the sentiment dictionary 
Strength Word POS Stemmed Polarity 
weaksubj abandoned adj n negative 
weaksubj abandonment noun n negative 
weaksubj abandon verb y negative 
strongsubj needed verb n blindnegation
International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.5, September 2014 
39 
strongsubj require verb n blindnegation 
strongsubj not advb n negation 
strongsubj neither conj n negation 
strongsubj nor conj n negation 
strongsubj :) emoti n positive 
strongsubj :( emoti n negative 
3.2. Feature detection using Twitter hashtags 
A hashtag is a word or an unspaced phrase prefixed with the number sign ("#"). It is a form of 
metadata tag. Words in messages on microblogging and social networking services such as 
Twitter, Facebook, Google+ or Instagram may be tagged by putting "#" before them, either as 
they appear in a sentence or they are appended to it. 
In this research, twitter data is categorized according to polarity. Detecting the subject towards 
which the sentiment is directed is a tedious task to perform, but as twitter is used as data source 
hashtags can be used to easily identify the subject hence eliminating the need for using a complex 
mechanism for feature detection thereby saving time and effort. 
3.3. Handling negation and blind negation 
Negation words are the words which reverse the polarity of the sentiment involved in the text. For 
example ‘the movie was not good’. Although the word ‘good’ depicts a positive sentiment the 
negation – ‘not’ reverses its polarity. In the proposed approach whenever a negation word is 
encountered in a tweet, its polarity is reversed. 
Blind negation words are the words which operates on the sentence level and points out a feature 
that is desired in a product or service. For example in the sentence ‘the acting needed to be 
better’, ‘better’ depicts a positive sentiment but the presence of the blind negation word- ‘needed’ 
suggests that this sentence is actually depicting negative sentiment. In the proposed approach 
whenever a blind negation word occurs in a sentence its polarity is immediately labelled as 
negative. 
3.4. Sentiment Calculation 
Sentiment calculation is done for every tweet and a polarity score is given to it. If the score is 
greater than 0 then it is considered to be positive sentiment on behalf of the user, if less than 0 
then negative else neutral. The polarity score is calculated by using algorithm 1. 
Algorithm 1: ALGO_SENTICAL 
Input: Tweets, SentiWord_Dictionary Output: Sentiment (positive, negative or neutral) 
1) BEGIN 
2) For each tweet Ti 
3) { 
4) SentiScore = 0; 
5) For each word Wj in Ti that exists in Sentiword_Dictionary 
6) { 
7) If polarity[Wj] = blindnegation
International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.5, September 2014 
8) { 
9) Return negative; 
10) } 
11) Else 
12) { 
13) If polarity[Wj] = positive && strength[Wj] = Strongsubj 
14) { 
15) SentiScore = SentiScore + 1; 
16) } 
17) Else If polarity[Wj] = positive && strength[Wj] = Weaksubj 
18) { 
19) SentiScore = SentiScore + 0.5; 
20) } 
21) Else If polarity[Wj] = negative && strength[Wj] = Strongsubj 
22) { 
23) SentiScore = SentiScore – 1; 
24) } 
25) Else If polarity[Wj] = negative && strength[Wj] = Weaksubj 
26) { 
27) SentiScore = SentiScore – 0.5; 
28) } 
29) } 
30) If polarity[Wj] = negation 
31) { 
32) Sentiscore = Sentiscore * -1 
33) } 
34) } 
35) If Sentiscore of Ti >0 
36) { 
37) Sentiment = positive 
38) } 
39) Else If Sentiscore of TI<0 
40) { 
41) Sentiment = negative 
42) } 
43) Else 
44) { 
45) Sentiment = neutral 
46) } 
47) Return Sentiment 
48) } 
49) END 
4. PERFORMANCE EVALUATION 
In this section, description of the tools which were used to perform sentiment analysis is given 
along with the performance of the approach and its comparison with existing approaches. 
4.1. Experimental Setup 
The proposed algorithm was implemented using a sandboxed version of Hadoop. Hadoop is 
designed to work in a huge multimode environment consisting of thousands of servers situated at 
40
International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.5, September 2014 
different locations. But for research purposes often a single node virtual environment is used that 
creates an illusion of several nodes which are situated at different locations and are working 
together. An Intel Core i5-3210M CPU@2.50GHz processor with 6 GB memory was used to 
simulate the Hadoop Environment. Data was imported from Twitter using Flume, a distributed, 
reliable, and available service for efficiently collecting, aggregating, and moving large amounts of 
streaming data into the Hadoop Distributed File System (HDFS). 
Then the data is refined and the algorithm described in the previous section is implemented upon 
it with the help of HiveQL. HiveQL is an SQL like language which is used in hadoop for 
interacting and manipulating huge databases. 
4.2. Results and Evaluation 
As explained earlier the purpose of this research was to device a method that can quickly compute 
the sentiments of huge data sets without compromising too much with accuracy. 
The proposed approach has performed very well in terms of speed. It took 14.8 Seconds to 
analyse the sentiments behind 6,74,412 tweets. Table 1 shows the performance of the algorithm. 
Moreover the speed can be substantially improved by using an actual multimode hadoop 
environment instead of a single node sandboxed configuration. 
The algorithm also performed very well in terms of accuracy. Tweets were analyzed with an 
accuracy of 73.5 %. As expected it is not as high as the machine learning methodologies but it 
shows a fairly good accuracy when compared with other lexicon based approaches. Figure 1 
shows a comparison between average accuracies of machine learning and lexicon based 
techniques discussed in section II and the accuracy of the proposed approach. 
41 
Table 2 Performance of the proposed approach 
Number of Tweets Time Taken Accuracy 
6,74,412 14.8 seconds 73.5 % 
Figure 1. Comparison of the proposed approach with other methodologies
International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.5, September 2014 
5. CONCLUSION AND FUTURE WORK 
Sentiment analysis is being used for different applications and can be used for several others in 
future. It is evident that its applications will definitely expand to more areas and will continue to 
encourage more and more researches in the field. We have done an overview of some state-of-the- 
art solutions applied to sentiment classification and provided a new approach that scales to 
big data sets efficiently. 
A scalable and practical lexicon based approach for extracting sentiments using emoticons and 
hashtags is introduced. Hadoop was used to classify Twitter data without need for any kind of 
training. 
Our approach performed extremely well in terms of both speed and accuracy while showing signs 
that it can be further scaled to much bigger data sets with similar, in fact better performance. 
In this research, main focus was on performing sentiment analysis quickly so that big data sets 
can be handled efficiently. The work can be further expanded by introducing techniques that 
increase the accuracy by tackling problems like thwarted expressions and implicit sentiments 
which still needs to be resolved properly. 
Also as explained earlier, this work was implemented on a single node sandboxed configuration 
and although it is expected that it will perform much better in a multimode enterprise level 
configuration, it is desirable to check its performance in such environment in future. 
REFERENCES 
[1] B. Liu. Sentiment Analysis and Subjectivity. Handbook of Natural Language Processing, Second 
42 
Edition, (editors: N. Indurkhya and F. J. Damerau), 2010. 
[2] Melville, Wojciech Gryc, “Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text 
Classification”, KDD‟09, June 28–July 1, 2009, Paris, France.Copyright 2009 ACM 978-1-60558- 
495-9/09/06. 
[3] Rui Xia, Chengqing Zong, Shoushan Li, “Ensemble of feature sets and classification algorithms for 
sentiment classification”, Information Sciences 181 (2011) 1138–1152. 
[4] Ziqiong Zhang, Qiang Ye, Zili Zhang, Yijun Li, “Sentiment classification of Internet restaurant 
reviews written in Cantonese”, Expert Systems with Applications xxx (2011) xxx–xxx. 
[5] Songbo Tan, Jin Zhang, “An empirical study of sentiment analysis for chinese documents”, Expert 
Systems with Applications 34 (2008) 2622–2629. 
[6] Qiang Ye, Ziqiong Zhang, Rob Law, “Sentiment classification of online reviews to travel destinations 
by supervised machine learning approaches”, Expert Systems with Applications 36 (2009) 6527– 
6535. 
[7] Ion SMEUREANU, Cristian BUCUR, “Applying Supervised Opinion Mining Techniques on Online 
User Reviews”, Informatica Economică vol. 16, no. 2/2012. 
[8] Rudy Prabowo, Mike Thelwall, “Sentiment analysis: A combined approach.” Journal of Informetrics 
3 (2009) 143–157. 
[9] Kaiquan Xu , Stephen Shaoyi Liao , Jiexun Li, Yuxia Song, “Mining comparative opinions from 
customer reviews for Competitive Intelligence”, Decision Support Systems 50 (2011) 743–754. 
[10] ZHU Jian , XU Chen, WANG Han-shi, “" Sentiment classification using the theory of ANNs”, The 
Journal of China Universities of Posts and Telecommunications, July 2010, 17(Suppl.): 58–62 .[16] 
Ziqiong Zhang, Qiang Ye, Zili Zhang, Yijun Li, “Sentiment classification of Internet restaurant 
reviews written in Cantonese”, Expert Systems with Applications xxx (2011) 
[11] Long-Sheng Chen, Cheng-Hsiang Liu, Hui-Ju Chiu, “A neural network based approach for sentiment 
classification in the blogosphere”, Journal of Informetrics 5 (2011) 313–322.
International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.5, September 2014 
[12] Kamps, Maarten Marx, Robert J. Mokken and Maarten De Rijke, “Using wordnet to measure 
semantic orientation of adjectives”, Proceedings of 4th International Conference on Language 
Resources and Evaluation, pp. 1115-1118, Lisbon, Portugal, 2004. 
[13] Andrea Esuli and Fabrizio Sebastiani, “Determining the semantic orientation of terms through gloss 
classification”, Proceedings of 14th ACM International Conference on Information and Knowledge 
Management,pp. 617-624, Bremen, Germany, 2005. 
[14] Ting-Chun Peng and Chia-Chun Shih , “An Unsupervised Snippet-based Sentiment Classification 
Method for Chinese Unknown Phrases without using Reference Word Pairs”, 2010 IEEE/WIC/ACM 
International Conference on Web Intelligence and intelligent Agent Technology JOURNAL OF 
COMPUTING, VOLUME 2, ISSUE 8, AUGUST 2010, ISSN 2151-9617 . 
[15] Prabu Palanisamy, Vineet Yadav, Harsha Elchuri, “Serendio: Simple and Practical lexicon based 
43 
approach to Sentiment Analysis”, Serendio Software Pvt Ltd, 2013. 
[16] Subhabrata Mukherjee, Dr. Pushpak Bhattacharyya, Indian Institute of Technology, Bombay, 
Department of Computer Science and Engineering, June 2012. 
[17] Bickel, S.Bruckner, M. Scheffer, “Discriminative learning for differing training and test 
distributions”, International Conference on Machine Learning, 2007. 
[18] Benamara, Farah, Carmine Cesarano, Antonio Picariello, Diego Reforgiato and VS Subrahmanian, 
“Sentiment analysis: Adjectives and adverbs are better than adjectives alone” International 
Conference on Weblogs and Social Media, ICWSM, Boulder, CO. 2007. 
[19] Peter Turney and Michael Littman. 2003. Measuring praise and criticism: Inference of semantic 
orientationfrom association. ACM Transactions on Information Systems 21(4):315–346. 
[20] Maite Taboada, Julian Brooke, Milan Tofiloski, Kimberly Voll and Manfred Stede. 2011. Lexicon-based 
methods for sentiment analysis. Computational linguistics, volume 37, number2, 267–307, MIT 
Press. 
[21] Albert Bifet and Eibe Frank. 2010. Sentiment knowledge discovery in twitter streaming data, 
DiscoveryScience 1–14, Springer. 
[22] Bo Han and Timothy Baldwin. 2011. Lexical normalisation of short text messages: Makn sens a# 
twitter.Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: 
Human Language Technologies Volume 1 , 368–378. 
[23] Max Kaufmann and Jugal Kalita. 2010. Syntactic normalization of Twitter messages. International 
Conference on Natural Language Processing Kharagpur, India. 
[24] Dmitry Davidov, Oren Tsur and Ari Rappoport. 2010. Enhanced sentiment learning using twitter 
hashtagsand smileys. Proceedings of the 23rd International Conference on Computational Linguistics 
241 –249, Association for Computational Linguistics. 
[25] Michael Wiegand, Alexandra Balahur, Benjamin Roth, Dietrich Klakow, Andr´es Montoyo. 2010. A 
survey on the role of negation in sentiment analysis. Proceedings of the workshop on negation and 
speculation in natural language processing 60–68, Association for Computational Linguistics. 
[26] Minqing Hu, Bing Liu. Mining and Summarizing Customer Reviews, Department of Computer 
Science, University of Illinois at Chicago, Research Track Paper. 
Authors 
Chetan Kaushik1 is a student in the Department of Computer Engineering, YMCA 
University of Science and Technology, Haryana, India. He holds a bachelor’s degree 
in information Technology from Kurukshetra University, kurukshetra and is pursuing 
his master’s degree in computer engineering from the Department of Computer 
Engineering, YMCA University of Science and Technology. His research interests 
includes big data and Social network mining. 
Atul Mishra2 is working as an Associate Professor in the Department of Computer 
Engineering, YMCA University of Science and Technology, Haryana, India. He holds 
a Master’s Degree in Computer Science and Technology, from University of 
Roorkee(now IIT,Roorkee) and PhD in Computer Science and Engg. from MD 
University, Rohtak. He has about 16 years of work experience in the Optical 
Telecommunication Industry specializing in Optical Network Planning and Network 
Management tools development. His research interests include SOA, Network 
Management & Mobile Agents.

More Related Content

What's hot

Analyzing sentiment system to specify polarity by lexicon-based
Analyzing sentiment system to specify polarity by lexicon-basedAnalyzing sentiment system to specify polarity by lexicon-based
Analyzing sentiment system to specify polarity by lexicon-basedjournalBEEI
 
Sentiment Analysis and Classification of Tweets using Data Mining
Sentiment Analysis and Classification of Tweets using Data MiningSentiment Analysis and Classification of Tweets using Data Mining
Sentiment Analysis and Classification of Tweets using Data MiningIRJET Journal
 
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...IJECEIAES
 
Methods for Sentiment Analysis: A Literature Study
Methods for Sentiment Analysis: A Literature StudyMethods for Sentiment Analysis: A Literature Study
Methods for Sentiment Analysis: A Literature Studyvivatechijri
 
Twitter’s Sentiment Analysis on Gsm Services using Multinomial Naïve Bayes
Twitter’s Sentiment Analysis on Gsm Services using Multinomial Naïve BayesTwitter’s Sentiment Analysis on Gsm Services using Multinomial Naïve Bayes
Twitter’s Sentiment Analysis on Gsm Services using Multinomial Naïve BayesTELKOMNIKA JOURNAL
 
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUESA SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUESJournal For Research
 
Framework for opinion as a service on review data of customer using semantics...
Framework for opinion as a service on review data of customer using semantics...Framework for opinion as a service on review data of customer using semantics...
Framework for opinion as a service on review data of customer using semantics...IJECEIAES
 
Supervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithmSupervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithmIJSRD
 
Sentiment Analysis on Twitter Data
Sentiment Analysis on Twitter DataSentiment Analysis on Twitter Data
Sentiment Analysis on Twitter DataIRJET Journal
 
Hybrid Classifier for Sentiment Analysis using Effective Pipelining
Hybrid Classifier for Sentiment Analysis using Effective PipeliningHybrid Classifier for Sentiment Analysis using Effective Pipelining
Hybrid Classifier for Sentiment Analysis using Effective PipeliningIRJET Journal
 
A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet
A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet
A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet IJECEIAES
 
Amazon Product Sentiment review
Amazon Product Sentiment reviewAmazon Product Sentiment review
Amazon Product Sentiment reviewLalit Jain
 
NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241Urjit Patel
 
IRJET- Sentimental Analysis of Product Reviews for E-Commerce Websites
IRJET- Sentimental Analysis of Product Reviews for E-Commerce WebsitesIRJET- Sentimental Analysis of Product Reviews for E-Commerce Websites
IRJET- Sentimental Analysis of Product Reviews for E-Commerce WebsitesIRJET Journal
 
IRJET- Sentimental Analysis on Audio and Video using Vader Algorithm -Monali ...
IRJET- Sentimental Analysis on Audio and Video using Vader Algorithm -Monali ...IRJET- Sentimental Analysis on Audio and Video using Vader Algorithm -Monali ...
IRJET- Sentimental Analysis on Audio and Video using Vader Algorithm -Monali ...IRJET Journal
 
IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...
IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...
IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...IRJET Journal
 
AN AUTOMATED MULTIPLE-CHOICE QUESTION GENERATION USING NATURAL LANGUAGE PROCE...
AN AUTOMATED MULTIPLE-CHOICE QUESTION GENERATION USING NATURAL LANGUAGE PROCE...AN AUTOMATED MULTIPLE-CHOICE QUESTION GENERATION USING NATURAL LANGUAGE PROCE...
AN AUTOMATED MULTIPLE-CHOICE QUESTION GENERATION USING NATURAL LANGUAGE PROCE...kevig
 

What's hot (19)

Analyzing sentiment system to specify polarity by lexicon-based
Analyzing sentiment system to specify polarity by lexicon-basedAnalyzing sentiment system to specify polarity by lexicon-based
Analyzing sentiment system to specify polarity by lexicon-based
 
Sentiment Analysis and Classification of Tweets using Data Mining
Sentiment Analysis and Classification of Tweets using Data MiningSentiment Analysis and Classification of Tweets using Data Mining
Sentiment Analysis and Classification of Tweets using Data Mining
 
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...
 
Methods for Sentiment Analysis: A Literature Study
Methods for Sentiment Analysis: A Literature StudyMethods for Sentiment Analysis: A Literature Study
Methods for Sentiment Analysis: A Literature Study
 
Twitter’s Sentiment Analysis on Gsm Services using Multinomial Naïve Bayes
Twitter’s Sentiment Analysis on Gsm Services using Multinomial Naïve BayesTwitter’s Sentiment Analysis on Gsm Services using Multinomial Naïve Bayes
Twitter’s Sentiment Analysis on Gsm Services using Multinomial Naïve Bayes
 
Viva
VivaViva
Viva
 
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUESA SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
 
Framework for opinion as a service on review data of customer using semantics...
Framework for opinion as a service on review data of customer using semantics...Framework for opinion as a service on review data of customer using semantics...
Framework for opinion as a service on review data of customer using semantics...
 
Supervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithmSupervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithm
 
Sentiment Analysis on Twitter Data
Sentiment Analysis on Twitter DataSentiment Analysis on Twitter Data
Sentiment Analysis on Twitter Data
 
Hybrid Classifier for Sentiment Analysis using Effective Pipelining
Hybrid Classifier for Sentiment Analysis using Effective PipeliningHybrid Classifier for Sentiment Analysis using Effective Pipelining
Hybrid Classifier for Sentiment Analysis using Effective Pipelining
 
A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet
A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet
A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet
 
Abstract
AbstractAbstract
Abstract
 
Amazon Product Sentiment review
Amazon Product Sentiment reviewAmazon Product Sentiment review
Amazon Product Sentiment review
 
NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241
 
IRJET- Sentimental Analysis of Product Reviews for E-Commerce Websites
IRJET- Sentimental Analysis of Product Reviews for E-Commerce WebsitesIRJET- Sentimental Analysis of Product Reviews for E-Commerce Websites
IRJET- Sentimental Analysis of Product Reviews for E-Commerce Websites
 
IRJET- Sentimental Analysis on Audio and Video using Vader Algorithm -Monali ...
IRJET- Sentimental Analysis on Audio and Video using Vader Algorithm -Monali ...IRJET- Sentimental Analysis on Audio and Video using Vader Algorithm -Monali ...
IRJET- Sentimental Analysis on Audio and Video using Vader Algorithm -Monali ...
 
IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...
IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...
IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...
 
AN AUTOMATED MULTIPLE-CHOICE QUESTION GENERATION USING NATURAL LANGUAGE PROCE...
AN AUTOMATED MULTIPLE-CHOICE QUESTION GENERATION USING NATURAL LANGUAGE PROCE...AN AUTOMATED MULTIPLE-CHOICE QUESTION GENERATION USING NATURAL LANGUAGE PROCE...
AN AUTOMATED MULTIPLE-CHOICE QUESTION GENERATION USING NATURAL LANGUAGE PROCE...
 

Similar to A scalable, lexicon based technique for sentiment analysis

APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...mathsjournal
 
A Survey on Sentiment Analysis and Opinion Mining
A Survey on Sentiment Analysis and Opinion MiningA Survey on Sentiment Analysis and Opinion Mining
A Survey on Sentiment Analysis and Opinion MiningIJSRD
 
The sarcasm detection with the method of logistic regression
The sarcasm detection with the method of logistic regressionThe sarcasm detection with the method of logistic regression
The sarcasm detection with the method of logistic regressionEditorIJAERD
 
Analysis Levels And Techniques A Survey
Analysis Levels And Techniques   A SurveyAnalysis Levels And Techniques   A Survey
Analysis Levels And Techniques A SurveyLiz Adams
 
Online Product Reviews Based on Sentiment Analysis
Online Product Reviews Based on Sentiment AnalysisOnline Product Reviews Based on Sentiment Analysis
Online Product Reviews Based on Sentiment AnalysisBRNSSPublicationHubI
 
Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...
Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...
Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...Andrew Parish
 
An Approach To Sentiment Analysis
An Approach To Sentiment AnalysisAn Approach To Sentiment Analysis
An Approach To Sentiment AnalysisSarah Morrow
 
Dialectal Arabic sentiment analysis based on tree-based pipeline optimizatio...
Dialectal Arabic sentiment analysis based on tree-based pipeline  optimizatio...Dialectal Arabic sentiment analysis based on tree-based pipeline  optimizatio...
Dialectal Arabic sentiment analysis based on tree-based pipeline optimizatio...IJECEIAES
 
A fuzzy logic based on sentiment
A fuzzy logic based on sentimentA fuzzy logic based on sentiment
A fuzzy logic based on sentimentIJDKP
 
A SURVEY OF MACHINE LEARNING TECHNIQUES FOR SENTIMENT CLASSIFICATION
A SURVEY OF MACHINE LEARNING TECHNIQUES FOR SENTIMENT CLASSIFICATIONA SURVEY OF MACHINE LEARNING TECHNIQUES FOR SENTIMENT CLASSIFICATION
A SURVEY OF MACHINE LEARNING TECHNIQUES FOR SENTIMENT CLASSIFICATIONijcsa
 
Sentiment Analysis in Social Media and Its Operations
Sentiment Analysis in Social Media and Its OperationsSentiment Analysis in Social Media and Its Operations
Sentiment Analysis in Social Media and Its OperationsIRJET Journal
 
Graph-based Analysis and Opinion Mining in Social Network
Graph-based Analysis and Opinion Mining in Social NetworkGraph-based Analysis and Opinion Mining in Social Network
Graph-based Analysis and Opinion Mining in Social NetworkKhan Mostafa
 
Sentiment analysis on Bangla conversation using machine learning approach
Sentiment analysis on Bangla conversation using machine  learning approachSentiment analysis on Bangla conversation using machine  learning approach
Sentiment analysis on Bangla conversation using machine learning approachIJECEIAES
 
opinion feature extraction using enhanced opinion mining technique and intrin...
opinion feature extraction using enhanced opinion mining technique and intrin...opinion feature extraction using enhanced opinion mining technique and intrin...
opinion feature extraction using enhanced opinion mining technique and intrin...INFOGAIN PUBLICATION
 
A Review Of Text Mining Techniques And Applications
A Review Of Text Mining Techniques And ApplicationsA Review Of Text Mining Techniques And Applications
A Review Of Text Mining Techniques And ApplicationsLisa Graves
 
Automatic Movie Rating By Using Twitter Sentiment Analysis And Monitoring Tool
Automatic Movie Rating By Using Twitter Sentiment Analysis And Monitoring ToolAutomatic Movie Rating By Using Twitter Sentiment Analysis And Monitoring Tool
Automatic Movie Rating By Using Twitter Sentiment Analysis And Monitoring ToolLaurie Smith
 

Similar to A scalable, lexicon based technique for sentiment analysis (19)

APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...
 
A Survey on Sentiment Analysis and Opinion Mining
A Survey on Sentiment Analysis and Opinion MiningA Survey on Sentiment Analysis and Opinion Mining
A Survey on Sentiment Analysis and Opinion Mining
 
The sarcasm detection with the method of logistic regression
The sarcasm detection with the method of logistic regressionThe sarcasm detection with the method of logistic regression
The sarcasm detection with the method of logistic regression
 
Analysis Levels And Techniques A Survey
Analysis Levels And Techniques   A SurveyAnalysis Levels And Techniques   A Survey
Analysis Levels And Techniques A Survey
 
Online Product Reviews Based on Sentiment Analysis
Online Product Reviews Based on Sentiment AnalysisOnline Product Reviews Based on Sentiment Analysis
Online Product Reviews Based on Sentiment Analysis
 
Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...
Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...
Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...
 
An Approach To Sentiment Analysis
An Approach To Sentiment AnalysisAn Approach To Sentiment Analysis
An Approach To Sentiment Analysis
 
Dialectal Arabic sentiment analysis based on tree-based pipeline optimizatio...
Dialectal Arabic sentiment analysis based on tree-based pipeline  optimizatio...Dialectal Arabic sentiment analysis based on tree-based pipeline  optimizatio...
Dialectal Arabic sentiment analysis based on tree-based pipeline optimizatio...
 
A fuzzy logic based on sentiment
A fuzzy logic based on sentimentA fuzzy logic based on sentiment
A fuzzy logic based on sentiment
 
A SURVEY OF MACHINE LEARNING TECHNIQUES FOR SENTIMENT CLASSIFICATION
A SURVEY OF MACHINE LEARNING TECHNIQUES FOR SENTIMENT CLASSIFICATIONA SURVEY OF MACHINE LEARNING TECHNIQUES FOR SENTIMENT CLASSIFICATION
A SURVEY OF MACHINE LEARNING TECHNIQUES FOR SENTIMENT CLASSIFICATION
 
Sentiment Analysis in Social Media and Its Operations
Sentiment Analysis in Social Media and Its OperationsSentiment Analysis in Social Media and Its Operations
Sentiment Analysis in Social Media and Its Operations
 
Anu paper(IJARCCE)
Anu paper(IJARCCE)Anu paper(IJARCCE)
Anu paper(IJARCCE)
 
Graph-based Analysis and Opinion Mining in Social Network
Graph-based Analysis and Opinion Mining in Social NetworkGraph-based Analysis and Opinion Mining in Social Network
Graph-based Analysis and Opinion Mining in Social Network
 
Sentiment analysis on Bangla conversation using machine learning approach
Sentiment analysis on Bangla conversation using machine  learning approachSentiment analysis on Bangla conversation using machine  learning approach
Sentiment analysis on Bangla conversation using machine learning approach
 
opinion feature extraction using enhanced opinion mining technique and intrin...
opinion feature extraction using enhanced opinion mining technique and intrin...opinion feature extraction using enhanced opinion mining technique and intrin...
opinion feature extraction using enhanced opinion mining technique and intrin...
 
A Review Of Text Mining Techniques And Applications
A Review Of Text Mining Techniques And ApplicationsA Review Of Text Mining Techniques And Applications
A Review Of Text Mining Techniques And Applications
 
Automatic Movie Rating By Using Twitter Sentiment Analysis And Monitoring Tool
Automatic Movie Rating By Using Twitter Sentiment Analysis And Monitoring ToolAutomatic Movie Rating By Using Twitter Sentiment Analysis And Monitoring Tool
Automatic Movie Rating By Using Twitter Sentiment Analysis And Monitoring Tool
 
J1803015357
J1803015357J1803015357
J1803015357
 
Ijetcas14 580
Ijetcas14 580Ijetcas14 580
Ijetcas14 580
 

More from ijfcstjournal

A COMPARATIVE ANALYSIS ON SOFTWARE ARCHITECTURE STYLES
A COMPARATIVE ANALYSIS ON SOFTWARE ARCHITECTURE STYLESA COMPARATIVE ANALYSIS ON SOFTWARE ARCHITECTURE STYLES
A COMPARATIVE ANALYSIS ON SOFTWARE ARCHITECTURE STYLESijfcstjournal
 
SYSTEM ANALYSIS AND DESIGN FOR A BUSINESS DEVELOPMENT MANAGEMENT SYSTEM BASED...
SYSTEM ANALYSIS AND DESIGN FOR A BUSINESS DEVELOPMENT MANAGEMENT SYSTEM BASED...SYSTEM ANALYSIS AND DESIGN FOR A BUSINESS DEVELOPMENT MANAGEMENT SYSTEM BASED...
SYSTEM ANALYSIS AND DESIGN FOR A BUSINESS DEVELOPMENT MANAGEMENT SYSTEM BASED...ijfcstjournal
 
AN ALGORITHM FOR SOLVING LINEAR OPTIMIZATION PROBLEMS SUBJECTED TO THE INTERS...
AN ALGORITHM FOR SOLVING LINEAR OPTIMIZATION PROBLEMS SUBJECTED TO THE INTERS...AN ALGORITHM FOR SOLVING LINEAR OPTIMIZATION PROBLEMS SUBJECTED TO THE INTERS...
AN ALGORITHM FOR SOLVING LINEAR OPTIMIZATION PROBLEMS SUBJECTED TO THE INTERS...ijfcstjournal
 
LBRP: A RESILIENT ENERGY HARVESTING NOISE AWARE ROUTING PROTOCOL FOR UNDER WA...
LBRP: A RESILIENT ENERGY HARVESTING NOISE AWARE ROUTING PROTOCOL FOR UNDER WA...LBRP: A RESILIENT ENERGY HARVESTING NOISE AWARE ROUTING PROTOCOL FOR UNDER WA...
LBRP: A RESILIENT ENERGY HARVESTING NOISE AWARE ROUTING PROTOCOL FOR UNDER WA...ijfcstjournal
 
STRUCTURAL DYNAMICS AND EVOLUTION OF CAPSULE ENDOSCOPY (PILL CAMERA) TECHNOLO...
STRUCTURAL DYNAMICS AND EVOLUTION OF CAPSULE ENDOSCOPY (PILL CAMERA) TECHNOLO...STRUCTURAL DYNAMICS AND EVOLUTION OF CAPSULE ENDOSCOPY (PILL CAMERA) TECHNOLO...
STRUCTURAL DYNAMICS AND EVOLUTION OF CAPSULE ENDOSCOPY (PILL CAMERA) TECHNOLO...ijfcstjournal
 
AN OPTIMIZED HYBRID APPROACH FOR PATH FINDING
AN OPTIMIZED HYBRID APPROACH FOR PATH FINDINGAN OPTIMIZED HYBRID APPROACH FOR PATH FINDING
AN OPTIMIZED HYBRID APPROACH FOR PATH FINDINGijfcstjournal
 
EAGRO CROP MARKETING FOR FARMING COMMUNITY
EAGRO CROP MARKETING FOR FARMING COMMUNITYEAGRO CROP MARKETING FOR FARMING COMMUNITY
EAGRO CROP MARKETING FOR FARMING COMMUNITYijfcstjournal
 
EDGE-TENACITY IN CYCLES AND COMPLETE GRAPHS
EDGE-TENACITY IN CYCLES AND COMPLETE GRAPHSEDGE-TENACITY IN CYCLES AND COMPLETE GRAPHS
EDGE-TENACITY IN CYCLES AND COMPLETE GRAPHSijfcstjournal
 
COMPARATIVE STUDY OF DIFFERENT ALGORITHMS TO SOLVE N QUEENS PROBLEM
COMPARATIVE STUDY OF DIFFERENT ALGORITHMS TO SOLVE N QUEENS PROBLEMCOMPARATIVE STUDY OF DIFFERENT ALGORITHMS TO SOLVE N QUEENS PROBLEM
COMPARATIVE STUDY OF DIFFERENT ALGORITHMS TO SOLVE N QUEENS PROBLEMijfcstjournal
 
PSTECEQL: A NOVEL EVENT QUERY LANGUAGE FOR VANET’S UNCERTAIN EVENT STREAMS
PSTECEQL: A NOVEL EVENT QUERY LANGUAGE FOR VANET’S UNCERTAIN EVENT STREAMSPSTECEQL: A NOVEL EVENT QUERY LANGUAGE FOR VANET’S UNCERTAIN EVENT STREAMS
PSTECEQL: A NOVEL EVENT QUERY LANGUAGE FOR VANET’S UNCERTAIN EVENT STREAMSijfcstjournal
 
CLUSTBIGFIM-FREQUENT ITEMSET MINING OF BIG DATA USING PRE-PROCESSING BASED ON...
CLUSTBIGFIM-FREQUENT ITEMSET MINING OF BIG DATA USING PRE-PROCESSING BASED ON...CLUSTBIGFIM-FREQUENT ITEMSET MINING OF BIG DATA USING PRE-PROCESSING BASED ON...
CLUSTBIGFIM-FREQUENT ITEMSET MINING OF BIG DATA USING PRE-PROCESSING BASED ON...ijfcstjournal
 
A MUTATION TESTING ANALYSIS AND REGRESSION TESTING
A MUTATION TESTING ANALYSIS AND REGRESSION TESTINGA MUTATION TESTING ANALYSIS AND REGRESSION TESTING
A MUTATION TESTING ANALYSIS AND REGRESSION TESTINGijfcstjournal
 
GREEN WSN- OPTIMIZATION OF ENERGY USE THROUGH REDUCTION IN COMMUNICATION WORK...
GREEN WSN- OPTIMIZATION OF ENERGY USE THROUGH REDUCTION IN COMMUNICATION WORK...GREEN WSN- OPTIMIZATION OF ENERGY USE THROUGH REDUCTION IN COMMUNICATION WORK...
GREEN WSN- OPTIMIZATION OF ENERGY USE THROUGH REDUCTION IN COMMUNICATION WORK...ijfcstjournal
 
A NEW MODEL FOR SOFTWARE COSTESTIMATION USING HARMONY SEARCH
A NEW MODEL FOR SOFTWARE COSTESTIMATION USING HARMONY SEARCHA NEW MODEL FOR SOFTWARE COSTESTIMATION USING HARMONY SEARCH
A NEW MODEL FOR SOFTWARE COSTESTIMATION USING HARMONY SEARCHijfcstjournal
 
AGENT ENABLED MINING OF DISTRIBUTED PROTEIN DATA BANKS
AGENT ENABLED MINING OF DISTRIBUTED PROTEIN DATA BANKSAGENT ENABLED MINING OF DISTRIBUTED PROTEIN DATA BANKS
AGENT ENABLED MINING OF DISTRIBUTED PROTEIN DATA BANKSijfcstjournal
 
International Journal on Foundations of Computer Science & Technology (IJFCST)
International Journal on Foundations of Computer Science & Technology (IJFCST)International Journal on Foundations of Computer Science & Technology (IJFCST)
International Journal on Foundations of Computer Science & Technology (IJFCST)ijfcstjournal
 
AN INTRODUCTION TO DIGITAL CRIMES
AN INTRODUCTION TO DIGITAL CRIMESAN INTRODUCTION TO DIGITAL CRIMES
AN INTRODUCTION TO DIGITAL CRIMESijfcstjournal
 
DISTRIBUTION OF MAXIMAL CLIQUE SIZE UNDER THE WATTS-STROGATZ MODEL OF EVOLUTI...
DISTRIBUTION OF MAXIMAL CLIQUE SIZE UNDER THE WATTS-STROGATZ MODEL OF EVOLUTI...DISTRIBUTION OF MAXIMAL CLIQUE SIZE UNDER THE WATTS-STROGATZ MODEL OF EVOLUTI...
DISTRIBUTION OF MAXIMAL CLIQUE SIZE UNDER THE WATTS-STROGATZ MODEL OF EVOLUTI...ijfcstjournal
 
A STATISTICAL COMPARATIVE STUDY OF SOME SORTING ALGORITHMS
A STATISTICAL COMPARATIVE STUDY OF SOME SORTING ALGORITHMSA STATISTICAL COMPARATIVE STUDY OF SOME SORTING ALGORITHMS
A STATISTICAL COMPARATIVE STUDY OF SOME SORTING ALGORITHMSijfcstjournal
 
A LOCATION-BASED MOVIE RECOMMENDER SYSTEM USING COLLABORATIVE FILTERING
A LOCATION-BASED MOVIE RECOMMENDER SYSTEM USING COLLABORATIVE FILTERINGA LOCATION-BASED MOVIE RECOMMENDER SYSTEM USING COLLABORATIVE FILTERING
A LOCATION-BASED MOVIE RECOMMENDER SYSTEM USING COLLABORATIVE FILTERINGijfcstjournal
 

More from ijfcstjournal (20)

A COMPARATIVE ANALYSIS ON SOFTWARE ARCHITECTURE STYLES
A COMPARATIVE ANALYSIS ON SOFTWARE ARCHITECTURE STYLESA COMPARATIVE ANALYSIS ON SOFTWARE ARCHITECTURE STYLES
A COMPARATIVE ANALYSIS ON SOFTWARE ARCHITECTURE STYLES
 
SYSTEM ANALYSIS AND DESIGN FOR A BUSINESS DEVELOPMENT MANAGEMENT SYSTEM BASED...
SYSTEM ANALYSIS AND DESIGN FOR A BUSINESS DEVELOPMENT MANAGEMENT SYSTEM BASED...SYSTEM ANALYSIS AND DESIGN FOR A BUSINESS DEVELOPMENT MANAGEMENT SYSTEM BASED...
SYSTEM ANALYSIS AND DESIGN FOR A BUSINESS DEVELOPMENT MANAGEMENT SYSTEM BASED...
 
AN ALGORITHM FOR SOLVING LINEAR OPTIMIZATION PROBLEMS SUBJECTED TO THE INTERS...
AN ALGORITHM FOR SOLVING LINEAR OPTIMIZATION PROBLEMS SUBJECTED TO THE INTERS...AN ALGORITHM FOR SOLVING LINEAR OPTIMIZATION PROBLEMS SUBJECTED TO THE INTERS...
AN ALGORITHM FOR SOLVING LINEAR OPTIMIZATION PROBLEMS SUBJECTED TO THE INTERS...
 
LBRP: A RESILIENT ENERGY HARVESTING NOISE AWARE ROUTING PROTOCOL FOR UNDER WA...
LBRP: A RESILIENT ENERGY HARVESTING NOISE AWARE ROUTING PROTOCOL FOR UNDER WA...LBRP: A RESILIENT ENERGY HARVESTING NOISE AWARE ROUTING PROTOCOL FOR UNDER WA...
LBRP: A RESILIENT ENERGY HARVESTING NOISE AWARE ROUTING PROTOCOL FOR UNDER WA...
 
STRUCTURAL DYNAMICS AND EVOLUTION OF CAPSULE ENDOSCOPY (PILL CAMERA) TECHNOLO...
STRUCTURAL DYNAMICS AND EVOLUTION OF CAPSULE ENDOSCOPY (PILL CAMERA) TECHNOLO...STRUCTURAL DYNAMICS AND EVOLUTION OF CAPSULE ENDOSCOPY (PILL CAMERA) TECHNOLO...
STRUCTURAL DYNAMICS AND EVOLUTION OF CAPSULE ENDOSCOPY (PILL CAMERA) TECHNOLO...
 
AN OPTIMIZED HYBRID APPROACH FOR PATH FINDING
AN OPTIMIZED HYBRID APPROACH FOR PATH FINDINGAN OPTIMIZED HYBRID APPROACH FOR PATH FINDING
AN OPTIMIZED HYBRID APPROACH FOR PATH FINDING
 
EAGRO CROP MARKETING FOR FARMING COMMUNITY
EAGRO CROP MARKETING FOR FARMING COMMUNITYEAGRO CROP MARKETING FOR FARMING COMMUNITY
EAGRO CROP MARKETING FOR FARMING COMMUNITY
 
EDGE-TENACITY IN CYCLES AND COMPLETE GRAPHS
EDGE-TENACITY IN CYCLES AND COMPLETE GRAPHSEDGE-TENACITY IN CYCLES AND COMPLETE GRAPHS
EDGE-TENACITY IN CYCLES AND COMPLETE GRAPHS
 
COMPARATIVE STUDY OF DIFFERENT ALGORITHMS TO SOLVE N QUEENS PROBLEM
COMPARATIVE STUDY OF DIFFERENT ALGORITHMS TO SOLVE N QUEENS PROBLEMCOMPARATIVE STUDY OF DIFFERENT ALGORITHMS TO SOLVE N QUEENS PROBLEM
COMPARATIVE STUDY OF DIFFERENT ALGORITHMS TO SOLVE N QUEENS PROBLEM
 
PSTECEQL: A NOVEL EVENT QUERY LANGUAGE FOR VANET’S UNCERTAIN EVENT STREAMS
PSTECEQL: A NOVEL EVENT QUERY LANGUAGE FOR VANET’S UNCERTAIN EVENT STREAMSPSTECEQL: A NOVEL EVENT QUERY LANGUAGE FOR VANET’S UNCERTAIN EVENT STREAMS
PSTECEQL: A NOVEL EVENT QUERY LANGUAGE FOR VANET’S UNCERTAIN EVENT STREAMS
 
CLUSTBIGFIM-FREQUENT ITEMSET MINING OF BIG DATA USING PRE-PROCESSING BASED ON...
CLUSTBIGFIM-FREQUENT ITEMSET MINING OF BIG DATA USING PRE-PROCESSING BASED ON...CLUSTBIGFIM-FREQUENT ITEMSET MINING OF BIG DATA USING PRE-PROCESSING BASED ON...
CLUSTBIGFIM-FREQUENT ITEMSET MINING OF BIG DATA USING PRE-PROCESSING BASED ON...
 
A MUTATION TESTING ANALYSIS AND REGRESSION TESTING
A MUTATION TESTING ANALYSIS AND REGRESSION TESTINGA MUTATION TESTING ANALYSIS AND REGRESSION TESTING
A MUTATION TESTING ANALYSIS AND REGRESSION TESTING
 
GREEN WSN- OPTIMIZATION OF ENERGY USE THROUGH REDUCTION IN COMMUNICATION WORK...
GREEN WSN- OPTIMIZATION OF ENERGY USE THROUGH REDUCTION IN COMMUNICATION WORK...GREEN WSN- OPTIMIZATION OF ENERGY USE THROUGH REDUCTION IN COMMUNICATION WORK...
GREEN WSN- OPTIMIZATION OF ENERGY USE THROUGH REDUCTION IN COMMUNICATION WORK...
 
A NEW MODEL FOR SOFTWARE COSTESTIMATION USING HARMONY SEARCH
A NEW MODEL FOR SOFTWARE COSTESTIMATION USING HARMONY SEARCHA NEW MODEL FOR SOFTWARE COSTESTIMATION USING HARMONY SEARCH
A NEW MODEL FOR SOFTWARE COSTESTIMATION USING HARMONY SEARCH
 
AGENT ENABLED MINING OF DISTRIBUTED PROTEIN DATA BANKS
AGENT ENABLED MINING OF DISTRIBUTED PROTEIN DATA BANKSAGENT ENABLED MINING OF DISTRIBUTED PROTEIN DATA BANKS
AGENT ENABLED MINING OF DISTRIBUTED PROTEIN DATA BANKS
 
International Journal on Foundations of Computer Science & Technology (IJFCST)
International Journal on Foundations of Computer Science & Technology (IJFCST)International Journal on Foundations of Computer Science & Technology (IJFCST)
International Journal on Foundations of Computer Science & Technology (IJFCST)
 
AN INTRODUCTION TO DIGITAL CRIMES
AN INTRODUCTION TO DIGITAL CRIMESAN INTRODUCTION TO DIGITAL CRIMES
AN INTRODUCTION TO DIGITAL CRIMES
 
DISTRIBUTION OF MAXIMAL CLIQUE SIZE UNDER THE WATTS-STROGATZ MODEL OF EVOLUTI...
DISTRIBUTION OF MAXIMAL CLIQUE SIZE UNDER THE WATTS-STROGATZ MODEL OF EVOLUTI...DISTRIBUTION OF MAXIMAL CLIQUE SIZE UNDER THE WATTS-STROGATZ MODEL OF EVOLUTI...
DISTRIBUTION OF MAXIMAL CLIQUE SIZE UNDER THE WATTS-STROGATZ MODEL OF EVOLUTI...
 
A STATISTICAL COMPARATIVE STUDY OF SOME SORTING ALGORITHMS
A STATISTICAL COMPARATIVE STUDY OF SOME SORTING ALGORITHMSA STATISTICAL COMPARATIVE STUDY OF SOME SORTING ALGORITHMS
A STATISTICAL COMPARATIVE STUDY OF SOME SORTING ALGORITHMS
 
A LOCATION-BASED MOVIE RECOMMENDER SYSTEM USING COLLABORATIVE FILTERING
A LOCATION-BASED MOVIE RECOMMENDER SYSTEM USING COLLABORATIVE FILTERINGA LOCATION-BASED MOVIE RECOMMENDER SYSTEM USING COLLABORATIVE FILTERING
A LOCATION-BASED MOVIE RECOMMENDER SYSTEM USING COLLABORATIVE FILTERING
 

Recently uploaded

Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
DATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage exampleDATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage examplePragyanshuParadkar1
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
EduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIEduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIkoyaldeepu123
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 

Recently uploaded (20)

Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
DATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage exampleDATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage example
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
EduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIEduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AI
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 

A scalable, lexicon based technique for sentiment analysis

  • 1. International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.5, September 2014 A SCALABLE, LEXICON BASED TECHNIQUE FOR SENTIMENT ANALYSIS Chetan Kaushik and Atul Mishra Computer Engg. Department, YMCA University of Science & Technology, Faridabad ABSTRACT Rapid increase in the volume of sentiment rich social media on the web has resulted in an increased interest among researchers regarding Sentimental Analysis and opinion mining. However, with so much social media available on the web, sentiment analysis is now considered as a big data task. Hence the conventional sentiment analysis approaches fails to efficiently handle the vast amount of sentiment data available now a days. The main focus of the research was to find such a technique that can efficiently perform sentiment analysis on big data sets. A technique that can categorize the text as positive, negative and neutral in a fast and accurate manner. In the research, sentiment analysis was performed on a large data set of tweets using Hadoop and the performance of the technique was measured in form of speed and accuracy. The experimental results shows that the technique exhibits very good efficiency in handling big sentiment data sets. KEYWORDS Big data, Hadoop, Lexicon, Machine learning, Negation, NLP, Sentiment Analysis. 1. INTRODUCTION Sentiment Analysis is a Natural Language Processing task that includes obtaining the writer’s feeling about several products or services which are posted on the internet via various posts or comments. In other words Sentiment Analysis helps in determining the response of a user or a group of users on a topic and categorizing their opinion as positive, negative or neutral. In recent years, a huge increase is seen in the amount of opinions expressed on the web by social media users and hence it has captured the interest of more and more researchers. Liu [1] defined a sentiment as a quintuple “<oj, fjk, soijkl, hi, tl >, where oj is a target object, fjk is a feature of the object oj, soijkl is the sentiment value of the opinion of the opinion holder hi on feature fjk of object oj at time tl, soijkl is +ve, -ve, or neutral, or a more granular rating, hi is an opinion holder, tl is the time when the opinion is expressed.” Sentiment Analysis has its applications in several fields such as in marketing for knowing customers’ response towards a new product or service or by social media users to find the general opinion about a topic which is currently trending. It can help manufactures in deciding the strategies for the launch of new products based on the responses towards the previous versions of that product in various geographical areas. It can also be used to detect heated arguments in comments, use of abusive language or spam detection. Sentimental Analysis can be phrase based in which the sentiment of a single phrase is taken in to consideration, sentence based in which the sentiment of the whole sentence is calculated or it can DOI:10.5121/ijfcst.2014.4504 35
  • 2. International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.5, September 2014 be document based in which an aggregated sentiment of the whole document is categorized as positive, negative or neutral Sentiment Analysis is generally carried out in three steps. First, the subject towards which the sentiment is directed is found then, the polarity of the sentiment is calculated and finally the degree of the polarity is assigned with the help of a sentiment score which denotes the intensity of the sentiment. Over the last few years, the amount of sentiment rich data on the web has increased rapidly with the increase in social media users. Various companies are now turning to online forums, blogs etc. to get the review of their products from customers. The sentiment data hence is now considered as big data which needs techniques that are capable of handling such large amounts of data efficiently. In this research a lexicon based approach is used to perform sentiment analysis. There has been a lot of research on sentiment calculation using lexicon based and other techniques. Some of which are described in the following section. 2. RELATED WORK A lot of work has been done in the field of opinion mining or sentiment analysis for well over a decade now. Different techniques are used to classify the text according to polarity. Most of these techniques can be classified under two categories: 2.1. Machine learning Machine learning strategies work by training our algorithm with a training data set before applying it to the actual data set. That is, a machine learning algorithm needs to be trained first for both supervised and unsupervised learning tasks. Machine learning techniques first trains the algorithm with some particular inputs with known outputs so that later it can work with new unknown data. Some of the most renowned works based on machine learning are as follows: Melville [2], Rui [3], Ziqiong [4], Songho [5], Qiang [6] and Smeureanu [7] used naïve bayes for classification of text. Naïve bayes is one of the most popular method in text classification. It is considered as one of the most simple and efficient approaches in NLP. It works by calculating the probability of an element being in a category. First the prior probability is calculated which is afterwards multiplied with the likelihood to calculate the final probability. The method assumes every word in the sentence to be independent which makes it easier to implement but less accurate. That’s why the method is given the name ‘naïve’. Rui [3], Ziqiong [4], Songho [5], and Rudy [8] used SVM (Support Vector Machines) in their approach. The sentiment classification was done using discriminative classifier. This approach is based on structural risk minimization in which support vectors are used to classify the training data sets into two different classes based on a predefined criteria. Multiclass SVM can also be used for text classification [9]. Songho [5] used centroid classifier which assign a centroid vector to different training classes and then uses this vector to calculate the similarity values of each element with a class. If the similarity value exceeds a defined threshold then the element is assigned to that class (polarity in this case). 36
  • 3. International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.5, September 2014 Songho [5] also used K-Nearest Neighbor (KNN) approach which finds ‘K’ nearest neighbours of a text document among the documents in the training data set. Then classification is performed on the basis of similarity score of a class with respect to a neighbour. Some approaches use prediction method. Winnow first predicts a class for an entity and then feedback is used to check for mistakes. The weight vectors are changed according to the feedback and the process is repeated over a large training data set. A combined approach was proposed based on included rule classification, machine learning and supervised learning [8]. Each training set was subjected to a 10 fold cross validation process. Several classifiers were used in series. Whenever a classifier failed, the text was passed on to the next classifier until all the texts are classified or there are no remaining classifiers Rui [3] used Ensemble technique which combines the output of several classification methods into a single integrated output. Another approach based on artificial neural networks was used to divide the document into positive, negative and fuzzy tone [10]. The approach was based on recursive least squares back propagation training algorithm. Long-Sheng [11] used a neural network based approach to combine the advantages of machine learning and information retrieval techniques. 2.2. Lexicon Based Lexicon Based techniques work on an assumption that the collective polarity of a document or sentence is the sum of polarities of the individual words or phrases. Some of the significant works done using this technique are: Kamps [12] used a simple technique based on lexical relations to perform classification of text. Andrea [13] used word net to classify the text using an assumption that words with similar polarity have similar orientation. Ting-Chun [14] used an algorithm based on pos (part of speech) patter. A text phrase was used as a query for a search engine and the results were used to classify the text. Prabhu [15] which used a simple lexicon based technique to extract sentiments from twitter data Turney [19] used semantic orientation on user reviews to identify the underlying sentiments. Taboada [20] used lexicon based approach to extract sentiments from microblogs. Sentiment analysis for microblogs is more challenging because of problems like use of short length status message, informal words, word shortening, spelling variation and emoticons. Twitter data was used for sentiment analysis by [21]. Negation word can reverse the polarity of any sentence. Taboada [20] performed sentiment analysis while handling negation and intensifying words. Role of negation was surveyed by [25]. Minquing [26] classified the text using a simple lexicon based approach with feature detection. It was observed that most of these existing techniques doesn’t scale to big data sets efficiently. While various machine learning methodologies exhibits better accuracy than lexicon based techniques, they take more time in training the algorithm and hence are not suitable for big data sets. In this paper, lexicon based approach is used to classify the text according to polarity. 37
  • 4. International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.5, September 2014 3. PROPOSED APPROACH As explained earlier the focus of this research was to device an approach that can perform sentiment analysis quicker because vast amount of data needed to be analyzed. Also, it had to be made sure that accuracy is not compromised too much while focusing on speed. The proposed approach is a lexicon based technique i.e. a dictionary of sentiment bearing words along with their polarities was used to classify the text into positive, negative or neutral opinion. Machine learning techniques are not used because although they are more accurate than the lexicon based approaches, they take far too much time performing sentiment analysis as they have to be trained first and hence are not efficient in handling big sentiment data. The main component of this approach is the sentiment lexicon or dictionary. 3.1. Sentiment Lexicon Unlike the small sized dictionaries which are used in other approaches, which are build up from a small set of words using a training data set, the dictionary used in this technique contains a large set of sentiment bearing words along with their polarity. The dictionary is domain specific i.e. the polarities of the words in the dictionary are set according to a specific domain e.g. book reviews, political blogs etc. Same word in different domains can have different meanings, the dictionary used in this approach is made for movie review domain. The dictionary contains all forms of a word i.e. every word is stored along with its various verb forms e.g. applause, applauding, applauded, applauds. Hence eliminating the need for stemming each word which saves more time. Emoticons are generally used by people to depict emotions. Hence it is obvious that they contain very useful sentiment information in them. The dictionary used in the approach contains more than 30 different emoticons along with their polarities. The Dictionary also contains the strength of the polarity of every word. Some word depicts stronger emotions than others. For example good and great are both positive words but great depicts a much stronger emotion. Negation and blind negation are very important in identifying the sentiments, as their presence can reverse the polarity of the sentence. The dictionary used here also contains various negation and blind negation words so that they can be identified in the sentence. The dictionary contains several different adjectives, nouns, negation words, emoticons etc. Some of these words are shown as an example in table 1. 38 Table 1. Some examples of the words and emoticons in the sentiment dictionary Strength Word POS Stemmed Polarity weaksubj abandoned adj n negative weaksubj abandonment noun n negative weaksubj abandon verb y negative strongsubj needed verb n blindnegation
  • 5. International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.5, September 2014 39 strongsubj require verb n blindnegation strongsubj not advb n negation strongsubj neither conj n negation strongsubj nor conj n negation strongsubj :) emoti n positive strongsubj :( emoti n negative 3.2. Feature detection using Twitter hashtags A hashtag is a word or an unspaced phrase prefixed with the number sign ("#"). It is a form of metadata tag. Words in messages on microblogging and social networking services such as Twitter, Facebook, Google+ or Instagram may be tagged by putting "#" before them, either as they appear in a sentence or they are appended to it. In this research, twitter data is categorized according to polarity. Detecting the subject towards which the sentiment is directed is a tedious task to perform, but as twitter is used as data source hashtags can be used to easily identify the subject hence eliminating the need for using a complex mechanism for feature detection thereby saving time and effort. 3.3. Handling negation and blind negation Negation words are the words which reverse the polarity of the sentiment involved in the text. For example ‘the movie was not good’. Although the word ‘good’ depicts a positive sentiment the negation – ‘not’ reverses its polarity. In the proposed approach whenever a negation word is encountered in a tweet, its polarity is reversed. Blind negation words are the words which operates on the sentence level and points out a feature that is desired in a product or service. For example in the sentence ‘the acting needed to be better’, ‘better’ depicts a positive sentiment but the presence of the blind negation word- ‘needed’ suggests that this sentence is actually depicting negative sentiment. In the proposed approach whenever a blind negation word occurs in a sentence its polarity is immediately labelled as negative. 3.4. Sentiment Calculation Sentiment calculation is done for every tweet and a polarity score is given to it. If the score is greater than 0 then it is considered to be positive sentiment on behalf of the user, if less than 0 then negative else neutral. The polarity score is calculated by using algorithm 1. Algorithm 1: ALGO_SENTICAL Input: Tweets, SentiWord_Dictionary Output: Sentiment (positive, negative or neutral) 1) BEGIN 2) For each tweet Ti 3) { 4) SentiScore = 0; 5) For each word Wj in Ti that exists in Sentiword_Dictionary 6) { 7) If polarity[Wj] = blindnegation
  • 6. International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.5, September 2014 8) { 9) Return negative; 10) } 11) Else 12) { 13) If polarity[Wj] = positive && strength[Wj] = Strongsubj 14) { 15) SentiScore = SentiScore + 1; 16) } 17) Else If polarity[Wj] = positive && strength[Wj] = Weaksubj 18) { 19) SentiScore = SentiScore + 0.5; 20) } 21) Else If polarity[Wj] = negative && strength[Wj] = Strongsubj 22) { 23) SentiScore = SentiScore – 1; 24) } 25) Else If polarity[Wj] = negative && strength[Wj] = Weaksubj 26) { 27) SentiScore = SentiScore – 0.5; 28) } 29) } 30) If polarity[Wj] = negation 31) { 32) Sentiscore = Sentiscore * -1 33) } 34) } 35) If Sentiscore of Ti >0 36) { 37) Sentiment = positive 38) } 39) Else If Sentiscore of TI<0 40) { 41) Sentiment = negative 42) } 43) Else 44) { 45) Sentiment = neutral 46) } 47) Return Sentiment 48) } 49) END 4. PERFORMANCE EVALUATION In this section, description of the tools which were used to perform sentiment analysis is given along with the performance of the approach and its comparison with existing approaches. 4.1. Experimental Setup The proposed algorithm was implemented using a sandboxed version of Hadoop. Hadoop is designed to work in a huge multimode environment consisting of thousands of servers situated at 40
  • 7. International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.5, September 2014 different locations. But for research purposes often a single node virtual environment is used that creates an illusion of several nodes which are situated at different locations and are working together. An Intel Core i5-3210M CPU@2.50GHz processor with 6 GB memory was used to simulate the Hadoop Environment. Data was imported from Twitter using Flume, a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of streaming data into the Hadoop Distributed File System (HDFS). Then the data is refined and the algorithm described in the previous section is implemented upon it with the help of HiveQL. HiveQL is an SQL like language which is used in hadoop for interacting and manipulating huge databases. 4.2. Results and Evaluation As explained earlier the purpose of this research was to device a method that can quickly compute the sentiments of huge data sets without compromising too much with accuracy. The proposed approach has performed very well in terms of speed. It took 14.8 Seconds to analyse the sentiments behind 6,74,412 tweets. Table 1 shows the performance of the algorithm. Moreover the speed can be substantially improved by using an actual multimode hadoop environment instead of a single node sandboxed configuration. The algorithm also performed very well in terms of accuracy. Tweets were analyzed with an accuracy of 73.5 %. As expected it is not as high as the machine learning methodologies but it shows a fairly good accuracy when compared with other lexicon based approaches. Figure 1 shows a comparison between average accuracies of machine learning and lexicon based techniques discussed in section II and the accuracy of the proposed approach. 41 Table 2 Performance of the proposed approach Number of Tweets Time Taken Accuracy 6,74,412 14.8 seconds 73.5 % Figure 1. Comparison of the proposed approach with other methodologies
  • 8. International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.5, September 2014 5. CONCLUSION AND FUTURE WORK Sentiment analysis is being used for different applications and can be used for several others in future. It is evident that its applications will definitely expand to more areas and will continue to encourage more and more researches in the field. We have done an overview of some state-of-the- art solutions applied to sentiment classification and provided a new approach that scales to big data sets efficiently. A scalable and practical lexicon based approach for extracting sentiments using emoticons and hashtags is introduced. Hadoop was used to classify Twitter data without need for any kind of training. Our approach performed extremely well in terms of both speed and accuracy while showing signs that it can be further scaled to much bigger data sets with similar, in fact better performance. In this research, main focus was on performing sentiment analysis quickly so that big data sets can be handled efficiently. The work can be further expanded by introducing techniques that increase the accuracy by tackling problems like thwarted expressions and implicit sentiments which still needs to be resolved properly. Also as explained earlier, this work was implemented on a single node sandboxed configuration and although it is expected that it will perform much better in a multimode enterprise level configuration, it is desirable to check its performance in such environment in future. REFERENCES [1] B. Liu. Sentiment Analysis and Subjectivity. Handbook of Natural Language Processing, Second 42 Edition, (editors: N. Indurkhya and F. J. Damerau), 2010. [2] Melville, Wojciech Gryc, “Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification”, KDD‟09, June 28–July 1, 2009, Paris, France.Copyright 2009 ACM 978-1-60558- 495-9/09/06. [3] Rui Xia, Chengqing Zong, Shoushan Li, “Ensemble of feature sets and classification algorithms for sentiment classification”, Information Sciences 181 (2011) 1138–1152. [4] Ziqiong Zhang, Qiang Ye, Zili Zhang, Yijun Li, “Sentiment classification of Internet restaurant reviews written in Cantonese”, Expert Systems with Applications xxx (2011) xxx–xxx. [5] Songbo Tan, Jin Zhang, “An empirical study of sentiment analysis for chinese documents”, Expert Systems with Applications 34 (2008) 2622–2629. [6] Qiang Ye, Ziqiong Zhang, Rob Law, “Sentiment classification of online reviews to travel destinations by supervised machine learning approaches”, Expert Systems with Applications 36 (2009) 6527– 6535. [7] Ion SMEUREANU, Cristian BUCUR, “Applying Supervised Opinion Mining Techniques on Online User Reviews”, Informatica Economică vol. 16, no. 2/2012. [8] Rudy Prabowo, Mike Thelwall, “Sentiment analysis: A combined approach.” Journal of Informetrics 3 (2009) 143–157. [9] Kaiquan Xu , Stephen Shaoyi Liao , Jiexun Li, Yuxia Song, “Mining comparative opinions from customer reviews for Competitive Intelligence”, Decision Support Systems 50 (2011) 743–754. [10] ZHU Jian , XU Chen, WANG Han-shi, “" Sentiment classification using the theory of ANNs”, The Journal of China Universities of Posts and Telecommunications, July 2010, 17(Suppl.): 58–62 .[16] Ziqiong Zhang, Qiang Ye, Zili Zhang, Yijun Li, “Sentiment classification of Internet restaurant reviews written in Cantonese”, Expert Systems with Applications xxx (2011) [11] Long-Sheng Chen, Cheng-Hsiang Liu, Hui-Ju Chiu, “A neural network based approach for sentiment classification in the blogosphere”, Journal of Informetrics 5 (2011) 313–322.
  • 9. International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.5, September 2014 [12] Kamps, Maarten Marx, Robert J. Mokken and Maarten De Rijke, “Using wordnet to measure semantic orientation of adjectives”, Proceedings of 4th International Conference on Language Resources and Evaluation, pp. 1115-1118, Lisbon, Portugal, 2004. [13] Andrea Esuli and Fabrizio Sebastiani, “Determining the semantic orientation of terms through gloss classification”, Proceedings of 14th ACM International Conference on Information and Knowledge Management,pp. 617-624, Bremen, Germany, 2005. [14] Ting-Chun Peng and Chia-Chun Shih , “An Unsupervised Snippet-based Sentiment Classification Method for Chinese Unknown Phrases without using Reference Word Pairs”, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and intelligent Agent Technology JOURNAL OF COMPUTING, VOLUME 2, ISSUE 8, AUGUST 2010, ISSN 2151-9617 . [15] Prabu Palanisamy, Vineet Yadav, Harsha Elchuri, “Serendio: Simple and Practical lexicon based 43 approach to Sentiment Analysis”, Serendio Software Pvt Ltd, 2013. [16] Subhabrata Mukherjee, Dr. Pushpak Bhattacharyya, Indian Institute of Technology, Bombay, Department of Computer Science and Engineering, June 2012. [17] Bickel, S.Bruckner, M. Scheffer, “Discriminative learning for differing training and test distributions”, International Conference on Machine Learning, 2007. [18] Benamara, Farah, Carmine Cesarano, Antonio Picariello, Diego Reforgiato and VS Subrahmanian, “Sentiment analysis: Adjectives and adverbs are better than adjectives alone” International Conference on Weblogs and Social Media, ICWSM, Boulder, CO. 2007. [19] Peter Turney and Michael Littman. 2003. Measuring praise and criticism: Inference of semantic orientationfrom association. ACM Transactions on Information Systems 21(4):315–346. [20] Maite Taboada, Julian Brooke, Milan Tofiloski, Kimberly Voll and Manfred Stede. 2011. Lexicon-based methods for sentiment analysis. Computational linguistics, volume 37, number2, 267–307, MIT Press. [21] Albert Bifet and Eibe Frank. 2010. Sentiment knowledge discovery in twitter streaming data, DiscoveryScience 1–14, Springer. [22] Bo Han and Timothy Baldwin. 2011. Lexical normalisation of short text messages: Makn sens a# twitter.Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies Volume 1 , 368–378. [23] Max Kaufmann and Jugal Kalita. 2010. Syntactic normalization of Twitter messages. International Conference on Natural Language Processing Kharagpur, India. [24] Dmitry Davidov, Oren Tsur and Ari Rappoport. 2010. Enhanced sentiment learning using twitter hashtagsand smileys. Proceedings of the 23rd International Conference on Computational Linguistics 241 –249, Association for Computational Linguistics. [25] Michael Wiegand, Alexandra Balahur, Benjamin Roth, Dietrich Klakow, Andr´es Montoyo. 2010. A survey on the role of negation in sentiment analysis. Proceedings of the workshop on negation and speculation in natural language processing 60–68, Association for Computational Linguistics. [26] Minqing Hu, Bing Liu. Mining and Summarizing Customer Reviews, Department of Computer Science, University of Illinois at Chicago, Research Track Paper. Authors Chetan Kaushik1 is a student in the Department of Computer Engineering, YMCA University of Science and Technology, Haryana, India. He holds a bachelor’s degree in information Technology from Kurukshetra University, kurukshetra and is pursuing his master’s degree in computer engineering from the Department of Computer Engineering, YMCA University of Science and Technology. His research interests includes big data and Social network mining. Atul Mishra2 is working as an Associate Professor in the Department of Computer Engineering, YMCA University of Science and Technology, Haryana, India. He holds a Master’s Degree in Computer Science and Technology, from University of Roorkee(now IIT,Roorkee) and PhD in Computer Science and Engg. from MD University, Rohtak. He has about 16 years of work experience in the Optical Telecommunication Industry specializing in Optical Network Planning and Network Management tools development. His research interests include SOA, Network Management & Mobile Agents.