A Survey on Sentiment Mining Techniques


Published on

A survey on publications addressing challenges in and techniques of sentiment mining.

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

A Survey on Sentiment Mining Techniques

  1. 1. A Survey of Sentiment Mining Techniques Khan Mostafa Graduate Student, Computer Science, Stony Brook University, NY 11794, USA Email: khan.mostafa@stonybrook.edu Student ID# 109365509 ABSTRACT A survey on publications addressing challenges in and techniques of sentiment mining. 1 INTRODUCTION Text convey subjective and objective information, as well as sentiments associated with it. It is an intuitive task for human to identify associated sentiment of any text. However, to identify collective, as well as individual sentiments amongst a large collection of textual data can be an enormous task. This requires data mining and classifying techniques to automatically associate sentiments of textual data. Sentiment mining can be used to identify how people feel about a product, topic or more generally an entity. This is useful to manufactures from business point of view. In recent years, there had been much academic research in sentiment analysis as well as practical commercial applications. Generally, sentiment can be negative or positive. Nevertheless, every text do not convey sentiment, some are merely objective statements. Thus, application needs to classify texts as positive, negative or neutral while mining sentiment. Sentiment analysis has been studied from perspectives of data mining, machine learning, natural language processing and statistical analysis. In this article, I would try to address several aspects of sentiment mining. I survey several papers starting with a text which familiarizes readers to basic ideas on automatic sentiment analysis. Then, I briefly address a well cited paper which instigated much of sentiment mining research as a specialized classification task. Next article discusses on utilizing microblogging sites like Twitter for sentiment analysis and opinion mining. Next papers discuss different approaches for sentiment classification. One specially focuses on mining large real time streaming data and the last paper gives hints to the case of ironic speeches. Each of these surveyed papers address slightly different aspects in sentiment mining and covers subtly overall problem domain. 2 SENTIMENT ANALYSIS AND OPINION MINING 2.1 Automatic Sentiment Analysis in On-line Text I would open my survey by first addressing to a relatively old but not so ancient text (Boiy, et al. 2007) about sentiment analysis which introduces readers to basic concepts, methodology, techniques and challenges in related topic. They first objectify sentiment by introducing concepts of emotions. Emotions can occur in text as appraisal, direct expressions, elements of action and remarks. Survey paper submitted for CSE590 Networks and Data Mining Techniques on Sep 26, 2013
  2. 2. Khan Mostafa Student ID# 109365509 Then they introduces readers to methodologies for identifying emotion (and thus sentiments) of text. They explores symbolic techniques and machine learning techniques. To employ machine learning techniques, first we need to select some features. In search of potential candidate features terminologies like Parts of Speech (POS), unigrams, n-grams, lemmas, negations, opinion words, adjectives are prevalent. Authors, then mentions support vector machines, naïve Bayes multinomial and maximum entropy as three example supervised method. Authors also lay focus on several challenges. One challenge is that, often in many texts persons express sentiment about different topics – some being negative and some being positive. Therefore, it can be useful to investigate relation topic sentiment relation. Again, many texts are not subjective but merely neutral objective statements. So, before estimating sentiment polarity, it is useful to identify whether they really bear some sentiment. Similar challenge is cross domain classification. Another important issue is, the text quality; especially when gathered from the web – text are intertwined with fair amount of junk. This requires decent amount of text filtration. 2.2 Thumbs up? Sentiment Classification using Machine Learning Techniques Pang, et al. (Pang, Lee and Vaithyanathan 2002), amongst many, investigated in the field of sentiment classification at an early stage and posed several challenges in the field. They aimed to, “examine whether it suffices to treat sentiment classification simply as a special case of topic-based categorization or whether special sentimentcategorization methods need to be developed.” They tried to employ three machine learning techniques, which performs well in topic categorization, namely: - (a) naïve Bayes, (b) maximum entropy classification and (c) support vector machines only to find that, they do not perform satisfactorily in sentiment classification. Thus, they ended with an open question for researchers to investigate intensely. 2.3 Twitter as a Corpus for Sentiment Analysis and Opinion Mining A. Pak & P. Paroubek (Pak and Paroubek 2010) studies how microblogging platform can be used for sentiment analysis. They mined Twitter to automatically collect a corpus of negative and positive sentiment (subjective) as well as objective (neutral) posts. They cleverly exploited the use of emoticons to associate sentiment to tweets; similar approach was exemplified by J. Read (Read. 2005). They queried Twitter for two types of emoticons:   Happy emoticons: “:-)”, “:)”, “=)”, “:D” etc. Sad emoticons: “:-(”, “:(”, “=(”, “;(” etc. In conjunction to that, they collected objective/neutral posts by retrieving posts from newspapers and magazines. Pak, et al. analyzed their collected corpus first by tagging posts in the corpus using TreeTagger (Schmid 1994) and then performing pairwise comparison of tags distribution over two sets. For subjective set vs. objective set they observe that, POS tags are not evenly distributed and postulated that, such feature can be used to classify objective and subjective posts. Similar observation was for positive vs. negative sentiment posts too. For training a sentiment classifier, they used the presence of n-grams as binary feature. They claimed that, high order n-grams performs better at capturing sentiments while unigrams has good coverage over data. While constructing n-grams they attached negation to adjacent terms. Then they use Naïve Bayes classifier and claimed that, this performs better than SVM or CRF (Lafferty, McCallum and Fernando 2001). They trained two Bayes classifiers, (a) n-gram based and (b) POS based. To attain a final result, they estimate sentiment using both classifiers and calculate the log likelihood of each sentiment. To increase accuracy, they suggested discarding common n-grams. For this, they only used 2
  3. 3. A Survey of Sentiment Mining Techniques n-grams with low Shannon entropy values. They evaluated their system over hand annotated real Twitter posts. The methodology presented here is an ideal one for this particular case. Specially, automatic training of classifier is a clever corpus building idea. Besides, combination of n-gram based and POS based classifying significantly solves the challenge of topic-sentiment relation. However, this methodology do not address how to handle streaming data which changes over time. 2.4 Using Appraisal Taxonomies for Sentiment Analysis In their paper about sentiment analysis, Whitelaw, et al. (Whitelaw, Garg and Argamon 2005) suggests using appraisal taxonomies for sentiment classification. They argued that, for semantic analysis approaches should go beyond (a) bag of words and (b) mood classified words. They identified the need for semantic analysis of attitude expression and also hypothesized that, atomic units of sentiment expression are not individual word but rather appraisal groups. They adopted four main types of attributes for appraisal groups: Attitude, Orientation, Graduation and Polarity; adopted from Martin and White’s Appraisal Theory (Martin and White 2005). They discussed a semi-automated technique to construct a lexicon of appraisal groups. To do so, they used terms from (Martin and White 2005) as seed terms and generated candidate expansions using WordNet and two other thesauri. They used coarse ranking of relevance to enlist such terms. However, they manually inspected each ranked list to produce final set of terms. Then they tested several feature sets, e.g. Words by Attitude, Systems by Attitude, Appraisal Group by Attitude & Orientation etc. They evaluated the effectiveness of the feature sets for movie review classification on IMDb movie reviews. They found that, union of bag-of-words and appraisal group by attitude & orientation (BoW+G:AO) yields best result. The approach demonstrated in this paper has several drawbacks in terms of scalability. Especially, as the lexicon building involves much manual effort and the objective function for classification tends to be computation intensive. However, their work draw the attention of researchers towards an important notion that, sentiment analysis should concentrate more on key terms rather than the whole corpus. Similar observation was found by (Benamara, et al. 2007) and (Subrahmanian and Reforgiato 2008) stating that, “Adjectives and Adverbs are better than Adjectives Alone”. Alongside, the essence of the outcome of (Whitelaw, Garg and Argamon 2005)’s work can be identified to be analogous to what (Pak and Paroubek 2010) exploits in their work by classifying sentiments based on both POS and word groups (n-grams). 2.5 Joint Sentiment/Topic Model for Sentiment Analysis Lin, et al. (Lin and He 2009) addressed sentiment analysis in a slightly different perspective by combining topic to it. They proposed an extension of the topic model, Latent Dirichlet Allocation (LDA) by adding a sentiment layer to it. Their model is described as Joint Sentiment/Topic (JST) model which is fully unsupervised and can detect sentiment and topic simultaneously in document level. They describe, “The existing framework of LDA has three hierarchical layers, where topics are associated with documents, and words are associated with topics. In order to model document sentiments, we propose a joint sentiment/topic (JST) model by adding an additional sentiment layer between the document and the topic layer. Hence, JST is effectively a four-layer model, where sentiment labels are associated with documents, under which topics are associated with sentiment labels and words are associated with both sentiment labels and topics.” They observed that, sentiment document distribution plays important role in determining polarity of a document. 3
  4. 4. Khan Mostafa Student ID# 109365509 They have examined an alternative model, called Tying-JST, which incorporates single topic-document distribution as opposed to individual distribution for each document in JST. However, Tying-JST shows consistently poor performance than JST. JST incorporates prior information with its model to enhance accuracy. They examined four model priors:- (a) paradigm word list, (b) mutual information, (c) full subjectivity lexicon and (d) filtered subjectivity lexicon. They evaluates result accuracy for different prior models which demonstrates significant improvement with incorporation of prior models as compared to results obtained from implementation without prior models. Also, filtered subjectivity lexicon perceived to be best amongst studied models. JST is stipulated to be a novel text mining approach for sentiment analysis and topic extraction. By simultaneously identifying topic, this model addresses to the problem of domain dependence of subjectivity. (i.e., a single word can have negative connotation in one domain whereas the same word might be positive in another domain.) However, the complexity of this approach can pose a major challenge is large scale commercial implementation of this method. This method considers document level sentiment, while many applications are often interested in much granular sentiment, especially sentiment towards entities. 2.6 Sentiment Knowledge Discovery in Twitter Streaming Data Yet another perspective of sentiment analysis is investigate by (Bifet and Frank. 2010) addressing challenges in mining streaming “data whose nature or distribution changes over time”. It specifically addresses Twitter data stream where data arrives at high speed and prediction algorithms requires to perform in real time. The paper addresses specifics of Twitter API and other implementation detail, which I would keep aside from survey discussion. In question of sentiment analysis, they note challenges posed due to succinctness of tweets and possibility of sarcasm and irony. They also leverages the advantage of many tweets being annotated by tweet-authors using emoticons – same idea utilized by (Pak and Paroubek 2010) to use such tweets as training data for sentiment classifier. However to train, they filter tweets by (a) replacing mentions with tag: USER, hyperlinks by tag: URL, (b) removing emoticons. Authors argue that frequently used measure, “prequential accuracy is not well-suited for data streams with unbalanced data, and that a prequential estimate of Kappa should be used instead.” Authors identifies the reason is that, the classes are not balanced and can vary over time and often one class is much more frequent than other class. Hence, a more appropriate measure would be something that normalizes a classifier accuracy by chance predictor such as Kappa statistics (Cohen 1960). They postulates on a suggestion by (Gama, Sebastião and Rodrigues 2009) which proposed to forget estimation either by (a) sliding a window on most recent observation or (b) weighing observation with fading factors. Authors indicates that, output on both approach are almost similar and thus suggest using sliding window with Kappa statistics. Then the authors experimented three fast incremental methods: - (a) multinomial naïve Bayes, (b) stochastic gradient descent (SGD) and (c) Hoeffding tree for mining this data stream. On the basis of their demonstration, authors suggested using SGD. This work successfully address the problem of streaming data and their novel solution can be viewed as an ideal solution. 2.7 The case of irony The last paper I would investigate is much recent one by Bosco, et al. (Bosco, Patti and Bolioli 2013) – a portion of which addresses the case of irony. In our relevant perspective, irony can be identified as 4
  5. 5. A Survey of Sentiment Mining Techniques a polarity reverser. That being said, question arises how to identify irony (and other figures of speech). Authors suggest that, context knowledge is important to identify irony. In Facebook comment threads, diagonal comments can be marked as ironic. But in context less circumstances (e.g. Twitter) world knowledge is required. Again, interpretation of ironic speeches can be subjective. Hence, authors finds the necessity of developing manually annotated corpora for irony detection and poses an open question to investigate. 3 CONCLUSION In this paper, I have tried to represent core ideas behind surveyed texts. These texts are all related to sentiment mining, sentiment analysis problems domain and challenges in them. Each of them addresses different aspects of this vast problem domain and provides insight on how to build a complete solution for mining large text data and extract sentiment out of it. This survey defines what sentiment is, how to classify them and use data mining and machine learning techniques to extract opinion from large corpuses. It also discusses on few approaches addressing challenges of domain dependence, ironic speeches, streaming data and so forth. Insights are found to identify opinion related to entities and trace sentiment transition over time. 4 REFERENCES Benamara, Farah, Carmine Cesarano, Antonio Picariello, Diego Reforgiato, and VS Subrahmanian. 2007. "Sentiment Analysis: Adjectives and Adverbs are better than Adjectives Alone." International Conference on Weblogs and Social Media. Boulder, CO USA: ICWSM. Bifet, Albert, and Eibe Frank. 2010. "Sentiment knowledge discovery in twitter streaming data." In Discovery Science, 1-15. Berlin Heidelberg: Springer . Boiy, Erik, Pieter Hens, Koen Deschacht, and Marie-francine Moens. 2007. "Automatic Sentiment Analysis in On-line Text." Proceedings of Conference on Electronic Publishing. Vienna, Austria: ELPUB. 349-360. Bosco, Cristina, Viviana Patti, and Andrea Bolioli. 2013. "Developing Corpora for Sentiment Analysis: The Case of Irony and Senti-TUT." IEEE Intelligent Systems (IEEE Computer Society) 55-63. Cohen, Jacob. 1960. "A coefficient of agreement for nominal scales." Educational and Psychological Measurement 37-46. Gama, João, Raquel Sebastião, and Pedro Pereira Rodrigues. 2009. "Issues in evaluation of stream learning algorithms." Proceedings of the 15th ACM SIGKDD International Conference. ACM. 329338. Lafferty, John D., Andrew McCallum, and N.C. Fernando. 2001. "Conditional random fields: Probabilistic." Proceedings of the Eighteenth International Conference on Machine Learning. San Francisco, CA, USA.: Morgan Kaufmann Publishers Inc. 282-289. Lin, Chenghua, and Yulan He. 2009. "Joint sentiment/topic model for sentiment analysis." Proceedings of the 18th ACM conference on Information and knowledge management. ACM. 375-384. Martin, J. R., and P. R. R. White. 2005. Language of Evaluation: Appraisal in English. London: Palgrave. http://grammatics.com/appraisal/. 5
  6. 6. Khan Mostafa Student ID# 109365509 Pak, Alexander, and Patrick Paroubek. 2010. "Twitter as a Corpus for Sentiment Analysis and Opinion Mining." Language Resources and Evaluation. 1320-1326. Pang, Bo, Lillian Lee, and Shivakumar Vaithyanathan. 2002. "Thumbs up? Sentiment Classification using Machine Learning Techniques." Proceedings of the ACL-02 conference on Empirical methods in natural language processing. Philadelphia, PA, USA: Association for Computational Linguistics. 79-86. Read., Jonathon. 2005. "Using emoticons to reduce dependency." The Association for Computer Linguistics. Schmid, Helmut. 1994. "Probabilistic part-of-speech tagging using decision trees." Proceedings of the International. 44-49. Subrahmanian, Venkatramana S., and Diego Reforgiato. 2008. "AVA: Adjective-verb-adverb combinations for sentiment analysis." Intelligent Systems (IEEE) 23 (4): 43-50. Whitelaw, Casey, Navendu Garg, and Shlomo Argamon. 2005. "Using appraisal groups for sentiment analysis." Proceedings of the 14th ACM international conference on Information and knowledge management. ACM. 625-631. 6