Aspect mining and sentiment association


Published on

The most integral part of our work is to extract Aspects from User Feedback and associate Sentiment and Opinion terms to them. The dataset we have at our disposal to work upon, is a set of feedback documents for various departments in a Hospital in XML format which have comments represented in tags. It contains about 65000 responses to a survey taken in a Hospital. Every response or comment is treated as a sentence or a set of them. We perform a sentence level aspect and sentiment extraction and we attempt to understand and mine User Feedback data to gather aspects from it. Further to it, we extract the sentiment mentions and evaluate them contextually for sentiment and associate those sentiment mentions with the corresponding aspects. To start with, we perform a clean up on the User Feedback data, followed by aspect extraction and sentiment polarity calculation, with the help of POS tagging and SentiWordNet filters respectively. The obtained sentiments are further classified according to a set of Linguistic rules and the scores are normalized to nullify any noise that might be present. We lay emphasis on using a rule based approach; rules being Linguistic rules that correspond to the positioning of various parts-of-speech words in a sentence.

Published in: Education, Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Aspect mining and sentiment association

  1. 1. An Approach to Extract Aspects and Sentence Level Sentiment From User Feedback Using a Rule Based Approach in conjunction with SentiWordNet and POS-Tagging Aaruna G {}, Ramachandra Kousik A.S {} Imaginea Technologies, a BU of Pramati Technologies.AbstractThe most integral part of our work is to extract Aspects from User Feedback and associate Sentimentand Opinion terms to them. The dataset we have at our disposal to work upon, is a set of feedbackdocuments for various departments in a Hospital in XML format which have comments represented intags. It contains about 65000 responses to a survey taken in a Hospital. Every response or comment istreated as a sentence or a set of them. We perform a sentence level aspect and sentiment extractionand we attempt to understand and mine User Feedback data to gather aspects from it. Further to it,we extract the sentiment mentions and evaluate them contextually for sentiment and associate thosesentiment mentions with the corresponding aspects. To start with, we perform a clean up on the UserFeedback data, followed by aspect extraction and sentiment polarity calculation, with the help of POStagging and SentiWordNet[1] filters respectively. The obtained sentiments are further classifiedaccording to a set of Linguistic rules and the scores are normalized to nullify any noise that might bepresent. We lay emphasis on using a rule based approach; rules being Linguistic rules that correspondto the positioning of various parts-of-speech words in a sentence.Keywords : Aspect Mining, Opinion Mining, Sentiment Analysis, Polarity Classification.1. Introduction:The primary focus area of our work is on Aspect Extraction and grouping. Aspects form an importantpart of any classification and they essentially also define the context in which a certain opinion or aresponse is expressed. We perform grouping on aspects in order to achieve closeness i.e. to put aspectsthat are linguistically related to each other together in a common bucket. Our work also focuses onextracting relevant sentiment for an aspect. We perform this analysis at the sentence level using a rulebased approach, where the rules are English language rules.Recently there has been a change of attitude in the field from plainly extracting positive or negativeopinions to introducing opinion weights and classifying opinions as neutral. Therefore, it is notanymore focused on the binary classification of positive or negative as referred to in [3] by Tuney.Corpus-based methods work by utilizing dictionary-based approaches and these approaches depend onexisting lexicographical resources (such as WordNet) to provide semantic data in regards to individualsenses and words [4] . We lay emphasis on Language rules rather than just a look-up from sources likeWordNet or SentiWordNet because we base our work on the fact that the meaning of the word isrelevant only in a context and the presence of other words along with a particular word in anexpression changes the intensity of the whole expression like an adjective or an adverb intensifying orfurther deprecating the intensity of a noun or a verb etc. This is also the distinguishing factor between
  2. 2. text mining and Information Retrieval (IR), where the latter is only information access while theformer involves pruning the complete text data, as also argued by Hearst in [2] . In contrast to theother works our work presents sentence level lexical/dictionary knowledge base method to tackle thedomain adaptability problem for different types data as shown by Khan et. al in [5].The dataset used for our work is a set of documents that are a response to a survey conducted in aHospital. And these responses have been categorized with respect to the department. Overall, we haveconducted our study on about 65.000 responses. We did not chose to do any subjectivity analysisbecause on a Feedback form or on a Post your comment section the number of objective expressionsare negligible and wouldnt constitute to any significant noise. One of the main reasons why we havelimited it to Sentence Level Classification is because an aspect that appears in a responsepredominantly contributes to that response alone but not to the whole document. Hence there is noneed for document-level(paragraph-level) analysis, should document-level analysis be done, the simpleaggregation of all the sentence level results would be pretty accurate. We as well wanted to remain asdomain independent as possible which could only be achieved by sentence level classification. Weanalyze the performance of our approach by comparing our results (shown in Section 4) against themanually annotated results.Section 2 presents the approach we followed to solve the problem of aspect extraction and sentimentassociation and classification. Section 3 will detail our score calculation metrics along with the pseudo-code followed by performance analysis in Section 4 and conclusion and future work in Section 5, withreferences in Section 6.2. Our ApproachEach response is about an aspect or a group of aspects and these aspects form the theme of a commentor a response. As the comments or responses were obtained through a survey, it is intuitive that acertain aspect would appear in a response only when the user who has written that response thinksthat particular aspect is deemed fit. And every aspect that appears in a sentence has a part to play inthe overall sentences expression quotient. Keeping this in mind, certain metrics like the TF-IDF wereruled out because a) A response is not as elaborate as a document and b) When two aspects appear ina response, we treat them both equally important to that response and we completely rule out theconcept of relative importance (where TF-IDF wouldve come in handy) as a response is typically notmore than a couple of sentences long and the chances of an aspect appearing multiple times in such aresponse are negligible. We lay emphasis on sentence level classification to be able to increase theefficiency of the model and to keep it as generic as possible.Aspects that are one hop neighbors from each other in a dictionary more or less mean the same thing.We use WordNet[8] to gather one hop neighbors for an aspect. The co-occurrence of aspects in alinguistic sense implies that two aspects could mean the same thing but in no way suggests that theyshould always co-occur in that context. We only use these to group aspects together which will help usin filtering out the redundant aspects while we are focused on calculating top aspects.
  3. 3. We follow a four-step process to accomplish our objective. 1) Extracting the entities from the corpus/text-base that are potential sentiment holders and are objects for potential sentiment - which we term as Aspects. 2) Filtering out noise by separating stop words and other irrelevant terms in a User comment using a Parts of Speech Tagger in [6]. 3) Associating respective sentiment terms to the corresponding Aspect. 4) Assigning normalized sentiment scores to a feature/entity to keep the output unaffected from changes in the algorithm or the weights we assign through our rules.Another case in point in our work is the user-profile. It is not imperative that everyone has to recordtheir response in adherence to correct grammar and the language conjugate. So we have chosen ourrules (explained in section 3) in way that we only consider rules that are more generic than extremelyspecific language rules. For instance, there is a rule to be applied in case of intensifiers being present(Very good) but we ignore the usage of exclamation marks and emoticons (! or :D :P etc) respectivelybecause of the tendency to use them at will and sometimes rather arbitrarily.Our approach essentially consists of two agents and these agents operate serially -: 1) The Aspect Extraction Agent (AEA) 2) The Sentiment Association Agent (SAA)Before AEA takes over, we remove noise from the user feedback. By that we mean, we remove all thespecial symbols, stop words like {a, hmm, is, yea etc} and blank spaces and we feed a filtered datasetto the AEA.The Aspect Extraction Agent : The Aspect Extraction Agent (AEA) makes use of a POS tagger [6] toseparate the subjects from the opinion part of the comment. The aspects form the subjective entitiesabout which a respective sentiment is expressed. The AEA also filters out the noise by not ranking thespecial symbols, if any left out or the obvious objective features. The assumption is that, on afeedback form the number of comments are predominantly subjective. The POS tagger also helps usidentify the opinion words present in the sentence. Intuitively it constructs a {key:value} pair whereKey is the aspect or the context of the expression or a set of aspects/contexts of an expression. Thevalue is the list of opinion words related to that aspect i.e. those used to express that particularaspect. And the input is further supplied to the SAA. The Aspects are further grouped to remove theredundant aspects and to re-adjust the weights of the aspects in the context of the department, whichwill further aid us in prioritizing the aspects for a department.The Sentiment Association Agent : The sentiment association agent receives a set of {Key: Value}pairs from the FEA. It then makes use of SentiWordNet, detailed in [1] to compute the opinion scoreof each word and produces an aggregate opinion score for that particular feature. The scores that arethus outputted are collected and are pruned to a set of language rules such as the presence ofintensifiers or otherwise and also the presence of negations, adverbs and adjectives to obtain the
  4. 4. sentiment score for that particular feature. The sentiment score thus obtained has three componentsnamely, Positive, Negative and Neutral. It is assumed with considerable conviction that almost everypositive opinion term, phrase or entity will have some sort of a negative and neutral score when usedin different senses and vice-versa. Eg: - Its incredible. I absolutely love it. [incredible is positive here] Ah, what incredibly awful stuff that is! [incredible is negative here]The reason for normalizing is that any rescaling of an input vector can be effectively undone bychanging the corresponding weights and biases, leaving us with the exact same outputs as we hadbefore. However, there are a variety of practical reasons why standardizing the inputs can maketraining faster and reduce the chances of getting stuck in local optima. Also, weight decay andBayesian estimation can be done more conveniently with standardized inputs. We can always tell thesystem by how much the value has changed since the previous input. Also the sum of respective scoresof opinion words in SentiWordNet converge to 1. So it makes a case in point to normalize ouraggregate of opinion word scores to converge to 1 to ensure consistency without regard to change inthe input vector scales. The score calculation metrics and pseudo code are further detailed in Section3.However, there is a catch with the way sentiment scores are organized on SentiWordNet. Every termin the SentiWordNet database is classified into a number of senses, each sense ranked according to thefrequency of its usage in general (with the help of a sense-number), indicating in how many differentcontexts that particular term could be used. There might be cases where a term could carryambiguous scores in the same sense. Table 1 illustrates this case. Synset SentiWordNet Score Gloss huffy, mad#1, sore (roused to (0.0, 0.125) "she gets mad when you wake anger) her up so early"; "mad at his friend"; "moreover a remark" brainsick, crazy, demented, (0.0, 0.5) "a man who had gone mad" disturbed, mad#2, sick, unbalanced, unhinged delirious, excited, frantic, (0.375, 0.125) "a crowd of delirious baseball mad#3, unrestrained fans"; “a mad whirl of pleasure" harebrained, insane, mad#4 (0.0, 0.25) "harebrained ideas"; "took insane (very foolish) risks behind the wheel"; “a completely mad scheme to build a bridge between two mountains" Table 1 : Example of Multiple Scores for a same term from SentiWordNet
  5. 5. In Table 1 the word mad belonging to adjective part of speech has got ambiguous positive andnegative senses and the disambiguation becomes a very primitive problem. It could be related to theWord Sense Disambiguation problem in some sense. Due to the limited time and the complexity ofintroducing WSD in this model, a simple approach is proposed to solve this problem • Evaluate scores for each term in a given sentence • If there are conflicting scores i.e. different sense scores for the same word – calculate the weighted average of all positive scores and all negative scores. By doing so, we deprecate the individual sense scores as the sense number increases.3. Score Calculation and Pseudo CodeOur model is a blend of traditional bag of words and intelligent look up and priority evaluation usinga set of Language rules. The simplest of rules to start with is The Negation Rule. When a negation ora negative word like, “not”, “neither” etc is found in a response, the polarity of the opinion associatedwith the aspect in context is reversed. If R is a response and {A} is the set of aspects associated withthat response in that context Θ S and if {N} denotes the set of negative words then, If [ ∃n∈Θ S ] where n∈N then A InversePolarities Rule 1 : The Negation RuleThe second rule is The Modal Rule. Modals are the trickiest to deal with. To understand why Modalsare important, consider the following cases. Case 1 : “The doctor could have been more positive”A response like that in Case 1 would be tagged as a positive response, for there is no exact negativeinference there.Similarly, Case 2 : “I would have not gotten as much attention in any other hospital”A response like this in Case 2 would be tagged as negative for the same reason mentioned in case 1.Pertaining to the usage of modals extensively in the Language spoken or written, it makes a hugedifference to the results if they are not handled appropriately. So the following rule is proposed to dealwith Modals. If R is a response and {A} is the set of aspects associated with that response in thatcontext Θ S and if {M} denotes the set of words like would have, could have etc, which we term asModals then
  6. 6. If [ ∃m∈ΘS ] where m ∈M then A InversePolarities Rule 2 : The Modal RuleThe adjustment of polarity with respect to adjectives and adverbs becomes a very important aspect ofsentence level sentiment extraction. We take into account the intensifier and the re-prune our polarityscores according to the score of the intensifier. If the score of the intensifier is I = [ i p ,i n ,i o ] wherei p ,i n ,i o denote the positive, negative and objectiveness scores of the intensifier and if [ Ψ p ,Ψ n ,Ψ o ]denote the values of positive, negative and objectiveness of the quantity that is being intensified bythe intensifier or reducer I. The re-prune values are given by the following rules R3 and R4 forintensifiers and R5 and R6 for reducers.If I p >I n and Ψ p >Ψ n then the resultant re-prune score for Ψ is given by Ψ newNegative = I p∗Ψ n  ÷ ∑  Ψ k −Ψ o  Ψ newPositive = ∑  Ψ k  −Ψ o  −Ψ n Rule R3 : Rule to intensify the positive quotientIf I p >I n and Ψ n >Ψ p then the resultant re-prune score for Ψ is given by Ψ newPositive =  I p∗Ψ p ÷ ∑  Ψ k  −Ψ o  Ψ newNegative = ∑  Ψ k −Ψ o −Ψ p Rule R4 : Rule to intensify the negative quotientIf I n >I p and Ψ p >Ψ n then the resultant re-prune score for Ψ is given by Ψ newPositive =  I n∗Ψ p ÷ ∑  Ψ k  −Ψ o  Ψ newNegative = ∑  Ψ k −Ψ o −Ψ p Rule R5 : Rule to reduce the positive quotientIf I n >I p and Ψ n >Ψ p then the resultant re-prune score for Ψ is given by Ψ newNegative = I n∗Ψ n ÷  ∑  Ψ k −Ψ o  Ψ newPositive = ∑  Ψ k  −Ψ o  −Ψ n Rule R6 : Rule to reduce the negative quotient
  7. 7. The above division rules are valid only when Ψ p +Ψ n >Ψ o . Otherwise, the value of the denominatorin the division rules becomes 1. The intuition is to reduce the impact of the opinion word in case of areducer and vice-versa for intensifier and the denominator component in division ensures that thevalues arent scaled down or scaled up by a huge margin. We equate the denominator to 1 in case ofsum of positive and negative scores for an opinionated term being less than the objective score of thatterm, to tackle the problem of Polarity Inversion. The above rules are applied to amplify or reducethe impact of the intensifier and a reducer respectively, on an opinionated word and sufficient care istaken that the values converge to 1, to ensure domain and overall consistency. The following casesillustrate our Rules R1...R6Eg 1 : Could not ask for any better place or doctor for any cancer patient. Aspects : doctor, place, cancer, patient Positive Sentiment Value: 0.30319 Negative Sentiment Value: 0.00736 (Before R1 and R2) Positive Sentiment Value: 0.00736 Negative Sentiment Value: 0.30319 (After R1) Positive Sentiment Value: 0.30319 Negative Sentiment Value: 0.00736 (After R1 and R2)Eg 2 : The doctors were very approachable and easy to talk to, understood my problem, and I could clearly understand them. Aspects: doctor, problem Positive Sentiment Value: 0.21824 Negative Sentiment Value: 0.09563Eg 4 : Effect of Adverbs on Adjectives (approx to two Positive Sentiment Negative Sentiment Neutral Sentiment decimal places) Val Val Valgood 0.5 0.25 1-(0.5+0.25)=0.25very 0.25 0.17 1-(0.25+0.17)=0.58very good (0.5+0.25)-0.083= 0.667 (0.25*0.25)/ 0.25 (0.5+0.25)=0.083 TABLE 2 : Demonstrating Effect of Adverbs on Adjectives
  8. 8. The following algorithm, Algorithm 1 shows the pseudo code of the implementation.1. Start2. Map sentimentMap = Load SentiWordNet3. For each comment c in Comments C4. boolean hasNegation = false5. NounBag = {}6. SentimentValue = {}7.. For each word w in comment c:8. If(negation(w) == true)9. Set hasNegation = ~hasNegation10. EndIf11. If Pos(w) = NOUN:12. NounBag.append(w)13. SentimentValue = getSentimentValue(SentimentMap, w)14.. EndIf15. Elif (Pos(w) == Adj or Verb):16. SentimentValue = getSentimentValue(SentimentMap, w)17. Reprune();18. End Elif19. Elif (Pos(w) == AdVerb):20. SentimentValue = getSentimentValue(SentimentMap, w)21. RepruneAdverb();22. End Elif23. If (hasNegation):24. InversePolarities();25. EndIf26. If (hasModals):27. PruneModals();28, EndIf29. EndFor30. EndFor31. EndALGORITHM 1 : Semi-Rule Based Model for Sentence Level Sentiment Extraction.The steps explain the order of operations which start with extracting every comment and in-turn,every opinion word in the comment, finding out its part-of-speech and then later finding out thepresence of intensifiers and the subsequent re-pruning. A lot of work has to be still done in evaluatingModals like wouldve been, couldve been and conjunctions. Currently the sentiment score on bothside of a conjunction is aggregated, but efforts have to be put into finding out a metric to efficientlyevaluate the opinion weights. Current efforts are put into disambiguating senses and finding out
  9. 9. linguistic rules in the presence of conjunctions and Modals to prune opinion weights.4. Performance AnalysisFor sentiment terms association and classification we have run our algorithm in four iterations and ateach iteration we achieved better results and outperformed our previous iteration. Of the dataset wehave, we have considered four departments {SECTSCHE, SECTCARE, SECTFACI, SECTOVER}respectively as our data sample. The reason to run it in four iterations is to understand how ouralgorithm was getting us better results with respect to addition of rules. The following table illustratesour strategy at each Iteration.Iteration 1 Gathering Adjectives and Adverbs from a response using POS Tagging and using SentiWordNet to look up for the scores and associating those scores with the corresponding aspects.Iteration 2 1) The impact of different senses was realized and nouns, adjectives, adverbs and verbs have been separated from the rest. 2) Each of the above 4 POS have been looked up differently from the SentiWordNet and the impact of one POS term on another is considered {Rules R3.. R6} 3) The notion of positive, negative and neutral scores to each entity was introduced which essentially means, every positive word has some amount of negative sense and vice-versa. These scores have been normalized to adjust to the changes in input scales.Iteration 3 Similar to Second Iteration, but nouns have been ignored during Sentiment Association and classification process owing to the notion that Nouns usually form the Aspect/Context of a Response but they dont greatly influence the Opinion Quotient of a response.Iteration 4 Rule R2 to deal with Modals has been introduced. TABLE 3 : List of Iterations for Sentiment Extraction and Classification.The following tabular representations in FIG 1 and FIG 2 shows the results of all the four iterations.At each iteration the false positives and false negatives have been calculated. False positives are thosenegative opinions mis-calculated as positive and false negatives are vice-versa. These False Positivesand False Negatives are denoted by FPOS and FNEG in FIG 1. At each iteration the percentage errorin positive and negative have also been shown. The false positives and false negatives are mainly dueto the fact that, every opinion however negative it is, is expressed more so using positive words thanthe negative words. The columns T-Positive and T-Negative denote the total number of originalpositive and negative responses as manually annotated. We check our calculated results against thesemanually annotated results to determine the accuracy and efficiency of our model. The results aredetailed in the following figures.
  10. 10. FIG 1 : The sentiment classification results at various iterations.As it can be seen from the tabular column, with every iteration, the number of false positives and falsenegatives have come down. (Refer to TABLE 3 for iteration details). The following figure FIG 2gives the percentage errors in every iteration after first. FIG 2 : Percentage Errors with each iterationThe above figure shows the percentage errors with respect to each iteration. % Error in Positive arethe number of positive opinions mis-calculated as negative or otherwise they are the false negativesand the % error in Negative are vice-versa. The percentage errors could be still brought down byimproving rules for Modals and using a domain specific lexicon or building a domain specific lexiconand using it. As it could be seen from FIG 2 the percentage of negative error has got to do withpeople expressing weak negatives with the help of positive qualifiers. And the increase in percentagepositive error from Iteration 3 to Iteration 4 could be tackled by fine tuning Rule R2. Part of the mis-calculations can be attributed to the limitations in POS Tagger [6] that in-turn has got to do with ourdatasets not having responses in a proper grammatical structure.5. Conclusion and Future WorkOur model is a rule based approach proposed for aspect extraction and the association of opinion tothose respective aspects. The contextual information and the sense of each individual sentence areextracted according to the pattern structure of the sentence using a Parts of Speech Tagger. The first
  11. 11. stage opinion score for the extracted sense is assigned to the sentence using SentiWordNet. Theeventual opinion score is calculated after checking linguistic orientation of each term in a sentencewith the help of Rules R1..R6 explained in Section 3 and normalizing the results to ensure that theeventual score of an aspect in a response converges to 1, irrespective of the number of opinion wordsassociated with that aspect.The accuracy of our model could be improved by having a lexicon that is specific to the domain or byemploying a learning mechanism with the help of a feedback loop which could also be manual.However, natural language processing is just not black and white. A lot of work still has to be done indisambiguating the word senses and weights associated with the subjects and objects in the sentenceas both of them dont necessarily have the same impact on the sentences sentiment. Work is alsobeing carried out to separate weak positives and negatives from Strong positives and negatives toprovide the customer with potential stand points to improve their product quality. An approach todeal with conjunctions has to be worked upon for better accuracy.5. References:[1] SentiWordNet 3.0 – An Enhanced Lexical Resource for Sentiment Analysis and OpinionMining. In Proc. Of LREC10(2010)[2] M.A. Hearst, “Untangling text data mining,” Proceedings of the 37th annual meeting of theAssociation for Computational Linguistics on Computational Linguistics, 1999, pp. 3-10.[3] P. Turney, “Thumbs up or thumbs down? Semantic orientation applied to unsupervisedclassification of reviews,” Proceedings of the 40th Annual Meeting of the Association forComputational Linguistics (ACLʼ02), 2002, pp. 417-424.[4] A. Andreevskaia and S. Bergler, “When specialists and generalists work together: Overcomingdomain dependence in sentiment tagging,” Proceedings of ACL-08: HLT, 2008, pp. 290-298.[5] A. Khan, B. Baharudin, and K. Khan, “Sentiment Classification from Online Customer ReviewsUsing Lexical Contextual Sentence Structure,” Communications in Computer and InformationScience, Software Engineering and Computer Systems, Springer Verlag, 2011, pp. 317-331.[6] Kristina Toutanova, Dan Klein, Christopher Manning, and Yoram Singer. 2003. Feature Rich Part-of-Speech Tagging with a Cyclic Dependency Network. In Proceedings of HLT-NAACL 2003, pp. 252-259.[7] Pang B., Lee L., and Vaithyanathan, S. (2002). Thumbs up? Sentiment Classification usingMachine Learning Techniques. Proceedings of EMNLP, 2002.[8] Miller G. A., Beckwith R., Fellbaum C, Gross D, Miller K. J. (1990). Introduction to WordNet: AnOn-line Lexical Database. International Journal of Lexicography. Vol. 3, No. 4 (Jan. 1990), 235-244.