SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

1,720 views

Published on

Lexicon-based approaches to Twitter sentiment analysis are gaining much popularity due to their simplicity, domain independence, and relatively good performance. These approaches rely on sentiment lexicons, where a collection of words are marked with fixed sentiment polarities. However, words' sentiment orientation (positive, neural, negative) and/or sentiment strengths could change depending on context and targeted entities. In this paper we present SentiCircle; a novel lexicon-based approach that takes into account the contextual and conceptual semantics of words when calculating their sentiment orientation and strength in Twitter. We evaluate our approach on three Twitter datasets using three different sentiment lexicons. Results show that our approach significantly outperforms two lexicon baselines. Results are competitive but inconclusive when comparing to state-of-art SentiStrength, and vary from one dataset to another. SentiCircle outperforms SentiStrength in accuracy on average, but falls marginally behind in F-measure.

Published in: Technology, Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,720
On SlideShare
0
From Embeds
0
Number of Embeds
314
Actions
Shares
0
Downloads
54
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • The workflow in our approach starts with capturing the contextual semantics of words where the first

    we compute the semantics of a term m by considering the relations of m with all its context words (i.e., words that occur with m in the same context). To compute the individual relation between the term m and a context term ci we propose the use of the Term Degree of Correlation (TDOC) metric. Inspired by the TF-IDF weighting scheme this metric is computed as:


    Now need to encode these information in away that enable us to measure the sentiment orientations and strength separately.

    Harith suggested to add a slide about the transition to SentiCircles
  • SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

    1. 1. SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter Hassan Saif, Miriam Fernandez, Yulan He and Harith Alani The Eleventh Extended Semantic Web Conference (ESWC2014) May 2014
    2. 2. OutLine oSentiment Analysis oApproaches oSentiCircles oEvaluation oConclusion
    3. 3. “Sentiment analysis is the task of identifying positive and negative opinions, emotions and evaluations in text” 3 Opinion OpinionFact Sentiment Analysis yes, It is sunny, but also very humid :( The weather is great today :) I think its almost 30 degrees today Sentiment Analysis
    4. 4. Sentiment Analysis Approaches Lexicon-Based Approach Machine Learning Approach
    5. 5. Machine Learning Approach
    6. 6. Lexicon-based Approach I had nightmares all night long last night :( Negative Sentiment Lexicon Text Processing Algorithm great sad down wrong horrible love
    7. 7. o Requires Labeled Twitter Corpora  Labor Intensive Task Distant Supervision (Noisy Labeling) o Domain Specific  Re-Training with new domains Machine Learning Approach On Twitter?
    8. 8. Traditional Lexicons - MPQA & SentiwordNet, etc - Not tailored to Twitter noisy data: - lol, gr8, wow, :), :P - Fixed number of words Lexicon-based Approach On Twitter? Sentiment Lexicon great sad down wrong horrible love grt8lol :) :P
    9. 9. Twitter-specific Lexicon-based Methods - Such as SentiStrength - Rule-base method for sentiment analysis on social web - Uses Thelwall-Lexicon - Built to specifically work on social data - Contain lists of emoticons, slangs, abbreviations, etc.
    10. 10. • Fixed Number of words • Offer Context-Insensitive Prior Sentiment Orientations and Strength of words Great Problem Smile Positive Thelwall-Lexicon & SentiStrength Sentiment Lexicon great sad down wrong horrible love
    11. 11. We Need..  Unsupervised Approach  Understands the Semantic of Words  Captures their Contexts  Updates Sentiment
    12. 12. SentiCircles
    13. 13. SentiCircles  Lexicon-based Approach  Builds Dynamic representation of words  Captures Contextual & Conceptual Semantics of words  Updates words’ sentiment orientation and strength accordingly
    14. 14. Contextual Semantics “Words that occur in similar context tend to have similar meaning” Wittgenstein (1953) “You Shall know the word by the company it keeps” Firth (1955) Great Problem Look Smile Concert Song Weather Loss Game Taylor Swift Amazing Great
    15. 15. Capturing Contextual Semantics Term (m) C1 C2 Cn…. Context-Term Vector Degree of Correlation Prior SentimentSentiment Lexicon (1) (2) Great Smile Look (3) Contextual Sentiment Strength Contextual Sentiment Orientation Positive, Negative Neutral [-1 (very negative) +1 (very positive)]
    16. 16. Term (m) C1 Degree of Correlation Prior Sentiment Great Smile SentiCircles Model X = R * COS(θ) Y = R * SIN(θ) Smile X ri θi xi yi Great PositiveVery Positive Very Negative Negative +1 -1 +1-1 Neutral Region ri = TDOC(Ci) θi = Prior_Sentiment (Ci) * π Capturing Contextual Semantics
    17. 17. SentiCircles (Example)
    18. 18. Overall Contextual Sentiment Ci X ri θi xi yi m PositiveVery Positive Very Negative Negative +1 -1 +1-1 Neutral Region nwhicheachtermisused. Tocomputethenewsentiment of tiCircleweusetheSenti-Median metric. Wenow havethe hichiscomposedbytheset of (x, y) Cartesiancoordinatesof wherethey valuerepresentsthesentiment andthex value ength. Aneffectiveway toapproximatetheoverall sentiment y calculatingthegeometricmedianof all itspoints. Formally, (p1, p2, ..., pn ) inaSentiCircle⌦, the2Dgeometricmedian g = arg min g2 R2 nX i = 1 k|pi − g||2, (5) Senti-Median of SentiCircle Sentiment Function
    19. 19. SentiCircles & Conceptual Semantics
    20. 20. Enriching SentiCircles with Conceptual Semantics Sushi time for fabulous Jesse's last day on dragons den @Stace_meister Ya, I have Rugby in an hour Dear eBay, if I win I owe you a total 580.63 bye paycheck Company Person Sport
    21. 21. Enriching SentiCircles with Conceptual Semantics Cycling under a heavy rain.. What a #luck! Weather Condition Wind Snow Humidity
    22. 22. SentiCircles for Tweet-level Sentiment Analysis Detecting the overall Sentiment of a given tweet message (positive vs. negative)
    23. 23. SentiCircles for Tweet-level Sentiment Analysis (1) The Median Method Cycling under a heavy rain.. what a #luck! S-Median S-Median S-Median S-Median S-Median S-Median The Median of Senti-Medians
    24. 24. Tweet-level Sentiment Analysis (2) The Pivot Method like1 X Y r1 θ1 PositiveVery Positive Very Negative Negative new2 pj r2 θ2 like1 new2 iPadj Wn Sj1 Sj2 Tweet tk ... et termstobeequal. Eachtweet ti 2 T isturnedintoavector of Senti- 1, g2, ..., gn ) of sizen, wheren isthenumber of termsthat composethe heSenti-Medianof theSentiCircleassociatedwithtermmj . Equation culatethemedianpoint q of g, whichweusetodeterminetheoverall eet ti using Function 6. od: Thismethodfavourssometermsinatweet over others, basedon that sentiment isoftenexpressedtowardsoneor morespecifictargets, toas“Pi vot ” terms. Inthetweet exampleabove, therearetwopivot e” and“i Pad” sincethesentiment word“amazi ng” isusedtodescribe ence, themethodworksby(1) extractingall pivot termsinatweet and; g, for eachsentiment label, thesentiment impact that eachpivot term her terms. Theoverall sentiment of atweet correspondstothesentiment ighest sentiment impact. Opiniontarget identificationisachallenging ondthescopeof our current study. For simplicity, weassumethat the hosehavingthePOStags: {CommonNoun, Proper Noun, Pronoun} in hcandidatepivot term, webuildaSentiCirclefromwhichthesentiment vot termreceivesfromall theother termsinatweet canbecomputed. vot-Methodseekstofindthesentiment ˆs that receivesthemaximum ct within atweet as: ˆs = argmax s2 S Hs(p) = argmax s2 S N p X i N wX j Hs(pi , wj ) (7) I like my new iPad
    25. 25. Experiments
    26. 26. Experimental Setup (1) Datasets (2) Sentiment Lexicons - SentiWordNet [3] - MPQA Subjectivity Lexicon [4] - Thelwall-Lexicon [5]
    27. 27. Experimental Setup (3) Baselines 1. Lexicon-Labeling (MPQA & SentiWordNet) Average of positive & negative words in a tweet. 2. SentiStrength (State-of-the-art) - Lexicon-based method built for Twitter - Apply a set of syntactic rules
    28. 28. Results
    29. 29. Sentiment Detection with Contextual Semantics
    30. 30. SentiCircles vs. Lexicon-Labeling Methods 52.35 52.74 74.96 52.34 52.30 68.06 40.00 45.00 50.00 55.00 60.00 65.00 70.00 75.00 80.00 MPQA-Lex SentiWNet-Lex SentiCircle Accuracy F-Measure
    31. 31. SentiCircle vs. SentiStrength Datasets Accuracy F1 OMD SentiCircle SentiCircle HCR SentiCircle SentiStrength STS-Gold SentiStrength SentiStrength Average SentiCircle SentiStrength
    32. 32. Why Such Variance.. • The sentiment class distribution in our datasets – SentiCircle produces, on average, 2.5% lower recall than SentiStrength on positive tweet detection – Our datasets contain more negative tweets than positive ones • Topic Distribution in the three datasets • More research is required
    33. 33. Sentiment Detection with Conceptual Semantics Win/Loss in Accuracy and F-measure of incorporating conceptual semantics into SentiCircles, where Mdn: SentiCircle with Median method, Pvt: SentiCircle with Pivot method.
    34. 34. Conclusion • We proposed a novel semantic sentiment approach called SentiCircle • SentiCircles captures context and update sentiment accordingly • We showed how SentiCircle can be applied for Tweet-level sentiment analysis • SentiCircles outperformed other lexicon labeling methods and overtake the state-of-the-art SentiStrength approach in accuracy, with a marginal drop in F-measure.
    35. 35. SentiCircles for Sentiment Analysis 1. Tweet-level Sentiment Analysis 1. Entity-Level Sentiment Analysis 2. Sentiment Lexicon Adaptation 3. Dynamic Stopwords Generation 4. Sentiment Patterns Discovery Saif et al. (2014) at ESWC Conference. Greece, Crete Saif et al. (2014), IPM Journal Saif et al. (2014) at ESWC Conference Saif et al. (2014) at LREC Conference. Reykjavik, Iceland Saif et al. (2014) submitted to ISWC Conference.
    36. 36. Thank You Email: hassan.saif@open.ac.uk Twitter: hrsaif Website: tweenator.com

    ×