Sentiment Analysis in Twitter with Lightweight Discourse Analysis

Sentiment Analysis in Twitter with
Lightweight Discourse Analysis
-Subhabrata Mukherjee & Pushpak Bhattacharyya
Presented By: Naveen Kumar

Outline
 Introduction to Sentiment Analysis
 Motivation
 Introduction : Approach
 Discourse Relations
◦ Discourse Relations Critical for Sentiment Analysis
◦ Semantic Operators Influencing Discourse Relations
 Algorithm Proposed
◦ Feature Variables
◦ Steps for different semantics operators and discourse relations
 Feature Vector Classification
 Evaluation
◦ Data Set Statistics
◦ Accuracy Measures
 Conclusion
Sentiment Analysis in Twitter with Lightweight
Discourse Analysis 2

What is Sentiment Analysis ?
 Classification on polarity of given text .
 Example : “@abc Good to see your government is working
for common people”
 Approaches : Opinion Mining

Introduction to Sentiment Analysis:
 Usage of Sentiment Analysis Applications : Brand
Monitoring , Business Planning , Risk Analysis and many
others
 Sentiment analysis is much more than simplistically
subtracting the number of “negative” words from the
number of “positive” in a document or message in order to
produce a score.

Introduction to Sentiment Analysis(Contd..) :
Fig : Approaches for designing Sentiment Analysis Application

Introduction to Sentiment Analysis (Contd..):
 Most Common Techniques Used for Sentiment Analysis:
◦ POS : Finding adjectives as they are indicators for opinion
◦ Term Presence and Frequency
◦ Opinion Dictionaries
◦ Negations : ‘not good’ is equivalent to bad
◦ Point-wise Mutual Information

Introduction to Sentiment Analysis (Contd..):
 Data expansion in recent years is one of the key factor
which increased the role of Sentiment Analysis in business.
 Looking ahead , we can see a true social democracy in
different domains which harness the opinion of crowd
rather than few experts

Motivation
▪ Example :
I’m quite excited about Tintin , despite not really liking original
comics.
 This sentence is positive although there is equal number of
positive and negative words.
 “Despite” gives more weight to previous discourse segment.

Motivation
▪ Example :
Think I’ll stay with whole sci-fi shit but this time …. classic movie

 This sentence is positive although there is equal number of
positive and negative words.
 “But” gives more weight to following discourse segment.

Motivation
 Discourse elements in text alters the polarity of text
 Sentiment Analysis classification should also take discourse
elements.
 This paper takes into consideration : connectives, modals,
conditionals and negation in a sentence for polarity
detection
 In this work bag of word model incorporates discourse
information which impacts the sentiment of a sentence

Motivation : Traditional Approach in
Discourse Analysis
• Traditional approaches for discourse analysis are mostly based on
Rhetorical Structure Theory (RTS) proposed by Mann (1988)
• But , most of these theories are focused on structured text.
• Tweet : “@abc I am gonna party 2day ….. wanna join?”
• Limitations:
• Parsers and taggers do fail frequently in unstructured text like
tweets
• Heavy linguistic processes increases the processing time and
slow down the entire application.

Introduction: Approach
 The approach deals with noisy and unstructured data as in
Tweets
 This works well even with the structured reviews and gives
a better accuracy compared to state of the art system.

Introduction: Approach
 Approach in this paper , identifies features for Sentiment
Analysis classification which incorporates discourse
information to give better sentiment classification accuracy
 Features here can include bag of words, along with domain
specific features like emoticons, hashtags, etc.

Discourse Relations
 Coherently Structured Discourse is collection of sentences related to each other.
Coherence Relation Conjunction
Cause-Effect because ; and so
Violated Expectations although; but; while
Condition If…(then) ; as long as ; while
Similarity and ; (and) similarity
Contrast by contrast ; but
Temporal Sequences (and) then ; first , second , ..before;
after ; while
Attribution according to … ; …said ; claim that …
; maintain that … ; stated
Example for example ; for instance
Elaboration also ; futhermore ; in addition
Generalization In general
Table : Contentful Conjunctions used to illustrate Coherence Relations

Discourse Relations Critical for Sentiment
Analysis
1. Violated Expectation and Contrast :
◦ Violating expectation conjunctions oppose or refute the neighboring discourse segment.
◦ Categorized in two : Conj_Fol and Conj_Prev
◦ Example :
 “He wasn’t prepared for the exam but he did pretty well”
 “I loved rides at Disney World despite having acrophobia”
◦ The discourse segment following “but” should be given more weight. In Example 2, the discourse
segment preceding “despite” should be given more weight.
2. Conclusive or Inferential Conjunctions : Set of conjunctions : Conj_infer. The discourse segment
following them should be given more weight.
Eg : @User I wasn’t much satisfied with ur so-called gud phone and subsequently decided to reject it.

Discourse Relations Critical for Sentiment
Analysis (Contd..)
3. Conditionals :
◦ The presence of “if” , tones down the final polarity as it introduces a hypothetical
situation in the context.
◦ Example : “If Micromax improved its battery life , it wud hv been gr8 phone”
4. Other Discourse Relations : Sentences having other discourse conjunctions which
has no contrasting or conflicting information can be handled by bag of word approach.

Semantic Operators Influencing Discourse
Relations
 Example : “You may like the movie despite the bad
reviews”
 Here ‘despite’ gives more weight to ‘like’
 But ‘may’ pulls down the weight as it creates uncertaininty

Relations (Contd..)
Relations Attributes
Conj_Fol but, however, nevertheless , otherwise , yet ,
still , nonetheless
Conj_Prev till , until , despite , in spite , though , although
Conj_Infer therefore , furthermore , consequently , thus ,
as a result , eventually , hence
Conditionals if
Strong_Mod might , could , can , would , may
Weak_Mod should , ought to , need not , shall , will , must
Neg not, neither , never , no , nor
Table : Discourse Relations and Semantic Operators Essential for Sentiment Analysis

Relations (Contd..)
1. Modals :
• Strong_Mod is the set of modals that express a higher degree of
uncertainity in any situation.
•Eg: He may be a rising star
• Weak_Mod is the set of modals that express lesser degree of
uncertainty and more emphasis on
certain events or situations.
Eg: Our civil service should work without interference.

Relations (Contd..)
2. Negation :
 The negation operator (Example: not, neither, nor, nothing
etc.) inverts the sentiment of the word following it.
Eg : I do not like Nokia but I like Samsung.

Algorithm Proposed :
 Discourse Relations and Semantic Operators are used to
create feature vectors
 Let user post R consist of ‘m’ sentences si(i=1…m) where
si consist of n words
wij (i=1…m , j=1..n)

Algorithm Proposed (Cond..): Feature
Variables
 Let A be the set of all discourse relations as described
before.
 Feature Variables :
 fij : Weight of the word in sentence , initialized to 1
 flipij : Indicates weather the polarity of word should be flipped or not
 hypij : Indicates the presence of conditional or strong modal in sentence
 Iterate the algorithm for every sentence :
for i = 1 … m

Algorithm Proposed (Cond..): Step 1
 Deals with conditional and strong modals
 Algo :
 for j =1..n
hypij = 0;
if wij == Conditional OR Strong _Mod :
hypij = 1;
end if
end for

 Deals with Conj_Fol and Conj_Infer
 Algo :
 for j =1..n
fij = 1;
if wij == Conj_Fol OR Conj_Infer :
for k=j+1 .. N and w != A
fik + = 1;
end for
end if
end for

 Deals with Conj_Prev
 Algo :
 for j =1..n
fij = Taken from Step 2;
if wij == Conj_Prev :
for k=1 .. j-1 and w != A
fik + = 1;
end for
end if
end for

 Deals with Negation
 Algo :
 for j =1..n
flipij = 1;
if wij == Neg :
for k=1 .. Neg_Window and w != (Conj_Prev OR Conj_Fol)
flipi,j+k = -1;
end for
end if
end for

Feature Vector Classification
 We devised a lexicon based
system as well as a supervised
system for feature vector
 Lexicon Based Classification :
◦ Here :
 fij : weight of each word in sentence
 flipij : indicates polarity of word should
be flipped or not
 hypij : indicates the presence of
conditional or strong_mod

Feature Vector Classification (Contd..)
 Supervised Classification :
◦ SVM’s are used to classify the set of feature vectors {flipij, wij, fij
and hypij}.
◦ Features used :
 N-grams
 Stop Words
 Feature Weight
 Modal and Conditional Indicator
 Stemming
 Negation
 Emoticons
 POS
 Feature Space

Evaluation : Data Sets Statistics
 Data Set 1 :
◦ 8507 tweets are collected based on a total of around 2000 different entities from
20 different domains.
◦ Four classes - positive, negative, objective-not-spam and objective-spam.
 Data Set 2 :
◦ Hashtags #positive, #joy, #excited, #happy etc. are used to collect tweets
bearing positive sentiment,
◦ Hashtags like #negative, #sad, #depressed, #gloomy, #disappointed etc. are
used to collect negative tweets.
 Data Set 3 :
◦ Travel Review Dataset contains 595 polarity-tagged documents for each of the
positive and negative classes.

Evaluation : Data Sets Statistics
 Base Line System : C-Feel-It (Joshi et al., 2011) Rule Based System
 The only difference between the two systems is the handling of
connectives , modals, conditionals and negation
Manually Annotated DataSet 1
#Positive #Negative #ObjectiveNotSpam #Objective #Total
2548 1209 2757 1993 8507
Auto Annotated DataSet 2
#Positive #Negative Total
7348 7866 15214
Table : Twitter Datasets 1 and 2 Statistics

Evaluation : Accuracy Measures
Table : Accuracy Comparision b/w Base Line System and Discourse System using Lexicon in Data Set 1 and 2
Accuracy Measure for 2 Class Classification : DataSet 1
Base Line System Discourse System
68.58 72.81
57.2 61.31
80.55 84.91

Fig : Accuracy Comparision b/w Base Line System and Discourse System using Lexicon in Data Set 1 and 2
0
10
20
30
40
50
60
70
80
90
Accuracy Measure for 2 Class Classification
: DataSet 1
: DataSet 1
: DataSet 2
BaseLineSystem
Discourse System

Table : Accuracy Comparison b/w Base Line SVM System and Discourse SVM System in Data Set 1 and 2
69.49 70.75
63.11 64.23
91.99 93.01

Fig : Accuracy Comparison b/w Base Line SVM System and Discourse SVM System in Data Set 1 and 2
0
10
20
30
40
50
60
70
80
90
100
: DataSet 1
: DataSet 1
: DataSet 2
BaseLineSystem
Discourse System

•This evaluation is used to determine whether our discourse based approach
performs well for structured text as well
Sentiment Evaluation Criterion Accuracy
Baseline Bag-of-Words Model 69.62
Bag-of-Words Model + Discourse 71.78
Table : Accuracy Comparision between Bag of Word and Siscourse System using Lexicon in DataSet 3

 Supervised Classification Approach under 4 scenarios :
◦ When only unigrams are used
◦ When only sense of unigrams are used
◦ When unigrams are used along with their senses (synset-id’s)
◦ When unigrams are used with senses and discourse features
Discourse Analysis 37Note : IWSD : Iterative Word Sense Disambiguation ((Khapra et al., 2010)
System Accuracy(%)
Only Unigrams 84.9
Only IWSD Sense of Unigrams 85.48
Unigrams + IWSD Sense of Unigrams 86.08
Unigrams + IWSD Sense of Unigrams +
Discourse 88.13
Table : Accuracy Comparison In Dataset 3

Discourse Analysis 38Note : IWSD : Iterative Word Sense Disambiguation ((Khapra et al., 2010)
Figure: Accuracy Comparison In Dataset 3
84.9
85.48
86.08
88.13
83
84
85
86
87
88
89
Unigram IWSD Sense of Unigram Unigram + IWSD Sense of Unigram Unigram + IWSD Sense of Unigram +
Discourse
Accuracy(%)

Conclusion
 Most of the works on unstructured data uses bag of words
approach , ignores discourse markers
 Evaluation on different dataset show that incorporating
discourse information improves the performance by 2-4%
 The approach presented can deal with unstructured data as
it doesn’t need to parse the document.

Questions for You
 What feature variables are used in the algorithm for
incorporating discourse information?
 What is the advantage of the presented approach over
traditional approaches ?

Sentiment Analysis in Twitter with Lightweight Discourse Analysis

In this document

More Related Content

What's hot

Viewers also liked

Similar to Sentiment Analysis in Twitter with Lightweight Discourse Analysis

Recently uploaded

Sentiment Analysis in Twitter with Lightweight Discourse Analysis