Sarcasm And Thwarting
Lekha
Deepali Gupta
Sagar Ahire
{lekha, gdeepali, sagarahire} @ cse.iitb.ac.in
Roadmap
Irony and Sarcasm
An Algorithm for Sarcasm Detection
Thwarting Detection
Irony and Sarcasm
Lekha | 133050002
Verbal Irony
“An irony is a figure of speech which implicitly
displays that the utterance situation was surrounded
by an ironic environment.”

There also exists Situational Irony
Ironic Environment
Expectation

Expectation fails

t0

t1
Negative emotional
attitude of the speaker
Reasons for Expectation to Fail
Expectation E is caused by an action A
1. E failed because A failed or cannot be
performed because of another action B
2. E failed because A was not performed
Expectation E is not caused by any action
3. E failed by an action B
4. E accidentally failed

Type 1 and 3
have victims
Sarcasm is
irony with
definite victims
and
counterfeited
emotions
Properties of an Ironic Environment
An utterance implicitly displays all the three conditions for
ironic environment when it:
1. Alludes to the speaker's expectation E
2. Intentionally violates one of pragmatic principles
3. Implies the speaker's emotional attitude toward the
failure of E
Irony is recognized if any 2 of these 3 are recognized.
Irony conveys the third unidentified property.
Allude to Speaker’s Expectation
Deepali baked a pizza to satisfy her hunger. She placed the pizza on the
table and in the meantime Sagar came and gobbled up the whole pizza.
Deepali said to Sagar:
a. I'm not hungry at all
b. Have you seen my pizza on the table?
c. I'll sleep well on a full stomach.
d. I'm really satisfied to eat the pizza.
e. Did you enjoy eating the pizza?
Violation of Pragmatic Principles
Sincerity
You make a statement you believe
You ask a question whose answer you don’t know
You offer advice which will benefit the receiver
You thank when you are really grateful
Propositional content
You thank for something that has been done for you
Preparatory condition for Offer
You offer something that you can really give
Maxim of relevance
Politeness principle
Maxim of quantity
Emotional Attitude
Tone and expressions
Interjections “Oh! The weather is so nice”
The context implies the emotional attitude of
the speaker
Semi Supervised
Recognition of Sarcastic
Sentences
Deepali Gupta | 13305R001
Sarcasm
The activity of saying or writing the opposite of
what you mean, or of speaking in a way
intended to make someone feel stupid or show
that you are angry (Macmillan English
Dictionary)
Sarcasm manifests in other ways...
● “Love the cover” (book)
● “Be sure to save your purchase receipt”
(Smart Phone)
● “Great idea, now try again with a real
product development team” (e-reader)
● “Where am I?” (GPS device)
The Algorithm: Overview
1. Training Set: Sentences manually assigned
scores 1 to 5 where five means clearly
sarcastic and one absence of sarcasm
2. Create feature vectors from the labelled
sentences
3. Use these feature vectors to build a model
and assign scores to unlabelled examples
Step 1: Preprocessing of Data
1. Replace each appearance of a
product/company/author by generalized
[product], [company], [author], etc.
2. Remove all HTML tags and special symbols
from review text.
Step 2: Creating Feature Vectors
Pattern Based Features:
1. Classify words into High Frequency Words (HFWs) and
Content Words (CWs)
All [product], [company] tags and punctuation marks are
HFWs.
2. A pattern is a sequence of HFWs with slots for CWs.
Example: “Garmin does not care about product quality or
customer support” has patterns “[company] does not CW
about CW CW” or “about CW CW or CW CW”, etc.
Pattern Matching
1: Exact Match
: Sparse Match - additional non-matching words can
be inserted between pattern components

: Incomplete Match - only n of N pattern
components appear in sentence, while some
non-matching words can be inserted in
between
Punctuation Based Features
●
●
●
●
●

Sentence length in words
Number of “!” characters
Number of “?” characters
Number of quotes
Number of capitalized/all capital words

Features are normalized to be in [0-1] by dividing them by
maximal observed value
Step 3: Data Enrichment
● For each sentence in the training set perform
a search engine query containing this
sentence
● Assign similar label to newly extracted
sentence.
Step 4: Classification
● Construct feature vectors for each sentence in the
training and test set
● Compute Euclidean Distance to each of matching
vectors in training set
Let ti i=1..k be the k vectors with lowest Euclidean Distance to v.
Then v is classified label l as follows:
Count(l) = Count of vectors in the training set with label l
Label(v) =
Star Sentiment Baseline
● From a set of negative reviews (with 1-3
stars) classify those sentences as sarcastic
with strong positive sentiment.
● Positive sentiment words can be eg. “great”,
“best”, “top”, etc.
Results

0.256

0.312

0.821

0.281

patterns

0.743

0.788

0.943

0.765

pat+punct

0.868

0.763

0.945

0.812

enrich punct 0.4

0.390

0.832

0.395

enrich pat

0.762

0.777

0.937

0.769

all: SASI

Table 2 (Below):
Comparison with
baseline

Accuracy F-Score

punctuation

Table 1:(Above): 5-fold
cross validation results

Precision Recall

0.912

0.756

0.947

0.827

Precisio Recall
n

False
Pos

False
Neg

F-score

Baseline 0.5

0.16

0.05

0.44

0.242

SASI

0.813

0.11

0.12

0.788

0.766
Thwarting Detection
Sagar Ahire | 133050073
Thwarting?
“The actors were good, the story was great, the
screenplay was a marvel of perfection and the
music was good too, but the movie couldn’t
hold my attention...”
Detecting Thwarting: The Big Picture
● Ascertain attributes of entity using ontology
● Find sentiment of each attribute in ontology
and the overall entity
● If there is a contrast, conclude thwarting has
occured
Building the Domain Ontology
1. Identify key features of domain from a
corpus
2. Arrange them in a hierarchy
Notes:
● Very human-intensive
● One-time requirement
An Example Ontology
Movie

Story Elements

Main Story

Dialogues

Acting

Characters

Music

Songs

Background
Score
Approaches to Detect Thwarting
● Rule-based Approach
● Machine Learning-based Approach
Rule-based Approach
1. Get dependency parse for adjective-noun
dependencies
2. Identify polarities towards all nouns
3. Tag corresponding ontology nodes with
found polarities
4. If a contradiction across levels is found,
conclude that thwarting has taken place
Rule-based Approach: Example
Movie
negative

Story Elements
positive

Main Story
positive

Dialogues
positive

Acting
positive

Characters
positive

Music
positive

Songs
positive

Background
Score
negative
Machine Learning-based Approach
Proceeds in two phases:
1. Learning weights
2. Classifying documents
Learning Weights: Choices
1. Choices for loss function:
a. Linear loss
b. Hinge loss

2. Choices for percolation across ontology
levels:
a. Complete percolation
b. Controlled percolation
Classification: Features
● Convert document into a feature vector.
● Examples:
○
○
○
○
○

Document polarity
No of flips of sign
Longest contiguous subsequence of +ve values
Longest contiguous subsequence of -ve values
etc.
Results
● Random baseline: 50%
● Rule-based approach: 56.3%
● ML-based approach (Linear loss, controlled
percolation): 81%
What’s the catch?
Requires sentiment as input!
Document with
Sentiment Information

Document

Thwarted or Not Thwarted
Current System

Thwarted or Not Thwarted,
Document Sentiment
Ideal System
Key Ideas
● Irony indicates presence of an ironic environment,
with 3 properties
● 2 of those 3 are enough to recognize irony
● Sarcasm is irony with victims and counterfeited
emotions
● A semi supervised pattern based algorithm detects
sarcasm well
● Thwarting is the phenomenon of polarity reversal at
a higher level of ontology compared to the polarity
expressed at the lower level
● Rule based and machine learning based
approaches have been attempted for thwarting
References
● Akira Utsumi (1996) - A unified theory of irony and its
computational formalization. InCOLING, 962–967.
● Oren Tsur, Dmitry Davidov, Ari Rappoport (2010) ICWSM – A Great Catchy Name: Semi-Supervised
Recognition of Sarcastic Sentences in Online Product
Reviews. In Association for the advancement of Artificial
Intelligence
● Ankit Ramteke, Akshat Malu, Pushpak Bhattacharyya,
J. Saketha Nath (2013) - Detecting Turnarounds in
Sentiment Analysis: Thwarting. In ACL 2013.
Questions?

Sarcasm & Thwarting in Sentiment Analysis [IIT-Bombay]

  • 1.
    Sarcasm And Thwarting Lekha DeepaliGupta Sagar Ahire {lekha, gdeepali, sagarahire} @ cse.iitb.ac.in
  • 2.
    Roadmap Irony and Sarcasm AnAlgorithm for Sarcasm Detection Thwarting Detection
  • 3.
  • 4.
    Verbal Irony “An ironyis a figure of speech which implicitly displays that the utterance situation was surrounded by an ironic environment.” There also exists Situational Irony
  • 5.
  • 6.
    Reasons for Expectationto Fail Expectation E is caused by an action A 1. E failed because A failed or cannot be performed because of another action B 2. E failed because A was not performed Expectation E is not caused by any action 3. E failed by an action B 4. E accidentally failed Type 1 and 3 have victims Sarcasm is irony with definite victims and counterfeited emotions
  • 7.
    Properties of anIronic Environment An utterance implicitly displays all the three conditions for ironic environment when it: 1. Alludes to the speaker's expectation E 2. Intentionally violates one of pragmatic principles 3. Implies the speaker's emotional attitude toward the failure of E Irony is recognized if any 2 of these 3 are recognized. Irony conveys the third unidentified property.
  • 8.
    Allude to Speaker’sExpectation Deepali baked a pizza to satisfy her hunger. She placed the pizza on the table and in the meantime Sagar came and gobbled up the whole pizza. Deepali said to Sagar: a. I'm not hungry at all b. Have you seen my pizza on the table? c. I'll sleep well on a full stomach. d. I'm really satisfied to eat the pizza. e. Did you enjoy eating the pizza?
  • 9.
    Violation of PragmaticPrinciples Sincerity You make a statement you believe You ask a question whose answer you don’t know You offer advice which will benefit the receiver You thank when you are really grateful Propositional content You thank for something that has been done for you Preparatory condition for Offer You offer something that you can really give Maxim of relevance Politeness principle Maxim of quantity
  • 10.
    Emotional Attitude Tone andexpressions Interjections “Oh! The weather is so nice” The context implies the emotional attitude of the speaker
  • 11.
    Semi Supervised Recognition ofSarcastic Sentences Deepali Gupta | 13305R001
  • 12.
    Sarcasm The activity ofsaying or writing the opposite of what you mean, or of speaking in a way intended to make someone feel stupid or show that you are angry (Macmillan English Dictionary)
  • 13.
    Sarcasm manifests inother ways... ● “Love the cover” (book) ● “Be sure to save your purchase receipt” (Smart Phone) ● “Great idea, now try again with a real product development team” (e-reader) ● “Where am I?” (GPS device)
  • 14.
    The Algorithm: Overview 1.Training Set: Sentences manually assigned scores 1 to 5 where five means clearly sarcastic and one absence of sarcasm 2. Create feature vectors from the labelled sentences 3. Use these feature vectors to build a model and assign scores to unlabelled examples
  • 15.
    Step 1: Preprocessingof Data 1. Replace each appearance of a product/company/author by generalized [product], [company], [author], etc. 2. Remove all HTML tags and special symbols from review text.
  • 16.
    Step 2: CreatingFeature Vectors Pattern Based Features: 1. Classify words into High Frequency Words (HFWs) and Content Words (CWs) All [product], [company] tags and punctuation marks are HFWs. 2. A pattern is a sequence of HFWs with slots for CWs. Example: “Garmin does not care about product quality or customer support” has patterns “[company] does not CW about CW CW” or “about CW CW or CW CW”, etc.
  • 17.
    Pattern Matching 1: ExactMatch : Sparse Match - additional non-matching words can be inserted between pattern components : Incomplete Match - only n of N pattern components appear in sentence, while some non-matching words can be inserted in between
  • 18.
    Punctuation Based Features ● ● ● ● ● Sentencelength in words Number of “!” characters Number of “?” characters Number of quotes Number of capitalized/all capital words Features are normalized to be in [0-1] by dividing them by maximal observed value
  • 19.
    Step 3: DataEnrichment ● For each sentence in the training set perform a search engine query containing this sentence ● Assign similar label to newly extracted sentence.
  • 20.
    Step 4: Classification ●Construct feature vectors for each sentence in the training and test set ● Compute Euclidean Distance to each of matching vectors in training set Let ti i=1..k be the k vectors with lowest Euclidean Distance to v. Then v is classified label l as follows: Count(l) = Count of vectors in the training set with label l Label(v) =
  • 21.
    Star Sentiment Baseline ●From a set of negative reviews (with 1-3 stars) classify those sentences as sarcastic with strong positive sentiment. ● Positive sentiment words can be eg. “great”, “best”, “top”, etc.
  • 22.
    Results 0.256 0.312 0.821 0.281 patterns 0.743 0.788 0.943 0.765 pat+punct 0.868 0.763 0.945 0.812 enrich punct 0.4 0.390 0.832 0.395 enrichpat 0.762 0.777 0.937 0.769 all: SASI Table 2 (Below): Comparison with baseline Accuracy F-Score punctuation Table 1:(Above): 5-fold cross validation results Precision Recall 0.912 0.756 0.947 0.827 Precisio Recall n False Pos False Neg F-score Baseline 0.5 0.16 0.05 0.44 0.242 SASI 0.813 0.11 0.12 0.788 0.766
  • 23.
  • 24.
    Thwarting? “The actors weregood, the story was great, the screenplay was a marvel of perfection and the music was good too, but the movie couldn’t hold my attention...”
  • 25.
    Detecting Thwarting: TheBig Picture ● Ascertain attributes of entity using ontology ● Find sentiment of each attribute in ontology and the overall entity ● If there is a contrast, conclude thwarting has occured
  • 26.
    Building the DomainOntology 1. Identify key features of domain from a corpus 2. Arrange them in a hierarchy Notes: ● Very human-intensive ● One-time requirement
  • 27.
    An Example Ontology Movie StoryElements Main Story Dialogues Acting Characters Music Songs Background Score
  • 28.
    Approaches to DetectThwarting ● Rule-based Approach ● Machine Learning-based Approach
  • 29.
    Rule-based Approach 1. Getdependency parse for adjective-noun dependencies 2. Identify polarities towards all nouns 3. Tag corresponding ontology nodes with found polarities 4. If a contradiction across levels is found, conclude that thwarting has taken place
  • 30.
    Rule-based Approach: Example Movie negative StoryElements positive Main Story positive Dialogues positive Acting positive Characters positive Music positive Songs positive Background Score negative
  • 31.
    Machine Learning-based Approach Proceedsin two phases: 1. Learning weights 2. Classifying documents
  • 32.
    Learning Weights: Choices 1.Choices for loss function: a. Linear loss b. Hinge loss 2. Choices for percolation across ontology levels: a. Complete percolation b. Controlled percolation
  • 33.
    Classification: Features ● Convertdocument into a feature vector. ● Examples: ○ ○ ○ ○ ○ Document polarity No of flips of sign Longest contiguous subsequence of +ve values Longest contiguous subsequence of -ve values etc.
  • 34.
    Results ● Random baseline:50% ● Rule-based approach: 56.3% ● ML-based approach (Linear loss, controlled percolation): 81%
  • 35.
    What’s the catch? Requiressentiment as input! Document with Sentiment Information Document Thwarted or Not Thwarted Current System Thwarted or Not Thwarted, Document Sentiment Ideal System
  • 36.
    Key Ideas ● Ironyindicates presence of an ironic environment, with 3 properties ● 2 of those 3 are enough to recognize irony ● Sarcasm is irony with victims and counterfeited emotions ● A semi supervised pattern based algorithm detects sarcasm well ● Thwarting is the phenomenon of polarity reversal at a higher level of ontology compared to the polarity expressed at the lower level ● Rule based and machine learning based approaches have been attempted for thwarting
  • 37.
    References ● Akira Utsumi(1996) - A unified theory of irony and its computational formalization. InCOLING, 962–967. ● Oren Tsur, Dmitry Davidov, Ari Rappoport (2010) ICWSM – A Great Catchy Name: Semi-Supervised Recognition of Sarcastic Sentences in Online Product Reviews. In Association for the advancement of Artificial Intelligence ● Ankit Ramteke, Akshat Malu, Pushpak Bhattacharyya, J. Saketha Nath (2013) - Detecting Turnarounds in Sentiment Analysis: Thwarting. In ACL 2013.
  • 38.