SlideShare a Scribd company logo
Sarcasm Detection:
Achilles Heel of Sentiment Analysis
Anuj Gupta
Independent Researcher
Former Director - ML, Huawei Technologies
“Technical talk right after lunch!!
That is exactly what every speaker wants.
Thanks, Naresh and ODSC team.”
2
4
Sarcasm
● Greek: sarkázein (speak bitterly, use of irony to mock)
French: sarcasme
● Nuanced form of language where often the speaker explicitly states the opposite of
what she implies.
● Deliberately mean opposite of what is on the surface.
“This talk looks like great fun ;)”
5
Importance of sarcasm detection
Business Perspective:
● Organizations tap into social media for public opinion on their products &
services and real time customer assistance.
● To assist this, sentiment analysis is a key offering in any and every CRM tool.
● Customers often use sarcasm to expressing their frustration with
products/services.
6
● Most sentiment analysis systems (SAS) fail to detect sarcasm and wrongly
infer the sentiment
● Both the systems got fooled by the word “love”.
● Most SAS lack the sophistication needed to detect sarcasm. 7
Stanford’s sentiment analysis demo Aylien sentiment analysis demo
● This places extra burden on customer care teams.
● Owing to the volume, velocity of traffic, subtlety of language, background &
cultural differences; agents can miss sarcasm completely.
● Missing/Misinterpreting = PR disasters for brands.
8
Research Perspective:
● Much like QnA, text summarization, machine translation, sarcasm detection
involves complexity of language and is believed to be a much harder task.
● Any progress in sarcasm detection, is an positive step towards pushing the
boundaries of NLP.
● Only recently, people started to look into it.
● With improvement in our understanding and approaches to sentiment
analysis, researchers started focusing on more difficult cases
○ Aspect based sentiment analysis
○ Sarcasm detection
So, be it be business/research perspective, it is worth
investing time and energy in sarcasm detection.
9
● Sarcasm: “a sharp, bitter, or cutting expression or remark; a bitter gibe or taunt”.
● Sarcasm is negative sentiment.
○ You are never sarcastically positive
What makes sarcasm detection difficult?
● It is deliberate - people employ play of language.
● It is subtle: it is just a word, phrases or a punctuation that is here and there.
● Even humans can find it hard to understand.
● Sarcasm is often used on social media platforms like twitter.
● Sarcasm in twitter comes with additional challenges : Fewer word cues (280
characters), spelling mistakes, acronyms, slang words, ever evolving vocabulary.
Key Characteristics
10
Problem Statement
Business problem: Build a sentiment analysis system capable of handling
sarcasm.
Abstract problem: Given an unlabeled tweet T from user U, a solution should
automatically detect if T is sarcastic or not.
Sarcasm ?text
Sentiment Analysis
No
Negative SentimentYes
11
Scope and Assumptions
● Consider the following sarcasm: “If Hillary wins, she will surely be pleased to
recall Monica each time she enters Oval office”.
Detecting this requires:
● Anaphora resolution
● Fact extraction
● Logical reasoning
● Such complex cases are beyond the scope of this work.
● Further, we assume all information necessary to detect sarcasm is contained
in same sentence (twitter data).
[Detecting sarcasm in paragraphs and articles is a much harder problem]
12
Dataset
Manually identified sources for sarcasm:
● Hashtags : #sarcasm, #not, #irony
● Handles : @sarcastic_us, @heissarcastic, @SarcasmMsg ….
What is not sarcasm ? Everything else.
● For this we also used twitter datasets for sentiment analysis.
● Being short (280 characters), all information necessary to detect sarcasm is
contained in same sentence.
● After cleaning left with ~100K data points, ~50K per class.
● Built test data of 20K data points in similar fashion, but from a different timeline.
13
Literature Survey
● Until very recently, hand coded features were used extensively.
○ Unigrams, bigrams, trigrams, n-grams, dictionary-based lexical features.
○ Pragmatic features such as emoticons, capitalization, punctuation.
○ Presence of a positive sentiment in close proximity of a negative situation phrase as a
feature for sarcasm detection.
○ Features based on frequency (gap between rare and common words)
○ Incongruous: number of time a word is followed by a word of opposite polarity.
○ #positive words, # negative words, length of longest sequences without polarity flip.
[Tsur et al., 2010, Gonzalez-Ib´anez et al., 2011, Riloff et al., 2013, Buschmeier et al.,
2014, Joshi et al., 2015] 14
● These features are based on certain observations in the dataset. Thus, they
are mostly dataset specific.
● While this can give great performance on the dataset in hand, from a product
point of view one would like to have more robust features.
● Features that are not brittle and generalize to other datasets.
● People started to apply DL models.
15
Baseline
● Treated this as binary classification problem.
● Single layer RNNs (LSTM, GRU).
● Failed to generalize (F1 score of ~68%).
● Owing to not having enough data, they overfit very quickly.
● Simple CNN did far better (F1 score of ~76%)
16
Need for stronger signals
Literature on sarcasm detection has typically used 3 clues :
1. Sentiment
2. Emotion
3. Personality
Let us understand each one of them in detail.
17
Sentiment
● Most sarcastic sentences show a shift in sentiment
“I love the pain present in the breakups”
(shift in sentiment)
● There is a contradiction between sentiment of “love” and “pain of breakups”.
This is a hallmark of sarcasm.
● Thus, including sentiment clues should help in sarcasm detection.
Traditionally this was done via sentiment lexicons.
○ # negative words, # positive words, # sentiment shifts across adjacent words
● Instead, we use features extracted from neural network trained for sentiment. 18
Emotion
● Emotion: feelings such as happiness, anger, jealousy, grief, etc. One can
have many emotions simultaneously. Subjective in nature.
● Sentiment: opinion or mental attitude produced by emotions about something.
This is much more objective.
● Sarcastic sentences are rich in emotions.
● “My steller programing carrier: job offer ; Ctrl C, Ctrl V; resignation. Repeat”
● Pain, sadness, anger, disgust etc
● Thus, including emoticons clues should help.
● We use features extracted from neural network trained for emotion.
19
Personality
● There is a body of work that argues that sarcasm is not just a linguistic
phenomena but also a behavioral phenomena i.e. it not just about what is
being said but also who is saying that is super important.
● i.e. sarcasm is user specific: some users have a stronger tendency to be
sarcastic as compared to others*.
● This body of work factors in the history of the user in question to derive
features to model behavioral aspect. For this, they use past tweets of the
user.
● Researchers have shown substantial gains using personality based signals.
20
* There are systematic studies that establish positive correlation between ability to create & understand sarcasm and higher cognitive ability.
Personality
● However, from the view point of production system this is super challenging.
● One has to either store features from every users past timeline.
● Or retrieving a users history at run time, compute features on the fly
● Given typical volumes, both these choices have severe implications on
throughput and resources.
● Hence, we did not use personality based indicators.
21
Our solution in nutshell
22
Text
Sentiment
features
Final
Classifier
Emotion
features
Features from
baseline
Sentiment model
Emotion model
Baseline model
CNN / Linear models
Details
1. Sentiment model
2. Emotion model
3. Final Classifier
23
Sentiment Model
● Objective is to build a sentiment model where the second last layer will
be used to extract features.
● We built a (standard) CNN for this.
○ Alternative layers of Convolution and Max Pooling
○ Followed by fully connected layers
○ Softmax output
● Text is tokenized (we used tweet tokenizer by Allen Ritter)
● Embedding layer is initialized using pretrained GloVe word embeddings
for twitter.
25
Sentiment Model
● To train this network we used a dataset for sentiment analysis.
○ 3 classes - negative, positive and neutral
○ Public dataset + custom data
● All convolutions are 1D convolutions.
○ Height of the filter varies.
○ Width of the filter is same as that of embedding dimension.
26
27
Emotion Model
● Build a emotion model where the second last layer will be used to extract
features.
● We built a (standard) CNN for this.
○ Alternative layers of Convolution and Max Pooling
○ Followed by fully connected layers
○ Softmax output
● Text is tokenized
● Embedding layer is initialized using pretrained GloVe word embeddings
for twitter.
28
Emotion Model
● To train this network we used a dataset for emotion analysis.
○ 6 classes - anger, disgust, surprise, sadness, joy and fear
○ Public dataset + custom data
● All convolutions are 1D convolutions.
29
Model Architecture
● Owing to scarcity of data, we kept networks simple.
● Embedding layer was frozen. Not fine tuned.
30
Our solution in nutshell
31
Text
Sentiment
features
Final
Classifier
Emotion
features
Features from
baseline
Pretrained Sentiment model
Pretrained Emotion model
Pretraned Baseline model
CNN / Linear models
Results
● Test Data came from a different timeline.
● ~20K balanced test set.
32
Future work
● Train your own word embedding
● Character n-gram embeddings
● Retry RNNs
● Attention networks
● Collect more data!
○ Collecting right data for negative class (not sarcasm) is very important.
○ Adding public datasets of sentiment to negative class helped us a lot.
● It will be interesting to see the impact of factoring in user behaviour.
33
Learnings
● Sarcasm detection is an important problem.
● It is difficult:
○ long term dependencies
○ subtle changes of word or punctuation can flip the polarity
○ Needs facts and external knowledge
● Present sentiment analysis systems are bad at detecting sarcasm.
● Pretrained (sub-task) specific CNNs can work in text as well.
● This is an example of domain knowledge + Deep Learning.
● Data collection strategy is important
○ Comprehensively collecting what is not sarcasm.
○ Adding public datasets of sentiment to negative class helped us a lot. 34
Thank You
Questions?
35
@anujgupta82
/anujgupta-82

More Related Content

What's hot

Sentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse AnalysisSentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
Naveen Kumar
 
Introduction to Neural Networks
Introduction to Neural NetworksIntroduction to Neural Networks
Introduction to Neural Networks
Databricks
 
Artifical Neural Network and its applications
Artifical Neural Network and its applicationsArtifical Neural Network and its applications
Artifical Neural Network and its applications
Sangeeta Tiwari
 
NLP - Sentiment Analysis
NLP - Sentiment AnalysisNLP - Sentiment Analysis
NLP - Sentiment Analysis
Rupak Roy
 
Presentation on Sentiment Analysis
Presentation on Sentiment AnalysisPresentation on Sentiment Analysis
Presentation on Sentiment Analysis
Rebecca Williams
 
Deep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applicationsDeep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applications
Buhwan Jeong
 
LSTM Basics
LSTM BasicsLSTM Basics
LSTM Basics
Akshay Sehgal
 
Activation function
Activation functionActivation function
Activation function
Astha Jain
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
Prakash K
 
Cnn
CnnCnn
Text Classification
Text ClassificationText Classification
Text Classification
RAX Automation Suite
 
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Edureka!
 
Neural Network from Scratch in Python
Neural Network from Scratch in PythonNeural Network from Scratch in Python
Neural Network from Scratch in Python
Dhirajk7
 
Deep Learning With Neural Networks
Deep Learning With Neural NetworksDeep Learning With Neural Networks
Deep Learning With Neural Networks
Aniket Maurya
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
Databricks
 
Image Steganography
Image SteganographyImage Steganography
Image Steganography
Hushen Savani
 
Sentiment Analysis Using Machine Learning
Sentiment Analysis Using Machine LearningSentiment Analysis Using Machine Learning
Sentiment Analysis Using Machine Learning
Nihar Suryawanshi
 
Feature scaling
Feature scalingFeature scaling
Feature scaling
Gautam Kumar
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term Memory
Yan Xu
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
Knoldus Inc.
 

What's hot (20)

Sentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse AnalysisSentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
 
Introduction to Neural Networks
Introduction to Neural NetworksIntroduction to Neural Networks
Introduction to Neural Networks
 
Artifical Neural Network and its applications
Artifical Neural Network and its applicationsArtifical Neural Network and its applications
Artifical Neural Network and its applications
 
NLP - Sentiment Analysis
NLP - Sentiment AnalysisNLP - Sentiment Analysis
NLP - Sentiment Analysis
 
Presentation on Sentiment Analysis
Presentation on Sentiment AnalysisPresentation on Sentiment Analysis
Presentation on Sentiment Analysis
 
Deep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applicationsDeep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applications
 
LSTM Basics
LSTM BasicsLSTM Basics
LSTM Basics
 
Activation function
Activation functionActivation function
Activation function
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
Cnn
CnnCnn
Cnn
 
Text Classification
Text ClassificationText Classification
Text Classification
 
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
 
Neural Network from Scratch in Python
Neural Network from Scratch in PythonNeural Network from Scratch in Python
Neural Network from Scratch in Python
 
Deep Learning With Neural Networks
Deep Learning With Neural NetworksDeep Learning With Neural Networks
Deep Learning With Neural Networks
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
 
Image Steganography
Image SteganographyImage Steganography
Image Steganography
 
Sentiment Analysis Using Machine Learning
Sentiment Analysis Using Machine LearningSentiment Analysis Using Machine Learning
Sentiment Analysis Using Machine Learning
 
Feature scaling
Feature scalingFeature scaling
Feature scaling
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term Memory
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 

Similar to Sarcasm Detection: Achilles Heel of sentiment analysis

02 naive bays classifier and sentiment analysis
02 naive bays classifier and sentiment analysis02 naive bays classifier and sentiment analysis
02 naive bays classifier and sentiment analysis
Subhas Kumar Ghosh
 
Collective sensing
Collective sensingCollective sensing
Collective sensing
mahdikianirad1
 
NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLP
Anuj Gupta
 
Text classification & sentiment analysis
Text classification & sentiment analysisText classification & sentiment analysis
Text classification & sentiment analysis
M. Atif Qureshi
 
Nlp whitepaper the securly way
Nlp whitepaper   the securly wayNlp whitepaper   the securly way
Nlp whitepaper the securly way
Securly
 
A General Architecture for an Emotion-aware Content-based Recommender System
A General Architecture for an Emotion-aware Content-based Recommender SystemA General Architecture for an Emotion-aware Content-based Recommender System
A General Architecture for an Emotion-aware Content-based Recommender System
Lucio Narducci
 
Sentiment Analysis using Machine Learning.pdf
Sentiment Analysis using Machine Learning.pdfSentiment Analysis using Machine Learning.pdf
Sentiment Analysis using Machine Learning.pdf
OmSatpathy
 
Live Twitter Sentiment Analysis and Interactive Visualizations with PyLDAvis ...
Live Twitter Sentiment Analysis and Interactive Visualizations with PyLDAvis ...Live Twitter Sentiment Analysis and Interactive Visualizations with PyLDAvis ...
Live Twitter Sentiment Analysis and Interactive Visualizations with PyLDAvis ...
IRJET Journal
 
Cooperative game model based sentiment analysis of product reviews.pptx
Cooperative game model based sentiment analysis of product reviews.pptxCooperative game model based sentiment analysis of product reviews.pptx
Cooperative game model based sentiment analysis of product reviews.pptx
UsamaHassan90
 
NLP Bootcamp
NLP BootcampNLP Bootcamp
NLP Bootcamp
Anuj Gupta
 
NATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptxNATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptx
saivinay93
 
Sentimental Analysis - Naive Bayes Algorithm
Sentimental Analysis - Naive Bayes AlgorithmSentimental Analysis - Naive Bayes Algorithm
Sentimental Analysis - Naive Bayes Algorithm
Khushboo Gupta
 
A Gold Standard for Emotion Annotation in Stack Overflow
A Gold Standard for Emotion Annotation in Stack Overflow A Gold Standard for Emotion Annotation in Stack Overflow
A Gold Standard for Emotion Annotation in Stack Overflow
Fabio Calefato
 
Uncovering the Causes of Emotions in Software Developer Communication Using Z...
Uncovering the Causes of Emotions in Software Developer Communication Using Z...Uncovering the Causes of Emotions in Software Developer Communication Using Z...
Uncovering the Causes of Emotions in Software Developer Communication Using Z...
Mia Mohammad Imran
 
Data Augmentation for Improving Emotion Recognition in Software Engineering C...
Data Augmentation for Improving Emotion Recognition in Software Engineering C...Data Augmentation for Improving Emotion Recognition in Software Engineering C...
Data Augmentation for Improving Emotion Recognition in Software Engineering C...
Preetha Chatterjee
 
How Sentiment Analysis works
How Sentiment Analysis worksHow Sentiment Analysis works
How Sentiment Analysis works
CJ Jenkins
 
Inspirit AI Facial Emotion Detection Project (Dec 2021)
Inspirit AI Facial Emotion Detection Project (Dec 2021)Inspirit AI Facial Emotion Detection Project (Dec 2021)
Inspirit AI Facial Emotion Detection Project (Dec 2021)
EmilyJoseph18
 
Multimodal opinion mining from social media
Multimodal opinion mining from social mediaMultimodal opinion mining from social media
Multimodal opinion mining from social media
Diana Maynard
 
Multimedia data minig and analytics sentiment analysis using social multimedia
Multimedia data minig and analytics sentiment analysis using social multimediaMultimedia data minig and analytics sentiment analysis using social multimedia
Multimedia data minig and analytics sentiment analysis using social multimedia
Kan-Han (John) Lu
 

Similar to Sarcasm Detection: Achilles Heel of sentiment analysis (20)

02 naive bays classifier and sentiment analysis
02 naive bays classifier and sentiment analysis02 naive bays classifier and sentiment analysis
02 naive bays classifier and sentiment analysis
 
Collective sensing
Collective sensingCollective sensing
Collective sensing
 
NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLP
 
Text classification & sentiment analysis
Text classification & sentiment analysisText classification & sentiment analysis
Text classification & sentiment analysis
 
Nlp whitepaper the securly way
Nlp whitepaper   the securly wayNlp whitepaper   the securly way
Nlp whitepaper the securly way
 
Unit 5f.pptx
Unit 5f.pptxUnit 5f.pptx
Unit 5f.pptx
 
A General Architecture for an Emotion-aware Content-based Recommender System
A General Architecture for an Emotion-aware Content-based Recommender SystemA General Architecture for an Emotion-aware Content-based Recommender System
A General Architecture for an Emotion-aware Content-based Recommender System
 
Sentiment Analysis using Machine Learning.pdf
Sentiment Analysis using Machine Learning.pdfSentiment Analysis using Machine Learning.pdf
Sentiment Analysis using Machine Learning.pdf
 
Live Twitter Sentiment Analysis and Interactive Visualizations with PyLDAvis ...
Live Twitter Sentiment Analysis and Interactive Visualizations with PyLDAvis ...Live Twitter Sentiment Analysis and Interactive Visualizations with PyLDAvis ...
Live Twitter Sentiment Analysis and Interactive Visualizations with PyLDAvis ...
 
Cooperative game model based sentiment analysis of product reviews.pptx
Cooperative game model based sentiment analysis of product reviews.pptxCooperative game model based sentiment analysis of product reviews.pptx
Cooperative game model based sentiment analysis of product reviews.pptx
 
NLP Bootcamp
NLP BootcampNLP Bootcamp
NLP Bootcamp
 
NATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptxNATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptx
 
Sentimental Analysis - Naive Bayes Algorithm
Sentimental Analysis - Naive Bayes AlgorithmSentimental Analysis - Naive Bayes Algorithm
Sentimental Analysis - Naive Bayes Algorithm
 
A Gold Standard for Emotion Annotation in Stack Overflow
A Gold Standard for Emotion Annotation in Stack Overflow A Gold Standard for Emotion Annotation in Stack Overflow
A Gold Standard for Emotion Annotation in Stack Overflow
 
Uncovering the Causes of Emotions in Software Developer Communication Using Z...
Uncovering the Causes of Emotions in Software Developer Communication Using Z...Uncovering the Causes of Emotions in Software Developer Communication Using Z...
Uncovering the Causes of Emotions in Software Developer Communication Using Z...
 
Data Augmentation for Improving Emotion Recognition in Software Engineering C...
Data Augmentation for Improving Emotion Recognition in Software Engineering C...Data Augmentation for Improving Emotion Recognition in Software Engineering C...
Data Augmentation for Improving Emotion Recognition in Software Engineering C...
 
How Sentiment Analysis works
How Sentiment Analysis worksHow Sentiment Analysis works
How Sentiment Analysis works
 
Inspirit AI Facial Emotion Detection Project (Dec 2021)
Inspirit AI Facial Emotion Detection Project (Dec 2021)Inspirit AI Facial Emotion Detection Project (Dec 2021)
Inspirit AI Facial Emotion Detection Project (Dec 2021)
 
Multimodal opinion mining from social media
Multimodal opinion mining from social mediaMultimodal opinion mining from social media
Multimodal opinion mining from social media
 
Multimedia data minig and analytics sentiment analysis using social multimedia
Multimedia data minig and analytics sentiment analysis using social multimediaMultimedia data minig and analytics sentiment analysis using social multimedia
Multimedia data minig and analytics sentiment analysis using social multimedia
 

More from Anuj Gupta

ODSC East 2020 : Continuous_learning_systems
ODSC East 2020 : Continuous_learning_systemsODSC East 2020 : Continuous_learning_systems
ODSC East 2020 : Continuous_learning_systems
Anuj Gupta
 
Continuous Learning Systems: Building ML systems that learn from their mistakes
Continuous Learning Systems: Building ML systems that learn from their mistakesContinuous Learning Systems: Building ML systems that learn from their mistakes
Continuous Learning Systems: Building ML systems that learn from their mistakes
Anuj Gupta
 
Recent Advances in NLP
  Recent Advances in NLP  Recent Advances in NLP
Recent Advances in NLP
Anuj Gupta
 
Talk from NVidia Developer Connect
Talk from NVidia Developer ConnectTalk from NVidia Developer Connect
Talk from NVidia Developer Connect
Anuj Gupta
 
Representation Learning of Text for NLP
Representation Learning of Text for NLPRepresentation Learning of Text for NLP
Representation Learning of Text for NLP
Anuj Gupta
 
Synthetic Gradients - Decoupling Layers of a Neural Nets
Synthetic Gradients - Decoupling Layers of a Neural NetsSynthetic Gradients - Decoupling Layers of a Neural Nets
Synthetic Gradients - Decoupling Layers of a Neural Nets
Anuj Gupta
 
DLBLR talk
DLBLR talkDLBLR talk
DLBLR talk
Anuj Gupta
 
Representation Learning for NLP
Representation Learning for NLPRepresentation Learning for NLP
Representation Learning for NLP
Anuj Gupta
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning Systems
Anuj Gupta
 

More from Anuj Gupta (9)

ODSC East 2020 : Continuous_learning_systems
ODSC East 2020 : Continuous_learning_systemsODSC East 2020 : Continuous_learning_systems
ODSC East 2020 : Continuous_learning_systems
 
Continuous Learning Systems: Building ML systems that learn from their mistakes
Continuous Learning Systems: Building ML systems that learn from their mistakesContinuous Learning Systems: Building ML systems that learn from their mistakes
Continuous Learning Systems: Building ML systems that learn from their mistakes
 
Recent Advances in NLP
  Recent Advances in NLP  Recent Advances in NLP
Recent Advances in NLP
 
Talk from NVidia Developer Connect
Talk from NVidia Developer ConnectTalk from NVidia Developer Connect
Talk from NVidia Developer Connect
 
Representation Learning of Text for NLP
Representation Learning of Text for NLPRepresentation Learning of Text for NLP
Representation Learning of Text for NLP
 
Synthetic Gradients - Decoupling Layers of a Neural Nets
Synthetic Gradients - Decoupling Layers of a Neural NetsSynthetic Gradients - Decoupling Layers of a Neural Nets
Synthetic Gradients - Decoupling Layers of a Neural Nets
 
DLBLR talk
DLBLR talkDLBLR talk
DLBLR talk
 
Representation Learning for NLP
Representation Learning for NLPRepresentation Learning for NLP
Representation Learning for NLP
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning Systems
 

Recently uploaded

Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
aishnasrivastava
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
IvanMallco1
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
AADYARAJPANDEY1
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Sérgio Sacani
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
ChetanK57
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
sonaliswain16
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
Health Advances
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
Richard Gill
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
AADYARAJPANDEY1
 

Recently uploaded (20)

Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
 

Sarcasm Detection: Achilles Heel of sentiment analysis

  • 1. Sarcasm Detection: Achilles Heel of Sentiment Analysis Anuj Gupta Independent Researcher Former Director - ML, Huawei Technologies
  • 2. “Technical talk right after lunch!! That is exactly what every speaker wants. Thanks, Naresh and ODSC team.” 2
  • 3. 4
  • 4. Sarcasm ● Greek: sarkázein (speak bitterly, use of irony to mock) French: sarcasme ● Nuanced form of language where often the speaker explicitly states the opposite of what she implies. ● Deliberately mean opposite of what is on the surface. “This talk looks like great fun ;)” 5
  • 5. Importance of sarcasm detection Business Perspective: ● Organizations tap into social media for public opinion on their products & services and real time customer assistance. ● To assist this, sentiment analysis is a key offering in any and every CRM tool. ● Customers often use sarcasm to expressing their frustration with products/services. 6
  • 6. ● Most sentiment analysis systems (SAS) fail to detect sarcasm and wrongly infer the sentiment ● Both the systems got fooled by the word “love”. ● Most SAS lack the sophistication needed to detect sarcasm. 7 Stanford’s sentiment analysis demo Aylien sentiment analysis demo
  • 7. ● This places extra burden on customer care teams. ● Owing to the volume, velocity of traffic, subtlety of language, background & cultural differences; agents can miss sarcasm completely. ● Missing/Misinterpreting = PR disasters for brands. 8
  • 8. Research Perspective: ● Much like QnA, text summarization, machine translation, sarcasm detection involves complexity of language and is believed to be a much harder task. ● Any progress in sarcasm detection, is an positive step towards pushing the boundaries of NLP. ● Only recently, people started to look into it. ● With improvement in our understanding and approaches to sentiment analysis, researchers started focusing on more difficult cases ○ Aspect based sentiment analysis ○ Sarcasm detection So, be it be business/research perspective, it is worth investing time and energy in sarcasm detection. 9
  • 9. ● Sarcasm: “a sharp, bitter, or cutting expression or remark; a bitter gibe or taunt”. ● Sarcasm is negative sentiment. ○ You are never sarcastically positive What makes sarcasm detection difficult? ● It is deliberate - people employ play of language. ● It is subtle: it is just a word, phrases or a punctuation that is here and there. ● Even humans can find it hard to understand. ● Sarcasm is often used on social media platforms like twitter. ● Sarcasm in twitter comes with additional challenges : Fewer word cues (280 characters), spelling mistakes, acronyms, slang words, ever evolving vocabulary. Key Characteristics 10
  • 10. Problem Statement Business problem: Build a sentiment analysis system capable of handling sarcasm. Abstract problem: Given an unlabeled tweet T from user U, a solution should automatically detect if T is sarcastic or not. Sarcasm ?text Sentiment Analysis No Negative SentimentYes 11
  • 11. Scope and Assumptions ● Consider the following sarcasm: “If Hillary wins, she will surely be pleased to recall Monica each time she enters Oval office”. Detecting this requires: ● Anaphora resolution ● Fact extraction ● Logical reasoning ● Such complex cases are beyond the scope of this work. ● Further, we assume all information necessary to detect sarcasm is contained in same sentence (twitter data). [Detecting sarcasm in paragraphs and articles is a much harder problem] 12
  • 12. Dataset Manually identified sources for sarcasm: ● Hashtags : #sarcasm, #not, #irony ● Handles : @sarcastic_us, @heissarcastic, @SarcasmMsg …. What is not sarcasm ? Everything else. ● For this we also used twitter datasets for sentiment analysis. ● Being short (280 characters), all information necessary to detect sarcasm is contained in same sentence. ● After cleaning left with ~100K data points, ~50K per class. ● Built test data of 20K data points in similar fashion, but from a different timeline. 13
  • 13. Literature Survey ● Until very recently, hand coded features were used extensively. ○ Unigrams, bigrams, trigrams, n-grams, dictionary-based lexical features. ○ Pragmatic features such as emoticons, capitalization, punctuation. ○ Presence of a positive sentiment in close proximity of a negative situation phrase as a feature for sarcasm detection. ○ Features based on frequency (gap between rare and common words) ○ Incongruous: number of time a word is followed by a word of opposite polarity. ○ #positive words, # negative words, length of longest sequences without polarity flip. [Tsur et al., 2010, Gonzalez-Ib´anez et al., 2011, Riloff et al., 2013, Buschmeier et al., 2014, Joshi et al., 2015] 14
  • 14. ● These features are based on certain observations in the dataset. Thus, they are mostly dataset specific. ● While this can give great performance on the dataset in hand, from a product point of view one would like to have more robust features. ● Features that are not brittle and generalize to other datasets. ● People started to apply DL models. 15
  • 15. Baseline ● Treated this as binary classification problem. ● Single layer RNNs (LSTM, GRU). ● Failed to generalize (F1 score of ~68%). ● Owing to not having enough data, they overfit very quickly. ● Simple CNN did far better (F1 score of ~76%) 16
  • 16. Need for stronger signals Literature on sarcasm detection has typically used 3 clues : 1. Sentiment 2. Emotion 3. Personality Let us understand each one of them in detail. 17
  • 17. Sentiment ● Most sarcastic sentences show a shift in sentiment “I love the pain present in the breakups” (shift in sentiment) ● There is a contradiction between sentiment of “love” and “pain of breakups”. This is a hallmark of sarcasm. ● Thus, including sentiment clues should help in sarcasm detection. Traditionally this was done via sentiment lexicons. ○ # negative words, # positive words, # sentiment shifts across adjacent words ● Instead, we use features extracted from neural network trained for sentiment. 18
  • 18. Emotion ● Emotion: feelings such as happiness, anger, jealousy, grief, etc. One can have many emotions simultaneously. Subjective in nature. ● Sentiment: opinion or mental attitude produced by emotions about something. This is much more objective. ● Sarcastic sentences are rich in emotions. ● “My steller programing carrier: job offer ; Ctrl C, Ctrl V; resignation. Repeat” ● Pain, sadness, anger, disgust etc ● Thus, including emoticons clues should help. ● We use features extracted from neural network trained for emotion. 19
  • 19. Personality ● There is a body of work that argues that sarcasm is not just a linguistic phenomena but also a behavioral phenomena i.e. it not just about what is being said but also who is saying that is super important. ● i.e. sarcasm is user specific: some users have a stronger tendency to be sarcastic as compared to others*. ● This body of work factors in the history of the user in question to derive features to model behavioral aspect. For this, they use past tweets of the user. ● Researchers have shown substantial gains using personality based signals. 20 * There are systematic studies that establish positive correlation between ability to create & understand sarcasm and higher cognitive ability.
  • 20. Personality ● However, from the view point of production system this is super challenging. ● One has to either store features from every users past timeline. ● Or retrieving a users history at run time, compute features on the fly ● Given typical volumes, both these choices have severe implications on throughput and resources. ● Hence, we did not use personality based indicators. 21
  • 21. Our solution in nutshell 22 Text Sentiment features Final Classifier Emotion features Features from baseline Sentiment model Emotion model Baseline model CNN / Linear models
  • 22. Details 1. Sentiment model 2. Emotion model 3. Final Classifier 23
  • 23. Sentiment Model ● Objective is to build a sentiment model where the second last layer will be used to extract features. ● We built a (standard) CNN for this. ○ Alternative layers of Convolution and Max Pooling ○ Followed by fully connected layers ○ Softmax output ● Text is tokenized (we used tweet tokenizer by Allen Ritter) ● Embedding layer is initialized using pretrained GloVe word embeddings for twitter. 25
  • 24. Sentiment Model ● To train this network we used a dataset for sentiment analysis. ○ 3 classes - negative, positive and neutral ○ Public dataset + custom data ● All convolutions are 1D convolutions. ○ Height of the filter varies. ○ Width of the filter is same as that of embedding dimension. 26
  • 25. 27
  • 26. Emotion Model ● Build a emotion model where the second last layer will be used to extract features. ● We built a (standard) CNN for this. ○ Alternative layers of Convolution and Max Pooling ○ Followed by fully connected layers ○ Softmax output ● Text is tokenized ● Embedding layer is initialized using pretrained GloVe word embeddings for twitter. 28
  • 27. Emotion Model ● To train this network we used a dataset for emotion analysis. ○ 6 classes - anger, disgust, surprise, sadness, joy and fear ○ Public dataset + custom data ● All convolutions are 1D convolutions. 29
  • 28. Model Architecture ● Owing to scarcity of data, we kept networks simple. ● Embedding layer was frozen. Not fine tuned. 30
  • 29. Our solution in nutshell 31 Text Sentiment features Final Classifier Emotion features Features from baseline Pretrained Sentiment model Pretrained Emotion model Pretraned Baseline model CNN / Linear models
  • 30. Results ● Test Data came from a different timeline. ● ~20K balanced test set. 32
  • 31. Future work ● Train your own word embedding ● Character n-gram embeddings ● Retry RNNs ● Attention networks ● Collect more data! ○ Collecting right data for negative class (not sarcasm) is very important. ○ Adding public datasets of sentiment to negative class helped us a lot. ● It will be interesting to see the impact of factoring in user behaviour. 33
  • 32. Learnings ● Sarcasm detection is an important problem. ● It is difficult: ○ long term dependencies ○ subtle changes of word or punctuation can flip the polarity ○ Needs facts and external knowledge ● Present sentiment analysis systems are bad at detecting sarcasm. ● Pretrained (sub-task) specific CNNs can work in text as well. ● This is an example of domain knowledge + Deep Learning. ● Data collection strategy is important ○ Comprehensively collecting what is not sarcasm. ○ Adding public datasets of sentiment to negative class helped us a lot. 34