SlideShare a Scribd company logo
1 of 25
Download to read offline
Tagging & Detecting Sarcasm
David M. Boyhan
dboyhan2@Illinois.edu
University of Illinois – Urbana Champaign
CS410 – DSO
December 2, 2016
Project Goals
• To create a browser‐based tool to easily tag sarcastic comments for 
development and refinement of a tagged corpus
• To use an NLP toolset to detect sarcastic comments
Web‐Based Tagging
• Develop a JavaScript bookmarklet or a Chrome extension that would 
allow users to highlight specific text, select the bookmarklet or 
extension and then have that tagged text (together with associated 
content, such as originating URL) automatically appended to a text 
file.
• This portion of the project turned out to be significantly more 
complex than anticipated. Further, because this feature is not a core 
focus of the CS410, I ultimately decided to table this feature and 
concentrate on work‐arounds and existing tagged corpora.
Web‐Based Tagging – Issues Encountered
• Using JavaScript to copy text into an OS’s clipboard is not particularly difficult.
• Unfortunately, automatically copying data from an OS clipboard into a text file is 
significantly more difficult to do in an OS agnostic fashion, particularly when using 
tools such as JS. 
• The reason for this becomes fairly apparent when you look at the security 
features built into JS. JS is specifically architected to “sand box” as many features 
as possible with the browser and thereby, protect the host OS. Accordingly, 
writing a file to the OS (even as ostensibly innocuous as a text file), is made 
*intentionally* difficult by JS. 
• There are OS specific work‐arounds to this, that will automatically monitor and 
copy clipboard contents to files. Similarly, it is relatively easy to use JS to pass 
clipboard contents along to another service (this, maintaining the sand box). 
Core Project ‐ Sarcasm Detection
Need …
Existing Products & Solutions
Found at 
spotter.com
Not much info on it.
Formal Definition of Sarcasm
• Sarcasm is “a sharp, bitter, or cutting expression or remark; a bitter gibe or 
taunt.“ Sarcasm may employ ambivalence, although sarcasm is not 
necessarily ironic. “The distinctive quality of sarcasm is present in the 
spoken word and manifested chiefly by vocal inflections”. The sarcastic 
content of a statement will be dependent upon the context in which it 
appears. (From Wikipedia entry on “Sarcasm”)
• Typically this involves making a comment (verbal or written) in a manner 
(inflection, context, etc.) where the intent is to communicate the exact 
opposite of the literal meaning of the comment. 
• For example, “Nice job” when someone has failed to do something well.
NLP Definition of Sarcasm
• Sarcasm can be easily thought of, from a natural language processing 
perspective, as:
False Postive (or Negative) on Sentiment Polarity Analysis
False Negative is less common
• In other words, if a sentiment analysis was run that was not intended 
to detect sarcasm, False Positives would most likely be sarcastic.
• “I love waiting in line.” (Most common form.)
• “I hate receiving thoughtful presents.” (Less common.)
Detecting Sarcasm is Difficult 
(Even for Human Beings)
• Detecting sarcasm is a uniquely difficult problem, not just for NLP but 
for individuals, even if they are native speakers of the language.
• In a 2005 study, only 56% of participants correctly identified sarcastic 
vs. non‐sarcastic comments when sent as a message.
• In a 2006 survey, 55% of respondents incorrectly believed they were 
providing an example of a sarcastic comment.
• The 2005 study found that when the same messages were 
transmitted through a voice recording, the recipient interpreted the 
emotion correctly 73% of the time (consistent with senders’ 
expectations.)
Barriers to Detecting Sarcasm
• Context – the identity, beliefs and demographics of the communicator 
are essential to correctly identifying sarcasm – a 10 year old boy’s 
views on the latest Transformer movie are likely to be very different 
from those of a 45 year old woman’s. This would be essential in 
determining whether the phrase “I love the Transformers” is sarcastic 
or not.
• Emphasis (and/or Inflection) – In the absence of Context, 
communicators will often use emphasis or inflection in the message 
to communicate sarcasm. This is often lost in written 
communications, especially if formatting and punctuation are 
removed.
Sarcasm Detection Relies upon Features 
Frequently Lost in NLP Processing
• Sarcasm detection frequently relies upon features typically “thrown out” as a part 
of NLP pre‐processing. 
• For example, punctuation and even extra spacing can be a significant hint in 
detecting sarcasm.
• For example:
• “Oh, yes, he did an excellent job.”
Vs
• “Oh yes … he did an ‘excellent’ job …”
• Note that the only difference is in the presence of punctuation.
• If you further add capitalization and other forms of emphasis, such as 
underlining, italics, and bolding, all typically discarded as a part of pre‐
processing, parsing and indexing, you can see there is a significant loss of 
features. 
Research and Corpus
• Without a web‐tagging tool, it was necessary to either collect 
sarcastic comments in a more manual fashion, generate them “by 
hand” or find a tagged research corpus.
• Fortunately, there were two research corpora to work with:
• UC Santa Cruz – Sarcasm Corpus (derived from multiple websites containing 
statement and response and tagged using Amazon’s Mechanical Turk)
• Fordham University – Sarcasm Corpus (derived from Amazon reviews (ironic
and regular reviews)
Weka for Modelling and Testing
Results …
Naive Bayes
Correctly Classified: 2424 51.66%
Incorrectly Classified: 2268 48.34%
TP FP Precision Recall F‐Measure
Sarcastic 0.613 0.58 0.514 0.613 0.559
Not Sarcastic 0.42 0.387 0.521 0.42 0.465
Weighted Avg. 0.517 0.483 0.517 0.517 0.512
KNN
Correctly Classified: 2431 51.81%
Incorrectly Classified: 2261 48.19%
TP FP Precision Recall F‐Measure
Sarcastic 0.422 0.385 0.522 0.422 0.467
Not Sarcastic 0.615 0.578 0.515 0.615 0.561
Weighted Avg. 0.518 0.482 0.519 0.518 0.514
SVM
Correctly Classified: 2356 50.21%
Incorrectly Classified: 2336 49.79%
TP FP Precision Recall F‐Measure
Sarcastic 0.405 0.401 0.503 0.405 0.449
Not Sarcastic 0.599 0.595 0.502 0.599 0.546
Weighted Avg. 0.502 0.498 0.502 0.502 0.497
Results are Surprisingly Poor
Little better than coin flipping …
Efforts to Improve Results
• Limiting the Corpus to only “Response”
• Limiting Corpus to only “Comment”
• Removing Stop Words from Corpus
• Retaining all punctuation in Corpus
• Using Multiple Models
Finally, Re‐evaluating the Tags
• Given poor results, decided to re‐examine the tagging of corpus
Results of Examining Corpus
• Disagreed with 50% of the tags on 10% sample of the overall corpus 
(500 tags reviewed)
• Not surprising modelling didn’t work
• Very consistent with research on ability of  humans to identify 
sarcasm
Limited Research & Best Article Title
• ACM Searches:
• Several thousand on sentiment detection and analysis
• Less then 10 on detecting sarcasm
• “Detecting Sarcasm in Text: An Obvious Solution to a Trivial Problem” 
(Stanford)
Next Steps
• Better Corpus …
• Create a Corpus …
• META modelling 
Thank you …
Dave Boyhan

More Related Content

Similar to Sarcasm Detection Slides

Google chrome extension
Google chrome extensionGoogle chrome extension
Google chrome extension
Johnny Kingdom
 
Best 5 CSS Frameworks You Should Know To Design Attractive Websites .pdf
Best 5 CSS Frameworks You Should Know To Design Attractive Websites .pdfBest 5 CSS Frameworks You Should Know To Design Attractive Websites .pdf
Best 5 CSS Frameworks You Should Know To Design Attractive Websites .pdf
Appdeveloper10
 
GeneralMobile Hybrid Development with WordPress
GeneralMobile Hybrid Development with WordPressGeneralMobile Hybrid Development with WordPress
GeneralMobile Hybrid Development with WordPress
GGDBologna
 

Similar to Sarcasm Detection Slides (20)

Google chrome extension
Google chrome extensionGoogle chrome extension
Google chrome extension
 
Css tools and methodologies
Css tools and methodologiesCss tools and methodologies
Css tools and methodologies
 
SharePoint 2013 Sandbox Solutions for On Premise or Office 365
SharePoint 2013 Sandbox Solutions for On Premise or Office 365SharePoint 2013 Sandbox Solutions for On Premise or Office 365
SharePoint 2013 Sandbox Solutions for On Premise or Office 365
 
1_Intro_toHTML.ppt
1_Intro_toHTML.ppt1_Intro_toHTML.ppt
1_Intro_toHTML.ppt
 
Capstone project task what to do capstone
Capstone project task what to do capstoneCapstone project task what to do capstone
Capstone project task what to do capstone
 
Web development services
Web development servicesWeb development services
Web development services
 
PPT on web development & SEO
PPT on web development & SEOPPT on web development & SEO
PPT on web development & SEO
 
WordPress Theming Best Practices
WordPress Theming Best PracticesWordPress Theming Best Practices
WordPress Theming Best Practices
 
How To Use Selenium Successfully
How To Use Selenium SuccessfullyHow To Use Selenium Successfully
How To Use Selenium Successfully
 
Site Manager rocks!
Site Manager rocks!Site Manager rocks!
Site Manager rocks!
 
Introduction to SharePoint Framework (SPFx)
Introduction to SharePoint Framework (SPFx)Introduction to SharePoint Framework (SPFx)
Introduction to SharePoint Framework (SPFx)
 
Know the reason behind choosing bootstrap as css framework
Know the reason behind choosing bootstrap as css frameworkKnow the reason behind choosing bootstrap as css framework
Know the reason behind choosing bootstrap as css framework
 
SharePoint 2013 Preview
SharePoint 2013 PreviewSharePoint 2013 Preview
SharePoint 2013 Preview
 
Important news from SharePoint Conference North America 2019
Important news from SharePoint Conference North America 2019Important news from SharePoint Conference North America 2019
Important news from SharePoint Conference North America 2019
 
Best 5 CSS Frameworks You Should Know To Design Attractive Websites .pdf
Best 5 CSS Frameworks You Should Know To Design Attractive Websites .pdfBest 5 CSS Frameworks You Should Know To Design Attractive Websites .pdf
Best 5 CSS Frameworks You Should Know To Design Attractive Websites .pdf
 
GeneralMobile Hybrid Development with WordPress
GeneralMobile Hybrid Development with WordPressGeneralMobile Hybrid Development with WordPress
GeneralMobile Hybrid Development with WordPress
 
Mobile Hybrid Development with WordPress
Mobile Hybrid Development with WordPressMobile Hybrid Development with WordPress
Mobile Hybrid Development with WordPress
 
Quick Wins to Jump Start Your SharePoint Implementation - SPEngage Raleigh 2016
Quick Wins to Jump Start Your SharePoint Implementation - SPEngage Raleigh 2016Quick Wins to Jump Start Your SharePoint Implementation - SPEngage Raleigh 2016
Quick Wins to Jump Start Your SharePoint Implementation - SPEngage Raleigh 2016
 
Whats new in SharePoint Online
Whats new in SharePoint OnlineWhats new in SharePoint Online
Whats new in SharePoint Online
 
Getting started with Vue.js - CodeMash 2020
Getting started with Vue.js - CodeMash 2020Getting started with Vue.js - CodeMash 2020
Getting started with Vue.js - CodeMash 2020
 

Recently uploaded

obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
yulianti213969
 
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
yulianti213969
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
ju0dztxtn
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives
23050636
 
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Stephen266013
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
acoha1
 
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
a8om7o51
 
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
ppy8zfkfm
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
pwgnohujw
 

Recently uploaded (20)

Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdf
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
 
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
 
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
 
Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
 
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
 
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
 
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives
 
Data Analysis Project Presentation : NYC Shooting Cluster Analysis
Data Analysis Project Presentation : NYC Shooting Cluster AnalysisData Analysis Project Presentation : NYC Shooting Cluster Analysis
Data Analysis Project Presentation : NYC Shooting Cluster Analysis
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting Techniques
 
Bios of leading Astrologers & Researchers
Bios of leading Astrologers & ResearchersBios of leading Astrologers & Researchers
Bios of leading Astrologers & Researchers
 
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptx
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
 
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
 
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
 

Sarcasm Detection Slides