SlideShare a Scribd company logo
Evaluation Datasets for Twitter Sentiment Analysis
A survey and a new dataset, the STS-Gold

Hassan Saif, Miriam Fernandez, Yulan He and Harith Alani
Knowledge Media Institute, The Open University,
Milton Keynes, United Kingdom

1st Workshop on Emotion and Sentiment in Social and
Expressive Media Approaches and perspectives from AI
• Definition & Background
• Evaluation Datasets for Twitter Sentiment
Analysis
• STS-Gold

Outline
• Comparative Study
• Conclusion
Sentiment Analysis – Definition
Sentiment Analysis
“Sentiment analysis is the task of identifying
positive and negative opinions, emotions and
evaluations in text”

The main dish was
delicious

It is a Syrian dish

Positive

Neutral

The main dish was
salty and horrible

Negative
3
Supervised

Sentiment Approaches

Unsupervised
Hybrid

Tweet-level
Sentiment Levels
Phrase-level
Entity-level

Twitter
Sentiment
Analysis
(Background)

Subjectivity
Sentiment Tasks

Polarity
Sentiment Strength
Emotion/Mood

4
Evaluation Datasets for Twitter Sentiment Analysis
SA Level

SA Task

No. of Tweets

Construction & Annotation

Dataset
Dataset

Vocabulary Size

Class Distribution
Sparsity
Dataset

SA Level

SA Task

Annotation/Agreement

Tweet

Subjectivity

Manual/UD

Tweet/Target

Subjectivity

Manual/UD

Obama-McCain Debate (OMD)

Tweet

Polarity*

Manual/α=0.655

Sentiment Strength Twitter Dataset (SS-Tweet)

Tweet

Strength/Subj
ectivity**

Manual
α≈0.56

Sanders Twitter Dataset

Tweet

Subjectivity

Manual/UD

Dialogue Earth Twitter Corpus (WAB, GASP)

Tweet/Target

Subjectivity

Manual/UD

SemEval-2013 Dataset

Tweet/Expre
ssion

Subjectivity

Manual/UD

Stanford Twitter Corpus (STS)
Health Care Reform (HCR)

Evaluation Datasets – Overview
• Details about the annotation
methodology (STS, HCR, Sanders)

What is Missing?

• Entity-level Sentiment Evaluation:
• Most works are focused on
assessing the performance of
sentiment classifiers at the tweet
level (STS, OMD, SS-Tweet, Sanders)
• Datasets, which allow for the
sentiment evaluation at the entity
level, assign similar sentiment
labels to the tweet and the entities
within it. (HCR, WAB, GASP)
 Enables the evaluation at both the entity and tweet
levels

 Tweets and entities are annotated independently

 Contains 58 Entities & 3000 Tweets
Data Collection

STS Corpus
Select

28 Entities
Select

100 Tweet/Entity
180K Tweets

STS-Gold

Alchemy API

2800 Tweets

Entity-Extraction
+200 tweets

Identify Frequent
Concepts

3000 Tweets

Top & Mid
Frequent Entities

Entity-Extraction

147 Entities
STS-Gold
Obama

Taylor Swift

Vegas

YouTube

Facebook

London
City

Person

Person

Person

Company

LeBron

Oprah

Person

Seattle

McDonalds

Starbucks

Sydney
iPod

iPhone
Lakers
England

Cavas

US

Xbox

Technology
Person

PSP

Organization

Person

Country

Headache

NASA

Person

Health
Condition

UN

Brazil

LeBron

Flu

Person
Cancer

Fever
3000 Tweets

147 Entities

Data Annotation

Tweenator.com

Sentiment Classes
Positive, Negative, Neutr
al, Mixed, Other

STS-Gold
3000 Tweets

147 Entities

Inter-annotation Agreement
Tweet α=0.765

Filtering

2205 Tweets

58 Entities

Entity α1=0.416
α2=0.964
Comparative Study

•
•
•
•

Vocabulary Size
Number of Tweets
Data Sparsity
Classification Performance
– Polarity Classification
– Naïve Bayes & Maximum Entropy
Comparative Study.1
Vocabulary Size vs. No. of Tweets
- There exists a high correction between the vocabulary size and the number of
tweets (ρ = 0.95)
- However, increasing the number of tweets does not always lead to increasing the
vocabulary size. (OMD)
Data Spar sity

Comparativeimportant factor that affectstheov
Da s t s rs isa Study.2
ta e pa ity
n

-

m chinele rning cla s rs[17]. According toS if e a
a
a
s ifie
a t l.
tha
nothe type
r
sof da
ta(e m
.g., oviere w da ) duetoa
vie
ta
Data Sparsity in tweets.
words
Inthiss ction, wea
e
imtocom rethepre e dda s ts
pa
s nte ta e
Twitter datasets are generally tethes rs de eof agive
Toca
lculavery sparse ity gre
pa
nda s t weus
ta e
e
Increasing both the number of tweets or the vocabulary size increases the sparsity
[13]:
Pn
degree of the dataset:
- ρno_of_tweets = 0.71
i Ni
Sd = 1 −
- ρvocabulary_size = 0.77
n ⇥ |V |
Whe
reN i isthethenum r of dis
be
tinct wordsintwe t i
e
the dataset and |V | the vocabulary size.
9

The Twe tNLP toke r ca be downloa d from ht t p:
e
nize n
de
Tweet NLP/
Comparative Study.3
Classification Performance vs. Dataset Sparsity (1)

0.9

Average Classifier Performance

Average Classifier Performance

According to Makrehchi et al (2008) and Saif et al (2012): in a given dataset the
classification performance and the sparsity degree are negatively correlated, i.e.,
increasing the dataset sparsity hinders the classification performance.
228
M . M akrehchi and M .S. K amel

0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1

Industry Sectors
20 newsgroups
Reuters

0.991 0.992 0.993 0.994 0.995 0.996 0.997 0.998 0.999

Average Sparsity

(a)

0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.9441

Industry Sectors
20 newsgroups
Reuters
0.9550

0.9661

0.9772

0.9886

1.00

0.9441

0.9550

Average Sparsity

(b)

F i g. 2. Classifier performance as a funct ion of sparsity: (a) Rocchio, and (b) SV M
Comparative Study.3
Classification Performance vs. Dataset Sparsity (2)
- No correlation between the classification performance and the sparsity degree
across the datasets. (ρacc = −0.06, ρf1 = 0.23)
- The sparsity-performance correlation is intrinsic, meaning that it might exists within
the dataset itself, but not necessarily across the datasets.
• Current datasets to evaluate Twitter
sentiment classifiers:
– Focus on the tweet-level.
– Assign similar sentiment labels to the
tweets and the entities within them.

• STS-Gold allows for sentiment evaluation
as both the tweet and the entity levels.

• A correlation between the vocabulary size
and the number of tweets does not
always exist.
• The sparsity-performance correlation is
intrinsic, i.e., it only exists within the
dataset itself, but not across the different
datasets.

Conclusion!
Thank You
Email: hassan.saif@open.ac.uk
Twitter: hrsaif
Website: tweenator.com

More Related Content

What's hot

Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14Rachit Goel
 
sentiment analysis text extraction from social media
sentiment  analysis text extraction from social media sentiment  analysis text extraction from social media
sentiment analysis text extraction from social media
Ravindra Chaudhary
 
Sentiment mining- The Design and Implementation of an Internet Public Opinion...
Sentiment mining- The Design and Implementation of an Internet PublicOpinion...Sentiment mining- The Design and Implementation of an Internet PublicOpinion...
Sentiment mining- The Design and Implementation of an Internet Public Opinion...
Prateek Singh
 
A review of sentiment analysis approaches in big
A review of sentiment analysis approaches in bigA review of sentiment analysis approaches in big
A review of sentiment analysis approaches in big
Nurfadhlina Mohd Sharef
 
Ontology based sentiment analysis
Ontology based sentiment analysisOntology based sentiment analysis
Ontology based sentiment analysis
prathako
 
IRE2014-Sentiment Analysis
IRE2014-Sentiment AnalysisIRE2014-Sentiment Analysis
IRE2014-Sentiment Analysis
Gangasagar Patil
 
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse AnalysisSentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
Naveen Kumar
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
Sagar Ahire
 
Sentiment Analysis in Twitter
Sentiment Analysis in TwitterSentiment Analysis in Twitter
Sentiment Analysis in Twitter
Ayushi Dalmia
 
Sentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using pythonSentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using python
Hetu Bhavsar
 
Twitter sentiment analysis ppt
Twitter sentiment analysis pptTwitter sentiment analysis ppt
Twitter sentiment analysis ppt
SonuCreation
 
Sentiment Analaysis on Twitter
Sentiment Analaysis on TwitterSentiment Analaysis on Twitter
Sentiment Analaysis on Twitter
Nitish J Prabhu
 
Alleviating Data Sparsity for Twitter Sentiment Analysis
Alleviating Data Sparsity for Twitter Sentiment AnalysisAlleviating Data Sparsity for Twitter Sentiment Analysis
Alleviating Data Sparsity for Twitter Sentiment Analysis
Knowledge Media Institute - The Open University
 
New sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumarNew sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumar
Ravi Kumar
 
Sentimental Analysis of twitter data .
Sentimental Analysis of twitter data .Sentimental Analysis of twitter data .
Sentimental Analysis of twitter data .
Greater Noida Institute Of Technology
 
Sentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use casesSentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use cases
Karol Chlasta
 
Big Data + Sentiment Analysis = Awesome
Big Data + Sentiment Analysis = AwesomeBig Data + Sentiment Analysis = Awesome
Big Data + Sentiment Analysis = Awesome
Adel Rahimi
 
SENTIMENT ANALYSIS OF TWITTER DATA
SENTIMENT ANALYSIS OF TWITTER DATASENTIMENT ANALYSIS OF TWITTER DATA
SENTIMENT ANALYSIS OF TWITTER DATAParvathy Devaraj
 

What's hot (20)

Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14
 
sentiment analysis text extraction from social media
sentiment  analysis text extraction from social media sentiment  analysis text extraction from social media
sentiment analysis text extraction from social media
 
Sentiment mining- The Design and Implementation of an Internet Public Opinion...
Sentiment mining- The Design and Implementation of an Internet PublicOpinion...Sentiment mining- The Design and Implementation of an Internet PublicOpinion...
Sentiment mining- The Design and Implementation of an Internet Public Opinion...
 
A review of sentiment analysis approaches in big
A review of sentiment analysis approaches in bigA review of sentiment analysis approaches in big
A review of sentiment analysis approaches in big
 
Ontology based sentiment analysis
Ontology based sentiment analysisOntology based sentiment analysis
Ontology based sentiment analysis
 
Opinion Mining – Twitter
Opinion Mining – TwitterOpinion Mining – Twitter
Opinion Mining – Twitter
 
IRE2014-Sentiment Analysis
IRE2014-Sentiment AnalysisIRE2014-Sentiment Analysis
IRE2014-Sentiment Analysis
 
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse AnalysisSentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
 
Sentiment Analysis in Twitter
Sentiment Analysis in TwitterSentiment Analysis in Twitter
Sentiment Analysis in Twitter
 
Sentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using pythonSentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using python
 
Twitter sentiment analysis ppt
Twitter sentiment analysis pptTwitter sentiment analysis ppt
Twitter sentiment analysis ppt
 
Sentiment Analaysis on Twitter
Sentiment Analaysis on TwitterSentiment Analaysis on Twitter
Sentiment Analaysis on Twitter
 
Alleviating Data Sparsity for Twitter Sentiment Analysis
Alleviating Data Sparsity for Twitter Sentiment AnalysisAlleviating Data Sparsity for Twitter Sentiment Analysis
Alleviating Data Sparsity for Twitter Sentiment Analysis
 
New sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumarNew sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumar
 
Sentimental Analysis of twitter data .
Sentimental Analysis of twitter data .Sentimental Analysis of twitter data .
Sentimental Analysis of twitter data .
 
Sentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use casesSentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use cases
 
Big Data + Sentiment Analysis = Awesome
Big Data + Sentiment Analysis = AwesomeBig Data + Sentiment Analysis = Awesome
Big Data + Sentiment Analysis = Awesome
 
SENTIMENT ANALYSIS OF TWITTER DATA
SENTIMENT ANALYSIS OF TWITTER DATASENTIMENT ANALYSIS OF TWITTER DATA
SENTIMENT ANALYSIS OF TWITTER DATA
 

Similar to Evaluation Datasets for Twitter Sentiment Analysis: A survey and a new dataset, the STS-Gold

statistical analysis of questionnaires
statistical analysis of questionnairesstatistical analysis of questionnaires
statistical analysis of questionnaires
Mohamed Afifi
 
GradTrack: Getting Started with Statistics September 20, 2018
GradTrack: Getting Started with Statistics September 20, 2018GradTrack: Getting Started with Statistics September 20, 2018
GradTrack: Getting Started with Statistics September 20, 2018
Nancy Garmer
 
GradTrack: Getting Started with Statistics September 20, 2018
GradTrack: Getting Started with Statistics September 20, 2018GradTrack: Getting Started with Statistics September 20, 2018
GradTrack: Getting Started with Statistics September 20, 2018
Evans Library at Florida Institute of Technology
 
Twitter Sentiment & Investing - modeling stock price movements with twitter s...
Twitter Sentiment & Investing - modeling stock price movements with twitter s...Twitter Sentiment & Investing - modeling stock price movements with twitter s...
Twitter Sentiment & Investing - modeling stock price movements with twitter s...
Eric Brown
 
Twitter sentiment classifications 1
Twitter sentiment classifications 1Twitter sentiment classifications 1
Twitter sentiment classifications 1
eshtiyak
 
Module 1.3 data exploratory
Module 1.3  data exploratoryModule 1.3  data exploratory
Module 1.3 data exploratory
Sara Hooker
 
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docx
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docxDESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docx
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docx
donaldp2
 
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docx
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docxDESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docx
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docx
carolinef5
 
Slalom
SlalomSlalom
Slalom
veesingh
 
Mike Thelwall: Introduction to Webometrics
Mike Thelwall: Introduction to WebometricsMike Thelwall: Introduction to Webometrics
Mike Thelwall: Introduction to Webometrics
Library and Information Science Research Coalition
 
Media 330057 smxx
Media 330057 smxxMedia 330057 smxx
Media 330057 smxx
AchrafLACHHEB
 
Digging for data: opportunities and challenges in an open research landscape_...
Digging for data: opportunities and challenges in an open research landscape_...Digging for data: opportunities and challenges in an open research landscape_...
Digging for data: opportunities and challenges in an open research landscape_...
Platforma Otwartej Nauki
 
Introduction to spss
Introduction to spssIntroduction to spss
Introduction to spss
Manish Parihar
 
Research Data Management
Research  Data ManagementResearch  Data Management
Research Data Management
Mahmoud91Tx
 
Becoming Datacentric
Becoming DatacentricBecoming Datacentric
Becoming Datacentric
Timothy Cook
 
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET Journal
 
Slides
SlidesSlides
Slidesbutest
 
Analytical Design in Applied Marketing Research
Analytical Design in Applied Marketing ResearchAnalytical Design in Applied Marketing Research
Analytical Design in Applied Marketing Research
Kelly Page
 
Sa discover text webinar
Sa discover text webinarSa discover text webinar
Sa discover text webinarQuestionPro
 

Similar to Evaluation Datasets for Twitter Sentiment Analysis: A survey and a new dataset, the STS-Gold (20)

statistical analysis of questionnaires
statistical analysis of questionnairesstatistical analysis of questionnaires
statistical analysis of questionnaires
 
GradTrack: Getting Started with Statistics September 20, 2018
GradTrack: Getting Started with Statistics September 20, 2018GradTrack: Getting Started with Statistics September 20, 2018
GradTrack: Getting Started with Statistics September 20, 2018
 
GradTrack: Getting Started with Statistics September 20, 2018
GradTrack: Getting Started with Statistics September 20, 2018GradTrack: Getting Started with Statistics September 20, 2018
GradTrack: Getting Started with Statistics September 20, 2018
 
Twitter Sentiment & Investing - modeling stock price movements with twitter s...
Twitter Sentiment & Investing - modeling stock price movements with twitter s...Twitter Sentiment & Investing - modeling stock price movements with twitter s...
Twitter Sentiment & Investing - modeling stock price movements with twitter s...
 
Twitter sentiment classifications 1
Twitter sentiment classifications 1Twitter sentiment classifications 1
Twitter sentiment classifications 1
 
Module 1.3 data exploratory
Module 1.3  data exploratoryModule 1.3  data exploratory
Module 1.3 data exploratory
 
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docx
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docxDESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docx
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docx
 
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docx
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docxDESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docx
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docx
 
Slalom
SlalomSlalom
Slalom
 
Mike Thelwall: Introduction to Webometrics
Mike Thelwall: Introduction to WebometricsMike Thelwall: Introduction to Webometrics
Mike Thelwall: Introduction to Webometrics
 
Media 330057 smxx
Media 330057 smxxMedia 330057 smxx
Media 330057 smxx
 
Digging for data: opportunities and challenges in an open research landscape_...
Digging for data: opportunities and challenges in an open research landscape_...Digging for data: opportunities and challenges in an open research landscape_...
Digging for data: opportunities and challenges in an open research landscape_...
 
Introduction to spss
Introduction to spssIntroduction to spss
Introduction to spss
 
wendi_ppt
wendi_pptwendi_ppt
wendi_ppt
 
Research Data Management
Research  Data ManagementResearch  Data Management
Research Data Management
 
Becoming Datacentric
Becoming DatacentricBecoming Datacentric
Becoming Datacentric
 
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
 
Slides
SlidesSlides
Slides
 
Analytical Design in Applied Marketing Research
Analytical Design in Applied Marketing ResearchAnalytical Design in Applied Marketing Research
Analytical Design in Applied Marketing Research
 
Sa discover text webinar
Sa discover text webinarSa discover text webinar
Sa discover text webinar
 

Recently uploaded

UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 

Recently uploaded (20)

UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 

Evaluation Datasets for Twitter Sentiment Analysis: A survey and a new dataset, the STS-Gold

  • 1. Evaluation Datasets for Twitter Sentiment Analysis A survey and a new dataset, the STS-Gold Hassan Saif, Miriam Fernandez, Yulan He and Harith Alani Knowledge Media Institute, The Open University, Milton Keynes, United Kingdom 1st Workshop on Emotion and Sentiment in Social and Expressive Media Approaches and perspectives from AI
  • 2. • Definition & Background • Evaluation Datasets for Twitter Sentiment Analysis • STS-Gold Outline • Comparative Study • Conclusion
  • 3. Sentiment Analysis – Definition Sentiment Analysis “Sentiment analysis is the task of identifying positive and negative opinions, emotions and evaluations in text” The main dish was delicious It is a Syrian dish Positive Neutral The main dish was salty and horrible Negative 3
  • 5. Evaluation Datasets for Twitter Sentiment Analysis SA Level SA Task No. of Tweets Construction & Annotation Dataset Dataset Vocabulary Size Class Distribution Sparsity
  • 6. Dataset SA Level SA Task Annotation/Agreement Tweet Subjectivity Manual/UD Tweet/Target Subjectivity Manual/UD Obama-McCain Debate (OMD) Tweet Polarity* Manual/α=0.655 Sentiment Strength Twitter Dataset (SS-Tweet) Tweet Strength/Subj ectivity** Manual α≈0.56 Sanders Twitter Dataset Tweet Subjectivity Manual/UD Dialogue Earth Twitter Corpus (WAB, GASP) Tweet/Target Subjectivity Manual/UD SemEval-2013 Dataset Tweet/Expre ssion Subjectivity Manual/UD Stanford Twitter Corpus (STS) Health Care Reform (HCR) Evaluation Datasets – Overview
  • 7. • Details about the annotation methodology (STS, HCR, Sanders) What is Missing? • Entity-level Sentiment Evaluation: • Most works are focused on assessing the performance of sentiment classifiers at the tweet level (STS, OMD, SS-Tweet, Sanders) • Datasets, which allow for the sentiment evaluation at the entity level, assign similar sentiment labels to the tweet and the entities within it. (HCR, WAB, GASP)
  • 8.  Enables the evaluation at both the entity and tweet levels  Tweets and entities are annotated independently  Contains 58 Entities & 3000 Tweets
  • 9. Data Collection STS Corpus Select 28 Entities Select 100 Tweet/Entity 180K Tweets STS-Gold Alchemy API 2800 Tweets Entity-Extraction +200 tweets Identify Frequent Concepts 3000 Tweets Top & Mid Frequent Entities Entity-Extraction 147 Entities
  • 11. 3000 Tweets 147 Entities Data Annotation Tweenator.com Sentiment Classes Positive, Negative, Neutr al, Mixed, Other STS-Gold 3000 Tweets 147 Entities Inter-annotation Agreement Tweet α=0.765 Filtering 2205 Tweets 58 Entities Entity α1=0.416 α2=0.964
  • 12. Comparative Study • • • • Vocabulary Size Number of Tweets Data Sparsity Classification Performance – Polarity Classification – Naïve Bayes & Maximum Entropy
  • 13. Comparative Study.1 Vocabulary Size vs. No. of Tweets - There exists a high correction between the vocabulary size and the number of tweets (ρ = 0.95) - However, increasing the number of tweets does not always lead to increasing the vocabulary size. (OMD)
  • 14. Data Spar sity Comparativeimportant factor that affectstheov Da s t s rs isa Study.2 ta e pa ity n - m chinele rning cla s rs[17]. According toS if e a a a s ifie a t l. tha nothe type r sof da ta(e m .g., oviere w da ) duetoa vie ta Data Sparsity in tweets. words Inthiss ction, wea e imtocom rethepre e dda s ts pa s nte ta e Twitter datasets are generally tethes rs de eof agive Toca lculavery sparse ity gre pa nda s t weus ta e e Increasing both the number of tweets or the vocabulary size increases the sparsity [13]: Pn degree of the dataset: - ρno_of_tweets = 0.71 i Ni Sd = 1 − - ρvocabulary_size = 0.77 n ⇥ |V | Whe reN i isthethenum r of dis be tinct wordsintwe t i e the dataset and |V | the vocabulary size. 9 The Twe tNLP toke r ca be downloa d from ht t p: e nize n de Tweet NLP/
  • 15. Comparative Study.3 Classification Performance vs. Dataset Sparsity (1) 0.9 Average Classifier Performance Average Classifier Performance According to Makrehchi et al (2008) and Saif et al (2012): in a given dataset the classification performance and the sparsity degree are negatively correlated, i.e., increasing the dataset sparsity hinders the classification performance. 228 M . M akrehchi and M .S. K amel 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Industry Sectors 20 newsgroups Reuters 0.991 0.992 0.993 0.994 0.995 0.996 0.997 0.998 0.999 Average Sparsity (a) 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.9441 Industry Sectors 20 newsgroups Reuters 0.9550 0.9661 0.9772 0.9886 1.00 0.9441 0.9550 Average Sparsity (b) F i g. 2. Classifier performance as a funct ion of sparsity: (a) Rocchio, and (b) SV M
  • 16. Comparative Study.3 Classification Performance vs. Dataset Sparsity (2) - No correlation between the classification performance and the sparsity degree across the datasets. (ρacc = −0.06, ρf1 = 0.23) - The sparsity-performance correlation is intrinsic, meaning that it might exists within the dataset itself, but not necessarily across the datasets.
  • 17. • Current datasets to evaluate Twitter sentiment classifiers: – Focus on the tweet-level. – Assign similar sentiment labels to the tweets and the entities within them. • STS-Gold allows for sentiment evaluation as both the tweet and the entity levels. • A correlation between the vocabulary size and the number of tweets does not always exist. • The sparsity-performance correlation is intrinsic, i.e., it only exists within the dataset itself, but not across the different datasets. Conclusion!
  • 18. Thank You Email: hassan.saif@open.ac.uk Twitter: hrsaif Website: tweenator.com