SlideShare a Scribd company logo

A comparison of Lexicon-based approaches for Sentiment Analysis of microblog posts

Cataldo Musto
Cataldo Musto
Cataldo MustoAssistant Professor (Ricercatore a Tempo Determinato) presso Università degli Studi di Bari

DART 2014 presentation - co located with AI*IA 2014 , Italian Conference on Artificial Intelligence Pisa (Italy)

A comparison of Lexicon-based approaches for Sentiment Analysis of microblog posts

1 of 73
Download to read offline
DART 2014 
8th Internation Workshop on 
Information Filtering and Retrieval 
Pisa (Italy) 
December 10, 2014 
A comparison of lexicon-based 
approaches for Sentiment Analysis 
of microblog posts 
Cataldo Musto, Giovanni Semeraro, Marco Polignano 
(Università degli Studi di Bari ‘Aldo Moro’, Italy - SWAP Research Group)
Outline 
• Background 
• Sentiment Analysis 
• Lexicon-based approaches 
• Methodology 
• State-of-the-art 
lexicons 
• Experiments 
• Conclusions 
Cataldo Musto, Giovanni Semeraro, Marco Polignano 2 
A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
Background 
One minute on the Web 
Cataldo Musto, Giovanni Semeraro, Marco Polignano 3 
A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
Background 
One minute on the Web 
4 
Information 
Overload 
Cataldo Musto, Giovanni Semeraro, Marco Polignano 
A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
5 
Background 
Information Overload 
Obstacleor Opportunity? 
Cataldo Musto, Giovanni Semeraro, Marco Polignano 
A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
6 
Opportunities 
(Social) Content Analytics 
Insight: to aggregate rough human-generated data to get 
valuable people-based findings 
Cataldo Musto, Giovanni Semeraro, Marco Polignano 
A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
Ad

Recommended

Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsDataminingTools Inc
 
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic ConceptsData Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic ConceptsSalah Amean
 
Introduction to pandas
Introduction to pandasIntroduction to pandas
Introduction to pandasPiyush rai
 
Fundamentals of Neural Networks
Fundamentals of Neural NetworksFundamentals of Neural Networks
Fundamentals of Neural NetworksGagan Deep
 
Deep learning tutorial 9/2019
Deep learning tutorial 9/2019Deep learning tutorial 9/2019
Deep learning tutorial 9/2019Amr Rashed
 

More Related Content

What's hot

Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Alia Hamwi
 
Recommendation system
Recommendation system Recommendation system
Recommendation system Vikrant Arya
 
Sentiment Analysis Using Machine Learning
Sentiment Analysis Using Machine LearningSentiment Analysis Using Machine Learning
Sentiment Analysis Using Machine LearningNihar Suryawanshi
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web miningDataminingTools Inc
 
Natural Language Processing: L02 words
Natural Language Processing: L02 wordsNatural Language Processing: L02 words
Natural Language Processing: L02 wordsananth
 
Sentiment analysis-by-nltk
Sentiment analysis-by-nltkSentiment analysis-by-nltk
Sentiment analysis-by-nltkWei-Ting Kuo
 
Essential concepts for machine learning
Essential concepts for machine learning Essential concepts for machine learning
Essential concepts for machine learning pyingkodi maran
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsMd. Main Uddin Rony
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learningHaris Jamil
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysisSeher Can
 
I. FSSP(Progression Planner) II. BSSP(Regression Planner
I. FSSP(Progression Planner) II. BSSP(Regression PlannerI. FSSP(Progression Planner) II. BSSP(Regression Planner
I. FSSP(Progression Planner) II. BSSP(Regression Plannervikas dhakane
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language ProcessingPranav Gupta
 
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Simplilearn
 
Random forest and decision tree
Random forest and decision treeRandom forest and decision tree
Random forest and decision treeAAKANKSHA JAIN
 

What's hot (20)

Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)
 
Recommendation system
Recommendation system Recommendation system
Recommendation system
 
Sentiment Analysis Using Machine Learning
Sentiment Analysis Using Machine LearningSentiment Analysis Using Machine Learning
Sentiment Analysis Using Machine Learning
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web mining
 
Boyer more algorithm
Boyer more algorithmBoyer more algorithm
Boyer more algorithm
 
Natural Language Processing: L02 words
Natural Language Processing: L02 wordsNatural Language Processing: L02 words
Natural Language Processing: L02 words
 
Sentiment analysis-by-nltk
Sentiment analysis-by-nltkSentiment analysis-by-nltk
Sentiment analysis-by-nltk
 
Essential concepts for machine learning
Essential concepts for machine learning Essential concepts for machine learning
Essential concepts for machine learning
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
 
AI: AI & Problem Solving
AI: AI & Problem SolvingAI: AI & Problem Solving
AI: AI & Problem Solving
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Red black tree
Red black treeRed black tree
Red black tree
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
 
NLP_KASHK:Minimum Edit Distance
NLP_KASHK:Minimum Edit DistanceNLP_KASHK:Minimum Edit Distance
NLP_KASHK:Minimum Edit Distance
 
I. FSSP(Progression Planner) II. BSSP(Regression Planner
I. FSSP(Progression Planner) II. BSSP(Regression PlannerI. FSSP(Progression Planner) II. BSSP(Regression Planner
I. FSSP(Progression Planner) II. BSSP(Regression Planner
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
 
Random forest and decision tree
Random forest and decision treeRandom forest and decision tree
Random forest and decision tree
 

Similar to A comparison of Lexicon-based approaches for Sentiment Analysis of microblog posts

Combining Distributional Semantics and Entity Linking for Context-aware Conte...
Combining Distributional Semantics and Entity Linking for Context-aware Conte...Combining Distributional Semantics and Entity Linking for Context-aware Conte...
Combining Distributional Semantics and Entity Linking for Context-aware Conte...Cataldo Musto
 
Tuning Personalized PageRank for Semantics-aware Recommendations based on Lin...
Tuning Personalized PageRank for Semantics-aware Recommendations based on Lin...Tuning Personalized PageRank for Semantics-aware Recommendations based on Lin...
Tuning Personalized PageRank for Semantics-aware Recommendations based on Lin...Cataldo Musto
 
Discourse-Centric Learning Analytics
Discourse-Centric Learning AnalyticsDiscourse-Centric Learning Analytics
Discourse-Centric Learning AnalyticsSimon Buckingham Shum
 
An evaluation of SimRank and Personalized PageRank to build a recommender sys...
An evaluation of SimRank and Personalized PageRank to build a recommender sys...An evaluation of SimRank and Personalized PageRank to build a recommender sys...
An evaluation of SimRank and Personalized PageRank to build a recommender sys...Paolo Tomeo
 
Corneli
CorneliCorneli
Cornelianesah
 
An Ensemble Model for Cross-Domain Polarity Classification on Twitter
An Ensemble Model for Cross-Domain Polarity Classification on TwitterAn Ensemble Model for Cross-Domain Polarity Classification on Twitter
An Ensemble Model for Cross-Domain Polarity Classification on TwitterSymeon Papadopoulos
 
Impact your Library UX with Contextual Inquiry
Impact your Library UX with Contextual InquiryImpact your Library UX with Contextual Inquiry
Impact your Library UX with Contextual InquiryRachel Vacek
 
Linked Open Data-enabled Strategies for Top-N Recommendations
Linked Open Data-enabled Strategies for Top-N RecommendationsLinked Open Data-enabled Strategies for Top-N Recommendations
Linked Open Data-enabled Strategies for Top-N RecommendationsCataldo Musto
 
Transcript - DOIs to support citation of grey literature
Transcript - DOIs to support citation of grey literatureTranscript - DOIs to support citation of grey literature
Transcript - DOIs to support citation of grey literatureARDC
 
SATANJEEV BANERJEE
SATANJEEV BANERJEESATANJEEV BANERJEE
SATANJEEV BANERJEEbutest
 
Academia, part of my 2014-2015 lectures at the University of Bergamo.
Academia, part of my 2014-2015 lectures at the University of Bergamo.Academia, part of my 2014-2015 lectures at the University of Bergamo.
Academia, part of my 2014-2015 lectures at the University of Bergamo.Roberto Peretta
 
FoCAS Newsletter Issue Two: January 2014
FoCAS Newsletter Issue Two: January 2014FoCAS Newsletter Issue Two: January 2014
FoCAS Newsletter Issue Two: January 2014FoCAS Initiative
 
Supervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithmSupervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithmIJSRD
 
The Effect of Different Set-based Visualizations on User Exploration of Reco...
The Effect of Different Set-based  Visualizations on User Exploration of Reco...The Effect of Different Set-based  Visualizations on User Exploration of Reco...
The Effect of Different Set-based Visualizations on User Exploration of Reco...Denis Parra Santander
 
Ed-Media2010- De Liddo
Ed-Media2010- De LiddoEd-Media2010- De Liddo
Ed-Media2010- De LiddoAnna De Liddo
 
From Open Content To Open Thinking
From Open Content To Open ThinkingFrom Open Content To Open Thinking
From Open Content To Open ThinkingAnna De Liddo
 

Similar to A comparison of Lexicon-based approaches for Sentiment Analysis of microblog posts (20)

Combining Distributional Semantics and Entity Linking for Context-aware Conte...
Combining Distributional Semantics and Entity Linking for Context-aware Conte...Combining Distributional Semantics and Entity Linking for Context-aware Conte...
Combining Distributional Semantics and Entity Linking for Context-aware Conte...
 
Tuning Personalized PageRank for Semantics-aware Recommendations based on Lin...
Tuning Personalized PageRank for Semantics-aware Recommendations based on Lin...Tuning Personalized PageRank for Semantics-aware Recommendations based on Lin...
Tuning Personalized PageRank for Semantics-aware Recommendations based on Lin...
 
Discourse-Centric Learning Analytics
Discourse-Centric Learning AnalyticsDiscourse-Centric Learning Analytics
Discourse-Centric Learning Analytics
 
ESWC 2014 Tutorial Part 4
ESWC 2014 Tutorial Part 4ESWC 2014 Tutorial Part 4
ESWC 2014 Tutorial Part 4
 
An evaluation of SimRank and Personalized PageRank to build a recommender sys...
An evaluation of SimRank and Personalized PageRank to build a recommender sys...An evaluation of SimRank and Personalized PageRank to build a recommender sys...
An evaluation of SimRank and Personalized PageRank to build a recommender sys...
 
Corneli
CorneliCorneli
Corneli
 
An Ensemble Model for Cross-Domain Polarity Classification on Twitter
An Ensemble Model for Cross-Domain Polarity Classification on TwitterAn Ensemble Model for Cross-Domain Polarity Classification on Twitter
An Ensemble Model for Cross-Domain Polarity Classification on Twitter
 
Anu paper(IJARCCE)
Anu paper(IJARCCE)Anu paper(IJARCCE)
Anu paper(IJARCCE)
 
Impact your Library UX with Contextual Inquiry
Impact your Library UX with Contextual InquiryImpact your Library UX with Contextual Inquiry
Impact your Library UX with Contextual Inquiry
 
Linked Open Data-enabled Strategies for Top-N Recommendations
Linked Open Data-enabled Strategies for Top-N RecommendationsLinked Open Data-enabled Strategies for Top-N Recommendations
Linked Open Data-enabled Strategies for Top-N Recommendations
 
Transcript - DOIs to support citation of grey literature
Transcript - DOIs to support citation of grey literatureTranscript - DOIs to support citation of grey literature
Transcript - DOIs to support citation of grey literature
 
SATANJEEV BANERJEE
SATANJEEV BANERJEESATANJEEV BANERJEE
SATANJEEV BANERJEE
 
Academia, part of my 2014-2015 lectures at the University of Bergamo.
Academia, part of my 2014-2015 lectures at the University of Bergamo.Academia, part of my 2014-2015 lectures at the University of Bergamo.
Academia, part of my 2014-2015 lectures at the University of Bergamo.
 
Sub1557
Sub1557Sub1557
Sub1557
 
N01741100102
N01741100102N01741100102
N01741100102
 
FoCAS Newsletter Issue Two: January 2014
FoCAS Newsletter Issue Two: January 2014FoCAS Newsletter Issue Two: January 2014
FoCAS Newsletter Issue Two: January 2014
 
Supervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithmSupervised Sentiment Classification using DTDP algorithm
Supervised Sentiment Classification using DTDP algorithm
 
The Effect of Different Set-based Visualizations on User Exploration of Reco...
The Effect of Different Set-based  Visualizations on User Exploration of Reco...The Effect of Different Set-based  Visualizations on User Exploration of Reco...
The Effect of Different Set-based Visualizations on User Exploration of Reco...
 
Ed-Media2010- De Liddo
Ed-Media2010- De LiddoEd-Media2010- De Liddo
Ed-Media2010- De Liddo
 
From Open Content To Open Thinking
From Open Content To Open ThinkingFrom Open Content To Open Thinking
From Open Content To Open Thinking
 

More from Cataldo Musto

MyrrorBot: a Digital Assistant Based on Holistic User Models for Personalize...
MyrrorBot: a Digital Assistant Based on Holistic User Models forPersonalize...MyrrorBot: a Digital Assistant Based on Holistic User Models forPersonalize...
MyrrorBot: a Digital Assistant Based on Holistic User Models for Personalize...Cataldo Musto
 
Fairness and Popularity Bias in Recommender Systems: an Empirical Evaluation
Fairness and Popularity Bias in Recommender Systems: an Empirical EvaluationFairness and Popularity Bias in Recommender Systems: an Empirical Evaluation
Fairness and Popularity Bias in Recommender Systems: an Empirical EvaluationCataldo Musto
 
Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...
Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...
Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...Cataldo Musto
 
Exploring the Effects of Natural Language Justifications in Food Recommender ...
Exploring the Effects of Natural Language Justifications in Food Recommender ...Exploring the Effects of Natural Language Justifications in Food Recommender ...
Exploring the Effects of Natural Language Justifications in Food Recommender ...Cataldo Musto
 
Exploiting Distributional Semantics Models for Natural Language Context-aware...
Exploiting Distributional Semantics Models for Natural Language Context-aware...Exploiting Distributional Semantics Models for Natural Language Context-aware...
Exploiting Distributional Semantics Models for Natural Language Context-aware...Cataldo Musto
 
Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...
Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...
Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...Cataldo Musto
 
Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...
Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...
Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...Cataldo Musto
 
Hybrid Semantics aware Recommendations Exploiting Knowledge Graph Embeddings
Hybrid Semantics aware Recommendations Exploiting Knowledge Graph EmbeddingsHybrid Semantics aware Recommendations Exploiting Knowledge Graph Embeddings
Hybrid Semantics aware Recommendations Exploiting Knowledge Graph EmbeddingsCataldo Musto
 
Natural Language Justifications for Recommender Systems Exploiting Text Summa...
Natural Language Justifications for Recommender Systems Exploiting Text Summa...Natural Language Justifications for Recommender Systems Exploiting Text Summa...
Natural Language Justifications for Recommender Systems Exploiting Text Summa...Cataldo Musto
 
L'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA Risponde
L'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA RispondeL'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA Risponde
L'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA RispondeCataldo Musto
 
Explanation Strategies - Advances in Content-based Recommender System
Explanation Strategies - Advances in Content-based Recommender SystemExplanation Strategies - Advances in Content-based Recommender System
Explanation Strategies - Advances in Content-based Recommender SystemCataldo Musto
 
Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...
Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...
Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...Cataldo Musto
 
ExpLOD: un framework per la generazione di spiegazioni per recommender system...
ExpLOD: un framework per la generazione di spiegazioni per recommender system...ExpLOD: un framework per la generazione di spiegazioni per recommender system...
ExpLOD: un framework per la generazione di spiegazioni per recommender system...Cataldo Musto
 
Myrror: una piattaforma per Holistic User Modeling e Quantified Self
Myrror: una piattaforma per Holistic User Modeling e Quantified SelfMyrror: una piattaforma per Holistic User Modeling e Quantified Self
Myrror: una piattaforma per Holistic User Modeling e Quantified SelfCataldo Musto
 
Semantic Holistic User Modeling for Personalized Access to Digital Content an...
Semantic Holistic User Modeling for Personalized Access to Digital Content an...Semantic Holistic User Modeling for Personalized Access to Digital Content an...
Semantic Holistic User Modeling for Personalized Access to Digital Content an...Cataldo Musto
 
Holistic User Modeling for Personalized Services in Smart Cities
Holistic User Modeling for Personalized Services in Smart CitiesHolistic User Modeling for Personalized Services in Smart Cities
Holistic User Modeling for Personalized Services in Smart CitiesCataldo Musto
 
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints
A Framework for Holistic User Modeling Merging Heterogeneous Digital FootprintsA Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints
A Framework for Holistic User Modeling Merging Heterogeneous Digital FootprintsCataldo Musto
 
eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?
eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?
eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?Cataldo Musto
 
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...Cataldo Musto
 
Il Linguaggio dell'Odio sui Social Network
Il Linguaggio dell'Odio sui Social NetworkIl Linguaggio dell'Odio sui Social Network
Il Linguaggio dell'Odio sui Social NetworkCataldo Musto
 

More from Cataldo Musto (20)

MyrrorBot: a Digital Assistant Based on Holistic User Models for Personalize...
MyrrorBot: a Digital Assistant Based on Holistic User Models forPersonalize...MyrrorBot: a Digital Assistant Based on Holistic User Models forPersonalize...
MyrrorBot: a Digital Assistant Based on Holistic User Models for Personalize...
 
Fairness and Popularity Bias in Recommender Systems: an Empirical Evaluation
Fairness and Popularity Bias in Recommender Systems: an Empirical EvaluationFairness and Popularity Bias in Recommender Systems: an Empirical Evaluation
Fairness and Popularity Bias in Recommender Systems: an Empirical Evaluation
 
Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...
Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...
Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...
 
Exploring the Effects of Natural Language Justifications in Food Recommender ...
Exploring the Effects of Natural Language Justifications in Food Recommender ...Exploring the Effects of Natural Language Justifications in Food Recommender ...
Exploring the Effects of Natural Language Justifications in Food Recommender ...
 
Exploiting Distributional Semantics Models for Natural Language Context-aware...
Exploiting Distributional Semantics Models for Natural Language Context-aware...Exploiting Distributional Semantics Models for Natural Language Context-aware...
Exploiting Distributional Semantics Models for Natural Language Context-aware...
 
Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...
Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...
Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...
 
Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...
Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...
Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...
 
Hybrid Semantics aware Recommendations Exploiting Knowledge Graph Embeddings
Hybrid Semantics aware Recommendations Exploiting Knowledge Graph EmbeddingsHybrid Semantics aware Recommendations Exploiting Knowledge Graph Embeddings
Hybrid Semantics aware Recommendations Exploiting Knowledge Graph Embeddings
 
Natural Language Justifications for Recommender Systems Exploiting Text Summa...
Natural Language Justifications for Recommender Systems Exploiting Text Summa...Natural Language Justifications for Recommender Systems Exploiting Text Summa...
Natural Language Justifications for Recommender Systems Exploiting Text Summa...
 
L'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA Risponde
L'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA RispondeL'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA Risponde
L'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA Risponde
 
Explanation Strategies - Advances in Content-based Recommender System
Explanation Strategies - Advances in Content-based Recommender SystemExplanation Strategies - Advances in Content-based Recommender System
Explanation Strategies - Advances in Content-based Recommender System
 
Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...
Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...
Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...
 
ExpLOD: un framework per la generazione di spiegazioni per recommender system...
ExpLOD: un framework per la generazione di spiegazioni per recommender system...ExpLOD: un framework per la generazione di spiegazioni per recommender system...
ExpLOD: un framework per la generazione di spiegazioni per recommender system...
 
Myrror: una piattaforma per Holistic User Modeling e Quantified Self
Myrror: una piattaforma per Holistic User Modeling e Quantified SelfMyrror: una piattaforma per Holistic User Modeling e Quantified Self
Myrror: una piattaforma per Holistic User Modeling e Quantified Self
 
Semantic Holistic User Modeling for Personalized Access to Digital Content an...
Semantic Holistic User Modeling for Personalized Access to Digital Content an...Semantic Holistic User Modeling for Personalized Access to Digital Content an...
Semantic Holistic User Modeling for Personalized Access to Digital Content an...
 
Holistic User Modeling for Personalized Services in Smart Cities
Holistic User Modeling for Personalized Services in Smart CitiesHolistic User Modeling for Personalized Services in Smart Cities
Holistic User Modeling for Personalized Services in Smart Cities
 
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints
A Framework for Holistic User Modeling Merging Heterogeneous Digital FootprintsA Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints
 
eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?
eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?
eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?
 
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...
 
Il Linguaggio dell'Odio sui Social Network
Il Linguaggio dell'Odio sui Social NetworkIl Linguaggio dell'Odio sui Social Network
Il Linguaggio dell'Odio sui Social Network
 

Recently uploaded

Evolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptx
Evolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptxEvolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptx
Evolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptxKyle Willson
 
Zi-Stick UBS Dongle ZIgbee from Aeotec manual
Zi-Stick UBS Dongle ZIgbee from  Aeotec manualZi-Stick UBS Dongle ZIgbee from  Aeotec manual
Zi-Stick UBS Dongle ZIgbee from Aeotec manualDomotica daVinci
 
How we think about an advisor tech stack
How we think about an advisor tech stackHow we think about an advisor tech stack
How we think about an advisor tech stackSummit
 
2024 February Patch Tuesday
2024 February Patch Tuesday2024 February Patch Tuesday
2024 February Patch TuesdayIvanti
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVARobert McDermott
 
My sample product research idea for you!
My sample product research idea for you!My sample product research idea for you!
My sample product research idea for you!KivenRaySarsaba
 
Q1 Memory Fabric Forum: SMART CXL Product Lineup
Q1 Memory Fabric Forum: SMART CXL Product LineupQ1 Memory Fabric Forum: SMART CXL Product Lineup
Q1 Memory Fabric Forum: SMART CXL Product LineupMemory Fabric Forum
 
Dynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringDynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringMassimo Talia
 
Power of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdfPower of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdfkatalinjordans1
 
"Running Open-Source LLM models on Kubernetes", Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes", Volodymyr TsapFwdays
 
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...ISPMAIndia
 
From eSIMs to iSIMs: It’s Inside the Manufacturing
From eSIMs to iSIMs: It’s Inside the ManufacturingFrom eSIMs to iSIMs: It’s Inside the Manufacturing
From eSIMs to iSIMs: It’s Inside the ManufacturingSoracom Global, Inc.
 
Heltun_HE-RS01_User_Manual_B9AH.pdf
Heltun_HE-RS01_User_Manual_B9AH.pdfHeltun_HE-RS01_User_Manual_B9AH.pdf
Heltun_HE-RS01_User_Manual_B9AH.pdfMarielaL5
 
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, GoogleISPMAIndia
 
Enhancing Productivity and Insight A Tour of JDK Tools Progress Beyond Java 17
Enhancing Productivity and Insight  A Tour of JDK Tools Progress Beyond Java 17Enhancing Productivity and Insight  A Tour of JDK Tools Progress Beyond Java 17
Enhancing Productivity and Insight A Tour of JDK Tools Progress Beyond Java 17Ana-Maria Mihalceanu
 
Enhancing SaaS Performance: A Hands-on Workshop for Partners
Enhancing SaaS Performance: A Hands-on Workshop for PartnersEnhancing SaaS Performance: A Hands-on Workshop for Partners
Enhancing SaaS Performance: A Hands-on Workshop for PartnersThousandEyes
 
Building Products That Think- Bhaskaran Srinivasan & Ashish Gupta
Building Products That Think- Bhaskaran Srinivasan & Ashish GuptaBuilding Products That Think- Bhaskaran Srinivasan & Ashish Gupta
Building Products That Think- Bhaskaran Srinivasan & Ashish GuptaISPMAIndia
 
Z-Wave Fan coil Thermostat Heltun_HE-HT01_User_Manual.pdf
Z-Wave Fan coil Thermostat Heltun_HE-HT01_User_Manual.pdfZ-Wave Fan coil Thermostat Heltun_HE-HT01_User_Manual.pdf
Z-Wave Fan coil Thermostat Heltun_HE-HT01_User_Manual.pdfDomotica daVinci
 
Microsoft Azure News - Feb 2024
Microsoft Azure News - Feb 2024Microsoft Azure News - Feb 2024
Microsoft Azure News - Feb 2024Daniel Toomey
 
Quinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdf
Quinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdfQuinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdf
Quinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdfDomotica daVinci
 

Recently uploaded (20)

Evolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptx
Evolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptxEvolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptx
Evolution of Chatbots: From Custom AI Chatbots and AI Chatbots for Websites.pptx
 
Zi-Stick UBS Dongle ZIgbee from Aeotec manual
Zi-Stick UBS Dongle ZIgbee from  Aeotec manualZi-Stick UBS Dongle ZIgbee from  Aeotec manual
Zi-Stick UBS Dongle ZIgbee from Aeotec manual
 
How we think about an advisor tech stack
How we think about an advisor tech stackHow we think about an advisor tech stack
How we think about an advisor tech stack
 
2024 February Patch Tuesday
2024 February Patch Tuesday2024 February Patch Tuesday
2024 February Patch Tuesday
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVA
 
My sample product research idea for you!
My sample product research idea for you!My sample product research idea for you!
My sample product research idea for you!
 
Q1 Memory Fabric Forum: SMART CXL Product Lineup
Q1 Memory Fabric Forum: SMART CXL Product LineupQ1 Memory Fabric Forum: SMART CXL Product Lineup
Q1 Memory Fabric Forum: SMART CXL Product Lineup
 
Dynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringDynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineering
 
Power of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdfPower of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdf
 
"Running Open-Source LLM models on Kubernetes", Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes", Volodymyr Tsap
 
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...
 
From eSIMs to iSIMs: It’s Inside the Manufacturing
From eSIMs to iSIMs: It’s Inside the ManufacturingFrom eSIMs to iSIMs: It’s Inside the Manufacturing
From eSIMs to iSIMs: It’s Inside the Manufacturing
 
Heltun_HE-RS01_User_Manual_B9AH.pdf
Heltun_HE-RS01_User_Manual_B9AH.pdfHeltun_HE-RS01_User_Manual_B9AH.pdf
Heltun_HE-RS01_User_Manual_B9AH.pdf
 
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
 
Enhancing Productivity and Insight A Tour of JDK Tools Progress Beyond Java 17
Enhancing Productivity and Insight  A Tour of JDK Tools Progress Beyond Java 17Enhancing Productivity and Insight  A Tour of JDK Tools Progress Beyond Java 17
Enhancing Productivity and Insight A Tour of JDK Tools Progress Beyond Java 17
 
Enhancing SaaS Performance: A Hands-on Workshop for Partners
Enhancing SaaS Performance: A Hands-on Workshop for PartnersEnhancing SaaS Performance: A Hands-on Workshop for Partners
Enhancing SaaS Performance: A Hands-on Workshop for Partners
 
Building Products That Think- Bhaskaran Srinivasan & Ashish Gupta
Building Products That Think- Bhaskaran Srinivasan & Ashish GuptaBuilding Products That Think- Bhaskaran Srinivasan & Ashish Gupta
Building Products That Think- Bhaskaran Srinivasan & Ashish Gupta
 
Z-Wave Fan coil Thermostat Heltun_HE-HT01_User_Manual.pdf
Z-Wave Fan coil Thermostat Heltun_HE-HT01_User_Manual.pdfZ-Wave Fan coil Thermostat Heltun_HE-HT01_User_Manual.pdf
Z-Wave Fan coil Thermostat Heltun_HE-HT01_User_Manual.pdf
 
Microsoft Azure News - Feb 2024
Microsoft Azure News - Feb 2024Microsoft Azure News - Feb 2024
Microsoft Azure News - Feb 2024
 
Quinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdf
Quinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdfQuinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdf
Quinto Z-Wave Heltun_HE-RS01_User_Manual_B9AH.pdf
 

A comparison of Lexicon-based approaches for Sentiment Analysis of microblog posts

  • 1. DART 2014 8th Internation Workshop on Information Filtering and Retrieval Pisa (Italy) December 10, 2014 A comparison of lexicon-based approaches for Sentiment Analysis of microblog posts Cataldo Musto, Giovanni Semeraro, Marco Polignano (Università degli Studi di Bari ‘Aldo Moro’, Italy - SWAP Research Group)
  • 2. Outline • Background • Sentiment Analysis • Lexicon-based approaches • Methodology • State-of-the-art lexicons • Experiments • Conclusions Cataldo Musto, Giovanni Semeraro, Marco Polignano 2 A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 3. Background One minute on the Web Cataldo Musto, Giovanni Semeraro, Marco Polignano 3 A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 4. Background One minute on the Web 4 Information Overload Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 5. 5 Background Information Overload Obstacleor Opportunity? Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 6. 6 Opportunities (Social) Content Analytics Insight: to aggregate rough human-generated data to get valuable people-based findings Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 7. - Real-time polls 7 Social Content Analytics Applications - Social CRM - Online brand monitoring All these applications share a common denominator Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 8. - Real-time polls They all need a methodology to automatically associate an opinion and/or a polarity to each piece of content 8 Social Content Analytics Applications - Social CRM - Online brand monitoring All these applications share a common denominator Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 9. - Real-time polls 9 Social Content Analytics Applications - Social CRM Solution: - Online brand monitoring Sentiment Analysis All these applications share a common denominator They all need a methodology to automatically associate an opinion and/or a polarity to each piece of content Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 10. 10 Sentiment Analysis Definition “It is the field of study that analyzes people’s opinions, sentiments, evaluations, appraisals, attitudes, and emotions towards entities such as products, services, organizations, individuals, issues, events, topics, and their attributes “ (*) (Pang, Bo, and Lillian Lee. "Opinion mining and sentiment analysis." Foundations and trends in information retrieval, 2008) Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 11. 11 Sentiment Analysis Definition “It is the field of study that analyzes people’s opinions, sentiments, evaluations, appraisals, attitudes, and emotions towards entities such as products, services, organizations, individuals, issues, events, topics, and their attributes “ (*) (Pang, Bo, and Lillian Lee. "Opinion mining and sentiment analysis." Foundations and trends in information retrieval, 2008) We will focus on the polarity detection task Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 12. 12 Sentiment Analysis State of the art Supervised Approaches (Machine Learning-based) Unsupervised Approaches (Lexicon-based) Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 13. Man ? 13 Sentiment Analysis Supervised approaches Dog Learn a classification model relying on labeled examples Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 14. frustration - - joy +++ 14 Sentiment Analysis Unsupervised approaches Rely on external lexical resources that associate a polarity score to each term. Sentiment of the content depends on the sentiment of the terms which compose it. Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 15. 15 Sentiment Analysis Supervised vs Unsupervised Pros Cons Nakov, Preslav, et al. "Semeval-2013 task 2: Sentiment analysis in Twitter.” Proceedings of SemEval 2013 Rosenthal, Sara, et al. "Semeval-2014 task 9: Sentiment analysis in Twitter." Proceedings of SemEval 2014. (*) (**) Supervised Higher Accuracy (*) (**) Pre-labeled examples Unsupervised No Training Accuracy depends on lexical resources Several lexical resources available Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 16. Pros Cons Supervised Higher Accuracy (*) (**) Pre-labeled examples Unsupervised No Training Accuracy depends on lexical resources Several lexical resources available We focus on lexicon-based approaches 16 Sentiment Analysis Supervised vs Unsupervised Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 17. 17 Contributions We propose a novel unsupervised lexicon-based approach for sentiment analysis We provide a comparison of lexical resources for sentiment analysis of microblog posts 1. 2. Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 18. 18 Methodology Lexicon-based approach Insight: The polarity of a textual content (e.g. a microblog posts) depends on the polarity of the microphrases which compose it. Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 19. 19 Methodology Lexicon-based approach Insight: The polarity of a textual content (e.g. a microblog posts) depends on the polarity of the microphrases which compose it. A microphrase is built whenever a splitting cue is found in the text Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 20. Conjunctions, adverbs and punctuations are used as 20 Methodology Lexicon-based approach Insight: The polarity of a textual content (e.g. a microblog posts) depends on the polarity of the microphrases which compose it. A microphrase is built whenever a splitting cue is found in the text splitting cues Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 21. Conjunctions, adverbs and punctuations are used as 21 Methodology Lexicon-based approach Insight: The polarity of a textual content (e.g. a microblog posts) depends on the polarity of the microphrases which compose it. A microphrase is built whenever a splitting cue is found in the text splitting cues example: “I don’t like this food, it’s terrible” Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 22. Conjunctions, adverbs and punctuations are used as 22 Methodology Lexicon-based approach Insight: The polarity of a textual content (e.g. a microblog posts) depends on the polarity of the microphrases which compose it. A microphrase is built whenever a splitting cue is found in the text splitting cues example: “I don’t like this food, it’s terrible” { { splitting m1 cue m2 Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 23. 23 Methodology Lexicon-based approach Insight: The polarity of a textual content (e.g. a microblog posts) depends on the polarity of the microphrases which compose it. k pol(T) = Σ pol(mi) i=1 Tweet microphrase T={m1…mk} Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 24. 24 Methodology Lexicon-based approach Insight: The polarity of a microphrase depends on the polarity of the terms which compose it. k pol(T) = Σ pol(mi) i=1 Tweet microphrase n pol(mi) = Σ score(tj) j=1 term T={m1…mk} Mi={t1…tn} Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 25. 25 Methodology Four variant proposed Basic k pol(T) = Σ pol(mi) i=1 n pol(mi) = Σ score(tj) j=1 Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 26. Four variant proposed Normalized pol(T) = Σ pol(mi) i=1 pol(mi) = Σ score(tj) 26 Methodology Basic k pol(T) = Σ pol(mi) i=1 n pol(mi) = Σ score(tj) j=1 n |mi| j=1 Score of each microphrase is normalized according to its length Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 27. Four variant proposed Normalized pol(T) = Σ pol(mi) i=1 pol(mi) = Σ score(tj) with an higher weight categories=adverbs, verbs, adjectives & valence 27 Methodology Basic k pol(T) = Σ pol(mi) i=1 n pol(mi) = Σ score(tj) j=1 n |mi| j=1 Emphasized pol(T) = Σ pol(mi) i=1 pol(mi) = n Σ score(tj) j=1 *w(tj) Specific categories are provided && valence shifters (intensifiers & downtoners) Several weights have been evaluated Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 28. Four variant proposed Normalized pol(T) = Σ pol(mi) i=1 pol(mi) = Σ score(tj) 28 Methodology Basic k pol(T) = Σ pol(mi) i=1 n pol(mi) = Σ score(tj) j=1 n |mi| j=1 Emphasized Normalized-Emphasized pol(T) = Σ pol(mi) i=1 pol(mi) = n Σ score(tj) j=1 pol(T) = Σ pol(mi) pol(mi) = Σscore(tj) Combination |mi| *w(tj) *w(tj) Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 29. We have a problem Normalized pol(T) = Σ pol(mi) i=1 pol(mi) = Σ score(tj) 29 Methodology Basic k pol(T) = Σ pol(mi) i=1 n pol(mi) = Σ score(tj) j=1 n |mi| j=1 Emphasized Normalized-Emphasized pol(T) = Σ pol(mi) i=1 pol(mi) = n Σ score(tj) j=1 pol(T) = Σ pol(mi) pol(mi) = Σscore(tj) |mi| *w(tj) *w(tj) Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 30. We have a problem Normalized pol(T) = Σ pol(mi) i=1 pol(mi) = Σ How to calculate score(score(tj) ? tj) 30 Methodology Basic k pol(T) = Σ pol(mi) i=1 n pol(mi) = Σ score(tj) j=1 n |mi| j=1 Emphasized Normalized-Emphasized pol(T) = Σ pol(mi) i=1 pol(mi) = n Σ score(tj) j=1 pol(T) = Σ pol(mi) pol(mi) = Σscore(tj) |mi| *w(tj) *w(tj) Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 31. 31 Solution Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 32. 32 Lexical Resources State of the art We evaluated four state-of-the-art resources for sentiment analysis SentiWordNet http://sentiwordnet.isti.cnr.it WordNet Affect http://wndomains.fbk.eu/wnaffect.html SenticNet http://sentic.net MPQA http://mpqa.cs.pitt.edu Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 33. 33 Lexical Resources SentiWordNet(*) Each WordNet synset is provided with three different sentiment scores (positivity, negativity, objectivity) (*) Baccianella, Stefano, Andrea Esuli, and Fabrizio Sebastiani. "SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining." LREC. Vol. 10. 2010. Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 34. 34 Lexical Resources WordNet Affect(*) WordNet extension Affective-related synsets are mapped with an A-Label e.g. euphoria —> positive-emotion illness —> physical state (*) Strapparava, Carlo, and Alessandro Valitutti. "WordNet Affect: an Affective Extension of WordNet." LREC. Vol. 4. 2004. Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 35. 35 Lexical Resources SenticNet(*) Inspired by the Hourglass of Emotions model Each term is represented of the ground of the intensity of four basic emotional dimensions (sensitivity, aptitude, attention, pleasantness) The activation level of each dimension defines 16 basic emotions (*) Cambria, Erik, Daniel Olsher, and Dheeraj Rajagopal. "SenticNet 3: a common and common-sense knowledge base for cognition-driven sentiment analysis." Twenty-eighth AAAI conference on artificial intelligence. 2014. Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 36. 36 Lexical Resources SenticNet(*) According to the triggered emotions, each term is provided with an aggregated polarity score Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 37. 37 Lexical Resources SenticNet(*) SenticNet models a sentiment score for some bigrams and trigrams as well! Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 38. 38 Lexical Resources MPQA(*) (*) Wilson, Theresa, Janyce Wiebe, and Paul Hoffmann. "Recognizing contextual polarity in phrase-level sentiment analysis." Proceedings of the conference on human language technology and empirical methods in natural language processing. Association for Computational Linguistics, 2005. Each term is (manually) provided with a discrete sentiment score +1 positive 0 neutral -1 negative Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 39. 39 Lexical Resources Comparison Resource Coverage (terms) SentiWordNet 117,659 WordNet Affect 200 SenticNet 14,000 MPQA 8,222 Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 40. Cataldo Musto, Giovanni Semeraro, Marco Polignano 40 A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 41. 41 Lexical Resources Score calculation SentiWordNet Given a term, score(tj) is the mean of the sentiment score of all the possible synsets of tj score(good) = 0.75 + 0 + 1 +1 = 4 0.687 Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 42. Score calculation Given a term, score(tj), WordNet Affect hierarchy is climbed until an A-Label which occur in SentiWordNet is found. tj inherits the sentiment score of the A-Label score(good) = score(benevolence) = 0.339 42 Lexical Resources WordNet Affect Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 43. 43 Lexical Resources Score calculation SenticNet Given a term, score(tj), SenticNet APIs are queried and sentiment score is extracted score(good) = 0.883 Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 44. 44 Lexical Resources Score calculation MPQA Given a term, score(tj), MPQA Lexicon are queried and sentiment score is extracted score(good) = 1 Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 45. 45 Methodology Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 46. Experimental Evaluation Research Hypothesis 46 1. How do the different versions of the algorithm perform with respect to state-of-the- art datasets? 2. What is the best lexical resource to detect the polarity of microblog posts? Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 47. Experimental Evaluation Description of the datasets 47 • SemEval-2013 • 14,435 Tweets • 8,180 training • 3,255 test • Positive, Negative, Neutral • STS Dataset • 1,600,000 Tweets • only 359 test • Positive, Negative Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 48. Experimental Evaluation Statistics about Coverage 48 Lexicon SemEval-2013-Test STS-Test Vocabulary Size 18,309 6,711 SentiWordNet 4,314 883 WordNet-Affect 149 48 MPQA 897 224 SenticNet 1,497 326 Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 49. Experiment 1 49 Intra-Lexicons evaluation Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 50. norm vs norm+emph significant (p < 0,0001) Basic Normalized Emphasized Norm-Emph Experiment 1 57,67 58,1 58,65 58,99 45 50 55 60 65 50 SemEval :: SentiWordNet Emphasis and Normalization improve the accuracy Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 51. Basic Normalized Emphasized Norm-Emph Experiment 1 53,92 55,05 53,95 55,08 not significant 45 50 55 60 65 51 SemEval :: WordNet Affect Emphasis and Normalization improve the accuracy Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 52. Basic Normalized Emphasized Norm-Emph Experiment 1 58,03 57,97 58,25 58,1 not significant 45 50 55 60 65 52 SemEval :: MPQA Emphasis improves the accuracy. Normalization doesn’t. Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 53. Basic Normalized Emphasized Norm-Emph Experiment 1 48,69 47,25 48,29 48,08 norm vs norm+emph significant (p < 0,0001) 45 50 55 60 65 53 SemEval :: SenticNet No improvement Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 54. Experiment 1 54 General Outcomes SentiWordNet WordNet Affect MPQA Emphasis leads to improvements (7 out of 8 comparisons). 1. 2. SenticNet Normalization doesn’t. (1 out of 4 comparisons) Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 55. Basic Normalized Emphasized Norm-Emph Experiment 1 71,87 72,42 71,31 71,59 not significant gaps 60 63,75 67,5 71,25 75 55 STS :: SentiWordNet Normalization improves the accuracy. Emphasis doesn’t Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 56. Basic Normalized Emphasized Norm-Emph Experiment 1 62,95 62,67 62,96 62,95 60 63,75 67,5 71,25 75 56 STS :: WordNet Affect not significant gaps Emphasis improves the accuracy. Normalization doesn’t Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 57. Basic Normalized Emphasized Norm-Emph Experiment 1 69,54 70,75 69,92 70,76 60 63,75 67,5 71,25 75 57 STS :: MPQA not significant gaps Both Emphasis and Normalization improve the accuracy. Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 58. Basic Normalized Emphasized Norm-Emph Experiment 1 74,37 74,65 74,65 73,82 not significant 70 71,75 73,5 75,25 77 58 STS :: SenticNet Normalization improves the accuracy. Emphasis doesn’t Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 59. Experiment 1 SenticNet 59 General Outcomes SentiWordNet WordNet Affect MPQA 1. Controversial behavior (normalization typically improves, emphasis doesn’t) 2. Little statistical significance (small dataset) Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 60. Experiment 2 60 Inter-Lexicons evaluation Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 61. Experiment 2 61 Comparison between lexicons Accuracy 80 60 40 20 0 SentiWordNet SenticNet WordNet-Affect MPQA 58,25 62,96 55,08 74,65 48,69 72,42 SemEval-2013 STS 70,76 58,99 Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 62. Experiment 2 SentiWordNet is the best-performing configuration on SemEval data 62 Comparison between lexicons Accuracy 80 60 40 20 0 SentiWordNet SenticNet WordNet-Affect MPQA 58,25 62,96 55,08 74,65 48,69 72,42 SemEval-2013 STS 70,76 58,99 Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 63. Experiment 2 63 Comparison between lexicons Accuracy 80 60 40 20 0 SentiWordNet SenticNet WordNet-Affect MPQA 58,25 62,96 55,08 74,65 48,69 72,42 SemEval-2013 STS 70,76 58,99 MPQA well-performs on SemEval data Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 64. Experiment 2 SenticNet has a controversial behavior: worst on SemEval - best on STS 64 Comparison between lexicons Accuracy 80 60 40 20 0 SentiWordNet SenticNet WordNet-Affect MPQA 58,25 62,96 55,08 74,65 48,69 72,42 SemEval-2013 STS 70,76 58,99 Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 65. Experiment 2 Reason: SenticNet can hardly classify neutral Tweets (threshold learning?) 65 Comparison between lexicons Accuracy 80 60 40 20 0 SentiWordNet SenticNet WordNet-Affect MPQA 58,25 62,96 55,08 74,65 48,69 72,42 SemEval-2013 STS 70,76 58,99 Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 66. Experiment 2 66 Comparison between lexicons Accuracy 80 60 40 20 0 SentiWordNet SenticNet WordNet-Affect MPQA 58,25 62,96 55,08 74,65 48,69 72,42 SemEval-2013 STS 70,76 58,99 SentiWordNet and MPQA confirm their performance on STS Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 67. Experiment 2 Poor coverage negatively influences Wordnet-Affect performances 67 Comparison between lexicons Accuracy 80 60 40 20 0 SentiWordNet SenticNet WordNet-Affect MPQA 58,25 62,96 55,08 74,65 48,69 72,42 SemEval-2013 STS 70,76 58,99 Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 68. Experiment 2 68 Statistical Analysis Accuracy 80 60 40 20 0 SentiWordNet SenticNet WordNet-Affect MPQA 58,25 62,96 55,08 74,65 48,69 72,42 best p < 0,0001 p < 0,001 p < 0,50 p < 0,42 best p < 0,0001 p < 0,11 SemEval-2013 STS 70,76 58,99 = not significant gap = significant gap Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 69. Experiment 2 69 Conclusions Accuracy 80 60 40 20 0 SentiWordNet SenticNet WordNet-Affect MPQA 58,25 62,96 55,08 74,65 48,69 72,42 best p < 0,0001 p < 0,001 p < 0,50 p < 0,42 best p < 0,0001 p < 0,11 SemEval-2013 STS 70,76 58,99 = best-performing lexicons Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 70. Conclusions Cataldo Musto, Giovanni Semeraro, Marco Polignano 70 A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 71. Lessons Learned INVESTIGATION ABOUT THE EFFECTIVENESS OF LEXICAL RESOURCES IN POLARITY CLASSIFICATION OF MICROBLOG POSTS Comparison of 4 state-of-the-art resources 71 SentiWordNet - SenticNet - MPQA - WordNet Affect Evaluation. Research Question: What is the impact of each lexical resource in the task of polarity classification? MPQA and SentiWordNet typically overcome other resources (interesting result, due to the smaller coverage of MPQA) SenticNet behavior is worth to be deepen investigated 1. 2. Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 72. Future Research 72 Evaluation against different datasets and with more lexical results; Better tuning of parameters (classification threshold) , integration of more complex syntactic structures, merging lexical resources Integration of the algorithm in a recommendation framework to exploit sentiment-based information to model user interests Cataldo Musto, Giovanni Semeraro, Marco Polignano A comparison of lexicon-based approaches for sentiment analysis of microblog posts. DART 2014 Workshop, Pisa(Italy) 10.12.2014
  • 73. questions? Cataldo Musto, Ph.D cataldo.musto@uniba.it