SlideShare a Scribd company logo
Resources for Sentiment Analysis
Seminar Presentation
Sagar Ahire
133050073
IIT Bombay
02 May, 2014
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 1 / 48
Roadmap
1 Introduction
2 Sentiwordnet
3 SO-CAL
4 Wordnet-Affect
5 Indian-Language Sentiwordnets
6 Conclusions
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 2 / 48
Introduction
Roadmap: We Are Here
1 Introduction
2 Sentiwordnet
3 SO-CAL
4 Wordnet-Affect
5 Indian-Language Sentiwordnets
6 Conclusions
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 3 / 48
Introduction Overview
Overview
An overview of today’s presentation:
This presentation covers lexical resources for sentiment analysis.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 4 / 48
Introduction Overview
Overview
An overview of today’s presentation:
This presentation covers lexical resources for sentiment analysis.
Four resources are covered, each using a different approach for
representation and creation:
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 4 / 48
Introduction Overview
Overview
An overview of today’s presentation:
This presentation covers lexical resources for sentiment analysis.
Four resources are covered, each using a different approach for
representation and creation:
Sentiwordnet, created automatically, with 3 graded scores per synset
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 4 / 48
Introduction Overview
Overview
An overview of today’s presentation:
This presentation covers lexical resources for sentiment analysis.
Four resources are covered, each using a different approach for
representation and creation:
Sentiwordnet, created automatically, with 3 graded scores per synset
SO-CAL, created manually, with a graded score per word
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 4 / 48
Introduction Overview
Overview
An overview of today’s presentation:
This presentation covers lexical resources for sentiment analysis.
Four resources are covered, each using a different approach for
representation and creation:
Sentiwordnet, created automatically, with 3 graded scores per synset
SO-CAL, created manually, with a graded score per word
Wordnet-Affect, created semi-automatically, with affect information for
each synset
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 4 / 48
Introduction Overview
Overview
An overview of today’s presentation:
This presentation covers lexical resources for sentiment analysis.
Four resources are covered, each using a different approach for
representation and creation:
Sentiwordnet, created automatically, with 3 graded scores per synset
SO-CAL, created manually, with a graded score per word
Wordnet-Affect, created semi-automatically, with affect information for
each synset
Indian-Language Sentiwordnet, created by projecting the English
Sentiwordnet
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 4 / 48
Introduction Sentiment Analysis
Sentiment Analysis
Sentiment Analysis: Determining the opinion expressed in a text
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 5 / 48
Introduction Sentiment Analysis
Sentiment Analysis
Sentiment Analysis: Determining the opinion expressed in a text
Approaches:
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 5 / 48
Introduction Sentiment Analysis
Sentiment Analysis
Sentiment Analysis: Determining the opinion expressed in a text
Approaches:
Classifier-based
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 5 / 48
Introduction Sentiment Analysis
Sentiment Analysis
Sentiment Analysis: Determining the opinion expressed in a text
Approaches:
Classifier-based
Lexicon-based
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 5 / 48
Introduction Sentiment Analysis
Why Lexicon-based Approach?
The classifier-based approach has the following drawbacks:
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 6 / 48
Introduction Sentiment Analysis
Why Lexicon-based Approach?
The classifier-based approach has the following drawbacks:
Domain Specificity (Example: Movie reviews mentioning ‘writer’,
‘plot’, etc.) [Bro01]
Lack of Context (Example: ‘good’ vs ‘not good’ vs ‘not very good’)
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 6 / 48
Introduction Sentiment Analysis
Why Lexicon-based Approach?
The classifier-based approach has the following drawbacks:
Domain Specificity (Example: Movie reviews mentioning ‘writer’,
‘plot’, etc.) [Bro01]
Lack of Context (Example: ‘good’ vs ‘not good’ vs ‘not very good’)
The lexicon-based approach aims at solving these problems.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 6 / 48
Introduction Sentiment Lexicons
Sentiment Lexicons
A sentiment lexicon is a sentiment database for language units of the form
(lexical unit, sentiment).
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 7 / 48
Introduction Sentiment Lexicons
Sentiment Lexicons
A sentiment lexicon is a sentiment database for language units of the form
(lexical unit, sentiment).
Choices for lexical unit:
Word
Word sense
Phrase, etc.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 7 / 48
Introduction Sentiment Lexicons
Sentiment Lexicons
A sentiment lexicon is a sentiment database for language units of the form
(lexical unit, sentiment).
Choices for lexical unit:
Word
Word sense
Phrase, etc.
Choices for sentiment:
Fixed categorization into ‘positive’ and ‘negative’
Graded sets like ‘strongly positive’, ‘mildly positive’, ‘neutral’, ‘mildly
negative’, ‘strongly negative’
Score in an interval like [0, 1] or [−1, +1]
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 7 / 48
Introduction Sentiment Lexicons
Approaches for Creation
Manual
Automatic
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 8 / 48
Sentiwordnet
Roadmap: We Are Here
1 Introduction
2 Sentiwordnet
3 SO-CAL
4 Wordnet-Affect
5 Indian-Language Sentiwordnets
6 Conclusions
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 9 / 48
Sentiwordnet
Introduction to Sentiwordnet
Sentiwordnet [ES06] is an automatically generated sentiment lexicon made
using Wordnet. Its salient features are:
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 10 / 48
Sentiwordnet
Introduction to Sentiwordnet
Sentiwordnet [ES06] is an automatically generated sentiment lexicon made
using Wordnet. Its salient features are:
High coverage
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 10 / 48
Sentiwordnet
Introduction to Sentiwordnet
Sentiwordnet [ES06] is an automatically generated sentiment lexicon made
using Wordnet. Its salient features are:
High coverage
Support for graded sentiment labels
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 10 / 48
Sentiwordnet
Introduction to Sentiwordnet
Sentiwordnet [ES06] is an automatically generated sentiment lexicon made
using Wordnet. Its salient features are:
High coverage
Support for graded sentiment labels
Support for both sentiment classification and subjectivity detection
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 10 / 48
Sentiwordnet Structure
Structure of Sentiwordnet
Sentiwordnet = Wordnet + Sentiment Information.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 11 / 48
Sentiwordnet Structure
Structure of Sentiwordnet
Sentiwordnet = Wordnet + Sentiment Information.
Each synset s is given three sentiment scores:
Positive score Pos(s)
Negative score Neg(s)
Objective score Obj(s)
Pos(s) + Neg(s) + Obj(s) = 1
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 11 / 48
Sentiwordnet Structure
Structure of Sentiwordnet
Sentiwordnet = Wordnet + Sentiment Information.
Each synset s is given three sentiment scores:
Positive score Pos(s)
Negative score Neg(s)
Objective score Obj(s)
Pos(s) + Neg(s) + Obj(s) = 1
Example Synset
beautifula: Pos = 0.75, Neg = 0.00, Obj = 0.25
a
URL: http://sentiwordnet.isti.cnr.it/search.php?q=beautiful
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 11 / 48
Sentiwordnet Creation
Creation Steps
The top-level steps in the algorithm to create Sentiwordnet are as follows:
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 12 / 48
Sentiwordnet Creation
Creation Steps
The top-level steps in the algorithm to create Sentiwordnet are as follows:
1 Selection of seed set
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 12 / 48
Sentiwordnet Creation
Creation Steps
The top-level steps in the algorithm to create Sentiwordnet are as follows:
1 Selection of seed set
2 Expansion using Wordnet’s semantic relations
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 12 / 48
Sentiwordnet Creation
Creation Steps
The top-level steps in the algorithm to create Sentiwordnet are as follows:
1 Selection of seed set
2 Expansion using Wordnet’s semantic relations
3 Training of a team of ternary classifiers
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 12 / 48
Sentiwordnet Creation
Creation Steps
The top-level steps in the algorithm to create Sentiwordnet are as follows:
1 Selection of seed set
2 Expansion using Wordnet’s semantic relations
3 Training of a team of ternary classifiers
4 Classification of each Wordnet synset using the classifiers
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 12 / 48
SO-CAL
Roadmap: We Are Here
1 Introduction
2 Sentiwordnet
3 SO-CAL
4 Wordnet-Affect
5 Indian-Language Sentiwordnets
6 Conclusions
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 13 / 48
SO-CAL
Introduction to SO-CAL
SO-CAL is a system that uses a manually-constructed lexicon. Its salient
features are:
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 14 / 48
SO-CAL
Introduction to SO-CAL
SO-CAL is a system that uses a manually-constructed lexicon. Its salient
features are:
Highly detailed lexicon
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 14 / 48
SO-CAL
Introduction to SO-CAL
SO-CAL is a system that uses a manually-constructed lexicon. Its salient
features are:
Highly detailed lexicon
Graded sentiment label
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 14 / 48
SO-CAL
Introduction to SO-CAL
SO-CAL is a system that uses a manually-constructed lexicon. Its salient
features are:
Highly detailed lexicon
Graded sentiment label
Low coverage, but high accuracy
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 14 / 48
SO-CAL Structure
Features Used
SO-CAL classifies words into various features and treats each feature
differently in the lexicon. They are:
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 15 / 48
SO-CAL Structure
Features Used
SO-CAL classifies words into various features and treats each feature
differently in the lexicon. They are:
Adjectives
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 15 / 48
SO-CAL Structure
Features Used
SO-CAL classifies words into various features and treats each feature
differently in the lexicon. They are:
Adjectives
Nouns, Verbs, Adverbs and Multiwords
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 15 / 48
SO-CAL Structure
Features Used
SO-CAL classifies words into various features and treats each feature
differently in the lexicon. They are:
Adjectives
Nouns, Verbs, Adverbs and Multiwords
Intensifiers and Downtoners
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 15 / 48
SO-CAL Structure
Features Used
SO-CAL classifies words into various features and treats each feature
differently in the lexicon. They are:
Adjectives
Nouns, Verbs, Adverbs and Multiwords
Intensifiers and Downtoners
Negation
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 15 / 48
SO-CAL Structure
Features Used
SO-CAL classifies words into various features and treats each feature
differently in the lexicon. They are:
Adjectives
Nouns, Verbs, Adverbs and Multiwords
Intensifiers and Downtoners
Negation
Irrealis Blocking
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 15 / 48
SO-CAL Structure
Structure of SO-CAL
Sentiment scoring:
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 16 / 48
SO-CAL Structure
Structure of SO-CAL
Sentiment scoring:
Words are scored in [−5, +5]
Intensifiers and negation further act upon these scores
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 16 / 48
SO-CAL Structure
Structure of SO-CAL
Sentiment scoring:
Words are scored in [−5, +5]
Intensifiers and negation further act upon these scores
Examples
good: +3
monstrosity: −5
masterpiece: +5
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 16 / 48
Wordnet-Affect
Roadmap: We Are Here
1 Introduction
2 Sentiwordnet
3 SO-CAL
4 Wordnet-Affect
5 Indian-Language Sentiwordnets
6 Conclusions
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 17 / 48
Wordnet-Affect
Introduction to Wordnet-Affect
Wordnet-Affect [SV04] is a semi-automatically generated sentiment lexicon
made using Wordnet. It associates affective information with each
synset. Its salient features are:
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 18 / 48
Wordnet-Affect
Introduction to Wordnet-Affect
Wordnet-Affect [SV04] is a semi-automatically generated sentiment lexicon
made using Wordnet. It associates affective information with each
synset. Its salient features are:
Highly detailed
Ability to handle sentiment differently depending on emotion
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 18 / 48
Wordnet-Affect Structure
Structure of Wordnet-Affect
Wordnet-Affect = Wordnet + Affect Information.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 19 / 48
Wordnet-Affect Structure
Structure of Wordnet-Affect
Wordnet-Affect = Wordnet + Affect Information.
Affect is represented using the following:
An a-label which represents the emotion,
The valency which indicates the sentiment.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 19 / 48
Wordnet-Affect Structure
Structure of Wordnet-Affect
The a-label is a tree of emotions starting at a root node with each
leaf node corresponding to a synset.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 20 / 48
Wordnet-Affect Structure
Structure of Wordnet-Affect
The a-label is a tree of emotions starting at a root node with each
leaf node corresponding to a synset.
The valency can be any of positive, negative, neutral or ambiguous.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 20 / 48
Wordnet-Affect Structure
root
mental-state
cognitive-state affective-state
mood emotion
positive-emotion
joy
elation
love
worship
negative-emotion
sadness
melancholy
shame
embarrassment
. . .
. . .
physical-state . . .
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 21 / 48
Wordnet-Affect Creation
Creation Steps
Wordnet-Affect was created using the following steps:
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 22 / 48
Wordnet-Affect Creation
Creation Steps
Wordnet-Affect was created using the following steps:
Manual creation of initial resource
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 22 / 48
Wordnet-Affect Creation
Creation Steps
Wordnet-Affect was created using the following steps:
Manual creation of initial resource
Automatic expansion using Wordnet relations
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 22 / 48
Indian-Language Sentiwordnets
Roadmap: We Are Here
1 Introduction
2 Sentiwordnet
3 SO-CAL
4 Wordnet-Affect
5 Indian-Language Sentiwordnets
6 Conclusions
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 23 / 48
Indian-Language Sentiwordnets
Introduction to Indian-Language Sentiwordnets
Indian-language Sentiwordnets can be created using Wordnet projection
[JRB10]. This approach has the following salient features:
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 24 / 48
Indian-Language Sentiwordnets
Introduction to Indian-Language Sentiwordnets
Indian-language Sentiwordnets can be created using Wordnet projection
[JRB10]. This approach has the following salient features:
Easy to create once backing resources are available
No reduplication of effort
Use of tried-and-tested representations
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 24 / 48
Indian-Language Sentiwordnets Creation
Creation Steps
The process of projecting a Sentiwordnet has the following steps:
Fetch a synset from the English Sentiwordnet.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 25 / 48
Indian-Language Sentiwordnets Creation
Creation Steps
The process of projecting a Sentiwordnet has the following steps:
Fetch a synset from the English Sentiwordnet.
Find the corresponding Hindi synset using Indowordnet.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 25 / 48
Indian-Language Sentiwordnets Creation
Creation Steps
The process of projecting a Sentiwordnet has the following steps:
Fetch a synset from the English Sentiwordnet.
Find the corresponding Hindi synset using Indowordnet.
Assign sentiment scores from English synset to Hindi synset.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 25 / 48
Conclusions
Roadmap: We Are Here
1 Introduction
2 Sentiwordnet
3 SO-CAL
4 Wordnet-Affect
5 Indian-Language Sentiwordnets
6 Conclusions
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 26 / 48
Conclusions
A Comparison of the Resources
Criterion SWN SO-CAL WN-Affect IL-SWN
Sentiment 3 x [0, 1] [−5, +5] Affect 3 x [0, 1]
Lexical Unit Synset Word Synset Synset
Backing Resource Wordnet None Wordnet SWN + In-
dowordnet
Creation Automatic Manual Automatic Projection
No of Entries 117,000 5,000 900 16,000
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 27 / 48
Conclusions
Concluding Remarks
To conclude, there are three choices in making a sentiment lexicon:
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 28 / 48
Conclusions
Concluding Remarks
To conclude, there are three choices in making a sentiment lexicon:
Creation Approach: Manual, Automatic, Semi-Automatic or
Projection
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 28 / 48
Conclusions
Concluding Remarks
To conclude, there are three choices in making a sentiment lexicon:
Creation Approach: Manual, Automatic, Semi-Automatic or
Projection
Lexical Unit: Word, Synset or Higher Representations
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 28 / 48
Conclusions
Concluding Remarks
To conclude, there are three choices in making a sentiment lexicon:
Creation Approach: Manual, Automatic, Semi-Automatic or
Projection
Lexical Unit: Word, Synset or Higher Representations
Sentiment: Labels, Graded Scores or Affect Information
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 28 / 48
Conclusions
Concluding Remarks: Creation Approach
Manual Approach Automatic Approach
High annotation accuracy Low annotation accuracy
High time investment Low time investment
More details supported Less details supported
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 29 / 48
Conclusions
Concluding Remarks: Lexical Unit
Word Synset
Unreliable for polysemous words Reliable for polysemous words
No pre-processing required Requires WSD
Projection is comparatively difficult Projection is comparatively easier
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 30 / 48
Conclusions
Concluding Remarks: Sentiment
Graded scores have been shown to be better than mere labels in general.
Moreover, a graded score resource can always be converted to a
label-based resource.
Affect information can help in specialized circumstances.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 31 / 48
Conclusions
Future Work
Possible directions in the future:
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 32 / 48
Conclusions
Future Work
Possible directions in the future:
Automatic resources for higher-level lexical units like phrases, trees,
etc.
Manual resources for synsets
Manual lexicons for Indian languages
Techniques for building dynamic resources to incorporate ‘netspeak’
and other slang
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 32 / 48
Conclusions
References I
Julian Brooke, A semantic approach to automatic text sentiment
analysis, M.A. thesis, Stanford University, 2001.
Andrea Esuli and Fabrizio Sebastiani, SentiWordNet: A publicly
available lexical resource for opinion mining, Proceedings of the 5th
Conference on Language Resources and Evaluation (LREC-06), 2006,
pp. 417–422.
Andrea Esuli, Automatic generation of lexical resources for opinion
mining: Models, algorithms and applications, Ph.D. thesis, Universita
di Pisa, 2008.
Christiane Fellbaum, Wordnet: An electronic lexical database, A
Bradford Book, 1998.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 33 / 48
Conclusions
References II
Vasileios Hatzivassiloglou and Kathleen R. McKeown, Predicting the
semantic orientation of adjectives, Proceedings of the 35th Annual
Meeting of the Association for Computational Linguistics and Eighth
Conference of the European Chapter of the Association for
Computational Linguistics, Association for Computational Linguistics,
1997, pp. 174–181.
Aditya Joshi, Balamurali A R, and Pushpak Bhattacharyya, A
fall-back strategy for sentiment analysis in hindi: a case study,
Proceedings of ICON 2010: 8th International Conference on Natural
Language Processing, Macmillan Publishers, India, 2010.
Jaap Kamps, Maarten Marx, Robert J. Mokken, and Maarten
de Rijke, Using wordnet to measure semantic orientations of
adjectives, Proceedings of LREC-04, 4th International Conference on
Language Resources and Evaluation, 2004, pp. 1115–1118.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 34 / 48
Conclusions
References III
Ellen Riloff and Janyce Wiebe, Learning extraction patterns for
subjective expressions, Proceedings of the 2003 Conference on
Empirical Methods in Natural Language Processing, Association for
Computational Linguistics, 2003, pp. 105–112.
Carlo Strapparava and Alessandro Valitutti, WordNet-Affect: an
affective extension of WordNet, Proceedings of the 4th International
Conference on Language Resources and Evaluation (LREC-04), 2004,
pp. 1083–1086.
Peter D. Turney and Michael L. Littman, Measuring praise and
criticism: Inference of semantic orientation from association, ACM
Transactions on Information Systems 21 (2003), no. 4, 315–346.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 35 / 48
Additional Slides Wordnet
Wordnet
Wordnet [Fel98] is a lexical database organized by word sense. The
fundamental unit of storage is called a synset.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 36 / 48
Additional Slides Wordnet
Wordnet
Wordnet [Fel98] is a lexical database organized by word sense. The
fundamental unit of storage is called a synset.
An Example Synset
brilliant, superba: of surpassing excellence
“a brilliant performance”; “a superb actor”
a
URL: http://wordnetweb.princeton.edu/perl/webwn?s=brilliant
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 36 / 48
Additional Slides Wordnet
Semantic Relations in Wordnet
Wordnet synsets are linked to each other by relations called semantic
relations. Some of them are:
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 37 / 48
Additional Slides Wordnet
Semantic Relations in Wordnet
Wordnet synsets are linked to each other by relations called semantic
relations. Some of them are:
Antonymy
Meronymy
Hypernymy
Hyponymy
Similar to, etc.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 37 / 48
Additional Slides Wordnet
Semantic Relations in Wordnet
Wordnet synsets are linked to each other by relations called semantic
relations. Some of them are:
Antonymy
Meronymy
Hypernymy
Hyponymy
Similar to, etc.
These relations are helpful in creating the training set for classifying
synsets to create Sentiwordnet.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 37 / 48
Additional Slides Background
Sentiment Classification
Initial work that automatically detected the sentiment of a word led to
today’s modern lexicons. This included:
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 38 / 48
Additional Slides Background
Sentiment Classification
Initial work that automatically detected the sentiment of a word led to
today’s modern lexicons. This included:
Use of conjunction-separated adjectives [HM97]
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 38 / 48
Additional Slides Background
Sentiment Classification
Initial work that automatically detected the sentiment of a word led to
today’s modern lexicons. This included:
Use of conjunction-separated adjectives [HM97]
PMI-based Extraction using Web Queries [TL03]
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 38 / 48
Additional Slides Background
Sentiment Classification
Initial work that automatically detected the sentiment of a word led to
today’s modern lexicons. This included:
Use of conjunction-separated adjectives [HM97]
PMI-based Extraction using Web Queries [TL03]
Graph Expansion using Wordnet [KMMdR04]
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 38 / 48
Additional Slides Background
Sentiment Classification
Initial work that automatically detected the sentiment of a word led to
today’s modern lexicons. This included:
Use of conjunction-separated adjectives [HM97]
PMI-based Extraction using Web Queries [TL03]
Graph Expansion using Wordnet [KMMdR04]
Classification using Wordnet Glosses [Esu08]
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 38 / 48
Additional Slides Background
Subjectivity Detection
Work that identifies whether a term is indeed subjective is necessary to
filter out objective words from sentiment classification. This includes:
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 39 / 48
Additional Slides Background
Subjectivity Detection
Work that identifies whether a term is indeed subjective is necessary to
filter out objective words from sentiment classification. This includes:
Adapting Wordnet Glosses to Subjectivity Detection [Esu08]
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 39 / 48
Additional Slides Background
Subjectivity Detection
Work that identifies whether a term is indeed subjective is necessary to
filter out objective words from sentiment classification. This includes:
Adapting Wordnet Glosses to Subjectivity Detection [Esu08]
Bootstrapping Subjective Expressions from a Corpus [RW03]
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 39 / 48
Additional Slides Structure of SO-CAL
Adjectives
Adjectives were collected from a 500-document corpus and annotated with
a sentiment score from −5 to +5.
Examples
good: +3
sleazy: −3
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 40 / 48
Additional Slides Structure of SO-CAL
Nouns, Verbs, Adverbs, Multiwords
This was extended to other parts of speech and multiword expressions, for
a total of about 5,000 words.
Examples
monstrosity: −5
masterpiece: +5
inspire: +2
funny: +2 vs. act funny: −1
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 41 / 48
Additional Slides Structure of SO-CAL
Intensifiers and Downtoners
Intensifiers are words that increase sentiment intensity while downtoners
are words that reduce sentiment intensity. For example extraordinarily and
somewhat.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 42 / 48
Additional Slides Structure of SO-CAL
Intensifiers and Downtoners
Intensifiers are words that increase sentiment intensity while downtoners
are words that reduce sentiment intensity. For example extraordinarily and
somewhat.
Intensifiers and downtoners are modeled as percentage modifiers.
Examples
slightly: −50%
extraordinarily: +50%
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 42 / 48
Additional Slides Structure of SO-CAL
Negation
Negation is modeled as a numeric shift of value 4 towards the opposite
sentiment.
Examples
good: +3 ⇒ not good: −1
atrocious: −5 ⇒ not atrocious: −1
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 43 / 48
Additional Slides Structure of SO-CAL
Irrealis Blocking
An irrealis marker is a word that indicates that the sentiment may not be
reliable because the event hasn’t actually happened. For example, ‘would’,
‘expect’, ‘if’, quotation marks, etc.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 44 / 48
Additional Slides Structure of SO-CAL
Irrealis Blocking
An irrealis marker is a word that indicates that the sentiment may not be
reliable because the event hasn’t actually happened. For example, ‘would’,
‘expect’, ‘if’, quotation marks, etc.
Sentences with irrealis markers are ignored for sentiment analysis.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 44 / 48
Additional Slides Sentiwordnet Creation
Seed Set
Two seed sets are created:
Lp for positive synsets
Ln for negative synsets
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 45 / 48
Additional Slides Sentiwordnet Creation
Seed Set
Two seed sets are created:
Lp for positive synsets
Ln for negative synsets
Each synset representation consists of:
The terms
The defninition
The sample phrases
Explicit indication of negation
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 45 / 48
Additional Slides Sentiwordnet Creation
Wordnet Expansion
Relations of Wordnet used for expansion:
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 46 / 48
Additional Slides Sentiwordnet Creation
Wordnet Expansion
Relations of Wordnet used for expansion:
Direct antonymy
Similarity
Derived from
Pertains to
Attribute
Also see
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 46 / 48
Additional Slides Sentiwordnet Creation
Classifiers
8 classifiers were created differing in:
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 47 / 48
Additional Slides Sentiwordnet Creation
Classifiers
8 classifiers were created differing in:
No of iterations of expansion (0, 2, 4, 6)
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 47 / 48
Additional Slides Sentiwordnet Creation
Classifiers
8 classifiers were created differing in:
No of iterations of expansion (0, 2, 4, 6)
Learning algorithm (SVM, Rocchio)
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 47 / 48
Additional Slides Sentiwordnet Creation
Classifiers
Each ternary classifier is a sum of 2 binary classifiers:
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 48 / 48
Additional Slides Sentiwordnet Creation
Classifiers
Each ternary classifier is a sum of 2 binary classifiers:
Positive vs. Not Positive
Negative vs. Not Negative
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 48 / 48
Additional Slides Sentiwordnet Creation
Classifiers
Each ternary classifier is a sum of 2 binary classifiers:
Positive vs. Not Positive
Negative vs. Not Negative
The results are combined as:
Positive Not Positive
Negative Objective Negative
Not Negative Positive Objective
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 48 / 48

More Related Content

What's hot

Finite Element Analysis -Dr.P.Parandaman
Finite Element Analysis  -Dr.P.ParandamanFinite Element Analysis  -Dr.P.Parandaman
Finite Element Analysis -Dr.P.Parandaman
Dr.P.Parandaman
 
hadamard_talk_ray_nguyen.pdf
hadamard_talk_ray_nguyen.pdfhadamard_talk_ray_nguyen.pdf
hadamard_talk_ray_nguyen.pdf
sreeja78
 
Finite Element Methode (FEM) Notes
Finite Element Methode (FEM) NotesFinite Element Methode (FEM) Notes
Finite Element Methode (FEM) Notes
Zulkifli Yunus
 
Face recognition Face Identification
Face recognition Face IdentificationFace recognition Face Identification
Face recognition Face Identification
Kalyan Acharjya
 
Antispoofing techniques in Facial recognition
Antispoofing techniques in Facial recognitionAntispoofing techniques in Facial recognition
Antispoofing techniques in Facial recognition
Rishabh shah
 
C2 ae open set recognition
C2 ae open set recognitionC2 ae open set recognition
C2 ae open set recognition
哲东 郑
 
Ansys Stimulation Study
Ansys Stimulation StudyAnsys Stimulation Study
Ansys Stimulation Study
HarshadaPawar26
 
Finite Element Analysis - UNIT-1
Finite Element Analysis - UNIT-1Finite Element Analysis - UNIT-1
Finite Element Analysis - UNIT-1
propaul
 
Gate mathematics
Gate mathematicsGate mathematics
Gate mathematics
Vivek Thakur
 
L23 thermography test
L23 thermography testL23 thermography test
L23 thermography test
karthi keyan
 
Copper and copper alloys in railway systems
Copper and copper alloys in railway systemsCopper and copper alloys in railway systems
Copper and copper alloys in railway systems
Leonardo ENERGY
 
Mram
MramMram
Introduction fea
Introduction feaIntroduction fea
Introduction fea
ahmad saepuddin
 
sheet resistivity
sheet resistivitysheet resistivity
sheet resistivity
paneliya sagar
 
Multi chip module
Multi chip moduleMulti chip module
Multi chip module
Biddika Manjusree
 
Criminal Detection System
Criminal Detection SystemCriminal Detection System
Criminal Detection System
Intrader Amit
 
"Image and Video Summarization," a Presentation from the University of Washin...
"Image and Video Summarization," a Presentation from the University of Washin..."Image and Video Summarization," a Presentation from the University of Washin...
"Image and Video Summarization," a Presentation from the University of Washin...
Edge AI and Vision Alliance
 
PhD Defense
PhD DefensePhD Defense
PhD Defense
Thomas Vanneste
 
Stages of fea in cad environment
Stages of fea in cad environmentStages of fea in cad environment
Stages of fea in cad environment
rashmi322
 
Human Activity Recognition
Human Activity RecognitionHuman Activity Recognition
Human Activity Recognition
AshwinGill1
 

What's hot (20)

Finite Element Analysis -Dr.P.Parandaman
Finite Element Analysis  -Dr.P.ParandamanFinite Element Analysis  -Dr.P.Parandaman
Finite Element Analysis -Dr.P.Parandaman
 
hadamard_talk_ray_nguyen.pdf
hadamard_talk_ray_nguyen.pdfhadamard_talk_ray_nguyen.pdf
hadamard_talk_ray_nguyen.pdf
 
Finite Element Methode (FEM) Notes
Finite Element Methode (FEM) NotesFinite Element Methode (FEM) Notes
Finite Element Methode (FEM) Notes
 
Face recognition Face Identification
Face recognition Face IdentificationFace recognition Face Identification
Face recognition Face Identification
 
Antispoofing techniques in Facial recognition
Antispoofing techniques in Facial recognitionAntispoofing techniques in Facial recognition
Antispoofing techniques in Facial recognition
 
C2 ae open set recognition
C2 ae open set recognitionC2 ae open set recognition
C2 ae open set recognition
 
Ansys Stimulation Study
Ansys Stimulation StudyAnsys Stimulation Study
Ansys Stimulation Study
 
Finite Element Analysis - UNIT-1
Finite Element Analysis - UNIT-1Finite Element Analysis - UNIT-1
Finite Element Analysis - UNIT-1
 
Gate mathematics
Gate mathematicsGate mathematics
Gate mathematics
 
L23 thermography test
L23 thermography testL23 thermography test
L23 thermography test
 
Copper and copper alloys in railway systems
Copper and copper alloys in railway systemsCopper and copper alloys in railway systems
Copper and copper alloys in railway systems
 
Mram
MramMram
Mram
 
Introduction fea
Introduction feaIntroduction fea
Introduction fea
 
sheet resistivity
sheet resistivitysheet resistivity
sheet resistivity
 
Multi chip module
Multi chip moduleMulti chip module
Multi chip module
 
Criminal Detection System
Criminal Detection SystemCriminal Detection System
Criminal Detection System
 
"Image and Video Summarization," a Presentation from the University of Washin...
"Image and Video Summarization," a Presentation from the University of Washin..."Image and Video Summarization," a Presentation from the University of Washin...
"Image and Video Summarization," a Presentation from the University of Washin...
 
PhD Defense
PhD DefensePhD Defense
PhD Defense
 
Stages of fea in cad environment
Stages of fea in cad environmentStages of fea in cad environment
Stages of fea in cad environment
 
Human Activity Recognition
Human Activity RecognitionHuman Activity Recognition
Human Activity Recognition
 

Similar to MTech Seminar Presentation [IIT-Bombay]

N01741100102
N01741100102N01741100102
N01741100102
IOSR Journals
 
Senti-Lexicon and Analysis for Restaurant Reviews of Myanmar Text
Senti-Lexicon and Analysis for Restaurant Reviews of Myanmar TextSenti-Lexicon and Analysis for Restaurant Reviews of Myanmar Text
Senti-Lexicon and Analysis for Restaurant Reviews of Myanmar Text
IJAEMSJORNAL
 
Opinion Mining Techniques for Non-English Languages: An Overview
Opinion Mining Techniques for Non-English Languages: An OverviewOpinion Mining Techniques for Non-English Languages: An Overview
Opinion Mining Techniques for Non-English Languages: An Overview
CSCJournals
 
Polarity detection of movie reviews in
Polarity detection of movie reviews inPolarity detection of movie reviews in
Polarity detection of movie reviews in
ijcsa
 
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUESA SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
Journal For Research
 
HTY 110HA Module 8 Presentation Project Instructions
HTY 110HA  Module 8 Presentation Project Instructions HTY 110HA  Module 8 Presentation Project Instructions
HTY 110HA Module 8 Presentation Project Instructions
NarcisaBrandenburg70
 
Opinion mining in hindi language a survey
Opinion mining in hindi language a surveyOpinion mining in hindi language a survey
Opinion mining in hindi language a survey
ijfcstjournal
 
Opinion mining of movie reviews at document level
Opinion mining of movie reviews at document levelOpinion mining of movie reviews at document level
Opinion mining of movie reviews at document level
ijitjournal
 
Required resources articlesbrown, l. (2015). a quick guide to
Required resources articlesbrown, l. (2015). a quick guide toRequired resources articlesbrown, l. (2015). a quick guide to
Required resources articlesbrown, l. (2015). a quick guide to
MARK547399
 
SENTIMENT ANALYSIS-AN OBJECTIVE VIEW
SENTIMENT ANALYSIS-AN OBJECTIVE VIEWSENTIMENT ANALYSIS-AN OBJECTIVE VIEW
SENTIMENT ANALYSIS-AN OBJECTIVE VIEW
Journal For Research
 
To Label or Not? Advances and Open Challenges in SE-specific Sentiment Analysis
To Label or Not? Advances and Open Challenges in SE-specific Sentiment AnalysisTo Label or Not? Advances and Open Challenges in SE-specific Sentiment Analysis
To Label or Not? Advances and Open Challenges in SE-specific Sentiment Analysis
Nicole Novielli
 
Sentiment Analysis in Marathi Language
Sentiment Analysis in Marathi LanguageSentiment Analysis in Marathi Language
Sentiment Analysis in Marathi Language
rahulmonikasharma
 
Peer Mentoring/Peer Assisted Learning Leicester Award Guidance
Peer Mentoring/Peer Assisted Learning Leicester Award GuidancePeer Mentoring/Peer Assisted Learning Leicester Award Guidance
Peer Mentoring/Peer Assisted Learning Leicester Award Guidance
martau3
 
A decision tree based word sense disambiguation system in manipuri language
A decision tree based word sense disambiguation system in manipuri languageA decision tree based word sense disambiguation system in manipuri language
A decision tree based word sense disambiguation system in manipuri language
acijjournal
 
Literate environment analysis_presentation_outline1
Literate environment analysis_presentation_outline1Literate environment analysis_presentation_outline1
Literate environment analysis_presentation_outline1
seying4
 
Mining of product reviews at aspect level
Mining of product reviews at aspect levelMining of product reviews at aspect level
Mining of product reviews at aspect level
ijfcstjournal
 
Vodafone Business Research
Vodafone Business Research Vodafone Business Research
Vodafone Business Research
Gaurav Asthana
 
IRJET- A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...
IRJET-  	  A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...IRJET-  	  A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...
IRJET- A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...
IRJET Journal
 
A Survey On Sentiment Analysis Of Movie Reviews
A Survey On Sentiment Analysis Of Movie ReviewsA Survey On Sentiment Analysis Of Movie Reviews
A Survey On Sentiment Analysis Of Movie Reviews
Shannon Green
 
An Improved sentiment classification for objective word.
An Improved sentiment classification for objective word.An Improved sentiment classification for objective word.
An Improved sentiment classification for objective word.
IJSRD
 

Similar to MTech Seminar Presentation [IIT-Bombay] (20)

N01741100102
N01741100102N01741100102
N01741100102
 
Senti-Lexicon and Analysis for Restaurant Reviews of Myanmar Text
Senti-Lexicon and Analysis for Restaurant Reviews of Myanmar TextSenti-Lexicon and Analysis for Restaurant Reviews of Myanmar Text
Senti-Lexicon and Analysis for Restaurant Reviews of Myanmar Text
 
Opinion Mining Techniques for Non-English Languages: An Overview
Opinion Mining Techniques for Non-English Languages: An OverviewOpinion Mining Techniques for Non-English Languages: An Overview
Opinion Mining Techniques for Non-English Languages: An Overview
 
Polarity detection of movie reviews in
Polarity detection of movie reviews inPolarity detection of movie reviews in
Polarity detection of movie reviews in
 
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUESA SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
 
HTY 110HA Module 8 Presentation Project Instructions
HTY 110HA  Module 8 Presentation Project Instructions HTY 110HA  Module 8 Presentation Project Instructions
HTY 110HA Module 8 Presentation Project Instructions
 
Opinion mining in hindi language a survey
Opinion mining in hindi language a surveyOpinion mining in hindi language a survey
Opinion mining in hindi language a survey
 
Opinion mining of movie reviews at document level
Opinion mining of movie reviews at document levelOpinion mining of movie reviews at document level
Opinion mining of movie reviews at document level
 
Required resources articlesbrown, l. (2015). a quick guide to
Required resources articlesbrown, l. (2015). a quick guide toRequired resources articlesbrown, l. (2015). a quick guide to
Required resources articlesbrown, l. (2015). a quick guide to
 
SENTIMENT ANALYSIS-AN OBJECTIVE VIEW
SENTIMENT ANALYSIS-AN OBJECTIVE VIEWSENTIMENT ANALYSIS-AN OBJECTIVE VIEW
SENTIMENT ANALYSIS-AN OBJECTIVE VIEW
 
To Label or Not? Advances and Open Challenges in SE-specific Sentiment Analysis
To Label or Not? Advances and Open Challenges in SE-specific Sentiment AnalysisTo Label or Not? Advances and Open Challenges in SE-specific Sentiment Analysis
To Label or Not? Advances and Open Challenges in SE-specific Sentiment Analysis
 
Sentiment Analysis in Marathi Language
Sentiment Analysis in Marathi LanguageSentiment Analysis in Marathi Language
Sentiment Analysis in Marathi Language
 
Peer Mentoring/Peer Assisted Learning Leicester Award Guidance
Peer Mentoring/Peer Assisted Learning Leicester Award GuidancePeer Mentoring/Peer Assisted Learning Leicester Award Guidance
Peer Mentoring/Peer Assisted Learning Leicester Award Guidance
 
A decision tree based word sense disambiguation system in manipuri language
A decision tree based word sense disambiguation system in manipuri languageA decision tree based word sense disambiguation system in manipuri language
A decision tree based word sense disambiguation system in manipuri language
 
Literate environment analysis_presentation_outline1
Literate environment analysis_presentation_outline1Literate environment analysis_presentation_outline1
Literate environment analysis_presentation_outline1
 
Mining of product reviews at aspect level
Mining of product reviews at aspect levelMining of product reviews at aspect level
Mining of product reviews at aspect level
 
Vodafone Business Research
Vodafone Business Research Vodafone Business Research
Vodafone Business Research
 
IRJET- A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...
IRJET-  	  A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...IRJET-  	  A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...
IRJET- A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...
 
A Survey On Sentiment Analysis Of Movie Reviews
A Survey On Sentiment Analysis Of Movie ReviewsA Survey On Sentiment Analysis Of Movie Reviews
A Survey On Sentiment Analysis Of Movie Reviews
 
An Improved sentiment classification for objective word.
An Improved sentiment classification for objective word.An Improved sentiment classification for objective word.
An Improved sentiment classification for objective word.
 

More from Sagar Ahire

Wordnet-Affect [IIT-Bombay]
Wordnet-Affect [IIT-Bombay]Wordnet-Affect [IIT-Bombay]
Wordnet-Affect [IIT-Bombay]
Sagar Ahire
 
Sentiwordnet [IIT-Bombay]
Sentiwordnet [IIT-Bombay]Sentiwordnet [IIT-Bombay]
Sentiwordnet [IIT-Bombay]
Sagar Ahire
 
Paper Presentation: HMM-based Alignment
Paper Presentation: HMM-based AlignmentPaper Presentation: HMM-based Alignment
Paper Presentation: HMM-based Alignment
Sagar Ahire
 
Paper Presentation: A Pendulum Swung Too Far
Paper Presentation: A Pendulum Swung Too FarPaper Presentation: A Pendulum Swung Too Far
Paper Presentation: A Pendulum Swung Too Far
Sagar Ahire
 
NLP Asignment Final Presentation [IIT-Bombay]
NLP Asignment Final Presentation [IIT-Bombay]NLP Asignment Final Presentation [IIT-Bombay]
NLP Asignment Final Presentation [IIT-Bombay]
Sagar Ahire
 
Sarcasm & Thwarting in Sentiment Analysis [IIT-Bombay]
Sarcasm & Thwarting in Sentiment Analysis [IIT-Bombay]Sarcasm & Thwarting in Sentiment Analysis [IIT-Bombay]
Sarcasm & Thwarting in Sentiment Analysis [IIT-Bombay]
Sagar Ahire
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
Sagar Ahire
 
Neuro-fuzzy systems
Neuro-fuzzy systemsNeuro-fuzzy systems
Neuro-fuzzy systems
Sagar Ahire
 

More from Sagar Ahire (8)

Wordnet-Affect [IIT-Bombay]
Wordnet-Affect [IIT-Bombay]Wordnet-Affect [IIT-Bombay]
Wordnet-Affect [IIT-Bombay]
 
Sentiwordnet [IIT-Bombay]
Sentiwordnet [IIT-Bombay]Sentiwordnet [IIT-Bombay]
Sentiwordnet [IIT-Bombay]
 
Paper Presentation: HMM-based Alignment
Paper Presentation: HMM-based AlignmentPaper Presentation: HMM-based Alignment
Paper Presentation: HMM-based Alignment
 
Paper Presentation: A Pendulum Swung Too Far
Paper Presentation: A Pendulum Swung Too FarPaper Presentation: A Pendulum Swung Too Far
Paper Presentation: A Pendulum Swung Too Far
 
NLP Asignment Final Presentation [IIT-Bombay]
NLP Asignment Final Presentation [IIT-Bombay]NLP Asignment Final Presentation [IIT-Bombay]
NLP Asignment Final Presentation [IIT-Bombay]
 
Sarcasm & Thwarting in Sentiment Analysis [IIT-Bombay]
Sarcasm & Thwarting in Sentiment Analysis [IIT-Bombay]Sarcasm & Thwarting in Sentiment Analysis [IIT-Bombay]
Sarcasm & Thwarting in Sentiment Analysis [IIT-Bombay]
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Neuro-fuzzy systems
Neuro-fuzzy systemsNeuro-fuzzy systems
Neuro-fuzzy systems
 

Recently uploaded

Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
TIPNGVN2
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Zilliz
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 

Recently uploaded (20)

Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 

MTech Seminar Presentation [IIT-Bombay]

  • 1. Resources for Sentiment Analysis Seminar Presentation Sagar Ahire 133050073 IIT Bombay 02 May, 2014 Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 1 / 48
  • 2. Roadmap 1 Introduction 2 Sentiwordnet 3 SO-CAL 4 Wordnet-Affect 5 Indian-Language Sentiwordnets 6 Conclusions Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 2 / 48
  • 3. Introduction Roadmap: We Are Here 1 Introduction 2 Sentiwordnet 3 SO-CAL 4 Wordnet-Affect 5 Indian-Language Sentiwordnets 6 Conclusions Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 3 / 48
  • 4. Introduction Overview Overview An overview of today’s presentation: This presentation covers lexical resources for sentiment analysis. Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 4 / 48
  • 5. Introduction Overview Overview An overview of today’s presentation: This presentation covers lexical resources for sentiment analysis. Four resources are covered, each using a different approach for representation and creation: Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 4 / 48
  • 6. Introduction Overview Overview An overview of today’s presentation: This presentation covers lexical resources for sentiment analysis. Four resources are covered, each using a different approach for representation and creation: Sentiwordnet, created automatically, with 3 graded scores per synset Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 4 / 48
  • 7. Introduction Overview Overview An overview of today’s presentation: This presentation covers lexical resources for sentiment analysis. Four resources are covered, each using a different approach for representation and creation: Sentiwordnet, created automatically, with 3 graded scores per synset SO-CAL, created manually, with a graded score per word Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 4 / 48
  • 8. Introduction Overview Overview An overview of today’s presentation: This presentation covers lexical resources for sentiment analysis. Four resources are covered, each using a different approach for representation and creation: Sentiwordnet, created automatically, with 3 graded scores per synset SO-CAL, created manually, with a graded score per word Wordnet-Affect, created semi-automatically, with affect information for each synset Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 4 / 48
  • 9. Introduction Overview Overview An overview of today’s presentation: This presentation covers lexical resources for sentiment analysis. Four resources are covered, each using a different approach for representation and creation: Sentiwordnet, created automatically, with 3 graded scores per synset SO-CAL, created manually, with a graded score per word Wordnet-Affect, created semi-automatically, with affect information for each synset Indian-Language Sentiwordnet, created by projecting the English Sentiwordnet Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 4 / 48
  • 10. Introduction Sentiment Analysis Sentiment Analysis Sentiment Analysis: Determining the opinion expressed in a text Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 5 / 48
  • 11. Introduction Sentiment Analysis Sentiment Analysis Sentiment Analysis: Determining the opinion expressed in a text Approaches: Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 5 / 48
  • 12. Introduction Sentiment Analysis Sentiment Analysis Sentiment Analysis: Determining the opinion expressed in a text Approaches: Classifier-based Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 5 / 48
  • 13. Introduction Sentiment Analysis Sentiment Analysis Sentiment Analysis: Determining the opinion expressed in a text Approaches: Classifier-based Lexicon-based Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 5 / 48
  • 14. Introduction Sentiment Analysis Why Lexicon-based Approach? The classifier-based approach has the following drawbacks: Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 6 / 48
  • 15. Introduction Sentiment Analysis Why Lexicon-based Approach? The classifier-based approach has the following drawbacks: Domain Specificity (Example: Movie reviews mentioning ‘writer’, ‘plot’, etc.) [Bro01] Lack of Context (Example: ‘good’ vs ‘not good’ vs ‘not very good’) Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 6 / 48
  • 16. Introduction Sentiment Analysis Why Lexicon-based Approach? The classifier-based approach has the following drawbacks: Domain Specificity (Example: Movie reviews mentioning ‘writer’, ‘plot’, etc.) [Bro01] Lack of Context (Example: ‘good’ vs ‘not good’ vs ‘not very good’) The lexicon-based approach aims at solving these problems. Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 6 / 48
  • 17. Introduction Sentiment Lexicons Sentiment Lexicons A sentiment lexicon is a sentiment database for language units of the form (lexical unit, sentiment). Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 7 / 48
  • 18. Introduction Sentiment Lexicons Sentiment Lexicons A sentiment lexicon is a sentiment database for language units of the form (lexical unit, sentiment). Choices for lexical unit: Word Word sense Phrase, etc. Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 7 / 48
  • 19. Introduction Sentiment Lexicons Sentiment Lexicons A sentiment lexicon is a sentiment database for language units of the form (lexical unit, sentiment). Choices for lexical unit: Word Word sense Phrase, etc. Choices for sentiment: Fixed categorization into ‘positive’ and ‘negative’ Graded sets like ‘strongly positive’, ‘mildly positive’, ‘neutral’, ‘mildly negative’, ‘strongly negative’ Score in an interval like [0, 1] or [−1, +1] Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 7 / 48
  • 20. Introduction Sentiment Lexicons Approaches for Creation Manual Automatic Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 8 / 48
  • 21. Sentiwordnet Roadmap: We Are Here 1 Introduction 2 Sentiwordnet 3 SO-CAL 4 Wordnet-Affect 5 Indian-Language Sentiwordnets 6 Conclusions Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 9 / 48
  • 22. Sentiwordnet Introduction to Sentiwordnet Sentiwordnet [ES06] is an automatically generated sentiment lexicon made using Wordnet. Its salient features are: Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 10 / 48
  • 23. Sentiwordnet Introduction to Sentiwordnet Sentiwordnet [ES06] is an automatically generated sentiment lexicon made using Wordnet. Its salient features are: High coverage Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 10 / 48
  • 24. Sentiwordnet Introduction to Sentiwordnet Sentiwordnet [ES06] is an automatically generated sentiment lexicon made using Wordnet. Its salient features are: High coverage Support for graded sentiment labels Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 10 / 48
  • 25. Sentiwordnet Introduction to Sentiwordnet Sentiwordnet [ES06] is an automatically generated sentiment lexicon made using Wordnet. Its salient features are: High coverage Support for graded sentiment labels Support for both sentiment classification and subjectivity detection Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 10 / 48
  • 26. Sentiwordnet Structure Structure of Sentiwordnet Sentiwordnet = Wordnet + Sentiment Information. Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 11 / 48
  • 27. Sentiwordnet Structure Structure of Sentiwordnet Sentiwordnet = Wordnet + Sentiment Information. Each synset s is given three sentiment scores: Positive score Pos(s) Negative score Neg(s) Objective score Obj(s) Pos(s) + Neg(s) + Obj(s) = 1 Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 11 / 48
  • 28. Sentiwordnet Structure Structure of Sentiwordnet Sentiwordnet = Wordnet + Sentiment Information. Each synset s is given three sentiment scores: Positive score Pos(s) Negative score Neg(s) Objective score Obj(s) Pos(s) + Neg(s) + Obj(s) = 1 Example Synset beautifula: Pos = 0.75, Neg = 0.00, Obj = 0.25 a URL: http://sentiwordnet.isti.cnr.it/search.php?q=beautiful Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 11 / 48
  • 29. Sentiwordnet Creation Creation Steps The top-level steps in the algorithm to create Sentiwordnet are as follows: Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 12 / 48
  • 30. Sentiwordnet Creation Creation Steps The top-level steps in the algorithm to create Sentiwordnet are as follows: 1 Selection of seed set Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 12 / 48
  • 31. Sentiwordnet Creation Creation Steps The top-level steps in the algorithm to create Sentiwordnet are as follows: 1 Selection of seed set 2 Expansion using Wordnet’s semantic relations Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 12 / 48
  • 32. Sentiwordnet Creation Creation Steps The top-level steps in the algorithm to create Sentiwordnet are as follows: 1 Selection of seed set 2 Expansion using Wordnet’s semantic relations 3 Training of a team of ternary classifiers Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 12 / 48
  • 33. Sentiwordnet Creation Creation Steps The top-level steps in the algorithm to create Sentiwordnet are as follows: 1 Selection of seed set 2 Expansion using Wordnet’s semantic relations 3 Training of a team of ternary classifiers 4 Classification of each Wordnet synset using the classifiers Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 12 / 48
  • 34. SO-CAL Roadmap: We Are Here 1 Introduction 2 Sentiwordnet 3 SO-CAL 4 Wordnet-Affect 5 Indian-Language Sentiwordnets 6 Conclusions Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 13 / 48
  • 35. SO-CAL Introduction to SO-CAL SO-CAL is a system that uses a manually-constructed lexicon. Its salient features are: Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 14 / 48
  • 36. SO-CAL Introduction to SO-CAL SO-CAL is a system that uses a manually-constructed lexicon. Its salient features are: Highly detailed lexicon Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 14 / 48
  • 37. SO-CAL Introduction to SO-CAL SO-CAL is a system that uses a manually-constructed lexicon. Its salient features are: Highly detailed lexicon Graded sentiment label Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 14 / 48
  • 38. SO-CAL Introduction to SO-CAL SO-CAL is a system that uses a manually-constructed lexicon. Its salient features are: Highly detailed lexicon Graded sentiment label Low coverage, but high accuracy Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 14 / 48
  • 39. SO-CAL Structure Features Used SO-CAL classifies words into various features and treats each feature differently in the lexicon. They are: Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 15 / 48
  • 40. SO-CAL Structure Features Used SO-CAL classifies words into various features and treats each feature differently in the lexicon. They are: Adjectives Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 15 / 48
  • 41. SO-CAL Structure Features Used SO-CAL classifies words into various features and treats each feature differently in the lexicon. They are: Adjectives Nouns, Verbs, Adverbs and Multiwords Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 15 / 48
  • 42. SO-CAL Structure Features Used SO-CAL classifies words into various features and treats each feature differently in the lexicon. They are: Adjectives Nouns, Verbs, Adverbs and Multiwords Intensifiers and Downtoners Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 15 / 48
  • 43. SO-CAL Structure Features Used SO-CAL classifies words into various features and treats each feature differently in the lexicon. They are: Adjectives Nouns, Verbs, Adverbs and Multiwords Intensifiers and Downtoners Negation Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 15 / 48
  • 44. SO-CAL Structure Features Used SO-CAL classifies words into various features and treats each feature differently in the lexicon. They are: Adjectives Nouns, Verbs, Adverbs and Multiwords Intensifiers and Downtoners Negation Irrealis Blocking Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 15 / 48
  • 45. SO-CAL Structure Structure of SO-CAL Sentiment scoring: Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 16 / 48
  • 46. SO-CAL Structure Structure of SO-CAL Sentiment scoring: Words are scored in [−5, +5] Intensifiers and negation further act upon these scores Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 16 / 48
  • 47. SO-CAL Structure Structure of SO-CAL Sentiment scoring: Words are scored in [−5, +5] Intensifiers and negation further act upon these scores Examples good: +3 monstrosity: −5 masterpiece: +5 Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 16 / 48
  • 48. Wordnet-Affect Roadmap: We Are Here 1 Introduction 2 Sentiwordnet 3 SO-CAL 4 Wordnet-Affect 5 Indian-Language Sentiwordnets 6 Conclusions Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 17 / 48
  • 49. Wordnet-Affect Introduction to Wordnet-Affect Wordnet-Affect [SV04] is a semi-automatically generated sentiment lexicon made using Wordnet. It associates affective information with each synset. Its salient features are: Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 18 / 48
  • 50. Wordnet-Affect Introduction to Wordnet-Affect Wordnet-Affect [SV04] is a semi-automatically generated sentiment lexicon made using Wordnet. It associates affective information with each synset. Its salient features are: Highly detailed Ability to handle sentiment differently depending on emotion Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 18 / 48
  • 51. Wordnet-Affect Structure Structure of Wordnet-Affect Wordnet-Affect = Wordnet + Affect Information. Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 19 / 48
  • 52. Wordnet-Affect Structure Structure of Wordnet-Affect Wordnet-Affect = Wordnet + Affect Information. Affect is represented using the following: An a-label which represents the emotion, The valency which indicates the sentiment. Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 19 / 48
  • 53. Wordnet-Affect Structure Structure of Wordnet-Affect The a-label is a tree of emotions starting at a root node with each leaf node corresponding to a synset. Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 20 / 48
  • 54. Wordnet-Affect Structure Structure of Wordnet-Affect The a-label is a tree of emotions starting at a root node with each leaf node corresponding to a synset. The valency can be any of positive, negative, neutral or ambiguous. Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 20 / 48
  • 55. Wordnet-Affect Structure root mental-state cognitive-state affective-state mood emotion positive-emotion joy elation love worship negative-emotion sadness melancholy shame embarrassment . . . . . . physical-state . . . Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 21 / 48
  • 56. Wordnet-Affect Creation Creation Steps Wordnet-Affect was created using the following steps: Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 22 / 48
  • 57. Wordnet-Affect Creation Creation Steps Wordnet-Affect was created using the following steps: Manual creation of initial resource Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 22 / 48
  • 58. Wordnet-Affect Creation Creation Steps Wordnet-Affect was created using the following steps: Manual creation of initial resource Automatic expansion using Wordnet relations Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 22 / 48
  • 59. Indian-Language Sentiwordnets Roadmap: We Are Here 1 Introduction 2 Sentiwordnet 3 SO-CAL 4 Wordnet-Affect 5 Indian-Language Sentiwordnets 6 Conclusions Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 23 / 48
  • 60. Indian-Language Sentiwordnets Introduction to Indian-Language Sentiwordnets Indian-language Sentiwordnets can be created using Wordnet projection [JRB10]. This approach has the following salient features: Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 24 / 48
  • 61. Indian-Language Sentiwordnets Introduction to Indian-Language Sentiwordnets Indian-language Sentiwordnets can be created using Wordnet projection [JRB10]. This approach has the following salient features: Easy to create once backing resources are available No reduplication of effort Use of tried-and-tested representations Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 24 / 48
  • 62. Indian-Language Sentiwordnets Creation Creation Steps The process of projecting a Sentiwordnet has the following steps: Fetch a synset from the English Sentiwordnet. Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 25 / 48
  • 63. Indian-Language Sentiwordnets Creation Creation Steps The process of projecting a Sentiwordnet has the following steps: Fetch a synset from the English Sentiwordnet. Find the corresponding Hindi synset using Indowordnet. Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 25 / 48
  • 64. Indian-Language Sentiwordnets Creation Creation Steps The process of projecting a Sentiwordnet has the following steps: Fetch a synset from the English Sentiwordnet. Find the corresponding Hindi synset using Indowordnet. Assign sentiment scores from English synset to Hindi synset. Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 25 / 48
  • 65. Conclusions Roadmap: We Are Here 1 Introduction 2 Sentiwordnet 3 SO-CAL 4 Wordnet-Affect 5 Indian-Language Sentiwordnets 6 Conclusions Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 26 / 48
  • 66. Conclusions A Comparison of the Resources Criterion SWN SO-CAL WN-Affect IL-SWN Sentiment 3 x [0, 1] [−5, +5] Affect 3 x [0, 1] Lexical Unit Synset Word Synset Synset Backing Resource Wordnet None Wordnet SWN + In- dowordnet Creation Automatic Manual Automatic Projection No of Entries 117,000 5,000 900 16,000 Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 27 / 48
  • 67. Conclusions Concluding Remarks To conclude, there are three choices in making a sentiment lexicon: Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 28 / 48
  • 68. Conclusions Concluding Remarks To conclude, there are three choices in making a sentiment lexicon: Creation Approach: Manual, Automatic, Semi-Automatic or Projection Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 28 / 48
  • 69. Conclusions Concluding Remarks To conclude, there are three choices in making a sentiment lexicon: Creation Approach: Manual, Automatic, Semi-Automatic or Projection Lexical Unit: Word, Synset or Higher Representations Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 28 / 48
  • 70. Conclusions Concluding Remarks To conclude, there are three choices in making a sentiment lexicon: Creation Approach: Manual, Automatic, Semi-Automatic or Projection Lexical Unit: Word, Synset or Higher Representations Sentiment: Labels, Graded Scores or Affect Information Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 28 / 48
  • 71. Conclusions Concluding Remarks: Creation Approach Manual Approach Automatic Approach High annotation accuracy Low annotation accuracy High time investment Low time investment More details supported Less details supported Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 29 / 48
  • 72. Conclusions Concluding Remarks: Lexical Unit Word Synset Unreliable for polysemous words Reliable for polysemous words No pre-processing required Requires WSD Projection is comparatively difficult Projection is comparatively easier Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 30 / 48
  • 73. Conclusions Concluding Remarks: Sentiment Graded scores have been shown to be better than mere labels in general. Moreover, a graded score resource can always be converted to a label-based resource. Affect information can help in specialized circumstances. Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 31 / 48
  • 74. Conclusions Future Work Possible directions in the future: Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 32 / 48
  • 75. Conclusions Future Work Possible directions in the future: Automatic resources for higher-level lexical units like phrases, trees, etc. Manual resources for synsets Manual lexicons for Indian languages Techniques for building dynamic resources to incorporate ‘netspeak’ and other slang Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 32 / 48
  • 76. Conclusions References I Julian Brooke, A semantic approach to automatic text sentiment analysis, M.A. thesis, Stanford University, 2001. Andrea Esuli and Fabrizio Sebastiani, SentiWordNet: A publicly available lexical resource for opinion mining, Proceedings of the 5th Conference on Language Resources and Evaluation (LREC-06), 2006, pp. 417–422. Andrea Esuli, Automatic generation of lexical resources for opinion mining: Models, algorithms and applications, Ph.D. thesis, Universita di Pisa, 2008. Christiane Fellbaum, Wordnet: An electronic lexical database, A Bradford Book, 1998. Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 33 / 48
  • 77. Conclusions References II Vasileios Hatzivassiloglou and Kathleen R. McKeown, Predicting the semantic orientation of adjectives, Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics, Association for Computational Linguistics, 1997, pp. 174–181. Aditya Joshi, Balamurali A R, and Pushpak Bhattacharyya, A fall-back strategy for sentiment analysis in hindi: a case study, Proceedings of ICON 2010: 8th International Conference on Natural Language Processing, Macmillan Publishers, India, 2010. Jaap Kamps, Maarten Marx, Robert J. Mokken, and Maarten de Rijke, Using wordnet to measure semantic orientations of adjectives, Proceedings of LREC-04, 4th International Conference on Language Resources and Evaluation, 2004, pp. 1115–1118. Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 34 / 48
  • 78. Conclusions References III Ellen Riloff and Janyce Wiebe, Learning extraction patterns for subjective expressions, Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, 2003, pp. 105–112. Carlo Strapparava and Alessandro Valitutti, WordNet-Affect: an affective extension of WordNet, Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC-04), 2004, pp. 1083–1086. Peter D. Turney and Michael L. Littman, Measuring praise and criticism: Inference of semantic orientation from association, ACM Transactions on Information Systems 21 (2003), no. 4, 315–346. Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 35 / 48
  • 79. Additional Slides Wordnet Wordnet Wordnet [Fel98] is a lexical database organized by word sense. The fundamental unit of storage is called a synset. Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 36 / 48
  • 80. Additional Slides Wordnet Wordnet Wordnet [Fel98] is a lexical database organized by word sense. The fundamental unit of storage is called a synset. An Example Synset brilliant, superba: of surpassing excellence “a brilliant performance”; “a superb actor” a URL: http://wordnetweb.princeton.edu/perl/webwn?s=brilliant Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 36 / 48
  • 81. Additional Slides Wordnet Semantic Relations in Wordnet Wordnet synsets are linked to each other by relations called semantic relations. Some of them are: Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 37 / 48
  • 82. Additional Slides Wordnet Semantic Relations in Wordnet Wordnet synsets are linked to each other by relations called semantic relations. Some of them are: Antonymy Meronymy Hypernymy Hyponymy Similar to, etc. Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 37 / 48
  • 83. Additional Slides Wordnet Semantic Relations in Wordnet Wordnet synsets are linked to each other by relations called semantic relations. Some of them are: Antonymy Meronymy Hypernymy Hyponymy Similar to, etc. These relations are helpful in creating the training set for classifying synsets to create Sentiwordnet. Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 37 / 48
  • 84. Additional Slides Background Sentiment Classification Initial work that automatically detected the sentiment of a word led to today’s modern lexicons. This included: Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 38 / 48
  • 85. Additional Slides Background Sentiment Classification Initial work that automatically detected the sentiment of a word led to today’s modern lexicons. This included: Use of conjunction-separated adjectives [HM97] Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 38 / 48
  • 86. Additional Slides Background Sentiment Classification Initial work that automatically detected the sentiment of a word led to today’s modern lexicons. This included: Use of conjunction-separated adjectives [HM97] PMI-based Extraction using Web Queries [TL03] Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 38 / 48
  • 87. Additional Slides Background Sentiment Classification Initial work that automatically detected the sentiment of a word led to today’s modern lexicons. This included: Use of conjunction-separated adjectives [HM97] PMI-based Extraction using Web Queries [TL03] Graph Expansion using Wordnet [KMMdR04] Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 38 / 48
  • 88. Additional Slides Background Sentiment Classification Initial work that automatically detected the sentiment of a word led to today’s modern lexicons. This included: Use of conjunction-separated adjectives [HM97] PMI-based Extraction using Web Queries [TL03] Graph Expansion using Wordnet [KMMdR04] Classification using Wordnet Glosses [Esu08] Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 38 / 48
  • 89. Additional Slides Background Subjectivity Detection Work that identifies whether a term is indeed subjective is necessary to filter out objective words from sentiment classification. This includes: Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 39 / 48
  • 90. Additional Slides Background Subjectivity Detection Work that identifies whether a term is indeed subjective is necessary to filter out objective words from sentiment classification. This includes: Adapting Wordnet Glosses to Subjectivity Detection [Esu08] Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 39 / 48
  • 91. Additional Slides Background Subjectivity Detection Work that identifies whether a term is indeed subjective is necessary to filter out objective words from sentiment classification. This includes: Adapting Wordnet Glosses to Subjectivity Detection [Esu08] Bootstrapping Subjective Expressions from a Corpus [RW03] Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 39 / 48
  • 92. Additional Slides Structure of SO-CAL Adjectives Adjectives were collected from a 500-document corpus and annotated with a sentiment score from −5 to +5. Examples good: +3 sleazy: −3 Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 40 / 48
  • 93. Additional Slides Structure of SO-CAL Nouns, Verbs, Adverbs, Multiwords This was extended to other parts of speech and multiword expressions, for a total of about 5,000 words. Examples monstrosity: −5 masterpiece: +5 inspire: +2 funny: +2 vs. act funny: −1 Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 41 / 48
  • 94. Additional Slides Structure of SO-CAL Intensifiers and Downtoners Intensifiers are words that increase sentiment intensity while downtoners are words that reduce sentiment intensity. For example extraordinarily and somewhat. Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 42 / 48
  • 95. Additional Slides Structure of SO-CAL Intensifiers and Downtoners Intensifiers are words that increase sentiment intensity while downtoners are words that reduce sentiment intensity. For example extraordinarily and somewhat. Intensifiers and downtoners are modeled as percentage modifiers. Examples slightly: −50% extraordinarily: +50% Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 42 / 48
  • 96. Additional Slides Structure of SO-CAL Negation Negation is modeled as a numeric shift of value 4 towards the opposite sentiment. Examples good: +3 ⇒ not good: −1 atrocious: −5 ⇒ not atrocious: −1 Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 43 / 48
  • 97. Additional Slides Structure of SO-CAL Irrealis Blocking An irrealis marker is a word that indicates that the sentiment may not be reliable because the event hasn’t actually happened. For example, ‘would’, ‘expect’, ‘if’, quotation marks, etc. Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 44 / 48
  • 98. Additional Slides Structure of SO-CAL Irrealis Blocking An irrealis marker is a word that indicates that the sentiment may not be reliable because the event hasn’t actually happened. For example, ‘would’, ‘expect’, ‘if’, quotation marks, etc. Sentences with irrealis markers are ignored for sentiment analysis. Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 44 / 48
  • 99. Additional Slides Sentiwordnet Creation Seed Set Two seed sets are created: Lp for positive synsets Ln for negative synsets Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 45 / 48
  • 100. Additional Slides Sentiwordnet Creation Seed Set Two seed sets are created: Lp for positive synsets Ln for negative synsets Each synset representation consists of: The terms The defninition The sample phrases Explicit indication of negation Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 45 / 48
  • 101. Additional Slides Sentiwordnet Creation Wordnet Expansion Relations of Wordnet used for expansion: Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 46 / 48
  • 102. Additional Slides Sentiwordnet Creation Wordnet Expansion Relations of Wordnet used for expansion: Direct antonymy Similarity Derived from Pertains to Attribute Also see Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 46 / 48
  • 103. Additional Slides Sentiwordnet Creation Classifiers 8 classifiers were created differing in: Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 47 / 48
  • 104. Additional Slides Sentiwordnet Creation Classifiers 8 classifiers were created differing in: No of iterations of expansion (0, 2, 4, 6) Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 47 / 48
  • 105. Additional Slides Sentiwordnet Creation Classifiers 8 classifiers were created differing in: No of iterations of expansion (0, 2, 4, 6) Learning algorithm (SVM, Rocchio) Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 47 / 48
  • 106. Additional Slides Sentiwordnet Creation Classifiers Each ternary classifier is a sum of 2 binary classifiers: Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 48 / 48
  • 107. Additional Slides Sentiwordnet Creation Classifiers Each ternary classifier is a sum of 2 binary classifiers: Positive vs. Not Positive Negative vs. Not Negative Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 48 / 48
  • 108. Additional Slides Sentiwordnet Creation Classifiers Each ternary classifier is a sum of 2 binary classifiers: Positive vs. Not Positive Negative vs. Not Negative The results are combined as: Positive Not Positive Negative Objective Negative Not Negative Positive Objective Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 48 / 48