SlideShare a Scribd company logo
Leveraging Sentiment to Compute
Word Similarity
Balamurali A.R., Subhabrata Mukherjee, Akshat Malu and Pushpak
Bhattacharyya
Dept. of Computer Science and Engineering, IIT Bombay
6th International Global Wordnet Conference
GWC 2011,
Matsue, Japan, Jan, 2012
Motivation
2










Motivation
3
 Introduce Sentiment as another feature in the Semantic Similarity
Measure
 “Among a set of a similar word pairs, a pair is more similar if their
sentiment content is the same”
 Is “enchant” (hold spellbound) more similar to “endear” (make
endearing or lovable) than to “delight ”(give pleasure to or be
pleasing to) ?







Motivation
4
 Introduce Sentiment as another feature in the Semantic Similarity
Measure
 “Among a set of a similar word pairs, a pair is more similar if their
sentiment content is the same”
 Is “enchant” (hold spellbound) more similar to “endear” (make
endearing or lovable) than to “delight ”(give pleasure to or be
pleasing to) ?
 Useful for replacing an unknown feature in test set with a similar
feature in training set






Motivation
5
 Introduce Sentiment as another feature in the Semantic Similarity
Measure
 “Among a set of a similar word pairs, a pair is more similar if their
sentiment content is the same”
 Is “enchant” (hold spellbound) more similar to “endear” (make
endearing or lovable) than to “delight ”(give pleasure to or be
pleasing to) ?
 Useful for replacing an unknown feature in test set with a similar
feature in training set
 Given a word in a sentence, create its Similarity Vector
 Use Word Sense Disambiguation on context to find its Synset-id
 Create a Gloss Vector (sparse) using its gloss
 Extend gloss using relevant WordNet Relations
 Learn the relations to use for different POS tags and the depth in WordNet
hierarchy
 Incorporate SentiWordNet Scores in the Expanded Vector using Different Scoring
Motivation
6
 Introduce Sentiment as another feature in the Semantic Similarity
Measure
 “Among a set of a similar word pairs, a pair is more similar if their
sentiment content is the same”
 Is “enchant” (hold spellbound) more similar to “endear” (make
endearing or lovable) than to “delight ”(give pleasure to or be
pleasing to) ?
 Useful for replacing an unknown feature in test set with a similar
feature in training set
 Given a word in a sentence, create its Similarity Vector
 Use Word Sense Disambiguation on context to find its Synset-id
 Create a Gloss Vector (sparse) using its gloss
 Extend gloss using relevant WordNet Relations
 Learn the relations to use for different POS tags and the depth in WordNet
hierarchy
 Incorporate SentiWordNet Scores in the Expanded Vector using Different Scoring
Sentiment-Semantic Correlation
7/23/2013Department of Computer Science and Engineering, IIT Bombay
7
Annotation Strategy Overall NOUN VERB ADJECTIVES ADVERBS
Meaning 0.768 0.803 0.750 0.527 0.759
Meaning + Sentiment 0.799 0.750 0.889 0.720 0.844
WordNet Relations used for
Expansion
7/23/2013Department of Computer Science and Engineering, IIT Bombay
8
POS WordNet relations used for expansion
Nouns hypernym, hyponym, nominalization
Verbs nominalization, hypernym, hyponym
Adjectives also see, nominalization, attribute
Adverbs derived
Scoring Formula
7/23/2013Department of Computer Science and Engineering, IIT Bombay
9
 ScoreSD(A) = SWNpos(A)- SWNneg(A)
 ScoreSM(A)= max(SWNpos(A), SWNneg(A))
 ScoreTM(A) =
sign(max(SWNpos(A), SWNneg(A)))∗ (1+abs(max(SWNpos(A),
SWNneg(A)))
SenSimx(A, B) = cosine (glossvec (sense(A)), glossvec (sense(B)))
Where,
glossvec =1:scorex(1) 2:scorex(2)… n:scorex(n)
scorex(Y) = Sentiment score of word Y using scoring function x
x = Scoring function of type SD/SM/TD/TM
Evaluation on Gold Standard Data:
Word Pair Similarity10
Evaluation on Gold Standard Data:
Word Pair Similarity11
• A set of 50 word pairs (with given context) manually marked
• Each word pair is given 3 scores in the form of ratings (1-5):
– Similarity based on meaning
– Similarity based on sentiment
– Similarity based on meaning + sentiment
Evaluation on Gold Standard Data:
Word Pair Similarity12
• A set of 50 word pairs (with given context) manually marked
• Each word pair is given 3 scores in the form of ratings (1-5):
– Similarity based on meaning
– Similarity based on sentiment
– Similarity based on meaning + sentiment
Agreement metric: Pearson correlation coefficient
Evaluation on Gold Standard Data:
Word Pair Similarity13
• A set of 50 word pairs (with given context) manually marked
• Each word pair is given 3 scores in the form of ratings (1-5):
– Similarity based on meaning
– Similarity based on sentiment
– Similarity based on meaning + sentiment
Agreement metric: Pearson correlation coefficient
Metric Used Overall NOUN VERB ADJECTIVES ADVERBS
LESK (Banerjee et al., 2003) 0.22 0.51 -0.91 0.19 0.37
LIN (Lin, 1998) 0.27 0.24 0.00 NA Na
LCH (Leacock et al., 1998) 0.36 0.34 0.44 NA NA
SenSim (SD) 0.46 0.73 0.55 0.08 0.76
SenSim (SM) 0.50 0.62 0.48 0.06 0.54
SenSim (TD) 0.45 0.73 0.55 0.08 0.59
SenSim (TM) 0.48 0.62 0.48 0.06 0.78
Evaluation on Travel Review Data:
Feature Replacement
7/23/2013Department of Computer Science and Engineering, IIT Bombay
14
Metric Used Accuracy
(%)
PP NP PR NR
Baseline 89.10 91.50 87.07 85.18 91.24
LESK
(Banerjee et al., 2003)
89.36 91.57 87.46 85.68 91.25
LIN (Lin, 1998) 89.27 91.24 87.61 85.85 90.90
LCH
(Leacock et al., 1998)
89.64 90.48 88.86 86.47 89.63
SenSim (SD) 89.95 91.39 88.65 87.11 90.93
SenSim (SM) 90.06 92.01 88.38 86.67 91.58
SenSim (TD) 90.11 91.68 88.69 86.97 91.23

More Related Content

Similar to Leveraging Sentiment to Compute Word Similarity

Sourcebased Questions Skills
Sourcebased Questions SkillsSourcebased Questions Skills
Sourcebased Questions Skills
Caroline Chua
 
Sourcebased Questions Skills
Sourcebased Questions SkillsSourcebased Questions Skills
Sourcebased Questions Skills
guesta59df6
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional Semantics
Andre Freitas
 

Similar to Leveraging Sentiment to Compute Word Similarity (20)

Sentiment+Analysis.ppt
Sentiment+Analysis.pptSentiment+Analysis.ppt
Sentiment+Analysis.ppt
 
EVALution 1.0 - An Evolving Semantic Dataset for Trainining and Evaluation of...
EVALution 1.0 - An Evolving Semantic Dataset for Trainining and Evaluation of...EVALution 1.0 - An Evolving Semantic Dataset for Trainining and Evaluation of...
EVALution 1.0 - An Evolving Semantic Dataset for Trainining and Evaluation of...
 
Lac presentation
Lac presentationLac presentation
Lac presentation
 
A Context-Based Algorithm For Sentiment Analysis
A Context-Based Algorithm For Sentiment AnalysisA Context-Based Algorithm For Sentiment Analysis
A Context-Based Algorithm For Sentiment Analysis
 
Analyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in PythonAnalyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in Python
 
Detecting Paraphrases in Marathi Language
Detecting Paraphrases in Marathi LanguageDetecting Paraphrases in Marathi Language
Detecting Paraphrases in Marathi Language
 
An Improved sentiment classification for objective word.
An Improved sentiment classification for objective word.An Improved sentiment classification for objective word.
An Improved sentiment classification for objective word.
 
Sourcebased Questions Skills
Sourcebased Questions SkillsSourcebased Questions Skills
Sourcebased Questions Skills
 
Sourcebased Questions Skills
Sourcebased Questions SkillsSourcebased Questions Skills
Sourcebased Questions Skills
 
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUESA SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Fyp ca2
Fyp ca2Fyp ca2
Fyp ca2
 
Word Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology ClassesWord Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology Classes
 
NLP
NLPNLP
NLP
 
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffnL6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
 
DETECTING OXYMORON IN A SINGLE STATEMENT
DETECTING OXYMORON IN A SINGLE STATEMENTDETECTING OXYMORON IN A SINGLE STATEMENT
DETECTING OXYMORON IN A SINGLE STATEMENT
 
N01741100102
N01741100102N01741100102
N01741100102
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional Semantics
 
Word sense disambiguation using wsd specific wordnet of polysemy words
Word sense disambiguation using wsd specific wordnet of polysemy wordsWord sense disambiguation using wsd specific wordnet of polysemy words
Word sense disambiguation using wsd specific wordnet of polysemy words
 
6&7-Query Languages & Operations.ppt
6&7-Query Languages & Operations.ppt6&7-Query Languages & Operations.ppt
6&7-Query Languages & Operations.ppt
 

More from Subhabrata Mukherjee

Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Subhabrata Mukherjee
 

More from Subhabrata Mukherjee (18)

XtremeDistil: Multi-stage Distillation for Massive Multilingual Models
XtremeDistil: Multi-stage Distillation for Massive Multilingual ModelsXtremeDistil: Multi-stage Distillation for Massive Multilingual Models
XtremeDistil: Multi-stage Distillation for Massive Multilingual Models
 
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
 
Fact Checking from Text
Fact Checking from TextFact Checking from Text
Fact Checking from Text
 
OpenTag: Open Attribute Value Extraction From Product Profiles
OpenTag: Open Attribute Value Extraction From Product ProfilesOpenTag: Open Attribute Value Extraction From Product Profiles
OpenTag: Open Attribute Value Extraction From Product Profiles
 
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
 
Continuous Experience-aware Language Model
Continuous Experience-aware Language ModelContinuous Experience-aware Language Model
Continuous Experience-aware Language Model
 
Experience aware Item Recommendation in Evolving Review Communities
Experience aware Item Recommendation in Evolving Review CommunitiesExperience aware Item Recommendation in Evolving Review Communities
Experience aware Item Recommendation in Evolving Review Communities
 
Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...
Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...
Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...
 
Leveraging Joint Interactions for Credibility Analysis in News Communities
Leveraging Joint Interactions for Credibility Analysis in News CommunitiesLeveraging Joint Interactions for Credibility Analysis in News Communities
Leveraging Joint Interactions for Credibility Analysis in News Communities
 
People on Drugs: Credibility of User Statements in Health Forums
People on Drugs: Credibility of User Statements in Health ForumsPeople on Drugs: Credibility of User Statements in Health Forums
People on Drugs: Credibility of User Statements in Health Forums
 
Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...
Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...
Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...
 
Joint Author Sentiment Topic Model
Joint Author Sentiment Topic ModelJoint Author Sentiment Topic Model
Joint Author Sentiment Topic Model
 
TwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter
TwiSent: A Multi-Stage System for Analyzing Sentiment in TwitterTwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter
TwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter
 
Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...
Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...
Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...
 
WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...
WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...
WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...
 
Feature specific analysis of reviews
Feature specific analysis of reviewsFeature specific analysis of reviews
Feature specific analysis of reviews
 
YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...
YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...
YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...
 
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse AnalysisSentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
 

Recently uploaded

Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Peter Udo Diehl
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 

Recently uploaded (20)

To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Ransomware Mallox [EN].pdf
Ransomware         Mallox       [EN].pdfRansomware         Mallox       [EN].pdf
Ransomware Mallox [EN].pdf
 

Leveraging Sentiment to Compute Word Similarity

  • 1. Leveraging Sentiment to Compute Word Similarity Balamurali A.R., Subhabrata Mukherjee, Akshat Malu and Pushpak Bhattacharyya Dept. of Computer Science and Engineering, IIT Bombay 6th International Global Wordnet Conference GWC 2011, Matsue, Japan, Jan, 2012
  • 3. Motivation 3  Introduce Sentiment as another feature in the Semantic Similarity Measure  “Among a set of a similar word pairs, a pair is more similar if their sentiment content is the same”  Is “enchant” (hold spellbound) more similar to “endear” (make endearing or lovable) than to “delight ”(give pleasure to or be pleasing to) ?       
  • 4. Motivation 4  Introduce Sentiment as another feature in the Semantic Similarity Measure  “Among a set of a similar word pairs, a pair is more similar if their sentiment content is the same”  Is “enchant” (hold spellbound) more similar to “endear” (make endearing or lovable) than to “delight ”(give pleasure to or be pleasing to) ?  Useful for replacing an unknown feature in test set with a similar feature in training set      
  • 5. Motivation 5  Introduce Sentiment as another feature in the Semantic Similarity Measure  “Among a set of a similar word pairs, a pair is more similar if their sentiment content is the same”  Is “enchant” (hold spellbound) more similar to “endear” (make endearing or lovable) than to “delight ”(give pleasure to or be pleasing to) ?  Useful for replacing an unknown feature in test set with a similar feature in training set  Given a word in a sentence, create its Similarity Vector  Use Word Sense Disambiguation on context to find its Synset-id  Create a Gloss Vector (sparse) using its gloss  Extend gloss using relevant WordNet Relations  Learn the relations to use for different POS tags and the depth in WordNet hierarchy  Incorporate SentiWordNet Scores in the Expanded Vector using Different Scoring
  • 6. Motivation 6  Introduce Sentiment as another feature in the Semantic Similarity Measure  “Among a set of a similar word pairs, a pair is more similar if their sentiment content is the same”  Is “enchant” (hold spellbound) more similar to “endear” (make endearing or lovable) than to “delight ”(give pleasure to or be pleasing to) ?  Useful for replacing an unknown feature in test set with a similar feature in training set  Given a word in a sentence, create its Similarity Vector  Use Word Sense Disambiguation on context to find its Synset-id  Create a Gloss Vector (sparse) using its gloss  Extend gloss using relevant WordNet Relations  Learn the relations to use for different POS tags and the depth in WordNet hierarchy  Incorporate SentiWordNet Scores in the Expanded Vector using Different Scoring
  • 7. Sentiment-Semantic Correlation 7/23/2013Department of Computer Science and Engineering, IIT Bombay 7 Annotation Strategy Overall NOUN VERB ADJECTIVES ADVERBS Meaning 0.768 0.803 0.750 0.527 0.759 Meaning + Sentiment 0.799 0.750 0.889 0.720 0.844
  • 8. WordNet Relations used for Expansion 7/23/2013Department of Computer Science and Engineering, IIT Bombay 8 POS WordNet relations used for expansion Nouns hypernym, hyponym, nominalization Verbs nominalization, hypernym, hyponym Adjectives also see, nominalization, attribute Adverbs derived
  • 9. Scoring Formula 7/23/2013Department of Computer Science and Engineering, IIT Bombay 9  ScoreSD(A) = SWNpos(A)- SWNneg(A)  ScoreSM(A)= max(SWNpos(A), SWNneg(A))  ScoreTM(A) = sign(max(SWNpos(A), SWNneg(A)))∗ (1+abs(max(SWNpos(A), SWNneg(A))) SenSimx(A, B) = cosine (glossvec (sense(A)), glossvec (sense(B))) Where, glossvec =1:scorex(1) 2:scorex(2)… n:scorex(n) scorex(Y) = Sentiment score of word Y using scoring function x x = Scoring function of type SD/SM/TD/TM
  • 10. Evaluation on Gold Standard Data: Word Pair Similarity10
  • 11. Evaluation on Gold Standard Data: Word Pair Similarity11 • A set of 50 word pairs (with given context) manually marked • Each word pair is given 3 scores in the form of ratings (1-5): – Similarity based on meaning – Similarity based on sentiment – Similarity based on meaning + sentiment
  • 12. Evaluation on Gold Standard Data: Word Pair Similarity12 • A set of 50 word pairs (with given context) manually marked • Each word pair is given 3 scores in the form of ratings (1-5): – Similarity based on meaning – Similarity based on sentiment – Similarity based on meaning + sentiment Agreement metric: Pearson correlation coefficient
  • 13. Evaluation on Gold Standard Data: Word Pair Similarity13 • A set of 50 word pairs (with given context) manually marked • Each word pair is given 3 scores in the form of ratings (1-5): – Similarity based on meaning – Similarity based on sentiment – Similarity based on meaning + sentiment Agreement metric: Pearson correlation coefficient Metric Used Overall NOUN VERB ADJECTIVES ADVERBS LESK (Banerjee et al., 2003) 0.22 0.51 -0.91 0.19 0.37 LIN (Lin, 1998) 0.27 0.24 0.00 NA Na LCH (Leacock et al., 1998) 0.36 0.34 0.44 NA NA SenSim (SD) 0.46 0.73 0.55 0.08 0.76 SenSim (SM) 0.50 0.62 0.48 0.06 0.54 SenSim (TD) 0.45 0.73 0.55 0.08 0.59 SenSim (TM) 0.48 0.62 0.48 0.06 0.78
  • 14. Evaluation on Travel Review Data: Feature Replacement 7/23/2013Department of Computer Science and Engineering, IIT Bombay 14 Metric Used Accuracy (%) PP NP PR NR Baseline 89.10 91.50 87.07 85.18 91.24 LESK (Banerjee et al., 2003) 89.36 91.57 87.46 85.68 91.25 LIN (Lin, 1998) 89.27 91.24 87.61 85.85 90.90 LCH (Leacock et al., 1998) 89.64 90.48 88.86 86.47 89.63 SenSim (SD) 89.95 91.39 88.65 87.11 90.93 SenSim (SM) 90.06 92.01 88.38 86.67 91.58 SenSim (TD) 90.11 91.68 88.69 86.97 91.23