Leveraging Sentiment to Compute Word Similarity

•

0 likes•396 views

Leveraging Sentiment to Compute Word Similarity, Balamurali A.R., Subhabrata Mukherjee, Akshat Malu and Pushpak Bhattacharyya, In Proceedings of the 6th International Global Wordnet Conference (GWC 2011), Matsue, Japan, Jan, 2012 (http://www.cse.iitb.ac.in/~pb/papers/gwc12-sense-sa.pdf)

Technology Education

Leveraging Sentiment to Compute
Word Similarity
Balamurali A.R., Subhabrata Mukherjee, Akshat Malu and Pushpak
Bhattacharyya
Dept. of Computer Science and Engineering, IIT Bombay
6th International Global Wordnet Conference
GWC 2011,
Matsue, Japan, Jan, 2012

Motivation
3
 Introduce Sentiment as another feature in the Semantic Similarity
Measure
 “Among a set of a similar word pairs, a pair is more similar if their
sentiment content is the same”
 Is “enchant” (hold spellbound) more similar to “endear” (make
endearing or lovable) than to “delight ”(give pleasure to or be
pleasing to) ?








Motivation
4
 Introduce Sentiment as another feature in the Semantic Similarity
Measure
 “Among a set of a similar word pairs, a pair is more similar if their
sentiment content is the same”
 Is “enchant” (hold spellbound) more similar to “endear” (make
endearing or lovable) than to “delight ”(give pleasure to or be
pleasing to) ?
 Useful for replacing an unknown feature in test set with a similar
feature in training set







Motivation
5
 Introduce Sentiment as another feature in the Semantic Similarity
Measure
 “Among a set of a similar word pairs, a pair is more similar if their
sentiment content is the same”
 Is “enchant” (hold spellbound) more similar to “endear” (make
endearing or lovable) than to “delight ”(give pleasure to or be
pleasing to) ?
 Useful for replacing an unknown feature in test set with a similar
feature in training set
 Given a word in a sentence, create its Similarity Vector
 Use Word Sense Disambiguation on context to find its Synset-id
 Create a Gloss Vector (sparse) using its gloss
 Extend gloss using relevant WordNet Relations
 Learn the relations to use for different POS tags and the depth in WordNet
hierarchy
 Incorporate SentiWordNet Scores in the Expanded Vector using Different Scoring

Motivation
6
 Introduce Sentiment as another feature in the Semantic Similarity
Measure
 “Among a set of a similar word pairs, a pair is more similar if their
sentiment content is the same”
 Is “enchant” (hold spellbound) more similar to “endear” (make
endearing or lovable) than to “delight ”(give pleasure to or be
pleasing to) ?
 Useful for replacing an unknown feature in test set with a similar
feature in training set
 Given a word in a sentence, create its Similarity Vector
 Use Word Sense Disambiguation on context to find its Synset-id
 Create a Gloss Vector (sparse) using its gloss
 Extend gloss using relevant WordNet Relations
 Learn the relations to use for different POS tags and the depth in WordNet
hierarchy
 Incorporate SentiWordNet Scores in the Expanded Vector using Different Scoring

Sentiment-Semantic Correlation
7/23/2013Department of Computer Science and Engineering, IIT Bombay
7
Annotation Strategy Overall NOUN VERB ADJECTIVES ADVERBS
Meaning 0.768 0.803 0.750 0.527 0.759
Meaning + Sentiment 0.799 0.750 0.889 0.720 0.844

WordNet Relations used for
Expansion
7/23/2013Department of Computer Science and Engineering, IIT Bombay
8
POS WordNet relations used for expansion
Nouns hypernym, hyponym, nominalization
Verbs nominalization, hypernym, hyponym
Adjectives also see, nominalization, attribute
Adverbs derived

Scoring Formula
7/23/2013Department of Computer Science and Engineering, IIT Bombay
9
 ScoreSD(A) = SWNpos(A)- SWNneg(A)
 ScoreSM(A)= max(SWNpos(A), SWNneg(A))
 ScoreTM(A) =
sign(max(SWNpos(A), SWNneg(A)))∗ (1+abs(max(SWNpos(A),
SWNneg(A)))
SenSimx(A, B) = cosine (glossvec (sense(A)), glossvec (sense(B)))
Where,
glossvec =1:scorex(1) 2:scorex(2)… n:scorex(n)
scorex(Y) = Sentiment score of word Y using scoring function x
x = Scoring function of type SD/SM/TD/TM

Evaluation on Gold Standard Data:
Word Pair Similarity10

Evaluation on Gold Standard Data:
Word Pair Similarity11
• A set of 50 word pairs (with given context) manually marked
• Each word pair is given 3 scores in the form of ratings (1-5):
– Similarity based on meaning
– Similarity based on sentiment
– Similarity based on meaning + sentiment

Evaluation on Gold Standard Data:
Word Pair Similarity12
• A set of 50 word pairs (with given context) manually marked
• Each word pair is given 3 scores in the form of ratings (1-5):
– Similarity based on meaning
– Similarity based on sentiment
– Similarity based on meaning + sentiment
Agreement metric: Pearson correlation coefficient

Evaluation on Gold Standard Data:
Word Pair Similarity13
• A set of 50 word pairs (with given context) manually marked
• Each word pair is given 3 scores in the form of ratings (1-5):
– Similarity based on meaning
– Similarity based on sentiment
– Similarity based on meaning + sentiment
Agreement metric: Pearson correlation coefficient
Metric Used Overall NOUN VERB ADJECTIVES ADVERBS
LESK (Banerjee et al., 2003) 0.22 0.51 -0.91 0.19 0.37
LIN (Lin, 1998) 0.27 0.24 0.00 NA Na
LCH (Leacock et al., 1998) 0.36 0.34 0.44 NA NA
SenSim (SD) 0.46 0.73 0.55 0.08 0.76
SenSim (SM) 0.50 0.62 0.48 0.06 0.54
SenSim (TD) 0.45 0.73 0.55 0.08 0.59
SenSim (TM) 0.48 0.62 0.48 0.06 0.78

Evaluation on Travel Review Data:
Feature Replacement
7/23/2013Department of Computer Science and Engineering, IIT Bombay
14
Metric Used Accuracy
(%)
PP NP PR NR
Baseline 89.10 91.50 87.07 85.18 91.24
LESK
(Banerjee et al., 2003)
89.36 91.57 87.46 85.68 91.25
LIN (Lin, 1998) 89.27 91.24 87.61 85.85 90.90
LCH
(Leacock et al., 1998)
89.64 90.48 88.86 86.47 89.63
SenSim (SD) 89.95 91.39 88.65 87.11 90.93
SenSim (SM) 90.06 92.01 88.38 86.67 91.58
SenSim (TD) 90.11 91.68 88.69 86.97 91.23

Similar to Leveraging Sentiment to Compute Word Similarity

Sentiment+Analysis.ppt

visheshs4

These slides introduce EVALution 1.0, a dataset designed for the training and the evaluation of Distributional Semantic Models (DSMs). This version consists of almost 7.5K tuples, instantiating several semantic relations between word pairs (including hypernymy, synonymy, antonymy, meronymy). The dataset is enriched with a large amount of additional information (i.e. relation domain, word frequency, word POS, word semantic field, etc.) that can be used for either filtering the pairs or performing an in-depth analysis of the results. The tuples were extracted from a combination of ConceptNet 5.0 and WordNet 4.0, and subsequently filtered through automatic methods and crowdsourcing in order to ensure their quality. The dataset is freely downloadable. An extension in RDF format, including also scripts for data processing, is under development.

EVALution 1.0 - An Evolving Semantic Dataset for Trainining and Evaluation of...

Enrico Santus Aversano

Lac presentation

Roseline Antai

A Context-Based Algorithm For Sentiment Analysis

Richard Hogue

This presentation will guide you through the application of Python NLP Techniques to analyze arguments during a debate and define a strategy to figure out the winner of the debate on the basis of strength and relevance of the arguments. This is made for PyCon India 2015. For details : https://in.pycon.org/cfp/pycon-india-2015/proposals/analyzing-arguments-during-a-debate-using-natural-language-processing-in-python/ Contact me : abhinav.gpt3@gmail.com

Analyzing Arguments during a Debate using Natural Language Processing in Python

Abhinav Gupta

Paraphrasing refers to the sentences that either differs in their textual content or dissimilar in rearrangement of words but convey the same meaning. Identifying a paraphrase is exceptionally important in various real life applications such as Information Retrieval, Plagiarism Detection, Text Summarization and Question Answering. A large amount of work in Paraphrase Detection has been done in English and many Indian Languages. However, there is no existing system to identify paraphrases in Marathi.

Detecting Paraphrases in Marathi Language

BOHR International Journal of Smart Computing and Information Technology

Sentiment classification is an ongoing field and interesting area of research because of its application in various fields. Customer sentiments play a very important role in daily life. Currently, Sentiment classification focused on subjective statements and ignores objective statements which also carry sentiment. During the sentiment classification, problem is faced due to the ambiguous sense (meaning) of words and negation words. In word sense disambiguation method semantic scores calculated from SentiWordNet of WordNet glosses terms. The correct sense of the word is extracted and determined similarity in WordNet glosses terms. SentiWordNet extract first sense of word which used in general sense. This work aims at improving the sentiment classification by modifying the sentiment values returned by SentiWordNet and compare classification accuracy of support vector machine and naÃƒÂ¯ve bays.

An Improved sentiment classification for objective word.

IJSRD

Sourcebased Questions Skills

Caroline Chua

Sourcebased Questions Skills

guesta59df6

Sentiment classification is an ongoing field and interesting area of research because of its application in various fields collecting review from people about products and social and political events through the web. Currently, Sentiment Analysis concentrates for subjective statements or on subjectivity and overlook objective statements which carry sentiment(s). During the sentiment classification more challenging problem are faced due to the ambiguous sense of words, negation words and intensifier. Due to its importance the correct sense of target word is extracted and determined for which the similarity arise in WordNet Glosses. This paper presents a survey covering the techniques and methods in sentiment analysis and challenges appear in the field.

A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES

Journal For Research

Sentiment Analysis

ishan0019

Fyp ca2

Haha Teh

Semantic annotation is fundamental to deal with large-scale lexical information, mapping the information to an enumerable set of categories over which rules and algorithms can be applied, and foundational ontology classes can be used as a formal set of categories for such tasks. A previous alignment between WordNet noun synsets and DOLCE provided a starting point for ontology-based annotation, but in NLP tasks verbs are also of substantial importance. This work presents an extension to the WordNet-DOLCE noun mapping, aligning verbs according to their links to nouns denoting perdurants, transferring to the verb the DOLCE class assigned to the noun that best represents that verb’s occurrence. To evaluate the usefulness of this resource, we implemented a foundational ontology-based semantic annotation framework, that assigns a high-level foundational category to each word or phrase in a text, and compared it to a similar annotation tool, obtaining an increase of 9.05% in accuracy.

Word Tagging with Foundational Ontology Classes

Andre Freitas

NLP

Girish Khanzode

L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn

RwanEnan

DETECTING OXYMORON IN A SINGLE STATEMENT

WarNik Chow

N01741100102

IOSR Journals

Introduction to Distributional Semantics

Andre Freitas

This paper presents a new model of WordNet that is used to disambiguate the correct sense of polysemy word based on the clue words. The related words for each sense of a polysemy word as well as single sense word are referred to as the clue words. The conventional WordNet organizes nouns, verbs, adjectives and adverbs together into sets of synonyms called synsets each expressing a different concept. In contrast to the structure of WordNet, we developed a new model of WordNet that organizes the different senses of polysemy words as well as the single sense words based on the clue words. These clue words for each sense of a polysemy word as well as for single sense word are used to disambiguate the correct meaning of the polysemy word in the given context using knowledge based Word Sense Disambiguation (WSD) algorithms. The clue word can be a noun, verb, adjective or adverb.

Word sense disambiguation using wsd specific wordnet of polysemy words

ijnlc

6&7-Query Languages & Operations.ppt

BereketAraya

Similar to Leveraging Sentiment to Compute Word Similarity (20)

Sentiment+Analysis.ppt

EVALution 1.0 - An Evolving Semantic Dataset for Trainining and Evaluation of...

Lac presentation

A Context-Based Algorithm For Sentiment Analysis

Analyzing Arguments during a Debate using Natural Language Processing in Python

Detecting Paraphrases in Marathi Language

An Improved sentiment classification for objective word.

Sourcebased Questions Skills

A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES

Sentiment Analysis

Fyp ca2

Word Tagging with Foundational Ontology Classes

NLP

L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn

DETECTING OXYMORON IN A SINGLE STATEMENT

N01741100102

Introduction to Distributional Semantics

Word sense disambiguation using wsd specific wordnet of polysemy words

6&7-Query Languages & Operations.ppt

More from Subhabrata Mukherjee

XtremeDistil: Multi-stage Distillation for Massive Multilingual Models

Subhabrata Mukherjee

Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...

Subhabrata Mukherjee

Fact Checking from Text

Subhabrata Mukherjee

OpenTag: Open Attribute Value Extraction From Product Profiles

Subhabrata Mukherjee

One of the major hurdles preventing the full exploitation of information from online communities is the widespread concern regarding the quality and credibility of user-contributed content. We propose probabilistic graphical models that can leverage the joint interplay between multiple factors --- like user interactions, community dynamics, and textual content --- to automatically assess the credibility of user-contributed online information, expertise of users and their evolution with user-interpretable explanation. We devise new models based on Conditional Random Fields that enable applications such as extracting reliable side-effects of drugs from user-contributed posts in health forums, and identifying credible news articles in news forums. Online communities are dynamic, as users join and leave, adapt to evolving trends, and mature over time. To capture this dynamics, we propose generative models based on Hidden Markov Model, Latent Dirichlet Allocation, and Brownian Motion to trace the continuous evolution of user expertise and their language model over time. This allows us to identify expert users and credible content jointly over time, improving state-of-the-art recommender systems by explicitly considering the maturity of users. This enables applications such as identifying useful product reviews, and detecting fake and anomalous reviews with limited information.

Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...

Subhabrata Mukherjee

Continuous Experience-aware Language Model

Subhabrata Mukherjee

Experience aware Item Recommendation in Evolving Review Communities

Subhabrata Mukherjee

Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...

Subhabrata Mukherjee

Leveraging Joint Interactions for Credibility Analysis in News Communities

Subhabrata Mukherjee

People on Drugs: Credibility of User Statements in Health Forums

Subhabrata Mukherjee

Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...

Subhabrata Mukherjee

Joint Author Sentiment Topic Model

Subhabrata Mukherjee

TwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter

Subhabrata Mukherjee

Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...

Subhabrata Mukherjee

WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...

Subhabrata Mukherjee

Feature specific analysis of reviews

Subhabrata Mukherjee

YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...

Subhabrata Mukherjee

Sentiment Analysis in Twitter with Lightweight Discourse Analysis

Subhabrata Mukherjee

More from Subhabrata Mukherjee (18)

XtremeDistil: Multi-stage Distillation for Massive Multilingual Models

Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...

Fact Checking from Text

OpenTag: Open Attribute Value Extraction From Product Profiles

Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...

Continuous Experience-aware Language Model

Experience aware Item Recommendation in Evolving Review Communities

Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...

Leveraging Joint Interactions for Credibility Analysis in News Communities

People on Drugs: Credibility of User Statements in Health Forums

Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...

Joint Author Sentiment Topic Model

TwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter

Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...

WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...

Feature specific analysis of reviews

YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...

Sentiment Analysis in Twitter with Lightweight Discourse Analysis

Recently uploaded

To Graph or Not to Graph Knowledge Graph Architectures and LLMs

Paul Groth

Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs. 1. Unifying Large Language Models and Knowledge Graphs: A Roadmap. https://arxiv.org/abs/2306.08302 2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs: https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/

GraphRAG is All You need? LLM & Knowledge Graph

Guy Korland

In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development. This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.

PHP Frameworks: I want to break free (IPC Berlin 2024)

Ralf Eggert

Key Trends Shaping the Future of Infrastructure.pdf

Cheryl Hung

I'm excited to share my latest predictions on how AI, robotics, and other technological advancements will reshape industries in the coming years. The slides explore the exponential growth of computational power, the future of AI and robotics, and their profound impact on various sectors. Why this matters: The success of new products and investments hinges on precise timing and foresight into emerging categories. This deck equips founders, VCs, and industry leaders with insights to align future products with upcoming tech developments. These insights enhance the ability to forecast industry trends, improve market timing, and predict competitor actions. Highlights: ▪ Exponential Growth in Compute: How $1000 will soon buy the computational power of a human brain ▪ Scaling of AI Models: The journey towards beyond human-scale models and intelligent edge computing ▪ Transformative Technologies: From advanced robotics and brain interfaces to automated healthcare and beyond ▪ Future of Work: How automation will redefine jobs and economic structures by 2040 With so many predictions presented here, some will inevitably be wrong or mistimed, especially with potential external disruptions. For instance, a conflict in Taiwan could severely impact global semiconductor production, affecting compute costs and related advancements. Nonetheless, these slides are intended to guide intuition on future technological trends.

Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl

Peter Udo Diehl

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...

Product School

Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...

Thierry Lestable

Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf

91mobiles

Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place. Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects. Here’s what you’ll gain: - Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows. - Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy. - Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency. - Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity. We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic. Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.

Essentials of Automations: Optimizing FME Workflows with Parameters

Safe Software

Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes? All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.

The Art of the Pitch: WordPress Relationships and Sales

Laura Byrne

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf

FIDO Alliance

The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more. Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/ Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.

Transcript: Selling digital books in 2024: Insights from industry leaders - T...

BookNet Canada

Mission to Decommission: Importance of Decommissioning Products to Increase E...

Product School

Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...

Product School

In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.

Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...

Ramesh Iyer

НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»

QADay

ODC, Data Fabric and Architecture User Group

CatarinaPereira64715

UiPath Test Automation using UiPath Test Suite series, part 3

DianaGray10

Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to: Create a campaign using Mailchimp with merge tags/fields Send an interactive Slack channel message (using buttons) Have the message received by managers and peers along with a test email for review But there’s more: In a second workflow supporting the same use case, you’ll see: Your campaign sent to target colleagues for approval If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team But—if the “Reject” button is pushed, colleagues will be alerted via Slack message Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors. And... Speakers: Akshay Agnihotri, Product Manager Charlie Greenberg, Host

Connector Corner: Automate dynamic content and events by pushing a button

DianaGray10

The infamous Mallox is the digital Robin Hoods of our time, except they steal from everyone and give to themselves. Since mid-2021, they've been playing hide and seek with unsecured Microsoft SQL servers, encrypting data, and then graciously offering to give it back for a modest Bitcoin donation. Mallox decided to go shopping for new malware toys, adding the Remcos RAT, BatCloak, and a sprinkle of Metasploit to their collection. They're now playing a game of "Catch me if you can" with antivirus software, using their FUD obfuscator packers to turn their ransomware into the digital equivalent of a ninja. ------- This document provides a analysis of the Target Company ransomware group, also known as Smallpox, which has been rapidly evolving since its first identification in June 2021. The analysis delves into various aspects of the group's operations, including its distinctive practice of appending targeted organizations' names to encrypted files, the evolution of its encryption algorithms, and its tactics for establishing persistence and evading defenses. The insights gained from this analysis are crucial for informing defense strategies and enhancing preparedness against such evolving cyber threats.

Ransomware Mallox [EN].pdf

Overkill Security

Recently uploaded (20)

To Graph or Not to Graph Knowledge Graph Architectures and LLMs

GraphRAG is All You need? LLM & Knowledge Graph

PHP Frameworks: I want to break free (IPC Berlin 2024)

Key Trends Shaping the Future of Infrastructure.pdf

Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...

Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...

Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf

Essentials of Automations: Optimizing FME Workflows with Parameters

The Art of the Pitch: WordPress Relationships and Sales

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf

Transcript: Selling digital books in 2024: Insights from industry leaders - T...

Mission to Decommission: Importance of Decommissioning Products to Increase E...

Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...

Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...

НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»

ODC, Data Fabric and Architecture User Group

UiPath Test Automation using UiPath Test Suite series, part 3

Connector Corner: Automate dynamic content and events by pushing a button

Ransomware Mallox [EN].pdf

Leveraging Sentiment to Compute Word Similarity

1. Leveraging Sentiment to Compute Word Similarity Balamurali A.R., Subhabrata Mukherjee, Akshat Malu and Pushpak Bhattacharyya Dept. of Computer Science and Engineering, IIT Bombay 6th International Global Wordnet Conference GWC 2011, Matsue, Japan, Jan, 2012

2. Motivation 2          

3. Motivation 3  Introduce Sentiment as another feature in the Semantic Similarity Measure  “Among a set of a similar word pairs, a pair is more similar if their sentiment content is the same”  Is “enchant” (hold spellbound) more similar to “endear” (make endearing or lovable) than to “delight ”(give pleasure to or be pleasing to) ?       

4. Motivation 4  Introduce Sentiment as another feature in the Semantic Similarity Measure  “Among a set of a similar word pairs, a pair is more similar if their sentiment content is the same”  Is “enchant” (hold spellbound) more similar to “endear” (make endearing or lovable) than to “delight ”(give pleasure to or be pleasing to) ?  Useful for replacing an unknown feature in test set with a similar feature in training set      

5. Motivation 5  Introduce Sentiment as another feature in the Semantic Similarity Measure  “Among a set of a similar word pairs, a pair is more similar if their sentiment content is the same”  Is “enchant” (hold spellbound) more similar to “endear” (make endearing or lovable) than to “delight ”(give pleasure to or be pleasing to) ?  Useful for replacing an unknown feature in test set with a similar feature in training set  Given a word in a sentence, create its Similarity Vector  Use Word Sense Disambiguation on context to find its Synset-id  Create a Gloss Vector (sparse) using its gloss  Extend gloss using relevant WordNet Relations  Learn the relations to use for different POS tags and the depth in WordNet hierarchy  Incorporate SentiWordNet Scores in the Expanded Vector using Different Scoring

6. Motivation 6  Introduce Sentiment as another feature in the Semantic Similarity Measure  “Among a set of a similar word pairs, a pair is more similar if their sentiment content is the same”  Is “enchant” (hold spellbound) more similar to “endear” (make endearing or lovable) than to “delight ”(give pleasure to or be pleasing to) ?  Useful for replacing an unknown feature in test set with a similar feature in training set  Given a word in a sentence, create its Similarity Vector  Use Word Sense Disambiguation on context to find its Synset-id  Create a Gloss Vector (sparse) using its gloss  Extend gloss using relevant WordNet Relations  Learn the relations to use for different POS tags and the depth in WordNet hierarchy  Incorporate SentiWordNet Scores in the Expanded Vector using Different Scoring

7. Sentiment-Semantic Correlation 7/23/2013Department of Computer Science and Engineering, IIT Bombay 7 Annotation Strategy Overall NOUN VERB ADJECTIVES ADVERBS Meaning 0.768 0.803 0.750 0.527 0.759 Meaning + Sentiment 0.799 0.750 0.889 0.720 0.844

8. WordNet Relations used for Expansion 7/23/2013Department of Computer Science and Engineering, IIT Bombay 8 POS WordNet relations used for expansion Nouns hypernym, hyponym, nominalization Verbs nominalization, hypernym, hyponym Adjectives also see, nominalization, attribute Adverbs derived

9. Scoring Formula 7/23/2013Department of Computer Science and Engineering, IIT Bombay 9  ScoreSD(A) = SWNpos(A)- SWNneg(A)  ScoreSM(A)= max(SWNpos(A), SWNneg(A))  ScoreTM(A) = sign(max(SWNpos(A), SWNneg(A)))∗ (1+abs(max(SWNpos(A), SWNneg(A))) SenSimx(A, B) = cosine (glossvec (sense(A)), glossvec (sense(B))) Where, glossvec =1:scorex(1) 2:scorex(2)… n:scorex(n) scorex(Y) = Sentiment score of word Y using scoring function x x = Scoring function of type SD/SM/TD/TM

10. Evaluation on Gold Standard Data: Word Pair Similarity10

11. Evaluation on Gold Standard Data: Word Pair Similarity11 • A set of 50 word pairs (with given context) manually marked • Each word pair is given 3 scores in the form of ratings (1-5): – Similarity based on meaning – Similarity based on sentiment – Similarity based on meaning + sentiment

12. Evaluation on Gold Standard Data: Word Pair Similarity12 • A set of 50 word pairs (with given context) manually marked • Each word pair is given 3 scores in the form of ratings (1-5): – Similarity based on meaning – Similarity based on sentiment – Similarity based on meaning + sentiment Agreement metric: Pearson correlation coefficient

13. Evaluation on Gold Standard Data: Word Pair Similarity13 • A set of 50 word pairs (with given context) manually marked • Each word pair is given 3 scores in the form of ratings (1-5): – Similarity based on meaning – Similarity based on sentiment – Similarity based on meaning + sentiment Agreement metric: Pearson correlation coefficient Metric Used Overall NOUN VERB ADJECTIVES ADVERBS LESK (Banerjee et al., 2003) 0.22 0.51 -0.91 0.19 0.37 LIN (Lin, 1998) 0.27 0.24 0.00 NA Na LCH (Leacock et al., 1998) 0.36 0.34 0.44 NA NA SenSim (SD) 0.46 0.73 0.55 0.08 0.76 SenSim (SM) 0.50 0.62 0.48 0.06 0.54 SenSim (TD) 0.45 0.73 0.55 0.08 0.59 SenSim (TM) 0.48 0.62 0.48 0.06 0.78

14. Evaluation on Travel Review Data: Feature Replacement 7/23/2013Department of Computer Science and Engineering, IIT Bombay 14 Metric Used Accuracy (%) PP NP PR NR Baseline 89.10 91.50 87.07 85.18 91.24 LESK (Banerjee et al., 2003) 89.36 91.57 87.46 85.68 91.25 LIN (Lin, 1998) 89.27 91.24 87.61 85.85 90.90 LCH (Leacock et al., 1998) 89.64 90.48 88.86 86.47 89.63 SenSim (SD) 89.95 91.39 88.65 87.11 90.93 SenSim (SM) 90.06 92.01 88.38 86.67 91.58 SenSim (TD) 90.11 91.68 88.69 86.97 91.23

Leveraging Sentiment to Compute Word Similarity

Recommended

Recommended

More Related Content

Similar to Leveraging Sentiment to Compute Word Similarity

Similar to Leveraging Sentiment to Compute Word Similarity (20)

More from Subhabrata Mukherjee

More from Subhabrata Mukherjee (18)

Recently uploaded

Recently uploaded (20)

Leveraging Sentiment to Compute Word Similarity