SlideShare a Scribd company logo
1 of 36
Sentiment Analysis
Presented by
Aditya Joshi 08305908
Guided by
Prof. Pushpak Bhattacharyya
IIT Bombay
What is SA & OM?
• Identify the orientation of opinion in a
piece of text
• Can be generalized to a wider set of
emotions
The movie
was fabulous!
The movie
stars Mr. X
The movie
was horrible!
Motivation
• Knowing sentiment is a very natural ability
of a human being.
Can a machine be trained to do it?
• SA aims at getting sentiment-related
knowledge especially from the huge
amount of information on the internet
• Can be generally used to understand
opinion in a set of documents
Tripod of Sentiment Analysis
Cognitive
Science
Natural
Language
Processing
Machine
Learning
Sentiment
Analysis
Natural
Language
Processing
Machine
Learning
Contents :
Challenges
Subjectivity
detection
SA
Approaches
Applications
Lexical
Resources
Challenges
• Contrasts with standard text-based
categorization
• Domain dependent
• Sarcasm
• Dissatisfied expressions
Mere presence of words is
Indicative of the category
in case of text categorization.
Not the case with
sentiment analysis
• Contrasts with standard text-based
categorization
• Domain dependent
• Sarcasm
• Dissatisfied expressions
Sentiment of a word
is w.r.t. the
domain.
Example: ‘unpredictable’
For steering of a car,
For movie review,
• Contrasts with standard text-based
categorization
• Domain dependent
• Sarcasm
• Dissatisfied expressions
Sarcasm uses words of
a polarity to represent
another polarity.
Example: The perfume is so
amazing that I suggest you wear it
with your windows shut
• Contrasts with standard text-based
categorization
• Domain dependent
• Sarcasm
• Dissatisfied expressions
the sentences/words that
contradict the overall sentiment
of the set are in majority
Example: The actors are good,
the music is brilliant and appealing.
Yet, the movie fails to strike a chord.
SentiWordNet
•Lexical resource for sentiment
analysis
•Built on the top of WordNet synsets
•Attaches sentiment-related
information with synsets
Quantifying sentiment
Objective
Polarity
Subjective
Polarity
Positive Negative
Term sense
position
Each term has a Positive, Negative and Objective score. The
scores sum to one.
Building SentiWordNet
• Ln, Lo, Lp are the three seed sets
• Iteratively expand the seed sets through K
steps
• Train the classifier for the expanded sets
Lp
Ln
Expansion of seed sets
The sets at the end of kth step are called Tr(k,p) and Tr(k,n)
Tr(k,o) is the set that is not present in Tr(k,p) and Tr(k,n)
Committee of classifiers
• Train a committee of classifiers of different
types and different K-values for the given
data
• Observations:
– Low values of K give high precision and low
recall
– Accuracy in determining positivity or
negativity, however, remains almost constant
WordNet Affect
• Similar to SentiWordNet (an earlier work)
• WordNet-Affect: WordNet + annotated
affective concepts in hierarchical order
• Hierarchy called ‘affective domain labels’
– behaviour
– personality
– cognitive state
Subjectivity detection
• Aim: To extract subjective portions of text
• Algorithm used: Minimum cut algorithm
Constructing the graph
• Why graphs?
• Nodes and edges?
• Individual Scores
• Association scores
To model item-specific
and pairwise information
independently.
• Why graphs?
• Nodes and edges?
• Individual Scores
• Association scores
Nodes: Sentences of
the document and source & sink
Source & sink represent
the two classes of sentences
Edges: Weighted with
either of the two scores
• Why graphs?
• Nodes and edges?
• Individual Scores
• Association scores
Prediction whether
the sentence is subjective or not
Indsub(si)=
• Why graphs?
• Nodes and edges?
• Individual Scores
• Association scores
Prediction whether two
sentences should have
the same subjectivity level
T : Threshold – maximum distance upto
which sentences may be considered
proximal
f: The decaying function
i, j : Position numbers
Constructing the graph
• Build an undirected graph G with vertices
{v1, v2…,s, t} (sentences and s,t)
• Add edges (s, vi) each with weight ind1(xi)
• Add edges (t, vi) each with weight ind2(xi)
• Add edges (vi, vk) with weight assoc (vi, vk)
• Partition cost:
Example
Sample cuts:
Document
Subjective
Results (1/2)
• Naïve Bayes, no extraction : 82.8%
• Naïve Bayes, subjective extraction : 86.4%
• Naïve Bayes, ‘flipped experiment’ : 71 %
Document
Subjectivity
detector
Objective
POLARITY CLASSIFIER
Results (2/2)
Approach 1: Using adjectives
• Many adjectives have high sentiment
value
– A ‘beautiful’ bag
– A ‘wooden’ bench
– An ‘embarrassing’ performance
• An idea would be to augment this polarity
information to adjectives in the WordNet
Setup
• Two anchor words (extremes of the
polarity spectrum) were chosen
• PMI of adjectives with respect to these
adjectives is calculated
Polarity Score (W)= PMI(W,excellent) – PMI (W, poor)
excellent poor
word
PMI PMI
Experimentation
• K-means clustering algorithm used on the
basis of polarity scores
• The clusters contain words with similar
polarities
• These words can be linked using an
‘isopolarity link’ in WordNet
Results
• Three clusters seen
• Major words were with negative polarity
scores
• The obscure words were removed by
selecting adjectives with familiarity count
of 3
– the ones that are not very common
Approach 2: Using Adverb-
Adjective Combinations (AACs)
• Calculate sentiment value based on the
effect of adverbs on adjectives
• Linguistic ideas:
• Adverbs of affirmation: certainly
• Adverbs of doubt: possibly
• Strong intensifying adverbs: extremely
• Weak intensifying adverbs: scarcely
• Negation and Minimizers: never
Moving towards computation…
• Based on type of adverb, the score of the
resultant AAC will be affected
• Example of an axiom:
• Example : ‘extremely good’ is more
positive than ‘good’
AAC Scoring Algorithms
1. Variable Scoring Algorithm
2. Adjective Priority Scoring Algorithm
3. Adverb first scoring algorithm
Scoring the sentiment on a topic
• Rel (t) : Sentences in d that reference to
topic t
• s : Sentence is Rel (t)
• Appl+(s) : AACs with positive score in s
• Appl-(s) : AACs with negative score in s
• Return strength =
Findings
• APSr with r=0.35 worked the best (Better
correlation with human subject)
– Adjectives are more important than adverbs in
terms of sentiment
• AACs give better precision and recall as
compared to only adjectives
Approach 3: Subject-based SA
• Examples:
The horse bolted.
The movie lacks a good story.
Lexicon
subj. bolt
b VB bolt subj
subj. lack obj.
b VB lack obj ~subj
Argument that sends the
sentiment (subj./obj.)
Argument that receives the
sentiment (subj./obj.)
Argument that receives the
sentiment (subj./obj.)
Lexicon
• Also allows ‘S+’ characters
• Similar to regular expressions
• E.g. to put S+ to risk
– The favorability of the subject depends on the
favorability of ‘S+’.
Example
The movie lacks a good story.
G JJ good obj.
The movie lacks S+.
B VB lack obj ~subj.
Lexicon : Steps :
1) Consider a context window of upto five
words
2) Shallow parse the sentence
3) Step-by-step calculate the sentiment value
based on lexicon and by adding ‘S+’
characters at each step
Results
Description Precision Recall
Benchmark
corpus
Mixed
statements
94.3% 28%
Open Test
corpus
Reviews of
a camera
94% 24%
Applications
• Review-related analysis
• Developing ‘hate mail filters’ analogous to
‘spam mail filters’
• Question-answering (Opinion-oriented
questions may involve different treatment)
Conclusion & Future Work
• Lexical Resources have been developed to
capture sentiment-related nature
• Subjective extracts provide a better accuracy of
sentiment prediction
• Several approaches use algorithms like Naïve
Bayes, clustering, etc. to perform sentiment
analysis
• The cognitive angle to Sentiment Analysis can
be explored in the future
References (1/2)
• Tetsuya Nasukawa, Jeonghee Yi. ‘Sentiment Analysis: Capturing
Favorability Using Natural Language Processing’. In K-CAP ’03, Florida,
pages 1-8. 2003.
• Alekh Agarwal, Pushpak Bhattacharyya. ‘Augmenting WordNet with polarity
information on adjectives’. In K-CAP ’03, Florida, pages 1-8. 2003.
• SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining
Andrea Esuli, Fabrizio Sebastiani
• ‘Machine Learning’, Han and Kamber, 2nd edition, 310-330.
• http://wordnet.princeton.edu
• Farah Benamara, Carmine Cesarano, Antonio Picariello, VS Subrahmanian
et al; ‘Sentiment Analysis: Adjectives and Adverbs are better than Adjectives
Alone’; In ICWSM ’2007 Boulder, CO USA, 2007.
References (2/2)
• Jon M. Kleinberg; ‘Authoritative Sources in a Hyperlinked Environment’ as
IBM Research Report RJ 10076, May 1997, Pgs. 1 – 34.
• www.cs.uah.edu/~jrushing/cs696-summer2004/notes/Ch8Supp.ppt
• Opinion Mining and Sentiment Analysis, Foundations and Trends in
Information Retrieval, B. Pang and L. Lee, Vol. 2, Nos. 1–2 (2008) 1–135,
2008.
• Bo Pang, Lillian Lee; ‘A Sentimental Education: Sentiment Analysis Using
Subjectivity Summarization Based on Minimum Cuts’; Proceedings of the
42nd ACL; pp. 271–278; 2004.
• http://www.cse.iitb.ac.in/~veeranna/ppt/Wordnet-Affect.ppt

More Related Content

Similar to sa-mincut-aditya.ppt

Machine Learning
Machine Learning Machine Learning
Machine Learning GaytriDhingra1
 
AI -learning and machine learning.pptx
AI  -learning and machine learning.pptxAI  -learning and machine learning.pptx
AI -learning and machine learning.pptxGaytriDhingra1
 
Knowledge base system appl. p 1,2-ver1
Knowledge base system appl.  p 1,2-ver1Knowledge base system appl.  p 1,2-ver1
Knowledge base system appl. p 1,2-ver1Taymoor Nazmy
 
NLP Bootcamp
NLP BootcampNLP Bootcamp
NLP BootcampAnuj Gupta
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment AnalysisAnkur Tyagi
 
progressivereport (1).pptx
progressivereport (1).pptxprogressivereport (1).pptx
progressivereport (1).pptxShubhamPancheshwar1
 
OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)
OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)
OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)Nicolas Van Labeke
 
Determining text complexity 4 step process
Determining text complexity 4 step processDetermining text complexity 4 step process
Determining text complexity 4 step processAngela Wolfe
 
Introduction to Text Mining
Introduction to Text MiningIntroduction to Text Mining
Introduction to Text MiningMinha Hwang
 
Sergey Nikolenko and Anton Alekseev User Profiling in Text-Based Recommende...
Sergey Nikolenko and  Anton Alekseev  User Profiling in Text-Based Recommende...Sergey Nikolenko and  Anton Alekseev  User Profiling in Text-Based Recommende...
Sergey Nikolenko and Anton Alekseev User Profiling in Text-Based Recommende...AIST
 
Analyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in PythonAnalyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in PythonAbhinav Gupta
 
A Gentle Introduction to Text Analysis :)
A Gentle Introduction to Text Analysis :)A Gentle Introduction to Text Analysis :)
A Gentle Introduction to Text Analysis :)UNCResearchHub
 
Big Data Palooza Talk: Aspects of Semantic Processing
Big Data Palooza Talk: Aspects of Semantic ProcessingBig Data Palooza Talk: Aspects of Semantic Processing
Big Data Palooza Talk: Aspects of Semantic ProcessingNa'im Tyson
 
A Gentle Introduction to Text Analysis I
A Gentle Introduction to Text Analysis IA Gentle Introduction to Text Analysis I
A Gentle Introduction to Text Analysis IUNCResearchHub
 
TopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxTopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxKalpit Desai
 
Multimodal Learning Analytics
Multimodal Learning AnalyticsMultimodal Learning Analytics
Multimodal Learning AnalyticsXavier Ochoa
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information RetrievalNik Spirin
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature surveyAkshay Hegde
 

Similar to sa-mincut-aditya.ppt (20)

Machine Learning
Machine Learning Machine Learning
Machine Learning
 
AI -learning and machine learning.pptx
AI  -learning and machine learning.pptxAI  -learning and machine learning.pptx
AI -learning and machine learning.pptx
 
Knowledge base system appl. p 1,2-ver1
Knowledge base system appl.  p 1,2-ver1Knowledge base system appl.  p 1,2-ver1
Knowledge base system appl. p 1,2-ver1
 
NLP Bootcamp
NLP BootcampNLP Bootcamp
NLP Bootcamp
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
progressivereport (1).pptx
progressivereport (1).pptxprogressivereport (1).pptx
progressivereport (1).pptx
 
OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)
OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)
OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)
 
Determining text complexity 4 step process
Determining text complexity 4 step processDetermining text complexity 4 step process
Determining text complexity 4 step process
 
Introduction to Text Mining
Introduction to Text MiningIntroduction to Text Mining
Introduction to Text Mining
 
Sergey Nikolenko and Anton Alekseev User Profiling in Text-Based Recommende...
Sergey Nikolenko and  Anton Alekseev  User Profiling in Text-Based Recommende...Sergey Nikolenko and  Anton Alekseev  User Profiling in Text-Based Recommende...
Sergey Nikolenko and Anton Alekseev User Profiling in Text-Based Recommende...
 
Analyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in PythonAnalyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in Python
 
A Gentle Introduction to Text Analysis :)
A Gentle Introduction to Text Analysis :)A Gentle Introduction to Text Analysis :)
A Gentle Introduction to Text Analysis :)
 
Big Data Palooza Talk: Aspects of Semantic Processing
Big Data Palooza Talk: Aspects of Semantic ProcessingBig Data Palooza Talk: Aspects of Semantic Processing
Big Data Palooza Talk: Aspects of Semantic Processing
 
A Gentle Introduction to Text Analysis I
A Gentle Introduction to Text Analysis IA Gentle Introduction to Text Analysis I
A Gentle Introduction to Text Analysis I
 
TopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxTopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptx
 
Multimodal Learning Analytics
Multimodal Learning AnalyticsMultimodal Learning Analytics
Multimodal Learning Analytics
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information Retrieval
 
Chounta@paws
Chounta@pawsChounta@paws
Chounta@paws
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature survey
 

More from ShaliniVerma380300

More from ShaliniVerma380300 (20)

CBE.pptx
CBE.pptxCBE.pptx
CBE.pptx
 
new.ppt
new.pptnew.ppt
new.ppt
 
Personalized adaptive learning.pptx
Personalized adaptive learning.pptxPersonalized adaptive learning.pptx
Personalized adaptive learning.pptx
 
527375072-50-Innovative-Teaching-Methods-in-Teaching-Science.ppt
527375072-50-Innovative-Teaching-Methods-in-Teaching-Science.ppt527375072-50-Innovative-Teaching-Methods-in-Teaching-Science.ppt
527375072-50-Innovative-Teaching-Methods-in-Teaching-Science.ppt
 
Innovative-Teaching-Methods-in-Teaching-Science.ppt
Innovative-Teaching-Methods-in-Teaching-Science.pptInnovative-Teaching-Methods-in-Teaching-Science.ppt
Innovative-Teaching-Methods-in-Teaching-Science.ppt
 
CBE.pptx
CBE.pptxCBE.pptx
CBE.pptx
 
EDM 401 4(C)fyfhghghg.pptx
EDM 401 4(C)fyfhghghg.pptxEDM 401 4(C)fyfhghghg.pptx
EDM 401 4(C)fyfhghghg.pptx
 
EDM 401 4(C)fyfhghghg.pptx
EDM 401 4(C)fyfhghghg.pptxEDM 401 4(C)fyfhghghg.pptx
EDM 401 4(C)fyfhghghg.pptx
 
EDM 401 4(C)fyfhghghg.pptx
EDM 401 4(C)fyfhghghg.pptxEDM 401 4(C)fyfhghghg.pptx
EDM 401 4(C)fyfhghghg.pptx
 
EDC 471 4(a+b) yugugug.pptx
EDC 471 4(a+b) yugugug.pptxEDC 471 4(a+b) yugugug.pptx
EDC 471 4(a+b) yugugug.pptx
 
IEP.pptx
IEP.pptxIEP.pptx
IEP.pptx
 
EDC 471 3(d).pptx
EDC 471 3(d).pptxEDC 471 3(d).pptx
EDC 471 3(d).pptx
 
indus.pptx
indus.pptxindus.pptx
indus.pptx
 
pppi.pptx
pppi.pptxpppi.pptx
pppi.pptx
 
IEP.pptx
IEP.pptxIEP.pptx
IEP.pptx
 
EDM 401 2(d) ollll.pptx
EDM 401 2(d) ollll.pptxEDM 401 2(d) ollll.pptx
EDM 401 2(d) ollll.pptx
 
Unit-2-Multi Media.pptx
Unit-2-Multi Media.pptxUnit-2-Multi Media.pptx
Unit-2-Multi Media.pptx
 
Issues-and-problems-of-existing-curriculum.pptx
Issues-and-problems-of-existing-curriculum.pptxIssues-and-problems-of-existing-curriculum.pptx
Issues-and-problems-of-existing-curriculum.pptx
 
401 1c.pptx
401 1c.pptx401 1c.pptx
401 1c.pptx
 
ved.pptx
ved.pptxved.pptx
ved.pptx
 

Recently uploaded

How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationAadityaSharma884161
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayMakMakNepo
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........LeaCamillePacle
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 

Recently uploaded (20)

How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint Presentation
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up Friday
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 

sa-mincut-aditya.ppt

  • 1. Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay
  • 2. What is SA & OM? • Identify the orientation of opinion in a piece of text • Can be generalized to a wider set of emotions The movie was fabulous! The movie stars Mr. X The movie was horrible!
  • 3. Motivation • Knowing sentiment is a very natural ability of a human being. Can a machine be trained to do it? • SA aims at getting sentiment-related knowledge especially from the huge amount of information on the internet • Can be generally used to understand opinion in a set of documents
  • 4. Tripod of Sentiment Analysis Cognitive Science Natural Language Processing Machine Learning Sentiment Analysis Natural Language Processing Machine Learning
  • 6. Challenges • Contrasts with standard text-based categorization • Domain dependent • Sarcasm • Dissatisfied expressions Mere presence of words is Indicative of the category in case of text categorization. Not the case with sentiment analysis • Contrasts with standard text-based categorization • Domain dependent • Sarcasm • Dissatisfied expressions Sentiment of a word is w.r.t. the domain. Example: ‘unpredictable’ For steering of a car, For movie review, • Contrasts with standard text-based categorization • Domain dependent • Sarcasm • Dissatisfied expressions Sarcasm uses words of a polarity to represent another polarity. Example: The perfume is so amazing that I suggest you wear it with your windows shut • Contrasts with standard text-based categorization • Domain dependent • Sarcasm • Dissatisfied expressions the sentences/words that contradict the overall sentiment of the set are in majority Example: The actors are good, the music is brilliant and appealing. Yet, the movie fails to strike a chord.
  • 7. SentiWordNet •Lexical resource for sentiment analysis •Built on the top of WordNet synsets •Attaches sentiment-related information with synsets
  • 8. Quantifying sentiment Objective Polarity Subjective Polarity Positive Negative Term sense position Each term has a Positive, Negative and Objective score. The scores sum to one.
  • 9. Building SentiWordNet • Ln, Lo, Lp are the three seed sets • Iteratively expand the seed sets through K steps • Train the classifier for the expanded sets
  • 10. Lp Ln Expansion of seed sets The sets at the end of kth step are called Tr(k,p) and Tr(k,n) Tr(k,o) is the set that is not present in Tr(k,p) and Tr(k,n)
  • 11. Committee of classifiers • Train a committee of classifiers of different types and different K-values for the given data • Observations: – Low values of K give high precision and low recall – Accuracy in determining positivity or negativity, however, remains almost constant
  • 12. WordNet Affect • Similar to SentiWordNet (an earlier work) • WordNet-Affect: WordNet + annotated affective concepts in hierarchical order • Hierarchy called ‘affective domain labels’ – behaviour – personality – cognitive state
  • 13. Subjectivity detection • Aim: To extract subjective portions of text • Algorithm used: Minimum cut algorithm
  • 14. Constructing the graph • Why graphs? • Nodes and edges? • Individual Scores • Association scores To model item-specific and pairwise information independently. • Why graphs? • Nodes and edges? • Individual Scores • Association scores Nodes: Sentences of the document and source & sink Source & sink represent the two classes of sentences Edges: Weighted with either of the two scores • Why graphs? • Nodes and edges? • Individual Scores • Association scores Prediction whether the sentence is subjective or not Indsub(si)= • Why graphs? • Nodes and edges? • Individual Scores • Association scores Prediction whether two sentences should have the same subjectivity level T : Threshold – maximum distance upto which sentences may be considered proximal f: The decaying function i, j : Position numbers
  • 15. Constructing the graph • Build an undirected graph G with vertices {v1, v2…,s, t} (sentences and s,t) • Add edges (s, vi) each with weight ind1(xi) • Add edges (t, vi) each with weight ind2(xi) • Add edges (vi, vk) with weight assoc (vi, vk) • Partition cost:
  • 17. Document Subjective Results (1/2) • NaĂŻve Bayes, no extraction : 82.8% • NaĂŻve Bayes, subjective extraction : 86.4% • NaĂŻve Bayes, ‘flipped experiment’ : 71 % Document Subjectivity detector Objective POLARITY CLASSIFIER
  • 19. Approach 1: Using adjectives • Many adjectives have high sentiment value – A ‘beautiful’ bag – A ‘wooden’ bench – An ‘embarrassing’ performance • An idea would be to augment this polarity information to adjectives in the WordNet
  • 20. Setup • Two anchor words (extremes of the polarity spectrum) were chosen • PMI of adjectives with respect to these adjectives is calculated Polarity Score (W)= PMI(W,excellent) – PMI (W, poor) excellent poor word PMI PMI
  • 21. Experimentation • K-means clustering algorithm used on the basis of polarity scores • The clusters contain words with similar polarities • These words can be linked using an ‘isopolarity link’ in WordNet
  • 22. Results • Three clusters seen • Major words were with negative polarity scores • The obscure words were removed by selecting adjectives with familiarity count of 3 – the ones that are not very common
  • 23. Approach 2: Using Adverb- Adjective Combinations (AACs) • Calculate sentiment value based on the effect of adverbs on adjectives • Linguistic ideas: • Adverbs of affirmation: certainly • Adverbs of doubt: possibly • Strong intensifying adverbs: extremely • Weak intensifying adverbs: scarcely • Negation and Minimizers: never
  • 24. Moving towards computation… • Based on type of adverb, the score of the resultant AAC will be affected • Example of an axiom: • Example : ‘extremely good’ is more positive than ‘good’
  • 25. AAC Scoring Algorithms 1. Variable Scoring Algorithm 2. Adjective Priority Scoring Algorithm 3. Adverb first scoring algorithm
  • 26. Scoring the sentiment on a topic • Rel (t) : Sentences in d that reference to topic t • s : Sentence is Rel (t) • Appl+(s) : AACs with positive score in s • Appl-(s) : AACs with negative score in s • Return strength =
  • 27. Findings • APSr with r=0.35 worked the best (Better correlation with human subject) – Adjectives are more important than adverbs in terms of sentiment • AACs give better precision and recall as compared to only adjectives
  • 28. Approach 3: Subject-based SA • Examples: The horse bolted. The movie lacks a good story.
  • 29. Lexicon subj. bolt b VB bolt subj subj. lack obj. b VB lack obj ~subj Argument that sends the sentiment (subj./obj.) Argument that receives the sentiment (subj./obj.) Argument that receives the sentiment (subj./obj.)
  • 30. Lexicon • Also allows ‘S+’ characters • Similar to regular expressions • E.g. to put S+ to risk – The favorability of the subject depends on the favorability of ‘S+’.
  • 31. Example The movie lacks a good story. G JJ good obj. The movie lacks S+. B VB lack obj ~subj. Lexicon : Steps : 1) Consider a context window of upto five words 2) Shallow parse the sentence 3) Step-by-step calculate the sentiment value based on lexicon and by adding ‘S+’ characters at each step
  • 32. Results Description Precision Recall Benchmark corpus Mixed statements 94.3% 28% Open Test corpus Reviews of a camera 94% 24%
  • 33. Applications • Review-related analysis • Developing ‘hate mail filters’ analogous to ‘spam mail filters’ • Question-answering (Opinion-oriented questions may involve different treatment)
  • 34. Conclusion & Future Work • Lexical Resources have been developed to capture sentiment-related nature • Subjective extracts provide a better accuracy of sentiment prediction • Several approaches use algorithms like NaĂŻve Bayes, clustering, etc. to perform sentiment analysis • The cognitive angle to Sentiment Analysis can be explored in the future
  • 35. References (1/2) • Tetsuya Nasukawa, Jeonghee Yi. ‘Sentiment Analysis: Capturing Favorability Using Natural Language Processing’. In K-CAP ’03, Florida, pages 1-8. 2003. • Alekh Agarwal, Pushpak Bhattacharyya. ‘Augmenting WordNet with polarity information on adjectives’. In K-CAP ’03, Florida, pages 1-8. 2003. • SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining Andrea Esuli, Fabrizio Sebastiani • ‘Machine Learning’, Han and Kamber, 2nd edition, 310-330. • http://wordnet.princeton.edu • Farah Benamara, Carmine Cesarano, Antonio Picariello, VS Subrahmanian et al; ‘Sentiment Analysis: Adjectives and Adverbs are better than Adjectives Alone’; In ICWSM ’2007 Boulder, CO USA, 2007.
  • 36. References (2/2) • Jon M. Kleinberg; ‘Authoritative Sources in a Hyperlinked Environment’ as IBM Research Report RJ 10076, May 1997, Pgs. 1 – 34. • www.cs.uah.edu/~jrushing/cs696-summer2004/notes/Ch8Supp.ppt • Opinion Mining and Sentiment Analysis, Foundations and Trends in Information Retrieval, B. Pang and L. Lee, Vol. 2, Nos. 1–2 (2008) 1–135, 2008. • Bo Pang, Lillian Lee; ‘A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts’; Proceedings of the 42nd ACL; pp. 271–278; 2004. • http://www.cse.iitb.ac.in/~veeranna/ppt/Wordnet-Affect.ppt