Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Praxis Business School

Feature-Based Opinion
Mining
Gourab Nath
Faculty Member, Data Science
Praxis Business School, Bangalore
gourab@praxis.ac.in | 9038333245

Need to purchase
a cellular phone…

“The front camera is extremely clear. The phone comes with 8GB RAM. It has Gorilla
glass 6 on front & Gorilla glass 5 on back with aluminium frame on side. With 48MP
f/1.7 (Sony IMX 586 sensor) rear camera, it clicks amazing outdoor pics. The battery
backup is great. The speaker is not good though.”
REVIEW

NON-OPINIONATED
PASSAGES
OBJECTIVE SENTENCE

OPINIONATED PASSAGES ON
FEATURES
SUBJECTIVE SENTENCE

OPINIONS ON FEATURES

front camera - extremely clear (+3)
Rear camera - amazing (+4)
Battery backup - great (+5)
Speaker - not good (-2)
FEATURE-BASED
SENTIMENT SUMMARY

Select few reviews (a reasonable number) and probably design a summary like this.
Camera = {awesome: 5 , great: 15, extremely good: 5, …, bad: 8, poor: 5 }
Display = {beautiful: 25 , lovely: 18, wonderful: 12, …, clear: 7, poor: 11 }
Battery = {fine: 8 , good: 22, fast: 7, …, bad: 9}
Price = {high: 10 , comfortable: 15, … }
SUMMARY

Summary
From a Capstone Project
at
Praxis Business School, Bangalore
Instruction: Click on Oneplus 6 in the webpage
Click here

The Problem of Sentiment Analysis

Bing Liu
Distinguished Professor
Department of Computer Science
University of Illinois Chicago (UIC)
Minqing Hu
Data Scientist at Signifyd
PhD – Computer Science
University of Illinois Chicago (UIC)

Mining Opinion Features in Customer Reviews
M. Hu and B. Liu
Proceedings of the ACM SIGKDD Conference on KDD, 2004
Mining and Summarizing Customer Reviews
M. Hu and B. Liu
Opinion Observer: Analysing and Comparing Opinions on the web
M. Hu, B. Liu and J. Cheng
Proceedings of WWW, 2005
Sentiment Analysis and Subjectivity
B. Liu
Handbook of Natural Language Processing 2 (2010), 627-666
1
2
3
4

Object
Components
Sub
Components /
Attributes
Cellular Phone
Camera Battery Display
Front
Camera
Back
Camera
Rear
Camera
Battery
life
Battery
Size
Battery
performance
ROOT
Size Quality Type
Features being represented by
its synonyms
OBJECT
Thus, an object can be represented as a tree, hierarchy or taxonomy.

Display
Front
Camera
Rear
Camera
Battery
life
Display
Size
Phone
Cellular Phone
FEATURES
Back
Camera
Battery
Battery
Size
Battery
performance
Display
Clarity
Camera

 Explicit Feature Example:
“The battery life of this phone is too short”
 Implicit Feature Example:
“The phone doesn’t fit in an usual jeans pocket though.”
Size of the Phone
FEATURES
EXPLICIT VS IMPLICIT
FEATURES
“Don’t know why I had spent so much money for the phone”
Not value for money

 Explicit Opinions Example:
The display clarity of this phone is amazing!
 Implicit Opinions Example:
The phone doesn’t fit in an usual jeans pocket though.
A fact which expresses
dissatisfaction / disappointment
OPINIONS
EXPLICIT VS IMPLICIT
OPINIONS

1. Identification of Frequent Features
2. Identification of Opinions on each features
3. Opinion Orientation Identification
4. Infrequent Feature Identification
5. Summary Generation
THE PROCESS FLOW

Step 1: Frequent Feature Mining
Frequent Features
Mining
Opinion Word Extraction
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation

POS TAGGING
Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
N N N
N N N
N N

BINARY REPRESENTATION
Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation

BINARY REPRESENTATIONEXAMPLE
Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
Front

Front
ASSOCIATION RULES
MINING
SUPPORT 0.7 0.6 0.4 0.4 0.3 0.2 0.3
EXAMPLE
Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation

Front
ASSOCIATION RULES
MINING
P(Camera, Front) 0.4
EXAMPLE
Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
SUPPORT

Front
ASSOCIATION RULES
MINING
P(Battery, Front)
EXAMPLE
Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
0.2SUPPORT

Front
ASSOCIATION RULES
MINING
P(Battery, Life) 0.4
EXAMPLE
Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
SUPPORT

Front
ASSOCIATION RULES
MINING
P(Camera, Buy) 0.2
Minimum Support
Threshold = 0.4
(say)
EXAMPLE
Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
SUPPORT

front
ASSOCIATION RULES
MINING
EXPERIMENTAL
RESULTS
One Plus 6 Features – Extracted from the Reviews written in www.amazon.in
Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation

FEATURE PRUNING
COMPACTNESS
PRUNING
The method checks features that contains at least 2 words and remove those that are likely to be meaningless
Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation

FEATURE PRUNING
COMPACTNESS
PRUNING
EXAMPLE
“The camera quality is really good”
“I love the quality of the camera”
“awesome camera and the phone comes with a quality display”
counter example
“The phone has an awesome front camera and a quality display”
Compact
Compact
Not Compact
Compact
Compact but has no
dependency
Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation

FEATURE PRUNING
COMPACTNESS
PRUNING
COUNTER-
EXAMPLES
“Both the camera and the battery is good”
“Although good camera but not good battery”
“lovely camera quality and nice battery”
Compact
Compact
Compact
Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation

FEATURE PRUNING
COMPACTNESS
PRUNING
COUNTER-
EXAMPLES
“Both the camera and the battery is good”
“Although good camera but not good battery”
“lovely camera quality and nice battery”
Compact
Compact
Compact
However note:
The features here are separated by conjunctions (which is mostly the cases)
Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation

FEATURE PRUNING
COMPACTNESS
PRUNING
MODIFICATION
Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation

FEATURE PRUNING
EXPERIMENTAL
RESULTS
Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
front

FEATURE PRUNING
REDUNDANCY
PRUNING
The method checks features that contains SINGLE word and remove those that are likely to be meaningless
p-support (pure support) – p support of a feature f is the number of sentences that f appears and these
sentences must contain no feature phrase that is a superset of f
Example:
Consider the feature: camera
Consider the other features that
contains the word camera:
front camera | rear Camera |
back camera | camera quality.
P-support of camera
= number of reviews in which camera
occurred along and not with any of its
supersets
= 100 – (20 + 15 + 23 + 10)
= 32
A Feature will be considered meaningful if it satisfied the minimum threshold for p-support.
Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation

front
FEATURE PRUNING
EXPERIMENTAL
RESULTS
Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation

FREQUENT FEATURE
MINING
FLOWCHART
Review Database Frequent Features
POS Tagging
Frequent Feature
Identification
Feature Pruning
Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation

Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
Step 2: Opinion Word Extraction

Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
ADJECTIVES AS OPINION
Mining and Summarizing Customer Reviews
M. Hu and B. Liu
Examples:
“The camera of the phone is good”
“The display looks dull”
“the sound quality of the speaker is fantastic”
“The phone has some really cool features”
Adjective
Adjective
Adjective
Adjective
 This was based on previous research works on subjectivity
The nearest
adjective is
considered
as opinion

Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
ADJECTIVES AS OPINION
COUNTER-
EXAMPLES
Examples:
“The camera of the phone is extremely good”
“The headphone is not working”
“The speaker of the phone is doing great”
“The phone has some nice cool features”
“The display is not bad”
Adverb + Adjective
Negation + Verb
Verb + Adjective
Adjective + Adjective
Negation + Adjective

Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
OPINION EXTRACTIONALGORITHM
Opinion Word/s Extraction:

Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
Step 3
Opinion Orientation
Identification

Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
OPINION ORIENTATION
IDENTIFICATIOIN
ONLY ADJECTIVES
Adjective list:
Seed list:

Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
OPINION ORIENTATION
IDENTIFICATIOIN
WORDNET
In WordNet , adjectives are organized into bipolar clusters

Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
OPINION ORIENTATION
IDENTIFICATIOIN
WORDNET
Fast = + 2
Seed list:
In general, adjectives share the same orientation
as their synonyms and opposite orientation as
their antonyms.

Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
OPINION ORIENTATION
IDENTIFICATIOIN
ALGORITHM

Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
Examples:
“The camera of the phone is extremely good”
“The headphone is not working”
“The speaker of the phone is doing great”
“The phone has some nice cool features”
“The display is not bad”
Adverb + Adjective
Negation + Verb
Verb + Adjective
Adjective + Adjective
Negation + Adjective
OPINION ORIENTATION
IDENTIFICATIOIN
LIMITATION

Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
FLOWCHARTTILL NOW!
Review
Database
POS Tagging
Frequent Feature
Identification
Feature Pruning
Frequent
Features
Opinion Word
Identification
Opinion
Orientation
Identification

Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
Step 4: Infrequent Feature Mining

Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
“The picture is absolutely amazing.”
“The software that comes with it is amazing”
Note: The above two sentences shares same opinion
‘easy’ yet describing different features.
INFREQUENT FEATURE
MINING

Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
“The picture is absolutely amazing.”
“The software that comes with it is amazing”
Note: The above two sentences shares same opinion
‘easy’ yet describing different features.
INFREQUENT FEATURE
MINING
COUNTER-
EXAMPLE
“The delivery guy was amazingly patient”
Shares the same
opinion but is not
a relevant feature

Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
INFREQUENT FEATURE
MINING
Algorithm

Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
Review Database
POS Tagging
Frequent Feature
Identification
Feature Pruning
Frequent
Features
Opinion Word
Identification
Opinion Orientation
Identification
Opinion
Words
Infrequent
Features
Infrequent
Feature
Identification
FLOWCHARTTILL NOW!

Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
Step 5: Summary Generation

Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 = 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑂𝑝𝑖𝑛𝑖𝑜𝑛 𝑂𝑟𝑖𝑒𝑛𝑡𝑎𝑡𝑖𝑜𝑛𝑠
𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 = − 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑂𝑝𝑖𝑛𝑖𝑜𝑛 𝑂𝑟𝑖𝑒𝑛𝑡𝑎𝑡𝑖𝑜𝑛𝑠
𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 % =
𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒
𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 + 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒
𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 % =
𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒
𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 + 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒

Frequent Features
Mining
Opinion Orientation
Identification
Infrequent Features
Mining
Summary Generation
Review Database
POS Tagging
Frequent Feature
Identification
Feature Pruning
Frequent
Features
Opinion Word
Identification
Opinion Orientation
Identification
Opinion
Words
Infrequent
Features
Infrequent
Feature
Identification
Summary Generation

Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Praxis Business School

Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Praxis Business School

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Praxis Business School

Similar to Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Praxis Business School (20)

More from Analytics India Magazine

More from Analytics India Magazine (20)

Recently uploaded

Recently uploaded (20)

Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Praxis Business School