Opinion Driven Decision Support System

Opinion-Driven Decision Support
System

Visiting a new city….
Online Opinions  Which hotel to stay at?
2

Visiting a new city….
Online Opinions  What attractions to visit?
Without opinions, decision making becomes
difficult!
3

ODSS Components
1. Data
Comprehensive set of opinions to support search and analysis
capabilities
4. Presentation
Putting it all altogether- easy way for users to explore results of search
and analysis components (ex. organizing and summarizing results)
3. Search Capabilities
Ability to find entities using
existing opinions
Focus of Existing
Work
opinion summarization
structured summaries
1. Sentiment Summary
(ex. +ve/-ve on a piece of text)
2. Fine-grained Sentiment Summ.
(ex. Battery life: 2 stars; Audio: 1 star)
2. Analysis Tools
Tools to help digest opinions
(ex. Summaries, Opinion trend
visualization)
Not a complete solution to support
decision making based on opinions !
4

ODSS Components
1. Data
Comprehensive set of opinions to support search and analysis
capabilities
4. Presentation
Putting it all altogether- easy way for users to explore results of search
and analysis components (ex. organizing and summarizing results)
Ability to find entities using
existing opinions
Focus of Existing
Work
opinion summarization
structured summaries
1. Sentiment Summary
(ex. +ve/-ve on a piece of text)
2. Fine-grained Sentiment Summ.
(ex. Battery life: 2 stars; Audio: 1 star)
2. Analysis Tools
Tools to help digest opinions
(ex. Summaries, Opinion trend
visualization)
Need to address broader set of problems
to enable opinion driven decision support
5

 We need data: large number of online
opinions
 Allow users to get complete and unbiased picture
▪ Opinions are very subjective and can vary a lot
 Currently: No study on how to systematically
collect opinions from the web

 We need different analysis tools
 To help users analyze & digest opinions
▪ Sentiment trend visualization
▪ fluctuation over time
▪ Aspect level sentiment summaries
▪ Textual summaries, etc…
 Currently: focus on structured summarization

 We need to incorporate search
 Allow users find different items or entities based
on existing opinions
 This can improve user productivity  cuts down
on the time spent on reading large number
opinions

 We also need to know how to organize &
present opinions at hand effectively
 Aspect level summaries:
▪ How to organize these summaries?
▪ Scores or Visuals (stars)?
▪ Do you show supporting phrases?
 Full opinions:
▪ How to allow effective browsing of reviews/opinions? 
don’t overwhelm users

ODSS Components
1. Data
Comprehensive set of opinions to support opinion based
search & analysis tasks
Find items/entities based on
existing opinions
(ex. show “clean” hotels only)
4. Presentation
Organizing opinions to support effective decision making
2. Analysis Tools
Tools to help analyze & digest
opinions (ex. Summaries, Opinion
trend visualization)
10

1. Should be general
 Works across different domains & possibly content
type
2. Should be practical & lightweight
 Can be integrated into existing applications
 Can potentially scale up to large amounts of data
11

Ganesan & Zhai 2012 (Information Retrieval)
12

 Currently: No direct way of finding entities
based on online opinions
 Need to read opinions about different entities
to find entities that fulfill personal criteria
13
Time consuming & impairs user
productivity!

 Use existing opinions to rank entities based on
a set of unstructured user preferences
 Finding a hotel: “clean rooms, good service”
 Finding a restaurant: “authentic food, good ambience”
14

 Use results of existing opinion mining methods
 Find sentiment ratings on different aspects
 Rank entities based on discovered aspect ratings
 Problem: Not practical!
 Costly - mine large amounts of textual content
 Need prior knowledge on set of queriable aspects
 Most existing methods rely on supervision
▪ E.g. Overall user rating
15

 Use existing text retrieval models for ranking
entities based on preferences:
 Can scale up to large amounts of textual content
 Can be tweaked
 Do not require costly IE or text mining
16

 Investigate use of text retrieval models for Opinion-
Based Entity Ranking
 Compare 3 state-of-the-art retrieval models:
BM25, PL2, DirichletLM – shown to work best for TR tasks
 Which one works best for this ranking task?
 Explore some extensions over existing IR models
 Can ranking improve with these extensions?
 Compile the first test set & propose evaluation
method for this new ranking task
17

 Standard retrieval  cannot distinguish multiple
preferences in query
E.g. Query: “clean rooms, cheap, good service”
 Treated as long keyword query but actually 3 preferences
 Problem: An entity may score highly because of matching
one aspect extremely well
 To address this problem:
 Score each preference separately – multiple queries
 Combine the results of each query – different strategies
▪ Score combination  works best
▪ Average rank
▪ Min rank
▪ Max rank
19

 In standard retrieval: Matching an opinion
word & standard topic word is not
distinguished
 Opinion-Based Entity Ranking:
 Important to match opinion words in the query
▪ opinion words have more variation than topic words
▪ E.g. Great: excellent, good, fantastic, terrific…
 Intuition:
▪ Expand a query with similar opinion words
▪ Help emphasize matching of opinions
20

0.0%
2.0%
4.0%
6.0%
8.0%
PL2 LM BM25
QAM QAM + OpinExp
0.0%
0.5%
1.0%
1.5%
2.0%
2.5%
PL2 LM BM25
QAM QAM + OpinExp
Hotels Cars
Improvement using QAM
Improvement using QAM + OpinExp
21

0.0%
2.0%
4.0%
6.0%
8.0%
PL2 LM BM25
QAM QAM + OpinExp
0.0%
0.5%
1.0%
1.5%
2.0%
2.5%
PL2 LM BM25
QAM QAM + OpinExp
Hotels Cars
QAM: Any model
can be used
QAM: Any model
can be used
22

0.0%
2.0%
4.0%
6.0%
8.0%
PL2 LM BM25
QAM QAM + OpinExp
0.0%
0.5%
1.0%
1.5%
2.0%
2.5%
PL2 LM BM25
QAM QAM + OpinExp
Hotels Cars
QAM+OpinExp: BM25
most effective
QAM+OpinExp: BM25
most effective
23

Current methods: Focus on
generating structured
summaries of opinions
[Lu et al., 2009; Lerman et al., 2009;..]
Opinion Summary for iPod

We need supporting textual
summaries!
To know more: read many
redundant sentences
Opinion Summary for iPod

 Summarize the major opinions
 What are the major complaints/praise in the text?
 Concise
◦ Easily digestible
◦ Viewable on smaller screen
 Readable
◦ Easily understood
27

 Widely studied for years
[Radev et al.2000; Erkan & Radev, 2004; Mihalcea & Tarau, 2004…]
 Not suitable for generating concise summaries
 Bias: with limit on summary size
▪ Selected sentences may have missed critical info.
 Verbose: Not shortening sentences
We need more of an abstractive approach

2 Abstractive Summarization
Methods
Opinosis
-Graph based summarization framework
-Relies on structural redundancies in sentences
WebNgram
-Optimization framework based on readability
& representativeness scoring
-Phrases generated by combining words in
original text
29

Input
Set of sentences:
Topic specific
POS annotated
30

my
the iphone is a
phone calls frequently
too
with
.
drop
Step 1: Generate
graph representation of
text (Opinosis-Graph)
great
device
Input
Set of sentences:
Topic specific
POS annotated
31

Step 2: Find promising paths
(candidate summaries) &
score the candidates
my
the iphone is a
too
with
.
drop
Step 1: Generate
great
device
Input
Set of sentences:
Topic specific
POS annotated
calls frequentlydrop
great device
candidate sum1
candidate sum2
3.2
2.5
32

The iPhone is a great
device, but calls drop
frequently.
Step 3: Select top scoring
candidates as final summary
calls frequentlydrop
great device
Step 2: Find promising paths
(candidate summaries) &
score the candidates
candidate sum1
candidate sum2
3.2
2.5
my
the iphone is a
too
with
.
drop
Step 1: Generate
great
device
Input
Set of sentences:
Topic specific
POS annotated
33

Assume:
 2 sentences about “call quality of iphone”
1. My phone calls drop frequently with the iPhone.
2. Great device, but the calls drop too frequently.
34

• One node for each unique word + POS combination
• Sid and Pid maintained at each node
• Edges indicate relationship between words in sentence 35
great
2:1
device
2:2
,
2:3
but
2:4
.
1:9, 2:10
my
1:1
phone
1:2
drop
1:4, 2:7
frequently
1:5, 2:9 with
1:6
the
1:7, 2:5
iphone
1:8
calls
1:3, 2:6
too
2:8

great
2:1
device
2:2
,
2:3
but
2:4
.
1:9, 2:10
my
1:1
phone
1:2
drop
1:4, 2:7
frequently
1:5, 2:9 with
1:6
the
1:7, 2:5
iphone
1:8
calls
1:3, 2:6
too
2:8

great
2:1
device
2:2
,
2:3
but
2:4
.
1:9, 2:10
my
1:1
phone
1:2
drop
1:4, 2:7
frequently
1:5, 2:9 with
1:6
the
1:7, 2:5
iphone
1:8
calls
1:3, 2:6
too
2:8
drop
1:4, 2:7
frequently
1:5, 2:9
calls
1:3, 2:6
Path shared by 2 sentences naturally
captured by nodes
37

great
2:1
device
2:2
,
2:3
but
2:4
.
1:9, 2:10
my
1:1
phone
1:2
drop
1:4, 2:7
frequently
1:5, 2:9 with
1:6
the
1:7, 2:5
iphone
1:8
calls
1:3, 2:6
too
2:8
drop
1:4, 2:7
frequently
1:5, 2:9
calls
1:3, 2:6
Easily discover redundancies for high
confidence summaries
38

great
2:1
device
2:2
,
2:3
but
2:4
.
1:9, 2:10
my
1:1
phone
1:2
drop
1:4, 2:7
frequently
1:5, 2:9 with
1:6
the
1:7, 2:5
iphone
1:8
calls
1:3, 2:6
too
2:8
drop
1:4, 2:7
frequently
1:5, 2:9
calls
1:3, 2:6
Gap between words = 2
39

great
2:1
device
2:2
,
2:3
but
2:4
.
1:9, 2:10
my
1:1
phone
1:2
drop
1:4, 2:7
frequently
1:5, 2:9 with
1:6
the
1:7, 2:5
iphone
1:8
calls
1:3, 2:6
too
2:8
drop
1:4, 2:7
frequently
1:5, 2:9
calls
1:3, 2:6
Gapped subsequences allow:
• redundancy enforcements
• discovery of new sentences
40

 Calls drop frequently with the iPhone
 Calls drop frequently with the Black Berry
drop frequently with the iphonecalls
black berry
One common high
redundancy path
High fan-out
“calls drop frequently with the iphone and
black berry”
41

 Input:
 Topic specific sentences from user reviews
 Evaluation Measure:
 Automatic ROUGE evaluation
42

0.3184
0.2831
0.4932
0.1293
0.0851
0.2316
HUMAN
(17 words)
OPINOSISbest
(15 words)
MEAD
(75 words)
ROUGE-1 ROUGE-SU4
ROUGE Recall
0.3434
0.4482
0.0916
0.3088
0.3271
0.1515
HUMAN
(17 words)
OPINOSISbest
(15 words)
MEAD
(75 words)
ROUGE Precision
Lowest precision
Much longer
sentences
Highest recall
MEAD does not do well in generating
concise summaries.
43

0.3184
0.2831
0.4932
0.1293
0.0851
0.2316
HUMAN
(17 words)
OPINOSISbest
(15 words)
MEAD
(75 words)
ROUGE-1 ROUGE-SU4
ROUGE Recall
0.3434
0.4482
0.0916
0.3088
0.3271
0.1515
HUMAN
(17 words)
OPINOSISbest
(15 words)
MEAD
(75 words)
ROUGE Precision
similar similar
Performance of Opinosis is reasonable 
similar to human performance
44

 Use existing words in original text to generate
micropinion summaries- set of short phrases
 Emphasis on 3 aspects:
 Compactness - use as few words as possible
 Representativeness – reflect major opinions in text
 Readability – fairly well formed
45

kmmsim
)(mS
)(mS
m
)(mS)(mS...mmM
jisimji
readiread
repirep
ss
k
i
i
k
i
ireadirepki
,1(
subject to
maxarg
,),
1
1
46

kmmsim
)(mS
)(mS
m
)(mS)(mS...mmM
jisimji
readiread
repirep
ss
k
i
i
k
i
ireadirepki
,1(
subject to
maxarg
,),
1
1
Objective function: Optimize
representativeness & readability
scores
• Ensure: summaries reflect key opinions &
reasonably well formed
47

kmmsim
)(mS
)(mS
m
)(mS)(mS...mmM
jisimji
readiread
repirep
ss
k
i
i
k
i
ireadirepki
,1(
subject to
maxarg
,),
1
1
Readability score of mi
Representativeness score of mi
48

kmmsim
)(mS
)(mS
m
)(mS)(mS...mmM
jisimji
readiread
repirep
ss
k
i
i
k
i
ireadirepki
,1(
subject to
maxarg
,),
1
1
Constraint 1: Maximum
length of summary.
•User adjustable
•Captures compactness.
49

kmmsim
)(mS
)(mS
m
)(mS)(mS...mmM
jisimji
readiread
repirep
ss
k
i
i
k
i
ireadirepki
,1(
subject to
maxarg
,),
1
1Constraint 2 &3: Min
representativeness & readability.
•Helps improve efficiency
•Does not affect performance
50

kmmsim
)(mS
)(mS
m
)(mS)(mS...mmM
jisimji
readiread
repirep
ss
k
i
i
k
i
ireadirepki
,1(
subject to
maxarg
,),
1
1
Constraint 4: Max
similarity of phrases
• User adjustable
• Captures compactness by
minimizing redundancies
51

 Measure used:
 Standard Jaccard Similarity Measure
 Why important?
 Allows user to control amount of redundancy
 E.g. User desires good coverage of information on
small device  request less redundancies !
52

 Purpose: Measure how well a phrase represents
opinions from the original text?
 2 properties of a highly representative phrase:
1. Words should be strongly associated in text
2. Words should be sufficiently frequent in text
 Captured by a modified pointwise mutual
information (PMI) function
53
)()(
),(),(
log)(' 2,
ji
jiji
ji
wpwp
wwcwwp
wwpmi
Add frequency of
occurrence within
a window

 Purpose: Measure well-formedness of a phrase
 Readability scoring:
 Use Microsoft's Web N-gram model (publicly available)
 Obtain conditional probabilities of phrases
 Intuition: A readable phrase would occur more
frequently according to the web than a non-readable
phrase
)|(log
1
)( 1...12... kqk
n
qk
knkread wwwp
K
wwS
54
chain rule to compute
joint probability in terms of
conditional probabilities
(averaged)

 Input:
 User reviews for 330 products (CNET)
 Evaluation Measure:
 Automatic ROUGE evaluation
55

0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
5 10 15 20 25 30
ROUGE-2RECALL
Summary Size (max words)
KEA
Tfidf
Opinosis
WebNGram
WebNgram: Performs
the best for this task
KEA: slightly
better than tfidfTfidf: Worst
performance
56

0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
5 10 15 20 25 30
ROUGE-2RECALL
Summary Size (max words)
KEA
Tfidf
Opinosis
WebNGram
WebNgram: Performs
the best for this task
KEA: slightly
better than tfidfTfidf: Worst
performance
57
PROS
CONS
FULL REVIEW

 No easy way to obtain a comprehensive set of
opinions about an entity
 Where to get opinions now?
 Rely on content providers or crawl a few sources
Problem :
▪ Can result in source specific bias
▪ Data sparseness for some entities
59

60
 Automatically crawl online reviews for
arbitrary entities
E.g. Cars, Restaurants, Doctors
 Target online reviews  represent a big
portion of online opinions

 Meant to collect pages relevant to a topic
E.g. “Databases Systems”, “Boston Terror Attack”
 Page type is not as important content
 news article, review pages, forum page, etc.
 Most focused crawlers are supervised
 require large amounts of training data for each topic
 Not suitable for review collection on arbitrary
entities
 Need training data for each entity  will not scale up to
large # of entities
61

 Focused crawler for collecting reviews pages
on arbitrary entities
 Unsupervised approach
 Does not require large amounts of training data
 Solves crawling problem efficiently
 Uses a special data structure for relevance scoring

Set of entities in a domain
(e.g. All hotels in a city)
Step 1: For each entity, obtain
initial set of Candidate Review
Pages (CRP).
Find Initial Candidate
Review Pages (CRP)
Input
1. Hampton Inn Champaign…
2. I Hotel Conference Center…
3. La Quinta Inn Champaign…
4. Drury Inn
5. ….
Hampton Inn…Reviews
63

Step 3: Score CRPs:
• Entity relevance (Sent)
• Review pg. relevance (Srev)
Select: Srev > σrev ; Sent > σent
Expand CRP List
tripadvisor.com/Hotel_Review-g36806-d903...
tripadvisor.com/Hotels-g36806-Urbana_Cha...
hamptoninn3.hilton.com/en/hotels/…
tripadvisor.com/ShowUserReviews-g36806-...
......…
tripadvisor.com/Hotel_Review-g35790-d102…
tripadvisor.com/Hotels-g36806-Urbana_Cha...
hamptoninn3.hilton.com/en/hotels/…
tripadvisor.com/ShowUserReviews-g36806-...
......…
Step 2: Expand list of CRPs by
exploring links in neighborhood
of initial CRPs.
Collect Relevant Review Pages
64

 Use any general web search (e.g. Bing/Google)
 Per entity basis
 Search engines do partial matching of entities to
pages
 More likely pages in vicinity of search results
related to entity
 QueryEntity Query
Format: “entity name + brand / address” + “reviews”
E.g. “Hampton Inn Champaign 1200 W University Ave Reviews”
65

 Follow top-N URLs around vicinity of search
results
 Use URL prioritization strategy:
 Bias crawl path towards entity related pages
 Score each URL: based on similarity between
(a) URL + Entity Query, Sim(URL,EQ)
(b) Anchor + Entity Query, Sim(Anchor,EQ)
66

 To determine if page is indeed a review page
 Use review vocabulary:
 Lexicon with most commonly occurring words
within review pages – details in thesis
 Idea: score a page based on # of review page words
67
]10[)(,
)(
)(
)(),(log)( 2
pirevS
normalizer
piS
pirevS
Vt twtiptcpirevS
rawrev
raw

]10[)(,
)(
)(
)(),(log)( 2
pirevS
normalizer
piS
pirevS
Vt twtiptcpirevS
rawrev
raw
 To determine if page is indeed a review page
 Use review vocabulary:
 Lexicon with most commonly occurring words
within review pages – details in thesis
 Idea: score a page based on # of review page words
Raw review page
relevance score
Normalize to obtain
final review page
relevance score
68
t is a term in the
review vocabulary, Vc(t, pi) – freq. of t in page pi (tf).
wt(t) - importance
weighting of t in RV
Normalizer needed to
set proper thresholds

 Explored 3 normalization options:
 SiteMax (SM) : Max Srevraw(pi) amongst all pages
related to a particular site - Normalize based on site
density
 EntityMax (EM) : Max Srevraw(pi) score amongst all
pages related to an entity - Normalize based on
entity popularity
 EntityMax + GlobalMax (GM) or
SiteMax + GlobalMax (GM) :
▪ To help with cases where SM/EM are unreliable
69

 To determine if page is about target entity
 Based on similarity between a page URL & Entity
Query
 Why it works?
 Most review pages have highly descriptive URLs
 Entity Query is a detailed description of entity
 The more URL resembles query, more likely it is
relevant to target entity
 Similarity measure: Jaccard Similarity
70

 Steps proposed so far, can be implemented in a
variety of different ways
 Our goal: make the crawling framework usable in
practice
71

1. Efficiency:
 Allow review collection for large number of entities
 Task should terminate in reasonable time & accuracy
 Problem happens when cannot access required
information quickly
▪ E.g. Repeated access to term frequencies of different pages
2. Rich Information Access (RIA):
 Allow client to access info. beyond crawled pages
E.g. Get all review pages from top 10 popular sites for entity X
 DB not suitable because you cannot naturally model
complex relationships and would yield in large joins
72

 Heterogeneous graph data structure
 Models complex relationships between
different components in a data collection
problem
73

Review Vocabulary
Current Query
Q
V
t1
t2
t3
t4
t5
tz
.
.
.
.
Term Nodes
wt
wt
wt
wt
wt
wt
wt
wt
wt
wt
wt
E1
Entity Nodes
E2
Ek
Hampton Inn Champaign
I-Hotel Conference Center
Drury inn Champaign
t
t
t
u
u
u
c
c
c
Page Nodes
P2
P1
P3
P4
P5
P6
Pn
Site Nodes
S2
St
hotels.com
local.yahoo.com
S1
tripadvisor.com
t = title, u = url, c = content
Logical Nodes
Other
Logical Nodes
74

Review Vocabulary
Current Query
Q
V
Other
Logical Nodes
t1
t2
t3
t4
t5
tz
.
.
.
.
Term Nodes
wt
wt
wt
wt
wt
wt
wt
wt
wt
wt
wt
E1
Entity Nodes
E2
Ek
Drury inn Champaign
u
u
u
c
c
c
Page Nodes
P2
P1
P3
P4
P5
P6
Pn
Site Nodes
S2
St
hotels.com
local.yahoo.com
S1
tripadvisor.com
t
t
t
t = title, u = url, c = content
Logical Nodes
75
List of entities on which
reviews are required
Based on set of CRPs
found for each entity
At the
core, made up of
terms
One node
per unique
term

 Maintain one simple data structure:
 Access to various statistics
▪ E.g TF of word in a page  EdgeWT(content node  term node)
 Access to complex relationships and global information
 Compact: can be an in memory data structure
 Network can be persisted and accessed later
 Client applications can use network to answer
interesting app. related questions
E.g. Get all review pages for entity X from top 10 popular sites
76

t1
t2
t3
t4
t5
tz
.
.
.
.
Term NodesPage Nodes
P2
P1
P3
P4
P5
P6
Pn
wt
wt
wt
wt
V
C
Content Node
(logical node)
tf
tf
tf
Review Vocabulary Node
(logical node)
To compute Srevraw(pi) :
-Terms present in both the Content node and RV node.
-TF and weights can be obtained from edges
-Lookup of review vocabulary words within a page is fast
-No need to parse page contents each time encountered
77
Outgoing edges = term ownership
Edge weight = importance wt
Edge weight = TF

Opinion Vocabulary
Current Query
Q
O
Other
Logical Nodes
t1
t2
t3
t4
t5
tz
.
.
.
.
Term Nodes
wt
wt
wt
wt
wt
wt
wt
wt
wt
wt
wt
E1
Entity Node
E2
Ek
Drury inn Champaign
u
u
u
c
c
c
Page Nodes
P2
P1
P3
P4
P5
P6
Pn
Site Nodes
S2
St
hotels.com
local.yahoo.com
S1
tripadvisor.com
t
t
t
Logical Nodes
Access all pages connected
to the site node
requires complete graph
78

Opinion Vocabulary
Current Query
Q
O
Other
Logical Nodes
t1
t2
t3
t4
t5
tz
.
.
.
.
Term Nodes
wt
wt
wt
wt
wt
wt
wt
wt
wt
wt
wt
E1
Entity Node
E2
Ek
Drury inn Champaign
u
u
u
c
c
c
Page Nodes
P2
P1
P3
P4
P5
P6
Pn
Site Nodes
S2
St
hotels.com
local.yahoo.com
tripadvisor.com
S1
t
t
t
Logical Nodes
Access all pages connected
to entity node
requires complete graph
79

t1
t2
t3
t4
t5
tz
.
.
.
.
Term Nodes
tf
Page Nodes
P2
P1
P3
P4
P5
P6
Pn
tf
tf
tf
tf
q1
Entity Query Node
(logical node)
Hampton Inn Champaign 1200
W Univ…Reviews
tf
tf
tripadvisor.com/ShowUser…
U
URL Node
(logical node)
80

 Goal: Evaluate accuracy & give insights into efficiency
using FetchGraph
 Evaluated in 3 domains:
 (5) – Electronics, (5) – Hotels, (4) - Attractions
 Only 14 entities  expensive to obtain judgments
 Gold standard:
 For each entity, explore top 50 Google results & links
around vicinity of the results (up to depth 3)
 3 Human judges used to determine relevance of
collected links to entity query (crowd sourcing)
 Final judgment: majority voting
81

 Baseline: Google search results
 Deemed relevant to entity query
 Evaluation measure:
 Precision
 Recall – estimate of coverage of review pages
82
)Pages(eGoldStdRel#
)RelPages(e#
)Recall(e
k
k
k
)ages(eRetrievedP#
)RelPages(e#
)Prec(e
k
k
k

0.00
0.05
0.10
0.15
0.20
0.25
10 20 30 40 50
Recall
Number of search results
Google OpinoFetch OpinoFetchUnnormalized
OpinoFetch
OpinoFetchUnnormalized
Google
Google: recall
consistently low
Google: recall
consistently low
Google: recall
consistently low
Google: recall
consistently low
84
Search results  not always relevant to EQ or not
direct pointers to actual review pages.

0.00
0.05
0.10
0.15
0.20
0.25
10 20 30 40 50
Recall
OpinoFetch
Google
OpinoFetch: recall
keeps improving
OpinoFetch: recall
keeps improving
OpinoFetch: recall
keeps improving
OpinoFetch: recall
keeps improving
85
-A lot of relevant content in vicinity of search results
-OpinoFetch is able to discover such relevant content

0.00
0.05
0.10
0.15
0.20
0.25
10 20 30 40 50
Recall
OpinoFetch
Google
OpinoFetch: better
recall with normalization
-Scores are normalized using special normalizers
(e.g. EntityMax / SiteMax)
-Easier to distinguish relevant review pages
86

97.23%
85.72%
36.23%
19.62%
0%
20%
40%
60%
80%
100%
EntityMax +
GlobalMax
EntityMax SiteMax +
GlobalMax
SiteMax
%Changeinprecision EM + GM: gives the
best precision
SM: gives lowest
precision
87
SM is worst performing: certain sites cover
different classes of entities. Max score from the
site may be unreliable for sparse entities

0
50000
100000
150000
200000
250000
300000
350000
400000
450000
0 200 400 600 800 1000
GraphSize
# pages crawled
Linear growth without any
optimization/compression
Possible to use FetchGraph as in
memory data structure
88

Avg. Execution Time with/without FetchGraph
With
FetchGraph
Without
FetchGraph
Srevraw(pi) 0.09ms 8.60ms
EnityMax
Normalizer
0.06ms 4.40 s
Without FetchGraph:
-Parse page contents each time
With FetchGraph:
-Page loaded into memory once
-Use FetchGraph to compute Srevraw(pi)
89

Avg. Execution Time with/without FetchGraph
With
FetchGraph
Without
FetchGraph
Srevraw(pi) ~0.09ms ~8.60ms
EnityMax
Normalizer
~0.06ms ~4.40s
Without FetchGraph:
load sets of pages into memory
to find entity max normalizer
With FetchGraph:
-Global info tracked till the end
-Only need to do a lookup on related sets
of pages to obtain entity max normalizer
90

 Proposed: An unsupervised, practical method
for collecting reviews on arbitrary entities
 Works with reasonable accuracy without
requiring large amounts of training data
 Proposed FetchGraph:
 Helps with efficient lookup of various statistics
 Useful for answering application related queries
91

 Finds & ranks entities based on user preferences
 Unstructured opinion preferences - novel
 Structured preferences - e.g. price, brand, etc.
 Beyond search: Support for analysis of entities
 Ability to generate textual summaries of reviews
 Ability to display tag clouds of reviews
 Current version: Works in the hotels domain
93

Search: Find entities based
on unstructured opinion
preferences
Search: + Combine with
structured preferences
Ranking: How well all
preferences are
matched?
94

Tag clouds
weighted by frequency
Related snippets
(“convenient location”)
95

Opinion summaries
readable, well-formed
Related snippets
96

Summary with Initial Reviews:
-26 reviews in total
-1-2 sources
Summary with OpinoFetch Reviews:
-135 reviews (8 sources)
-Extracted with a baseline extractor
-Not all reviews were included – filter
• Based on length of review
• Subjectivity score of review 97

 Opinion Based Entity Ranking
 Use click through & query logs to further improve
ranking of entities
▪ Now possible  everything is logged by demo system
 Look into the use of phrasal search for ranking
▪ Limit deviation from actual query (e.g. “close to university”)
▪ Explore: “back-off” style scoring – score based on phrase
then remove the phrase restriction
98

 Opinosis
 How to scale up to very large amounts of text?
▪ Explore use of map reduce framework
 Would this approach work with other types of texts?
▪ E.g. Tweets, Facebook comments – shorter texts
 Opinion Acquisition
 Compare OpinoFetch with a supervised crawler
▪ Can achieve comparable results?
 How to improve recall of OpinoFetch?
▪ To evaluate at a reasonable scale: approximate judgments
without relying on humans?
99

[Barzilay and Lee2003] Barzilay, Regina and Lillian Lee. 2003. Learning to paraphrase: an unsupervised
approach using multiple-sequence alignment. In NAACL ’03: Proceedings of the 2003 Conference of the
North American Chapter of the Association for Computational Linguistics on Human Language
Technology, pages 16–23, Morristown, NJ, USA.
[DeJong1982] DeJong, Gerald F. 1982. An overview of the FRUMP system. In Lehnert, Wendy G. and Martin H.
Ringle, editors, Strategies for Natural Language Processing, pages 149–176. Lawrence
Erlbaum, Hillsdale, NJ.
[Erkan and Radev2004] Erkan, G¨unes and Dragomir R. Radev. 2004. Lexrank: graph-based lexical centrality as
salience in text summarization. J. Artif. Int. Res.,22(1):457–479.
[Finley and Harabagiu2002] Finley, Sanda Harabagiu and Sanda M. Harabagiu. 2002. Generating single and
multi-document summaries with gistexter. In Proceedings of the workshop on automatic
summarization, pages 30–38.
[Hu and Liu2004] Hu, Minqing and Bing Liu. 2004. Mining and summarizing customer reviews. In KDD, pages
168–177.
[Jing and McKeown2000] Jing, Hongyan and Kathleen R. McKeown. 2000. Cut and paste based text
summarization. In Proceedings of the 1st North American chapter of the Association for Computational
Linguistics conference, pages 178–185, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.
[Lerman et al.2009] Lerman, Kevin, Sasha Blair-Goldensohn, and Ryan Mcdonald. 2009. Sentiment
summarization: Evaluating and learning user preferences. In 12th Conference of the European Chapter of
the Association for Computational Linguistics (EACL-09).
[Mihalcea and Tarau2004] Mihalcea, R. and P. Tarau. 2004. TextRank: Bringing order into texts. In Proceedings
of EMNLP-04and the 2004 Conference on Empirical Methods in Natural Language Processing, July.
[Pang and Lee2004] Pang, Bo and Lillian Lee. 2004. A sentimental education: Sentiment analysis using
subjectivity summarization based on minimum cuts. In Proceedings of the ACL, pages 271–278.
[Pang et al.2002] Pang, Bo, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up? Sentiment
classification using machine learning techniques. In Proceedings of the 2002 Conference on Empirical
Methods in Natural Language Processing (EMNLP), pages 79–86.
[Radev and McKeown1998] Radev, DR and K. McKeown. 1998. Generating natural language summaries from
multiple on-line sources. Computational Linguistics, 24(3):469–500.
[More in Thesis Report] 100

Opinion Driven Decision Support System

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Opinion Driven Decision Support System

Similar to Opinion Driven Decision Support System (20)

Recently uploaded

Recently uploaded (20)

Opinion Driven Decision Support System

Editor's Notes