Presentation slides for the following paper:
Kazutoshi Umemoto, Takehiro Yamamoto, and Katsumi Tanaka. 2016. ScentBar: A Query Suggestion Interface Visualizing the Amount of Missed Relevant Information for Intrinsically Diverse Search. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval (SIGIR '16). ACM, New York, NY, USA, 405-414. DOI: http://dx.doi.org/10.1145/2911451.2911546
ScentBar: A Query Suggestion Interface Visualizing the Amount of Missed Relevant Information for Intrinsically Diverse Search
1. ScentBar: A Query Suggestion Interface
Visualizing the Amount of Missed Relevant
Information for Intrinsically Diverse Search
Kazutoshi Umemoto, Takehiro Yamamoto, Katsumi Tanaka
Kyoto University, Japan
{umemoto,tyamamot,tanaka}@dl.kuis.kyoto-u.ac.jp
2. 2
Intrinsically Diverse Search Tasks [1]
A user searches for extensive information covering diverse aspects of a single topic
[1] K. Raman et al. Toward Whole-Session Relevance: Exploring Intrinsic Diversity in Web Search (2013)
What if
I continue
smoking?
relax-
ation stress
release
lose
weight
skin
aging
dental
health
waste of
money
high
blood
pressure
cancer risksmoking effect
smoking cancer
cigarette price
Multiple queries are issued to fulfill intrinsically diverse search tasks
3. 3
Decisions in Intrinsically Diverse Tasks
smoking effect
What if
I continue
smoking?
!
smoking cancer
Should I
stop examining
the SERP?
What query
should I
issue next?
Should I
stop the task
session?
Query
stopping
Query
selection
Session
stopping
4. 4
Query Stopping
smoking effect
When to stop examining the SERP of the current query?
rel
non-rel
non-rel
non-rel
smoking effect
very-rel
non-rel
non-rel
non-rel
Stop
Stop
Many wasted clicks Miss important docs
5. 5
Query Selection
Which query to use for the next search?
ineffective query effective query
smoking effect
browsed browsed newbrowsed
Many already-browsed docs
new new browsednew
Many novel docs
Little
search effort
Much
search effort
6. 6
Session Stopping
When to stop the whole task session?
Low additional outcomes High additional outcomes
"
!
"
!
current current
Continue searching
Time-wasting
Stop searching
Miss important info
7. 7
Issues Raised by Inappropriate Decisions
l Searchers do not know: what aspects exist and how important they are?
Difficult to formulate queries effective for covering diverse aspects
l Searchers cannot guess: how much important info are (un)explored?
Difficult to find appropriate timing for query/session stopping
Why is it difficult to make rational decisions?Q.
A.
Query stopping: increase wasted clicks or miss important docs
Query selection: increase in #queries
Session stopping: waste time or miss important info
Help searchers make better decisions on
query stopping, query selection, and session stopping
Objective
A.
8. 8
important info that the searcher misses collecting from the SERP
Visualize the amount of missed information for each query
ScentBar: Our Proposed Query Suggestion Interface
cigarettes price increase
smoking ruins your looks
smoking benefits
diseases caused by smoking
smoking cancer risk
Bar length = amount of MI
9. 9
important info that the searcher misses collecting from the SERP
Visualize the amount of missed information for each query
ScentBar: Our Proposed Query Suggestion Interface
cigarettes price increase
smoking ruins your looks
smoking benefits
diseases caused by smoking
smoking cancer risk
After browsing some docs through task session
How much info is
available from the SERP?How much
remains
unexplored?
16. 16
Expected Benefits
Cancer seems
more important
than price.
Little MI remains.
Let’s change the
query.
Having covered
most aspects, I can
finish the task.
!
At beginning of search task During each search At end of task session
cigarettes price increase
smoking ruins your looks
smoking benefits
diseases caused by smoking
smoking cancer risk
cigarettes price increase
smoking ruins your looks
smoking benefits
diseases caused by smoking
smoking cancer risk
cigars vs. cigarettes
smoking ruins your looks
smoking effects on brain
diseases caused by smoking
smoking benefits
Understand
info distribution
Query stopping
Query selection
Session stopping
17. 17
Formalization: Missed Information
MI%,' ( = Gain' .% ∪ .0
1 − Gain' .%
Additional gain that can be obtained from unclicked search results
Gain obtained so far
Total gain obtained
after browsing all SERP docs
3: search topic, .0
1: top 4 docs retrieved for query (, .%: set of docs browsed by user 5
How should missed information behave?
Formalize Gain to satisfy these three properties
Q.
It should be high
when SERP documentsA.
1. cover important aspects
2. are highly relevant
3. cover unexplored aspects
18. 18
Gain' . = 6 Pr 9 3 : Gain; .
;∈>?
3: search topic, . and .B: document sets
Expected value of per-aspect gain
aspect importance
a set of aspects
Gain; . = 6 Rel; FG : Disc; FK, … , FGMK
N
GOK
Sum up document relevance
weighted by aspect novelty
per-aspect
document relevance
discount function of
aspect novelty
Formalization: Gain
Disc; .B = P 1 − Rel; FR
B
ST
U
∈NU
Discount aspects that have mostly
been covered by browsed docs
Importance
Relevance
Novelty
19. 19
Relation with Intent Aware Metrics
[1] O. Chapelle et al. Expected Reciprocal Rank for Graded Relevance (2009)
Gain' . = 6 Pr 9 3 6 Rel; FG P 1 − Rel; FR
GMK
ROK
N
GOK;∈>?
Gain
ERR−IA X0 = 6 Pr 9 ( 6
Rel; YG
Z
P 1 − Rel; YR
GMK
ROK
N
GOK;∈>[
ERR-IA [1]
discounts for lower-ranked docs
similar to
Gain is a set-wise metric for evaluating the utility of documents
separately from the browsing order
20. 20
How to Estimate Missed Information
Need to know values of 3 gain components
1. ]!, a set of aspects for search topic 3
2. _` a ! , an importance probability of aspect 9 ∈ b'
3. defa g , a relevance score of document F to aspect 9
Estimate these components by subtopic mining algorithm [1]
[1] K. Tsukuda et al. Estimating Intent Types for Search Result Diversification (2013)
21. 21
User Study
l 24 subjects (within-subject design)
l 4 search topics (from NTCIR INTENT, IMine)
▶ symptoms of diabetes
▶ clinical depression
▶ dress codes for wedding ceremony
▶ dinosaurs
l ClueWeb09-JA as document corpus
RQs
without MI (baseline)
with MI (proposed)
How our visualization of
missed information affects
1. Query Stopping
2. Query Selection
3. Gain Acquisition Pattern
4. Session Stopping
5. Effort vs. Gain
cigarettes price increase
smoking ruins your looks
smoking benefits
diseases caused by smoking
smoking cancer risk
cigarettes price increase
smoking ruins your looks
smoking benefits
diseases caused by smoking
smoking cancer riskmedical domain
22. 22
Experimental Procedure
There is no time limit for each topic
▶ i.e. Different participants had different task completion times
▶ To investigate the effect of ScentBar on session stopping
▶ (All participants completed the whole experiment within 2 hours)
You were given the assignment of submitting a thorough report on topic 3.
To fully understand 3, collect relevant information on this topic from a
number of different aspects that you think is important. You may end this
search task when you feel there is little important information left.
Task guideline
< 2 hoursStart
Instruction
Finish
topic 3K with UI ZK
Instruction topic 3i with UI Zi Finish
23. 23
Accuracy of Gain Estimation
Measure correlation between estimated/oracle gains at each browsing path
3
Start task Finish taskBrowse doc FK Browse Fi
oracle:
estimated:
Gain'∗ FK
Gain'k FK
Gain'∗ FK, Fi
Gain'k FK, Fi
Topic Pearson’s l Spearman’s m Kendall’s n
symptoms of
diabetes
0.834 0.851 0.683
clinical depression 0.845 0.860 0.678
dress codes for
wedding ceremony
0.824 0.862 0.713
dinosaurs 0.710 0.702 0.529
High
Correlation
Low
Correlation
⋯
⋯
how consistent?
Conduct analyses for All, HC, and LC topics separately
to investigate effect of gain estimation accuracy
24. 24
Analysis 1: Query Stopping
Calculate how much oracle MI decreases when the current search ends
ScentBar users collected more relevant information at query level
3
Start task Issue query ( Finish taskStop searching with (
smoking cancer risk
All topics HC topics LC topic
baseline proposed baseline proposed baseline proposed
pqr 0.109 0.138* 0.128 0.186* 0.069 0.071
RQ
How does ScentBar affect users’ decisions on when to stop
the current search?
HC topics
25. 25
Analysis 2: Query Selection
Calculate how much MI remains when the new query is issued
ScentBar users issued queries having more missed information
smoking cancer risk
All topics HC topics LC topic
baseline proposed baseline proposed baseline proposed
qr 0.211 0.241* 0.238 0.298* 0.155 0.162
RQ
How does ScentBar affect users’ decisions on which query to
use for the next search?
3
Start task Issue query ( Finish taskStop searching with (
HC topics
26. 26
Analysis 3: Gain Acquisition Pattern
All topics HC topics LC topic
l Little difference between interfaces at early stage (! < s tuv)
l ScentBar users obtained higher oracle gain at late stage
Calculate cumulative oracle gain that participants obtained by the minute
0
0.2
0.4
0.6
0.8
1
0 10 20 30 40
Oracle Gain
Elapsed Time
RQ
How does ScentBar affect the temporal change in gain that
users acquire through their search process?
HC topics
0
0.2
0.4
0.6
0.8
1
0 10 20 30 40
w/o scentbaseline
0
0.2
0.4
0.6
0.8
1
0 10 20 30 40
w/ scentproposed
27. 27
Analysis 4: Session Stopping
Calculate how much oracle MI decreases when the whole session ends
RQ
How does ScentBar affect users’ decisions on when to stop
their task sessions?
3
Start task Issue query ( Finish taskStop searching with (
smoking cancer risk
All topics HC topics LC topic
baseline proposed baseline proposed baseline proposed
pqr 0.162 0.195* 0.188 0.256* 0.106 0.110
ScentBar users collected more relevant information at session level, too
HC topics
28. 28
0
0.2
0.4
0.6
0.8
1
0 10 20 30 40
w/o scent w/ scent
Model task completion time and oracle gain via linear regression
Analysis 5: Effort vs. Gain
0
0.2
0.4
0.6
0.8
1
0 10 20 30 40
Oracle Gain
Task Completion Time
RQ
How does ScentBar affect the relationship between the effort
that users expend and the gain that they obtain?
All topics HC topics LC topic
ScentBar users obtained higher oracle gain per unit time
HC topics
baseline proposed
0
0.2
0.4
0.6
0.8
1
0 10 20 30 40
w/o scent w/ scentbaseline proposed
29. 29
Search Behavior
l ScentBar users issued more suggestion queries for HC topics
l No significant difference in the number of wasted clicks
prior to query stopping
All topics HC topics LC topic
baseline proposed baseline proposed baseline proposed
%xyzze{|
}ye`ue{
0.376 0.480 0.355 0.505* 0.437 0.405
#fuÄÅ{
ÇÉ{|def−}
0.156 0.164 0.165 0.175 0.133 0.142
%xyzze{|}ye`ue{:
the fraction of suggestion queries among all issued queries
#fuÄÅ{ÇÉ{|def−}:
the number of irrelevant doc clicks prior to query stopping
30. 30
Findings
l Effects of MI visualization on search outcomes
▶ Highly-accurate MI enabled users to
1. issue more promising queries
2. obtain higher gain at the late stage of sessions
3. obtain higher gain per unit time
▶ Less-accurate MI worsened search performance
probably because users got confused with unreliable MI behavior
l Effects of MI visualization on search behavior
▶ Users interacted with query suggestions more frequently
▶ Users clicked more irrelevant docs just before query stopping
(though not statistically significant)
31. 31
Discussion
Why did MI visualization fail to reduce wasted clicks prior to query stopping?Q.
Query-level MI visualization might be less informative
for making decisions to stop SERP examinationA.
smoking cancer riskl Which doc should I assess next?
l Is the examination worth the effort?
Though some MI remains in this SERP, …
Improvement of search outcomes are
l mainly due to better decisions on query selection
l not due to better decisions on query stopping
32. 32
How to Better Support Query Stopping?
Possible solutions
▶ Visualize per-aspect relevance for each search result [1],
as well as query-level MI presentation
▶ Re-rank search results with much MI at high positions,
so that optimal stopping points can be found by searchers
who investigate SERPs in a top-down manner
▶ Make users aware of search effort [2] they have expended so far,
to help searchers understand their cost-benefit performance
[1] M. Iwata et al. AspecTiles: Tile-based Visualization of Diversified Web Search Results (2012)
[2] L. Azzopardi and G. Zuccon. An Analysis of Theories of Search and Search Behavior (2015)
These can be integrated into ScentBar
33. 33
ScentBar: query suggestion interface for intrinsically diverse task
Summary: Visualization of Missed Information
cigarettes price increase
smoking ruins your looks
smoking benefits
diseases caused by smoking
smoking cancer risk
Much info is still unexplored.
I have to keep on searching.
Missed Information
Findings
Future Work
l Conceptualized as important info that the searcher misses collecting from the SERP
l Formalized as additional gain that can be obtained from unclicked search results
l ScentBar helped users make better decisions especially on query selection
and made search process more efficient when high-accurate MI was presented
l Search performance was worsened when less-accurate MI was presented
l Improve gain estimation algorithm (e.g. by modeling topic aspects hierarchically)
l Utilize missed information in different ways (e.g. MI-based query suggestion algorithm)