This is the presentation that was delivered at BCS-IRSG ECIR 2018. This work proposes an extension to the Complex Searcher Model (CSM), enabling us to model user interactions on a Search Engine Results Page (SERP). Grounded by Information Foraging Theory (IFT), we propose a new stopping decision point within the CSM. Results of simulations show that this new stopping decision point improves the realism of searcher models, and suggests that models and measures used in Information Retrieval research need to be u
1. ECIR 2018: Grenoble, France March 27th, 2018
Information Scent,
Searching and Stopping
Modelling SERP Level Stopping Behaviour
David Maxwell and Leif Azzopardi
With thanks to Yashar Moshfeghi
2. Information Foraging Theory
2
§ Proposed by Pirolli and Card1, in turn based
upon Foraging Theory2
§ Composed of different models…
1Pirolli and Card (1999), 2Stephens and Krebs (1986)
Diet Model Scent Model Patch Model
3. Considering Scent and Patches
3
Part of Information Foraging Theory – Pirolli and Card (1999)
1
2
4. Considering Scent and Patches
3
Part of Information Foraging Theory – Pirolli and Card (1999)
1
2
Does this patch smell good?
Move on!
Find another patch
Yes!
No
Enter the patch,
investigate
5. IFT: Patches and Scent
4
§ Patches typically modelled as SERPs1
§ Scent determined by a series of proximal
cues presented on the SERP2
Search
Web Images Video Maps News Shopping More Search Tools
McLaren: Prospects for 2018 - BBC Sport
www.bbc.co.uk/sport/formula1/43162270
23 Feb 2018 - McLaren have revealed the car they
hope will return them to at least relative
competitiveness. The MCL33 has switched from…
McLaren 2018 F1 Car: First Pictures of the…
www.motorsport.com/.../video-mclaren-fires-…
16 Feb 2018 - McLaren has successfully fired up
its new Renault-powered MCL33 for the first time
at its Woking factory.
McLaren MCL33
Formula One Car
mclaren mcl33 news
Thumbnails
More Contemporary
Titles
Sources
Freshness
Snippets
1For example, look at Ong et al. (2017), 2Pirolli & Card (1995)
6. The SERP “Overview”
5
§ Current models and measures assume that a
searcher will examine at least one snippet
§ Combining the patch and scent models, what
about a SERP’s initial impression as a whole?
§ Scent shown to affect searcher behaviour1
1Card et al. (2001)
University of Glasgow :: Glasgow, Scotland, UK
https://www.gla.ac.uk/
The University of Glasgow, Scotland, UK. The University
of Glasgow is a major research-led university operating in
University of Strathclyde, Glasgow
https://www.strath.ac.uk/
The University of Strathclyde, located in Glasgow city
centre, is a multi-award-winning UK university. We are...
glasgow university
?
?
? ?
7. Research Questions
6
§ Operationalising scent as the performance of
a SERP (from an overview), can we:
RQ1 …obtain improvements in performance?
RQ2 …obtain a approximation of actual
searcher stopping behaviour?
9. Complex Searcher Model
8
Examine
Topic
Generate
Queries
Issue Query View SERP
Examine
Snippet
Click
Document
Assess
Document
Mark
Document
Continue
on SERP?
Stop
Session?
Relevant?
Attractive?
Yes
No
Abandon SERP
Yes
No
Select
Query
Out of Queries
Yes
No
Model adapted from Baskaya et al. (2013) and Thomas et al. (2014)
10. Examine
Topic
Generate
Queries
Issue Query View SERP
Examine
Snippet
Click
Document
Assess
Document
Mark
Document
Continue
on SERP?
Stop
Session?
Relevant?
Attractive?
Yes
No
Abandon SERP
Yes
No
Select
Query
Out of Queries
No
Yes
Yes
No
Appears
Useful?
Complex Searcher Model
9
Model adapted from Maxwell and Azzopardi (CIKM 2016)
Rev. II
12. Methodology Basics
11
1Maxwell et al. (2017)
TREC AQUAINT
Newswire collection
The SimIIR Framework
For conducting simulations of interaction
TREC QRELS
Robust Track (50 topics)
User Study Data
Data extracted from interaction log of a prior user study1 (SIGIR 2017)
n=53
Ad-hoc retrieval
Time limited
Provides Grounding
Extracted interaction
probabilities & costs
Altering Scent…
…by increasing lengths of
snippets observed
TREC
Qrels
1
0
1
1
2
0
13. § Experimented with different simulated users
representing SERP stopping strategies
SERP Level Stopping
12
Lower Bound User
Upper Bound User
Baseline – enter, examine a
SERP regardless of scent
Only enter when SERP
offers a good scent
Low/high scent threshold: P@10=0.0
As defined by Wu et al. (2014)
Stochastic Users
Three users, utilising P(E|LS) and P(E|HS)
View SERP
No
Yes
Appears
Useful?
14. View SERP
No
Yes
Appears
Useful?
§ Experimented with different simulated users
representing SERP stopping strategies
SERP Level Stopping
12
Lower Bound User
Upper Bound User
Baseline – enter, examine a
SERP regardless of scent
Only enter when SERP
offers a good scent
Low/high scent threshold: P@10=0.0
As defined by Wu et al. (2014)
Stochastic Users
Three users, utilising P(E|LS) and P(E|HS)
All SERPs from study
Low Scent
P@10=0.0
High Scent
P@10>0.0
P(E|LS) = #SERPS w/click
#SERPS
P(E|HS) = #SERPS w/click
#SERPS
Based on the definition of an abandoned SERP by Hassan and White (2013)
15. § Experimented with different simulated users
representing SERP stopping strategies
SERP Level Stopping
12
Lower Bound User
Upper Bound User
Baseline – enter, examine a
SERP regardless of scent
Only enter when SERP
offers a good scent
Savvy
Average
Naïve
Low/high scent threshold: P@10=0.0
As defined by Wu et al. (2014)
Stochastic Users
Three users, utilising P(E|LS) and P(E|HS)
Savvy
Naïve Average
P(E|LS) = 0.74 P(E|LS) = 0.34 P(E|LS) = 0.00
P(E|HS) = 0.77 P(E|HS) = 0.80 P(E|HS) = 0.82
>> Tends towards a more efficient searcher >>
View SERP
No
Yes
Appears
Useful?
16. § Also considered three established snippet level
stopping strategies: SS1, SS2 and SS31
Snippet Level Stopping
13
1Maxwell et al. (2015)
SS1Fixed Depth (@3) SS2Adaptive (@2) SS3Adaptive (@2)
No
Yes
Snippet
Attractive?
Continue
on SERP?
No
18. 0 5 10 15 20
Mean Depth per Query
0
0.5
1
1.5
2
2.5
3
Mean
Cumulative
Gain
(CG)
SS2: Mean Depth per Query vs. CG
RQ1 Examining Performance
15
Upper
Good scent only
2.63
5.52
Savvy
Top 15
2.45
4.55
Average
53 subjects
1.75
10.06
Naïve
Bottom 15
1.19
12.11
Lower
Baseline
1.21
7
User CG D/Q
19. 0 5 10 15 20
Mean Depth per Query
0
0.5
1
1.5
2
2.5
3
Mean
Cumulative
Gain
(CG)
SS2: Mean Depth per Query vs. CG
RQ1 Examining Performance
15
Upper
Good scent only
2.63
5.52
Savvy
Top 15
2.45
4.55
Average
53 subjects
1.75
10.06
Naïve
Bottom 15
1.19
12.11
Lower
Baseline
1.21
7
User CG D/Q
Time
Cumulative
Gain
(CG)
§ Good SERP – lots of
gain, and fast! Stop
comparatively early
§ Bad SERP – slow to
acquire gain, stop later
Time
Cumulative
Gain
(CG)
Good
Time
Cumulative
Gain
(CG)
Good
Bad
Time
Cumulative
Gain
(CG)
Good
Bad
Time
Cumulative
Gain
(CG)
Good
Bad
Time
Cumulative
Gain
(CG)
Good
Bad
20. § Used the Mean Squared Error to compare
user study data against simulation data
§ Replayed a total of 175 queries (over four
TREC topics) for this purpose
16
RQ2 Real-World Comparisons
Query Q0
wildlife extinction
Q1
endangered species
Qn
...
RW C.D.
Sim. C.D.
5 2 …
6 1 …
21. Mean Depth per Query
Mean
Squared
Error
(MSE)
0 5 10 15 20 25
180
200
220
240
260
280
300
320
340
360
380
400
SS2: Approximating Stopping Depths
RQ2 Real-World Comparisons
17
Upper
Good scent only
210.4
8.28
Savvy
Top 15
199.3
10.90
Average
53 subjects
190.1
10.79
Naïve
Bottom 15
191.5
12.58
Lower
Baseline
200.6
9.85
User MSE D/Q
Actual D/Q
10.65
22. § Information scent affects searching and
stopping behaviours
§ Adding a SERP stopping decision point is an
intuitive contribution
§ Offers improved performance
§ Also leads to better predictions
§ But: only if searchers can differentiate between
between SERPs offering a good scent, & v.v.
Study Conclusions
18
23. § We need to rethink the user models that
we use to measure user search
performance
§ Currently all measures assume that at least
the first result summary/snippet is examined
– implications for session based measures1
§ People don't always examine pages in
detail, and savvy users quickly ignore
poor SERPs
What does this Mean?
19
1See Kanoulas et al. (SIGIR 2011) for an excellent take on session based measures.
24. § What cues do searchers look for in
particular on a SERP?
§ What cues under what contexts?
§ How can we better render SERPs to help
searchers identify these cues?
§ The SimIIR framework now allows you to explore
the power of simulation for yourself, reproduce
these findings, or develop your own analysis of
other searcher behaviours
Next Steps
20
25. 21
P. Pirolli and S. Card. Information Foraging. Psychological Review 106, pages 643—375, 1999. – Information Foraging Theory Paper.
D. Stephens and J. Krebs. Foraging Theory. 1986. – Foraging Theory book.
K. Ong, K. Jarvelin, M. Sanderson and F. Scholer. Using Information Scent to understand mobile and desktop web search behavior. In Proc. 40th ACM
SIGIR, pages 209—304, 2017. – Example of information scent usage.
P. Pirolli and S. Card. Information Foraging in information access environments. In Proc. 13th ACM SIGCHI, pages 51—58, 1995. – Example of proximal
cues.
S. Card, P. Pirolli, M. Van Der Wege, J. Morrison, R. Reeder, P. Schraedley and J. Boshart. Information scent as a driver of web behavior graphics:
Results of a protocol analysis method for web usability. In Proc. 19th ACM SIGCHI, pages 498—505, 2001. – Information scent affecting searcher
behaviours.
F. Baskaya, H. Keskustalo, and K. Järvelin. Modeling behavioral factors in interactive information retrieval. In Proc. 22nd ACM CIKM, pages 2297–
2302, 2013. – Prior user model.
P. Thomas, A. Moffat, P. Bailey, and F. Scholer. Modeling decision points in user search behavior. In Proc. 5th IIiX, pages 239–242, 2014. – Prior user
model.
D. Maxwell and L. Azzopardi. Agents, Simulated Users and Humans: An Analysis of Performance and Behaviour. In Proc. 25th ACM CIKM, pages 731—
740, 2016. – Prior user model.
D. Maxwell, L. Azzopardi and Y. Moshfeghi. A Study of Snippet Length and Informativeness: Behaviour, Performance and User Experience. In Proc. 40th
ACM SIGIR, pages 135—144, 2017. – User study with which simulations in this study were grounded.
D. Maxwell, L. Azzopardi, K. Jarvelin and H. Keskustalo. Searching and Stopping: An Analysis of Stopping Rules and Strategies. In Proc. 24th ACM CIKM,
pages 313—322, 2015. – Query level stopping strategy analysis.
W-C. Wu and D. Kelly, and A. Sud. Using information scent and need for cognition to understand online search behavior. In Proc. 37th ACM SIGIR, pages
557-566, 2014. – Information scent
Selected References & Credits
Image Credits
Flickr: Stuart Heath, Alberto, @sage_solar. Various vector artworks from Freepik.com.