Exploring session search

Exploring Session Search

Gene Golovchinsky
FX Palo Alto Laboratory, Inc.
@HCIR_GeneG

Thanks to:
Jeremy Pickens, Abdigani Diriye, Tony Dunnigan

Exploratory search

Interactive
Information seeking
Anomalous state of knowledge
Evolving information need
Often recall-oriented

One Query to Rule Them All

No single query satisfies a typical exploratory search
information need

Search strategies involve many queries

Queries return overlapping results

Why we’re here
1. How do we know what’s a session?

2. How do we help people deal with this complex task?

3. How do we evaluate systems and algorithms?

Warning

THIS TALK CONTAINS EXPLICIT
CONTENT

Explicit vs. implicit sessions
Explicit sessions
1. We ask the person

2. We infer it from structural aspects
of the search context
Task context may provide strong
organizing queues
For example, genealogical
searches are often tied to a
person in a family tree

What about implicit
sessions?

Implicit section detection is
based on implicit assumptions

How do we detect a session?
– Time heuristics
– Client connection heuristics
– Query similarity heuristics

What are we assuming?
– Person works continuously
– Person does not switch tasks
– Enough overlap in queries

How good are these assumptions?

Tradeoffs
Implicit sessions Explicit sessions
Pros Pros
No explicit user input required Accurate
Needed for collaboration
Cons Durable over time
Effectiveness relies on precision-
oriented information needs and
inter-query similarity, i.e., on Cons
redundancy Requires manual input in
some cases
More difficult to connect recurring
or ongoing instances of the same
information need

Strategies
Ignore it
The traditional approach

Manage redundancy in the UI
Ancestry.com, Querium

Increase diversity through scoring
Some algorithmic evaluation,
but are such interactive systems deployed?

Manage redundancy in the UI

COPING WITH REDUNDANCY

Some UI examples
Google
+1 but no session awareness & no good persistent visual feedback

Bing
Visible query history but no help with documents

Ancestry.com
Flags previously saved records for current person

Querium user interface
Variety of document- and query-centric displays

Ancestry.com: Query overlap
How can we help people
make sense of search
results?
What’s new?
What’s redundant?
What’s useful?
What’s not useful?

Querium: Filtering by process metadata

History of interaction
during a search can be
projected onto current
results

Querium: Visualizing re-retrieval
Document-centered
retrieval history can be
projected onto each search
result

Indicates “important”
documents

Indicates new documents

Increasing diversity

PREVENTING REDUNDANCY

Some (cor)related metrics

Novelty

Precision

Diversity

Recall

The exact relationship
Redundancy is hard to pin down

Increasing {diversity} with scoring
Pros Query
– Can incorporate prior explicit and
implicit relevance assessments Black box
– More focused queries may retrieve
more pertinent documents at a given Rank Session
docs state
cutoff

Cons
– Relies on accurate assessment of
relevance
Displayed User
– No way to recover “organic” results, feedback
ranking
so hard for people to understand
effect of personalization

Stop

Increasing {diversity} with post-processing
Rank Query
Pros docs
– Can recover “organic” results
– Supports feedback on incorrect inference “organic”
If user selects demoted doc ranking
– Accommodates shifting info needs better
Session
– Can be applied interactively state

Re-rank
Cons docs
– Limited document set

Displayed User
feedback
ranking

Stop

A holistic approach

EVALUATION

Vague generalities
Session-based search must be evaluated as a human-
machine system
Hard to account for real human behavior through simulations only

Recall and precision do not tell the whole story
Exploratory search is inherently a learning process
Effort, knowledge gain, frustration, serendipity important

Look at patterns of interaction that led to discovery
Hard to evaluate marginal contribution of each query due to
negative results, learning, information need drift

Some thoughts on evaluating algorithms
Small gains in retrieval effectiveness will be swamped by
interaction, good or bad
Small statistically-significant effects are meaningless in practice

Evaluation “in the wild” relies on users for ground truth
Use post-hoc analysis to test how algorithms predicted users’ choices

Look at system’s ability to help people recognize useful
documents
How many times was a document retrieved before it was seen?
This works for lab and naturalistic studies

In closing…
Information needs evolve
Queries are approximations
Knowledge is uncertain

Design challenge: Help people plan
future actions by understanding the
present in the context of the past

While I have your attention…
There is a pending proposal to create a StackExchange
site for information retrieval.
Think of it as Stack Overflow for IR geeks.

We need more people to vote & promote.

http://area51.stackexchange.com/proposals/39142/informatio
n-retrieval-and-search-engines

Do I still have your attention?
IIiX 2012
August 21-24, 2012, Nijmegen, The Netherlands
Deadline for papers April 9, 2012

EuroHCIR 2012
Same place, August 25
Deadline for papers is June 22, 2012

HCIR 2012: The 6th Symposium on Human Computer Information
Retrieval
October 4-5, 2012, Boston, Massachusetts, USA
Submission deadline mid-summer
Will publish works in progress and archival, full-length papers

Image credits

http://www.flickr.com/photos/torremountain/6831414535/
http://www.flickr.com/photos/bigtallguy/233176326/
http://www.flickr.com/photos/77074420@N00/198347900/
http://www.flickr.com/photos/racatumba/93569705/
http://www.flickr.com/photos/chrisolson/3595815374/
http://www.flickr.com/photos/brymo/2813028454/
http://www.flickr.com/photos/computix/108732248/
http://www.flickr.com/photos/funadium/913303959/
http://www.flickr.com/photos/moriza/189890016/
http://www.flickr.com/photos/uhdigital/6802789537/

Exploring session search

More Related Content

Similar to Exploring session search

Recently uploaded

Exploring session search