Dynamic Search and Beyond

Dynamic Search and
Beyond
Prof. Grace Hui Yang
InfoSense Group
Department of Computer Science
Georgetown University
huiyang@cs.georgetown.edu
Sep 29, 2018
CCIR 2018 @ Guilin

• Our graduate program focuses on
Information Systems,
Privacy and Security,
and Computer Theory.
• Ph.D., Master’s, Postdocs

• ACM International Conference on Theory of
Information Retrieval (ICTIR)
• Its importance in the IR community
• Acknowledgement to Guangxi normal university,
CCF, and many old and new friends

Statistical Modeling of
Information Seeking
• Aims to connect user’s information seeking
behaviors with retrieval models
• The ‘dynamics’ in the search process are the
primary elements to be modeled
• I call this set of novel retrieval algorithms “Dynamic
IR Modeling”

Task: Dynamic IR
• The information retrieval task that aims to find
relevant documents for a session of multiple queries.
• It happens when information needs are complex,
vague, evolving, often containing multiple subtopics
• Not possible to be resolved by one-shot ad-hoc
retrieval
• e.g. “Purchasing a home”, “What is the meaning of
life”

E.g. Find what city and state Dulles airport is in, what shuttles ride-sharing vans and
taxi cabs connect the airport to other cities, what hotels are close to the airport, what
are some cheap off-airport parking, and what are the metro stops close to the Dulles
airport.
Information
need
User
Search
Engine
An Illustration

Characteristics of Dynamic IR
• Rich interactions
• Query formulation
• Document clicks
• Document examination
• eye movement
• mouse movements
• etc.
4

• Temporal dependency
5
clicked
documentsquery
D1
ranked documents
q1 C1
D2
q2 C2 ……
…… Dn
qn Cn
I
information need
iteration 1 iteration 2 iteration n

• Aim for a long-term goal
• Great if we can find early what a user
ultimately want
4

Reinforcement Learning (RL)
• Fits well in this trial-and-error setting
• It is to learn from repeated, varied attempts which are
continued until success.
• The learner (also known as agent) learns from its dynamic
interactions with the world
• rather than from a labeled dataset as in supervised
learning.
• The stochastic model assumes that the system's current
state depend on the previous state and action in a non-
deterministic manner 6

Most of Our Work is inspired
by MDPs/POMDPs

○ Based on Markov Decision Process (MDP)
○ States: Queries
! Observable
○ Actions:
! User actions:
○ Add/remove/unchange the query terms
○ Nicely correspond to our definition of query change
! Search Engine actions:
○ Increase/ decrease /remain term weights
○ Rewards:
! nDCG
[Guan, Zhang, and Yang SIGIR 2013]
QUERY CHANGE MODEL

SEARCH ENGINE AGENT’S ACTIONS
∈ Di−1 action Example
qtheme
Y increase “pocono mountain” in s6
N increase
“france world cup 98 reaction” in s28, france
world cup 98 reaction stock market→ france world
cup 98 reaction
+∆q
Y decrease
‘policy’ in s37, Merck lobbyists → Merck
lobbyists US policy
N increase
‘US’ in s37, Merck lobbyists → Merck lobbyists
US policy
−∆q
Y decrease
‘reaction’ in s28, france world cup 98 reaction
→ france world cup 98
N No change
‘legislation’ in s32, bollywood legislation
→bollywood law

QUERY CHANGE RETRIEVAL MODEL (QCM)
○ Bellman Equation gives the optimal value for an MDP:
○ The reward function is used as the document relevance
score function and is tweaked backwards from Bellman
equation:
Document relevant
score
Query Transition
model
Maximum past
relevance
Current
reward/relevance
score

CALCULATING THE TRANSITION MODEL
• According to Query Change and Search Engine Actions
Current reward/
relevance score
Increase weights for
theme terms
Decrease weights for
old added terms
Decrease weights for
removed terms
Increase weights for
novel added terms

○ Partially Observable Markov Decision Process
○ Two agents
● Cooperative game
● Joint Optimization
WIN-WIN SEARCH: DUAL-AGENT STOCHASTIC GAME
● Hidden states
● Actions
● Rewards
● Markov
[Luo, Zhang, and Yang SIGIR 2014]

A MARKOV CHAIN OF DECISION MAKING STATES

SRT
Relevant &
Exploitation
SRR
Relevant &
Exploration
SNRT
Non-Relevant &
Exploitation
SNRR
Non-Relevant &
Exploration
● scooter price ⟶ scooter stores ● collecting old US coins⟶ selling
old US coins
● Philadelphia NYC travel ⟶
Philadelphia NYC train
● Boston tourism ⟶ NYC tourism
q0
HIDDEN DECISION MAKING STATES

ACTIONS
! User Action (Au)
○ add query terms (+Δq)
○ remove query terms (-Δq)
○ keep query terms (qtheme)
! Search Engine Action(Ase)
○ Increase/ decrease/ keep term weights
○ Switch on or off a search technique,
○ e.g. to use or not to use query expansion
○ adjust parameters in search techniques
○ e.g., select the best k for the top k docs used in
PRF
! Message from the user(Σu)
○ clicked documents
○ SAT clicked documents
! Message from search engine(Σse)
○ top k returned documents
Messages are essentially
documents that an agent thinks
are relevant.

REWARDS
! Explicit Rewards:
! nDCG
! Implicit Rewards:
! clicks
[Luo et al, SIGIR 2014, ECIR 2015]

EXPERIMENTS
○ Corpus: ClubWeb09 and ClueWeb 12, TREC DD datasets
○ Query Logs

SEARCH ACCURACY
○ Search accuracy on TREC 2012 Session Track
TREC 2012 Session Track
◆ Win-win outperforms most retrieval algorithms on TREC 2012.

◆ Systems in TREC 2012 perform better than in TREC 2013.
◆ many relevant documents are not included in ClueWeb12 CatB
collection
◆ Win-win outperforms all retrieval algorithms on TREC 2013.
◆ It is highly effective in Session Search.
SEARCH ACCURACY
○ Search accuracy on TREC 2013 Session Track
TREC 2013 Session Track

SEARCH ACCURACY FOR DIFFERENT
SESSION TYPES
○ TREC 2012 Sessions are classified into:
! Product: Factual / Intellectual
! Goal quality: Specific / Amorphous
Intellectual %chg Amorphous %chg Specific %chg Factual %chg
TREC best 0.3369 0.00% 0.3495 0.00% 0.3007 0.00% 0.3138 0.00%
Nugget 0.3305 -1.90% 0.3397 -2.80% 0.2736 -9.01% 0.2871 -8.51%
QCM 0.3870 14.87% 0.3689 5.55% 0.3091 2.79% 0.3066 -2.29%
QCM+DUP 0.3900 15.76% 0.3692 5.64% 0.3114 3.56% 0.3072 -2.10%
- Better handle sessions that demonstrate evolution and exploration Because QCM
treats a session as a continuous process by studying changes among query
transitions and modeling the dynamics
QCM

How to design the states,
actions, and rewards

DESIGN OPTIONS
○ Is there a temporal component?
○ States – What changes with each time step?
○ Actions – How does your system change the state?
○ Rewards – How do you measure feedback or
effectiveness in your problem at each time step?
○ Transition Probability – Can you determine this?
! If not, then model free approach is more suitable
ECIR’15

A Direct Policy Learning
Framework
• Learns a direct mapping from observations to actions by
gradient descent
• Define a history: A chain of events happening in a
session
• the dynamic changes of states, actions, observations,
and rewards in a session
ICTIR’15

Browse Phase
• Actor: the user
• It happens
• after the search results are shown to the user
• before the user starts to write the next query
• Records how the user perceives and examines the
(previously retrieved) search results
ICTIR’15
Decompose a history

Query Phase
• Actor: the user
• It happens
• when the user writes a query
• Assuming the query is created based on
• what has been seen in the browse phase
• the information need
ICTIR’15
Decompose a history

Rank Phase
• Actor: the search engine
• It happens
• after the query is entered
• before the search results are returned
• It is where the search algorithm takes place
Decompose a history

Action Selection Distribution
Softmax Function
Gradient

Ranking Function
• It originally presents the probability of selecting a
(ranking) action
• In our context, the probability of selecting d to be put
at the top of a ranked list under n3 and θ3 at the tth
iteration
• Then we sort the documents by it to generate the
document list

Updates:
Feature function:
Query Features
• Test if a search term w∈q
t
and w∈q
t
−1
• # of times that a term w occurs in q
1
,q
2
,…,q
t
Query-Document Features
• Test if a search term w∈+∆q
t
and w∈D
t
−1
• Test if a document d contains a term w ∈ −∆q
t
tf
.
idf score of a document d to q
t
Click Features
• Test if there are SAT-Clicks in Dt−1
• # of times a document being clicked in the
current session
• # of seconds a document being viewed and
reviewed in the current session
Query-Document-Click Features
• Test if qi leads to SAT-Clicks in Di, where i =
0...t−1
Session Features
• position at the current session
Browse
Query
Rank

Efficiency - TREC 2012 Session
• lemur > dpl > qcm > winwin
• dpl achieves a good balance between accuracy and efficiency
• the conclusions are also consistent upon experiments on TREC’12
~ 14 Session Tracks
DPL

TREC 2012 Session
• dpl achieves a significant improvement over the TREC best run
• We found similar conclusions on TREC 2013 and 2014 Session Tracks
DPL

TREC DYNAMIC DOMAIN 2015-2017
! The search task focuses on specific
domains
! In the three years, we had explored
domains from the dark web (illicit good and
Ebola) and polar science, to more general
web domains (NYT)
! What is consistent?
○ The participating system is expected to
help the user through interactions & get
their tasks done
○ User’s information need usually consists
of multiple aspects

FEEDBACK FROM A SIMULATED USER
! https://github.com/trec-dd/trec-dd-jig

DOMAIN USED IN 2017
○ New York Times Annotated Corpus
! Sandhaus, Evan. "The new york times annotated corpus." Linguistic Data
Consortium, Philadelphia 6, no. 12 (2008): e26752.
! Archives of New York Times in 20 years, from January 1, 1987 and June 19, 2007
! Uncompressed size 16 GB
! Over 1.8 million documents
! Over 650,000 article summaries written by library scientists.
! Over 1,500,000 articles manually tagged by library scientists
! Over 275,000 algorithmically-tagged articles that have been hand verified by
professionals

ANNOTATION
○ Create Topic and Relevance Judgement at the same time
! Not by pooling
○ Topic – subtopic – passage – Relevance Judgement
○ The challenge: how to be complete

○ Useful information that the user gains
! Raw relevance score
○ Discounting
! Based on document ranking
! Based on diversity
○ User’s efforts
! Time spent
! Lengths of documents being
viewed
EVALUATION METRICS FOR DYNAMIC SEARCH

○ Most session search metrics consider all those factors into
one overwhelmingly complex formula
○ The optimal value, aka upper bound, of those metrics highly
varies on different search topics
○ In Cranfield-like settings (e.g. TREC), the difference is often
ignored
THE PROBLEM

TOY EXAMPLE
Doc Relevance score regarding topic-subtopic
1-1 1-2 2-1 2-2 2-3 2-4 2-5
d1 1 4
d2 3 4
d3 4
d4 4
d5 4
System Topic 1 CT-
topic 1
Topic 2 CT-
topic
2
CT-avg Normaliz
ed CT-
avg
System1 d1, irrel, irrel, irrel,
irrel
1 d1, d3, d4, d5, irrel
16 8.5 0.596
System2 d2, irrel, irrel, irrel,
irrel
3 d1, d3, d4, d5, irrel
14 8.5 0.787
Optimal d1, d2, irrel, irrel, irrel 4 d1, d2, d3, d4, d5 17

○ What is the optimal metric value that a system can
achieve?
! How to get the upper bound for each search topic?
! How does it affect the evaluation conclusions?
○ Variance of different topics
○ Normalization
RESEARCH QUESTIONS
!"#$%& = (
)*+,-
$./_!"#$% 1#23", 5 − 7#/%$_8#9:;(1#23")
922%$_8#9:; 1#23" − 7#/%$_8#9:;(1#23")

○ Session-DCG (sDCG)
! Järvelin et al. "Discounted cumulated gain based evaluation of multiple-query IR
sessions." Advances in Information Retrieval (2008): 4-15.
○ Cube Test (CT)
! Luo et al. "The water filling model and the cube test: multi-dimensional evaluation for professional
search." CIKM, 2013.
○ Expected Utility (EU)
○ Yang and Abhimanyu. "Modeling expected utility of multi-session information distillation." ICTIR
2009.
DYNAMIC SEARCH METRICS
!" = $
%
& ' $
(,* ∈%
$
,∈-.,/
0, ∗ 23 ,,(,*45 − 7 ∗ 89:;(=, >))
@A =
∑(C5
D ∑*C5
|F(GH.|
∑, 0, IJK =, > ∗ 23(,,(,*45)
∑(C5
D ∑*C5
|F(GH.|
89:;(=, >)
:L@M = $
(C5
D
$
*C5
|F(GH.|
IJK(=, >)
1 + logS > ∗ 1 + logST =

○ sDCG
○ Cube Test
○ Expected Utility
DECONSTRUCT THE METRICS
CostGain Rank discount Novelty discount
!"#$ = &
'()
*
&
+()
|-'./0|
123(5, 7)
1 + log> 7 ∗ 1 + log>@ 5
#A =
∑'()
*
∑+()
|-'./0|
∑C DC 123 5, 7 ∗ EF(C,',+G))
∑'()
* ∑+()
|-'./0|
HI!J(5, 7)
KL = &
M
N O &
',+ ∈M
&
C∈Q0,R
DC ∗ EF C,',+G) − T ∗ HI!J(5, 7))

BOUNDS ON DIFFERENT TOPICS
!"#$ = "&!'()*+,- $.&*

!" =
$%&'()*+,- ./%*
!(&+

!" = $%&'()*+,- ./%*
−$%&'()*+,- 1(&+

! The difference of the optimal value a metric would
produce for different topics is large and should not
be ignored.

○ Rearrangement Inequality
○ In IR, Probability Ranking Principle [4]
! the overall effectiveness of an IR system can be
achieved the best by ranking the documents by their
usefulness in descending order
OUR SOLUTION
!"#$ + !&#$'" + … + !$#" ≤ !* " #" + !* & #& + … + !* $ #$ ≤ !"#" + !&#& + ⋯ + !$#$
,-. !" ≤ !& … ≤ !$ /01 #" ≤ #& … ≤ #$

NORMALIZATION EFFECT
!"#$ = "&!'()*+,- $.&*

!" =
$%&'()*+,- ./%*
!(&+

!" = $%&'()*+,- ./%* − / ∗ $%&'()*+,- 2(&+
/ = 0.01

! Using the bounds for normalization brings in more
fairness into evaluation

Conclusion
• Our main contributions:
• Put user into the models
• Created a bridge between information
seeking studies/user behavior studies with
learning
• Yield a family of new generative retrieval
models for a complex, dynamic settings
• Able to explain the results

A Few Thinkings
• Information seeking is a Markov Decision Process, instead of
independent searches
• User actions that cost more efforts, such as query changes,
are stronger signals than clicks
• Search is also a learning process for the user, who also
evolves
• Users and search engines form a partnership to explore the
information space
• They influence each other; It is a two-way communication
• Complex evaluation metrics might not be appropriate; the
complexity should either be modelled in the model or the
metric, but not in both

Look into the future
• Dynamic IR Models are good for modeling
information seeking
• A lot of room to study the user and the search
engine interaction in a generative way
• The thinking I presented here could be able to
generate new methods not only on retrieval and
evaluation, but also on related fields
• Exciting!!

Thank You!
• Email:
huiyang@cs.georgetown.edu
• Group Page: InfoSense at
http://infosense.cs.georgetown.
edu/
• Dynamic IR Website:
http://www.dynamic-ir-
modeling.org/
• Book: Dynamic Information
Retrieval Modeling
• TREC Dynamic Domain Track:
http://trec-dd.org/

Dynamic Search and Beyond

Recommended

Recommended

More Related Content

Similar to Dynamic Search and Beyond

Similar to Dynamic Search and Beyond (20)

Recently uploaded

Recently uploaded (20)

Dynamic Search and Beyond