LiMe: Linear Methods for Pseudo-Relevance Feedback [SAC '18 Slides]

ACM SAC 2018, Pau, France
LiMe: Linear Methods for Pseudo-Relevance Feedback
Daniel Valcarce Javier Parapar Álvaro Barreiro
@dvalcarce @jparapar @AlvaroBarreiroG
Information Retrieval Lab
University of A Coruña
Spain

Outline
1. Pseudo-Relevance Feedback
2. Our proposal: LiMe
3. Experiments
4. Conclusions and Future Directions
1

Pseudo-Relevance Feedback (I)
Pseudo-Relevance Feedback provides an automatic method for
query expansion:
First retrieval with the original query
◦ Top retrieved documents are assumed to be relevant
(pseudo-relevant set)
Expand the query with terms from the pseudo-relevant set
Second retrieval with the expanded query
◦ The expanded query usually performs better than the
original one
3

Pseudo-Relevance Feedback (II)
Information need
4

Information need
query
4

Information need
query Retrieval
System
4

Information need
query Retrieval
System
Query
Expansion
expanded
query
4

LiMe
LiMe, as a PRF technique:
models the PRF task a matrix decomposition problem
employs linear methods to provide a solution
is able to learn inter-term similarities
jointly models the query and the pseudo-relevant set
admits diﬀerent feature schemes to represent documents
and queries
is agnostic to the retrieval model
6

Notation
Some notation:
A user prompts a query Q
The collection C is composed of documents D
V denotes the vocabulary and is formed of terms t
We denote the pseudo-relevant set by F
And the extended pseudo-relevant set by F {Q} ∪ F
◦ Its cardinal is m |F | |F| + 1
◦ And its vocabulary VF has size n |VF | ≤ |V|
7

LiMe: Matrix Formulation
Let X ∈ Rm×n be the extended pseudo-relevant set matrix, we
aim to ﬁnd a inter-term similarity matrix W ∈ Rn×n
+ such that:
X X × W
Q
D1
. . .
Dm−1 m×n
Q
D1
. . .
Dm−1 m×n
×
w11 · · · w1n
...
...
...
wn1 · · · wnn n×n
s.t. diag(W) 0, W ≥ 0
8

LiMe: Feature Schemes (I)
How do we ﬁll matrix X
Q
D1
. . .
Dm−1 m×n
?
9

LiMe: Feature Schemes (I)
How do we ﬁll matrix X
Q
D1
. . .
Dm−1 m×n
?
xij



s(tj , Q) if i 1 and f (tj , Q) > 0,
s(tj , Di−1) if i > 1 and f (tj , Di−1) > 0,
0 otherwise
s(t, D): weighting function of term t in D (or Q)
f (t, D): #occurrences of term t in D (or Q)
9

LiMe: Feature Schemes (II)
We tested two well-known Information Retrieval weighting
functions:
TF
stf(w, D) 1 + log2 f (w, D)
TF-IDF
stf-idf(w, D) 1 + log2 f (w, D) × log2
|C|
df(w)
10

LiMe: Optimization Problem
W∗
arg min
W
1
2
X − X W 2
F + β1 W 1,1 +
β2
2
W 2
F
s.t. diag(W) 0, W ≥ 0
(1)
11

LiMe: Optimization Problem
W∗
arg min
W
1
2
X − X W 2
F + β1 W 1,1 +
β2
2
W 2
F
s.t. diag(W) 0, W ≥ 0
(1)
Bound constrained least squares optimization problem with
elastic net ( 1 and 2 regularization) penalty:
ìw∗
·j arg min
ìw·j
1
2
ìx·j − X ìw·j
2
2
+ β1 ìw·j 1
+
β2
2
ìw·j
2
2
s.t. wjj 0, ìw·j ≥ 0
(2)
11

LiMe: Query Expansion
To expand the original query, we reconstruct the ﬁrst row of X:
Q
1×n
Q
1×n
×
w11 · · · w1n
...
...
...
ˆx1· ìx1· × W∗
(3)
12

LiMe: Query Expansion
To expand the original query, we reconstruct the ﬁrst row of X:
Q
1×n
Q
1×n
×
w11 · · · w1n
...
...
...
ˆx1· ìx1· × W∗
(3)
We compute a probabilistic estimate of a term tj given the
feedback model θF:
p(tj |θF)



ˆx1j
tv ∈VF
ˆx1v
if tj ∈ VF ,
0 otherwise
(4)
12

LiMe: Second retrieval
The second retrieval is performed interpolating the original
query model with the feedback model:
p(t|θQ) (1 − α) p(t|θQ) + α p(t|θF) (5)
The hyperparameter α controls the interpolation
This is a standard procedure in state-of-the-art PRF
techniques
13

State-of-the-art Baselines
Retrieval model:
◦ LM: Language Models (µ 1000) [Ponte & Croft, SIGIR ’98]
Based on language modelling:
◦ RM3: Relevance-Based Language Models [Lavrenko &
Croft, SIGIR ’01]
◦ MEDMM: Maximum-Entropy Divergence Minimisation
Models [Lv & Zhai, CIKM ’09]
Based on matrix factorization:
◦ RFMF: Relevance Feedback Matrix Factorisation [Zamani et
al., CIKM ’16]
15

Test Collections
Collection #docs
Avg doc Topics
length Training Test
AP88-89 165k 284.7 51-100 101-150
TREC-678 528k 297.1 301-350 351-400
Robust04 528k 28.3 301-450 601-700
WT10G 1,692k 399.3 451-500 501-550
GOV2 25,205k 647.9 701-750 751-800
16

Evaluation Metrics
We produce a ranking of 1000 documents per query:
MAP Mean Average Precision
nDCG Normalised Discounted Cumulative Gain
RI Robustness Index:
#topics improved−#topics degraded
#topics
17

Results
Metric LM RFMF MEDMM RM3 LiMe-TF LiMe-TF-IDF
AP
MAP 0.2349 0.2774 0.3010 0.3002 0.3062 0.3149
nDCG 0.5637 0.5749 0.5955 0.6005 0.6003 0.6085
RI − 0.42 0.42 0.50 0.38 0.52
TREC
MAP 0.1931 0.2072 0.2327 0.2235 0.2267 0.2357
nDCG 0.4518 0.4746 0.5115 0.4987 0.5051 0.5198
RI − 0.23 0.26 0.40 0.48 0.46
Robust
MAP 0.2914 0.3130 0.3447 0.3488 0.3388 0.3517
nDCG 0.5830 0.5884 0.6227 0.6251 0.6223 0.6294
RI − 0.07 0.32 0.37 0.23 0.37
WT10G
MAP 0.2194 0.2389 0.2472 0.2470 0.2484 0.2476
nDCG 0.5212 0.5262 0.5324 0.5352 0.5416 0.5398
RI − 0.30 0.36 0.20 0.32 0.30
GOV2
MAP 0.3310 0.3580 0.3790 0.3755 0.3776 0.3830
nDCG 0.6325 0.6453 0.6653 0.6618 0.6656 0.6698
RI − 0.42 0.66 0.60 0.68 0.62 18

Sensitivity of the 1 regularization (β1)
0.220
0.240
0.260
0.280
0.300
0.320
0.340
0.360
10−5 10−4 10−3 10−2 10−1 100 101 102 103
MAP
β1
AP88-89
WT2G
TREC678
WT10G
19

Sensitivity of the 2 regularization (β2)
0.16
0.18
0.20
0.22
0.24
0.26
0.28
0.30
0.32
0.34
0 50 100 150 200 250 300 350 400 450 500
MAP
β2
AP88-89
TREC-678
Robust04
WT10G
Gov2
20

Sensitivity of the number of pseudo-relevant documents (k)
0.140
0.160
0.180
0.200
0.220
0.240
0.260
0.280
0.300
0.320
0.340
5 10 25 50 75 100
MAP
k
AP88-89
TREC678
Robust04
WT10G
GOV2
21

Sensitivity of the number of terms (e)
0.140
0.160
0.180
0.200
0.220
0.240
0.260
0.280
0.300
0.320
0.340
5 10 25 50 75 100
MAP
e
AP88-89
TREC678
Robust04
WT10G
GOV2
22

Sensitivity of the query interpolation (α)
0.14
0.16
0.18
0.20
0.22
0.24
0.26
0.28
0.30
0.32
0.34
0.0 0.2 0.4 0.6 0.8 1.0
MAP
α
AP88-89
TREC678
Robust04
WT10G
GOV2
23

Conclusions and Future Directions

Conclusions
LiMe:
is a PRF technique that shows state-of-the-art performance
can be plugged on top of any retrieval model
accepts diﬀerent feature schemes
models inter-term similarities
25

Future work
Alternative feature schemes based on:
retrieval features
query logs
Explore connection with Translation Models which also rely on
inter-term similarities:
learnt from training data [Berger & Laﬀerty, SIGIR ’99]
based on mutual information [Karimzadehgan & Zhai,
SIGIR ’10]
26

Thank you!
@dvalcarce
http://www.dc.fi.udc.es/~dvalcarce

LiMe: Linear Methods for Pseudo-Relevance Feedback [SAC '18 Slides]

More Related Content

What's hot

Similar to LiMe: Linear Methods for Pseudo-Relevance Feedback [SAC '18 Slides]

More from Daniel Valcarce

Recently uploaded

LiMe: Linear Methods for Pseudo-Relevance Feedback [SAC '18 Slides]