Query Modeling Using Non-relevance Information - TREC 2008 Relevance Feedback Track Talk - Edgar Meij (Univ. of Amsterdam) - Presentation Transcript
The University of Amsterdam at the
TREC 2008 Relevance Feedback Track
Query Modeling Using Non-relevance Information
Edgar Meij, W. Weerkamp, J. He, and M. de Rijke
ISLA
University of Amsterdam
http://ilps.science.uva.nl
TREC 2008
Introduction Model Experiments Conclusion
Outline
Introduction
Model
Experiments
Conclusion
Introduction Model Experiments Conclusion
Motivation
• Pseudo-relevance feedback approaches generally assume
a term’s non-relevance status is implicitly indicated by its
absence
• How should we interpret explicit non-relevance information
in a generative language modeling setting?
Introduction Model Experiments Conclusion
Retrieval Model
• Documents are ranked according to the KL-divergence
between a query model and each document model
P(t|θQ )
Score(D,Q) = − P(t|θQ ) log
P(t|θD )
t∈V
rank
= − P(t|θQ ) log P(t|θD )
t∈V
• Document models are smoothed using a reference corpus
• We use Jelinek-Mercer smoothing
P(t|θD ) = (1 − λD )P(t|D) + λD P(t)
Introduction Model Experiments Conclusion
Retrieval Model
• Documents are ranked according to the KL-divergence
between a query model and each document model
P(t|θQ )
Score(D,Q) = − P(t|θQ ) log
P(t|θD )
t∈V
rank
= − P(t|θQ ) log P(t|θD )
t∈V
• Document models are smoothed using a reference corpus
• We use Jelinek-Mercer smoothing
P(t|θD ) = (1 − λD )P(t|D) + λD P(t)
Introduction Model Experiments Conclusion
Query Modeling
• Assumption: the better the query model reflects the
information need, the better the results
• Baseline: Each query term is equally important and
receives an equal probability mass (set A)
c(t, Q)
P(t|θQ ) = P(t|Q) =
|Q|
• Cast pseudo-relevance feedback as query model updating
ˆ
P(t|θQ ) = (1 − λQ )P(t|Q) + λQ P(t|θQ )
• Smooth the initial query by adding and (re)weighing terms
Introduction Model Experiments Conclusion
Query Modeling
• Assumption: the better the query model reflects the
information need, the better the results
• Baseline: Each query term is equally important and
receives an equal probability mass (set A)
c(t, Q)
P(t|θQ ) = P(t|Q) =
|Q|
• Cast pseudo-relevance feedback as query model updating
ˆ
P(t|θQ ) = (1 − λQ )P(t|Q) + λQ P(t|θQ )
• Smooth the initial query by adding and (re)weighing terms
Introduction Model Experiments Conclusion
Outline
Introduction
Model
Experiments
Conclusion
Introduction Model Experiments Conclusion
(Non) Relevant Models
• Relevant model estimated using interpolated MLE on the
set of relevant documents:
P(t|θR ) = δ1 P(t) + (1 − δ1 )P(t|R)
D∈R P(t|D)
= δ1 P(t) + (1 − δ1 )
|R|
• Non-relevant model likewise:
P(t|θ¬R ) = δ2 P(t) + (1 − δ2 )P(t|¬R)
P(t|D)
= δ2 P(t) + (1 − δ2 ) D∈¬R
|¬R|
Introduction Model Experiments Conclusion
Our Model
ˆ
In order to arrive at an expanded query model θQ , we sample
terms proportional to the following:
• Each term is sampled according to the probability of
observing that term in each relevant document
• For each relevant document, adjust the probability mass of
each term by
• the probability of occurring given the relevant model
• normalized by its probability given the non-relevant model
Introduction Model Experiments Conclusion
Our Model
ˆ
In order to arrive at an expanded query model θQ , we sample
terms proportional to the following:
• Each term is sampled according to the probability of
observing that term in each relevant document
• For each relevant document, adjust the probability mass of
each term by
• the probability of occurring given the relevant model
• normalized by its probability given the non-relevant model
Introduction Model Experiments Conclusion
Normalized Log-Likelihood Ratio
NLLR(D|R) = H(θD , θ¬R ) − H(θR , θD )
P(t|θR )
= P(t|θD ) log
P(t|θ¬R )
t∈V
(1 − δ1 )P(t|R) + δ1 P(t)
= P(t|θD ) log
(1 − δ2 )P(t|¬R) + δ2 P(t)
t∈V
• Measures how much better the relevant model can encode
events from the document model than the non-relevant
model
• If a term has a high probability of occurring in θR / θ¬R it is
rewarded / penalized
Introduction Model Experiments Conclusion
Normalized Log-Likelihood Ratio
NLLR(D|R) = H(θD , θ¬R ) − H(θR , θD )
P(t|θR )
= P(t|θD ) log
P(t|θ¬R )
t∈V
(1 − δ1 )P(t|R) + δ1 P(t)
= P(t|θD ) log
(1 − δ2 )P(t|¬R) + δ2 P(t)
t∈V
• Measures how much better the relevant model can encode
events from the document model than the non-relevant
model
• If a term has a high probability of occurring in θR / θ¬R it is
rewarded / penalized
Introduction Model Experiments Conclusion
Query Model
• Expanded query part
ˆ
P(t|θQ ) ∝ P(t|θD )P(θD |θR )
D∈R
where
NLLR(D|R)
P(θD |θR ) =
D NLLR(D |R)
Introduction Model Experiments Conclusion
Outline
Introduction
Model
Experiments
Conclusion
Introduction Model Experiments Conclusion
Experimental Setup
• Preprocessing
• Porter stemming
• Stopwords removed
• Training
• Optimize MAP on held-out set (odd-numbered topics)
• Sweep over free parameters
• λD , λQ
• δ1 for P(t|θR )
• δ2 for P(t|θ¬R )
• Submitted runs
• Used 10 terms with the highest P(t|θQ )
• met6: Non-relevant documents
• met9: Substitutes non-relevant model with collection
Introduction Model Experiments Conclusion
statMAP
A B C D E
met6 0.2289 0.2595 0.2750 0.2758 0.2822
met9 0.2289 0.2608 0.2787 0.2777 0.2810
indicates a statistically significant difference with the previous
set at the 0.01 level, tested using a Wilcoxon test
Introduction Model Experiments Conclusion
31 TREC Terabyte topics
MAP P5 P10
A 0.1364 0.2516 0.2452
met6 B 0.1726 0.3161 0.3194
met6 C 0.1682 0.3032 0.2968
met6 D 0.1746 0.3097 0.3065
met6 E 0.1910 0.3935 0.3645
met9 B 0.1769 0.3161 0.3194
met9 C 0.1699 0.3161 0.3032
met9 D 0.1738 0.4000 0.3710
met9 E 0.1959 0.2903 0.2871
/ indicates a statistically significant difference with the
baseline (set A) at the 0.05 / 0.01 level resp.
Introduction Model Experiments Conclusion
31 TREC Terabyte topics, set E
<num>814</num>
<title>Johnstown flood</title>
<desc>Provide information about the Johnstown Flood in Johnstown, Pennsylvania
</desc>
flood
johnstown
dam
club
AP P10
water
noaa baseline 0.3366 0.3000
gov met6 0.7853 1.0000
sir
www
time
0 0.125 0.250 0.375 0.500
Introduction Model Experiments Conclusion
31 TREC Terabyte topics, set E
<num>808</num>
<title>North Korean Counterfeiting</title>
<desc>What information is available on the involvement of the North Korean Government
in counterfeiting of US currency</desc>
north
korean
counterfeit
korea
AP P10
state
drug baseline 0.2497 0.6000
weapon met6 0.0096 0.0000
countri
nuclear
traffick
0 0.125 0.250 0.375 0.500
Introduction Model Experiments Conclusion
31 TREC Terabyte topics, set E
0.2040
0.2020
0.2000
0.1980
MAP
0.1960
0.1940
0.1920
0.1900
5 15 25 35 45 55 65 75 85 95 105
Number of terms
Introduction Model Experiments Conclusion
Conclusion and Future Work
• Conclusion
• Modeled (non)relevant documents as separate models and
created a query model by sampling proportional to the
NLLR of these models
• Results improve over baseline
• Non-relevance information does not help significantly
• Future work
• Further analysis
• Compare with other, established RF methods
• Set/Estimate λQ based on relevance information
• amount
• confidence
Introduction Model Experiments Conclusion
Questions?
Edgar.Meij@uva.nl
http://www.science.uva.nl/~emeij
0 comments
Post a comment