Query Modeling Using Non-relevance Information - TREC 2008 Relevance Feedback Track Talk - Edgar Meij (Univ. of Amsterdam)

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Query Modeling Using Non-relevance Information - TREC 2008 Relevance Feedback Track Talk - Edgar Meij (Univ. of Amsterdam) - Presentation Transcript

    1. The University of Amsterdam at the TREC 2008 Relevance Feedback Track Query Modeling Using Non-relevance Information Edgar Meij, W. Weerkamp, J. He, and M. de Rijke ISLA University of Amsterdam http://ilps.science.uva.nl TREC 2008
    2. Introduction Model Experiments Conclusion Outline Introduction Model Experiments Conclusion
    3. Introduction Model Experiments Conclusion Motivation • Pseudo-relevance feedback approaches generally assume a term’s non-relevance status is implicitly indicated by its absence • How should we interpret explicit non-relevance information in a generative language modeling setting?
    4. Introduction Model Experiments Conclusion Retrieval Model • Documents are ranked according to the KL-divergence between a query model and each document model P(t|θQ ) Score(D,Q) = − P(t|θQ ) log P(t|θD ) t∈V rank = − P(t|θQ ) log P(t|θD ) t∈V • Document models are smoothed using a reference corpus • We use Jelinek-Mercer smoothing P(t|θD ) = (1 − λD )P(t|D) + λD P(t)
    5. Introduction Model Experiments Conclusion Retrieval Model • Documents are ranked according to the KL-divergence between a query model and each document model P(t|θQ ) Score(D,Q) = − P(t|θQ ) log P(t|θD ) t∈V rank = − P(t|θQ ) log P(t|θD ) t∈V • Document models are smoothed using a reference corpus • We use Jelinek-Mercer smoothing P(t|θD ) = (1 − λD )P(t|D) + λD P(t)
    6. Introduction Model Experiments Conclusion Query Modeling • Assumption: the better the query model reflects the information need, the better the results • Baseline: Each query term is equally important and receives an equal probability mass (set A) c(t, Q) P(t|θQ ) = P(t|Q) = |Q| • Cast pseudo-relevance feedback as query model updating ˆ P(t|θQ ) = (1 − λQ )P(t|Q) + λQ P(t|θQ ) • Smooth the initial query by adding and (re)weighing terms
    7. Introduction Model Experiments Conclusion Query Modeling • Assumption: the better the query model reflects the information need, the better the results • Baseline: Each query term is equally important and receives an equal probability mass (set A) c(t, Q) P(t|θQ ) = P(t|Q) = |Q| • Cast pseudo-relevance feedback as query model updating ˆ P(t|θQ ) = (1 − λQ )P(t|Q) + λQ P(t|θQ ) • Smooth the initial query by adding and (re)weighing terms
    8. Introduction Model Experiments Conclusion Outline Introduction Model Experiments Conclusion
    9. Introduction Model Experiments Conclusion (Non) Relevant Models • Relevant model estimated using interpolated MLE on the set of relevant documents: P(t|θR ) = δ1 P(t) + (1 − δ1 )P(t|R) D∈R P(t|D) = δ1 P(t) + (1 − δ1 ) |R| • Non-relevant model likewise: P(t|θ¬R ) = δ2 P(t) + (1 − δ2 )P(t|¬R) P(t|D) = δ2 P(t) + (1 − δ2 ) D∈¬R |¬R|
    10. Introduction Model Experiments Conclusion Our Model ˆ In order to arrive at an expanded query model θQ , we sample terms proportional to the following: • Each term is sampled according to the probability of observing that term in each relevant document • For each relevant document, adjust the probability mass of each term by • the probability of occurring given the relevant model • normalized by its probability given the non-relevant model
    11. Introduction Model Experiments Conclusion Our Model ˆ In order to arrive at an expanded query model θQ , we sample terms proportional to the following: • Each term is sampled according to the probability of observing that term in each relevant document • For each relevant document, adjust the probability mass of each term by • the probability of occurring given the relevant model • normalized by its probability given the non-relevant model
    12. Introduction Model Experiments Conclusion Normalized Log-Likelihood Ratio NLLR(D|R) = H(θD , θ¬R ) − H(θR , θD ) P(t|θR ) = P(t|θD ) log P(t|θ¬R ) t∈V (1 − δ1 )P(t|R) + δ1 P(t) = P(t|θD ) log (1 − δ2 )P(t|¬R) + δ2 P(t) t∈V • Measures how much better the relevant model can encode events from the document model than the non-relevant model • If a term has a high probability of occurring in θR / θ¬R it is rewarded / penalized
    13. Introduction Model Experiments Conclusion Normalized Log-Likelihood Ratio NLLR(D|R) = H(θD , θ¬R ) − H(θR , θD ) P(t|θR ) = P(t|θD ) log P(t|θ¬R ) t∈V (1 − δ1 )P(t|R) + δ1 P(t) = P(t|θD ) log (1 − δ2 )P(t|¬R) + δ2 P(t) t∈V • Measures how much better the relevant model can encode events from the document model than the non-relevant model • If a term has a high probability of occurring in θR / θ¬R it is rewarded / penalized
    14. Introduction Model Experiments Conclusion Query Model • Expanded query part ˆ P(t|θQ ) ∝ P(t|θD )P(θD |θR ) D∈R where NLLR(D|R) P(θD |θR ) = D NLLR(D |R)
    15. Introduction Model Experiments Conclusion Outline Introduction Model Experiments Conclusion
    16. Introduction Model Experiments Conclusion Experimental Setup • Preprocessing • Porter stemming • Stopwords removed • Training • Optimize MAP on held-out set (odd-numbered topics) • Sweep over free parameters • λD , λQ • δ1 for P(t|θR ) • δ2 for P(t|θ¬R ) • Submitted runs • Used 10 terms with the highest P(t|θQ ) • met6: Non-relevant documents • met9: Substitutes non-relevant model with collection
    17. Introduction Model Experiments Conclusion statMAP A B C D E met6 0.2289 0.2595 0.2750 0.2758 0.2822 met9 0.2289 0.2608 0.2787 0.2777 0.2810 indicates a statistically significant difference with the previous set at the 0.01 level, tested using a Wilcoxon test
    18. Introduction Model Experiments Conclusion 31 TREC Terabyte topics MAP P5 P10 A 0.1364 0.2516 0.2452 met6 B 0.1726 0.3161 0.3194 met6 C 0.1682 0.3032 0.2968 met6 D 0.1746 0.3097 0.3065 met6 E 0.1910 0.3935 0.3645 met9 B 0.1769 0.3161 0.3194 met9 C 0.1699 0.3161 0.3032 met9 D 0.1738 0.4000 0.3710 met9 E 0.1959 0.2903 0.2871 / indicates a statistically significant difference with the baseline (set A) at the 0.05 / 0.01 level resp.
    19. Introduction Model Experiments Conclusion 31 TREC Terabyte topics, set E <num>814</num> <title>Johnstown flood</title> <desc>Provide information about the Johnstown Flood in Johnstown, Pennsylvania </desc> flood johnstown dam club AP P10 water noaa baseline 0.3366 0.3000 gov met6 0.7853 1.0000 sir www time 0 0.125 0.250 0.375 0.500
    20. Introduction Model Experiments Conclusion 31 TREC Terabyte topics, set E <num>808</num> <title>North Korean Counterfeiting</title> <desc>What information is available on the involvement of the North Korean Government in counterfeiting of US currency</desc> north korean counterfeit korea AP P10 state drug baseline 0.2497 0.6000 weapon met6 0.0096 0.0000 countri nuclear traffick 0 0.125 0.250 0.375 0.500
    21. Introduction Model Experiments Conclusion 31 TREC Terabyte topics, set E 0.1925 0.1920 0.1915 0.1910 0.1905 MAP 0.1900 0.1895 0.1890 0.1885 0.1880 0.1875 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 δ2
    22. Introduction Model Experiments Conclusion 31 TREC Terabyte topics, set E 0.21 0.20 0.19 0.18 0.17 MAP 0.16 0.15 0.14 0.13 P(t|θQ ) = ˆ (1 − λQ )P(t|Q) + λQ P(t|θQ ) 0.12 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 λQ
    23. Introduction Model Experiments Conclusion 31 TREC Terabyte topics, set E 0.2040 0.2020 0.2000 0.1980 MAP 0.1960 0.1940 0.1920 0.1900 5 15 25 35 45 55 65 75 85 95 105 Number of terms
    24. Introduction Model Experiments Conclusion Conclusion and Future Work • Conclusion • Modeled (non)relevant documents as separate models and created a query model by sampling proportional to the NLLR of these models • Results improve over baseline • Non-relevance information does not help significantly • Future work • Further analysis • Compare with other, established RF methods • Set/Estimate λQ based on relevance information • amount • confidence
    25. Introduction Model Experiments Conclusion Questions? Edgar.Meij@uva.nl http://www.science.uva.nl/~emeij

    + Edgar MeijEdgar Meij, 2 years ago

    custom

    634 views, 0 favs, 0 embeds more stats

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 634
      • 634 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 0
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories