Information Models for Ad Hoc Information Retrieval, SIGIR 2010
Upcoming SlideShare
Loading in...5
×
 

Information Models for Ad Hoc Information Retrieval, SIGIR 2010

on

  • 665 views

 

Statistics

Views

Total Views
665
Views on SlideShare
665
Embed Views
0

Actions

Likes
0
Downloads
6
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Information Models for Ad Hoc Information Retrieval, SIGIR 2010 Information Models for Ad Hoc Information Retrieval, SIGIR 2010 Presentation Transcript

  • Information-Based Models for Ad Hoc IR St´phane Clinchant e 1,2 Eric Gaussier 2 1 Xerox Research Centre Europe 2 Laboratoire d’Informatique de Grenoble Univ. Grenoble 1 SIGIR’10, 20 July 2010S.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 1 / 33
  • Overview Information Models Normalization Probability Distribution RSV Heuristic Burstiness Constraints Phenomenon Condition 1 Condition 2 Property of Condition 3 Prob.Distributions Condition 4S.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 2 / 33
  • Informative ContentUse Shannon’s information to weigh words in documents P(X) −log P(X)Inf(x) = − log P(x|ΘC ) = Informative ContentDeviation from an average behaviorS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 3 / 33
  • Informative ContentUse Shannon’s information to weigh words in documents P(X) −log P(X)Inf(x) = − log P(x|ΘC ) = Informative ContentDeviation from an average behavior- Observation by Harter (70): non-specialty words deviates from a Poisson- Informative Content, core to Divergence From Randomness ModelsS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 3 / 33
  • Information-based ModelMain idea: 1 Discrete terms frequencies x are renormalized into continuous values t(x), due to different document lengthS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 4 / 33
  • Information-based ModelMain idea: 1 Discrete terms frequencies x are renormalized into continuous values t(x), due to different document length 2 For each term w , values t(x) are assumed to follow a distribution P with parameter λw on the corpus, ie Tfw |λw ∼ PS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 4 / 33
  • Information-based ModelMain idea: 1 Discrete terms frequencies x are renormalized into continuous values t(x), due to different document length 2 For each term w , values t(x) are assumed to follow a distribution P with parameter λw on the corpus, ie Tfw |λw ∼ P 3 Queries and documents are compared with a surprise measure, a mean information: q d RSV (q, d) = −xw log P(Tfw > t(xw )|λw ) w ∈qS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 4 / 33
  • Outline 1 Model Properties Retrieval Heuristics Burstiness Phenomenon 2 Two Power-Law Instances log-logistic model smoothed power-law model 3 Experiments 4 Extension to PRFS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 5 / 33
  • Notations d qxw frequency of word w in document d, xw in query dtw normalized term frequencyTfw random variable for frequency of word wS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 6 / 33
  • Notations d qxw frequency of word w in document d, xw in query dtw normalized term frequencyTfw random variable for frequency of word wld length of document didfw corpus parameter for word wθ model parameter.S.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 6 / 33
  • Notations d qxw frequency of word w in document d, xw in query dtw normalized term frequencyTfw random variable for frequency of word wld length of document didfw corpus parameter for word wθ model parameter.Most (Ad-Hoc) IR models can be written as: q d RSV (q, d) = f (xw )h(xw , ld , idfw , θ) w ∈q⇒ What do we know about h?S.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 6 / 33
  • Overview Information Models Normalization Probability Distribution RSV Heuristic Burstiness Constraints Phenomenon Condition 1 Condition 2 Property of Condition 3 Prob.Distributions Condition 4S.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 7 / 33
  • Condition 1Docs with more occurrences of query terms get higher scores than docswith less occurrences ∂h(x, l, idf , θ) ∀(l, idf , θ), > 0 (h increases with x) ∂x 6 "Good" h: increasing "Bad" h: decreasing 5 4 h(x) 3 2 1 0 0 5 10 15 xS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 8 / 33
  • Condition 2The increase in the retrieval score should be smaller for larger termfrequencies. Ex: 2→4, 50→ 52 ∂ 2 h(x, l, idf , θ) ∀(l, idf , θ), < 0 (h concave) ∂x 2 4.5 "Good" h: Concave "Bad" h: Convex 4.0 3.5 3.0 Difference of scores decreases h(x) 2.5 2.0 1.5 Difference of scores increases 1.0 0 5 10 15 xS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 9 / 33
  • Condition 3 Longer documents, when compared to shorter ones with exactly the same number of occurrences of query terms, should be penalized (likely to cover additional topics) ∂h(x, l, idf , θ) ∀(x, idf , θ), < 0 (h decreasing with l) ∂lS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 10 / 33
  • Condition 4: IDF EffectIt is important to downweight terms occurring in many documents ∂h(x, l, idf , θ) ∀(x, l, θ), > 0 (IDF Effect) ∂idf 3.0 h(x,IDF=10) h(x,IDF=5) 2.8 2.6 2.4 h(x) 2.2 IDF Effect: h(x,IDF=10)>h(x,IDF=5) 2.0 1.8 1.6 0 5 10 15 xS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 11 / 33
  • Heuristic Constraints Condition 1: h increases with x Condition 2: h is concave Condition 3: h decreases with l Condition 4: h increases with idf (IDF Effect) Additionnal conditions in the paperS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 12 / 33
  • Heuristic Constraints Condition 1: h increases with x Condition 2: h is concave Condition 3: h decreases with l Condition 4: h increases with idf (IDF Effect) Additionnal conditions in the paper⇒ Analytical Reformulation of TFC1, TFC2, LNC1 and TDC:Fang et al, A Formal Study of Information Retrieval Heuristics, SIGIR’04S.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 12 / 33
  • Overview Information Models Normalization Probability Distribution RSV Heuristic Burstiness Constraints Phenomenon Condition 1 Condition 2 Property of Condition 3 Prob.Distributions Condition 4S.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 13 / 33
  • Burstiness PhenomenonWe proceed to Word Frequency distributions: Church and Gale 1 showed that a 2-Poisson model yields a poor fit to word frequencies A possible explanation: the behavior of words which tend to appear in bursts, ie burstiness Once a word appears in a document, it is much more likely to appear again 1 Poisson MixturesS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 14 / 33
  • Burstiness PhenomenonWe proceed to Word Frequency distributions: Church and Gale 1 showed that a 2-Poisson model yields a poor fit to word frequencies A possible explanation: the behavior of words which tend to appear in bursts, ie burstiness Once a word appears in a document, it is much more likely to appear again Recent works on Dirichlet Coumpound Multinomial ⇒ Which distributions can account for burstiness? 1 Poisson MixturesS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 14 / 33
  • Burstiness Property of Probabilility DistributionDefinitionA distribution P is bursty iff the function g defined by: g (x) = P(X ≥ x + |X ≥ x)is a strictly increasing function of x ( ∀ > 0)Interpretation: it becomes easier to generate more occurrencesS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 15 / 33
  • Burstiness Property of Probabilility DistributionDefinitionA distribution P is bursty iff the function g defined by: g (x) = P(X ≥ x + |X ≥ x)is a strictly increasing function of x ( ∀ > 0)Interpretation: it becomes easier to generate more occurrences g (x) strictly increasing ⇐⇒ ∆ = log g (x) strictly increasing ⇐⇒ ∆ = log P(X ≥ x + ) − log P(X ≥ x) is increasingAs ∆ < 0, absolute values of successive difference ∆ decreasesS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 15 / 33
  • Geometric Interpretation of Burstiness 0 Delta = log P(X>x+e) − log P(X>x) increases −1 As Delta<0, absolute value decreases −2 log P(X>x) −3 −4 −5 0 5 10 15 xS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 16 / 33
  • Gaussian(mean=5,std=1) is not bursty 0 −10 −20 log P(X>x) −30 −40 −50 0 5 10 15 xS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 17 / 33
  • Overview Information Models Normalization Probability Distribution RSV Heuristic Constraints Burstiness Condition 1 Phenomenon Condition 2 Condition 3 Property of Condition 4 Prob.DistributionsS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 18 / 33
  • Information Models & Heuristics Constraints:Models defined by: Function h q d RSV (q, d) = xw (− log P(Tfw > tw |λw )) (1) w ∈qS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 19 / 33
  • Information Models & Heuristics Constraints:Models defined by: Function h q d RSV (q, d) = xw (− log P(Tfw > tw |λw )) (1) w ∈q Condition 1: h increasing with x Condition 3: h penalizes long documentsS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 19 / 33
  • Information Models & Heuristics Constraints:Models defined by: Function h q d RSV (q, d) = xw (− log P(Tfw > tw |λw )) (1) w ∈q Condition 1: h increasing with x Condition 3: h penalizes long documents Condition 2: h concaveTheoremIf the distribution P is bursty, then the information model defined with Pis concaveS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 19 / 33
  • Information Models & Heuristics Constraints:Models defined by: Function h q d RSV (q, d) = xw (− log P(Tfw > tw |λw )) (1) w ∈q Condition 1: h increasing with x Condition 3: h penalizes long documents Condition 2: h concaveTheoremIf the distribution P is bursty, then the information model defined with Pis concave IDF effect and 2 additional Conditions depend on the choice of PS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 19 / 33
  • Characterization of Information Models 1 Normalisation of Frequencies Increasing in x, decreasing in l ex: DFR normalization tw = xw log(1 + c avg l ) d d ldS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 20 / 33
  • Characterization of Information Models 1 Normalisation of Frequencies Increasing in x, decreasing in l ex: DFR normalization tw = xw log(1 + c avg l ) d d ld 2 Probability Distribution Continuous and Bursty. Support = [0, +∞)S.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 20 / 33
  • Characterization of Information Models 1 Normalisation of Frequencies Increasing in x, decreasing in l ex: DFR normalization tw = xw log(1 + c avg l ) d d ld 2 Probability Distribution Continuous and Bursty. Support = [0, +∞) 3 Retrieval Function q d RSV (q, d) = −xw log P(Tfw > tw |λw ) w ∈qS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 20 / 33
  • Characterization of Information Models 1 Normalisation of Frequencies Increasing in x, decreasing in l ex: DFR normalization tw = xw log(1 + c avg l ) d d ld 2 Probability Distribution Continuous and Bursty. Support = [0, +∞) 3 Retrieval Function q d RSV (q, d) = −xw log P(Tfw > tw |λw ) w ∈q q d = −xw log P(Tfw > tw |λw ) w ∈q∩dS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 20 / 33
  • Characterization of Information Models 1 Normalisation of Frequencies Increasing in x, decreasing in l ex: DFR normalization tw = xw log(1 + c avg l ) d d ld 2 Probability Distribution Continuous and Bursty. Support = [0, +∞) 3 Retrieval Function q d RSV (q, d) = −xw log P(Tfw > tw |λw ) w ∈q q d = −xw log P(Tfw > tw |λw ) w ∈q∩d F w Nw λw = or N N where: -Fw Frequency of w in the corpus -Nw Document Frequency of w -N Number of documents in the collectionS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 20 / 33
  • Two Power-law Instances The log-logistic and smoothed power law modelsS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 21 / 33
  • Log-Logistic Model Log-Logistic distribution d λw P(Tfw > tw |λw ) = d (tw + λw )S.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 22 / 33
  • Log-Logistic Model Log-Logistic distribution d λw P(Tfw > tw |λw ) = d (tw + λw )The LGD model is defined by 1 DFR Normalization with parameter cS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 22 / 33
  • Log-Logistic Model Log-Logistic distribution d λw P(Tfw > tw |λw ) = d (tw + λw )The LGD model is defined by 1 DFR Normalization with parameter c Nw 2 Tfw ∼ LogLogistic(λw = N )S.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 22 / 33
  • Log-Logistic Model Log-Logistic distribution d λw P(Tfw > tw |λw ) = d (tw + λw )The LGD model is defined by 1 DFR Normalization with parameter c Nw 2 Tfw ∼ LogLogistic(λw = N ) 3 Ranking Model (as before): q d RSV (q, d) = xw − log P(Tfw > tw ) w ∈q∩dMeets all conditions for all parameter valuesS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 22 / 33
  • Smoothed Power Law SPLDistribution on [0, +∞) with parameter 0 < λ < 1: twd d tw +1 d λw − λw P(Tfw > tw |λw ) = 1 − λwS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 23 / 33
  • Smoothed Power Law SPLDistribution on [0, +∞) with parameter 0 < λ < 1: twd d tw +1 d λw − λw P(Tfw > tw |λw ) = 1 − λwIR Model: 1 DFR Normalization with parameter c Nw 2 Tfw ∼ SPL(λw = N ) 3 Ranking Model (as before): q d RSV (q, d) = xw − log P(Tfw > tw ) w ∈q∩dMeets all conditionsS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 23 / 33
  • Experiments Comparison with language models, BM25, DFR models Corpus: ROBUST, TREC-3, CLEF03, GIRT with short (-t) and long queries (-d) 6 query sets: ROB-d, ROB-t, T3-t, GIRT, CLEF-d, CLEF-tMethodology: 1 Divide each collection into 10 splits training/test 2 Learn best parameter (µ, c, k1 ) to optimize MAP or P10 on the training set 3 Measure MAP or P10 on the 10 splits and test difference with a t-test.S.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 24 / 33
  • Comparison with Dirichlet SmoothingTable: LGD and SPL versus LM-Dirichlet after 10 splits; bold indicates significantdifference MAP ROB-d ROB-t GIR T3-t CL-t CL-d DIR 27.1 25.1 41.1 25.6 36.2 48.5 LGD 27.4 25.0 42.1 24.8 36.8 49.7 P10 ROB-d ROB-t GIR T3-t CL-t CLF-d DIR 45.6 43.3 68.6 54.0 28.4 33.8 LGD 46.2 43.5 69.0 54.3 28.6 34.5S.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 25 / 33
  • Comparison with Dirichlet SmoothingTable: LGD and SPL versus LM-Dirichlet after 10 splits; bold indicates significantdifference MAP ROB-d ROB-t GIR T3-t CL-t CL-d DIR 27.1 25.1 41.1 25.6 36.2 48.5 LGD 27.4 25.0 42.1 24.8 36.8 49.7 P10 ROB-d ROB-t GIR T3-t CL-t CLF-d DIR 45.6 43.3 68.6 54.0 28.4 33.8 LGD 46.2 43.5 69.0 54.3 28.6 34.5 MAP ROB-d ROB-t GIR T3-t CL-t CL-d DIR 26.7 25.0 40.9 27.1 36.2 50.2 SPL 25.6 24.9 42.1 26.8 36.4 46.9 P10 ROB-d ROB-t GIR T3-t CL-t CL-d DIR 45.2 43.8 68.2 52.8 27.3 32.8 SPL 46.6 44.7 70.8 55.3 27.1 32.9S.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 25 / 33
  • Comparison with DFR modelsTable: LGD and SPL versus PL2 after 10 splits; bold indicates significantdifference MAP ROB-d ROB-t GIR T3-t CL-t CL-d PL2 26.2 24.8 40.6 24.9 36.0 47.2 LGD 27.3 24.7 40.5 24.0 36.2 47.5 P10 ROB-d ROB-t GIR T3-t CL-t CL-d PL2 46.4 44.1 68.2 55.0 28.7 33.1 LGD 46.6 43.2 66.7 53.9 28.5 33.7 MAP ROB-d ROB-t GIR T3-t CL-t CL-d PL2 26.3 25.2 42.8 25.8 37.3 45.7 SPL 26.3 25.2 42.7 25.3 37.4 44.1 P10 ROB-d ROB-t GIR T3-t CL-t CL-d PL2 46.0 45.2 69.3 54.8 26.2 32.7 SPL 47.0 45.2 69.8 55.4 25.9 32.9S.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 26 / 33
  • Extension to Pseudo Relevance FeedbackMean information of the top retrieved documents 1 d InfoR (w ) = − log P(Tfw > tw ; λw ) |R| d∈RQuery Update: q q2 xw InfoR (w ) xw = q +β maxw xw maxw Info(w )S.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 27 / 33
  • Comparison with others PRF ModelsMixture Model (Zhai) R comes from a mixture of a relevant topic model θw and the corpus language model (multinomial distribution) Query Update : p(w |q2) = αp(w |q) + (1 − α)θwS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 28 / 33
  • Comparison with others PRF ModelsMixture Model (Zhai) R comes from a mixture of a relevant topic model θw and the corpus language model (multinomial distribution) Query Update : p(w |q2) = αp(w |q) + (1 − α)θw Bo2 Model (Amati) Documents in R are merged together. A Geometric probability model measures the informative content of a word Query Update: q q2 xw InfoBo2 (w ) xw = q +β maxw xw maxw InfoBo2 (w )S.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 28 / 33
  • Pseudo Relevance Feedback Experiments 1 Divide each collection in 10 splits training/test 2 Learn best interpolation weight (β, α) to optimize MAP on the training set 3 Measure MAP on the 10 splits and test difference with a t-test 4 Change |R| and termCount TC to add to the queries 5 RepeatS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 29 / 33
  • ∗Table: MAP, bold indicates best performance, significant difference over LMand Bo2 models Model |R| TC ROB-t GIRT TREC3-t CLEF-t LM+MIX 5 5 27.5 44.4 30.7 36.6 INL+Bo2 5 5 26.5 42.0 30.6 37.6 LGD 5 5 28.3∗ 44.3 32.9∗ 37.6 LM+MIX 5 10 28.3 45.7∗ 33.6 37.4 INL+Bo2 5 10 27.5 42.7 32.6 37.5 LGD 5 10 29.4∗ 44.9 35.0∗ 40.2∗ LM+MIX 10 10 28.4 45.5 31.8 37.6 INL+Bo2 10 10 27.2 43.0 32.3 37.4 LGD 10 10 30.0∗ 46.8∗ 35.5∗ 38.9 LM+MIX 10 20 29.0 46.2 33.7 38.2 INL+Bo2 10 20 27.7 43.5 33.8 37.7 LGD 10 20 30.3∗ 47.6∗ 37.4∗ 38.6S.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 30 / 33
  • Table: Mean average precision (MAP) of PRF experiments; bold indicates bestperformance, ∗ significant difference over LM and Bo2 models Model |R| TC ROB-t GIR T3-t CL-t LGD 5 5 28.3∗ 44.3 32.9∗ 37.6 SPL 5 5 28.9∗ 45.6∗ 32.9∗ 39.0∗ LGD 5 10 29.4∗ 44.9 35.0∗ 40.2∗ SPL 5 10 29.6∗ 47.0∗ 34.6∗ 39.5∗ LGD 10 10 30.0∗ 46.8∗ 35.5∗ 38.9 SPL 10 10 30.0∗ 48.9∗ 33.8∗ 39.1∗ LGD 10 20 30.3∗ 47.6∗ 37.4∗ 38.6 SPL 10 20 29.9∗ 50.2∗ 34.3 39.7∗ LGD 20 20 29.5∗ 48.9∗ 37.2∗ 41.0∗ SPL 20 20 28.8 50.3∗ 33.9 39.0∗S.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 31 / 33
  • Conclusion Can we design IR models compatible with empirical evidence?⇒ Proposal: Information Models modelling burstiness (better fit to data) Analytical Characterization of Retrieval Constraints Definition of Burstiness for Probabilility distributions Information-Based Models compliant with Retrieval Constraints Bursty Distribution ⇒ Concave Model Extension to PRFS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 32 / 33
  • Conclusion Can we design IR models compatible with empirical evidence?⇒ Proposal: Information Models modelling burstiness (better fit to data) Analytical Characterization of Retrieval Constraints Definition of Burstiness for Probabilility distributions Information-Based Models compliant with Retrieval Constraints Bursty Distribution ⇒ Concave Model Extension to PRF The Log-logistic and Smoothed Power Law Models Similar/Better Performance to LM and DFR without PRF, better with PRF Questions ?S.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 32 / 33
  • Relation with DFRDFR Models are defined by: q d d RSV (q, d) = −xw Inf2 (tw ) log P(tw ) w ∈q∩dWe can show that: Inf2 makes DFR models concave (condition 2) Without Inf2 , DFR models have poor performances Discrete Laws with continues values 2 Notions of informations (non homogenous)⇒ Information Models uses continuous laws and a single concept ofinformationS.Clinchant E.Gaussier (XRCE-LIG) Information-Based Models for Ad Hoc IR SIGIR’10, 20 July 2010 33 / 33