ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

914 views

Published on

Best Paper Award at ECDL 2010: the 14th European Conference on Research and Advanced Technology for Digital Libraries

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
914
On SlideShare
0
From Embeds
0
Number of Embeds
10
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

  1. 1. ECDL 2010 6-10 september 2010 Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study Damien Palacio, Guillaume Cabanac, Christian Sallaberry, Gilles Hubert Damien Palacio - damien.palacio@univ-pau.fr 1
  2. 2. Outline 1. Motivation Topical IR → Geographic IR Hypothesis: GIRS > IRS 2. Context IRS evaluation Issue Current evaluation frameworks = partial 3. Contribution GIRS evaluation framework 4. Experiments Case study with PIV GIRS Hypothesis validated 5. Conclusion and Future Works 2
  3. 3. Outline 1. Motivation Topical IR → Geographic IR Hypothesis: GIRS > IRS 2. Context IRS evaluation Issue Current evaluation frameworks = partial 3. Contribution GIRS evaluation framework 4. Experiments Case study with PIV GIRS Hypothesis validated 5. Conclusion and Future Works 3
  4. 4. 1. Motivation – Why Geographic IR? Geographic Information Retrieval ➔ Query = ''trip around Glasgow in summer 2010'' ➔ Search Engines ➔ Topical term ∈ {trip, Glasgow, summer, 2010} spatial ∈ {citiesNearGlasgow ...} ➔ Geographic temporal ∈ {21june .. 22sept 2010} term ∈ {trip, Glasgow, summer, 2010} ➔ ≈ 1/6 Queries = Geographic Queries ➔ Excite (Sanderson et al., 2004) ➔ AOL (Gan et al., 2008) ➔ Yahoo! (Jones et al., 2008) ➔ Current Issue and Realistic 4
  5. 5. 1. Motivation – Why Geographic IR? A Geographic IRS: How Does It Work? ➔ 3 Dimensions to Process: ➔ Spatial, temporal and topical ➔ 1 Index per Dimension ➔ Topical bag of words, vector space model, ... ➔ Spatial named entity recognition, ... ➔ Temporal named entity recognition, ... 5
  6. 6. 1. Motivation – Why Geographic IR? A Geographic IRS: How Does It Work? ➔ Spatial Processing 6
  7. 7. 1. Motivation – Why Geographic IR? A Geographic IRS: How Does It Work? ➔ 3 Dimensions to Process: ➔ Spatial, temporal and topical ➔ 1 Index per Dimension ➔ Topical bag of words, vector space model, ... ➔ Spatial named entity recognition, ... ➔ Temporal named entity recognition, ... ➔ Retrieval ➔ Usually by filtering (STEWARD, SPIRIT, CITER, …) ➔ Issue: Performance of GIRS vs. topical IRS ➔ Hypothesis: Geographic IRS better than topical IRS 7
  8. 8. Outline 1. Motivation Topical IR → Geographic IR Hypothesis: GIRS > IRS 2. Context IRS evaluation Issue Current evaluation frameworks = partial 3. Contribution GIRS evaluation framework 4. Experiments Case study with PIV GIRS Hypothesis validated 5. Conclusion and Future Works 8
  9. 9. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation 9
  10. 10. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Computation Storage time needed Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation 10
  11. 11. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Computation Storage needed Quality time Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation 11
  12. 12. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Computation Storage needed Quality time Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation Temporal Topical Spatial 12
  13. 13. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Computation Storage needed Quality time Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation TREC, CLEF, ... Temporal Topical Spatial 13
  14. 14. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Computation Storage needed Quality time Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation TREC, CLEF, ... TempEval Temporal Topical Spatial 14
  15. 15. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Computation Storage needed Quality time Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation TREC, CLEF, ... TempEval Temporal Topical Bucher et al. (2005) GeoClef Spatial 15
  16. 16. 2. Context and Issue: IRS Partial Evaluation Evaluating an IR System ➔ System = efficiency + effectiveness Computation Storage needed Quality time Geo IR litterature Topical IR litterature ➔ Effectiveness Evaluation TREC, CLEF, ... TempEval Temporal Topical Bucher et al. (2005) Evaluation GeoClef framework Spatial proposed 16
  17. 17. Outline 1. Motivation Topical IR → Geographic IR Hypothesis: GIRS > IRS 2. Context IRS evaluation Issue Current evaluation frameworks = partial 3. Contribution GIRS evaluation framework 4. Experiments Case study with PIV GIRS Hypothesis validated 5. Conclusion and Future Works 17
  18. 18. 3. Proposition – GIRS Evaluation Framework Evaluation Framework for the 3 Dimensions (1/2) ➔ Goal: measuring GIRS quality ➔ Means: building on TREC framework (1992-) ➔ ''Cranfield'' methodology ➔ Test collection ➔ Corpus ➔ ≥ 25 Topics ➔ Qrels ➔ Measures: P@X, MAP, NDCG, ... [Voorhees, 2007] 18
  19. 19. 3. Proposition – GIRS Evaluation Framework Evaluation Framework for the 3 Dimensions (2/2) ➔ TREC Framework Extension ➔ Test collection ➔ ≥ 25 Topics ➔ Corpus Covering the 3 dimensions ➔ Gradual qrels ➔ + geographic ressources 19
  20. 20. 3. Proposition – GIRS Evaluation Framework Evaluation Framework for the 3 Dimensions (2/2) ➔ TREC Framework Extension ➔ Test collection ➔ ≥ 25 Topics ➔ Corpus Covering the 3 dimensions ➔ Gradual qrels 3 dimensions: ➔ + geographic ressources Topic: ''trip around Glasgow'' Doc: trip + Bob born in Dumbarton No dimension 3 dimensions + global ➔ About qrels … = Satisfied topic  ➔ Relevance (doc, topic) ∈ {0;1;2;3;4} ➔ Principle: ''the more satisfied dimensions there are, the better it is'' 20
  21. 21. 3. Proposition – GIRS Evaluation Framework Evaluation Framework for the 3 Dimensions (2/2) ➔ TREC Framework Extension ➔ Test collection ➔ ≥ 25 Topics ➔ Corpus Covering the 3 dimensions ➔ Gradual qrels 3 dimensions: ➔ + geographic ressources Topic: ''trip around Glasgow'' Doc: trip + Bob born in Dumbarton No dimension 3 dimensions + global ➔ About qrels … = Satisfied topic  ➔ Relevance (doc, topic) ∈ {0;1;2;3;4} ➔ Principle: ''the more satisfied dimensions there are, the better it is'' ➔ Gradual qrels aware measure: Normalized Discounted Cumulative Gain [Järvelin & Kekäläinen, 2002] ➔ By topic: NDCG for each topic ➔ Global: meanNDCG for the system 21
  22. 22. Outline 1. Motivation Topical IR → Geographic IR Hypothesis: GIRS > IRS 2. Context IRS evaluation Issue Current evaluation frameworks = partial 3. Contribution GIRS evaluation framework 4. Experiments Case study with PIV GIRS Hypothesis validated 5. Conclusion and Future Works 22
  23. 23. 4. Experiments – Case Study with PIV GIRS Case Study: PIV System ➔ Indexing: 1 index per dimension ➔ Topical = Terrier IRS [Ounis et al, 2005] ➔ Spatial = map segmentation into tiles ➔ Temporal = timeline segmentation into tiles CombMNZ ➔ Retrieval ➔ Result document list for each index ➔ Results combination with CombMNZ [Fox & Shaw, 1993; Lee, 1997] 23
  24. 24. 4. Experiments – Case Study with PIV GIRS CombMNZ Principle [Fox & Shaw, 1993; Lee 1997] 24
  25. 25. 4. Experiments – Case Study with PIV GIRS CombMNZ Principle [Fox & Shaw, 1993; Lee 1997] 25
  26. 26. 4. Experiments – Case Study with PIV GIRS CombMNZ Principle [Fox & Shaw, 1993; Lee 1997] 26
  27. 27. 4. Experiments – Case Study with PIV GIRS Case Study: MIDR_2010 collection ➔ Building Qrels: 12 volunteers (thanks!!!) 31 topics Qrels 5645 Relevance documents judgments = {0;1;2;3;4} paragraphs Map for tracking spatial information 27
  28. 28. 4. Experiments – Hypothesis Validated Analysis of Collected Data ➔ IRS Evaluation trec_eval ➔ ResultsList × Qrels NDCG ➔ Results: geographic IRS most effective Hypothesis  28
  29. 29. 4. Experiments – Hypothesis Validated Analysis of Collected Data ➔ Results: geographic IRS most effective 29
  30. 30. Outline 1. Motivation Topical IR → Geographic IR Hypothesis: GIRS > IRS 2. Context IRS evaluation Issue Current evaluation frameworks = partial 3. Contribution GIRS evaluation framework 4. Experiments Case study with PIV GIRS Hypothesis validated 5. Conclusion and Future Works 30
  31. 31. Evaluation framework for Geographic IR Systems Conclusions and Future Works (1/2) ➔ Evaluation Framework for Geographic IR Systems ➔ Reusable ➔ Generalizable for more dimensions: confidence, freshness, ... [Costa Pereira et al., 2009] ➔ Not gradual relevance per dimension ➔ Case Study with PIV System ➔ Creation of a specific test collection (≥ 25 topics) ➔ French test collection ➔ Limited collection (number of documents) 31
  32. 32. Evaluation Framework for Geographic IR Systems Conclusions and Future Works (2/2) ➔ Hypothesis Validated ➔ The 3 dimensions improve IR (+66.5%) ➔ Future Works ➔ More precise analysis: by query ➔ Quantify PIV improvements: various indexes combinations ➔ Organize a GIRS evaluation campaign: anyone interested? 32
  33. 33. ECDL 2010 6-10 september 2010 Thank you! Damien Palacio - damien.palacio@univ-pau.fr 33
  34. 34. Spatial Interface 34
  35. 35. Spatial Interface 35
  36. 36. Temporal Interface 36
  37. 37. Temporal Interface 37
  38. 38. Spatial Tiling 38

×