More Related Content

Slideshows for you(20)

Similar to In-depth Exploration of Geotagging Performance(20)


More from Symeon Papadopoulos(20)


In-depth Exploration of Geotagging Performance

  1. In-depth Exploration of Geotagging Performance using sampling strategies on YFCC100M George Kordopatis-Zilos, Symeon Papadopoulos, Yiannis Kompatsiaris Information Technologies Institute, Thessaloniki, Greece MMCommons Workshop, October 16, 2016 @ Amsterdam, NL
  2. Where is it? Depicted landmark Eiffel Tower Location Paris, Tennessee Keyword “Tennesee” is very important to correctly place the photo. Source (Wikipedia): er_(Paris,_Tennessee)
  3. Motivation Evaluating multimedia retrieval systems • What do we evaluate? • How? • What decisions do we make based on it? MM system (black box) Test Collection Comparison to ground truth Evaluation measure Decision
  4. Problem Formulation • Test collection creation  Evaluation bias • Performance reduced to a single measure  miss a lot of nuances of performance • Test problem: Geotagging = predicting the geographic location of a multimedia item based on its content
  5. Example: Evaluating geotagging • Test collection #1: 1M images, 700K located in US • Assume we use P@1km as an evaluation measure • System 1: almost perfect precision in US (100%), very poor for rest of the world (10%)  P@1km = 0.7*100 + 0.3*10 = 73% • System 2: approximately the same precision all over the world (65%)  P@1km = 65% • Test collection #2: 1M images, 500K depicting cats and puppies on white background • Then, for 50% of the collection any prediction is essentially random.
  6. Multimedia Geotagging • Problem of estimating the geographic location of a multimedia item (e.g. Flickr image + metadata) • Variety of approaches: • Text-based: use the text metadata (tags) • Gazetteer-based • Statistical methods (associations between tags & locations) • Visual • Similarity-based (find most similar and use their location) • Model-based (learn visual model of an area) • Hybrid • Combine text and visual
  7. Language Model • Most likely cell: 𝑐𝑗 = 𝑎𝑟𝑔𝑚𝑎𝑥𝑖 𝑘=1 𝑁 𝑝(𝑡 𝑘|𝑐𝑖) • Tag-cell probability: 𝑝 𝑡 𝑐 = 𝑁 𝑢 𝑁𝑡 We will refer to this as: Base LM (or Basic)
  8. Language Model Extensions • Feature selection • Discard tags that do not provide any geographical cues • Selection criterion: locality > 0 • Feature weighting • More importance to tags with geographic information • Linear combination of locality and spatial entropy • Multiple grids • Consider two grids: fine and coarse – if the estimate from the fine grid falls within that of the coarse, then use that one • Similarity Search • Out of the selected cell, use lat/lon of most similar item to refine location estimation We will refer to this as: Full LM (or Full)
  9. MediaEval Placing Task • Benchmarking activity in the context of MediaEval • Dataset: • Flickr images and videos (different each year) • Training and test set • Also possible to test systems that use external data Edition Training Set Test Set 2015 4,695,149 949,889 2014 5,025,000 510,000 2013 8,539,050 262,000
  10. Proposed Evaluation Framework • Initial (reference) test collection Dref • Sampling function f: Dref  Dtest • Performance volatility • p(D): performance score achieved in collection D • In our case, we consider two such measures: • P@1km • Median distance error
  11. Sampling Strategies A variety of approaches for Placing Task collection: • Geographical Uniform Sampling • User Uniform Sampling • Text-based Sampling • Text Diversity Sampling • Geographically Focused Sampling • Ambiguity-based Sampling • Visual Sampling
  12. Uniform Sampling • Geographic Uniform Sampling • Divide earth surface into square areas of approximately the same size (~10x10km) • Select N items from each area (N=median of items/area) • User Uniform Sampling • Select only one item per user
  13. Text Sampling • Text-based Sampling • Select only items with more than M terms (M: median of terms/item) • Text Diversity Sampling • Represent items using bag-of-words • Use MinHash to generate a binary code per BoW vector • Select one item per code (bucket) B
  14. Other Sampling Strategies • Geographically Focused Sampling • Pick items from a selected place (continent/country) • Ambiguity-based Sampling • Select the set of items that are associated with ambiguous place names (or the complementary set) • Ambiguity defined with the help of entropy • Visual Sampling • Select only items associated with a given visual concept • Select only items associated with concepts related to buildings
  15. Experiments - Setup • Placing Task 2015 dataset: 949,889 images (subset of YFCC100M) • Test four variants of Language Model method: • Basic-PT: Base LM method trained on PT dataset (=4.7 geotagged images released by the task organizers) • Full-PT: Full LM method trained on PT dataset • Basic-Y: Base LM method trained on YFCC dataset (=40M geotagged images of YFCC100M) • Full-Y: Full LM method trained on YFCC dataset
  16. Reference Results
  17. Geographical Uniform Sampling • Initial distribution  • Uniform distribution: • select three items/cell
  18. User Uniform Sampling
  19. Text-based Sampling Select only images with >7 tags/item
  20. Text Diversity Sampling • After MinHash, 478,817 buckets were created.
  21. Geographically Focused Sampling Results of Full-Y
  22. Ambiguity-based Sampling
  23. Visual Sampling
  24. Summary of Results
  25. Thank you! Data/Code: • Get in touch: • George Kordopatis-Zilos: • Symeon Papadopoulos: / @sympap With the support of: