• Like
Context Adaptation in Image Search
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Context Adaptation in Image Search

  • 537 views
Published

Presentation about context-adaptation in image search, given at the 4th Twente/Siks workshop (held for the occasion of Robin Aly's PhD defense).

Presentation about context-adaptation in image search, given at the 4th Twente/Siks workshop (held for the occasion of Robin Aly's PhD defense).

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
537
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
3
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Context Adaptation in Image Search arjen@acm.org
  • 2. Context Adaptation GOAL: Present different photos to a sports journalist who queries for Beckham, than the glossy magazine editor issuing the same query
  • 3. IPTC Categories • ACE (arts, culture, • LIF (lifestyle & leisure) entertainment) • POL (politics) • CLJ (crime, law & justice) • REL (religion) • DIS (disasters & accidents) • SCI (science & technology) • EBF (economy, business & • SOI (social issues) finance) • SPO (sports) • EDU (education) • WAR (unrest, conflicts, • ENV (environment) war) • HTH (health) • WEA (weather) • HUM (human interest) • LAB (labour, work)
  • 4. What Context? • Collection context – One “main” IPTC category per image • 96,351 out of 97,760 images in 100k Belga Collection • Note: noisy data, in spite of it being edited content! E.g., we found lifestyle Beckham images annotated as SPO, and even typos in IPTC category assignment! • User context – Classified 813 users into IPTC categories to represent their main interest (based on Belga input about the user’s organizations)
  • 5. Filter on IPTC? //image[@IPTC eq SPO][about(.,Beckham)] • Bad for recall: – Not all images have been assigned IPTC categories • Bad for precision: – Noisy assignment of IPTC categories to images • At least 4 of the top 10 SPO Beckham results do not show Beckham taking part in sporting activities
  • 6. Retrieval Model • Re-rank results based on cluster membership λρd(q) + (1-λ) ∑c ∈ Clusters ρc(q) ρc(d) P(Q|D) P(D|c) P(Q|c) – Modify scores based on document’s context Oren Kurland and Lillian Lee. ACM Transactions on Information Systems (TOIS), 27(3), 2009. • Novelty in Vitalas: – Modify scores based on user’s context • Cluster formation based on user clicks • Cluster selection based on user context
  • 7. Retrieval Model • Cluster formation: – IPTC-image categories; forms disjoint clusters – IPTC-user categories of users who clicked the image; gives overlapping clusters • Cluster selection: – {d∈c}: cluster contains document – {u∈c}: cluster/@category corresponds to user's interests
  • 8. Results on Click Prediction image image image image user user user User NDCG D 0.0 0.1 0.4 0.7 0.0 0.1 0.4 0.7 ACE 0.1724 0.1423 0.1741 0.1721 0.1721 0.2070 0.1978 0.1767 0.1747 EBF 0.5527 0.4744 0.5460 0.5497 0.5504 0.4882 0.5519 0.5509 0.5509 EDU 0.0145 0.0163 0.0145 0.0145 0.0145 0.0165 0.0167 0.0155 0.0146 HTH 0.1308 0.1347 0.1308 0.1308 0.1308 0.6342 0.3712 0.1934 0.1414 HUM 0.1849 0.1612 0.1798 0.1772 0.1849 0.2109 0.2043 0.1776 0.1760 LAB 0.1331 0.1543 0.1331 0.1331 0.1331 0.2164 0.2339 0.1817 0.1380 LIF 0.1245 0.0888 0.1234 0.1233 0.1232 0.1894 0.1555 0.1121 0.1253 POL 0.0723 0.0586 0.0704 0.0717 0.0721 0.1054 0.0990 0.0916 0.0769 SOI 0.2880 0.1806 0.2883 0.2880 0.2880 0.2964 0.2970 0.2968 0.3008 SPO 0.1811 0.1801 0.1809 0.1806 0.1807 0.2151 0.2005 0.1839 0.1820 Related literature on evaluation methodology: Carterette and Jones, NIPS 2007, and, Carterette, Allan, and Sitaraman, SIGIR 2006.
  • 9. No Adaptation “Greece”
  • 10. SPO Adaptation “Greece, collection-based clusters, λ=0.1”
  • 11. SPO Adaptation “Greece, collection-based clusters, λ=0.0”
  • 12. SPO Adaptation “Greece, user-based clusters, λ=0.1”
  • 13. SPO Adaptation “Greece, user-based clusters, λ=0.0”
  • 14. SPO Observations • Re-ranking pushes the sports-related images to the top – No more images about the fires – When λ=0.0 the initial retrieval score is not taken into account (initial text ranking ignored) • Minimal differences between collection- based and user-based cluster formation – Archivists consider as sports-related those images that users with sports-related interests click on
  • 15. POL Adaptation “Greece, collection-based clusters, λ=0.1”
  • 16. POL Adaptation “Greece, collection-based clusters, λ=0.0”
  • 17. POL Adaptation “Greece, user-based clusters, λ=0.1”
  • 18. POL Adaptation “Greece, user-based clusters, λ=0.0”
  • 19. POL Observations • Re-ranking for a politics context shows a difference in interpretation between the archivist and the user group – Archivists focussed on the actual political rallies etc. – Users focussed on the forest fires
  • 20. ACE Adaptation “Greece, collection-based clusters, λ=0.1”
  • 21. ACE Adaptation “Greece, collection-based clusters, λ=0.0”
  • 22. ACE Observations • Re-ranking for arts, culture and entertainment requires λ=0.0, to ignore the initial ranking and let the right images shine
  • 23. No Adaptation “Beckham”
  • 24. SPO Adaptation “Beckham, collection-based clusters, λ=0.1”
  • 25. SPO Adaptation “Beckham, collection-based clusters, λ=0.0”
  • 26. HUM Adaptation “Beckham, collection-based clusters, λ=0.1”
  • 27. Conclusions this far • Adaptation also retrieves images not assigned IPTC category, by considering clusters formed by the images clicked by users with the same interests • Alternative cluster formation approaches can be investigated; e.g., using visual features • Method easily adapted for personalised and/or collaborative search
  • 28. Potential for Personalization • Which queries have the potential to benefit by context adaptation (personalisation)? • The ones for which different users click on different results – Can be studied looking at nDCG of one user assuming another user’s clicks are ideal Jaime Teevan, Susan T. Dumais and Eric Horvitz. Potential for Personalization. ACM Transactions on Computer-Human Interaction (ToCHI) special issue on Data Mining for Understanding User Needs, 17(1), March 2010. • Novel in Vitalas: compare IPTC-defined user groups (instead of individual users)
  • 29. P4P in Belga 100K
  • 30. P4P in Belga 100K nDCG high: low potential Dean (0.8067) King albert ii (0.7810) greece (0.3910) nDCG low: high potential
  • 31. No Adaptation “King Albert II”
  • 32. EBF Adaptation “King Albert II”
  • 33. POL Adaptation “King Albert II”
  • 34. No Adaptation “Dean”
  • 35. ACE Adaptation “Dean, user-based clusters”
  • 36. ACE Adaptation “Dean, collection-based clusters”
  • 37. Dean: Temporal Effect • Log files: “Dean” = “Hurricane Dean” • Still, query is quite ambiguous: – James Dean – Agyness Dean (a model) – a (university) dean – Dean Dealannoi – Howard Dean – Dean Martin • Context adaptation for “Dean” requires archivist
  • 38. Future Work • Address various normalization issues – In context adaptation (due to NLLR approximation) – In “potential for personalization”/adaptation • Explore temporal dimension – Combinations of collection and user context? • Explore cross-media cluster-based retrieval – Use visual features in cluster formation
  • 39. See also “CWI” Vitalas demonstrations: http://www.ins.cwi.nl/projects/M4/vitalas/ Collection context instead of user context: http://www.ins.cwi.nl/projects/M4/vitalas/context_adap tation.html Detectors trained by query log http://olympus.ee.auth.gr/diou/civr2009/