Twente ir-course 20-10-2010

538 views

Published on

Guest Lecture for MSc Information Retrieval course, October 20th, 2010, University of Twente.

Published in: Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
538
On SlideShare
0
From Embeds
0
Number of Embeds
10
Actions
Shares
0
Downloads
6
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
    Explicit/implicit refers to whether a user's action translates into explicit/implicit evidence of relevance. Explicit evidence of relevance is when the user says a document is relevant. Implicit evidence of relevance is when the user may not say that something is relevant, but his actions/behaviour (i.e, clicking, looking at, etc.) indicate that to some degree the user finds it relevant. When a Belga user buys an image, he may not have said explicitly that it is relevant, but this action is very close to that.
    Belga's logs currently include only type implicit evidence, whereas VITALAS logs will include both implicit and explicit.
    Maybe “personalisation” should also become context adaptation so as to be consistent with IP3 and not get confused with the personalisation in WP5.
  • <number>
    Explicit/implicit refers to whether a user's action translates into explicit/implicit evidence of relevance. Explicit evidence of relevance is when the user says a document is relevant. Implicit evidence of relevance is when the user may not say that something is relevant, but his actions/behaviour (i.e, clicking, looking at, etc.) indicate that to some degree the user finds it relevant. When a Belga user buys an image, he may not have said explicitly that it is relevant, but this action is very close to that.
    Belga's logs currently include only type implicit evidence, whereas VITALAS logs will include both implicit and explicit.
    Maybe “personalisation” should also become context adaptation so as to be consistent with IP3 and not get confused with the personalisation in WP5.
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
  • <number>
  • Twente ir-course 20-10-2010

    1. 1. How search logs can help improve future searches Arjen P. de Vries arjen@acm.org
    2. 2. User  Content User  UserContent  Metadata Content Indexing Interactions among users Interaction with Content
    3. 3. User  Content Content  Metadata Content Indexing Interaction with Content
    4. 4. (C) 2008, The New York Times Company Anchor tekst: “continue reading”
    5. 5. Not much text to get you here... A fan’s hyves page: Kyteman's HipHop Orchestra: www.kyteman.com Ticket Sales Luxor theatre: May 22nd - Kyteman's hiphop Orchestra - www.kyteman.com Kluun.nl: De site van Kyteman Blog Rockin’ Beats: The 21-year-old Kyteman (trumpet player, composer and Producer Colin Benders), has worked for 3 years on his debute: the Hermit sessions. Jazzenzo: ... a performance by the popular Kyteman’s Hiphop Orkest
    6. 6. ‘Co-creation’  Social Media:  Consumer becomes a co-creator  ‘Data consumption’ traces  In essence: many new sources to play the role of anchor text!
    7. 7. Tweets about blip.tv  E.g.: http://blip.tv/file/2168377  Amazing  Watching “World’s most realistic 3D city models?”  Google Earth/Maps killer  Ludvig Emgard shows how maps/satellite pics on web is done (learn Google and MS!)  and ~120 more Tweets
    8. 8. Information Need Representation Result Representation Result Representation Click Result Representation Click Result Representation Click Anchor text Weblink Anchor text Weblink Anchor text Weblink AnchortextRelevancefeedback Every search request is metadata! That metadata is useful as expanded content representation, to capture more diverse views on the same content, and reduce the vocabulary difference between creators of content, indexers, and users, as a means to adapt retrieval systems to the user context, and even as training data for machine learning of multimedia ‘detectors’!
    9. 9. Types of feedback  Explicit user feedback  Images/videos marked as relevant/non-relevant  Selected keywords that are added to the query  Selected concepts that are added to the query  Implicit user feedback  Clicking on retrieved images/videos (click-through data)  Bookmarking or sharing an image/video  Downloading/buying an image/video
    10. 10. Who interact with the data?  Interactive relevance feedback  Current user in current search  Personalisation  Current user in logged past searches  Context adaptation  Users similar to current user in logged past searches  Collective knowledge  All users in logged past searches
    11. 11. Applications exploiting feedback  Given a query, rank all images/videos based on past users feedback  Given an image/video, rank all images/videos based on past users feedback
    12. 12. Applications exploiting feedback  Interactive relevance feedback  Modify query and re-rank, based on current user's explicit feedback (and current ranking)  Blind relevance feedback  Modify query and re-rank, based on feedback by past users and current ranking
    13. 13. Applications exploiting feedback  Query suggestion  Recommend keywords/concepts to support users in interactive query modification (refinement or expansion)
    14. 14. ‘Police Sting’ Sting performs with The Police ‘Elton Diana’ Sting attends Versace memorial service ‘Led Zeppelin’ Sting performs at Led Zeppelin concert
    15. 15. Exploiting User Logs (FP6 Vitalas T4.2)  Aim  Understand the information-searching process of professional users of a picture portal  Method  Building in collaboration with Belga an increasingly large dataset that contains the log of Belga's users' search interactions  Processing, analysing, and investigating the use of this collective knowledge stored in search logs in a variety of tasks
    16. 16. Search logs  Search logs in Vitalas  Searches performed by users through Belga's web interface from 22/06/2007 to 12/10/2007 (101 days)  402,388 tuples <date,time,userid,action>  "SEARCH_PICTURES" (138,275) | "SHOW_PHOTO" (192,168) | "DOWNLOAD_PICTURE" (38,070) | "BROWSE_GALLERIES" (8,878) | "SHOW_GALLERY" (24,352) | "CONNECT_IMAGE_FORUM" (645)  17,861 unique (‘lightly normalised’) queries  96,420 clicked images  Web image search (Craswell and Szummer, 2007):  Pruned graph has 1.1 million edges, 505,000 URLs and 202,000 queries
    17. 17. Search Logs Analysis
    18. 18. Clijsters Henin
    19. 19. What could we learn?  Goals  What do users search for?  User context  How do professionals search image archives, when compared to the average user?  Query modifications  How do users reformulate their queries within a search session
    20. 20. Professionals search longer
    21. 21. Semantic analysis  Most studies investigate the search logs at the syntactic (term-based) level  Our idea: map the term occurrences into linked open data (LOD)
    22. 22. Semantic Log Analysis  Method:  Map queries into linked data cloud, find 'abstract' patterns, and re-use those for query suggestion, e.g.:  A and B play-soccer-in-team X  A is-spouse-of B  Advantages:  Reduces sparseness of the raw search log data  Provides higher level insights in the data  Right mix of statistics and semantics?  Overcomes the query drift problem of thesaurus-based query expansion
    23. 23. Assign Query Types
    24. 24. Detect High-level Relations…
    25. 25. … transformed into modification patterns
    26. 26. Implications  Guide the selection of ontologies/lexicons/etc. most suited for your user population  Distinguish between successful and unsuccessful queries when making search suggestions  Improve session boundary detection
    27. 27. Finally… a ‘wild idea’  Image data is seldomly annoted adequately  i.e., adequately to support search  Automatic image annotation or ‘concept detection’  Supervised machine learning  Requires labelled samples as training data; a laborious and expensive task
    28. 28. FP6 Vitalas IP  Phase 1 – collect training data  Select ~500 concepten with collection owner  Manually select ~1000 positive and negative examples for each concept
    29. 29. How to obtain training data?  Can we use click-through data instead of manually labelled samples?  Advantages:  Large quantities, no user intervention, collective assessments  Disadvantages:  Noisy & sparse  Queries not based on strict visual criteria
    30. 30. Automatic Image Annotation  Research questions:  How to annotate images with concepts using click-through data?  How reliable are click-through data based annotations?  What is the effectiveness of these annotations as training samples for concept classifiers?
    31. 31. Manual annotations annotations per concept positive samples negative samples MEAN 1020.02 89.44 930.57 MEDIAN 998 30 970 STDEV 164.64 132.84 186.21
    32. 32. Manual vs. search logs based
    33. 33. 1. How to annotate? (1/4)  Use the queries for which images were clicked  Challenges:  Inherent noise: gap between queries/captions and concepts  queries describe the content+context of images to be retrieved  clicked images retrieved using their captions: content+context  concept-based annotations: based on visual content-only criteria  Sparsity: only cover part of the collection previously accessed  Mismatch between terms in concept descriptions and queries
    34. 34. How to annotate (2/4)  Basic ‘global’ method:  Given the keywords of a query Q  Find the query Q' in search logs that is most textually similar to Q  Find the images I clicked for Q'  Find the queries Q'' for which these images have been clicked  Rank the queries Q'' based on the number of images clicked for them
    35. 35. How to annotate (3/4)  Exact: images clicked for queries exactly matching the concept name  Example: 'traffic' -> 'traffic jam', 'E40', 'vacances', 'transport‘  Search log-based image representations:  Images represented by all queries for which they have been clicked  Retrieval based on language models (smoothing, stemming)  Example: 'traffic' -> 'infrabel', 'deutsche bahn', 'traffic lights‘  Random walks over the click graph  Example: 'hurricane' -> 'dean', 'mexico', 'dean haiti', 'dean mexico'
    36. 36. How to annotate (4/4)  Local method:  given the keywords of a query Q and its top ranked images  Find the queries Q'' for which these images have been clicked  Rank the queries Q'' based on the number of images clicked for them
    37. 37. •Compare agreement of click-through-based annotations to manual ones, examining the 111 VITALAS concepts with at least 10 images (for at least one of the methods) in the overlap of clicked and manually annotated images • Levels of agreement vary greatly across concepts • 20% of concepts per method reach agreement of at least 0.8 What type of concepts can be reliably annotated using clickthrough data? • defined categories? not informative activities, animals, events, graphics, people,image_theme, objects, setting/scene/site Possible future research on types of concepts • named entities? • specific vs. broad? • 2. Reliability
    38. 38. Train the classifiers for each of 25 concepts positive samples: images selected by each method negative samples: selected by random sampling the 100k set exclude those already selected as positive samples low-level visual features FW : texture description integrated Weibull distribution extracted from overlapping image regions low-level textual features FT : a vocabulary of most frequent terms in captions is built for each concept compare each image caption is against each concept vocabulary build a frequency-histogram for each concept SVM classifiers with RBF kernel (and cross 3. Effectiveness (1/3)
    39. 39. 3. Effectiviness study (2/3) •Experiment 1 (visual features): –training: search-log based annotations –test set for each concept: manual annotations (~1000 images) –feasibility study: in most cases, AP considerably higher than the prior 3. Effectiveness (2/3)
    40. 40. •Experiments 2,3,4 (visual or textual features): –Experiment 2 training: search-log based annotations –Experiment 3 training: manual + search-log based annotations –Experiment 4 training: manual annotations –common test set: 56,605 images (subset of the 100,000 collection) –contribution of search-log based annotations to training is positive –particularly in combination with manual annotations 3. Effectiviness (3/3)
    41. 41. manually annotated positive samples search log based annotated positive samples test set results View results at: http://olympus.ee.auth.gr/~diou/searchlogs/ Example: Soccer
    42. 42. Paris
    43. 43. or... Paris
    44. 44. Diversity from User Logs  Present different query variants' clicked images in clustered view  Merge different query variants' clicked images in a round robin fashion into one list (CLEF)
    45. 45. ImageCLEF 'Olympics' Olympic games Olympic torch Olympic village Olympic rings Olympic flag Olympic Belgium Olympic stadium Other
    46. 46. ImageCLEF 'Tennis'
    47. 47. ImageCLEF Findings  Many queries (>20%) without clicked images  Corpus and available logs originated from different time frame
    48. 48.  Best results combine text search in metadata with image click data for topic title and each of the cluster titles  Using query variants derived from the logs increases recall with 50- 100%  However, also topic drift; reduced early precision ImageCLEF Findings

    ×