Twente ir-course 20-10-2010

How search logs can help
improve future searches
Arjen P. de Vries
arjen@acm.org

User  Content
User  UserContent  Metadata
Content Indexing Interactions among users
Interaction with Content

User  Content
Content  Metadata
Content Indexing
Interaction with Content

(C) 2008, The New York Times Company
Anchor tekst:
“continue reading”

Not much text to
get you here...
A fan’s hyves page:
Kyteman's HipHop Orchestra:
www.kyteman.com
Ticket Sales Luxor theatre:
May 22nd - Kyteman's hiphop Orchestra - www.kyteman.com
Kluun.nl:
De site van Kyteman
Blog Rockin’ Beats:
The 21-year-old Kyteman
(trumpet player, composer and
Producer Colin Benders),
has worked for 3 years on
his debute:
the Hermit sessions.
Jazzenzo:
... a performance by the popular
Kyteman’s Hiphop Orkest

‘Co-creation’
 Social Media:
 Consumer becomes a co-creator
 ‘Data consumption’ traces
 In essence: many new sources to
play the role of anchor text!

Tweets about blip.tv
 E.g.: http://blip.tv/file/2168377
 Amazing
 Watching “World’s most realistic 3D city
models?”
 Google Earth/Maps killer
 Ludvig Emgard shows how maps/satellite pics
on web is done (learn Google and MS!)
 and ~120 more Tweets

Information Need
Representation
Result
Representation
Result
Representation
Click
Result
Representation
Click
Result
Representation
Click
Anchor
text
Weblink
Anchor
text
Weblink
Anchor
text
Weblink
AnchortextRelevancefeedback
Every search
request is
metadata!
That metadata is useful as expanded content representation, to capture more diverse
views on the same content, and reduce the vocabulary difference between creators of
content, indexers, and users, as a means to adapt retrieval systems to the user context,
and even as training data for machine learning of multimedia ‘detectors’!

Types of feedback
 Explicit user feedback
 Images/videos marked as relevant/non-relevant
 Selected keywords that are added to the query
 Selected concepts that are added to the query
 Implicit user feedback
 Clicking on retrieved images/videos (click-through
data)
 Bookmarking or sharing an image/video
 Downloading/buying an image/video

Who interact with the data?
 Interactive relevance feedback
 Current user in current search
 Personalisation
 Current user in logged past searches
 Context adaptation
 Users similar to current user in logged past
searches
 Collective knowledge
 All users in logged past searches

Applications exploiting feedback
 Given a query, rank all
images/videos based on past users
feedback
 Given an image/video, rank all
images/videos based on past users
feedback

 Interactive relevance feedback
 Modify query and re-rank, based on current
user's explicit feedback (and current ranking)
 Blind relevance feedback
 Modify query and re-rank, based on feedback
by past users and current ranking

 Query suggestion
 Recommend keywords/concepts to support
users in interactive query modification
(refinement or expansion)

‘Police Sting’
Sting performs with The Police
‘Elton Diana’
Sting attends Versace memorial
service
‘Led Zeppelin’
Sting performs at Led Zeppelin concert

Exploiting User Logs
(FP6 Vitalas T4.2)
 Aim
 Understand the information-searching process
of professional users of a picture portal
 Method
 Building in collaboration with Belga an
increasingly large dataset that contains the
log of Belga's users' search interactions
 Processing, analysing, and investigating the
use of this collective knowledge stored in
search logs in a variety of tasks

Search logs
 Search logs in Vitalas
 Searches performed by users through Belga's web
interface from 22/06/2007 to 12/10/2007 (101 days)
 402,388 tuples <date,time,userid,action>
 "SEARCH_PICTURES" (138,275) | "SHOW_PHOTO"
(192,168) | "DOWNLOAD_PICTURE" (38,070) |
"BROWSE_GALLERIES" (8,878) | "SHOW_GALLERY"
(24,352) | "CONNECT_IMAGE_FORUM" (645)
 17,861 unique (‘lightly normalised’) queries
 96,420 clicked images
 Web image search (Craswell and Szummer,
2007):
 Pruned graph has 1.1 million edges, 505,000 URLs and
202,000 queries

What could we learn?
 Goals
 What do users search for?
 User context
 How do professionals search image archives,
when compared to the average user?
 Query modifications
 How do users reformulate their queries within
a search session

Semantic analysis
 Most studies investigate the search
logs at the syntactic (term-based)
level
 Our idea: map the term occurrences
into linked open data (LOD)

Semantic Log Analysis
 Method:
 Map queries into linked data cloud, find 'abstract'
patterns, and re-use those for query suggestion, e.g.:
 A and B play-soccer-in-team X
 A is-spouse-of B
 Advantages:
 Reduces sparseness of the raw search log data
 Provides higher level insights in the data
 Right mix of statistics and semantics?
 Overcomes the query drift problem of thesaurus-based
query expansion

Detect High-level Relations…

… transformed into modification
patterns

Implications
 Guide the selection of
ontologies/lexicons/etc. most suited
for your user population
 Distinguish between successful and
unsuccessful queries when making
search suggestions
 Improve session boundary detection

Finally… a ‘wild idea’
 Image data is seldomly annoted
adequately
 i.e., adequately to support search
 Automatic image annotation or
‘concept detection’
 Supervised machine learning
 Requires labelled samples as training data; a
laborious and expensive task

FP6 Vitalas IP
 Phase 1 – collect training data
 Select ~500 concepten with collection owner
 Manually select ~1000 positive and negative
examples for each concept

How to obtain training data?
 Can we use click-through data
instead of manually labelled
samples?
 Advantages:
 Large quantities, no user intervention, collective
assessments
 Disadvantages:
 Noisy & sparse
 Queries not based on strict visual criteria

Automatic Image Annotation
 Research questions:
 How to annotate images with concepts using
click-through data?
 How reliable are click-through data based
annotations?
 What is the effectiveness of these annotations
as training samples for concept classifiers?

Manual annotations
annotations per concept positive samples negative samples
MEAN 1020.02 89.44 930.57
MEDIAN 998 30 970
STDEV 164.64 132.84 186.21

1. How to annotate? (1/4)
 Use the queries for which images were clicked
 Challenges:
 Inherent noise: gap between queries/captions and concepts
 queries describe the content+context of images to be retrieved
 clicked images retrieved using their captions: content+context
 concept-based annotations: based on visual content-only criteria
 Sparsity: only cover part of the collection previously accessed
 Mismatch between terms in concept descriptions and queries

How to annotate (2/4)
 Basic ‘global’ method:
 Given the keywords of a query Q
 Find the query Q' in search logs that is most
textually similar to Q
 Find the images I clicked for Q'
 Find the queries Q'' for which these images
have been clicked
 Rank the queries Q'' based on the number of
images clicked for them

 Exact: images clicked for queries exactly matching
the concept name
 Example: 'traffic' -> 'traffic jam', 'E40', 'vacances', 'transport‘
 Search log-based image representations:
 Images represented by all queries for which they have been
clicked
 Retrieval based on language models (smoothing, stemming)
 Example: 'traffic' -> 'infrabel', 'deutsche bahn', 'traffic lights‘
 Random walks over the click graph
 Example: 'hurricane' -> 'dean', 'mexico', 'dean haiti', 'dean
mexico'

 Local method:
 given the keywords of a query Q and its top
ranked images
 Find the queries Q'' for which these images have
been clicked
 Rank the queries Q'' based on the number of
images clicked for them

•Compare agreement of click-through-based annotations to manual ones,
examining the 111 VITALAS concepts with at least 10 images (for at least
one of the methods) in the overlap of clicked and manually annotated images
• Levels of agreement vary greatly across concepts
• 20% of concepts per method reach agreement of at least 0.8
What type of concepts can be reliably
annotated using clickthrough data?
• defined categories? not informative
activities, animals, events, graphics,
people,image_theme, objects,
setting/scene/site
Possible future research on types of concepts
• named entities?
• specific vs. broad?
•
2. Reliability

Train the classifiers for each of 25
concepts
positive samples:
images selected by each method
negative samples:
selected by random sampling the 100k set
exclude those already selected as positive samples
low-level visual features FW
:
texture description
integrated Weibull distribution extracted from overlapping image
regions
low-level textual features FT
:
a vocabulary of most frequent terms in captions is built for each
concept
compare each image caption is against each concept vocabulary
build a frequency-histogram for each concept
SVM classifiers with RBF kernel (and cross
3. Effectiveness (1/3)

3. Effectiviness study (2/3)
•Experiment 1 (visual features):
–training: search-log based annotations
–test set for each concept: manual annotations (~1000 images)
–feasibility study: in most cases, AP considerably higher than the prior
3. Effectiveness (2/3)

•Experiments 2,3,4 (visual or textual features):
–Experiment 2 training: search-log based annotations
–Experiment 3 training: manual + search-log based annotations
–Experiment 4 training: manual annotations
–common test set: 56,605 images (subset of the 100,000 collection)
–contribution of search-log based annotations to training is positive
–particularly in combination with manual annotations
3. Effectiviness (3/3)

manually annotated positive samples search log based annotated positive samples
test set results
View results at:
http://olympus.ee.auth.gr/~diou/searchlogs/
Example: Soccer

Diversity from User Logs
 Present different query variants'
clicked images in clustered view
 Merge different query variants'
clicked images in a round robin
fashion into one list (CLEF)

ImageCLEF
'Olympics'
Olympic
games
Olympic
torch
Olympic
village
Olympic
rings
Olympic
flag
Olympic
Belgium
Olympic
stadium
Other

ImageCLEF Findings
 Many queries (>20%) without
clicked images
 Corpus and available logs originated from
different time frame

 Best results combine text search in
metadata with image click data for
topic title and each of the cluster
titles
 Using query variants derived from
the logs increases recall with 50-
100%
 However, also topic drift; reduced early
precision
ImageCLEF Findings

Twente ir-course 20-10-2010

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (9)

Similar to Twente ir-course 20-10-2010

Similar to Twente ir-course 20-10-2010 (20)

More from Arjen de Vries

More from Arjen de Vries (19)

Recently uploaded

Recently uploaded (20)

Twente ir-course 20-10-2010

Editor's Notes