In this paper we explore the performance of a generic, unified framework for the retrieval of relevant and diverse images from social photo collections. The approach allows
for the easy evaluation of different visual and textual image descriptions, clustering algorithms, and similarity metrics. Preliminary results show strong dependance between the choice of underlying technology and similarity metric, and the achieved performance.
ceur-ws.org/Vol-1263/mediaeval2014_submission_2.pdf
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
A Unified Framework for Retrieving Diverse Social Images
1. A Unified Framework for Retrieving
Diverse Social Images
MAIA ZAHARIEVA1,2 AND PATRICK SCHWAB1
1Multimedia Information Systems, Faculty of Computer Science, University of Vienna, Austria
2Interactive Media Systems, Institute of Sof tware Technology and Interactive Systems,
Vienna University of Technology, Austria
The large number of publicly shared images leads
to a high number of visually similar data. In
traditional search engines these images receive a
similar ranking, as the only ranking criterion is
image’s relevance to a query. This, in turn,
produces redundant search results, where users
often have to browse through several pages of
similar images to get an overview of the underlying
dataset.
This is where the MediaEval 2014 Retrieving
Diverse Social Images Task [1] and, subsequently,
this project come in: We aim at the reduction of
redundancy in search results by introducing
diversity as a ranking criterion.
Final image
selection
EXPERIMENTS
RANK FEATURE COMPARISON METRIC
1 CM3x3 Euclidean
2 TFIDF Spearman
SIFT Euclidean
3 CM χ2
LBP χ2
LBP3x3 χ2
4 HOG cosine
GLRLM χ2
GLRLM3x3 χ2
5 CSD cosine
CN correlation
6 CN3x3 Euclidean
RANK FEATURE COMPARISON METRIC
1 CM χ2
CM3x3 Euclidean
2 HOG cosine
3 GLRLM3x3 χ2
LBP3x3 χ2
4 GLRLM Euclidean
LBP χ2
SIFT Euclidean
5 CSD cosine
CN Euclidean
CN3x3 Euclidean
6 TFIDF Euclidean
RESULTS
RUN RELEVANCE RANKING IMAGE CLUSTERING
run1 (V) CM3x3 SIFT
run2 (T) TFIDF TFIDF
run3 (VT) TFIDF CSD
run5 (V) CM3x3 CSD
Table 3: The features used in the runs.
V – visual, T – textual, VT – visual and textual
RUN CR@20 P@20 F1@20
run1 0.4426 0.7600 0.5552
run2 0.4132 0.7250 0.5188
run3 0.4484 0.7567 0.5559
run5 0.4369 0.7617 0.5499
Table 4: Evaluation results on the development dataset.
RUN CR@20 P@20 F1@20
run1 0.3901 0.6646 0.4863
run2 0.3909 0.6809 0.4888
run3 0.3982 0.6732 0.4949
run5 0.3915 0.6752 0.4897
Table 1: Feature-metric combination ranking for relevance
ranking.
Table 5: Evaluation results on the test dataset.
Table 2: Feature-metric combination ranking for image
clustering.
• We submitted 4 different runs for the final
evaluation (Table 3 shows the configurations).
• The combination of visual and textual
information achieved the best performances.
• However, the clustering recall (CR) remained
relatively low due to the large number of
irrelevant images.
We conducted a thorough evaluation of the
employed feature-metric combinations at the two
main stages of our approach: relevance ranking
and image clustering.
REFERENCES
INTRODUCTION
APPROACH
CONCLUSION
B. Ionescu, A. Popescu, M. Lupu, A. L.
Gînscâ, and H. Müller. Retrieving diverse
social images at mediaeval 2014: Challenges,
datasets, and evaluation. In MediaEval 2014
Multimedia Challenge Workshop, 2014.
[1]
We employ a multi-stage approach for the retrieval
of diverse social images. The workflow passes the
following three main stages:
Relevance ranking of input images.
In this stage we evaluate how relevant each image
is with respect to a given query. We compare every
image to a set of representative images by
computing the distance between their correspon-ding
feature vectors. For the given dataset, we use
the provided Wikipedia images as reference
images. Following, we assign the smallest
distance to any of the reference images as its
relevance score.
Image clustering for diversification.
The image clustering stage aims at the identifica-tion
of groups of similar images that can be used
to diversify the final retrieval results. The clustering
algorithm is up to choice. For our purposes,
Adaptive Hierarchical Clustering (AHC) showed
the best performance. It’s important to note that
the features and the distance metric used in this
stage can, and optimally should, differ from the
ones employed in the relevance ranking stage.
Final image selection.
The final stage combines the results of the
previous steps to facilitate a result set that is both
relevant and diverse according to the initial image
set. We use a Round-Robin algorithm that
chooses the most relevant images from alternating
clusters.
Although, there are significant differences in the
performances of single features, the top perform-ing
features prove to be highly interchangeable.
Achieved results indicate that - for the given data-sets
- the crucial part of the process is not so much
the diversification but more the assessment of
image relevance.
In general, the achieved results outline the
limitations of the available textual (and visual)
information in assessing image relevance. A
possible approach to improve the results may
consider occasionally available GPS data and
employ external resources as additional source for
information.
Relevance ranking Image clustering
1 2 3 …!
Figure 1: Schematic overview of the proposed approach.
Images from: http://google.com/search?q=eiffel+tower,
accessed July 2nd 2014
ACKNOWLEDGMENT
This work has been partly funded by the Vienna
Science and Technology Fund (WWTF) through
project ICT12-010.