Content vs Context

IMAGE RETRIEVAL:

CONTENT
VERSUS

CONTEXT

Thijs Westerveld

Teoria e Tecnologia della Comunicazione

Sistemi Informativi Multimediali AA’11-’12

Angelo Oldani

744818

INTRODUCTION

Through these slides will be presented the paper “Image
retrieval: content versus context” of Thijs Westerveld.

This paper presents a “new” approach to image retrieval
that takes the best from two worlds.

It combine image features – content
collateral text – context

INDEX

TRADITIONAL IMAGE RETRIEVAL

LATENT SEMANTIC INDEXING

FEATURE EXTRACTION PROCESS

EXPERIMENTS

DISCUSSION

CONTEXT
BASE
IMAGE
RETRIEVAL

May be based on two modes:

•  Annotations that are manually added.

•  Collateral text available with an image.

The similarity between images is then based on the
similarity between the associated texts.

CONTEXT
BASE
IMAGE
RETRIEVAL

PROBLEMS

•  Synonymy

Use different words to describe the same subject in
different documents.

•  Ambiguity

Same words describe different subjects.

CONTENT
BASE
IMAGE
RETRIEVAL

Return images that are visually most similar.

Similarity is based on a set of low-level image
features like a:

•  colour

•  shape

•  texture

•  …..

CONTENT
BASE
IMAGE
RETRIEVAL

PROBLEMS

Semantic gap

LATENT
SEMANTIC
INDEXING
(LSI)

LSI is a method that uses co-occurrence statistics of
terms to ﬁnd the semantics behind a document’s terms.

Documents using similar terms are probably related.

RESERVATION

DOUBLE ROOM

SHOWER

BREAKFAST

LATENT
SEMANTIC
INDEXING
(LSI)

No one has combined text and image into the same
semantic space using LSI.

List of terms from both modalities in one term document
matrix and then apply the SVD resulting in a semantic space
that contains both visual and textual items.

LATENT
SEMANTIC
INDEXING
(LSI)

CALCULATING
IMAGE TERMS

To use LSI on image content is necessary to
deﬁne a set of discrete image features that
has the same distribuiton as the set of textual
terms.

Set terms that is sparse as the set
of the textual terms.

CALCULATING
IMAGE TERMS

Set of therms that is the same size
of the textual terms.

FEATURE
EXTRACTION

Should extract the indexing terms from documents.

TEXTUAL IMAGE
TERMS

FEATURES

Image captions

Colours

Textures

SPARSE
SET OF
IMAGE TERMS

COLOUR FEATURES

Has been used HSV colour space divided into 18 Hues, 3
Saturations and 3 Values and were extracted two sets of
features:

•  Histogram for the whole image.

•  Binary value of the most frequent color for each
block.

TEXTURE FEATURES

Has been used gabor ﬁlters at 3 different wavelengths and
four orientation and was extracted the average energy for
each combination of wavelengths and orientation. Avg
energy values are quantiﬁed into 128 bands and disregarding
the values

that fall within the lower 16 bands.

SPARSE
SET OF
IMAGE TERMS

TERM FREQUENCIES

Tot. #terms

Avg. #terms/doc

ratio

Text

4283

27

158:1

Image

37752

625

63:1

Combination

42035

598

70:1

SMALL
SET OF
IMAGE TERMS

COLOUR FEATURES

Has been used HSV colour space divided into 18 Hues, 3
Saturations and 3 Values and were extracted two sets of
features:

•  Histogram for each block.

•  Histogram for whole image.

TEXTURE FEATURES

Has been used gabor ﬁlters at 3 different wavelengths and
four orientation and was extracted the average energy for
each combination of wavelengths and orientation. Avg
energy values are quantiﬁed into 10 bands and
disregarding the values

that fall within the lower 2 bands.

SMALL
SET OF
IMAGE TERMS

TERM FREQUENCIES

Tot. #terms

Avg. #terms/doc

ratio

Text

4283

27

158:1

Image

4442

1131

4:1

Combination

8725

1158

8:1

EXPERIMENT

3379 images from Reformatorisch Dagblad
online archive together with their
captions.

Set of 20 documents as query

3 indexes (LSI indexing):

•  Visual terms

•  Textual term

•  Visual Textual terms

Top 100 returned documents

EXPERIMENT

RESULTS

•  The small set of image features seems to perform
somewhat better than the sparse set

•  The combined approach for this set of features
outperforms both the image and the text approach for
queries with many relevant documents in the data set.

DISCUSSION

Latent Semantic Indexing can help bridge the semantic
gap

LIMITS

•  Research based on very small set of images

•  Text is not available with every image

Content vs Context

Recommended

Recommended

More Related Content

What's hot

What's hot (6)

Viewers also liked

Viewers also liked (18)

Similar to Content vs Context

Similar to Content vs Context (20)

Recently uploaded

Recently uploaded (20)

Content vs Context