Mind the Gap
Another look at the problem of the semantic gap in image retrieval
Multimedia Content Analysis, Management and Retrieval 2006
@ Electronic Imaging 2006
Jonathon S. Hare and Paul H. Lewis
Intelligence,Agents, Multimedia Group
School of Electronics and Computer Science
University of Southampton
{jsh2 | phl}@ecs.soton.ac.uk
&
Peter G.B. Enser and Christine J. Sandom
School of Computing, Mathematical and Information Sciences
University of Brighton
{p.g.b.enser | c.sandom}@bton.ac.uk
Contents
Introduction
Characterising the semantic gap
Characterising user queries
Attacking the gap from below: auto-annotation
Attacking the gap from above: ontologies
Some conclusions and future work
Introduction
In “The Bridging of the Semantic Gap inVisual
Information Retrieval” project we are exploring
how test-bed ontologies combined with content
based techniques and annotation can help meet
the needs of real users in limited domains.
What is the Semantic Gap
in image retrieval?
The gap between information extractable
automatically from the visual data and the
interpretation a user may have for the same data
…typically between low level features and the
image semantics
Characterising the Gap (I)
Much Content Based Image Retrieval uses Query
By Example and operates at the feature vector
level
Sometimes successful
However, often fails due to the gap between
feature vectors and semantics
Characterising the Gap (II)
A hierarchy of levels between media and semantics
!"#$%&'()
!"#$%&'($)*&+!,-.+/-'*,0'1!($
*+,"(&-.$+"/)
-21"!)+%',*1$-'!3'!"#$%&-
*+,"(&)
/(!&!&2/+%*)'%!1"+,*&+!,-'!3'
0$-%(+/&!(-
0")(1'2&31)
3$*&4($56$%&!(-
4$5-6"7'$
+1*7$-
!"#$%3%%&"'(%5'&8%)*"+%3%-
43$7)'7"%'%-93)"#'&"-:$&'3%$/-
;$1<=->$/'?31%'$-3%-@ABCB@DDA-$&-
@EFCGFCCH6I
!"#$
)*"+
,"'(
)-./-*0-(%1#"123%)'#4-*0%,-.4"*23%
546-#7#-8-#%9420".,'/23%:";,4-,%
(-2<,4=0",23%-0<>>>
Characterising the Gap (III)
Of course, its not that simple...
Descripto
rs
Raw Media
Objects Labels
SKY
MOUNTAINS
TREES
Photo of
Yosemite valley
showing El Capitan and
Glacier Point with the
Half Dome in the
distance
Semantics
Inter-object relationships
sub/super
objects
other
contextual
information
Analysing the Gap
Instructive to break the gap into two parts...
Analysing the Gap
from descriptors to labels
Descripto
rs
Objects Labels
SKY
MOUNTAINS
TREES
Most current research into bridging the semantic
gap is actually trying to bridge the gap between
descriptors and labels
Analysing the Gap
from labels to semantics
Labels
SKY
MOUNTAINS
TREES
However, user queries are typically formulated in
terms of semantics
Photo of
Yosemite valley
showing El Capitan and
Glacier Point with the
Half Dome in the
distance
Semantics
Users’ queries should be
the driver (I)
Hallmark of a good retrieval system is its ability to
respond to queries posed by a user
We have been collecting numerous real queries
for images from different collections in order to
investigate how we need to bridge the semantic
gap in order to answer the queries
Users’ queries should be
the driver (II)
User queries may specify unique features
A member of parliament with a beard
May involve temporal or spatial facets
A 1950s fridge in the background
Particular significance
Bannister breaking the 4 min mile
The absence of features
GeorgeV’s Coronation but no procession or royals
Attacking the Gap from
below:Auto-annotation
Lots of techniques proposed, using different descriptor
morphologies (global, region-based [segmented, salient, ...])
Co-occurrence of keywords and image features
Machine translation
Statistical, maximum entropy, ...
Probabilistic methods
Inference networks, density estimation, ...
Latent-spaces
Keyword propagation
Simple classifiers using low level features
Attacking the Gap from
below: Semantic-Spaces
Our current approach:
Based on a generalisation of an information retrieval
technique called Cross-Language Latent Semantic
Indexing (CL-LSI)
Uses SVD to factorise a multilingual term-document
matrix into a semantic-space of terms and
documents
Visual terms from low-level features and
keywords
Doesn’t actually assign annotations to the images, but
provides a way to search image collections by keyword
and/or visual features
A Simple Semantic-Space
foo
Key
SKY
TREE
MOUNTAIN
CABLE CAR
Visual terms
Keywords
Documents
Semantic-space
performance
Experimental results promising
Retrieval performance using the approach is
more-or-less on-par with the Machine-
translation approach (see paper)
However, we only used a global colour
histogram feature, whilst the machine
translation approach used a large
combination different features
Need to do more research to assess scalability
Attacking the Gap from
above: Ontologies
A popular knowledge representation scheme
Impetus from semantic web activity
A shared conceptualisation of a domain
Can structure and enhance the semantics of the
image and its content
For image retrieval it is useful to consider in
two parts:
content ontologies and context ontologies
Context Ontology
The SCULPTEUR project
SCULPTEUR - three year EU
project - Finished last year
Aimed to develop
multimedia handling facilities
for museums
The CIDOC CRM was used to
model contextual information
about art objects
Artifact metadata mapped to
ontology for all museums
Enabled semantic level searches,
combined semantic and content-
based searches and allowed
interoperability between
museums
Content Ontology
The MIAKT project
MIAKT was a two year UK
(EPSRC) funded project
formed from the AKT and
MIAS IRCs
Aimed to develop software
to support the breast
cancer screening process
Manual delineation of regions
of interest
Automatic feature extraction
from ROI, translation to object
labels and link to ontology
Provides semantic navigation
to images and a platform for
reasoning and decision support
Image Descriptor
Texture Margin Shape ...
Region-of-interest
image-id
Mammogram
containsgraphic-region
has-descriptor
has-descriptor
has-descriptor
has-descriptor
Conclusions/Future Work
The semantic gap in image retrieval can be viewed as two parts
The first part, between image descriptors and labels, can be
attacked using auto-annotation and/or semantic spaces
We are looking at the use of ontologies as a way to help
bridge the second part of the gap between labels and
semantics
Some possibilities for future research:
Using the ontology to help structure the semantic space
Developing structured search methodologies that augment
keyword-based search using ontology-based reasoning

Mind the Gap: Another look at the problem of the semantic gap in image retrieval

  • 1.
    Mind the Gap Anotherlook at the problem of the semantic gap in image retrieval Multimedia Content Analysis, Management and Retrieval 2006 @ Electronic Imaging 2006 Jonathon S. Hare and Paul H. Lewis Intelligence,Agents, Multimedia Group School of Electronics and Computer Science University of Southampton {jsh2 | phl}@ecs.soton.ac.uk & Peter G.B. Enser and Christine J. Sandom School of Computing, Mathematical and Information Sciences University of Brighton {p.g.b.enser | c.sandom}@bton.ac.uk
  • 2.
    Contents Introduction Characterising the semanticgap Characterising user queries Attacking the gap from below: auto-annotation Attacking the gap from above: ontologies Some conclusions and future work
  • 3.
    Introduction In “The Bridgingof the Semantic Gap inVisual Information Retrieval” project we are exploring how test-bed ontologies combined with content based techniques and annotation can help meet the needs of real users in limited domains.
  • 4.
    What is theSemantic Gap in image retrieval? The gap between information extractable automatically from the visual data and the interpretation a user may have for the same data …typically between low level features and the image semantics
  • 5.
    Characterising the Gap(I) Much Content Based Image Retrieval uses Query By Example and operates at the feature vector level Sometimes successful However, often fails due to the gap between feature vectors and semantics
  • 6.
    Characterising the Gap(II) A hierarchy of levels between media and semantics !"#$%&'() !"#$%&'($)*&+!,-.+/-'*,0'1!($ *+,"(&-.$+"/) -21"!)+%',*1$-'!3'!"#$%&- *+,"(&) /(!&!&2/+%*)'%!1"+,*&+!,-'!3' 0$-%(+/&!(- 0")(1'2&31) 3$*&4($56$%&!(- 4$5-6"7'$ +1*7$- !"#$%3%%&"'(%5'&8%)*"+%3%- 43$7)'7"%'%-93)"#'&"-:$&'3%$/- ;$1<=->$/'?31%'$-3%-@ABCB@DDA-$&- @EFCGFCCH6I !"#$ )*"+ ,"'( )-./-*0-(%1#"123%)'#4-*0%,-.4"*23% 546-#7#-8-#%9420".,'/23%:";,4-,% (-2<,4=0",23%-0<>>>
  • 7.
    Characterising the Gap(III) Of course, its not that simple... Descripto rs Raw Media Objects Labels SKY MOUNTAINS TREES Photo of Yosemite valley showing El Capitan and Glacier Point with the Half Dome in the distance Semantics Inter-object relationships sub/super objects other contextual information
  • 8.
    Analysing the Gap Instructiveto break the gap into two parts...
  • 9.
    Analysing the Gap fromdescriptors to labels Descripto rs Objects Labels SKY MOUNTAINS TREES Most current research into bridging the semantic gap is actually trying to bridge the gap between descriptors and labels
  • 10.
    Analysing the Gap fromlabels to semantics Labels SKY MOUNTAINS TREES However, user queries are typically formulated in terms of semantics Photo of Yosemite valley showing El Capitan and Glacier Point with the Half Dome in the distance Semantics
  • 11.
    Users’ queries shouldbe the driver (I) Hallmark of a good retrieval system is its ability to respond to queries posed by a user We have been collecting numerous real queries for images from different collections in order to investigate how we need to bridge the semantic gap in order to answer the queries
  • 12.
    Users’ queries shouldbe the driver (II) User queries may specify unique features A member of parliament with a beard May involve temporal or spatial facets A 1950s fridge in the background Particular significance Bannister breaking the 4 min mile The absence of features GeorgeV’s Coronation but no procession or royals
  • 13.
    Attacking the Gapfrom below:Auto-annotation Lots of techniques proposed, using different descriptor morphologies (global, region-based [segmented, salient, ...]) Co-occurrence of keywords and image features Machine translation Statistical, maximum entropy, ... Probabilistic methods Inference networks, density estimation, ... Latent-spaces Keyword propagation Simple classifiers using low level features
  • 14.
    Attacking the Gapfrom below: Semantic-Spaces Our current approach: Based on a generalisation of an information retrieval technique called Cross-Language Latent Semantic Indexing (CL-LSI) Uses SVD to factorise a multilingual term-document matrix into a semantic-space of terms and documents Visual terms from low-level features and keywords Doesn’t actually assign annotations to the images, but provides a way to search image collections by keyword and/or visual features
  • 15.
  • 16.
    Semantic-space performance Experimental results promising Retrievalperformance using the approach is more-or-less on-par with the Machine- translation approach (see paper) However, we only used a global colour histogram feature, whilst the machine translation approach used a large combination different features Need to do more research to assess scalability
  • 17.
    Attacking the Gapfrom above: Ontologies A popular knowledge representation scheme Impetus from semantic web activity A shared conceptualisation of a domain Can structure and enhance the semantics of the image and its content For image retrieval it is useful to consider in two parts: content ontologies and context ontologies
  • 18.
    Context Ontology The SCULPTEURproject SCULPTEUR - three year EU project - Finished last year Aimed to develop multimedia handling facilities for museums The CIDOC CRM was used to model contextual information about art objects Artifact metadata mapped to ontology for all museums Enabled semantic level searches, combined semantic and content- based searches and allowed interoperability between museums
  • 19.
    Content Ontology The MIAKTproject MIAKT was a two year UK (EPSRC) funded project formed from the AKT and MIAS IRCs Aimed to develop software to support the breast cancer screening process Manual delineation of regions of interest Automatic feature extraction from ROI, translation to object labels and link to ontology Provides semantic navigation to images and a platform for reasoning and decision support Image Descriptor Texture Margin Shape ... Region-of-interest image-id Mammogram containsgraphic-region has-descriptor has-descriptor has-descriptor has-descriptor
  • 20.
    Conclusions/Future Work The semanticgap in image retrieval can be viewed as two parts The first part, between image descriptors and labels, can be attacked using auto-annotation and/or semantic spaces We are looking at the use of ontologies as a way to help bridge the second part of the gap between labels and semantics Some possibilities for future research: Using the ontology to help structure the semantic space Developing structured search methodologies that augment keyword-based search using ontology-based reasoning