Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Bridging Intent and Semantic Gaps in Multimedia Search
1. Sketching out Your Search Intent
- Towards bridging intent and semantic gaps in
web-scale multimedia search
Xian-Sheng Hua
Lead Researcher, Microsoft Research Asia
June 24 2010 – ACM Multimedia 2010 TPC Meeting – University of Amsterdam
2. Xian-Sheng Hua, MSR Asia
Evolution of CBIR/CBVR
Annotation Based
Commercial Search ~2006
Engines
~2003 (1) Tagging Based
(2) Large-Scale CBIR/CBVR
Conventional CBIR ~2001
Direct-Text Based
1990’ Academia
Prototypes
1970~
1980’
Manual Labeling Based
2
3. Xian-Sheng Hua, MSR Asia
Three Schemes
Example-Based
Annotation-Based
Text-Based
4. Xian-Sheng Hua, MSR Asia
Limitation of Text-Based Search
• High noises
̵ Surrounding text: frequently not related to the visual content
̵ Social tags: also noisy, and a large portion don’t have tags
• Mostly only covers simple semantics
̵ Surrounding text: limited
̵ Social tags: People tend to only input simple and personal tags
̵ Query association: powerful but coverage is still limited (non-clicked
images will never be well indexed)
A complex example:
Finding Lady Gaga walking on red carpet with black dress
5.
6. Xian-Sheng Hua, MSR Asia
Limitation of Annotation-Based Search
• Low accuracy
̵ The accuracy of automatic annotation is still far from satisfactory
̵ Difficult to solve due
• Limited coverage
̵ The coverage of the semantic concepts is still far from the rich content
that is contain in an image
̵ Difficult to extend
• High computation
̵ Learning costs high computation
̵ Recognition costs high if the number of concepts is large
Still has a long way to go
7. Xian-Sheng Hua, MSR Asia
Limitation of Large-Scale Example-Based Search
• You need an example first
̵ How to do it if I don’t have an initial example?
• Large-scale (semantic) similarity search is still difficult
̵ Though large-scale (near) duplicate detection is good
An example: TinEye
8. Xian-Sheng Hua, MSR Asia
Possible Way-Outs?
• Large-scale high-performance annotation?
̵ Model-based? Data-driven? Hybrid? …
• Large-scale manual labeling?
̵ ESP game? Label Me? Machinery Turk? …
• New query interface?
̵ Interactive search?
Or … To combine text, content and interface?
9. Xian-Sheng Hua, MSR Asia
Simple Attempts
• Exemplary attempts based on this idea
̵ Search by dominant color (Google/Bing)
̵ Show similar images (Bing)/Find similar images (Google)
̵ Show more sizes (Bing)
10.
11.
12.
13.
14.
15.
16. Xian-Sheng Hua, MSR Asia
Two Recent Projects
• Sketching out your search intent
̵ Image Search by Color Sketch
̵ Image Search by Concept Sketch
18. Xian-Sheng Hua, MSR Asia
Seems It Is Not New?
• “Query by sketch” has been proposed long time ago
̵ But on small datasets and difficult to scale up
̵ And seldom work well actually
̵ Not use to use – “object sketch”
19.
20. It is based on a SIGGRAPH 95’s paper:
C. Jacobs et al. Fast Multiresolution Image Querying. SIGGRAPH 1995
on small scale image set and difficult to scale-up (0.44 second for 10,000 images)
23. Xian-Sheng Hua, MSR Asia
Our Approach
• Color sketch …
̵ Is not “object sketch”
̵ But only rough color distribution
̵ Supports large-scale data
̵ And uses text at the same time
24.
25.
26. Xian-Sheng Hua, MSR Asia
Our Approach
• Color sketch …
̵ Is not “object sketch”
̵ But only rough color distribution
̵ Supports large-scale data
̵ And uses text at the same time
• Advantages of “Color Sketch”
̵ More presentative than dominant color (and text only)
̵ More robust than “object sketch”
̵ Easy to use compared with “object sketch”
̵ Easy to scale up
66. Xian-Sheng Hua, MSR Asia
Summary – Why it Works
• What are difficult
̵ Semantics is difficult
̵ Intent prediction is difficult
• What are easier
̵ Use color to describe images’ content
̵ Use color to describe users’ intent
content color sketch intent
Color sketch is an intermediate representation to connect images’ content and users’ intent.
67. Xian-Sheng Hua, MSR Asia
Limitation of Color Sketch
• Only works well for those semantics that can
be described by color distribution
• Only works well when people is able to transfer
their desire into color distribution
A more complex example:
A flying butterfly on the top-left of a flower.
69. Xian-Sheng Hua, MSR Asia
Search By Concept Sketch
• A system for the image search intentions concerning:
̵ The presence of the semantic concepts (semantic constraints)
̵ the spatial layout of the concepts (spatial constraints)
butterfly
flower
A concept map query Image search results of our system
70. Xian-Sheng Hua, MSR Asia
Framework
A concept map query A candidate image
1
A: butterfly Create Candidate
B: flower Image Set
butterfly
A: X=0.5, Y=0.5
B: X=0.8, flower
Y=0.8
2 3
Instant Concept Concept Distribution
Modeling Estimation
Concept Models Estimated distributions
butterfly butterfly
4 flower
Intent flower
Interpretation
Desired distributions
5
Spatial distribution
butterfly comparison
flower
Relevance score
71. Xian-Sheng Hua, MSR Asia
Search Results Comparison
Task: searching for the images with “snoopy appear at right”
snoopy
Text-based image search
“show similar image”
snoopy
Image search by concept map
72. Xian-Sheng Hua, MSR Asia
Search Results
Task: searching for the images with “snoopy appear at top” / “snoopy appear at left”
73. Xian-Sheng Hua, MSR Asia
Search Results Comparison
Task: searching for the images with “a jeep shown above a piece of grass”
jeep grass
74. Xian-Sheng Hua, MSR Asia
Search Results Comparison
Task: searching for the images with “a car parked in front of a house”
house car
75. Xian-Sheng Hua, MSR Asia
Search Results Comparison
Task: searching for the images with “a keyboard and a mouse being side by side”
keyboard mouse
76. Xian-Sheng Hua, MSR Asia
Search Results Comparison
Task: searching for the images with “sky/Colosseum/grass appear from top to bottom”
Colosseum grass
77. Xian-Sheng Hua, MSR Asia
Advanced Function: Influence Scope
Indication
• Allow users to explicitly specify the occupied
region of a concept
house
Default
house
lawn lawn
house
lawn
Search for the long narrow lawn at the bottom of the image
78. Xian-Sheng Hua, MSR Asia
Advanced Function:
Influence Scope Adjustment
Searching for small windmill
Searching for large windmill
79. Xian-Sheng Hua, MSR Asia
Advanced Function:
Visual Modeling Adjustment
sky
house
lawn
Clear Search
Advanced Function
Visual Assistor
Previous Next
Allow users to indicate or narrow down
what a desired concept looks like (visual constraints)
80. Xian-Sheng Hua, MSR Asia
Advanced Function:
Visual Modeling Adjustment
Searching for
front-view jeeps
jeep
Searching for
side-view jeeps
Visual instances
81. Xian-Sheng Hua, MSR Asia
Advanced Function:
Visual Modeling Adjustment
Searching for
butterfly with
yellow flower
butterfly
flower
Searching for
butterfly with
red flower
Visual instances
82. Xian-Sheng Hua, MSR Asia
Conclusions
• To bridge the gap between users’ intent and
visual semantics of images by
̵ Sketching out your search intent, and
̵ Combining text and visual content
• Two exemplary approaches introduced
̵ Search by color sketch
̵ Search by concept sketch