SlideShare a Scribd company logo
1 of 23
Download to read offline
Semantically Relevant Visual Dictionary
Ashish Gupta (CVSSP)
University of Surrey
a.gupta@surrey.ac.uk
July 10,2012
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
Contents
Introduction: Visual Category Recognition
Current practice: Visual Dictionary
Problem: inter-mixed feature vectors
Approach: Over-partition + Co-cluster image-word matrix
Solution: Group estimated categorically related partitions
Experiments:
Summary
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
Visual Category Recognition
Definition
Detect presence of an instance of a
visual category in an image.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
Challenges
Several variations in visual category appearance render category
recognition very difficult.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
Visual Dictionary
Visual Word
Representative feature vector
(generally centroid) of each
cluster.
Image Histogram
Histogram of assignments of
image feature vectors to visual
words.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
Problems with Visual Dictionary
Inter-mixed
Categorically dissimilar feature vectors inter-mixed in feature space.
Semantic scatter
Feature vectors pertaining to same category part scattered in
feature space.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
Inter-mixed Feature Vectors
Categorically equivalent
vectors mapped to naturally
occurring clusters
Easily partitioned to yield
discriminative dictionary
elements
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
Inter-mixed Feature Vectors
Categorically dissimilar vectors
inter-mixed
Partitioning yields
non-discriminative dictionary
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
Inter-mixed Feature Vectors
Over-partition feature space into tiny clusters.
Build a dictionary using these tiny clusters.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
Semantic Scatter
Small variations in instances of object part causes associated
descriptors to get scattered in feature space.
Combine visual words which are related and create a visual
topic.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
Hypothesis
Semantically related words can be discovered by analysing
image-word distribution.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
Visual Topic Dictionary ← Visual Word Dictionary
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
Co-Clustering
Formulate the image-word matrix as a joint probability distribution.
CX : {x1, x2, . . . , xm} → { ˆx1, ˆx2, . . . , ˆxk }
CY : {y1, y2, . . . , yn} → { ˆy1, ˆy2, . . . , ˆyl }
the tuple (CX , CY ) is referred to as co-clustering.
‘re-order’ rows and columns of the matrix, which gives rise to
blocks, referred to as co-clusters.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
Co-clustering contd.
Optimal co-clustering minimizes loss in mutual information
I(X; Y ) − I( ˆX; ˆY ), given number of row (k) and column (l)
clusters.
For a (CX , CY ), loss in mutual information can be expressed by
KL-divergence between p(X, Y ) and an approximation q(X, Y ).
I(X; Y ) − I( ˆX; ˆY ) = DKL(p(X, Y ) q(X, Y ))
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
Conceptual view
Image histogram feature vectors in high-dimensional visual words
space are projected to lower dimensional visual topic space.
The distance between feature vectors from the same category is
reduced.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
Experiment
Feature descriptor
SIFT : Affine co-variant local image patch descriptor.
Data sets
Scene-15; Pascal VOC 2006; VOC 2007; VOC 2010.
Classifier
k-NN : Verify if mutual distance between categorically equivalent
feature vectors is reduced.
Performance metric
F1-score: harmonic mean of precision and recall. Popularly used in
classification and retrieval communities.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
Scene-15 Dataset
It has 15 visual categories of natural indoor and outdoor scenes.
Each category has about 200 to 400 images and the entire dataset
has 4485 images.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
PASCAL VOC2006 Dataset
It has 10 visual categories with about 175 to 650 images per
category. There are a total of 5304 images.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
PASCAL VOC2007 Dataset
It has 20 visual categories. Each category contains images ranging
from 100 to 2000, with 9963 images in all.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
PASCAL VOC2010 Dataset
It has 20 visual categories and 300 to 3500 images in each
category. Combines data from VOC2008 and VOC2009.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
Dictionary Size
10,000 words → n Topics. Appropriate number of Topics?
Large dictionary becomes category dependent.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
Summary
Visual dictionary in limited: unsupervised clustering.
Significant intra-category appearance variation: semantic scatter.
Feature vectors from different visual categories inter-mixed in
feature space.
Visual Topic ← Visual Word: grouping over-partitioned feature
space.
Co-clustering Image-Word distribution: discover optimal grouping
of words with minimal loss in mutual information.
Semantic dimensionality reduction.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
Thank you.
Acknowledgement
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary

More Related Content

Viewers also liked (7)

Visual Dictionary Part 2
Visual Dictionary Part 2Visual Dictionary Part 2
Visual Dictionary Part 2
 
Parts of a book
Parts of a bookParts of a book
Parts of a book
 
Parts Of A Dictionary
Parts Of A DictionaryParts Of A Dictionary
Parts Of A Dictionary
 
Parts of a Dictionary
Parts of a DictionaryParts of a Dictionary
Parts of a Dictionary
 
Dictionary ppt
Dictionary pptDictionary ppt
Dictionary ppt
 
Parts of a book power point for students
Parts of a book power point for studentsParts of a book power point for students
Parts of a book power point for students
 
Parts Of A Book
Parts Of A BookParts Of A Book
Parts Of A Book
 

Similar to Semantically Relevant Visual Dictionary

Visual Category Recognition using Information-Theoretic Co-Clustering
Visual Category Recognition using Information-Theoretic Co-ClusteringVisual Category Recognition using Information-Theoretic Co-Clustering
Visual Category Recognition using Information-Theoretic Co-Clustering
Ashish Gupta
 
Wsd as distributed constraint optimization problem
Wsd as distributed constraint optimization problemWsd as distributed constraint optimization problem
Wsd as distributed constraint optimization problem
lolokikipipi
 
Learning a Recurrent Visual Representation for Image Caption G
Learning a Recurrent Visual Representation for Image Caption GLearning a Recurrent Visual Representation for Image Caption G
Learning a Recurrent Visual Representation for Image Caption G
JospehStull43
 
Learning a Recurrent Visual Representation for Image Caption G.docx
Learning a Recurrent Visual Representation for Image Caption G.docxLearning a Recurrent Visual Representation for Image Caption G.docx
Learning a Recurrent Visual Representation for Image Caption G.docx
croysierkathey
 

Similar to Semantically Relevant Visual Dictionary (20)

Visual Category Recognition using Information-Theoretic Co-Clustering
Visual Category Recognition using Information-Theoretic Co-ClusteringVisual Category Recognition using Information-Theoretic Co-Clustering
Visual Category Recognition using Information-Theoretic Co-Clustering
 
[DL輪読会]Generative Models of Visually Grounded Imagination
[DL輪読会]Generative Models of Visually Grounded Imagination[DL輪読会]Generative Models of Visually Grounded Imagination
[DL輪読会]Generative Models of Visually Grounded Imagination
 
Image Super-Resolution Reconstruction Based On Multi-Dictionary Learning
Image Super-Resolution Reconstruction Based On Multi-Dictionary LearningImage Super-Resolution Reconstruction Based On Multi-Dictionary Learning
Image Super-Resolution Reconstruction Based On Multi-Dictionary Learning
 
Image Restoration and Denoising By Using Nonlocally Centralized Sparse Repres...
Image Restoration and Denoising By Using Nonlocally Centralized Sparse Repres...Image Restoration and Denoising By Using Nonlocally Centralized Sparse Repres...
Image Restoration and Denoising By Using Nonlocally Centralized Sparse Repres...
 
presentation
presentationpresentation
presentation
 
Sharath copy
Sharath   copySharath   copy
Sharath copy
 
Distance-based Text Clustering
Distance-based Text ClusteringDistance-based Text Clustering
Distance-based Text Clustering
 
Spot the Dog: An overview of semantic retrieval of unannotated images in the ...
Spot the Dog: An overview of semantic retrieval of unannotated images in the ...Spot the Dog: An overview of semantic retrieval of unannotated images in the ...
Spot the Dog: An overview of semantic retrieval of unannotated images in the ...
 
Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)
Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)
Deep convnets for global recognition (Master in Computer Vision Barcelona 2016)
 
Wsd as distributed constraint optimization problem
Wsd as distributed constraint optimization problemWsd as distributed constraint optimization problem
Wsd as distributed constraint optimization problem
 
Image Denoising Based On Sparse Representation In A Probabilistic Framework
Image Denoising Based On Sparse Representation In A Probabilistic FrameworkImage Denoising Based On Sparse Representation In A Probabilistic Framework
Image Denoising Based On Sparse Representation In A Probabilistic Framework
 
Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...
Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...
Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...
 
Sparse codes for natural images
Sparse codes for natural imagesSparse codes for natural images
Sparse codes for natural images
 
Image Restoration UsingNonlocally Centralized Sparse Representation and histo...
Image Restoration UsingNonlocally Centralized Sparse Representation and histo...Image Restoration UsingNonlocally Centralized Sparse Representation and histo...
Image Restoration UsingNonlocally Centralized Sparse Representation and histo...
 
Extending the knowledge level of cognitive architectures with Conceptual Spac...
Extending the knowledge level of cognitive architectures with Conceptual Spac...Extending the knowledge level of cognitive architectures with Conceptual Spac...
Extending the knowledge level of cognitive architectures with Conceptual Spac...
 
Ontology mapping needs context & approximation
Ontology mapping needs context & approximationOntology mapping needs context & approximation
Ontology mapping needs context & approximation
 
Learning a Recurrent Visual Representation for Image Caption G
Learning a Recurrent Visual Representation for Image Caption GLearning a Recurrent Visual Representation for Image Caption G
Learning a Recurrent Visual Representation for Image Caption G
 
Learning a Recurrent Visual Representation for Image Caption G.docx
Learning a Recurrent Visual Representation for Image Caption G.docxLearning a Recurrent Visual Representation for Image Caption G.docx
Learning a Recurrent Visual Representation for Image Caption G.docx
 
Semantic Hybridized Image Features in Visual Diagnostic of Plant Health
Semantic Hybridized Image Features in Visual Diagnostic of Plant HealthSemantic Hybridized Image Features in Visual Diagnostic of Plant Health
Semantic Hybridized Image Features in Visual Diagnostic of Plant Health
 
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Recently uploaded (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Semantically Relevant Visual Dictionary

  • 1. Semantically Relevant Visual Dictionary Ashish Gupta (CVSSP) University of Surrey a.gupta@surrey.ac.uk July 10,2012 Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • 2. Contents Introduction: Visual Category Recognition Current practice: Visual Dictionary Problem: inter-mixed feature vectors Approach: Over-partition + Co-cluster image-word matrix Solution: Group estimated categorically related partitions Experiments: Summary Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • 3. Visual Category Recognition Definition Detect presence of an instance of a visual category in an image. Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • 4. Challenges Several variations in visual category appearance render category recognition very difficult. Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • 5. Visual Dictionary Visual Word Representative feature vector (generally centroid) of each cluster. Image Histogram Histogram of assignments of image feature vectors to visual words. Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • 6. Problems with Visual Dictionary Inter-mixed Categorically dissimilar feature vectors inter-mixed in feature space. Semantic scatter Feature vectors pertaining to same category part scattered in feature space. Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • 7. Inter-mixed Feature Vectors Categorically equivalent vectors mapped to naturally occurring clusters Easily partitioned to yield discriminative dictionary elements Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • 8. Inter-mixed Feature Vectors Categorically dissimilar vectors inter-mixed Partitioning yields non-discriminative dictionary Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • 9. Inter-mixed Feature Vectors Over-partition feature space into tiny clusters. Build a dictionary using these tiny clusters. Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • 10. Semantic Scatter Small variations in instances of object part causes associated descriptors to get scattered in feature space. Combine visual words which are related and create a visual topic. Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • 11. Hypothesis Semantically related words can be discovered by analysing image-word distribution. Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • 12. Visual Topic Dictionary ← Visual Word Dictionary Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • 13. Co-Clustering Formulate the image-word matrix as a joint probability distribution. CX : {x1, x2, . . . , xm} → { ˆx1, ˆx2, . . . , ˆxk } CY : {y1, y2, . . . , yn} → { ˆy1, ˆy2, . . . , ˆyl } the tuple (CX , CY ) is referred to as co-clustering. ‘re-order’ rows and columns of the matrix, which gives rise to blocks, referred to as co-clusters. Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • 14. Co-clustering contd. Optimal co-clustering minimizes loss in mutual information I(X; Y ) − I( ˆX; ˆY ), given number of row (k) and column (l) clusters. For a (CX , CY ), loss in mutual information can be expressed by KL-divergence between p(X, Y ) and an approximation q(X, Y ). I(X; Y ) − I( ˆX; ˆY ) = DKL(p(X, Y ) q(X, Y )) Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • 15. Conceptual view Image histogram feature vectors in high-dimensional visual words space are projected to lower dimensional visual topic space. The distance between feature vectors from the same category is reduced. Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • 16. Experiment Feature descriptor SIFT : Affine co-variant local image patch descriptor. Data sets Scene-15; Pascal VOC 2006; VOC 2007; VOC 2010. Classifier k-NN : Verify if mutual distance between categorically equivalent feature vectors is reduced. Performance metric F1-score: harmonic mean of precision and recall. Popularly used in classification and retrieval communities. Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • 17. Scene-15 Dataset It has 15 visual categories of natural indoor and outdoor scenes. Each category has about 200 to 400 images and the entire dataset has 4485 images. Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • 18. PASCAL VOC2006 Dataset It has 10 visual categories with about 175 to 650 images per category. There are a total of 5304 images. Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • 19. PASCAL VOC2007 Dataset It has 20 visual categories. Each category contains images ranging from 100 to 2000, with 9963 images in all. Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • 20. PASCAL VOC2010 Dataset It has 20 visual categories and 300 to 3500 images in each category. Combines data from VOC2008 and VOC2009. Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • 21. Dictionary Size 10,000 words → n Topics. Appropriate number of Topics? Large dictionary becomes category dependent. Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • 22. Summary Visual dictionary in limited: unsupervised clustering. Significant intra-category appearance variation: semantic scatter. Feature vectors from different visual categories inter-mixed in feature space. Visual Topic ← Visual Word: grouping over-partitioned feature space. Co-clustering Image-Word distribution: discover optimal grouping of words with minimal loss in mutual information. Semantic dimensionality reduction. Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • 23. Thank you. Acknowledgement Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary