SlideShare a Scribd company logo
1 of 21
Download to read offline
Content Based Image
Retrieval (CBIR)
Behzad Shomali
What is CBIR?
Content-based image retrieval, also known as query by image content (QBIC) and content-based
visual information retrieval (CBVIR), is the application of computer vision techniques to the image
retrieval problem, that is, the problem of searching for digital images in large databases.
https://en.wikipedia.org/wiki/Content-based_image_retrieval
Query
Image
Image
Feature
Extraction
Feature
Extraction
Similarity
Measuremen
t
Retrieved
Images
Technologies
● Query by example (QBE)


● Semantic retrieval


● Relevance feedback (human interaction)


● Iterative/machine learning


● Other query methods
https://en.wikipedia.org/wiki/Content-based_image_retrieval
Technologies
● Query by example (QBE)


● Semantic retrieval


● Relevance feedback (human interaction)


● Iterative/machine learning


● Other query methods
https://en.wikipedia.org/wiki/Content-based_image_retrieval
Application in popular search systems
● Google images


○ Constructing a mathematical model


○ Metadata


● eBay


○ ResNet-50 for category recognition


● SK Planet


○ inception-v3 as vision encoder


○ RNN multi-class classification


● Alibaba


○ GoogLeNet V1 for category prediction and feature learning


● Pinterest


○ Two-step object detection
https://en.wikipedia.org/wiki/Reverse_image_search
Application in popular search systems
● Google images


○ Constructing a mathematical model


○ Metadata


● eBay


○ ResNet-50 for category recognition


● SK Planet


○ inception-v3 as vision encoder


○ RNN multi-class classification


● Alibaba


○ GoogLeNet V1 for category prediction and feature learning


● Pinterest


○ Two-step object detection
https://en.wikipedia.org/wiki/Reverse_image_search
Image Representation and Features
● Extract local and deep features


● Studied AlexNet and VGG


○ Extract feature representations from fc6 and fc8 layers


○ Binarized


○ Hamming distance


● Extract salient color signatures


○ Detect salient regions


○ K-means clustering


○ Store cluster centroids and weights as image signature
[DavJing, Yushi, et al. "Visual search at pinterest." Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015]
Two-step Object Detection and Localization
1. Category classification


2. Object detection
[DavJing, Yushi, et al. "Visual search at pinterest." Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015]
Car Flower Person
c1 ... cn f1 ... fm p1 ... pk
Input
Category classification
Object detection
Car Flower Person
c1 ... cn f1 ... fm p1 ... pk
Input
Category classification
Object detection
Car Flower Person
c1 ... cn f1 ... fm p1 ... pk
Input
Category classification
Object detection
Reduce
computational cost
Static Evaluation of Search Relevance
● Used dataset contains: 1.6 M unique images


○ Be assumed to be relevant, if two images share a label


● Computed precision@k based on several features


○ The fc6 layer activations from the generic AlexNet (pre-trained for ILSVRC)


○ The fc6 activations of an AlexNet model fine-tuned to recognize over 3,000 Pinterest
products categories


○ The loss3/classifier activations of a generic GoogLeNet


○ The fc6 activations of a generic VGG 16-layer model
[DavJing, Yushi, et al. "Visual search at pinterest." Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015]
Precision vs. Recall
[Müller, Henning, et al. "Performance evaluation in content-based image retrieval: overview and proposals." Pattern recognition letters 22.5 (2001): 593-601]
Precision vs. Recall
[Müller, Henning, et al. "Performance evaluation in content-based image retrieval: overview and proposals." Pattern recognition letters 22.5 (2001): 593-601]
Either value alone contains insufficient information


● We can always make recall 1, simply by retrieving all images


● Similarly, precision can be kept high by retrieving only a few images
● P (10) ; P (30) ; P (NR) - the precision after the first 10 ; 30 ; NR documents are retrieved


● Mean Average Precision - mean (non-interpolated) average precision .


● recall at .5 precision - recall at the rank where precision drops below .5.


● R (1000) - recall after 1000 documents are retrieved.


● Rank first relevant - The rank of the highest-ranked relevant document.
Precision vs. Recall
[Müller, Henning, et al. "Performance evaluation in content-based image retrieval: overview and proposals." Pattern recognition letters 22.5 (2001): 593-601]
Either value alone contains insufficient information


● We can always make recall 1, simply by retrieving all images


● Similarly, precision can be kept high by retrieving only a few images
Precision and recall
should either be
used together
● P (10) ; P (30) ; P (NR) - the precision after the first 10 ; 30 ; NR documents are retrieved


● Mean Average Precision - mean (non-interpolated) average precision .


● recall at .5 precision - recall at the rank where precision drops below .5.


● R (1000) - recall after 1000 documents are retrieved.


● Rank first relevant - The rank of the highest-ranked relevant document.
Relevance of visual search
Table 1 shows p@5 and p@10 performance of these models, along with the average CPU-based
latency of our visual search service, which includes feature extraction for the query image as well as
retrieval.
[DavJing, Yushi, et al. "Visual search at pinterest." Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015]
Siamese networks
[Das, Arpita, et al. "Together we stand: Siamese networks for similar question retrieval." Proceedings of the 54th Annual Meeting of the Association for
Computational Linguistics (Volume 1: Long Papers). 2016]
Siamese networks
● Let, F(X) be the family of functions with set of parameters W. F(X) is assumed to be
differentiable with respect to W. Siamese network seeks a value of the parameter W such that
the symmetric similarity metric is small if X1 and X2 belong to the same category, and large if
they belong to different categories.
[Das, Arpita, et al. "Together we stand: Siamese networks for similar question retrieval." Proceedings of the 54th Annual Meeting of the Association for
Computational Linguistics (Volume 1: Long Papers). 2016]
Different loss functions for training a Siamese network
Two commonly used ones are


● Triplet loss


● Contrastive loss


The main idea of these loss functions is to
pull the samples of every class toward one
another and push the samples of different
classes away from each other
[Ghojogh, Benyamin, et al. "Fisher discriminant triplet and contrastive losses for training siamese networks." 2020 International Joint Conference on Neural
Networks (IJCNN). IEEE, 2020]
Different loss functions - Triplet loss
The triplet loss uses anchor, neighbor, and distant. Let f(x) be the output (i.e., embedding) of the network
for the input x. The triplet loss tries to reduce the distance of anchor and neighbor embeddings and desires
to increase the distance of anchor and distant embeddings. As long as the distances of anchor-distant pairs
get larger than the distances of anchor-neighbor pairs by a margin α ≥ 0, the desired embedding is obtained
[Ghojogh, Benyamin, et al. "Fisher discriminant triplet and contrastive losses for training siamese networks." 2020 International Joint Conference on Neural
Networks (IJCNN). IEEE, 2020]
Different loss functions - Contrastive loss
The contrastive loss uses pairs of samples which can be anchor and neighbor or anchor and distant. If the
samples are anchor and neighbor, they are pulled towards each other; otherwise, their distance is
increased. In other words, the contrastive loss performs like the triplet loss but one by one rather than
simultaneously. The desired embedding is obtained when the anchor-distant distances get larger than the
anchor-neighbor distances by a margin of α
[Ghojogh, Benyamin, et al. "Fisher discriminant triplet and contrastive losses for training siamese networks." 2020 International Joint Conference on Neural
Networks (IJCNN). IEEE, 2020]

More Related Content

Similar to CBIR Techniques and Applications

IRJET - Vehicle Classification with Time-Frequency Domain Features using ...
IRJET -  	  Vehicle Classification with Time-Frequency Domain Features using ...IRJET -  	  Vehicle Classification with Time-Frequency Domain Features using ...
IRJET - Vehicle Classification with Time-Frequency Domain Features using ...IRJET Journal
 
IRJET- Recognition of Handwritten Characters based on Deep Learning with Tens...
IRJET- Recognition of Handwritten Characters based on Deep Learning with Tens...IRJET- Recognition of Handwritten Characters based on Deep Learning with Tens...
IRJET- Recognition of Handwritten Characters based on Deep Learning with Tens...IRJET Journal
 
K-Means Clustering in Moving Objects Extraction with Selective Background
K-Means Clustering in Moving Objects Extraction with Selective BackgroundK-Means Clustering in Moving Objects Extraction with Selective Background
K-Means Clustering in Moving Objects Extraction with Selective BackgroundIJCSIS Research Publications
 
ArtificialIntelligenceInObjectDetection-Report.pdf
ArtificialIntelligenceInObjectDetection-Report.pdfArtificialIntelligenceInObjectDetection-Report.pdf
ArtificialIntelligenceInObjectDetection-Report.pdfAbishek86232
 
Hand Written Digit Classification
Hand Written Digit ClassificationHand Written Digit Classification
Hand Written Digit Classificationijtsrd
 
IMAGE GENERATION FROM CAPTION
IMAGE GENERATION FROM CAPTIONIMAGE GENERATION FROM CAPTION
IMAGE GENERATION FROM CAPTIONijscai
 
Image Generation from Caption
Image Generation from Caption Image Generation from Caption
Image Generation from Caption IJSCAI Journal
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnBenjamin Bengfort
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural NetworksYogendra Tamang
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - Hiroshi Fukui
 
Learning with Relative Attributes
Learning with Relative AttributesLearning with Relative Attributes
Learning with Relative AttributesVikas Jain
 
imageclassification-160206090009.pdf
imageclassification-160206090009.pdfimageclassification-160206090009.pdf
imageclassification-160206090009.pdfKammetaJoshna
 
Using Mask R CNN to Isolate PV Panels from Background Object in Images
Using Mask R CNN to Isolate PV Panels from Background Object in ImagesUsing Mask R CNN to Isolate PV Panels from Background Object in Images
Using Mask R CNN to Isolate PV Panels from Background Object in Imagesijtsrd
 
Automated Image Captioning – Model Based on CNN – GRU Architecture
Automated Image Captioning – Model Based on CNN – GRU ArchitectureAutomated Image Captioning – Model Based on CNN – GRU Architecture
Automated Image Captioning – Model Based on CNN – GRU ArchitectureIRJET Journal
 
Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsIRJET Journal
 
Safety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdfSafety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdfPolytechnique Montréal
 
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...IRJET- Comparative Study of Different Techniques for Text as Well as Object D...
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...IRJET Journal
 
IRJET - Content based Image Classification
IRJET -  	  Content based Image ClassificationIRJET -  	  Content based Image Classification
IRJET - Content based Image ClassificationIRJET Journal
 

Similar to CBIR Techniques and Applications (20)

IRJET - Vehicle Classification with Time-Frequency Domain Features using ...
IRJET -  	  Vehicle Classification with Time-Frequency Domain Features using ...IRJET -  	  Vehicle Classification with Time-Frequency Domain Features using ...
IRJET - Vehicle Classification with Time-Frequency Domain Features using ...
 
AI and Deep Learning
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
 
IRJET- Recognition of Handwritten Characters based on Deep Learning with Tens...
IRJET- Recognition of Handwritten Characters based on Deep Learning with Tens...IRJET- Recognition of Handwritten Characters based on Deep Learning with Tens...
IRJET- Recognition of Handwritten Characters based on Deep Learning with Tens...
 
K-Means Clustering in Moving Objects Extraction with Selective Background
K-Means Clustering in Moving Objects Extraction with Selective BackgroundK-Means Clustering in Moving Objects Extraction with Selective Background
K-Means Clustering in Moving Objects Extraction with Selective Background
 
ArtificialIntelligenceInObjectDetection-Report.pdf
ArtificialIntelligenceInObjectDetection-Report.pdfArtificialIntelligenceInObjectDetection-Report.pdf
ArtificialIntelligenceInObjectDetection-Report.pdf
 
Hand Written Digit Classification
Hand Written Digit ClassificationHand Written Digit Classification
Hand Written Digit Classification
 
IMAGE GENERATION FROM CAPTION
IMAGE GENERATION FROM CAPTIONIMAGE GENERATION FROM CAPTION
IMAGE GENERATION FROM CAPTION
 
Image Generation from Caption
Image Generation from Caption Image Generation from Caption
Image Generation from Caption
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural Networks
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
 
Learning with Relative Attributes
Learning with Relative AttributesLearning with Relative Attributes
Learning with Relative Attributes
 
imageclassification-160206090009.pdf
imageclassification-160206090009.pdfimageclassification-160206090009.pdf
imageclassification-160206090009.pdf
 
Using Mask R CNN to Isolate PV Panels from Background Object in Images
Using Mask R CNN to Isolate PV Panels from Background Object in ImagesUsing Mask R CNN to Isolate PV Panels from Background Object in Images
Using Mask R CNN to Isolate PV Panels from Background Object in Images
 
Automated Image Captioning – Model Based on CNN – GRU Architecture
Automated Image Captioning – Model Based on CNN – GRU ArchitectureAutomated Image Captioning – Model Based on CNN – GRU Architecture
Automated Image Captioning – Model Based on CNN – GRU Architecture
 
Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather Conditions
 
Safety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdfSafety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdf
 
Ai use cases
Ai use casesAi use cases
Ai use cases
 
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...IRJET- Comparative Study of Different Techniques for Text as Well as Object D...
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...
 
IRJET - Content based Image Classification
IRJET -  	  Content based Image ClassificationIRJET -  	  Content based Image Classification
IRJET - Content based Image Classification
 

Recently uploaded

OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...NETWAYS
 
The 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringThe 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringSebastiano Panichella
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )Pooja Nehwal
 
Philippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.pptPhilippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.pptssuser319dad
 
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Salam Al-Karadaghi
 
George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024eCommerce Institute
 
Genesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptxGenesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptxFamilyWorshipCenterD
 
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...NETWAYS
 
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...NETWAYS
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Kayode Fayemi
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Delhi Call girls
 
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...NETWAYS
 
LANDMARKS AND MONUMENTS IN NIGERIA.pptx
LANDMARKS  AND MONUMENTS IN NIGERIA.pptxLANDMARKS  AND MONUMENTS IN NIGERIA.pptx
LANDMARKS AND MONUMENTS IN NIGERIA.pptxBasil Achie
 
call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@vikas rana
 
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Krijn Poppe
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSebastiano Panichella
 
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
NATIONAL ANTHEMS OF AFRICA (National Anthems of Africa)
NATIONAL ANTHEMS OF AFRICA (National Anthems of Africa)NATIONAL ANTHEMS OF AFRICA (National Anthems of Africa)
NATIONAL ANTHEMS OF AFRICA (National Anthems of Africa)Basil Achie
 
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...NETWAYS
 

Recently uploaded (20)

OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
 
The 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringThe 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software Engineering
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
 
Philippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.pptPhilippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.ppt
 
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
 
George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024
 
Genesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptxGenesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptx
 
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
 
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
 
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
 
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
 
LANDMARKS AND MONUMENTS IN NIGERIA.pptx
LANDMARKS  AND MONUMENTS IN NIGERIA.pptxLANDMARKS  AND MONUMENTS IN NIGERIA.pptx
LANDMARKS AND MONUMENTS IN NIGERIA.pptx
 
call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@
 
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
 
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
NATIONAL ANTHEMS OF AFRICA (National Anthems of Africa)
NATIONAL ANTHEMS OF AFRICA (National Anthems of Africa)NATIONAL ANTHEMS OF AFRICA (National Anthems of Africa)
NATIONAL ANTHEMS OF AFRICA (National Anthems of Africa)
 
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
 

CBIR Techniques and Applications

  • 1. Content Based Image Retrieval (CBIR) Behzad Shomali
  • 2. What is CBIR? Content-based image retrieval, also known as query by image content (QBIC) and content-based visual information retrieval (CBVIR), is the application of computer vision techniques to the image retrieval problem, that is, the problem of searching for digital images in large databases. https://en.wikipedia.org/wiki/Content-based_image_retrieval Query Image Image Feature Extraction Feature Extraction Similarity Measuremen t Retrieved Images
  • 3. Technologies ● Query by example (QBE) ● Semantic retrieval ● Relevance feedback (human interaction) ● Iterative/machine learning ● Other query methods https://en.wikipedia.org/wiki/Content-based_image_retrieval
  • 4. Technologies ● Query by example (QBE) ● Semantic retrieval ● Relevance feedback (human interaction) ● Iterative/machine learning ● Other query methods https://en.wikipedia.org/wiki/Content-based_image_retrieval
  • 5. Application in popular search systems ● Google images ○ Constructing a mathematical model ○ Metadata ● eBay ○ ResNet-50 for category recognition ● SK Planet ○ inception-v3 as vision encoder ○ RNN multi-class classification ● Alibaba ○ GoogLeNet V1 for category prediction and feature learning ● Pinterest ○ Two-step object detection https://en.wikipedia.org/wiki/Reverse_image_search
  • 6. Application in popular search systems ● Google images ○ Constructing a mathematical model ○ Metadata ● eBay ○ ResNet-50 for category recognition ● SK Planet ○ inception-v3 as vision encoder ○ RNN multi-class classification ● Alibaba ○ GoogLeNet V1 for category prediction and feature learning ● Pinterest ○ Two-step object detection https://en.wikipedia.org/wiki/Reverse_image_search
  • 7. Image Representation and Features ● Extract local and deep features ● Studied AlexNet and VGG ○ Extract feature representations from fc6 and fc8 layers ○ Binarized ○ Hamming distance ● Extract salient color signatures ○ Detect salient regions ○ K-means clustering ○ Store cluster centroids and weights as image signature [DavJing, Yushi, et al. "Visual search at pinterest." Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015]
  • 8. Two-step Object Detection and Localization 1. Category classification 2. Object detection [DavJing, Yushi, et al. "Visual search at pinterest." Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015]
  • 9. Car Flower Person c1 ... cn f1 ... fm p1 ... pk Input Category classification Object detection
  • 10. Car Flower Person c1 ... cn f1 ... fm p1 ... pk Input Category classification Object detection
  • 11. Car Flower Person c1 ... cn f1 ... fm p1 ... pk Input Category classification Object detection Reduce computational cost
  • 12. Static Evaluation of Search Relevance ● Used dataset contains: 1.6 M unique images ○ Be assumed to be relevant, if two images share a label ● Computed precision@k based on several features ○ The fc6 layer activations from the generic AlexNet (pre-trained for ILSVRC) ○ The fc6 activations of an AlexNet model fine-tuned to recognize over 3,000 Pinterest products categories ○ The loss3/classifier activations of a generic GoogLeNet ○ The fc6 activations of a generic VGG 16-layer model [DavJing, Yushi, et al. "Visual search at pinterest." Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015]
  • 13. Precision vs. Recall [Müller, Henning, et al. "Performance evaluation in content-based image retrieval: overview and proposals." Pattern recognition letters 22.5 (2001): 593-601]
  • 14. Precision vs. Recall [Müller, Henning, et al. "Performance evaluation in content-based image retrieval: overview and proposals." Pattern recognition letters 22.5 (2001): 593-601] Either value alone contains insufficient information ● We can always make recall 1, simply by retrieving all images ● Similarly, precision can be kept high by retrieving only a few images ● P (10) ; P (30) ; P (NR) - the precision after the first 10 ; 30 ; NR documents are retrieved ● Mean Average Precision - mean (non-interpolated) average precision . ● recall at .5 precision - recall at the rank where precision drops below .5. ● R (1000) - recall after 1000 documents are retrieved. ● Rank first relevant - The rank of the highest-ranked relevant document.
  • 15. Precision vs. Recall [Müller, Henning, et al. "Performance evaluation in content-based image retrieval: overview and proposals." Pattern recognition letters 22.5 (2001): 593-601] Either value alone contains insufficient information ● We can always make recall 1, simply by retrieving all images ● Similarly, precision can be kept high by retrieving only a few images Precision and recall should either be used together ● P (10) ; P (30) ; P (NR) - the precision after the first 10 ; 30 ; NR documents are retrieved ● Mean Average Precision - mean (non-interpolated) average precision . ● recall at .5 precision - recall at the rank where precision drops below .5. ● R (1000) - recall after 1000 documents are retrieved. ● Rank first relevant - The rank of the highest-ranked relevant document.
  • 16. Relevance of visual search Table 1 shows p@5 and p@10 performance of these models, along with the average CPU-based latency of our visual search service, which includes feature extraction for the query image as well as retrieval. [DavJing, Yushi, et al. "Visual search at pinterest." Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015]
  • 17. Siamese networks [Das, Arpita, et al. "Together we stand: Siamese networks for similar question retrieval." Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2016]
  • 18. Siamese networks ● Let, F(X) be the family of functions with set of parameters W. F(X) is assumed to be differentiable with respect to W. Siamese network seeks a value of the parameter W such that the symmetric similarity metric is small if X1 and X2 belong to the same category, and large if they belong to different categories. [Das, Arpita, et al. "Together we stand: Siamese networks for similar question retrieval." Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2016]
  • 19. Different loss functions for training a Siamese network Two commonly used ones are ● Triplet loss ● Contrastive loss The main idea of these loss functions is to pull the samples of every class toward one another and push the samples of different classes away from each other [Ghojogh, Benyamin, et al. "Fisher discriminant triplet and contrastive losses for training siamese networks." 2020 International Joint Conference on Neural Networks (IJCNN). IEEE, 2020]
  • 20. Different loss functions - Triplet loss The triplet loss uses anchor, neighbor, and distant. Let f(x) be the output (i.e., embedding) of the network for the input x. The triplet loss tries to reduce the distance of anchor and neighbor embeddings and desires to increase the distance of anchor and distant embeddings. As long as the distances of anchor-distant pairs get larger than the distances of anchor-neighbor pairs by a margin α ≥ 0, the desired embedding is obtained [Ghojogh, Benyamin, et al. "Fisher discriminant triplet and contrastive losses for training siamese networks." 2020 International Joint Conference on Neural Networks (IJCNN). IEEE, 2020]
  • 21. Different loss functions - Contrastive loss The contrastive loss uses pairs of samples which can be anchor and neighbor or anchor and distant. If the samples are anchor and neighbor, they are pulled towards each other; otherwise, their distance is increased. In other words, the contrastive loss performs like the triplet loss but one by one rather than simultaneously. The desired embedding is obtained when the anchor-distant distances get larger than the anchor-neighbor distances by a margin of α [Ghojogh, Benyamin, et al. "Fisher discriminant triplet and contrastive losses for training siamese networks." 2020 International Joint Conference on Neural Networks (IJCNN). IEEE, 2020]