SlideShare a Scribd company logo
Eva Mohedano
eva.mohedano@insight-centre.org
Postdoctoral Researcher
Insight-centre for Data Analytics
Dublin City University
Content-based
Image Retrieval
Day 1 Lecture 5
#DLUPC
http://bit.ly/dlcv2018
Overview
● What is content-based image retrieval?
● The classic SIFT retrieval pipeline
● Using off the shelf CNN features for retrieval
● Learning representations for retrieval
2
Overview
3
● What is content-based image retrieval?
● The classic SIFT retrieval pipeline
● Using off the shelf CNN features for retrieval
● Learning representations for retrieval
The problem: query by example
Given:
● An example query
image that illustrates
the user's information
need
● A very large dataset of
images
Task:
● Rank all images in the
dataset according to
how likely they are to
fulfil the user's
information need
4
Demo by Paula Gomez Duran
Dockerized visualization tool
Overview
6
● What is content-based image retrieval?
● The classic SIFT retrieval pipeline
● Using off the shelf CNN features for retrieval
● Learning representations for retrieval
The retrieval pipeline
7
Image RepresentationsQuery
Image
Dataset
Image Matching Ranked List
Similarity score Image
.
.
.
0.98
0.97
0.10
0.01
v = (v1
, …, vn
)
v1
= (v11
, …, v1n
)
vk
= (vk1
, …, vkn
)
...
Euclidean distance
Cosine Similarity
Similarity
Metric .
.
.
The classic SIFT retrieval pipeline
8
v1
= (v11
, …, v1n
)
vk
= (vk1
, …, vkn
)
...
variable number of
feature vectors per image
Bag of Visual
Words
N-Dimensional
feature space
M visual words
(M clusters)
INVERTED FILE
word Image ID
1 1, 12,
2 1, 30, 102
3 10, 12
4 2,3
6 10
...
Large vocabularies (50k-1M)
Very fast!
Typically used with SIFT features
Initial Search
The classic SIFT retrieval pipeline
9
Spatial re-ranking
Re-ranking the top-ranked results using spatial constraints
RAndom SAmple Consensus (RANSAC)
● Estimates an homography between
the query and a dataset image
● Re-rank based on number of inlier
local features
● Improves quality of the initial search
Philbin, J., Chum, O., Isard, M., Sivic, J. and Zisserman, Object retrieval with large vocabularies and fast spatial matching. CVPR 2007
Kalantidis, Y., Tolias, G., Spyrou, E., Mylonas, P. and Avrithis, Visual image retrieval and localization. 7th International Workshop on Content-Based Multimedia Indexing, 2009.
Expensive to compute
Overview
10
● What is content-based image retrieval?
● The classic SIFT retrieval pipeline
● Using off the shelf CNN features for retrieval
● Learning representations for retrieval
Off-the-shelf CNN representations
11Razavian et al. CNN features off-the-shelf: an astounding baseline for recognition. CVPR 2014
Babenko et al. Neural codes for image retrieval. ECCV 2014
FC layers as global feature
representation
Off-the-shelf CNN representations
Neural codes for retrieval [1]
● FC7 layer (4096D)
● L2
norm + PCA whitening + L2
norm
● Euclidean distance
● Only better than traditional SIFT approach after fine tuning on similar domain image dataset.
CNN features off-the-shelf: an astounding baseline for recognition [2]
● Extending Babenko’s approach with spatial search
● Several features extracted by image (sliding window approach)
● Really good results but too computationally expensive for practical situations
[1] Babenko et al, Neural codes for image retrieval CVPR 2014
[2] Razavian et al, CNN features off-the-shelf: an astounding baseline for recognition, CVPR 2014
12
Off-the-shelf CNN representations
13
FC layers as global feature
representation
Is there any other way to pool the local information from a convolutional layer?
Pooling method #1: Spatial sum/max pooling
14
sum/max pool conv features across filters
Babenko and Lempitsky, Aggregating local deep features for image retrieval. ICCV 2015
Tolias et al. Particular object retrieval with integral max-pooling of CNN activations. arXiv 2015.
Kalantidis et al. Cross-dimensional Weighting for Aggregated Deep Convolutional Features. ECCV 2016 Workshops.
Pooling method #1: Spatial sum/max pooling
15
Pooling method #1: Spatial sum/max pooling
16
Sum/max pooling over specific regions
Pooling method #1: Spatial sum/max pooling
Babenko and Lempitsky, Aggregating local deep features for image retrieval. ICCV 2015
Kalantidis et al. Cross-dimensional Weighting for Aggregated Deep Convolutional Features. ECCV 2016 Workshops.
Mohedano et al. Saliency Weighted Convolutional Features for Instance Search, arXiv 2017
Apply spatial weighting on the features before
pooling them
Spatial weighting on convolutional activations
17
w11
w12
... w1w
w21
w22
... w2w
wH1
wH2
... wHW
Pooling method #2: Generalized-mean pooling
Radenović, F., Tolias, G. and Chum, O, Fine-tuning CNN image retrieval with no human annotation. TPAMI 2018 18
Max-pooling (MAC)
Average pooling (SPoC)
Generalized-mean pooling (GeM)
pk
can be manually set or learnt
f1
f2
fK
...
...
...
Pooling method #2: Generalized-mean pooling
19Radenović, F., Tolias, G. and Chum, O, Fine-tuning CNN image retrieval with no human annotation. TPAMI 2018
Pooling method #3: Region pooling
20
Regional Maximum Activation of Convolution (R-MAC)
Settings
- Fully convolutional off-the-shelf VGG16
- Pool5
- Spatial Max pooling
- High Resolution images
- Global descriptor based on aggregating region vectors
- Sliding window approach
Tolias et al. Particular object retrieval with integral max-pooling of CNN activations. arXiv 2015.
R-MAC
Region1
21
R-MAC
Region1
Region2
22
R-MAC
Region1
Region2
Region3
23
R-MAC
Region1
Region2
Region3
Region4
...
RegionN
24
R-MAC
Region1
Region2
Region3
Region4
...
RegionN
Region vectors 512D l2-PCAw-l2
Region1
Region2
Region3
Region4
...
RegionN
Global
vector
512D
25
Spatial re-ranking
● Only on top 100 ranked results, compare query
with dataset regions
● Re-rank images based on best region score per
image
R-MAC
26
Pooling method #4: Traditional pooling techniques
27
BoW, VLAD encoding of conv features
Ng et al. Exploiting local features from deep networks for image retrieval. CVPR Workshops 2015
Mohedano et al. Bags of Local Convolutional Features for Scalable Instance Search. ICMR 2016
Pooling method #4: Bags of Local Convolutional Features
28
(336x256)
Resolution
conv5_1 from
VGG16
(42x32)
25K centroids 25K-D vector
Mohedano et al. Bags of Local Convolutional Features for Scalable Instance Search. ICMR 2016
29
(336x256)
Resolution
conv5_1 from
VGG16
(42x32)
25K centroids 25K-D vector
Pooling method #4: Bags of Local Convolutional Features
30
Paris Buildings 6k Oxford Buildings 5k
TRECVID Instance Search 2013
(subset of 23k frames)
[7] Kalantidis et al. Cross-dimensional Weighting for Aggregated
Deep Convolutional Features. ECCV 2016 Workshops.
Mohedano et al. Bags of Local Convolutional Features for
Scalable Instance Search. ICMR 2016
Pooling method #4: Bags of Local Convolutional Features
31
Pooling method #4: Bags of Local Convolutional Features
Mohedano et al. Saliency Weighted Convolutional Features for Instance Search, arXiv 2017
Overview
32
● What is content-based image retrieval?
● The classic SIFT retrieval pipeline
● Using off the shelf CNN features for retrieval
● Learning representations for retrieval
Classification
33
Query: This chair
Results from dataset classified as “chair”
Retrieval
34
Query: This chair
Results from dataset ranked by similarity to the query
What loss function should we use?
35
Learning representations for retrieval
Siamese network: network to learn a function that maps input
patterns into a target space such that L2
norm in the target
space approximates the semantic distance in the input space.
36
Song et al.: Deep metric learning via lifted structured feature embedding. CVPR 2015
Chopra et al. Learning a similarity metric discriminatively, with application to face verification . CVPR 2005
Simo-Serra et al. Discriminative Learning of Deep Convolutional Feature Point Descriptor . ICCV 2015
Margin: m
Hinge loss
Negative pairs: if
nearer than the
margin, pay a linear
penalty
The margin is a
hyperparameter
37
Learning representations for retrieval
Siamese network with triplet loss: loss function minimizes distance between query and positive
and maximizes distance between query and negative
38Schroff et al. FaceNet: A Unified Embedding for Face Recognition and Clustering, CVPR 2015
CNN
w w
a p n
L2 embedding space
Triplet Loss
CNN CNN
How do we get the training data?
39
Exploring Image datasets with GPS coordinates (Google Street View Time Machine)
Relja Arandjelović et al, NetVLAD: CNN architecture for weakly supervised place recognition, CVPR 2016 40
Radenović el at., CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples, CVPR 2016
Gordo et al., Deep Image Retrieval: Learning global representations for image search, CVPR 2016
[1] Schonberger et al. From single image query to detailed 3D reconstruction, CVPR 2015
Automatic data cleaning
Strong baseline (SIFT) +
geometric verification + 3D
camera estimation[1]
Further manual inspection
Further manual inspection
41
Radenović et al, CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples, CVPR 2016
Gordo et al., Deep Image Retrieval: Learning global representations for image search, CVPR 2016
Generating training pairs
Select the hardest negative
- Select triplet which
generates larger loss
- If 3D-geometric verification
model, select images based
on number of inliers
42
Once we have collected a training dataset (with annotations), the way to select the pairs/triplets is
crucial
NetVLAD
Relja Arandjelović et al, NetVLAD: CNN architecture for weakly supervised place recognition, CVPR 2016
43
End-to-End learning for retrieval #1
Local features
Vectors dimension D
K cluster centers
Visual Vocabulary
(K vectors with dimension D)
Learnt from data with K-means
Feature Space D-dim
VLAD vector
Sum of all residuals of the
assignments to the cluster 1
d1 d2
d3
d4
Dim D
Dim K x D
VLAD
44
V Matrix VLAD (residuals)
D
k
Hard Assignment
Not differentiable !
j
Soft Assignment
N Local features {Xi
}
D
i
K Clusters features {Ck
}
D
k
0.99
0.98
0.05
0.01
0.00
Cluster k=1
j
j
1
1
0
0
0
Softmax function :)
VLAD
45
Relja Arandjelović et al, NetVLAD: CNN architecture for weakly supervised place recognition, CVPR 2016
Consider the positive
example closer to the query
Consider all negative
candidates
With distance inferior to a
certain margin
46
Training based on GPS annotated data
Relja Arandjelović et al, NetVLAD: CNN architecture for weakly supervised place recognition, CVPR 2016
NetVLAD
Fine-tuned R-MAC
47
End-to-End learning for retrieval #2
Gordo et al, Deep Image Retrieval: Learning global representations for image search. ECCV 2016
R-MAC
Region1
Region2
Region3
Region4
...
RegionN
Region vectors
512D
l2-PCAw-l2
Region1
Region2
Region3
Region4
...
RegionN
Global
vector
512D
48
Learning representations for retrieval
49
Gordo et al, Deep Image Retrieval: Learning global representations for image search. ECCV 2016
Learning representations for retrieval
50
Comparison between R-MAC from off-the-shelf network and R-MAC retrained for retrieval
Gordo et al, Deep Image Retrieval: Learning global representations for image search. ECCV 2016
Extra: Ranking refinement
51
Ranking refinement: Query Expansion
New QueryQuery Image
Average descriptors of the top N retrieved images
● Runs at query time
● Boost performance if the initial ranking is good
○ Traditionally applied on spatially verified images (RANSAC)
● Only considers the top NN of the image
Diffusion mechanism to propagate similarities through a pairwise affinity matrix
On Random walks and Diffusion on graphs:
Leonid E. Zhukov, Structural Analysis and Visualization of Networks [slides] [video lecture]
1
2
4
3
0 1 1 1 0
1 0 0 0 1
1 0 0 1 1
1 0 1 0 0
0 1 1 1 1
5
Affinity Matrix
Ranking refinement: Diffusion
Diffusion mechanism to propagate similarities through a pairwise affinity matrix
● Most retrieval approaches using pairwise measures (i.e Euclidean
distance between query vs dataset images) to generate the final rank.
● This has the limitation of ignoring the structure of the underlying
data manifold
Ranking refinement: Diffusion
Zhou et al. Ranking on data manifolds. In Advances in neural information processing systems (pp. 169-176), 2004
Ranking refinement: Diffusion
Iscen, A et al. Efficient diffusion on region manifolds: Recovering small objects with compact cnn representations. CVPR 2017
Overview
57
● What is content-based image retrieval?
● The classic SIFT retrieval pipeline
● Using off the shelf CNN features for retrieval
● Learning representations for retrieval
58
Questions?

More Related Content

What's hot

Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentation
Owin Will
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural Networks
Yogendra Tamang
 
fuzzy image processing
fuzzy image processingfuzzy image processing
fuzzy image processing
amalalhait
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
CloudxLab
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
Musa Hawamdah
 
Image enhancement in the spatial domain1
Image enhancement in the spatial domain1Image enhancement in the spatial domain1
Image enhancement in the spatial domain1
shabanam tamboli
 
Harris corner detector and face recognition
Harris corner detector and face recognitionHarris corner detector and face recognition
Harris corner detector and face recognition
Shih Wei Huang
 
Hough Transform By Md.Nazmul Islam
Hough Transform By Md.Nazmul IslamHough Transform By Md.Nazmul Islam
Hough Transform By Md.Nazmul Islam
Nazmul Islam
 
Kernels and Support Vector Machines
Kernels and Support Vector  MachinesKernels and Support Vector  Machines
Kernels and Support Vector Machines
Edgar Marca
 
Image compression standards
Image compression standardsImage compression standards
Image compression standards
kirupasuchi1996
 
IMAGE PROCESSING - MATHANKUMAR.S - VMKVEC
IMAGE PROCESSING - MATHANKUMAR.S - VMKVECIMAGE PROCESSING - MATHANKUMAR.S - VMKVEC
IMAGE PROCESSING - MATHANKUMAR.S - VMKVEC
Mathankumar S
 
point operations in image processing
point operations in image processingpoint operations in image processing
point operations in image processing
Ramachendran Logarajah
 
Smoothing in Digital Image Processing
Smoothing in Digital Image ProcessingSmoothing in Digital Image Processing
Smoothing in Digital Image Processing
Pallavi Agarwal
 
Image processing fundamentals
Image processing fundamentalsImage processing fundamentals
Image processing fundamentals
A B Shinde
 
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Deep Learning in Bio-Medical Imaging
Deep Learning in Bio-Medical ImagingDeep Learning in Bio-Medical Imaging
Deep Learning in Bio-Medical Imaging
Joonhyung Lee
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...
Universitat Politècnica de Catalunya
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
nextlib
 
DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...
DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...
DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...
Joonhyung Lee
 
Modeling uncertainty in deep learning
Modeling uncertainty in deep learning Modeling uncertainty in deep learning
Modeling uncertainty in deep learning
Sungjoon Choi
 

What's hot (20)

Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentation
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural Networks
 
fuzzy image processing
fuzzy image processingfuzzy image processing
fuzzy image processing
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Image enhancement in the spatial domain1
Image enhancement in the spatial domain1Image enhancement in the spatial domain1
Image enhancement in the spatial domain1
 
Harris corner detector and face recognition
Harris corner detector and face recognitionHarris corner detector and face recognition
Harris corner detector and face recognition
 
Hough Transform By Md.Nazmul Islam
Hough Transform By Md.Nazmul IslamHough Transform By Md.Nazmul Islam
Hough Transform By Md.Nazmul Islam
 
Kernels and Support Vector Machines
Kernels and Support Vector  MachinesKernels and Support Vector  Machines
Kernels and Support Vector Machines
 
Image compression standards
Image compression standardsImage compression standards
Image compression standards
 
IMAGE PROCESSING - MATHANKUMAR.S - VMKVEC
IMAGE PROCESSING - MATHANKUMAR.S - VMKVECIMAGE PROCESSING - MATHANKUMAR.S - VMKVEC
IMAGE PROCESSING - MATHANKUMAR.S - VMKVEC
 
point operations in image processing
point operations in image processingpoint operations in image processing
point operations in image processing
 
Smoothing in Digital Image Processing
Smoothing in Digital Image ProcessingSmoothing in Digital Image Processing
Smoothing in Digital Image Processing
 
Image processing fundamentals
Image processing fundamentalsImage processing fundamentals
Image processing fundamentals
 
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
 
Deep Learning in Bio-Medical Imaging
Deep Learning in Bio-Medical ImagingDeep Learning in Bio-Medical Imaging
Deep Learning in Bio-Medical Imaging
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...
DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...
DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...
 
Modeling uncertainty in deep learning
Modeling uncertainty in deep learning Modeling uncertainty in deep learning
Modeling uncertainty in deep learning
 

Similar to Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018

Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
Hiroshi Fukui
 
Visual geometry with deep learning
Visual geometry with deep learningVisual geometry with deep learning
Visual geometry with deep learning
NAVER Engineering
 
Region-oriented Convolutional Networks for Object Retrieval
Region-oriented Convolutional Networks for Object RetrievalRegion-oriented Convolutional Networks for Object Retrieval
Region-oriented Convolutional Networks for Object Retrieval
Universitat Politècnica de Catalunya
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
CHENHuiMei
 
Mobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesMobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large Repositories
United States Air Force Academy
 
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
CSCJournals
 
Lec11 object-re-id
Lec11 object-re-idLec11 object-re-id
Obscenity Detection in Images
Obscenity Detection in ImagesObscenity Detection in Images
Obscenity Detection in Images
Anil Kumar Gupta
 
Class Weighted Convolutional Features for Image Retrieval
Class Weighted Convolutional Features for Image Retrieval Class Weighted Convolutional Features for Image Retrieval
Class Weighted Convolutional Features for Image Retrieval
Universitat Politècnica de Catalunya
 
[212]big models without big data using domain specific deep networks in data-...
[212]big models without big data using domain specific deep networks in data-...[212]big models without big data using domain specific deep networks in data-...
[212]big models without big data using domain specific deep networks in data-...
NAVER D2
 
[DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation
[DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation[DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation
[DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation
Deep Learning JP
 
Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...
Symeon Papadopoulos
 
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
Jia-Bin Huang
 
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
multimediaeval
 
Visual Search for Musical Performances and Endoscopic Videos
Visual Search for Musical Performances and Endoscopic VideosVisual Search for Musical Performances and Endoscopic Videos
Visual Search for Musical Performances and Endoscopic Videos
Universitat Politècnica de Catalunya
 
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Universitat Politècnica de Catalunya
 
ObjectDetection.pptx
ObjectDetection.pptxObjectDetection.pptx
ObjectDetection.pptx
RitikPabbaraju2
 
Semantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesSemantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network Approaches
Fellowship at Vodafone FutureLab
 
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
Simone Ercoli
 

Similar to Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018 (20)

Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
 
Visual geometry with deep learning
Visual geometry with deep learningVisual geometry with deep learning
Visual geometry with deep learning
 
Region-oriented Convolutional Networks for Object Retrieval
Region-oriented Convolutional Networks for Object RetrievalRegion-oriented Convolutional Networks for Object Retrieval
Region-oriented Convolutional Networks for Object Retrieval
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
 
Mobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesMobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large Repositories
 
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
 
Lec11 object-re-id
Lec11 object-re-idLec11 object-re-id
Lec11 object-re-id
 
Obscenity Detection in Images
Obscenity Detection in ImagesObscenity Detection in Images
Obscenity Detection in Images
 
Class Weighted Convolutional Features for Image Retrieval
Class Weighted Convolutional Features for Image Retrieval Class Weighted Convolutional Features for Image Retrieval
Class Weighted Convolutional Features for Image Retrieval
 
[212]big models without big data using domain specific deep networks in data-...
[212]big models without big data using domain specific deep networks in data-...[212]big models without big data using domain specific deep networks in data-...
[212]big models without big data using domain specific deep networks in data-...
 
[DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation
[DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation[DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation
[DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation
 
Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...
 
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
 
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
 
Visual Search for Musical Performances and Endoscopic Videos
Visual Search for Musical Performances and Endoscopic VideosVisual Search for Musical Performances and Endoscopic Videos
Visual Search for Musical Performances and Endoscopic Videos
 
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
 
ObjectDetection.pptx
ObjectDetection.pptxObjectDetection.pptx
ObjectDetection.pptx
 
Semantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesSemantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network Approaches
 
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
 

More from Universitat Politècnica de Catalunya

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Universitat Politècnica de Catalunya
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
Universitat Politècnica de Catalunya
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Universitat Politècnica de Catalunya
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
Universitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Universitat Politècnica de Catalunya
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Universitat Politècnica de Catalunya
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Universitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
Universitat Politècnica de Catalunya
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Universitat Politècnica de Catalunya
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
Universitat Politècnica de Catalunya
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Universitat Politècnica de Catalunya
 

More from Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 

Recently uploaded

06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
bmucuha
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
SaffaIbrahim1
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
Monthly Management report for the Month of May 2024
Monthly Management report for the Month of May 2024Monthly Management report for the Month of May 2024
Monthly Management report for the Month of May 2024
facilitymanager11
 
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
yuvarajkumar334
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
a9qfiubqu
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
Márton Kodok
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
taqyea
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
slg6lamcq
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
bmucuha
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
wyddcwye1
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 

Recently uploaded (20)

06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
Monthly Management report for the Month of May 2024
Monthly Management report for the Month of May 2024Monthly Management report for the Month of May 2024
Monthly Management report for the Month of May 2024
 
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 

Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018

  • 1. Eva Mohedano eva.mohedano@insight-centre.org Postdoctoral Researcher Insight-centre for Data Analytics Dublin City University Content-based Image Retrieval Day 1 Lecture 5 #DLUPC http://bit.ly/dlcv2018
  • 2. Overview ● What is content-based image retrieval? ● The classic SIFT retrieval pipeline ● Using off the shelf CNN features for retrieval ● Learning representations for retrieval 2
  • 3. Overview 3 ● What is content-based image retrieval? ● The classic SIFT retrieval pipeline ● Using off the shelf CNN features for retrieval ● Learning representations for retrieval
  • 4. The problem: query by example Given: ● An example query image that illustrates the user's information need ● A very large dataset of images Task: ● Rank all images in the dataset according to how likely they are to fulfil the user's information need 4
  • 5. Demo by Paula Gomez Duran Dockerized visualization tool
  • 6. Overview 6 ● What is content-based image retrieval? ● The classic SIFT retrieval pipeline ● Using off the shelf CNN features for retrieval ● Learning representations for retrieval
  • 7. The retrieval pipeline 7 Image RepresentationsQuery Image Dataset Image Matching Ranked List Similarity score Image . . . 0.98 0.97 0.10 0.01 v = (v1 , …, vn ) v1 = (v11 , …, v1n ) vk = (vk1 , …, vkn ) ... Euclidean distance Cosine Similarity Similarity Metric . . .
  • 8. The classic SIFT retrieval pipeline 8 v1 = (v11 , …, v1n ) vk = (vk1 , …, vkn ) ... variable number of feature vectors per image Bag of Visual Words N-Dimensional feature space M visual words (M clusters) INVERTED FILE word Image ID 1 1, 12, 2 1, 30, 102 3 10, 12 4 2,3 6 10 ... Large vocabularies (50k-1M) Very fast! Typically used with SIFT features Initial Search
  • 9. The classic SIFT retrieval pipeline 9 Spatial re-ranking Re-ranking the top-ranked results using spatial constraints RAndom SAmple Consensus (RANSAC) ● Estimates an homography between the query and a dataset image ● Re-rank based on number of inlier local features ● Improves quality of the initial search Philbin, J., Chum, O., Isard, M., Sivic, J. and Zisserman, Object retrieval with large vocabularies and fast spatial matching. CVPR 2007 Kalantidis, Y., Tolias, G., Spyrou, E., Mylonas, P. and Avrithis, Visual image retrieval and localization. 7th International Workshop on Content-Based Multimedia Indexing, 2009. Expensive to compute
  • 10. Overview 10 ● What is content-based image retrieval? ● The classic SIFT retrieval pipeline ● Using off the shelf CNN features for retrieval ● Learning representations for retrieval
  • 11. Off-the-shelf CNN representations 11Razavian et al. CNN features off-the-shelf: an astounding baseline for recognition. CVPR 2014 Babenko et al. Neural codes for image retrieval. ECCV 2014 FC layers as global feature representation
  • 12. Off-the-shelf CNN representations Neural codes for retrieval [1] ● FC7 layer (4096D) ● L2 norm + PCA whitening + L2 norm ● Euclidean distance ● Only better than traditional SIFT approach after fine tuning on similar domain image dataset. CNN features off-the-shelf: an astounding baseline for recognition [2] ● Extending Babenko’s approach with spatial search ● Several features extracted by image (sliding window approach) ● Really good results but too computationally expensive for practical situations [1] Babenko et al, Neural codes for image retrieval CVPR 2014 [2] Razavian et al, CNN features off-the-shelf: an astounding baseline for recognition, CVPR 2014 12
  • 13. Off-the-shelf CNN representations 13 FC layers as global feature representation Is there any other way to pool the local information from a convolutional layer?
  • 14. Pooling method #1: Spatial sum/max pooling 14 sum/max pool conv features across filters Babenko and Lempitsky, Aggregating local deep features for image retrieval. ICCV 2015 Tolias et al. Particular object retrieval with integral max-pooling of CNN activations. arXiv 2015. Kalantidis et al. Cross-dimensional Weighting for Aggregated Deep Convolutional Features. ECCV 2016 Workshops.
  • 15. Pooling method #1: Spatial sum/max pooling 15
  • 16. Pooling method #1: Spatial sum/max pooling 16 Sum/max pooling over specific regions
  • 17. Pooling method #1: Spatial sum/max pooling Babenko and Lempitsky, Aggregating local deep features for image retrieval. ICCV 2015 Kalantidis et al. Cross-dimensional Weighting for Aggregated Deep Convolutional Features. ECCV 2016 Workshops. Mohedano et al. Saliency Weighted Convolutional Features for Instance Search, arXiv 2017 Apply spatial weighting on the features before pooling them Spatial weighting on convolutional activations 17 w11 w12 ... w1w w21 w22 ... w2w wH1 wH2 ... wHW
  • 18. Pooling method #2: Generalized-mean pooling Radenović, F., Tolias, G. and Chum, O, Fine-tuning CNN image retrieval with no human annotation. TPAMI 2018 18 Max-pooling (MAC) Average pooling (SPoC) Generalized-mean pooling (GeM) pk can be manually set or learnt f1 f2 fK ... ... ...
  • 19. Pooling method #2: Generalized-mean pooling 19Radenović, F., Tolias, G. and Chum, O, Fine-tuning CNN image retrieval with no human annotation. TPAMI 2018
  • 20. Pooling method #3: Region pooling 20 Regional Maximum Activation of Convolution (R-MAC) Settings - Fully convolutional off-the-shelf VGG16 - Pool5 - Spatial Max pooling - High Resolution images - Global descriptor based on aggregating region vectors - Sliding window approach Tolias et al. Particular object retrieval with integral max-pooling of CNN activations. arXiv 2015.
  • 25. R-MAC Region1 Region2 Region3 Region4 ... RegionN Region vectors 512D l2-PCAw-l2 Region1 Region2 Region3 Region4 ... RegionN Global vector 512D 25 Spatial re-ranking ● Only on top 100 ranked results, compare query with dataset regions ● Re-rank images based on best region score per image
  • 27. Pooling method #4: Traditional pooling techniques 27 BoW, VLAD encoding of conv features Ng et al. Exploiting local features from deep networks for image retrieval. CVPR Workshops 2015 Mohedano et al. Bags of Local Convolutional Features for Scalable Instance Search. ICMR 2016
  • 28. Pooling method #4: Bags of Local Convolutional Features 28 (336x256) Resolution conv5_1 from VGG16 (42x32) 25K centroids 25K-D vector Mohedano et al. Bags of Local Convolutional Features for Scalable Instance Search. ICMR 2016
  • 29. 29 (336x256) Resolution conv5_1 from VGG16 (42x32) 25K centroids 25K-D vector Pooling method #4: Bags of Local Convolutional Features
  • 30. 30 Paris Buildings 6k Oxford Buildings 5k TRECVID Instance Search 2013 (subset of 23k frames) [7] Kalantidis et al. Cross-dimensional Weighting for Aggregated Deep Convolutional Features. ECCV 2016 Workshops. Mohedano et al. Bags of Local Convolutional Features for Scalable Instance Search. ICMR 2016 Pooling method #4: Bags of Local Convolutional Features
  • 31. 31 Pooling method #4: Bags of Local Convolutional Features Mohedano et al. Saliency Weighted Convolutional Features for Instance Search, arXiv 2017
  • 32. Overview 32 ● What is content-based image retrieval? ● The classic SIFT retrieval pipeline ● Using off the shelf CNN features for retrieval ● Learning representations for retrieval
  • 33. Classification 33 Query: This chair Results from dataset classified as “chair”
  • 34. Retrieval 34 Query: This chair Results from dataset ranked by similarity to the query
  • 35. What loss function should we use? 35
  • 36. Learning representations for retrieval Siamese network: network to learn a function that maps input patterns into a target space such that L2 norm in the target space approximates the semantic distance in the input space. 36 Song et al.: Deep metric learning via lifted structured feature embedding. CVPR 2015 Chopra et al. Learning a similarity metric discriminatively, with application to face verification . CVPR 2005 Simo-Serra et al. Discriminative Learning of Deep Convolutional Feature Point Descriptor . ICCV 2015
  • 37. Margin: m Hinge loss Negative pairs: if nearer than the margin, pay a linear penalty The margin is a hyperparameter 37
  • 38. Learning representations for retrieval Siamese network with triplet loss: loss function minimizes distance between query and positive and maximizes distance between query and negative 38Schroff et al. FaceNet: A Unified Embedding for Face Recognition and Clustering, CVPR 2015 CNN w w a p n L2 embedding space Triplet Loss CNN CNN
  • 39. How do we get the training data? 39
  • 40. Exploring Image datasets with GPS coordinates (Google Street View Time Machine) Relja Arandjelović et al, NetVLAD: CNN architecture for weakly supervised place recognition, CVPR 2016 40
  • 41. Radenović el at., CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples, CVPR 2016 Gordo et al., Deep Image Retrieval: Learning global representations for image search, CVPR 2016 [1] Schonberger et al. From single image query to detailed 3D reconstruction, CVPR 2015 Automatic data cleaning Strong baseline (SIFT) + geometric verification + 3D camera estimation[1] Further manual inspection Further manual inspection 41
  • 42. Radenović et al, CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples, CVPR 2016 Gordo et al., Deep Image Retrieval: Learning global representations for image search, CVPR 2016 Generating training pairs Select the hardest negative - Select triplet which generates larger loss - If 3D-geometric verification model, select images based on number of inliers 42 Once we have collected a training dataset (with annotations), the way to select the pairs/triplets is crucial
  • 43. NetVLAD Relja Arandjelović et al, NetVLAD: CNN architecture for weakly supervised place recognition, CVPR 2016 43 End-to-End learning for retrieval #1
  • 44. Local features Vectors dimension D K cluster centers Visual Vocabulary (K vectors with dimension D) Learnt from data with K-means Feature Space D-dim VLAD vector Sum of all residuals of the assignments to the cluster 1 d1 d2 d3 d4 Dim D Dim K x D VLAD 44
  • 45. V Matrix VLAD (residuals) D k Hard Assignment Not differentiable ! j Soft Assignment N Local features {Xi } D i K Clusters features {Ck } D k 0.99 0.98 0.05 0.01 0.00 Cluster k=1 j j 1 1 0 0 0 Softmax function :) VLAD 45 Relja Arandjelović et al, NetVLAD: CNN architecture for weakly supervised place recognition, CVPR 2016
  • 46. Consider the positive example closer to the query Consider all negative candidates With distance inferior to a certain margin 46 Training based on GPS annotated data Relja Arandjelović et al, NetVLAD: CNN architecture for weakly supervised place recognition, CVPR 2016 NetVLAD
  • 47. Fine-tuned R-MAC 47 End-to-End learning for retrieval #2 Gordo et al, Deep Image Retrieval: Learning global representations for image search. ECCV 2016
  • 49. Learning representations for retrieval 49 Gordo et al, Deep Image Retrieval: Learning global representations for image search. ECCV 2016
  • 50. Learning representations for retrieval 50 Comparison between R-MAC from off-the-shelf network and R-MAC retrained for retrieval Gordo et al, Deep Image Retrieval: Learning global representations for image search. ECCV 2016
  • 52. Ranking refinement: Query Expansion New QueryQuery Image Average descriptors of the top N retrieved images ● Runs at query time ● Boost performance if the initial ranking is good ○ Traditionally applied on spatially verified images (RANSAC) ● Only considers the top NN of the image
  • 53. Diffusion mechanism to propagate similarities through a pairwise affinity matrix On Random walks and Diffusion on graphs: Leonid E. Zhukov, Structural Analysis and Visualization of Networks [slides] [video lecture] 1 2 4 3 0 1 1 1 0 1 0 0 0 1 1 0 0 1 1 1 0 1 0 0 0 1 1 1 1 5 Affinity Matrix Ranking refinement: Diffusion
  • 54. Diffusion mechanism to propagate similarities through a pairwise affinity matrix ● Most retrieval approaches using pairwise measures (i.e Euclidean distance between query vs dataset images) to generate the final rank. ● This has the limitation of ignoring the structure of the underlying data manifold Ranking refinement: Diffusion
  • 55. Zhou et al. Ranking on data manifolds. In Advances in neural information processing systems (pp. 169-176), 2004 Ranking refinement: Diffusion
  • 56. Iscen, A et al. Efficient diffusion on region manifolds: Recovering small objects with compact cnn representations. CVPR 2017
  • 57. Overview 57 ● What is content-based image retrieval? ● The classic SIFT retrieval pipeline ● Using off the shelf CNN features for retrieval ● Learning representations for retrieval