SlideShare a Scribd company logo
1 of 26
Deep Image Retrieval:
Learning global representations for image
search
Albert Gordo, Jon Almazan, Jerome Revaud, Diane Larlus
Original Slides by Albert Jiménez
Computer Vision Reading Group
1
[arXiv]
1.Introduction
2
3
Instance Retrieval + Ranking
1.
2.
3.
4.Image Retrieval
Slide credit: Amaia
Sal
Ranking
Image
Query
CNN-based retrieval
● CNNs trained for classification tasks
● Features are very robust to intra-class variability
● Lack of robustness to scaling, cropping and image clutter
Related Work
Lamp
We are interested in distinguishing between particular objects from the same class!
4
R-MAC
● Regional Maximum Activation of Convolutions
● Compact feature vectors encode image regions
Related Work
Giorgos Tolias, Ronan Sicre, Hervé Jégou, Particular object retrieval with integral max-pooling of CNN
activations (Submitted to ICLR 2016)
5
R-MAC
● Regions selected using a rigid grid
● Compute a feature vector per region
● Combine all region feature vectors
○ Dimension → 256 / 512
Related Work
Giorgos Tolias, Ronan Sicre, Hervé Jégou, Particular object retrieval with integral max-pooling of CNN
activations (Submitted to ICLR 2016)
ConvNet
Last
Layer
K feature maps
size = W x H
Different scale
region grids
maximum activation
6
2. Methodology
7
1st Contribution
● Three-stream siamese network
● PCA implemented as a shift + fully connected layer
● Optimize weights (CNN + PCA) from R-MAC representation with a triplet
loss function
8
where:
● m is a scalar that controls the margin
● q, d+, d- are the descriptors for the query, positive and negative images
1st Contribution
Ranking Loss Function
9
2nd Contribution
● Localize regions of interest (ROIs)
● Train a Region Proposal Network with bounding boxes (Similar Fast R-CNN,
[arXiv])
In R-MAC → Rigid grid
Replace
Region Proposal Network
10
2nd Contribution
RPN in a nutshell
11
● Predict, for a set of candidate boxes of
various sizes and aspects ratio, and at all
possible image locations, a score
describing how likely each box contains an
object of interest.
● Simultaneously, for each candidate box
perform regression to improve its location.
Summary
12
● Able to encode one image into a compact feature vector in a single forward
pass
● Images can be compared using the dot product
● Very efficient at test time
3. Experiments
13
Datasets
14
● Training Landmarks dataset: 214k images from 672 landmark
sites
● Testing Oxford 5k, Paris 6k, Oxford 105k, Paris 106k, INRIA
Holidays
● Remove all images contained in Oxford 5k and Paris 6k datasets
○ Landmarks-full: 200k images from 592 landmarks
● Cleaning Landmarks dataset (Select most relevant images/discard incorrect)
○ SIFT + Hessian Affine keypoint det. → Construct graph of similar images
○ Landmarks-clean: 52k images from 592 landmarks
Bounding Box Estimation
15
● RPN trained using automatically estimated bounding box annotations
1. Define initial bounding box: min rectangle that encloses all matched
keypoints
2. For a pair (i, j) we predict the bounding box Bj using Bi and an affine transform
Aij
3. Update (Merge
using geometrical mean)
4. Iterate until convergence
Bounding box projections Initial vs Final estimations
Experimental Details
16
● VGG-16 network pre-trained on ImageNet
● Fine-tune with Landmarks dataset
● Select triplets in an efficient manner
○ Forward pass to obtain image representations
○ Select hard negatives (Large loss)
● Dimension of the feature vector = 512
● Evaluation: mean Average Precision (mAP)
VGG16
1st Experiment
17
Comparison between R-MAC and their implementations
C: Classification Network
R: Ranking (Trained with triplets)
2nd Experiment
18
Comparison between fixed grid vs number of region proposals
16-32 proposals already outperform rigid grid!
2nd Experiment
19
mAP - Number of triplets Recall - Number of region proposals
2nd Experiment
20
Heatmap vs Bounding Box Estimation
Comparison with state of the art
21
Comparison with state of the art
22
Top Retrieval Results
23
4. Conclusions
24
Conclusions
25
● They have proposed an effective and scalable method for image retrieval that
encodes images into compact global signatures that can be compared with the
dot-product.
● Proposal of a siamese network architecture trained for the specific task of
image retrieval using ranking loss function (Triplets).
● Demonstrate the benefit of predicting the ROI of the images when encoding by
using Region Proposal Networks.
Thank You!
Questions?
26

More Related Content

What's hot

Object detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetObject detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetRishabh Indoria
 
object detection paper review
object detection paper reviewobject detection paper review
object detection paper reviewYoonho Na
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya
 
Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Hwa Pyung Kim
 
Semantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesSemantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesFellowship at Vodafone FutureLab
 
Fast Non-Uniform Filtering with Symmetric Weighted Integral Images
Fast Non-Uniform Filtering with Symmetric Weighted Integral ImagesFast Non-Uniform Filtering with Symmetric Weighted Integral Images
Fast Non-Uniform Filtering with Symmetric Weighted Integral Imagesdavidmarimon
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImageryRAHUL BHOJWANI
 
Deep learning based object detection basics
Deep learning based object detection basicsDeep learning based object detection basics
Deep learning based object detection basicsBrodmann17
 
R-FCN : object detection via region-based fully convolutional networks
R-FCN :  object detection via region-based fully convolutional networksR-FCN :  object detection via region-based fully convolutional networks
R-FCN : object detection via region-based fully convolutional networksEntrepreneur / Startup
 
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...Universitat Politècnica de Catalunya
 
DNR - Auto deep lab paper review ppt
DNR - Auto deep lab paper review pptDNR - Auto deep lab paper review ppt
DNR - Auto deep lab paper review ppttaeseon ryu
 
Semantic Mapping of Road Scenes
Semantic Mapping of Road ScenesSemantic Mapping of Road Scenes
Semantic Mapping of Road ScenesSunando Sengupta
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Shai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingShai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingwolf
 

What's hot (20)

Object detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetObject detection - RCNNs vs Retinanet
Object detection - RCNNs vs Retinanet
 
object detection paper review
object detection paper reviewobject detection paper review
object detection paper review
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
 
Mask R-CNN
Mask R-CNNMask R-CNN
Mask R-CNN
 
Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)
 
Semantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesSemantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network Approaches
 
Region-oriented Convolutional Networks for Object Retrieval
Region-oriented Convolutional Networks for Object RetrievalRegion-oriented Convolutional Networks for Object Retrieval
Region-oriented Convolutional Networks for Object Retrieval
 
Fast Non-Uniform Filtering with Symmetric Weighted Integral Images
Fast Non-Uniform Filtering with Symmetric Weighted Integral ImagesFast Non-Uniform Filtering with Symmetric Weighted Integral Images
Fast Non-Uniform Filtering with Symmetric Weighted Integral Images
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite Imagery
 
Deep learning based object detection basics
Deep learning based object detection basicsDeep learning based object detection basics
Deep learning based object detection basics
 
R-FCN : object detection via region-based fully convolutional networks
R-FCN :  object detection via region-based fully convolutional networksR-FCN :  object detection via region-based fully convolutional networks
R-FCN : object detection via region-based fully convolutional networks
 
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
 
Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)
 
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
 
DNR - Auto deep lab paper review ppt
DNR - Auto deep lab paper review pptDNR - Auto deep lab paper review ppt
DNR - Auto deep lab paper review ppt
 
Semantic Mapping of Road Scenes
Semantic Mapping of Road ScenesSemantic Mapping of Road Scenes
Semantic Mapping of Road Scenes
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
 
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
 
Deep Learning for Computer Vision: Segmentation (UPC 2016)
Deep Learning for Computer Vision: Segmentation (UPC 2016)Deep Learning for Computer Vision: Segmentation (UPC 2016)
Deep Learning for Computer Vision: Segmentation (UPC 2016)
 
Shai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingShai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble tracking
 

Similar to Learning global representations for deep image retrieval

物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術CHENHuiMei
 
You only look once (YOLO) : unified real time object detection
You only look once (YOLO) : unified real time object detectionYou only look once (YOLO) : unified real time object detection
You only look once (YOLO) : unified real time object detectionEntrepreneur / Startup
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...Edge AI and Vision Alliance
 
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...Edge AI and Vision Alliance
 
위성이미지 객체 검출 대회 - 2등
위성이미지 객체 검출 대회 - 2등위성이미지 객체 검출 대회 - 2등
위성이미지 객체 검출 대회 - 2등DACON AI 데이콘
 
Week5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptxWeek5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptxfahmi324663
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012Jinwon Lee
 
Anchor free object detection by deep learning
Anchor free object detection by deep learningAnchor free object detection by deep learning
Anchor free object detection by deep learningYu Huang
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Jihong Kang
 
NetVLAD: CNN architecture for weakly supervised place recognition
NetVLAD:  CNN architecture for weakly supervised place recognitionNetVLAD:  CNN architecture for weakly supervised place recognition
NetVLAD: CNN architecture for weakly supervised place recognitionGeunhee Cho
 
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNAutomatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNZihao(Gerald) Zhang
 
Transformer in Vision
Transformer in VisionTransformer in Vision
Transformer in VisionSangmin Woo
 
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis taeseon ryu
 
Panoptic Segmentation @CVPR2019
Panoptic Segmentation @CVPR2019Panoptic Segmentation @CVPR2019
Panoptic Segmentation @CVPR2019Kousuke Kuzuoka
 
Fast methods for deep learning based object detection
Fast methods for deep learning based object detectionFast methods for deep learning based object detection
Fast methods for deep learning based object detectionBrodmann17
 
Domain adaptation for Image Segmentation
Domain adaptation for Image SegmentationDomain adaptation for Image Segmentation
Domain adaptation for Image SegmentationDeepak Thukral
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyNUPUR YADAV
 
An accurate retrieval through R-MAC+ descriptors for landmark recognition
An accurate retrieval through R-MAC+ descriptors for landmark recognitionAn accurate retrieval through R-MAC+ descriptors for landmark recognition
An accurate retrieval through R-MAC+ descriptors for landmark recognitionFederico Magliani
 

Similar to Learning global representations for deep image retrieval (20)

物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
 
You only look once (YOLO) : unified real time object detection
You only look once (YOLO) : unified real time object detectionYou only look once (YOLO) : unified real time object detection
You only look once (YOLO) : unified real time object detection
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
 
위성이미지 객체 검출 대회 - 2등
위성이미지 객체 검출 대회 - 2등위성이미지 객체 검출 대회 - 2등
위성이미지 객체 검출 대회 - 2등
 
Week5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptxWeek5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptx
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012
 
TransNeRF
TransNeRFTransNeRF
TransNeRF
 
Anchor free object detection by deep learning
Anchor free object detection by deep learningAnchor free object detection by deep learning
Anchor free object detection by deep learning
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331
 
NetVLAD: CNN architecture for weakly supervised place recognition
NetVLAD:  CNN architecture for weakly supervised place recognitionNetVLAD:  CNN architecture for weakly supervised place recognition
NetVLAD: CNN architecture for weakly supervised place recognition
 
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNAutomatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
 
Transformer in Vision
Transformer in VisionTransformer in Vision
Transformer in Vision
 
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
Panoptic Segmentation @CVPR2019
Panoptic Segmentation @CVPR2019Panoptic Segmentation @CVPR2019
Panoptic Segmentation @CVPR2019
 
Fast methods for deep learning based object detection
Fast methods for deep learning based object detectionFast methods for deep learning based object detection
Fast methods for deep learning based object detection
 
Domain adaptation for Image Segmentation
Domain adaptation for Image SegmentationDomain adaptation for Image Segmentation
Domain adaptation for Image Segmentation
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
 
An accurate retrieval through R-MAC+ descriptors for landmark recognition
An accurate retrieval through R-MAC+ descriptors for landmark recognitionAn accurate retrieval through R-MAC+ descriptors for landmark recognition
An accurate retrieval through R-MAC+ descriptors for landmark recognition
 
YOLACT
YOLACTYOLACT
YOLACT
 

Recently uploaded

Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...Call girls in Ahmedabad High profile
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 

Recently uploaded (20)

Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 

Learning global representations for deep image retrieval

  • 1. Deep Image Retrieval: Learning global representations for image search Albert Gordo, Jon Almazan, Jerome Revaud, Diane Larlus Original Slides by Albert Jiménez Computer Vision Reading Group 1 [arXiv]
  • 3. 3 Instance Retrieval + Ranking 1. 2. 3. 4.Image Retrieval Slide credit: Amaia Sal Ranking Image Query
  • 4. CNN-based retrieval ● CNNs trained for classification tasks ● Features are very robust to intra-class variability ● Lack of robustness to scaling, cropping and image clutter Related Work Lamp We are interested in distinguishing between particular objects from the same class! 4
  • 5. R-MAC ● Regional Maximum Activation of Convolutions ● Compact feature vectors encode image regions Related Work Giorgos Tolias, Ronan Sicre, Hervé Jégou, Particular object retrieval with integral max-pooling of CNN activations (Submitted to ICLR 2016) 5
  • 6. R-MAC ● Regions selected using a rigid grid ● Compute a feature vector per region ● Combine all region feature vectors ○ Dimension → 256 / 512 Related Work Giorgos Tolias, Ronan Sicre, Hervé Jégou, Particular object retrieval with integral max-pooling of CNN activations (Submitted to ICLR 2016) ConvNet Last Layer K feature maps size = W x H Different scale region grids maximum activation 6
  • 8. 1st Contribution ● Three-stream siamese network ● PCA implemented as a shift + fully connected layer ● Optimize weights (CNN + PCA) from R-MAC representation with a triplet loss function 8
  • 9. where: ● m is a scalar that controls the margin ● q, d+, d- are the descriptors for the query, positive and negative images 1st Contribution Ranking Loss Function 9
  • 10. 2nd Contribution ● Localize regions of interest (ROIs) ● Train a Region Proposal Network with bounding boxes (Similar Fast R-CNN, [arXiv]) In R-MAC → Rigid grid Replace Region Proposal Network 10
  • 11. 2nd Contribution RPN in a nutshell 11 ● Predict, for a set of candidate boxes of various sizes and aspects ratio, and at all possible image locations, a score describing how likely each box contains an object of interest. ● Simultaneously, for each candidate box perform regression to improve its location.
  • 12. Summary 12 ● Able to encode one image into a compact feature vector in a single forward pass ● Images can be compared using the dot product ● Very efficient at test time
  • 14. Datasets 14 ● Training Landmarks dataset: 214k images from 672 landmark sites ● Testing Oxford 5k, Paris 6k, Oxford 105k, Paris 106k, INRIA Holidays ● Remove all images contained in Oxford 5k and Paris 6k datasets ○ Landmarks-full: 200k images from 592 landmarks ● Cleaning Landmarks dataset (Select most relevant images/discard incorrect) ○ SIFT + Hessian Affine keypoint det. → Construct graph of similar images ○ Landmarks-clean: 52k images from 592 landmarks
  • 15. Bounding Box Estimation 15 ● RPN trained using automatically estimated bounding box annotations 1. Define initial bounding box: min rectangle that encloses all matched keypoints 2. For a pair (i, j) we predict the bounding box Bj using Bi and an affine transform Aij 3. Update (Merge using geometrical mean) 4. Iterate until convergence Bounding box projections Initial vs Final estimations
  • 16. Experimental Details 16 ● VGG-16 network pre-trained on ImageNet ● Fine-tune with Landmarks dataset ● Select triplets in an efficient manner ○ Forward pass to obtain image representations ○ Select hard negatives (Large loss) ● Dimension of the feature vector = 512 ● Evaluation: mean Average Precision (mAP) VGG16
  • 17. 1st Experiment 17 Comparison between R-MAC and their implementations C: Classification Network R: Ranking (Trained with triplets)
  • 18. 2nd Experiment 18 Comparison between fixed grid vs number of region proposals 16-32 proposals already outperform rigid grid!
  • 19. 2nd Experiment 19 mAP - Number of triplets Recall - Number of region proposals
  • 20. 2nd Experiment 20 Heatmap vs Bounding Box Estimation
  • 21. Comparison with state of the art 21
  • 22. Comparison with state of the art 22
  • 25. Conclusions 25 ● They have proposed an effective and scalable method for image retrieval that encodes images into compact global signatures that can be compared with the dot-product. ● Proposal of a siamese network architecture trained for the specific task of image retrieval using ranking loss function (Triplets). ● Demonstrate the benefit of predicting the ROI of the images when encoding by using Region Proposal Networks.

Editor's Notes

  1. It aggregates several image regions into a compact feature vector of fixed length and is thus robust to scale and translation. This representation can deal with high resolution images of different aspect ratios and obtains a very competitive accuracy.
  2. Approximate integral max-pooling.