Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

A dense depth representation for vlad descriptors in

9 views

Published on

The recent advances brought by deep learning allowed to improve the performance in image retrieval tasks. Through the many convolutional layers, available in a Convolutional Neural Network (CNN), it is possible to obtain a hierarchy of features from the evaluated image. At every step, the patches extracted are smaller than the previous levels and more representative. Following this idea, this paper introduces a new detector applied on the feature maps extracted from pre-trained CNN. Specifically, this approach lets to increase the number of features in order to increase the performance of the aggregation algorithms like the most famous and used VLAD embedding. The proposed approach is tested on different public datasets: Holidays, Oxford5k, Paris6k and UKB.

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

A dense depth representation for vlad descriptors in

  1. 1. A Dense-Depth Representation for VLAD descriptors in Content-Based Image Retrieval Federico Magliani, Tomaso Fontanini, and Andrea Prati IMP lab - University of Parma 25/10/2018 IMP Lab - University of Parma 1
  2. 2. Agenda • Motivations and objectives • Related works • Proposed approach (Bag of Indexes) • Experimental results • Conclusions 25/10/2018 IMP Lab - University of Parma 2
  3. 3. Motivations and objectives • The recent advances brought by deep learning allowed to improve the performance in image retrieval tasks; • Convolutional Neural Networks (CNN) allow to obtain a hierarchy of features from the evaluated image; • We introduce a new detector applied on the feature maps; • The objective is to increase the number of features in order to boost the performance of the aggregation algorithms like VLAD embedding. 25/10/2018 IMP Lab - University of Parma 3
  4. 4. Agenda • Motivations and objectives • Related works • Proposed approach (Bag of Indexes) • Experimental results • Conclusions 25/10/2018 IMP Lab - University of Parma 4
  5. 5. Image retrieval pipeline 25/10/2018 IMP Lab - University of Parma 5 Extraction of local features Aggregation of the extracted features Retrieval of the most similar image
  6. 6. CNN codes • Feature vectors extracted from pre-trained networks: InceptionV3 in our case; • CNN codes are extracted from the 8th inception pooling layer (mixed8) 25/10/2018 IMP Lab - University of Parma 6
  7. 7. Agenda • Motivations and objectives • Related works • Proposed approach (DDR) • Experimental results • Conclusions 25/10/2018 IMP Lab - University of Parma 7
  8. 8. Dense-Depth representation (DDR) • VLAD-based embedding works better with dense representations; • The features are grouped into a larger set of features of lower dimensionality; • A feature map of 𝑊 × 𝐻 × 𝐷 is splitted along 𝐷 -> maintain geometrical information; • Number of descriptors: from 𝑊 × 𝐻 to 𝑊 × 𝐻 × 𝐷 𝑠𝑝𝑙𝑖𝑡𝑓𝑎𝑐𝑡𝑜𝑟 25/10/2018 IMP Lab - University of Parma 8
  9. 9. Example Feature maps of 8 × 8 × 1280 with 𝑠𝑝𝑙𝑖𝑡𝑓𝑎𝑐𝑡𝑜𝑟 = 128 8 ∗ 8 ∗ 10 descriptors of 128𝐷 25/10/2018 IMP Lab - University of Parma 9 W H D D’ D’’ D’’’
  10. 10. locVLAD1 • Evolution of VLAD; • Calculated through the mean of two different VLAD descriptors: • One computed on the whole image; • One computed on the central portion of the image. • The most important and useful features in the images are often in the central region; • Applied only on the query images. 25/10/2018 IMP Lab - University of Parma 10 1. Magliani, Federico, Navid Mahmoudian Bidgoli, and Andrea Prati. "A location-aware embedding technique for accurate landmark recognition." 11th International Conference on Distributed Smart Cameras. ACM, 2017.
  11. 11. Agenda • Motivations and objectives • Related works • Proposed approach (Bag of Indexes) • Experimental results • Conclusions 25/10/2018 IMP Lab - University of Parma 11
  12. 12. Datasets • Holidays: 1491 images subdivided in 500 classes. The database images are 991 and the query images are 500, one for every class; • Oxford5k: 5062 images. The classes are 11 and the queries are 55 (5 for each class); • Paris6k: 6412 images. The classes are 11 and the queries are 55 (5 for each class); • UKB: 10200 images subdivided in 2550 classes. All the images are used as database images and only one for category is used as a query image. 25/10/2018 IMP Lab - University of Parma 12
  13. 13. Results on Holidays Network Layer Input Image DDR Root square norm. Encoding PCA-whit. mAP VGG19 block4_pool 224x224   VLAD  74.33% VGG19 block4_pool 550x550   VLAD  75.95% VGG19 block4_pool 550x550   VLAD  77.80% InceptionV3 mixed_8 450x450   VLAD  81.55% InceptionV3 mixed_8 450x450   locVLAD  84.55% InceptionV3 mixed_8 562x562   locVLAD  85.43% InceptionV3 mixed_8 562x562   locVLAD  85.98% InceptionV3 mixed_8 562x662   locVLAD  86.70% InceptionV3 mixed_8 562x662   locVLAD 128D 87.38% InceptionV3 mixed_8 562x662   locVLAD 128D 85.63% InceptionV3 mixed_8 562x662   locVLAD 256D 89.93% InceptionV3 mixed_8 562x662   locVLAD 512D 90.46% 25/10/2018 IMP Lab - University of Parma 13
  14. 14. Comparison with the state of the art Method Dimension Oxford5k Paris6k Holidays UKB VLAD 4096 37.80% 38.60% 55.60% 3.18 CEVLAD 128 53.00% - 68.10% 3.093 FVLAD 128 - - 62.20% 3.43 HVLAD 128 - - 64.00% 3.40 gVLAD 128 60.00% - 77.90% - Ng et al. 128 55.80% 58.30% 83.60% - DDR locVLAD 128 57.52% 64.70% 87.38% 3.70 NetVLAD 512 59.00% 70.20% 82.90% - DDR locVLAD 512 61.46% 71.88% 90.46% 3.76 25/10/2018 IMP Lab - University of Parma 14
  15. 15. Agenda • Motivations and objectives • Related works • Proposed approach (Bag of Indexes) • Experimental results • Conclusions 25/10/2018 IMP Lab - University of Parma 15
  16. 16. Conclusion and future development • DDR and locVLAD outperform the state of the art related to VLAD descriptors; • Small descriptor dimension; • The future work will be on a different embedding like R-MAC; • Also, the application of fine-tuning could help to improve the final accuracy results. 25/10/2018 IMP Lab - University of Parma 16
  17. 17. Thanks for your attention! • Questions? • Contacts: tomaso.fontanini@studenti.unipr.it • Website: implab.ce.unipr.it/?page_id=122 25/10/2018 IMP Lab - University of Parma 17

×