Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Saliency Weighted Convolutional Features for Instance Search

371 views

Published on

https://imatge-upc.github.io/salbow/
This work explores attention models to weight the contribution of local convolutional representations for the instance search task. We present a retrieval framework based on bags of local convolutional features (BLCF) that benefits from saliency weighting to build an efficient image representation. The use of human visual attention models (saliency) allows significant improvements in retrieval performance without the need to conduct region analysis or spatial verification, and without requiring any feature fine tuning. We investigate the impact of different saliency models, finding that higher performance on saliency benchmarks does not necessarily equate to improved performance when used in instance search tasks. The proposed approach outperforms the state-of-the-art on the challenging INSTRE benchmark by a large margin, and provides similar performance on the Oxford and Paris benchmarks compared to more complex methods that use off-the-shelf representations.

Published in: Data & Analytics
  • Did you try ⇒ www.HelpWriting.net ⇐?. They know how to do an amazing essay, research papers or dissertations.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

Saliency Weighted Convolutional Features for Instance Search

  1. 1. Saliency Weighted Convolutional Features for Instance Search Eva Mohedano, Kevin McGuinness, Xavier Giro-i-Nieto and Noel E. O’Connor
  2. 2. Contents Instance Search task Motivation Proposed Method Results Conclusions and Future Work
  3. 3. Contents Instance Search task Motivation Proposed Method Results Conclusions and Future Work
  4. 4. Visual Instance Retrieval 4 Image Database “This dog” Expected outcome: Visual Query
  5. 5. The Classic Retrieval Pipeline 5 Image RepresentationsQuery Image Dataset Image Matching Ranked List Similarity score Image . . . 0.98 0.97 0.10 0.01 v = (v1 , …, vn ) v1 = (v11 , …, v1n ) vk = (vk1 , …, vkn ) ... Euclidean distance Cosine Similarity Similarity Metric . . .
  6. 6. The Classic Retrieval Pipeline 6 v1 = (v11 , …, v1n ) vk = (vk1 , …, vkn ) ... variable number of feature vectors per image Bag of Visual Words N-Dimensional feature space M visual words (M clusters) INVERTED FILE word Image ID 1 1, 12, 2 1, 30, 102 3 10, 12 4 2,3 6 10 ... Large vocabularies (50k-1M) Very fast! Typically used with SIFT features Initial Search
  7. 7. The Classic Retrieval Pipeline 7 Re-ranking the top-ranked results using spatial constraints RAndom SAmple Consensus (RANSAC) ● Estimates an homography between the query and a dataset image ● Re-rank based on number of inlier local features ● Improves quality of the initial search Philbin, James, Ondrej Chum, Michael Isard, Josef Sivic, and Andrew Zisserman. "Object retrieval with large vocabularies and fast spatial matching." In Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conference on, pp. 1-8. IEEE, 2007. Expensive to compute Spatial re-ranking
  8. 8. Contents Instance Search task Motivation Proposed Method Results Conclusions and Future Work
  9. 9. Deep Learning Approaches in CBMI 9 Zheng, Liang, Yi Yang, and Qi Tian. "SIFT meets CNN: A decade survey of instance retrieval." TPAMI 2018.
  10. 10. Features from pre-trained CNN networks - Providing more importance to the center region (Content-independent) 10 Gaussian weighting Convolutional features Sum-pooled features Babenko, Artem, and Victor Lempitsky. "Aggregating local deep features for image retrieval." CVPR 2015.
  11. 11. Features from pre-trained CNN networks - Providing more importance to the most active regions in a convolution layer (Content-dependent) 11 Convolutional features Sum-pooled featuresSum across conv channels weighting Kalantidis, Yannis, Clayton Mellina, and Simon Osindero. "Cross-dimensional weighting for aggregated deep convolutional features." ECCV 2016.
  12. 12. Features from pre-trained CNN networks - Region Maximum Activation of Convolution (R-MAC) 12 Region1 Region2 … RegionN Max-pool Region Normalization Tolias, Giorgos, Ronan Sicre, and Hervé Jégou. "Particular object retrieval with integral max-pooling of CNN activations." ICLR 2016.
  13. 13. Features from pre-trained CNN networks - Region Maximum Activation of Convolution (R-MAC) (Content-independent) 13 R-MAC spatial weight Fix set of locations and window scales
  14. 14. Using human-based Saliency models 14 Human-based saliency
  15. 15. Saliency weighting for retrieval [1] Awad, Dounia, Vincent Courboulay, and Arnaud Revel. "Saliency filtering of sift detectors: Application to cbir." ACIVS, 2012 [2] de Carvalho Soares, Robson, Ilmerio Reis da Silva, and Denise Guliato. "Spatial locality weighting of features using saliency map with a bag-of-visual-words approach." ICTAI, 2012 15 - Traditionally explored with SIFT-based BoW approaches to: - Prune the number of local descriptors [1] - Weight the contribution of the background [2] We investigate traditional and data-driven saliency models to weight the contribution of visual words assigned to local convolutional features for the Visual Instance Search task.
  16. 16. Contents Instance Search task Motivation Proposed Method Results Conclusions and Future Work
  17. 17. General Framework 17
  18. 18. General Framework 18
  19. 19. Bag of Local Convolutional Features 19 (336x256) Resolution conv5_1 from VGG16 (21x16) 25K centroids (Visual Vocabulary) 25K-D vector Bag of Words Sparse feature representation Mohedano, Eva, Kevin McGuinness, Noel E. O'Connor, Amaia Salvador, Ferran Marqués, and Xavier Giro-i-Nieto. "Bags of local convolutional features for scalable instance search." ICMR 2016.
  20. 20. Masking the relevant region (Encoding the query) 20 (336x256) Resolution conv5_1 from VGG16 (21x16) 25K centroids (Visual Vocabulary) 25K-D vector Bag of Words Assignment Maps Mohedano, Eva, Kevin McGuinness, Noel E. O'Connor, Amaia Salvador, Ferran Marqués, and Xavier Giro-i-Nieto. "Bags of local convolutional features for scalable instance search." ICMR 2016.
  21. 21. General Framework 21 Pan, Junting, Cristian Canton Ferrer, Kevin McGuinness, Noel E. O'Connor, Jordi Torres, Elisa Sayrol, and Xavier Giro-i-Nieto. "Salgan: Visual saliency prediction with generative adversarial networks." arXiv preprint arXiv:1701.01081 (2017).
  22. 22. Different Saliency models 22 Gaussian Conv features Itti-Koch BMS SalNet SalGAN SAM-VGG SAM-ResNet
  23. 23. General Framework 23
  24. 24. Encoding relevant areas based on saliency prediction (dataset image) 24 Spatial weighting 25K-D BoW vector Unweighted Bow Weighted Bow 25K-D BoW vector
  25. 25. Contents Instance Search task Motivation Proposed Method Results Conclusions and Future Work
  26. 26. Effect of different spatial weighting methods 26 Hand-crafted saliency models Deep-learning based saliency models
  27. 27. 27 Saliency region ‘within’ the instance, which is not beneficial in retrieval datasets based on buildings
  28. 28. Comparison Sum-pooling vs BCLF 28 ● BCLF better baseline (vocabulary learning can be seen as unsupervised domain adaptation) ● Saliency effective in both Sum-pooling and BLCF approach for the instance search dataset Instre
  29. 29. Comparison with the State-of-the-art 29 High dimensional 25,000D representations with an average number of non-zeros ~200
  30. 30. 30
  31. 31. 31 Gomez P, Mohedano E, McGuinness K, Giró-i-Nieto X, O'Connor N, “Demonstration of an Open Source Framework for Qualitative Evaluation of CBIR Systems”, ACM Multimedia 2018 Dockerized visualization tool
  32. 32. Conclusions ● Proven the application of modern saliency models for the instance search task ● Achieved SoA performance on instance search benchmark (Instre) with a off-the-shelf CNN model ● Investigate better post-processing for ranking refinement ● Scale method on large-scale datasets Future Work
  33. 33. Thanks for your attention! Questions? Software available @ https://github.com/imatge-upc/salbow

×