Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Efficient nearest neighbors search for large scale

10 views

Published on

The problem of landmark recognition has achieved excellent results in small-scale datasets. When dealing with large-scale retrieval, issues that were irrelevant with small amount of data, quickly become fundamental for an efficient retrieval phase. In particular, computational time needs to be kept as low as possible, whilst the retrieval accuracy has to be preserved as much as possible. In this paper we propose a novel multi-index hashing method called Bag of Indexes (BoI) for Approximate Nearest Neighbors (ANN) search. It allows to drastically reduce the query time and outperforms the accuracy results compared to the state-of-the-art methods for large-scale landmark recognition. It has been demonstrated that this family of algorithms can be applied on different embedding techniques like VLAD and R-MAC obtaining excellent results in very short times on different public datasets: Holidays+Flickr1M, Oxford105k and Paris106k.

Published in: Engineering
  • .DOWNLOAD THIS BOOKS INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... .DOWNLOAD PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... .DOWNLOAD EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... .DOWNLOAD doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... .DOWNLOAD PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... .DOWNLOAD EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... .DOWNLOAD doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

Efficient nearest neighbors search for large scale

  1. 1. Efficient Nearest Neighbors Search for Large-Scale Landmark Recognition Federico Magliani, Tomaso Fontanini, and Andrea Prati IMP Lab - University of Parma 22/10/2018 IMP Lab - University of Parma 1
  2. 2. Agenda • Motivations • Related works • Proposed approach (Bag of Indexes) • Experimental results • Conclusions 22/10/2018 IMP Lab - University of Parma 2
  3. 3. Motivations • Approximate Nearest Neighbor (ANN) search problem • find relevant results among an huge quantity of data • trade-off between computational time and memory occupancy • applied on image, text and information retrieval 22/10/2018 IMP Lab - University of Parma 3
  4. 4. Agenda • Motivations • Related works • Proposed approach (Bag of Indexes) • Experimental results • Conclusions 22/10/2018 IMP Lab - University of Parma 4
  5. 5. Related works • Permutation Pivots allows to represent the image descriptors through permutation of a set of randomly selected reference objects; • Locality Sensitive Hashing (LSH) projects points that are close to each other into the same bucket with high probability; • Product Quantization (PQ) decomposes the space into a Cartesian product of low dimensional subspaces and quantizes each subspace separately; • FLANN: an open source library for ANN and one of the most popular for nearest neighbor matching. 22/10/2018 IMP Lab - University of Parma 5
  6. 6. Agenda • Motivations • Related works • Proposed approach (Bag of Indexes) • Experimental results • Conclusions 22/10/2018 IMP Lab - University of Parma 6
  7. 7. Proposed approach: Bag of Indexes (BoI) It’s a multi-index hashing algorithm for ANN search problem. • The Db data are projected through LSH function and the index of the signature is saved in hash tables; • For each query, the following process is repeated for every projection: 1. Project the descriptor. 2. The indexes found in the bucket closest to the query will be added to a ranking list (BoI) with a weight proportional by the Hamming distance between the query bucket and the analysed bucket. 3. At the end the topN elements are re-ranked according to the Euclidean distance. 22/10/2018 IMP Lab - University of Parma 7
  8. 8. Proposed approach: Bag of Indexes (BoI) 0 1 2 3 1 2 3 4 5 6 7 Weight Image Index Hash Table 1 Hash Table 2 Hash Table 3 22/10/2018 IMP Lab - University of Parma 8 Hash Table 1 … … {4,6} 5 {2,3} … … Hash Table 2 … 7 {5,3} 1 … … … Hash Table 3 … … … 5 3 {1,4} … Index of query image for each Hash Table L = 3
  9. 9. Proposed approach: Bag of Indexes (BoI) • Weighing strategy (multi-probe approach): 𝑤 𝑖, 𝑞, 𝑙 = ቐ 1 2 𝐻(𝑖,𝑞) , 𝑖𝑓 𝐻 𝑖, 𝑞 ≤ 𝑙 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 where i is a generic bucket, q is the query bucket and H(i,q) is the Hamming distance between i and q. • Adaptive version: after a predefined number of hash table, the gap is reduced in order to reduce the computational time. 22/10/2018 IMP Lab - University of Parma 9
  10. 10. Linear vs Sublinear reduction • linear: the number of neighboring buckets γ is reduced by 2 every 40 hash tables: 𝛾𝑖 = ቊ 𝛾𝑖−1 − 2, 𝑖𝑓 𝑖 = {Δ1, … , 𝑘𝑖Δ1} 𝛾𝑖−1, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 with i = {1, . . . , L}, ∆1 = 40 and k1 : k1 ∆1 ≤ L • sublinear: the number of neighboring buckets γ is reduced by 2 every 25 hash tables, but only after the first half of hash tables: 𝛾𝑖 = 𝛾𝑖−1, 𝑖𝑓 𝑖 ≤ 𝐿 2 𝛾𝑖−1 − 2, 𝑖𝑓 𝑖 = 𝐿 2 , 𝐿 2 + Δ2, … , 𝐿 2 + 𝑘2Δ2 𝛾𝑖−1, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 with i = {1, . . . , L}, ∆2 = 25 and k2 : L/2 + k2∆2 ≤ L 22/10/2018 IMP Lab - University of Parma 10
  11. 11. BoI - Parameters config Symbol Name Value δ hash dimension 2 8 = 256 L hash tables 100 𝜸 𝟎 initial gap 68 l neighbors used 3-neighbors - reduction sublinear ε re-ranking top 250 elements 22/10/2018 IMP Lab - University of Parma 11
  12. 12. Agenda • Motivations • Related works • Proposed approach (Bag of Indexes) • Experimental results • Conclusions 22/10/2018 IMP Lab - University of Parma 12
  13. 13. Datasets • Holidays+Flickr1M (1M distractor images + 1491 images: 500 classes, 500 query.) • Oxford105k (100k distractor images + 5062 images: 11 classes, 55 queries); • Paris106k (100k distractor images + 6412 images: 11 classes, 55 queries); • SIFT1M (1M 128D SIFT descriptors, 10k query images, only the top 100 images in the final ranking for each query are evaluated) • GIST1M (1M 960D GIST descriptors, 1k query images, only the top 100 images in the final ranking for each query are checked) 22/10/2018 IMP Lab - University of Parma 13
  14. 14. Evaluation Metrics • Different evaluation metrics are used to compare with the state-of- the-art approaches: • Recall in R = 1, 10, 100 → it is the average rate of queries for which the 1-nearest neighbor is ranked in the top R positions. • mAP (mean Average Precision) → mean of Average Precision scores (correct results) for each query, based on the position in the ranking. 22/10/2018 IMP Lab - University of Parma 14
  15. 15. Results on Holidays+Flickr1M Method ε mAP Avg retrieval time (msec) LSH 250 86.03 % 3103 Multi-probe LSH 250 86.10 % 16706 Permutations 250 82.70 % 2844 LOPQ 250 36.37 % 4 FLANN 250 83.97 % 995 BoI LSH 250 78.10 % 5 BoI multi-probe LSH 250 85.16 % 12 BoI adaptive multi-probe LSH 250 85.35 % 8 22/10/2018 IMP Lab - University of Parma 15
  16. 16. Results on Holidays+Flickr1M Method ε mAP Avg retrieval time (msec) Permutations 10k 85.51 % 15640 LOPQ 10k 67.22 % 72 FLANN 10k 85.66 % 1004 BoI adaptive multi-probe LSH 10k 86.09 % 16 22/10/2018 IMP Lab - University of Parma 16
  17. 17. Results on Oxford105k and Paris106k Method ε Oxford105k Paris106k mAP Avg ret. Time (msec) mAP Avg ret. Time (msec) LSH 2500 80.83% 610 86.50% 607 Permutations 2500 81.89% 240 88.14% 140 LOPQ 2500 71.70% 346 87.47% 295 FLANN 2500 70.33% 2118 68.93% 2132 Boi adaptive multi-probe LSH 2500 81.44% 12 87.90% 13 Permutations 10k 82.82% 250 89.04% 164 LOPQ 10k 69.94% 1153 88.00% 841 FLANN 10k 69.37% 2135 70.73% 2156 Boi adaptive multi-probe LSH 10k 84.38% 25 92.31% 26 22/10/2018 IMP Lab - University of Parma 17
  18. 18. Results on Sift1M Method ε R=1 R=10 R=100 Avg retrieval time (msec) Permutations 500 94.32 % 94.98% 94.98 % 16999 LOPQ 500 19.93 % 44.80 % 52.92 % 3 FLANN 500 54.47 % 54.83 % 54.83% 16 BoI adaptive multi-probe LSH 500 93.72 % 94.34 % 94.34 % 22 LOPQ 10k 36.34 % 80.11 % 96.18 % 104 FLANN 10k 95.06 % 95.86 % 95.86 % 31 BoI adaptive multi-probe LSH 10k 99.17 % 99.85 % 99.85 % 30 22/10/2018 IMP Lab - University of Parma 18
  19. 19. Results on Gist1M Method ε R=1 R=10 R=100 Avg retrieval time (msec) Permutations 500 54.80 % 55.30% 55.30 % 17909 FLANN 500 28.30 % 28.60 % 28.60% 1262 BoI adaptive multi-probe LSH 500 57.70 % 58.20 % 58.20 % 69 LOPQ 10k 75.90 % 76.50 % 76.50 % 1352 BoI adaptive multi-probe LSH 10k 92.40 % 93.40 % 93.40 % 108 22/10/2018 IMP Lab - University of Parma 19
  20. 20. Agenda • Motivations • Related works • Proposed approach (Bag of Indexes) • Experimental results • Conclusions 22/10/2018 IMP Lab - University of Parma 20
  21. 21. Conclusions • The proposed Bag of Indexes (BoI) adaptive multi-probe LSH is a simple technique implemented for the efficient resolution of the ANN search problem. • BoI allows to work in combination of different hashing/projection functions. • Experiments are performed on five public datasets, namely Holidays+Flickr1M, Oxford105k, Paris106k, SIFT1M and GIST1M, and demonstrate superior recognition accuracy w.r.t. the state of the art. 22/10/2018 IMP Lab - University of Parma 21
  22. 22. Thanks for your attention! • Questions? • Contacts: tomaso.fontanini@studenti.unipr.it • Website: implab.ce.unipr.it/?page_id=122 • GitHub: github.com/fmaglia/BoI 22/10/2018 IMP Lab - University of Parma 22

×