Achieving Spatial Adaptivity while Searching for Approximate Nearest Neighbors

•

0 likes•282 views

Don Sheehy

Technology

p is a c-approximate nearest
neighbor of q if
d(q, p) < c d(q, NN(q)).

p is a c-approximate nearest
neighbor of q if
d(q, p) < c d(q, NN(q)).

Goals
• O(log n) queries

p is a c-approximate nearest
neighbor of q if
d(q, p) < c d(q, NN(q)).

Goals
• O(log n) queries
• O(d3/2)-approximation

p is a c-approximate nearest
neighbor of q if
d(q, p) < c d(q, NN(q)).

Goals
• O(log n) queries
• O(d3/2)-approximation
• Spatially Adaptive

Dynamic Finger Property:
Query time is O(log(δ(p, q)))
• where p is the previous query
result.
• q is the current query.
• δ(p,q) is the number of input points
between p and q.

Finger search pretty well solved in 1D
[pointer machine: Brodal et al],[RAM, Andersson & Thorup]

Skip lists work.

Finger search pretty well solved in 1D
[pointer machine: Brodal et al],[RAM, Andersson & Thorup]

Previous Work in 2D

• Proximate Point Search [Demaine, Iacono, and Langerman,
’02/’04]

• Proximate Point Location [Iacono and Langerman ’03]

This work.

• O(d3/2) ANN in Rd

• O(d lg(δ(p,q))) queries. (Finger Search!)
2

• O(dn) space.
• O(d n lg n) preprocessing time
2

8d |p-q|

δ(p, q) = # of input points in this box.

A classic O(d3/2)-ANN Algorithm:
[Bern93, Chan98, Liao01 et al]

Put the input points in your favorite
1-dimensional data structure, ordered by
the Z-order.

Construct d+1 of these data structures, such
that all points inserted into the ith are shifted by
(i/d+1) x diameter in every coordinate.

For each query, search all d+1 data structures
for a nearest neighbor in the shifted Z-order.
Return the one that is closest in Rd.

Lemma [Chan]: With at least d+1 shifts, at
most 1 is “bad” for each dimension.

Lemma [Chan]: With at least d+1 shifts, at
most 1 is “bad” for each dimension.

So x and NNx share a small QT cell.

Are we done?

Z-order reduces d-dimensions to 1-dimension.

It works for ANN.

Finger Search in 1-dimension is solved.

Making it work.

1. Use 2d+1 1D data structures that support finger
search.
2. Run all 2d+1 searches in parallel.
3. Stop when d+1 are finished.
4. Return the best answer among the d+1 that
finished.
5. Manually update the finger pointers in all 2d+1
structures.

If q and NNq are in a
small QT box relative to
their distance then the
approximation is good.

If q and NNq are in a If p and q are in a small
small QT box relative to QT box relative to their
their distance then the distance then the
approximation is good. runtime is good.

Summary

A new data structure.

• O(d3/2) ANN in Rd

• O(d lg(δ(p,q))) queries. (Finger Search!)
2

• O(dn) space.
• O(d n lg n) preprocessing time
2

What's hot

Biconnectivitymsramanujan

A Commutative Alternative to Fractional Calculus on k-Differentiable FunctionsMatt Parker

The discrete fourier transform (dsp) 4HIMANSHU DIWAKAR

Box-fitting algorithm presentationRidlo Wibowo

discrete-hmmMd Pavel Mahmud

DftSenthil Kumar

Can recurrent neural networks warp timeDanbi Cho

DFT and its propertiesssuser2797e4

Transformsssuser2797e4

NFA DFA Equivalence theorem niveditJain

Ch06b entropysuresh301

Radix-2 DIT FFT Sarang Joshi

Algorithm Design and Complexity - Course 7Traian Rebedea

Cpsc125 ch6sec3guest862df4e

Accelerating Collapsed Variational Bayesian Inference for Latent Dirichlet Al...Tomonari Masada

Noisy optimization --- (theory oriented) SurveyOlivier Teytaud

Kumegawa russiaKazuki Kumegawa

Scalable Link Discovery for Modern Data-Driven ApplicationsHolistic Benchmarking of Big Linked Data

Discrete fourier transformMOHAMMAD AKRAM

What's hot (19)

Biconnectivity

A Commutative Alternative to Fractional Calculus on k-Differentiable Functions

The discrete fourier transform (dsp) 4

Box-fitting algorithm presentation

discrete-hmm

Dft

Can recurrent neural networks warp time

DFT and its properties

Transforms

NFA DFA Equivalence theorem

Ch06b entropy

Radix-2 DIT FFT

Algorithm Design and Complexity - Course 7

Cpsc125 ch6sec3

Accelerating Collapsed Variational Bayesian Inference for Latent Dirichlet Al...

Noisy optimization --- (theory oriented) Survey

Kumegawa russia

Scalable Link Discovery for Modern Data-Driven Applications

Discrete fourier transform

Viewers also liked

Learning with Nets and MeshesDon Sheehy

Contrato Lobistas Pineda Poncetony2706

The Centervertex Theorem (CCCG)Don Sheehy

Flips in Computational GeometryDon Sheehy

石川合宿Kazuki Tada

Mark And Haleigh SlideshowDavid Wenger

Büyükşehir Belediyeleri Stratejik Planlarında Spor (2006-2011)Sport

Wedding PortfolioCarterphoto

Physician-patient interaction and the Internet: The Canadian ExperiencePat Rich

Insect pests 2012Plantlet Culture.com

Proyecto id logisticsMANUEL SILVA

Internet pour les DNL francophonesCarmen Vera

Viewers also liked (12)

Learning with Nets and Meshes

Contrato Lobistas Pineda Ponce

The Centervertex Theorem (CCCG)

Flips in Computational Geometry

石川合宿

Mark And Haleigh Slideshow

Büyükşehir Belediyeleri Stratejik Planlarında Spor (2006-2011)

Wedding Portfolio

Physician-patient interaction and the Internet: The Canadian Experience

Insect pests 2012

Proyecto id logistics

Internet pour les DNL francophones

Similar to Achieving Spatial Adaptivity while Searching for Approximate Nearest Neighbors

ClosestPairClosestPairClosestPairClosestPairShanmuganathan C

QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...The Statistical and Applied Mathematical Sciences Institute

Converting High Dimensional Problems to Low Dimensional OnesStrand Life Sciences Pvt Ltd

Continuous and Discrete-Time Analysis of SGDValentin De Bortoli

Analysis of algoManeesha Srivastav

Solving connectivity problems via basic Linear Algebracseiitgn

Skiena algorithm 2007 lecture09 linear sortingzukun

Lec36Nikhil Chilwant

Knn solutiondvbtunisia

LSHHsiao-Fei Liu

1_Asymptotic_Notation_pptx.pptxpallavidhade2

The Persistent Homology of Distance Functions under Random ProjectionDon Sheehy

COCOON14Yuan Tang

Approximate Nearest Neighbour in Higher DimensionsShrey Verma

sortingRavirajsinh Chauhan

Analysis and design of algorithms part 3Deepak John

Deblurring in ctKriti Sharma

Simplified Runtime Analysis of Estimation of Distribution AlgorithmsPK Lehre

Simplified Runtime Analysis of Estimation of Distribution AlgorithmsPer Kristian Lehre

Waveform_codingUNIT-II_DC_-PPT.pptxKIRUTHIKAAR2

Similar to Achieving Spatial Adaptivity while Searching for Approximate Nearest Neighbors (20)

ClosestPairClosestPairClosestPairClosestPair

QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...

Converting High Dimensional Problems to Low Dimensional Ones

Continuous and Discrete-Time Analysis of SGD

Analysis of algo

Solving connectivity problems via basic Linear Algebra

Skiena algorithm 2007 lecture09 linear sorting

Lec36

Knn solution

LSH

1_Asymptotic_Notation_pptx.pptx

The Persistent Homology of Distance Functions under Random Projection

COCOON14

Approximate Nearest Neighbour in Higher Dimensions

sorting

Analysis and design of algorithms part 3

Deblurring in ct

Simplified Runtime Analysis of Estimation of Distribution Algorithms

Waveform_codingUNIT-II_DC_-PPT.pptx

Recently uploaded

A Domino Admins Adventures (Engage 2024)Gabriella Davis

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays

Histor y of HAM Radio presentation slidevu2urc

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

Automating Google Workspace (GWS) & more with Apps Scriptwesley chun

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

A Call to Action for Generative AI in 2024Results

Scaling API-first – The story of a global engineering organizationRadu Cotescu

Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK

Salesforce Community Group Quito, Salesforce 101Paola De la Torre

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Recently uploaded (20)

A Domino Admins Adventures (Engage 2024)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Histor y of HAM Radio presentation slide

08448380779 Call Girls In Civil Lines Women Seeking Men

Automating Google Workspace (GWS) & more with Apps Script

[2024]Digital Global Overview Report 2024 Meltwater.pdf

The Codex of Business Writing Software for Real-World Solutions 2.pptx

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

A Call to Action for Generative AI in 2024

Scaling API-first – The story of a global engineering organization

Unblocking The Main Thread Solving ANRs and Frozen Frames

Salesforce Community Group Quito, Salesforce 101

2024: Domino Containers - The Next Step. News from the Domino Container commu...

Breaking the Kubernetes Kill Chain: Host Path Mount

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf

Driving Behavioral Change for Information Management through Data-Driven Gree...

Exploring the Future Potential of AI-Enabled Smartphone Processors

How to Troubleshoot Apps for the Modern Connected Worker

GenCyber Cyber Security Day Presentation

Achieving Spatial Adaptivity while Searching for Approximate Nearest Neighbors

1. Achieving Spatial Adaptivity in Approximate Nearest Neighbor Search Jonathan Derryberry, Don Sheehy, Daniel Sleator, and Maverick Woo

6. p is a c-approximate nearest neighbor of q if d(q, p) < c d(q, NN(q)).

7. p is a c-approximate nearest neighbor of q if d(q, p) < c d(q, NN(q)). Goals • O(log n) queries

8. p is a c-approximate nearest neighbor of q if d(q, p) < c d(q, NN(q)). Goals • O(log n) queries • O(d3/2)-approximation

9. p is a c-approximate nearest neighbor of q if d(q, p) < c d(q, NN(q)). Goals • O(log n) queries • O(d3/2)-approximation • Spatially Adaptive

10.

11.

12.

13.

14.

15.

16. Dynamic Finger Property: Query time is O(log(δ(p, q))) • where p is the previous query result. • q is the current query. • δ(p,q) is the number of input points between p and q.

17.

18. Finger search pretty well solved in 1D [pointer machine: Brodal et al],[RAM, Andersson & Thorup]

19. Skip lists work. Finger search pretty well solved in 1D [pointer machine: Brodal et al],[RAM, Andersson & Thorup]

20. Previous Work in 2D • Proximate Point Search [Demaine, Iacono, and Langerman, ’02/’04] • Proximate Point Location [Iacono and Langerman ’03]

21. This work. • O(d3/2) ANN in Rd • O(d lg(δ(p,q))) queries. (Finger Search!) 2 • O(dn) space. • O(d n lg n) preprocessing time 2

22. This work. • O(d3/2) ANN in Rd • O(d lg(δ(p,q))) queries. (Finger Search!) 2 • O(dn) space. • O(d n lg n) preprocessing time 2 ?

23.

24.

25.

26.

27. 8d |p-q|

28. 8d |p-q| δ(p, q) = # of input points in this box.

29.

30.

31.

32.

33. The placement of the origin matters.

34. A classic O(d3/2)-ANN Algorithm: [Bern93, Chan98, Liao01 et al] Put the input points in your favorite 1-dimensional data structure, ordered by the Z-order. Construct d+1 of these data structures, such that all points inserted into the ith are shifted by (i/d+1) x diameter in every coordinate. For each query, search all d+1 data structures for a nearest neighbor in the shifted Z-order. Return the one that is closest in Rd.

35.

36.

37. Lemma [Chan]: With at least d+1 shifts, at most 1 is “bad” for each dimension.

38. Lemma [Chan]: With at least d+1 shifts, at most 1 is “bad” for each dimension. So x and NNx share a small QT cell.

39. Are we done? Z-order reduces d-dimensions to 1-dimension. It works for ANN. Finger Search in 1-dimension is solved.

40.

41. Making it work. 1. Use 2d+1 1D data structures that support finger search. 2. Run all 2d+1 searches in parallel. 3. Stop when d+1 are finished. 4. Return the best answer among the d+1 that finished. 5. Manually update the finger pointers in all 2d+1 structures.

42. If q and NNq are in a small QT box relative to their distance then the approximation is good.

43. If q and NNq are in a If p and q are in a small small QT box relative to QT box relative to their their distance then the distance then the approximation is good. runtime is good.

44. If q and NNq are in a If p and q are in a small small QT box relative to QT box relative to their their distance then the distance then the approximation is good. runtime is good. We apply the Chan Lemma twice. • Since at most d are bad for p and q, we can stop after all but d ﬁnish and the runtime is good. • Since d+1 are left over, at least one will be a good approximation.

45. Summary A new data structure. • O(d3/2) ANN in Rd • O(d lg(δ(p,q))) queries. (Finger Search!) 2 • O(dn) space. • O(d n lg n) preprocessing time 2

46. Thank you.

47. Thank you. Questions?

Achieving Spatial Adaptivity while Searching for Approximate Nearest Neighbors

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Viewers also liked

Viewers also liked (12)

Similar to Achieving Spatial Adaptivity while Searching for Approximate Nearest Neighbors

Similar to Achieving Spatial Adaptivity while Searching for Approximate Nearest Neighbors (20)

More from Don Sheehy

More from Don Sheehy (20)

Recently uploaded

Recently uploaded (20)

Achieving Spatial Adaptivity while Searching for Approximate Nearest Neighbors