SlideShare a Scribd company logo
1 of 6
Nearest Neighbor Search
Hup university
27-02-2018
1
Nearest Neighbor Search
• Nearest neighbor search (NNS), as a form of proximity
search, is the optimization problem of finding the point
in a given set that is closest to a given point.
• Closeness is typically expressed in terms of a
dissimilarity function: the less similar the objects, the
larger the function values.
• Formally, the NN search problem is defined as follows:
given a set S of points in a space M and a query point q
∈ M, find the closest point in S to q.
• A direct generalization of this problem is a k-NN search,
where we need to find the k closest points.
2
Distance Metric
• Most commonly M is a metric space and dissimilarity is
expressed as a distance metric, which is symmetric and
satisfies the triangle inequality.
• Even more common, M is taken to be the d-
dimensional vector space where dissimilarity is
measured using the Euclidean distance, Manhattan
distance or other distance metric.
• However, the dissimilarity function can be arbitrary.
One example are asymmetric Bregman divergences, for
which the triangle inequality does not hold
3
Quality & Usefulness Of The
Algorithms
• Various solutions to the NNS problem have been
proposed. The quality and usefulness of the
algorithms are determined by the time
complexity of queries as well as the space
complexity of any search data structures that
must be maintained.
• The informal observation usually referred to as
the curse of dimensionality states that there is no
general-purpose exact solution for NNS in high-
dimensional Euclidean space using polynomial
preprocessing and polylog arithmic search time.
4
Linear Search
• The simplest solution to the NNS problem is to
compute the distance from the query point to every
other point in the database, keeping track of the "best
so far".
• This algorithm, sometimes referred to as the naive
approach, has a running time of O(dN) where N is
the cardinality of S and d is the dimensionality of M.
• There are no search data structures to maintain, so
linear search has no space complexity beyond the
storage of the database.
• Naive search can, on average, outperform space
partitioning approaches on higher dimensional spaces
5
Space Partitioning
• Several space-partitioning methods have been developed for solving the
NNS problem.
• k-d tree, which iteratively bisects the search space into two regions
containing half of the points of the parent region.
• Queries are performed via traversal of the tree from the root to a leaf by
evaluating the query point at each split.
• Depending on the distance specified in the query, neighboring branches
that might contain hits may also need to be evaluated.
• For constant dimension query time, average complexity is O(log N) in the
case of randomly distributed points, worst case complexity is O(kN^(1-
1/k)) Alternatively the R-tree data structure was designed to support
nearest neighbor search in dynamic context, as it has efficient algorithms
for insertions and deletions such as the R* tree.
• R-trees can yield nearest neighbors not only for Euclidean distance, but
can also be used with other distances.
6

More Related Content

What's hot

Wavelet based image fusion
Wavelet based image fusionWavelet based image fusion
Wavelet based image fusionUmed Paliwal
 
Architecture of data mining system
Architecture of data mining systemArchitecture of data mining system
Architecture of data mining systemramya marichamy
 
4.3 multimedia datamining
4.3 multimedia datamining4.3 multimedia datamining
4.3 multimedia dataminingKrish_ver2
 
Logics for non monotonic reasoning-ai
Logics for non monotonic reasoning-aiLogics for non monotonic reasoning-ai
Logics for non monotonic reasoning-aiShaishavShah8
 
Digital Image Processing: Image Segmentation
Digital Image Processing: Image SegmentationDigital Image Processing: Image Segmentation
Digital Image Processing: Image SegmentationMostafa G. M. Mostafa
 
Applications of paralleL processing
Applications of paralleL processingApplications of paralleL processing
Applications of paralleL processingPage Maker
 
Instance Based Learning in Machine Learning
Instance Based Learning in Machine LearningInstance Based Learning in Machine Learning
Instance Based Learning in Machine LearningPavithra Thippanaik
 
Adversarial search
Adversarial searchAdversarial search
Adversarial searchNilu Desai
 
Image Representation & Descriptors
Image Representation & DescriptorsImage Representation & Descriptors
Image Representation & DescriptorsPundrikPatel
 
Evaluating hypothesis
Evaluating  hypothesisEvaluating  hypothesis
Evaluating hypothesisswapnac12
 
Parallel sorting algorithm
Parallel sorting algorithmParallel sorting algorithm
Parallel sorting algorithmRicha Kumari
 
Knowledge representation In Artificial Intelligence
Knowledge representation In Artificial IntelligenceKnowledge representation In Artificial Intelligence
Knowledge representation In Artificial IntelligenceRamla Sheikh
 

What's hot (20)

Wavelet based image fusion
Wavelet based image fusionWavelet based image fusion
Wavelet based image fusion
 
Dempster shafer theory
Dempster shafer theoryDempster shafer theory
Dempster shafer theory
 
Concept learning
Concept learningConcept learning
Concept learning
 
Architecture of data mining system
Architecture of data mining systemArchitecture of data mining system
Architecture of data mining system
 
4.3 multimedia datamining
4.3 multimedia datamining4.3 multimedia datamining
4.3 multimedia datamining
 
GMM
GMMGMM
GMM
 
Logics for non monotonic reasoning-ai
Logics for non monotonic reasoning-aiLogics for non monotonic reasoning-ai
Logics for non monotonic reasoning-ai
 
Digital Image Processing: Image Segmentation
Digital Image Processing: Image SegmentationDigital Image Processing: Image Segmentation
Digital Image Processing: Image Segmentation
 
Applications of paralleL processing
Applications of paralleL processingApplications of paralleL processing
Applications of paralleL processing
 
Planning
PlanningPlanning
Planning
 
Instance Based Learning in Machine Learning
Instance Based Learning in Machine LearningInstance Based Learning in Machine Learning
Instance Based Learning in Machine Learning
 
Adversarial search
Adversarial searchAdversarial search
Adversarial search
 
Support Vector Machines ( SVM )
Support Vector Machines ( SVM ) Support Vector Machines ( SVM )
Support Vector Machines ( SVM )
 
Image Representation & Descriptors
Image Representation & DescriptorsImage Representation & Descriptors
Image Representation & Descriptors
 
Dbscan algorithom
Dbscan algorithomDbscan algorithom
Dbscan algorithom
 
Principal Component Analysis
Principal Component AnalysisPrincipal Component Analysis
Principal Component Analysis
 
Evaluating hypothesis
Evaluating  hypothesisEvaluating  hypothesis
Evaluating hypothesis
 
Parallel sorting algorithm
Parallel sorting algorithmParallel sorting algorithm
Parallel sorting algorithm
 
Knowledge representation In Artificial Intelligence
Knowledge representation In Artificial IntelligenceKnowledge representation In Artificial Intelligence
Knowledge representation In Artificial Intelligence
 
Machine learning clustering
Machine learning clusteringMachine learning clustering
Machine learning clustering
 

Similar to Nearest neighbor search

ODSC India 2018: Topological space creation & Clustering at BigData scale
ODSC India 2018: Topological space creation & Clustering at BigData scaleODSC India 2018: Topological space creation & Clustering at BigData scale
ODSC India 2018: Topological space creation & Clustering at BigData scaleKuldeep Jiwani
 
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Maninda Edirisooriya
 
A Quantitative analysis and performance study for similarity search methods i...
A Quantitative analysis and performance study for similarity search methods i...A Quantitative analysis and performance study for similarity search methods i...
A Quantitative analysis and performance study for similarity search methods i...Jungyeol
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersAlbert Y. C. Chen
 
DMTM 2015 - 06 Introduction to Clustering
DMTM 2015 - 06 Introduction to ClusteringDMTM 2015 - 06 Introduction to Clustering
DMTM 2015 - 06 Introduction to ClusteringPier Luca Lanzi
 
DMTM Lecture 11 Clustering
DMTM Lecture 11 ClusteringDMTM Lecture 11 Clustering
DMTM Lecture 11 ClusteringPier Luca Lanzi
 
SPATIAL POINT PATTERNS
SPATIAL POINT PATTERNSSPATIAL POINT PATTERNS
SPATIAL POINT PATTERNSLiemNguyenDuy
 
Pivot Selection Techniques
Pivot Selection TechniquesPivot Selection Techniques
Pivot Selection TechniquesCatarina Moreira
 
Talk at MCubed London about Manifold Learning and Applications
Talk at MCubed London about Manifold Learning and ApplicationsTalk at MCubed London about Manifold Learning and Applications
Talk at MCubed London about Manifold Learning and ApplicationsStefan Kühn
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3Nandhini S
 

Similar to Nearest neighbor search (20)

Db Scan
Db ScanDb Scan
Db Scan
 
ODSC India 2018: Topological space creation & Clustering at BigData scale
ODSC India 2018: Topological space creation & Clustering at BigData scaleODSC India 2018: Topological space creation & Clustering at BigData scale
ODSC India 2018: Topological space creation & Clustering at BigData scale
 
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
 
A Quantitative analysis and performance study for similarity search methods i...
A Quantitative analysis and performance study for similarity search methods i...A Quantitative analysis and performance study for similarity search methods i...
A Quantitative analysis and performance study for similarity search methods i...
 
2.uninformed search
2.uninformed search2.uninformed search
2.uninformed search
 
unit 4 nearest neighbor.ppt
unit 4 nearest neighbor.pptunit 4 nearest neighbor.ppt
unit 4 nearest neighbor.ppt
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
 
Lecture6 pca
Lecture6 pcaLecture6 pca
Lecture6 pca
 
KNN
KNNKNN
KNN
 
DMTM 2015 - 06 Introduction to Clustering
DMTM 2015 - 06 Introduction to ClusteringDMTM 2015 - 06 Introduction to Clustering
DMTM 2015 - 06 Introduction to Clustering
 
DMTM Lecture 11 Clustering
DMTM Lecture 11 ClusteringDMTM Lecture 11 Clustering
DMTM Lecture 11 Clustering
 
DBSCAN (1) (4).pptx
DBSCAN (1) (4).pptxDBSCAN (1) (4).pptx
DBSCAN (1) (4).pptx
 
SPATIAL POINT PATTERNS
SPATIAL POINT PATTERNSSPATIAL POINT PATTERNS
SPATIAL POINT PATTERNS
 
Pivot Selection Techniques
Pivot Selection TechniquesPivot Selection Techniques
Pivot Selection Techniques
 
Lecture7
Lecture7Lecture7
Lecture7
 
Talk at MCubed London about Manifold Learning and Applications
Talk at MCubed London about Manifold Learning and ApplicationsTalk at MCubed London about Manifold Learning and Applications
Talk at MCubed London about Manifold Learning and Applications
 
Chap11 slides
Chap11 slidesChap11 slides
Chap11 slides
 
Sudoku
SudokuSudoku
Sudoku
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
 
Knn 160904075605-converted
Knn 160904075605-convertedKnn 160904075605-converted
Knn 160904075605-converted
 

More from zahraa F.Muhsen (20)

Compute skillspowerpoint lect2
Compute skillspowerpoint lect2Compute skillspowerpoint lect2
Compute skillspowerpoint lect2
 
Computer skills powerpoint intro
Computer skills powerpoint introComputer skills powerpoint intro
Computer skills powerpoint intro
 
Filter
FilterFilter
Filter
 
Table
TableTable
Table
 
List
ListList
List
 
Computer skills excel2010 5
Computer skills excel2010 5Computer skills excel2010 5
Computer skills excel2010 5
 
Computer skills excel2010 data
Computer skills excel2010 dataComputer skills excel2010 data
Computer skills excel2010 data
 
Computer skills excel2010 2
Computer skills excel2010 2Computer skills excel2010 2
Computer skills excel2010 2
 
Computer skills excel2010
Computer skills excel2010Computer skills excel2010
Computer skills excel2010
 
Computer skills excel2010 3
Computer skills excel2010 3Computer skills excel2010 3
Computer skills excel2010 3
 
Computer skills excel2010-4
Computer skills excel2010-4Computer skills excel2010-4
Computer skills excel2010-4
 
Computer skills excel2010
Computer skills excel2010Computer skills excel2010
Computer skills excel2010
 
Genetic algorithms
Genetic algorithmsGenetic algorithms
Genetic algorithms
 
Internet search techniques
Internet search techniquesInternet search techniques
Internet search techniques
 
Search methods
Search methodsSearch methods
Search methods
 
Hash function
Hash functionHash function
Hash function
 
Pigeonhole sort
Pigeonhole sortPigeonhole sort
Pigeonhole sort
 
Gnome sort
Gnome sortGnome sort
Gnome sort
 
Puzzle (1000 coins and 10 bags)
Puzzle (1000 coins and 10 bags)Puzzle (1000 coins and 10 bags)
Puzzle (1000 coins and 10 bags)
 
Php1
Php1Php1
Php1
 

Recently uploaded

RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 

Recently uploaded (20)

RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 

Nearest neighbor search

  • 1. Nearest Neighbor Search Hup university 27-02-2018 1
  • 2. Nearest Neighbor Search • Nearest neighbor search (NNS), as a form of proximity search, is the optimization problem of finding the point in a given set that is closest to a given point. • Closeness is typically expressed in terms of a dissimilarity function: the less similar the objects, the larger the function values. • Formally, the NN search problem is defined as follows: given a set S of points in a space M and a query point q ∈ M, find the closest point in S to q. • A direct generalization of this problem is a k-NN search, where we need to find the k closest points. 2
  • 3. Distance Metric • Most commonly M is a metric space and dissimilarity is expressed as a distance metric, which is symmetric and satisfies the triangle inequality. • Even more common, M is taken to be the d- dimensional vector space where dissimilarity is measured using the Euclidean distance, Manhattan distance or other distance metric. • However, the dissimilarity function can be arbitrary. One example are asymmetric Bregman divergences, for which the triangle inequality does not hold 3
  • 4. Quality & Usefulness Of The Algorithms • Various solutions to the NNS problem have been proposed. The quality and usefulness of the algorithms are determined by the time complexity of queries as well as the space complexity of any search data structures that must be maintained. • The informal observation usually referred to as the curse of dimensionality states that there is no general-purpose exact solution for NNS in high- dimensional Euclidean space using polynomial preprocessing and polylog arithmic search time. 4
  • 5. Linear Search • The simplest solution to the NNS problem is to compute the distance from the query point to every other point in the database, keeping track of the "best so far". • This algorithm, sometimes referred to as the naive approach, has a running time of O(dN) where N is the cardinality of S and d is the dimensionality of M. • There are no search data structures to maintain, so linear search has no space complexity beyond the storage of the database. • Naive search can, on average, outperform space partitioning approaches on higher dimensional spaces 5
  • 6. Space Partitioning • Several space-partitioning methods have been developed for solving the NNS problem. • k-d tree, which iteratively bisects the search space into two regions containing half of the points of the parent region. • Queries are performed via traversal of the tree from the root to a leaf by evaluating the query point at each split. • Depending on the distance specified in the query, neighboring branches that might contain hits may also need to be evaluated. • For constant dimension query time, average complexity is O(log N) in the case of randomly distributed points, worst case complexity is O(kN^(1- 1/k)) Alternatively the R-tree data structure was designed to support nearest neighbor search in dynamic context, as it has efficient algorithms for insertions and deletions such as the R* tree. • R-trees can yield nearest neighbors not only for Euclidean distance, but can also be used with other distances. 6