2. “Goal - Become a Data Scientist”
“A Dream becomes a Goal when action is taken towards its achievement” - Bo Bennett
“The Plan”
“A Goal without a Plan is just a wish”
3. ● Fundamentals of Nearest Neighbour
● Nearest Neighbours for Unsupervised Learning
● Nearest Neighbours for Classification
● Nearest Neighbours for Regression
● Nearest Centroid Classifier
Agenda
4. Fundamentals of Nearest Neighbour
● The principle behind nearest neighbor methods is to find a predefined
number of training samples closest in distance to the new point, and predict
the label from these.
● The number of samples can be a user-defined constant (k-nearest neighbor
learning), or vary based on the local density of points (radius-based neighbor
learning).
● Being a non-parametric method, it is often successful in classification
situations where the decision boundary is very irregular.
● Neighbors-based methods are known as non-generalizing machine learning
methods, since they simply “remember” all of its training data
5. Algorithms of Nearest Neighbours
● Algorithms to remember data
● Brute Force
● K-D Tree
● Ball Tree
6. Nearest Neighbours Unsupervised Learning
● During fit, it just stores the training data.
● For a query point just finds out nearest k neighbours
7. Nearest Neighbours Classification
● Instance based learning & does not construct a generalized model.
● A query point is assigned the data class which has the most representatives
within the nearest neighbors of the point.
● Two major types
● KNeighboursClassifier ( based on configured k )
● RadiusNeighbourClassifier ( based on configured r )
● Weights can be ‘uniform’ or ‘distance’. It assigns weights proportional to the
inverse of the distance from the query point.
10. Nearest Neighbours Regression
● Data labels are continues
● The label assigned to a query point is computed based the mean of the
labels of its nearest neighbors.
● Two types of regressors
● KNeighbourRegressor
● RadiusNeighbourRegressor
● Weights concept holds good here similar to classification