K-NEAREST NEIGHBOURS
Classification Algorithm
SupervisedLearning
Instance based learning method for classifying objects based on
closest training examples in the future space.
KNN is a simple algorithm that stores all available cases and classifies
new cases based on a similarity measure.
3.
K-NEAREST NEIGHBOURS
KNN=>No. ofneighbours
If K=1, select the nearest neighbor
If K>1,For classification select the most frequent neighbor.
When to Consider Nearest Neighbor ?
Lots of training data
Less than 20 attributes per example
KNN-Algorithm
Step by stepon how to compute K-nearest neighbors KNN algorithm:
Determine parameter K = number of nearest neighbors.
Calculate the distance between the query-instance and all the
training samples.
Sort the distance and determine nearest neighbors based on the K-
th minimum distance.
Gather the category of the nearest neighbors.
Use simple majority of the category of nearest neighbors as the
prediction value of the query instance.
7.
Numerical Example ofKNN
A student is evaluated by internal examiner & external examiner &
accordingly student results can pass or fail.
Student X1(Rating by
internal
Examiner)
X2(Rating by
external
examiner)
Y
S1 7 7 Pass
S2 7 4 Pass
S3 3 4 Fail
S4 1 4 Fail
S5 3 7 ?
8.
SOLUTION
Decide new studentresult :
Step 1 Determine parameter K = number of nearest neighbors
Suppose use K = 3
Step 2 Calculate the distance between the query-instance and all the
training samples
Coordinate of query instance is (3, 7)
9.
SOLUTION
Step2 continue...
x1 x2Eucliean
distance to
query
instance(3,7)
Is it included
in 3 nn?
7 7 4 Yes
7 4 5 No
3 4 3 Yes
1 4 3.60 Yes
10.
SOLUTION
Step 3 :Sort the distance i.e. arranging all above distances in
non-decreasing order. (3,3.6,4.5)
Step 4 :Gather the category of the nearest neighbors. Select k=3 distance
from above as (d3,d4,d1) = > (3,3.6,4)
d3=(3,4,fail) d4=(1,4,fail) d1=(7,7,Pass)
Step 5 : Select majority of the category of NN as the prediction value of
the query instance k – pass =1.
k – fail = 2
k-pass < k-fail
So new student or test instance is classified to fail because k-fail is
maximum.
11.
Summary
K-Nearest Neighbors (KNN)is a simple yet powerful machine
learning algorithm that offers several advantages.
It is very fast in training because it does not build an explicit model;
instead, it stores the data and makes predictions based on similarity at
query time.
This allows KNN to learn and represent even complex target functions
effectively.
Another strength is that it does not lose information since all the
training data is retained, making it well-suited for problems where
every detail might be important.