CLASSIFICATION
K-NEAREST NEIGHBOURS(KNN)
K-NEAREST NEIGHBOURS

Classification Algorithm

Supervised Learning

Instance based learning method for classifying objects based on
closest training examples in the future space.

KNN is a simple algorithm that stores all available cases and classifies
new cases based on a similarity measure.
K-NEAREST NEIGHBOURS

KNN=>No. of neighbours

If K=1, select the nearest neighbor

If K>1,For classification select the most frequent neighbor.

When to Consider Nearest Neighbor ?
 Lots of training data
 Less than 20 attributes per example
Example of KNN
Example of KNN
KNN-Algorithm

Step by step on how to compute K-nearest neighbors KNN algorithm:
 Determine parameter K = number of nearest neighbors.
 Calculate the distance between the query-instance and all the
training samples.
 Sort the distance and determine nearest neighbors based on the K-
th minimum distance.
 Gather the category of the nearest neighbors.
 Use simple majority of the category of nearest neighbors as the
prediction value of the query instance.
Numerical Example of KNN

A student is evaluated by internal examiner & external examiner &
accordingly student results can pass or fail.
Student X1(Rating by
internal
Examiner)
X2(Rating by
external
examiner)
Y
S1 7 7 Pass
S2 7 4 Pass
S3 3 4 Fail
S4 1 4 Fail
S5 3 7 ?
SOLUTION

Decide new student result :

Step 1 Determine parameter K = number of nearest neighbors

Suppose use K = 3

Step 2 Calculate the distance between the query-instance and all the
training samples

Coordinate of query instance is (3, 7)
SOLUTION

Step2 continue...
x1 x2 Eucliean
distance to
query
instance(3,7)
Is it included
in 3 nn?
7 7 4 Yes
7 4 5 No
3 4 3 Yes
1 4 3.60 Yes
SOLUTION

Step 3 : Sort the distance i.e. arranging all above distances in

non-decreasing order. (3,3.6,4.5)

Step 4 :Gather the category of the nearest neighbors. Select k=3 distance
from above as (d3,d4,d1) = > (3,3.6,4)

d3=(3,4,fail) d4=(1,4,fail) d1=(7,7,Pass)

Step 5 : Select majority of the category of NN as the prediction value of
the query instance k – pass =1.

k – fail = 2

k-pass < k-fail

So new student or test instance is classified to fail because k-fail is
maximum.
Summary

K-Nearest Neighbors (KNN) is a simple yet powerful machine
learning algorithm that offers several advantages.

It is very fast in training because it does not build an explicit model;
instead, it stores the data and makes predictions based on similarity at
query time.

This allows KNN to learn and represent even complex target functions
effectively.

Another strength is that it does not lose information since all the
training data is retained, making it well-suited for problems where
every detail might be important.

Classification algorithm k-nearest neighbors.pptx

  • 1.
  • 2.
    K-NEAREST NEIGHBOURS  Classification Algorithm  SupervisedLearning  Instance based learning method for classifying objects based on closest training examples in the future space.  KNN is a simple algorithm that stores all available cases and classifies new cases based on a similarity measure.
  • 3.
    K-NEAREST NEIGHBOURS  KNN=>No. ofneighbours  If K=1, select the nearest neighbor  If K>1,For classification select the most frequent neighbor.  When to Consider Nearest Neighbor ?  Lots of training data  Less than 20 attributes per example
  • 4.
  • 5.
  • 6.
    KNN-Algorithm  Step by stepon how to compute K-nearest neighbors KNN algorithm:  Determine parameter K = number of nearest neighbors.  Calculate the distance between the query-instance and all the training samples.  Sort the distance and determine nearest neighbors based on the K- th minimum distance.  Gather the category of the nearest neighbors.  Use simple majority of the category of nearest neighbors as the prediction value of the query instance.
  • 7.
    Numerical Example ofKNN  A student is evaluated by internal examiner & external examiner & accordingly student results can pass or fail. Student X1(Rating by internal Examiner) X2(Rating by external examiner) Y S1 7 7 Pass S2 7 4 Pass S3 3 4 Fail S4 1 4 Fail S5 3 7 ?
  • 8.
    SOLUTION  Decide new studentresult :  Step 1 Determine parameter K = number of nearest neighbors  Suppose use K = 3  Step 2 Calculate the distance between the query-instance and all the training samples  Coordinate of query instance is (3, 7)
  • 9.
    SOLUTION  Step2 continue... x1 x2Eucliean distance to query instance(3,7) Is it included in 3 nn? 7 7 4 Yes 7 4 5 No 3 4 3 Yes 1 4 3.60 Yes
  • 10.
    SOLUTION  Step 3 :Sort the distance i.e. arranging all above distances in  non-decreasing order. (3,3.6,4.5)  Step 4 :Gather the category of the nearest neighbors. Select k=3 distance from above as (d3,d4,d1) = > (3,3.6,4)  d3=(3,4,fail) d4=(1,4,fail) d1=(7,7,Pass)  Step 5 : Select majority of the category of NN as the prediction value of the query instance k – pass =1.  k – fail = 2  k-pass < k-fail  So new student or test instance is classified to fail because k-fail is maximum.
  • 11.
    Summary  K-Nearest Neighbors (KNN)is a simple yet powerful machine learning algorithm that offers several advantages.  It is very fast in training because it does not build an explicit model; instead, it stores the data and makes predictions based on similarity at query time.  This allows KNN to learn and represent even complex target functions effectively.  Another strength is that it does not lose information since all the training data is retained, making it well-suited for problems where every detail might be important.