K-Nearest Neighbor
CSE 4621 | Machine Learning| BScTE | Summer 22-23
Sabbir Ahmed
Assistant Professor, Dept of CSE, IUT
sabbirahmed@iut-dhaka.edu
1
Nearest Neighbor Classifier
2
Memorize all data
and labels
Predict the label of
the most similar
training image
Distance Metric to compare images
3
L1 distance:
• Memorize training data
• For each test image:
• Find nearest training image
• Return label of nearest
image
4
Nearest Neighbor Classifier
Q: With N examples,
how fast is training?
A: O(1)
Q: With N examples,
how fast is testing?
A: O(N)
What does this look like?
5
What does this look like?
6
7
Nearest Neighbor Decision Boundaries
8
Nearest Neighbor Decision Boundaries
x1
Nearest neighbors
in two dimensions
x0
9
Nearest Neighbor Decision Boundaries
x1
Nearest neighbors
in two dimensions
Points are training
examples; colors
give training labels
x0
10
Nearest Neighbor Decision Boundaries
x1
Nearest neighbors
in two dimensions
Points are training
examples; colors
give training labels
Background colors
give the category
a test point would
be assigned
x
x0
11
Nearest Neighbor Decision Boundaries
x1
Nearest neighbors
in two dimensions
Points are training
examples; colors
give training labels
Decision boundary
is the boundary
between two
classification regions
Background colors
give the category
a test point would
be assigned
x
x0
12
Nearest Neighbor Decision Boundaries
x1
Nearest neighbors
in two dimensions
Points are training
examples; colors
give training labels
Background colors
give the category
a test point would
x
Decision boundary
is the boundary
between two
classification regions
be assigned x0
Decision boundaries
can be noisy;
affected by outliers
13
Nearest Neighbor Decision Boundaries
x1
Nearest neighbors
in two dimensions
Points are training
examples; colors
give training labels
Background colors
give the category
a test point would
x
Decision boundary
is the boundary
between two
classification regions
be assigned x0
Decision boundaries
can be noisy;
affected by outliers
How to smooth out
decision boundaries?
Use more neighbors!
14
K-Nearest Neighbors
K = 1
Instead of copying label from nearest neighbor,
take majority vote from K closest points
K = 3
15
K-Nearest Neighbors
K = 1
Using more neighbors helps smooth
out rough decision boundaries
K = 3
16
K-Nearest Neighbors
K = 1
Using more neighbors helps
reduce the effect of outliers
K = 3
K-Nearest Neighbors
K = 1
When K > 1 there can be
ties between classes.
Need to break somehow!
K = 3
K-Nearest Neighbors: Distance Metric
L1 (Manhattan) distance L2 (Euclidean) distance
K = 1
K = 1
K-Nearest Neighbors:
Web Demo
http://vision.stanford.edu/teaching/cs231n-demos/knn/
Interactively move points around
and see decision boundaries change
Play with L1 vs L2 metrics
Play with changing number of
training points, value of K
Hyperparameters
What is the best value of K to use?
What is the best distance metric to use?
These are examples of hyperparameters: choices about our
learning algorithm that we don’t learn from the training
data; instead we set them at the start of the learning process
Hyperparameters
What is the best value of K to use?
What is the best distance metric to use?
These are examples of hyperparameters: choices about our
learning algorithm that we don’t learn from the training
data; instead we set them at the start of the learning process
Very problem-dependent. In general need to try them all and
see what works best for our data / task.
Setting Hyperparameters
Idea #1: Choose hyperparameters that
work best on the data
Your Dataset
Setting Hyperparameters
Idea #1: Choose hyperparameters that
work best on the data
BAD: K = 1 always works
perfectly on training data
Your Dataset
Setting Hyperparameters
Idea #1: Choose hyperparameters that
work best on the data
BAD: K = 1 always works
perfectly on training data
Idea #2: Split data into train and test, choose
hyperparameters that work best on test data
Your Dataset
train test
Setting Hyperparameters
Idea #1: Choose hyperparameters that
work best on the data
BAD: K = 1 always works
perfectly on training data
Idea #2: Split data into train and test, choose
hyperparameters that work best on test data
BAD: No idea how algorithm
will perform on new data
Your Dataset
train test
Setting Hyperparameters
Idea #1: Choose hyperparameters that
work best on the data
BAD: K = 1 always works
perfectly on training data
Idea #2: Split data into train and test, choose
hyperparameters that work best on test data
BAD: No idea how algorithm
will perform on new data
Your Dataset
train test
Idea #3: Split data into train, val, and test; choose
hyperparameters on val and evaluate on test
Better!
train validation test
Setting Hyperparameters
Your Dataset
fold 1 fold 2 fold 3 fold 4 fold 5 test
Idea #4: Cross-Validation: Split data into folds, try each
fold as validation and average the results
fold 1 fold 2 fold 3 fold 4 fold 5 test
fold 1 fold 2 fold 3 fold 4 fold 5 test
Useful for small datasets, but (unfortunately) not used too frequently in deep learning
Setting Hyperparameters
Example of 5-fold cross-validation for
the value of k.
Each point: single outcome.
The line goes through the mean, bars
indicated standard deviation
(Seems that k ~ 7 works best
for this data)
Mathematical explanation of K-Nearest
Neighbour
• It’s one of the Supervised learning algorithm mostly used for classification
of data on the basis how it’s neighbour are classified.
• KNN stores all available cases and classifies new cases based on a
similarity measure.
• K in KNN is a parameter that refers to the number of the nearest
neighbours to include in the majority voting process.
29
• How do we choose K?
• Sqrt(n), where n is a total number of data points(if in case n is even we have to
make the value odd by adding 1 or subtracting 1 that helps in select better)
• When to use KNN?
• We can use KNN when Dataset is labelled and noise-free and it’s must be small
because KNN is a “Lazy learner”. Let’s understand KNN algorithm with the help of
an example
30
31
• Here male is denoted with numeric value 0 and female with 1.
• Let’s find in which class of people Angelina will lie whose k factor is 3 and
age is 5.
• So we have to find out the distance using
d=√((x2-x1)²+(y2-y1)²) to find the distance between any two points.
32
• So let’s find out the distance between Ajay and Angelina using formula
d=√((age2-age1)²+(gender2-gender1)²)
d=√((5-32)²+(1-0)²)
d=√729+1
d=27.02
Similarly, we find out all distance one by one.
33
34
• So the value of k factor is 3 for Angelina. And the closest to 3 is 9,10,10.5
that is closest to Angelina are Zaira, Smith and Michael.
Zaira 9 cricket
Michael 10 cricket
Smith 10.5 football
• So according to KNN algorithm, Angelina will be in the class of people
who like cricket. So this is how KNN algorithm works.
35
Reference
• https://www.geeksforgeeks.org/mathematical-explanation-of-k-nearest-
neighbour/
36

k-nearest neighbour Machine Learning.pdf

  • 1.
    K-Nearest Neighbor CSE 4621| Machine Learning| BScTE | Summer 22-23 Sabbir Ahmed Assistant Professor, Dept of CSE, IUT sabbirahmed@iut-dhaka.edu 1
  • 2.
    Nearest Neighbor Classifier 2 Memorizeall data and labels Predict the label of the most similar training image
  • 3.
    Distance Metric tocompare images 3 L1 distance:
  • 4.
    • Memorize trainingdata • For each test image: • Find nearest training image • Return label of nearest image 4 Nearest Neighbor Classifier Q: With N examples, how fast is training? A: O(1) Q: With N examples, how fast is testing? A: O(N)
  • 5.
    What does thislook like? 5
  • 6.
    What does thislook like? 6
  • 7.
  • 8.
    8 Nearest Neighbor DecisionBoundaries x1 Nearest neighbors in two dimensions x0
  • 9.
    9 Nearest Neighbor DecisionBoundaries x1 Nearest neighbors in two dimensions Points are training examples; colors give training labels x0
  • 10.
    10 Nearest Neighbor DecisionBoundaries x1 Nearest neighbors in two dimensions Points are training examples; colors give training labels Background colors give the category a test point would be assigned x x0
  • 11.
    11 Nearest Neighbor DecisionBoundaries x1 Nearest neighbors in two dimensions Points are training examples; colors give training labels Decision boundary is the boundary between two classification regions Background colors give the category a test point would be assigned x x0
  • 12.
    12 Nearest Neighbor DecisionBoundaries x1 Nearest neighbors in two dimensions Points are training examples; colors give training labels Background colors give the category a test point would x Decision boundary is the boundary between two classification regions be assigned x0 Decision boundaries can be noisy; affected by outliers
  • 13.
    13 Nearest Neighbor DecisionBoundaries x1 Nearest neighbors in two dimensions Points are training examples; colors give training labels Background colors give the category a test point would x Decision boundary is the boundary between two classification regions be assigned x0 Decision boundaries can be noisy; affected by outliers How to smooth out decision boundaries? Use more neighbors!
  • 14.
    14 K-Nearest Neighbors K =1 Instead of copying label from nearest neighbor, take majority vote from K closest points K = 3
  • 15.
    15 K-Nearest Neighbors K =1 Using more neighbors helps smooth out rough decision boundaries K = 3
  • 16.
    16 K-Nearest Neighbors K =1 Using more neighbors helps reduce the effect of outliers K = 3
  • 17.
    K-Nearest Neighbors K =1 When K > 1 there can be ties between classes. Need to break somehow! K = 3
  • 18.
    K-Nearest Neighbors: DistanceMetric L1 (Manhattan) distance L2 (Euclidean) distance K = 1 K = 1
  • 19.
    K-Nearest Neighbors: Web Demo http://vision.stanford.edu/teaching/cs231n-demos/knn/ Interactivelymove points around and see decision boundaries change Play with L1 vs L2 metrics Play with changing number of training points, value of K
  • 20.
    Hyperparameters What is thebest value of K to use? What is the best distance metric to use? These are examples of hyperparameters: choices about our learning algorithm that we don’t learn from the training data; instead we set them at the start of the learning process
  • 21.
    Hyperparameters What is thebest value of K to use? What is the best distance metric to use? These are examples of hyperparameters: choices about our learning algorithm that we don’t learn from the training data; instead we set them at the start of the learning process Very problem-dependent. In general need to try them all and see what works best for our data / task.
  • 22.
    Setting Hyperparameters Idea #1:Choose hyperparameters that work best on the data Your Dataset
  • 23.
    Setting Hyperparameters Idea #1:Choose hyperparameters that work best on the data BAD: K = 1 always works perfectly on training data Your Dataset
  • 24.
    Setting Hyperparameters Idea #1:Choose hyperparameters that work best on the data BAD: K = 1 always works perfectly on training data Idea #2: Split data into train and test, choose hyperparameters that work best on test data Your Dataset train test
  • 25.
    Setting Hyperparameters Idea #1:Choose hyperparameters that work best on the data BAD: K = 1 always works perfectly on training data Idea #2: Split data into train and test, choose hyperparameters that work best on test data BAD: No idea how algorithm will perform on new data Your Dataset train test
  • 26.
    Setting Hyperparameters Idea #1:Choose hyperparameters that work best on the data BAD: K = 1 always works perfectly on training data Idea #2: Split data into train and test, choose hyperparameters that work best on test data BAD: No idea how algorithm will perform on new data Your Dataset train test Idea #3: Split data into train, val, and test; choose hyperparameters on val and evaluate on test Better! train validation test
  • 27.
    Setting Hyperparameters Your Dataset fold1 fold 2 fold 3 fold 4 fold 5 test Idea #4: Cross-Validation: Split data into folds, try each fold as validation and average the results fold 1 fold 2 fold 3 fold 4 fold 5 test fold 1 fold 2 fold 3 fold 4 fold 5 test Useful for small datasets, but (unfortunately) not used too frequently in deep learning
  • 28.
    Setting Hyperparameters Example of5-fold cross-validation for the value of k. Each point: single outcome. The line goes through the mean, bars indicated standard deviation (Seems that k ~ 7 works best for this data)
  • 29.
    Mathematical explanation ofK-Nearest Neighbour • It’s one of the Supervised learning algorithm mostly used for classification of data on the basis how it’s neighbour are classified. • KNN stores all available cases and classifies new cases based on a similarity measure. • K in KNN is a parameter that refers to the number of the nearest neighbours to include in the majority voting process. 29
  • 30.
    • How dowe choose K? • Sqrt(n), where n is a total number of data points(if in case n is even we have to make the value odd by adding 1 or subtracting 1 that helps in select better) • When to use KNN? • We can use KNN when Dataset is labelled and noise-free and it’s must be small because KNN is a “Lazy learner”. Let’s understand KNN algorithm with the help of an example 30
  • 31.
  • 32.
    • Here maleis denoted with numeric value 0 and female with 1. • Let’s find in which class of people Angelina will lie whose k factor is 3 and age is 5. • So we have to find out the distance using d=√((x2-x1)²+(y2-y1)²) to find the distance between any two points. 32
  • 33.
    • So let’sfind out the distance between Ajay and Angelina using formula d=√((age2-age1)²+(gender2-gender1)²) d=√((5-32)²+(1-0)²) d=√729+1 d=27.02 Similarly, we find out all distance one by one. 33
  • 34.
  • 35.
    • So thevalue of k factor is 3 for Angelina. And the closest to 3 is 9,10,10.5 that is closest to Angelina are Zaira, Smith and Michael. Zaira 9 cricket Michael 10 cricket Smith 10.5 football • So according to KNN algorithm, Angelina will be in the class of people who like cricket. So this is how KNN algorithm works. 35
  • 36.