SlideShare a Scribd company logo
1 of 11
K-Nearest Neighbor –As Supervised Classifier
Compiled by : Dr. Kumud Kundu
K Nearest Neighbor – As a Supervised Classification
Approach
• KNN is a non-parametric supervised learning technique in
which the query instance is classified to a given category with
the help of training set.
• Non-parametric means not making any assumptions on the
underlying data distribution.
• Predictions are made for a new instance (x) by searching
through the entire training set for the K most similar cases
(neighbors) and summarizing the output variable for those K
cases.
• In simple words, it captures information of all training cases
and classifies new cases based on a similarity.
2
Two Phases of Supervised Classification
3
4
EUCLIDEAN DISTANCE: SIMILARITY METRIC
𝑦2
𝑦1
𝑥1 𝑥2
𝑃𝑜𝑖𝑛𝑡 𝐵 (𝑥2, 𝑦2)
𝑃𝑜𝑖𝑛𝑡 𝐴 (𝑥1, 𝑦1)
𝑬𝒖𝒄𝒍𝒊𝒅𝒆𝒂𝒏 𝑫𝒊𝒔𝒕𝒂𝒏𝒄𝒆 = 𝒙 𝟐 − 𝒙 𝟏
𝟐 + 𝒚 𝟐 − 𝒚 𝟏
𝟐
5
MANHATTAN DISTANCE or CITY BLOCK
DISTANCE: SIMILARITY METRIC
𝑦2
𝑦1
𝑥1 𝑥2
𝑃𝑜𝑖𝑛𝑡 𝐵 (𝑥2, 𝑦2)
𝑃𝑜𝑖𝑛𝑡 𝐴 (𝑥1, 𝑦1)
𝑴𝒂𝒏𝒉𝒂𝒕𝒕𝒂𝒏 𝑫𝒊𝒔𝒕𝒂𝒏𝒄𝒆 = | 𝒙 𝟐 − 𝒙 𝟏 | + | 𝒚 𝟐 − 𝒚 𝟏 |
6
1. Select a value for k (e.g.: 1, 2, 3, 10..)
2. Calculate the Euclidian distance between the point to be classified and every
other point in the training data-set
3. Pick the k closest data points (points with the k smallest distances)
4. Run a majority vote among selected data points, the dominating
classification is the winner! Point is classified based on the dominant class.
5. Repeat if required!
K NEAREST NEIGHBORS (KNN): ALGORITHM STEPS
7
Predict Class for Tuple ( 10,7)
Apply KNN Classification Algorithm
Solution with K=3
Feature 1 Feature 2 Class Euclidean Distance Rank
1 1A 10.81665383 8
2 3A 8.94427191 7
2 4A 8.544003745 6
5 3A 6.403124237 5
8 6B 2.236067977 4
8 8B 2.236067977 3
9 6B 1.414213562 2
11 7B 1 1
Predict for 10 7 Predicted Class = B
8
1. Calculate the Euclidian distance between the point to be classified and every other point in
the training data-set
2. Pick the k=3 closest data points (points with the k smallest distances)
3. Run a majority vote among selected data points, the dominating classification is the winner!
Point is classified based on the dominant class.
How to decide the number of neighbors in KNN?
What are its effects on the classification Accuracy?
The number of neighbors(K) in KNN is a hyperparameter that is
needed to be chosen at the time of model building.
• K controls the classification accuracy of the model.
• Generally, K is chosen as an odd number if the number of
classes is even.
• Otherwise value of K is dependent upon the nature of dataset
(Domain Dependent ) for which it is to be applied.
9
KNN Implementation in Python
• First, import the KNeighborsClassifier module and create KNN classifier object by passing argument
number of neighbors in KNeighborsClassifier() function.
• by usingthe sklearn.neighbors.NearestNeighbors module
classifier = KNeighborsClassifier(n_neighbors = 5,
metric = 'minkowski', p = 2)
# KNN model with 5 neighbours and Euclidian distance as similarity metric
• Then, fit your model on the train set using fit() and perform prediction on the test set using predict().
# Fitting K-NN to the Training set
• classifier.fit(X_train, y_train)
# Predicting the Test set results
• y_predy_pred = classifier.predict(X_test) 10
from sklearn.neighbors import NearestNeighbors
QUICK CHECK
*Which of the following statements is true for k-NN classifiers?
A) The classification accuracy is better with larger values of k
B) The decision boundary is smoother with smaller values of k
C) The decision boundary is linear
D) k-NN does not require an explicit training step
*k-NN algorithm does more computation on test time rather than train time.
A) TRUE
B) FALSE
*Which of the following statement is true about k-NN algorithm?
1. k-NN performs much better if all of the data have the same scale
2. k-NN works well with a small number of input variables (p), but struggles when the
number of inputs is very large
3. k-NN makes no assumptions about the functional form of the problem being solved
A) 1 and 2
B) 1 and 3
C) Only 1
D) All of the above
11

More Related Content

What's hot

Social Media Analytics Lecture
Social Media Analytics LectureSocial Media Analytics Lecture
Social Media Analytics LectureDr Wasim Ahmed
 
Breast cancer diagnosis and recurrence prediction using machine learning tech...
Breast cancer diagnosis and recurrence prediction using machine learning tech...Breast cancer diagnosis and recurrence prediction using machine learning tech...
Breast cancer diagnosis and recurrence prediction using machine learning tech...eSAT Journals
 
Spectral clustering Tutorial
Spectral clustering TutorialSpectral clustering Tutorial
Spectral clustering TutorialZitao Liu
 
3.5 Exploratory Data Analysis
3.5 Exploratory Data Analysis3.5 Exploratory Data Analysis
3.5 Exploratory Data Analysismlong24
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Dev Sahu
 
Disease prediction using machine learning
Disease prediction using machine learningDisease prediction using machine learning
Disease prediction using machine learningJinishaKG
 
Introduction to data analysis using R
Introduction to data analysis using RIntroduction to data analysis using R
Introduction to data analysis using RVictoria López
 
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...Simplilearn
 
Cross-validation Tutorial: What, how and which?
Cross-validation Tutorial: What, how and which?Cross-validation Tutorial: What, how and which?
Cross-validation Tutorial: What, how and which?Pradeep Redddy Raamana
 
Linear regression
Linear regressionLinear regression
Linear regressionMartinHogg9
 
Linear discriminant analysis
Linear discriminant analysisLinear discriminant analysis
Linear discriminant analysisBangalore
 
Introduction to Random Forest
Introduction to Random Forest Introduction to Random Forest
Introduction to Random Forest Rupak Roy
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining Sulman Ahmed
 

What's hot (20)

Social Media Analytics Lecture
Social Media Analytics LectureSocial Media Analytics Lecture
Social Media Analytics Lecture
 
Market basket analysis
Market basket analysisMarket basket analysis
Market basket analysis
 
Hierarchical Clustering
Hierarchical ClusteringHierarchical Clustering
Hierarchical Clustering
 
Breast cancer diagnosis and recurrence prediction using machine learning tech...
Breast cancer diagnosis and recurrence prediction using machine learning tech...Breast cancer diagnosis and recurrence prediction using machine learning tech...
Breast cancer diagnosis and recurrence prediction using machine learning tech...
 
Testing for normality
Testing for normalityTesting for normality
Testing for normality
 
Knn
KnnKnn
Knn
 
Spatial databases
Spatial databasesSpatial databases
Spatial databases
 
Spectral clustering Tutorial
Spectral clustering TutorialSpectral clustering Tutorial
Spectral clustering Tutorial
 
Datacube
DatacubeDatacube
Datacube
 
3.5 Exploratory Data Analysis
3.5 Exploratory Data Analysis3.5 Exploratory Data Analysis
3.5 Exploratory Data Analysis
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
 
Disease prediction using machine learning
Disease prediction using machine learningDisease prediction using machine learning
Disease prediction using machine learning
 
Introduction to data analysis using R
Introduction to data analysis using RIntroduction to data analysis using R
Introduction to data analysis using R
 
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...
 
Cross-validation Tutorial: What, how and which?
Cross-validation Tutorial: What, how and which?Cross-validation Tutorial: What, how and which?
Cross-validation Tutorial: What, how and which?
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Linear discriminant analysis
Linear discriminant analysisLinear discriminant analysis
Linear discriminant analysis
 
Introduction to Random Forest
Introduction to Random Forest Introduction to Random Forest
Introduction to Random Forest
 
DMQL(Data Mining Query Language).pptx
DMQL(Data Mining Query Language).pptxDMQL(Data Mining Query Language).pptx
DMQL(Data Mining Query Language).pptx
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
 

Similar to K- Nearest Neighbor Approach

Investigating the Performance of Distanced-Based Weighted-Voting approaches i...
Investigating the Performance of Distanced-Based Weighted-Voting approaches i...Investigating the Performance of Distanced-Based Weighted-Voting approaches i...
Investigating the Performance of Distanced-Based Weighted-Voting approaches i...Dario Panada
 
Implementation of K-Nearest Neighbor Algorithm
Implementation of K-Nearest Neighbor AlgorithmImplementation of K-Nearest Neighbor Algorithm
Implementation of K-Nearest Neighbor AlgorithmDipesh Shome
 
MachineLearning.pptx
MachineLearning.pptxMachineLearning.pptx
MachineLearning.pptxBangtangurl
 
Enhancing Classification Accuracy of K-Nearest Neighbors Algorithm using Gain...
Enhancing Classification Accuracy of K-Nearest Neighbors Algorithm using Gain...Enhancing Classification Accuracy of K-Nearest Neighbors Algorithm using Gain...
Enhancing Classification Accuracy of K-Nearest Neighbors Algorithm using Gain...IRJET Journal
 
Lecture_3_k-mean-clustering.ppt
Lecture_3_k-mean-clustering.pptLecture_3_k-mean-clustering.ppt
Lecture_3_k-mean-clustering.pptSyedNahin1
 
instance bases k nearest neighbor algorithm.ppt
instance bases k nearest neighbor algorithm.pptinstance bases k nearest neighbor algorithm.ppt
instance bases k nearest neighbor algorithm.pptJohny139575
 
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures Intel® Software
 
K-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierK-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierNeha Kulkarni
 
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...ssuser2624f71
 

Similar to K- Nearest Neighbor Approach (20)

Knn 160904075605-converted
Knn 160904075605-convertedKnn 160904075605-converted
Knn 160904075605-converted
 
K-Nearest Neighbor(KNN)
K-Nearest Neighbor(KNN)K-Nearest Neighbor(KNN)
K-Nearest Neighbor(KNN)
 
KNN Classifier
KNN ClassifierKNN Classifier
KNN Classifier
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
 
Investigating the Performance of Distanced-Based Weighted-Voting approaches i...
Investigating the Performance of Distanced-Based Weighted-Voting approaches i...Investigating the Performance of Distanced-Based Weighted-Voting approaches i...
Investigating the Performance of Distanced-Based Weighted-Voting approaches i...
 
KNN.pptx
KNN.pptxKNN.pptx
KNN.pptx
 
Implementation of K-Nearest Neighbor Algorithm
Implementation of K-Nearest Neighbor AlgorithmImplementation of K-Nearest Neighbor Algorithm
Implementation of K-Nearest Neighbor Algorithm
 
Lecture 8
Lecture 8Lecture 8
Lecture 8
 
Knn demonstration
Knn demonstrationKnn demonstration
Knn demonstration
 
Data analysis of weather forecasting
Data analysis of weather forecastingData analysis of weather forecasting
Data analysis of weather forecasting
 
knn-1.pptx
knn-1.pptxknn-1.pptx
knn-1.pptx
 
MachineLearning.pptx
MachineLearning.pptxMachineLearning.pptx
MachineLearning.pptx
 
Enhancing Classification Accuracy of K-Nearest Neighbors Algorithm using Gain...
Enhancing Classification Accuracy of K-Nearest Neighbors Algorithm using Gain...Enhancing Classification Accuracy of K-Nearest Neighbors Algorithm using Gain...
Enhancing Classification Accuracy of K-Nearest Neighbors Algorithm using Gain...
 
KNN
KNNKNN
KNN
 
Lecture_3_k-mean-clustering.ppt
Lecture_3_k-mean-clustering.pptLecture_3_k-mean-clustering.ppt
Lecture_3_k-mean-clustering.ppt
 
Cluster Analysis for Dummies
Cluster Analysis for DummiesCluster Analysis for Dummies
Cluster Analysis for Dummies
 
instance bases k nearest neighbor algorithm.ppt
instance bases k nearest neighbor algorithm.pptinstance bases k nearest neighbor algorithm.ppt
instance bases k nearest neighbor algorithm.ppt
 
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
 
K-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierK-Nearest Neighbor Classifier
K-Nearest Neighbor Classifier
 
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
 

Recently uploaded

Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxMan or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxDhatriParmar
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...DhatriParmar
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSMae Pangan
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research DiscourseAnita GoswamiGiri
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...Nguyen Thanh Tu Collection
 
Multi Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP ModuleMulti Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP ModuleCeline George
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...DhatriParmar
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDhatriParmar
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptxDhatriParmar
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxSayali Powar
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17Celine George
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfPrerana Jadhav
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 

Recently uploaded (20)

Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxMan or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHS
 
Paradigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTAParadigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTA
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research Discourse
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
 
Multi Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP ModuleMulti Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP Module
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17
 
prashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Professionprashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Profession
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdf
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 

K- Nearest Neighbor Approach

  • 1. K-Nearest Neighbor –As Supervised Classifier Compiled by : Dr. Kumud Kundu
  • 2. K Nearest Neighbor – As a Supervised Classification Approach • KNN is a non-parametric supervised learning technique in which the query instance is classified to a given category with the help of training set. • Non-parametric means not making any assumptions on the underlying data distribution. • Predictions are made for a new instance (x) by searching through the entire training set for the K most similar cases (neighbors) and summarizing the output variable for those K cases. • In simple words, it captures information of all training cases and classifies new cases based on a similarity. 2
  • 3. Two Phases of Supervised Classification 3
  • 4. 4 EUCLIDEAN DISTANCE: SIMILARITY METRIC 𝑦2 𝑦1 𝑥1 𝑥2 𝑃𝑜𝑖𝑛𝑡 𝐵 (𝑥2, 𝑦2) 𝑃𝑜𝑖𝑛𝑡 𝐴 (𝑥1, 𝑦1) 𝑬𝒖𝒄𝒍𝒊𝒅𝒆𝒂𝒏 𝑫𝒊𝒔𝒕𝒂𝒏𝒄𝒆 = 𝒙 𝟐 − 𝒙 𝟏 𝟐 + 𝒚 𝟐 − 𝒚 𝟏 𝟐
  • 5. 5 MANHATTAN DISTANCE or CITY BLOCK DISTANCE: SIMILARITY METRIC 𝑦2 𝑦1 𝑥1 𝑥2 𝑃𝑜𝑖𝑛𝑡 𝐵 (𝑥2, 𝑦2) 𝑃𝑜𝑖𝑛𝑡 𝐴 (𝑥1, 𝑦1) 𝑴𝒂𝒏𝒉𝒂𝒕𝒕𝒂𝒏 𝑫𝒊𝒔𝒕𝒂𝒏𝒄𝒆 = | 𝒙 𝟐 − 𝒙 𝟏 | + | 𝒚 𝟐 − 𝒚 𝟏 |
  • 6. 6 1. Select a value for k (e.g.: 1, 2, 3, 10..) 2. Calculate the Euclidian distance between the point to be classified and every other point in the training data-set 3. Pick the k closest data points (points with the k smallest distances) 4. Run a majority vote among selected data points, the dominating classification is the winner! Point is classified based on the dominant class. 5. Repeat if required! K NEAREST NEIGHBORS (KNN): ALGORITHM STEPS
  • 7. 7 Predict Class for Tuple ( 10,7) Apply KNN Classification Algorithm
  • 8. Solution with K=3 Feature 1 Feature 2 Class Euclidean Distance Rank 1 1A 10.81665383 8 2 3A 8.94427191 7 2 4A 8.544003745 6 5 3A 6.403124237 5 8 6B 2.236067977 4 8 8B 2.236067977 3 9 6B 1.414213562 2 11 7B 1 1 Predict for 10 7 Predicted Class = B 8 1. Calculate the Euclidian distance between the point to be classified and every other point in the training data-set 2. Pick the k=3 closest data points (points with the k smallest distances) 3. Run a majority vote among selected data points, the dominating classification is the winner! Point is classified based on the dominant class.
  • 9. How to decide the number of neighbors in KNN? What are its effects on the classification Accuracy? The number of neighbors(K) in KNN is a hyperparameter that is needed to be chosen at the time of model building. • K controls the classification accuracy of the model. • Generally, K is chosen as an odd number if the number of classes is even. • Otherwise value of K is dependent upon the nature of dataset (Domain Dependent ) for which it is to be applied. 9
  • 10. KNN Implementation in Python • First, import the KNeighborsClassifier module and create KNN classifier object by passing argument number of neighbors in KNeighborsClassifier() function. • by usingthe sklearn.neighbors.NearestNeighbors module classifier = KNeighborsClassifier(n_neighbors = 5, metric = 'minkowski', p = 2) # KNN model with 5 neighbours and Euclidian distance as similarity metric • Then, fit your model on the train set using fit() and perform prediction on the test set using predict(). # Fitting K-NN to the Training set • classifier.fit(X_train, y_train) # Predicting the Test set results • y_predy_pred = classifier.predict(X_test) 10 from sklearn.neighbors import NearestNeighbors
  • 11. QUICK CHECK *Which of the following statements is true for k-NN classifiers? A) The classification accuracy is better with larger values of k B) The decision boundary is smoother with smaller values of k C) The decision boundary is linear D) k-NN does not require an explicit training step *k-NN algorithm does more computation on test time rather than train time. A) TRUE B) FALSE *Which of the following statement is true about k-NN algorithm? 1. k-NN performs much better if all of the data have the same scale 2. k-NN works well with a small number of input variables (p), but struggles when the number of inputs is very large 3. k-NN makes no assumptions about the functional form of the problem being solved A) 1 and 2 B) 1 and 3 C) Only 1 D) All of the above 11