SlideShare a Scribd company logo
1 of 4
Download to read offline
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 04 Issue: 09 | Sep -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1385
Enhancing Classification Accuracy of K-Nearest Neighbours Algorithm
Using Gain Ratio
Aditya Duneja1, Thendral Puyalnithi2
1 School Of Computer Science and Engineering , VIT University, Vellore, 632014 , Tamil Nadu, India
2Professor, School Of Computer Science and Engineering, VIT University, Vellore, 632014, Tamil Nadu, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract – K-Nearest Neighbors or KNN classification
technique in machine learning is among the simplest of
techniques which are used for classification. The approach
used in KNN involves finding the k nearest neighbors to a
test vector and then assigning a class to it based on majority
voting. One of the assumptions in KNN is that data lies in a
feature space, that is, measures like Euclidean distance,
Manhattan distance and others can be applied on data. It
has its applications in computer vision, recommender
systems and various other fields. In this paper, a
modification to this method has been put forward which
aims at improving its accuracy. The modified version
combines the concepts of distance metrics, which are
applied in K-nearest neighbors, and the entropy or the
strength of an attribute. The way in which the distance is
calculated between the data points is altered as the strength
of each individual attribute or dimension is taken into
consideration. The proposed technique was tested on
various UCI datasets and the accuracies of both methods,
the old and the modified, are compared.
Key Words: classification, entropy, Euclidean distance,
KNN, dimension
1. Introduction
In machine learning, classification is a method which uses
a training set comprising of various instances and their
respective classes to determine the category of a new
observation or a set of new observations. It is a supervised
learning technique. Classification has a wide range of
applications in various fields, namely medical imaging,
speech recognition, handwriting recognition, search
engines and many more. So, high accuracy classification
models are required in order to make correct predictions
in the above mentioned fields. Some of the classification
models are decision trees, Support Vector Machines and K-
nearest neighbor classifier.
K-nearest neighbor or KNN classifier is one of the most
frequently used classification techniques in machine
learning. K nearest neighbor algorithm involves the
computation of the distance of the test observation with
every observation in the training set. The distance which is
computed is mostly Euclidean distance although other
measures of distance can also be applied. Then the k
nearest observations are chosen and the class which is
assigned to the majority out of these k observations is
assigned to the test observation. KNN makes no
assumptions regarding the distribution of data, therefore,
it is non-parametric. No separate training phase is defined
for this algorithm as the whole dataset is used for
computation of distance every time.
KNN has undergone many modifications over the years in
order to improve its accuracy. Faruk et al.[1] proposed
new version of KNN where the neighbors are chosen with
respect to the angle between them. Even if two neighbors
have the same distance from the test vector they are
considered different with respect to the angle they make.
Singh et al. [2] devised a way to accelerate the calibration
process of KNN using parallel computing. Parvin et al. [3]
modified the KNN algorithm by weighting the neighbors of
the unknown observation. The contribution of every
neighbor is considered to be inversely proportional to its
distance from the test vector. Fuzzy K nearest neighbors
was first proposed in 1985 by Keller et al.[4] Fuzzy
memberships are assigned to the training data in order to
obtain more accurate results as every neighbor is given a
different weight. A new modification to the K-nearest
neighbors algorithm is proposed in this paper taking the
strength of the attributes into account. Faziludeen et al.[5]
proposed a new modification to KNN known as the
Evidential K nearest neighbors (EKNN). The resultant
class from all k nearest neighbors of a test sample were
used to find the class, not only the majority class. Distance
between samples was considered to measure the weight
of the evidence attributed to each class. Some research is
also done to reduce the size of the dataset required to get
effective results from the KNN algorithm. Song et al.[6]
proposed an algorithm which removed the outlier
instances from the dataset and the remaining were sorted
on the basis of difference in output among instances and
their neighbors. Another modification to KNN was put
forward by Nguyen et al.[8] in the form of a distance
metric learning method. The resulting problem was
formulated as a quadratic program. Lin et al.[9] proposed
a nearest neighbor classifier by combining neighborhood
information. While finding the distance between samples,
the influence of their neighbors is taken into account.
Manocha et al. [10] proposed a probabilistic nearest
neighbor classifier which used probabilistic methods to
assign class membership, an aspect which KNN doesn’t
have. Sarkar[11] improved the accuracy of the KNN by
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 04 Issue: 09 | Sep -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1386
introducing fuzzy-rough KNN classifier. It combined the
conventional KNN along with fuzzy KNN. The optimal K
value was not required for this algorithm to work
effectively and fuzzy ownership values were calculated for
each class. Timofte et al.[12] proposed Iterative nearest
neighbors algorithm which combined the power of sparse
representation(SR)[13] and local linear embedding[14]
along with K nearest neighbors. A belief based KNN
(BKNN) was introduced by Liu et al.[15] to overcome
difficulties faced by the conventional KNN in classifying
data points which are really close to each other. In BKNN,
every point has a specific belief value for each class which
helps reduce the error in classification. Zhang et al.[16]
proposed a multi-label lazy learning KNN(ML-KNN) which
evaluated K nearest neighbors for every instance and the
Maximum posteriori principle is utilized to assign the label
for the test instance. Sierra et al.[17] proposed K nearest
neighbor equality(KNNE) which extended the idea of KNN
. Each class was individually searched for k points and the
one which had the least average distance between K points
and the test instance was chosen as the class for that
instance.
2. Conventional and the Proposed Methods:
KNN is a non-parametric lazy learning algorithm. The non-
parametric property of KNN is useful as in real world
problems too, data does not follow a particular
distribution like Gaussian distribution. KNN is a lazy
algorithm, meaning that the training phase is minimal and
the full dataset is used as the test phase. This is unlike all
the other algorithms in machine learning like Support
Vector Machines. In the KNN algorithm, the distance
between the test vector and the whole training set is
computed and the k nearest neighbors are calculated by
selecting the first k points(tuples in the training set) after
sorting the distances array in increasing order . The
distance metric used in K nearest neighbors can vary but
Euclidean distance is the popular choice. So, the class
which is assigned to the test vector is the one which is in
majority out of those k points chosen.
In the ID3 algorithm, a method is defined where we pick
the most suitable attribute on which the decision tree will
be divided. The concept of entropy is used in the ID3
algorithm. Entropy of a dataset is the number of bits
needed to express an attribute. A log function to the base 2
is used, because the information is encoded in bits. Let D
be a dataset and be the I th class. Let C(i,D) be the set of
tuples of class C(i) in D. Let D and C(i,D) denote the
number of tuples in D and C(i,D), respectively and m
denote the number of classes. Entropy(D) is just the
average amount of information needed to identify the
class label of a tuple in D.
( ) ∑( )
Where is the probability that an arbitrary tuple in D
belongs to class C(i) and is estimated by C(i,D)/D.
Higher the entropy, the tougher it is to choose the class
simply based on seeing the input. The dataset is highly
unstable as more number of bits are needed to represent
it. The formula given above is the initial entropy of the
dataset. Now, to make a decision tree, we need to pick the
root and at the same time ensure that the height of the
tree is least. It should not overfit the data by going till the
maximum depth. For that entropy of each attribute is
calculated . It is given by,
( ) ∑( ) ( )
Where ( ) is the entropy or additional information
required to make a classification for an attribute A.
| | ) is the weighted sum of the entropies of each of
the v partitions of attribute A. Lesser the value of the
entropy of the attribute more is the information gained as
the initial entropy decreases by that account. That is why
Information Gain of an attribute is given by:
( ) ( ) ( )
More the information gain, stronger the attribute. So, the
partition takes place on the attribute which has the
highest information gain. But ID3 algorithm has some
drawbacks. C4.5 extends the idea of ID3 and removes its
drawbacks. In ID3, there is a bias among attributes which
have a large number of partitions, even though they may
not be the stronger attributes than others. C4.5 removes
this bias by introducing the concept of splitinfo of an
attribute. This parameter normalizes the value of the
information gain and doesn’t let it get biased towards such
attributes. The normalization is carried out by dividing the
information gain of an attribute by its splitinfo, a value
which is known as the gain ratio of an attribute. It is given
by:
( ) ( ) ( )
And splitinfo of an attribute A is given by:
( )
∑ (
| |
)
| |
The modification is as follows: While calculating the
Euclidean distance of the test vector with each of the
instances in the training set, we are dividing the distance
between each of the attributes by the Gain Ratio of the
attribute..
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 04 Issue: 09 | Sep -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1387
Say the dataset has 3 attributes . The test vector is (test1,
test2, test3) and training instance is (training1, training2,
training3). So the modified distance is given by:
( ) (∑ (
( ) ( )
( )
) ^2)) ^1/2
where i goes from 1 to the number of attributes (or
dimensions), x is the test vector and is the GainRatio(i) is
the gain ratio of the ith attribute.
The intuition behind this method is that when the distance
is calculated, the point with the least distance is selected
as the neighbor(k times). As the distance calculated is the
sum of the squares of distances between individual
attributes, minimizing the effective distance of an attribute
contributes in that point being selected as a neighbor. So,
we are taking the strength of the attribute into
consideration while we calculate the distance. That is why
if the distance of an attribute is more but it is a stronger
attribute than others( that is, if we choose that value of an
attribute, we can tell with maximum certainty (relative to
other attributes) that a particular class is going to be
assigned to the test example).So, when we divide a larger
number(distance) with a larger number(as a stronger
attribute will have a larger value of Gain ratio), the
effective distance is reduced and the stronger attributes
have contributed more in the classification process. This
concept is not covered in the original KNN algorithm as
every attribute is given equal weightage.
3. Analysis and Result
The proposed algorithm was successfully implemented in
MATLAB and was tested on 16 datasets. (table given
below).
Table1: Information about datasets used for evaluation.
A rule of thumb in machine learning is to pick k near the
square root of the size of the training set. In practice this
does a good job of telling signal from noise. The algorithm
was tested on categorical and mixed (categorical +
numerical) datasets.
Optimal value of K, by convention, is taken as the square
root of the number of instances of the dataset. The
accuracy mentioned in the table given below is calculated
after taking an average of 5 trials. Accuracies (in %) of
modified KNN(Mod. KNN) and KNN are mentioned for
these datasets.
Table 2: Accuracies of datasets for conventional KNN and
the modified version of KNN.
Given below is an illustration of the accuracy of the
datasets mentioned above.
Figure 1: Accuracies of 16 datasets for KNN and modified
version of KNN
SNo. Dataset name No. of
rows
No. of
features
Type
1 Bal. Scale Weight & Distance 625 4 Categorical
2 Car evaluation 1728 6 Categorical
3 Contraceptive method
Choice
1473 8 Categorical
4 Tic-Tac-Toe Endgame 958 9 Categorical
5 Congressional voting
records
435 16 Categorical
6 Australian Credit Approval 680 14 Mixed
7 Thyroid disease 1225 21 Mixed
8 German Credit 1770 24 Mixed
9 Teaching Assistant
Evaluation
151 5 Mixed
10 Heart Disease 256 13 Mixed
11 Poker Hand 1240 10 Mixed
12 Sleep 1600 13 Mixed
13 Mushroom 2500 21 Mixed
14 Credit Card 500 15 Mixed
15 Shuttle 3600 9 Mixed
16 Breast Cancer Wisconsin 730 10 Mixed
DATASET K Mod.
KNN(%)
KNN (%)
Bal. Scale Weight and distance 25 77.68 61.92
Car evaluation 42 69.74 69.74
Contraceptive method Choice 39 92.77 92.77
Tic-Tac-Toe Endgame 31 75.94 64.72
Congressional voting records 20 59.9 56.6
Australian Credit Approval 26 61 57.1
Thyroid disease 35 94.1 94.1
German Credit 42 68.7 68.4
Teaching Assistant Evaluation 12 45.6 39.7
Heart Disease 16 58.3 51.4
Poker Hand 35 52.3 47.7
Sleep 40 50 49.5
Mushroom 50 89 89
Credit Card 22 70.4 68.9
Shuttle 60 79.4 79.4
Breast Cancer Wisconsin 27 79.1 74.8
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 04 Issue: 09 | Sep -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1388
4. Conclusion:
A lot of modifications made to KNN which were previously
discussed are fairly complex and account for slight
improvements in accuracy of its classification. But the
modification made in this paper is quite intuitive and easy
to comprehend. It also improves the results of the
conventional KNN, slightly for some datasets but
drastically for others. In the 16 datasets evaluated, 8 of the
datasets showed an increase of more than 5% in their
classification accuracy and 4 datasets showed an average
increase of around 3%. The worst case accuracy of any
dataset evaluated by the proposed algorithm is equal to
that of conventional KNN. So, from the above analysis we
conclude that the accuracy in the case of modified version
of the KNN is equal or more than the accuracy of the
original K-nearest neighbors algorithm.
5. References:
[1] Ertugrul, Ö. F., & Tagluk, M. E. (2017). A novel version
of k nearest neighbor: Dependent nearest neighbor.
Applied Soft Computing, 55, 480-490.
[2] Singh, A., Deep, K., & Grover, P. (2017). A novel
approach to accelerate calibration process of a k-nearest
neighbor classifier using GPU. Journal of Parallel and
Distributed Computing, 104, 114-129.
[3] Parvin, H., Alizadeh, H., & Minati, B. (2010). A
modification on k-nearest neighbor classifier. Global
Journal of Computer Science and Technology.
[4] Keller, J. M., Gray, M. R., & Givens, J. A. (1985). A fuzzy
k-nearest neighbor algorithm. IEEE transactions on
systems, man, and cybernetics, (4), 580-585.
[5] Faziludeen, S., & Sankaran, P. (2016). ECG Beat
Classification Using Evidential K-Nearest Neighbours.
Procedia Computer Science, 89, 499-505.
[6] Song, Y., Liang, J., Lu, J., & Zhao, X. (2017). An efficient
instance selection algorithm for k nearest neighbor
regression. Neurocomputing, 251, 26-34.
[7] H. Bian,W. Shen, Fuzzy-rough nearest-neighbor
classification method: an integrated framework, in: Proc.
IASTED Internat. Conf. on Applied Informatics, Austria,
2002, pp. 160–164.
[8] Nguyen, B., Morell, C., & De Baets, B. (2016). Large-
scale distance metric learning for k-nearest neighbors
regression. Neurocomputing, 214, 805-814.
[9] Lin, Y., Li, J., Lin, M., & Chen, J. (2014). A new nearest
neighbor classifier via fusing neighborhood information.
Neurocomputing, 143, 164-169.
[10] Manocha, S., & Girolami, M. A. (2007). An empirical
analysis of the probabilistic k-nearest neighbour classifier.
Pattern Recognition Letters, 28(13), 1818-1824.
[11] Sarkar, M. (2007). Fuzzy-rough nearest neighbor
algorithms in classification. Fuzzy Sets and Systems,
158(19), 2134-2152.
[12] Timofte, R., & Van Gool, L. (2015). Iterative nearest
neighbors. Pattern Recognition, 48(1), 60-72.
[13] Aharon, M., Elad, M., & Bruckstein, A. (2006). $ rm k
$-SVD: An algorithm for designing overcomplete
dictionaries for sparse representation. IEEE Transactions
on signal processing, 54(11), 4311-4322.
[14] Roweis, S. T., & Saul, L. K. (2000). Nonlinear
dimensionality reduction by locally linear
embedding. science, 290(5500), 2323-2326.
[15] Liu, Z. G., Pan, Q., & Dezert, J. (2013). A new belief-
based K-nearest neighbor classification method. Pattern
Recognition, 46(3), 834-844.
[16] Zhang, M. L., & Zhou, Z. H. (2007). ML-KNN: A lazy
learning approach to multi-label learning. Pattern
recognition, 40(7), 2038-2048.
[17] Sierra, B., Lazkano, E., Irigoien, I., Jauregi, E., &
Mendialdua, I. (2011). K Nearest Neighbor Equality: Giving
equal chance to all existing classes. Information Sciences,
181(23), 5158-5168.
[18] Lichman, M. (2013). UCI Machine Learning
Repository [http://archive.ics.uci.edu/ml]. Irvine, CA:
University of California, School of Information and
Computer Science

More Related Content

What's hot

Instance based learning
Instance based learningInstance based learning
Instance based learningSlideshare
 
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...IJECEIAES
 
Instance Based Learning in Machine Learning
Instance Based Learning in Machine LearningInstance Based Learning in Machine Learning
Instance Based Learning in Machine LearningPavithra Thippanaik
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3Nandhini S
 
K means Clustering
K means ClusteringK means Clustering
K means ClusteringEdureka!
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithmijsrd.com
 
Experimental study of Data clustering using k- Means and modified algorithms
Experimental study of Data clustering using k- Means and modified algorithmsExperimental study of Data clustering using k- Means and modified algorithms
Experimental study of Data clustering using k- Means and modified algorithmsIJDKP
 
K-Means, its Variants and its Applications
K-Means, its Variants and its ApplicationsK-Means, its Variants and its Applications
K-Means, its Variants and its ApplicationsVarad Meru
 
Master defense presentation 2019 04_18_rev2
Master defense presentation 2019 04_18_rev2Master defense presentation 2019 04_18_rev2
Master defense presentation 2019 04_18_rev2Hyun Wong Choi
 
A Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmA Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmIRJET Journal
 
Optimising Data Using K-Means Clustering Algorithm
Optimising Data Using K-Means Clustering AlgorithmOptimising Data Using K-Means Clustering Algorithm
Optimising Data Using K-Means Clustering AlgorithmIJERA Editor
 
Instance based learning
Instance based learningInstance based learning
Instance based learningswapnac12
 
Array diagnosis using compressed sensing in near field
Array diagnosis using compressed sensing in near fieldArray diagnosis using compressed sensing in near field
Array diagnosis using compressed sensing in near fieldAlexander Decker
 
Large Scale Data Clustering: an overview
Large Scale Data Clustering: an overviewLarge Scale Data Clustering: an overview
Large Scale Data Clustering: an overviewVahid Mirjalili
 
Novel algorithms for Knowledge discovery from neural networks in Classificat...
Novel algorithms for  Knowledge discovery from neural networks in Classificat...Novel algorithms for  Knowledge discovery from neural networks in Classificat...
Novel algorithms for Knowledge discovery from neural networks in Classificat...Dr.(Mrs).Gethsiyal Augasta
 
Pattern recognition binoy k means clustering
Pattern recognition binoy  k means clusteringPattern recognition binoy  k means clustering
Pattern recognition binoy k means clustering108kaushik
 

What's hot (20)

Instance based learning
Instance based learningInstance based learning
Instance based learning
 
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...
 
Instance Based Learning in Machine Learning
Instance Based Learning in Machine LearningInstance Based Learning in Machine Learning
Instance Based Learning in Machine Learning
 
Kmeans
KmeansKmeans
Kmeans
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
 
Noura2
Noura2Noura2
Noura2
 
K means Clustering
K means ClusteringK means Clustering
K means Clustering
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithm
 
Data clustering
Data clustering Data clustering
Data clustering
 
Experimental study of Data clustering using k- Means and modified algorithms
Experimental study of Data clustering using k- Means and modified algorithmsExperimental study of Data clustering using k- Means and modified algorithms
Experimental study of Data clustering using k- Means and modified algorithms
 
K-Means, its Variants and its Applications
K-Means, its Variants and its ApplicationsK-Means, its Variants and its Applications
K-Means, its Variants and its Applications
 
Master defense presentation 2019 04_18_rev2
Master defense presentation 2019 04_18_rev2Master defense presentation 2019 04_18_rev2
Master defense presentation 2019 04_18_rev2
 
A Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmA Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means Algorithm
 
Optimising Data Using K-Means Clustering Algorithm
Optimising Data Using K-Means Clustering AlgorithmOptimising Data Using K-Means Clustering Algorithm
Optimising Data Using K-Means Clustering Algorithm
 
Instance based learning
Instance based learningInstance based learning
Instance based learning
 
Array diagnosis using compressed sensing in near field
Array diagnosis using compressed sensing in near fieldArray diagnosis using compressed sensing in near field
Array diagnosis using compressed sensing in near field
 
Large Scale Data Clustering: an overview
Large Scale Data Clustering: an overviewLarge Scale Data Clustering: an overview
Large Scale Data Clustering: an overview
 
Novel algorithms for Knowledge discovery from neural networks in Classificat...
Novel algorithms for  Knowledge discovery from neural networks in Classificat...Novel algorithms for  Knowledge discovery from neural networks in Classificat...
Novel algorithms for Knowledge discovery from neural networks in Classificat...
 
Pattern recognition binoy k means clustering
Pattern recognition binoy  k means clusteringPattern recognition binoy  k means clustering
Pattern recognition binoy k means clustering
 
Dataa miining
Dataa miiningDataa miining
Dataa miining
 

Similar to Enhancing Classification Accuracy of K-Nearest Neighbors Algorithm using Gain Ratio

Improving K-NN Internet Traffic Classification Using Clustering and Principle...
Improving K-NN Internet Traffic Classification Using Clustering and Principle...Improving K-NN Internet Traffic Classification Using Clustering and Principle...
Improving K-NN Internet Traffic Classification Using Clustering and Principle...journalBEEI
 
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...IRJET Journal
 
Parallel KNN for Big Data using Adaptive Indexing
Parallel KNN for Big Data using Adaptive IndexingParallel KNN for Big Data using Adaptive Indexing
Parallel KNN for Big Data using Adaptive IndexingIRJET Journal
 
Parallel k nn on gpu architecture using opencl
Parallel k nn on gpu architecture using openclParallel k nn on gpu architecture using opencl
Parallel k nn on gpu architecture using opencleSAT Publishing House
 
Parallel knn on gpu architecture using opencl
Parallel knn on gpu architecture using openclParallel knn on gpu architecture using opencl
Parallel knn on gpu architecture using opencleSAT Journals
 
Survey of K means Clustering and Hierarchical Clustering for Road Accident An...
Survey of K means Clustering and Hierarchical Clustering for Road Accident An...Survey of K means Clustering and Hierarchical Clustering for Road Accident An...
Survey of K means Clustering and Hierarchical Clustering for Road Accident An...IRJET Journal
 
EFFECTIVENESS PREDICTION OF MEMORY BASED CLASSIFIERS FOR THE CLASSIFICATION O...
EFFECTIVENESS PREDICTION OF MEMORY BASED CLASSIFIERS FOR THE CLASSIFICATION O...EFFECTIVENESS PREDICTION OF MEMORY BASED CLASSIFIERS FOR THE CLASSIFICATION O...
EFFECTIVENESS PREDICTION OF MEMORY BASED CLASSIFIERS FOR THE CLASSIFICATION O...cscpconf
 
K-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierK-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierNeha Kulkarni
 
An improvement in k mean clustering algorithm using better time and accuracy
An improvement in k mean clustering algorithm using better time and accuracyAn improvement in k mean clustering algorithm using better time and accuracy
An improvement in k mean clustering algorithm using better time and accuracyijpla
 
IRJET- Machine Learning and Deep Learning Methods for Cybersecurity
IRJET- Machine Learning and Deep Learning Methods for CybersecurityIRJET- Machine Learning and Deep Learning Methods for Cybersecurity
IRJET- Machine Learning and Deep Learning Methods for CybersecurityIRJET Journal
 
Survey on classification algorithms for data mining (comparison and evaluation)
Survey on classification algorithms for data mining (comparison and evaluation)Survey on classification algorithms for data mining (comparison and evaluation)
Survey on classification algorithms for data mining (comparison and evaluation)Alexander Decker
 
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and LimeIRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and LimeIRJET Journal
 
IRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine LearningIRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine LearningIRJET Journal
 
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETSA HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETSEditor IJCATR
 
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET Journal
 
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET Journal
 
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...Scientific Review SR
 

Similar to Enhancing Classification Accuracy of K-Nearest Neighbors Algorithm using Gain Ratio (20)

Improving K-NN Internet Traffic Classification Using Clustering and Principle...
Improving K-NN Internet Traffic Classification Using Clustering and Principle...Improving K-NN Internet Traffic Classification Using Clustering and Principle...
Improving K-NN Internet Traffic Classification Using Clustering and Principle...
 
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
 
Parallel KNN for Big Data using Adaptive Indexing
Parallel KNN for Big Data using Adaptive IndexingParallel KNN for Big Data using Adaptive Indexing
Parallel KNN for Big Data using Adaptive Indexing
 
Parallel k nn on gpu architecture using opencl
Parallel k nn on gpu architecture using openclParallel k nn on gpu architecture using opencl
Parallel k nn on gpu architecture using opencl
 
Parallel knn on gpu architecture using opencl
Parallel knn on gpu architecture using openclParallel knn on gpu architecture using opencl
Parallel knn on gpu architecture using opencl
 
Survey of K means Clustering and Hierarchical Clustering for Road Accident An...
Survey of K means Clustering and Hierarchical Clustering for Road Accident An...Survey of K means Clustering and Hierarchical Clustering for Road Accident An...
Survey of K means Clustering and Hierarchical Clustering for Road Accident An...
 
EFFECTIVENESS PREDICTION OF MEMORY BASED CLASSIFIERS FOR THE CLASSIFICATION O...
EFFECTIVENESS PREDICTION OF MEMORY BASED CLASSIFIERS FOR THE CLASSIFICATION O...EFFECTIVENESS PREDICTION OF MEMORY BASED CLASSIFIERS FOR THE CLASSIFICATION O...
EFFECTIVENESS PREDICTION OF MEMORY BASED CLASSIFIERS FOR THE CLASSIFICATION O...
 
K-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierK-Nearest Neighbor Classifier
K-Nearest Neighbor Classifier
 
An improvement in k mean clustering algorithm using better time and accuracy
An improvement in k mean clustering algorithm using better time and accuracyAn improvement in k mean clustering algorithm using better time and accuracy
An improvement in k mean clustering algorithm using better time and accuracy
 
IRJET- Machine Learning and Deep Learning Methods for Cybersecurity
IRJET- Machine Learning and Deep Learning Methods for CybersecurityIRJET- Machine Learning and Deep Learning Methods for Cybersecurity
IRJET- Machine Learning and Deep Learning Methods for Cybersecurity
 
Survey on classification algorithms for data mining (comparison and evaluation)
Survey on classification algorithms for data mining (comparison and evaluation)Survey on classification algorithms for data mining (comparison and evaluation)
Survey on classification algorithms for data mining (comparison and evaluation)
 
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and LimeIRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
 
IRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine LearningIRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine Learning
 
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETSA HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
 
artificial intelligence.pptx
artificial intelligence.pptxartificial intelligence.pptx
artificial intelligence.pptx
 
Di35605610
Di35605610Di35605610
Di35605610
 
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification Algorithms
 
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification Algorithms
 
A046010107
A046010107A046010107
A046010107
 
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...
 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsIRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASIRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProIRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemIRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesIRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web applicationIRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignIRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...IRJET Journal
 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2RajaP95
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 

Recently uploaded (20)

Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 

Enhancing Classification Accuracy of K-Nearest Neighbors Algorithm using Gain Ratio

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 04 Issue: 09 | Sep -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1385 Enhancing Classification Accuracy of K-Nearest Neighbours Algorithm Using Gain Ratio Aditya Duneja1, Thendral Puyalnithi2 1 School Of Computer Science and Engineering , VIT University, Vellore, 632014 , Tamil Nadu, India 2Professor, School Of Computer Science and Engineering, VIT University, Vellore, 632014, Tamil Nadu, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract – K-Nearest Neighbors or KNN classification technique in machine learning is among the simplest of techniques which are used for classification. The approach used in KNN involves finding the k nearest neighbors to a test vector and then assigning a class to it based on majority voting. One of the assumptions in KNN is that data lies in a feature space, that is, measures like Euclidean distance, Manhattan distance and others can be applied on data. It has its applications in computer vision, recommender systems and various other fields. In this paper, a modification to this method has been put forward which aims at improving its accuracy. The modified version combines the concepts of distance metrics, which are applied in K-nearest neighbors, and the entropy or the strength of an attribute. The way in which the distance is calculated between the data points is altered as the strength of each individual attribute or dimension is taken into consideration. The proposed technique was tested on various UCI datasets and the accuracies of both methods, the old and the modified, are compared. Key Words: classification, entropy, Euclidean distance, KNN, dimension 1. Introduction In machine learning, classification is a method which uses a training set comprising of various instances and their respective classes to determine the category of a new observation or a set of new observations. It is a supervised learning technique. Classification has a wide range of applications in various fields, namely medical imaging, speech recognition, handwriting recognition, search engines and many more. So, high accuracy classification models are required in order to make correct predictions in the above mentioned fields. Some of the classification models are decision trees, Support Vector Machines and K- nearest neighbor classifier. K-nearest neighbor or KNN classifier is one of the most frequently used classification techniques in machine learning. K nearest neighbor algorithm involves the computation of the distance of the test observation with every observation in the training set. The distance which is computed is mostly Euclidean distance although other measures of distance can also be applied. Then the k nearest observations are chosen and the class which is assigned to the majority out of these k observations is assigned to the test observation. KNN makes no assumptions regarding the distribution of data, therefore, it is non-parametric. No separate training phase is defined for this algorithm as the whole dataset is used for computation of distance every time. KNN has undergone many modifications over the years in order to improve its accuracy. Faruk et al.[1] proposed new version of KNN where the neighbors are chosen with respect to the angle between them. Even if two neighbors have the same distance from the test vector they are considered different with respect to the angle they make. Singh et al. [2] devised a way to accelerate the calibration process of KNN using parallel computing. Parvin et al. [3] modified the KNN algorithm by weighting the neighbors of the unknown observation. The contribution of every neighbor is considered to be inversely proportional to its distance from the test vector. Fuzzy K nearest neighbors was first proposed in 1985 by Keller et al.[4] Fuzzy memberships are assigned to the training data in order to obtain more accurate results as every neighbor is given a different weight. A new modification to the K-nearest neighbors algorithm is proposed in this paper taking the strength of the attributes into account. Faziludeen et al.[5] proposed a new modification to KNN known as the Evidential K nearest neighbors (EKNN). The resultant class from all k nearest neighbors of a test sample were used to find the class, not only the majority class. Distance between samples was considered to measure the weight of the evidence attributed to each class. Some research is also done to reduce the size of the dataset required to get effective results from the KNN algorithm. Song et al.[6] proposed an algorithm which removed the outlier instances from the dataset and the remaining were sorted on the basis of difference in output among instances and their neighbors. Another modification to KNN was put forward by Nguyen et al.[8] in the form of a distance metric learning method. The resulting problem was formulated as a quadratic program. Lin et al.[9] proposed a nearest neighbor classifier by combining neighborhood information. While finding the distance between samples, the influence of their neighbors is taken into account. Manocha et al. [10] proposed a probabilistic nearest neighbor classifier which used probabilistic methods to assign class membership, an aspect which KNN doesn’t have. Sarkar[11] improved the accuracy of the KNN by
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 04 Issue: 09 | Sep -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1386 introducing fuzzy-rough KNN classifier. It combined the conventional KNN along with fuzzy KNN. The optimal K value was not required for this algorithm to work effectively and fuzzy ownership values were calculated for each class. Timofte et al.[12] proposed Iterative nearest neighbors algorithm which combined the power of sparse representation(SR)[13] and local linear embedding[14] along with K nearest neighbors. A belief based KNN (BKNN) was introduced by Liu et al.[15] to overcome difficulties faced by the conventional KNN in classifying data points which are really close to each other. In BKNN, every point has a specific belief value for each class which helps reduce the error in classification. Zhang et al.[16] proposed a multi-label lazy learning KNN(ML-KNN) which evaluated K nearest neighbors for every instance and the Maximum posteriori principle is utilized to assign the label for the test instance. Sierra et al.[17] proposed K nearest neighbor equality(KNNE) which extended the idea of KNN . Each class was individually searched for k points and the one which had the least average distance between K points and the test instance was chosen as the class for that instance. 2. Conventional and the Proposed Methods: KNN is a non-parametric lazy learning algorithm. The non- parametric property of KNN is useful as in real world problems too, data does not follow a particular distribution like Gaussian distribution. KNN is a lazy algorithm, meaning that the training phase is minimal and the full dataset is used as the test phase. This is unlike all the other algorithms in machine learning like Support Vector Machines. In the KNN algorithm, the distance between the test vector and the whole training set is computed and the k nearest neighbors are calculated by selecting the first k points(tuples in the training set) after sorting the distances array in increasing order . The distance metric used in K nearest neighbors can vary but Euclidean distance is the popular choice. So, the class which is assigned to the test vector is the one which is in majority out of those k points chosen. In the ID3 algorithm, a method is defined where we pick the most suitable attribute on which the decision tree will be divided. The concept of entropy is used in the ID3 algorithm. Entropy of a dataset is the number of bits needed to express an attribute. A log function to the base 2 is used, because the information is encoded in bits. Let D be a dataset and be the I th class. Let C(i,D) be the set of tuples of class C(i) in D. Let D and C(i,D) denote the number of tuples in D and C(i,D), respectively and m denote the number of classes. Entropy(D) is just the average amount of information needed to identify the class label of a tuple in D. ( ) ∑( ) Where is the probability that an arbitrary tuple in D belongs to class C(i) and is estimated by C(i,D)/D. Higher the entropy, the tougher it is to choose the class simply based on seeing the input. The dataset is highly unstable as more number of bits are needed to represent it. The formula given above is the initial entropy of the dataset. Now, to make a decision tree, we need to pick the root and at the same time ensure that the height of the tree is least. It should not overfit the data by going till the maximum depth. For that entropy of each attribute is calculated . It is given by, ( ) ∑( ) ( ) Where ( ) is the entropy or additional information required to make a classification for an attribute A. | | ) is the weighted sum of the entropies of each of the v partitions of attribute A. Lesser the value of the entropy of the attribute more is the information gained as the initial entropy decreases by that account. That is why Information Gain of an attribute is given by: ( ) ( ) ( ) More the information gain, stronger the attribute. So, the partition takes place on the attribute which has the highest information gain. But ID3 algorithm has some drawbacks. C4.5 extends the idea of ID3 and removes its drawbacks. In ID3, there is a bias among attributes which have a large number of partitions, even though they may not be the stronger attributes than others. C4.5 removes this bias by introducing the concept of splitinfo of an attribute. This parameter normalizes the value of the information gain and doesn’t let it get biased towards such attributes. The normalization is carried out by dividing the information gain of an attribute by its splitinfo, a value which is known as the gain ratio of an attribute. It is given by: ( ) ( ) ( ) And splitinfo of an attribute A is given by: ( ) ∑ ( | | ) | | The modification is as follows: While calculating the Euclidean distance of the test vector with each of the instances in the training set, we are dividing the distance between each of the attributes by the Gain Ratio of the attribute..
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 04 Issue: 09 | Sep -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1387 Say the dataset has 3 attributes . The test vector is (test1, test2, test3) and training instance is (training1, training2, training3). So the modified distance is given by: ( ) (∑ ( ( ) ( ) ( ) ) ^2)) ^1/2 where i goes from 1 to the number of attributes (or dimensions), x is the test vector and is the GainRatio(i) is the gain ratio of the ith attribute. The intuition behind this method is that when the distance is calculated, the point with the least distance is selected as the neighbor(k times). As the distance calculated is the sum of the squares of distances between individual attributes, minimizing the effective distance of an attribute contributes in that point being selected as a neighbor. So, we are taking the strength of the attribute into consideration while we calculate the distance. That is why if the distance of an attribute is more but it is a stronger attribute than others( that is, if we choose that value of an attribute, we can tell with maximum certainty (relative to other attributes) that a particular class is going to be assigned to the test example).So, when we divide a larger number(distance) with a larger number(as a stronger attribute will have a larger value of Gain ratio), the effective distance is reduced and the stronger attributes have contributed more in the classification process. This concept is not covered in the original KNN algorithm as every attribute is given equal weightage. 3. Analysis and Result The proposed algorithm was successfully implemented in MATLAB and was tested on 16 datasets. (table given below). Table1: Information about datasets used for evaluation. A rule of thumb in machine learning is to pick k near the square root of the size of the training set. In practice this does a good job of telling signal from noise. The algorithm was tested on categorical and mixed (categorical + numerical) datasets. Optimal value of K, by convention, is taken as the square root of the number of instances of the dataset. The accuracy mentioned in the table given below is calculated after taking an average of 5 trials. Accuracies (in %) of modified KNN(Mod. KNN) and KNN are mentioned for these datasets. Table 2: Accuracies of datasets for conventional KNN and the modified version of KNN. Given below is an illustration of the accuracy of the datasets mentioned above. Figure 1: Accuracies of 16 datasets for KNN and modified version of KNN SNo. Dataset name No. of rows No. of features Type 1 Bal. Scale Weight & Distance 625 4 Categorical 2 Car evaluation 1728 6 Categorical 3 Contraceptive method Choice 1473 8 Categorical 4 Tic-Tac-Toe Endgame 958 9 Categorical 5 Congressional voting records 435 16 Categorical 6 Australian Credit Approval 680 14 Mixed 7 Thyroid disease 1225 21 Mixed 8 German Credit 1770 24 Mixed 9 Teaching Assistant Evaluation 151 5 Mixed 10 Heart Disease 256 13 Mixed 11 Poker Hand 1240 10 Mixed 12 Sleep 1600 13 Mixed 13 Mushroom 2500 21 Mixed 14 Credit Card 500 15 Mixed 15 Shuttle 3600 9 Mixed 16 Breast Cancer Wisconsin 730 10 Mixed DATASET K Mod. KNN(%) KNN (%) Bal. Scale Weight and distance 25 77.68 61.92 Car evaluation 42 69.74 69.74 Contraceptive method Choice 39 92.77 92.77 Tic-Tac-Toe Endgame 31 75.94 64.72 Congressional voting records 20 59.9 56.6 Australian Credit Approval 26 61 57.1 Thyroid disease 35 94.1 94.1 German Credit 42 68.7 68.4 Teaching Assistant Evaluation 12 45.6 39.7 Heart Disease 16 58.3 51.4 Poker Hand 35 52.3 47.7 Sleep 40 50 49.5 Mushroom 50 89 89 Credit Card 22 70.4 68.9 Shuttle 60 79.4 79.4 Breast Cancer Wisconsin 27 79.1 74.8
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 04 Issue: 09 | Sep -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1388 4. Conclusion: A lot of modifications made to KNN which were previously discussed are fairly complex and account for slight improvements in accuracy of its classification. But the modification made in this paper is quite intuitive and easy to comprehend. It also improves the results of the conventional KNN, slightly for some datasets but drastically for others. In the 16 datasets evaluated, 8 of the datasets showed an increase of more than 5% in their classification accuracy and 4 datasets showed an average increase of around 3%. The worst case accuracy of any dataset evaluated by the proposed algorithm is equal to that of conventional KNN. So, from the above analysis we conclude that the accuracy in the case of modified version of the KNN is equal or more than the accuracy of the original K-nearest neighbors algorithm. 5. References: [1] Ertugrul, Ö. F., & Tagluk, M. E. (2017). A novel version of k nearest neighbor: Dependent nearest neighbor. Applied Soft Computing, 55, 480-490. [2] Singh, A., Deep, K., & Grover, P. (2017). A novel approach to accelerate calibration process of a k-nearest neighbor classifier using GPU. Journal of Parallel and Distributed Computing, 104, 114-129. [3] Parvin, H., Alizadeh, H., & Minati, B. (2010). A modification on k-nearest neighbor classifier. Global Journal of Computer Science and Technology. [4] Keller, J. M., Gray, M. R., & Givens, J. A. (1985). A fuzzy k-nearest neighbor algorithm. IEEE transactions on systems, man, and cybernetics, (4), 580-585. [5] Faziludeen, S., & Sankaran, P. (2016). ECG Beat Classification Using Evidential K-Nearest Neighbours. Procedia Computer Science, 89, 499-505. [6] Song, Y., Liang, J., Lu, J., & Zhao, X. (2017). An efficient instance selection algorithm for k nearest neighbor regression. Neurocomputing, 251, 26-34. [7] H. Bian,W. Shen, Fuzzy-rough nearest-neighbor classification method: an integrated framework, in: Proc. IASTED Internat. Conf. on Applied Informatics, Austria, 2002, pp. 160–164. [8] Nguyen, B., Morell, C., & De Baets, B. (2016). Large- scale distance metric learning for k-nearest neighbors regression. Neurocomputing, 214, 805-814. [9] Lin, Y., Li, J., Lin, M., & Chen, J. (2014). A new nearest neighbor classifier via fusing neighborhood information. Neurocomputing, 143, 164-169. [10] Manocha, S., & Girolami, M. A. (2007). An empirical analysis of the probabilistic k-nearest neighbour classifier. Pattern Recognition Letters, 28(13), 1818-1824. [11] Sarkar, M. (2007). Fuzzy-rough nearest neighbor algorithms in classification. Fuzzy Sets and Systems, 158(19), 2134-2152. [12] Timofte, R., & Van Gool, L. (2015). Iterative nearest neighbors. Pattern Recognition, 48(1), 60-72. [13] Aharon, M., Elad, M., & Bruckstein, A. (2006). $ rm k $-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on signal processing, 54(11), 4311-4322. [14] Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. science, 290(5500), 2323-2326. [15] Liu, Z. G., Pan, Q., & Dezert, J. (2013). A new belief- based K-nearest neighbor classification method. Pattern Recognition, 46(3), 834-844. [16] Zhang, M. L., & Zhou, Z. H. (2007). ML-KNN: A lazy learning approach to multi-label learning. Pattern recognition, 40(7), 2038-2048. [17] Sierra, B., Lazkano, E., Irigoien, I., Jauregi, E., & Mendialdua, I. (2011). K Nearest Neighbor Equality: Giving equal chance to all existing classes. Information Sciences, 181(23), 5158-5168. [18] Lichman, M. (2013). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science