Improving the accuracy         of  K-means clustering      algorithm           Kasun Ranga Wijeweera          (krw19870829...
This presentation is based on the   following research paper   K. A. Abdul Nazeer, M. P. Sebastian, Improving     the Accu...
Consider a Set of Data Points,And a Set of Clusters,
The Goal,
Algorithm k-means1.Randomly choose K data items from X as initialcentroids.2.Repeat    Assign each data point to the clus...
K-means gets stuck in a local         optima
Algorithm selection of initial centroids1. Set m = 1;2. Compute the distance between each data point and all   other data ...
Algorithm selection of initial centroidscontinued…6. If m < k then m = m + 1, find another pair of data   points from X be...
Any Questions ?
Thanks for your attention !
Upcoming SlideShare
Loading in …5
×

Improved k-means

337 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
337
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Improved k-means

  1. 1. Improving the accuracy of K-means clustering algorithm Kasun Ranga Wijeweera (krw19870829@gmail.com)
  2. 2. This presentation is based on the following research paper K. A. Abdul Nazeer, M. P. Sebastian, Improving the Accuracy and Efficiency of the k-means Clustering Algorithm, Proceedings of the World Congress on Engineering 2009 Vol I, WCE 2009, July 1 – 3, 2009, London, U. K.
  3. 3. Consider a Set of Data Points,And a Set of Clusters,
  4. 4. The Goal,
  5. 5. Algorithm k-means1.Randomly choose K data items from X as initialcentroids.2.Repeat  Assign each data point to the cluster which has the closest centroid.  Calculate new cluster centroids. Until the convergence criteria is met.
  6. 6. K-means gets stuck in a local optima
  7. 7. Algorithm selection of initial centroids1. Set m = 1;2. Compute the distance between each data point and all other data points in the set;3. Find the closest pair of data points from the set X and form a data point set A[m] (1 <= m <= K) which contains these two data points. Delete these two data points from the set;4. Find the data point in X that is closest to the data points set. Add it to A[m] and delete it from X;5. Repeat step 4 until the number of data points in A[m] reaches 0.75*(n/k);
  8. 8. Algorithm selection of initial centroidscontinued…6. If m < k then m = m + 1, find another pair of data points from X between which the distance is the shortest, form another data point set A[m] and delete them from X. Go to step 4;7. For each data point set A[m] (1 <= m <= K) find the arithmetic mean of the vectors of data points in A[m]. These means will be the initial centroids.
  9. 9. Any Questions ?
  10. 10. Thanks for your attention !

×