Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- K means Clustering by Edureka! 13574 views
- K mean-clustering algorithm by parry prabhu 6409 views
- K Means Algorithm by Amanda Adityaningrum 6278 views
- K-Means Algorithm Example by Kasun Ranga Wijew... 14228 views
- Clustering, k means algorithm by Junyoung Park 722 views
- Kmeans clustering by www.softscients.w... 2183 views

27,964 views

Published on

No Downloads

Total views

27,964

On SlideShare

0

From Embeds

0

Number of Embeds

11

Shares

0

Downloads

1,906

Comments

4

Likes

20

No embeds

No notes for slide

- 1. K-means clustering algorithm Kasun Ranga Wijeweera (krw19870829@gmail.com)
- 2. What is Clustering?• Organizing data into classes such that there is • high intra-class similarity • low inter-class similarity• Finding the class labels and the number of classesdirectly from the data (in contrast to classification).• More informally, finding natural groupings amongobjects.
- 3. What is a natural grouping among these objects?
- 4. What is a natural grouping among these objects? Clustering is subjectiveSimpsons Family School Employees Females Males
- 5. Defining Distance MeasuresDefinition: Let O1 and O2 be two objects from the universeof possible objects. The distance (dissimilarity) betweenO1 and O2 is a real number denoted by D(O1,O2) Kasun Kiosn 0.23 3 342.7
- 6. Consider a Set of Data Points,And a Set of Clusters,
- 7. The Goal,
- 8. Algorithm k-means1. Randomly choose K data items from X as initialcentroids.2. Repeat Assign each data point to the cluster which has the closest centroid. Calculate new cluster centroids. Until the convergence criteria is met.
- 9. The data points
- 10. Initialization
- 11. #Runs = 1
- 12. #Runs = 2
- 13. #Runs = 3
- 14. K-means gets stuck in a local optima
- 15. The data points
- 16. Initialization
- 17. #Runs = 1
- 18. #Runs = 2
- 19. #Runs = 3
- 20. #Runs = 4
- 21. Applications of K-means Method • Optical Character Recognition • Biometrics • Diagnostic Systems • Military Applications
- 22. Comments on the K-Means Method• Strength – Relatively efficient: O(tkn), where n is # objects, k is # clusters, and t is # iterations. Normally, k, t << n. – Often terminates at a local optimum. The global optimum may be found using techniques such as: deterministic annealing and genetic algorithms• Weakness – Applicable only when mean is defined, then what about categorical data? – Need to specify k, the number of clusters, in advance – Unable to handle noisy data and outliers – Not suitable to discover clusters with non-convex shapes
- 23. Any Questions ?
- 24. Thanks for your attention !

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

kya tha ye