Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- Clustering methods in image processing by Maede Maftouni 663 views
- K means Clustering by Edureka! 11737 views
- Kmeans clustering by www.softscients.w... 1918 views
- Fuzzy c-Means Clustering Algorithms by Justin Cletus 2391 views
- PRML 9.1-9.2: K-means Clustering & ... by Shinichi Tamura 657 views
- قطعه بندی با استفاده از خوشه بندی ب... by Maede Maftouni 679 views

1,658 views

Published on

Data Clustering and clustering techniques focus on K-means algorithms

Published in:
Education

No Downloads

Total views

1,658

On SlideShare

0

From Embeds

0

Number of Embeds

2

Shares

0

Downloads

150

Comments

0

Likes

3

No embeds

No notes for slide

- 1. Clustering, K-means variants clustering techniques and applications Jagdeep Matharu Brock University March 18th 2013Jagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 1 / 54
- 2. Clustering Algorithms ClusteringClustering 1 Grouping together data objects that are in some similar way according to some user deﬁned criteria. 2 Cluster : collection of data objects that are similar to each other 3 A form of Unsupervised learning. 4 Data exploration - Looking for new patterns for structures of data. 5 Optimization problem.Jagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 2 / 54
- 3. Clustering Algorithms ClusteringClustering Task 1 Pattern Representation 2 Pattern proximity measure Most important How much (de)similar two objects are. 3 GroupingJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 3 / 54
- 4. Clustering Algorithms Clustering TechniquesClustering Techniques 1 Hierarchical Algorithms: Create Hierarchical decomposition of the data set. Agglomerative: Bottom-up approach. Divisive: top-down approach. 2 Partition Algorithms: Create partition and then evaluate by some criteria e.g: k-means ,k-medoids Figure 1 : Examples of segmentation based on colour orMarch 18th 2013Jagdeep Matharu (Brock University) Clustering - k-means intensity. 4 / 54
- 5. Clustering Algorithms Hierarchical Clustering AlgorithmsHierarchical Clustering Algorithms 1 Sequential Clustering Algorithm 2 Algorithm: assign every data point in a separate cluster Keep merging the most similar pairs of data points/clusters until we have one cluster Compute Distances between and old clusters 3 Use distance matrix as clustering criteria 4 Construct nested partitions layer by layer into tree like structure 5 Resulting cluster can further cut down to get the desired number of cluster.Jagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 5 / 54
- 6. Clustering Algorithms Hierarchical Clustering AlgorithmsCont’d 1 Binary Tree or dendrogram. 2 Where Height of the bars shows how close two objects are.Jagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 6 / 54
- 7. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 7 / 54
- 8. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 8 / 54
- 9. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 9 / 54
- 10. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 10 / 54
- 11. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 11 / 54
- 12. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 12 / 54
- 13. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 13 / 54
- 14. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 14 / 54
- 15. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 15 / 54
- 16. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 16 / 54
- 17. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 17 / 54
- 18. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 18 / 54
- 19. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 19 / 54
- 20. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 20 / 54
- 21. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 21 / 54
- 22. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 22 / 54
- 23. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 23 / 54
- 24. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 24 / 54
- 25. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 25 / 54
- 26. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 26 / 54
- 27. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 27 / 54
- 28. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 28 / 54
- 29. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 29 / 54
- 30. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 30 / 54
- 31. Clustering Algorithms Hierarchical Clustering AlgorithmsExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 31 / 54
- 32. Clustering Algorithms Hierarchical Clustering AlgorithmsStrengths and Weaknesses 1 Pros: No need to assume number of clusters required. Easy to implement. 2 Cons: Time and Space complexity O(n2 ). computing proximity matrix. No objective function directly minimized. Merging decisions are ﬁnal - cannot undone.Jagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 32 / 54
- 33. Partition Clustering algorithmsPartition Clustering algorithms 1 Overview: Construct a partition of a data set D of n objects into a set of k clusters. Value of k is speciﬁed by user. diﬀerent values of k result in diﬀerent cluster output. Find the partition of k clusters that optimize the chosen partition criteria/Error Function. E.g.: Error Sum of Squares(SSE) 2 Combinatorial search can be computationally expensive.Jagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 33 / 54
- 34. Partition Clustering algorithms Partition Clustering algorithmPartition Clustering algorithms 1 k-medoids Use medoid (data point) to represent the cluster. 2 k-means Use centriod to represent the cluster. 3 Variations Bisecting k-means ISODATAJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 34 / 54
- 35. Partition Clustering algorithms Partition Clustering algorithmsk-means algorithms 1 Choose k initial centroids (center points). 2 Each cluster is associated with a centroid. 3 Each data object is assigned to closet centroid. 4 The centroid of each cluster is then updated based on the data objects assignment to the cluster. 5 Repeat the assignment and update steps until convergence. Figure 2 : AlgorithmJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 35 / 54
- 36. Partition Clustering algorithms Partition Clustering algorithmsK-means ExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 36 / 54
- 37. Partition Clustering algorithms Partition Clustering algorithmsK-means ExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 37 / 54
- 38. Partition Clustering algorithms Partition Clustering algorithmsK-means ExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 38 / 54
- 39. Partition Clustering algorithms Partition Clustering algorithmsK-means ExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 39 / 54
- 40. Partition Clustering algorithms Partition Clustering algorithmsK-means ExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 40 / 54
- 41. Partition Clustering algorithms Partition Clustering algorithmsK-means ExampleJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 41 / 54
- 42. Partition Clustering algorithms Partition Clustering algorithmsK-means 1 What is the size of k? 2 How to Choosing initial centroids ? 3 How to assign points to closet centroid ? 4 Cluster evaluation ? 5 Other issues.Jagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 42 / 54
- 43. Partition Clustering algorithms Partition Clustering algorithmsChoosing value of k 1 k represent the number of the clusters required in a partition. 2 Must specify before hand 3 There is no rule of thumb while choosing k - Trail and failure. 4 Diﬀerent sizes may result to diﬀerent results.Jagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 43 / 54
- 44. Partition Clustering algorithms Partition Clustering algorithmschoosing initial centroid. 1 Key step of k-means method. 2 Diﬀerent initial centroids can produce diﬀerent results.Jagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 44 / 54
- 45. Partition Clustering algorithms Partition Clustering algorithmsExample - Optimal Initial Centroid.Jagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 45 / 54
- 46. Partition Clustering algorithms Partition Clustering algorithmsExample - Sub - Optimal Initial Centroid.Jagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 46 / 54
- 47. Partition Clustering algorithms Partition Clustering algorithmsChoosing intial centroid. 1 Choose Initial centroid randomly. Can lead to poor clustering. 2 Choosing centroid by performing multiple runs with randomly chosen initial centroid. Select the set of clusters with optimal solution. 3 Take a sample of points and cluster them using a hierarchical clustering technique. k clusters are extracted from hierarchy. Centroids of those clusters are used as initial centroids.Jagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 47 / 54
- 48. Partition Clustering algorithms Partition Clustering algorithmsAssigning points to centroid. 1 Goal is to ﬁnd the closest centroid for each data points. 2 Assign data points to the closest centroid . 3 Required proximity measure to calculate distances. Euclidien distance, Manhattan distance. 4 Point is assigned to the centroid with smallest distance.Jagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 48 / 54
- 49. Partition Clustering algorithms Partition Clustering algorithmsCluster Evaluation 1 Most common measure is the sum of squared errors. (SSE) 2 Goal is to reduce the error. 3 Error represent the distance from data point to nearest cluster. 4 Mathematically K dist 2 (mi , x) i=1 x∈Ci 5 Where dist is the distence from a data point to cluster, x is a data point, Ci and Mi is repersentative points for the cluster Ci 6 Given the two clusters, we choose the one with the smallest error. 7 To reduce SSE increase k.Jagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 49 / 54
- 50. Partition Clustering algorithms Partition Clustering algorithmsk-means 1 Pros Easy to implement. Guarantee to converge. In few initial iterations. Linear complexity O(n). 2 Cons Need to specify k, in advance. Sensitive to outliers. May yield empty clusters.Jagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 50 / 54
- 51. Partition Clustering algorithms Partition Clustering algorithmsBisecting k-means 1 Variation of basic k-means method. 2 Can produce a partitional or hierarchical clustering. 3 To obtain K clusters, split the set of all points into two clusters. 4 Choose one of two clusters to split again. Can choose largest cluster between two. Can choose one with hight SSE . Cab choose based on both. 5 Continue until K clusters have been produced.Jagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 51 / 54
- 52. Partition Clustering algorithms Partition Clustering algorithmsISODATA 1 Iterative Self Organizing Data Analysis Technique A 2 Dont need to know the number of clusters. 3 Cluster centers are randomly placed and points are assigned to closest centriod. 4 The standard deviation within each cluster, and the distance between cluster centers is calculated. Clusters are split if standard deviation is greater than the user-deﬁned. Clusters are merged if the distance between them is less than the user-deﬁned threshold.Jagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 52 / 54
- 53. Partition Clustering algorithms Partition Clustering algorithmsPractical Example of k-means 1 Image segmentation using k-means clustering. Figure 3 : Examples of segmentation based on colour or intensity.Jagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 53 / 54
- 54. Partition Clustering algorithms BibliographyBibliography I A. K. Jain, M. N. Murty, and P. J. Flynn, “Data clustering: A review,” 1999. P. L. Lanzi. (2007) Clustering: Partitioning methods. [Online]. Available: http://www.slideshare.net/pierluca.lanzi/ machine-learning-and-data-mining-06-clustering-partitioning?from= ss embed Tan. (2005) Introduction to data mining. [Online]. Available: http://www-users.cs.umn.edu/∼kumar/dmbook/dmslides/ chap8 basic cluster analysis.pdfJagdeep Matharu (Brock University) Clustering - k-means March 18th 2013 54 / 54

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment