Quick Look At Clustering

892 views

Published on

Quick Look At Clustering

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
892
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Quick Look At Clustering

  1. 1. CLUSTERING<br />
  2. 2. Introduction<br />Centroid - Center of a cluster<br />Centroid could be either a real point or an imaginary one.<br />Objective function –<br />Measures the quality of clustering (small value is desirable)<br />Calculated by summing the squares of distances of each point from the centroid of the cluster<br />Two types of Clustering are:<br />k-means Clustering<br />Hierarchical Clustering<br />
  3. 3. k-means Clustering<br />It is an exclusive clustering algorithm<br />Algorithm:<br />Select a value for ‘k’<br />Select ‘k’ objects in an arbitrary fashion. Use it as an initial set of k centroids<br />Assign each object to the cluster for which it is nearest to the centroid<br />Recalculate the centroids<br />Repeat steps 3 & 4 until centroids don’t move.<br />It may not find the best set of clusters but will always terminate.<br />
  4. 4. Agglomerative Hierarchical Clustering<br />Algorithm:<br />Assign each object to its own single-object cluster. Calculate the distance between each pair (distance matrix)<br />Select and merge the closest pairs<br />Calculate the distance between this new cluster and other clusters.<br />Repeat steps 2 & 3 until all objects are in single cluster<br />
  5. 5. Example<br />Before clustering<br />After two passes<br />
  6. 6. Example<br />It gives the entire hierarchy of clusters<br />Dendrogram (A Binary tree) – End result of hierarchical clustering<br />
  7. 7. Distance Measure<br />Three ways of calculating distances:<br />Single-link clustering <br />Shortest distance from any member of one cluster to any member of another cluster<br />Complete-link clustering <br />Longest distance from any member of one cluster to any member of another cluster<br />Average-link clustering <br />Average distance from any member of one cluster to any member of another cluster<br />
  8. 8. Visit more self help tutorials<br />Pick a tutorial of your choice and browse through it at your own pace.<br />The tutorials section is free, self-guiding and will not involve any additional support.<br />Visit us at www.dataminingtools.net<br />

×