Your SlideShare is downloading. ×
K Means Clustering of Web Pages based on Tags and Words
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

K Means Clustering of Web Pages based on Tags and Words

1,545
views

Published on

Published in: Education

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,545
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. CLUSTERING USING K-MEANS IMPLEMENTATION
  • 2. OVERVIEW INTRODUCTION LITERATURE SURVEY IMPLEMENTATION DETAILS FUTURISTIC SCOPE
  • 3. INTRODUCTION A Cluster is nothing but a group of similar data objects. Clustering refers to a method by which large sets of data are grouped into clusters of smaller sets of similar data. Clustering has many types. These include : -  Hierarchical clustering  Partitional clustering  Density - based clustering  Distance - based clustering
  • 4.  One of the Algorithms for clustering is K-means Algorithm. As the name suggests, we divide the data set into K clusters; where k is a positive integer number. Firstly, we compute the centroid of each cluster. Then, the proximity of data points from this centroid is computed by finding the mean. This process continues iteratively till entire data is divided into proper k clusters.
  • 5. LITERATURE SURVEYClustering :- Let us consider an example :- f the three different colours into three different groups.
  • 6.  The balls of same colour are clustered into a group as shown belowTypes of Clustering :- Hard clustering Soft clustering
  • 7. Clustering Algorithms :- A clustering algorithm attempts to find natural groups of components (or data) based on some similarity. The clustering algorithm finds the centroid of a group of datasets. Most algorithms evaluate the distance between a point and the cluster centroids.
  • 8. K-Means Algorithm:- It is a distance-based, Partitional clustering algorithm. “K” stands for number of clusters, it is a user input to the algorithm. It is unsupervised algorithm. Each cluster is associated with a centroid. Each point is assigned to cluster with closest centroid. This algorithm is iterative in nature.
  • 9. 1) Select K points as the initial centroid.2) repeat3) form K clusters by assigning all points to the closest centroid.4) Recompute the centroid of each cluster.5) until the centroids don’t change.
  • 10. K-means example, step 1 k1Pickk=3initialcluster Y k2centers(randomly) k3 X
  • 11. K-means example, step 2 k1Assigneach pointto the k2closest Yclustercenter k3 X
  • 12. K-means example, step 3Move k1 k1eachclustercenterto the Y k2meanof each k3cluster k2 k3 X
  • 13. K-means example, step 4Reassignpoints k1closest to adifferentnew clustercenter YQ: Which k3points are k2reassigned? X
  • 14. K-means example, step 4 … k1A: threepoints withanimation Y k3 k2 X
  • 15. K-means example, step 5 k1re-computecluster Ymeans k3 k2 X
  • 16. K-means example, step 6 k1moveclustercenters Yto clustermeans k2 k3 X
  • 17. Advantages : Simple, understandable. Items automatically assigned to clusters.Disadvantages : The number of clusters, K, must be determined before hand. We never know which attribute contributes more to the grouping process since we assume that each attribute has the same weight. Too sensitive to outliers.
  • 18. Applications of K-means : Unsupervised learning of neural networks. Pattern recognitions. Classification analysis. Artificial intelligence. Image processing. Machine vision. Email filtering. Web page classification.
  • 19. IMPLEMENTATION DETAILS