SlideShare a Scribd company logo
1 of 53
Download to read offline
K-means Clustering
Dr. P. Kuppusamy
Prof / CSE
Clustering
Techniques
Partitioning
methods
Hierarchical
methods
Density-based
methods
Graph based
methods
Model based
clustering
• k-Means algorithm [1957, 1967]
• k-Medoids algorithm
• k-Modes [1998]
• Fuzzy c-means algorithm [1999]
Divisive
Agglomerative
methods
• STING [1997]
• DBSCAN [1996]
• CLIQUE [1998]
• DENCLUE [1998]
• OPTICS [1999]
• Wave Cluster [1998]
• MST Clustering [1999]
• OPOSSUM [2000]
• SNN Similarity Clustering [2001, 2003]
• EM Algorithm [1977]
• Auto class [1996]
• COBWEB [1987]
• ANN Clustering [1982, 1989]
• AGNES [1990]
• BIRCH [1996]
• CURE [1998]
• ROCK [1999]
• Chamelon [1999]
• DIANA [1990]
• PAM [1990]
• CLARA [1990]
• CLARANS
[1994]
Illustration of k-Means clustering algorithms
List of objects (Attributes)
x1 x2 x1 d2 d3 cluster
6.8 12.6 4.0 1.1 5.9 2
0.8 9.8 3.0 7.4 10.2 1
1.2 11.6 3.1 6.6 8.5 1
2.8 9.6 1.0 5.6 9.5 1
3.8 9.9 0.0 4.6 8.9 1
4.4 6.5 3.5 6.6 12.1 1
4.8 1.1 8.9 11.5 17.5 1
6.0 19.9 10.2 7.9 1.4 3
6.2 18.5 8.9 6.5 0.0 3
7.6 17.4 8.4 5.2 1.8 3
7.8 12.2 4.6 0.0 6.5 2
6.6 7.7 3.6 4.7 10.8 1
8.2 4.5 7.0 7.7 14.1 1
8.4 6.9 5.5 5.3 11.8 2
9.0 3.4 8.3 8.9 15.4 1
9.6 11.1 5.9 2.1 8.1 2
Initial cluster with respect to Table
Centroid Objects
A1 A2
c1 3.8 9.9
c2 7.8 12.2
c3 6.2 18.5
Initial Centroids chosen
randomly
x1
x
2
Illustration of k-Means clustering algorithms
List of objects (Attributes)
x1 x2 d1 d2 d3 cluster
6.8 12.6 4.0 1.1 5.9 2
0.8 9.8 3.0 7.4 10.2 1
1.2 11.6 3.1 6.6 8.5 1
2.8 9.6 1.0 5.6 9.5 1
3.8 9.9 0.0 4.6 8.9 1
4.4 6.5 3.5 6.6 12.1 1
4.8 1.1 8.9 11.5 17.5 1
6.0 19.9 10.2 7.9 1.4 3
6.2 18.5 8.9 6.5 0.0 3
7.6 17.4 8.4 5.2 1.8 3
7.8 12.2 4.6 0.0 6.5 2
6.6 7.7 3.6 4.7 10.8 1
8.2 4.5 7.0 7.7 14.1 1
8.4 6.9 5.5 5.3 11.8 2
9.0 3.4 8.3 8.9 15.4 1
9.6 11.1 5.9 2.1 8.1 2
Initial cluster with respect to Table
Centroid Objects
x1 x2
c1 3.8 9.9
c2 7.8 12.2
c3 6.2 18.5
Initial Centroids chosen
randomly
x1
x
2
c1
c2
c3
Illustration of k-Means clustering algorithms
Distance calculation
x1 x2 d1 d2 d3 cluster
6.8 12.6 4.0 1.1 5.9 2
0.8 9.8 3.0 7.4 10.2 1
1.2 11.6 3.1 6.6 8.5 1
2.8 9.6 1.0 5.6 9.5 1
3.8 9.9 0.0 4.6 8.9 1
4.4 6.5 3.5 6.6 12.1 1
4.8 1.1 8.9 11.5 17.5 1
6.0 19.9 10.2 7.9 1.4 3
6.2 18.5 8.9 6.5 0.0 3
7.6 17.4 8.4 5.2 1.8 3
7.8 12.2 4.6 0.0 6.5 2
6.6 7.7 3.6 4.7 10.8 1
8.2 4.5 7.0 7.7 14.1 1
8.4 6.9 5.5 5.3 11.8 2
9.0 3.4 8.3 8.9 15.4 1
9.6 11.1 5.9 2.1 8.1 2
Initial cluster with respect to Table
Centroid Objects
x1 x2
c1 3.8 9.9
c2 7.8 12.2
c3 6.2 18.5
Initial Centroids chosen
randomly
x1
x
2
c1
c2
c3
Illustration of k-Means clustering algorithms
Distance calculation
x1 x2 d1 d2 d3 cluster
6.8 12.6 4.0 1.1 5.9 2
0.8 9.8 3.0 7.4 10.2 1
1.2 11.6 3.1 6.6 8.5 1
2.8 9.6 1.0 5.6 9.5 1
3.8 9.9 0.0 4.6 8.9 1
4.4 6.5 3.5 6.6 12.1 1
4.8 1.1 8.9 11.5 17.5 1
6.0 19.9 10.2 7.9 1.4 3
6.2 18.5 8.9 6.5 0.0 3
7.6 17.4 8.4 5.2 1.8 3
7.8 12.2 4.6 0.0 6.5 2
6.6 7.7 3.6 4.7 10.8 1
8.2 4.5 7.0 7.7 14.1 1
8.4 6.9 5.5 5.3 11.8 2
9.0 3.4 8.3 8.9 15.4 1
9.6 11.1 5.9 2.1 8.1 2
Initial cluster with respect to Table
Centroid Objects
x1 x2
c1 3.8 9.9
c2 7.8 12.2
c3 6.2 18.5
Initial Centroids chosen
randomly
x1
x
2
c1
c2
c3
Illustration of k-Means clustering algorithms
Distance calculation
x1 x2 d1 d2 d3 cluster
6.8 12.6 4.0 1.1 5.9 2
0.8 9.8 3.0 7.4 10.2 1
1.2 11.6 3.1 6.6 8.5 1
2.8 9.6 1.0 5.6 9.5 1
3.8 9.9 0.0 4.6 8.9 1
4.4 6.5 3.5 6.6 12.1 1
4.8 1.1 8.9 11.5 17.5 1
6.0 19.9 10.2 7.9 1.4 3
6.2 18.5 8.9 6.5 0.0 3
7.6 17.4 8.4 5.2 1.8 3
7.8 12.2 4.6 0.0 6.5 2
6.6 7.7 3.6 4.7 10.8 1
8.2 4.5 7.0 7.7 14.1 1
8.4 6.9 5.5 5.3 11.8 2
9.0 3.4 8.3 8.9 15.4 1
9.6 11.1 5.9 2.1 8.1 2
Initial cluster with respect to Table
Centroid Objects
x1 x2
c1 3.8 9.9
c2 7.8 12.2
c3 6.2 18.5
Initial Centroids chosen
randomly
x1
x
2
c1
c2
c3
Illustration of k-Means clustering algorithms
Distance calculation
x1 x2 d1 d2 d3 cluster
6.8 12.6 4.0 1.1 5.9 2
0.8 9.8 3.0 7.4 10.2 1
1.2 11.6 3.1 6.6 8.5 1
2.8 9.6 1.0 5.6 9.5 1
3.8 9.9 0.0 4.6 8.9 1
4.4 6.5 3.5 6.6 12.1 1
4.8 1.1 8.9 11.5 17.5 1
6.0 19.9 10.2 7.9 1.4 3
6.2 18.5 8.9 6.5 0.0 3
7.6 17.4 8.4 5.2 1.8 3
7.8 12.2 4.6 0.0 6.5 2
6.6 7.7 3.6 4.7 10.8 1
8.2 4.5 7.0 7.7 14.1 1
8.4 6.9 5.5 5.3 11.8 2
9.0 3.4 8.3 8.9 15.4 1
9.6 11.1 5.9 2.1 8.1 2
Initial cluster with respect to Table
Centroid Objects
x1 x2
c1 3.8 9.9
c2 7.8 12.2
c3 6.2 18.5
Initial Centroids chosen
randomly
x1
x
2
c1
c2
c3
Illustration of k-Means clustering algorithms
Cluster Assigning
x1 x2 d1 d2 d3 cluster
6.8 12.6 4.0 1.1 5.9 2
0.8 9.8 3.0 7.4 10.2 1
1.2 11.6 3.1 6.6 8.5 1
2.8 9.6 1.0 5.6 9.5 1
3.8 9.9 0.0 4.6 8.9 1
4.4 6.5 3.5 6.6 12.1 1
4.8 1.1 8.9 11.5 17.5 1
6.0 19.9 10.2 7.9 1.4 3
6.2 18.5 8.9 6.5 0.0 3
7.6 17.4 8.4 5.2 1.8 3
7.8 12.2 4.6 0.0 6.5 2
6.6 7.7 3.6 4.7 10.8 1
8.2 4.5 7.0 7.7 14.1 1
8.4 6.9 5.5 5.3 11.8 2
9.0 3.4 8.3 8.9 15.4 1
9.6 11.1 5.9 2.1 8.1 2
Initial cluster with respect to Table
Centroid Objects
A1 A2
c1 3.8 9.9
c2 7.8 12.2
c3 6.2 18.5
Initial Centroids chosen
randomly
x1
x
2
c1
c2
c3
Illustration of k-Means clustering algorithms
Cluster Assigning
x1 x2 d1 d2 d3 cluster
6.8 12.6 4.0 1.1 5.9 2
0.8 9.8 3.0 7.4 10.2 1
1.2 11.6 3.1 6.6 8.5 1
2.8 9.6 1.0 5.6 9.5 1
3.8 9.9 0.0 4.6 8.9 1
4.4 6.5 3.5 6.6 12.1 1
4.8 1.1 8.9 11.5 17.5 1
6.0 19.9 10.2 7.9 1.4 3
6.2 18.5 8.9 6.5 0.0 3
7.6 17.4 8.4 5.2 1.8 3
7.8 12.2 4.6 0.0 6.5 2
6.6 7.7 3.6 4.7 10.8 1
8.2 4.5 7.0 7.7 14.1 1
8.4 6.9 5.5 5.3 11.8 2
9.0 3.4 8.3 8.9 15.4 1
9.6 11.1 5.9 2.1 8.1 2
Initial cluster with respect to Table
Centroid Objects
A1 A2
c1 3.8 9.9
c2 7.8 12.2
c3 6.2 18.5
Initial Centroids chosen
randomly
x1
x
2
c1
c2
c3
Illustration of k-Means clustering algorithms
Cluster Assigning
x1 x2 d1 d2 d3 cluster
6.8 12.6 4.0 1.1 5.9 2
0.8 9.8 3.0 7.4 10.2 1
1.2 11.6 3.1 6.6 8.5 1
2.8 9.6 1.0 5.6 9.5 1
3.8 9.9 0.0 4.6 8.9 1
4.4 6.5 3.5 6.6 12.1 1
4.8 1.1 8.9 11.5 17.5 1
6.0 19.9 10.2 7.9 1.4 3
6.2 18.5 8.9 6.5 0.0 3
7.6 17.4 8.4 5.2 1.8 3
7.8 12.2 4.6 0.0 6.5 2
6.6 7.7 3.6 4.7 10.8 1
8.2 4.5 7.0 7.7 14.1 1
8.4 6.9 5.5 5.3 11.8 2
9.0 3.4 8.3 8.9 15.4 1
9.6 11.1 5.9 2.1 8.1 2
Initial cluster with respect to Table
Centroid Objects
A1 A2
c1 3.8 9.9
c2 7.8 12.2
c3 6.2 18.5
Initial Centroids chosen
randomly
x1
x
2
c1
c2
c3
k-Means Algorithm
• k-Means clustering algorithm proposed by J. Hartigan and M. A. Wong [1979].
• Given a set of n distinct objects, the k-Means clustering partitions the objects into k number of
clusters such that intracluster similarity is high but the intercluster similarity is low.
• In this algorithm, user need to specify k, the number of clusters.
• Let consider the objects are defined with numeric attributes.
• Then use any one of the distance metric (Euclidian, Manhattan) to create the clusters.
k-Means Algorithm
• First it selects k number of objects at random from the set of n objects. These k objects are treated
as the centroids of k clusters.
• For each of the remaining objects, it is assigned to one of the closest centroid. Thus, it forms a
collection of objects assigned to each centroid is called a cluster.
• Next, the centroid of each cluster is then updated (by calculating the mean values of attributes of
each object).
• Repeat (Iterate) the assignment and update procedure is until it reaches some stopping criteria
(such as, number of iteration, centroids remain unchanged or no assignment, etc.)
k-Means Algorithm
Input: D is a dataset containing n objects, k is the number of cluster
Output: A set of k clusters
Steps:
1. Randomly choose k objects from D as the initial cluster centroids.
2. For each of the objects in D do
• Compute distance between the current objects and k cluster centroids
• Assign the current object to that cluster to which it is closest.
3. Compute the “cluster centers” of each cluster. These become the new cluster centroids.
4. Repeat step 2-3 until the convergence criterion is satisfied
5. Stop
k-Means Algorithm
k-Means Algorithm
• Problem
Example
Let’s have 4 types of medicines and each has two attributes (pH and weight index).
Create a group from these objects into K=2 group of medicine.
Medicine Weight pH-Index
A 1 1
B 2 1
C 4 3
D 5 4
A B
C
D
Example
• Step 1: Use initial seed (random) points for partitioning
B
c
,
A
c 2
1 

24
.
4
)
1
4
(
)
2
5
(
)
,
(
5
)
1
4
(
)
1
5
(
)
,
(
2
2
2
2
2
1










c
D
d
c
D
d
Assign each object to the cluster
with the nearest seed point
Euclidean distance
D
C
A B
Example
• Step 2: Compute new centroids of the current partition
Knowing the members of each cluster,
now compute the new centroid of each
group based on these new
memberships.
)
3
8
,
3
11
(
3
4
3
1
,
3
5
4
2
)
1
,
1
(
2
1






 





c
c
Example
• Step 2: Renew membership based on new centroids
Compute the distance of all
objects to the new centroids
Assign the membership to objects
Example
• Step 3: Repeat the first two steps until its convergence
Knowing the members of each
cluster, now we compute the new
centroid of each group based on
these new memberships.
)
2
1
3
,
2
1
4
(
2
4
3
,
2
5
4
)
1
,
2
1
1
(
2
1
1
,
2
2
1
2
1






 








 


c
c
Example
• Step 3: Repeat the first two steps until its convergence
Compute the distance of all objects to
the new centroids
Stop due to no new assignment
Membership in each cluster no longer change
How to choose k? – Elbow Method
• A fundamental step for any unsupervised algorithm is to
determine the optimal number of clusters for data which
may be clustered.
• From the visualization, observe the optimal number of
clusters should be around 3. But visualizing the data alone
cannot always give the optimal cluster numbers.
• The Elbow Method is one of the most popular methods to
determine this optimal value of k.
x1
x
2
How to choose k? – Elbow Method
Elbow method (using Distortion):
• Step 1: Distortion: It is the average of the squared
distances from the cluster centers of the respective
clusters. Typically, the Euclidean distance metric is used.
• Iterate the values of k from 1 to 9 and calculate the values
of distortions for each value of k and calculate the
distortion and inertia for each value of k in the given
range.
• Step 2: Building the clustering model and calculating
the values of the Distortion
– select the value of k at the “elbow” ie the point after which the
distortion is start to decrease in a linear fashion
How to choose k? – Elbow Method
Elbow method (using Inertia):
• Step 1: Inertia: It is the sum of squared distances of
samples (objects) to their closest cluster center.
• Iterate the values of k from 1 to 9 and calculate the values of
inertia for each value of k and calculate the inertia for each
value of k in the given range.
• Step 2: Building the clustering model and calculating the
values of the Inertia.
– select the value of k at the “elbow” ie the point after which the
inertia is start to decrease in a linear fashion.
Limitations of k-Means algorithm
Limitations of k-Means algorithm
• Local optimum
– sensitive to initial seed points
– converge to a local optimum: It maybe an unwanted solution
• Need to specify K, the number of clusters, in advance
• Not suitable for discovering clusters with non-convex shapes
• Applicable only when mean is defined, not handling categorical data. (Use K-mode algorithm)
Application
• Colour-Based Image Segmentation Using K-means
Application
“blue” pixels “white” pixels “pink” pixels
Practice
• dataset = { (5,3), (10,15), (15,12), (24,10), (30,45), (85,70), (71,80),
(60,78), (55,52), (80,91) }
• Let’s randomly initiate the cluster centroid c1, c2 as (5, 3) and (10, 15).
• While Iteration = 1,
– Compute the distance
Iteration = 1
x1 x2 d1 d2 cluster
5 3 0 13
10 15 13 0
15 12 13.45 5.83
24 10 20.24 14.86
30 45 48.87 36
85 70 104.35 93
71 80 101.41 89
60 78 93 80
55 52 70 58
80 91 115.52 103.32
Iteration = 1
• Assign the data points to closest centroid’s cluster
x1 x2 d1 d2 cluster
5 3 0 13 C1
10 15 13 0 C2
15 12 13.45 5.83 C2
24 10 20.24 14.86 C2
30 45 48.87 36 C2
85 70 104.35 93 C2
71 80 101.41 89 C2
60 78 93 80 C2
55 52 70 58 C2
80 91 115.52 103.32 C2
Iteration = 1
• Calculate the New Cluster Centroid
x1 x2 d1 d2 cluster
5 3 0 13 C1
10 15 13 0 C2
15 12 13.45 5.83 C2
24 10 20.24 14.86 C2
30 45 48.87 36 C2
85 70 104.35 93 C2
71 80 101.41 89 C2
60 78 93 80 C2
55 52 70 58 C2
80 91 115.52 103.32 C2
Iteration = 1
• C1 has only one data point (5,3).
• Mean c1(x1)= (5/1)= 5.
• Mean c1(x2)= (3/1) = 3
• So, new centroid for Cluster1 is again (5,3)
x1 x2 d1 d2 cluster
5 3 0 13 C1
10 15 13 0 C2
15 12 13.45 5.83 C2
24 10 20.24 14.86 C2
30 45 48.87 36 C2
85 70 104.35 93 C2
71 80 101.41 89 C2
60 78 93 80 C2
55 52 70 58 C2
80 91 115.52 103.32 C2
Iteration = 1
• C2 has only 9 data points.
• Mean c2(x1)= (10 + 15 + 24 + 30 + 85 + 71 + 60 + 55 + 80) / 9 = 47.77
• Mean c2(x2)= (15 + 12 + 10 + 45 + 70 + 80 + 78 + 52 + 91) / 9 = 50.33
• So, new centroid for Cluster2 is (47.77,50.33)
x1 x2 d1 d2 cluster
5 3 0 13 C1
10 15 13 0 C2
15 12 13.45 5.83 C2
24 10 20.24 14.86 C2
30 45 48.87 36 C2
85 70 104.35 93 C2
71 80 101.41 89 C2
60 78 93 80 C2
55 52 70 58 C2
80 91 115.52 103.32 C2
Iteration = 2
• Compute the distance between new centroids c1 i.e. (5,3) and all data
points.
x1 x2 d1 d2 cluster
5 3 0
10 15 13
15 12 13.45
24 10 20.24
30 45 48.87
85 70 104.35
71 80 101.41
60 78 93
55 52 70
80 91 115.52
Iteration = 2
x1 x2 d1 d2 cluster
5 3 0 63.79
10 15 13 51.71
15 12 13.45 50.42
24 10 20.24 46.81
30 45 48.87 18.55
85 70 104.35 42.1
71 80 101.41 37.68
60 78 93 30.25
55 52 70 7.42
80 91 115.52 51.89
• Compute the distance between new centroid c2 i.e. (47.77,50.33) and all
data points.
Iteration = 2
• Assign the data points to the closest centroids.
x1 x2 d1 d2 cluster
5 3 0 63.79 C1
10 15 13 51.71 C1
15 12 13.45 50.42 C1
24 10 20.24 46.81 C1
30 45 48.87 18.55 C2
85 70 104.35 42.1 C2
71 80 101.41 37.68 C2
60 78 93 30.25 C2
55 52 70 7.42 C2
80 91 115.52 51.89 C2
Iteration = 2
• Compute the new cluster centroids
x1 x2 d1 d2 cluster
5 3 0 63.79 C1
10 15 13 51.71 C1
15 12 13.45 50.42 C1
24 10 20.24 46.81 C1
30 45 48.87 18.55 C2
85 70 104.35 42.1 C2
71 80 101.41 37.68 C2
60 78 93 30.25 C2
55 52 70 7.42 C2
80 91 115.52 51.89 C2
Iteration = 2
• C1 has 4 data points.
• Mean c1(x1)= (5, 10, 15, 24) / 4 = 13.5
• Mean c1(x2)= (3, 15, 12, 10) / 4 = 10.0
• So, new centroid for Cluster1 is (13.5,10.0)
x1 x2 d1 d2 cluster
5 3 0 63.79 C1
10 15 13 51.71 C1
15 12 13.45 50.42 C1
24 10 20.24 46.81 C1
30 45 48.87 18.55 C2
85 70 104.35 42.1 C2
71 80 101.41 37.68 C2
60 78 93 30.25 C2
55 52 70 7.42 C2
80 91 115.52 51.89 C2
Iteration = 2
• C2 has 6 data points.
• Mean c2(x1)= (30 + 85 + 71 + 60 + 55 + 80) / 6 = 63.5
• Mean c2(x2)= (45 + 70 + 80 + 78 + 52 +91) / 6 = 69.33
• So, new centroid for Cluster2 is (63.5, 69.33)
x1 x2 d1 d2 cluster
5 3 0 63.79 C1
10 15 13 51.71 C1
15 12 13.45 50.42 C1
24 10 20.24 46.81 C1
30 45 48.87 18.55 C2
85 70 104.35 42.1 C2
71 80 101.41 37.68 C2
60 78 93 30.25 C2
55 52 70 7.42 C2
80 91 115.52 51.89 C2
Iteration = 3
• Compute the distance between new centroids and all data points.
x1 x2 d1 d2 cluster
5 3 11.01 88.44
10 15 6.1 76.24
15 12 2.5 75.09
24 10 10.5 71.27
30 45 38.69 41.4
85 70 93.3 21.51
71 80 90.58 13.04
60 78 85.37 9.34
55 52 59.04 19.3
80 91 104.8 27.3
Iteration = 3
• Assign the data points to the closest centroids.
x1 x2 d1 d2 cluster
5 3 11.01 88.44 C1
10 15 6.1 76.24 C1
15 12 2.5 75.09 C1
24 10 10.5 71.27 C1
30 45 38.69 41.4 C1
85 70 93.3 21.51 C2
71 80 90.58 13.04 C2
60 78 85.37 9.34 C2
55 52 59.04 19.3 C2
80 91 104.8 27.3 C2
Iteration = 3
• Compute the new cluster centroids
x1 x2 d1 d2 cluster
5 3 11.01 88.44 C1
10 15 6.1 76.24 C1
15 12 2.5 75.09 C1
24 10 10.5 71.27 C1
30 45 38.69 41.4 C1
85 70 93.3 21.51 C2
71 80 90.58 13.04 C2
60 78 85.37 9.34 C2
55 52 59.04 19.3 C2
80 91 104.8 27.3 C2
Iteration = 3
• C1 has 5 data points.
• Mean c1(x1)= (5, 10, 15, 24, 30) / 5 = 16.8
• Mean c1(x2)= (3, 15, 12, 10, 45) / 5 = 17.0
• So, new centroid for Cluster1 is (16.8,17.0)
x1 x2 d1 d2 cluster
5 3 11.01 88.44 C1
10 15 6.1 76.24 C1
15 12 2.5 75.09 C1
24 10 10.5 71.27 C1
30 45 38.69 41.4 C1
85 70 93.3 21.51 C2
71 80 90.58 13.04 C2
60 78 85.37 9.34 C2
55 52 59.04 19.3 C2
80 91 104.8 27.3 C2
Iteration = 3
• C2 has 5 data points.
• Mean c2(x1)= (85 + 71 + 60 + 55 + 80) / 5 = 70.2
• Mean c2(x2)= (70 + 80 + 78 + 52 + 91) / 5 = 74.2
• So, new centroid for Cluster2 is (70.2,74.2)
x1 x2 d1 d2 cluster
5 3 11.01 88.44 C1
10 15 6.1 76.24 C1
15 12 2.5 75.09 C1
24 10 10.5 71.27 C1
30 45 38.69 41.4 C1
85 70 93.3 21.51 C2
71 80 90.58 13.04 C2
60 78 85.37 9.34 C2
55 52 59.04 19.3 C2
80 91 104.8 27.3 C2
Iteration = 4
• Compute the distance from new centroids and data points
• c1(16.8,17.0) and c2(70.2,74.2)
x1 x2 d1 d2 cluster
5 3 18.3 96.54
10 15 7.08 84.43
15 12 5.31 83.16
24 10 10.04 79.09
30 45 30.95 49.68
85 70 86.37 15.38
71 80 83.1 5.85
60 78 74.74 10.88
55 52 51.8 26.9
80 91 97.31 19.44
Iteration = 4
• Assign the data points to the closest centroids
x1 x2 d1 d2 cluster
5 3 18.3 96.54 C1
10 15 7.08 84.43 C1
15 12 5.31 83.16 C1
24 10 10.04 79.09 C1
30 45 30.95 49.68 C1
85 70 86.37 15.38 C2
71 80 83.1 5.85 C2
60 78 74.74 10.88 C2
55 52 51.8 26.9 C2
80 91 97.31 19.44 C2
Iteration = 4
• Compute the new cluster centroids and all data points
x1 x2 d1 d2 cluster
5 3 18.3 96.54 C1
10 15 7.08 84.43 C1
15 12 5.31 83.16 C1
24 10 10.04 79.09 C1
30 45 30.95 49.68 C1
85 70 86.37 15.38 C2
71 80 83.1 5.85 C2
60 78 74.74 10.88 C2
55 52 51.8 26.9 C2
80 91 97.31 19.44 C2
Iteration = 4
• C1 has 5 data points.
• Mean c1(x1)= (5, 10, 15, 24, 30) / 5 = 16.8
• Mean c1(x2)= (3, 15, 12, 10, 45) / 5 = 17.0
• So, new centroid for Cluster1 is (16.8,17.0)
x1 x2 d1 d2 cluster
5 3 18.3 96.54 C1
10 15 7.08 84.43 C1
15 12 5.31 83.16 C1
24 10 10.04 79.09 C1
30 45 30.95 49.68 C1
85 70 86.37 15.38 C2
71 80 83.1 5.85 C2
60 78 74.74 10.88 C2
55 52 51.8 26.9 C2
80 91 97.31 19.44 C2
Iteration = 4
• C2 has 5 data points.
• Mean c2(x1)= (85 + 71 + 60 + 55 + 80) / 5 = 70.2
• Mean c2(x2)= (70 + 80 + 78 + 52 + 91) / 5 = 74.2
• So, new centroid for Cluster2 is (70.2,74.2)
x1 x2 d1 d2 cluster
5 3 18.3 96.54 C1
10 15 7.08 84.43 C1
15 12 5.31 83.16 C1
24 10 10.04 79.09 C1
30 45 30.95 49.68 C1
85 70 86.37 15.38 C2
71 80 83.1 5.85 C2
60 78 74.74 10.88 C2
55 52 51.8 26.9 C2
80 91 97.31 19.44 C2
Convergence
• The cluster centroids in iteration3 and iteration4 are same. i.e (No change).
• It satisfies the convergence criteria. i.e. Data points cannot be clustered
further.
• So, Stop the process.
Reference
• Artificial Intelligence and Machine Learning, Chandra S.S. & H.S. Anand, PHI Publications
• Online materials

More Related Content

What's hot

K-means Clustering
K-means ClusteringK-means Clustering
K-means ClusteringAnna Fensel
 
K-means clustering algorithm
K-means clustering algorithmK-means clustering algorithm
K-means clustering algorithmVinit Dantkale
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clusteringArshad Farhad
 
Performance Evaluation for Classifiers tutorial
Performance Evaluation for Classifiers tutorialPerformance Evaluation for Classifiers tutorial
Performance Evaluation for Classifiers tutorialBilkent University
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersFunctional Imperative
 
Curse of dimensionality
Curse of dimensionalityCurse of dimensionality
Curse of dimensionalityNikhil Sharma
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning Mohammad Junaid Khan
 
Presentation on unsupervised learning
Presentation on unsupervised learning Presentation on unsupervised learning
Presentation on unsupervised learning ANKUSH PAL
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree LearningMilind Gokhale
 
Computer Vision image classification
Computer Vision image classificationComputer Vision image classification
Computer Vision image classificationWael Badawy
 

What's hot (20)

K mean-clustering
K mean-clusteringK mean-clustering
K mean-clustering
 
K-means Clustering
K-means ClusteringK-means Clustering
K-means Clustering
 
K-means clustering algorithm
K-means clustering algorithmK-means clustering algorithm
K-means clustering algorithm
 
Knn 160904075605-converted
Knn 160904075605-convertedKnn 160904075605-converted
Knn 160904075605-converted
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
 
Decision tree
Decision treeDecision tree
Decision tree
 
Performance Evaluation for Classifiers tutorial
Performance Evaluation for Classifiers tutorialPerformance Evaluation for Classifiers tutorial
Performance Evaluation for Classifiers tutorial
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning Classifiers
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
 
Presentation on K-Means Clustering
Presentation on K-Means ClusteringPresentation on K-Means Clustering
Presentation on K-Means Clustering
 
KNN
KNN KNN
KNN
 
Curse of dimensionality
Curse of dimensionalityCurse of dimensionality
Curse of dimensionality
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning
 
Presentation on unsupervised learning
Presentation on unsupervised learning Presentation on unsupervised learning
Presentation on unsupervised learning
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Computer Vision image classification
Computer Vision image classificationComputer Vision image classification
Computer Vision image classification
 
UNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptxUNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptx
 
K Nearest Neighbor Algorithm
K Nearest Neighbor AlgorithmK Nearest Neighbor Algorithm
K Nearest Neighbor Algorithm
 
Hierachical clustering
Hierachical clusteringHierachical clustering
Hierachical clustering
 

Similar to K means clustering

Clustering techniques
Clustering techniquesClustering techniques
Clustering techniquestalktoharry
 
Math Senior Project Digital- Ahnaf Khan
Math Senior Project Digital- Ahnaf KhanMath Senior Project Digital- Ahnaf Khan
Math Senior Project Digital- Ahnaf KhanM. Ahnaf Khan
 
A practical Introduction to Machine(s) Learning
A practical Introduction to Machine(s) LearningA practical Introduction to Machine(s) Learning
A practical Introduction to Machine(s) LearningBruno Gonçalves
 
Pattern recognition binoy k means clustering
Pattern recognition binoy  k means clusteringPattern recognition binoy  k means clustering
Pattern recognition binoy k means clustering108kaushik
 
Chapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptChapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptSubrata Kumer Paul
 
Business analytics course in delhi
Business analytics course in delhiBusiness analytics course in delhi
Business analytics course in delhibhuvan8999
 
data science course in delhi
data science course in delhidata science course in delhi
data science course in delhidevipatnala1
 
business analytics course in delhi
business analytics course in delhibusiness analytics course in delhi
business analytics course in delhidevipatnala1
 
Best data science training, best data science training institute in hyderabad.
 Best data science training, best data science training institute in hyderabad. Best data science training, best data science training institute in hyderabad.
Best data science training, best data science training institute in hyderabad.Data Analytics Courses in Pune
 
Data scientist course in hyderabad
Data scientist course in hyderabadData scientist course in hyderabad
Data scientist course in hyderabadprathyusha1234
 
Data scientist training in bangalore
Data scientist training in bangaloreData scientist training in bangalore
Data scientist training in bangaloreprathyusha1234
 
Data science course in chennai (3)
Data science course in chennai (3)Data science course in chennai (3)
Data science course in chennai (3)prathyusha1234
 
data science course in chennai
data science course in chennaidata science course in chennai
data science course in chennaidevipatnala1
 
Best institute for data science in hyderabad
Best institute for data science in hyderabadBest institute for data science in hyderabad
Best institute for data science in hyderabadprathyusha1234
 
Data science online course
Data science online courseData science online course
Data science online courseprathyusha1234
 
data science institute in bangalore
data science institute in bangaloredata science institute in bangalore
data science institute in bangaloredevipatnala1
 
Best data science training, best data science training institute in hyderabad.
Best data science training, best data science training institute in hyderabad.Best data science training, best data science training institute in hyderabad.
Best data science training, best data science training institute in hyderabad.sripadojwarumavilas
 

Similar to K means clustering (20)

Clustering.pptx
Clustering.pptxClustering.pptx
Clustering.pptx
 
Clustering techniques
Clustering techniquesClustering techniques
Clustering techniques
 
Math Senior Project Digital- Ahnaf Khan
Math Senior Project Digital- Ahnaf KhanMath Senior Project Digital- Ahnaf Khan
Math Senior Project Digital- Ahnaf Khan
 
A practical Introduction to Machine(s) Learning
A practical Introduction to Machine(s) LearningA practical Introduction to Machine(s) Learning
A practical Introduction to Machine(s) Learning
 
Pattern recognition binoy k means clustering
Pattern recognition binoy  k means clusteringPattern recognition binoy  k means clustering
Pattern recognition binoy k means clustering
 
Chapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptChapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.ppt
 
08 clustering
08 clustering08 clustering
08 clustering
 
Business analytics course in delhi
Business analytics course in delhiBusiness analytics course in delhi
Business analytics course in delhi
 
data science course in delhi
data science course in delhidata science course in delhi
data science course in delhi
 
business analytics course in delhi
business analytics course in delhibusiness analytics course in delhi
business analytics course in delhi
 
Best data science training, best data science training institute in hyderabad.
 Best data science training, best data science training institute in hyderabad. Best data science training, best data science training institute in hyderabad.
Best data science training, best data science training institute in hyderabad.
 
Data scientist course in hyderabad
Data scientist course in hyderabadData scientist course in hyderabad
Data scientist course in hyderabad
 
Data scientist training in bangalore
Data scientist training in bangaloreData scientist training in bangalore
Data scientist training in bangalore
 
Data science course in chennai (3)
Data science course in chennai (3)Data science course in chennai (3)
Data science course in chennai (3)
 
data science course in chennai
data science course in chennaidata science course in chennai
data science course in chennai
 
Best institute for data science in hyderabad
Best institute for data science in hyderabadBest institute for data science in hyderabad
Best institute for data science in hyderabad
 
Data science training
Data science trainingData science training
Data science training
 
Data science online course
Data science online courseData science online course
Data science online course
 
data science institute in bangalore
data science institute in bangaloredata science institute in bangalore
data science institute in bangalore
 
Best data science training, best data science training institute in hyderabad.
Best data science training, best data science training institute in hyderabad.Best data science training, best data science training institute in hyderabad.
Best data science training, best data science training institute in hyderabad.
 

More from Kuppusamy P

Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnnKuppusamy P
 
Image segmentation
Image segmentationImage segmentation
Image segmentationKuppusamy P
 
Image enhancement
Image enhancementImage enhancement
Image enhancementKuppusamy P
 
Feature detection and matching
Feature detection and matchingFeature detection and matching
Feature detection and matchingKuppusamy P
 
Image processing, Noise, Noise Removal filters
Image processing, Noise, Noise Removal filtersImage processing, Noise, Noise Removal filters
Image processing, Noise, Noise Removal filtersKuppusamy P
 
Flowchart design for algorithms
Flowchart design for algorithmsFlowchart design for algorithms
Flowchart design for algorithmsKuppusamy P
 
Algorithm basics
Algorithm basicsAlgorithm basics
Algorithm basicsKuppusamy P
 
Problem solving using Programming
Problem solving using ProgrammingProblem solving using Programming
Problem solving using ProgrammingKuppusamy P
 
Parts of Computer, Hardware and Software
Parts of Computer, Hardware and Software Parts of Computer, Hardware and Software
Parts of Computer, Hardware and Software Kuppusamy P
 
Java methods or Subroutines or Functions
Java methods or Subroutines or FunctionsJava methods or Subroutines or Functions
Java methods or Subroutines or FunctionsKuppusamy P
 
Java iterative statements
Java iterative statementsJava iterative statements
Java iterative statementsKuppusamy P
 
Java conditional statements
Java conditional statementsJava conditional statements
Java conditional statementsKuppusamy P
 
Java introduction
Java introductionJava introduction
Java introductionKuppusamy P
 
Logistic regression in Machine Learning
Logistic regression in Machine LearningLogistic regression in Machine Learning
Logistic regression in Machine LearningKuppusamy P
 
Anomaly detection (Unsupervised Learning) in Machine Learning
Anomaly detection (Unsupervised Learning) in Machine LearningAnomaly detection (Unsupervised Learning) in Machine Learning
Anomaly detection (Unsupervised Learning) in Machine LearningKuppusamy P
 
Machine Learning Performance metrics for classification
Machine Learning Performance metrics for classificationMachine Learning Performance metrics for classification
Machine Learning Performance metrics for classificationKuppusamy P
 

More from Kuppusamy P (20)

Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnn
 
Deep learning
Deep learningDeep learning
Deep learning
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
 
Image enhancement
Image enhancementImage enhancement
Image enhancement
 
Feature detection and matching
Feature detection and matchingFeature detection and matching
Feature detection and matching
 
Image processing, Noise, Noise Removal filters
Image processing, Noise, Noise Removal filtersImage processing, Noise, Noise Removal filters
Image processing, Noise, Noise Removal filters
 
Flowchart design for algorithms
Flowchart design for algorithmsFlowchart design for algorithms
Flowchart design for algorithms
 
Algorithm basics
Algorithm basicsAlgorithm basics
Algorithm basics
 
Problem solving using Programming
Problem solving using ProgrammingProblem solving using Programming
Problem solving using Programming
 
Parts of Computer, Hardware and Software
Parts of Computer, Hardware and Software Parts of Computer, Hardware and Software
Parts of Computer, Hardware and Software
 
Strings in java
Strings in javaStrings in java
Strings in java
 
Java methods or Subroutines or Functions
Java methods or Subroutines or FunctionsJava methods or Subroutines or Functions
Java methods or Subroutines or Functions
 
Java arrays
Java arraysJava arrays
Java arrays
 
Java iterative statements
Java iterative statementsJava iterative statements
Java iterative statements
 
Java conditional statements
Java conditional statementsJava conditional statements
Java conditional statements
 
Java data types
Java data typesJava data types
Java data types
 
Java introduction
Java introductionJava introduction
Java introduction
 
Logistic regression in Machine Learning
Logistic regression in Machine LearningLogistic regression in Machine Learning
Logistic regression in Machine Learning
 
Anomaly detection (Unsupervised Learning) in Machine Learning
Anomaly detection (Unsupervised Learning) in Machine LearningAnomaly detection (Unsupervised Learning) in Machine Learning
Anomaly detection (Unsupervised Learning) in Machine Learning
 
Machine Learning Performance metrics for classification
Machine Learning Performance metrics for classificationMachine Learning Performance metrics for classification
Machine Learning Performance metrics for classification
 

Recently uploaded

Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 

Recently uploaded (20)

Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 

K means clustering

  • 1. K-means Clustering Dr. P. Kuppusamy Prof / CSE
  • 2. Clustering Techniques Partitioning methods Hierarchical methods Density-based methods Graph based methods Model based clustering • k-Means algorithm [1957, 1967] • k-Medoids algorithm • k-Modes [1998] • Fuzzy c-means algorithm [1999] Divisive Agglomerative methods • STING [1997] • DBSCAN [1996] • CLIQUE [1998] • DENCLUE [1998] • OPTICS [1999] • Wave Cluster [1998] • MST Clustering [1999] • OPOSSUM [2000] • SNN Similarity Clustering [2001, 2003] • EM Algorithm [1977] • Auto class [1996] • COBWEB [1987] • ANN Clustering [1982, 1989] • AGNES [1990] • BIRCH [1996] • CURE [1998] • ROCK [1999] • Chamelon [1999] • DIANA [1990] • PAM [1990] • CLARA [1990] • CLARANS [1994]
  • 3. Illustration of k-Means clustering algorithms List of objects (Attributes) x1 x2 x1 d2 d3 cluster 6.8 12.6 4.0 1.1 5.9 2 0.8 9.8 3.0 7.4 10.2 1 1.2 11.6 3.1 6.6 8.5 1 2.8 9.6 1.0 5.6 9.5 1 3.8 9.9 0.0 4.6 8.9 1 4.4 6.5 3.5 6.6 12.1 1 4.8 1.1 8.9 11.5 17.5 1 6.0 19.9 10.2 7.9 1.4 3 6.2 18.5 8.9 6.5 0.0 3 7.6 17.4 8.4 5.2 1.8 3 7.8 12.2 4.6 0.0 6.5 2 6.6 7.7 3.6 4.7 10.8 1 8.2 4.5 7.0 7.7 14.1 1 8.4 6.9 5.5 5.3 11.8 2 9.0 3.4 8.3 8.9 15.4 1 9.6 11.1 5.9 2.1 8.1 2 Initial cluster with respect to Table Centroid Objects A1 A2 c1 3.8 9.9 c2 7.8 12.2 c3 6.2 18.5 Initial Centroids chosen randomly x1 x 2
  • 4. Illustration of k-Means clustering algorithms List of objects (Attributes) x1 x2 d1 d2 d3 cluster 6.8 12.6 4.0 1.1 5.9 2 0.8 9.8 3.0 7.4 10.2 1 1.2 11.6 3.1 6.6 8.5 1 2.8 9.6 1.0 5.6 9.5 1 3.8 9.9 0.0 4.6 8.9 1 4.4 6.5 3.5 6.6 12.1 1 4.8 1.1 8.9 11.5 17.5 1 6.0 19.9 10.2 7.9 1.4 3 6.2 18.5 8.9 6.5 0.0 3 7.6 17.4 8.4 5.2 1.8 3 7.8 12.2 4.6 0.0 6.5 2 6.6 7.7 3.6 4.7 10.8 1 8.2 4.5 7.0 7.7 14.1 1 8.4 6.9 5.5 5.3 11.8 2 9.0 3.4 8.3 8.9 15.4 1 9.6 11.1 5.9 2.1 8.1 2 Initial cluster with respect to Table Centroid Objects x1 x2 c1 3.8 9.9 c2 7.8 12.2 c3 6.2 18.5 Initial Centroids chosen randomly x1 x 2 c1 c2 c3
  • 5. Illustration of k-Means clustering algorithms Distance calculation x1 x2 d1 d2 d3 cluster 6.8 12.6 4.0 1.1 5.9 2 0.8 9.8 3.0 7.4 10.2 1 1.2 11.6 3.1 6.6 8.5 1 2.8 9.6 1.0 5.6 9.5 1 3.8 9.9 0.0 4.6 8.9 1 4.4 6.5 3.5 6.6 12.1 1 4.8 1.1 8.9 11.5 17.5 1 6.0 19.9 10.2 7.9 1.4 3 6.2 18.5 8.9 6.5 0.0 3 7.6 17.4 8.4 5.2 1.8 3 7.8 12.2 4.6 0.0 6.5 2 6.6 7.7 3.6 4.7 10.8 1 8.2 4.5 7.0 7.7 14.1 1 8.4 6.9 5.5 5.3 11.8 2 9.0 3.4 8.3 8.9 15.4 1 9.6 11.1 5.9 2.1 8.1 2 Initial cluster with respect to Table Centroid Objects x1 x2 c1 3.8 9.9 c2 7.8 12.2 c3 6.2 18.5 Initial Centroids chosen randomly x1 x 2 c1 c2 c3
  • 6. Illustration of k-Means clustering algorithms Distance calculation x1 x2 d1 d2 d3 cluster 6.8 12.6 4.0 1.1 5.9 2 0.8 9.8 3.0 7.4 10.2 1 1.2 11.6 3.1 6.6 8.5 1 2.8 9.6 1.0 5.6 9.5 1 3.8 9.9 0.0 4.6 8.9 1 4.4 6.5 3.5 6.6 12.1 1 4.8 1.1 8.9 11.5 17.5 1 6.0 19.9 10.2 7.9 1.4 3 6.2 18.5 8.9 6.5 0.0 3 7.6 17.4 8.4 5.2 1.8 3 7.8 12.2 4.6 0.0 6.5 2 6.6 7.7 3.6 4.7 10.8 1 8.2 4.5 7.0 7.7 14.1 1 8.4 6.9 5.5 5.3 11.8 2 9.0 3.4 8.3 8.9 15.4 1 9.6 11.1 5.9 2.1 8.1 2 Initial cluster with respect to Table Centroid Objects x1 x2 c1 3.8 9.9 c2 7.8 12.2 c3 6.2 18.5 Initial Centroids chosen randomly x1 x 2 c1 c2 c3
  • 7. Illustration of k-Means clustering algorithms Distance calculation x1 x2 d1 d2 d3 cluster 6.8 12.6 4.0 1.1 5.9 2 0.8 9.8 3.0 7.4 10.2 1 1.2 11.6 3.1 6.6 8.5 1 2.8 9.6 1.0 5.6 9.5 1 3.8 9.9 0.0 4.6 8.9 1 4.4 6.5 3.5 6.6 12.1 1 4.8 1.1 8.9 11.5 17.5 1 6.0 19.9 10.2 7.9 1.4 3 6.2 18.5 8.9 6.5 0.0 3 7.6 17.4 8.4 5.2 1.8 3 7.8 12.2 4.6 0.0 6.5 2 6.6 7.7 3.6 4.7 10.8 1 8.2 4.5 7.0 7.7 14.1 1 8.4 6.9 5.5 5.3 11.8 2 9.0 3.4 8.3 8.9 15.4 1 9.6 11.1 5.9 2.1 8.1 2 Initial cluster with respect to Table Centroid Objects x1 x2 c1 3.8 9.9 c2 7.8 12.2 c3 6.2 18.5 Initial Centroids chosen randomly x1 x 2 c1 c2 c3
  • 8. Illustration of k-Means clustering algorithms Distance calculation x1 x2 d1 d2 d3 cluster 6.8 12.6 4.0 1.1 5.9 2 0.8 9.8 3.0 7.4 10.2 1 1.2 11.6 3.1 6.6 8.5 1 2.8 9.6 1.0 5.6 9.5 1 3.8 9.9 0.0 4.6 8.9 1 4.4 6.5 3.5 6.6 12.1 1 4.8 1.1 8.9 11.5 17.5 1 6.0 19.9 10.2 7.9 1.4 3 6.2 18.5 8.9 6.5 0.0 3 7.6 17.4 8.4 5.2 1.8 3 7.8 12.2 4.6 0.0 6.5 2 6.6 7.7 3.6 4.7 10.8 1 8.2 4.5 7.0 7.7 14.1 1 8.4 6.9 5.5 5.3 11.8 2 9.0 3.4 8.3 8.9 15.4 1 9.6 11.1 5.9 2.1 8.1 2 Initial cluster with respect to Table Centroid Objects x1 x2 c1 3.8 9.9 c2 7.8 12.2 c3 6.2 18.5 Initial Centroids chosen randomly x1 x 2 c1 c2 c3
  • 9. Illustration of k-Means clustering algorithms Cluster Assigning x1 x2 d1 d2 d3 cluster 6.8 12.6 4.0 1.1 5.9 2 0.8 9.8 3.0 7.4 10.2 1 1.2 11.6 3.1 6.6 8.5 1 2.8 9.6 1.0 5.6 9.5 1 3.8 9.9 0.0 4.6 8.9 1 4.4 6.5 3.5 6.6 12.1 1 4.8 1.1 8.9 11.5 17.5 1 6.0 19.9 10.2 7.9 1.4 3 6.2 18.5 8.9 6.5 0.0 3 7.6 17.4 8.4 5.2 1.8 3 7.8 12.2 4.6 0.0 6.5 2 6.6 7.7 3.6 4.7 10.8 1 8.2 4.5 7.0 7.7 14.1 1 8.4 6.9 5.5 5.3 11.8 2 9.0 3.4 8.3 8.9 15.4 1 9.6 11.1 5.9 2.1 8.1 2 Initial cluster with respect to Table Centroid Objects A1 A2 c1 3.8 9.9 c2 7.8 12.2 c3 6.2 18.5 Initial Centroids chosen randomly x1 x 2 c1 c2 c3
  • 10. Illustration of k-Means clustering algorithms Cluster Assigning x1 x2 d1 d2 d3 cluster 6.8 12.6 4.0 1.1 5.9 2 0.8 9.8 3.0 7.4 10.2 1 1.2 11.6 3.1 6.6 8.5 1 2.8 9.6 1.0 5.6 9.5 1 3.8 9.9 0.0 4.6 8.9 1 4.4 6.5 3.5 6.6 12.1 1 4.8 1.1 8.9 11.5 17.5 1 6.0 19.9 10.2 7.9 1.4 3 6.2 18.5 8.9 6.5 0.0 3 7.6 17.4 8.4 5.2 1.8 3 7.8 12.2 4.6 0.0 6.5 2 6.6 7.7 3.6 4.7 10.8 1 8.2 4.5 7.0 7.7 14.1 1 8.4 6.9 5.5 5.3 11.8 2 9.0 3.4 8.3 8.9 15.4 1 9.6 11.1 5.9 2.1 8.1 2 Initial cluster with respect to Table Centroid Objects A1 A2 c1 3.8 9.9 c2 7.8 12.2 c3 6.2 18.5 Initial Centroids chosen randomly x1 x 2 c1 c2 c3
  • 11. Illustration of k-Means clustering algorithms Cluster Assigning x1 x2 d1 d2 d3 cluster 6.8 12.6 4.0 1.1 5.9 2 0.8 9.8 3.0 7.4 10.2 1 1.2 11.6 3.1 6.6 8.5 1 2.8 9.6 1.0 5.6 9.5 1 3.8 9.9 0.0 4.6 8.9 1 4.4 6.5 3.5 6.6 12.1 1 4.8 1.1 8.9 11.5 17.5 1 6.0 19.9 10.2 7.9 1.4 3 6.2 18.5 8.9 6.5 0.0 3 7.6 17.4 8.4 5.2 1.8 3 7.8 12.2 4.6 0.0 6.5 2 6.6 7.7 3.6 4.7 10.8 1 8.2 4.5 7.0 7.7 14.1 1 8.4 6.9 5.5 5.3 11.8 2 9.0 3.4 8.3 8.9 15.4 1 9.6 11.1 5.9 2.1 8.1 2 Initial cluster with respect to Table Centroid Objects A1 A2 c1 3.8 9.9 c2 7.8 12.2 c3 6.2 18.5 Initial Centroids chosen randomly x1 x 2 c1 c2 c3
  • 12. k-Means Algorithm • k-Means clustering algorithm proposed by J. Hartigan and M. A. Wong [1979]. • Given a set of n distinct objects, the k-Means clustering partitions the objects into k number of clusters such that intracluster similarity is high but the intercluster similarity is low. • In this algorithm, user need to specify k, the number of clusters. • Let consider the objects are defined with numeric attributes. • Then use any one of the distance metric (Euclidian, Manhattan) to create the clusters.
  • 13. k-Means Algorithm • First it selects k number of objects at random from the set of n objects. These k objects are treated as the centroids of k clusters. • For each of the remaining objects, it is assigned to one of the closest centroid. Thus, it forms a collection of objects assigned to each centroid is called a cluster. • Next, the centroid of each cluster is then updated (by calculating the mean values of attributes of each object). • Repeat (Iterate) the assignment and update procedure is until it reaches some stopping criteria (such as, number of iteration, centroids remain unchanged or no assignment, etc.)
  • 14. k-Means Algorithm Input: D is a dataset containing n objects, k is the number of cluster Output: A set of k clusters Steps: 1. Randomly choose k objects from D as the initial cluster centroids. 2. For each of the objects in D do • Compute distance between the current objects and k cluster centroids • Assign the current object to that cluster to which it is closest. 3. Compute the “cluster centers” of each cluster. These become the new cluster centroids. 4. Repeat step 2-3 until the convergence criterion is satisfied 5. Stop
  • 17. • Problem Example Let’s have 4 types of medicines and each has two attributes (pH and weight index). Create a group from these objects into K=2 group of medicine. Medicine Weight pH-Index A 1 1 B 2 1 C 4 3 D 5 4 A B C D
  • 18. Example • Step 1: Use initial seed (random) points for partitioning B c , A c 2 1   24 . 4 ) 1 4 ( ) 2 5 ( ) , ( 5 ) 1 4 ( ) 1 5 ( ) , ( 2 2 2 2 2 1           c D d c D d Assign each object to the cluster with the nearest seed point Euclidean distance D C A B
  • 19. Example • Step 2: Compute new centroids of the current partition Knowing the members of each cluster, now compute the new centroid of each group based on these new memberships. ) 3 8 , 3 11 ( 3 4 3 1 , 3 5 4 2 ) 1 , 1 ( 2 1              c c
  • 20. Example • Step 2: Renew membership based on new centroids Compute the distance of all objects to the new centroids Assign the membership to objects
  • 21. Example • Step 3: Repeat the first two steps until its convergence Knowing the members of each cluster, now we compute the new centroid of each group based on these new memberships. ) 2 1 3 , 2 1 4 ( 2 4 3 , 2 5 4 ) 1 , 2 1 1 ( 2 1 1 , 2 2 1 2 1                     c c
  • 22. Example • Step 3: Repeat the first two steps until its convergence Compute the distance of all objects to the new centroids Stop due to no new assignment Membership in each cluster no longer change
  • 23. How to choose k? – Elbow Method • A fundamental step for any unsupervised algorithm is to determine the optimal number of clusters for data which may be clustered. • From the visualization, observe the optimal number of clusters should be around 3. But visualizing the data alone cannot always give the optimal cluster numbers. • The Elbow Method is one of the most popular methods to determine this optimal value of k. x1 x 2
  • 24. How to choose k? – Elbow Method Elbow method (using Distortion): • Step 1: Distortion: It is the average of the squared distances from the cluster centers of the respective clusters. Typically, the Euclidean distance metric is used. • Iterate the values of k from 1 to 9 and calculate the values of distortions for each value of k and calculate the distortion and inertia for each value of k in the given range. • Step 2: Building the clustering model and calculating the values of the Distortion – select the value of k at the “elbow” ie the point after which the distortion is start to decrease in a linear fashion
  • 25. How to choose k? – Elbow Method Elbow method (using Inertia): • Step 1: Inertia: It is the sum of squared distances of samples (objects) to their closest cluster center. • Iterate the values of k from 1 to 9 and calculate the values of inertia for each value of k and calculate the inertia for each value of k in the given range. • Step 2: Building the clustering model and calculating the values of the Inertia. – select the value of k at the “elbow” ie the point after which the inertia is start to decrease in a linear fashion.
  • 27. Limitations of k-Means algorithm • Local optimum – sensitive to initial seed points – converge to a local optimum: It maybe an unwanted solution • Need to specify K, the number of clusters, in advance • Not suitable for discovering clusters with non-convex shapes • Applicable only when mean is defined, not handling categorical data. (Use K-mode algorithm)
  • 28. Application • Colour-Based Image Segmentation Using K-means
  • 29. Application “blue” pixels “white” pixels “pink” pixels
  • 30. Practice • dataset = { (5,3), (10,15), (15,12), (24,10), (30,45), (85,70), (71,80), (60,78), (55,52), (80,91) } • Let’s randomly initiate the cluster centroid c1, c2 as (5, 3) and (10, 15). • While Iteration = 1, – Compute the distance
  • 31. Iteration = 1 x1 x2 d1 d2 cluster 5 3 0 13 10 15 13 0 15 12 13.45 5.83 24 10 20.24 14.86 30 45 48.87 36 85 70 104.35 93 71 80 101.41 89 60 78 93 80 55 52 70 58 80 91 115.52 103.32
  • 32. Iteration = 1 • Assign the data points to closest centroid’s cluster x1 x2 d1 d2 cluster 5 3 0 13 C1 10 15 13 0 C2 15 12 13.45 5.83 C2 24 10 20.24 14.86 C2 30 45 48.87 36 C2 85 70 104.35 93 C2 71 80 101.41 89 C2 60 78 93 80 C2 55 52 70 58 C2 80 91 115.52 103.32 C2
  • 33. Iteration = 1 • Calculate the New Cluster Centroid x1 x2 d1 d2 cluster 5 3 0 13 C1 10 15 13 0 C2 15 12 13.45 5.83 C2 24 10 20.24 14.86 C2 30 45 48.87 36 C2 85 70 104.35 93 C2 71 80 101.41 89 C2 60 78 93 80 C2 55 52 70 58 C2 80 91 115.52 103.32 C2
  • 34. Iteration = 1 • C1 has only one data point (5,3). • Mean c1(x1)= (5/1)= 5. • Mean c1(x2)= (3/1) = 3 • So, new centroid for Cluster1 is again (5,3) x1 x2 d1 d2 cluster 5 3 0 13 C1 10 15 13 0 C2 15 12 13.45 5.83 C2 24 10 20.24 14.86 C2 30 45 48.87 36 C2 85 70 104.35 93 C2 71 80 101.41 89 C2 60 78 93 80 C2 55 52 70 58 C2 80 91 115.52 103.32 C2
  • 35. Iteration = 1 • C2 has only 9 data points. • Mean c2(x1)= (10 + 15 + 24 + 30 + 85 + 71 + 60 + 55 + 80) / 9 = 47.77 • Mean c2(x2)= (15 + 12 + 10 + 45 + 70 + 80 + 78 + 52 + 91) / 9 = 50.33 • So, new centroid for Cluster2 is (47.77,50.33) x1 x2 d1 d2 cluster 5 3 0 13 C1 10 15 13 0 C2 15 12 13.45 5.83 C2 24 10 20.24 14.86 C2 30 45 48.87 36 C2 85 70 104.35 93 C2 71 80 101.41 89 C2 60 78 93 80 C2 55 52 70 58 C2 80 91 115.52 103.32 C2
  • 36. Iteration = 2 • Compute the distance between new centroids c1 i.e. (5,3) and all data points. x1 x2 d1 d2 cluster 5 3 0 10 15 13 15 12 13.45 24 10 20.24 30 45 48.87 85 70 104.35 71 80 101.41 60 78 93 55 52 70 80 91 115.52
  • 37. Iteration = 2 x1 x2 d1 d2 cluster 5 3 0 63.79 10 15 13 51.71 15 12 13.45 50.42 24 10 20.24 46.81 30 45 48.87 18.55 85 70 104.35 42.1 71 80 101.41 37.68 60 78 93 30.25 55 52 70 7.42 80 91 115.52 51.89 • Compute the distance between new centroid c2 i.e. (47.77,50.33) and all data points.
  • 38. Iteration = 2 • Assign the data points to the closest centroids. x1 x2 d1 d2 cluster 5 3 0 63.79 C1 10 15 13 51.71 C1 15 12 13.45 50.42 C1 24 10 20.24 46.81 C1 30 45 48.87 18.55 C2 85 70 104.35 42.1 C2 71 80 101.41 37.68 C2 60 78 93 30.25 C2 55 52 70 7.42 C2 80 91 115.52 51.89 C2
  • 39. Iteration = 2 • Compute the new cluster centroids x1 x2 d1 d2 cluster 5 3 0 63.79 C1 10 15 13 51.71 C1 15 12 13.45 50.42 C1 24 10 20.24 46.81 C1 30 45 48.87 18.55 C2 85 70 104.35 42.1 C2 71 80 101.41 37.68 C2 60 78 93 30.25 C2 55 52 70 7.42 C2 80 91 115.52 51.89 C2
  • 40. Iteration = 2 • C1 has 4 data points. • Mean c1(x1)= (5, 10, 15, 24) / 4 = 13.5 • Mean c1(x2)= (3, 15, 12, 10) / 4 = 10.0 • So, new centroid for Cluster1 is (13.5,10.0) x1 x2 d1 d2 cluster 5 3 0 63.79 C1 10 15 13 51.71 C1 15 12 13.45 50.42 C1 24 10 20.24 46.81 C1 30 45 48.87 18.55 C2 85 70 104.35 42.1 C2 71 80 101.41 37.68 C2 60 78 93 30.25 C2 55 52 70 7.42 C2 80 91 115.52 51.89 C2
  • 41. Iteration = 2 • C2 has 6 data points. • Mean c2(x1)= (30 + 85 + 71 + 60 + 55 + 80) / 6 = 63.5 • Mean c2(x2)= (45 + 70 + 80 + 78 + 52 +91) / 6 = 69.33 • So, new centroid for Cluster2 is (63.5, 69.33) x1 x2 d1 d2 cluster 5 3 0 63.79 C1 10 15 13 51.71 C1 15 12 13.45 50.42 C1 24 10 20.24 46.81 C1 30 45 48.87 18.55 C2 85 70 104.35 42.1 C2 71 80 101.41 37.68 C2 60 78 93 30.25 C2 55 52 70 7.42 C2 80 91 115.52 51.89 C2
  • 42. Iteration = 3 • Compute the distance between new centroids and all data points. x1 x2 d1 d2 cluster 5 3 11.01 88.44 10 15 6.1 76.24 15 12 2.5 75.09 24 10 10.5 71.27 30 45 38.69 41.4 85 70 93.3 21.51 71 80 90.58 13.04 60 78 85.37 9.34 55 52 59.04 19.3 80 91 104.8 27.3
  • 43. Iteration = 3 • Assign the data points to the closest centroids. x1 x2 d1 d2 cluster 5 3 11.01 88.44 C1 10 15 6.1 76.24 C1 15 12 2.5 75.09 C1 24 10 10.5 71.27 C1 30 45 38.69 41.4 C1 85 70 93.3 21.51 C2 71 80 90.58 13.04 C2 60 78 85.37 9.34 C2 55 52 59.04 19.3 C2 80 91 104.8 27.3 C2
  • 44. Iteration = 3 • Compute the new cluster centroids x1 x2 d1 d2 cluster 5 3 11.01 88.44 C1 10 15 6.1 76.24 C1 15 12 2.5 75.09 C1 24 10 10.5 71.27 C1 30 45 38.69 41.4 C1 85 70 93.3 21.51 C2 71 80 90.58 13.04 C2 60 78 85.37 9.34 C2 55 52 59.04 19.3 C2 80 91 104.8 27.3 C2
  • 45. Iteration = 3 • C1 has 5 data points. • Mean c1(x1)= (5, 10, 15, 24, 30) / 5 = 16.8 • Mean c1(x2)= (3, 15, 12, 10, 45) / 5 = 17.0 • So, new centroid for Cluster1 is (16.8,17.0) x1 x2 d1 d2 cluster 5 3 11.01 88.44 C1 10 15 6.1 76.24 C1 15 12 2.5 75.09 C1 24 10 10.5 71.27 C1 30 45 38.69 41.4 C1 85 70 93.3 21.51 C2 71 80 90.58 13.04 C2 60 78 85.37 9.34 C2 55 52 59.04 19.3 C2 80 91 104.8 27.3 C2
  • 46. Iteration = 3 • C2 has 5 data points. • Mean c2(x1)= (85 + 71 + 60 + 55 + 80) / 5 = 70.2 • Mean c2(x2)= (70 + 80 + 78 + 52 + 91) / 5 = 74.2 • So, new centroid for Cluster2 is (70.2,74.2) x1 x2 d1 d2 cluster 5 3 11.01 88.44 C1 10 15 6.1 76.24 C1 15 12 2.5 75.09 C1 24 10 10.5 71.27 C1 30 45 38.69 41.4 C1 85 70 93.3 21.51 C2 71 80 90.58 13.04 C2 60 78 85.37 9.34 C2 55 52 59.04 19.3 C2 80 91 104.8 27.3 C2
  • 47. Iteration = 4 • Compute the distance from new centroids and data points • c1(16.8,17.0) and c2(70.2,74.2) x1 x2 d1 d2 cluster 5 3 18.3 96.54 10 15 7.08 84.43 15 12 5.31 83.16 24 10 10.04 79.09 30 45 30.95 49.68 85 70 86.37 15.38 71 80 83.1 5.85 60 78 74.74 10.88 55 52 51.8 26.9 80 91 97.31 19.44
  • 48. Iteration = 4 • Assign the data points to the closest centroids x1 x2 d1 d2 cluster 5 3 18.3 96.54 C1 10 15 7.08 84.43 C1 15 12 5.31 83.16 C1 24 10 10.04 79.09 C1 30 45 30.95 49.68 C1 85 70 86.37 15.38 C2 71 80 83.1 5.85 C2 60 78 74.74 10.88 C2 55 52 51.8 26.9 C2 80 91 97.31 19.44 C2
  • 49. Iteration = 4 • Compute the new cluster centroids and all data points x1 x2 d1 d2 cluster 5 3 18.3 96.54 C1 10 15 7.08 84.43 C1 15 12 5.31 83.16 C1 24 10 10.04 79.09 C1 30 45 30.95 49.68 C1 85 70 86.37 15.38 C2 71 80 83.1 5.85 C2 60 78 74.74 10.88 C2 55 52 51.8 26.9 C2 80 91 97.31 19.44 C2
  • 50. Iteration = 4 • C1 has 5 data points. • Mean c1(x1)= (5, 10, 15, 24, 30) / 5 = 16.8 • Mean c1(x2)= (3, 15, 12, 10, 45) / 5 = 17.0 • So, new centroid for Cluster1 is (16.8,17.0) x1 x2 d1 d2 cluster 5 3 18.3 96.54 C1 10 15 7.08 84.43 C1 15 12 5.31 83.16 C1 24 10 10.04 79.09 C1 30 45 30.95 49.68 C1 85 70 86.37 15.38 C2 71 80 83.1 5.85 C2 60 78 74.74 10.88 C2 55 52 51.8 26.9 C2 80 91 97.31 19.44 C2
  • 51. Iteration = 4 • C2 has 5 data points. • Mean c2(x1)= (85 + 71 + 60 + 55 + 80) / 5 = 70.2 • Mean c2(x2)= (70 + 80 + 78 + 52 + 91) / 5 = 74.2 • So, new centroid for Cluster2 is (70.2,74.2) x1 x2 d1 d2 cluster 5 3 18.3 96.54 C1 10 15 7.08 84.43 C1 15 12 5.31 83.16 C1 24 10 10.04 79.09 C1 30 45 30.95 49.68 C1 85 70 86.37 15.38 C2 71 80 83.1 5.85 C2 60 78 74.74 10.88 C2 55 52 51.8 26.9 C2 80 91 97.31 19.44 C2
  • 52. Convergence • The cluster centroids in iteration3 and iteration4 are same. i.e (No change). • It satisfies the convergence criteria. i.e. Data points cannot be clustered further. • So, Stop the process.
  • 53. Reference • Artificial Intelligence and Machine Learning, Chandra S.S. & H.S. Anand, PHI Publications • Online materials