K – means cluster analysis.pptx

K – MEANS
CLUSTER
ANALYSIS
SUBMITTED TO:
PROF. SOMEN SAHU
DEPT. OF FES
-AGNIVA PRADHAN
M.F.Sc 2ND SEMESTER
DEPT. OF FNT
M/F/2021/03

 K-means clustering is a simple unsupervised learning algorithm
that is used to solve clustering problems. It follows a simple
procedure of classifying a given data set into a number of clusters,
defined by the letter "k," which is fixed beforehand. The clusters are
then positioned as points and all observations or data points are
associated with the nearest cluster, computed, adjusted and then the
process starts over using the new adjustments until a desired result
is reached.

This is a versatile algorithm that can be used for any type of
grouping. Some examples of use cases are:
Behavioural segmentation:
Segment by purchase history
Segment by activities on application, website, or platform
Define personas based on interests
Create profiles based on activity monitoring
Inventory categorization:
Group inventory by sales activity
Group inventory by manufacturing metrics

Sorting sensor measurements:
Detect activity types in motion sensors
Group images
Separate audio
Identify groups in health monitoring
Detecting bots or anomalies:
Separate valid activity groups from bots
Group valid activity to clean up outlier detection
In addition, monitoring if a tracked data point switches
between groups over time can be used to detect
meaningful changes in the data.

 Suppose we have some data of Height of
Students and Weight of student
No. Height of
Student (in
cms)
Weight of student
(in Kgs)
1 185 72
2 170 56
3 168 60
4 179 68
5 182 72
6 188 77
7 180 71
8 180 70
9 183 84
10 180 88
11 180 67
12 177 76
0
10
20
30
40
50
60
70
80
90
100
165 170 175 180 185 190
Weight

 Now I need to classify the data points using K- Means algorithm into 2
Cluster in the name K1 and K2.
 Now here I am using the centroid Concept i.e. For Every cluster there will
a Centroid value associated.
 Centroid value is such value by using the value the rest data points will
be clustered.
 Then we need to calculate the distance of the data points from the
centroid value.
 Here the distance will be Euclidean Distance.
 ED = (𝑋𝑜 − 𝑋𝐶)2+(𝑌𝑜 + 𝑌𝐶)2
 𝑋𝑜 & 𝑌𝑜 - Observed Value
 𝑋𝐶 & 𝑌𝐶 - are centroid value

 I have taken the 1st row as a centroid value for K1i.e.
(185, 72) and 2nd Row as a centroid value of K2 i.e.
(170,56).
 Now, we need to cluster the data into two clusters by
measuring Euclidean Distance.
 Now ED for 3rd row =
K1: (168 − 185)2+(60 − 72)2 = 20.82
K2 (168 − 170)2+(60 − 56)2 = 4.48
 As 3rd row value ED is nearer to K2 [ ED for K2 < ED for
K1] so 3rd row will be in K2
So our New cluster will be like :
K1 – 1st row
K2 – 2nd and 3rd Row
185,72 170,56
K1 K2

 Now we need to recalculate the new
centroid for K2 [ as 3rd row gone under
K2]
 So new Cetroid value of K2 =
(
170+168
2
,
60+56
2
) = (169, 58)
 Now we need to recalculate the ED for
4th Row as before.
 Thus, we get the final K1 and K2
Cluster as
 K1: {1,4,5,6,7,8,9,10,11,12}
 K2: {2,3}
185,72 169,58
K1 K2

K – means cluster analysis.pptx

K – means cluster analysis.pptx

Recommended

Recommended

More Related Content

Similar to K – means cluster analysis.pptx

Similar to K – means cluster analysis.pptx (20)

More from agniva pradhan

More from agniva pradhan (11)

Recently uploaded

Recently uploaded (20)

K – means cluster analysis.pptx