11. Unsupervised Classification :
Cluster seeking by Maximin Algorithm :
1. Choose any Xl as the first cluster center.
2. The vector Xm that maximises the vector-to-first-cluster
distance becomes the second cluster center.
3. The vector Xn that maximizes the minimum-vector-to
cluster-distance-among-k-clusters becomes the (k+l)th
cluster center.
1 2
1
3
2
k
l
k
l
1
12. 0
1
2
3
4
5
6
0 1 2 3 4 5 6
y
x
Feature Space
𝐶0
𝐶1
𝐶2
Refinement of Cluster Centres
using KMeans Algorithm
# x y
1 0 4
2 0 5
3 1 5
4 4 0
5 5 0
6 5 1
7 4 5
8 5 5
Examples
𝐶0𝑖 = (0,5)
𝐶1𝑖 = (5,0)
𝐶2𝑖 = (5,5)
Initial Cluster Centres
13. 2D Feature Space Refinement of Cluster Centres
using KMeans Algorithm
3 Cluster Problem
No. 𝑷𝒊
(x,y)
D
(𝑷𝒊,𝑪𝟎)
D
(𝑷𝒊,𝑪𝟏)
D
(𝑷𝒊,𝑪𝟐)
min
(D)
k=
Argmin
(min(D))
for
k=0,1,2
𝑁𝑘
for
k=0,1,2
𝐶𝑘𝑓
1 (0,4)
2 (0,5)
3 (1,5)
4 (4,0)
5 (5,0)
6 (5,1)
7 (4,5)
8 (5,5)
KMeans Algorithm for First Iteration
𝐶0𝑖 = (0,5) 𝐶1𝑖 = (5,0) 𝐶2𝑖 = (5,5)
𝐶0𝑓 = (0.33,, 4.67) 𝐶1𝑓 = (4.67,0.33) 𝐶2𝑓 = (4.5,5)
14. 2D Feature Space Refinement of Cluster Centres
using KMeans Algorithm
3 Cluster Problem
No. 𝑷𝒊
(x,y)
D
(𝑷𝒊,𝑪𝟎)
D
(𝑷𝒊,𝑪𝟏)
D
(𝑷𝒊,𝑪𝟐)
min
(D)
k=
Argmin
(min(D))
for
k=0,1,2
𝑁𝑘
for
k=0,1,2
𝐶𝑘𝑓
1 (0,4) 1 41 26 1 0
2 (0,5) 0 50 25 0 0
3 (1,5) 1 41 16 1 0
4 (4,0) 41 1 26 1 1
5 (5,0) 50 0 25 0 1
6 (5,1) 41 1 16 1 1
7 (4,5) 16 26 1 1 2
8 (5,5) 25 25 0 0 2
KMeans Algorithm for First Iteration
𝐶0𝑖 = (0,5) 𝐶1𝑖 = (5,0) 𝐶2𝑖 = (5,5)
𝐶2𝑓 = (4.5,5)
17. Cluster Refininement by K-Means Algorithm
1. Determine initial K cluster centers by maximin algorithm
2. Assign each object to the group that has the closest centroid.
3. When all objects have been assigned, recalculate the
positions of the K centroids.
4. Repeat steps 2 and 3 until the centroids no longer move.
This produces a separation of the objects into groups from
which the metric to be minimised can be calculated.
1 2
1
2
18. Classification :
1. Supervised Classification
• Training samples have known labels – enabling estimating the
characteristics of classes –by a set of parameters to represent a class.
• The goal is to assign each pixel vector a label by computing the
distance of the pixel vector to each class and finding the class to
which it has minimum distance using a Euclidean, City Block,
Mahalanobis or other distance measures.
2. Unsupervised Classification
• Initially samples do not have a label.
• Employ a clustering technique to partition the n samples from your
dataset into k clusters where each dataset belongs to one of the
clusters
• K-means is a clustering algorithm where initial cluster centres can be
obtained randomly or using a cluster seeking algorithm like Maximin
or a neural network like Self Organization Map.
19. ISODATA Algorithms – Extension of K-Means Algorithm
ISODATA – Iterative Self
Organising Data Analysis
Techniques
2D Feature Space
𝑥2
𝑥1
𝑪𝟐
𝑪𝟏 𝑪𝟑 𝑪𝟒
𝑪𝟓 𝑪𝟔
𝑪𝟕
#Examples Mean
Vector
Variance
Vector
Principal
Eigen Vector
𝒗𝟏
𝑻
Principal
Eigen
Value
𝑪𝟏 100 (2,8) (-0.25,0.25)
𝑪𝟐 100 (2,7.5) (-0.25,0.25)
𝑪𝟑 100 (2.5,8) (-0.25,0.25)
𝑪𝟒 1000 (5,5) (7,7) (0.707,0.707) 10
𝑪𝟓 100 (6,3) (0.25,0.25)
𝑪𝟔 300 (8,3) (1,1) (1,0)
𝑪𝟕 20 (9,6) (0.12,0.12)
Need for
a. Splitting
b. Merging
c. Rejecting
d. Refining (K-Means)
20. ISODATA Algorithms – Extension of K-Means Algorithm
Rejecting a Clusters
#Examples Mean
Vector
𝑪𝟏 100 (2,8)
𝑪𝟐 100 (2,7.5)
𝑪𝟑 100 (2.5,8)
𝑪𝟒 1000 (5,5)
𝑪𝟓 100 (6,3)
𝑪𝟔 300 (8,3)
𝑪𝟕 20 (9,6)
Regect a Cluster if population ≤ 50
#Examples Mean
Vector
𝑪𝟏 100 (2,8)
𝑪𝟐 100 (2,7.5)
𝑪𝟑 100 (2.5,8)
𝑪𝟒 1000 (5,5)
𝑪𝟓 100 (6,3)
𝑪𝟔 300 (8,3)
Rejecting Cluster 7 :
Examples of Cluster 7 redistributed to other clusters
7 clusters to 6 clusters
23. ISODATA Algorithms – Extension of K-Means Algorithm
Merging of Clusters 𝐷𝑐1𝑐2, 𝐷𝑐1𝑐3, 𝐷𝑐2𝑐3 < 1 :
Merge Clusters 𝑪𝟏 , 𝑪𝟐 & 𝑪𝟑 into 𝑪𝑵𝟏
#Examples Mean
Vector
𝑪𝟏 100 (2,8)
𝑪𝟐 100 (2,7.5)
𝑪𝟑 100 (2.5,8)
Mean of Merged Cluster 𝑪𝑵𝟏
=
𝟏𝟎𝟎× 𝟐,𝟖 +𝟏𝟎𝟎× 𝟐,𝟕.𝟓 +𝟏𝟎𝟎×(𝟐.𝟓,𝟖)
𝟑𝟎𝟎
=
Population of Merged Cluster 𝑪𝑵𝟏
= 100 + 100 + 100 =
#Examples Mean
Vector
𝑪𝟏 100 (2,8)
𝑪𝟐 100 (2,7.5)
𝑪𝟑 100 (2.5,8)
𝑪𝟒 1000 (5,5)
𝑪𝟓 100 (6,3)
𝑪𝟔 300 (8,3)
#Examples Mean
Vector
𝑪𝑵𝟏 300 (2.16, 7,83)
𝑪𝟒 1000 (5,5)
𝑪𝟓 100 (6,3)
𝑪𝟔 300 (8,3)
6 clusters to 4 clusters
24. ISODATA Algorithms – Extension of K-Means Algorithm
Merging of Clusters 𝐷𝑐1𝑐2, 𝐷𝑐1𝑐3, 𝐷𝑐2𝑐3 < 1 :
Merge Clusters 𝑪𝟏 , 𝑪𝟐 & 𝑪𝟑 into 𝑪𝑵𝟏
#Examples Mean
Vector
𝑪𝟏 100 (2,8)
𝑪𝟐 100 (2,7.5)
𝑪𝟑 100 (2.5,8)
Mean of Merged Cluster 𝑪𝑵𝟏
=
𝟏𝟎𝟎× 𝟐,𝟖 +𝟏𝟎𝟎× 𝟐,𝟕.𝟓 +𝟏𝟎𝟎×(𝟐.𝟓,𝟖)
𝟑𝟎𝟎
= (2.16, 7.83)
Population of Merged Cluster 𝑪𝑵𝟏
= 100 + 100 + 100 = 300
#Examples Mean
Vector
𝑪𝟏 100 (2,8)
𝑪𝟐 100 (2,7.5)
𝑪𝟑 100 (2.5,8)
𝑪𝟒 1000 (5,5)
𝑪𝟓 100 (6,3)
𝑪𝟔 300 (8,3)
#Examples Mean
Vector
𝑪𝑵𝟏 300 (2.16, 7,83)
𝑪𝟒 1000 (5,5)
𝑪𝟓 100 (6,3)
𝑪𝟔 300 (8,3)
6 clusters to 4 clusters
25. ISODATA Algorithms – Extension of K-Means Algorithm
Splitting of Clusters
Split a cluster if Principal Eigen Value > 7 units
#Examples Mean
Vector
Variance
Vector
Principal
Eigen Vector
𝒗𝟏
𝑻
Principal
Eigen
Value
𝝀𝟏
𝑪𝑵𝟏 100 (2.16,7.83)
𝑪𝟒 1000 (5,5) (7,7) (0.707,0.707) 10
𝑪𝟓 100 (6,3) (0.25,0.25)
𝑪𝟔 300 (8,3) (1,1) (1,0)
Split 𝑪𝟒 into 2 new clusters 𝑪𝟒𝒂 & 𝑪𝟒𝒃
Mean of 𝑪𝟒𝒂 = (5,5) - 𝝀𝟏 × 𝒗𝟏
𝑻
=
Mean of 𝑪𝟒𝒂 = (5,5) + 𝝀𝟏 × 𝒗𝟏
𝑻
=
26. ISODATA Algorithms – Extension of K-Means Algorithm
Splitting of Clusters
Split a cluster if Principal Eigen Value > 7 units
#Examples Mean
Vector
Variance
Vector
Principal
Eigen Vector
𝒗𝟏
𝑻
Principal
Eigen
Value
𝝀𝟏
𝑪𝑵𝟏 100 (2.16,7.83)
𝑪𝟒 1000 (5,5) (7,7) (0.707,0.707) 10
𝑪𝟓 100 (6,3) (0.25,0.25)
𝑪𝟔 300 (8,3) (1,1) (1,0)
Split 𝑪𝟒 into 2 new clusters 𝑪𝟒𝒂 & 𝑪𝟒𝒃
Mean of 𝑪𝟒𝒂 = (5,5) - 𝝀𝟏 × 𝒗𝟏
𝑻
= (5,5) - 3.16 ×(0.707,0.707)
= (2.76, 2.76)
Mean of 𝑪𝟒𝒂 = (5,5) + 𝝀𝟏 × 𝒗𝟏
𝑻
= (5,5) +3.16 × (0.707,0.707)
= (7.23, 7.23)
27. ISODATA Algorithms – Extension of K-Means Algorithm
Splitting of Clusters
Split a cluster if Principal Eigen Value > 7 units
Mean
Vector
𝑪𝑵𝟏 (2.16,7.83)
𝑪𝟒𝒂 (2.76, 2.76)
𝑪𝟒𝒃 (7.23, 7.23)
𝑪𝟓 (6,3)
𝑪𝟔 (8,3)
Mean
Vector
𝑪𝑵𝟏 (2.16, 7,83)
𝑪𝟒 (5,5)
𝑪𝟓 (6,3)
𝑪𝟔 (8,3)
4 clusters to 5 clusters
28. ISODATA Algorithms – Extension of K-Means Algorithm
K-Mean Algorithm – Fixed Number of Clusters
ISODATA Algorithms – Number of Clusters varying with iteration