11. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Clustering?
What is clustering?
The problem of partitioning a collection of objects into groups.
Objects in the same group are similar.
Objects in different groups are dissimilar.
Why clustering?
Useful in finding natural groupings within a data set
Useful in analysis, description and utilization of valuable information
hidden within groups
22. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
What is a clustering?
Data set
Let X = {X1, X2, . . . , Xn} be a set of n data items, where Xi ∈ IRf
, i = 1, . . . , n,
that is, Xi = (Xi1, . . . , Xif ), where Xij ’s are called features of Xi .
Clustering
A clustering of X is then a partition C = (C1, . . . , Ck ) of X such that:
∪k
i=1 Ci = X,
23. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
What is a clustering?
Data set
Let X = {X1, X2, . . . , Xn} be a set of n data items, where Xi ∈ IRf
, i = 1, . . . , n,
that is, Xi = (Xi1, . . . , Xif ), where Xij ’s are called features of Xi .
Clustering
A clustering of X is then a partition C = (C1, . . . , Ck ) of X such that:
∪k
i=1 Ci = X,
Ci ∩ Cj = ϕ ∀i, j ∈ {1, 2, . . . , k}, i ̸= j,
24. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
What is a clustering?
Data set
Let X = {X1, X2, . . . , Xn} be a set of n data items, where Xi ∈ IRf
, i = 1, . . . , n,
that is, Xi = (Xi1, . . . , Xif ), where Xij ’s are called features of Xi .
Clustering
A clustering of X is then a partition C = (C1, . . . , Ck ) of X such that:
∪k
i=1 Ci = X,
Ci ∩ Cj = ϕ ∀i, j ∈ {1, 2, . . . , k}, i ̸= j,
Ci ̸= ϕ ∀i ∈ {1, 2, . . . , k}.
35. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Clustering Validity Indexes
Used to analyze the quality of a clustering solution on a
quantitative basis.
A good clustering would be one with compact and distinct
clusters.
A validity index helps capture the notion of compactness
and separation in a clustering solution.
Need the distance measure and the clustering to calculate
the validity index.
61. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
PBM Index
E =
n∑
i=1
D(Xi, ωx )
FPBM(C, D) =
1
K
×
E
EK
× DK
DK = max
0≤i,j≤K
i̸=j
{D(ωi, ωj)}
EK =
K∑
k=1
∑
X∈Ck
D(X, ωk )
A clustering
Distance function
Number of clusters
sum of distance of all items from global centroid
max distance between centroids
sum of distance of all items from their centroid
92. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Differential Evolution Clustering (DEC)
DEC has two main stages:
Initialization stage
Evolution stage
In the initialization stage, a population of individuals is generated.
In the evolution stage, the population evolves through a number of
cycles.
Each cycle has two phases:
Exploration phase
93. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Differential Evolution Clustering (DEC)
DEC has two main stages:
Initialization stage
Evolution stage
In the initialization stage, a population of individuals is generated.
In the evolution stage, the population evolves through a number of
cycles.
Each cycle has two phases:
Exploration phase
Exploitation phase
94. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Differential Evolution Clustering (DEC)
DEC has two main stages:
Initialization stage
Evolution stage
In the initialization stage, a population of individuals is generated.
In the evolution stage, the population evolves through a number of
cycles.
Each cycle has two phases:
Exploration phase
Exploitation phase
Best individual is returned at the end.
102. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Differential Evolution Clustering (DEC)
DEC Algorithm
P ← Initialization
Repeat
while g ≤ number of generations per cycle
for each individual Ip in P
Perform crossover to generate offspring Ic
Local optimize Ic
Compare Ic and Ip and keep one of them in P
end for
end while
Cycle
Generation
Until some criteria are met
103. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Differential Evolution Clustering (DEC)
DEC Algorithm
P ← Initialization
Repeat
while g ≤ number of generations per cycle
for each individual Ip in P
Perform crossover to generate offspring Ic
Local optimize Ic
Compare Ic and Ip and keep one of them in P
end for
end while
Cycle
Generation
Perturb P
Until some criteria are met
104. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Differential Evolution Clustering (DEC)
DEC Algorithm
P ← Initialization
Repeat
while g ≤ number of generations per cycle
for each individual Ip in P
Perform crossover to generate offspring Ic
Local optimize Ic
Compare Ic and Ip and keep one of them in P
end for
end while
Exploration
[<70%
generations]
Perturb P
Until some criteria are met
105. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Differential Evolution Clustering (DEC)
DEC Algorithm
P ← Initialization
Repeat
while g ≤ number of generations per cycle
for each individual Ip in P
Perform crossover to generate offspring Ic
Local optimize Ic
Compare Ic and Ip and keep one of them in P
end for
end while
Exploitation
[≥70%
generations]
Perturb P
Until some criteria are met
106. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Differential Evolution Clustering (DEC)
DEC Algorithm
D ← Euclidean distance /*distance measure*/
P ← Initialization
Repeat
while g ≤ number of generations per cycle
for each individual Ip in P
Perform crossover to generate offspring Ic
Local optimize Ic
Compare Ic and Ip and keep one of them in P
end for
end while
Perturb P
D ← MinMax distance [≥ 70% cycles]
Until some criteria are met
107. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Differential Evolution Clustering (DEC)
DEC Algorithm
D ← Euclidean distance
P ← Initialization
Repeat
while g ≤ number of generations per cycle
for each individual Ip in P
Perform crossover to generate offspring Ic
Local optimize Ic
Compare Ic and Ip and keep one of them in P
end for
end while
Perturb P
D ← MinMax distance
Until some criteria are met
Return the best clustering found
108. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Initialization
DEC Algorithm
D ← Euclidean distance
P ← Initialization
Repeat
while g ≤ number of generations per cycle
for each individual Ip in P
Perform crossover to generate offspring Ic
Local optimize Ic
Compare Ic and Ip and keep better individual in P
end for
end while
Perturb P
If exploitation phase, D ← MinMax distance
Until some criteria are met
Return the best clustering found
123. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Setup
Create a population P of N individuals, where
N/3 individuals have 90% − 100% active centroids
N/3 individuals have up to 70% − 80% active centroids
N/3 individuals have up to 50% active centroids
A clustering is then computed by assigning items to their
closest active centroids.
Centroids are recomputed periodically.
130. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Evolution stage
DEC Algorithm
D ← Euclidean distance
P ← Initialization
Repeat
while g ≤ number of generations per cycle
for each individual Ip in P
Perform crossover to generate offspring Ic
Local optimize Ic
Compare Ic and Ip and keep better individual in P
end for
end while
Perturb P
If exploitation phase, D ← MinMax distance
Until some criteria are met
Return the best clustering found
131. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Evolution stage
DEC Algorithm
D ← Euclidean distance
P ← Initialization
Repeat
while g ≤ number of generations per cycle
for each individual Ip in P
Perform crossover to generate offspring Ic
Local optimize Ic
Compare Ic and Ip and keep better individual in P
end for
end while
Perturb P
If exploitation phase, D ← MinMax distance
Until some criteria are met
Return the best clustering found
132. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Evolution stage
DEC Algorithm
D ← Euclidean distance
P ← Initialization
Repeat
while g ≤ number of generations per cycle
for each individual Ip in P
Perform crossover to generate offspring Ic
Local optimize Ic
Compare Ic and Ip and keep better individual in P
end for
end while
Perturb P
If exploitation phase, D ← MinMax distance
Until some criteria are met
Return the best clustering found
133. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Crossover
DEC Algorithm
D ← Euclidean distance
P ← Initialization
Repeat
while g ≤ number of generations per cycle
for each individual Ip in P
Perform crossover to generate offspring Ic
Local optimize Ic
Compare Ic and Ip and keep better individual in P
end for
end while
Perturb P
If exploitation phase, D ← MinMax distance
Until some criteria are met
Return the best clustering found
139. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Crossover
Idea
For each individual Ip (parent) in the population, an offspring Ic is
generated using crossover as follows:
Ic ← (Ωc, Tc, Cc)
Randomly select three unique individuals Ix , Iy , Iz (donors) from the
population.
The threshold vector Tc is computed using:
tc
i =
{
tx
i + σt (t
y
i
− tz
i ) if URandom(0, 1) < η
tp otherwise,
where η ∈ [0, 1] is the crossover probability, σt is the scaling factor and
URandom(0, 1) denotes a random number selected from [0, 1].
148. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Crossover
Example of Crossover
Parent Ip ωp
11
. . . ωp
1f
ωp
21
. . . ωp
2f
ωp
i1
. . . ωp
if
ωp
K1
. . . ωp
Kf
. . . .
Donor Ix ωx
11
. . . ωx
1f
ωx
21
. . . ωx
2f
ωx
i1
. . . ωx
if
ωx
K1
. . . ωx
Kf
. . . .
Donor Iy ωy
11
. . . ωy
1f
ωy
21
. . . ωy
2f
ωy
i1
. . . ωy
if
ωy
K1
. . . ωy
Kf
. . . .
Donor Iz ωz
11
. . . ωz
1f
ωz
21
. . . ωz
2f
ωz
i1
. . . ωz
if
ωz
K1
. . . ωz
Kf
. . . .
From Ip From Ix , Iy , Iz From Ix , Iy , Iz From Ip
Exploitation
different σ for each feature f
Offspring . . . .ωc
11
. . . ωc
1f
ωc
21
. . . ωc
2f
ωc
i1
. . . ωc
if
ωc
K1
. . . ωc
Kf
149. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Local Optimization
DEC Algorithm
D ← Euclidean distance
P ← Initialization
Repeat
while g ≤ number of generations per cycle
for each individual Ip in P
Perform crossover to generate offspring Ic
Local optimize Ic
Compare Ic and Ip and keep better individual in P
end for
end while
Perturb P
If exploitation phase, D ← MinMax distance
Until some criteria are met
Return the best clustering found
158. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Local Optimization
Objective
Improve the quality of the offspring after crossover.
How
The three techniques implemented for local optimization are:
BreakUp
Breaking up a large cluster into smaller clusters, where a cluster
with more than 40% items of the data set is considered large.
Merge
Merging two close clusters to see if fitness can be improved.
159. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Local Optimization
Objective
Improve the quality of the offspring after crossover.
How
The three techniques implemented for local optimization are:
BreakUp
Breaking up a large cluster into smaller clusters, where a cluster
with more than 40% items of the data set is considered large.
Merge
Merging two close clusters to see if fitness can be improved.
Scatter
160. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Local Optimization
Objective
Improve the quality of the offspring after crossover.
How
The three techniques implemented for local optimization are:
BreakUp
Breaking up a large cluster into smaller clusters, where a cluster
with more than 40% items of the data set is considered large.
Merge
Merging two close clusters to see if fitness can be improved.
Scatter
Redistributing items of a cluster to other clusters to see if fitness
can be improved.
187. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Merge
Objective
The aim is to improve the clustering by recombining clusters that are similar.
How
Find closest clusters Ci and Cj in offspring I.
If on combining Ci and Cj , fitness of I improves, merge and form one
bigger cluster.
Else, search for next closest pair of clusters.
Repeat till k pairs of clusters have been merged, or all pairs have been
considered.
195. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Scatter
Objective
The aim is to get rid of clusters that might affect the fitness adversely.
How
Find the smallest cluster C in I.
If fitness of I improves on redistributing items from C to other clusters,
do it.
Else, search for the next smallest cluster.
Repeat till k clusters have been scattered, or every cluster has been
considered.
196. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Replacement
DEC Algorithm
D ← Euclidean distance
P ← Initialization
Repeat
while g ≤ number of generations per cycle
for each individual Ip in P
Perform crossover to generate offspring Ic
Local optimize Ic
Compare Ic and Ip and keep better individual in P
end for
end while
Perturb P
If exploitation phase, D ← MinMax distance
Until some criteria are met
Return the best clustering found
199. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Perturb
DEC Algorithm
D ← Euclidean distance
P ← Initialization
Repeat
while g ≤ number of generations per cycle
for each individual Ip in P
Perform crossover to generate offspring Ic
Local optimize Ic
Compare Ic and Ip and keep better individual in P
end for
end while
Perturb P
If exploitation phase, D ← MinMax distance
Until some criteria are met
Return the best clustering found
205. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Perturb population
Idea
In order to prevent premature convergence and avoid getting stuck in local
optima, the population is perturbed at the end of every cycle.
How
Randomly select a number of individuals (not the best) to be modified
from the current population.
Tweak the centroids and threshold values for the selected individual
with a small probability.
206. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Perturb population
Idea
In order to prevent premature convergence and avoid getting stuck in local
optima, the population is perturbed at the end of every cycle.
How
Randomly select a number of individuals (not the best) to be modified
from the current population.
Tweak the centroids and threshold values for the selected individual
with a small probability.
Recompute the clustering.
207. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Change Distance Measure
DEC Algorithm
D ← Euclidean distance
P ← Initialization
Repeat
while g ≤ number of generations per cycle
for each individual Ip in P
Perform crossover to generate offspring Ic
Local optimize Ic
Compare Ic and Ip and keep better individual in P
end for
end while
Perturb P
If exploitation phase, D ← MinMax distance
Until some criteria are met
Return the best clustering found
211. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Termination condition
DEC Algorithm
D ← Euclidean distance
P ← Initialization
Repeat
while g ≤ number of generations per cycle
for each individual Ip in P
Perform crossover to generate offspring Ic
Local optimize Ic
Compare Ic and Ip and keep better individual in P
end for
end while
Perturb P
If exploitation phase, D ← MinMax distance
Until some criteria are met
Return the best clustering found
215. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Return Best
DEC Algorithm
D ← Euclidean distance
P ← Initialization
Repeat
while g ≤ number of generations per cycle
for each individual Ip in P
Perform crossover to generate offspring Ic
Local optimize Ic
Compare Ic and Ip and keep better individual in P
end for
end while
Perturb P
If exploitation phase, D ← MinMax distance
Until some criteria are met
Return the best clustering found
234. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Results
Results for DEC were collected to analyze:
the quality of the solution based on different validity
indexes,
the impact of distance measures on the performance of the
algorithm,
different methods for computing clustering,
the effect of minimum number of clusters (Kmin), and
the running time of the algorithm.
235. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Results
Results for DEC algorithm were collected to analyze:
the quality of the solution based on different validity
indexes,
the impact of distance measures on the performance of the
algorithm,
different methods for computing clustering,
the effect of minimum number of clusters (Kmin),and
the running time of the algorithm.
237. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Effect of Validity Index
Wine
Ground Truth
Class 1 Class 2 Class 3
59 items 71 items 48 items
Cluster 1 51 4 0
55 items
Cluster 2 8 64 0
72 items
Cluster 3 0 3 48
51 items
Using DB Index
Wine
Ground Truth
Class 1 Class 2 Class 3
59 items 71 items 48 items
Cluster 1 36 0 0
36 items
Cluster 2 23 63 0
86 items
Cluster 3 0 8 48
56 items
Using CS Index
238. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Effect of Validity Index
Wine
Ground Truth
Class 1 Class 2 Class 3
59 items 71 items 48 items
Cluster 1 51 4 0
55 items
Cluster 2 8 64 0
72 items
Cluster 3 0 3 48
51 items
Using DB Index
Wine
Ground Truth
Class 1 Class 2 Class 3
59 items 71 items 48 items
Cluster 1 36 0 0
36 items
Cluster 2 23 63 0
86 items
Cluster 3 0 8 48
56 items
Using CS Index
Wine
Ground Truth
Class 1 Class 2 Class 3
59 items 71 items 48 items
Cluster 1 49 6 0
55 items
Cluster 2 4 13 1
18 items
Cluster 3 6 52 47
105 items
Using PBM Index
239. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Effect of Validity Index
Wine
Ground Truth
Class 1 Class 2 Class 3
59 items 71 items 48 items
Cluster 1 51 4 0
55 items
Cluster 2 8 64 0
72 items
Cluster 3 0 3 48
51 items
Using DB Index
Wine
Ground Truth
Class 1 Class 2 Class 3
59 items 71 items 48 items
Cluster 1 36 0 0
36 items
Cluster 2 23 63 0
86 items
Cluster 3 0 8 48
56 items
Using CS Index
Wine
Ground Truth
Class 1 Class 2 Class 3
59 items 71 items 48 items
Cluster 1 49 6 0
55 items
Cluster 2 4 13 1
18 items
Cluster 3 6 52 47
105 items
Using PBM Index
240. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Results
Results for DEC algorithm were collected to analyze:
the quality of the solution based on different validity
indexes,
the impact of distance measures on the performance of the
algorithm,
different methods for computing clustering,
the effect of minimum number of clusters (Kmin),and
the running time of the algorithm.
247. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Results
Results for DEC algorithm were collected to analyze:
the quality of the solution based on different validity
indexes,
the impact of distance measures on the performance of the
algorithm,
different methods for computing clustering,
the effect of minimum number of clusters (Kmin), and
the running time of the algorithm.
252. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Effect of clustering method
Different clusterings can be obtained for a given set of
centroids.
DEC uses the following two methods to recompute
clustering for the final set of centroids, Ω:
The clustering is recomputed using one iteration of
K-means algorithm on Ω (CA1).
Same as above but centroids are recomputed as well
(CA2).
254. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Effect of clustering method
GTK
Ground Truth
Class 1 Class 2 Class 3
59 items 71 items 48 items
Cluster 1 50 3 1
54 items
Cluster 2 0 49 17
66 items
Cluster 3 9 19 30
58 items
GT Using CA1 (1.14)
Wine
Ground Truth
Class 1 Class 2 Class 3
59 items 71 items 48 items
Cluster 1 51 4 0
55 items
Cluster 2 8 64 0
72 items
Cluster 3 0 3 48
51 items
Using DEC (1.12)
255. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Effect of clustering method
GTK
Ground Truth
Class 1 Class 2 Class 3
59 items 71 items 48 items
Cluster 1 50 3 1
54 items
Cluster 2 0 49 17
66 items
Cluster 3 9 19 30
58 items
GT Using CA1 (1.14)
Wine
Ground Truth
Class 1 Class 2 Class 3
59 items 71 items 48 items
Cluster 1 51 4 0
55 items
Cluster 2 8 64 0
72 items
Cluster 3 0 3 48
51 items
Using DEC (1.12)
Wine
Ground Truth
Class 1 Class 2 Class 3
59 items 71 items 48 items
Cluster 1 50 4 2
56 items
Cluster 2 0 43 8
51 items
Cluster 3 9 24 38
71 items
DEC Using CA1 (1.80)
256. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Effect of clustering method
GTK
Ground Truth
Class 1 Class 2 Class 3
59 items 71 items 48 items
Cluster 1 50 3 1
54 items
Cluster 2 0 49 17
66 items
Cluster 3 9 19 30
58 items
GT Using CA1 (1.14)
Wine
Ground Truth
Class 1 Class 2 Class 3
59 items 71 items 48 items
Cluster 1 51 4 0
55 items
Cluster 2 8 64 0
72 items
Cluster 3 0 3 48
51 items
Using DEC (1.12)
Wine
Ground Truth
Class 1 Class 2 Class 3
59 items 71 items 48 items
Cluster 1 50 4 2
56 items
Cluster 2 0 43 8
51 items
Cluster 3 9 24 38
71 items
DEC Using CA1 (1.80)
Wine
Ground Truth
Class 1 Class 2 Class 3
59 items 71 items 48 items
Cluster 1 46 1 0
47 items
Cluster 2 0 50 17
67 items
Cluster 3 13 20 31
64 items
DEC Using CA2 (0.62)
257. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Results
Results for DEC algorithm were collected to analyze:
the quality of the solution based on different validity
indexes,
the impact of distance measures on the performance of the
algorithm,
different methods for computing clustering,
the effect of minimum number of clusters (Kmin), and
the running time of the algorithm.
259. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Effect of Kmin
2 4 6 8 10
0
1
2
3
4
Wine-GT
Wine-GTk
Glass-GT
Glass-GTk
E.coli-GTE.coli-GTk
Minimum Number of Clusters Required (Kmin)
CSValidityIndex
E.coli-CS
Wine-CS
Glass-CS
Better fitness with
lower value of Kmin.
Worse fitness for
ground truth
shows that external
factors
might have been
used to
determine the
clustering.
260. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Effect of Kmin
2 4 6 8 10
0
200
400
Wine-GT
Wine-GTk
Glass-GT Glass-GTk
E.coli-GT
E.coli-GTk
Minimum Number of Clusters Required (Kmin)
PBMValidityIndex
E.coli-PBM
Wine-PBM
Glass-PBM
Better fitness with
lower value of Kmin.
Worse fitness for
ground truth
shows that external
factors
might have been
used to
determine the
clustering.
DEC converges to
Kmin
clusters due to this
characteristic of the
validity index.
261. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Results
Results for DEC algorithm were collected to analyze:
the quality of the solution based on different validity
indexes,
the impact of distance measures on the performance of the
algorithm,
different methods for computing clustering,
the effect of minimum number of clusters (Kmin),and
the running time of the algorithm.
266. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
Running Time
The DEC algorithm has a running time of O(n2).
DEC takes longer to finish for CS Index as compared to DB
and PBM indexes.
Example approximate run times of DEC:
size = 200, f ∈ [2, 15]: 200 seconds
size = 300, f > 30: 600 seconds
size = 700, f ∈ [2, 15]: upto 30 minutes
size ≥ 1000, f ∈ [2, 15]: upto 5 hours
267. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
DEC vs Existing Algorithms
Data Set Algorithm
CS DB
# of clusters Index # of clusters Index
Cancer
DEC(2) 2.00 1.13 2.82 0.58
ACDE 2.00 0.45 2.05 0.52
DCPSO 2.25 0.48 2.50 0.57
GCUK 2.00 0.61 2.50 0.63
Classical DE 2.25 0.89 2.10 0.51
Average Link 2.00 0.90 2.00 0.76
Glass
DEC(6) 6.00 0.30 6.00 1.02
DEC(2) 2.00 0.10 2.00 0.84
ACDE 6.05 0.33 6.05 1.01
DCPSO 5.95 0.76 5.95 1.51
GCUK 5.85 1.47 5.85 1.83
Classical DE 5.60 0.78 5.60 1.66
Average Link 6.00 1.02 6.00 1.85
268. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
DEC vs Existing Algorithms
Data Set Algorithm
CS DB
# of clusters Index # of clusters Index
Cancer
DEC(2) 2.00 1.13 2.82 0.58
ACDE 2.00 0.45 2.05 0.52
DCPSO 2.25 0.48 2.50 0.57
GCUK 2.00 0.61 2.50 0.63
Classical DE 2.25 0.89 2.10 0.51
Average Link 2.00 0.90 2.00 0.76
Glass
DEC(6) 6.00 0.30 6.00 1.02
DEC(2) 2.00 0.10 2.00 0.84
ACDE 6.05 0.33 6.05 1.01
DCPSO 5.95 0.76 5.95 1.51
GCUK 5.85 1.47 5.85 1.83
Classical DE 5.60 0.78 5.60 1.66
Average Link 6.00 1.02 6.00 1.85
269. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
DEC vs Existing Algorithms
Data Set Algorithm
CS DB
# of clusters Index # of clusters Index
Cancer
DEC(2) 2.00 1.13 2.82 0.58
ACDE 2.00 0.45 2.05 0.52
DCPSO 2.25 0.48 2.50 0.57
GCUK 2.00 0.61 2.50 0.63
Classical DE 2.25 0.89 2.10 0.51
Average Link 2.00 0.90 2.00 0.76
Glass
DEC(6) 6.00 0.30 6.00 1.02
DEC(2) 2.00 0.10 2.00 0.84
ACDE 6.05 0.33 6.05 1.01
DCPSO 5.95 0.76 5.95 1.51
GCUK 5.85 1.47 5.85 1.83
Classical DE 5.60 0.78 5.60 1.66
Average Link 6.00 1.02 6.00 1.85
DEC(2) represents the results
when Kmin is set to 2
270. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
DEC vs Existing Algorithms
Data Set Algorithm
CS DB
# of clusters Index # of clusters Index
Cancer
DEC(2) 2.00 1.13 2.82 0.58
ACDE 2.00 0.45 2.05 0.52
DCPSO 2.25 0.48 2.50 0.57
GCUK 2.00 0.61 2.50 0.63
Classical DE 2.25 0.89 2.10 0.51
Average Link 2.00 0.90 2.00 0.76
Glass
DEC(6) 6.00 0.30 6.00 1.02
DEC(2) 2.00 0.10 2.00 0.84
ACDE 6.05 0.33 6.05 1.01
DCPSO 5.95 0.76 5.95 1.51
GCUK 5.85 1.47 5.85 1.83
Classical DE 5.60 0.78 5.60 1.66
Average Link 6.00 1.02 6.00 1.85
271. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
DEC vs Existing Algorithms
Data Set Algorithm
CS DB
# of clusters Index # of clusters Index
Cancer
DEC(2) 2.00 1.13 2.82 0.58
ACDE 2.00 0.45 2.05 0.52
DCPSO 2.25 0.48 2.50 0.57
GCUK 2.00 0.61 2.50 0.63
Classical DE 2.25 0.89 2.10 0.51
Average Link 2.00 0.90 2.00 0.76
Glass
DEC(6) 6.00 0.30 6.00 1.02
DEC(2) 2.00 0.10 2.00 0.84
ACDE 6.05 0.33 6.05 1.01
DCPSO 5.95 0.76 5.95 1.51
GCUK 5.85 1.47 5.85 1.83
Classical DE 5.60 0.78 5.60 1.66
Average Link 6.00 1.02 6.00 1.85
DEC(k) represents the results
when Kmin is set to known
number of clusters (ground
truth)
272. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
DEC vs Existing Algorithms
Data Set Algorithm
CS DB
# of clusters Index # of clusters Index
Cancer
DEC(2) 2.00 1.13 2.82 0.58
ACDE 2.00 0.45 2.05 0.52
DCPSO 2.25 0.48 2.50 0.57
GCUK 2.00 0.61 2.50 0.63
Classical DE 2.25 0.89 2.10 0.51
Average Link 2.00 0.90 2.00 0.76
Glass
DEC(6) 6.00 0.30 6.00 1.02
DEC(2) 2.00 0.10 2.00 0.84
ACDE 6.05 0.33 6.05 1.01
DCPSO 5.95 0.76 5.95 1.51
GCUK 5.85 1.47 5.85 1.83
Classical DE 5.60 0.78 5.60 1.66
Average Link 6.00 1.02 6.00 1.85
273. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Background DEC Algorithm Results Conclusion
DEC vs Existing Algorithms
Data Set Algorithm
CS DB
# of clusters Index # of clusters Index
Iris
DEC(3) 3.00 0.60 3.00 0.56
DEC(2) 3.00 0.60 2.00 0.42
ACDE 3.25 0.66 3.05 0.46
DCPSO 2.23 0.73 2.25 0.69
GCUK 2.35 0.72 2.30 0.73
Classical DE 2.50 0.76 2.50 0.58
Average Link 3.00 0.78 3.00 0.84
Wine
DEC(3) 3.00 0.94 3.00 1.12
DEC(2) 2.00 0.70 2.00 0.96
ACDE 3.25 0.92 3.25 3.04
DCPSO 3.05 1.87 3.05 4.34
GCUK 2.95 1.58 2.95 5.34
Classical DE 3.50 1.79 3.50 3.39
Average Link 3.00 1.89 3.00 5.72