SlideShare a Scribd company logo
University of Science and Technology Houari Boumediene
The ACO-MEDOIDS
Using the Ant Colony Optimization for partitioning
data into cluters
MOUDJARI Leila
l.moudj11@gmail.com
April 15, 2017
1
Presentation plan
Introduction
Clustering and related work
What Is Cluster Analysis?
Requirements of clustering
Categorization of Clustering Methods
Clustering and related work
The importance of swarm intelligence and the ACO approach
Ant Colony Optimization
Adaptation of ACO to the medoids problem
ACO-MEDOID algorithm
An ant
The search space
Solution construction
Selecting rule
Fitness function
Pheromone update
The empirical parameters
Conclusion
MOUDJARI Leila | ACO-MEDOIDS
2
bg
MOUDJARI Leila | ACO-MEDOIDS
2
Introduction
Data mining is used in several disciplines, database systems,
statistics, machine learning, visualization, information science...
MOUDJARI Leila | ACO-MEDOIDS
2
Introduction
Data mining is used in several disciplines, database systems,
statistics, machine learning, visualization, information science...
A data mining system can perform several tasks such as
characterization, discrimination, association or correlation
analysis, classification, prediction, clustering, outlier analysis, or
evolution analysis. These tasks can be classified as supervised
or unsupervised. Data clustering is an unsupervised learning
and one of the most challenging problems in data mining. It’s
also classified as an NP-hard problem.
MOUDJARI Leila | ACO-MEDOIDS
2
Introduction
Data mining is used in several disciplines, database systems,
statistics, machine learning, visualization, information science...
A data mining system can perform several tasks such as
characterization, discrimination, association or correlation
analysis, classification, prediction, clustering, outlier analysis, or
evolution analysis. These tasks can be classified as supervised
or unsupervised. Data clustering is an unsupervised learning
and one of the most challenging problems in data mining. It’s
also classified as an NP-hard problem.
One of the strongest disciplines which faced this class of
problems and still remains liable is swarm intelligence. Therefore
we leaned towards this discipline as other researchers did.
MOUDJARI Leila | ACO-MEDOIDS
3
bg
MOUDJARI Leila | ACO-MEDOIDS
3
Introduction
Over the last years, many have presented works in this area, we
mention the BAT-CLARA [1], Association Rule Mining Based on
Bat Algorithm [2], MACOC: a medoid-based ACO clustering
algorithm [3], SACOC: A spectral-based ACO clustering
algorithm [4]...
MOUDJARI Leila | ACO-MEDOIDS
4
bg
MOUDJARI Leila | ACO-MEDOIDS
4
Introduction
Clustering is a large field and lots of work might still be needed in
the different areas. However we are concetrating ours on the
partitioning algorithms. Pricesly partitioning the dataset into k
clusters. Which is also an NP-hard task.
MOUDJARI Leila | ACO-MEDOIDS
4
Introduction
Clustering is a large field and lots of work might still be needed in
the different areas. However we are concetrating ours on the
partitioning algorithms. Pricesly partitioning the dataset into k
clusters. Which is also an NP-hard task.
The most well-known and commonly used partitioning methods
are k-means, k-medoids (PAM), and their variations [5].
Such as CLARA, CLARANS, CLAM (a recent one 2011, using a
hybrid metaheuristic between VNS and Tabu Search to solve the
problem of k-medoid clustering) [6], ...etc.
MOUDJARI Leila | ACO-MEDOIDS
4
Introduction
Clustering is a large field and lots of work might still be needed in
the different areas. However we are concetrating ours on the
partitioning algorithms. Pricesly partitioning the dataset into k
clusters. Which is also an NP-hard task.
The most well-known and commonly used partitioning methods
are k-means, k-medoids (PAM), and their variations [5].
Such as CLARA, CLARANS, CLAM (a recent one 2011, using a
hybrid metaheuristic between VNS and Tabu Search to solve the
problem of k-medoid clustering) [6], ...etc.
We hereby present an algorithm for k-medoid clustering based
on an ACO solution search the ACO-medoids. As its name
indicates, the algorithm uses the Ant colony optimisation to
explore the search space looking for an optimal set of medoids
with reference to k-medoids for necessary clustering concepts.
MOUDJARI Leila | ACO-MEDOIDS
5
bg
MOUDJARI Leila | ACO-MEDOIDS
5
Clustering and related work
What Is Cluster Analysis?
MOUDJARI Leila | ACO-MEDOIDS
5
Clustering and related work
What Is Cluster Analysis?
Clustering is an unsupervised learning process it does
not rely on predefined classes and class-labeled
training examples, therefore it is considered as a
form of learning by observation and not by examples.
MOUDJARI Leila | ACO-MEDOIDS
5
Clustering and related work
What Is Cluster Analysis?
Clustering is an unsupervised learning process it does
not rely on predefined classes and class-labeled
training examples, therefore it is considered as a
form of learning by observation and not by examples.
It aims to reduce the data size by grouping similar
objects in one cluster, so Giving a set of data
objects a clustering algorithm must be capable of
grouping the different objects into classes, so that a
high intragroup similarity and a low inter-group
similarity are ensured.
MOUDJARI Leila | ACO-MEDOIDS
5
Clustering and related work
What Is Cluster Analysis?
Clustering is an unsupervised learning process it does
not rely on predefined classes and class-labeled
training examples, therefore it is considered as a
form of learning by observation and not by examples.
It aims to reduce the data size by grouping similar
objects in one cluster, so Giving a set of data
objects a clustering algorithm must be capable of
grouping the different objects into classes, so that a
high intragroup similarity and a low inter-group
similarity are ensured.
The similarity or dissimilarity is assessed via a
distance measure(Euclidean or Manhattan distance
measures, or other distance measurements, may also be
used)
MOUDJARI Leila | ACO-MEDOIDS
6
bg
MOUDJARI Leila | ACO-MEDOIDS
6
Clustering and related work
Requirements of clustering
Scalability,
MOUDJARI Leila | ACO-MEDOIDS
6
Clustering and related work
Requirements of clustering
Scalability,
Ability to deal with different types of attributes,
MOUDJARI Leila | ACO-MEDOIDS
6
Clustering and related work
Requirements of clustering
Scalability,
Ability to deal with different types of attributes,
Ability to deal with noisy data,
MOUDJARI Leila | ACO-MEDOIDS
6
Clustering and related work
Requirements of clustering
Scalability,
Ability to deal with different types of attributes,
Ability to deal with noisy data,
High dimensionality (number of attributes)...
MOUDJARI Leila | ACO-MEDOIDS
7
Categorization of Clustering Methods
In general, these algorithms can be classified into the following
categories:
MOUDJARI Leila | ACO-MEDOIDS
7
Categorization of Clustering Methods
In general, these algorithms can be classified into the following
categories:
Partitioning methods
what characterizes this class is a predefined number k of partitions,
each partition represents a cluster. So that each cluster must contain
at least one object and an object must belong to at most one group.
The most known methods are k-means and k-medoids.
MOUDJARI Leila | ACO-MEDOIDS
7
Categorization of Clustering Methods
In general, these algorithms can be classified into the following
categories:
Partitioning methods
what characterizes this class is a predefined number k of partitions,
each partition represents a cluster. So that each cluster must contain
at least one object and an object must belong to at most one group.
The most known methods are k-means and k-medoids.
Hierarchical methods
creates a hierarchical decomposition of the dataset, it can be
classified as being either agglomerative (bottom-up) or divisive
(top-down).
MOUDJARI Leila | ACO-MEDOIDS
8
Categorization of Clustering Methods
Density-based methods
unlike partitioning methods, these are based on the notion of density
(number of objects or data points) instead of distance. They continue
growing the given cluster as long as the density in the “neighborhood”
exceeds some threshold. Such as DBSCAN and its extension,
OPTICS, are typical density-based methods.
MOUDJARI Leila | ACO-MEDOIDS
8
Categorization of Clustering Methods
Density-based methods
unlike partitioning methods, these are based on the notion of density
(number of objects or data points) instead of distance. They continue
growing the given cluster as long as the density in the “neighborhood”
exceeds some threshold. Such as DBSCAN and its extension,
OPTICS, are typical density-based methods.
There is also the Grid-based methods, Model-based methods,
Constraint-based clustering...
MOUDJARI Leila | ACO-MEDOIDS
9
Clustering and related work
K-mean algorithm
Input: k (the number of clusters),
D(a data set containing n objects).
Output: A set of k clusters.
Begin
1. arbitrarily choose k objects from D as the
initial cluster centers;
2. repeat
3. (re)assign each object to the cluster to
which the object is the most similar;
4. update the cluster means, i.e., calculate the
mean value of the objects for each cluster;
5. until no change;
End.
MOUDJARI Leila | ACO-MEDOIDS
10
Clustering and related work
k-Medoids algorithm
Input: k (the number of clusters),
D(a data set containing n objects).
Output: A set of k clusters.
Begin
1. arbitrarily choose k objects from D as the initial cluster centers;
2. repeat
3. assign each remaining object to the nearest cluster;
4. randomly select a nonrepresentative object, orand ;
5. compute the total cost, S, of swapping
representative object, oj with orand ;
6. if S < 0 then swap oj with orand to form the new set of k
medoids;
7. until no change;
End.
MOUDJARI Leila | ACO-MEDOIDS
11
Clustering and related work
K-medoids was presented as a solution to some of k-means flows. As
its sensitivity towards outliers and the fact that the centroids are
abstract objects. PAM proved that real objects diminish the error
value. However, it has some lacks. When it comes to large datasets it
loses due to the significant amount of time needed to construct the
set of medoids. In spite of that, researchers tried to improve it. That’s
why clustering field witnessed the birth of its variations: CLARA,
followed by CLARANS and others as already mentioned. However
the problem persists. How can we gain in scalability without
loosing in quality?
In the last years clustering draw attention of the meta-heuristic
community. Several works have been presented. One of the
promising optimization methods is ACO.
MOUDJARI Leila | ACO-MEDOIDS
12
The importance of swarm intelligence and the
ACO approach
Swarm intelligence
MOUDJARI Leila | ACO-MEDOIDS
12
The importance of swarm intelligence and the
ACO approach
Swarm intelligence
It is well known that we are more effective when we work with
others rather than working in isolation and this is, the core of
swarm intelligence.
MOUDJARI Leila | ACO-MEDOIDS
12
The importance of swarm intelligence and the
ACO approach
Swarm intelligence
It is well known that we are more effective when we work with
others rather than working in isolation and this is, the core of
swarm intelligence.
Swarm intelligence is based on the collective behavior of
species. Each method is a result of nature observation and
intelligence forms of group behavior analysis. It results in the
simulation of these studied behaviors of collective insects, animal
and human. It gained popularity with the burst of artificial
intelligence in the 80s. Especially when dealing with
combinatorial problems. Such problems are divided into classes,
P (polynomial), NP, Np-complete and NP-hard. The latter two
generally have an exponential complexity.
MOUDJARI Leila | ACO-MEDOIDS
13
The importance of swarm intelligence and the
ACO approach
Problem ∈ [Np-hard | Np-complete] ==> call 911.
clustering ∈ [Np-hard] ==> Swarm intelligence.
MOUDJARI Leila | ACO-MEDOIDS
14
The importance of swarm intelligence and the
ACO approach
Ant Colony Optimization
ACO showed its strength when dealing with problems related to
graphs.
It was driven by the fascination for ants, how they worked in
harmony to nourish and build a habitat.
They cooperate and help each other by sharing useful
information such as the path to take or to avoid.
they communicate using a substance they release called
"pheromone", as a stigmergy.
The use of ACO-based algorithms is very large and domain
based, therefore it was adopted to several types of problems.
MOUDJARI Leila | ACO-MEDOIDS
15
The importance of swarm intelligence and the
ACO approach
Ant Colony Optimization: the algorithm
ACO algorithm
Begin
1. While (not stop conditions) do
2. for k=1 to Nb-ants do
3. begin
4. Build a solution (Sk );
5. Evaluate (Sk );
6. Apply online pheromone update;
7. end-for;
8. Determine the best solution of the current iteration;
9. Apply offline pheromone update;
10. end-while;
End.
MOUDJARI Leila | ACO-MEDOIDS
16
The importance of swarm intelligence and the
ACO approach
Ant Colony Optimization
One of the advantages of applying ACO algorithms to the
clustering problems is that ACO performs a global search in the
solution space, which is less likely to get trapped in local minima
and, thus, has the potential to find more accurate solutions [7].
The algorithm, uses an iterative search strategy to find an
approximate optimal solution, using the pheromone track and a
heuristic.
ACO has been successfully adopted for multiple problems.
Works on unsupervised learning have focused on clustering
showing the potential of ACO-based techniques.
MOUDJARI Leila | ACO-MEDOIDS
17
Clustering and ACO
Nevertheless, more work need to be done, especially for
medoid-based clustering, which compared to classical
centroid-based techniques are more efficient. In this area, divers
algorithms were proposed such as:
"An adaptive multi-agent ACO clustering algorithm" in 2005 by
Weijiao Zhang and Chunhuang Liu.
"Classification with cluster-based Bayesian multi-nets using Ant
Colony Optimization" in 2014 by Khalid M. Selma and Alex A.
Freitas.
Also MACOC: a medoid-based ACO clustering algorithm in 2014.
Recently, a "Medoid-based clustering algorithms using ant
colony optimization" (METACOC and METACOC-K) were
proposed in 2016 (Héctor D. Menéndez, Fernando E. B. Otero,
David Camacho) [7].
...etc.
MOUDJARI Leila | ACO-MEDOIDS
18
Adaptation of ACO to the medoids problem
ACO-MEDOID algorithm
MOUDJARI Leila | ACO-MEDOIDS
18
Adaptation of ACO to the medoids problem
ACO-MEDOID algorithm
"ACO-medoids" for finding the best set of k-medoids. Based on the
ant colony optimisation and the k-medoids. we will strat with the
general form of the algorithm
MOUDJARI Leila | ACO-MEDOIDS
18
Adaptation of ACO to the medoids problem
ACO-MEDOID algorithm
"ACO-medoids" for finding the best set of k-medoids. Based on the
ant colony optimisation and the k-medoids. we will strat with the
general form of the algorithm
ACO-medoid algorithm
Input: k (the number of clusters),
D(a data set containing n objects).
M(the similarity (distance) matrix).
Output: A set of k clusters.
Begin
// start by creating the initial population
1. foreach ant do
2. arbitrarily choose k objects from D as the initial solution of the ant;
3. end-foreach;
4. While (change or i < Max − Iter) do
5. foreach ant do
6. begin
7. Build a solution (Sk );
8. Evaluate (Sk );
9. Update Abest and Vbest 19;
10. Apply online pheromone update;
11. end-foreach;
12. Determine the best solution of the current iteration;
13. Apply offline pheromone update;
14. end-while;
End.
MOUDJARI Leila | ACO-MEDOIDS
19
Adaptation of ACO to the medoids problem
An ant
A virtual agent in the multi-dimensional space, which is the search
space. It has the following properties:
sol: which is the current solution of the ant,
Abest: the best solution found so far by the ant,
Vbest: the valuation of Abest.
MOUDJARI Leila | ACO-MEDOIDS
20
Adaptation of ACO to the medoids problem
The search space
It includes all potential combinations of objects that can build a set of
medoids (solutions), verifying the similarity/dissimilarity constraint of
clustering. The number of these possible solutions depends on k (the
number of clusters) so if we have n objects that need to be placed in
k clusters then the number is determined as follows:
For the initial object we have n possibilities,
for the next one we have n − 1,
for the kth
we have n − k − 1 possibilities,
The total number of solutions is then equal to
n ∗ (n − 1) ∗ ... ∗ (n − k − 1) => meaning exponential.
MOUDJARI Leila | ACO-MEDOIDS
21
Adaptation of ACO to the medoids problem
Solution construction
In order to build a solution an ant has two possible strategies exploit
or explore. The first is a local search based method that helps an ant
improve its solution, the second help exploring new promising
regions. As shown in the next pseudocode;
MOUDJARI Leila | ACO-MEDOIDS
21
Adaptation of ACO to the medoids problem
Solution construction
In order to build a solution an ant has two possible strategies exploit
or explore. The first is a local search based method that helps an ant
improve its solution, the second help exploring new promising
regions. As shown in the next pseudocode;
Procedure: constructSolution
Input: an ant,
Output: an ant
Begin
// choose a strategy randomly
1. S0 : a random variable uniformly distributed in [0,1]
2. if S0 <= Sp then
3. sol = explore;
4. applyLocalSearch(Sol);
5. else
6. apply local search(Abest);
7. endif;
End.
MOUDJARI Leila | ACO-MEDOIDS
22
Adaptation of ACO to the medoids problem
Solution construction
Procedure: explore
Input: D the dataset;
Output: s
Begin
1. s = empty;
2. while (i<k and D not empty) do
3. select oi from D using the selecting rule 24;
4. Append oi to s;
5. eliminate oi from D;
6. endwhile;
End.
MOUDJARI Leila | ACO-MEDOIDS
23
Adaptation of ACO to the medoids problem
Solution construction
Procedure: localSearch
Input: the solution to be improved ;
Output: a solution
Begin
1. for j = 1 to lmax do
2. for m=1 to mds 27 do
3. C: the corresponding cluster;
4. choose object orand from C;
5. compute the total cost S, of swapping the
representative object Sol[m], with orand ;
6. if S < 0 then swap orand with Sol[m] to
form the new set of k representative objects;
7. update clusters;
8. endfor;
9. endfor;
End.
MOUDJARI Leila | ACO-MEDOIDS
24
Adaptation of ACO to the medoids problem
Selecting rule
The selecting process tries to find the furthest object in the selection
D from the set of objects already chosen as medoids, by using the
following formula;
j =
maxu∈Y {T(u)} if q ≤ q0
maxu∈Y {P(u)} otherwise
(1)
Pu(t) =
Tu(t)
v∈Y Tv
where;
Tj pheromone amount of the jth
object ∈ D,
Y set of possible medoids,
P the probability that data instance j could
be selected as a medoid
q is a random number distributed uniformly
in [0, 1],
q0 is an empirical parameter,
MOUDJARI Leila | ACO-MEDOIDS
25
Adaptation of ACO to the medoids problem
Fitness function
It is used to evaluate a solution, it represents the cost (Ecost ) of a
solution. However, in order to compare two solutions we calculate S,
S = Snew − Sold . If S is negative, then the new solution is better than
the old one.
Ecost =
k
i=1
C
j=1
M[m, j]
where;
M is the distance matrix,
C is the number of objects in the clusteri .
Another possible objective function is the sum of the probability P
calculate in the following formula. The aim is to maximize it;
if q ≤ q0
P =
1 if j = argmaxu∈Y {T(u)}
0 otherwise
(2)
else P is calculated as formula 24.
MOUDJARI Leila | ACO-MEDOIDS
26
Adaptation of ACO to the medoids problem
Pheromone update
Regarding the pheromone updates we used the on and offline
updates calculated as follows;
Online update
Ti = (1 − ρ)Ti (t) + ρτ0
where;
ρ: is the evaporation rate and also an empirical parameter,
τ: is the initial value of pheromone. Offline update At the end of
each iteration, the offline update is performed. So the ant with the
best current solution deposits an amount of pheromone equal to
∆Ti (t). The update is performed using this formula:
Ti = (1 − ρ)Ti (t) + ρ∆Ti (t)
where;
∆Ti (t) =
1
C if the ant uses the object l
0 otherwise
(3)
C: the cost of the ant’s solution (Ecost 25).
MOUDJARI Leila | ACO-MEDOIDS
27
Adaptation of ACO to the medoids problem
The empirical parameters
This section presents the different empirical parameters that need to
be defined in order to improve the solution quality.
parameter role
A number of ants
Max-Iter Iterations number of the algorithm
lmax Iterations number of local search
Sp in [0,1] Intensification/diversification rate
the strategy rate
q0 selection rate
mds Number of clusters to be updated
(can be equal to k or randomly
chosen each time in [1-k])
ρ the evaporation rate
MOUDJARI Leila | ACO-MEDOIDS
28
Conclusion
We presented some ideas for the use of ACO to solve the medoids
problem, through a proposed medoid and ACO based clustering
algorithm we called "ACO-medoids". It is based on the ants’ collective
behavior and k-medoids for building the clusters. Implementation and
tests need to be done so that we can be conclusive regarding the
algorithm behavior. However swarm based algorithms, including ACO
proved that they can improve the time/space complexity of NP-hard
problems. Therefore, we believe that the algorithm can provide the
optimal solution in a finite amount of time.
MOUDJARI Leila | ACO-MEDOIDS
Thank you!
29
Bibliographie
[1] NadjetKamel YasmineAboubi, HabibaDrias.
Bat-clara: Bat-inspired algorithm for clustering large applications.
IFAC-PapersOnLine 49-12 243–248, 2016.
[2] Habiba Drias Kamel Eddine Heraguemi, Nadjet Kamel.
Association rule mining based on bat algorithm.
Journal of Computational and Theoretical Nanoscience
12(7):1195-1200, 2015.
[3] Fernando E. B. Otero Héctor D. Menéndez and David Camacho.
Macoc: a medoid-based aco clustering algorithm.
DOI: 10.1007/978-3-319-09952-1_11, 2014.
[4] Fernando E. B. Otero Héctor D. Menéndez and David Camacho.
Sacoc a spectral-based aco clustering algorithm.
DOI: 10.1007/978-3-319-10422-5_20, 2014.
[5] Data mining: concepts and techniques (second edition).
ELESEVIER, 2011.
[6] V.J. J Nguyen, Q. & Rayward-Smith.MOUDJARI Leila | ACO-MEDOIDS

More Related Content

What's hot

Paper id 26201478
Paper id 26201478Paper id 26201478
Paper id 26201478
IJRAT
 
Introduction to Multi-Objective Clustering Ensemble
Introduction to Multi-Objective Clustering EnsembleIntroduction to Multi-Objective Clustering Ensemble
Introduction to Multi-Objective Clustering Ensemble
IJSRD
 
An efficient algorithm for privacy
An efficient algorithm for privacyAn efficient algorithm for privacy
An efficient algorithm for privacy
IJDKP
 
An Iterative Improved k-means Clustering
An Iterative Improved k-means ClusteringAn Iterative Improved k-means Clustering
An Iterative Improved k-means Clustering
IDES Editor
 
Introduction to Clustering algorithm
Introduction to Clustering algorithmIntroduction to Clustering algorithm
Introduction to Clustering algorithm
hadifar
 
Classification on multi label dataset using rule mining technique
Classification on multi label dataset using rule mining techniqueClassification on multi label dataset using rule mining technique
Classification on multi label dataset using rule mining technique
eSAT Publishing House
 
Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)
Mustafa Sherazi
 
Chapter8
Chapter8Chapter8
A0310112
A0310112A0310112
A0310112
iosrjournals
 
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
IJCSIS Research Publications
 
Survey on traditional and evolutionary clustering
Survey on traditional and evolutionary clusteringSurvey on traditional and evolutionary clustering
Survey on traditional and evolutionary clustering
eSAT Publishing House
 
Survey on traditional and evolutionary clustering approaches
Survey on traditional and evolutionary clustering approachesSurvey on traditional and evolutionary clustering approaches
Survey on traditional and evolutionary clustering approaches
eSAT Journals
 
Capter10 cluster basic
Capter10 cluster basicCapter10 cluster basic
Capter10 cluster basic
Houw Liong The
 
A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...
A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...
A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...
IJCSIS Research Publications
 
Chap8 basic cluster_analysis
Chap8 basic cluster_analysisChap8 basic cluster_analysis
Chap8 basic cluster_analysis
guru_prasadg
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
Kamalakshi Deshmukh-Samag
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithm
ijsrd.com
 
Dp33701704
Dp33701704Dp33701704
Dp33701704
IJERA Editor
 
Bj24390398
Bj24390398Bj24390398
Bj24390398
IJERA Editor
 
A study on rough set theory based
A study on rough set theory basedA study on rough set theory based
A study on rough set theory based
ijaia
 

What's hot (20)

Paper id 26201478
Paper id 26201478Paper id 26201478
Paper id 26201478
 
Introduction to Multi-Objective Clustering Ensemble
Introduction to Multi-Objective Clustering EnsembleIntroduction to Multi-Objective Clustering Ensemble
Introduction to Multi-Objective Clustering Ensemble
 
An efficient algorithm for privacy
An efficient algorithm for privacyAn efficient algorithm for privacy
An efficient algorithm for privacy
 
An Iterative Improved k-means Clustering
An Iterative Improved k-means ClusteringAn Iterative Improved k-means Clustering
An Iterative Improved k-means Clustering
 
Introduction to Clustering algorithm
Introduction to Clustering algorithmIntroduction to Clustering algorithm
Introduction to Clustering algorithm
 
Classification on multi label dataset using rule mining technique
Classification on multi label dataset using rule mining techniqueClassification on multi label dataset using rule mining technique
Classification on multi label dataset using rule mining technique
 
Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)
 
Chapter8
Chapter8Chapter8
Chapter8
 
A0310112
A0310112A0310112
A0310112
 
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
 
Survey on traditional and evolutionary clustering
Survey on traditional and evolutionary clusteringSurvey on traditional and evolutionary clustering
Survey on traditional and evolutionary clustering
 
Survey on traditional and evolutionary clustering approaches
Survey on traditional and evolutionary clustering approachesSurvey on traditional and evolutionary clustering approaches
Survey on traditional and evolutionary clustering approaches
 
Capter10 cluster basic
Capter10 cluster basicCapter10 cluster basic
Capter10 cluster basic
 
A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...
A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...
A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...
 
Chap8 basic cluster_analysis
Chap8 basic cluster_analysisChap8 basic cluster_analysis
Chap8 basic cluster_analysis
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithm
 
Dp33701704
Dp33701704Dp33701704
Dp33701704
 
Bj24390398
Bj24390398Bj24390398
Bj24390398
 
A study on rough set theory based
A study on rough set theory basedA study on rough set theory based
A study on rough set theory based
 

Similar to ACO-medoids

For iiii year students of cse ML-UNIT-V.pptx
For iiii year students of cse ML-UNIT-V.pptxFor iiii year students of cse ML-UNIT-V.pptx
For iiii year students of cse ML-UNIT-V.pptx
SureshPolisetty2
 
Clustering Algorithms.pptx
Clustering Algorithms.pptxClustering Algorithms.pptx
Clustering Algorithms.pptx
Issra'a Almgoter
 
pratik meshram-Unit 5 (contemporary mkt r sch)
pratik meshram-Unit 5 (contemporary mkt r sch)pratik meshram-Unit 5 (contemporary mkt r sch)
pratik meshram-Unit 5 (contemporary mkt r sch)
Pratik Meshram
 
Data clustring
Data clustring Data clustring
Data clustring
Salman Memon
 
A Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data MiningA Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data Mining
Natasha Grant
 
UNIT - 4: Data Warehousing and Data Mining
UNIT - 4: Data Warehousing and Data MiningUNIT - 4: Data Warehousing and Data Mining
UNIT - 4: Data Warehousing and Data Mining
Nandakumar P
 
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Salah Amean
 
Chapter 5.pdf
Chapter 5.pdfChapter 5.pdf
Chapter 5.pdf
DrGnaneswariG
 
Cluster analysis (2).docx
Cluster analysis (2).docxCluster analysis (2).docx
Cluster analysis (2).docx
YaseenRashid4
 
Clustering - K-Means, DBSCAN
Clustering - K-Means, DBSCANClustering - K-Means, DBSCAN
Clustering - K-Means, DBSCAN
Medicaps University
 
Multilevel techniques for the clustering problem
Multilevel techniques for the clustering problemMultilevel techniques for the clustering problem
Multilevel techniques for the clustering problem
csandit
 
Knowledge Discovery & Representation
Knowledge Discovery & RepresentationKnowledge Discovery & Representation
Knowledge Discovery & Representation
Darshan Patil
 
Applications Of Clustering Techniques In Data Mining A Comparative Study
Applications Of Clustering Techniques In Data Mining  A Comparative StudyApplications Of Clustering Techniques In Data Mining  A Comparative Study
Applications Of Clustering Techniques In Data Mining A Comparative Study
Fiona Phillips
 
Clustering[306] [Read-Only].pdf
Clustering[306] [Read-Only].pdfClustering[306] [Read-Only].pdf
Clustering[306] [Read-Only].pdf
igeabroad
 
CLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfCLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdf
SowmyaJyothi3
 
Data Clustering Using Swarm Intelligence Algorithms An Overview
Data Clustering Using  Swarm Intelligence Algorithms  An OverviewData Clustering Using  Swarm Intelligence Algorithms  An Overview
Data Clustering Using Swarm Intelligence Algorithms An Overview
Aboul Ella Hassanien
 
Data mining concepts and techniques Chapter 10
Data mining concepts and techniques Chapter 10Data mining concepts and techniques Chapter 10
Data mining concepts and techniques Chapter 10
mqasimsheikh5
 
Data clustering and optimization techniques
Data clustering and optimization techniquesData clustering and optimization techniques
Data clustering and optimization techniques
Spyros Ktenas
 
A SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERING
A SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERINGA SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERING
A SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERING
ijcsa
 
84cc04ff77007e457df6aa2b814d2346bf1b
84cc04ff77007e457df6aa2b814d2346bf1b84cc04ff77007e457df6aa2b814d2346bf1b
84cc04ff77007e457df6aa2b814d2346bf1b
PRAWEEN KUMAR
 

Similar to ACO-medoids (20)

For iiii year students of cse ML-UNIT-V.pptx
For iiii year students of cse ML-UNIT-V.pptxFor iiii year students of cse ML-UNIT-V.pptx
For iiii year students of cse ML-UNIT-V.pptx
 
Clustering Algorithms.pptx
Clustering Algorithms.pptxClustering Algorithms.pptx
Clustering Algorithms.pptx
 
pratik meshram-Unit 5 (contemporary mkt r sch)
pratik meshram-Unit 5 (contemporary mkt r sch)pratik meshram-Unit 5 (contemporary mkt r sch)
pratik meshram-Unit 5 (contemporary mkt r sch)
 
Data clustring
Data clustring Data clustring
Data clustring
 
A Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data MiningA Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data Mining
 
UNIT - 4: Data Warehousing and Data Mining
UNIT - 4: Data Warehousing and Data MiningUNIT - 4: Data Warehousing and Data Mining
UNIT - 4: Data Warehousing and Data Mining
 
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
 
Chapter 5.pdf
Chapter 5.pdfChapter 5.pdf
Chapter 5.pdf
 
Cluster analysis (2).docx
Cluster analysis (2).docxCluster analysis (2).docx
Cluster analysis (2).docx
 
Clustering - K-Means, DBSCAN
Clustering - K-Means, DBSCANClustering - K-Means, DBSCAN
Clustering - K-Means, DBSCAN
 
Multilevel techniques for the clustering problem
Multilevel techniques for the clustering problemMultilevel techniques for the clustering problem
Multilevel techniques for the clustering problem
 
Knowledge Discovery & Representation
Knowledge Discovery & RepresentationKnowledge Discovery & Representation
Knowledge Discovery & Representation
 
Applications Of Clustering Techniques In Data Mining A Comparative Study
Applications Of Clustering Techniques In Data Mining  A Comparative StudyApplications Of Clustering Techniques In Data Mining  A Comparative Study
Applications Of Clustering Techniques In Data Mining A Comparative Study
 
Clustering[306] [Read-Only].pdf
Clustering[306] [Read-Only].pdfClustering[306] [Read-Only].pdf
Clustering[306] [Read-Only].pdf
 
CLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfCLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdf
 
Data Clustering Using Swarm Intelligence Algorithms An Overview
Data Clustering Using  Swarm Intelligence Algorithms  An OverviewData Clustering Using  Swarm Intelligence Algorithms  An Overview
Data Clustering Using Swarm Intelligence Algorithms An Overview
 
Data mining concepts and techniques Chapter 10
Data mining concepts and techniques Chapter 10Data mining concepts and techniques Chapter 10
Data mining concepts and techniques Chapter 10
 
Data clustering and optimization techniques
Data clustering and optimization techniquesData clustering and optimization techniques
Data clustering and optimization techniques
 
A SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERING
A SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERINGA SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERING
A SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERING
 
84cc04ff77007e457df6aa2b814d2346bf1b
84cc04ff77007e457df6aa2b814d2346bf1b84cc04ff77007e457df6aa2b814d2346bf1b
84cc04ff77007e457df6aa2b814d2346bf1b
 

Recently uploaded

GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 

Recently uploaded (20)

GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 

ACO-medoids

  • 1. University of Science and Technology Houari Boumediene The ACO-MEDOIDS Using the Ant Colony Optimization for partitioning data into cluters MOUDJARI Leila l.moudj11@gmail.com April 15, 2017
  • 2. 1 Presentation plan Introduction Clustering and related work What Is Cluster Analysis? Requirements of clustering Categorization of Clustering Methods Clustering and related work The importance of swarm intelligence and the ACO approach Ant Colony Optimization Adaptation of ACO to the medoids problem ACO-MEDOID algorithm An ant The search space Solution construction Selecting rule Fitness function Pheromone update The empirical parameters Conclusion MOUDJARI Leila | ACO-MEDOIDS
  • 3. 2 bg MOUDJARI Leila | ACO-MEDOIDS
  • 4. 2 Introduction Data mining is used in several disciplines, database systems, statistics, machine learning, visualization, information science... MOUDJARI Leila | ACO-MEDOIDS
  • 5. 2 Introduction Data mining is used in several disciplines, database systems, statistics, machine learning, visualization, information science... A data mining system can perform several tasks such as characterization, discrimination, association or correlation analysis, classification, prediction, clustering, outlier analysis, or evolution analysis. These tasks can be classified as supervised or unsupervised. Data clustering is an unsupervised learning and one of the most challenging problems in data mining. It’s also classified as an NP-hard problem. MOUDJARI Leila | ACO-MEDOIDS
  • 6. 2 Introduction Data mining is used in several disciplines, database systems, statistics, machine learning, visualization, information science... A data mining system can perform several tasks such as characterization, discrimination, association or correlation analysis, classification, prediction, clustering, outlier analysis, or evolution analysis. These tasks can be classified as supervised or unsupervised. Data clustering is an unsupervised learning and one of the most challenging problems in data mining. It’s also classified as an NP-hard problem. One of the strongest disciplines which faced this class of problems and still remains liable is swarm intelligence. Therefore we leaned towards this discipline as other researchers did. MOUDJARI Leila | ACO-MEDOIDS
  • 7. 3 bg MOUDJARI Leila | ACO-MEDOIDS
  • 8. 3 Introduction Over the last years, many have presented works in this area, we mention the BAT-CLARA [1], Association Rule Mining Based on Bat Algorithm [2], MACOC: a medoid-based ACO clustering algorithm [3], SACOC: A spectral-based ACO clustering algorithm [4]... MOUDJARI Leila | ACO-MEDOIDS
  • 9. 4 bg MOUDJARI Leila | ACO-MEDOIDS
  • 10. 4 Introduction Clustering is a large field and lots of work might still be needed in the different areas. However we are concetrating ours on the partitioning algorithms. Pricesly partitioning the dataset into k clusters. Which is also an NP-hard task. MOUDJARI Leila | ACO-MEDOIDS
  • 11. 4 Introduction Clustering is a large field and lots of work might still be needed in the different areas. However we are concetrating ours on the partitioning algorithms. Pricesly partitioning the dataset into k clusters. Which is also an NP-hard task. The most well-known and commonly used partitioning methods are k-means, k-medoids (PAM), and their variations [5]. Such as CLARA, CLARANS, CLAM (a recent one 2011, using a hybrid metaheuristic between VNS and Tabu Search to solve the problem of k-medoid clustering) [6], ...etc. MOUDJARI Leila | ACO-MEDOIDS
  • 12. 4 Introduction Clustering is a large field and lots of work might still be needed in the different areas. However we are concetrating ours on the partitioning algorithms. Pricesly partitioning the dataset into k clusters. Which is also an NP-hard task. The most well-known and commonly used partitioning methods are k-means, k-medoids (PAM), and their variations [5]. Such as CLARA, CLARANS, CLAM (a recent one 2011, using a hybrid metaheuristic between VNS and Tabu Search to solve the problem of k-medoid clustering) [6], ...etc. We hereby present an algorithm for k-medoid clustering based on an ACO solution search the ACO-medoids. As its name indicates, the algorithm uses the Ant colony optimisation to explore the search space looking for an optimal set of medoids with reference to k-medoids for necessary clustering concepts. MOUDJARI Leila | ACO-MEDOIDS
  • 13. 5 bg MOUDJARI Leila | ACO-MEDOIDS
  • 14. 5 Clustering and related work What Is Cluster Analysis? MOUDJARI Leila | ACO-MEDOIDS
  • 15. 5 Clustering and related work What Is Cluster Analysis? Clustering is an unsupervised learning process it does not rely on predefined classes and class-labeled training examples, therefore it is considered as a form of learning by observation and not by examples. MOUDJARI Leila | ACO-MEDOIDS
  • 16. 5 Clustering and related work What Is Cluster Analysis? Clustering is an unsupervised learning process it does not rely on predefined classes and class-labeled training examples, therefore it is considered as a form of learning by observation and not by examples. It aims to reduce the data size by grouping similar objects in one cluster, so Giving a set of data objects a clustering algorithm must be capable of grouping the different objects into classes, so that a high intragroup similarity and a low inter-group similarity are ensured. MOUDJARI Leila | ACO-MEDOIDS
  • 17. 5 Clustering and related work What Is Cluster Analysis? Clustering is an unsupervised learning process it does not rely on predefined classes and class-labeled training examples, therefore it is considered as a form of learning by observation and not by examples. It aims to reduce the data size by grouping similar objects in one cluster, so Giving a set of data objects a clustering algorithm must be capable of grouping the different objects into classes, so that a high intragroup similarity and a low inter-group similarity are ensured. The similarity or dissimilarity is assessed via a distance measure(Euclidean or Manhattan distance measures, or other distance measurements, may also be used) MOUDJARI Leila | ACO-MEDOIDS
  • 18. 6 bg MOUDJARI Leila | ACO-MEDOIDS
  • 19. 6 Clustering and related work Requirements of clustering Scalability, MOUDJARI Leila | ACO-MEDOIDS
  • 20. 6 Clustering and related work Requirements of clustering Scalability, Ability to deal with different types of attributes, MOUDJARI Leila | ACO-MEDOIDS
  • 21. 6 Clustering and related work Requirements of clustering Scalability, Ability to deal with different types of attributes, Ability to deal with noisy data, MOUDJARI Leila | ACO-MEDOIDS
  • 22. 6 Clustering and related work Requirements of clustering Scalability, Ability to deal with different types of attributes, Ability to deal with noisy data, High dimensionality (number of attributes)... MOUDJARI Leila | ACO-MEDOIDS
  • 23. 7 Categorization of Clustering Methods In general, these algorithms can be classified into the following categories: MOUDJARI Leila | ACO-MEDOIDS
  • 24. 7 Categorization of Clustering Methods In general, these algorithms can be classified into the following categories: Partitioning methods what characterizes this class is a predefined number k of partitions, each partition represents a cluster. So that each cluster must contain at least one object and an object must belong to at most one group. The most known methods are k-means and k-medoids. MOUDJARI Leila | ACO-MEDOIDS
  • 25. 7 Categorization of Clustering Methods In general, these algorithms can be classified into the following categories: Partitioning methods what characterizes this class is a predefined number k of partitions, each partition represents a cluster. So that each cluster must contain at least one object and an object must belong to at most one group. The most known methods are k-means and k-medoids. Hierarchical methods creates a hierarchical decomposition of the dataset, it can be classified as being either agglomerative (bottom-up) or divisive (top-down). MOUDJARI Leila | ACO-MEDOIDS
  • 26. 8 Categorization of Clustering Methods Density-based methods unlike partitioning methods, these are based on the notion of density (number of objects or data points) instead of distance. They continue growing the given cluster as long as the density in the “neighborhood” exceeds some threshold. Such as DBSCAN and its extension, OPTICS, are typical density-based methods. MOUDJARI Leila | ACO-MEDOIDS
  • 27. 8 Categorization of Clustering Methods Density-based methods unlike partitioning methods, these are based on the notion of density (number of objects or data points) instead of distance. They continue growing the given cluster as long as the density in the “neighborhood” exceeds some threshold. Such as DBSCAN and its extension, OPTICS, are typical density-based methods. There is also the Grid-based methods, Model-based methods, Constraint-based clustering... MOUDJARI Leila | ACO-MEDOIDS
  • 28. 9 Clustering and related work K-mean algorithm Input: k (the number of clusters), D(a data set containing n objects). Output: A set of k clusters. Begin 1. arbitrarily choose k objects from D as the initial cluster centers; 2. repeat 3. (re)assign each object to the cluster to which the object is the most similar; 4. update the cluster means, i.e., calculate the mean value of the objects for each cluster; 5. until no change; End. MOUDJARI Leila | ACO-MEDOIDS
  • 29. 10 Clustering and related work k-Medoids algorithm Input: k (the number of clusters), D(a data set containing n objects). Output: A set of k clusters. Begin 1. arbitrarily choose k objects from D as the initial cluster centers; 2. repeat 3. assign each remaining object to the nearest cluster; 4. randomly select a nonrepresentative object, orand ; 5. compute the total cost, S, of swapping representative object, oj with orand ; 6. if S < 0 then swap oj with orand to form the new set of k medoids; 7. until no change; End. MOUDJARI Leila | ACO-MEDOIDS
  • 30. 11 Clustering and related work K-medoids was presented as a solution to some of k-means flows. As its sensitivity towards outliers and the fact that the centroids are abstract objects. PAM proved that real objects diminish the error value. However, it has some lacks. When it comes to large datasets it loses due to the significant amount of time needed to construct the set of medoids. In spite of that, researchers tried to improve it. That’s why clustering field witnessed the birth of its variations: CLARA, followed by CLARANS and others as already mentioned. However the problem persists. How can we gain in scalability without loosing in quality? In the last years clustering draw attention of the meta-heuristic community. Several works have been presented. One of the promising optimization methods is ACO. MOUDJARI Leila | ACO-MEDOIDS
  • 31. 12 The importance of swarm intelligence and the ACO approach Swarm intelligence MOUDJARI Leila | ACO-MEDOIDS
  • 32. 12 The importance of swarm intelligence and the ACO approach Swarm intelligence It is well known that we are more effective when we work with others rather than working in isolation and this is, the core of swarm intelligence. MOUDJARI Leila | ACO-MEDOIDS
  • 33. 12 The importance of swarm intelligence and the ACO approach Swarm intelligence It is well known that we are more effective when we work with others rather than working in isolation and this is, the core of swarm intelligence. Swarm intelligence is based on the collective behavior of species. Each method is a result of nature observation and intelligence forms of group behavior analysis. It results in the simulation of these studied behaviors of collective insects, animal and human. It gained popularity with the burst of artificial intelligence in the 80s. Especially when dealing with combinatorial problems. Such problems are divided into classes, P (polynomial), NP, Np-complete and NP-hard. The latter two generally have an exponential complexity. MOUDJARI Leila | ACO-MEDOIDS
  • 34. 13 The importance of swarm intelligence and the ACO approach Problem ∈ [Np-hard | Np-complete] ==> call 911. clustering ∈ [Np-hard] ==> Swarm intelligence. MOUDJARI Leila | ACO-MEDOIDS
  • 35. 14 The importance of swarm intelligence and the ACO approach Ant Colony Optimization ACO showed its strength when dealing with problems related to graphs. It was driven by the fascination for ants, how they worked in harmony to nourish and build a habitat. They cooperate and help each other by sharing useful information such as the path to take or to avoid. they communicate using a substance they release called "pheromone", as a stigmergy. The use of ACO-based algorithms is very large and domain based, therefore it was adopted to several types of problems. MOUDJARI Leila | ACO-MEDOIDS
  • 36. 15 The importance of swarm intelligence and the ACO approach Ant Colony Optimization: the algorithm ACO algorithm Begin 1. While (not stop conditions) do 2. for k=1 to Nb-ants do 3. begin 4. Build a solution (Sk ); 5. Evaluate (Sk ); 6. Apply online pheromone update; 7. end-for; 8. Determine the best solution of the current iteration; 9. Apply offline pheromone update; 10. end-while; End. MOUDJARI Leila | ACO-MEDOIDS
  • 37. 16 The importance of swarm intelligence and the ACO approach Ant Colony Optimization One of the advantages of applying ACO algorithms to the clustering problems is that ACO performs a global search in the solution space, which is less likely to get trapped in local minima and, thus, has the potential to find more accurate solutions [7]. The algorithm, uses an iterative search strategy to find an approximate optimal solution, using the pheromone track and a heuristic. ACO has been successfully adopted for multiple problems. Works on unsupervised learning have focused on clustering showing the potential of ACO-based techniques. MOUDJARI Leila | ACO-MEDOIDS
  • 38. 17 Clustering and ACO Nevertheless, more work need to be done, especially for medoid-based clustering, which compared to classical centroid-based techniques are more efficient. In this area, divers algorithms were proposed such as: "An adaptive multi-agent ACO clustering algorithm" in 2005 by Weijiao Zhang and Chunhuang Liu. "Classification with cluster-based Bayesian multi-nets using Ant Colony Optimization" in 2014 by Khalid M. Selma and Alex A. Freitas. Also MACOC: a medoid-based ACO clustering algorithm in 2014. Recently, a "Medoid-based clustering algorithms using ant colony optimization" (METACOC and METACOC-K) were proposed in 2016 (Héctor D. Menéndez, Fernando E. B. Otero, David Camacho) [7]. ...etc. MOUDJARI Leila | ACO-MEDOIDS
  • 39. 18 Adaptation of ACO to the medoids problem ACO-MEDOID algorithm MOUDJARI Leila | ACO-MEDOIDS
  • 40. 18 Adaptation of ACO to the medoids problem ACO-MEDOID algorithm "ACO-medoids" for finding the best set of k-medoids. Based on the ant colony optimisation and the k-medoids. we will strat with the general form of the algorithm MOUDJARI Leila | ACO-MEDOIDS
  • 41. 18 Adaptation of ACO to the medoids problem ACO-MEDOID algorithm "ACO-medoids" for finding the best set of k-medoids. Based on the ant colony optimisation and the k-medoids. we will strat with the general form of the algorithm ACO-medoid algorithm Input: k (the number of clusters), D(a data set containing n objects). M(the similarity (distance) matrix). Output: A set of k clusters. Begin // start by creating the initial population 1. foreach ant do 2. arbitrarily choose k objects from D as the initial solution of the ant; 3. end-foreach; 4. While (change or i < Max − Iter) do 5. foreach ant do 6. begin 7. Build a solution (Sk ); 8. Evaluate (Sk ); 9. Update Abest and Vbest 19; 10. Apply online pheromone update; 11. end-foreach; 12. Determine the best solution of the current iteration; 13. Apply offline pheromone update; 14. end-while; End. MOUDJARI Leila | ACO-MEDOIDS
  • 42. 19 Adaptation of ACO to the medoids problem An ant A virtual agent in the multi-dimensional space, which is the search space. It has the following properties: sol: which is the current solution of the ant, Abest: the best solution found so far by the ant, Vbest: the valuation of Abest. MOUDJARI Leila | ACO-MEDOIDS
  • 43. 20 Adaptation of ACO to the medoids problem The search space It includes all potential combinations of objects that can build a set of medoids (solutions), verifying the similarity/dissimilarity constraint of clustering. The number of these possible solutions depends on k (the number of clusters) so if we have n objects that need to be placed in k clusters then the number is determined as follows: For the initial object we have n possibilities, for the next one we have n − 1, for the kth we have n − k − 1 possibilities, The total number of solutions is then equal to n ∗ (n − 1) ∗ ... ∗ (n − k − 1) => meaning exponential. MOUDJARI Leila | ACO-MEDOIDS
  • 44. 21 Adaptation of ACO to the medoids problem Solution construction In order to build a solution an ant has two possible strategies exploit or explore. The first is a local search based method that helps an ant improve its solution, the second help exploring new promising regions. As shown in the next pseudocode; MOUDJARI Leila | ACO-MEDOIDS
  • 45. 21 Adaptation of ACO to the medoids problem Solution construction In order to build a solution an ant has two possible strategies exploit or explore. The first is a local search based method that helps an ant improve its solution, the second help exploring new promising regions. As shown in the next pseudocode; Procedure: constructSolution Input: an ant, Output: an ant Begin // choose a strategy randomly 1. S0 : a random variable uniformly distributed in [0,1] 2. if S0 <= Sp then 3. sol = explore; 4. applyLocalSearch(Sol); 5. else 6. apply local search(Abest); 7. endif; End. MOUDJARI Leila | ACO-MEDOIDS
  • 46. 22 Adaptation of ACO to the medoids problem Solution construction Procedure: explore Input: D the dataset; Output: s Begin 1. s = empty; 2. while (i<k and D not empty) do 3. select oi from D using the selecting rule 24; 4. Append oi to s; 5. eliminate oi from D; 6. endwhile; End. MOUDJARI Leila | ACO-MEDOIDS
  • 47. 23 Adaptation of ACO to the medoids problem Solution construction Procedure: localSearch Input: the solution to be improved ; Output: a solution Begin 1. for j = 1 to lmax do 2. for m=1 to mds 27 do 3. C: the corresponding cluster; 4. choose object orand from C; 5. compute the total cost S, of swapping the representative object Sol[m], with orand ; 6. if S < 0 then swap orand with Sol[m] to form the new set of k representative objects; 7. update clusters; 8. endfor; 9. endfor; End. MOUDJARI Leila | ACO-MEDOIDS
  • 48. 24 Adaptation of ACO to the medoids problem Selecting rule The selecting process tries to find the furthest object in the selection D from the set of objects already chosen as medoids, by using the following formula; j = maxu∈Y {T(u)} if q ≤ q0 maxu∈Y {P(u)} otherwise (1) Pu(t) = Tu(t) v∈Y Tv where; Tj pheromone amount of the jth object ∈ D, Y set of possible medoids, P the probability that data instance j could be selected as a medoid q is a random number distributed uniformly in [0, 1], q0 is an empirical parameter, MOUDJARI Leila | ACO-MEDOIDS
  • 49. 25 Adaptation of ACO to the medoids problem Fitness function It is used to evaluate a solution, it represents the cost (Ecost ) of a solution. However, in order to compare two solutions we calculate S, S = Snew − Sold . If S is negative, then the new solution is better than the old one. Ecost = k i=1 C j=1 M[m, j] where; M is the distance matrix, C is the number of objects in the clusteri . Another possible objective function is the sum of the probability P calculate in the following formula. The aim is to maximize it; if q ≤ q0 P = 1 if j = argmaxu∈Y {T(u)} 0 otherwise (2) else P is calculated as formula 24. MOUDJARI Leila | ACO-MEDOIDS
  • 50. 26 Adaptation of ACO to the medoids problem Pheromone update Regarding the pheromone updates we used the on and offline updates calculated as follows; Online update Ti = (1 − ρ)Ti (t) + ρτ0 where; ρ: is the evaporation rate and also an empirical parameter, τ: is the initial value of pheromone. Offline update At the end of each iteration, the offline update is performed. So the ant with the best current solution deposits an amount of pheromone equal to ∆Ti (t). The update is performed using this formula: Ti = (1 − ρ)Ti (t) + ρ∆Ti (t) where; ∆Ti (t) = 1 C if the ant uses the object l 0 otherwise (3) C: the cost of the ant’s solution (Ecost 25). MOUDJARI Leila | ACO-MEDOIDS
  • 51. 27 Adaptation of ACO to the medoids problem The empirical parameters This section presents the different empirical parameters that need to be defined in order to improve the solution quality. parameter role A number of ants Max-Iter Iterations number of the algorithm lmax Iterations number of local search Sp in [0,1] Intensification/diversification rate the strategy rate q0 selection rate mds Number of clusters to be updated (can be equal to k or randomly chosen each time in [1-k]) ρ the evaporation rate MOUDJARI Leila | ACO-MEDOIDS
  • 52. 28 Conclusion We presented some ideas for the use of ACO to solve the medoids problem, through a proposed medoid and ACO based clustering algorithm we called "ACO-medoids". It is based on the ants’ collective behavior and k-medoids for building the clusters. Implementation and tests need to be done so that we can be conclusive regarding the algorithm behavior. However swarm based algorithms, including ACO proved that they can improve the time/space complexity of NP-hard problems. Therefore, we believe that the algorithm can provide the optimal solution in a finite amount of time. MOUDJARI Leila | ACO-MEDOIDS
  • 54. 29 Bibliographie [1] NadjetKamel YasmineAboubi, HabibaDrias. Bat-clara: Bat-inspired algorithm for clustering large applications. IFAC-PapersOnLine 49-12 243–248, 2016. [2] Habiba Drias Kamel Eddine Heraguemi, Nadjet Kamel. Association rule mining based on bat algorithm. Journal of Computational and Theoretical Nanoscience 12(7):1195-1200, 2015. [3] Fernando E. B. Otero Héctor D. Menéndez and David Camacho. Macoc: a medoid-based aco clustering algorithm. DOI: 10.1007/978-3-319-09952-1_11, 2014. [4] Fernando E. B. Otero Héctor D. Menéndez and David Camacho. Sacoc a spectral-based aco clustering algorithm. DOI: 10.1007/978-3-319-10422-5_20, 2014. [5] Data mining: concepts and techniques (second edition). ELESEVIER, 2011. [6] V.J. J Nguyen, Q. & Rayward-Smith.MOUDJARI Leila | ACO-MEDOIDS