SlideShare a Scribd company logo
1 of 54
Download to read offline
University of Science and Technology Houari Boumediene
The ACO-MEDOIDS
Using the Ant Colony Optimization for partitioning
data into cluters
MOUDJARI Leila
l.moudj11@gmail.com
April 15, 2017
1
Presentation plan
Introduction
Clustering and related work
What Is Cluster Analysis?
Requirements of clustering
Categorization of Clustering Methods
Clustering and related work
The importance of swarm intelligence and the ACO approach
Ant Colony Optimization
Adaptation of ACO to the medoids problem
ACO-MEDOID algorithm
An ant
The search space
Solution construction
Selecting rule
Fitness function
Pheromone update
The empirical parameters
Conclusion
MOUDJARI Leila | ACO-MEDOIDS
2
bg
MOUDJARI Leila | ACO-MEDOIDS
2
Introduction
Data mining is used in several disciplines, database systems,
statistics, machine learning, visualization, information science...
MOUDJARI Leila | ACO-MEDOIDS
2
Introduction
Data mining is used in several disciplines, database systems,
statistics, machine learning, visualization, information science...
A data mining system can perform several tasks such as
characterization, discrimination, association or correlation
analysis, classification, prediction, clustering, outlier analysis, or
evolution analysis. These tasks can be classified as supervised
or unsupervised. Data clustering is an unsupervised learning
and one of the most challenging problems in data mining. It’s
also classified as an NP-hard problem.
MOUDJARI Leila | ACO-MEDOIDS
2
Introduction
Data mining is used in several disciplines, database systems,
statistics, machine learning, visualization, information science...
A data mining system can perform several tasks such as
characterization, discrimination, association or correlation
analysis, classification, prediction, clustering, outlier analysis, or
evolution analysis. These tasks can be classified as supervised
or unsupervised. Data clustering is an unsupervised learning
and one of the most challenging problems in data mining. It’s
also classified as an NP-hard problem.
One of the strongest disciplines which faced this class of
problems and still remains liable is swarm intelligence. Therefore
we leaned towards this discipline as other researchers did.
MOUDJARI Leila | ACO-MEDOIDS
3
bg
MOUDJARI Leila | ACO-MEDOIDS
3
Introduction
Over the last years, many have presented works in this area, we
mention the BAT-CLARA [1], Association Rule Mining Based on
Bat Algorithm [2], MACOC: a medoid-based ACO clustering
algorithm [3], SACOC: A spectral-based ACO clustering
algorithm [4]...
MOUDJARI Leila | ACO-MEDOIDS
4
bg
MOUDJARI Leila | ACO-MEDOIDS
4
Introduction
Clustering is a large field and lots of work might still be needed in
the different areas. However we are concetrating ours on the
partitioning algorithms. Pricesly partitioning the dataset into k
clusters. Which is also an NP-hard task.
MOUDJARI Leila | ACO-MEDOIDS
4
Introduction
Clustering is a large field and lots of work might still be needed in
the different areas. However we are concetrating ours on the
partitioning algorithms. Pricesly partitioning the dataset into k
clusters. Which is also an NP-hard task.
The most well-known and commonly used partitioning methods
are k-means, k-medoids (PAM), and their variations [5].
Such as CLARA, CLARANS, CLAM (a recent one 2011, using a
hybrid metaheuristic between VNS and Tabu Search to solve the
problem of k-medoid clustering) [6], ...etc.
MOUDJARI Leila | ACO-MEDOIDS
4
Introduction
Clustering is a large field and lots of work might still be needed in
the different areas. However we are concetrating ours on the
partitioning algorithms. Pricesly partitioning the dataset into k
clusters. Which is also an NP-hard task.
The most well-known and commonly used partitioning methods
are k-means, k-medoids (PAM), and their variations [5].
Such as CLARA, CLARANS, CLAM (a recent one 2011, using a
hybrid metaheuristic between VNS and Tabu Search to solve the
problem of k-medoid clustering) [6], ...etc.
We hereby present an algorithm for k-medoid clustering based
on an ACO solution search the ACO-medoids. As its name
indicates, the algorithm uses the Ant colony optimisation to
explore the search space looking for an optimal set of medoids
with reference to k-medoids for necessary clustering concepts.
MOUDJARI Leila | ACO-MEDOIDS
5
bg
MOUDJARI Leila | ACO-MEDOIDS
5
Clustering and related work
What Is Cluster Analysis?
MOUDJARI Leila | ACO-MEDOIDS
5
Clustering and related work
What Is Cluster Analysis?
Clustering is an unsupervised learning process it does
not rely on predefined classes and class-labeled
training examples, therefore it is considered as a
form of learning by observation and not by examples.
MOUDJARI Leila | ACO-MEDOIDS
5
Clustering and related work
What Is Cluster Analysis?
Clustering is an unsupervised learning process it does
not rely on predefined classes and class-labeled
training examples, therefore it is considered as a
form of learning by observation and not by examples.
It aims to reduce the data size by grouping similar
objects in one cluster, so Giving a set of data
objects a clustering algorithm must be capable of
grouping the different objects into classes, so that a
high intragroup similarity and a low inter-group
similarity are ensured.
MOUDJARI Leila | ACO-MEDOIDS
5
Clustering and related work
What Is Cluster Analysis?
Clustering is an unsupervised learning process it does
not rely on predefined classes and class-labeled
training examples, therefore it is considered as a
form of learning by observation and not by examples.
It aims to reduce the data size by grouping similar
objects in one cluster, so Giving a set of data
objects a clustering algorithm must be capable of
grouping the different objects into classes, so that a
high intragroup similarity and a low inter-group
similarity are ensured.
The similarity or dissimilarity is assessed via a
distance measure(Euclidean or Manhattan distance
measures, or other distance measurements, may also be
used)
MOUDJARI Leila | ACO-MEDOIDS
6
bg
MOUDJARI Leila | ACO-MEDOIDS
6
Clustering and related work
Requirements of clustering
Scalability,
MOUDJARI Leila | ACO-MEDOIDS
6
Clustering and related work
Requirements of clustering
Scalability,
Ability to deal with different types of attributes,
MOUDJARI Leila | ACO-MEDOIDS
6
Clustering and related work
Requirements of clustering
Scalability,
Ability to deal with different types of attributes,
Ability to deal with noisy data,
MOUDJARI Leila | ACO-MEDOIDS
6
Clustering and related work
Requirements of clustering
Scalability,
Ability to deal with different types of attributes,
Ability to deal with noisy data,
High dimensionality (number of attributes)...
MOUDJARI Leila | ACO-MEDOIDS
7
Categorization of Clustering Methods
In general, these algorithms can be classified into the following
categories:
MOUDJARI Leila | ACO-MEDOIDS
7
Categorization of Clustering Methods
In general, these algorithms can be classified into the following
categories:
Partitioning methods
what characterizes this class is a predefined number k of partitions,
each partition represents a cluster. So that each cluster must contain
at least one object and an object must belong to at most one group.
The most known methods are k-means and k-medoids.
MOUDJARI Leila | ACO-MEDOIDS
7
Categorization of Clustering Methods
In general, these algorithms can be classified into the following
categories:
Partitioning methods
what characterizes this class is a predefined number k of partitions,
each partition represents a cluster. So that each cluster must contain
at least one object and an object must belong to at most one group.
The most known methods are k-means and k-medoids.
Hierarchical methods
creates a hierarchical decomposition of the dataset, it can be
classified as being either agglomerative (bottom-up) or divisive
(top-down).
MOUDJARI Leila | ACO-MEDOIDS
8
Categorization of Clustering Methods
Density-based methods
unlike partitioning methods, these are based on the notion of density
(number of objects or data points) instead of distance. They continue
growing the given cluster as long as the density in the “neighborhood”
exceeds some threshold. Such as DBSCAN and its extension,
OPTICS, are typical density-based methods.
MOUDJARI Leila | ACO-MEDOIDS
8
Categorization of Clustering Methods
Density-based methods
unlike partitioning methods, these are based on the notion of density
(number of objects or data points) instead of distance. They continue
growing the given cluster as long as the density in the “neighborhood”
exceeds some threshold. Such as DBSCAN and its extension,
OPTICS, are typical density-based methods.
There is also the Grid-based methods, Model-based methods,
Constraint-based clustering...
MOUDJARI Leila | ACO-MEDOIDS
9
Clustering and related work
K-mean algorithm
Input: k (the number of clusters),
D(a data set containing n objects).
Output: A set of k clusters.
Begin
1. arbitrarily choose k objects from D as the
initial cluster centers;
2. repeat
3. (re)assign each object to the cluster to
which the object is the most similar;
4. update the cluster means, i.e., calculate the
mean value of the objects for each cluster;
5. until no change;
End.
MOUDJARI Leila | ACO-MEDOIDS
10
Clustering and related work
k-Medoids algorithm
Input: k (the number of clusters),
D(a data set containing n objects).
Output: A set of k clusters.
Begin
1. arbitrarily choose k objects from D as the initial cluster centers;
2. repeat
3. assign each remaining object to the nearest cluster;
4. randomly select a nonrepresentative object, orand ;
5. compute the total cost, S, of swapping
representative object, oj with orand ;
6. if S < 0 then swap oj with orand to form the new set of k
medoids;
7. until no change;
End.
MOUDJARI Leila | ACO-MEDOIDS
11
Clustering and related work
K-medoids was presented as a solution to some of k-means flows. As
its sensitivity towards outliers and the fact that the centroids are
abstract objects. PAM proved that real objects diminish the error
value. However, it has some lacks. When it comes to large datasets it
loses due to the significant amount of time needed to construct the
set of medoids. In spite of that, researchers tried to improve it. That’s
why clustering field witnessed the birth of its variations: CLARA,
followed by CLARANS and others as already mentioned. However
the problem persists. How can we gain in scalability without
loosing in quality?
In the last years clustering draw attention of the meta-heuristic
community. Several works have been presented. One of the
promising optimization methods is ACO.
MOUDJARI Leila | ACO-MEDOIDS
12
The importance of swarm intelligence and the
ACO approach
Swarm intelligence
MOUDJARI Leila | ACO-MEDOIDS
12
The importance of swarm intelligence and the
ACO approach
Swarm intelligence
It is well known that we are more effective when we work with
others rather than working in isolation and this is, the core of
swarm intelligence.
MOUDJARI Leila | ACO-MEDOIDS
12
The importance of swarm intelligence and the
ACO approach
Swarm intelligence
It is well known that we are more effective when we work with
others rather than working in isolation and this is, the core of
swarm intelligence.
Swarm intelligence is based on the collective behavior of
species. Each method is a result of nature observation and
intelligence forms of group behavior analysis. It results in the
simulation of these studied behaviors of collective insects, animal
and human. It gained popularity with the burst of artificial
intelligence in the 80s. Especially when dealing with
combinatorial problems. Such problems are divided into classes,
P (polynomial), NP, Np-complete and NP-hard. The latter two
generally have an exponential complexity.
MOUDJARI Leila | ACO-MEDOIDS
13
The importance of swarm intelligence and the
ACO approach
Problem ∈ [Np-hard | Np-complete] ==> call 911.
clustering ∈ [Np-hard] ==> Swarm intelligence.
MOUDJARI Leila | ACO-MEDOIDS
14
The importance of swarm intelligence and the
ACO approach
Ant Colony Optimization
ACO showed its strength when dealing with problems related to
graphs.
It was driven by the fascination for ants, how they worked in
harmony to nourish and build a habitat.
They cooperate and help each other by sharing useful
information such as the path to take or to avoid.
they communicate using a substance they release called
"pheromone", as a stigmergy.
The use of ACO-based algorithms is very large and domain
based, therefore it was adopted to several types of problems.
MOUDJARI Leila | ACO-MEDOIDS
15
The importance of swarm intelligence and the
ACO approach
Ant Colony Optimization: the algorithm
ACO algorithm
Begin
1. While (not stop conditions) do
2. for k=1 to Nb-ants do
3. begin
4. Build a solution (Sk );
5. Evaluate (Sk );
6. Apply online pheromone update;
7. end-for;
8. Determine the best solution of the current iteration;
9. Apply offline pheromone update;
10. end-while;
End.
MOUDJARI Leila | ACO-MEDOIDS
16
The importance of swarm intelligence and the
ACO approach
Ant Colony Optimization
One of the advantages of applying ACO algorithms to the
clustering problems is that ACO performs a global search in the
solution space, which is less likely to get trapped in local minima
and, thus, has the potential to find more accurate solutions [7].
The algorithm, uses an iterative search strategy to find an
approximate optimal solution, using the pheromone track and a
heuristic.
ACO has been successfully adopted for multiple problems.
Works on unsupervised learning have focused on clustering
showing the potential of ACO-based techniques.
MOUDJARI Leila | ACO-MEDOIDS
17
Clustering and ACO
Nevertheless, more work need to be done, especially for
medoid-based clustering, which compared to classical
centroid-based techniques are more efficient. In this area, divers
algorithms were proposed such as:
"An adaptive multi-agent ACO clustering algorithm" in 2005 by
Weijiao Zhang and Chunhuang Liu.
"Classification with cluster-based Bayesian multi-nets using Ant
Colony Optimization" in 2014 by Khalid M. Selma and Alex A.
Freitas.
Also MACOC: a medoid-based ACO clustering algorithm in 2014.
Recently, a "Medoid-based clustering algorithms using ant
colony optimization" (METACOC and METACOC-K) were
proposed in 2016 (Héctor D. Menéndez, Fernando E. B. Otero,
David Camacho) [7].
...etc.
MOUDJARI Leila | ACO-MEDOIDS
18
Adaptation of ACO to the medoids problem
ACO-MEDOID algorithm
MOUDJARI Leila | ACO-MEDOIDS
18
Adaptation of ACO to the medoids problem
ACO-MEDOID algorithm
"ACO-medoids" for finding the best set of k-medoids. Based on the
ant colony optimisation and the k-medoids. we will strat with the
general form of the algorithm
MOUDJARI Leila | ACO-MEDOIDS
18
Adaptation of ACO to the medoids problem
ACO-MEDOID algorithm
"ACO-medoids" for finding the best set of k-medoids. Based on the
ant colony optimisation and the k-medoids. we will strat with the
general form of the algorithm
ACO-medoid algorithm
Input: k (the number of clusters),
D(a data set containing n objects).
M(the similarity (distance) matrix).
Output: A set of k clusters.
Begin
// start by creating the initial population
1. foreach ant do
2. arbitrarily choose k objects from D as the initial solution of the ant;
3. end-foreach;
4. While (change or i < Max − Iter) do
5. foreach ant do
6. begin
7. Build a solution (Sk );
8. Evaluate (Sk );
9. Update Abest and Vbest 19;
10. Apply online pheromone update;
11. end-foreach;
12. Determine the best solution of the current iteration;
13. Apply offline pheromone update;
14. end-while;
End.
MOUDJARI Leila | ACO-MEDOIDS
19
Adaptation of ACO to the medoids problem
An ant
A virtual agent in the multi-dimensional space, which is the search
space. It has the following properties:
sol: which is the current solution of the ant,
Abest: the best solution found so far by the ant,
Vbest: the valuation of Abest.
MOUDJARI Leila | ACO-MEDOIDS
20
Adaptation of ACO to the medoids problem
The search space
It includes all potential combinations of objects that can build a set of
medoids (solutions), verifying the similarity/dissimilarity constraint of
clustering. The number of these possible solutions depends on k (the
number of clusters) so if we have n objects that need to be placed in
k clusters then the number is determined as follows:
For the initial object we have n possibilities,
for the next one we have n − 1,
for the kth
we have n − k − 1 possibilities,
The total number of solutions is then equal to
n ∗ (n − 1) ∗ ... ∗ (n − k − 1) => meaning exponential.
MOUDJARI Leila | ACO-MEDOIDS
21
Adaptation of ACO to the medoids problem
Solution construction
In order to build a solution an ant has two possible strategies exploit
or explore. The first is a local search based method that helps an ant
improve its solution, the second help exploring new promising
regions. As shown in the next pseudocode;
MOUDJARI Leila | ACO-MEDOIDS
21
Adaptation of ACO to the medoids problem
Solution construction
In order to build a solution an ant has two possible strategies exploit
or explore. The first is a local search based method that helps an ant
improve its solution, the second help exploring new promising
regions. As shown in the next pseudocode;
Procedure: constructSolution
Input: an ant,
Output: an ant
Begin
// choose a strategy randomly
1. S0 : a random variable uniformly distributed in [0,1]
2. if S0 <= Sp then
3. sol = explore;
4. applyLocalSearch(Sol);
5. else
6. apply local search(Abest);
7. endif;
End.
MOUDJARI Leila | ACO-MEDOIDS
22
Adaptation of ACO to the medoids problem
Solution construction
Procedure: explore
Input: D the dataset;
Output: s
Begin
1. s = empty;
2. while (i<k and D not empty) do
3. select oi from D using the selecting rule 24;
4. Append oi to s;
5. eliminate oi from D;
6. endwhile;
End.
MOUDJARI Leila | ACO-MEDOIDS
23
Adaptation of ACO to the medoids problem
Solution construction
Procedure: localSearch
Input: the solution to be improved ;
Output: a solution
Begin
1. for j = 1 to lmax do
2. for m=1 to mds 27 do
3. C: the corresponding cluster;
4. choose object orand from C;
5. compute the total cost S, of swapping the
representative object Sol[m], with orand ;
6. if S < 0 then swap orand with Sol[m] to
form the new set of k representative objects;
7. update clusters;
8. endfor;
9. endfor;
End.
MOUDJARI Leila | ACO-MEDOIDS
24
Adaptation of ACO to the medoids problem
Selecting rule
The selecting process tries to find the furthest object in the selection
D from the set of objects already chosen as medoids, by using the
following formula;
j =
maxu∈Y {T(u)} if q ≤ q0
maxu∈Y {P(u)} otherwise
(1)
Pu(t) =
Tu(t)
v∈Y Tv
where;
Tj pheromone amount of the jth
object ∈ D,
Y set of possible medoids,
P the probability that data instance j could
be selected as a medoid
q is a random number distributed uniformly
in [0, 1],
q0 is an empirical parameter,
MOUDJARI Leila | ACO-MEDOIDS
25
Adaptation of ACO to the medoids problem
Fitness function
It is used to evaluate a solution, it represents the cost (Ecost ) of a
solution. However, in order to compare two solutions we calculate S,
S = Snew − Sold . If S is negative, then the new solution is better than
the old one.
Ecost =
k
i=1
C
j=1
M[m, j]
where;
M is the distance matrix,
C is the number of objects in the clusteri .
Another possible objective function is the sum of the probability P
calculate in the following formula. The aim is to maximize it;
if q ≤ q0
P =
1 if j = argmaxu∈Y {T(u)}
0 otherwise
(2)
else P is calculated as formula 24.
MOUDJARI Leila | ACO-MEDOIDS
26
Adaptation of ACO to the medoids problem
Pheromone update
Regarding the pheromone updates we used the on and offline
updates calculated as follows;
Online update
Ti = (1 − ρ)Ti (t) + ρτ0
where;
ρ: is the evaporation rate and also an empirical parameter,
τ: is the initial value of pheromone. Offline update At the end of
each iteration, the offline update is performed. So the ant with the
best current solution deposits an amount of pheromone equal to
∆Ti (t). The update is performed using this formula:
Ti = (1 − ρ)Ti (t) + ρ∆Ti (t)
where;
∆Ti (t) =
1
C if the ant uses the object l
0 otherwise
(3)
C: the cost of the ant’s solution (Ecost 25).
MOUDJARI Leila | ACO-MEDOIDS
27
Adaptation of ACO to the medoids problem
The empirical parameters
This section presents the different empirical parameters that need to
be defined in order to improve the solution quality.
parameter role
A number of ants
Max-Iter Iterations number of the algorithm
lmax Iterations number of local search
Sp in [0,1] Intensification/diversification rate
the strategy rate
q0 selection rate
mds Number of clusters to be updated
(can be equal to k or randomly
chosen each time in [1-k])
ρ the evaporation rate
MOUDJARI Leila | ACO-MEDOIDS
28
Conclusion
We presented some ideas for the use of ACO to solve the medoids
problem, through a proposed medoid and ACO based clustering
algorithm we called "ACO-medoids". It is based on the ants’ collective
behavior and k-medoids for building the clusters. Implementation and
tests need to be done so that we can be conclusive regarding the
algorithm behavior. However swarm based algorithms, including ACO
proved that they can improve the time/space complexity of NP-hard
problems. Therefore, we believe that the algorithm can provide the
optimal solution in a finite amount of time.
MOUDJARI Leila | ACO-MEDOIDS
Thank you!
29
Bibliographie
[1] NadjetKamel YasmineAboubi, HabibaDrias.
Bat-clara: Bat-inspired algorithm for clustering large applications.
IFAC-PapersOnLine 49-12 243–248, 2016.
[2] Habiba Drias Kamel Eddine Heraguemi, Nadjet Kamel.
Association rule mining based on bat algorithm.
Journal of Computational and Theoretical Nanoscience
12(7):1195-1200, 2015.
[3] Fernando E. B. Otero Héctor D. Menéndez and David Camacho.
Macoc: a medoid-based aco clustering algorithm.
DOI: 10.1007/978-3-319-09952-1_11, 2014.
[4] Fernando E. B. Otero Héctor D. Menéndez and David Camacho.
Sacoc a spectral-based aco clustering algorithm.
DOI: 10.1007/978-3-319-10422-5_20, 2014.
[5] Data mining: concepts and techniques (second edition).
ELESEVIER, 2011.
[6] V.J. J Nguyen, Q. & Rayward-Smith.MOUDJARI Leila | ACO-MEDOIDS

More Related Content

What's hot

Paper id 26201478
Paper id 26201478Paper id 26201478
Paper id 26201478IJRAT
 
Introduction to Multi-Objective Clustering Ensemble
Introduction to Multi-Objective Clustering EnsembleIntroduction to Multi-Objective Clustering Ensemble
Introduction to Multi-Objective Clustering EnsembleIJSRD
 
An efficient algorithm for privacy
An efficient algorithm for privacyAn efficient algorithm for privacy
An efficient algorithm for privacyIJDKP
 
An Iterative Improved k-means Clustering
An Iterative Improved k-means ClusteringAn Iterative Improved k-means Clustering
An Iterative Improved k-means ClusteringIDES Editor
 
Introduction to Clustering algorithm
Introduction to Clustering algorithmIntroduction to Clustering algorithm
Introduction to Clustering algorithmhadifar
 
Classification on multi label dataset using rule mining technique
Classification on multi label dataset using rule mining techniqueClassification on multi label dataset using rule mining technique
Classification on multi label dataset using rule mining techniqueeSAT Publishing House
 
Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)Mustafa Sherazi
 
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...IJCSIS Research Publications
 
Survey on traditional and evolutionary clustering
Survey on traditional and evolutionary clusteringSurvey on traditional and evolutionary clustering
Survey on traditional and evolutionary clusteringeSAT Publishing House
 
Survey on traditional and evolutionary clustering approaches
Survey on traditional and evolutionary clustering approachesSurvey on traditional and evolutionary clustering approaches
Survey on traditional and evolutionary clustering approacheseSAT Journals
 
Capter10 cluster basic
Capter10 cluster basicCapter10 cluster basic
Capter10 cluster basicHouw Liong The
 
A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...
A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...
A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...IJCSIS Research Publications
 
Chap8 basic cluster_analysis
Chap8 basic cluster_analysisChap8 basic cluster_analysis
Chap8 basic cluster_analysisguru_prasadg
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithmijsrd.com
 
A study on rough set theory based
A study on rough set theory basedA study on rough set theory based
A study on rough set theory basedijaia
 

What's hot (20)

Paper id 26201478
Paper id 26201478Paper id 26201478
Paper id 26201478
 
Introduction to Multi-Objective Clustering Ensemble
Introduction to Multi-Objective Clustering EnsembleIntroduction to Multi-Objective Clustering Ensemble
Introduction to Multi-Objective Clustering Ensemble
 
An efficient algorithm for privacy
An efficient algorithm for privacyAn efficient algorithm for privacy
An efficient algorithm for privacy
 
An Iterative Improved k-means Clustering
An Iterative Improved k-means ClusteringAn Iterative Improved k-means Clustering
An Iterative Improved k-means Clustering
 
Introduction to Clustering algorithm
Introduction to Clustering algorithmIntroduction to Clustering algorithm
Introduction to Clustering algorithm
 
Classification on multi label dataset using rule mining technique
Classification on multi label dataset using rule mining techniqueClassification on multi label dataset using rule mining technique
Classification on multi label dataset using rule mining technique
 
Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)
 
Chapter8
Chapter8Chapter8
Chapter8
 
A0310112
A0310112A0310112
A0310112
 
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
 
Survey on traditional and evolutionary clustering
Survey on traditional and evolutionary clusteringSurvey on traditional and evolutionary clustering
Survey on traditional and evolutionary clustering
 
Survey on traditional and evolutionary clustering approaches
Survey on traditional and evolutionary clustering approachesSurvey on traditional and evolutionary clustering approaches
Survey on traditional and evolutionary clustering approaches
 
Capter10 cluster basic
Capter10 cluster basicCapter10 cluster basic
Capter10 cluster basic
 
A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...
A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...
A Kernel Approach for Semi-Supervised Clustering Framework for High Dimension...
 
Chap8 basic cluster_analysis
Chap8 basic cluster_analysisChap8 basic cluster_analysis
Chap8 basic cluster_analysis
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithm
 
Dp33701704
Dp33701704Dp33701704
Dp33701704
 
Bj24390398
Bj24390398Bj24390398
Bj24390398
 
A study on rough set theory based
A study on rough set theory basedA study on rough set theory based
A study on rough set theory based
 

Similar to ACO-medoids

For iiii year students of cse ML-UNIT-V.pptx
For iiii year students of cse ML-UNIT-V.pptxFor iiii year students of cse ML-UNIT-V.pptx
For iiii year students of cse ML-UNIT-V.pptxSureshPolisetty2
 
Clustering Algorithms.pptx
Clustering Algorithms.pptxClustering Algorithms.pptx
Clustering Algorithms.pptxIssra'a Almgoter
 
pratik meshram-Unit 5 (contemporary mkt r sch)
pratik meshram-Unit 5 (contemporary mkt r sch)pratik meshram-Unit 5 (contemporary mkt r sch)
pratik meshram-Unit 5 (contemporary mkt r sch)Pratik Meshram
 
A Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data MiningA Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data MiningNatasha Grant
 
UNIT - 4: Data Warehousing and Data Mining
UNIT - 4: Data Warehousing and Data MiningUNIT - 4: Data Warehousing and Data Mining
UNIT - 4: Data Warehousing and Data MiningNandakumar P
 
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...Salah Amean
 
Cluster analysis (2).docx
Cluster analysis (2).docxCluster analysis (2).docx
Cluster analysis (2).docxYaseenRashid4
 
Multilevel techniques for the clustering problem
Multilevel techniques for the clustering problemMultilevel techniques for the clustering problem
Multilevel techniques for the clustering problemcsandit
 
Knowledge Discovery & Representation
Knowledge Discovery & RepresentationKnowledge Discovery & Representation
Knowledge Discovery & RepresentationDarshan Patil
 
Applications Of Clustering Techniques In Data Mining A Comparative Study
Applications Of Clustering Techniques In Data Mining  A Comparative StudyApplications Of Clustering Techniques In Data Mining  A Comparative Study
Applications Of Clustering Techniques In Data Mining A Comparative StudyFiona Phillips
 
Clustering[306] [Read-Only].pdf
Clustering[306] [Read-Only].pdfClustering[306] [Read-Only].pdf
Clustering[306] [Read-Only].pdfigeabroad
 
CLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfCLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfSowmyaJyothi3
 
Data Clustering Using Swarm Intelligence Algorithms An Overview
Data Clustering Using  Swarm Intelligence Algorithms  An OverviewData Clustering Using  Swarm Intelligence Algorithms  An Overview
Data Clustering Using Swarm Intelligence Algorithms An OverviewAboul Ella Hassanien
 
Data mining concepts and techniques Chapter 10
Data mining concepts and techniques Chapter 10Data mining concepts and techniques Chapter 10
Data mining concepts and techniques Chapter 10mqasimsheikh5
 
Data clustering and optimization techniques
Data clustering and optimization techniquesData clustering and optimization techniques
Data clustering and optimization techniquesSpyros Ktenas
 
A SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERING
A SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERINGA SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERING
A SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERINGijcsa
 
84cc04ff77007e457df6aa2b814d2346bf1b
84cc04ff77007e457df6aa2b814d2346bf1b84cc04ff77007e457df6aa2b814d2346bf1b
84cc04ff77007e457df6aa2b814d2346bf1bPRAWEEN KUMAR
 

Similar to ACO-medoids (20)

For iiii year students of cse ML-UNIT-V.pptx
For iiii year students of cse ML-UNIT-V.pptxFor iiii year students of cse ML-UNIT-V.pptx
For iiii year students of cse ML-UNIT-V.pptx
 
Clustering Algorithms.pptx
Clustering Algorithms.pptxClustering Algorithms.pptx
Clustering Algorithms.pptx
 
pratik meshram-Unit 5 (contemporary mkt r sch)
pratik meshram-Unit 5 (contemporary mkt r sch)pratik meshram-Unit 5 (contemporary mkt r sch)
pratik meshram-Unit 5 (contemporary mkt r sch)
 
Data clustring
Data clustring Data clustring
Data clustring
 
A Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data MiningA Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data Mining
 
UNIT - 4: Data Warehousing and Data Mining
UNIT - 4: Data Warehousing and Data MiningUNIT - 4: Data Warehousing and Data Mining
UNIT - 4: Data Warehousing and Data Mining
 
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
 
Chapter 5.pdf
Chapter 5.pdfChapter 5.pdf
Chapter 5.pdf
 
Cluster analysis (2).docx
Cluster analysis (2).docxCluster analysis (2).docx
Cluster analysis (2).docx
 
Clustering - K-Means, DBSCAN
Clustering - K-Means, DBSCANClustering - K-Means, DBSCAN
Clustering - K-Means, DBSCAN
 
Multilevel techniques for the clustering problem
Multilevel techniques for the clustering problemMultilevel techniques for the clustering problem
Multilevel techniques for the clustering problem
 
Knowledge Discovery & Representation
Knowledge Discovery & RepresentationKnowledge Discovery & Representation
Knowledge Discovery & Representation
 
Applications Of Clustering Techniques In Data Mining A Comparative Study
Applications Of Clustering Techniques In Data Mining  A Comparative StudyApplications Of Clustering Techniques In Data Mining  A Comparative Study
Applications Of Clustering Techniques In Data Mining A Comparative Study
 
Clustering[306] [Read-Only].pdf
Clustering[306] [Read-Only].pdfClustering[306] [Read-Only].pdf
Clustering[306] [Read-Only].pdf
 
CLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfCLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdf
 
Data Clustering Using Swarm Intelligence Algorithms An Overview
Data Clustering Using  Swarm Intelligence Algorithms  An OverviewData Clustering Using  Swarm Intelligence Algorithms  An Overview
Data Clustering Using Swarm Intelligence Algorithms An Overview
 
Data mining concepts and techniques Chapter 10
Data mining concepts and techniques Chapter 10Data mining concepts and techniques Chapter 10
Data mining concepts and techniques Chapter 10
 
Data clustering and optimization techniques
Data clustering and optimization techniquesData clustering and optimization techniques
Data clustering and optimization techniques
 
A SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERING
A SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERINGA SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERING
A SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERING
 
84cc04ff77007e457df6aa2b814d2346bf1b
84cc04ff77007e457df6aa2b814d2346bf1b84cc04ff77007e457df6aa2b814d2346bf1b
84cc04ff77007e457df6aa2b814d2346bf1b
 

Recently uploaded

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Recently uploaded (20)

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

ACO-medoids

  • 1. University of Science and Technology Houari Boumediene The ACO-MEDOIDS Using the Ant Colony Optimization for partitioning data into cluters MOUDJARI Leila l.moudj11@gmail.com April 15, 2017
  • 2. 1 Presentation plan Introduction Clustering and related work What Is Cluster Analysis? Requirements of clustering Categorization of Clustering Methods Clustering and related work The importance of swarm intelligence and the ACO approach Ant Colony Optimization Adaptation of ACO to the medoids problem ACO-MEDOID algorithm An ant The search space Solution construction Selecting rule Fitness function Pheromone update The empirical parameters Conclusion MOUDJARI Leila | ACO-MEDOIDS
  • 3. 2 bg MOUDJARI Leila | ACO-MEDOIDS
  • 4. 2 Introduction Data mining is used in several disciplines, database systems, statistics, machine learning, visualization, information science... MOUDJARI Leila | ACO-MEDOIDS
  • 5. 2 Introduction Data mining is used in several disciplines, database systems, statistics, machine learning, visualization, information science... A data mining system can perform several tasks such as characterization, discrimination, association or correlation analysis, classification, prediction, clustering, outlier analysis, or evolution analysis. These tasks can be classified as supervised or unsupervised. Data clustering is an unsupervised learning and one of the most challenging problems in data mining. It’s also classified as an NP-hard problem. MOUDJARI Leila | ACO-MEDOIDS
  • 6. 2 Introduction Data mining is used in several disciplines, database systems, statistics, machine learning, visualization, information science... A data mining system can perform several tasks such as characterization, discrimination, association or correlation analysis, classification, prediction, clustering, outlier analysis, or evolution analysis. These tasks can be classified as supervised or unsupervised. Data clustering is an unsupervised learning and one of the most challenging problems in data mining. It’s also classified as an NP-hard problem. One of the strongest disciplines which faced this class of problems and still remains liable is swarm intelligence. Therefore we leaned towards this discipline as other researchers did. MOUDJARI Leila | ACO-MEDOIDS
  • 7. 3 bg MOUDJARI Leila | ACO-MEDOIDS
  • 8. 3 Introduction Over the last years, many have presented works in this area, we mention the BAT-CLARA [1], Association Rule Mining Based on Bat Algorithm [2], MACOC: a medoid-based ACO clustering algorithm [3], SACOC: A spectral-based ACO clustering algorithm [4]... MOUDJARI Leila | ACO-MEDOIDS
  • 9. 4 bg MOUDJARI Leila | ACO-MEDOIDS
  • 10. 4 Introduction Clustering is a large field and lots of work might still be needed in the different areas. However we are concetrating ours on the partitioning algorithms. Pricesly partitioning the dataset into k clusters. Which is also an NP-hard task. MOUDJARI Leila | ACO-MEDOIDS
  • 11. 4 Introduction Clustering is a large field and lots of work might still be needed in the different areas. However we are concetrating ours on the partitioning algorithms. Pricesly partitioning the dataset into k clusters. Which is also an NP-hard task. The most well-known and commonly used partitioning methods are k-means, k-medoids (PAM), and their variations [5]. Such as CLARA, CLARANS, CLAM (a recent one 2011, using a hybrid metaheuristic between VNS and Tabu Search to solve the problem of k-medoid clustering) [6], ...etc. MOUDJARI Leila | ACO-MEDOIDS
  • 12. 4 Introduction Clustering is a large field and lots of work might still be needed in the different areas. However we are concetrating ours on the partitioning algorithms. Pricesly partitioning the dataset into k clusters. Which is also an NP-hard task. The most well-known and commonly used partitioning methods are k-means, k-medoids (PAM), and their variations [5]. Such as CLARA, CLARANS, CLAM (a recent one 2011, using a hybrid metaheuristic between VNS and Tabu Search to solve the problem of k-medoid clustering) [6], ...etc. We hereby present an algorithm for k-medoid clustering based on an ACO solution search the ACO-medoids. As its name indicates, the algorithm uses the Ant colony optimisation to explore the search space looking for an optimal set of medoids with reference to k-medoids for necessary clustering concepts. MOUDJARI Leila | ACO-MEDOIDS
  • 13. 5 bg MOUDJARI Leila | ACO-MEDOIDS
  • 14. 5 Clustering and related work What Is Cluster Analysis? MOUDJARI Leila | ACO-MEDOIDS
  • 15. 5 Clustering and related work What Is Cluster Analysis? Clustering is an unsupervised learning process it does not rely on predefined classes and class-labeled training examples, therefore it is considered as a form of learning by observation and not by examples. MOUDJARI Leila | ACO-MEDOIDS
  • 16. 5 Clustering and related work What Is Cluster Analysis? Clustering is an unsupervised learning process it does not rely on predefined classes and class-labeled training examples, therefore it is considered as a form of learning by observation and not by examples. It aims to reduce the data size by grouping similar objects in one cluster, so Giving a set of data objects a clustering algorithm must be capable of grouping the different objects into classes, so that a high intragroup similarity and a low inter-group similarity are ensured. MOUDJARI Leila | ACO-MEDOIDS
  • 17. 5 Clustering and related work What Is Cluster Analysis? Clustering is an unsupervised learning process it does not rely on predefined classes and class-labeled training examples, therefore it is considered as a form of learning by observation and not by examples. It aims to reduce the data size by grouping similar objects in one cluster, so Giving a set of data objects a clustering algorithm must be capable of grouping the different objects into classes, so that a high intragroup similarity and a low inter-group similarity are ensured. The similarity or dissimilarity is assessed via a distance measure(Euclidean or Manhattan distance measures, or other distance measurements, may also be used) MOUDJARI Leila | ACO-MEDOIDS
  • 18. 6 bg MOUDJARI Leila | ACO-MEDOIDS
  • 19. 6 Clustering and related work Requirements of clustering Scalability, MOUDJARI Leila | ACO-MEDOIDS
  • 20. 6 Clustering and related work Requirements of clustering Scalability, Ability to deal with different types of attributes, MOUDJARI Leila | ACO-MEDOIDS
  • 21. 6 Clustering and related work Requirements of clustering Scalability, Ability to deal with different types of attributes, Ability to deal with noisy data, MOUDJARI Leila | ACO-MEDOIDS
  • 22. 6 Clustering and related work Requirements of clustering Scalability, Ability to deal with different types of attributes, Ability to deal with noisy data, High dimensionality (number of attributes)... MOUDJARI Leila | ACO-MEDOIDS
  • 23. 7 Categorization of Clustering Methods In general, these algorithms can be classified into the following categories: MOUDJARI Leila | ACO-MEDOIDS
  • 24. 7 Categorization of Clustering Methods In general, these algorithms can be classified into the following categories: Partitioning methods what characterizes this class is a predefined number k of partitions, each partition represents a cluster. So that each cluster must contain at least one object and an object must belong to at most one group. The most known methods are k-means and k-medoids. MOUDJARI Leila | ACO-MEDOIDS
  • 25. 7 Categorization of Clustering Methods In general, these algorithms can be classified into the following categories: Partitioning methods what characterizes this class is a predefined number k of partitions, each partition represents a cluster. So that each cluster must contain at least one object and an object must belong to at most one group. The most known methods are k-means and k-medoids. Hierarchical methods creates a hierarchical decomposition of the dataset, it can be classified as being either agglomerative (bottom-up) or divisive (top-down). MOUDJARI Leila | ACO-MEDOIDS
  • 26. 8 Categorization of Clustering Methods Density-based methods unlike partitioning methods, these are based on the notion of density (number of objects or data points) instead of distance. They continue growing the given cluster as long as the density in the “neighborhood” exceeds some threshold. Such as DBSCAN and its extension, OPTICS, are typical density-based methods. MOUDJARI Leila | ACO-MEDOIDS
  • 27. 8 Categorization of Clustering Methods Density-based methods unlike partitioning methods, these are based on the notion of density (number of objects or data points) instead of distance. They continue growing the given cluster as long as the density in the “neighborhood” exceeds some threshold. Such as DBSCAN and its extension, OPTICS, are typical density-based methods. There is also the Grid-based methods, Model-based methods, Constraint-based clustering... MOUDJARI Leila | ACO-MEDOIDS
  • 28. 9 Clustering and related work K-mean algorithm Input: k (the number of clusters), D(a data set containing n objects). Output: A set of k clusters. Begin 1. arbitrarily choose k objects from D as the initial cluster centers; 2. repeat 3. (re)assign each object to the cluster to which the object is the most similar; 4. update the cluster means, i.e., calculate the mean value of the objects for each cluster; 5. until no change; End. MOUDJARI Leila | ACO-MEDOIDS
  • 29. 10 Clustering and related work k-Medoids algorithm Input: k (the number of clusters), D(a data set containing n objects). Output: A set of k clusters. Begin 1. arbitrarily choose k objects from D as the initial cluster centers; 2. repeat 3. assign each remaining object to the nearest cluster; 4. randomly select a nonrepresentative object, orand ; 5. compute the total cost, S, of swapping representative object, oj with orand ; 6. if S < 0 then swap oj with orand to form the new set of k medoids; 7. until no change; End. MOUDJARI Leila | ACO-MEDOIDS
  • 30. 11 Clustering and related work K-medoids was presented as a solution to some of k-means flows. As its sensitivity towards outliers and the fact that the centroids are abstract objects. PAM proved that real objects diminish the error value. However, it has some lacks. When it comes to large datasets it loses due to the significant amount of time needed to construct the set of medoids. In spite of that, researchers tried to improve it. That’s why clustering field witnessed the birth of its variations: CLARA, followed by CLARANS and others as already mentioned. However the problem persists. How can we gain in scalability without loosing in quality? In the last years clustering draw attention of the meta-heuristic community. Several works have been presented. One of the promising optimization methods is ACO. MOUDJARI Leila | ACO-MEDOIDS
  • 31. 12 The importance of swarm intelligence and the ACO approach Swarm intelligence MOUDJARI Leila | ACO-MEDOIDS
  • 32. 12 The importance of swarm intelligence and the ACO approach Swarm intelligence It is well known that we are more effective when we work with others rather than working in isolation and this is, the core of swarm intelligence. MOUDJARI Leila | ACO-MEDOIDS
  • 33. 12 The importance of swarm intelligence and the ACO approach Swarm intelligence It is well known that we are more effective when we work with others rather than working in isolation and this is, the core of swarm intelligence. Swarm intelligence is based on the collective behavior of species. Each method is a result of nature observation and intelligence forms of group behavior analysis. It results in the simulation of these studied behaviors of collective insects, animal and human. It gained popularity with the burst of artificial intelligence in the 80s. Especially when dealing with combinatorial problems. Such problems are divided into classes, P (polynomial), NP, Np-complete and NP-hard. The latter two generally have an exponential complexity. MOUDJARI Leila | ACO-MEDOIDS
  • 34. 13 The importance of swarm intelligence and the ACO approach Problem ∈ [Np-hard | Np-complete] ==> call 911. clustering ∈ [Np-hard] ==> Swarm intelligence. MOUDJARI Leila | ACO-MEDOIDS
  • 35. 14 The importance of swarm intelligence and the ACO approach Ant Colony Optimization ACO showed its strength when dealing with problems related to graphs. It was driven by the fascination for ants, how they worked in harmony to nourish and build a habitat. They cooperate and help each other by sharing useful information such as the path to take or to avoid. they communicate using a substance they release called "pheromone", as a stigmergy. The use of ACO-based algorithms is very large and domain based, therefore it was adopted to several types of problems. MOUDJARI Leila | ACO-MEDOIDS
  • 36. 15 The importance of swarm intelligence and the ACO approach Ant Colony Optimization: the algorithm ACO algorithm Begin 1. While (not stop conditions) do 2. for k=1 to Nb-ants do 3. begin 4. Build a solution (Sk ); 5. Evaluate (Sk ); 6. Apply online pheromone update; 7. end-for; 8. Determine the best solution of the current iteration; 9. Apply offline pheromone update; 10. end-while; End. MOUDJARI Leila | ACO-MEDOIDS
  • 37. 16 The importance of swarm intelligence and the ACO approach Ant Colony Optimization One of the advantages of applying ACO algorithms to the clustering problems is that ACO performs a global search in the solution space, which is less likely to get trapped in local minima and, thus, has the potential to find more accurate solutions [7]. The algorithm, uses an iterative search strategy to find an approximate optimal solution, using the pheromone track and a heuristic. ACO has been successfully adopted for multiple problems. Works on unsupervised learning have focused on clustering showing the potential of ACO-based techniques. MOUDJARI Leila | ACO-MEDOIDS
  • 38. 17 Clustering and ACO Nevertheless, more work need to be done, especially for medoid-based clustering, which compared to classical centroid-based techniques are more efficient. In this area, divers algorithms were proposed such as: "An adaptive multi-agent ACO clustering algorithm" in 2005 by Weijiao Zhang and Chunhuang Liu. "Classification with cluster-based Bayesian multi-nets using Ant Colony Optimization" in 2014 by Khalid M. Selma and Alex A. Freitas. Also MACOC: a medoid-based ACO clustering algorithm in 2014. Recently, a "Medoid-based clustering algorithms using ant colony optimization" (METACOC and METACOC-K) were proposed in 2016 (Héctor D. Menéndez, Fernando E. B. Otero, David Camacho) [7]. ...etc. MOUDJARI Leila | ACO-MEDOIDS
  • 39. 18 Adaptation of ACO to the medoids problem ACO-MEDOID algorithm MOUDJARI Leila | ACO-MEDOIDS
  • 40. 18 Adaptation of ACO to the medoids problem ACO-MEDOID algorithm "ACO-medoids" for finding the best set of k-medoids. Based on the ant colony optimisation and the k-medoids. we will strat with the general form of the algorithm MOUDJARI Leila | ACO-MEDOIDS
  • 41. 18 Adaptation of ACO to the medoids problem ACO-MEDOID algorithm "ACO-medoids" for finding the best set of k-medoids. Based on the ant colony optimisation and the k-medoids. we will strat with the general form of the algorithm ACO-medoid algorithm Input: k (the number of clusters), D(a data set containing n objects). M(the similarity (distance) matrix). Output: A set of k clusters. Begin // start by creating the initial population 1. foreach ant do 2. arbitrarily choose k objects from D as the initial solution of the ant; 3. end-foreach; 4. While (change or i < Max − Iter) do 5. foreach ant do 6. begin 7. Build a solution (Sk ); 8. Evaluate (Sk ); 9. Update Abest and Vbest 19; 10. Apply online pheromone update; 11. end-foreach; 12. Determine the best solution of the current iteration; 13. Apply offline pheromone update; 14. end-while; End. MOUDJARI Leila | ACO-MEDOIDS
  • 42. 19 Adaptation of ACO to the medoids problem An ant A virtual agent in the multi-dimensional space, which is the search space. It has the following properties: sol: which is the current solution of the ant, Abest: the best solution found so far by the ant, Vbest: the valuation of Abest. MOUDJARI Leila | ACO-MEDOIDS
  • 43. 20 Adaptation of ACO to the medoids problem The search space It includes all potential combinations of objects that can build a set of medoids (solutions), verifying the similarity/dissimilarity constraint of clustering. The number of these possible solutions depends on k (the number of clusters) so if we have n objects that need to be placed in k clusters then the number is determined as follows: For the initial object we have n possibilities, for the next one we have n − 1, for the kth we have n − k − 1 possibilities, The total number of solutions is then equal to n ∗ (n − 1) ∗ ... ∗ (n − k − 1) => meaning exponential. MOUDJARI Leila | ACO-MEDOIDS
  • 44. 21 Adaptation of ACO to the medoids problem Solution construction In order to build a solution an ant has two possible strategies exploit or explore. The first is a local search based method that helps an ant improve its solution, the second help exploring new promising regions. As shown in the next pseudocode; MOUDJARI Leila | ACO-MEDOIDS
  • 45. 21 Adaptation of ACO to the medoids problem Solution construction In order to build a solution an ant has two possible strategies exploit or explore. The first is a local search based method that helps an ant improve its solution, the second help exploring new promising regions. As shown in the next pseudocode; Procedure: constructSolution Input: an ant, Output: an ant Begin // choose a strategy randomly 1. S0 : a random variable uniformly distributed in [0,1] 2. if S0 <= Sp then 3. sol = explore; 4. applyLocalSearch(Sol); 5. else 6. apply local search(Abest); 7. endif; End. MOUDJARI Leila | ACO-MEDOIDS
  • 46. 22 Adaptation of ACO to the medoids problem Solution construction Procedure: explore Input: D the dataset; Output: s Begin 1. s = empty; 2. while (i<k and D not empty) do 3. select oi from D using the selecting rule 24; 4. Append oi to s; 5. eliminate oi from D; 6. endwhile; End. MOUDJARI Leila | ACO-MEDOIDS
  • 47. 23 Adaptation of ACO to the medoids problem Solution construction Procedure: localSearch Input: the solution to be improved ; Output: a solution Begin 1. for j = 1 to lmax do 2. for m=1 to mds 27 do 3. C: the corresponding cluster; 4. choose object orand from C; 5. compute the total cost S, of swapping the representative object Sol[m], with orand ; 6. if S < 0 then swap orand with Sol[m] to form the new set of k representative objects; 7. update clusters; 8. endfor; 9. endfor; End. MOUDJARI Leila | ACO-MEDOIDS
  • 48. 24 Adaptation of ACO to the medoids problem Selecting rule The selecting process tries to find the furthest object in the selection D from the set of objects already chosen as medoids, by using the following formula; j = maxu∈Y {T(u)} if q ≤ q0 maxu∈Y {P(u)} otherwise (1) Pu(t) = Tu(t) v∈Y Tv where; Tj pheromone amount of the jth object ∈ D, Y set of possible medoids, P the probability that data instance j could be selected as a medoid q is a random number distributed uniformly in [0, 1], q0 is an empirical parameter, MOUDJARI Leila | ACO-MEDOIDS
  • 49. 25 Adaptation of ACO to the medoids problem Fitness function It is used to evaluate a solution, it represents the cost (Ecost ) of a solution. However, in order to compare two solutions we calculate S, S = Snew − Sold . If S is negative, then the new solution is better than the old one. Ecost = k i=1 C j=1 M[m, j] where; M is the distance matrix, C is the number of objects in the clusteri . Another possible objective function is the sum of the probability P calculate in the following formula. The aim is to maximize it; if q ≤ q0 P = 1 if j = argmaxu∈Y {T(u)} 0 otherwise (2) else P is calculated as formula 24. MOUDJARI Leila | ACO-MEDOIDS
  • 50. 26 Adaptation of ACO to the medoids problem Pheromone update Regarding the pheromone updates we used the on and offline updates calculated as follows; Online update Ti = (1 − ρ)Ti (t) + ρτ0 where; ρ: is the evaporation rate and also an empirical parameter, τ: is the initial value of pheromone. Offline update At the end of each iteration, the offline update is performed. So the ant with the best current solution deposits an amount of pheromone equal to ∆Ti (t). The update is performed using this formula: Ti = (1 − ρ)Ti (t) + ρ∆Ti (t) where; ∆Ti (t) = 1 C if the ant uses the object l 0 otherwise (3) C: the cost of the ant’s solution (Ecost 25). MOUDJARI Leila | ACO-MEDOIDS
  • 51. 27 Adaptation of ACO to the medoids problem The empirical parameters This section presents the different empirical parameters that need to be defined in order to improve the solution quality. parameter role A number of ants Max-Iter Iterations number of the algorithm lmax Iterations number of local search Sp in [0,1] Intensification/diversification rate the strategy rate q0 selection rate mds Number of clusters to be updated (can be equal to k or randomly chosen each time in [1-k]) ρ the evaporation rate MOUDJARI Leila | ACO-MEDOIDS
  • 52. 28 Conclusion We presented some ideas for the use of ACO to solve the medoids problem, through a proposed medoid and ACO based clustering algorithm we called "ACO-medoids". It is based on the ants’ collective behavior and k-medoids for building the clusters. Implementation and tests need to be done so that we can be conclusive regarding the algorithm behavior. However swarm based algorithms, including ACO proved that they can improve the time/space complexity of NP-hard problems. Therefore, we believe that the algorithm can provide the optimal solution in a finite amount of time. MOUDJARI Leila | ACO-MEDOIDS
  • 54. 29 Bibliographie [1] NadjetKamel YasmineAboubi, HabibaDrias. Bat-clara: Bat-inspired algorithm for clustering large applications. IFAC-PapersOnLine 49-12 243–248, 2016. [2] Habiba Drias Kamel Eddine Heraguemi, Nadjet Kamel. Association rule mining based on bat algorithm. Journal of Computational and Theoretical Nanoscience 12(7):1195-1200, 2015. [3] Fernando E. B. Otero Héctor D. Menéndez and David Camacho. Macoc: a medoid-based aco clustering algorithm. DOI: 10.1007/978-3-319-09952-1_11, 2014. [4] Fernando E. B. Otero Héctor D. Menéndez and David Camacho. Sacoc a spectral-based aco clustering algorithm. DOI: 10.1007/978-3-319-10422-5_20, 2014. [5] Data mining: concepts and techniques (second edition). ELESEVIER, 2011. [6] V.J. J Nguyen, Q. & Rayward-Smith.MOUDJARI Leila | ACO-MEDOIDS