Dynamic clustering algorithm using fuzzy c means


Published on

Here, in this paper we are introducing a dynamic clustering algorithm using fuzzy c-mean clustering algorithm. We will try to process several sets patterns together to find a common structure. The structure is finalized by interchanging prototypes of the given data and by moving the prototypes of the subsequent clusters toward each other. In regular FCM clustering algorithm, fixed numbers of clusters are chosen and those are pre-defined. If, in case, the number of chosen clusters is wrong, then the final result will degrade the purity of the cluster. In our proposed algorithm this drawback will be overcome by using dynamic clustering architecture. Here we will take fixed number of clusters in the beginning but on iterations the algorithm will increase the number of clusters automatically depending on the nature and type of data, which will increase the purity of the result at the end. A detailed clustering algorithm is developed on a basis of the standard FCM method and will be illustrated by means of numeric examples.

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Dynamic clustering algorithm using fuzzy c means

  1. 1. DYNAMIC CLUSTERINGALGORITHM USING FUZZY C-MEANSJ Anuradha, Wrishin Bhattacharya, TanujaSenapatySchool of Computing Science and EngineeringVIT University, Vellore – 14.1. ABSTRACTHere, in this paper we are introducing adynamic clustering algorithm using fuzzyc-mean clustering algorithm. We will tryto process several sets patterns togetherto find a common structure. The structureis finalized by interchanging prototypes ofthe given data and by moving theprototypes of the subsequent clusterstoward each other. In regular FCMclustering algorithm, fixed numbers ofclusters are chosen and those are pre-defined. If, in case, the number of chosenclusters is wrong, then the final result willdegrade the purity of the cluster. In ourproposed algorithm this drawback will beovercome by using dynamic clusteringarchitecture. Here we will take fixednumber of clusters in the beginning but oniterations the algorithm will increase thenumber of clusters automaticallydepending on the nature and type of data,which will increase the purity of the resultat the end. A detailed clustering algorithmis developed on a basis of the standardFCM method and will be illustrated bymeans of numeric examples.Keywords: Cluster, dynamic clustering,objective function, membership function,fuzzy membership.2. IntroductionA cluster is defined as a group of similartype of objects. The objects belonging tosame cluster are of same type, andobjects belonging to different clusters areof different types. When we have to groupN number of patterns in C number ofclusters which has high rate of similarity inits own class and low rate of similaritywith respect to other classes then itbecomes a problem. The main goal of“Objective-Function” supported clusteringalgorithm is to determine a partition for acluster. The fuzzy C-means algorithmrepresents each cluster by its centre ofgravity.Let us consider an example. Suppose,there is large amount of data about clientinformation which is distributed indifferent databases and if we now try tomine such data, an intelligent approachwould be to analyze each database locallyand then combine the results at globallyabstract level, which also satisfy some ofthe security concerns of the client as well.In this situation, one can cluster eachsubpopulation locally as a module whichwill enable faster convergence ofclustering. Finally, after the unionbetween the different modules of data weare converged to a stable clustering.Fuzzy C-means is derived by incorporatingfuzzy sets, rough sets and c-meansframework together. Overlappedpartitions are efficiently handled by fuzzyc-means membership.3. Literature ReviewFrom the early 1950’s, pattern recognitionbecame a field of study. From the 1960’s,Fuzzy Theory was started to be used in thefield of patter recognition and clusteringanalysis.Pattern Recognition is not the onlyapplication of cluster analysis. Based onmany criteria, cluster analysis can be usedin social groupings, retrieval ofinformation etc. So, we can say that it is
  2. 2. applicable everywhere if one wants toclassify some objects into severalcategories, we commonly encounter. [4]Classification of data is done in generalusing Fuzzy C-mean clustering, though thisclassification needs a number of clustersas input and the optimal convergencedepends on the initial cluster centersselection. Therefore, Fuzzy C-meansmethod is not suitable for classifying largedata set. [5]When similar characteristic data isgrouped then it is known as dataclassification. It is done to enhance humanunderstanding of data structure and todescribe the data behaviour by buildingmodels. K-means clustering algorithm,Fuzzy C-mean clustering algorithm,neural-net etc. were developed, just to beapplied in this field of study. [8]Fuzzy C-means clustering classifies theunclassified data after performing pre-classification. Y.Lim applied Fuzzy C-meansclustering to colour image segmentation.He obtained the appropriate thresholdvalue to get the number of cluster usingscale space filtering and 1st and 2nddifferentiation. Then image data was pre-classified on the basis of appropriatethreshold value. Finally, fine-classificationwas performed using Fuzzy C-meansclustering.Though Fuzzy C-means clustering is notgenerally used for large data classificationY.G.Jin used a method, subtractive andgravity Fuzzy C-means clustering toovercome this problem. [7] Subtractiveclustering is used to get the number ofcluster and the cluster centers used forpre-classifications. Then during pre-classification, for the unclassified datagravity Fuzzy C-means clustering is usedwhich actually overcomes the deficiencyof Fuzzy C-means clustering.This algorithm is a process for showingthat dataset can be differentiated andformulated into groups but it can be seenthat every data has some specificationssuch as difference between each nodes ofdata, difference of distance, differentweights for data nodes that makes itworse to simplify how to group each nodepoints in such a way that will show betterclassification and use for data nodes. [6]This algorithm is also used to separate thedata in different magnitude of cluster byusing the logic of the fuzzy theory. Thisdivision depends on various criteria, suchas, distance between two data nodes,choosing centroid and membershipfunction that mean we do not haveaccurate data cluster size. [1]4. Implementation4.1. Fuzzy C-means: [3]The main concept of FCM is to find a Fuzzypseudo-partition to minimize the costfunction.Cost Function:s.t.In the above formula,yj= featured data to be clustered;nl= center of each cluster;vjl= fuzzy partition corresponding tofeature data;m= number of feature data;L= number of cluster;δ= exponent to adjust fuzzy degree.The updating steps are as follows:E-Step:M-Step:
  3. 3. E-step is used to get new center of eachcluster and M-step is used to update thefuzzy partition. When E-step and M-stepare repeated, cluster center m and fuzzypartition u are updated, until the costfunction reaches the minimal value, or itcannot be reduced anymore, we get thefinal cluster information.4.2. The Fuzzy C-means Algorithm: [2]1. Initialize V=[vjx] matrix, V(0)2. At k-step: calculate the centersvectors centroid(k)=[centroidz] withV(l)3. Update V(l) , V(l+1)4. If || V(l+1) - V(l)||< then STOP;otherwise return to step 2.5. Results and DiscussionAfter we implemented this algorithm wegot the following outputs.Fig5.1: Input Feature VectorIn the above figure we can see the inputfeature Vectors. Here we can see theclusters formed after implementation ofthe algorithm. We can also see thebelonging of certain data points to aparticular cluster.Fig5.2: Input Feature Vector afterIterationsHere, in the above figure we can see theinput feature vectors after iterations.
  4. 4. After several iterations we can see theclearly formed clusters. The points whichwere distantly connected to a particularcluster have moved in closer to it afterseveral iterations.Fig5.3: Termination MeasureIn the above figure we have shown thetermination measure after severaliterations.6. Proposed Algorithm6.1. The Dynamic Fuzzy C-meanAlgorithm:1. Initialize V=[vjx] matrix, V(0)2. At k-step: calculate the centersvectors centroid(k)=[centroidz] withV(l)3. Update V(l) , V(l+1)4. Find the maximum number ofobjects closer to yj where j=1 to n.5. Calculate the distance between allobjects yj to that of cluster centroid.[d(yj, centroidz)]6. If d(yj,centroidz) where j=1,2,...., c,is higher, then yj is new centroidand centroid=centroid+1.7. If || V(l+1) - V(l)||< then STOP;otherwise return to step 2.8. ConclusionThere are many algorithms present whichcan be applied to implement clustering.Fuzzy C-mean clustering algorithm is easyand efficient to implement. Our researchproposes the dynamic clustering of thedata set and also belonging of pointsbetween clusters. The proposed algorithmif implemented, then, at each and everyiteration, based on the distance betweenthe objects, new clusters will be formedand new centroid will be generated. Themain benefit of this algorithm is todevelop a high performance algorithmwhich will reduce the number of iterationsand also will converge faster than theexisting Fuzzy C-mean algorithm.9. Reference1. Abu-Zanona M.A., El-Zaghmouri B.M,Fuzzy C-Means Clustering AlgorithmModification and Adaptation forApplication, World of ComputerScience and Information TechnologyJournal (WCSIT), ISSN: 2221-0741, Vol.2, No. 1, 42-45, 2012.2. Bezdek.J.C(1981): “PatternRecognition with Fuzzy Objective
  5. 5. Function Algorithms”, Plenum Press,New York.3. Gath.I, Unsupervised Optimal FuzzyClustering, IEEE TRANSACTIONS ONPATTERN ANALYSIS AND MACHINEINTELLIGENCE. VOL. I I . NO. 7. JULY19894. Hall L.O, A Comparison of NeuralNetwork and Fuzzy ClusteringTechniques in Segmenting ResonanceImages of the Brain, IEEETRANSACTIONS ON NEURALNETWORKS, VOL. 3, NO. 5,SEPTEMBER 1992.5. Jin Y.G, Kwon.O.S, Kim .T.K “DATACLASSIFICATION BASED ONSUBTRACTIVE AND GRAVITY FUZZY C-MEANS CLUSTERING”, Fuzzy Logic andIntelligent Technologies for NuclearScience and Industry, Proceedings ofthe 3rd International Films WorkshopAntwerp, Belgium, September 14_ 16-1998.6. Jiang.H, Generalized Fuzzy ClusteringModel using Fuzzy C-Means.7. Lim Y.W, Lee S.U.K, “ON THE COLORIMAGE SEGMENTATION ALGORITHMBASED ON THE THRESHOLDING ANDTHE FUZZY C-MEANS TECHNIQUE”,Pattern Recognition,Vol.23,no.9,pp.935-952,1990.8. Xie X.L, A validity measure for FuzzyClustering.