Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. IOSR Journal of Computer Engineering (IOSR-JCE)e-ISSN: 2278-0661, p- ISSN: 2278-8727Volume 9, Issue 3 (Mar. - Apr. 2013), PP 16-21www.iosrjournals.org A hybrid algorithm based on KFCM-HACO-FAPSO for clustering ECG beat J.Mercy Geraldine1, P.Kiruthiga21 Head of the Department, Department of CSE, Srinivasan Engineering College, Perambalur, Tamilnadu, India 2 M.E-Student, Department of CSE, Srinivasan Engineering College, Perambalur, Tamilnadu, IndiaAbstract: Data clustering is an essential technique for web applications and organizations. However, theclustering performance has to be optimized to form usable and efficient data clusters. Many optimizing methodshave been suggested to improve the clustering performance of the fuzzy c- means clustering. The FAPSO andthe HACO optimization techniques have been proposed to improve the clustering performance. However, theseconventional methods experience from various restrictions such as trapping into local minima and lack of priorknowledge for optimum parameters of the kernel functions. Considering the performance of the clusteringtechniques, the kernel methods are used in kernelized fuzzy c-means algorithm for improving the clustering performance of the well know fuzzy c-means algorithm. This is obtained by mapping the given dataset into ahigher dimensional space non-linearly. Where, the newly obtained dataset from the database are linearlyseparable. The dataset which is extracted from MIT–BIH arrhythmia database are applied in the proposedmethod and domain features are extracted for each type and training and test sets are formed. This algorithmcan be used in various applications such as web application, classifying ECG records.Keywords - Fuzzy c-means, Kernelized fuzzy c-means, Ant colony optimization, particle swarm optimization. I. Introduction Data clustering describes the process of growing data into classes or cluster such that the data in eachcluster share a high degree of similarity while being very dissimilar to data from other cluster. The ACO, PSOalgorithms is one of the modern evolutionary algorithms. Thus, the pattern recognition methods are useful inmining the dataset. Pattern recognition methods can be categorized into two groups according to the learningprocedure. Supervised learning requires prior labeling of the training data to create a model of the given dataset.A supervised learning algorithm analyses the given training dataset and creates an output. This output is thencompared with the desired output (label) and an error or feedback signal is created. Algorithm then updatesitself according to this feedback signal in order to create a model of the given dataset. Once the algorithm isterminated the obtained model should generalize the training data such that when an unknown input pattern isgiven to the model it should be classified correctly. However, unsupervised learning does not need a priorlabeling. It creates clusters from a given dataset according to a similarity measure which is usually a distancefunction. After the clustering process, similar patterns are grouped in the same cluster and dissimilar patternsare grouped in different clusters. ECG classification algorithms, a similarity measure is used to measure the distances between the querybeat and the templates in the database. The smaller the distance is, the more similar the template to the query.However, as stated above, in many cases a data point can be in a location which is almost equally distant tomore than one cluster center. In such a situation, fuzzy clustering methods could prevent the misclassification ofa query beat by utilizing the membership values of the data points to each cluster. Based on the aboveconsiderations, recently a number of studies which use fuzzy clustering algorithms for ECG beat classificationare proposed. One of these fuzzy clustering algorithms is the fuzzy c-means (FCM) algorithm. After a clusteringprocess, the FCM algorithm gives two outputs namely, the cluster centers and a fuzzy partition matrix whichcontains the membership values of each data point to these clusters. The obtained cluster centers or themembership values are then utilized for classification. The popular clustering algorithm is the hard c-means algorithm which assigns a data point in a givendataset to exactly one cluster. Such an assignment can be inadequate because some data points can be in alocation which is almost equally distant from two or more cluster centers. By forcing such a point to exactly onecluster, the similarity of this point to other clusters is totally ignored. For this reason, fuzzy clustering methodsare proposed. In fuzzy clustering methods a data point can belong to more than one cluster with differentdegrees of membership which is useful especially when the clusters overlap each other. Fuzzy c-means algorithm is used in several clustering problems efficiently and it has two majordrawbacks which reduce its performance. The needs for an optimization and thus these algorithm are beenproposed. Firstly, it is responsive to initial values of the clusters and secondly, it can be easily trapped into localminima. Therefore, several extensions of the FCM algorithm are proposed to improve its performance. One of www.iosrjournals.org 16 | Page
  2. 2. A hybrid algorithm based on KFCM-HACO-FAPSO for clustering ECG beatthese algorithms is called the kernelized fuzzy c-means algorithm (KFCM) which uses kernel methods toimprove the clustering performance of the fuzzy c-means algorithm. In KFCM a given dataset is mapped to a higher dimensional space non-linearly by a kernel function.Thus, the newly obtained dataset is more likely to be linearly separable. However, the KFCM algorithm also hasthe above-mentioned drawbacks. Additionally, there is no prior knowledge about the optimum parameters of thekernel functions.Paper Organization This paper is organized as follows: Section 2 introduces related work. Section 3 introduces proposedmethod. Section 4 presents clustering with hybrid ant colony optimization and fuzzy adaptive particle swarmoptimization. Section 5 presents experiments of efficiency evaluation. Section 6 gives conclusions anddirections for future work. II. Related Work2.1 Fuzzy c-means algorithm Fuzzy clustering is an important crisis which is the issue of active research in several real-worldapplications. The most popular fuzzy clustering technique is fuzzy c-means because of its efficient,straightforward, and easy to implement [4]. But the fuzzy c-means algorithm suffers from the problem such assensitive to initialization and it can be easily attentive in local optima. But later it combines with the colonyoptimization solution to obtain the efficient solution. Particle swarm optimization (PSO) is a globaloptimization tool which is used in many optimization problems [10]. In this method, a hybrid fuzzy clusteringmethod which is based on FCM and fuzzy PSO (FPSO) is proposed which make use of the qualities of bothalgorithms. Fuzzy method is used to find out the vector speed of the particle and the clustering of the dataoccurs. K-means is one of the most popular hard clustering algorithms which partitions data objects into kclusters where the number of clusters, k, is decided in advance according to application purposes. This model isunsuitable for real data sets in which there are no definite boundaries between the clusters [2].Kernelized fuzzy c-means is applied in noisy image segmentation for separating the noises from the image. Thesegmentation is that dividing the image into the appropriate points and then clustering of the corresponding datainto groups.2.2 Colony Optimization Particle swarm optimization is an accompanied with the c-means clustering and ant colonyoptimization, which could be implemented and applied easily to solve various function optimization problems[9]. Fuzzy c-means clustering is an effective algorithm, it randomly selects center points and makesiterative process finds the local optimal solution easily [12]. The algorithmic flow in PSO starts with apopulation of particles whose positions represent the potential solutions for the problem, and velocities arerandomly initialized in the search space [3]. At each iteration, the search for optimal position is processed by updating the particle velocities andpositions. And the fitness value of each particle’s position is determined using a fitness function. The extensions of ant colony optimization (ACO) to continuous domains are used. ACO, which wasinitially developed to be a meta heuristic for optimization, can be adapted to continuous optimization withoutany major conceptual change to its structure [6]. The extended ACO compares to those algorithms, and presentsome analysis of its efficiency and robustness. Optimization algorithms inspired by the ants foraging behavior have been initially proposed forsolving combinatorial optimization problems (COPs) [8]. Many of these problems, especially those of practicalrelevance, are NP-hard. In other words, it is strongly believed that it is not possible to find efficient algorithmsto solve them optimally. The optimization, it deals with finding optimal combinations of available problem components. Hence,it is required that the problem is partitioned into a finite set of components, and the combinatorial optimizationalgorithm attempts to find the optimal combination or permutation. Many real world optimization problemsmay be represented as COPs in a straightforward way [12]. Continuous optimization is barely a new research field. There exist numerous algorithms includingmetaheuristics that were developed for finding the solution to this type of problems [5]. In order to have aproper viewpoint on the performance of ACOR, it is compared not only to other ant-related methods, but also toother metaheuristics used for continuous optimization [7]. www.iosrjournals.org 17 | Page
  3. 3. A hybrid algorithm based on KFCM-HACO-FAPSO for clustering ECG beat III. The Proposed Method3.1 Overview of the proposed system The datasets from Arrhythmia database are encoded. Preprocessing of the dataset is processed andnormalization to it is done. Then features are extracted from it. The extracted features are divided into trainingsets and test sets. Then training sets are used for finding optimum cluster center and membership degrees usingthe algorithm KFCM, HACO and FAPSO. Where the FAPSO are used for finding the fitness value forcalculating the cluster centers and weight vector are evaluated. Using the cluster center, the classify the test setsas mentioned in figure 1.3.2 Algorithm used3.2.1 Kernelized fuzzy c-means algorithm The kernelized fuzzy c-means algorithm uses the kernel method for clustering. The kernel method wasfirst implemented in support vector machine. The kernel based clustering concept is used for processing the datasets using the Gaussian kernel function. The search space are been Search space S is a set of continuous variables Xi, i = 1. . . n. A solution s∈S, in which each variable has a value assigned and satisfies all the constraints in the set. It is a feasible solutionfor the continuous optimization variables. The Gaussian kernel function uses the non-linear mapping of the dataset. The kernelized fuzzy c-means clustering algorithm is used for restating the distance function in fuzzy c-means algorithm.3.2.2 Hybrid ant colony optimization Ant colony optimization are used for clustering the data, however the continuous variables cannot beprocessed. To overcome this problem of continuous variables the hybrid versions of ant colony optimizationhave been proposed. The hybrid ant colony optimization technique is evolved by overcoming the problem of continuousvariables. The Gaussian kernel function is used in search space for handling continuous variables. Search space S is a set of continuous variables Xi, i = 1, . .n A solution set s∈ S, in which each variablehas a value assigned and satisfies all the constraints in the set Ω is a feasible solution of the given CNOP. Asolution s* ∈S is called global optimum if and only if: f(s*) ≤f(s). Solving a CnOP requires finding at least ones*∈ S*.3.2.3 Fuzzy Adaptive Particle Swarm OptimizationThe algorithm works as followsa) When the best fitness is found at the end of the run, low inertia weight and high learning factors are often preferred.b) When the best fitness is stayed at one value at long time, the number of generations for unchanged best fitness is large. The inertia weight should be increased and learning should be decreased. In FAPSO algorithm each particle is searching for the optimum value and is moving towards allneighborhoods. Hence it has velocity for moving. Each particle remembers the position where it had its bestresult. A particle has a neighborhood connected with it. A particle knows the fitness of those in itsneighborhood, and uses the position of the one with best fitness. The position is used to adjust the particle’svelocity. Through the cluster centers are found and fitness values are obtained. The parameters used in these algorithm are k (size of solution archive), q (locality of search process), ξ(convergence speed, N(µ ; σ) (Gaussian function with mean and standard deviation), α (learning rate), F(differential evolution coefficient is randomly chosen). The ECG datasets are encoded and then preprocessingof dataset occur, where the noise are reduced. Then the normalization of the process is processed with 128 pointin beats. Here, the dataset are extracted. Then, extracted features are divided into two groups and training andtest sets are formed. By using the training set and the proposed method, optimum cluster centers andcorresponding membership degrees are found. These cluster centers and membership degrees are then used toclassify ECG beats. www.iosrjournals.org 18 | Page
  4. 4. A hybrid algorithm based on KFCM-HACO-FAPSO for clustering ECG beat Figure 1 Architecture of the proposed method.3.3 Functional procedurestep 1. Generate the solutionstep 2. For each particle calculate fitness valuestep 3. If the fitness value is better than the best fitness valuestep 4. Set the current value as the new valuestep 5. Find in the particle neighborhood, the particle with the best fitnessstep 6. Calculate the particle velocity according to the velocity equationstep 7. Apply the velocity constructionstep 8. Update particle position according to the position equationstep 9. Sort the solutions according to the fitness in descending orderstep 10. Calculate weight vector ωstep 11. Compute the σ values for the selected solutions.step 12. Generate new solutions from selected solutions by using the computed σ values and replace newsolutions. IV. Clustering With Haco And Fapso In order to meaningfully restrict the number of queries that are similar to each other, one alternative isto cluster queries in the workload based on query similarity. This can be done using a simple K-meansclustering method . Using K-means, we cluster m queries into K clusters based on a predefined K and numberof iterations. In this paper the HACO based Kernelized Fuzzy C-Means algorithm is proposed. The solution set areinitially encoded into higher dimensional space. The solutions are encoded in the Gaussian functions such asGi……Gn, where n is the dimension of the problem. After adding the solution to the archive, they are sortedaccording to the appropriate value. Then, the weight vector are calculated as 𝟏 ωl = 𝒒𝒌 𝟐𝝅 𝒆− 𝒍−𝟏 /𝟐𝐪𝟐𝐤𝟐 (1)This is a value of the Gaussian function with argument l, mean 1.0 and standard deviation qk.To update the solution, Gaussian values should been chosen according to the weight vector using equation (1).Then Gaussian values are chose and updates in the solution set.σ values should also be encoded to find the optimum values. The σ values are encoded using equation (2). Theoptimum values are generated and by using the weight vector the fitness are generated. The solution formed andrepresented in table 1. {ci1…… cik}={si1, . . . , sik} (2) TABLE 1 Solution set S11 ….. S1i ….. S1n S21 S2i S2n . . . . . . . . . Si1 ….. Sii ….. Sin . . . . . . . . . Sk1 ….. Ski ….. Skn www.iosrjournals.org 19 | Page
  5. 5. A hybrid algorithm based on KFCM-HACO-FAPSO for clustering ECG beat To find the fitness value the FAPSO algorithm is used. The exact values needed for clustering areevaluated and it is updated for various values. And finally, the output of the values compared and the fit σ valueare obtained.For the given training set the HACO algorithm is initialized. At each iteration, fitness of a solution is evaluatedand after a certain number of iterations optimum cluster centers and σ values are found.The weight vectors are calculated. Then distance between the cluster centers and output of the algorithm arefound. This is performed using Equation (3). W=(UTU) − UT ∗T 1 (3)Where U represents the fuzzy partition matrix and T represents the target output matrix.Through these weight vector the classification are formed for the test sets. The classifications are performed byfuzzy partition matrix. The output is used for classifier. V. Experimental Evaluation We have evaluated each proposed model (HACO and FAPSO, KFCM) in isolation, and then comparedboth these models with the combined model for efficiency. We also evaluated the efficiency of our clusteringframework. For the considered training set, algorithm is initialized with the parameters mentioned above, andoptimum cluster centers and σ values are found. By using the obtained cluster centers and σ values, weights forthe classification stage of the proposed system are computed. Classification performance of the proposedsystem is then tested over the test samples. Several experiments are performed for certain number of clusters.Classification results for FCM and KFCM algorithms are considered as 6, 10, 15, 20, 25, 30, 35 and 40 clustersare obtained through the FAPSO. The FAPSO algorithm finds the fitness value through several iterations bycalculating the particle velocity in the particle neighborhood. Finally after finding the velocities the values σ is updated. By keeping σ value as constant for Gaussiankernels of KFCM algorithm and several experiments are performed for different σ values to obtain optimumvalue. Several iterations are also performed for different values of fuzzifier exponent (m) to determine theoptimum value. After experimenting several values it is chosen to be m = 2 for FCM and KFCM algorithms.However, all the results are average results of ten experiments. It is shown that, KFCM algorithm with 15clusters and σ = 2.4 are superior to FCM algorithm which has a total cluster number of 20. Another set ofexperiments are performed with keeping σ = 1.2 and σ = 4.0 for all clusters and again searching for theoptimum centers. But, the classification performance of the proposed system decreases. Alternatively theclassification performance of the KFCM algorithm is strictly depends on the σ value. If there is not enoughnumber of clusters, choosing small σ values decreases the classification performance of the KFCM algorithm,so in this case it is necessary to increase σ value to cover enough area in the feature space. In contrast, if there isenough number of clusters, selecting large σ values decrease the classification performance. Obtained resultsconfirm this evaluation. Figure 2 explains the performance analysis of the clustering. Therefore, the HACObased KFCM along with FAPSO performance better than the traditional algorithm. Ideally, we would have preferred to compare our approach against existing clustering schemes indatabases. However, what has been addressed in literature is the use of clustering of fuzzy c-means algorithm.Hence, we have tried to compare the proposed clustering method to indicate the effectiveness of each methodwith respect to the other method.5.1 Efficiency Evaluation The goal of this study was to determine whether our framework can be incorporated into a real-worldapplication. The ECG beat are classified through the proposed method and the efficiency are evaluated. Figure 2 performance analysis of clustering The cluster center and membership are processed and efficient values are chosen. The efficiencybetween the conventional method and the proposed method are compared. The proposed method gives the www.iosrjournals.org 20 | Page
  6. 6. A hybrid algorithm based on KFCM-HACO-FAPSO for clustering ECG beatsensitivity of the cluster center better when compared to the other methods. Clustering with the obtained valuesgives better cluster partition. Likewise, all other dataset are clustered with the obtained cluster center andmembership values for numerous data. Figure 2 present the efficiency performance between the fuzzy c-meansand Hybrid ant colony optimization. Where, the hybrid ant colony optimization is based on the kernelized fuzzyc-means algorithm. VI. Conclusion In this Paper, an optimization method is used to improve the clustering performance of the kernelizedfuzzy c-means algorithm. We proposed a combination of two different algorithms namely: Hybrid ant colonyoptimization (HACO) and fuzzy adaptive particle swarm optimization (FAPSO). HACO optimizes both thekernel function parameter and cluster centers. The proposed algorithm obtains the optimized set of clustercenters as the output which minimizes the objective function of the traditional KFCM algorithm and thisalgorithm can find the application in the areas such as web applications and classifying the ECG records. Infuture the neutral network may obtain the better performance than the other algorithms. References[1] Berat Dogan, M.Korurek “A new ECG beat clustering method based on kernelized fuzzy c-means and hybrid ant colony optimization for continuous domain” , Applied soft computing, 2012, pp 3442-3451.[2] Biswal.B, P.K. Dash, S. Mishra, “A hybrid ant colony optimization technique for power signal pattern classification”, Expert Systems with Applications, May 2011, pp 6368–6375.[3] Dao-Qiang Zhang, Song-Can Chen, “A novel kernelized fuzzy C-means algorithm with application in medical image segmentation”, Artificial Intelligence in Medicine, September 2004, pp 37–50.[4] Hesam Izakian, Ajith Abraham, “Fuzzy C-means and fuzzy swarm for fuzzy clustering problem”, Expert Systems with Applications 38 (March (3)) (2011), pp 1835–1838.[5] Ince .T, S. Kiranyaz, M. Gabbouj, “A generic and robust system for automated patient-specific classification of ECG signals”, IEEE Transactions on Biomedical Engineering 56 (May (5)) (2009), pp 1415–1426.[6] Jing Xiao, LiangPing Li, “A hybrid ant colony optimization for continuous domains”, Expert Systems with Applications, 2011, pp 11072–11077.[7] Julia Handl, Bernd Meyer, “Ant-based and swarm-based clustering”, Swarm Intelligence1 (2) (2007), pp 95–113.[8] Krzysztof Socha, Marco Dorigo, “Ant colony optimization for continuous domains”, European Journal of Operational Research , 2008, pp 1155–1173.[9] Niknam.T, B. Amiri, “An efficient hybrid approach based on PSO, ACO and k-means for cluster analysis”, Applied Soft Computing 10 (January (1)) (2010), pp 183–197.[10] Qiang Niu, Xinjian Huang, “An improved fuzzy C-means clustering algorithm based on PSO”, Journal of Software 6 (5 May) (2011), pp 873–879.[11] Runkler. T.A, C. Katz, IEEE International Conference on “Fuzzy Clustering by Particle Swarm Optimization”, Fuzzy Systems, 2006, pp 601–608[12] Wang.L, Y. Liu, X. Zhao, Y. Xu, “Particle swarm optimization for fuzzy c-means clustering”, intelligent control and automation, The Sixth World Congress on WCICA 2006, vol. 2, 2006, pp 6055–6058.[13] Yanfang Han, Pengfei Shi, “An improved ant colony algorithm for fuzzy clustering in image segmentation”, Neurocomputing , 2007, pp 665–671.[14] Yun-Chi Yeh, Wen-June Wang, Che Wun Chiou, “A novel fuzzy c-means method for classifying heartbeat cases from ECG signals”, Measurement 43 (December (10)) (2010), pp 1542–1555.[15] Taher Niknam, Bahman Bahmani Firouzi and Majid Nayeripour, “An Efficient Hybrid Evolutionary Algorithm for Cluster Analysis”, World Applied Sciences Journal 4 (2), 2008, pp 300-307. www.iosrjournals.org 21 | Page