SlideShare a Scribd company logo
1 of 5
Download to read offline
International Journal of Engineering Research and Development
e-ISSN: 2278-067X, p-ISSN: 2278-800X, www.ijerd.com
Volume 5, Issue 12 (February 2013), PP. 24-28

    Automated Clustering in K-Means Using Double Link
                  Cluster Tree (DLCT)
                            R.Ranga Raj1, Dr.M.Punithavalli2,
             1
               Head of the Department Computer Science, Hindusthan College of Arts and Science
                                                Coimbatore
            2
              Director, Department of Computer Applications, Sri Ramakrishna Engineering College
                                                Coimbatore

    Abstract:- Clustering is the process of grouping related documents from the large collection of
    database. The mining of such related documents from the enormous database which are unlabelled is a
    challenging one. To overcome this problem, clustering is used to filter the unlabelled documents from
    the large collection of database. Clustering can be achieved by various algorithms that differ
    significantly in their notion and how to efficiently find them. The Standard K-Means algorithm is a well
    known data mining algorithm which can effectively cluster data in the database. K-mean is a simple
    algorithm that has been adapted to many problem domains. Hence by using k-mean, the initializations of
    number of clusters can be done through manually. In this research paper, a new technique DLCT
    (Double Link Cluster Tree) is merged with the enhanced K-Mean algorithm which helps to makes
    clustering in an efficient manner by without initializing of number of clusters and optimal clusters. The
    result of k-mean with DLCT, which allows automatic determination of number of clusters on any type
    of data such as documents, images etc.
    General Terms Effective Clustering Using DLCT

    Keywords:- Clustering, K-Means Enhanced Approach Algorithm, Double Link Cluster Tree (DLCT),
    unsupervised clustering.

                                         I.      INTRODUCTION
          Clustering algorithm can be categorized based on several cluster method. A cluster is a set of points
such that a point in a cluster is closer to one or more other points in the cluster than to any point not in the
cluster. A good clustering method will produce high quality cluster in which the intra cluster similarity is high
and inter class similarity is low.
          For example, in an image related datasets it is difficult to identify how many clusters are available.
Image clustering which is an important technology for processing image that has been actively researched for a
long period of time. Recently the growth of interest in supervised method makes to improve the way of
representing image sets. Image clustering is the high level description of image content. Nowadays the
grayscale images are very important for analyzing image contents which has the application of satellite images
to medical images. Such analysis becomes very complex. By using certain mathematical approach [1] the
clustered grayscale images are determined with the optimal cluster number. The clustering process which
separates the data into number of segments those are in the form of n-dimensional space. These segmented data
uses a specific function which helps to model the data distribution [8]. Based on intra cluster and inter cluster
distance measure in the mathematical approach which allows the number of clusters to be determined
automatically.
          Cluster will be grow depend on the size of database. In other hand some existing subjects concentrates
on reducing iteration in K-means method [2] during the clustering process so to obtain an optimized cluster
output. These algorithm uses methods such as Genetic algorithm , PSO, Ant Colony Optimization (ACO) Using
Genetics algorithm(GA) and PSO these are the optimization technique, to reduce the no of iteration. The new
unsupervised k-means clustering algorithm [2] can be applied for the any type of datasets such as images,
documents etc. This clustering process depends on the size of the database and concentrates on reducing
iterations in K-Means method [1] to obtain an optimized cluster output.

                                       II.       EXISTING WORK
         Clustering can be achieved by various algorithms that differ significantly in their notion of what
constitutes a cluster and how to efficiently find them. If the cluster analysis is done on image clustering, the
growth of interest in unsupervised method makes to improve the way of representing image sets. There are

                                                       24
Automated Clusting in K-Means Using Double Link Cluster Tree (Dlct)

many clustering algorithms that can be performed on the image but by using k-means algorithm on image will
obtain several segments. The k-mean clustering algorithm [2] will not provide the optimal cluster number and
also makes initialization of number of clusters on the given data set. One way to overcome this problem is the
mathematical approach [1], which is used to specify optimal cluster number with intra and inter cluster distance
measure and finally allows automatic determination of number of clusters.

                                     III.       K-MEANS ALGORITHM
         Now, using K-Means clustering algorithm we can cluster an image to obtain segments. To run this
algorithm, we need to provide the value of K which is nothing but the number of cluster centers.

3.1 Enhanced Approach
The algorithm for Enhanced K-Means is as Follows
Input:
D={d1,d2,d3,……..dn}//Set of n data points
C={c1,c2,…………….ck}//Set of k clusters
Output:
A set of k clusters

Steps:
1. Compute the distance of each data points di(1<=i<=n) to all the centroids cj(1<=j<=k) as d(di,cj).
2. For each data point di, find the closest centroid cj and assign di to cluster j.
3. Set cluster id[i]=j;//j:id of the closest cluster.
4. Set Nearest_Dist[i]=d(di,cj);
5. For each cluster j(1<=j<=k),recalculate the centroid;
6. Repeat;
7. for each data point di;
          7.1 Compute its distance from the centroid of the present nearest cluster;
          7.2 If this distance is less than or equal to the present nearest distance, the data point stay in this cluster;
          Else
          7.2.1 For every centroid cj (1<=j<=k) Compute the distance d(di,cj);
          End for;
          7.2.2 Assign the data point di to the cluster with the nearest centroid cj;
          7.2.3 Set Cluster id[i]=j;
          7.2.4 Set Nearest _Dist[i]=d(di,cj);
          Endfor;
          8. For each cluster j(1<=j<=k);
          Recalculate the centroids;
          Untill the convergence criteria is met.

                                                  IV.        DLCT
         DLCT is a Double Link Cluster tree which is used as the enhancement of Enhanced K-means
algorithm. In certain dataset there will need for algorithm which can cluster the data without any initialization
(i.e.) NO_OF_CLUSTERS.




                              Fig 1: The Diagrammatical representation of the DLCT

       DLCT is an algorithm, which works as a looping frame for Enhanced K-Means algorithm and makes
Enhanced k-means algorithm to cluster the dataset several times.

                                                           25
Automated Clusting in K-Means Using Double Link Cluster Tree (Dlct)

The Algorithm for this method is shown below
Step 1: Get the Database as input for the purpose of processing Let it be DB
Step 2: Initialize the Level to 1 i.e., Level=1;
Step 3: Find mean for the DB let it is A
Step 4: By means of A, separation the DB Dataset is processed in to two groups let the First group be A1tmp
                   which contains
         the value should be minimum to A i.e. less than A. Let the Second group be A2tmp should contain the
value, less than
         A.
Step 5: Now have two clusters A1tmp and A2tmp.
Step 6: Like Step3 find mean for the A1tmp cluster and A2tmp Cluster Let it be B and C
Step 7: Now initialize K-Means Algorithm with Initial Cluster center as B and C and No of Cluster would be 2
Step 8: Begin to cluster the Database DB. When K-means algorithm finished clustering the Database after
                   certain Iterations the
        Outcome will be two clusters they are A1 and A2
Step 9: Increment the Level to 1 (Now Level =2;)
        (So at the level one, acquire 2 power 1 i.e 2 clusters at Level 2 and get 4 clusters vise versa)
Step 10: Now at Level 2 the same steps or procedure were handled for cluster A1 and A2 (Steps from 4 to 9)
          Level inside our algorithm an algorithm
Level 0 1 cluster (probably Original Dataset)
Level 1 2 clusters
Level 2 4 clusters
Level 3 8 clusters and so on;

                                       V.        PROPOSED WORK
          In the traditional K-Means algorithm it has limitations of getting no of cluster centers [4] by means of
its user. This makes K-Means difficult to use, where there is a unpredictable datasets are available. So to solve
this problem we proposed a method named DLCT (Double Link Cluster Tree). In certain dataset there will need
for algorithm which can cluster the data without any initialization.
          During the clustering processes are,
     1. Sometimes the cluster centers of the tree node would be same. The algorithm will merge the two
          clusters which has unique cluster center
     2. Due to data insufficiency some cluster will does not have any cluster element. So that cluster space was
          terminated.
          In the standard algorithm, the usage of K-Means algorithm [2] is allowed for clustering the datasets
only one time, hence the numbers of clusters are given manually. The concept K-Means Enhanced Approach
Algorithm with Double Link Cluster Tree (DLCT) focuses on clustering of documents and images in an
efficient way by without initializing the number of clusters. In DLCT, the K-Means algorithm is used as library
to process the clustering among all the Database types, images always contains the unpredictable amount of
cluster in the Fig 2 and 3. It shows input image before clustering and output image after clustering.




                                      Fig 2: Input image before clustering




                                      Fig 3: Output Image after Clustering

                                                       26
Automated Clusting in K-Means Using Double Link Cluster Tree (Dlct)

Database Contains 200*200 Elements i.e. 40,000 items. Among them the cluster using DLCT is made as,
Cluster1=30307 Elements
Cluster2= 1300 Elements
Cluster3 =865 Elements
Cluster4=1997 Elements
Cluster5= 3167 Elements
Cluster6= 1621 Elements
Cluster7=743 Elements
For example, the Representation of Number of Clusters in the Character Database is shown in Fig 4 and Fig 5.




                                     Fig 4: Number of Character Database




                                     Fig 5: Number of Character Database

                                      VI.          CONCLUSION
          The clustering of datasets with Enhanced K-Mean algorithm [2] and DLCT helps to make clustering in
an efficient way, by without initializing the number of clusters in an unpredictable database. The designing of
Double Link Cluster Tree (DLCT) algorithm is done in such a way is solves the problems in previous
unsupervised methods implemented. When comparing the proposed algorithm with various existing clustering
algorithm, the analysis of clustering process results in the cluster centers. The proposed algorithm is also an
automatic optimization process since the separation of the clusters is done each time for every level of the
process. In which every cluster centers should have a difference of about 60% else those clusters that have less
difference below 60% will be merged and considered as single cluster. This process will be handles to avoid
minimum distanced cluster to be a separate clusters.

                                              REFERENCES
[1].    A Novel Approach for Determination of Optimal Number of Cluster”, Debashis Ganguly.Computer
        Science and Engineering,Department,Heritage Institute of Technology,Anandapur Kolkata – 700107,
        India
[2].    Improving the accuracy and efficiency of K-Mean Clustering Algorithm”, by K.A. Abdul Nazeer, M.P.
        Sebastian. Proceeding of the world congress on Engineering 2009 vol I WCE 2009, July 1-3, 2009,
        London, U.K.
[3].    Clustering Algorithms Based on Volume Criteria” Raghu Krishnapuram and Jongwoo Kim IEEE
        TRANSACTIONS ON FUZZY SYSTEMS, VOL. 8, NO. 2, APRIL 2000.
[4].    An Efficient k-Means Clustering Algorithm: Analysis and Implementation” Tapas Kanungo, Senior
        Member, IEEE, David M. Mount, Member, IEEE, Nathan S. Netanyahu, Member, IEEE, Christine D.
        Piatko, Ruth Silverman, and Angela Y. Wu, Senior Member, IEEE VOL. 24, NO. 7, JULY 2002


                                                      27
Automated Clusting in K-Means Using Double Link Cluster Tree (Dlct)

[5].    A Comparison of Document Clustering Techniques”, Michael Steinbach,George Karypis. Department
        of Computer Science University of Minnesota Technical Report #00-034 steinbac, karypis,
        kumar@cs.umn.edu Vipin Kumar.
[6].    Document Image Segmentation Using Wavelet Scale–Space Features” Mausumi Acharyya and Malay
        K. Kundu, Senior Member, IEEE VOL. 12, NO. 12, DECEMBER 2002
[7].    Document Clustering in Correlation Similarity Measure Space” Taiping Zhang; Yuan Yan Tang; Bin
        Fang; Yong Xiang Knowledge and Data Engineering, IEEE Transactions on Volume: 24 ,,2012
[8].    A review on image segmentation techniques,Pattern Recognition” , N.R. Pal and S.K. Pal, vol. 26, pp.
        1277-1294, 1993.
[9].    A Modified K-Means Algorithm for Circular Invariant Clustering”, Dimitrios Charalampidis, Member,
        IEEE VOL. 27, NO. 12, DECEMBER 2005
[10].   Randomized Clustering Forests for Image Classification” Frank Moosmann, Student Member, IEEE,
        Eric Nowak, Student Member, IEEE, and Frederic Jurie, Member, IEEE Computer Society VOL. 30,
        NO. 9, SEPTEMBER 2008
[11].   Unsupervised Change Detection in Satellite Images Using Principal Component Analysis and k-Means
        Clustering”, Turgay Celik, IEEE GEOSCIENCE VOL. 6, NO. 4, OCTOBER 2009




                                                    28

More Related Content

What's hot

Clustering, k-means clustering
Clustering, k-means clusteringClustering, k-means clustering
Clustering, k-means clusteringMegha Sharma
 
Image classification using neural network
Image classification using neural networkImage classification using neural network
Image classification using neural networkBhavyateja Potineni
 
Experimental study of Data clustering using k- Means and modified algorithms
Experimental study of Data clustering using k- Means and modified algorithmsExperimental study of Data clustering using k- Means and modified algorithms
Experimental study of Data clustering using k- Means and modified algorithmsIJDKP
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithmijsrd.com
 
An Efficient Frame Embedding Using Haar Wavelet Coefficients And Orthogonal C...
An Efficient Frame Embedding Using Haar Wavelet Coefficients And Orthogonal C...An Efficient Frame Embedding Using Haar Wavelet Coefficients And Orthogonal C...
An Efficient Frame Embedding Using Haar Wavelet Coefficients And Orthogonal C...IJERA Editor
 
Parallel k nn on gpu architecture using opencl
Parallel k nn on gpu architecture using openclParallel k nn on gpu architecture using opencl
Parallel k nn on gpu architecture using opencleSAT Publishing House
 
Analysis of mass based and density based clustering techniques on numerical d...
Analysis of mass based and density based clustering techniques on numerical d...Analysis of mass based and density based clustering techniques on numerical d...
Analysis of mass based and density based clustering techniques on numerical d...Alexander Decker
 
A fuzzy clustering algorithm for high dimensional streaming data
A fuzzy clustering algorithm for high dimensional streaming dataA fuzzy clustering algorithm for high dimensional streaming data
A fuzzy clustering algorithm for high dimensional streaming dataAlexander Decker
 
TOWARDS REDUCTION OF DATA FLOW IN A DISTRIBUTED NETWORK USING PRINCIPAL COMPO...
TOWARDS REDUCTION OF DATA FLOW IN A DISTRIBUTED NETWORK USING PRINCIPAL COMPO...TOWARDS REDUCTION OF DATA FLOW IN A DISTRIBUTED NETWORK USING PRINCIPAL COMPO...
TOWARDS REDUCTION OF DATA FLOW IN A DISTRIBUTED NETWORK USING PRINCIPAL COMPO...cscpconf
 
IRJET- Finding Dominant Color in the Artistic Painting using Data Mining ...
IRJET-  	  Finding Dominant Color in the Artistic Painting using Data Mining ...IRJET-  	  Finding Dominant Color in the Artistic Painting using Data Mining ...
IRJET- Finding Dominant Color in the Artistic Painting using Data Mining ...IRJET Journal
 
Novel algorithms for Knowledge discovery from neural networks in Classificat...
Novel algorithms for  Knowledge discovery from neural networks in Classificat...Novel algorithms for  Knowledge discovery from neural networks in Classificat...
Novel algorithms for Knowledge discovery from neural networks in Classificat...Dr.(Mrs).Gethsiyal Augasta
 
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and ArchitecturesMetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and ArchitecturesMLAI2
 

What's hot (17)

Clustering, k-means clustering
Clustering, k-means clusteringClustering, k-means clustering
Clustering, k-means clustering
 
K mean-clustering
K mean-clusteringK mean-clustering
K mean-clustering
 
K means report
K means reportK means report
K means report
 
Image classification using neural network
Image classification using neural networkImage classification using neural network
Image classification using neural network
 
Experimental study of Data clustering using k- Means and modified algorithms
Experimental study of Data clustering using k- Means and modified algorithmsExperimental study of Data clustering using k- Means and modified algorithms
Experimental study of Data clustering using k- Means and modified algorithms
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithm
 
Lect4
Lect4Lect4
Lect4
 
An Efficient Frame Embedding Using Haar Wavelet Coefficients And Orthogonal C...
An Efficient Frame Embedding Using Haar Wavelet Coefficients And Orthogonal C...An Efficient Frame Embedding Using Haar Wavelet Coefficients And Orthogonal C...
An Efficient Frame Embedding Using Haar Wavelet Coefficients And Orthogonal C...
 
50120140505013
5012014050501350120140505013
50120140505013
 
Parallel k nn on gpu architecture using opencl
Parallel k nn on gpu architecture using openclParallel k nn on gpu architecture using opencl
Parallel k nn on gpu architecture using opencl
 
Analysis of mass based and density based clustering techniques on numerical d...
Analysis of mass based and density based clustering techniques on numerical d...Analysis of mass based and density based clustering techniques on numerical d...
Analysis of mass based and density based clustering techniques on numerical d...
 
A fuzzy clustering algorithm for high dimensional streaming data
A fuzzy clustering algorithm for high dimensional streaming dataA fuzzy clustering algorithm for high dimensional streaming data
A fuzzy clustering algorithm for high dimensional streaming data
 
TOWARDS REDUCTION OF DATA FLOW IN A DISTRIBUTED NETWORK USING PRINCIPAL COMPO...
TOWARDS REDUCTION OF DATA FLOW IN A DISTRIBUTED NETWORK USING PRINCIPAL COMPO...TOWARDS REDUCTION OF DATA FLOW IN A DISTRIBUTED NETWORK USING PRINCIPAL COMPO...
TOWARDS REDUCTION OF DATA FLOW IN A DISTRIBUTED NETWORK USING PRINCIPAL COMPO...
 
IRJET- Finding Dominant Color in the Artistic Painting using Data Mining ...
IRJET-  	  Finding Dominant Color in the Artistic Painting using Data Mining ...IRJET-  	  Finding Dominant Color in the Artistic Painting using Data Mining ...
IRJET- Finding Dominant Color in the Artistic Painting using Data Mining ...
 
Novel algorithms for Knowledge discovery from neural networks in Classificat...
Novel algorithms for  Knowledge discovery from neural networks in Classificat...Novel algorithms for  Knowledge discovery from neural networks in Classificat...
Novel algorithms for Knowledge discovery from neural networks in Classificat...
 
Visualization of Crisp and Rough Clustering using MATLAB
Visualization of Crisp and Rough Clustering using MATLABVisualization of Crisp and Rough Clustering using MATLAB
Visualization of Crisp and Rough Clustering using MATLAB
 
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and ArchitecturesMetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
 

Viewers also liked

IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...
IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...
IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...IJERD Editor
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...
IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...
IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...IJERD Editor
 
Experimental performance of flexural creep behavior of ferrocement slab
Experimental performance of flexural creep behavior of ferrocement slabExperimental performance of flexural creep behavior of ferrocement slab
Experimental performance of flexural creep behavior of ferrocement slabeSAT Publishing House
 
IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...
IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...
IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...IJERD Editor
 

Viewers also liked (11)

www.ijerd.com
www.ijerd.comwww.ijerd.com
www.ijerd.com
 
IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...
IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...
IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
www.ijerd.com
www.ijerd.comwww.ijerd.com
www.ijerd.com
 
www.ijerd.com
www.ijerd.comwww.ijerd.com
www.ijerd.com
 
www.ijerd.com
www.ijerd.comwww.ijerd.com
www.ijerd.com
 
IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...
IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...
IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...
 
Experimental performance of flexural creep behavior of ferrocement slab
Experimental performance of flexural creep behavior of ferrocement slabExperimental performance of flexural creep behavior of ferrocement slab
Experimental performance of flexural creep behavior of ferrocement slab
 
www.ijerd.com
www.ijerd.comwww.ijerd.com
www.ijerd.com
 
IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...
IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...
IJERD(www.ijerd.com)International Journal of Engineering Research and Develop...
 
www.ijerd.com
www.ijerd.comwww.ijerd.com
www.ijerd.com
 

Similar to Welcome to International Journal of Engineering Research and Development (IJERD)

A comprehensive survey of contemporary
A comprehensive survey of contemporaryA comprehensive survey of contemporary
A comprehensive survey of contemporaryprjpublications
 
K-means Clustering Method for the Analysis of Log Data
K-means Clustering Method for the Analysis of Log DataK-means Clustering Method for the Analysis of Log Data
K-means Clustering Method for the Analysis of Log Dataidescitation
 
K- means clustering method based Data Mining of Network Shared Resources .pptx
K- means clustering method based Data Mining of Network Shared Resources .pptxK- means clustering method based Data Mining of Network Shared Resources .pptx
K- means clustering method based Data Mining of Network Shared Resources .pptxSaiPragnaKancheti
 
K- means clustering method based Data Mining of Network Shared Resources .pptx
K- means clustering method based Data Mining of Network Shared Resources .pptxK- means clustering method based Data Mining of Network Shared Resources .pptx
K- means clustering method based Data Mining of Network Shared Resources .pptxSaiPragnaKancheti
 
Premeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means ClusteringPremeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means ClusteringIJCSIS Research Publications
 
Document clustering for forensic analysis an approach for improving compute...
Document clustering for forensic   analysis an approach for improving compute...Document clustering for forensic   analysis an approach for improving compute...
Document clustering for forensic analysis an approach for improving compute...Madan Golla
 
Introduction to Multi-Objective Clustering Ensemble
Introduction to Multi-Objective Clustering EnsembleIntroduction to Multi-Objective Clustering Ensemble
Introduction to Multi-Objective Clustering EnsembleIJSRD
 
Parallel Implementation of K Means Clustering on CUDA
Parallel Implementation of K Means Clustering on CUDAParallel Implementation of K Means Clustering on CUDA
Parallel Implementation of K Means Clustering on CUDAprithan
 
Dynamic approach to k means clustering algorithm-2
Dynamic approach to k means clustering algorithm-2Dynamic approach to k means clustering algorithm-2
Dynamic approach to k means clustering algorithm-2IAEME Publication
 
DECISION TREE CLUSTERING: A COLUMNSTORES TUPLE RECONSTRUCTION
DECISION TREE CLUSTERING: A COLUMNSTORES TUPLE RECONSTRUCTIONDECISION TREE CLUSTERING: A COLUMNSTORES TUPLE RECONSTRUCTION
DECISION TREE CLUSTERING: A COLUMNSTORES TUPLE RECONSTRUCTIONcscpconf
 
An improvement in k mean clustering algorithm using better time and accuracy
An improvement in k mean clustering algorithm using better time and accuracyAn improvement in k mean clustering algorithm using better time and accuracy
An improvement in k mean clustering algorithm using better time and accuracyijpla
 
Review of Existing Methods in K-means Clustering Algorithm
Review of Existing Methods in K-means Clustering AlgorithmReview of Existing Methods in K-means Clustering Algorithm
Review of Existing Methods in K-means Clustering AlgorithmIRJET Journal
 
Decision Tree Clustering : A Columnstores Tuple Reconstruction
Decision Tree Clustering : A Columnstores Tuple ReconstructionDecision Tree Clustering : A Columnstores Tuple Reconstruction
Decision Tree Clustering : A Columnstores Tuple Reconstructioncsandit
 
Decision tree clustering a columnstores tuple reconstruction
Decision tree clustering  a columnstores tuple reconstructionDecision tree clustering  a columnstores tuple reconstruction
Decision tree clustering a columnstores tuple reconstructioncsandit
 
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...IOSR Journals
 
Mine Blood Donors Information through Improved K-Means Clustering
Mine Blood Donors Information through Improved K-Means ClusteringMine Blood Donors Information through Improved K-Means Clustering
Mine Blood Donors Information through Improved K-Means Clusteringijcsity
 
Clustering
ClusteringClustering
ClusteringMeme Hei
 
A Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data MiningA Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data MiningNatasha Grant
 

Similar to Welcome to International Journal of Engineering Research and Development (IJERD) (20)

Chapter 5.pdf
Chapter 5.pdfChapter 5.pdf
Chapter 5.pdf
 
A comprehensive survey of contemporary
A comprehensive survey of contemporaryA comprehensive survey of contemporary
A comprehensive survey of contemporary
 
K-means Clustering Method for the Analysis of Log Data
K-means Clustering Method for the Analysis of Log DataK-means Clustering Method for the Analysis of Log Data
K-means Clustering Method for the Analysis of Log Data
 
K- means clustering method based Data Mining of Network Shared Resources .pptx
K- means clustering method based Data Mining of Network Shared Resources .pptxK- means clustering method based Data Mining of Network Shared Resources .pptx
K- means clustering method based Data Mining of Network Shared Resources .pptx
 
K- means clustering method based Data Mining of Network Shared Resources .pptx
K- means clustering method based Data Mining of Network Shared Resources .pptxK- means clustering method based Data Mining of Network Shared Resources .pptx
K- means clustering method based Data Mining of Network Shared Resources .pptx
 
Premeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means ClusteringPremeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means Clustering
 
Document clustering for forensic analysis an approach for improving compute...
Document clustering for forensic   analysis an approach for improving compute...Document clustering for forensic   analysis an approach for improving compute...
Document clustering for forensic analysis an approach for improving compute...
 
Introduction to Multi-Objective Clustering Ensemble
Introduction to Multi-Objective Clustering EnsembleIntroduction to Multi-Objective Clustering Ensemble
Introduction to Multi-Objective Clustering Ensemble
 
Parallel Implementation of K Means Clustering on CUDA
Parallel Implementation of K Means Clustering on CUDAParallel Implementation of K Means Clustering on CUDA
Parallel Implementation of K Means Clustering on CUDA
 
Dynamic approach to k means clustering algorithm-2
Dynamic approach to k means clustering algorithm-2Dynamic approach to k means clustering algorithm-2
Dynamic approach to k means clustering algorithm-2
 
DECISION TREE CLUSTERING: A COLUMNSTORES TUPLE RECONSTRUCTION
DECISION TREE CLUSTERING: A COLUMNSTORES TUPLE RECONSTRUCTIONDECISION TREE CLUSTERING: A COLUMNSTORES TUPLE RECONSTRUCTION
DECISION TREE CLUSTERING: A COLUMNSTORES TUPLE RECONSTRUCTION
 
An improvement in k mean clustering algorithm using better time and accuracy
An improvement in k mean clustering algorithm using better time and accuracyAn improvement in k mean clustering algorithm using better time and accuracy
An improvement in k mean clustering algorithm using better time and accuracy
 
Review of Existing Methods in K-means Clustering Algorithm
Review of Existing Methods in K-means Clustering AlgorithmReview of Existing Methods in K-means Clustering Algorithm
Review of Existing Methods in K-means Clustering Algorithm
 
Decision Tree Clustering : A Columnstores Tuple Reconstruction
Decision Tree Clustering : A Columnstores Tuple ReconstructionDecision Tree Clustering : A Columnstores Tuple Reconstruction
Decision Tree Clustering : A Columnstores Tuple Reconstruction
 
Decision tree clustering a columnstores tuple reconstruction
Decision tree clustering  a columnstores tuple reconstructionDecision tree clustering  a columnstores tuple reconstruction
Decision tree clustering a columnstores tuple reconstruction
 
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
 
Mine Blood Donors Information through Improved K-Means Clustering
Mine Blood Donors Information through Improved K-Means ClusteringMine Blood Donors Information through Improved K-Means Clustering
Mine Blood Donors Information through Improved K-Means Clustering
 
Clustering
ClusteringClustering
Clustering
 
A Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data MiningA Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data Mining
 
Data clustering
Data clustering Data clustering
Data clustering
 

More from IJERD Editor

A Novel Method for Prevention of Bandwidth Distributed Denial of Service Attacks
A Novel Method for Prevention of Bandwidth Distributed Denial of Service AttacksA Novel Method for Prevention of Bandwidth Distributed Denial of Service Attacks
A Novel Method for Prevention of Bandwidth Distributed Denial of Service AttacksIJERD Editor
 
MEMS MICROPHONE INTERFACE
MEMS MICROPHONE INTERFACEMEMS MICROPHONE INTERFACE
MEMS MICROPHONE INTERFACEIJERD Editor
 
Influence of tensile behaviour of slab on the structural Behaviour of shear c...
Influence of tensile behaviour of slab on the structural Behaviour of shear c...Influence of tensile behaviour of slab on the structural Behaviour of shear c...
Influence of tensile behaviour of slab on the structural Behaviour of shear c...IJERD Editor
 
Gold prospecting using Remote Sensing ‘A case study of Sudan’
Gold prospecting using Remote Sensing ‘A case study of Sudan’Gold prospecting using Remote Sensing ‘A case study of Sudan’
Gold prospecting using Remote Sensing ‘A case study of Sudan’IJERD Editor
 
Reducing Corrosion Rate by Welding Design
Reducing Corrosion Rate by Welding DesignReducing Corrosion Rate by Welding Design
Reducing Corrosion Rate by Welding DesignIJERD Editor
 
Router 1X3 – RTL Design and Verification
Router 1X3 – RTL Design and VerificationRouter 1X3 – RTL Design and Verification
Router 1X3 – RTL Design and VerificationIJERD Editor
 
Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...
Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...
Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...IJERD Editor
 
Mitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVR
Mitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVRMitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVR
Mitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVRIJERD Editor
 
Study on the Fused Deposition Modelling In Additive Manufacturing
Study on the Fused Deposition Modelling In Additive ManufacturingStudy on the Fused Deposition Modelling In Additive Manufacturing
Study on the Fused Deposition Modelling In Additive ManufacturingIJERD Editor
 
Spyware triggering system by particular string value
Spyware triggering system by particular string valueSpyware triggering system by particular string value
Spyware triggering system by particular string valueIJERD Editor
 
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...IJERD Editor
 
Secure Image Transmission for Cloud Storage System Using Hybrid Scheme
Secure Image Transmission for Cloud Storage System Using Hybrid SchemeSecure Image Transmission for Cloud Storage System Using Hybrid Scheme
Secure Image Transmission for Cloud Storage System Using Hybrid SchemeIJERD Editor
 
Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...
Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...
Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...IJERD Editor
 
Gesture Gaming on the World Wide Web Using an Ordinary Web Camera
Gesture Gaming on the World Wide Web Using an Ordinary Web CameraGesture Gaming on the World Wide Web Using an Ordinary Web Camera
Gesture Gaming on the World Wide Web Using an Ordinary Web CameraIJERD Editor
 
Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...
Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...
Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...IJERD Editor
 
Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...
Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...
Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...IJERD Editor
 
Moon-bounce: A Boon for VHF Dxing
Moon-bounce: A Boon for VHF DxingMoon-bounce: A Boon for VHF Dxing
Moon-bounce: A Boon for VHF DxingIJERD Editor
 
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...IJERD Editor
 
Importance of Measurements in Smart Grid
Importance of Measurements in Smart GridImportance of Measurements in Smart Grid
Importance of Measurements in Smart GridIJERD Editor
 
Study of Macro level Properties of SCC using GGBS and Lime stone powder
Study of Macro level Properties of SCC using GGBS and Lime stone powderStudy of Macro level Properties of SCC using GGBS and Lime stone powder
Study of Macro level Properties of SCC using GGBS and Lime stone powderIJERD Editor
 

More from IJERD Editor (20)

A Novel Method for Prevention of Bandwidth Distributed Denial of Service Attacks
A Novel Method for Prevention of Bandwidth Distributed Denial of Service AttacksA Novel Method for Prevention of Bandwidth Distributed Denial of Service Attacks
A Novel Method for Prevention of Bandwidth Distributed Denial of Service Attacks
 
MEMS MICROPHONE INTERFACE
MEMS MICROPHONE INTERFACEMEMS MICROPHONE INTERFACE
MEMS MICROPHONE INTERFACE
 
Influence of tensile behaviour of slab on the structural Behaviour of shear c...
Influence of tensile behaviour of slab on the structural Behaviour of shear c...Influence of tensile behaviour of slab on the structural Behaviour of shear c...
Influence of tensile behaviour of slab on the structural Behaviour of shear c...
 
Gold prospecting using Remote Sensing ‘A case study of Sudan’
Gold prospecting using Remote Sensing ‘A case study of Sudan’Gold prospecting using Remote Sensing ‘A case study of Sudan’
Gold prospecting using Remote Sensing ‘A case study of Sudan’
 
Reducing Corrosion Rate by Welding Design
Reducing Corrosion Rate by Welding DesignReducing Corrosion Rate by Welding Design
Reducing Corrosion Rate by Welding Design
 
Router 1X3 – RTL Design and Verification
Router 1X3 – RTL Design and VerificationRouter 1X3 – RTL Design and Verification
Router 1X3 – RTL Design and Verification
 
Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...
Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...
Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...
 
Mitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVR
Mitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVRMitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVR
Mitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVR
 
Study on the Fused Deposition Modelling In Additive Manufacturing
Study on the Fused Deposition Modelling In Additive ManufacturingStudy on the Fused Deposition Modelling In Additive Manufacturing
Study on the Fused Deposition Modelling In Additive Manufacturing
 
Spyware triggering system by particular string value
Spyware triggering system by particular string valueSpyware triggering system by particular string value
Spyware triggering system by particular string value
 
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
 
Secure Image Transmission for Cloud Storage System Using Hybrid Scheme
Secure Image Transmission for Cloud Storage System Using Hybrid SchemeSecure Image Transmission for Cloud Storage System Using Hybrid Scheme
Secure Image Transmission for Cloud Storage System Using Hybrid Scheme
 
Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...
Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...
Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...
 
Gesture Gaming on the World Wide Web Using an Ordinary Web Camera
Gesture Gaming on the World Wide Web Using an Ordinary Web CameraGesture Gaming on the World Wide Web Using an Ordinary Web Camera
Gesture Gaming on the World Wide Web Using an Ordinary Web Camera
 
Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...
Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...
Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...
 
Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...
Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...
Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...
 
Moon-bounce: A Boon for VHF Dxing
Moon-bounce: A Boon for VHF DxingMoon-bounce: A Boon for VHF Dxing
Moon-bounce: A Boon for VHF Dxing
 
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...
 
Importance of Measurements in Smart Grid
Importance of Measurements in Smart GridImportance of Measurements in Smart Grid
Importance of Measurements in Smart Grid
 
Study of Macro level Properties of SCC using GGBS and Lime stone powder
Study of Macro level Properties of SCC using GGBS and Lime stone powderStudy of Macro level Properties of SCC using GGBS and Lime stone powder
Study of Macro level Properties of SCC using GGBS and Lime stone powder
 

Welcome to International Journal of Engineering Research and Development (IJERD)

  • 1. International Journal of Engineering Research and Development e-ISSN: 2278-067X, p-ISSN: 2278-800X, www.ijerd.com Volume 5, Issue 12 (February 2013), PP. 24-28 Automated Clustering in K-Means Using Double Link Cluster Tree (DLCT) R.Ranga Raj1, Dr.M.Punithavalli2, 1 Head of the Department Computer Science, Hindusthan College of Arts and Science Coimbatore 2 Director, Department of Computer Applications, Sri Ramakrishna Engineering College Coimbatore Abstract:- Clustering is the process of grouping related documents from the large collection of database. The mining of such related documents from the enormous database which are unlabelled is a challenging one. To overcome this problem, clustering is used to filter the unlabelled documents from the large collection of database. Clustering can be achieved by various algorithms that differ significantly in their notion and how to efficiently find them. The Standard K-Means algorithm is a well known data mining algorithm which can effectively cluster data in the database. K-mean is a simple algorithm that has been adapted to many problem domains. Hence by using k-mean, the initializations of number of clusters can be done through manually. In this research paper, a new technique DLCT (Double Link Cluster Tree) is merged with the enhanced K-Mean algorithm which helps to makes clustering in an efficient manner by without initializing of number of clusters and optimal clusters. The result of k-mean with DLCT, which allows automatic determination of number of clusters on any type of data such as documents, images etc. General Terms Effective Clustering Using DLCT Keywords:- Clustering, K-Means Enhanced Approach Algorithm, Double Link Cluster Tree (DLCT), unsupervised clustering. I. INTRODUCTION Clustering algorithm can be categorized based on several cluster method. A cluster is a set of points such that a point in a cluster is closer to one or more other points in the cluster than to any point not in the cluster. A good clustering method will produce high quality cluster in which the intra cluster similarity is high and inter class similarity is low. For example, in an image related datasets it is difficult to identify how many clusters are available. Image clustering which is an important technology for processing image that has been actively researched for a long period of time. Recently the growth of interest in supervised method makes to improve the way of representing image sets. Image clustering is the high level description of image content. Nowadays the grayscale images are very important for analyzing image contents which has the application of satellite images to medical images. Such analysis becomes very complex. By using certain mathematical approach [1] the clustered grayscale images are determined with the optimal cluster number. The clustering process which separates the data into number of segments those are in the form of n-dimensional space. These segmented data uses a specific function which helps to model the data distribution [8]. Based on intra cluster and inter cluster distance measure in the mathematical approach which allows the number of clusters to be determined automatically. Cluster will be grow depend on the size of database. In other hand some existing subjects concentrates on reducing iteration in K-means method [2] during the clustering process so to obtain an optimized cluster output. These algorithm uses methods such as Genetic algorithm , PSO, Ant Colony Optimization (ACO) Using Genetics algorithm(GA) and PSO these are the optimization technique, to reduce the no of iteration. The new unsupervised k-means clustering algorithm [2] can be applied for the any type of datasets such as images, documents etc. This clustering process depends on the size of the database and concentrates on reducing iterations in K-Means method [1] to obtain an optimized cluster output. II. EXISTING WORK Clustering can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them. If the cluster analysis is done on image clustering, the growth of interest in unsupervised method makes to improve the way of representing image sets. There are 24
  • 2. Automated Clusting in K-Means Using Double Link Cluster Tree (Dlct) many clustering algorithms that can be performed on the image but by using k-means algorithm on image will obtain several segments. The k-mean clustering algorithm [2] will not provide the optimal cluster number and also makes initialization of number of clusters on the given data set. One way to overcome this problem is the mathematical approach [1], which is used to specify optimal cluster number with intra and inter cluster distance measure and finally allows automatic determination of number of clusters. III. K-MEANS ALGORITHM Now, using K-Means clustering algorithm we can cluster an image to obtain segments. To run this algorithm, we need to provide the value of K which is nothing but the number of cluster centers. 3.1 Enhanced Approach The algorithm for Enhanced K-Means is as Follows Input: D={d1,d2,d3,……..dn}//Set of n data points C={c1,c2,…………….ck}//Set of k clusters Output: A set of k clusters Steps: 1. Compute the distance of each data points di(1<=i<=n) to all the centroids cj(1<=j<=k) as d(di,cj). 2. For each data point di, find the closest centroid cj and assign di to cluster j. 3. Set cluster id[i]=j;//j:id of the closest cluster. 4. Set Nearest_Dist[i]=d(di,cj); 5. For each cluster j(1<=j<=k),recalculate the centroid; 6. Repeat; 7. for each data point di; 7.1 Compute its distance from the centroid of the present nearest cluster; 7.2 If this distance is less than or equal to the present nearest distance, the data point stay in this cluster; Else 7.2.1 For every centroid cj (1<=j<=k) Compute the distance d(di,cj); End for; 7.2.2 Assign the data point di to the cluster with the nearest centroid cj; 7.2.3 Set Cluster id[i]=j; 7.2.4 Set Nearest _Dist[i]=d(di,cj); Endfor; 8. For each cluster j(1<=j<=k); Recalculate the centroids; Untill the convergence criteria is met. IV. DLCT DLCT is a Double Link Cluster tree which is used as the enhancement of Enhanced K-means algorithm. In certain dataset there will need for algorithm which can cluster the data without any initialization (i.e.) NO_OF_CLUSTERS. Fig 1: The Diagrammatical representation of the DLCT DLCT is an algorithm, which works as a looping frame for Enhanced K-Means algorithm and makes Enhanced k-means algorithm to cluster the dataset several times. 25
  • 3. Automated Clusting in K-Means Using Double Link Cluster Tree (Dlct) The Algorithm for this method is shown below Step 1: Get the Database as input for the purpose of processing Let it be DB Step 2: Initialize the Level to 1 i.e., Level=1; Step 3: Find mean for the DB let it is A Step 4: By means of A, separation the DB Dataset is processed in to two groups let the First group be A1tmp which contains the value should be minimum to A i.e. less than A. Let the Second group be A2tmp should contain the value, less than A. Step 5: Now have two clusters A1tmp and A2tmp. Step 6: Like Step3 find mean for the A1tmp cluster and A2tmp Cluster Let it be B and C Step 7: Now initialize K-Means Algorithm with Initial Cluster center as B and C and No of Cluster would be 2 Step 8: Begin to cluster the Database DB. When K-means algorithm finished clustering the Database after certain Iterations the Outcome will be two clusters they are A1 and A2 Step 9: Increment the Level to 1 (Now Level =2;) (So at the level one, acquire 2 power 1 i.e 2 clusters at Level 2 and get 4 clusters vise versa) Step 10: Now at Level 2 the same steps or procedure were handled for cluster A1 and A2 (Steps from 4 to 9) Level inside our algorithm an algorithm Level 0 1 cluster (probably Original Dataset) Level 1 2 clusters Level 2 4 clusters Level 3 8 clusters and so on; V. PROPOSED WORK In the traditional K-Means algorithm it has limitations of getting no of cluster centers [4] by means of its user. This makes K-Means difficult to use, where there is a unpredictable datasets are available. So to solve this problem we proposed a method named DLCT (Double Link Cluster Tree). In certain dataset there will need for algorithm which can cluster the data without any initialization. During the clustering processes are, 1. Sometimes the cluster centers of the tree node would be same. The algorithm will merge the two clusters which has unique cluster center 2. Due to data insufficiency some cluster will does not have any cluster element. So that cluster space was terminated. In the standard algorithm, the usage of K-Means algorithm [2] is allowed for clustering the datasets only one time, hence the numbers of clusters are given manually. The concept K-Means Enhanced Approach Algorithm with Double Link Cluster Tree (DLCT) focuses on clustering of documents and images in an efficient way by without initializing the number of clusters. In DLCT, the K-Means algorithm is used as library to process the clustering among all the Database types, images always contains the unpredictable amount of cluster in the Fig 2 and 3. It shows input image before clustering and output image after clustering. Fig 2: Input image before clustering Fig 3: Output Image after Clustering 26
  • 4. Automated Clusting in K-Means Using Double Link Cluster Tree (Dlct) Database Contains 200*200 Elements i.e. 40,000 items. Among them the cluster using DLCT is made as, Cluster1=30307 Elements Cluster2= 1300 Elements Cluster3 =865 Elements Cluster4=1997 Elements Cluster5= 3167 Elements Cluster6= 1621 Elements Cluster7=743 Elements For example, the Representation of Number of Clusters in the Character Database is shown in Fig 4 and Fig 5. Fig 4: Number of Character Database Fig 5: Number of Character Database VI. CONCLUSION The clustering of datasets with Enhanced K-Mean algorithm [2] and DLCT helps to make clustering in an efficient way, by without initializing the number of clusters in an unpredictable database. The designing of Double Link Cluster Tree (DLCT) algorithm is done in such a way is solves the problems in previous unsupervised methods implemented. When comparing the proposed algorithm with various existing clustering algorithm, the analysis of clustering process results in the cluster centers. The proposed algorithm is also an automatic optimization process since the separation of the clusters is done each time for every level of the process. In which every cluster centers should have a difference of about 60% else those clusters that have less difference below 60% will be merged and considered as single cluster. This process will be handles to avoid minimum distanced cluster to be a separate clusters. REFERENCES [1]. A Novel Approach for Determination of Optimal Number of Cluster”, Debashis Ganguly.Computer Science and Engineering,Department,Heritage Institute of Technology,Anandapur Kolkata – 700107, India [2]. Improving the accuracy and efficiency of K-Mean Clustering Algorithm”, by K.A. Abdul Nazeer, M.P. Sebastian. Proceeding of the world congress on Engineering 2009 vol I WCE 2009, July 1-3, 2009, London, U.K. [3]. Clustering Algorithms Based on Volume Criteria” Raghu Krishnapuram and Jongwoo Kim IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 8, NO. 2, APRIL 2000. [4]. An Efficient k-Means Clustering Algorithm: Analysis and Implementation” Tapas Kanungo, Senior Member, IEEE, David M. Mount, Member, IEEE, Nathan S. Netanyahu, Member, IEEE, Christine D. Piatko, Ruth Silverman, and Angela Y. Wu, Senior Member, IEEE VOL. 24, NO. 7, JULY 2002 27
  • 5. Automated Clusting in K-Means Using Double Link Cluster Tree (Dlct) [5]. A Comparison of Document Clustering Techniques”, Michael Steinbach,George Karypis. Department of Computer Science University of Minnesota Technical Report #00-034 steinbac, karypis, kumar@cs.umn.edu Vipin Kumar. [6]. Document Image Segmentation Using Wavelet Scale–Space Features” Mausumi Acharyya and Malay K. Kundu, Senior Member, IEEE VOL. 12, NO. 12, DECEMBER 2002 [7]. Document Clustering in Correlation Similarity Measure Space” Taiping Zhang; Yuan Yan Tang; Bin Fang; Yong Xiang Knowledge and Data Engineering, IEEE Transactions on Volume: 24 ,,2012 [8]. A review on image segmentation techniques,Pattern Recognition” , N.R. Pal and S.K. Pal, vol. 26, pp. 1277-1294, 1993. [9]. A Modified K-Means Algorithm for Circular Invariant Clustering”, Dimitrios Charalampidis, Member, IEEE VOL. 27, NO. 12, DECEMBER 2005 [10]. Randomized Clustering Forests for Image Classification” Frank Moosmann, Student Member, IEEE, Eric Nowak, Student Member, IEEE, and Frederic Jurie, Member, IEEE Computer Society VOL. 30, NO. 9, SEPTEMBER 2008 [11]. Unsupervised Change Detection in Satellite Images Using Principal Component Analysis and k-Means Clustering”, Turgay Celik, IEEE GEOSCIENCE VOL. 6, NO. 4, OCTOBER 2009 28