Abstract Clustering deals with grouping up of similar objects. Unlike classification, clustering tries to group a set of objects and find whether there is some relationship between the objects whereas in classification a set of predefined classes will be known and it is enough to find which class a object belongs. Simply, classification is a supervised learning technique and clustering is an unsupervised learning technique. Clustering techniques apply when there is no class to be predicted but rather when the instances are to be divided into natural groups. These clusters presumably reflect some mechanism at work in the domain from which instances are drawn, a mechanism that causes some instances to bear a stronger resemblance to each other than they do to the remaining instances. It naturally requires different techniques to the classification and association learning methods .Clustering has many applications in various fields. In Software engineering it helps in reverse engineering, software maintenance and for re-building systems. It aims at breaking a larger problem into small pieces of understanding elements. There are many approaches available to carry out clustering. Since clustering has no particular methodology there are many methods available for carrying out clustering. There are many traditional as well as evolutionary methods available for carrying out clustering. In this paper various types of the above mentioned methods are described and some of them are compared. Each method has its own advantage and they can be used according to the needs of the user. Keywords: Clustering, Classification, Software Engineering, Traditional, Evolutionary.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
This document provides an overview of several clustering algorithms. It begins by defining clustering and its importance in data mining. It then categorizes clustering algorithms into four main types: partitional, hierarchical, grid-based, and density-based. For each type, some representative algorithms are described briefly. The document also reviews several popular clustering algorithms like k-means, CLARA, PAM, CLARANS, and BIRCH in more detail. It discusses aspects like the algorithms' time complexity, types of data handled, ability to detect clusters of different shapes, required input parameters, and advantages/disadvantages. Overall, the document aims to guide selection of suitable clustering algorithms for specific applications by surveying their key characteristics.
A SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERINGijcsa
This document provides a survey of optimization approaches that have been applied to text document clustering. It discusses several clustering algorithms and categorizes them as partitioning methods, hierarchical methods, density-based methods, grid-based methods, model-based methods, frequent pattern-based clustering, and constraint-based clustering. It then describes several soft computing techniques that have been used as optimization approaches for text document clustering, including genetic algorithms, bees algorithms, particle swarm optimization, and ant colony optimization. These optimization techniques perform a global search to improve the quality and efficiency of document clustering algorithms.
Cancer data partitioning with data structure and difficulty independent clust...IRJET Journal
This document discusses cancer data partitioning using clustering techniques. It begins with an introduction to clustering concepts and different clustering methods like k-means, hierarchical agglomerative clustering, and partitioning methods. It then reviews literature on clustering algorithms and ensemble methods applied to problems like speaker diarization and tumor clustering from gene expression data. The document analyzes issues with existing clustering methodology and proposes a new dynamic ensemble membership selection scheme to support data structure and complexity independent clustering for cancer data partitioning. The method combines partition around medoids clustering with an incremental semi-supervised cluster ensemble framework to improve healthcare data partitioning accuracy.
This document summarizes a research paper on clustering algorithms in data mining. It begins by defining clustering as an unsupervised learning technique that organizes unlabeled data into groups of similar objects. The document then reviews different types of clustering algorithms and methods for evaluating clustering results. Key steps in clustering include feature selection, algorithm selection, and cluster validation to assess how well the derived groups represent the underlying data structure. A variety of clustering algorithms exist and must be chosen based on the problem characteristics.
The document provides a literature review of different clustering techniques. It begins by defining clustering and its applications. It then categorizes and describes several clustering methods including hierarchical (BIRCH, CURE, CHAMELEON), partitioning (k-means, k-medoids), density-based (DBSCAN, OPTICS, DENCLUE), grid-based (CLIQUE, STING, MAFIA), and model-based (RBMN, SOM) methods. For each method, it discusses the algorithm, advantages, disadvantages and time complexity. The document aims to provide an overview of various clustering techniques for classification and comparison.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
The role of materialized views is becoming vital in today’s distributed Data warehouses. Materialization is
where parts of the data cube are pre-computed. Some of the real time distributed architectures are
maintaining materialization transparencies in the sense the users are not known with the materialization at
a node. Usually what all followed by them is a cache maintenance mechanism where the query results are
cached. When a query requesting materialization arrives at a distributed node it checks in its cache and if
the materialization is available answers the query. What if materialization is not available- the node
communicates the query in the network until a node answering the requested materialization is available.
This type of network communication increases the number of query forwarding’s between nodes. The aim
of this paper is to reduce the multiple redirects. In this paper we propose a new CB-pattern tree indexing to
identify the exact distributed node where the needed materialization is available.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
This document provides an overview of several clustering algorithms. It begins by defining clustering and its importance in data mining. It then categorizes clustering algorithms into four main types: partitional, hierarchical, grid-based, and density-based. For each type, some representative algorithms are described briefly. The document also reviews several popular clustering algorithms like k-means, CLARA, PAM, CLARANS, and BIRCH in more detail. It discusses aspects like the algorithms' time complexity, types of data handled, ability to detect clusters of different shapes, required input parameters, and advantages/disadvantages. Overall, the document aims to guide selection of suitable clustering algorithms for specific applications by surveying their key characteristics.
A SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERINGijcsa
This document provides a survey of optimization approaches that have been applied to text document clustering. It discusses several clustering algorithms and categorizes them as partitioning methods, hierarchical methods, density-based methods, grid-based methods, model-based methods, frequent pattern-based clustering, and constraint-based clustering. It then describes several soft computing techniques that have been used as optimization approaches for text document clustering, including genetic algorithms, bees algorithms, particle swarm optimization, and ant colony optimization. These optimization techniques perform a global search to improve the quality and efficiency of document clustering algorithms.
Cancer data partitioning with data structure and difficulty independent clust...IRJET Journal
This document discusses cancer data partitioning using clustering techniques. It begins with an introduction to clustering concepts and different clustering methods like k-means, hierarchical agglomerative clustering, and partitioning methods. It then reviews literature on clustering algorithms and ensemble methods applied to problems like speaker diarization and tumor clustering from gene expression data. The document analyzes issues with existing clustering methodology and proposes a new dynamic ensemble membership selection scheme to support data structure and complexity independent clustering for cancer data partitioning. The method combines partition around medoids clustering with an incremental semi-supervised cluster ensemble framework to improve healthcare data partitioning accuracy.
This document summarizes a research paper on clustering algorithms in data mining. It begins by defining clustering as an unsupervised learning technique that organizes unlabeled data into groups of similar objects. The document then reviews different types of clustering algorithms and methods for evaluating clustering results. Key steps in clustering include feature selection, algorithm selection, and cluster validation to assess how well the derived groups represent the underlying data structure. A variety of clustering algorithms exist and must be chosen based on the problem characteristics.
The document provides a literature review of different clustering techniques. It begins by defining clustering and its applications. It then categorizes and describes several clustering methods including hierarchical (BIRCH, CURE, CHAMELEON), partitioning (k-means, k-medoids), density-based (DBSCAN, OPTICS, DENCLUE), grid-based (CLIQUE, STING, MAFIA), and model-based (RBMN, SOM) methods. For each method, it discusses the algorithm, advantages, disadvantages and time complexity. The document aims to provide an overview of various clustering techniques for classification and comparison.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
The role of materialized views is becoming vital in today’s distributed Data warehouses. Materialization is
where parts of the data cube are pre-computed. Some of the real time distributed architectures are
maintaining materialization transparencies in the sense the users are not known with the materialization at
a node. Usually what all followed by them is a cache maintenance mechanism where the query results are
cached. When a query requesting materialization arrives at a distributed node it checks in its cache and if
the materialization is available answers the query. What if materialization is not available- the node
communicates the query in the network until a node answering the requested materialization is available.
This type of network communication increases the number of query forwarding’s between nodes. The aim
of this paper is to reduce the multiple redirects. In this paper we propose a new CB-pattern tree indexing to
identify the exact distributed node where the needed materialization is available.
Survey Paper on Clustering Data Streams Based on Shared Density between Micro...IRJET Journal
This document discusses a survey of clustering data streams based on shared density between micro-clusters. It describes how current reclustering approaches for data stream clustering ignore density information between micro-clusters, which can result in inaccurate cluster assignments. The paper proposes DBSTREAM, a new approach that captures shared density between micro-clusters using a density graph. This density information is then used in the reclustering process to generate final clusters based on actual density between adjacent micro-clusters rather than assumptions about data distribution.
TWO PARTY HIERARICHAL CLUSTERING OVER HORIZONTALLY PARTITIONED DATA SETIJDKP
This document summarizes a research paper that proposes a two-party hierarchical clustering approach for horizontally partitioned data to enable privacy-preserving data mining. The key points are:
1) The paper presents an approach for applying hierarchical clustering across two parties that hold horizontally partitioned data, with the goal of preserving privacy.
2) Each party independently computes k-cluster centers on their own data and encrypts the distance matrices before sharing. Hierarchical clustering is then applied to merge the clusters.
3) An algorithm is provided for identifying the closest cluster for each data point based on the merged distance matrices.
4) The approach is analyzed and compared to other clustering techniques, demonstrating it has lower computational complexity
Iaetsd a survey on one class clusteringIaetsd Iaetsd
This document presents a new method for performing one-to-many data linkage called the One Class Clustering Tree (OCCT). The OCCT builds a tree structure with inner nodes representing features of the first dataset and leaves representing similar features of the second dataset. It uses splitting criteria and pruning methods to perform the data linkage more accurately than existing indexing techniques. The OCCT approach induces a decision tree using a splitting criteria and performs prepruning to determine which branches to trim. It then compares entities to match them between the two datasets and produces a final result.
This document describes an expandable Bayesian network (EBN) approach for 3D object description from multiple images and sensor data. The key points are:
- EBNs can dynamically instantiate network structures at runtime based on the number of input images, allowing the use of a varying number of evidence features.
- EBNs introduce the use of hidden variables to handle correlation of evidence features across images, whereas previous approaches did not properly model this.
- The document presents an application of an EBN for building detection and description from aerial images using multiple views and sensor data. Experimental results showed the EBN approach provided significant performance improvements over other methods.
A Novel Clustering Method for Similarity Measuring in Text DocumentsIJMER
International Journal of Modern Engineering Research (IJMER) is Peer reviewed, online Journal. It serves as an international archival forum of scholarly research related to engineering and science education.
Clustering Algorithm Based On Correlation Preserving IndexingIOSR Journals
Abstract: Fast retrieval of the relevant information from the databases has always been a significant issue.
Different techniques have been developed for this purpose; one of them is Data Clustering. In this paper Data
Clustering is discussed along with the applications of Data Clustering and Correlation Preserving Indexing. We
proposed a CPI (Correlation Preserving Indexing) algorithm and relate it to structural differences between the
data sets.
Keywords: Data Clustering, Data Mining, Clustering techniques, Correlation Preserving Indexing.
This document presents a feature clustering algorithm to reduce the dimensionality of feature vectors for text classification. The algorithm groups words in documents into clusters based on similarity, with each cluster characterized by a membership function. Words not similar to existing clusters form new clusters. This avoids specifying features in advance and the need for trial and error. Experimental results showed the method can classify text faster and with better extracted features than other methods.
This document summarizes a research paper that evaluates cluster quality using a modified density subspace clustering approach. It discusses how density subspace clustering can be used to identify clusters in high-dimensional datasets by detecting density-connected clusters in all subspaces. The proposed approach uses a density subspace clustering algorithm to select attribute subsets and identify the best clusters. It then calculates intra-cluster and inter-cluster distances to evaluate cluster quality and compares the results to other clustering algorithms in terms of accuracy and runtime. Experimental results showed that the proposed method improves clustering quality and performs faster than existing techniques.
This document provides an overview and summary of Pankaj Jajoo's 2008 master's thesis on improving document clustering algorithms. The thesis explores two approaches: 1) preprocessing the graph representation of documents to remove noise before applying standard graph partitioning algorithms, and 2) clustering words first before clustering documents to reduce noise. Experimental results on three datasets show these approaches improve clustering quality over standard K-Means clustering. The thesis provides background on clustering, reviews existing document clustering methods, and describes the two new algorithms and evaluation of their performance.
Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach IJECEIAES
The document presents a new approach called Bat-Cluster (BC) for automated graph clustering. BC combines the Fast Fourier Domain Positioning (FFDP) algorithm and the Bat Algorithm. FFDP positions graph nodes, then Bat Algorithm optimizes clustering by finding configurations that minimize the Davies-Bouldin Index. BC is tested on four benchmark graphs and outperforms Particle Swarm Optimization, Ant Colony Optimization, and Differential Evolution in providing higher clustering precision.
Privacy Preserving Clustering on Distorted dataIOSR Journals
- The document discusses privacy-preserving clustering on distorted data using singular value decomposition (SVD) and sparsified singular value decomposition (SSVD).
- It applies SVD and SSVD to distort a real-world dataset of 100 terrorists with 42 attributes, generating distorted datasets.
- K-means clustering is then performed on the original and distorted datasets for different numbers of clusters (k). The results show that SSVD more effectively groups the data objects into clusters compared to the original and SVD-distorted datasets, while preserving data privacy as measured by various metrics.
The document presents an algorithm for k-medoid clustering based on Ant Colony Optimization (ACO) called ACO-MEDOIDS. It first provides background on data mining, clustering, and related clustering algorithms such as k-means and k-medoids. It then describes how ACO is adapted to solve the k-medoid clustering problem by using ants to explore the search space and iteratively update pheromone trails to find an optimal set of medoids, or cluster representative points. The ACO-MEDOIDS algorithm aims to address some limitations of traditional k-medoid clustering.
This document discusses hierarchical clustering and similarity measures for document clustering. It summarizes that hierarchical clustering creates a hierarchical decomposition of data objects through either agglomerative or divisive approaches. The success of clustering depends on the similarity measure used, with traditional measures using a single viewpoint, while multiviewpoint measures use different viewpoints to increase accuracy. The paper then focuses on applying a multiviewpoint similarity measure to hierarchical clustering of documents.
A Soft Set-based Co-occurrence for Clustering Web User TransactionsTELKOMNIKA JOURNAL
This document proposes a soft set-based approach for clustering web user transactions to achieve lower computational complexity and higher clustering purity compared to previous rough set approaches. Unlike rough set approaches that use similarity, the proposed approach uses a co-occurrence approach based on soft set theory. The soft set representation of web user transactions allows modeling as a binary-valued information system. The approach is evaluated in comparison to two previous rough set-based approaches, demonstrating better performance with over 100% lower computational complexity and higher cluster purity.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Abstract
The exponential growth of knowledge in the World Wide Web, has understood the need to develop economical and effective ways for organizing relevant contents. In the field of web computing, document clustering plays a vital role and plays an interesting and challenging problem. Document clustering is mainly used for grouping the similar documents in the search engine. The web also has rich and dynamic collection of hyperlink information. The retrieval of relevant document from the internet is the complicated task. Based on the user’s query the document will be retrieved from the various databases to give relevant information and additional information for the given query. The documents are already clustered based on keyword extraction and stored in the database. The probabilistic relational approach for web document clustering is to find the relation between two linked pages and to define a relational clustering algorithm based on probabilistic graph representation. In document clustering, both content information and hyperlink structure of web page are considered and document is viewed as a semantic units. It also provides additional information to the user.
Keywords: Document Clustering, Agglomerative Clustering, Entropy, F-Measure
Mastering Hierarchical Clustering: A Comprehensive Guidecyberprosocial
In the world of data analysis and machine learning, hierarchical clustering is a really important technique that helps us understand how different pieces of data are related to each other. This article is here to explain hierarchical clustering in a way that’s easy to understand, breaking down its main ideas, how it’s used, and the benefits it brings.
An Analysis On Clustering Algorithms In Data MiningGina Rizzo
This document summarizes and analyzes various clustering algorithms used in data mining. It begins with an introduction to clustering and discusses hierarchical and partitional clustering approaches. Specific algorithms discussed include k-means, single linkage, and fuzzy clustering. The document compares techniques on aspects like complexity and sensitivity. It concludes that choosing the best clustering algorithm depends on factors like the data type, size, and desired cluster structure. The ideal is to have criteria to automatically select the appropriate algorithm for a given data set.
Clustering Approach Recommendation System using Agglomerative AlgorithmIRJET Journal
The document discusses clustering approaches and agglomerative algorithms for recommendation systems. It proposes a new clustering approach based on agglomerative algorithms that uses simple calculations to identify clusters. Clustering algorithms aim to group similar objects into clusters while maximizing dissimilarity between objects in different clusters. The document reviews various hierarchical and partitioning clustering methods and algorithms presented in previous literature. It then proposes using an agglomerative clustering approach for recommendation systems that identifies clusters through simple calculations.
This document summarizes a research paper that proposes a new density-based clustering technique called Triangle-Density Based Clustering Technique (TDCT) to efficiently cluster large spatial datasets. TDCT uses a polygon approach where the number of data points inside each triangle of a polygon is calculated to determine triangle densities. Triangle densities are used to identify clusters based on a density confidence threshold. The technique aims to identify clusters of arbitrary shapes and densities while minimizing computational costs. Experimental results demonstrate the technique's superiority in terms of cluster quality and complexity compared to other density-based clustering algorithms.
UNIT - 4: Data Warehousing and Data MiningNandakumar P
UNIT-IV
Cluster Analysis: Types of Data in Cluster Analysis – A Categorization of Major Clustering Methods – Partitioning Methods – Hierarchical methods – Density, Based Methods – Grid, Based Methods – Model, Based Clustering Methods – Clustering High, Dimensional Data – Constraint, Based Cluster Analysis – Outlier Analysis.
Survey Paper on Clustering Data Streams Based on Shared Density between Micro...IRJET Journal
This document discusses a survey of clustering data streams based on shared density between micro-clusters. It describes how current reclustering approaches for data stream clustering ignore density information between micro-clusters, which can result in inaccurate cluster assignments. The paper proposes DBSTREAM, a new approach that captures shared density between micro-clusters using a density graph. This density information is then used in the reclustering process to generate final clusters based on actual density between adjacent micro-clusters rather than assumptions about data distribution.
TWO PARTY HIERARICHAL CLUSTERING OVER HORIZONTALLY PARTITIONED DATA SETIJDKP
This document summarizes a research paper that proposes a two-party hierarchical clustering approach for horizontally partitioned data to enable privacy-preserving data mining. The key points are:
1) The paper presents an approach for applying hierarchical clustering across two parties that hold horizontally partitioned data, with the goal of preserving privacy.
2) Each party independently computes k-cluster centers on their own data and encrypts the distance matrices before sharing. Hierarchical clustering is then applied to merge the clusters.
3) An algorithm is provided for identifying the closest cluster for each data point based on the merged distance matrices.
4) The approach is analyzed and compared to other clustering techniques, demonstrating it has lower computational complexity
Iaetsd a survey on one class clusteringIaetsd Iaetsd
This document presents a new method for performing one-to-many data linkage called the One Class Clustering Tree (OCCT). The OCCT builds a tree structure with inner nodes representing features of the first dataset and leaves representing similar features of the second dataset. It uses splitting criteria and pruning methods to perform the data linkage more accurately than existing indexing techniques. The OCCT approach induces a decision tree using a splitting criteria and performs prepruning to determine which branches to trim. It then compares entities to match them between the two datasets and produces a final result.
This document describes an expandable Bayesian network (EBN) approach for 3D object description from multiple images and sensor data. The key points are:
- EBNs can dynamically instantiate network structures at runtime based on the number of input images, allowing the use of a varying number of evidence features.
- EBNs introduce the use of hidden variables to handle correlation of evidence features across images, whereas previous approaches did not properly model this.
- The document presents an application of an EBN for building detection and description from aerial images using multiple views and sensor data. Experimental results showed the EBN approach provided significant performance improvements over other methods.
A Novel Clustering Method for Similarity Measuring in Text DocumentsIJMER
International Journal of Modern Engineering Research (IJMER) is Peer reviewed, online Journal. It serves as an international archival forum of scholarly research related to engineering and science education.
Clustering Algorithm Based On Correlation Preserving IndexingIOSR Journals
Abstract: Fast retrieval of the relevant information from the databases has always been a significant issue.
Different techniques have been developed for this purpose; one of them is Data Clustering. In this paper Data
Clustering is discussed along with the applications of Data Clustering and Correlation Preserving Indexing. We
proposed a CPI (Correlation Preserving Indexing) algorithm and relate it to structural differences between the
data sets.
Keywords: Data Clustering, Data Mining, Clustering techniques, Correlation Preserving Indexing.
This document presents a feature clustering algorithm to reduce the dimensionality of feature vectors for text classification. The algorithm groups words in documents into clusters based on similarity, with each cluster characterized by a membership function. Words not similar to existing clusters form new clusters. This avoids specifying features in advance and the need for trial and error. Experimental results showed the method can classify text faster and with better extracted features than other methods.
This document summarizes a research paper that evaluates cluster quality using a modified density subspace clustering approach. It discusses how density subspace clustering can be used to identify clusters in high-dimensional datasets by detecting density-connected clusters in all subspaces. The proposed approach uses a density subspace clustering algorithm to select attribute subsets and identify the best clusters. It then calculates intra-cluster and inter-cluster distances to evaluate cluster quality and compares the results to other clustering algorithms in terms of accuracy and runtime. Experimental results showed that the proposed method improves clustering quality and performs faster than existing techniques.
This document provides an overview and summary of Pankaj Jajoo's 2008 master's thesis on improving document clustering algorithms. The thesis explores two approaches: 1) preprocessing the graph representation of documents to remove noise before applying standard graph partitioning algorithms, and 2) clustering words first before clustering documents to reduce noise. Experimental results on three datasets show these approaches improve clustering quality over standard K-Means clustering. The thesis provides background on clustering, reviews existing document clustering methods, and describes the two new algorithms and evaluation of their performance.
Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach IJECEIAES
The document presents a new approach called Bat-Cluster (BC) for automated graph clustering. BC combines the Fast Fourier Domain Positioning (FFDP) algorithm and the Bat Algorithm. FFDP positions graph nodes, then Bat Algorithm optimizes clustering by finding configurations that minimize the Davies-Bouldin Index. BC is tested on four benchmark graphs and outperforms Particle Swarm Optimization, Ant Colony Optimization, and Differential Evolution in providing higher clustering precision.
Privacy Preserving Clustering on Distorted dataIOSR Journals
- The document discusses privacy-preserving clustering on distorted data using singular value decomposition (SVD) and sparsified singular value decomposition (SSVD).
- It applies SVD and SSVD to distort a real-world dataset of 100 terrorists with 42 attributes, generating distorted datasets.
- K-means clustering is then performed on the original and distorted datasets for different numbers of clusters (k). The results show that SSVD more effectively groups the data objects into clusters compared to the original and SVD-distorted datasets, while preserving data privacy as measured by various metrics.
The document presents an algorithm for k-medoid clustering based on Ant Colony Optimization (ACO) called ACO-MEDOIDS. It first provides background on data mining, clustering, and related clustering algorithms such as k-means and k-medoids. It then describes how ACO is adapted to solve the k-medoid clustering problem by using ants to explore the search space and iteratively update pheromone trails to find an optimal set of medoids, or cluster representative points. The ACO-MEDOIDS algorithm aims to address some limitations of traditional k-medoid clustering.
This document discusses hierarchical clustering and similarity measures for document clustering. It summarizes that hierarchical clustering creates a hierarchical decomposition of data objects through either agglomerative or divisive approaches. The success of clustering depends on the similarity measure used, with traditional measures using a single viewpoint, while multiviewpoint measures use different viewpoints to increase accuracy. The paper then focuses on applying a multiviewpoint similarity measure to hierarchical clustering of documents.
A Soft Set-based Co-occurrence for Clustering Web User TransactionsTELKOMNIKA JOURNAL
This document proposes a soft set-based approach for clustering web user transactions to achieve lower computational complexity and higher clustering purity compared to previous rough set approaches. Unlike rough set approaches that use similarity, the proposed approach uses a co-occurrence approach based on soft set theory. The soft set representation of web user transactions allows modeling as a binary-valued information system. The approach is evaluated in comparison to two previous rough set-based approaches, demonstrating better performance with over 100% lower computational complexity and higher cluster purity.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Abstract
The exponential growth of knowledge in the World Wide Web, has understood the need to develop economical and effective ways for organizing relevant contents. In the field of web computing, document clustering plays a vital role and plays an interesting and challenging problem. Document clustering is mainly used for grouping the similar documents in the search engine. The web also has rich and dynamic collection of hyperlink information. The retrieval of relevant document from the internet is the complicated task. Based on the user’s query the document will be retrieved from the various databases to give relevant information and additional information for the given query. The documents are already clustered based on keyword extraction and stored in the database. The probabilistic relational approach for web document clustering is to find the relation between two linked pages and to define a relational clustering algorithm based on probabilistic graph representation. In document clustering, both content information and hyperlink structure of web page are considered and document is viewed as a semantic units. It also provides additional information to the user.
Keywords: Document Clustering, Agglomerative Clustering, Entropy, F-Measure
Mastering Hierarchical Clustering: A Comprehensive Guidecyberprosocial
In the world of data analysis and machine learning, hierarchical clustering is a really important technique that helps us understand how different pieces of data are related to each other. This article is here to explain hierarchical clustering in a way that’s easy to understand, breaking down its main ideas, how it’s used, and the benefits it brings.
An Analysis On Clustering Algorithms In Data MiningGina Rizzo
This document summarizes and analyzes various clustering algorithms used in data mining. It begins with an introduction to clustering and discusses hierarchical and partitional clustering approaches. Specific algorithms discussed include k-means, single linkage, and fuzzy clustering. The document compares techniques on aspects like complexity and sensitivity. It concludes that choosing the best clustering algorithm depends on factors like the data type, size, and desired cluster structure. The ideal is to have criteria to automatically select the appropriate algorithm for a given data set.
Clustering Approach Recommendation System using Agglomerative AlgorithmIRJET Journal
The document discusses clustering approaches and agglomerative algorithms for recommendation systems. It proposes a new clustering approach based on agglomerative algorithms that uses simple calculations to identify clusters. Clustering algorithms aim to group similar objects into clusters while maximizing dissimilarity between objects in different clusters. The document reviews various hierarchical and partitioning clustering methods and algorithms presented in previous literature. It then proposes using an agglomerative clustering approach for recommendation systems that identifies clusters through simple calculations.
This document summarizes a research paper that proposes a new density-based clustering technique called Triangle-Density Based Clustering Technique (TDCT) to efficiently cluster large spatial datasets. TDCT uses a polygon approach where the number of data points inside each triangle of a polygon is calculated to determine triangle densities. Triangle densities are used to identify clusters based on a density confidence threshold. The technique aims to identify clusters of arbitrary shapes and densities while minimizing computational costs. Experimental results demonstrate the technique's superiority in terms of cluster quality and complexity compared to other density-based clustering algorithms.
UNIT - 4: Data Warehousing and Data MiningNandakumar P
UNIT-IV
Cluster Analysis: Types of Data in Cluster Analysis – A Categorization of Major Clustering Methods – Partitioning Methods – Hierarchical methods – Density, Based Methods – Grid, Based Methods – Model, Based Clustering Methods – Clustering High, Dimensional Data – Constraint, Based Cluster Analysis – Outlier Analysis.
The document proposes a hierarchical and grid-based clustering method called HGD Cluster for clustering distributed data across decentralized nodes. HGD Cluster uses hierarchical and grid-based clustering techniques to cluster datasets dispersed among large numbers of nodes in a distributed environment. It aims to efficiently perform dynamic distributed clustering and allow nodes to maintain summarized views of datasets when nodes are asynchronous and data is dynamic.
A Hierarchical and Grid Based Clustering Method for Distributed Systems (Hgd ...iosrjce
In distributed peer-to-peer systems, huge amount of data are dispersed. Grouping of those data from
multiple sources is a tedious task. By applying effective data mining techniques the clustering of distributed data
is become ease and this decreases the hurdles of clustering due to processing, storage, and transmission costs.
To perform a dynamic distributed clustering, a fully decentralized clustering method has been proposed. HGD
Cluster can cluster a data set which is dispersed among a large number of nodes in a distributed environment
using hierarchal and grid based clustering techniques. When nodes are fully asynchronous and decentralized
and also adaptable to stir, then HGD cluster will apply. The general design principles employed in the proposed
algorithm also allow customization for other classes of clustering. It is fully capable of clustering dynamic and
distributed data sets. Using the algorithm, every node can maintain summarized views of the dataset.
Customizing HGD Cluster for execution of the hierarchal-based and grid-based clustering methods on the
summarized views is the main aim of the proposed system. Coping with dynamic data is made possible by
gradually adapting the clustering model.
A Density Based Clustering Technique For Large Spatial Data Using Polygon App...IOSR Journals
This document presents a density-based clustering technique called TDCT (Triangle-density based clustering technique) for efficiently clustering large spatial datasets. The technique uses a polygon approach where the number of data points inside each triangle of a polygon is calculated. If the ratio of point densities between two neighboring triangles exceeds a threshold, the triangles are merged into the same cluster. The technique is capable of identifying clusters of arbitrary shapes and densities. Experimental results demonstrate the technique has superior cluster quality and complexity compared to other methods.
It is a data mining technique used to place the data elements into their related groups. Clustering is the process of partitioning the data (or objects) into the same class, The data in one class is more similar to each other than to those in other cluster.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
International Journal of Engineering and Science Invention (IJESI)inventionjournals
This document discusses multidimensional clustering methods for data mining and their industrial applications. It begins with an introduction to clustering, including definitions and goals. Popular clustering algorithms are described, such as K-means, fuzzy C-means, hierarchical clustering, and mixture of Gaussians. Distance measures and their importance in clustering are covered. The K-means and fuzzy C-means algorithms are explained in detail. Examples are provided to illustrate fuzzy C-means clustering. Finally, applications of clustering algorithms in fields such as marketing, biology, and earth sciences are mentioned.
Certain Investigation on Dynamic Clustering in Dynamic Dataminingijdmtaiir
Clustering is the process of grouping a set of objects
into classes of similar objects. Dynamic clustering comes in a
new research area that is concerned about dataset with dynamic
aspects. It requires updates of the clusters whenever new data
records are added to the dataset and may result in a change of
clustering over time. When there is a continuous update and
huge amount of dynamic data, rescan the database is not
possible in static data mining. But this is possible in Dynamic
data mining process. This dynamic data mining occurs when
the derived information is present for the purpose of analysis
and the environment is dynamic, i.e. many updates occur.
Since this has now been established by most researchers and
they will move into solving some of the problems and the
research is to concentrate on solving the problem of using data
mining dynamic databases. This paper gives some
investigation of existing work done in some papers related with
dynamic clustering and incremental data clustering
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Evaluating the efficiency of rule techniques for file classificationeSAT Journals
Abstract Text mining refers to the process of deriving high quality information from text. It is also known as knowledge discovery from text (KDT), deals with the machine supported analysis of text. It is used in various areas such as information retrieval, marketing, information extraction, natural language processing, document similarity, and so on. Document Similarity is one of the important techniques in text mining. In document similarity, the first and foremost step is to classify the files based on their category. In this research work, various classification rule techniques are used to classify the computer files based on their extensions. For example, the extension of computer files may be pdf, doc, ppt, xls, and so on. There are several algorithms for rule classifier such as decision table, JRip, Ridor, DTNB, NNge, PART, OneR and ZeroR. In this research work, three classification algorithms namely decision table, DTNB and OneR classifiers are used for performing classification of computer files based on their extension. The results produced by these algorithms are analyzed by using the performance factors classification accuracy and error rate. From the experimental results, DTNB proves to be more efficient than other two techniques. Index Terms: Data mining, Text mining, Classification, Decision table, DTNB, OneR
IRJET- Customer Segmentation from Massive Customer Transaction DataIRJET Journal
This document discusses various methods for customer segmentation through analysis of massive customer transaction data, including K-Means clustering, PAM clustering, agglomerative clustering, divisive clustering, and density-based clustering. It finds that K-Means is the most commonly used partitioning method. The document also reviews related work on customer segmentation and clustering algorithms like CLARA, CLARANS, BIRCH, ROCK, CHAMELEON, CURE, DHCC, DBSCAN, and LOF. It proposes a framework for an online shopping site that would apply these techniques to group customers based on their product preferences in transaction data.
This document compares hierarchical and non-hierarchical clustering algorithms. It summarizes four clustering algorithms: K-Means, K-Medoids, Farthest First Clustering (hierarchical algorithms), and DBSCAN (non-hierarchical algorithm). It describes the methodology of each algorithm and provides pseudocode. It also describes the datasets used to evaluate the performance of the algorithms and the evaluation metrics. The goal is to compare the performance of the clustering methods on different datasets.
Clustering is an unsupervised machine learning technique that groups similar objects together. There are several clustering techniques including partitioning methods like k-means, hierarchical methods like agglomerative and divisive clustering, density-based methods, graph-based methods, and model-based clustering. Clustering is used in applications such as pattern recognition, image processing, bioinformatics, and more to split data into meaningful subgroups.
The document surveys previous work on clustering of time series data. It discusses three main approaches to time series clustering: (1) working directly with raw time series data, (2) extracting features from the raw data and clustering the features, and (3) building models from the raw data and clustering the model parameters. It also summarizes commonly used clustering algorithms, measures for determining time series similarity, and criteria for evaluating clustering results. The document categorizes and discusses past time series clustering research and identifies topics for future work. It aims to provide an overview of the field of time series clustering.
Similar to Survey on traditional and evolutionary clustering approaches (20)
Mechanical properties of hybrid fiber reinforced concrete for pavementseSAT Journals
Abstract
The effect of addition of mono fibers and hybrid fibers on the mechanical properties of concrete mixture is studied in the present
investigation. Steel fibers of 1% and polypropylene fibers 0.036% were added individually to the concrete mixture as mono fibers and
then they were added together to form a hybrid fiber reinforced concrete. Mechanical properties such as compressive, split tensile and
flexural strength were determined. The results show that hybrid fibers improve the compressive strength marginally as compared to
mono fibers. Whereas, hybridization improves split tensile strength and flexural strength noticeably.
Keywords:-Hybridization, mono fibers, steel fiber, polypropylene fiber, Improvement in mechanical properties.
Material management in construction – a case studyeSAT Journals
Abstract
The objective of the present study is to understand about all the problems occurring in the company because of improper application
of material management. In construction project operation, often there is a project cost variance in terms of the material, equipments,
manpower, subcontractor, overhead cost, and general condition. Material is the main component in construction projects. Therefore,
if the material management is not properly managed it will create a project cost variance. Project cost can be controlled by taking
corrective actions towards the cost variance. Therefore a methodology is used to diagnose and evaluate the procurement process
involved in material management and launch a continuous improvement was developed and applied. A thorough study was carried
out along with study of cases, surveys and interviews to professionals involved in this area. As a result, a methodology for diagnosis
and improvement was proposed and tested in selected projects. The results obtained show that the main problem of procurement is
related to schedule delays and lack of specified quality for the project. To prevent this situation it is often necessary to dedicate
important resources like money, personnel, time, etc. To monitor and control the process. A great potential for improvement was
detected if state of the art technologies such as, electronic mail, electronic data interchange (EDI), and analysis were applied to the
procurement process. These helped to eliminate the root causes for many types of problems that were detected.
Managing drought short term strategies in semi arid regions a case studyeSAT Journals
Abstract
Drought management needs multidisciplinary action. Interdisciplinary efforts among the experts in various fields of the droughts
prone areas are helpful to achieve tangible and permanent solution for this recurring problem. The Gulbarga district having the total
area around 16, 240 sq.km, and accounts 8.45 per cent of the Karnataka state area. The district has been situated with latitude 17º 19'
60" North and longitude of 76 º 49' 60" east. The district is situated entirely on the Deccan plateau positioned at a height of 300 to
750 m above MSL. Sub-tropical, semi-arid type is one among the drought prone districts of Karnataka State. The drought
management is very important for a district like Gulbarga. In this paper various short term strategies are discussed to mitigate the
drought condition in the district.
Keywords: Drought, South-West monsoon, Semi-Arid, Rainfall, Strategies etc.
Life cycle cost analysis of overlay for an urban road in bangaloreeSAT Journals
Abstract
Pavements are subjected to severe condition of stresses and weathering effects from the day they are constructed and opened to traffic
mainly due to its fatigue behavior and environmental effects. Therefore, pavement rehabilitation is one of the most important
components of entire road systems. This paper highlights the design of concrete pavement with added mono fibers like polypropylene,
steel and hybrid fibres for a widened portion of existing concrete pavement and various overlay alternatives for an existing
bituminous pavement in an urban road in Bangalore. Along with this, Life cycle cost analyses at these sections are done by Net
Present Value (NPV) method to identify the most feasible option. The results show that though the initial cost of construction of
concrete overlay is high, over a period of time it prove to be better than the bituminous overlay considering the whole life cycle cost.
The economic analysis also indicates that, out of the three fibre options, hybrid reinforced concrete would be economical without
compromising the performance of the pavement.
Keywords: - Fatigue, Life cycle cost analysis, Net Present Value method, Overlay, Rehabilitation
Laboratory studies of dense bituminous mixes ii with reclaimed asphalt materialseSAT Journals
Abstract
The issue of growing demand on our nation’s roadways over that past couple of decades, decreasing budgetary funds, and the need to
provide a safe, efficient, and cost effective roadway system has led to a dramatic increase in the need to rehabilitate our existing
pavements and the issue of building sustainable road infrastructure in India. With these emergency of the mentioned needs and this
are today’s burning issue and has become the purpose of the study.
In the present study, the samples of existing bituminous layer materials were collected from NH-48(Devahalli to Hassan) site.The
mixtures were designed by Marshall Method as per Asphalt institute (MS-II) at 20% and 30% Reclaimed Asphalt Pavement (RAP).
RAP material was blended with virgin aggregate such that all specimens tested for the, Dense Bituminous Macadam-II (DBM-II)
gradation as per Ministry of Roads, Transport, and Highways (MoRT&H) and cost analysis were carried out to know the economics.
Laboratory results and analysis showed the use of recycled materials showed significant variability in Marshall Stability, and the
variability increased with the increase in RAP content. The saving can be realized from utilization of recycled materials as per the
methodology, the reduction in the total cost is 19%, 30%, comparing with the virgin mixes.
Keywords: Reclaimed Asphalt Pavement, Marshall Stability, MS-II, Dense Bituminous Macadam-II
Laboratory investigation of expansive soil stabilized with natural inorganic ...eSAT Journals
This document summarizes a study on stabilizing expansive black cotton soil with the natural inorganic stabilizer RBI-81. Laboratory tests were conducted to evaluate the effect of RBI-81 on the soil's engineering properties. The tests showed that with 2% RBI-81 and 28 days of curing, the unconfined compressive strength increased by around 250% and the CBR value improved by approximately 400% compared to the untreated soil. Overall, the study found that RBI-81 effectively improved the strength properties of the black cotton soil and its suitability as a soil stabilizer was supported.
Influence of reinforcement on the behavior of hollow concrete block masonry p...eSAT Journals
Abstract
Reinforced masonry was developed to exploit the strength potential of masonry and to solve its lack of tensile strength. Experimental
and analytical studies have been carried out to investigate the effect of reinforcement on the behavior of hollow concrete block
masonry prisms under compression and to predict ultimate failure compressive strength. In the numerical program, three dimensional
non-linear finite elements (FE) model based on the micro-modeling approach is developed for both unreinforced and reinforced
masonry prisms using ANSYS (14.5). The proposed FE model uses multi-linear stress-strain relationships to model the non-linear
behavior of hollow concrete block, mortar, and grout. Willam-Warnke’s five parameter failure theory has been adopted to model the
failure of masonry materials. The comparison of the numerical and experimental results indicates that the FE models can successfully
capture the highly nonlinear behavior of the physical specimens and accurately predict their strength and failure mechanisms.
Keywords: Structural masonry, Hollow concrete block prism, grout, Compression failure, Finite element method,
Numerical modeling.
Influence of compaction energy on soil stabilized with chemical stabilizereSAT Journals
This document summarizes a study on the influence of compaction energy on soil stabilized with a chemical stabilizer. Laboratory tests were conducted on locally available loamy soil treated with a patented polymer liquid stabilizer and compacted at four different energy levels. The study found that increasing the compaction effort increased the density of both untreated and treated soil, but the rate of increase was lower for stabilized soil. Treating the soil with the stabilizer improved its unconfined compressive strength and resilient modulus, and reduced accumulated plastic strain, with these properties further improved by higher compaction efforts. The stabilized soil exhibited strength and performance benefits compared to the untreated soil.
Geographical information system (gis) for water resources managementeSAT Journals
This document describes a hydrological framework developed in the form of a Hydrologic Information System (HIS) to meet the information needs of various government departments related to water management in a state. The HIS consists of a hydrological database coupled with tools for collecting and analyzing spatial and non-spatial water resources data. It also incorporates a hydrological model to indirectly assess water balance components over space and time. A web-based GIS portal was created to allow users to access and visualize the hydrological data, as well as outputs from the SWAT hydrological model. The framework is intended to facilitate integrated water resources planning and management across different administrative levels.
Forest type mapping of bidar forest division, karnataka using geoinformatics ...eSAT Journals
Abstract
The study demonstrate the potentiality of satellite remote sensing technique for the generation of baseline information on forest types
including tree plantation details in Bidar forest division, Karnataka covering an area of 5814.60Sq.Kms. The Total Area of Bidar
forest division is 5814Sq.Kms analysis of the satellite data in the study area reveals that about 84% of the total area is Covered by
crop land, 1.778% of the area is covered by dry deciduous forest, 1.38 % of mixed plantation, which is very threatening to the
environmental stability of the forest, future plantation site has been mapped. With the use of latest Geo-informatics technology proper
and exact condition of the trees can be observed and necessary precautions can be taken for future plantation works in an appropriate
manner
Keywords:-RS, GIS, GPS, Forest Type, Tree Plantation
Factors influencing compressive strength of geopolymer concreteeSAT Journals
Abstract
To study effects of several factors on the properties of fly ash based geopolymer concrete on the compressive strength and also the
cost comparison with the normal concrete. The test variables were molarities of sodium hydroxide(NaOH) 8M,14M and 16M, ratio of
NaOH to sodium silicate (Na2SiO3) 1, 1.5, 2 and 2.5, alkaline liquid to fly ash ratio 0.35 and 0.40 and replacement of water in
Na2SiO3 solution by 10%, 20% and 30% were used in the present study. The test results indicated that the highest compressive
strength 54 MPa was observed for 16M of NaOH, ratio of NaOH to Na2SiO3 2.5 and alkaline liquid to fly ash ratio of 0.35. Lowest
compressive strength of 27 MPa was observed for 8M of NaOH, ratio of NaOH to Na2SiO3 is 1 and alkaline liquid to fly ash ratio of
0.40. Alkaline liquid to fly ash ratio of 0.35, water replacement of 10% and 30% for 8 and 16 molarity of NaOH and has resulted in
compressive strength of 36 MPa and 20 MPa respectively. Superplasticiser dosage of 2 % by weight of fly ash has given higher
strength in all cases.
Keywords: compressive strength, alkaline liquid, fly ash
Experimental investigation on circular hollow steel columns in filled with li...eSAT Journals
Abstract
Composite Circular hollow Steel tubes with and without GFRP infill for three different grades of Light weight concrete are tested for
ultimate load capacity and axial shortening , under Cyclic loading. Steel tubes are compared for different lengths, cross sections and
thickness. Specimens were tested separately after adopting Taguchi’s L9 (Latin Squares) Orthogonal array in order to save the initial
experimental cost on number of specimens and experimental duration. Analysis was carried out using ANN (Artificial Neural
Network) technique with the assistance of Mini Tab- a statistical soft tool. Comparison for predicted, experimental & ANN output is
obtained from linear regression plots. From this research study, it can be concluded that *Cross sectional area of steel tube has most
significant effect on ultimate load carrying capacity, *as length of steel tube increased- load carrying capacity decreased & *ANN
modeling predicted acceptable results. Thus ANN tool can be utilized for predicting ultimate load carrying capacity for composite
columns.
Keywords: Light weight concrete, GFRP, Artificial Neural Network, Linear Regression, Back propagation, orthogonal
Array, Latin Squares
Experimental behavior of circular hsscfrc filled steel tubular columns under ...eSAT Journals
This document summarizes an experimental study that tested circular concrete-filled steel tube columns with varying parameters. 45 specimens were tested with different fiber percentages (0-2%), tube diameter-to-wall-thickness ratios (D/t from 15-25), and length-to-diameter (L/d) ratios (from 2.97-7.04). The results found that columns filled with fiber-reinforced concrete exhibited higher stiffness, equal ductility, and enhanced energy absorption compared to those filled with plain concrete. The load carrying capacity increased with fiber content up to 1.5% but not at 2.0%. The analytical predictions of failure load closely matched the experimental values.
Evaluation of punching shear in flat slabseSAT Journals
Abstract
Flat-slab construction has been widely used in construction today because of many advantages that it offers. The basic philosophy in
the design of flat slab is to consider only gravity forces; this method ignores the effect of punching shear due to unbalanced moments
at the slab column junction which is critical. An attempt has been made to generate generalized design sheets which accounts both
punching shear due to gravity loads and unbalanced moments for cases (a) interior column; (b) edge column (bending perpendicular
to shorter edge); (c) edge column (bending parallel to shorter edge); (d) corner column. These design sheets are prepared as per
codal provisions of IS 456-2000. These design sheets will be helpful in calculating the shear reinforcement to be provided at the
critical section which is ignored in many design offices. Apart from its usefulness in evaluating punching shear and the necessary
shear reinforcement, the design sheets developed will enable the designer to fix the depth of flat slab during the initial phase of the
design.
Keywords: Flat slabs, punching shear, unbalanced moment.
Evaluation of performance of intake tower dam for recent earthquake in indiaeSAT Journals
Abstract
Intake towers are typically tall, hollow, reinforced concrete structures and form entrance to reservoir outlet works. A parametric
study on dynamic behavior of circular cylindrical towers can be carried out to study the effect of depth of submergence, wall thickness
and slenderness ratio, and also effect on tower considering dynamic analysis for time history function of different soil condition and
by Goyal and Chopra accounting interaction effects of added hydrodynamic mass of surrounding and inside water in intake tower of
dam
Key words: Hydrodynamic mass, Depth of submergence, Reservoir, Time history analysis,
Evaluation of operational efficiency of urban road network using travel time ...eSAT Journals
This document evaluates the operational efficiency of an urban road network in Tiruchirappalli, India using travel time reliability measures. Traffic volume and travel times were collected using video data from 8-10 AM on various roads. Average travel times, 95th percentile travel times, and buffer time indexes were calculated to assess reliability. Non-motorized vehicles were found to most impact reliability on one road. A relationship between buffer time index and traffic volume was developed. Finally, a travel time model was created and validated based on length, speed, and volume.
Estimation of surface runoff in nallur amanikere watershed using scs cn methodeSAT Journals
Abstract
The development of watershed aims at productive utilization of all the available natural resources in the entire area extending from
ridge line to stream outlet. The per capita availability of land for cultivation has been decreasing over the years. Therefore, water and
the related land resources must be developed, utilized and managed in an integrated and comprehensive manner. Remote sensing and
GIS techniques are being increasingly used for planning, management and development of natural resources. The study area, Nallur
Amanikere watershed geographically lies between 110 38’ and 110 52’ N latitude and 760 30’ and 760 50’ E longitude with an area of
415.68 Sq. km. The thematic layers such as land use/land cover and soil maps were derived from remotely sensed data and overlayed
through ArcGIS software to assign the curve number on polygon wise. The daily rainfall data of six rain gauge stations in and around
the watershed (2001-2011) was used to estimate the daily runoff from the watershed using Soil Conservation Service - Curve Number
(SCS-CN) method. The runoff estimated from the SCS-CN model was then used to know the variation of runoff potential with different
land use/land cover and with different soil conditions.
Keywords: Watershed, Nallur watershed, Surface runoff, Rainfall-Runoff, SCS-CN, Remote Sensing, GIS.
Estimation of morphometric parameters and runoff using rs & gis techniqueseSAT Journals
This document summarizes a study that used remote sensing and GIS techniques to estimate morphometric parameters and runoff for the Yagachi catchment area in India over a 10-year period. Morphometric analysis was conducted to understand the hydrological response at the micro-watershed level. Daily runoff was estimated using the SCS curve number model. The results showed a positive correlation between rainfall and runoff. Land use/land cover changes between 2001-2010 were found to impact estimated runoff amounts. Remote sensing approaches provided an effective means to model runoff for this large, ungauged area.
Effect of variation of plastic hinge length on the results of non linear anal...eSAT Journals
Abstract The nonlinear Static procedure also well known as pushover analysis is method where in monotonically increasing loads are applied to the structure till the structure is unable to resist any further load. It is a popular tool for seismic performance evaluation of existing and new structures. In literature lot of research has been carried out on conventional pushover analysis and after knowing deficiency efforts have been made to improve it. But actual test results to verify the analytically obtained pushover results are rarely available. It has been found that some amount of variation is always expected to exist in seismic demand prediction of pushover analysis. Initial study is carried out by considering user defined hinge properties and default hinge length. Attempt is being made to assess the variation of pushover analysis results by considering user defined hinge properties and various hinge length formulations available in literature and results compared with experimentally obtained results based on test carried out on a G+2 storied RCC framed structure. For the present study two geometric models viz bare frame and rigid frame model is considered and it is found that the results of pushover analysis are very sensitive to geometric model and hinge length adopted. Keywords: Pushover analysis, Base shear, Displacement, hinge length, moment curvature analysis
Effect of use of recycled materials on indirect tensile strength of asphalt c...eSAT Journals
Abstract
Depletion of natural resources and aggregate quarries for the road construction is a serious problem to procure materials. Hence
recycling or reuse of material is beneficial. On emphasizing development in sustainable construction in the present era, recycling of
asphalt pavements is one of the effective and proven rehabilitation processes. For the laboratory investigations reclaimed asphalt
pavement (RAP) from NH-4 and crumb rubber modified binder (CRMB-55) was used. Foundry waste was used as a replacement to
conventional filler. Laboratory tests were conducted on asphalt concrete mixes with 30, 40, 50, and 60 percent replacement with RAP.
These test results were compared with conventional mixes and asphalt concrete mixes with complete binder extracted RAP
aggregates. Mix design was carried out by Marshall Method. The Marshall Tests indicated highest stability values for asphalt
concrete (AC) mixes with 60% RAP. The optimum binder content (OBC) decreased with increased in RAP in AC mixes. The Indirect
Tensile Strength (ITS) for AC mixes with RAP also was found to be higher when compared to conventional AC mixes at 300C.
Keywords: Reclaimed asphalt pavement, Foundry waste, Recycling, Marshall Stability, Indirect tensile strength.
Software Engineering and Project Management - Introduction, Modeling Concepts...Prakhyath Rai
Introduction, Modeling Concepts and Class Modeling: What is Object orientation? What is OO development? OO Themes; Evidence for usefulness of OO development; OO modeling history. Modeling
as Design technique: Modeling, abstraction, The Three models. Class Modeling: Object and Class Concept, Link and associations concepts, Generalization and Inheritance, A sample class model, Navigation of class models, and UML diagrams
Building the Analysis Models: Requirement Analysis, Analysis Model Approaches, Data modeling Concepts, Object Oriented Analysis, Scenario-Based Modeling, Flow-Oriented Modeling, class Based Modeling, Creating a Behavioral Model.
Batteries -Introduction – Types of Batteries – discharging and charging of battery - characteristics of battery –battery rating- various tests on battery- – Primary battery: silver button cell- Secondary battery :Ni-Cd battery-modern battery: lithium ion battery-maintenance of batteries-choices of batteries for electric vehicle applications.
Fuel Cells: Introduction- importance and classification of fuel cells - description, principle, components, applications of fuel cells: H2-O2 fuel cell, alkaline fuel cell, molten carbonate fuel cell and direct methanol fuel cells.
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTjpsjournal1
The rivalry between prominent international actors for dominance over Central Asia's hydrocarbon
reserves and the ancient silk trade route, along with China's diplomatic endeavours in the area, has been
referred to as the "New Great Game." This research centres on the power struggle, considering
geopolitical, geostrategic, and geoeconomic variables. Topics including trade, political hegemony, oil
politics, and conventional and nontraditional security are all explored and explained by the researcher.
Using Mackinder's Heartland, Spykman Rimland, and Hegemonic Stability theories, examines China's role
in Central Asia. This study adheres to the empirical epistemological method and has taken care of
objectivity. This study analyze primary and secondary research documents critically to elaborate role of
china’s geo economic outreach in central Asian countries and its future prospect. China is thriving in trade,
pipeline politics, and winning states, according to this study, thanks to important instruments like the
Shanghai Cooperation Organisation and the Belt and Road Economic Initiative. According to this study,
China is seeing significant success in commerce, pipeline politics, and gaining influence on other
governments. This success may be attributed to the effective utilisation of key tools such as the Shanghai
Cooperation Organisation and the Belt and Road Economic Initiative.
Discover the latest insights on Data Driven Maintenance with our comprehensive webinar presentation. Learn about traditional maintenance challenges, the right approach to utilizing data, and the benefits of adopting a Data Driven Maintenance strategy. Explore real-world examples, industry best practices, and innovative solutions like FMECA and the D3M model. This presentation, led by expert Jules Oudmans, is essential for asset owners looking to optimize their maintenance processes and leverage digital technologies for improved efficiency and performance. Download now to stay ahead in the evolving maintenance landscape.
Survey on traditional and evolutionary clustering approaches
1. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 02 Issue: 12 | Dec-2013, Available @ http://www.ijret.org 442
SURVEY ON TRADITIONAL AND EVOLUTIONARY CLUSTERING
APPROACHES
Christabel Williams1
, Bright Gee Varghese R2
1
PG Student, 2
Assistant professor, Department of Computer Science and Engineering, Karunya University, Tamil Nadu,
India, christabel.williams99@gmail.com, brightfsona@yahoo.co.in
Abstract
Clustering deals with grouping up of similar objects. Unlike classification, clustering tries to group a set of objects and find whether
there is some relationship between the objects whereas in classification a set of predefined classes will be known and it is enough to
find which class a object belongs. Simply, classification is a supervised learning technique and clustering is an unsupervised learning
technique. Clustering techniques apply when there is no class to be predicted but rather when the instances are to be divided into
natural groups. These clusters presumably reflect some mechanism at work in the domain from which instances are drawn, a
mechanism that causes some instances to bear a stronger resemblance to each other than they do to the remaining instances. It
naturally requires different techniques to the classification and association learning methods .Clustering has many applications in
various fields. In Software engineering it helps in reverse engineering, software maintenance and for re-building systems. It aims at
breaking a larger problem into small pieces of understanding elements. There are many approaches available to carry out clustering.
Since clustering has no particular methodology there are many methods available for carrying out clustering. There are many
traditional as well as evolutionary methods available for carrying out clustering. In this paper various types of the above mentioned
methods are described and some of them are compared. Each method has its own advantage and they can be used according to the
needs of the user.
Keywords: Clustering, Classification, Software Engineering, Traditional, Evolutionary.
----------------------------------------------------------------------***------------------------------------------------------------------------
1. INTRODUCTION
Clustering is a technique for finding similarity groups in data,
called clusters. It groups data instances that are similar to
(near) each other in one cluster and data instances that are very
different (far away) from each other into different clusters.
Clustering is often called an unsupervised learning task as no
class values denoting an a priori grouping of the data instances
are given, which is the case in supervised learning. Clustering
can be used in software maintenance and re-use. In order to
maintain software, understanding the source code is
mandatory. Therefore source code should be clustered. The
main purpose is to understand the system , artifacts recovery
and identifying the relationships among the source code.
Clustering can be done in many ways and there are several
methods for carrying out clustering .Traditional methods
includes partitional based, hierarchical and density based
clustering. There are some evolutionary clustering approaches
as well like relocation approaches, grid based and density
based [1].
The partitional approaches includes k-means, graph-theoretic
clustering and density based clustering and so on. These
approaches construct various partitions and evaluate them
using some criterion. On the other hand evolutionary
approaches are used for solving optimization problems. Here
the evolutionary operators and clustering structures are
converged to give an optimal solution. Both these approaches
are useful according to the circumstances in which they are
used. Each of them have their own pros and cons and can be
used effectively according to the user needs.
Fig.1 Various Clustering approaches
2. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 02 Issue: 12 | Dec-2013, Available @ http://www.ijret.org 443
2. VARIOUS CLUSTERING APPROACHES
This section focuses on the two approaches of clustering:
Traditional and evolutionary approaches.
2.1 Traditional Approaches
The traditional approaches of clustering include three
categories–partitional, hierarchical and density based
approaches. Since clustering do not have any precise notion
there are many clustering methods available to carry out
clustering. According to [2] Farley and Raftery (1998)
clustering can be divided into two main groups – hierarchical
and partitioning methods. Apart from these several methods
can proposed based on the induction principle.
2.1.1 Hierarchical Methods
In hierarchical clustering clusters are built by recursively
partitioning the instances in either a top down or a bottom-
down fashion.
Agglomerative Hierarchical Clustering: It is a bottom-up
approach in which each object itself is a cluster of its own.
Then these clusters are merged until a desired cluster structure
is obtained. Ying Zhao and George Karypis [3] defines certain
clustering criterion that can be used to determine which
clusters to merge at each step.
[3] For example in an n document dataset and the solution has
been obtained after m merging steps. The solution includes n-1
clusters as one cluster, as one cluster will be removed in each
step. So accordingly the next pair to be merged will be
selected which will lead to a n-2 solution that optimizes the
selected clustering criterion. That is one of the (n-1)*(n-2)/2
pairs of merges will be evaluated and the particular clustering
criterion with the minimum value is selected.the criterion
function will be locally optimized in that particular stage of
the algorithm. Since agglomerative algorithm has
computationally expensive steps and its complexity cannot be
reduced for any particular clustering criterion it is not
effective.
Divisive Hierarchical Clustering: [22] Christopher D.
Manning, Prabhakar Raghavan and Hinnich Schutze says that
top-down clustering is conceptually more complex than
bottom-up clustering since a second, flat clustering algorithm
as a subroutine is needed.. Divisive algorithm has the
advantage of being more efficient if a complete hierarchy is
not generated down to individual document leaves. For a fixed
number of top levels, using an efficient flat algorithm like k-
means, top-down algorithms are linear in the number of
documents and clusters. So they run much faster than HAC
algorithms. Bottom-up methods make clustering decisions
based on local patterns without initially taking into account the
global distribution. These early decisions cannot be undone.
Top-down clustering benefits from complete information
about the global distribution when making top-level
partitioning decisions
The result of a hierarchical clustering is a dendogram
including the nested grouping of objects with the similarity
level. The clustering can be obtained by cutting the
dendogram at the desired similarity level. [9] Lior Rokach and
Oded Maimon.
[4] Jain et al 1999 classifies the hierarchical clustering further
based on the manner in which their similarity measures are
calculated.
Single-linkage Clustering: Also known as minimum method
or nearest neighbour method .In this method the distance
between the two clusters are said to be equal to the shortest
distance from any member of one cluster to any member of
other cluster. If similarities are present in the data then the
similarity between two clusters will be equal to the greatest
similarity from any member of one cluster to any member of
other cluster. [5] Sneath and Sokal 1973. [6] Guha et al 1998
summarizes the disadvantages as chaining effect. A few points
forming a bridge between two clusters may cause the two
clusters to become one.
Complete-link Clustering: Also known as maximum method
or furthest neighbour method. Here the distance between two
clusters are considered to be equal to the longest distance from
member belonging to one cluster to other member belonging
to another cluster [7] King 1967.
Average-link Clustering: Minimum variance method. Here
the distance between two clusters are said to be equal to the
average distance from any member of one cluster to any
member of another cluster. But the disadvantage is that it may
cause the elongated clusters to split and neighboring clusters
to merge.
The main disadvantage of hierarchical clustering is these
methods can never undo what has been done previously ie., it
cannot be backtracked. And since its complexity is more it
requires huge i/o costs.
2.1.2 Partitioning Methods
Partitioning methods of clustering are flexible methods based
on the iterative relocation of data points between clusters. The
quality is measured by a clustering criterion [8] Sami A¨
yra¨mo¨ Tommi Ka¨rkka¨inen
[21] Witten, Ian H, Eibe Frank,2005,says that clustering
techniques apply when there is no class to be predicted but
rather when the instances are to be divided into natural groups.
Clustering naturally requires different techniques to the
classification and association learning methods. The classic
clustering technique is called k-means. First, the number of
3. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 02 Issue: 12 | Dec-2013, Available @ http://www.ijret.org 444
clusters needed should be mentioned in a parameter called k.
Then k points are chosen at random as cluster centers.
According to the Euclidean distance metric every instance is
assigned to its closest cluster center. Next the mean, of the
instances in each cluster is calculated—this is the ―means‖
part. These means forms the new center value of the
corresponding clusters. The whole process is repeated with the
new cluster centers. Iteration continues until same points are
assigned to each cluster in all rounds, and this indicates the
cluster centers are stabilized and will remain the same forever.
This clustering method is simple and effective. It is easy to
prove that choosing the cluster center to be the center
minimizes the total squared distance from each of the cluster’s
points to its center. Once the iteration has over, each point is
assigned to its nearest cluster center, so the overall effect is to
minimize the total squared distance from all points to their
cluster centers. But the minimum is a local one; there is no
guarantee that it is the global minimum. If there is a change in
the initial random choice fully different arrangements can
arise. It is almost always infeasible to find globally optimal
clusters.
2.2 Evolutionary Approaches
Clustering lacks in a general objective and this is the main
reason for having many clustering methods. Nature has been
offering us many concepts or ideas for solving optimization
problems faced by modern human society. Evolutionary
computation techniques are general methods for solving
optimization problems and since clustering can be considered
as an optimization problem it can be solved using evolutionary
approaches.
[9] Lior rokach and oded maimon et al says that evolutionary
operators and a clustering population can be used to converged
into a global optimal solution. Clustering components are used
as chromosomes and evolutionary operators are selection,
recombination and mutation. In EC fitness function value is
calculated and those chromosomes that are having good
fitness values are able to survive to the next level. Genetic
algorithm (GA) is the most frequently used evolutionary
algorithm. Each cluster will be associated with a particular
fitness value. Since fitness value is inversely proportional to
the squared error, clusters with small squared value will have
high fitness value.
In GA’s selection operator promotes solution from one
generation to the next level depending on the fitness value.
Selection is carried out based on the probabilistic values
depending on the fitness value. Crossover is the most
popularly used recombination operator in use. It takes a pair of
chromosomes as inputs and produces a pair of offspring.
Mutation can also be used. But a major problem with GA is
their sensitivity to various parameters like population size,
crossover and mutation etc. Many researchers suggested
various guidelines but of no use. Better results can be obtained
by using hybrid genetic algorithms.
2.2.1 Relocation Approaches
Since in EC techniques solutions evolve from one generation
to other in an iterative process , clustering is carried out using
relocation methods. At first the idea of using evolutionary
techniques to carry out clustering was proposed by [10] Krovi
et al in 1991 where a genetic algorithm was proposed to split
the data into two clusters. Here the solution are strings of
integers of length that of size of the dataset. Comparatively
this algorithm suffers from drawbacks like redundancy and
invalidity. The algorithm maximizes the ratio between the sum
of squares and within the sum of squares. This was used by
[11] Krishna and Narasimha murty1999 for searching a
partition with a fixed number of clusters using modified
genetic operators. Instead of crossover single k-means
iteration is used.
Partition is carried out as a boolean k x n matrix as proposed
by [12] Bezdek et al 1994 and focuses on reducing the sum-of-
squared-errors criterion. Experiments are carried out with
different distance metrics in order to detect clusters of
different shapes. Here the crossover operator simply swaps the
columns between chromosomes and mutation changes the
cluster assignment of one object randomly.
[13] Luchian et al 1994 came up with a new encoding in
which search for simultaneous optimum number of clusters
and optimum partition are allowed. Partition is carried out
similar to k-means , according to the proximities data items
are assigned to clusters. Crossover and mutation operators are
extended to work with variable-length chromosomes and real
encoding. A new operator called Lamarckian operator acting
at gene level which modifies a cluster to match the mean of
the corresponding cluster is introduced. This could be called
as hybridization with k-means because the cluster assignment
procedure and the updating of the cluster representative leads
to one iteration if the traditional clustering. But this leads to
premature convergence due to an increased selection pressure.
[14] Hall et al(1999) extended the same algorithm for carrying
out searching in fuzzy partitions with fixed number of clusters.
Cluster elements are represented using gray coding and this
became the most successful clustering literature [15 ]Maulik
and Bandyopadhyay 2000. This kind of approaches are based
on the distances between the data items and the cluster centres.
[13] Luchian et al 1994 proposed a centroid based encoding,
which gives continuous optimization. They are mostly used in
clustering based on PSO and are also useful to search for
cluster representatives. Their performance is comparatively
higher when compared to that of k –means algorithm and are
much more better than or equal to k –means provided with
best initial configuration. This is due to the increased
exploration capabilities and are not much strongly dependent
4. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 02 Issue: 12 | Dec-2013, Available @ http://www.ijret.org 445
on initialization. [16] Abraham et al 2007 presented a survey
on swarm intelligence techniques. Centroid based encoding
and differential evolution was used in supervised as well as
unsupervised scenario.
2.2.2 Density Based Approaches:
In density-based approaches for clustering, multi-modal
evolutionary algorithms that search for cluster centres lying in
the dense region in the feature space are used [17]
Dumitrrescu and Simon 2003. The correctness of the cluster
centroids are measured using the Gaussian functions. [18]
Nasraoui et al 2005 says that when data are distributed
according to the normal distributions then each one of the
clusters will be a hyper-ellipsoid associated with a mean and a
covariance matrix. Local maxima of a density function are
traced out using a multi-modal genetic algorithm. Each
individual is represented using a point in the m-dimensional
space in order to identify centres of the dense regions.
[19] Zaharie et al 2005 observes the use of a differential
evolution algorithm in the same context. It is not only
concerned with the number of clusters but also on the hyper-
ellipsoid scales. The method proposed by [18] Nasraoui et al
2005 generates only one descriptor whereas in this approach
asset of descriptors can be associated to the same cluster. It
provides a reliable identification of clusters in noisy data.
2.2.3 Grid Based Approaches:
[20] Sarafis et al 2002 proposed a genetic algorithm that
searches for a partition of the feature space that also provides a
partition of the data set. The algorithm generates rules that will
build grid in the space. Each individual has a set of k
clustering rules corresponding to one cluster. In turn each rule
is derived from m genes and each gene corresponds to interval
involving one feature. Many attempts were made to minimize
square-error criterion by using a flexible fitness function and it
has a high computational cost because of the form of fitness
function.
3. CONCLUSIONS
This paper gives a survey on various techniques for carrying
out clustering. It covered both traditional as well as
evolutionary approaches. Evolutionary approaches are
performing better than traditional approaches. But each
approaches have their own performance levels.[3] Ying Zhao
and George Karypis says that for every criterion function,
partitional algorithms always lead to better clustering results
than agglomerative algorithms, which suggests that partitional
clustering algorithms are well-suited for clustering large
document datasets due to not only their relatively low
computational requirements, but also comparable or even
better clustering performance.This review takes into account
many algorithms and reviewed them according to their
performance.
ACKNOWLEDGEMENTS
I am highly indebted to Mr. Bright Gee Varghese, Assistant
Professor, Department of Computer Science and Engineering
for his guidance and I would like to thank all the authors of the
references which were very helpful for this survey.
REFERENCES
[1]. Clustering: Evolutionary Approaches Mihaela Elena
Breaban ,Prof. PhD Henri Luchian
[2]. Fraley C. and Raftery A.E., ―How Many Clusters? Which
Clustering Method? Answers Via Model-Based Cluster
Analysis‖, Technical Report No. 329. Department of Statistics
University of Washington, 1998
[3]. Comparison of Agglomerative and Partitional Document
Clustering Algorithms- Ying Zhao and George Karypis.
Department of Computer Science, University of Minnesota,
Minneapolis, MN 55455
[4]. Jain, A.K. Murty, M.N. and Flynn, P.J. Data Clustering: A
Survey. ACM Computing Surveys, Vol. 31, No. 3, September
1999
[5]. Sneath, P., and Sokal, R. Numerical Taxonomy. W.H.
Freeman Co., San Francisco, CA, 1973
[6]. Guha, S., Rastogi, R. and Shim, K. CURE: An efficient
clustering algorithm for large databases. In Proceedings of
ACM SIGMOD International Conference on Management of
Data, pages 73-84, New York, 1998.
[7]. King, B. Step-wise Clustering Procedures, J. Am. Stat.
Assoc. 69, pp. 86-101, 1967.
[8]. ―Introduction to partitioning based clustering methods
with a robust example‖ Sami A¨ yra¨mo¨ Tommi Ka¨rkka¨inen
[9]. ―Clustering Methods‖ Lior Rokach Department of
Industrial Engineering Tel-Aviv University Oded Maimon
Department of Industrial Engineering Tel-Aviv University.
[10]. Ravindra Krovi. Genetic algorithms for. clustering: A
preliminary investigation. In Proceedings of the Twenty-Fifth
Hawaii International Conference on System Sciences, pages
540{544. IEEE Computer Society Press, 1991.
[11]. K. Krishna and M. Narasimha Murty. Genetic k-means
algorithm. Systems, Man, and Cybernetics, Part B:
Cybernetics, IEEE Transactions on, 29(3):433 {439, June
1999. ISSN 1083-4419. doi: 10.1109/3477.764879.
[12]. James C. Bezdek, Srinivas Boggavarapu, Lawrence O.
Hall, and Amine Bensaid. Genetic algorithm guided
clustering. In International Conference on Evolutionary
Computation, pages 34{39, 1994
[13]. Silvia Luchian, Henri Luchian, and Mihai Petriuc.
Evolutionary automated classification. In Proceedings of 1st
Congress on Evolutionary Computation, pages 585{588, 1994
[14]. L. O. Hall, B. Ozyurt, and J. C. Bezdek. Clustering with
a genetically optimized approach. IEEE Transactions on
Evolutionary Computation, 3:103{112, 1999
[15]. Ujjwal Maulik and Sanghamitra Bandyopadhyay.
Genetic algorithm-based clustering technique Pattern
Recognition, 33(9):1455 { 1465, 2000. ISSN 0031-3203
5. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 02 Issue: 12 | Dec-2013, Available @ http://www.ijret.org 446
[16]. Ajith Abraham, Swagatam Das, and Sandip Roy. Swarm
intelligence algorithms for data clustering. Soft Computing for
Knowledge Discovery and Data Mining, Springer Verlag,
pages 279{313, 2007.
[17]. D. Dumitrescu and Kroly Simon. Evolutionary prototype
selection. In Proceedings of the International Conference on
Theory and Applications of Mathematics and Informatics
ICTAMI, pages 183{190, 2003
[18]. Olfa Nasraoui, Elizabeth Leon, and Raghu
Krishnapuram. Unsupervised niche clustering: Discovering an
unknown number of clusters in noisy data sets. In Ashish
Ghosh and Lakhmi Jain, editors, Evolutionary Computation in
Data Mining, volume 163 of Studies in Fuzziness and Soft
Computing, pages 157{188. Springer Berlin Heidelberg, 2005.
[19]. Daniela Zaharie. Density based clustering with crowding
differential evolution. Symbolic and Numeric Algorithms for
Scientific Computing, International Symposium on, pages
343{350, 2005
[20]. I. Sarafis, A. M. S. Zalzala, and P. W. Trinder. A genetic
rule-based data clustering toolkit. In Proceedings of the
Evolutionary Computation on 2002. CEC '02.Proceedings of
the 2002 Congress - Volume 02, CEC '02, pages 1238{1243,
Washington, DC, USA, 2002. IEEE Computer Society. ISBN
0-7803-7282-4. URL
http://portal.acm.org/citation.cfm?id=1251972.1252396
[21]. Witten, Ian H, Eibe Frank,2005, Data Mining - Practical
Machine Learning Tools and Technique‖ 2nd Edition,Morhan
Kaufmann,San Francisco
[22]. Christopher D. Manning, Prabhakar Raghavan and
Hinnich Schutze ―Introduction to Information Retrieval‖
BIOGRAPHIES
Christabel Williams is pursuing MTech
(Computer Science and Engineering) degree
from Karunya University, Tamil Nadu, India.
She received her BE (Computer Science and
Engineering) degree from C.S.I College of
Engineering, Ketti, The Nilgiris. Tamil Nadu,
India
Bright Gee Varghese. R completed his BE
(CSE) in Sun College of Engineering and
Technology, Nagercoil, Tamil Nadu, India in
2003. He started his career as Lecturer in Sun
College of Engineering and Technology,
Nagercoil, from June 2003. He completed his
M.E from Vinayaka Mission University,
Salem. Currently he is working as an Assistant Professor in
Computer Science Department in Karunya University and
pursuing part time Ph.D in Karunya University Tamil Nadu,
India. His main research area is Software Engineering.