This document compares the k-means and grid density clustering algorithms. It summarizes that grid density clustering determines dense grids based on the densities of neighboring grids, and is able to handle different shaped clusters in multi-density environments. The grid density algorithm does not require distance computation and is not dependent on the number of clusters being known in advance like k-means. The document concludes that grid density clustering is better than k-means clustering as it can handle noise and outliers, find arbitrary shaped clusters, and has lower time complexity.
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...IJECEIAES
A hard partition clustering algorithm assigns equally distant points to one of the clusters, where each datum has the probability to appear in simultaneous assignment to further clusters. The fuzzy cluster analysis assigns membership coefficients of data points which are equidistant between two clusters so the information directs have a place toward in excess of one cluster in the meantime. For a subset of CiteScore dataset, fuzzy clustering (fanny) and fuzzy c-means (fcm) algorithms were implemented to study the data points that lie equally distant from each other. Before analysis, clusterability of the dataset was evaluated with Hopkins statistic which resulted in 0.4371, a value < 0.5, indicating that the data is highly clusterable. The optimal clusters were determined using NbClust package, where it is evidenced that 9 various indices proposed 3 cluster solutions as best clusters. Further, appropriate value of fuzziness parameter m was evaluated to determine the distribution of membership values with variation in m from 1 to 2. Coefficient of variation (CV), also known as relative variability was evaluated to study the spread of data. The time complexity of fuzzy clustering (fanny) and fuzzy c-means algorithms were evaluated by keeping data points constant and varying number of clusters.
International Journal of Engineering and Science Invention (IJESI)inventionjournals
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...IJECEIAES
A hard partition clustering algorithm assigns equally distant points to one of the clusters, where each datum has the probability to appear in simultaneous assignment to further clusters. The fuzzy cluster analysis assigns membership coefficients of data points which are equidistant between two clusters so the information directs have a place toward in excess of one cluster in the meantime. For a subset of CiteScore dataset, fuzzy clustering (fanny) and fuzzy c-means (fcm) algorithms were implemented to study the data points that lie equally distant from each other. Before analysis, clusterability of the dataset was evaluated with Hopkins statistic which resulted in 0.4371, a value < 0.5, indicating that the data is highly clusterable. The optimal clusters were determined using NbClust package, where it is evidenced that 9 various indices proposed 3 cluster solutions as best clusters. Further, appropriate value of fuzziness parameter m was evaluated to determine the distribution of membership values with variation in m from 1 to 2. Coefficient of variation (CV), also known as relative variability was evaluated to study the spread of data. The time complexity of fuzzy clustering (fanny) and fuzzy c-means algorithms were evaluated by keeping data points constant and varying number of clusters.
International Journal of Engineering and Science Invention (IJESI)inventionjournals
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
SCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMSijdkp
Subspace clustering discovers the clusters embedded in multiple, overlapping subspaces of high
dimensional data. Many significant subspace clustering algorithms exist, each having different
characteristics caused by the use of different techniques, assumptions, heuristics used etc. A comprehensive
classification scheme is essential which will consider all such characteristics to divide subspace clustering
approaches in various families. The algorithms belonging to same family will satisfy common
characteristics. Such a categorization will help future developers to better understand the quality criteria to
be used and similar algorithms to be used to compare results with their proposed clustering algorithms. In
this paper, we first proposed the concept of SCAF (Subspace Clustering Algorithms’ Family).
Characteristics of SCAF will be based on the classes such as cluster orientation, overlap of dimensions etc.
As an illustration, we further provided a comprehensive, systematic description and comparison of few
significant algorithms belonging to “Axis parallel, overlapping, density based” SCAF.
Comparison Between Clustering Algorithms for Microarray Data AnalysisIOSR Journals
Currently, there are two techniques used for large-scale gene-expression profiling; microarray and
RNA-Sequence (RNA-Seq).This paper is intended to study and compare different clustering algorithms that used
in microarray data analysis. Microarray is a DNA molecules array which allows multiple hybridization
experiments to be carried out simultaneously and trace expression levels of thousands of genes. It is a highthroughput
technology for gene expression analysis and becomes an effective tool for biomedical research.
Microarray analysis aims to interpret the data produced from experiments on DNA, RNA, and protein
microarrays, which enable researchers to investigate the expression state of a large number of genes. Data
clustering represents the first and main process in microarray data analysis. The k-means, fuzzy c-mean, selforganizing
map, and hierarchical clustering algorithms are under investigation in this paper. These algorithms
are compared based on their clustering model.
DISTRIBUTED COVERAGE AND CONNECTIVITY PRESERVING ALGORITHM WITH SUPPORT OF DI...IJCSEIT Journal
Given a 3D space where should be supervised and a group of mobile sensor actor nodes with limited
sensing and communicating capabilities, this paper aims at proposing a distributed self-deployment
algorithm for agents to cover the space as much as possible by considering non-uniform sensing coverage
degree constraint of environment while preserving connectivity. The problem is formulated as coverage
maximization subject to connectivity and sensing coverage degree constraint. Considering a desired
distance between neighbouring nodes, an error function which depends on pairwise distance between
nodes is described. The maximization is encoded to an error minimization problem that is solved using
gradient descent algorithm and will yield in moving sensors into appropriate positions. Simulation results
are presented in two different conditions that importance of sensing coverage degree support of
environment is very high and is low.
On the High Dimentional Information Processing in Quaternionic Domain and its...IJAAS Team
There are various high dimensional engineering and scientific applications in communication, control, robotics, computer vision, biometrics, etc.; where researchers are facing problem to design an intelligent and robust neural system which can process higher dimensional information efficiently. The conventional real-valued neural networks are tried to solve the problem associated with high dimensional parameters, but the required network structure possesses high complexity and are very time consuming and weak to noise. These networks are also not able to learn magnitude and phase values simultaneously in space. The quaternion is the number, which possesses the magnitude in all four directions and phase information is embedded within it. This paper presents a well generalized learning machine with a quaternionic domain neural network that can finely process magnitude and phase information of high dimension data without any hassle. The learning and generalization capability of the proposed learning machine is presented through a wide spectrum of simulations which demonstrate the significance of the work.
Data mining is utilized to manage huge measure of information which are put in the data ware houses and databases, to discover required information and data. Numerous data mining systems have been proposed, for example, association rules, decision trees, neural systems, clustering, and so on. It has turned into the purpose of consideration from numerous years. A re-known amongst the available data mining strategies is clustering of the dataset. It is the most effective data mining method. It groups the dataset in number of clusters based on certain guidelines that are predefined. It is dependable to discover the connection between the distinctive characteristics of data.
In k-mean clustering algorithm, the function is being selected on the basis of the relevancy of the function for predicting the data and also the Euclidian distance between the centroid of any cluster and the data objects outside the cluster is being computed for the clustering the data points. In this work, author enhanced the Euclidian distance formula to increase the cluster quality.
The problem of accuracy and redundancy of the dissimilar points in the clusters remains in the improved k-means for which new enhanced approach is been proposed which uses the similarity function for checking the similarity level of the point before including it to the cluster.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Among many data clustering approaches available today, mixed data set of numeric and category data
poses a significant challenge due to difficulty of an appropriate choice and employment of
distance/similarity functions for clustering and its verification. Unsupervised learning models for
artificial neural network offers an alternate means for data clustering and analysis. The objective of this
study is to highlight an approach and its associated considerations for mixed data set clustering with
Adaptive Resonance Theory 2 (ART-2) artificial neural network model and subsequent validation of the
clusters with dimensionality reduction using Autoencoder neural network model.
INFLUENCE OF PRIORS OVER MULTITYPED OBJECT IN EVOLUTIONARY CLUSTERINGcscpconf
In recent years, Evolutionary clustering is an evolving research area in data mining. The evolution diagnosis of any homogeneous as well as heterogeneous network will provide an overall view about the network. Applications of evolutionary clustering includes, analyzing, the social networks, information networks, about their structure, properties and behaviors. In this paper, the authors study the problem of influence of priors over multi-typed object in
evolutionary clustering. Priors are defined for each type of object in a heterogeneous information network and experimental results were produced to show how consistency and quality of cluster changes according to the priors
Influence of priors over multityped object in evolutionary clusteringcsandit
In recent years, Evolutionary clustering is an evolving research area in data mining. The
evolution diagnosis of any homogeneous as well as heterogeneous network will provide an
overall view about the network. Applications of evolutionary clustering includes, analyzing, the
social networks, information networks, about their structure, properties and behaviors. In this
paper, the authors study the problem of influence of priors over multi-typed object in
evolutionary clustering. Priors are defined for each type of object in a heterogeneous
information network and experimental results were produced to show how consistency and
quality of cluster changes according to the priors.
K-Means clustering uses an iterative procedure which is very much sensitive and dependent upon the initial centroids. The initial centroids in the k-means clustering are chosen randomly, and hence the clustering also changes with respect to the initial centroids. This paper tries to overcome this problem of random selection of centroids and hence change of clusters with a premeditated selection of initial centroids. We have used the iris, abalone and wine data sets to demonstrate that the proposed method of finding the initial centroids and using the centroids in k-means algorithm improves the clustering performance. The clustering also remains the same in every run as the initial centroids are not randomly selected but through premeditated method.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Introduction to Multi-Objective Clustering EnsembleIJSRD
Association rule mining is a popular and well researched method for discovering interesting relations between variables in large databases. In this paper we introduce the concept of Data mining, Association rule and Multilevel association rule with different algorithm, its advantage and concept of Fuzzy logic and Genetic Algorithm. Multilevel association rules can be mined efficiently using concept hierarchies under a support-confidence framework.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
SCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMSijdkp
Subspace clustering discovers the clusters embedded in multiple, overlapping subspaces of high
dimensional data. Many significant subspace clustering algorithms exist, each having different
characteristics caused by the use of different techniques, assumptions, heuristics used etc. A comprehensive
classification scheme is essential which will consider all such characteristics to divide subspace clustering
approaches in various families. The algorithms belonging to same family will satisfy common
characteristics. Such a categorization will help future developers to better understand the quality criteria to
be used and similar algorithms to be used to compare results with their proposed clustering algorithms. In
this paper, we first proposed the concept of SCAF (Subspace Clustering Algorithms’ Family).
Characteristics of SCAF will be based on the classes such as cluster orientation, overlap of dimensions etc.
As an illustration, we further provided a comprehensive, systematic description and comparison of few
significant algorithms belonging to “Axis parallel, overlapping, density based” SCAF.
Comparison Between Clustering Algorithms for Microarray Data AnalysisIOSR Journals
Currently, there are two techniques used for large-scale gene-expression profiling; microarray and
RNA-Sequence (RNA-Seq).This paper is intended to study and compare different clustering algorithms that used
in microarray data analysis. Microarray is a DNA molecules array which allows multiple hybridization
experiments to be carried out simultaneously and trace expression levels of thousands of genes. It is a highthroughput
technology for gene expression analysis and becomes an effective tool for biomedical research.
Microarray analysis aims to interpret the data produced from experiments on DNA, RNA, and protein
microarrays, which enable researchers to investigate the expression state of a large number of genes. Data
clustering represents the first and main process in microarray data analysis. The k-means, fuzzy c-mean, selforganizing
map, and hierarchical clustering algorithms are under investigation in this paper. These algorithms
are compared based on their clustering model.
DISTRIBUTED COVERAGE AND CONNECTIVITY PRESERVING ALGORITHM WITH SUPPORT OF DI...IJCSEIT Journal
Given a 3D space where should be supervised and a group of mobile sensor actor nodes with limited
sensing and communicating capabilities, this paper aims at proposing a distributed self-deployment
algorithm for agents to cover the space as much as possible by considering non-uniform sensing coverage
degree constraint of environment while preserving connectivity. The problem is formulated as coverage
maximization subject to connectivity and sensing coverage degree constraint. Considering a desired
distance between neighbouring nodes, an error function which depends on pairwise distance between
nodes is described. The maximization is encoded to an error minimization problem that is solved using
gradient descent algorithm and will yield in moving sensors into appropriate positions. Simulation results
are presented in two different conditions that importance of sensing coverage degree support of
environment is very high and is low.
On the High Dimentional Information Processing in Quaternionic Domain and its...IJAAS Team
There are various high dimensional engineering and scientific applications in communication, control, robotics, computer vision, biometrics, etc.; where researchers are facing problem to design an intelligent and robust neural system which can process higher dimensional information efficiently. The conventional real-valued neural networks are tried to solve the problem associated with high dimensional parameters, but the required network structure possesses high complexity and are very time consuming and weak to noise. These networks are also not able to learn magnitude and phase values simultaneously in space. The quaternion is the number, which possesses the magnitude in all four directions and phase information is embedded within it. This paper presents a well generalized learning machine with a quaternionic domain neural network that can finely process magnitude and phase information of high dimension data without any hassle. The learning and generalization capability of the proposed learning machine is presented through a wide spectrum of simulations which demonstrate the significance of the work.
Data mining is utilized to manage huge measure of information which are put in the data ware houses and databases, to discover required information and data. Numerous data mining systems have been proposed, for example, association rules, decision trees, neural systems, clustering, and so on. It has turned into the purpose of consideration from numerous years. A re-known amongst the available data mining strategies is clustering of the dataset. It is the most effective data mining method. It groups the dataset in number of clusters based on certain guidelines that are predefined. It is dependable to discover the connection between the distinctive characteristics of data.
In k-mean clustering algorithm, the function is being selected on the basis of the relevancy of the function for predicting the data and also the Euclidian distance between the centroid of any cluster and the data objects outside the cluster is being computed for the clustering the data points. In this work, author enhanced the Euclidian distance formula to increase the cluster quality.
The problem of accuracy and redundancy of the dissimilar points in the clusters remains in the improved k-means for which new enhanced approach is been proposed which uses the similarity function for checking the similarity level of the point before including it to the cluster.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Among many data clustering approaches available today, mixed data set of numeric and category data
poses a significant challenge due to difficulty of an appropriate choice and employment of
distance/similarity functions for clustering and its verification. Unsupervised learning models for
artificial neural network offers an alternate means for data clustering and analysis. The objective of this
study is to highlight an approach and its associated considerations for mixed data set clustering with
Adaptive Resonance Theory 2 (ART-2) artificial neural network model and subsequent validation of the
clusters with dimensionality reduction using Autoencoder neural network model.
INFLUENCE OF PRIORS OVER MULTITYPED OBJECT IN EVOLUTIONARY CLUSTERINGcscpconf
In recent years, Evolutionary clustering is an evolving research area in data mining. The evolution diagnosis of any homogeneous as well as heterogeneous network will provide an overall view about the network. Applications of evolutionary clustering includes, analyzing, the social networks, information networks, about their structure, properties and behaviors. In this paper, the authors study the problem of influence of priors over multi-typed object in
evolutionary clustering. Priors are defined for each type of object in a heterogeneous information network and experimental results were produced to show how consistency and quality of cluster changes according to the priors
Influence of priors over multityped object in evolutionary clusteringcsandit
In recent years, Evolutionary clustering is an evolving research area in data mining. The
evolution diagnosis of any homogeneous as well as heterogeneous network will provide an
overall view about the network. Applications of evolutionary clustering includes, analyzing, the
social networks, information networks, about their structure, properties and behaviors. In this
paper, the authors study the problem of influence of priors over multi-typed object in
evolutionary clustering. Priors are defined for each type of object in a heterogeneous
information network and experimental results were produced to show how consistency and
quality of cluster changes according to the priors.
K-Means clustering uses an iterative procedure which is very much sensitive and dependent upon the initial centroids. The initial centroids in the k-means clustering are chosen randomly, and hence the clustering also changes with respect to the initial centroids. This paper tries to overcome this problem of random selection of centroids and hence change of clusters with a premeditated selection of initial centroids. We have used the iris, abalone and wine data sets to demonstrate that the proposed method of finding the initial centroids and using the centroids in k-means algorithm improves the clustering performance. The clustering also remains the same in every run as the initial centroids are not randomly selected but through premeditated method.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Introduction to Multi-Objective Clustering EnsembleIJSRD
Association rule mining is a popular and well researched method for discovering interesting relations between variables in large databases. In this paper we introduce the concept of Data mining, Association rule and Multilevel association rule with different algorithm, its advantage and concept of Fuzzy logic and Genetic Algorithm. Multilevel association rules can be mined efficiently using concept hierarchies under a support-confidence framework.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
How to notice and extend educational values of family trips and school outings. Links to your National Curricula and tips on home education for home schooling parents and teachers.
Pictures of the tiger and the caterpillar in front slide courtesy of anankkml1 and gualberto 107 / FreeDigitalPhotos.net
Data mining is a process to extract information from a huge amount of data and transform it into an
understandable structure. Data mining provides the number of tasks to extract data from large databases such
as Classification, Clustering, Regression, Association rule mining. This paper provides the concept of
Classification. Classification is an important data mining technique based on machine learning which is used to
classify the each item on the bases of features of the item with respect to the predefined set of classes or groups.
This paper summarises various techniques that are implemented for the classification such as k-NN, Decision
Tree, Naïve Bayes, SVM, ANN and RF. The techniques are analyzed and compared on the basis of their
advantages and disadvantages
Ensemble based Distributed K-Modes ClusteringIJERD Editor
Clustering has been recognized as the unsupervised classification of data items into groups. Due to the explosion in the number of autonomous data sources, there is an emergent need for effective approaches in distributed clustering. The distributed clustering algorithm is used to cluster the distributed datasets without gathering all the data in a single site. The K-Means is a popular clustering method owing to its simplicity and speed in clustering large datasets. But it fails to handle directly the datasets with categorical attributes which are generally occurred in real life datasets. Huang proposed the K-Modes clustering algorithm by introducing a new dissimilarity measure to cluster categorical data. This algorithm replaces means of clusters with a frequency based method which updates modes in the clustering process to minimize the cost function. Most of the distributed clustering algorithms found in the literature seek to cluster numerical data. In this paper, a novel Ensemble based Distributed K-Modes clustering algorithm is proposed, which is well suited to handle categorical data sets as well as to perform distributed clustering process in an asynchronous manner. The performance of the proposed algorithm is compared with the existing distributed K-Means clustering algorithms, and K-Modes based Centralized Clustering algorithm. The experiments are carried out for various datasets of UCI machine learning data repository.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Experimental study of Data clustering using k- Means and modified algorithmsIJDKP
The k- Means clustering algorithm is an old algorithm that has been intensely researched owing to its ease
and simplicity of implementation. Clustering algorithm has a broad attraction and usefulness in
exploratory data analysis. This paper presents results of the experimental study of different approaches to
k- Means clustering, thereby comparing results on different datasets using Original k-Means and other
modified algorithms implemented using MATLAB R2009b. The results are calculated on some performance
measures such as no. of iterations, no. of points misclassified, accuracy, Silhouette validity index and
execution time
Extended pso algorithm for improvement problems k means clustering algorithmIJMIT JOURNAL
The clustering is a without monitoring process and one of the most common data mining techniques. The
purpose of clustering is grouping similar data together in a group, so were most similar to each other in a
cluster and the difference with most other instances in the cluster are. In this paper we focus on clustering
partition k-means, due to ease of implementation and high-speed performance of large data sets, After 30
year it is still very popular among the developed clustering algorithm and then for improvement problem of
placing of k-means algorithm in local optimal, we pose extended PSO algorithm, that its name is ECPSO.
Our new algorithm is able to be cause of exit from local optimal and with high percent produce the
problem’s optimal answer. The probe of results show that mooted algorithm have better performance
regards as other clustering algorithms specially in two index, the carefulness of clustering and the quality
of clustering.
A survey on Efficient Enhanced K-Means Clustering Algorithmijsrd.com
Data mining is the process of using technology to identify patterns and prospects from large amount of information. In Data Mining, Clustering is an important research topic and wide range of unverified classification application. Clustering is technique which divides a data into meaningful groups. K-means clustering is a method of cluster analysis which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean. In this paper, we present the comparison of different K-means clustering algorithms.
Scalable Rough C-Means clustering using Firefly algorithm..................................................................1
Abhilash Namdev and B.K. Tripathy
Significance of Embedded Systems to IoT................................................................................................. 15
P. R. S. M. Lakshmi, P. Lakshmi Narayanamma and K. Santhi Sri
Cognitive Abilities, Information Literacy Knowledge and Retrieval Skills of Undergraduates: A
Comparison of Public and Private Universities in Nigeria ........................................................................ 24
Janet O. Adekannbi and Testimony Morenike Oluwayinka
Risk Assessment in Constructing Horseshoe Vault Tunnels using Fuzzy Technique................................ 48
Erfan Shafaghat and Mostafa Yousefi Rad
Evaluating the Adoption of Deductive Database Technology in Augmenting Criminal Intelligence in
Zimbabwe: Case of Zimbabwe Republic Police......................................................................................... 68
Mahlangu Gilbert, Furusa Samuel Simbarashe, Chikonye Musafare and Mugoniwa Beauty
Analysis of Petrol Pumps Reachability in Anand District of Gujarat ....................................................... 77
Nidhi Arora
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Extended pso algorithm for improvement problems k means clustering algorithmIJMIT JOURNAL
The clustering is a without monitoring process and one of the most common data mining techniques. The
purpose of clustering is grouping similar data together in a group, so were most similar to each other in a
cluster and the difference with most other instances in the cluster are. In this paper we focus on clustering
partition k-means, due to ease of implementation and high-speed performance of large data sets, After 30
year it is still very popular among the developed clustering algorithm and then for improvement problem of
placing of k-means algorithm in local optimal, we pose extended PSO algorithm, that its name is ECPSO.
Our new algorithm is able to be cause of exit from local optimal and with high percent produce the
problem’s optimal answer. The probe of results show that mooted algorithm have better performance
regards as other clustering algorithms specially in two index, the carefulness of clustering and the quality
of clustering.
Electrically small antennas: The art of miniaturizationEditor IJARCET
We are living in the technological era, were we preferred to have the portable devices rather than unmovable devices. We are isolating our self rom the wires and we are becoming the habitual of wireless world what makes the device portable? I guess physical dimensions (mechanical) of that particular device, but along with this the electrical dimension is of the device is also of great importance. Reducing the physical dimension of the antenna would result in the small antenna but not electrically small antenna. We have different definition for the electrically small antenna but the one which is most appropriate is, where k is the wave number and is equal to and a is the radius of the imaginary sphere circumscribing the maximum dimension of the antenna. As the present day electronic devices progress to diminish in size, technocrats have become increasingly concentrated on electrically small antenna (ESA) designs to reduce the size of the antenna in the overall electronics system. Researchers in many fields, including RF and Microwave, biomedical technology and national intelligence, can benefit from electrically small antennas as long as the performance of the designed ESA meets the system requirement.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
1. ISSN: 2278 – 1323
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 2, Issue 6, June 2013
2143
www.ijarcet.org
Abstract— Clustering is the one of the most important task of
the data mining. Clustering is the unsupervised method to
find the relations between points of dataset into several
groups. It can be done by using various types of techniques
like partitioned, hierarchical, density, and grid. Each of these
has its own advantages and disadvantages. Grid density takes
the advantage of the density and the grid algorithms. In this
paper, the comparison of the k-mean algorithm can be done
with grid density algorithm. Grid density algorithm is better
than the k-mean algorithm in clustering.
Index Terms—Clustering, Data types, K-mean, Grid
density.
I. INTRODUCTION
Clustering is the one of the most important task of the data
mining. Clustering is the unsupervised method to find the
relations between points of dataset into several groups.
Unsupervised method means the user known the input data
but not known the output data but the way the user has to
use for obtaining results is known. Cluster analysis can be
done by Finding similarities between data according to the
characteristics found in the data and grouping similar data
objects into clusters. It is the unsupervised process that is
why no predefined class is present. It has the major two
applications: It is used as a stand-alone tool to get insight into
data distribution; it is a preprocessing step for other
algorithms. It has a rich applications and multidisciplinary
efforts such as Pattern Recognition, Spatial Data Analysis in
which it Create thematic maps in GIS by clustering feature
spaces and detect spatial clusters or for other spatial mining
tasks, Image Processing, Economic Science (especially
market research), Document classification and Cluster
Weblog data to discover groups of similar access
patterns.[3]. Various examples of clustering are marketing,
landuse, insurance, city planning, earthquake studies,
transportation system etc. Generally the user has the large
dataset say DataSet(X) when it is pass through the clustering
algorithm it partitioned the DataSet(X) into P number of
clusters.
As in the figure1, the dataset is passed through the clustering
algorithm, then the dataset can be taken as output in the form
of clusters, clusters are of arbitrary shaped clusters.
Amandeep Kaur Mann, M.Tech Computer Science & Technology,
Pinjab Technical University/ RIMT/ RIMT Institutes of Technology,
Patiala, India.
Navneet Kaur, Assistant Professor Computer Science & Technology,
Pinjab Technical University/ RIMT/ RIMT Institutes of Technology,
Patiala, India.
Figure1: Clustering Process
The quality of a clustering result depends on equally the
similarity calculation used by the method and its
implementation. The quality of a clustering technique is also
calculated by its ability to discover some or all of the
unknown patterns. Dissimilarity/Similarity metric is also
used for this purpose. Similarity is spoken in conditions of a
distance function, usually metric: d(i, j). There is a separate
“quality” function that calculates the “goodness” of a cluster.
The definitions of distance functions are generally very
different for interval-scaled, Boolean, categorical, ordinal
ratio, and vector variables. Weights should be associated
with dissimilar variables based on applications and data
semantics. It is tough to define “similar enough” or “good
enough”, the answer is typically highly subjective. Some of
the requirements of clustering in data mining are scalability,
capability to deal with dissimilar types of attributes,
capability to hold dynamic data, detection of clusters with
arbitrary shape, negligible requirements for domain
knowledge to determine input parameters, capable to deal
with noise and outliers, tactless to order of input records,
High dimensionality, merging of user-specified constraints,
interpretability and usability. A good clustering algorithm
produce high superiority clusters with low intra-cluster
variance and high inter-cluster variance. In other words high
similarity between intra-class and low similarity between
inter-class clusters.
Figure2: Intra-class and Inter-class variance
Different types of data that is used for cluster analysis are [3]:
Grid Density Based Clustering Algorithm
Amandeep Kaur Mann, Navneet Kaur
2. ISSN: 2278 – 1323
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 2, Issue 6, June 2013
2144
1. Interval-valued variables: from the standardize data
calculate the mean absolute deviation:
Where
2. Binary variables: It is the contiguous table for binary
values.
3. Nominal variables: A generalization of the binary variable
in that it can take more than 2 states, e.g., red, yellow, blue,
green.
4. Ordinal variables: An ordinal variable can be discrete or
continuous, Order is important, e.g., rank.
5. Ratio-scaled variable: It is a positive measurement on a
nonlinear scale, approximately at exponential scale, such as
AeBt
or Ae-Bt
.
6. Variables of mixed types: A database may contain all the
six types of variables, symmetric binary, asymmetric binary,
nominal, ordinal, interval and ratio.
There are different algorithms for clustering. Some of the
clustering algorithms require that the number of clusters
should be known earlier to the start of clustering method
others find out the clusters themselves. K-mean known the
number of clusters earlier usually Density-based clustering
algorithms are independent of earlier knowledge of number
of cluster. K-mean algorithm may be useful in situations
where the number of cluster should be determined easily
before the start of the algorithm [1].In this paper, the analysis
of K-mean and Grid density clustering algorithm is done, and
also the explain the features in which they are apart from
each other.
2. K-Mean
The k-means algorithm partitions the dataset into „k‟ subsets.
In this, a cluster is represented by its centroid, which is an
average point also called mean of points within a cluster. This
algorithm works resourcefully only with numerical
attributes. And it can be harmfully affected with the single
outlier. It is the most accepted clustering algorithm that is
used in scientific and industrial applications. It is a way of
cluster analysis which aims to partition “n” observations into
k clusters in which each observation belongs to the cluster
with the nearest mean.
The basic algorithm is very simple as: [5]
1. Select “k” points as initial centroids.
2. Repeat.
3. Form “k” clusters by assigning each point to its closest
centriod.
4. Recompute the centroid of each cluster until centroid does
not change.
Figure 3: Example of K-mean
As in the above figure the working of the k-mean algorithm is
shown. Initially the dataset is present with “n” objects.
According to first step of the algorithm, it arbitrarily chooses
the value of “k” i.e. 2 as initial cluster center. After that,
finding the objects that has the most similar center and put
that objects into one cluster. Then update the cluster mean
value and reassign the objects according to the most similar
center and again update the cluster mean value. This process
is repeated again and again until all the objects are put into
cluster.
K-mean algorithm has the limitation that it cannot handle
noise. It is based on the number of objects present and only
applicable when mean is defined, cannot apply on
categorical data. It is also not able to find the non-convex
shape clusters.
3. Grid Density
It determines dense grids based on densities of their
neighbors. Grid density clustering algorithm is able to handle
different shaped clusters in multi-density environments. Grid
density is defined as number of points mapped to one grid.
The density of a grid is defined by the number of the data
points in the grid and if it is higher than density threshold, it
is considered as a dense unit. A grid is called sporadic when
its density is less than the input argument MinPts. [6]. If the
Eps neighborhood of the grid contains at least a minimum
number, MinPts, of objects, then the grid is called a core
grid. If it is within the Eps neighborhood of a core grid, but it
is not a core grid, then the grid is called a border grid. A
border grid may fall into neighborhood of one or more core
grid. For a given grid, if the Eps neighborhood of the grid
contains less than a minimum number, MinPts, of objects,
and it is not within the Eps neighborhood of any core grids,
then the grid is called a noise grid. [8]. In order to obtain
superior results it is truly essential to set the parameters
accurately. In this, each data record in data stream maps to a
grid and grids are clustered based on their density.
|)|...|||(|1
21 fnffffff
mxmxmxns
.)...
21
1
nff
z
ff
xx(xnm
3. ISSN: 2278 – 1323
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 2, Issue 6, June 2013
2145
www.ijarcet.org
The basic steps of the grid based algorithm are:
1. Creating the grid structure, in other words divide the data
space into a finite number of cells.
2. Calculating the cell density for each cell
3. Sorting of the cells according to their densities.
4. Identifying cluster centers.
5. Traversal of neighbor cells.
Figure 4 - Density-Grid based clustering Framework [7]
It uses a two-phase scheme, which consists of an online
component that processes raw data stream and produces
review statistics and an offline component that uses the
review data to create clusters. The grid density algorithms are
able to handle the noise. In this, the numbers of clusters are
not in known in advance. It is also used for spatial database.
It can handle the arbitrary shaped clusters.
The table1 shows the comparison of k-mean with grid
density algorithms. The table 2 shows the various density
and grid density algorithms also show their time complexity
and the limitations of the respective algorithm. From the
comparison of the two tables, it is clear that grid density
algorithm is better than other clustering algorithms.
Table 1. Comparison of K-mean and Grid density algorithms
Parameters K-mean Grid density
Noise No Yes
Criteria used Partition the
dataset into “k”
clusters
Partition the dataset
based on density
No. of clusters
known in
advance
Yes No
Used for spatial
database
No Yes
Used for high
dimensional
No Yes
Shape of clusters Not suitable for
non-convex
shape clusters
Find the arbitrary
shaped clusters
Type of data Handle numeric
data, not handle
categorical data.
Not such type of any
limitation
4. Related Work in Grid Density and Density
Clustering is a problematical task in Data Mining and
Knowledge Discovery. The nearly everyone well-known
algorithm in this family is DDBSCAN which takes two input
arguments, epsilon and MinPts. Dense regions in dataset are
based on these arguments. A region is dense in the order of a
exact point if in its epsilon neighborhood radius there are at
least MinPts points and these points are called core. If two
core points are neighbors of each other, DBSCAN merges
these two points. Time complexity of algorithm is O(n2
).
OPTICS is another density based clustering algorithms. It
tries to overcome the problem of DBSCAN algorithms by
proposing a supplement ordering for points in dataset. This
visualized illustration of objects helps the user to choose
suitable epsilon value for DBSCAN to achieve a proper
result. But this algorithm still has trouble in datasets with
overlapped clusters. The reason behind this is that OPTICS
is structurally similar to the DBSCAN, and the time
complexity of OPTICS is also similar to DBSCAN.
VDBSCAN has the idea of the algorithm is choosing number
of epsilon value rather than one epsilon The time complexity
of algorithm is O(time complexity of DBSCAN * T), in
which “T” is the number of iterations of algorithm.
LDBSCAN suggested using the concepts of local outlier
factor (LOF) and local reach ability distance (LRD). It
powerfully manipulates the output of clustering but the
algorithm has no direction to select correct values.
GMDBSCAN is a grid-density based algorithm to identify
clusters. It divides the data space into a number of grids. The
computational and time complexity both are high. It works
on local density value.
MSDBSCAN settlement from the concept of local core
distance(lcd). By changing the core point into local core
distance, it enables algorithm to find clusters with different
shapes in multi density environments. The time complexity
of the algorithm is still the drawback of the algorithm like
DBSCAN.
GDCLU is the Grid Density Clustering Algorithm which
concentrates on time complexity. As it is already discussed
in the section2
4. ISSN: 2278 – 1323
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 2, Issue 6, June 2013
2146
Table 2: Comparison of density and grid density algorithms
4. CONCLUSION AND FUTURE SCOPE
Grid density takes the advantage of the density and the grid
algorithms. Grid density is suitable for handling noise and
outliers. It can find the arbitrary shaped clusters used for high
dimensional data. The grid density algorithm does not
require the distance computation. K-mean knows the
number of clusters in advance but the grid density does not.
Grid density algorithm is better than the k-mean algorithm in
clustering. The advantage of grid density method is lower
processing time. Therefore, our aim is to implement the grid
density clustering algorithm for analyze and increase the
speed, efficiency and accuracy of the dataset.
ACKNOWLEDGEMENT
I express my sincere gratitude to my guide Ms. Navneet
Kaur, for her valuable guidance and advice. Also I would like
to thanks the Department of Computer Science &
Engineering of RIMT Institutes near Floating Restaurant,
Sirhind Side, Mandi Gobindgarh-147301, and Punjab, India.
REFERENCES
[1] Mariam Rehman, Syed Atif Mehdi. Comparison Of
Density-Based Clustering Algorithms, research work,
Lahore College for Women University Lahore, Pakistan.
[2]Pasi Fränti. Number of clusters (validation of clustering),
Speech and Image Processing Unit School of Computing
University of Eastern Finland.
[3] R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan.
Automatic subspace clustering of high dimensional data
for data mining applications. SIGMOD'98.
[4] K. Mumtaz1 and Dr. K. Duraiswamy. A Novel Density
based improved k-means Clustering Algorithm –
Dbkmeans. (IJCSE) International Journal on Computer
Science and Engineering, Vol. 02, No. 02, pp. 213-218, 2010.
[5] Pradeep Rai, Shubha Singh. A Survey of Clustering
Techniques, International Journal of Computer Applications
(0975 – 8887), Volume 7– No.12, October 2010.
[6] Gholamreza Esfandani, Mohsen Sayyadi. GDCLU: a
new Grid-Density based CLUstring algorithm, ACIS
International Conference on Software Engineering, Artificial
Intelligence, Networking and Parallel/Distributed
Computing, pp 102-107, 2012.
[7] Amineh Amini, Teh Ying Wah, Mahmoud Reza Saybani,
Saeed Reza Aghabozorgi Sahaf Yazdi. A Study of
Density-Grid based Clustering Algorithms on Data
Streams, Eighth International Conference on Fuzzy Systems and
Knowledge Discovery (FSKD), 2011.
[8] Zheng Hua, Wang Zhenxing, Zhang Liancheng, Wang
Qian. Clustering Algorithm Based on Characteristics of
Density Distribution, National Digital Switching System
Engineering & Technological R&D Center, pp431-435, 2010.
[9] D. Gibson, J. Kleinberg, and P. Raghavan. Clustering
categorical data: An approach based on dynamic
systems. VLDB‟98.
[10] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. A
density-based algorithm for discovering clusters in large
spatial databases. KDD'96.
[11] B. Borah and D.K. Bhattacharyya. An Improved
Sampling-Based DBSCAN for Large Spatial Databases,
International Conference on Intelligent Sensing and
Information, IEEE, pp. 92-96, 2004.
Algorithm Review
DBSCAN
(Density Based Spatial
Clustering Algorithm in
Noise)
Time complexity is O(n)2
and
not suitable with different
densities.
OPTICS (Ordering Points
To Identify The Clustering
Structure)
Time complexity is also O (n) 2
and has trouble of overlapped
clusters.
VDBSCAN (Varied Density
Based Spatial
Clustering of Applications
with Noise)
Time complexity is O (time
complexity of DBSCAN* T),
where T is the number of
iterations of algorithm. And
choose a number of epsilon
value rather than one value.
LDBSCAN (A local-density
based
spatial clustering algorithm
with noise)
It powerfully manipulates the
output of clustering but the
algorithm has no direction to
select correct values.
GMDBSCAN (Multi-
Density DBSCAN Cluster
Based on Grid)
The computational and time
complexity both are high. It
works on local density value.
P-DBSCAN ( Photo- Density
Based Spatial Clustering
Algorithm in Noise)
It concentrates on clustering
spatial data. It changes the core
point into photo.
MSDBSCAN (Multi Density
Scale
Independent Clustering Algorithm
Based on DBSCAN)
The time complexity of the
algorithm is still the drawback of
this algorithm.
GDCLU (Grid Density
Clustering algorithm)
It generates major clusters by
merging dense grids. The time
complexity of the algorithm is
better than others.
5. ISSN: 2278 – 1323
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 2, Issue 6, June 2013
2147
www.ijarcet.org
[12] Jiawei Han and Micheline Kamber. Data Mining:
Concepts and Techniques, Morgan Kaufmann Publishers,
pp. 383-463, 2006.
[13] Mihael Ankerst, Markus M.Breunig and Hans-Peter
Kriegel. OPTIC:Ordering points to identify the clustering
structure, ACM SIGMOD International Conference on
Management of Data, pp. 49-60, 1999.
[14] D. Zhou, N. Wu, P. Liue. VDBSCAN: Varied Density
Based Spatial Clustering of Applications with Noise,
International Conference on Service Systems and Service
Management, pp. 1-4, 2007.
[15] M. Yufang, Z. Yan, W. Ping, C. Xiaoyun.
GMDBSCAN: Multi- Density DBSCAN Cluster Based on
Grid, IEEE International Conference on e-Business
Engineering, pp. 780-783, 2008.
[16] G. Esfandani, H. Abolhassani. MSDBSCAN: Multi
Density Scale Independent Clustering Algorithm Based
on DBSCAN, Advanced Data Mining and Application
(ADMA), LNCS, vol. 6440, pp. 202-213, 2010.
Amandeep Kaur Mann pursuing M.Tech in Computer Science &
Technology in RIMT institutes of Engineering & Technology, Mandi
Gobindgarh belongs to Punjab Technical University. And doing the research
work in data mining on grid density algorithm used in Clustering
Navneet Kaur works as Assistant Professor in Computer Science &
Technology in RIMT institutes of Engineering & Technology; Mandi
Gobindgarh belongs to Punjab Technical University. She guides the M.Tech
students in their research work in Data Mining field.