The clustering is a without monitoring process and one of the most common data mining techniques. The
purpose of clustering is grouping similar data together in a group, so were most similar to each other in a
cluster and the difference with most other instances in the cluster are. In this paper we focus on clustering
partition k-means, due to ease of implementation and high-speed performance of large data sets, After 30
year it is still very popular among the developed clustering algorithm and then for improvement problem of
placing of k-means algorithm in local optimal, we pose extended PSO algorithm, that its name is ECPSO.
Our new algorithm is able to be cause of exit from local optimal and with high percent produce the
problem’s optimal answer. The probe of results show that mooted algorithm have better performance
regards as other clustering algorithms specially in two index, the carefulness of clustering and the quality
of clustering.
Experimental study of Data clustering using k- Means and modified algorithmsIJDKP
The k- Means clustering algorithm is an old algorithm that has been intensely researched owing to its ease
and simplicity of implementation. Clustering algorithm has a broad attraction and usefulness in
exploratory data analysis. This paper presents results of the experimental study of different approaches to
k- Means clustering, thereby comparing results on different datasets using Original k-Means and other
modified algorithms implemented using MATLAB R2009b. The results are calculated on some performance
measures such as no. of iterations, no. of points misclassified, accuracy, Silhouette validity index and
execution time
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Textual Data Partitioning with Relationship and Discriminative AnalysisEditor IJMTER
Data partitioning methods are used to partition the data values with similarity. Similarity
measures are used to estimate transaction relationships. Hierarchical clustering model produces tree
structured results. Partitioned clustering produces results in grid format. Text documents are
unstructured data values with high dimensional attributes. Document clustering group ups unlabeled text
documents into meaningful clusters. Traditional clustering methods require cluster count (K) for the
document grouping process. Clustering accuracy degrades drastically with reference to the unsuitable
cluster count.
Textual data elements are divided into two types’ discriminative words and nondiscriminative
words. Only discriminative words are useful for grouping documents. The involvement of
nondiscriminative words confuses the clustering process and leads to poor clustering solution in return.
A variation inference algorithm is used to infer the document collection structure and partition of
document words at the same time. Dirichlet Process Mixture (DPM) model is used to partition
documents. DPM clustering model uses both the data likelihood and the clustering property of the
Dirichlet Process (DP). Dirichlet Process Mixture Model for Feature Partition (DPMFP) is used to
discover the latent cluster structure based on the DPM model. DPMFP clustering is performed without
requiring the number of clusters as input.
Document labels are used to estimate the discriminative word identification process. Concept
relationships are analyzed with Ontology support. Semantic weight model is used for the document
similarity analysis. The system improves the scalability with the support of labels and concept relations
for dimensionality reduction process.
Particle Swarm Optimization based K-Prototype Clustering Algorithm iosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Scalable Rough C-Means clustering using Firefly algorithm..................................................................1
Abhilash Namdev and B.K. Tripathy
Significance of Embedded Systems to IoT................................................................................................. 15
P. R. S. M. Lakshmi, P. Lakshmi Narayanamma and K. Santhi Sri
Cognitive Abilities, Information Literacy Knowledge and Retrieval Skills of Undergraduates: A
Comparison of Public and Private Universities in Nigeria ........................................................................ 24
Janet O. Adekannbi and Testimony Morenike Oluwayinka
Risk Assessment in Constructing Horseshoe Vault Tunnels using Fuzzy Technique................................ 48
Erfan Shafaghat and Mostafa Yousefi Rad
Evaluating the Adoption of Deductive Database Technology in Augmenting Criminal Intelligence in
Zimbabwe: Case of Zimbabwe Republic Police......................................................................................... 68
Mahlangu Gilbert, Furusa Samuel Simbarashe, Chikonye Musafare and Mugoniwa Beauty
Analysis of Petrol Pumps Reachability in Anand District of Gujarat ....................................................... 77
Nidhi Arora
Experimental study of Data clustering using k- Means and modified algorithmsIJDKP
The k- Means clustering algorithm is an old algorithm that has been intensely researched owing to its ease
and simplicity of implementation. Clustering algorithm has a broad attraction and usefulness in
exploratory data analysis. This paper presents results of the experimental study of different approaches to
k- Means clustering, thereby comparing results on different datasets using Original k-Means and other
modified algorithms implemented using MATLAB R2009b. The results are calculated on some performance
measures such as no. of iterations, no. of points misclassified, accuracy, Silhouette validity index and
execution time
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Textual Data Partitioning with Relationship and Discriminative AnalysisEditor IJMTER
Data partitioning methods are used to partition the data values with similarity. Similarity
measures are used to estimate transaction relationships. Hierarchical clustering model produces tree
structured results. Partitioned clustering produces results in grid format. Text documents are
unstructured data values with high dimensional attributes. Document clustering group ups unlabeled text
documents into meaningful clusters. Traditional clustering methods require cluster count (K) for the
document grouping process. Clustering accuracy degrades drastically with reference to the unsuitable
cluster count.
Textual data elements are divided into two types’ discriminative words and nondiscriminative
words. Only discriminative words are useful for grouping documents. The involvement of
nondiscriminative words confuses the clustering process and leads to poor clustering solution in return.
A variation inference algorithm is used to infer the document collection structure and partition of
document words at the same time. Dirichlet Process Mixture (DPM) model is used to partition
documents. DPM clustering model uses both the data likelihood and the clustering property of the
Dirichlet Process (DP). Dirichlet Process Mixture Model for Feature Partition (DPMFP) is used to
discover the latent cluster structure based on the DPM model. DPMFP clustering is performed without
requiring the number of clusters as input.
Document labels are used to estimate the discriminative word identification process. Concept
relationships are analyzed with Ontology support. Semantic weight model is used for the document
similarity analysis. The system improves the scalability with the support of labels and concept relations
for dimensionality reduction process.
Particle Swarm Optimization based K-Prototype Clustering Algorithm iosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Scalable Rough C-Means clustering using Firefly algorithm..................................................................1
Abhilash Namdev and B.K. Tripathy
Significance of Embedded Systems to IoT................................................................................................. 15
P. R. S. M. Lakshmi, P. Lakshmi Narayanamma and K. Santhi Sri
Cognitive Abilities, Information Literacy Knowledge and Retrieval Skills of Undergraduates: A
Comparison of Public and Private Universities in Nigeria ........................................................................ 24
Janet O. Adekannbi and Testimony Morenike Oluwayinka
Risk Assessment in Constructing Horseshoe Vault Tunnels using Fuzzy Technique................................ 48
Erfan Shafaghat and Mostafa Yousefi Rad
Evaluating the Adoption of Deductive Database Technology in Augmenting Criminal Intelligence in
Zimbabwe: Case of Zimbabwe Republic Police......................................................................................... 68
Mahlangu Gilbert, Furusa Samuel Simbarashe, Chikonye Musafare and Mugoniwa Beauty
Analysis of Petrol Pumps Reachability in Anand District of Gujarat ....................................................... 77
Nidhi Arora
WITH SEMANTICS AND HIDDEN MARKOV MODELS TO AN ADAPTIVE LOG FILE PARSERijnlc
We aim to model an adaptive log file parser. As the content of log files often evolves over time, we established a dynamic statistical model which learns and adapts processing and parsing rules. First, we limit the amount of unstructured text by clustering based on semantics of log file lines. Next, we only take the most relevant cluster into account and focus only on those frequent patterns which lead to the desired output table similar to Vaarandi [10]. Furthermore, we transform the found frequent patterns and the output stating the parsed table into a Hidden Markov Model (HMM). We use this HMM as a specific, however, flexible representation of a pattern for log file parsing to maintain high quality output. After training our model on one system type and applying it to a different system with slightly different log file patterns, we achieve an accuracy over 99.99%
Job Scheduling on the Grid Environment using Max-Min Firefly AlgorithmEditor IJCATR
Grid computing indeed is the next generation of distributed systems and its goals is creating a powerful virtual, great, and
autonomous computer that is created using countless Heterogeneous resource with the purpose of sharing resources. Scheduling is one
of the main steps to exploit the capabilities of emerging computing systems such as the grid. Scheduling of the jobs in computational
grids due to Heterogeneous resources is known as an NP-Complete problem. Grid resources belong to different management domains
and each applies different management policies. Since the nature of the grid is Heterogeneous and dynamic, techniques used in
traditional systems cannot be applied to grid scheduling, therefore new methods must be found. This paper proposes a new algorithm
which combines the firefly algorithm with the Max-Min algorithm for scheduling of jobs on the grid. The firefly algorithm is a new
technique based on the swarm behavior that is inspired by social behavior of fireflies in nature. Fireflies move in the search space of
problem to find the optimal or near-optimal solutions. Minimization of the makespan and flowtime of completing jobs simultaneously
are the goals of this paper. Experiments and simulation results show that the proposed method has a better efficiency than other
compared algorithms.
A survey on Efficient Enhanced K-Means Clustering Algorithmijsrd.com
Data mining is the process of using technology to identify patterns and prospects from large amount of information. In Data Mining, Clustering is an important research topic and wide range of unverified classification application. Clustering is technique which divides a data into meaningful groups. K-means clustering is a method of cluster analysis which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean. In this paper, we present the comparison of different K-means clustering algorithms.
A Novel Approach to Mathematical Concepts in Data Miningijdmtaiir
-This paper describes three different fundamental
mathematical programming approaches that are relevant to
data mining. They are: Feature Selection, Clustering and
Robust Representation. This paper comprises of two clustering
algorithms such as K-mean algorithm and K-median
algorithms. Clustering is illustrated by the unsupervised
learning of patterns and clusters that may exist in a given
databases and useful tool for Knowledge Discovery in
Database (KDD). The results of k-median algorithm are used
to collecting the blood cancer patient from a medical database.
K-mean clustering is a data mining/machine learning algorithm
used to cluster observations into groups of related observations
without any prior knowledge of those relationships. The kmean algorithm is one of the simplest clustering techniques
and it is commonly used in medical imaging, biometrics and
related fields.
A PSO-Based Subtractive Data Clustering AlgorithmIJORCS
There is a tremendous proliferation in the amount of information available on the largest shared information source, the World Wide Web. Fast and high-quality clustering algorithms play an important role in helping users to effectively navigate, summarize, and organize the information. Recent studies have shown that partitional clustering algorithms such as the k-means algorithm are the most popular algorithms for clustering large datasets. The major problem with partitional clustering algorithms is that they are sensitive to the selection of the initial partitions and are prone to premature converge to local optima. Subtractive clustering is a fast, one-pass algorithm for estimating the number of clusters and cluster centers for any given set of data. The cluster estimates can be used to initialize iterative optimization-based clustering methods and model identification methods. In this paper, we present a hybrid Particle Swarm Optimization, Subtractive + (PSO) clustering algorithm that performs fast clustering. For comparison purpose, we applied the Subtractive + (PSO) clustering algorithm, PSO, and the Subtractive clustering algorithms on three different datasets. The results illustrate that the Subtractive + (PSO) clustering algorithm can generate the most compact clustering results as compared to other algorithms.
A Combined Approach for Feature Subset Selection and Size Reduction for High ...IJERA Editor
selection of relevant feature from a given set of feature is one of the important issues in the field of
data mining as well as classification. In general the dataset may contain a number of features however it is not
necessary that the whole set features are important for particular analysis of decision making because the
features may share the common information‟s and can also be completely irrelevant to the undergoing
processing. This generally happen because of improper selection of features during the dataset formation or
because of improper information availability about the observed system. However in both cases the data will
contain the features that will just increase the processing burden which may ultimately cause the improper
outcome when used for analysis. Because of these reasons some kind of methods are required to detect and
remove these features hence in this paper we are presenting an efficient approach for not just removing the
unimportant features but also the size of complete dataset size. The proposed algorithm utilizes the information
theory to detect the information gain from each feature and minimum span tree to group the similar features
with that the fuzzy c-means clustering is used to remove the similar entries from the dataset. Finally the
algorithm is tested with SVM classifier using 35 publicly available real-world high-dimensional dataset and the
results shows that the presented algorithm not only reduces the feature set and data lengths but also improves the
performances of the classifier.
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Feature selection in high-dimensional datasets is
considered to be a complex and time-consuming problem. To
enhance the accuracy of classification and reduce the execution
time, Parallel Evolutionary Algorithms (PEAs) can be used. In
this paper, we make a review for the most recent works which
handle the use of PEAs for feature selection in large datasets.
We have classified the algorithms in these papers into four main
classes (Genetic Algorithms (GA), Particle Swarm Optimization
(PSO), Scattered Search (SS), and Ant Colony Optimization
(ACO)). The accuracy is adopted as a measure to compare the
efficiency of these PEAs. It is noticeable that the Parallel Genetic
Algorithms (PGAs) are the most suitable algorithms for feature
selection in large datasets; since they achieve the highest accuracy.
On the other hand, we found that the Parallel ACO is timeconsuming
and less accurate comparing with other PEA.
K-MEDOIDS CLUSTERING USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...ijscmc
Face recognition is one of the most unobtrusive biometric techniques that can be used for access control as well as surveillance purposes. Various methods for implementing face recognition have been proposed with varying degrees of performance in different scenarios. The most common issue with effective facial biometric systems is high susceptibility of variations in the face owing to different factors like changes in pose, varying illumination, different expression, presence of outliers, noise etc. This paper explores a novel technique for face recognition by performing classification of the face images using unsupervised learning approach through K-Medoids clustering. Partitioning Around Medoids algorithm (PAM) has been used for performing K-Medoids clustering of the data. The results are suggestive of increased robustness to noise and outliers in comparison to other clustering methods. Therefore the technique can also be used to increase the overall robustness of a face recognition system and thereby increase its invariance and make it a reliably usable biometric modality
A HYBRID CLUSTERING ALGORITHM FOR DATA MININGcscpconf
Data clustering is a process of arranging similar data into groups. A clustering algorithm
partitions a data set into several groups such that the similarity within a group is better than
among groups. In this paper a hybrid clustering algorithm based on K-mean and K-harmonic
mean (KHM) is described. The proposed algorithm is tested on five different datasets. The research is focused on fast and accurate clustering. Its performance is compared with the traditional K-means & KHM algorithm. The result obtained from proposed hybrid algorithm is much better than the traditional K-mean & KHM algorithm
A h k clustering algorithm for high dimensional data using ensemble learningijitcs
Advances made to the traditional clustering algorithms solves the various problems such as curse of
dimensionality and sparsity of data for multiple attributes. The traditional H-K clustering algorithm can
solve the randomness and apriority of the initial centers of K-means clustering algorithm. But when we
apply it to high dimensional data it causes the dimensional disaster problem due to high computational
complexity. All the advanced clustering algorithms like subspace and ensemble clustering algorithms
improve the performance for clustering high dimension dataset from different aspects in different extent.
Still these algorithms will improve the performance form a single perspective. The objective of the
proposed model is to improve the performance of traditional H-K clustering and overcome the limitations
such as high computational complexity and poor accuracy for high dimensional data by combining the
three different approaches of clustering algorithm as subspace clustering algorithm and ensemble
clustering algorithm with H-K clustering algorithm.
Extended pso algorithm for improvement problems k means clustering algorithmIJMIT JOURNAL
The clustering is a without monitoring process and one of the most common data mining techniques. The
purpose of clustering is grouping similar data together in a group, so were most similar to each other in a
cluster and the difference with most other instances in the cluster are. In this paper we focus on clustering
partition k-means, due to ease of implementation and high-speed performance of large data sets, After 30
year it is still very popular among the developed clustering algorithm and then for improvement problem of
placing of k-means algorithm in local optimal, we pose extended PSO algorithm, that its name is ECPSO.
Our new algorithm is able to be cause of exit from local optimal and with high percent produce the
problem’s optimal answer. The probe of results show that mooted algorithm have better performance
regards as other clustering algorithms specially in two index, the carefulness of clustering and the quality
of clustering.
In recent machine learning community, there is a trend of constructing a linear logarithm version of
nonlinear version through the ‘kernel method’ for example kernel principal component analysis, kernel
fisher discriminant analysis, support Vector Machines (SVMs), and the current kernel clustering
algorithms. Typically, in unsupervised methods of clustering algorithms utilizing kernel method, a
nonlinear mapping is operated initially in order to map the data into a much higher space feature, and then
clustering is executed. A hitch of these kernel clustering algorithms is that the clustering prototype resides
in increased features specs of dimensions and therefore lack intuitive and clear descriptions without
utilizing added approximation of projection from the specs to the data as executed in the literature
presented. This paper aims to utilize the ‘kernel method’, a novel clustering algorithm, founded on the
conventional fuzzy clustering algorithm (FCM) is anticipated and known as kernel fuzzy c-means algorithm
(KFCM). This method embraces a novel kernel-induced metric in the space of data in order to interchange
the novel Euclidean matric norm in cluster prototype and fuzzy clustering algorithm still reside in the space
of data so that the results of clustering could be interpreted and reformulated in the spaces which are
original. This property is used for clustering incomplete data. Execution on supposed data illustrate that
KFCM has improved performance of clustering and stout as compare to other transformations of FCM for
clustering incomplete data.
WITH SEMANTICS AND HIDDEN MARKOV MODELS TO AN ADAPTIVE LOG FILE PARSERijnlc
We aim to model an adaptive log file parser. As the content of log files often evolves over time, we established a dynamic statistical model which learns and adapts processing and parsing rules. First, we limit the amount of unstructured text by clustering based on semantics of log file lines. Next, we only take the most relevant cluster into account and focus only on those frequent patterns which lead to the desired output table similar to Vaarandi [10]. Furthermore, we transform the found frequent patterns and the output stating the parsed table into a Hidden Markov Model (HMM). We use this HMM as a specific, however, flexible representation of a pattern for log file parsing to maintain high quality output. After training our model on one system type and applying it to a different system with slightly different log file patterns, we achieve an accuracy over 99.99%
Job Scheduling on the Grid Environment using Max-Min Firefly AlgorithmEditor IJCATR
Grid computing indeed is the next generation of distributed systems and its goals is creating a powerful virtual, great, and
autonomous computer that is created using countless Heterogeneous resource with the purpose of sharing resources. Scheduling is one
of the main steps to exploit the capabilities of emerging computing systems such as the grid. Scheduling of the jobs in computational
grids due to Heterogeneous resources is known as an NP-Complete problem. Grid resources belong to different management domains
and each applies different management policies. Since the nature of the grid is Heterogeneous and dynamic, techniques used in
traditional systems cannot be applied to grid scheduling, therefore new methods must be found. This paper proposes a new algorithm
which combines the firefly algorithm with the Max-Min algorithm for scheduling of jobs on the grid. The firefly algorithm is a new
technique based on the swarm behavior that is inspired by social behavior of fireflies in nature. Fireflies move in the search space of
problem to find the optimal or near-optimal solutions. Minimization of the makespan and flowtime of completing jobs simultaneously
are the goals of this paper. Experiments and simulation results show that the proposed method has a better efficiency than other
compared algorithms.
A survey on Efficient Enhanced K-Means Clustering Algorithmijsrd.com
Data mining is the process of using technology to identify patterns and prospects from large amount of information. In Data Mining, Clustering is an important research topic and wide range of unverified classification application. Clustering is technique which divides a data into meaningful groups. K-means clustering is a method of cluster analysis which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean. In this paper, we present the comparison of different K-means clustering algorithms.
A Novel Approach to Mathematical Concepts in Data Miningijdmtaiir
-This paper describes three different fundamental
mathematical programming approaches that are relevant to
data mining. They are: Feature Selection, Clustering and
Robust Representation. This paper comprises of two clustering
algorithms such as K-mean algorithm and K-median
algorithms. Clustering is illustrated by the unsupervised
learning of patterns and clusters that may exist in a given
databases and useful tool for Knowledge Discovery in
Database (KDD). The results of k-median algorithm are used
to collecting the blood cancer patient from a medical database.
K-mean clustering is a data mining/machine learning algorithm
used to cluster observations into groups of related observations
without any prior knowledge of those relationships. The kmean algorithm is one of the simplest clustering techniques
and it is commonly used in medical imaging, biometrics and
related fields.
A PSO-Based Subtractive Data Clustering AlgorithmIJORCS
There is a tremendous proliferation in the amount of information available on the largest shared information source, the World Wide Web. Fast and high-quality clustering algorithms play an important role in helping users to effectively navigate, summarize, and organize the information. Recent studies have shown that partitional clustering algorithms such as the k-means algorithm are the most popular algorithms for clustering large datasets. The major problem with partitional clustering algorithms is that they are sensitive to the selection of the initial partitions and are prone to premature converge to local optima. Subtractive clustering is a fast, one-pass algorithm for estimating the number of clusters and cluster centers for any given set of data. The cluster estimates can be used to initialize iterative optimization-based clustering methods and model identification methods. In this paper, we present a hybrid Particle Swarm Optimization, Subtractive + (PSO) clustering algorithm that performs fast clustering. For comparison purpose, we applied the Subtractive + (PSO) clustering algorithm, PSO, and the Subtractive clustering algorithms on three different datasets. The results illustrate that the Subtractive + (PSO) clustering algorithm can generate the most compact clustering results as compared to other algorithms.
A Combined Approach for Feature Subset Selection and Size Reduction for High ...IJERA Editor
selection of relevant feature from a given set of feature is one of the important issues in the field of
data mining as well as classification. In general the dataset may contain a number of features however it is not
necessary that the whole set features are important for particular analysis of decision making because the
features may share the common information‟s and can also be completely irrelevant to the undergoing
processing. This generally happen because of improper selection of features during the dataset formation or
because of improper information availability about the observed system. However in both cases the data will
contain the features that will just increase the processing burden which may ultimately cause the improper
outcome when used for analysis. Because of these reasons some kind of methods are required to detect and
remove these features hence in this paper we are presenting an efficient approach for not just removing the
unimportant features but also the size of complete dataset size. The proposed algorithm utilizes the information
theory to detect the information gain from each feature and minimum span tree to group the similar features
with that the fuzzy c-means clustering is used to remove the similar entries from the dataset. Finally the
algorithm is tested with SVM classifier using 35 publicly available real-world high-dimensional dataset and the
results shows that the presented algorithm not only reduces the feature set and data lengths but also improves the
performances of the classifier.
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Feature selection in high-dimensional datasets is
considered to be a complex and time-consuming problem. To
enhance the accuracy of classification and reduce the execution
time, Parallel Evolutionary Algorithms (PEAs) can be used. In
this paper, we make a review for the most recent works which
handle the use of PEAs for feature selection in large datasets.
We have classified the algorithms in these papers into four main
classes (Genetic Algorithms (GA), Particle Swarm Optimization
(PSO), Scattered Search (SS), and Ant Colony Optimization
(ACO)). The accuracy is adopted as a measure to compare the
efficiency of these PEAs. It is noticeable that the Parallel Genetic
Algorithms (PGAs) are the most suitable algorithms for feature
selection in large datasets; since they achieve the highest accuracy.
On the other hand, we found that the Parallel ACO is timeconsuming
and less accurate comparing with other PEA.
K-MEDOIDS CLUSTERING USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...ijscmc
Face recognition is one of the most unobtrusive biometric techniques that can be used for access control as well as surveillance purposes. Various methods for implementing face recognition have been proposed with varying degrees of performance in different scenarios. The most common issue with effective facial biometric systems is high susceptibility of variations in the face owing to different factors like changes in pose, varying illumination, different expression, presence of outliers, noise etc. This paper explores a novel technique for face recognition by performing classification of the face images using unsupervised learning approach through K-Medoids clustering. Partitioning Around Medoids algorithm (PAM) has been used for performing K-Medoids clustering of the data. The results are suggestive of increased robustness to noise and outliers in comparison to other clustering methods. Therefore the technique can also be used to increase the overall robustness of a face recognition system and thereby increase its invariance and make it a reliably usable biometric modality
A HYBRID CLUSTERING ALGORITHM FOR DATA MININGcscpconf
Data clustering is a process of arranging similar data into groups. A clustering algorithm
partitions a data set into several groups such that the similarity within a group is better than
among groups. In this paper a hybrid clustering algorithm based on K-mean and K-harmonic
mean (KHM) is described. The proposed algorithm is tested on five different datasets. The research is focused on fast and accurate clustering. Its performance is compared with the traditional K-means & KHM algorithm. The result obtained from proposed hybrid algorithm is much better than the traditional K-mean & KHM algorithm
A h k clustering algorithm for high dimensional data using ensemble learningijitcs
Advances made to the traditional clustering algorithms solves the various problems such as curse of
dimensionality and sparsity of data for multiple attributes. The traditional H-K clustering algorithm can
solve the randomness and apriority of the initial centers of K-means clustering algorithm. But when we
apply it to high dimensional data it causes the dimensional disaster problem due to high computational
complexity. All the advanced clustering algorithms like subspace and ensemble clustering algorithms
improve the performance for clustering high dimension dataset from different aspects in different extent.
Still these algorithms will improve the performance form a single perspective. The objective of the
proposed model is to improve the performance of traditional H-K clustering and overcome the limitations
such as high computational complexity and poor accuracy for high dimensional data by combining the
three different approaches of clustering algorithm as subspace clustering algorithm and ensemble
clustering algorithm with H-K clustering algorithm.
Extended pso algorithm for improvement problems k means clustering algorithmIJMIT JOURNAL
The clustering is a without monitoring process and one of the most common data mining techniques. The
purpose of clustering is grouping similar data together in a group, so were most similar to each other in a
cluster and the difference with most other instances in the cluster are. In this paper we focus on clustering
partition k-means, due to ease of implementation and high-speed performance of large data sets, After 30
year it is still very popular among the developed clustering algorithm and then for improvement problem of
placing of k-means algorithm in local optimal, we pose extended PSO algorithm, that its name is ECPSO.
Our new algorithm is able to be cause of exit from local optimal and with high percent produce the
problem’s optimal answer. The probe of results show that mooted algorithm have better performance
regards as other clustering algorithms specially in two index, the carefulness of clustering and the quality
of clustering.
In recent machine learning community, there is a trend of constructing a linear logarithm version of
nonlinear version through the ‘kernel method’ for example kernel principal component analysis, kernel
fisher discriminant analysis, support Vector Machines (SVMs), and the current kernel clustering
algorithms. Typically, in unsupervised methods of clustering algorithms utilizing kernel method, a
nonlinear mapping is operated initially in order to map the data into a much higher space feature, and then
clustering is executed. A hitch of these kernel clustering algorithms is that the clustering prototype resides
in increased features specs of dimensions and therefore lack intuitive and clear descriptions without
utilizing added approximation of projection from the specs to the data as executed in the literature
presented. This paper aims to utilize the ‘kernel method’, a novel clustering algorithm, founded on the
conventional fuzzy clustering algorithm (FCM) is anticipated and known as kernel fuzzy c-means algorithm
(KFCM). This method embraces a novel kernel-induced metric in the space of data in order to interchange
the novel Euclidean matric norm in cluster prototype and fuzzy clustering algorithm still reside in the space
of data so that the results of clustering could be interpreted and reformulated in the spaces which are
original. This property is used for clustering incomplete data. Execution on supposed data illustrate that
KFCM has improved performance of clustering and stout as compare to other transformations of FCM for
clustering incomplete data.
Performance Comparision of Machine Learning AlgorithmsDinusha Dilanka
In this paper Compare the performance of two
classification algorithm. I t is useful to differentiate
algorithms based on computational performance rather
than classification accuracy alone. As although
classification accuracy between the algorithms is similar,
computational performance can differ significantly and it
can affect to the final results. So the objective of this paper
is to perform a comparative analysis of two machine
learning algorithms namely, K Nearest neighbor,
classification and Logistic Regression. In this paper it
was considered a large dataset of 7981 data points and 112
features. Then the performance of the above mentioned
machine learning algorithms are examined. In this paper
the processing time and accuracy of the different machine
learning techniques are being estimated by considering the
collected data set, over a 60% for train and remaining
40% for testing. The paper is organized as follows. In
Section I, introduction and background analysis of the
research is included and in section II, problem statement.
In Section III, our application and data analyze Process,
the testing environment, and the Methodology of our
analysis are being described briefly. Section IV comprises
the results of two algorithms. Finally, the paper concludes
with a discussion of future directions for research by
eliminating the problems existing with the current
research methodology.
Engineering Research Publication
Best International Journals, High Impact Journals,
International Journal of Engineering & Technical Research
ISSN : 2321-0869 (O) 2454-4698 (P)
www.erpublication.org
Comparison Between Clustering Algorithms for Microarray Data AnalysisIOSR Journals
Currently, there are two techniques used for large-scale gene-expression profiling; microarray and
RNA-Sequence (RNA-Seq).This paper is intended to study and compare different clustering algorithms that used
in microarray data analysis. Microarray is a DNA molecules array which allows multiple hybridization
experiments to be carried out simultaneously and trace expression levels of thousands of genes. It is a highthroughput
technology for gene expression analysis and becomes an effective tool for biomedical research.
Microarray analysis aims to interpret the data produced from experiments on DNA, RNA, and protein
microarrays, which enable researchers to investigate the expression state of a large number of genes. Data
clustering represents the first and main process in microarray data analysis. The k-means, fuzzy c-mean, selforganizing
map, and hierarchical clustering algorithms are under investigation in this paper. These algorithms
are compared based on their clustering model.
Survey on Efficient Techniques of Text Miningvivatechijri
In the current era, with the advancement of technology, more and more data is available in digital
form. Among which, most of the data (approx. 85%) is in unstructured textual form. So it has become essential to
develop better techniques and algorithms to extract useful and interesting information from this large amount of
textual data. Text mining is process of extracting useful data from unstructured text. The algorithm used for text
mining has advantages and disadvantages. Moreover the issues in the field of text mining that affect the accuracy
and relevance of the results are identified.
An Efficient Clustering Method for Aggregation on Data FragmentsIJMER
Clustering is an important step in the process of data analysis with applications to numerous fields. Clustering ensembles, has emerged as a powerful technique for combining different clustering results to obtain a quality cluster. Existing clustering aggregation algorithms are applied directly to large number of data points. The algorithms are inefficient if the number of data points is large. This project defines an efficient approach for clustering aggregation based on data fragments. In fragment-based approach, a data fragment is any subset of the data. To increase the efficiency of the proposed approach, the clustering aggregation can be performed directly on data fragments under comparison measure and normalized mutual information measures for clustering aggregation, enhanced clustering aggregation algorithms are described. To show the minimal computational complexity. (Agglomerative, Furthest, and Local Search); nevertheless, which increases the accuracy.
A SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERINGijcsa
Text Document Clustering is one of the fastest growing research areas because of availability of huge amount of information in an electronic form. There are several number of techniques launched for clustering documents in such a way that documents within a cluster have high intra-similarity and low inter-similarity to other clusters. Many document clustering algorithms provide localized search in effectively navigating, summarizing, and organizing information. A global optimal solution can be obtained by applying high-speed and high-quality optimization algorithms. The optimization technique performs a globalized search in the entire solution space. In this paper, a brief survey on optimization approaches to text document clustering is turned out.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
COMPARISON BETWEEN THE GENETIC ALGORITHMS OPTIMIZATION AND PARTICLE SWARM OPT...IAEME Publication
Close range photogrammetry network design is referred to the process of placing a set of
cameras in order to achieve photogrammetric tasks. The main objective of this paper is tried to find
the best location of two/three camera stations. The genetic algorithm optimization and Particle
Swarm Optimization are developed to determine the optimal camera stations for computing the three
dimensional coordinates. In this research, a mathematical model representing the genetic algorithm
optimization and Particle Swarm Optimization for the close range photogrammetry network is
developed. This paper gives also the sequence of the field operations and computational steps for this
task. A test field is included to reinforce the theoretical aspects.
Similar to Extended pso algorithm for improvement problems k means clustering algorithm (20)
MULTIMODAL COURSE DESIGN AND IMPLEMENTATION USING LEML AND LMS FOR INSTRUCTIO...IJMIT JOURNAL
Traditionally, teaching has been centered around classroom delivery. However, the onslaught of the
COVID-19 pandemic has cultivated usage of technology, teaching, and learning methodologies for course
delivery. We investigate and describe different modes of course delivery that maintain the integrity of
teaching and learning. This paper answers to the research questions: 1) What course delivery method our
academic institutions use and why? 2) How can instructors validate the guidelines of the institutions? 3)
How courses should be taught to provide student learning outcomes? Using the Learning Environment
Modeling Language (LEML), we investigate the design and implementation of courses for delivery in the
following environments: face-to-face, online synchronous, asynchronous, hybrid, and hyflex. A good
course design and implementation are key components of instructional alignment. Furthermore, we
demonstrate how to design, implement, and deliver courses in synchronous, asynchronous, and hybrid
modes and describe our proposed enhancements to LEML.
Novel R&D Capabilities as a Response to ESG Risks-Lessons From Amazon’s Fusio...IJMIT JOURNAL
Environmental, Social, and Governance (ESG) management is essential for transforming corporate
financial performance-oriented business strategies into Finance (F) + ESG optimization strategies to
achieve the Sustainable Development Goals (SDGs).
In this trend, the rise of ESG risks has divided firms into two categories. Former incorporates a growthmindset that creates a passion for learning, and urges it to improve itself by endeavoring Research and
development (R&D) -driven challenges, while the other category, characterized by risk aversion, avoids
challenging highly uncertain R&D activities and seeks more manageable endeavors.
This duality underscores the complexity of corporate R&D strategies in addressing ESG risks and
necessitates the development of novel R&D capabilities for corporate R&D transformation strategies
towards F + ESG optimization.
International Journal of Managing Information Technology (IJMIT) ** WJCI IndexedIJMIT JOURNAL
The International Journal of Managing Information Technology (IJMIT) is a quarterly open access peer-reviewed journal that publishes articles that contribute new results in all areas of the strategic application of information technology (IT) in organizations. The journal focuses on innovative ideas and best practices in using IT to advance organizations – for-profit, non-profit, and governmental. The goal of this journal is to bring together researchers and practitioners from academia, government, and industry to focus on understanding both how to use IT to support the strategy and goals of the organization and to employ IT in new ways to foster greater collaboration, communication, and information sharing both within the organization and with its stakeholders. The International Journal of Managing Information Technology seeks to establish new collaborations, new best practices, and new theories in these areas.
International Journal of Managing Information Technology (IJMIT) ** WJCI IndexedIJMIT JOURNAL
The International Journal of Managing Information Technology (IJMIT) is a quarterly open access peer-reviewed journal that publishes articles that contribute new results in all areas of the strategic application of information technology (IT) in organizations. The journal focuses on innovative ideas and best practices in using IT to advance organizations – for-profit, non-profit, and governmental. The goal of this journal is to bring together researchers and practitioners from academia, government, and industry to focus on understanding both how to use IT to support the strategy and goals of the organization and to employ IT in new ways to foster greater collaboration, communication, and information sharing both within the organization and with its stakeholders. The International Journal of Managing Information Technology seeks to establish new collaborations, new best practices, and new theories in these areas.
NOVEL R & D CAPABILITIES AS A RESPONSE TO ESG RISKS- LESSONS FROM AMAZON’S FU...IJMIT JOURNAL
Environmental, Social, and Governance (ESG) management is essential for transforming corporate
financial performance-oriented business strategies into Finance (F) + ESG optimization strategies to
achieve the Sustainable Development Goals (SDGs).
In this trend, the rise of ESG risks has divided firms into two categories. Former incorporates a growthmindset that creates a passion for learning, and urges it to improve itself by endeavoring Research and
development (R&D) -driven challenges, while the other category, characterized by risk aversion, avoids
challenging highly uncertain R&D activities and seeks more manageable endeavors.
This duality underscores the complexity of corporate R&D strategies in addressing ESG risks and
necessitates the development of novel R&D capabilities for corporate R&D transformation strategies
towards F + ESG optimization.
Building on this premise, this paper conducts an empirical analysis, utilizing reliable firms data on ESG
risk and brand value, with a focus on 100 global R&D leader firms. It analyzes R&D and actions for ESG
risk mitigation, and assesses the development of new functions that fulfill F + ESG optimization through
R&D. The analysis also highlights the significance of network externality effects, with a specific focus on
Amazon, a leading R&D company, providing insights into the direction for transforming R&D strategies
towards F + ESG optimization.
The dynamics of stakeholder engagement in F + ESG optimization are indicated with the example of
amazon's activities. Through the analysis, it became evident that Amazon's capacity encompassing growth
and scalability, specifically its ability to grow and expand, is accelerating high-level research and
development by gaining the trust of stakeholders in the "synergy through R&D-driven ESG risk
mitigation."
Finally, as examples of these initiatives, the paper discussed the Climate Pledge led by Amazon and the
transformation of Japan's management system.
A REVIEW OF STOCK TREND PREDICTION WITH COMBINATION OF EFFECTIVE MULTI TECHNI...IJMIT JOURNAL
It is important for investors to understand stock trends and market conditions before trading stocks. Both
these capabilities are very important for an investor in order to obtain maximized profit and minimized
losses. Without this capability, investors will suffer losses due to their ignorance regarding stock trends
and market conditions. Technical analysis helps to understand stock prices behavior with regards to past
trends, the signals given by indicators and the major turning points of the market price. This paper reviews
the stock trend predictions with a combination of the effective multi technical indicator strategy to increase
investment performance by taking into account the global performance and the proposed combination of
effective multi technical indicator strategy model.
INTRUSION DETECTION SYSTEM USING CUSTOMIZED RULES FOR SNORTIJMIT JOURNAL
These days the security provided by the computer systems is a big issue as it always has the threats of
cyber-attacks like IP address spoofing, Denial of Service (DOS), token impersonation, etc. The security
provided by the blue team operations tends to be costly if done in large firms as a large number of systems
need to be protected against these attacks. This leads these firms to turn to less costly security
configurations like IDS Suricata and IDS Snort. The main theme of the project is to improve the services
provided by Snort which is a tool used in creating a vague defense against cyber-attacks like DDOS
attacks which are done on both physical and network layers. These attacks in turn result in loss of
extremely important data. The rules defined in this project will result in monitoring traffic, analyzing it,
and taking appropriate action to not only stop the attack but also locate its source IP address. This whole
process uses different tools other than Snort like Wireshark, Wazuh and Splunk. The product of this will
result in not only the detection of the attack but also the source IP address of the machine on which the
attack is initiated and completed. The end product of this research will result in sets of default rules for the
Snort tool which will not only be able to provide better security than its previous versions but also be able
to provide the user with the IP address of the attacker or the person conducting the attack. The system
involves the integration of Wazuh with Snort tool in order to make it more efficient than IDS Suricata
which is another intrusion detection system capable of detecting all these types of attacks as mentioned.
Splunk is another tool used in this project which increases the firewall efficiency to pass the no. of bits to
be scanned and the no. of bits scanned successfully. Wazuh is used in this system as it is the best choice for
traffic monitoring and incident response than any other of its alternatives in the market. Since this system
is used in firms which are known to handle big amounts of data and for this purpose, we use Splunk tool as
it is very efficient in handling big amounts of data. Wireshark is used in this system in order to give the IDS
automation in its capability to capture and report the malicious packets found during the network scan. All
of this gives the IDS a capability of a low budget automated threat detection system. This paper gives
complete guidelines for authors submitting papers for the AIRCC Journals.
Artificial Intelligence (AI) has rapidly become a critical technology for businesses seeking to improve
efficiency and profitability. One area where AI is proving particularly impactful is in service operations
management, where it is used to create AI-powered service operations (AIServiceOps) that deliver highvalue services to customers. AIServiceOps involve the use of AI to automate and optimize various business
processes, such as customer service, sales, marketing, and supply chain management. The rapid
development of Artificial Intelligence has prompted many changes in the field of Information Technology
(IT) Service Operations. IT Service Operations are driven by AI, i.e., AIServiceOps. AI has empowered
new vitality and addressed many challenges in IT Service Operations. However, there is a literature gap on
the Business Value Impact of Artificial intelligence (AI) Powered IT Service Operations. It can help IT
build optimized business resilience by creating value in complex and ever-changing environments as
product organizations move faster than IT can handle. So, this research paper examines how AIServiceOps
creates business value and sustainability, basically how AIServiceOps makes the IT staff liberation from a
low-level, repetitive workout and traditional IT practices for a continuously optimized process. One of the
research objectives is to compare Traditional IT Service Operations with AIServiceOPs. This paper
provides the basis for how enterprises can evaluate AIServiceOps and consider it a digital transformation
tool. The paper presents a case study of a company that implemented AI-powered service operations
(AIServiceOps) and analyzes the resulting business outcomes. The study shows that AIServiceOps can
significantly improve service delivery, reduce response times, and increase customer satisfaction.
Furthermore, it demonstrates how AIServiceOps can deliver substantial cost savings, such as reducing
labor costs and minimizing downtime.
MEDIATING AND MODERATING FACTORS AFFECTING READINESS TO IOT APPLICATIONS: THE...IJMIT JOURNAL
Although IOT seems to be the upcoming trend, it is still in its infancy; especially in the banking industry.
There is a clear gap in literature, as only few studies identify factors affecting readiness to IOT
applications in banks in general, and almost negligible investigations on mediating and moderating
factors. Accordingly, this research aims to investigate the main factors that affect employees’ readiness to
IOT applications, while highlighting the mediating and moderating factors in the Egyptian banking sector.
The importance of Egypt stems from its high population and steady steps taken towards technology
adoption. 479 valid questionnaires were distributed over HR employees in banks. Data collected was
statistically analysed using Regression and SEM. Results showed a significant impact of ‘Security’,
‘Networking’, ‘Software Development’ and ‘Regulations’ on ‘readiness to IOT applications. Thus, the
readiness acceptance level is high‘Security’ and ‘User Intention’ were proven to mediate the relationship
between research variables and readiness to IOT applications, and only a partial moderation role was
proven for ‘Efficiency’. The study contributes to increasing literature on IOT applications in general, and
fills a gap on the Egyptian banking context in particular. Finally, it provides decision makers at banks with
useful guidelines on how to optimally promote IOT applications among employees.
EFFECTIVELY CONNECT ACQUIRED TECHNOLOGY TO INNOVATION OVER A LONG PERIODIJMIT JOURNAL
IT (Information and Communication Technology) companies are facing the dilemma of decreasing
productivity despite increasing research and development efforts. M&A (Merger and Acquisition) is being
considered as a breakthrough solution. From existing research, it has been pointed out that M&A leads to
the emergence of new innovations. Purpose of this study was to discuss the efficient ways of acquisition and
to resolve the dilemma of productivity decline by clarifying how the technology obtained through M&A
leads to the creation of new innovations. Hypothesis 1 was that the technology acquired through M&A is
utilized for innovation creation, Hypothesis 2 was that the acquired technology is utilized over a long
period of time, and Hypothesis 3 was that a long-term utilization has a positive impact on corporate
performance. The results, using sports prosthetics as a case study and using patents as a proxy variable,
confirmed all the hypotheses set. We have revealed that long-term utilization of technology obtained
through M&A is effective for creating new innovations.
International Journal of Managing Information Technology (IJMIT) ** WJCI IndexedIJMIT JOURNAL
The International Journal of Managing Information Technology (IJMIT) is a quarterly open access peer-reviewed journal that publishes articles that contribute new results in all areas of the strategic application of information technology (IT) in organizations. The journal focuses on innovative ideas and best practices in using IT to advance organizations – for-profit, non-profit, and governmental. The goal of this journal is to bring together researchers and practitioners from academia, government, and industry to focus on understanding both how to use IT to support the strategy and goals of the organization and to employ IT in new ways to foster greater collaboration, communication, and information sharing both within the organization and with its stakeholders. The International Journal of Managing Information Technology seeks to establish new collaborations, new best practices, and new theories in these areas.
Authors are solicited to contribute to the journal by submitting articles that illustrate research results, projects, surveying works and industrial experiences that describe significant advances in the areas of information technology and management
4th International Conference on Cloud, Big Data and IoT (CBIoT 2023)IJMIT JOURNAL
4th International Conference on Cloud, Big Data and IoT (CBIoT 2023) will act as a major forum for the presentation of innovative ideas, approaches, developments, and research projects in the areas of Cloud, Big Data and IoT. It will also serve to facilitate the exchange of information between researchers and industry professionals to discuss the latest issues and advancement in the area of Cloud, Big Data and IoT.
Authors are solicited to contribute to the conference by submitting articles that illustrate research results, projects, surveying works and industrial experiences that describe significant advances in Cloud, Big Data and IoT.
TRANSFORMING SERVICE OPERATIONS WITH AI: A CASE FOR BUSINESS VALUEIJMIT JOURNAL
Artificial Intelligence (AI) has rapidly become a critical technology for businesses seeking to improve
efficiency and profitability. One area where AI is proving particularly impactful is in service operations
management, where it is used to create AI-powered service operations (AIServiceOps) that deliver highvalue services to customers. AIServiceOps involve the use of AI to automate and optimize various business
processes, such as customer service, sales, marketing, and supply chain management. The rapid
development of Artificial Intelligence has prompted many changes in the field of Information Technology
(IT) Service Operations. IT Service Operations are driven by AI, i.e., AIServiceOps. AI has empowered
new vitality and addressed many challenges in IT Service Operations. However, there is a literature gap on
the Business Value Impact of Artificial intelligence (AI) Powered IT Service Operations. It can help IT
build optimized business resilience by creating value in complex and ever-changing environments as
product organizations move faster than IT can handle. So, this research paper examines how AIServiceOps
creates business value and sustainability, basically how AIServiceOps makes the IT staff liberation from a
low-level, repetitive workout and traditional IT practices for a continuously optimized process. One of the
research objectives is to compare Traditional IT Service Operations with AIServiceOPs. This paper
provides the basis for how enterprises can evaluate AIServiceOps and consider it a digital transformation
tool. The paper presents a case study of a company that implemented AI-powered service operations
(AIServiceOps) and analyzes the resulting business outcomes. The study shows that AIServiceOps can
significantly improve service delivery, reduce response times, and increase customer satisfaction.
Furthermore, it demonstrates how AIServiceOps can deliver substantial cost savings, such as reducing
labor costs and minimizing downtime.
DESIGNING A FRAMEWORK FOR ENHANCING THE ONLINE KNOWLEDGE-SHARING BEHAVIOR OF ...IJMIT JOURNAL
The main objective of this paper is to identify the factors that influence academic staff's digital knowledgesharing behaviors in Ethiopian higher education. A structural equation model was used to validate the
research framework using survey data from 210 respondents. The collected data has been analyzed using
Smart PLS software. The results of the study show that trust, self-motivation, and altruism are positively
related to attitude. Contrary to our expectations, knowledge technology negatively affects attitude.
However, reward systems and empowerment by leaders are significantly associated with knowledgesharing intentions.Knowledge-sharing intention, in turn, was significantly related to digital knowledgesharing behavior. The contributions of this study are twofold. The framework may serve as a roadmap for
future researchers and managers considering their strategy to enhance digital knowledge sharing in HEI.
The findings will benefit academic staff and university administrations.The study will also help academic
staff enhance their knowledge-sharing practices.
BUILDING RELIABLE CLOUD SYSTEMS THROUGH CHAOS ENGINEERINGIJMIT JOURNAL
Cloud computing systems need to be reliable so that they can be accessed and used for computing at any
given point in time. The complex nature of cloud systems is the motivation to conduct research in novel
ways of ensuring that cloud systems are built with reliability in mind. In building cloud systems, it is
expected that the cloud system will be able to deal with high demands and unexpected events that affect the
reliability and performance of the system.
In this paper, chaos engineering is considered a heuristic method that can be used to build reliable cloud
systems. Chaos engineering is aimed at exposing weaknesses in systems that are in production. Chaos
engineering will help identify system weaknesses and strengths when a system is exposed to unexpected
knocks and shocks while it is in production.
Chaos engineering allows system developers and administrators to get insights into how the cloud system
will behave when it is exposed to unexpected occurrences.
A REVIEW OF STOCK TREND PREDICTION WITH COMBINATION OF EFFECTIVE MULTI TECHNI...IJMIT JOURNAL
It is important for investors to understand stock trends and market conditions before trading stocks. Both
these capabilities are very important for an investor in order to obtain maximized profit and minimized
losses. Without this capability, investors will suffer losses due to their ignorance regarding stock trends
and market conditions. Technical analysis helps to understand stock prices behavior with regards to past
trends, the signals given by indicators and the major turning points of the market price. This paper reviews
the stock trend predictions with a combination of the effective multi technical indicator strategy to increase
investment performance by taking into account the global performance and the proposed combination of
effective multi technical indicator strategy model.
NETWORK MEDIA ATTENTION AND GREEN TECHNOLOGY INNOVATIONIJMIT JOURNAL
This paper will provide a novel empirical study for the relationship between network media attention and
green technology innovation and examine how network media attention can ease financing constraints. It
collected data from listed companies in China's heavy pollution industry and performed rigorous
regression analysis, in order to innovatively explore the environmental governance functions of the media.
It found that network media attention significantly promotes green technology innovation. By analyzing the
inner mechanism further, it found that network media attention can promote green innovation by easing
financing constraints. Besides, network media attention has a significant positive impact on green invention
patents while not affecting green utility model patents.
INCLUSIVE ENTREPRENEURSHIP IN HANDLING COMPETING INSTITUTIONAL LOGICS FOR DHI...IJMIT JOURNAL
Information System (IS) research advocates employing collaborative and loose coupling strategies to address contradictory issues to address diversified actors’ interests than the prescriptive and unilateral Information Technology (IT) governance mechanisms’, yet it is rarely depicting how managers employ these strategies in Health Information System (HIS) implementation, particularly in a resource-constrained setting where IS implementation activities have highly relied on multiple international organizations resources. This study explored how managers in resource-constrained settings employ collaborative IT governance mechanisms in the case of District Health Information System 2 (DHIS2) adoption with an interpretative case study approach and the institutional logic concept. The institutional logic concept was used to identify the major actors’ logics underpinning the DHIS2 adoption. The study depicted the importance of high-level officials' distance from the dominant systemic logic to consider new alternative, and to employ inclusive IT governance mechanisms which separated resource from the system that facilitated stakeholders’ collaboration in DHIS2 adoption based on their capacity and interest.
DEEP LEARNING APPROACH FOR EVENT MONITORING SYSTEMIJMIT JOURNAL
With an increasing number of extreme events and complexity, more alarms are being used to monitor
control rooms. Operators in the control rooms need to monitor and analyze these alarms to take suitable
actions to ensure the system’s stability and security. Security is the biggest concern in the modern world. It
is important to have a rigid surveillance that should guarantee protection from any sought of hazard.
Considering security, Closed Circuit TV (CCTV) cameras are being utilized for reconnaissance, but these
CCTV cameras require a person for supervision. As a human being, there can be a possibility to be tired
off in supervision at any point of time. So, we need a system to detect automatically. Thus, we came up with
a solution using YOLO V5. We have taken a data set and used robo-flow framework to enhance the existing
images into numerous variations where it will create a copy of grey scale image, a copy of its rotation and
a copy of its blurred version which will be used to get an enlarged data set. This work mainly focuses on
providing a secure environment using CCTV live footage as a source to detect the weapons. Using YOLO
algorithm, it divides an image from the video into grid system and each grid detects an object within itself
MULTIMODAL COURSE DESIGN AND IMPLEMENTATION USING LEML AND LMS FOR INSTRUCTIO...IJMIT JOURNAL
Traditionally, teaching has been centered around classroom delivery. However, the onslaught of the
COVID-19 pandemic has cultivated usage of technology, teaching, and learning methodologies for course
delivery. We investigate and describe different modes of course delivery that maintain the integrity of
teaching and learning. This paper answers to the research questions: 1) What course delivery method our
academic institutions use and why? 2) How can instructors validate the guidelines of the institutions? 3)
How courses should be taught to provide student learning outcomes? Using the Learning Environment
Modeling Language (LEML), we investigate the design and implementation of courses for delivery in the
following environments: face-to-face, online synchronous, asynchronous, hybrid, and hyflex. A good
course design and implementation are key components of instructional alignment. Furthermore, we
demonstrate how to design, implement, and deliver courses in synchronous, asynchronous, and hybrid
modes and describe our proposed enhancements to LEML.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Water scarcity is the lack of fresh water resources to meet the standard water demand. There are two type of water scarcity. One is physical. The other is economic water scarcity.
Extended pso algorithm for improvement problems k means clustering algorithm
1. International Journal of Managing Information Technology (IJMIT) Vol.6, No.3, August 2014
DOI : 10.5121/ijmit.2014.6302 17
EXTENDED PSO ALGORITHM FOR
IMPROVEMENT PROBLEMS K-MEANS
CLUSTERING ALGORITHM
Maryam Lashkari1
and Amin Rostami2
1
Department of Computer Engineering, Ferdows Branch, Islamic Azad University,
Ferdows, Iran.
2
Department of Computer Engineering, Ferdows Branch, Islamic Azad University,
Ferdows, Iran.
ABSTRACT
The clustering is a without monitoring process and one of the most common data mining techniques. The
purpose of clustering is grouping similar data together in a group, so were most similar to each other in a
cluster and the difference with most other instances in the cluster are. In this paper we focus on clustering
partition k-means, due to ease of implementation and high-speed performance of large data sets, After 30
year it is still very popular among the developed clustering algorithm and then for improvement problem of
placing of k-means algorithm in local optimal, we pose extended PSO algorithm, that its name is ECPSO.
Our new algorithm is able to be cause of exit from local optimal and with high percent produce the
problem’s optimal answer. The probe of results show that mooted algorithm have better performance
regards as other clustering algorithms specially in two index, the carefulness of clustering and the quality
of clustering.
KEYWORDS
Clustering, Data Mining, Extended chaotic particle swarm optimization, K-means algorithm.
1. INTRODUCTION
Nowadays, usage of data mining observe in most of science, visibly. It’s obvious that if don’t
prepared suitable bedfast for use of this science, we will be away from achieved progress.
Clustering is one of the most common data mining tools. That use in most case such as:
engineering, data mining, medical science, social science and other items. As for clustering very
applications, need to clustering and data mining is necessary in most field for further progress.
First time idea of clustering represent in 1935 year and nowadays because of progresses and huge
mutation most of researchers pay attention to clustering. The clustering is process of collection
grouping form without label’s data.
That inner members have most similar to each other in a cluster and least similar to regard as
other cluster’s members. So, clustering is more ideal when two inner cluster likeness factor is
maximum and outside cluster likeness is least. There is other criterions, Such as: Euclidean
distance, hamming, for determination level of sample’s likeness to each other. That every
criterion have further usage in special field.
2. International Journal of Managing Information Technology (IJMIT) Vol.6, No.3, August 2014
18
Purpose’s function is convex and non-linear in most clustering problems [1]. It’s possible that
algorithm place in trap of local optimal and produce the problem’s optimal answer. There are
several clustering algorithm that grouping to following kinds. Hierarchical clustering algorithm,
partition, density based, model and graph based that each of them are more effective regard as
other algorithm in special data environment. In all of this algorithms, researchers try to balanced,
control or improve parameters to be more effective algorithm that consist of:
-high measurement,- having ability to work with high dimension – having ability to dynamic data
clustering, - having ability to work with high distance of problem,- having least need to additional
knowledge about problem,- suitable management from noises, and interpretable clusters.
Partition clustering algorithm is one of the most common and most applied from clustering
algorithm. That specific data collection to specified partition’s number. So that samples in every
partition have most similar to each other in a cluster and most difference with samples in other
clusters .K-means algorithm of is the famous clustering algorithm in this field [6]. And it’s one of
the favorite center-pivot clustering algorithms in clustering technique. K-means start with
Initialization to cluster’s centers and other things with regard to Euclidean distance criterion
allocate to one cluster that have least distance to cluster’s centers. In every algorithm repetition,
perform two chief phase. First, every item in data collection allocate to a cluster that have least
distance from cluster’s centers. In continue, after that spots grouping to K cluster the new
cluster’s centers calculate by estimate average from samples of every cluster. And algorithm
repeat. The temporal algorithm finish to there is any change in calculation of cluster’s centers and
or finish the repetition special number [9]. In this algorithm purpose function error square series
that goal is to reach a minimum it, that show in equation (1). X indicate to cluster’s samples and
C indicate to cluster’s center.
∅ = ∑ min ||ݔ − ܿ||ଶ
௫∈ (1)
Advantages of K-means algorithm:
Ease of implementation and high-speed performance, measurable and efficient in large data
collection.
Disadvantages and problems of k-means algorithm:
1- Selection of the first cluster’s centers and number of cluster do by user. For this reason
clustering results is dependent to first algorithm’s selection and if first algorithm’s
condition don’t be suitable, it’s possible algorithm place in trap of local optimal.
2- Selection number of optimal cluster for problem is difficult.
3- This algorithm, because of calculation average from cluster samples for determination
cluster’s center, have weak management regard as noises and data.
4- This algorithm can’t be usable in data collection that calculation average is not
describable.
5- Data clustering is not usable with different forms and density.
In continue we will express techniques for improvement problems that usually this techniques
focus on 3 issue that are:
1- Determination way of selection first parameters.
2- Alternation in basic algorithm.
3- Combination clustering algorithm with other initiative algorithms.
3. International Journal of Managing Information Technology (IJMIT) Vol.6, No.3, August 2014
19
And then new solution posed for improvement problems of placing k-means algorithm’s result in
local optimal and validation evaluate by using 3 real data collection and some indexes and finally,
we will have brief comparison from posed techniques. Continue of article organized in following
from:
1. Related works
2. Analysis and comparison posed algorithms.
3. The proposed method.
4. Simulation.
5. Conclusion
2. RELATED WORKS
K-medidos clustering algorithm:
This algorithm[1,3] for resolving problem of noises weak management in k-means algorithm and
also in perform in case that evaluation average for data collection is non describable. The idea
that posed in this algorithm is contemplate most central sample as cluster’s center in every cluster
rather that selection data’s average of one cluster.
Disadvantage: algorithm’s temporal complexity is high and it isn’t suitable and efficient for large
data collection. Result of clustering is sensitive to first condition of algorithm and determination
optimal K is difficult for problem.
CLARA clustering algorithm:
For solving the problem of k-medidos algorithm in large data sets, that is, high temporal
complexity posed CLARA algorithm [2]. In this algorithm solve the problem of k-medidos
algorithm, temporal complexity in large data collection, but there is a problem. Suppose , n is
total number of samples and m is most number of samples that this way of clustering can process
in objective time . If n>>m, often clustering from several small sample of data cause that
eliminate some of the data in same groups.
K-modes clustering algorithm:
For clustering nominal data, k-means algorithm isn’t suitable for this reason, posed generalized
way of k-means algorithm, k-means. In this algorithm [3] rather that evaluation average, we use
of mode every cluster as cluster’s centroid and also algorithm first parameters such as k-means,
selected randomly for this reason alongside advantage, be suitable for nominal data, it’s possible
that algorithm’s result place in local optimal and also be suitable only for nominal data and it’s
not efficient for numerical data.
Particles Swarm Optimization Clustering algorithm:
As we said before one of the problem of k-means algorithm is placing algorithm’s result in trap of
local optimal, because of algorithm local search in region of problem. Clustering algorithm based
on PSO algorithm problem posed [4]. For elimination of this clustering algorithm based on pso
have better operation rather than k-means algorithm with few dimension for data collection and
there is more probability for get all over optimal answer rather than k-means algorithm because of
all over research in region of problem but use of pso algorithm lead to much repetitions and slow
convergence for data with high volume. For this reason, we often combine this 2 algorithm with
each other to be complement and they cover weakness each other.
4. International Journal of Managing Information Technology (IJMIT) Vol.6, No.3, August 2014
20
(2)
(3)
(4)
Chaotic particle swarm optimization clustering algorithm:
Two main problem of clustering using PSO method is the convergence to local optimal and slow
convergence velocity, which is tried to be solved by using two ideas of chaos theory and
acceleration strategy . In the formula of updating velocity of the cluster centers that is mentioned
in the (2) updating is done for each particle for relocating the particle to the new position, from
the best answer for each particle (Pbest) and the best global solution so far (gbest) . In which W
Inertia coefficient rate tends to previous velocity of the particle, c1 rates tends to the local best
position of the particle, and c2 trends to the best global position of the particle [5].
In (3) replacing cr instead of rr improves PSO algorithm as given:
ݒௗ
௪
= ݓ ൈ ݒௗ
ௗ
ܿଵ ൈ ݎ ൈ ሺܾܲ݁ݐݏௗ − ݔௗ
ௗ
ሻ ܿଶ ൈ ݎଶ ൈ ሺܾ݃݁ݐݏௗ − ݔௗ
ௗ
ሻ
ݒௗ
௪
= ݓ ൈ ݒௗ
ௗ
ܿଵ ൈ ܿ ൈ ሺܾܲ݁ݐݏௗ − ݔௗ
ௗ
ሻ ܿଶ ൈ ሺ1 − ܿሻ ൈ ሺܾ݃݁ݐݏௗ − ݔௗ
ௗ
ሻ
ݎܥ௧ାଵ = ݇ ൈ ݎܥሺ௧ሻ ൈ ሺ1 − ݎܥሺ௧ሻሻ
In (4), Cr random value is created for each round independently between 0 and 1.which
substitutes both r1 and r2, and parameter k is the number of predicted clusters. Using the chaos
theory in PSO population generation will result in more diverse of the algorithm.
Figure1. Chaos map [5]
As can be see in Figure 1. To achieve more optimal particle swarm optimization algorithm, chaos
theory is applied And in other change to increase the rate of convergence used acceleration
strategy therefore in this mode a number of the population which are the best toward the target -
move not all population that it increases the rate of convergence [5].
Genetic clustering algorithm:
In this algorithm [7] for exit from trap of local optimal in k-means algorithm we use of genetics
optimization algorithm for better data clustering. Because of evolutionary algorithm, such as
genetics, have ability for global search in answer, use of them for clustering, decrease probability
from placing answer of algorithm in local optimal. And finally produce more optimal answer for
clustering.
Ant colony clustering algorithm:
Ant colony clustering algorithm [9] is pivot population innovative algorithm that used for solving
problem of optimization, such as: clustering. This algorithm is capable to produce optimal answer
with high speed in clusters and with complex forms rather than other innovative algorithm. This
5. International Journal of Managing Information Technology (IJMIT) Vol.6, No.3, August 2014
21
algorithm1- for better data clustering and reach to all over optimal answer with more probability
of k-means algorithm use.2- Of ant colony algorithm for data clustering process.
K-mica compound clustering algorithm:
This algorithm [11] is combination of colonial competition algorithm with k-means clustering
algorithm. In this algorithm after production of primary population, randomly, k-means algorithm
perform on available data with distinct numbers. Then obtained final cluster’s centers consider as
primary population of colonial competition algorithm that is imperialists and perform clustering
on them based on extended colonial competition algorithm and allocate colony to suitable
imperialists and clustering perform over data based on extended competition algorithm and
allocate colonies to suitable colonialisms.
Four hybrid strategies for combination continuous ant colony optimization with PSO
algorithm for utilizing in clustering process:
This algorithm [12] posed 4 hybrid strategies for combination PSO algorithm. Their examinations
show that utilizing hybrid strategies for clustering is so better than independent utilizing of k-
means, PSO, ACOR algorithm for clustering process.
Four Hybrid strategies that used by them are:
1: Series combination of 2 algorithm PSO, ACOR
2: Parallel combination of 2 algorithm PSO, ACOR
3: Series combination of 2 algorithm with one extended chart from pheromone-Particles
4: Substitution global best between 2 algorithm.
3. ANALYSIS AND COMPARISON OF ALGORITHM
As you see, we posed and checked several strategies and algorithms for elimination problems and
challenges of k-means clustering algorithm each of discussed algorithms have advantages and
disadvantages. Some of them expanded for elimination of previous algorithm limitation or they
are new strategies for solve the problems of k-means algorithm. Challenges of k-means algorithm
are:
1: Sensitivity to noise data
2: it’s limited to numerical data
3: Result of algorithm is dependent to primary condition and placing algorithm in local optimal.
4: Lack of suitable clustering for clusters with different forms and density.
In continuance, table 1 show the comparison of described algorithm from the point of view of
several important parameters. Empty cells of chart show that have any importance about specific
algorithm from relevant parameter.
Algorithm Advantages Disadvantages Temporal
complexity
Suitable
for data
sets
Result of
algorithm
Sensitivity
to noise
Kind of
algorithm
search
K-
medidos
Better
manageme
nt of noises
and pert
data it’s
suitable for
data sets
which
evaluation
High temporal
complexity in
large data sets. It
is not suitable
for clusters with
different forms
and density.
Result of
algorithm is
O(k(n-k)2
) Numeric
al
Most
central
members
in each
cluster
Have
not
Local
6. International Journal of Managing Information Technology (IJMIT) Vol.6, No.3, August 2014
22
of average
isn’t
describable
in it.
related to first
algorithm
condition. And
there is high
probability for
placing result of
algorithm in
local optimal.
It’s hard to
determine
optimal (k) for
problem. Utility
of this algorithm
is lesser and it is
implementation
is more complex
rather than k-
means
algorithm.
CLARA This
algorithm
can solve
the
problem of
k-medidos
algorithm,
that is high
temporal
complexity
in large
data sets
and also is
suitable for
massive
data sets
Having
weakness in
operation of
clustering. It is
not suitable for
clusters with
different forms
and density. The
result of
algorithm is
related to first
algorithm
condition and
there is high
probability for
placing result of
algorithm in
local optimal.
It’s hard to
determine
optimal(k) for
problem
O(k(40+k)2
+k(n-k))
Numeric
al
Most
central
members
in each
cluster
Have
not
Local
K-modes It’s suitable
for
clustering
of nominal
data sets.
Having
weakness in
clustering of
numerical data.
It is not suitable
for clusters with
different forms
and density. The
result of
algorithm is
related to first
algorithm
condition and
there is high
probability for
O(n) Nominal Mood of
each
cluster
Have
not
Local
7. International Journal of Managing Information Technology (IJMIT) Vol.6, No.3, August 2014
23
placing result of
algorithm in
local optimal.
It’s hard to
determine
optimal (k) for
problem
Clusterin
g based
on PSO
algorith
m
Probability
for reach to
all over
optimal
answer and
exit of
local
optimal is
more than
k-means
algorithm
because of
all over
research in
area of
problem
Use of pso
algorithm lead to
much repetitions
and slow
convergence for
data with high
volume and it’s
suitable for data
sets with low
volume. First
copy from pso
algorithm is very
related to
problem
parameters. And
for this reason,
algorithm place
in local optimal.
- Numeric
al
Centers
of first
clusters
for k-
means.
- Global
chaotic
particle
swarm
optimiza
tion
clusterin
g
algorith
m
There is
increase of
population
variation
and
increase of
convergenc
e speed in
pso
clustering
algorithm.
And there
is more
probability
for reach to
all over
optimal
answer
rather than
pso
clustering
algorithm.
There is higher
fiscal
complexity than
pso.
- Numeric
al
Centers
of first
clusters
for k-
means
algorith
m or
correctio
n of
formed
clusters
by k-
means
- Global
Clusterin
g
algorith
m based
on GA
algorith
m
Exit from
trap of
local
optimal
with high
percent and
there is
probability
for reach to
Low
convergence
speed and
increase of fiscal
complexity
- Nominal Centers
of
clusters
- Global
8. International Journal of Managing Information Technology (IJMIT) Vol.6, No.3, August 2014
24
all over
optimal
answer for
clustering.
Clusterin
g
algorith
m based
on ant
colony
algorith
m
It produce
optimal
answer
with higher
speed and
complex
forms in
clusters
rather than
other
algorithms.
Better data
clustering
and reach
to all over
optimal
answer
with more
probability
rather than
k-means
algorithm
It’s possible that
algorithm place
in trap of local
optimal and
produce optimal
answer because
of randomly
things selection
by ants and
numbers of
repetition
- Numeric
al
Optimal
centers
of
clusters
- Global
Compou
nd
clusterin
g
algorith
m(PSO+
ACO+K
-means)
There is
improveme
nt in
problem of
first
condition
selection
for k-
means
algorithm.
Increase of
convergenc
e speed to
all over
optimal
answer and
there is
more
probability
for close to
all over
optimal
answer
rather than
other
evolutionar
y algorithm
High fiscal
complexity
- Numeric
al
Optimal
centers
of
clusters
- Global
9. International Journal of Managing Information Technology (IJMIT) Vol.6, No.3, August 2014
25
4. PROPOSED METHOD
4,1. Introduction of standard PSO algorithm and it is problem
Particle swarm optimization (PSO) is a population-based stochastic search process, modeled after
the social behavior of a bird flock. The algorithm maintains a population of particles, where each
particle represents a potential solution to an optimization problem.In the context of PSO, a swarm
refers to a number ofpotential solutions to the optimization problem, where eachpotential solution
is referred to as a particle. The aim of thePSO is to find the particle position that results in the
bestevaluation of a given fitness (objective) function.Each particle represents a position in Nd
dimensionalspace, and is :'flown'' through this multi-dimensional search space, adjusting its
position toward bothThe particle's best position found thus farThe best position in the
neighborhood of that panicle.Each particle i maintains the following information:
xi : The current position of the particle;
vi: The current velocity of the particle;
yi : The personal best position of the panicle.
Using the above notation. A particle's position is adjustedaccording to
ݒ.ሺݐ 1ሻ = ݒݓ.ሺݐሻ ܿଵݎଵ.ሺݐሻ൫ݕ.ሺݐሻ − ݔ.ሺݐሻ൯ ܿଶݎଶ.ሺݐሻሺݕሺݐሻ − ݔ.ሺݐሻሻ
ܺሺݐ 1ሻ = ܺ ݒሺݐ 1ሻ
Where w is the inertia weight, c1 and c2 are the acceleration constants, r 1.j (t). r2.j(t) ~ U(0.1),
and k = 1. . . Nd. The velocity is thus calculated based on three contributions:
a fraction of the previous velocity.The cognitive component which is a function of the distance of
the particle from its personal best position. The social component which is a function of the
distance of the particle from the best particle found thus far (i.e. the best of the personal bests).
Important issue in standard PSO algorithm is rate of it is fast convergence that possible lead to
placing result of algorithm in local optimal. It’s clear that use of more informational sources
increase the space of search and distribution of algorithm and improve problem of PSO. There for
in suggestive algorithm that we called it, ECPSO1, briefly, try for increase utility of pso
algorithm implement changes in movement Particle function for Improvement from utility of
algorithm. In this equivalence, two randomly function(rand1,rand2) determined according to
recent posed strategies based on chaos map. That this function have hidden order rather than
randomly numbers that are disordered and this change cause improvement of utility from PSO
algorithm in clustering.
As you see in 20 equivalence, the new Particle speed in primary pso text calculate according to
local best situation of Particle and global best situation from all of the Particle. Entrance of global
best situation in speed of Particle, cased intense movement in displacement of Particles for going
to new situation and also cased fast convergence of pso algorithm and increase probability
placing algorithm in local optimal.
Because if this situation be a incorrect and deviant situation caused that particle have intense
deviation in it is movement. That is, one misled leader deviate all the population and algorithm
place in trap of local optimal and can’t reach to all over optimal answer. Therefor, in our
suggestive approach for solving this problem, consider (k) global best for all the population that
number of this global best determine according to population and during performance of
1
Extended chaotic particle swarm optimization
(5)
(6)
10. International Journal of Managing Information Technology (IJMIT) Vol.6, No.3, August 2014
26
algorithm amounts of (k) global best up to date according to steady distribution then in
calculation of new particle speed average of them and we have sped of average difference of
several situation (global-best) from particle situation.
This way is cased moderation particle movement, and increase the probability of reach to all over
optimal answer and decrease the probability of placing algorithm in trap of local optimal and we
consider c1, c2 in form comparative in algorithm that in order determine tendency to local best
situation and global-best of particle and (w) determine tendency to previous particle speed, that is,
any amount approach to last algorithm repetitions. Decrease rate of this variables.
Extended speed equivalence is:
ܸ௪
= ܹ ൈ ܸௗ ܿଵ ൈ ܿ ൈ ሺ݈ݐݏܾ݈݁ܽܿ − ݔሻ ܿଶ ൈ ሺ1 − ܿሻ
ൈ ሺ݉݁ܽ݊ሺ݈݃,1ݐݏܾ݈ܾ݈݁ܽ݃,2ݐݏܾ݈ܾ݁ܽ … , ݈݃݇ ݐݏܾ݈ܾ݁ܽሻ − ܺሻ
5. SIMULATION
New algorithm perform by using of MATLAB software. Then, for evaluation and measurement
from suggestive way in comparison with 4 clustering algorithm (GA-PSO-PSO+K-means-
CPSO) we use of 3 real and standard informational base from UCI site. Table 2 shows the
characteristics of these categories.
Table2.Data sets used
Data
set
Number
of
samples
Number
of
classes
Number
of
characters
iris 150 3 4
seeds 210 3 7
glass 214 6 10
In this article we use of 4 criterion and index, that are:
Number of algorithm repetition to reach to termination bet.Number of calculation from
competence function in algorithm Exactness of cluster or purity criterionIndex of validityIn
continuance we express the operation of every index.
(7)
11. International Journal of Managing Information Technology (IJMIT) Vol.6, No.3, August 2014
27
5,1. Analysis of 4 criterion and posed indexes for comparison of mentioned
algorithms operation:
Number of algorithm repetition to reach to termination bet:
We use of this algorithm for evaluation of algorithm convergence speed.Number of calculation
from fitness in algorithm:
In this index, we calculate the number of calculation from fitness in algorithm and whatever the
number of this index be further, that will imply further fiscal complexity from that algorithm. On
the other hand, with this index, we evaluate the fiscal complexity from clustering algorithm index
of
5,3. Clustering carefulness with purity index
This index evaluate data clustering carefulness by clustering algorithm. The extent of this index is
between 0 and 1 and whatever extent of this algorithm be closer to 1, the clustering carefulness is
higher and more desirable. Criterion of purity evaluation for all obtained clusters from clustering
algorithm. That calculation of this criterion for each cluster, calculate with equation (8):
=
1
݊
ݔܽܯሺ݊
ሻ
That is for each cluster we consider maximum similarity between each cluster to all the available classes in
data collection.
Total purity evaluation by equation (9):
ݕݐ݅ݎݑ =
݊
݊
ୀଵ
In this relation (nj) is size of cluster, (j) and (m) are number of cluster and (n) is number of
sample.
5,4. Index of validity
This index evaluation proportion of intra-cluster distance sum and distance between clusters, And
clustering is more desirable if intra-cluster distance sum is less and outside-cluster distance sum is
more therefore the less amount of this index imply to better data clustering. Termination bet form
all of the mentioned algorithm is Convergence criterion. That is in two continuous repetition of
clustering, if amount of validity index be less than 0.0001, algorithm finished because we reach to
convergence in clustering.
Table3. Result from posed indexes in foresaid algorithm after evaluation average
of 10 times accomplishment in iris data sets.
Algorithm Number of
algorithm
Number of
calculation
Clustering
carefulness
Index of
validityPSO 5.8 23.2 0.71 25.61 -6.8
CPSO 5.6 22.4 0.75 23.74 -5.01
PSO+K-means 16.8 47.2 0.76 33.2 -5.01
GA 5.1 49.8 0.71 33.2 -14.44
Ecpso 6 24 0.79 21.29 -
2.53
As shown in Table 3 new posed algorithm has more suitable convergence speed for reach to all
over optimal answer rather than PSO+K-means compound algorithm and it have a little
difference with PSO, CPSO, GA algorithm of this criterion.From the point of view of index 2,
(8)
(9)
12. International Journal of Managing Information Technology (IJMIT) Vol.6, No.3, August 2014
28
algorithm fiscal complexity is less than GA, PSO+K-means algorithm and have high difference
with them. Because of carefulness and quality are two important factor in data clustering
procedure that cause discovery of important and exact information in primary raw data.And even
have higher priority in applications from clustering such as: medical and engineering, clustering
quality and carefulness. So, the algorithm that can increase carefulness in clustering procedure,
can be more suitable algorithm for clustering from the point of view of this index posed algorithm
is superiority rather than other algorithms and have high clustering carefulness.
This remedial results obtain for algorithm because of particle move toward optimal answer with
cautionary, further carefulness and more suitable speed.Finally , from the point view of index 4,
that is, quality of clustering which it’s goal is growth of compaction and aggregation in one
cluster and further separation between different clusters, posed algorithm have higher clustering
quality rather than other algorithms. This domination demonstrate with numerical difference 2.53
from algorithm validity index which it is optimal amount is 18.76 in iris data collection.
[[
Table4. Result from posed indexes in foresaid algorithm after evaluation average
of 10 times accomplishment in seeds data sets.
Algorithm Number of
algorithm
repetition to reach
to termination bet
Number of
calculation
from fitness in
algorithm
Clustering
carefulness
with purity
index
Index of
validity
PSO 5.7 22.8 0.69 42.26 -3.86
CPSO 5.8 23.2 0.73 39.10 -0.7
PSO+K-
means
11 41 0.74 42.78 -4.38
GA 4.2 41.6 0.73 56.5 -18.1
Ecpso 7 28 0.77 37.8 -0.6
As shown in Table 4, and we show in previous experiment , our posed algorithm according to
result of index that show number7 become convergence to termination bet, more fast pso+ k-
means algorithm and also such as pervious experiment, fiscal complexity of new algorithm is less
than PSO-K-means and GA algorithm about clustering carefulness index or purity that expressed
matters about its importance in clustering and we also show in previous experiment, this
algorithms which illustrate with number 0.77. about clustering quality index or validity from this
algorithm with numerical difference 0.6 in proportion with original amount from this index in
seeds collection that is 38.40 consider data clustering more suitable and with more quality other
than other algorithms.
Table5. Result from posed indexes in foresaid algorithm after evaluation average
of 10 times accomplishment in glass data sets.
Algorithm Number of
algorithm
Number of
calculation
Clustering
carefulness
Index of
validityPSO 6.8 27.2 0.75 10.6 -
1.73CPSO 6.6 26.4 0.73 12.7 -
3.83PSO+K-
means
33.9 63.9 0.78 7.1
+1.77GA 7.6 69.8 7.6 13.8 -
4.93Ecpso 7.1 28.4 0.80 9.8 -0.93
As shown in Table 5, new algorithm have suitable convergence speed other than PSO+K-means
and GA algorithms and this algorithm about fiscal complexity is less than GA and PSO+K-means
algorithms. About clustering carefulness index as previous experiments that we show in this
experiment, posed algorithm have high clustering carefulness other than other algorithms.
13. International Journal of Managing Information Technology (IJMIT) Vol.6, No.3, August 2014
29
About clustering quality index or posed algorithm validity with numerical difference 0.93 in
proportion with original amount from this index in glass data collection, that is 8.87, this
algorithm is more suitable and more quality other than other algorithms. We can inference from
results of posed experiments that suggestive algorithm is better than other algorithms and have
good operation in index of clustering carefulness and clustering quality other than four other
algorithms.
6. CONCLUSIONS
Data mining is one Helpful technology which used and develop by expansion of data base
technology. We can say that this technique is one current instrument for analysis and in for ration
extraction among a lot of data. We can discover remedial patterns without interference of user by
it. Clustering is one of the most common data mining techniques. Which have application in very
cases. In this paper we evaluate and express some of the techniques and researches that posed in
current years for improvement problems k-means clustering algorithm.As you see, some of this
technique are for improvement problems from previous posed algorithms and some of them are
new algorithm for improvement problems from k-means algorithm. After review to history of
research, we posed new algorithm for improvement problem from placing algorithm result in
local optimal. And evaluation it by 3 real data sets and some of the validity index. We show that
posed algorithm have clustering carefulness and clustering quality. We can use of this algorithm
when clustering carefulness and quality is more important from the time.
REFERENCES
[1] Shalini S Singh, N C Chauhan, K-means v/s K-medoids: A Comparative Study, National Conference
on Recent Trends in Engineering & Technology,2011.
[2] Chih-Ping Wei, Yen-Hsien Lee and Che-Ming Hsu, Empirical Comparison of Fast Clustering
Algorithms for Large Data Sets,2000, Proceedings of the 33rd Hawaii International Conference on
System Sciences – 2000.
[3] Zhexue Huang, Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical
Values,1998.
[4] Suresh Chandra Satapathy, Gunanidhi Pradhan, Sabyasachi Pattnaik, JVR Murthy, PVGD Prasad
Reddy, Performance Comparisons of PSO based Clustering,2009, InterJRI Computer Science and
Networking
[5] Li-Yeh Chuang a, Chih-Jen Hsiao b, Cheng-Hong Yang, Chaotic particle swarm optimization for data
clustering, Elsevier ad-hoc, 2011.
[6] Sandeep Rana, Sanjay Jasola, Rajesh Kumar, A hybrid sequential approach for data clustering using
K-Means and particle swarm optimization algorithm" ,2010, International Journal of Engineering,
Science and Technology, 2010.
[7] L.E. Agustın-Blas, S. Salcedo-Sanz, S. Jimenez-Fernandez, L. Carro-Calvo, J. Del Ser,
J.A. Portilla-Figueras, A new grouping genetic algorithm for clustering problems, Elsevier ad hoc
2012.
[8] DW van der Merwe, AP Engelbrecht, Data Clustering using Particle Swarm Optimization,
Conference Publications, 2003.
[9] P.S. Shelokar, V.K. Jayaraman, B.D. Kulkarni,' An ant colony approach for clustering, Elsevier,
2003.
[10] Taher Niknam, Babak Amiri, An efficient hybrid approach based on PSO, ACO and k-means for
cluster analysis, Elsevier Engineering Applications of Artificial Intelligence, 2010.
[11] Taher Niknam a,n, ElaheTaherianFard b, NargesPourjafarian b, Alireza Rousta," An efficient hybrid
algorithm based on modified imperialist competitive algorithm and K-means for data clustering,
Elsevier Engineering Applications of Artificial Intelligence,2011
[12] Cheng-Lung Huang, Wen-Chen Huang, Hung-Yi Chang, Yi-Chun Yeh, Cheng-Yi Tsai,"
Hybridization strategies for continuous ant colony optimization and
particle swarm optimization applied to data clustering, Elsevier Engineering Applications of Artificial
Intelligence, 2013.