Automatic Unsupervised Data Classification Using Jaya Evolutionary Algorithmaciijournal
In this paper we attempt to solve an automatic clustering problem by optimizing multiple objectives such as automatic k-determination and a set of cluster validity indices concurrently. The proposed automatic clustering technique uses the most recent optimization algorithm Jaya as an underlying optimization stratagem. This evolutionary technique always aims to attain global best solution rather than a local best solution in larger datasets. The explorations and exploitations imposed on the proposed work results to detect the number of automatic clusters, appropriate partitioning present in data sets and mere optimal values towards CVIs frontiers. Twelve datasets of different intricacy are used to endorse the performance of aimed algorithm. The experiments lay bare that the conjectural advantages of multi objective clustering optimized with evolutionary approaches decipher into realistic and scalable performance paybacks.
Automatic Unsupervised Data Classification Using Jaya Evolutionary Algorithmaciijournal
In this paper we attempt to solve an automatic clustering problem by optimizing multiple objectives such as
automatic k-determination and a set of cluster validity indices concurrently. The proposed automatic
clustering technique uses the most recent optimization algorithm Jaya as an underlying optimization
stratagem. This evolutionary technique always aims
to attain global best solution rather than a local best solution in larger datasets. The explorations and exploitations imposed on the proposed work results to
detect the number of automatic clusters, appropriate partitioning present in data sets and mere optimal
values towards CVIs frontiers. Twelve datasets of different intricacy are used to endorse the performance
of aimed algorithm. The experiments lay bare that the conjectural advantages of multi objective clustering optimized with evolutionary approaches decipher into realistic and scalable performance paybacks.
An Heterogeneous Population-Based Genetic Algorithm for Data Clusteringijeei-iaes
As a primary data mining method for knowledge discovery, clustering is a technique of classifying a dataset into groups of similar objects. The most popular method for data clustering K-means suffers from the drawbacks of requiring the number of clusters and their initial centers, which should be provided by the user. In the literature, several methods have proposed in a form of k-means variants, genetic algorithms, or combinations between them for calculating the number of clusters and finding proper clusters centers. However, none of these solutions has provided satisfactory results and determining the number of clusters and the initial centers are still the main challenge in clustering processes. In this paper we present an approach to automatically generate such parameters to achieve optimal clusters using a modified genetic algorithm operating on varied individual structures and using a new crossover operator. Experimental results show that our modified genetic algorithm is a better efficient alternative to the existing approaches.
Using particle swarm optimization to solve test functions problemsriyaniaes
In this paper the benchmarking functions are used to evaluate and check the particle swarm optimization (PSO) algorithm. However, the functions utilized have two dimension but they selected with different difficulty and with different models. In order to prove capability of PSO, it is compared with genetic algorithm (GA). Hence, the two algorithms are compared in terms of objective functions and the standard deviation. Different runs have been taken to get convincing results and the parameters are chosen properly where the Matlab software is used. Where the suggested algorithm can solve different engineering problems with different dimension and outperform the others in term of accuracy and speed of convergence.
A new model for iris data set classification based on linear support vector m...IJECEIAES
Data mining is known as the process of detection concerning patterns from essential amounts of data. As a process of knowledge discovery. Classification is a data analysis that extracts a model which describes an important data classes. One of the outstanding classifications methods in data mining is support vector machine classification (SVM). It is capable of envisaging results and mostly effective than other classification methods. The SVM is a one technique of machine learning techniques that is well known technique, learning with supervised and have been applied perfectly to a vary problems of: regression, classification, and clustering in diverse domains such as gene expression, web text mining. In this study, we proposed a newly mode for classifying iris data set using SVM classifier and genetic algorithm to optimize c and gamma parameters of linear SVM, in addition principle components analysis (PCA) algorithm was use for features reduction.
Particle Swarm Optimization based K-Prototype Clustering Algorithm iosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
A Combined Approach for Feature Subset Selection and Size Reduction for High ...IJERA Editor
selection of relevant feature from a given set of feature is one of the important issues in the field of
data mining as well as classification. In general the dataset may contain a number of features however it is not
necessary that the whole set features are important for particular analysis of decision making because the
features may share the common information‟s and can also be completely irrelevant to the undergoing
processing. This generally happen because of improper selection of features during the dataset formation or
because of improper information availability about the observed system. However in both cases the data will
contain the features that will just increase the processing burden which may ultimately cause the improper
outcome when used for analysis. Because of these reasons some kind of methods are required to detect and
remove these features hence in this paper we are presenting an efficient approach for not just removing the
unimportant features but also the size of complete dataset size. The proposed algorithm utilizes the information
theory to detect the information gain from each feature and minimum span tree to group the similar features
with that the fuzzy c-means clustering is used to remove the similar entries from the dataset. Finally the
algorithm is tested with SVM classifier using 35 publicly available real-world high-dimensional dataset and the
results shows that the presented algorithm not only reduces the feature set and data lengths but also improves the
performances of the classifier.
Engineering Research Publication
Best International Journals, High Impact Journals,
International Journal of Engineering & Technical Research
ISSN : 2321-0869 (O) 2454-4698 (P)
www.erpublication.org
Automatic Unsupervised Data Classification Using Jaya Evolutionary Algorithmaciijournal
In this paper we attempt to solve an automatic clustering problem by optimizing multiple objectives such as automatic k-determination and a set of cluster validity indices concurrently. The proposed automatic clustering technique uses the most recent optimization algorithm Jaya as an underlying optimization stratagem. This evolutionary technique always aims to attain global best solution rather than a local best solution in larger datasets. The explorations and exploitations imposed on the proposed work results to detect the number of automatic clusters, appropriate partitioning present in data sets and mere optimal values towards CVIs frontiers. Twelve datasets of different intricacy are used to endorse the performance of aimed algorithm. The experiments lay bare that the conjectural advantages of multi objective clustering optimized with evolutionary approaches decipher into realistic and scalable performance paybacks.
Automatic Unsupervised Data Classification Using Jaya Evolutionary Algorithmaciijournal
In this paper we attempt to solve an automatic clustering problem by optimizing multiple objectives such as
automatic k-determination and a set of cluster validity indices concurrently. The proposed automatic
clustering technique uses the most recent optimization algorithm Jaya as an underlying optimization
stratagem. This evolutionary technique always aims
to attain global best solution rather than a local best solution in larger datasets. The explorations and exploitations imposed on the proposed work results to
detect the number of automatic clusters, appropriate partitioning present in data sets and mere optimal
values towards CVIs frontiers. Twelve datasets of different intricacy are used to endorse the performance
of aimed algorithm. The experiments lay bare that the conjectural advantages of multi objective clustering optimized with evolutionary approaches decipher into realistic and scalable performance paybacks.
An Heterogeneous Population-Based Genetic Algorithm for Data Clusteringijeei-iaes
As a primary data mining method for knowledge discovery, clustering is a technique of classifying a dataset into groups of similar objects. The most popular method for data clustering K-means suffers from the drawbacks of requiring the number of clusters and their initial centers, which should be provided by the user. In the literature, several methods have proposed in a form of k-means variants, genetic algorithms, or combinations between them for calculating the number of clusters and finding proper clusters centers. However, none of these solutions has provided satisfactory results and determining the number of clusters and the initial centers are still the main challenge in clustering processes. In this paper we present an approach to automatically generate such parameters to achieve optimal clusters using a modified genetic algorithm operating on varied individual structures and using a new crossover operator. Experimental results show that our modified genetic algorithm is a better efficient alternative to the existing approaches.
Using particle swarm optimization to solve test functions problemsriyaniaes
In this paper the benchmarking functions are used to evaluate and check the particle swarm optimization (PSO) algorithm. However, the functions utilized have two dimension but they selected with different difficulty and with different models. In order to prove capability of PSO, it is compared with genetic algorithm (GA). Hence, the two algorithms are compared in terms of objective functions and the standard deviation. Different runs have been taken to get convincing results and the parameters are chosen properly where the Matlab software is used. Where the suggested algorithm can solve different engineering problems with different dimension and outperform the others in term of accuracy and speed of convergence.
A new model for iris data set classification based on linear support vector m...IJECEIAES
Data mining is known as the process of detection concerning patterns from essential amounts of data. As a process of knowledge discovery. Classification is a data analysis that extracts a model which describes an important data classes. One of the outstanding classifications methods in data mining is support vector machine classification (SVM). It is capable of envisaging results and mostly effective than other classification methods. The SVM is a one technique of machine learning techniques that is well known technique, learning with supervised and have been applied perfectly to a vary problems of: regression, classification, and clustering in diverse domains such as gene expression, web text mining. In this study, we proposed a newly mode for classifying iris data set using SVM classifier and genetic algorithm to optimize c and gamma parameters of linear SVM, in addition principle components analysis (PCA) algorithm was use for features reduction.
Particle Swarm Optimization based K-Prototype Clustering Algorithm iosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
A Combined Approach for Feature Subset Selection and Size Reduction for High ...IJERA Editor
selection of relevant feature from a given set of feature is one of the important issues in the field of
data mining as well as classification. In general the dataset may contain a number of features however it is not
necessary that the whole set features are important for particular analysis of decision making because the
features may share the common information‟s and can also be completely irrelevant to the undergoing
processing. This generally happen because of improper selection of features during the dataset formation or
because of improper information availability about the observed system. However in both cases the data will
contain the features that will just increase the processing burden which may ultimately cause the improper
outcome when used for analysis. Because of these reasons some kind of methods are required to detect and
remove these features hence in this paper we are presenting an efficient approach for not just removing the
unimportant features but also the size of complete dataset size. The proposed algorithm utilizes the information
theory to detect the information gain from each feature and minimum span tree to group the similar features
with that the fuzzy c-means clustering is used to remove the similar entries from the dataset. Finally the
algorithm is tested with SVM classifier using 35 publicly available real-world high-dimensional dataset and the
results shows that the presented algorithm not only reduces the feature set and data lengths but also improves the
performances of the classifier.
Engineering Research Publication
Best International Journals, High Impact Journals,
International Journal of Engineering & Technical Research
ISSN : 2321-0869 (O) 2454-4698 (P)
www.erpublication.org
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Leave one out cross validated Hybrid Model of Genetic Algorithm and Naïve Bay...IJERA Editor
This paper presents a new approach to select reduced number of features in databases. Every database has a
given number of features but it is observed that some of these features can be redundant and can be harmful as
well as confuse the process of classification. The proposed method first applies a binary coded genetic algorithm
to select a small subset of features. The importance of these features is judged by applying Naïve Bayes (NB)
method of classification. The best reduced subset of features which has high classification accuracy on given
databases is adopted. The classification accuracy obtained by proposed method is compared with that reported
recently in publications on eight databases. It is noted that proposed method performs satisfactory on these
databases and achieves higher classification accuracy but with smaller number of feature
Data Warehouses are structures with large amount of data collected from heterogeneous sources to be
used in a decision support system. Data Warehouses analysis identifies hidden patterns initially unexpected
which analysis requires great memory and computation cost. Data reduction methods were proposed to
make this analysis easier. In this paper, we present a hybrid approach based on Genetic Algorithms (GA)
as Evolutionary Algorithms and the Multiple Correspondence Analysis (MCA) as Analysis Factor Methods
to conduct this reduction. Our approach identifies reduced subset of dimensions p’ from the initial subset p
where p'<p where it is proposed to find the profile fact that is the closest to reference. Gas identify the
possible subsets and the Khi² formula of the ACM evaluates the quality of each subset. The study is based
on a distance measurement between the reference and n facts profile extracted from the warehouse.
New Feature Selection Model Based Ensemble Rule Classifiers Method for Datase...ijaia
Feature selection and classification task are an essential process in dealing with large data sets that
comprise numerous number of input attributes. There are many search methods and classifiers that have
been used to find the optimal number of attributes. The aim of this paper is to find the optimal set of
attributes and improve the classification accuracy by adopting ensemble rule classifiers method. Research
process involves 2 phases; finding the optimal set of attributes and ensemble classifiers method for
classification task. Results are in terms of percentage of accuracy and number of selected attributes and
rules generated. 6 datasets were used for the experiment. The final output is an optimal set of attributes
with ensemble rule classifiers method. The experimental results conducted on public real dataset
demonstrate that the ensemble rule classifiers methods consistently show improve classification accuracy
on the selected dataset. Significant improvement in accuracy and optimal set of attribute selected is
achieved by adopting ensemble rule classifiers method.
Hybrid Method HVS-MRMR for Variable Selection in Multilayer Artificial Neural...IJECEIAES
The variable selection is an important technique the reducing dimensionality of data frequently used in data preprocessing for performing data mining. This paper presents a new variable selection algorithm uses the heuristic variable selection (HVS) and Minimum Redundancy Maximum Relevance (MRMR). We enhance the HVS method for variab le selection by incorporating (MRMR) filter. Our algorithm is based on wrapper approach using multi-layer perceptron. We called this algorithm a HVS-MRMR Wrapper for variables selection. The relevance of a set of variables is measured by a convex combination of the relevance given by HVS criterion and the MRMR criterion. This approach selects new relevant variables; we evaluate the performance of HVS-MRMR on eight benchmark classification problems. The experimental results show that HVS-MRMR selected a less number of variables with high classification accuracy compared to MRMR and HVS and without variables selection on most datasets. HVS-MRMR can be applied to various classification problems that require high classification accuracy.
ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...csandit
Attribute reduction and classification task are an essential process in dealing with large data
sets that comprise numerous number of input attributes. There are many search methods and
classifiers that have been used to find the optimal number of attributes. The aim of this paper is
to find the optimal set of attributes and improve the classification accuracy by adopting
ensemble rule classifiers method. Research process involves 2 phases; finding the optimal set of
attributes and ensemble classifiers method for classification task. Results are in terms of
percentage of accuracy and number of selected attributes and rules generated. 6 datasets were
used for the experiment. The final output is an optimal set of attributes with ensemble rule
classifiers method. The experimental results conducted on public real dataset demonstrate that
the ensemble rule classifiers methods consistently show improve classification accuracy on the
selected dataset. Significant improvement in accuracy and optimal set of attribute selected is
achieved by adopting ensemble rule classifiers method.
For years, the Machine Learning community has focused on developing efficient
algorithms that can produce very accurate classifiers. However, it is often much easier
to find several good classifiers based on dataset combination, instead of single classifier
applied on deferent datasets. The advantages of using classifier dataset combinations
instead of a single one are twofold: it helps lowering the computational complexity by
using simpler models, and it can improve the classification accuracy and performance.
Most Data mining applications are based on pattern matching algorithms, thus improving
the performance of the classification has a positive impact on the quality of the overall
data mining task. Since combination strategies proved very useful in improving the
performance, these techniques have become very important in applications such as
Cancer detection, Speech Technology and Natural Language Processing .The aim of this
paper is basically to propose proprietary metric, Normalized Geometric Index (NGI)
based on the latent properties of datasets for improving the accuracy of data mining tasks.
Extensive Analysis on Generation and Consensus Mechanisms of Clustering Ensem...IJECEIAES
Data analysis plays a prominent role in interpreting various phenomena. Data mining is the process to hypothesize useful knowledge from the extensive data. Based upon the classical statistical prototypes the data can be exploited beyond the storage and management of the data. Cluster analysis a primary investigation with little or no prior knowledge, consists of research and development across a wide variety of communities. Cluster ensembles are melange of individual solutions obtained from different clusterings to produce final quality clustering which is required in wider applications. The method arises in the perspective of increasing robustness, scalability and accuracy. This paper gives a brief overview of the generation methods and consensus functions included in cluster ensemble. The survey is to analyze the various techniques and cluster ensemble methods.
A novel population-based local search for nurse rostering problem IJECEIAES
Population-based approaches regularly are better than single based (local search) approaches in exploring the search space. However, the drawback of population-based approaches is in exploiting the search space. Several hybrid approaches have proven their efficiency through different domains of optimization problems by incorporating and integrating the strength of population and local search approaches. Meanwhile, hybrid methods have a drawback of increasing the parameter tuning. Recently, population-based local search was proposed for a university course-timetabling problem with fewer parameters than existing approaches, the proposed approach proves its effectiveness. The proposed approach employs two operators to intensify and diversify the search space. The first operator is applied to a single solution, while the second is applied for all solutions. This paper aims to investigate the performance of population-based local search for the nurse rostering problem. The INRC2010 database with a dataset composed of 69 instances is used to test the performance of PB-LS. A comparison was made between the performance of PB-LS and other existing approaches in the literature. Results show good performances of proposed approach compared to other approaches, where population-based local search provided best results in 55 cases over 69 instances used in experiments.
ON FEATURE SELECTION ALGORITHMS AND FEATURE SELECTION STABILITY MEASURES: A C...ijcsit
Data mining is indispensable for business organizations for extracting useful information from the huge volume of stored data which can be used in managerial decision making to survive in the competition. Due to the day-to-day advancements in information and communication technology, these data collected from ecommerce and e-governance are mostly high dimensional. Data mining prefers small datasets than high dimensional datasets. Feature selection is an important dimensionality reduction technique. The subsets selected in subsequent iterations by feature selection should be same or similar even in case of small perturbations of the dataset and is called as selection stability. It is recently becomes important topic of research community. The selection stability has been measured by various measures. This paper analyses the selection of the suitable search method and stability measure for the feature selection algorithms and also the influence of the characteristics of the dataset as the choice of the best approach is highly problem dependent.
An Automatic Clustering Technique for Optimal ClustersIJCSEA Journal
This paper proposes a simple, automatic and efficient clustering algorithm, namely, Automatic Merging for Optimal Clusters (AMOC) which aims to generate nearly optimal clusters for the given datasets automatically. The AMOC is an extension to standard k-means with a two phase iterative procedure combining certain validation techniques in order to find optimal clusters with automation of merging of clusters. Experiments on both synthetic and real data have proved that the proposed algorithm finds nearly optimal clustering structures in terms of number of clusters, compactness and separation.
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETSEditor IJCATR
This paper presents a hybrid data mining approach based on supervised learning and unsupervised learning to identify the closest data patterns in the data base. This technique enables to achieve the maximum accuracy rate with minimal complexity. The proposed algorithm is compared with traditional clustering and classification algorithm and it is also implemented with multidimensional datasets. The implementation results show better prediction accuracy and reliability.
sis of health condition is very challenging task for every human being because life is directly related to health
condition. Data mining based classification is one of the important applications for classification of data. In this
research work, we have used various classification techniques for classification of thyroid data. CART gives highest
accuracy 99.47% as best model. Feature selection plays very important role to computationally efficient and increase
the performance of model. This research work focus on Info Gain and Gain Ratio feature selection technique to
reduce the irrelevant features from original data set and computationally increase the performance of model. We have
applied both the feature selection techniques on best model i. e. CART. Our proposed CART-Info Gain and CARTGain
Ratio gives 99.47% and 99.20% accuracy with 25 and 3 feature respectively.
Applying the big bang-big crunch metaheuristic to large-sized operational pro...IJECEIAES
In this study, we present an investigation of comparing the capability of a big bang-big crunch metaheuristic (BBBC) for managing operational problems including combinatorial optimization problems. The BBBC is a product of the evolution theory of the universe in physics and astronomy. Two main phases of BBBC are the big bang and the big crunch. The big bang phase involves the creation of a population of random initial solutions, while in the big crunch phase these solutions are shrunk into one elite solution exhibited by a mass center. This study looks into the BBBC’s effectiveness in assignment and scheduling problems. Where it was enhanced by incorporating an elite pool of diverse and high quality solutions; a simple descent heuristic as a local search method; implicit recombination; Euclidean distance; dynamic population size; and elitism strategies. Those strategies provide a balanced search of diverse and good quality population. The investigation is conducted by comparing the proposed BBBC with similar metaheuristics. The BBBC is tested on three different classes of combinatorial optimization problems; namely, quadratic assignment, bin packing, and job shop scheduling problems. Where the incorporated strategies have a greater impact on the BBBC's performance. Experiments showed that the BBBC maintains a good balance between diversity and quality which produces high-quality solutions, and outperforms other identical metaheuristics (e.g. swarm intelligence and evolutionary algorithms) reported in the literature.
The pertinent single-attribute-based classifier for small datasets classific...IJECEIAES
Classifying a dataset using machine learning algorithms can be a big challenge when the target is a small dataset. The OneR classifier can be used for such cases due to its simplicity and efficiency. In this paper, we revealed the power of a single attribute by introducing the pertinent single-attributebased-heterogeneity-ratio classifier (SAB-HR) that used a pertinent attribute to classify small datasets. The SAB-HR’s used feature selection method, which used the Heterogeneity-Ratio (H-Ratio) measure to identify the most homogeneous attribute among the other attributes in the set. Our empirical results on 12 benchmark datasets from a UCI machine learning repository showed that the SAB-HR classifier significantly outperformed the classical OneR classifier for small datasets. In addition, using the H-Ratio as a feature selection criterion for selecting the single attribute was more effectual than other traditional criteria, such as Information Gain (IG) and Gain Ratio (GR).
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Leave one out cross validated Hybrid Model of Genetic Algorithm and Naïve Bay...IJERA Editor
This paper presents a new approach to select reduced number of features in databases. Every database has a
given number of features but it is observed that some of these features can be redundant and can be harmful as
well as confuse the process of classification. The proposed method first applies a binary coded genetic algorithm
to select a small subset of features. The importance of these features is judged by applying Naïve Bayes (NB)
method of classification. The best reduced subset of features which has high classification accuracy on given
databases is adopted. The classification accuracy obtained by proposed method is compared with that reported
recently in publications on eight databases. It is noted that proposed method performs satisfactory on these
databases and achieves higher classification accuracy but with smaller number of feature
Data Warehouses are structures with large amount of data collected from heterogeneous sources to be
used in a decision support system. Data Warehouses analysis identifies hidden patterns initially unexpected
which analysis requires great memory and computation cost. Data reduction methods were proposed to
make this analysis easier. In this paper, we present a hybrid approach based on Genetic Algorithms (GA)
as Evolutionary Algorithms and the Multiple Correspondence Analysis (MCA) as Analysis Factor Methods
to conduct this reduction. Our approach identifies reduced subset of dimensions p’ from the initial subset p
where p'<p where it is proposed to find the profile fact that is the closest to reference. Gas identify the
possible subsets and the Khi² formula of the ACM evaluates the quality of each subset. The study is based
on a distance measurement between the reference and n facts profile extracted from the warehouse.
New Feature Selection Model Based Ensemble Rule Classifiers Method for Datase...ijaia
Feature selection and classification task are an essential process in dealing with large data sets that
comprise numerous number of input attributes. There are many search methods and classifiers that have
been used to find the optimal number of attributes. The aim of this paper is to find the optimal set of
attributes and improve the classification accuracy by adopting ensemble rule classifiers method. Research
process involves 2 phases; finding the optimal set of attributes and ensemble classifiers method for
classification task. Results are in terms of percentage of accuracy and number of selected attributes and
rules generated. 6 datasets were used for the experiment. The final output is an optimal set of attributes
with ensemble rule classifiers method. The experimental results conducted on public real dataset
demonstrate that the ensemble rule classifiers methods consistently show improve classification accuracy
on the selected dataset. Significant improvement in accuracy and optimal set of attribute selected is
achieved by adopting ensemble rule classifiers method.
Hybrid Method HVS-MRMR for Variable Selection in Multilayer Artificial Neural...IJECEIAES
The variable selection is an important technique the reducing dimensionality of data frequently used in data preprocessing for performing data mining. This paper presents a new variable selection algorithm uses the heuristic variable selection (HVS) and Minimum Redundancy Maximum Relevance (MRMR). We enhance the HVS method for variab le selection by incorporating (MRMR) filter. Our algorithm is based on wrapper approach using multi-layer perceptron. We called this algorithm a HVS-MRMR Wrapper for variables selection. The relevance of a set of variables is measured by a convex combination of the relevance given by HVS criterion and the MRMR criterion. This approach selects new relevant variables; we evaluate the performance of HVS-MRMR on eight benchmark classification problems. The experimental results show that HVS-MRMR selected a less number of variables with high classification accuracy compared to MRMR and HVS and without variables selection on most datasets. HVS-MRMR can be applied to various classification problems that require high classification accuracy.
ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...csandit
Attribute reduction and classification task are an essential process in dealing with large data
sets that comprise numerous number of input attributes. There are many search methods and
classifiers that have been used to find the optimal number of attributes. The aim of this paper is
to find the optimal set of attributes and improve the classification accuracy by adopting
ensemble rule classifiers method. Research process involves 2 phases; finding the optimal set of
attributes and ensemble classifiers method for classification task. Results are in terms of
percentage of accuracy and number of selected attributes and rules generated. 6 datasets were
used for the experiment. The final output is an optimal set of attributes with ensemble rule
classifiers method. The experimental results conducted on public real dataset demonstrate that
the ensemble rule classifiers methods consistently show improve classification accuracy on the
selected dataset. Significant improvement in accuracy and optimal set of attribute selected is
achieved by adopting ensemble rule classifiers method.
For years, the Machine Learning community has focused on developing efficient
algorithms that can produce very accurate classifiers. However, it is often much easier
to find several good classifiers based on dataset combination, instead of single classifier
applied on deferent datasets. The advantages of using classifier dataset combinations
instead of a single one are twofold: it helps lowering the computational complexity by
using simpler models, and it can improve the classification accuracy and performance.
Most Data mining applications are based on pattern matching algorithms, thus improving
the performance of the classification has a positive impact on the quality of the overall
data mining task. Since combination strategies proved very useful in improving the
performance, these techniques have become very important in applications such as
Cancer detection, Speech Technology and Natural Language Processing .The aim of this
paper is basically to propose proprietary metric, Normalized Geometric Index (NGI)
based on the latent properties of datasets for improving the accuracy of data mining tasks.
Extensive Analysis on Generation and Consensus Mechanisms of Clustering Ensem...IJECEIAES
Data analysis plays a prominent role in interpreting various phenomena. Data mining is the process to hypothesize useful knowledge from the extensive data. Based upon the classical statistical prototypes the data can be exploited beyond the storage and management of the data. Cluster analysis a primary investigation with little or no prior knowledge, consists of research and development across a wide variety of communities. Cluster ensembles are melange of individual solutions obtained from different clusterings to produce final quality clustering which is required in wider applications. The method arises in the perspective of increasing robustness, scalability and accuracy. This paper gives a brief overview of the generation methods and consensus functions included in cluster ensemble. The survey is to analyze the various techniques and cluster ensemble methods.
A novel population-based local search for nurse rostering problem IJECEIAES
Population-based approaches regularly are better than single based (local search) approaches in exploring the search space. However, the drawback of population-based approaches is in exploiting the search space. Several hybrid approaches have proven their efficiency through different domains of optimization problems by incorporating and integrating the strength of population and local search approaches. Meanwhile, hybrid methods have a drawback of increasing the parameter tuning. Recently, population-based local search was proposed for a university course-timetabling problem with fewer parameters than existing approaches, the proposed approach proves its effectiveness. The proposed approach employs two operators to intensify and diversify the search space. The first operator is applied to a single solution, while the second is applied for all solutions. This paper aims to investigate the performance of population-based local search for the nurse rostering problem. The INRC2010 database with a dataset composed of 69 instances is used to test the performance of PB-LS. A comparison was made between the performance of PB-LS and other existing approaches in the literature. Results show good performances of proposed approach compared to other approaches, where population-based local search provided best results in 55 cases over 69 instances used in experiments.
ON FEATURE SELECTION ALGORITHMS AND FEATURE SELECTION STABILITY MEASURES: A C...ijcsit
Data mining is indispensable for business organizations for extracting useful information from the huge volume of stored data which can be used in managerial decision making to survive in the competition. Due to the day-to-day advancements in information and communication technology, these data collected from ecommerce and e-governance are mostly high dimensional. Data mining prefers small datasets than high dimensional datasets. Feature selection is an important dimensionality reduction technique. The subsets selected in subsequent iterations by feature selection should be same or similar even in case of small perturbations of the dataset and is called as selection stability. It is recently becomes important topic of research community. The selection stability has been measured by various measures. This paper analyses the selection of the suitable search method and stability measure for the feature selection algorithms and also the influence of the characteristics of the dataset as the choice of the best approach is highly problem dependent.
An Automatic Clustering Technique for Optimal ClustersIJCSEA Journal
This paper proposes a simple, automatic and efficient clustering algorithm, namely, Automatic Merging for Optimal Clusters (AMOC) which aims to generate nearly optimal clusters for the given datasets automatically. The AMOC is an extension to standard k-means with a two phase iterative procedure combining certain validation techniques in order to find optimal clusters with automation of merging of clusters. Experiments on both synthetic and real data have proved that the proposed algorithm finds nearly optimal clustering structures in terms of number of clusters, compactness and separation.
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETSEditor IJCATR
This paper presents a hybrid data mining approach based on supervised learning and unsupervised learning to identify the closest data patterns in the data base. This technique enables to achieve the maximum accuracy rate with minimal complexity. The proposed algorithm is compared with traditional clustering and classification algorithm and it is also implemented with multidimensional datasets. The implementation results show better prediction accuracy and reliability.
sis of health condition is very challenging task for every human being because life is directly related to health
condition. Data mining based classification is one of the important applications for classification of data. In this
research work, we have used various classification techniques for classification of thyroid data. CART gives highest
accuracy 99.47% as best model. Feature selection plays very important role to computationally efficient and increase
the performance of model. This research work focus on Info Gain and Gain Ratio feature selection technique to
reduce the irrelevant features from original data set and computationally increase the performance of model. We have
applied both the feature selection techniques on best model i. e. CART. Our proposed CART-Info Gain and CARTGain
Ratio gives 99.47% and 99.20% accuracy with 25 and 3 feature respectively.
Applying the big bang-big crunch metaheuristic to large-sized operational pro...IJECEIAES
In this study, we present an investigation of comparing the capability of a big bang-big crunch metaheuristic (BBBC) for managing operational problems including combinatorial optimization problems. The BBBC is a product of the evolution theory of the universe in physics and astronomy. Two main phases of BBBC are the big bang and the big crunch. The big bang phase involves the creation of a population of random initial solutions, while in the big crunch phase these solutions are shrunk into one elite solution exhibited by a mass center. This study looks into the BBBC’s effectiveness in assignment and scheduling problems. Where it was enhanced by incorporating an elite pool of diverse and high quality solutions; a simple descent heuristic as a local search method; implicit recombination; Euclidean distance; dynamic population size; and elitism strategies. Those strategies provide a balanced search of diverse and good quality population. The investigation is conducted by comparing the proposed BBBC with similar metaheuristics. The BBBC is tested on three different classes of combinatorial optimization problems; namely, quadratic assignment, bin packing, and job shop scheduling problems. Where the incorporated strategies have a greater impact on the BBBC's performance. Experiments showed that the BBBC maintains a good balance between diversity and quality which produces high-quality solutions, and outperforms other identical metaheuristics (e.g. swarm intelligence and evolutionary algorithms) reported in the literature.
The pertinent single-attribute-based classifier for small datasets classific...IJECEIAES
Classifying a dataset using machine learning algorithms can be a big challenge when the target is a small dataset. The OneR classifier can be used for such cases due to its simplicity and efficiency. In this paper, we revealed the power of a single attribute by introducing the pertinent single-attributebased-heterogeneity-ratio classifier (SAB-HR) that used a pertinent attribute to classify small datasets. The SAB-HR’s used feature selection method, which used the Heterogeneity-Ratio (H-Ratio) measure to identify the most homogeneous attribute among the other attributes in the set. Our empirical results on 12 benchmark datasets from a UCI machine learning repository showed that the SAB-HR classifier significantly outperformed the classical OneR classifier for small datasets. In addition, using the H-Ratio as a feature selection criterion for selecting the single attribute was more effectual than other traditional criteria, such as Information Gain (IG) and Gain Ratio (GR).
Fault detection and diagnosis of high speed switching devices in power invertereSAT Journals
Abstract
Power electronic based inverters are the major components in industry. A fault diagnostics framework composed of a pattern recognition system, having machine learning technology as its integral part is utilized for failure detection of different switches and tracing multiple types of faults in an inverter. Hardware point of view power electronics inverter can be considered to be the weakest link. Hence, this work is carried on detecting faults and classifies which switches in the inverter cause the fault. Diagnosis can help to avoid unplanned breakdown, to make possible to run an emergency operation in case of a fault. On the basis of theoretical foundations of electronic power inverter a simulation model has been developed to simulate the healthy condition and all single-switch open circuit faults. The generated signal is processed using Discrete Wavelet Transform (DWT) and Fuzzy Inference Logic (FIL). A smart and accurate classification of faults is obtained using simulation results, which are tested on a wide operation domain and various load conditions.
Keywords: Fault Diagnosis, DWT, Fuzzy Logic, Artificial Intelligence (AI).
Intelligent Fault Identification System for Transmission Lines Using Artifici...IOSR Journals
Transmission and distribution lines are vital links between generating units and consumers. They are
exposed to atmosphere, hence chances of occurrence of fault in transmission line is very high, which has to be
immediately taken care of in order to minimize damage caused by it. This paper focuses on detecting the faults
on electric power transmission lines using artificial neural networks. A feed forward neural network is
employed, which is trained with back propagation algorithm. Analysis on neural networks with varying number
of hidden layers and neurons per hidden layer has been provided to validate the choice of the neural networks
in each step. The developed neural network is capable of detecting single line to ground and double line to
ground for all the three phases. Simulation is done using MATLAB Simulink to demonstrate that artificial
neural network based method are efficient in detecting faults on transmission lines and achieve satisfactory
performances. A 300km, 25kv transmission line is used to validate the proposed fault detection system.
Hardware implementation of neural network is done on TMS320C6713.
89 jan-roger linna - 7032576 - capillary heating control and fault detectio...Mello_Patent_Registry
Jan-Roger Linna, John Paul Mello - Capillary Heating Control and Fault Detection System and Methodology for Fuel System in an Internal Combustion Engine
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
COMPARISON BETWEEN THE GENETIC ALGORITHMS OPTIMIZATION AND PARTICLE SWARM OPT...IAEME Publication
Close range photogrammetry network design is referred to the process of placing a set of
cameras in order to achieve photogrammetric tasks. The main objective of this paper is tried to find
the best location of two/three camera stations. The genetic algorithm optimization and Particle
Swarm Optimization are developed to determine the optimal camera stations for computing the three
dimensional coordinates. In this research, a mathematical model representing the genetic algorithm
optimization and Particle Swarm Optimization for the close range photogrammetry network is
developed. This paper gives also the sequence of the field operations and computational steps for this
task. A test field is included to reinforce the theoretical aspects.
Evaluation of a hybrid method for constructing multiple SVM kernelsinfopapers
Dana Simian, Florin Stoica, Evaluation of a hybrid method for constructing multiple SVM kernels, Recent Advances in Computers, Proceedings of the 13th WSEAS International Conference on Computers, Recent Advances in Computer Engineering Series, WSEAS Press, Rodos, Greece, July 23-25, 2009, ISSN: 1790-5109, ISBN: 978-960-474-099-4, pp. 619-623
Performance Comparision of Machine Learning AlgorithmsDinusha Dilanka
In this paper Compare the performance of two
classification algorithm. I t is useful to differentiate
algorithms based on computational performance rather
than classification accuracy alone. As although
classification accuracy between the algorithms is similar,
computational performance can differ significantly and it
can affect to the final results. So the objective of this paper
is to perform a comparative analysis of two machine
learning algorithms namely, K Nearest neighbor,
classification and Logistic Regression. In this paper it
was considered a large dataset of 7981 data points and 112
features. Then the performance of the above mentioned
machine learning algorithms are examined. In this paper
the processing time and accuracy of the different machine
learning techniques are being estimated by considering the
collected data set, over a 60% for train and remaining
40% for testing. The paper is organized as follows. In
Section I, introduction and background analysis of the
research is included and in section II, problem statement.
In Section III, our application and data analyze Process,
the testing environment, and the Methodology of our
analysis are being described briefly. Section IV comprises
the results of two algorithms. Finally, the paper concludes
with a discussion of future directions for research by
eliminating the problems existing with the current
research methodology.
ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...cscpconf
Optimization problems are dominantly being solved using Computational Intelligence. One of
the issues that can be addressed in this context is problems related to attribute subset selection
evaluation. This paper presents a computational intelligence technique for solving the
optimization problem using a proposed model called Modified Genetic Search Algorithms
(MGSA) that avoids local bad search space with merit and scaled fitness variables, detecting
and deleting bad candidate chromosomes, thereby reducing the number of individual
chromosomes from search space and subsequent iterations in next generations. This paper aims
to show that Rotation forest ensembles are useful in the feature selection method. The base
classifier is multinomial logistic regression method integrated with Haar wavelets as projection
filter and reproducing the ranks of each features with 10 fold cross validation method. It also
discusses the main findings and concludes with promising result of the proposed model. It
explores the combination of MGSA for optimization with Naïve Bayes classification. The result
obtained using proposed model MGSA is validated mathematically using Principal Component
Analysis. The goal is to improve the accuracy and quality of diagnosis of Breast cancer disease
with robust machine learning algorithms. As compared to other works in literature survey,
experimental results achieved in this paper show better results with statistical inferenc
Trust Enhanced Role Based Access Control Using Genetic Algorithm IJECEIAES
Improvements in technological innovations have become a boon for business organizations, firms, institutions, etc. System applications are being developed for organizations whether small-scale or large-scale. Taking into consideration the hierarchical nature of large organizations, security is an important factor which needs to be taken into account. For any healthcare organization, maintaining the confidentiality and integrity of the patients’ records is of utmost importance while ensuring that they are only available to the authorized personnel. The paper discusses the technique of Role-Based Access Control (RBAC) and its different aspects. The paper also suggests a trust enhanced model of RBAC implemented with selection and mutation only ‘Genetic Algorithm’. A practical scenario involving healthcare organization has also been considered. A model has been developed to consider the policies of different health departments and how it affects the permissions of a particular role. The purpose of the algorithm is to allocate tasks for every employee in an automated manner and ensures that they are not over-burdened with the work assigned. In addition, the trust records of the employees ensure that malicious users do not gain access to confidential patient data.
SVM-PSO based Feature Selection for Improving Medical Diagnosis Reliability u...cscpconf
Improving accuracy of supervised classification algorithms in biomedical applications,
especially CADx, is one of active area of research. This paper proposes construction of rotation
forest (RF) ensemble using 20 learners over two clinical datasets namely lymphography and
backache. We propose a new feature selection strategy based on support vector machines
optimized by particle swarm optimization for relevant and minimum feature subset for obtaining
higher accuracy of ensembles. We have quantitatively analyzed 20 base learners over two
datasets and carried out the experiments with 10 fold cross validation leave-one-out strategy
and the performance of 20 classifiers are evaluated using performance metrics namely accuracy
(acc), kappa value (K), root mean square error (RMSE) and area under receiver operating
characteristics curve (ROC). Base classifiers succeeded 79.96% & 81.71% average accuracies
for lymphography & backache datasets respectively. As for RF ensembles, they produced
average accuracies of 83.72% & 85.77% for respective diseases. The paper presents promising
results using RF ensembles and provides a new direction towards construction of reliable and robust medical diagnosis systems.
A general frame for building optimal multiple SVM kernelsinfopapers
Dana Simian, Florin Stoica, A General Frame for Building Optimal Multiple SVM Kernels, Large-Scale Scientific Computing, Lecture Notes in Computer Science, 2012, Volume 7116/2012, 256-263, DOI: 10.1007/978-3-642-29843-1_29
ON THE PREDICTION ACCURACIES OF THREE MOST KNOWN REGULARIZERS : RIDGE REGRESS...ijaia
The work in this paper shows intensive empirical experiments using 13 datasets to understand the regularization effectiveness of ridge regression, the lasso estimate, and elastic net regularization methods. The study offers a deep understanding of how the datasets affect the goodness of the prediction accuracy of each regularization method for a given problem given the diversity in the datasets used. The results have shown that datasets play crucial rules on the performance of the regularization method and that the
predication accuracy depends heavily on the nature of the sampled datasets.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
PROGRAM TEST DATA GENERATION FOR BRANCH COVERAGE WITH GENETIC ALGORITHM: COMP...cscpconf
In search based test data generation, the problem of test data generation is reduced to that of
function minimization or maximization.Traditionally, for branch testing, the problem of test data
generation has been formulated as a minimization problem. In this paper we define an alternate
maximization formulation and experimentally compare it with the minimization formulation. We
use a genetic algorithm as the search technique and in addition to the usual genetic algorithm
operators we also employ the path prefix strategy as a branch ordering strategy and memory and elitism. Results indicate that there is no significant difference in the performance or the coverage obtained through the two approaches and either could be used in test data generation when coupled with the path prefix strategy, memory and elitism.
This paper presents a set of methods that uses a genetic algorithm for automatic test-data generation in
software testing. For several years researchers have proposed several methods for generating test data
which had different drawbacks. In this paper, we have presented various Genetic Algorithm (GA) based test
methods which will be having different parameters to automate the structural-oriented test data generation
on the basis of internal program structure. The factors discovered are used in evaluating the fitness
function of Genetic algorithm for selecting the best possible Test method. These methods take the test
populations as an input and then evaluate the test cases for that program. This integration will help in
improving the overall performance of genetic algorithm in search space exploration and exploitation fields
with better convergence rate.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Mission to Decommission: Importance of Decommissioning Products to Increase E...
T180203125133
1. IOSR Journal of Computer Engineering (IOSR-JCE)
e-ISSN: 2278-0661,p-ISSN: 2278-8727, Volume 18, Issue 2, Ver. III (Mar-Apr. 2016), PP 125-133
www.iosrjournals.org
DOI: 10.9790/0661-180203125133 www.iosrjournals.org 125 | Page
A Method of View Materialization Using Genetic Algorithm
Debabrata Datta1
, Kashi Nath Dey2
, Payel Saha3
, Sayan Mukhopadhyay4
,
Sourav Kumar Mitra5
1
(Department of Computer Science, St. Xavier’s College (Kolkata), India)
2
(Department of Computer Science and Engineering, University of Calcutta (Kolkata), India)
3
(Department of Computer Science, St. Xavier’s College (Kolkata), India)
4
(Department of Computer Science, St. Xavier’s College (Kolkata), India)
5
(Department of Computer Science, St. Xavier’s College (Kolkata), India)
Abstract: A data warehouse is a very large database system that collects, summarizes and stores data from
multiple remote and heterogeneous information sources. Data warehouses are used for supporting decision
making operation on an intelligent system. The decision making queries are complex in nature and take a large
amount of time when they are run against a large data warehouse. To reduce the response time, materialized
views can be used. Since, all possible views cannot be materialized due to space constraints, an optimal subset
of views must be selected for materialization. An approach of selecting such subsets of views using Genetic
Algorithms is proposed in this paper. This approach computes the top-T views from a multidimensional lattice
and calculates the fitness values for each of them.
Keywords: Data Warehouse; Genetic Algorithms; Fitness Function; Materialized View; Multidimensional
Lattice
I. Introduction
Data warehouses are very large database systems that collect, summarize, and store data from multiple
remote and heterogeneous information sources, for the purpose of supporting decision making. The queries for
decision making are usually analytical and complex in nature and their response time is high when processed
against a large data warehouse. This query response time can be reduced by materializing views over a data
warehouse [1]. Since all views cannot be materialized, due to space constraints, an optimal subset of views
should be selected for materialization. Materialized views are used in data warehouses instead of views because
they store results of a query in a separate schema object which can be used to answer future queries without the
requirement of calculating the result again, unless the base tables have been updated, and use of materialized
views decreases the response time of the end user application. In this paper a Genetic algorithm has been
proposed for the selection of views for materialization.
II. Related Work
Materialized view selection is the problem of selecting appropriate sets of views for materialization in
the data warehouse such that the cost of evaluating queries is minimized subject to given space constraints.
According to the definition given in [2] view selection is defined as “given a database schema R, storage space
B, and a workload of queries Q, choose a set of views V over R to materialize, whose combined size is at most
B”. It is not feasible to materialize all possible views as the number of possible views is exponential in the
number of dimensions, and for higher dimensions it is not possible to store all possible views within the
available storage space. So, an optimal subset of views should be selected for materialization. There are many
different algorithms to select views for materialization. Genetic algorithm (GA) is one of them. The majority of
research in the area of materialized view selection is focused around greedy heuristics based techniques. These
techniques are unable to select good quality views for higher dimensional data sets because their total view
evaluation cost is high. GA is n widely used evolutionary technique suitable for solving complex problems
involving the identification of a good set of solutions from within a large search space [3]. Several GA based
view selection algorithms have been proposed in literature [5, 6, 7, 8, 9, 10]. These algorithms aim to select
views for higher dimensional data sets with the key challenge of selecting views of high quality i.e. low total
view evaluation cost (TVEC). In this paper a GA based algorithm is proposed that selects views from a
multidimensional lattice. Each chromosome is represented as a string of views selected for materialization. The
length of each chromosome is T for selecting top-T views for materialization. GA is applied with a pre-defined
crossover and mutation probability and a set of top-T views are generated after a pre-specified number of
generations.
2. A Method Of View Materialization Using Genetic Algorithm
DOI: 10.9790/0661-180203125133 www.iosrjournals.org 126 | Page
III. Basics Of Genetic Algorithm
Genetic algorithms (GAs) are adaptive heuristic search methods based on the evolutionary ideas of
natural selection and genetics. They are inspired by Darwin’s theory about evolution – “Survival of the fittest.”
They represent an intelligent exploitation of random search used to solve optimization problems. GAs, although
randomized, exploit historical information to direct the search into the region of better performance within the
search space. In solving problems, some solution(s) will be the better than others. The set of all possible
solutions is called search space or state space. Each point in the search space represents one possible solution,
which can be marked by its value. This is known as the fitness value for the problem. GA looks for the best
solution among a number of possible solutions.
Working Principle:
Fig.1. Working principle of Genetic algorithm
1. Initialization: An initial population of all chromosomes is usually generated randomly within the search
space.
2. Evaluation: Once the population is initialized or an offspring population is created, the fitness values of all
individuals within that population are evaluated using a proper fitness function.
3. Selection: Selection of two parents with higher fitness values for recombination. The main idea is to prefer
better solutions to worse ones.
4. Recombination (or Crossover): Recombination combines parts of two or more parental solutions (or
individuals) to create new, possibly better solutions (offspring).
5. Mutation: Mutation basically alters one or more gene values in a chromosome to maintain genetic diversity
from one generation of a population to the next.
6. Replacement: The offspring population created by selection, recombination and mutation replaces the
original parental population.
7. Repeat steps 2–6 until a terminating condition is met.
IV. Proposed Method
The proposed GA based method selects the top-T views from a multidimensional lattice as discussed below:
A. The Multidimensional Lattice
Views involved in On-Line Analytical Processing (OLAP) queries can be represented as nodes of a
multidimensional lattice [4, 11]. The multidimensional lattice is a graph with no self-loops or parallel edges.
The root node of the lattice represents the base facts table. All the views in the lattice either directly or indirectly
depend on the root view. A view VA is said to be dependent on another view VB when the queries on VA can be
answered using VB. In the graph, direct dependencies are represented by an edge between two views & indirect
dependencies are discovered transitively.
Fig.2. A 3-Dimensional lattice
3. A Method Of View Materialization Using Genetic Algorithm
DOI: 10.9790/0661-180203125133 www.iosrjournals.org 127 | Page
In Fig. 2, a 3-dimensional lattice is shown. The index is shown in brackets inside the node & the
frequency of the variables is shown on the top left corner of each node. The root view contains all the variables
A, B & C.
B. Proposed Algorithm
1. [Start] Generate random population of n chromosomes.
2. [Fitness] Use the fitness function f(x) to calculate the fitness value of each chromosome x in the population.
3. [Test] If the end condition is satisfied, i.e. , a chromosome with the desired fitness value is found in the
population or a value close to that value is found (& further repetition does not generate a chromosome with
a better fitness value) then stop & return the best solution that is found.
4. [New Population] Create a new population by repeating the following steps until the new population is
complete.
a. [Selection] Select two parent chromosomes from a population according to their fitness value. The higher
the fitness value, the better the chances for it to be selected.
b. [Crossover] With a crossover probability pc, combine the parent chromosomes to generate a new offspring.
c. [Mutation] With a mutation probability pm, mutate the offspring at each locus.
d. [Accepting] Place the offspring in the new population.
5. [Replace] Replace the new population with the previous population & go to step 2.
C. Chromosome Representation
Each chromosome is represented as a string of distinct views. Each distinct gene corresponds to a view.
The chromosome string contains only the index values of those views that are materialized. The chromosome
representation for selecting the top-5 views for materialization is shown below:
Fig.3. Chromosome representation for selecting top-5 views
D. Fitness Function
The fitness function is used to give an estimate of how much fit a chromosome is for being selected for
crossover. The quality of solution obtained by the GA based algorithm depends on the correctness & the
appropriateness of the fitness function. Here, the fitness function is based on the frequency of attributes used in
a view. The higher the total frequency of a chromosome, the better is its chances to be selected for crossover.
The fitness function is given below:
TFV = Total Frequency Value
n = Total number of genes or views
x = Number of materialized views
y = Number of non-materialized views
n = x + y
Mat(Vi) = Frequency of attributes for view Vi where Vi is materialized
AncNMat(Vj) = Frequency of a materialized ancestor with maximum frequency among other ancestors of a non-
materialized view Vj
For example, if views VAC, VBC, VA, VB & VC are selected for materialization and, x=5 & y=3 then the TFV
computation is as follows:
= Mat(VAC) + Mat(VBC) + Mat(VA) + Mat(VB) + Mat(VC)
= 70 + 96 + 60 + 40 + 32
= 298
4. A Method Of View Materialization Using Genetic Algorithm
DOI: 10.9790/0661-180203125133 www.iosrjournals.org 128 | Page
= Mat(VABC) + Mat(VAB) + Mat(VNONE)
= 0 +0 + 60
= 60
E. Fitness Function Algorithm
The fitness function takes the following parameters as its inputs:
1. The adjacency matrix adj[][] for the entire multidimensional lattice,
2. freq[] array which contains the frequencies of attributes for each view,
3. level[] which contains the level of each view in the graph(lattice),
4. The total number of distinct genes (tot_gene),
5. The number of genes to be materialized (mat_gene), and
6. The size of the population popl.
The procedure RAND (1, tot_gene) generates a random number between 1 & n where n is the total number
of distinct genes.
The procedure LINEAR (topt, k, i, mat_gene) checks if the item k is present in the ith
chromosome or not. If
it is, it returns true. Otherwise, it returns false.
The procedure MAX (anc) finds out the maximum number in the anc [] array, i.e. the materialized ancestor
of a non-materialized gene which has the maximum frequency.
The algorithm proceeds in the following way:
First it generates the initial population and stores the chromosomes in array topt [][].
Calculates total frequency value for each chromosome in the population. This value is termed asTFV1.
Finds out the non-materialized views in each chromosome and finds the maximum frequency materialized
ancestor for each of those non-materialized views.
Calculates the sum of attribute frequencies of these ancestors for each chromosome. This value is termed as
TFV2.
Add TFV1 and TFV2 for each chromosome to generate the Total Frequency Value (TFV).
Algorithm FITNESS (adj[][], freq[], level[], tot_gene, mat_gene, popl)
begin
for i = 1 to popl do
for j = 1 to mat_gene do
while (true) do //Loop for removing duplicate genes
flag = false;
rand = RAND(1, tot_gene);
for k = 1 to j do
if (topt[i][k] == rand) then //If the gene is already present in the
chromosome then generate a new number
flag = true;
break;
end for
if (flag == false) then
topt [i][j] = rand;
break;
end while
end for
end for
for i = 1 to popl do
tv1=0;
for j = 1 to mat_gene do
k = topt[i][j];
tv1 = tv1 + freq[k];
end for
5. A Method Of View Materialization Using Genetic Algorithm
DOI: 10.9790/0661-180203125133 www.iosrjournals.org 129 | Page
tfv1 [i] = tv1;
end for
flag = false;
for i = 1 to popl do
k = 1; l = 0;
while (k <= tot_gene and l < n_mat_gene) do
flag = LINEAR(topt, k ,i, mat_gene);
if(flag == false) then
mma [i][l]=k;
l++;
k++;
end while
end for
for i = 1 to popl do
tv2 = 0 ;
for j=1 to n_mat_gene do
k = mma[i][j];
q=0;
anc[]={0,0,0,0,0};
for p= 1 to tot_gene do
if(level[k]==1)
break;
else if ((adj[k][p] == 1) and (level[p] == (level[k]-1))) then //If entry in
adjacency matrix is 1 & the node is in the //previous level then it is an ancestor
anc[q++] = freq[p]; //Add the ancestor to anc[]
end for
m =MAX(anc); //Calculate the freq that is max because we need max freq ancestor
tv2 = tv2 + m;
end for
tfv2[i]=tv2;
end for
for i=1 to popl do
TFV[i]=tfv1[i]+tfv2[i];
end for
end
F. Selection
Selection is a process of choosing individuals from a population for performing crossover. The fitter
individuals are more likely to produce fitter offsprings in the subsequent generations. Selection of only fitter
individuals might hinder exploration of the search space thereby leading to early convergence. Therefore, there
is a need to randomly select individuals where individuals, having higher fitness value, have greater likelihood
of being selected for crossover. The approach uses binary tournament selection method where two individuals
are randomly selected from the population and a tournament is conducted among them. A value r is randomly
generated between 0 and 1. If r is less than the pre-defined value of k (say k=0.75), taken between 0 and 1, the
individual having higher TFV is selected else the individual with lower TFV is selected.
Chromosomes (top-T Views) Fitness value(TFV)
[ 3 6 7 8 2 ] 303
[ 2 5 4 1 8 ] 629
[ 2 3 1 8 6 ] 541
[ 7 4 8 5 1 ] 595
[ 7 3 8 1 6 ] 513
[ 6 2 1 7 4 ] 568
Fig.4. TFV of top-T views in VTopT
Selection Algorithm:
1. randomly two individuals from the population.
2. Given: parameter k = 0.75.
3. Choose Choose: a Random number r between 0 and 1.
4. If r < k, Select the fitter individual among the two individuals.
6. A Method Of View Materialization Using Genetic Algorithm
DOI: 10.9790/0661-180203125133 www.iosrjournals.org 130 | Page
5. Else, Select the less fitter individual among the two individuals.
Randomly
generated
Indexes[i] [j]
Tournament between
individuals [P(i)]
[P(j)]
Fitness
[TFV(P(i))]
[TFV(P(j))]
Random Number
(r)
Individual
selected
[2] [3] [2 3 1 8 6]
[7 4 8 5 1]
541
585
0.9765590312995504 [2 3 1 8 6]
[5] [0] [6 2 1 7 4]
[3 6 7 8 2]
303
568
0.7209919519618984 [3 6 7 8 2]
Fig.5. Selection of top-T views using Binary Tournament Selection
G. Crossover
In Crossover, two individuals are randomly selected and are recombined using single point crossover
method with crossover probability, pc= 0.5. A uniform random number, r, is generated and if r ≤ pc , the two
randomly selected individuals undergo crossover. Otherwise, the two offspring are simply copies of their
parents [12].
Fig.6. Single Point Crossover
Crossover Algorithm:
1. Select two parents randomly for crossover from the population
2. Generate random number r
3. If r < = pc then do
a. Randomly generate a point for crossover
b. Perform single point crossover between the parents corresponding to the randomly generated point
c. If there are duplicate genes within either of the children after crossover, then repeat Step 1-3
d. Else If any of the children have already been generated before to be placed in the new population then repeat
steps 1-3
Else Do not perform crossover. Simply copy the parents to their corresponding children.
H. Mutation
Mutation is performed using random resetting method with predefined mutation rate, pm. A random
number rand is generated & if it is less than the mutation rate only then we perform mutation. Otherwise, we do
not perform mutation. So, the actual number of genes to be mutated is not fixed. For a chromosome of length L,
on average L*pm genes will be mutated [13]. The method ensures that there are no duplicate genes in the
chromosome after mutation.
Fig.7. Mutation
7. A Method Of View Materialization Using Genetic Algorithm
DOI: 10.9790/0661-180203125133 www.iosrjournals.org 131 | Page
Mutation Algorithm:
1. for i=1 to L do
a. Generate rand
b. if rand < pm then
i. Generate a random gene value
ii. Mutate gene[i] of the selected chromosome with the random gene value generated in the previous
step
iii. if the newly added gene is a duplicate then repeat steps i and ii
2. Repeat Step 1 for all the chromosomes until the end of the population.
I. Replacement
After the generation of the new population we replace the old population with the new population. Then
we perform the entire cycle of fitness value calculation, selection, crossover, mutation, and replacement for a
pre specified number of generations.
V. Result & Analysis
The GA based view selection algorithm was implemented using jdk 1.7.0_55 in Windows 7(x64)
environment. The algorithm was successfully tested for a 3 & a 4 dimensional lattice of views. The result for
selecting top-5 views for materialization from a 3 dimensional lattice is shown in Fig. 4. From the figure it can
be observed that the matrix for top-T views contains distinct views in each row of the matrix. Duplicate views
are eliminated since the data warehouse does not require redundant data for recovery. TFV1 is the first part of
the fitness function that calculates the total frequency value for only materialized views. TFV2 is the second part
of the fitness function that calculates the total frequency value for non-materialized views. These two are added
to give the TFV for each chromosome.
Fig.8. Result for selecting top-5 Views from a 3 Dimensional Lattice
The size of the adjacency matrix increases exponentially with the increase in lattice dimensions. Hence,
with the increase in dimensions of the lattice, the complexity increases. When we calculate the attribute
frequencies for a non-materialized view, we use the frequency of attributes of the maximum materialized
ancestor of that non-materialized view because the non-materialized view is dependent on the ancestor. This
means that all the queries for the non-materialized view can be answered using its materialized ancestor, without
requiring any other views to be materialized. But, if the root view itself was not materialized then we use a
frequency value of zero in the calculation because there are no other views in the lattice which can give answers
8. A Method Of View Materialization Using Genetic Algorithm
DOI: 10.9790/0661-180203125133 www.iosrjournals.org 132 | Page
to all the queries that can be made to the root view as no other view but the base or root view contains all the
attributes.
A frequency value of zero is also used when a non-materialized view has no ancestors. This is because
only an ancestor view that the child view is dependent on or the child view itself can give answers to all the
queries that can be made to the child view.
Fig.9. Comparison of FA and PA – Fitness Vs. Dimensions for different Pc’s and Pm’ s
Our proposed GA based view selection algorithm using frequency of attribute (FA) and another GA
based view selection algorithm using size of attribute (PA) are compared. The comparisons are carried out on
Fitness value due to views selected by the two algorithms. The experiments were performed for selecting the
top-T views for materialization for dimensions 2 to 5 over 1000 generations. First, the graphs showing Fitness
value for different crossover and mutation probabilities for selecting top-6 views, were plotted and compared
with Fitness value of selecting top-6 views using PA. These graphs, plotted for pair of crossover and mutation
probabilities (0.5, 0.05), (0.6, 0.05), (0.5, 0.1), (0.6, 0.1), (0.55, 0.1), (0.55, 0.05), (0.65, 0.1), (0.65, 0.05), are
shown in Fig. 9. The graphs show that FA, in comparison to PA, is able to select views that are Fitter than that
obtained by PA for different crossover and mutation probabilities. This difference in Fitness value becomes
significant for higher dimensions. Further, it can be observed from the graph that, for crossover probability 0.6
and mutation probability 0.05, the best result is obtained by FA in the Fitness value across all dimensions.
VI. Conclusion
Here we have used Genetic Algorithms & used a fitness function to evaluate the fitness values of the
possible solutions. The Genetic Algorithms are comparatively easy to understand & does not require too much
mathematical knowledge. Analysis of state of the art of view selection has shown that there is a very few work
on view selection in distributed databases and data warehouses. One of challenging directions of future work
aims at addressing the view selection problem in a distributed setting. Randomized algorithms can be applied to
complex problems dealing with large or even unlimited search spaces. Using GA for selecting views enables
exploration and exploitation of the search space. As a result, the views so selected are likely to have a higher
TFV. Further experimental results show that the views selected using the proposed GA-based algorithm have a
9. A Method Of View Materialization Using Genetic Algorithm
DOI: 10.9790/0661-180203125133 www.iosrjournals.org 133 | Page
comparatively higher frequency value to those selected using PA for the observed crossover and mutation
probabilities. That is, the GA based algorithm using frequency is able to select comparatively better quality
views. This in turn results in reduced query response time enabling efficient decision making. Thus, the use of
randomized algorithms such as GAs can be considered in solving large combinatorial problems such as the view
selection problem.
References
[1]. Vijay Kumar, T.V., Kumar, S.: Materialized View Selection using Genetic Algorithm, Communications in Computer and
Information Science (CCIS), Volume 306, Springer Verlag, pp. 225-237, 2012.
[2]. Chirkova, R., Halevy, A. Y., Suciu, D.: A formal perspective on the view selection problem, 27 VLDB Conference, Italy; 2001
[3]. Goldberg, D. E.: Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, 1989.
[4]. Harinarayan, V., Rajaraman, A., Ullman, J. D.: Implementing data cubes efficiently, ACM SIGMOD International Conference on
Management of Data, pages 205-227, 1996.
[5]. Horng, J.T., Chang, Y.J., Liu, B.J., Kao, C.Y.: Materialized view selection using genetic algorithms in a data warehouse system, in
Proceedings of the World Congress on Evolutionary Computation, Washington, pp. 2221–2227, 1999.
[6]. Horng, J.T., Chang, Y.J., Liu, B.J.: Applying evolutionary algorithms to materialized view selection in a data warehouse, Soft
Computing.2003, Vol.7, pp.574–581, 2003.
[7]. Lee, M., Hammer, J.: Speeding Up Materialized View Selection in Data Warehouses Using A Randomized Algorithm, in the
International Journal of Cooperative Information Systems, Vol. 10, Nos. 3, pp. 327-353, 2001.
[8]. Lin, W.Y., Kuo, I.C.: A genetic selection algorithm for OLAP data cubes, Knowledge and Information Systems 6 (1) pp. 83–102,
2004.
[9]. Wang, Z., Zhang D.: Optimal Genetic View Selection Algorithm Under Space Constraint, In International Journal of Information
Technology, vol. 11, no. 5, pp. 44 - 51, 2005.
[10]. Yu, J. X., Yao, Y. X., Choi, C., Gou, G.: Materialized view selection as constrained evolutionary optimization, in the Journal of
IEEE Transactions on Systems, Man, and Cybernetics - TSMC, vol. 33, no. 4, pp. 458-467, 2003.
[11]. Mohania, M., Samtani, S., Roddick, J. and Kambayshi, Y.: Advances and Research Directions in Data Warehousing Technology,
Australian Journal of Information Systems, Vol 7, No 1; 1999.
[12]. Sastry, K., Goldberg, D., Kendall, G.: Search Methodologies, Introductory Tutorials in Optimization and Decision Support
Techniques, Springer US, 2005.
[13]. Eiben, A.E., Smith, J.E.: Introduction to Evolutionary Computing, Springer-Verlag, pages 42-43, 2003.