The variable selection is an important technique the reducing dimensionality of data frequently used in data preprocessing for performing data mining. This paper presents a new variable selection algorithm uses the heuristic variable selection (HVS) and Minimum Redundancy Maximum Relevance (MRMR). We enhance the HVS method for variab le selection by incorporating (MRMR) filter. Our algorithm is based on wrapper approach using multi-layer perceptron. We called this algorithm a HVS-MRMR Wrapper for variables selection. The relevance of a set of variables is measured by a convex combination of the relevance given by HVS criterion and the MRMR criterion. This approach selects new relevant variables; we evaluate the performance of HVS-MRMR on eight benchmark classification problems. The experimental results show that HVS-MRMR selected a less number of variables with high classification accuracy compared to MRMR and HVS and without variables selection on most datasets. HVS-MRMR can be applied to various classification problems that require high classification accuracy.
Constructing a classification model is important in machine learning for a particular task. A
classification process involves assigning objects into predefined groups or classes based on a
number of observed attributes related to those objects. Artificial neural network is one of the
classification algorithms which, can be used in many application areas. This paper investigates
the potential of applying the feed forward neural network architecture for the classification of
medical datasets. Migration based differential evolution algorithm (MBDE) is chosen and
applied to feed forward neural network to enhance the learning process and the network
learning is validated in terms of convergence rate and classification accuracy. In this paper,
MBDE algorithm with various migration policies is proposed for classification problems using
medical diagnosis.
An Heterogeneous Population-Based Genetic Algorithm for Data Clusteringijeei-iaes
As a primary data mining method for knowledge discovery, clustering is a technique of classifying a dataset into groups of similar objects. The most popular method for data clustering K-means suffers from the drawbacks of requiring the number of clusters and their initial centers, which should be provided by the user. In the literature, several methods have proposed in a form of k-means variants, genetic algorithms, or combinations between them for calculating the number of clusters and finding proper clusters centers. However, none of these solutions has provided satisfactory results and determining the number of clusters and the initial centers are still the main challenge in clustering processes. In this paper we present an approach to automatically generate such parameters to achieve optimal clusters using a modified genetic algorithm operating on varied individual structures and using a new crossover operator. Experimental results show that our modified genetic algorithm is a better efficient alternative to the existing approaches.
The pertinent single-attribute-based classifier for small datasets classific...IJECEIAES
Classifying a dataset using machine learning algorithms can be a big challenge when the target is a small dataset. The OneR classifier can be used for such cases due to its simplicity and efficiency. In this paper, we revealed the power of a single attribute by introducing the pertinent single-attributebased-heterogeneity-ratio classifier (SAB-HR) that used a pertinent attribute to classify small datasets. The SAB-HR’s used feature selection method, which used the Heterogeneity-Ratio (H-Ratio) measure to identify the most homogeneous attribute among the other attributes in the set. Our empirical results on 12 benchmark datasets from a UCI machine learning repository showed that the SAB-HR classifier significantly outperformed the classical OneR classifier for small datasets. In addition, using the H-Ratio as a feature selection criterion for selecting the single attribute was more effectual than other traditional criteria, such as Information Gain (IG) and Gain Ratio (GR).
Feature selection is one of the most fundamental steps in machine learning. It is closely related to
dimensionality reduction. A commonly used approach in feature selection is ranking the individual
features according to some criteria and then search for an optimal feature subset based on an evaluation
criterion to test the optimality. The objective of this work is to predict more accurately the presence of
Learning Disability (LD) in school-aged children with reduced number of symptoms. For this purpose, a
novel hybrid feature selection approach is proposed by integrating a popular Rough Set based feature
ranking process with a modified backward feature elimination algorithm. The process of feature ranking
follows a method of calculating the significance or priority of each symptoms of LD as per their
contribution in representing the knowledge contained in the dataset. Each symptoms significance or
priority values reflect its relative importance to predict LD among the various cases. Then by eliminating
least significant features one by one and evaluating the feature subset at each stage of the process, an
optimal feature subset is generated. For comparative analysis and to establish the importance of rough set
theory in feature selection, the backward feature elimination algorithm is combined with two state-of-theart
filter based feature ranking techniques viz. information gain and gain ratio. The experimental results
show the proposed feature selection approach outperforms the other two in terms of the data reduction.
Also, the proposed method eliminates all the redundant attributes efficiently from the LD dataset without
sacrificing the classification performance.
A Combined Approach for Feature Subset Selection and Size Reduction for High ...IJERA Editor
selection of relevant feature from a given set of feature is one of the important issues in the field of
data mining as well as classification. In general the dataset may contain a number of features however it is not
necessary that the whole set features are important for particular analysis of decision making because the
features may share the common information‟s and can also be completely irrelevant to the undergoing
processing. This generally happen because of improper selection of features during the dataset formation or
because of improper information availability about the observed system. However in both cases the data will
contain the features that will just increase the processing burden which may ultimately cause the improper
outcome when used for analysis. Because of these reasons some kind of methods are required to detect and
remove these features hence in this paper we are presenting an efficient approach for not just removing the
unimportant features but also the size of complete dataset size. The proposed algorithm utilizes the information
theory to detect the information gain from each feature and minimum span tree to group the similar features
with that the fuzzy c-means clustering is used to remove the similar entries from the dataset. Finally the
algorithm is tested with SVM classifier using 35 publicly available real-world high-dimensional dataset and the
results shows that the presented algorithm not only reduces the feature set and data lengths but also improves the
performances of the classifier.
25 16285 a hybrid ijeecs 012 edit septianIAESIJEECS
Feature selection aims to choose an optimal subset of features that are necessary and sufficient
to improve the generalization performance and the running efficiency of the learning algorithm. To get the
optimal subset in the feature selection process, a hybrid feature selection based on mutual information and
genetic algorithm is proposed in this paper. In order to make full use of the advantages of filter and
wrapper model, the algorithm is divided into two phases: the filter phase and the wrapper phase. In the
filter phase, this algorithm first uses the mutual information to sort the feature, and provides the heuristic
information for the subsequent genetic algorithm, to accelerate the search process of the genetic
algorithm. In the wrapper phase, using the genetic algorithm as the search strategy, considering the
performance of the classifier and dimension of subset as an evaluation criterion, search the best subset of
features. Experimental results on benchmark datasets show that the proposed algorithm has higher
classification accuracy and smaller feature dimension, and its running time is less than the time of using
genetic algorithm.
ON FEATURE SELECTION ALGORITHMS AND FEATURE SELECTION STABILITY MEASURES: A C...ijcsit
Data mining is indispensable for business organizations for extracting useful information from the huge volume of stored data which can be used in managerial decision making to survive in the competition. Due to the day-to-day advancements in information and communication technology, these data collected from ecommerce and e-governance are mostly high dimensional. Data mining prefers small datasets than high dimensional datasets. Feature selection is an important dimensionality reduction technique. The subsets selected in subsequent iterations by feature selection should be same or similar even in case of small perturbations of the dataset and is called as selection stability. It is recently becomes important topic of research community. The selection stability has been measured by various measures. This paper analyses the selection of the suitable search method and stability measure for the feature selection algorithms and also the influence of the characteristics of the dataset as the choice of the best approach is highly problem dependent.
Constructing a classification model is important in machine learning for a particular task. A
classification process involves assigning objects into predefined groups or classes based on a
number of observed attributes related to those objects. Artificial neural network is one of the
classification algorithms which, can be used in many application areas. This paper investigates
the potential of applying the feed forward neural network architecture for the classification of
medical datasets. Migration based differential evolution algorithm (MBDE) is chosen and
applied to feed forward neural network to enhance the learning process and the network
learning is validated in terms of convergence rate and classification accuracy. In this paper,
MBDE algorithm with various migration policies is proposed for classification problems using
medical diagnosis.
An Heterogeneous Population-Based Genetic Algorithm for Data Clusteringijeei-iaes
As a primary data mining method for knowledge discovery, clustering is a technique of classifying a dataset into groups of similar objects. The most popular method for data clustering K-means suffers from the drawbacks of requiring the number of clusters and their initial centers, which should be provided by the user. In the literature, several methods have proposed in a form of k-means variants, genetic algorithms, or combinations between them for calculating the number of clusters and finding proper clusters centers. However, none of these solutions has provided satisfactory results and determining the number of clusters and the initial centers are still the main challenge in clustering processes. In this paper we present an approach to automatically generate such parameters to achieve optimal clusters using a modified genetic algorithm operating on varied individual structures and using a new crossover operator. Experimental results show that our modified genetic algorithm is a better efficient alternative to the existing approaches.
The pertinent single-attribute-based classifier for small datasets classific...IJECEIAES
Classifying a dataset using machine learning algorithms can be a big challenge when the target is a small dataset. The OneR classifier can be used for such cases due to its simplicity and efficiency. In this paper, we revealed the power of a single attribute by introducing the pertinent single-attributebased-heterogeneity-ratio classifier (SAB-HR) that used a pertinent attribute to classify small datasets. The SAB-HR’s used feature selection method, which used the Heterogeneity-Ratio (H-Ratio) measure to identify the most homogeneous attribute among the other attributes in the set. Our empirical results on 12 benchmark datasets from a UCI machine learning repository showed that the SAB-HR classifier significantly outperformed the classical OneR classifier for small datasets. In addition, using the H-Ratio as a feature selection criterion for selecting the single attribute was more effectual than other traditional criteria, such as Information Gain (IG) and Gain Ratio (GR).
Feature selection is one of the most fundamental steps in machine learning. It is closely related to
dimensionality reduction. A commonly used approach in feature selection is ranking the individual
features according to some criteria and then search for an optimal feature subset based on an evaluation
criterion to test the optimality. The objective of this work is to predict more accurately the presence of
Learning Disability (LD) in school-aged children with reduced number of symptoms. For this purpose, a
novel hybrid feature selection approach is proposed by integrating a popular Rough Set based feature
ranking process with a modified backward feature elimination algorithm. The process of feature ranking
follows a method of calculating the significance or priority of each symptoms of LD as per their
contribution in representing the knowledge contained in the dataset. Each symptoms significance or
priority values reflect its relative importance to predict LD among the various cases. Then by eliminating
least significant features one by one and evaluating the feature subset at each stage of the process, an
optimal feature subset is generated. For comparative analysis and to establish the importance of rough set
theory in feature selection, the backward feature elimination algorithm is combined with two state-of-theart
filter based feature ranking techniques viz. information gain and gain ratio. The experimental results
show the proposed feature selection approach outperforms the other two in terms of the data reduction.
Also, the proposed method eliminates all the redundant attributes efficiently from the LD dataset without
sacrificing the classification performance.
A Combined Approach for Feature Subset Selection and Size Reduction for High ...IJERA Editor
selection of relevant feature from a given set of feature is one of the important issues in the field of
data mining as well as classification. In general the dataset may contain a number of features however it is not
necessary that the whole set features are important for particular analysis of decision making because the
features may share the common information‟s and can also be completely irrelevant to the undergoing
processing. This generally happen because of improper selection of features during the dataset formation or
because of improper information availability about the observed system. However in both cases the data will
contain the features that will just increase the processing burden which may ultimately cause the improper
outcome when used for analysis. Because of these reasons some kind of methods are required to detect and
remove these features hence in this paper we are presenting an efficient approach for not just removing the
unimportant features but also the size of complete dataset size. The proposed algorithm utilizes the information
theory to detect the information gain from each feature and minimum span tree to group the similar features
with that the fuzzy c-means clustering is used to remove the similar entries from the dataset. Finally the
algorithm is tested with SVM classifier using 35 publicly available real-world high-dimensional dataset and the
results shows that the presented algorithm not only reduces the feature set and data lengths but also improves the
performances of the classifier.
25 16285 a hybrid ijeecs 012 edit septianIAESIJEECS
Feature selection aims to choose an optimal subset of features that are necessary and sufficient
to improve the generalization performance and the running efficiency of the learning algorithm. To get the
optimal subset in the feature selection process, a hybrid feature selection based on mutual information and
genetic algorithm is proposed in this paper. In order to make full use of the advantages of filter and
wrapper model, the algorithm is divided into two phases: the filter phase and the wrapper phase. In the
filter phase, this algorithm first uses the mutual information to sort the feature, and provides the heuristic
information for the subsequent genetic algorithm, to accelerate the search process of the genetic
algorithm. In the wrapper phase, using the genetic algorithm as the search strategy, considering the
performance of the classifier and dimension of subset as an evaluation criterion, search the best subset of
features. Experimental results on benchmark datasets show that the proposed algorithm has higher
classification accuracy and smaller feature dimension, and its running time is less than the time of using
genetic algorithm.
ON FEATURE SELECTION ALGORITHMS AND FEATURE SELECTION STABILITY MEASURES: A C...ijcsit
Data mining is indispensable for business organizations for extracting useful information from the huge volume of stored data which can be used in managerial decision making to survive in the competition. Due to the day-to-day advancements in information and communication technology, these data collected from ecommerce and e-governance are mostly high dimensional. Data mining prefers small datasets than high dimensional datasets. Feature selection is an important dimensionality reduction technique. The subsets selected in subsequent iterations by feature selection should be same or similar even in case of small perturbations of the dataset and is called as selection stability. It is recently becomes important topic of research community. The selection stability has been measured by various measures. This paper analyses the selection of the suitable search method and stability measure for the feature selection algorithms and also the influence of the characteristics of the dataset as the choice of the best approach is highly problem dependent.
Multimodal authentication is one of the prime concepts in current applications of real scenario. Various
approaches have been proposed in this aspect. In this paper, an intuitive strategy is proposed as a
framework for providing more secure key in biometric security aspect. Initially the features will be
extracted through PCA by SVD from the chosen biometric patterns, then using LU factorization technique
key components will be extracted, then selected with different key sizes and then combined the selected key
components using convolution kernel method (Exponential Kronecker Product - eKP) as Context-Sensitive
Exponent Associative Memory model (CSEAM). In the similar way, the verification process will be done
and then verified with the measure MSE. This model would give better outcome when compared with SVD
factorization[1] as feature selection. The process will be computed for different key sizes and the results
will be presented.
MULTI-PARAMETER BASED PERFORMANCE EVALUATION OF CLASSIFICATION ALGORITHMSijcsit
Diabetes disease is amongst the most common disease in India. It affects patient’s health and also leads to
other chronic diseases. Prediction of diabetes plays a significant role in saving of life and cost. Predicting
diabetes in human body is a challenging task because it depends on several factors. Few studies have reported the performance of classification algorithms in terms of accuracy. Results in these studies are difficult and complex to understand by medical practitioner and also lack in terms of visual aids as they arepresented in pure text format. This reported survey uses ROC and PRC graphical measures toimproveunderstanding of results. A detailed parameter wise discussion of comparison is also presented which lacksin other reported surveys. Execution time, Accuracy, TP Rate, FP Rate, Precision, Recall, F Measureparameters are used for comparative analysis and Confusion Matrix is prepared for quick review of each
algorithm. Ten fold cross validation method is used for estimation of prediction model. Different sets of
classification algorithms are analyzed on diabetes dataset acquired from UCI repository
Extensive Analysis on Generation and Consensus Mechanisms of Clustering Ensem...IJECEIAES
Data analysis plays a prominent role in interpreting various phenomena. Data mining is the process to hypothesize useful knowledge from the extensive data. Based upon the classical statistical prototypes the data can be exploited beyond the storage and management of the data. Cluster analysis a primary investigation with little or no prior knowledge, consists of research and development across a wide variety of communities. Cluster ensembles are melange of individual solutions obtained from different clusterings to produce final quality clustering which is required in wider applications. The method arises in the perspective of increasing robustness, scalability and accuracy. This paper gives a brief overview of the generation methods and consensus functions included in cluster ensemble. The survey is to analyze the various techniques and cluster ensemble methods.
COMPARISON OF HIERARCHICAL AGGLOMERATIVE ALGORITHMS FOR CLUSTERING MEDICAL DO...ijseajournal
Extensive amount of data stored in medical documents require developing methods that help users to find
what they are looking for effectively by organizing large amounts of information into a small number of
meaningful clusters. The produced clusters contain groups of objects which are more similar to each other
than to the members of any other group. Thus, the aim of high-quality document clustering algorithms is to
determine a set of clusters in which the inter-cluster similarity is minimized and intra-cluster similarity is
maximized. The most important feature in many clustering algorithms is treating the clustering problem as
an optimization process, that is, maximizing or minimizing a particular clustering criterion function
defined over the whole clustering solution. The only real difference between agglomerative algorithms is
how they choose which clusters to merge. The main purpose of this paper is to compare different
agglomerative algorithms based on the evaluation of the clusters quality produced by different hierarchical
agglomerative clustering algorithms using different criterion functions for the problem of clustering
medical documents. Our experimental results showed that the agglomerative algorithm that uses I1 as its
criterion function for choosing which clusters to merge produced better clusters quality than the other
criterion functions in term of entropy and purity as external measures.
Automatic Unsupervised Data Classification Using Jaya Evolutionary Algorithmaciijournal
In this paper we attempt to solve an automatic clustering problem by optimizing multiple objectives such as automatic k-determination and a set of cluster validity indices concurrently. The proposed automatic clustering technique uses the most recent optimization algorithm Jaya as an underlying optimization stratagem. This evolutionary technique always aims to attain global best solution rather than a local best solution in larger datasets. The explorations and exploitations imposed on the proposed work results to detect the number of automatic clusters, appropriate partitioning present in data sets and mere optimal values towards CVIs frontiers. Twelve datasets of different intricacy are used to endorse the performance of aimed algorithm. The experiments lay bare that the conjectural advantages of multi objective clustering optimized with evolutionary approaches decipher into realistic and scalable performance paybacks.
A Threshold Fuzzy Entropy Based Feature Selection: Comparative StudyIJMER
Feature selection is one of the most common and critical tasks in database classification. It
reduces the computational cost by removing insignificant and unwanted features. Consequently, this
makes the diagnosis process accurate and comprehensible. This paper presents the measurement of
feature relevance based on fuzzy entropy, tested with Radial Basis Classifier (RBF) network,
Bagging(Bootstrap Aggregating), Boosting and stacking for various fields of datasets. Twenty
benchmarked datasets which are available in UCI Machine Learning Repository and KDD have been
used for this work. The accuracy obtained from these classification process shows that the proposed
method is capable of producing good and accurate results with fewer features than the original
datasets.
Metasem: An R Package For Meta-Analysis Using Structural Equation Modelling: ...Pubrica
This presentation explains about the Metasem: An r package for Meta-Analysis using Structural Equation Modelling:
1. SEM is used meta-analytical model formulated for conducting Meta-analysis which is used to analyse structural relationships. SEM can be univariate, multivariate, and three-level meta-analysis
2. Structural equation model (SEM) in general optimized and fit by using OpenMx package
3. The routine analysis of batch mode either interactively or noninteractively can be analysed by R package. Using the graphical interface like R studio is a convenient method for users to interfere with the analysis.
Learn More: https://pubrica.com/academy/
Why Pubrica:
When you order our services, we promise you the following – Plagiarism free, always on Time, outstanding customer support, written to Standard, Unlimited Revisions support and High-quality Subject Matter Experts.
Find freelance Meta-Analysis professionals, consultants, freelancers and get your project done - https://bit.ly/30V8QUK
Contact us:
Web: https://pubrica.com/
Email: sales@pubrica.com
WhatsApp : +91 9884350006
United Kingdom : +44-1143520021
Comparative study of various supervisedclassification methodsforanalysing def...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Automatic Feature Subset Selection using Genetic Algorithm for Clusteringidescitation
Feature subset selection is a process of selecting a
subset of minimal, relevant features and is a pre processing
technique for a wide variety of applications. High dimensional
data clustering is a challenging task in data mining. Reduced
set of features helps to make the patterns easier to understand.
Reduced set of features are more significant if they are
application specific. Almost all existing feature subset
selection algorithms are not automatic and are not application
specific. This paper made an attempt to find the feature subset
for optimal clusters while clustering. The proposed Automatic
Feature Subset Selection using Genetic Algorithm (AFSGA)
identifies the required features automatically and reduces
the computational cost in determining good clusters. The
performance of AFSGA is tested using public and synthetic
datasets with varying dimensionality. Experimental results
have shown the improved efficacy of the algorithm with optimal
clusters and computational cost.
Automatic Unsupervised Data Classification Using Jaya Evolutionary Algorithmaciijournal
In this paper we attempt to solve an automatic clustering problem by optimizing multiple objectives such as
automatic k-determination and a set of cluster validity indices concurrently. The proposed automatic
clustering technique uses the most recent optimization algorithm Jaya as an underlying optimization
stratagem. This evolutionary technique always aims
to attain global best solution rather than a local best solution in larger datasets. The explorations and exploitations imposed on the proposed work results to
detect the number of automatic clusters, appropriate partitioning present in data sets and mere optimal
values towards CVIs frontiers. Twelve datasets of different intricacy are used to endorse the performance
of aimed algorithm. The experiments lay bare that the conjectural advantages of multi objective clustering optimized with evolutionary approaches decipher into realistic and scalable performance paybacks.
Engineering Research Publication
Best International Journals, High Impact Journals,
International Journal of Engineering & Technical Research
ISSN : 2321-0869 (O) 2454-4698 (P)
www.erpublication.org
Comparision of methods for combination of multiple classifiers that predict b...IJERA Editor
Predictive analysis include techniques fromdata mining that analyze current and historical data and make
predictions about the future. Predictive analytics is used in actuarial science, financial services, retail, travel,
healthcare, insurance, pharmaceuticals, marketing, telecommunications and other fields.Predicting patterns can
be considered as a classification problem and combining the different classifiers gives better results. We will
study and compare three methods used to combine multiple classifiers. Bayesian networks perform
classification based on conditional probability. It is ineffective and easy to interpret as it assumes that the
predictors are independent. Tree augmented naïve Bayes (TAN) constructs a maximum weighted spanning tree
that maximizes the likelihood of the training data, to perform classification.This tree structure eliminates the
independent attribute assumption of naïve Bayesian networks. Behavior-knowledge space method works in two
phases and can provide very good performances if large and representative data sets are available.
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...IJRES Journal
Data clustering is a common technique for statistical data analysis; it is defined as a class of
statistical techniques for classifying a set of observations into completely different groups. Cluster analysis
seeks to minimize group variance and maximize between group variance. In this study we formulate a
mathematical programming model that chooses the most important variables in cluster analysis. A nonlinear
binary model is suggested to select the most important variables in clustering a set of data. The idea of the
suggested model depends on clustering data by minimizing the distance between observations within groups.
Indicator variables are used to select the most important variables in the cluster analysis.
Q UANTUM C LUSTERING -B ASED F EATURE SUBSET S ELECTION FOR MAMMOGRAPHIC I...ijcsit
In this paper, we present an algorithm for feature selection. This algorithm labeled QC-FS: Quantum
Clustering for Feature Selection performs the selection in two steps. Partitioning the original features
space in order to group similar features is performed using the Quantum Clustering algorithm. Then the
selection of a representative for each cluster is carried out. It uses similarity measures such as correlation
coefficient (CC) and the mutual information (MI). The feature which maximizes this information is chosen
by the algorithm
Data mining is indispensable for business organizations for extracting useful information from the huge volume of stored data which can be used in managerial decision making to survive in the competition. Due to the day-to-day advancements in information and communication technology, these data collected from ecommerce and e-governance are mostly high dimensional. Data mining prefers small datasets than high dimensional datasets. Feature selection is an important dimensionality reduction technique. The subsets selected in subsequent iterations by feature selection should be same or similar even in case of small perturbations of the dataset and is called as selection stability. It is recently becomes important topic of research community. The selection stability has been measured by various measures. This paper analyses the selection of the suitable search method and stability measure for the feature selection algorithms and also the influence of the characteristics of the dataset as the choice of the best approach is highly problem dependent.
Data mining is indispensable for business organizations for extracting useful information from the huge
volume of stored data which can be used in managerial decision making to survive in the competition. Due
to the day-to-day advancements in information and communication technology, these data collected from e-
commerce and e-governance are mostly high dimensional. Data mining prefers small datasets than high
dimensional datasets. Feature selection is an important dimensionality reduction technique. The subsets
selected in subsequent iterations by feature selection should be same or similar even in case of small
perturbations of the dataset and is called as selection stability. It is recently becomes important topic of
research community. The selection stability has been measured by various measures. This paper analyses
the selection of the suitable search method and stability measure for the feature selection algorithms and
also the influence of the characteristics of the dataset as the choice of the best approach is highly problem
dependent.
A Threshold fuzzy entropy based feature selection method applied in various b...IJMER
Large amount of data have been stored and manipulated using various database
technologies. Processing all the attributes for the particular means is the difficult task. To avoid such
difficulties, feature selection process is processed.In this paper,we are collect a eight various benchmark
datasets from UCI repository.Feature selection process is carried out using fuzzy entropy based
relevance measure algorithm and follows three selection strategies like Mean selection strategy,Half
selection strategy and Neural network for threshold selection strategy. After the features are selected,
they are evaluated using Radial Basis Function (RBF) network,Stacking,Bagging,AdaBoostM1 and Antminer
classification methodologies.The test results depicts that Neural network for threshold selection
strategy works well in selecting features and Ant-miner methodology works best in bringing out better
accuracy with selected feature than processing with original dataset.The obtained result of this
experiment shows that clearly the Ant-miner is superiority than other classifiers.Thus, this proposed Antminer
algorithm could be a more suitable method for producing good results with fewer features than
the original datasets.
In the present day huge amount of data is generated in every minute and transferred frequently. Although
the data is sometimes static but most commonly it is dynamic and transactional. New data that is being
generated is getting constantly added to the old/existing data. To discover the knowledge from this
incremental data, one approach is to run the algorithm repeatedly for the modified data sets which is time
consuming. Again to analyze the datasets properly, construction of efficient classifier model is necessary.
The objective of developing such a classifier is to classify unlabeled dataset into appropriate classes. The
paper proposes a dimension reduction algorithm that can be applied in dynamic environment for
generation of reduced attribute set as dynamic reduct, and an optimization algorithm which uses the
reduct and build up the corresponding classification system. The method analyzes the new dataset, when it
becomes available, and modifies the reduct accordingly to fit the entire dataset and from the entire data
set, interesting optimal classification rule sets are generated. The concepts of discernibility relation,
attribute dependency and attribute significance of Rough Set Theory are integrated for the generation of
dynamic reduct set, and optimal classification rules are selected using PSO method, which not only
reduces the complexity but also helps to achieve higher accuracy of the decision system. The proposed
method has been applied on some benchmark dataset collected from the UCI repository and dynamic
reduct is computed, and from the reduct optimal classification rules are also generated. Experimental
result shows the efficiency of the proposed method.
Multimodal authentication is one of the prime concepts in current applications of real scenario. Various
approaches have been proposed in this aspect. In this paper, an intuitive strategy is proposed as a
framework for providing more secure key in biometric security aspect. Initially the features will be
extracted through PCA by SVD from the chosen biometric patterns, then using LU factorization technique
key components will be extracted, then selected with different key sizes and then combined the selected key
components using convolution kernel method (Exponential Kronecker Product - eKP) as Context-Sensitive
Exponent Associative Memory model (CSEAM). In the similar way, the verification process will be done
and then verified with the measure MSE. This model would give better outcome when compared with SVD
factorization[1] as feature selection. The process will be computed for different key sizes and the results
will be presented.
MULTI-PARAMETER BASED PERFORMANCE EVALUATION OF CLASSIFICATION ALGORITHMSijcsit
Diabetes disease is amongst the most common disease in India. It affects patient’s health and also leads to
other chronic diseases. Prediction of diabetes plays a significant role in saving of life and cost. Predicting
diabetes in human body is a challenging task because it depends on several factors. Few studies have reported the performance of classification algorithms in terms of accuracy. Results in these studies are difficult and complex to understand by medical practitioner and also lack in terms of visual aids as they arepresented in pure text format. This reported survey uses ROC and PRC graphical measures toimproveunderstanding of results. A detailed parameter wise discussion of comparison is also presented which lacksin other reported surveys. Execution time, Accuracy, TP Rate, FP Rate, Precision, Recall, F Measureparameters are used for comparative analysis and Confusion Matrix is prepared for quick review of each
algorithm. Ten fold cross validation method is used for estimation of prediction model. Different sets of
classification algorithms are analyzed on diabetes dataset acquired from UCI repository
Extensive Analysis on Generation and Consensus Mechanisms of Clustering Ensem...IJECEIAES
Data analysis plays a prominent role in interpreting various phenomena. Data mining is the process to hypothesize useful knowledge from the extensive data. Based upon the classical statistical prototypes the data can be exploited beyond the storage and management of the data. Cluster analysis a primary investigation with little or no prior knowledge, consists of research and development across a wide variety of communities. Cluster ensembles are melange of individual solutions obtained from different clusterings to produce final quality clustering which is required in wider applications. The method arises in the perspective of increasing robustness, scalability and accuracy. This paper gives a brief overview of the generation methods and consensus functions included in cluster ensemble. The survey is to analyze the various techniques and cluster ensemble methods.
COMPARISON OF HIERARCHICAL AGGLOMERATIVE ALGORITHMS FOR CLUSTERING MEDICAL DO...ijseajournal
Extensive amount of data stored in medical documents require developing methods that help users to find
what they are looking for effectively by organizing large amounts of information into a small number of
meaningful clusters. The produced clusters contain groups of objects which are more similar to each other
than to the members of any other group. Thus, the aim of high-quality document clustering algorithms is to
determine a set of clusters in which the inter-cluster similarity is minimized and intra-cluster similarity is
maximized. The most important feature in many clustering algorithms is treating the clustering problem as
an optimization process, that is, maximizing or minimizing a particular clustering criterion function
defined over the whole clustering solution. The only real difference between agglomerative algorithms is
how they choose which clusters to merge. The main purpose of this paper is to compare different
agglomerative algorithms based on the evaluation of the clusters quality produced by different hierarchical
agglomerative clustering algorithms using different criterion functions for the problem of clustering
medical documents. Our experimental results showed that the agglomerative algorithm that uses I1 as its
criterion function for choosing which clusters to merge produced better clusters quality than the other
criterion functions in term of entropy and purity as external measures.
Automatic Unsupervised Data Classification Using Jaya Evolutionary Algorithmaciijournal
In this paper we attempt to solve an automatic clustering problem by optimizing multiple objectives such as automatic k-determination and a set of cluster validity indices concurrently. The proposed automatic clustering technique uses the most recent optimization algorithm Jaya as an underlying optimization stratagem. This evolutionary technique always aims to attain global best solution rather than a local best solution in larger datasets. The explorations and exploitations imposed on the proposed work results to detect the number of automatic clusters, appropriate partitioning present in data sets and mere optimal values towards CVIs frontiers. Twelve datasets of different intricacy are used to endorse the performance of aimed algorithm. The experiments lay bare that the conjectural advantages of multi objective clustering optimized with evolutionary approaches decipher into realistic and scalable performance paybacks.
A Threshold Fuzzy Entropy Based Feature Selection: Comparative StudyIJMER
Feature selection is one of the most common and critical tasks in database classification. It
reduces the computational cost by removing insignificant and unwanted features. Consequently, this
makes the diagnosis process accurate and comprehensible. This paper presents the measurement of
feature relevance based on fuzzy entropy, tested with Radial Basis Classifier (RBF) network,
Bagging(Bootstrap Aggregating), Boosting and stacking for various fields of datasets. Twenty
benchmarked datasets which are available in UCI Machine Learning Repository and KDD have been
used for this work. The accuracy obtained from these classification process shows that the proposed
method is capable of producing good and accurate results with fewer features than the original
datasets.
Metasem: An R Package For Meta-Analysis Using Structural Equation Modelling: ...Pubrica
This presentation explains about the Metasem: An r package for Meta-Analysis using Structural Equation Modelling:
1. SEM is used meta-analytical model formulated for conducting Meta-analysis which is used to analyse structural relationships. SEM can be univariate, multivariate, and three-level meta-analysis
2. Structural equation model (SEM) in general optimized and fit by using OpenMx package
3. The routine analysis of batch mode either interactively or noninteractively can be analysed by R package. Using the graphical interface like R studio is a convenient method for users to interfere with the analysis.
Learn More: https://pubrica.com/academy/
Why Pubrica:
When you order our services, we promise you the following – Plagiarism free, always on Time, outstanding customer support, written to Standard, Unlimited Revisions support and High-quality Subject Matter Experts.
Find freelance Meta-Analysis professionals, consultants, freelancers and get your project done - https://bit.ly/30V8QUK
Contact us:
Web: https://pubrica.com/
Email: sales@pubrica.com
WhatsApp : +91 9884350006
United Kingdom : +44-1143520021
Comparative study of various supervisedclassification methodsforanalysing def...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Automatic Feature Subset Selection using Genetic Algorithm for Clusteringidescitation
Feature subset selection is a process of selecting a
subset of minimal, relevant features and is a pre processing
technique for a wide variety of applications. High dimensional
data clustering is a challenging task in data mining. Reduced
set of features helps to make the patterns easier to understand.
Reduced set of features are more significant if they are
application specific. Almost all existing feature subset
selection algorithms are not automatic and are not application
specific. This paper made an attempt to find the feature subset
for optimal clusters while clustering. The proposed Automatic
Feature Subset Selection using Genetic Algorithm (AFSGA)
identifies the required features automatically and reduces
the computational cost in determining good clusters. The
performance of AFSGA is tested using public and synthetic
datasets with varying dimensionality. Experimental results
have shown the improved efficacy of the algorithm with optimal
clusters and computational cost.
Automatic Unsupervised Data Classification Using Jaya Evolutionary Algorithmaciijournal
In this paper we attempt to solve an automatic clustering problem by optimizing multiple objectives such as
automatic k-determination and a set of cluster validity indices concurrently. The proposed automatic
clustering technique uses the most recent optimization algorithm Jaya as an underlying optimization
stratagem. This evolutionary technique always aims
to attain global best solution rather than a local best solution in larger datasets. The explorations and exploitations imposed on the proposed work results to
detect the number of automatic clusters, appropriate partitioning present in data sets and mere optimal
values towards CVIs frontiers. Twelve datasets of different intricacy are used to endorse the performance
of aimed algorithm. The experiments lay bare that the conjectural advantages of multi objective clustering optimized with evolutionary approaches decipher into realistic and scalable performance paybacks.
Engineering Research Publication
Best International Journals, High Impact Journals,
International Journal of Engineering & Technical Research
ISSN : 2321-0869 (O) 2454-4698 (P)
www.erpublication.org
Comparision of methods for combination of multiple classifiers that predict b...IJERA Editor
Predictive analysis include techniques fromdata mining that analyze current and historical data and make
predictions about the future. Predictive analytics is used in actuarial science, financial services, retail, travel,
healthcare, insurance, pharmaceuticals, marketing, telecommunications and other fields.Predicting patterns can
be considered as a classification problem and combining the different classifiers gives better results. We will
study and compare three methods used to combine multiple classifiers. Bayesian networks perform
classification based on conditional probability. It is ineffective and easy to interpret as it assumes that the
predictors are independent. Tree augmented naïve Bayes (TAN) constructs a maximum weighted spanning tree
that maximizes the likelihood of the training data, to perform classification.This tree structure eliminates the
independent attribute assumption of naïve Bayesian networks. Behavior-knowledge space method works in two
phases and can provide very good performances if large and representative data sets are available.
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...IJRES Journal
Data clustering is a common technique for statistical data analysis; it is defined as a class of
statistical techniques for classifying a set of observations into completely different groups. Cluster analysis
seeks to minimize group variance and maximize between group variance. In this study we formulate a
mathematical programming model that chooses the most important variables in cluster analysis. A nonlinear
binary model is suggested to select the most important variables in clustering a set of data. The idea of the
suggested model depends on clustering data by minimizing the distance between observations within groups.
Indicator variables are used to select the most important variables in the cluster analysis.
Q UANTUM C LUSTERING -B ASED F EATURE SUBSET S ELECTION FOR MAMMOGRAPHIC I...ijcsit
In this paper, we present an algorithm for feature selection. This algorithm labeled QC-FS: Quantum
Clustering for Feature Selection performs the selection in two steps. Partitioning the original features
space in order to group similar features is performed using the Quantum Clustering algorithm. Then the
selection of a representative for each cluster is carried out. It uses similarity measures such as correlation
coefficient (CC) and the mutual information (MI). The feature which maximizes this information is chosen
by the algorithm
Data mining is indispensable for business organizations for extracting useful information from the huge volume of stored data which can be used in managerial decision making to survive in the competition. Due to the day-to-day advancements in information and communication technology, these data collected from ecommerce and e-governance are mostly high dimensional. Data mining prefers small datasets than high dimensional datasets. Feature selection is an important dimensionality reduction technique. The subsets selected in subsequent iterations by feature selection should be same or similar even in case of small perturbations of the dataset and is called as selection stability. It is recently becomes important topic of research community. The selection stability has been measured by various measures. This paper analyses the selection of the suitable search method and stability measure for the feature selection algorithms and also the influence of the characteristics of the dataset as the choice of the best approach is highly problem dependent.
Data mining is indispensable for business organizations for extracting useful information from the huge
volume of stored data which can be used in managerial decision making to survive in the competition. Due
to the day-to-day advancements in information and communication technology, these data collected from e-
commerce and e-governance are mostly high dimensional. Data mining prefers small datasets than high
dimensional datasets. Feature selection is an important dimensionality reduction technique. The subsets
selected in subsequent iterations by feature selection should be same or similar even in case of small
perturbations of the dataset and is called as selection stability. It is recently becomes important topic of
research community. The selection stability has been measured by various measures. This paper analyses
the selection of the suitable search method and stability measure for the feature selection algorithms and
also the influence of the characteristics of the dataset as the choice of the best approach is highly problem
dependent.
A Threshold fuzzy entropy based feature selection method applied in various b...IJMER
Large amount of data have been stored and manipulated using various database
technologies. Processing all the attributes for the particular means is the difficult task. To avoid such
difficulties, feature selection process is processed.In this paper,we are collect a eight various benchmark
datasets from UCI repository.Feature selection process is carried out using fuzzy entropy based
relevance measure algorithm and follows three selection strategies like Mean selection strategy,Half
selection strategy and Neural network for threshold selection strategy. After the features are selected,
they are evaluated using Radial Basis Function (RBF) network,Stacking,Bagging,AdaBoostM1 and Antminer
classification methodologies.The test results depicts that Neural network for threshold selection
strategy works well in selecting features and Ant-miner methodology works best in bringing out better
accuracy with selected feature than processing with original dataset.The obtained result of this
experiment shows that clearly the Ant-miner is superiority than other classifiers.Thus, this proposed Antminer
algorithm could be a more suitable method for producing good results with fewer features than
the original datasets.
In the present day huge amount of data is generated in every minute and transferred frequently. Although
the data is sometimes static but most commonly it is dynamic and transactional. New data that is being
generated is getting constantly added to the old/existing data. To discover the knowledge from this
incremental data, one approach is to run the algorithm repeatedly for the modified data sets which is time
consuming. Again to analyze the datasets properly, construction of efficient classifier model is necessary.
The objective of developing such a classifier is to classify unlabeled dataset into appropriate classes. The
paper proposes a dimension reduction algorithm that can be applied in dynamic environment for
generation of reduced attribute set as dynamic reduct, and an optimization algorithm which uses the
reduct and build up the corresponding classification system. The method analyzes the new dataset, when it
becomes available, and modifies the reduct accordingly to fit the entire dataset and from the entire data
set, interesting optimal classification rule sets are generated. The concepts of discernibility relation,
attribute dependency and attribute significance of Rough Set Theory are integrated for the generation of
dynamic reduct set, and optimal classification rules are selected using PSO method, which not only
reduces the complexity but also helps to achieve higher accuracy of the decision system. The proposed
method has been applied on some benchmark dataset collected from the UCI repository and dynamic
reduct is computed, and from the reduct optimal classification rules are also generated. Experimental
result shows the efficiency of the proposed method.
An overlapping conscious relief-based feature subset selection methodIJECEIAES
Feature selection is considered as a fundamental prepossessing step in various data mining and machine learning based works. The quality of features is essential to achieve good classification performance and to have better data analysis experience. Among several feature selection methods, distance-based methods are gaining popularity because of their eligibility in capturing feature interdependency and relevancy with the endpoints. However, most of the distance-based methods only rank the features and ignore the class overlapping issues. Features with class overlapping data work as an obstacle during classification. Therefore, the objective of this research work is to propose a method named overlapping conscious MultiSURF (OMsurf) to handle data overlapping and select a subset of informative features discarding the noisy ones. Experimental results over 20 benchmark dataset demonstrates the superiority of OMsurf over six existing state-of-the-art methods.
SVM-PSO based Feature Selection for Improving Medical Diagnosis Reliability u...cscpconf
Improving accuracy of supervised classification algorithms in biomedical applications,
especially CADx, is one of active area of research. This paper proposes construction of rotation
forest (RF) ensemble using 20 learners over two clinical datasets namely lymphography and
backache. We propose a new feature selection strategy based on support vector machines
optimized by particle swarm optimization for relevant and minimum feature subset for obtaining
higher accuracy of ensembles. We have quantitatively analyzed 20 base learners over two
datasets and carried out the experiments with 10 fold cross validation leave-one-out strategy
and the performance of 20 classifiers are evaluated using performance metrics namely accuracy
(acc), kappa value (K), root mean square error (RMSE) and area under receiver operating
characteristics curve (ROC). Base classifiers succeeded 79.96% & 81.71% average accuracies
for lymphography & backache datasets respectively. As for RF ensembles, they produced
average accuracies of 83.72% & 85.77% for respective diseases. The paper presents promising
results using RF ensembles and provides a new direction towards construction of reliable and robust medical diagnosis systems.
An Influence of Measurement Scale of Predictor Variable on Logistic Regressio...IJECEIAES
Much real world decision making is based on binary categories of information that agree or disagree, accept or reject, succeed or fail and so on. Information of this category is the output of a classification method that is the domain of statistical field studies (eg Logistic Regression method) and machine learning (eg Learning Vector Quantization (LVQ)). The input argument of a classification method has a very crucial role to the resulting output condition. This paper investigated the influence of various types of input data measurement (interval, ratio, and nominal) to the performance of logistic regression method and LVQ in classifying an object. Logistic regression modeling is done in several stages until a model that meets the suitability model test is obtained. Modeling on LVQ was tested on several codebook sizes and selected the most optimal LVQ model. The best model of each method compared to its performance on object classification based on Hit Ratio indicator. In logistic regression model obtained 2 models that meet the model suitability test is a model with predictive variables scaled interval and nominal, while in LVQ modeling obtained 3 pieces of the most optimal model with a different codebook. In the data with interval-scale predictor variable, the performance of both methods is the same. The performance of both models is just as bad when the data have the predictor variables of the nominal scale. In the data with predictor variable has ratio scale, the LVQ method able to produce moderate enough performance, while on logistic regression modeling is not obtained the model that meet model suitability test. Thus if the input dataset has interval or ratio-scale predictor variables than it is preferable to use the LVQ method for modeling the object classification.
MEDICAL DIAGNOSIS CLASSIFICATION USING MIGRATION BASED DIFFERENTIAL EVOLUTION...cscpconf
Constructing a classification model is important in machine learning for a particular task. A
classification process involves assigning objects into predefined groups or classes based on a
number of observed attributes related to those objects. Artificial neural network is one of the
classification algorithms which, can be used in many application areas. This paper investigates
the potential of applying the feed forward neural network architecture for the classification of
medical datasets. Migration based differential evolution algorithm (MBDE) is chosen and
applied to feed forward neural network to enhance the learning process and the network
learning is validated in terms of convergence rate and classification accuracy. In this paper,
MBDE algorithm with various migration policies is proposed for classification problems using
medical diagnosis.
Data mining is a process to extract information from a huge amount of data and transform it into an
understandable structure. Data mining provides the number of tasks to extract data from large databases such
as Classification, Clustering, Regression, Association rule mining. This paper provides the concept of
Classification. Classification is an important data mining technique based on machine learning which is used to
classify the each item on the bases of features of the item with respect to the predefined set of classes or groups.
This paper summarises various techniques that are implemented for the classification such as k-NN, Decision
Tree, Naïve Bayes, SVM, ANN and RF. The techniques are analyzed and compared on the basis of their
advantages and disadvantages
A ROBUST MISSING VALUE IMPUTATION METHOD MIFOIMPUTE FOR INCOMPLETE MOLECULAR ...ijcsa
Missing data imputation is an important research topic in data mining. Large-scale Molecular descriptor data may contains missing values (MVs). However, some methods for downstream analyses, including some prediction tools, require a complete descriptor data matrix. We propose and evaluate an iterative imputation method MiFoImpute based on a random forest. By averaging over many unpruned regression trees, random forest intrinsically constitutes a multiple imputation scheme. Using the NRMSE and NMAE estimates of random forest, we are able to estimate the imputation error. Evaluation is performed on two molecular descriptor datasets generated from a diverse selection of pharmaceutical fields with artificially introduced missing values ranging from 10% to 30%. The experimental result demonstrates that missing values has a great impact on the effectiveness of imputation techniques and our method MiFoImpute is more robust to missing value than the other ten imputation methods used as benchmark. Additionally, MiFoImpute exhibits attractive computational efficiency and can cope with high-dimensional data.
Decentralized data fusion approach is one in which features are extracted and processed individually and finally fused to obtain global estimates. The paper presents decentralized data fusion algorithm using factor analysis model. Factor analysis is a statistical method used to study the effect and interdependence of various factors within a system. The proposed algorithm fuses accelerometer and gyroscope data in an inertial measurement unit (IMU). Simulations are carried out on Matlab platform to illustrate the algorithm.
Analysis of machine learning algorithms for character recognition: a case stu...nooriasukmaningtyas
This paper covers the work done in handwritten digit recognition and the
various classifiers that have been developed. Methods like MLP, SVM,
Bayesian networks, and Random forests were discussed with their accuracy
and are empirically evaluated. Boosted LetNet 4, an ensemble of various
classifiers, has shown maximum efficiency among these methods.
ROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULEIJCSEA Journal
The generation of effective feature-based rules is essential to the development of any intelligent system. This paper presents an approach that integrates a powerful fuzzy rule generation algorithm with a rough set-assisted feature reduction method to generate diagnostic rule with a certainty factor. Certainty factor of each rule is calculated by considering both the membership value of each linguistic term introduced at time of fuzzyfication of data as well as possibility values, due to inconsistent data, generated by rough set theory at time of rule generation. In time of knowledge inferencing in an intelligent system, certainty factor of each rule will play an important role to find out the appropriate rule to be selected. Experimental results demonstrate the superiority of our approach.
Abstract: The common challenge for machine learning and data mining tasks is the curse of High Dimensionality. Feature selection reduces the dimensionality by selecting the relevant and optimal features from the huge dataset. In this research work, a clustering and genetic algorithm based feature selection (CLUST-GA-FS) is proposed that has three stages namely irrelevant feature removal, redundant feature removal, and optimal feature generation. The performance of the feature selection algorithms are analyzed using the parameters like classification accuracy, precision, recall and error rate. Recently, an increasing attention is given to the stability of feature selection algorithms which is an indicator that requires that similar subsets of features are selected every time the algorithm is executed on the same dataset. This work analyzes the stability of the algorithm on four publicly available dataset using stability measurements Average normal hamming distance(ANHD), Dices coefficient, Tanimoto distance, Jaccards index and Kuncheva index.
A new model for iris data set classification based on linear support vector m...IJECEIAES
Data mining is known as the process of detection concerning patterns from essential amounts of data. As a process of knowledge discovery. Classification is a data analysis that extracts a model which describes an important data classes. One of the outstanding classifications methods in data mining is support vector machine classification (SVM). It is capable of envisaging results and mostly effective than other classification methods. The SVM is a one technique of machine learning techniques that is well known technique, learning with supervised and have been applied perfectly to a vary problems of: regression, classification, and clustering in diverse domains such as gene expression, web text mining. In this study, we proposed a newly mode for classifying iris data set using SVM classifier and genetic algorithm to optimize c and gamma parameters of linear SVM, in addition principle components analysis (PCA) algorithm was use for features reduction.
Similar to Hybrid Method HVS-MRMR for Variable Selection in Multilayer Artificial Neural Network Classifier (20)
Bibliometric analysis highlighting the role of women in addressing climate ch...IJECEIAES
Fossil fuel consumption increased quickly, contributing to climate change
that is evident in unusual flooding and draughts, and global warming. Over
the past ten years, women's involvement in society has grown dramatically,
and they succeeded in playing a noticeable role in reducing climate change.
A bibliometric analysis of data from the last ten years has been carried out to
examine the role of women in addressing the climate change. The analysis's
findings discussed the relevant to the sustainable development goals (SDGs),
particularly SDG 7 and SDG 13. The results considered contributions made
by women in the various sectors while taking geographic dispersion into
account. The bibliometric analysis delves into topics including women's
leadership in environmental groups, their involvement in policymaking, their
contributions to sustainable development projects, and the influence of
gender diversity on attempts to mitigate climate change. This study's results
highlight how women have influenced policies and actions related to climate
change, point out areas of research deficiency and recommendations on how
to increase role of the women in addressing the climate change and
achieving sustainability. To achieve more successful results, this initiative
aims to highlight the significance of gender equality and encourage
inclusivity in climate change decision-making processes.
Voltage and frequency control of microgrid in presence of micro-turbine inter...IJECEIAES
The active and reactive load changes have a significant impact on voltage
and frequency. In this paper, in order to stabilize the microgrid (MG) against
load variations in islanding mode, the active and reactive power of all
distributed generators (DGs), including energy storage (battery), diesel
generator, and micro-turbine, are controlled. The micro-turbine generator is
connected to MG through a three-phase to three-phase matrix converter, and
the droop control method is applied for controlling the voltage and
frequency of MG. In addition, a method is introduced for voltage and
frequency control of micro-turbines in the transition state from gridconnected mode to islanding mode. A novel switching strategy of the matrix
converter is used for converting the high-frequency output voltage of the
micro-turbine to the grid-side frequency of the utility system. Moreover,
using the switching strategy, the low-order harmonics in the output current
and voltage are not produced, and consequently, the size of the output filter
would be reduced. In fact, the suggested control strategy is load-independent
and has no frequency conversion restrictions. The proposed approach for
voltage and frequency regulation demonstrates exceptional performance and
favorable response across various load alteration scenarios. The suggested
strategy is examined in several scenarios in the MG test systems, and the
simulation results are addressed.
Enhancing battery system identification: nonlinear autoregressive modeling fo...IJECEIAES
Precisely characterizing Li-ion batteries is essential for optimizing their
performance, enhancing safety, and prolonging their lifespan across various
applications, such as electric vehicles and renewable energy systems. This
article introduces an innovative nonlinear methodology for system
identification of a Li-ion battery, employing a nonlinear autoregressive with
exogenous inputs (NARX) model. The proposed approach integrates the
benefits of nonlinear modeling with the adaptability of the NARX structure,
facilitating a more comprehensive representation of the intricate
electrochemical processes within the battery. Experimental data collected
from a Li-ion battery operating under diverse scenarios are employed to
validate the effectiveness of the proposed methodology. The identified
NARX model exhibits superior accuracy in predicting the battery's behavior
compared to traditional linear models. This study underscores the
importance of accounting for nonlinearities in battery modeling, providing
insights into the intricate relationships between state-of-charge, voltage, and
current under dynamic conditions.
Smart grid deployment: from a bibliometric analysis to a surveyIJECEIAES
Smart grids are one of the last decades' innovations in electrical energy.
They bring relevant advantages compared to the traditional grid and
significant interest from the research community. Assessing the field's
evolution is essential to propose guidelines for facing new and future smart
grid challenges. In addition, knowing the main technologies involved in the
deployment of smart grids (SGs) is important to highlight possible
shortcomings that can be mitigated by developing new tools. This paper
contributes to the research trends mentioned above by focusing on two
objectives. First, a bibliometric analysis is presented to give an overview of
the current research level about smart grid deployment. Second, a survey of
the main technological approaches used for smart grid implementation and
their contributions are highlighted. To that effect, we searched the Web of
Science (WoS), and the Scopus databases. We obtained 5,663 documents
from WoS and 7,215 from Scopus on smart grid implementation or
deployment. With the extraction limitation in the Scopus database, 5,872 of
the 7,215 documents were extracted using a multi-step process. These two
datasets have been analyzed using a bibliometric tool called bibliometrix.
The main outputs are presented with some recommendations for future
research.
Use of analytical hierarchy process for selecting and prioritizing islanding ...IJECEIAES
One of the problems that are associated to power systems is islanding
condition, which must be rapidly and properly detected to prevent any
negative consequences on the system's protection, stability, and security.
This paper offers a thorough overview of several islanding detection
strategies, which are divided into two categories: classic approaches,
including local and remote approaches, and modern techniques, including
techniques based on signal processing and computational intelligence.
Additionally, each approach is compared and assessed based on several
factors, including implementation costs, non-detected zones, declining
power quality, and response times using the analytical hierarchy process
(AHP). The multi-criteria decision-making analysis shows that the overall
weight of passive methods (24.7%), active methods (7.8%), hybrid methods
(5.6%), remote methods (14.5%), signal processing-based methods (26.6%),
and computational intelligent-based methods (20.8%) based on the
comparison of all criteria together. Thus, it can be seen from the total weight
that hybrid approaches are the least suitable to be chosen, while signal
processing-based methods are the most appropriate islanding detection
method to be selected and implemented in power system with respect to the
aforementioned factors. Using Expert Choice software, the proposed
hierarchy model is studied and examined.
Enhancing of single-stage grid-connected photovoltaic system using fuzzy logi...IJECEIAES
The power generated by photovoltaic (PV) systems is influenced by
environmental factors. This variability hampers the control and utilization of
solar cells' peak output. In this study, a single-stage grid-connected PV
system is designed to enhance power quality. Our approach employs fuzzy
logic in the direct power control (DPC) of a three-phase voltage source
inverter (VSI), enabling seamless integration of the PV connected to the
grid. Additionally, a fuzzy logic-based maximum power point tracking
(MPPT) controller is adopted, which outperforms traditional methods like
incremental conductance (INC) in enhancing solar cell efficiency and
minimizing the response time. Moreover, the inverter's real-time active and
reactive power is directly managed to achieve a unity power factor (UPF).
The system's performance is assessed through MATLAB/Simulink
implementation, showing marked improvement over conventional methods,
particularly in steady-state and varying weather conditions. For solar
irradiances of 500 and 1,000 W/m2
, the results show that the proposed
method reduces the total harmonic distortion (THD) of the injected current
to the grid by approximately 46% and 38% compared to conventional
methods, respectively. Furthermore, we compare the simulation results with
IEEE standards to evaluate the system's grid compatibility.
Enhancing photovoltaic system maximum power point tracking with fuzzy logic-b...IJECEIAES
Photovoltaic systems have emerged as a promising energy resource that
caters to the future needs of society, owing to their renewable, inexhaustible,
and cost-free nature. The power output of these systems relies on solar cell
radiation and temperature. In order to mitigate the dependence on
atmospheric conditions and enhance power tracking, a conventional
approach has been improved by integrating various methods. To optimize
the generation of electricity from solar systems, the maximum power point
tracking (MPPT) technique is employed. To overcome limitations such as
steady-state voltage oscillations and improve transient response, two
traditional MPPT methods, namely fuzzy logic controller (FLC) and perturb
and observe (P&O), have been modified. This research paper aims to
simulate and validate the step size of the proposed modified P&O and FLC
techniques within the MPPT algorithm using MATLAB/Simulink for
efficient power tracking in photovoltaic systems.
Adaptive synchronous sliding control for a robot manipulator based on neural ...IJECEIAES
Robot manipulators have become important equipment in production lines, medical fields, and transportation. Improving the quality of trajectory tracking for
robot hands is always an attractive topic in the research community. This is a
challenging problem because robot manipulators are complex nonlinear systems
and are often subject to fluctuations in loads and external disturbances. This
article proposes an adaptive synchronous sliding control scheme to improve trajectory tracking performance for a robot manipulator. The proposed controller
ensures that the positions of the joints track the desired trajectory, synchronize
the errors, and significantly reduces chattering. First, the synchronous tracking
errors and synchronous sliding surfaces are presented. Second, the synchronous
tracking error dynamics are determined. Third, a robust adaptive control law is
designed,the unknown components of the model are estimated online by the neural network, and the parameters of the switching elements are selected by fuzzy
logic. The built algorithm ensures that the tracking and approximation errors
are ultimately uniformly bounded (UUB). Finally, the effectiveness of the constructed algorithm is demonstrated through simulation and experimental results.
Simulation and experimental results show that the proposed controller is effective with small synchronous tracking errors, and the chattering phenomenon is
significantly reduced.
Remote field-programmable gate array laboratory for signal acquisition and de...IJECEIAES
A remote laboratory utilizing field-programmable gate array (FPGA) technologies enhances students’ learning experience anywhere and anytime in embedded system design. Existing remote laboratories prioritize hardware access and visual feedback for observing board behavior after programming, neglecting comprehensive debugging tools to resolve errors that require internal signal acquisition. This paper proposes a novel remote embeddedsystem design approach targeting FPGA technologies that are fully interactive via a web-based platform. Our solution provides FPGA board access and debugging capabilities beyond the visual feedback provided by existing remote laboratories. We implemented a lab module that allows users to seamlessly incorporate into their FPGA design. The module minimizes hardware resource utilization while enabling the acquisition of a large number of data samples from the signal during the experiments by adaptively compressing the signal prior to data transmission. The results demonstrate an average compression ratio of 2.90 across three benchmark signals, indicating efficient signal acquisition and effective debugging and analysis. This method allows users to acquire more data samples than conventional methods. The proposed lab allows students to remotely test and debug their designs, bridging the gap between theory and practice in embedded system design.
Detecting and resolving feature envy through automated machine learning and m...IJECEIAES
Efficiently identifying and resolving code smells enhances software project quality. This paper presents a novel solution, utilizing automated machine learning (AutoML) techniques, to detect code smells and apply move method refactoring. By evaluating code metrics before and after refactoring, we assessed its impact on coupling, complexity, and cohesion. Key contributions of this research include a unique dataset for code smell classification and the development of models using AutoGluon for optimal performance. Furthermore, the study identifies the top 20 influential features in classifying feature envy, a well-known code smell, stemming from excessive reliance on external classes. We also explored how move method refactoring addresses feature envy, revealing reduced coupling and complexity, and improved cohesion, ultimately enhancing code quality. In summary, this research offers an empirical, data-driven approach, integrating AutoML and move method refactoring to optimize software project quality. Insights gained shed light on the benefits of refactoring on code quality and the significance of specific features in detecting feature envy. Future research can expand to explore additional refactoring techniques and a broader range of code metrics, advancing software engineering practices and standards.
Smart monitoring technique for solar cell systems using internet of things ba...IJECEIAES
Rapidly and remotely monitoring and receiving the solar cell systems status parameters, solar irradiance, temperature, and humidity, are critical issues in enhancement their efficiency. Hence, in the present article an improved smart prototype of internet of things (IoT) technique based on embedded system through NodeMCU ESP8266 (ESP-12E) was carried out experimentally. Three different regions at Egypt; Luxor, Cairo, and El-Beheira cities were chosen to study their solar irradiance profile, temperature, and humidity by the proposed IoT system. The monitoring data of solar irradiance, temperature, and humidity were live visualized directly by Ubidots through hypertext transfer protocol (HTTP) protocol. The measured solar power radiation in Luxor, Cairo, and El-Beheira ranged between 216-1000, 245-958, and 187-692 W/m 2 respectively during the solar day. The accuracy and rapidity of obtaining monitoring results using the proposed IoT system made it a strong candidate for application in monitoring solar cell systems. On the other hand, the obtained solar power radiation results of the three considered regions strongly candidate Luxor and Cairo as suitable places to build up a solar cells system station rather than El-Beheira.
An efficient security framework for intrusion detection and prevention in int...IJECEIAES
Over the past few years, the internet of things (IoT) has advanced to connect billions of smart devices to improve quality of life. However, anomalies or malicious intrusions pose several security loopholes, leading to performance degradation and threat to data security in IoT operations. Thereby, IoT security systems must keep an eye on and restrict unwanted events from occurring in the IoT network. Recently, various technical solutions based on machine learning (ML) models have been derived towards identifying and restricting unwanted events in IoT. However, most ML-based approaches are prone to miss-classification due to inappropriate feature selection. Additionally, most ML approaches applied to intrusion detection and prevention consider supervised learning, which requires a large amount of labeled data to be trained. Consequently, such complex datasets are impossible to source in a large network like IoT. To address this problem, this proposed study introduces an efficient learning mechanism to strengthen the IoT security aspects. The proposed algorithm incorporates supervised and unsupervised approaches to improve the learning models for intrusion detection and mitigation. Compared with the related works, the experimental outcome shows that the model performs well in a benchmark dataset. It accomplishes an improved detection accuracy of approximately 99.21%.
Developing a smart system for infant incubators using the internet of things ...IJECEIAES
This research is developing an incubator system that integrates the internet of things and artificial intelligence to improve care for premature babies. The system workflow starts with sensors that collect data from the incubator. Then, the data is sent in real-time to the internet of things (IoT) broker eclipse mosquito using the message queue telemetry transport (MQTT) protocol version 5.0. After that, the data is stored in a database for analysis using the long short-term memory network (LSTM) method and displayed in a web application using an application programming interface (API) service. Furthermore, the experimental results produce as many as 2,880 rows of data stored in the database. The correlation coefficient between the target attribute and other attributes ranges from 0.23 to 0.48. Next, several experiments were conducted to evaluate the model-predicted value on the test data. The best results are obtained using a two-layer LSTM configuration model, each with 60 neurons and a lookback setting 6. This model produces an R 2 value of 0.934, with a root mean square error (RMSE) value of 0.015 and a mean absolute error (MAE) of 0.008. In addition, the R 2 value was also evaluated for each attribute used as input, with a result of values between 0.590 and 0.845.
A review on internet of things-based stingless bee's honey production with im...IJECEIAES
Honey is produced exclusively by honeybees and stingless bees which both are well adapted to tropical and subtropical regions such as Malaysia. Stingless bees are known for producing small amounts of honey and are known for having a unique flavor profile. Problem identified that many stingless bees collapsed due to weather, temperature and environment. It is critical to understand the relationship between the production of stingless bee honey and environmental conditions to improve honey production. Thus, this paper presents a review on stingless bee's honey production and prediction modeling. About 54 previous research has been analyzed and compared in identifying the research gaps. A framework on modeling the prediction of stingless bee honey is derived. The result presents the comparison and analysis on the internet of things (IoT) monitoring systems, honey production estimation, convolution neural networks (CNNs), and automatic identification methods on bee species. It is identified based on image detection method the top best three efficiency presents CNN is at 98.67%, densely connected convolutional networks with YOLO v3 is 97.7%, and DenseNet201 convolutional networks 99.81%. This study is significant to assist the researcher in developing a model for predicting stingless honey produced by bee's output, which is important for a stable economy and food security.
A trust based secure access control using authentication mechanism for intero...IJECEIAES
The internet of things (IoT) is a revolutionary innovation in many aspects of our society including interactions, financial activity, and global security such as the military and battlefield internet. Due to the limited energy and processing capacity of network devices, security, energy consumption, compatibility, and device heterogeneity are the long-term IoT problems. As a result, energy and security are critical for data transmission across edge and IoT networks. Existing IoT interoperability techniques need more computation time, have unreliable authentication mechanisms that break easily, lose data easily, and have low confidentiality. In this paper, a key agreement protocol-based authentication mechanism for IoT devices is offered as a solution to this issue. This system makes use of information exchange, which must be secured to prevent access by unauthorized users. Using a compact contiki/cooja simulator, the performance and design of the suggested framework are validated. The simulation findings are evaluated based on detection of malicious nodes after 60 minutes of simulation. The suggested trust method, which is based on privacy access control, reduced packet loss ratio to 0.32%, consumed 0.39% power, and had the greatest average residual energy of 0.99 mJoules at 10 nodes.
Fuzzy linear programming with the intuitionistic polygonal fuzzy numbersIJECEIAES
In real world applications, data are subject to ambiguity due to several factors; fuzzy sets and fuzzy numbers propose a great tool to model such ambiguity. In case of hesitation, the complement of a membership value in fuzzy numbers can be different from the non-membership value, in which case we can model using intuitionistic fuzzy numbers as they provide flexibility by defining both a membership and a non-membership functions. In this article, we consider the intuitionistic fuzzy linear programming problem with intuitionistic polygonal fuzzy numbers, which is a generalization of the previous polygonal fuzzy numbers found in the literature. We present a modification of the simplex method that can be used to solve any general intuitionistic fuzzy linear programming problem after approximating the problem by an intuitionistic polygonal fuzzy number with n edges. This method is given in a simple tableau formulation, and then applied on numerical examples for clarity.
The performance of artificial intelligence in prostate magnetic resonance im...IJECEIAES
Prostate cancer is the predominant form of cancer observed in men worldwide. The application of magnetic resonance imaging (MRI) as a guidance tool for conducting biopsies has been established as a reliable and well-established approach in the diagnosis of prostate cancer. The diagnostic performance of MRI-guided prostate cancer diagnosis exhibits significant heterogeneity due to the intricate and multi-step nature of the diagnostic pathway. The development of artificial intelligence (AI) models, specifically through the utilization of machine learning techniques such as deep learning, is assuming an increasingly significant role in the field of radiology. In the realm of prostate MRI, a considerable body of literature has been dedicated to the development of various AI algorithms. These algorithms have been specifically designed for tasks such as prostate segmentation, lesion identification, and classification. The overarching objective of these endeavors is to enhance diagnostic performance and foster greater agreement among different observers within MRI scans for the prostate. This review article aims to provide a concise overview of the application of AI in the field of radiology, with a specific focus on its utilization in prostate MRI.
Seizure stage detection of epileptic seizure using convolutional neural networksIJECEIAES
According to the World Health Organization (WHO), seventy million individuals worldwide suffer from epilepsy, a neurological disorder. While electroencephalography (EEG) is crucial for diagnosing epilepsy and monitoring the brain activity of epilepsy patients, it requires a specialist to examine all EEG recordings to find epileptic behavior. This procedure needs an experienced doctor, and a precise epilepsy diagnosis is crucial for appropriate treatment. To identify epileptic seizures, this study employed a convolutional neural network (CNN) based on raw scalp EEG signals to discriminate between preictal, ictal, postictal, and interictal segments. The possibility of these characteristics is explored by examining how well timedomain signals work in the detection of epileptic signals using intracranial Freiburg Hospital (FH), scalp Children's Hospital Boston-Massachusetts Institute of Technology (CHB-MIT) databases, and Temple University Hospital (TUH) EEG. To test the viability of this approach, two types of experiments were carried out. Firstly, binary class classification (preictal, ictal, postictal each versus interictal) and four-class classification (interictal versus preictal versus ictal versus postictal). The average accuracy for stage detection using CHB-MIT database was 84.4%, while the Freiburg database's time-domain signals had an accuracy of 79.7% and the highest accuracy of 94.02% for classification in the TUH EEG database when comparing interictal stage to preictal stage.
Analysis of driving style using self-organizing maps to analyze driver behaviorIJECEIAES
Modern life is strongly associated with the use of cars, but the increase in acceleration speeds and their maneuverability leads to a dangerous driving style for some drivers. In these conditions, the development of a method that allows you to track the behavior of the driver is relevant. The article provides an overview of existing methods and models for assessing the functioning of motor vehicles and driver behavior. Based on this, a combined algorithm for recognizing driving style is proposed. To do this, a set of input data was formed, including 20 descriptive features: About the environment, the driver's behavior and the characteristics of the functioning of the car, collected using OBD II. The generated data set is sent to the Kohonen network, where clustering is performed according to driving style and degree of danger. Getting the driving characteristics into a particular cluster allows you to switch to the private indicators of an individual driver and considering individual driving characteristics. The application of the method allows you to identify potentially dangerous driving styles that can prevent accidents.
Hyperspectral object classification using hybrid spectral-spatial fusion and ...IJECEIAES
Because of its spectral-spatial and temporal resolution of greater areas, hyperspectral imaging (HSI) has found widespread application in the field of object classification. The HSI is typically used to accurately determine an object's physical characteristics as well as to locate related objects with appropriate spectral fingerprints. As a result, the HSI has been extensively applied to object identification in several fields, including surveillance, agricultural monitoring, environmental research, and precision agriculture. However, because of their enormous size, objects require a lot of time to classify; for this reason, both spectral and spatial feature fusion have been completed. The existing classification strategy leads to increased misclassification, and the feature fusion method is unable to preserve semantic object inherent features; This study addresses the research difficulties by introducing a hybrid spectral-spatial fusion (HSSF) technique to minimize feature size while maintaining object intrinsic qualities; Lastly, a soft-margins kernel is proposed for multi-layer deep support vector machine (MLDSVM) to reduce misclassification. The standard Indian pines dataset is used for the experiment, and the outcome demonstrates that the HSSF-MLDSVM model performs substantially better in terms of accuracy and Kappa coefficient.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Event Management System Vb Net Project Report.pdfKamal Acharya
In present era, the scopes of information technology growing with a very fast .We do not see any are untouched from this industry. The scope of information technology has become wider includes: Business and industry. Household Business, Communication, Education, Entertainment, Science, Medicine, Engineering, Distance Learning, Weather Forecasting. Carrier Searching and so on.
My project named “Event Management System” is software that store and maintained all events coordinated in college. It also helpful to print related reports. My project will help to record the events coordinated by faculties with their Name, Event subject, date & details in an efficient & effective ways.
In my system we have to make a system by which a user can record all events coordinated by a particular faculty. In our proposed system some more featured are added which differs it from the existing system such as security.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Vaccine management system project report documentation..pdfKamal Acharya
The Division of Vaccine and Immunization is facing increasing difficulty monitoring vaccines and other commodities distribution once they have been distributed from the national stores. With the introduction of new vaccines, more challenges have been anticipated with this additions posing serious threat to the already over strained vaccine supply chain system in Kenya.
Forklift Classes Overview by Intella PartsIntella Parts
Discover the different forklift classes and their specific applications. Learn how to choose the right forklift for your needs to ensure safety, efficiency, and compliance in your operations.
For more technical information, visit our website https://intellaparts.com
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSEDuvanRamosGarzon1
AIRCRAFT GENERAL
The Single Aisle is the most advanced family aircraft in service today, with fly-by-wire flight controls.
The A318, A319, A320 and A321 are twin-engine subsonic medium range aircraft.
The family offers a choice of engines
Automobile Management System Project Report.pdfKamal Acharya
The proposed project is developed to manage the automobile in the automobile dealer company. The main module in this project is login, automobile management, customer management, sales, complaints and reports. The first module is the login. The automobile showroom owner should login to the project for usage. The username and password are verified and if it is correct, next form opens. If the username and password are not correct, it shows the error message.
When a customer search for a automobile, if the automobile is available, they will be taken to a page that shows the details of the automobile including automobile name, automobile ID, quantity, price etc. “Automobile Management System” is useful for maintaining automobiles, customers effectively and hence helps for establishing good relation between customer and automobile organization. It contains various customized modules for effectively maintaining automobiles and stock information accurately and safely.
When the automobile is sold to the customer, stock will be reduced automatically. When a new purchase is made, stock will be increased automatically. While selecting automobiles for sale, the proposed software will automatically check for total number of available stock of that particular item, if the total stock of that particular item is less than 5, software will notify the user to purchase the particular item.
Also when the user tries to sale items which are not in stock, the system will prompt the user that the stock is not enough. Customers of this system can search for a automobile; can purchase a automobile easily by selecting fast. On the other hand the stock of automobiles can be maintained perfectly by the automobile shop manager overcoming the drawbacks of existing system.
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
2. ISSN: 2088-8708
IJECE Vol. 7, No. 5, October 2017 : 2773 – 2781
2774
subset selected. The accuracy of the learning algorithms is usually high [7]. Wrapper methods tend to find the
most suitable feature subset for the learning algorithm, but they are very computationally expensive. Unlike
the wrapper and filter methods, embedded methods incorporate the selection of variables during the learning
process. The embedded methods combine the advantages of filter and wrapper techniques [8]. The filter
approach determines the relevant and redundant variables independent of the classification, such as using
only MRMR criterion, so it is not recommended to use it alone [9]. The method filter might improve the
selection of variables if it understands how the filtered variables are used by the classifier. The wrapper
method evaluates a subset of features by its classification performance using a learning algorithm [10], for
example, heuristic variable selection (HVS). In this work, we propose to incorporate MRMR criterion into
the ranking scheme of HVS. We are making hybrids by a convex combination of the relevancy given by
HVS criterion and the MRMR criterion.
The rest of the paper is organized as flows: Section 2 we present related works. In section 3 we
present research method. Section 4 presents the results of our experimental studies including the
experimental methodology, experimental results, and the comparison with heuristic variables selection HVS
and MRMR Minimum Redundancy Maximum Relevance. The conclusions are drawn in Section 5.
2. RELATED WORKS
The variables selection is generally defined as a search process to find a subset of "relevant"
characteristics from those of the original set [5], [11-13]. The concept of relevance of a subset of variables
always depends on the objectives and system requirements. The problem of variables selection for
classification task can be described as follows: given the original set G, of N features, find a subset F
consisting of N’ relevant features where N’< N .The selection of a subset F allows maximizing the
performance of the classification by constructing learning models.
2.1 The Heuristic Variable Selection
Let’s consider that a multilayer perceptron (MLP) [14] is neural networks architecture defined by A
(I, H, 0) where I input layer, H hidden layers and O output layer, and W weight matrix. The value of w_ij
connection between two neurons j and i reflects the importance of their relationship. This value can be
positive or negative depending on if the connection is excitatory (+) or inhibitory (-). Yacoub et al proposed a
method for variable selection named heuristic variable selection HVS [15]. The HVS criterion is interested in
the strength of these connections. This strength is quantified by |w_ij |.The partial contribution π (i, j) of the
hidden neuron j on the output i is given by the proportion of all the connection strength arriving to neuron i
see Figure 1.
Figure 1. The partial contribution of the neuron j connect with i
πi,j =
|wi,j|
∑ |wi,j|Nc
j
(1)
For estimate the relative contribution of unit j is final decision of the system. The unit j sends
connections to, a set of units (j) with partial contributions πi,j Figure 2.
Cj = ∑ πijδi
Nj
i (2)
3. IJECE ISSN: 2088-8708
Hybrid Method HVS-MRMR for Variable Selection in Multilayer Artificial Neural … (Ben-Hdech Adil)
2775
Where
δi = {
1 if unit iϵ O
Ci if unit i ∈ H
(3)
Figure 2. The relative contribution of the neuron j
The selection algorithm based on HVS criterion as follows:
̶ Learning the MLP until we reach a local minimum.
̶ Calculate the relevance of each variable according to 2.
̶ Sort variables in ascending order of relevance.
̶ Remove the lower variable importance
̶ Back to 1) to the last variable
2.2. Minimum Redundancy Maximum Relevance
The MRMR (Minimum Redundancy Maximum Relevance) method [16] selects variables that have
the maximally relevance with the target class and which are also minimally redundant. In this work, to find a
maximally relevant and minimally redundant set of variables, we use mutual information based MRMR
criterion. The calculation of redundancy and relevance of a variable is given by equations (4) and (5). The
I(i, Y) is the mutual information between class labels y and variable i . This allows us to quantify the
relevance of variable i to the classification. The relevance of variable is given by:
Rli =
1
|S|2
∑ I(i, Y)y (4)
Where
𝐼(𝑖, 𝑌) = ∑ ∑ 𝑝(𝑖, 𝑦)log(
𝑝(𝑖,𝑦)
𝑝(𝑖)𝑝(𝑦)𝑦∈𝑌𝑖∈𝑆 ) (5)
The redundancy of a variable subset is determined by the mutual information among the variables.
The redundancy of variable i with the other variables is given by:
Rdi =
1
|𝑆|2
∑ 𝐼(𝑖, 𝑗)𝑖∈𝑆 (6)
Where S and |S| respectively denote the set of variables and its size and I(i, j) is the mutual
information between i and j . The score of a variable is the combination of these two factors:
Sci =
Rli
Rdi
(7)
The measures of relevance and redundancy of variables can be formed in several ways, but the
quotient of the relevance by redundancy select highly relevant variables with less redundancy [17]. After this
individual variable evaluation, a sequential search technique is used with a classifier to select the final subset
of variables. A classifier is used to evaluate the subsets starting with the variable that has the best score, the
best two, until we find the subset that minimizes the classification error.
4. ISSN: 2088-8708
IJECE Vol. 7, No. 5, October 2017 : 2773 – 2781
2776
3. RESEARCH METHOD
Using the filter methods alone for example MRMR, may not give the best performance because it
operates independently the classifier and is not involved in the selection of variables. On the other hand, HVS
does not take into account the redundancy among variables. Our objective is to improve the variables
selection HVS by introducing an MRMR filter to minimize the redundancy among relevant variables. As
seen later, this improves the performance of classifier by compromising relevancy and redundancy of
variables.
In our approach of HVS-MRMR variables selection, the variables are selected by a convex
combination of the relevancy given by HVS contributions and the MRMR criterion. For i the variable, the
ranking Measure R_i is given by
Ri = α|Ci| + (1 − α)Sci (8)
Where the parameter α ∈ [0,1] determines the compromise between HVS and MRMR criterion,
The search strategy is one of the properties of the variable selection algorithms. There are three
strategies, forward selection, backward elimination and stepwise selection. In forward selection, variables are
progressively incorporated into larger and larger subsets. In backward elimination one starts with the set of
all variables and progressively eliminates the least promising ones [3]. In our algorithm we use the strategy
backwards elimination. To better compromise with redundancy and Relevancy of variables, we use
S(i)MRMR criterion for ranking. Also, we use |Ci| the criterion of HVS as the measure of relevance
variables.
Algorithm 1 illustrates HVS-MRMR variables selection method. In each iteration, we identified the
least important variable after ranking the variables in the set G. The variable least significant are retired and
the remaining subset G will go through process iterative, each time we remove a variable, relearning is
required. The algorithm removes the variables one by one until the last variable.
Let’s consider that Sp is a subset of variables and p is the number of these variables.
Algorithm 1 : HVS-MRMR for variables selection
Begin
Set α
Given set of variable, S⊂G
Repeat :
Train the MLP with test dataset;
For each i ∈ 𝑺 𝒑 do
Compute the 𝑪𝒊 by equation (2 )
Compute the 𝑺𝒄𝒊 by equation (7 )
Compute the 𝑹𝒊 by equation (8 )
End for
Select the variable 𝒊∗
= 𝐚𝐫𝐠 𝐦𝐢𝐧{𝑹𝒊}
Update 𝑺 𝒑−𝟏 = 𝑮{𝒊∗}
Until G= {∅}
End
For selecting a subset Sp of variables, algorithm.1 generates a set of N neuronal networks (N is the
number of variables) having less and less variables. The choice of subset includes using a statistical test
(Fisher test) and searches among all networks MLP(p) those that are statistically near MLP(p∗
). This
principle provides a set of neuronal networks such as E (pi
) ≈ E (p∗
) . The subset of selected variables is the
smallest subset of p0
statistically near MLP(p∗
) variables. Where E (pi
) is the error of MLP for subset Sp.
Where,
E (p) is the error of MLP for subset Sp,
MLP(p∗) = agr minMLP(p)E(p) (9)
p0
= min i{i} (10)
5. IJECE ISSN: 2088-8708
Hybrid Method HVS-MRMR for Variable Selection in Multilayer Artificial Neural … (Ben-Hdech Adil)
2777
4. RESULTS AND ANALYSIS
To evaluate the performance of the HVS-MRMR, we use some data sets for real world
classification. Table 1 shows the detailed information of each dataset. They were partitioned into three sets: a
training set, a validation set and a testing set. The training and testing sets were used to train MLP and to
evaluate the classification accuracy of trained MLP, respectively. The validation set is used to estimate
prediction error for MLP.
Table 1 Descriptions and Partitions for Different Classifications Datasets
Dateset
Number of
Variables
Training
examples
Validation
examples
Testing
examples
Classes
Diabetes 8 384 192 192 2
Cancer 9 349 175 175 2
Glass 9 108 53 53 6
Vehicle 18 242 211 211 4
Hepatitis 19 77 39 39 2
Waveform 21 2500 1250 1250 3
Horse 21 172 86 86 2
Ionosphere 34 175 88 88 2
4.1 Preprocessing
We preprocessed the datasets by rescaling input variables values between 0 and 1 the linear
normalization function: After normalization, all features of these examples are between zero and one. The
standardization formula is as follows:
xi,new =
xi,old−xi,min
xi,max−xi,min
(11)
Where xi,new and xi,old are the new and old value of attribute, respectively. The xi,max and xi,min
are the maximum and minimum value, respectively of variable i.
Table 2. Performance of HVS-MRMR for different classifications datasets, St. dev. given after the ± sign
Dataset Measure Without VS HVS MRMR HVS-MRMR
Diabetes
Mean No. of variable 8.00±0.00 5.90±0.92 6.50±0.63 5.55±1.02
Acc % 75.68±1.01 76.32±2.06 76.25±2.01 76.63±3.23
Cancer
Mean No. of variable 9.00±0.00 6.83±1.01 5.60±0.92 6.13±0.98
Acc % 97.96±0.88 98.43±1.06 98.17±1.52 98.68±1.88
Glass
Mean No. of variable 9.00±0.00 5.13±0.55 4.90±0.45 4.33±0.65
Acc % 74.21±5.62 76.61±4.66 76.73±5.20 77.03±4.95
Vehicle
Mean No. of variable 18.00±0.00 4.51±0.67 5.33±0.64 4.90±0.76
Acc % 73.37±4.87 74.35±3.57 74.21±4.03 75.17±3.47
Hepatitis
Mean No. of variable 19.00±0.00 3.73±0.87 5.10±0.76 3.55±0.92
Acc % 70.63±3.60 77.65±2.79 76.32±3.43 78.67±3.15
Waveform
Mean No. of variable 21.00±0.00 4.85±0.96 5.50±0.81 4.85±1.04
Acc % 85.30±1.23 85.91±2.54 85.27±2.98 84.91±3.13
Horse
Mean No. of variable 21.00±0.00 7.94±1.87 6.93±1.65 6.55±1.45
Acc % 84.51±2.52 86.03±2.10 85.27±2.05 86.21±1.95
Ionosphere
Mean No. of variable 34.00±0.00 6.52±2.03 7.11±1.87 6.87±1.97
Acc % 94.66±2.03 95.81±1.95 96.27±2.25 96.31±2.98
Where
Mean No. of variable : the average numbers of variables selected
Acc: The classification accuracy
Without VS: Without variable selection
6. ISSN: 2088-8708
IJECE Vol. 7, No. 5, October 2017 : 2773 – 2781
2778
4.2 Parameter Estimation
In this paragraph, we have specified some parameters of HVS-MRMR. These are described as
follows: The initial weights for an MLP were randomly chosen in the value between -1 and 1.The training
error threshold value for cancer, diabetes, glass, hepatitis, horse, ionosphere, vehicle, and waveform datasets
was set to 0.002, 0.003, 0.04, 0.02, 0.003, 0.02, 0.01and 0.03 respectively. Also, the validation error
threshold value was set to 0.001, 0.002, 0.025, 0.018, 0.001, 0.014, 0.007 and 0.025 for cancer, diabetes,
glass, hepatitis, horse, ionosphere, vehicle, and waveform datasets, respectively. The learning rate and initial
weight values are the parameters of the well known back- propagation algorithm [18-19]. From to the
suggestions of many previous works [20-21] and after some preliminary tests these values were set. The α
value for the HVS-MRMR variable selection was then determined empirically from the set {0.2, 0.3, 0.4, 0.5,
0.6, 0.7, 0.8} based on the best tenfold cross-validation performance.
4.3 Results
Table 2 shows the results of HVS-MRMR, MRMR and HVS over 20 independent runs on eight
classification datasets. The classification accuracy (Acc) in Table 2 refers to the percentage of classifications
accuracy produced by trained MLP on the testing set of a classification dataset and (Mean No. of variable)
presents the average numbers of variables selected. It can be observed from table 2 that HVS-MRMR was
selected a smaller number of variables for different benchmark datasets. For example, HVS-MRMR selected
on average 6.13 variables from a set of 9 variables for solve the cancer dataset. It also selected on average
6.87 variables from a set of 34 variables for solve the Ionosphere dataset. In fact, HVS-MRMR selected a
small number of variables for each datasets with more variables.
Figure 3 shows the frequency of selected by HVS-MRMR for Cancer dataset
Figure 4 shows the frequency of selected by HVS-MRMR for Diabetes dataset
7. IJECE ISSN: 2088-8708
Hybrid Method HVS-MRMR for Variable Selection in Multilayer Artificial Neural … (Ben-Hdech Adil)
2779
Figure 5 shows the frequency of selected by HVS-MRMR for Glass dataset
In order to determine the essence of selected variables, we measured the frequency of variables. The
frequency of a variable i can be defined as
fi =
Hi
T
(12)
Where Hi is the number of times a particular variable is selected in all test and T is the total number
of test.
Figure 3, Figure 4 and Figure 5 show the frequency of variables selected for diabetes, cancer, and
glass datasets, respectively. It can be seen in Figure 3 that HVS-MRMR selected features 1, 2, 6, 7, 8 and 9
of the Cancer dataset vary frequently. The frequency of selection for these variables is one or nearly one.
Table 3. Comparison among HVS-MRMR, HVS, MRMR ADHOC, GPSFSCD, EIR-MLPFS and
ANNIGMA-WRAPPER for the cancer, diabetes, glass, hepatitis, horse, ionosphere, vehicle, and waveform
datasets
Dataset HVS MRMR ADHOC GPSFSCD EIR-MLPFS ANNIGMA-WRAPPER HVS-MRMR
Diabetes 76.32 76.25 71.2 _ _ 77.8 76.63
Cancer 98.43 98.17 _ 96.84 89.40 96.5 98.68
Glass 76.61 76.73 70.5 _ 44.10 _ 77.03
Vehicle 74.35 74.21 69.6 78.45 74.60 _ 75.17
Hepatitis 77.65 76.32 _ _ _ _ 78.6
Waveform 85.91 85.27 _ _ _ _ 84.91
Horse 86.03 85.27 _ _ _ _ 86.21
Ionosphere 95.81 96.27 _ _ 90.60 90.2 96.31
˝ ̶ ˝ means not available
We can be observed that our method achieved the best classification accuracy among all other
algorithms for five out (Cancer, Glass ,out, Hepatitis, Horse and Ionosphere) of eight datasets. For the
remaining three datasets, HVS-MRMR achieved as a second best. while HVS (Waveform), ANNIGMA-
WRAPPER(Diabetes) and GPSFSCD(Vehicle) achieved the best classification accuracy for one dataset each.
We can be said that the variables selection increases the classification accuracy by ignoring the
irrelevant variables from the original feature set. The variables selection is an important task in such a
process is to select necessary information (irrelevant variables). Otherwise, the performance of classifiers
might be decreased.
8. ISSN: 2088-8708
IJECE Vol. 7, No. 5, October 2017 : 2773 – 2781
2780
The efficacy of embedding of MRMR filter in HVS was evidenced by improved classification
performance on benchmark datasets. In this paper, the proposed algorithm outperformed other methods in the
classification on the most database tested, it was able to select the relevance variables among datasets. We
can be choose from these data with the analysis of performance of the subset of variables that have strong
relationship with the classification. However in terms of execution time, our proposed approach consumes
more than HVS and MRMR.
5. CONCLUSION
It is very important to remove the redundant and irrelevant variables in data before applying some
data mining techniques to analyze the data sets. In our research, we suggest a new method of variables
selection based on HVS criterion and MRMR criterion, called the HVS-MRMR, to integrate the procedures
of variable selection filter and variable selection wrapper to improve the performance of classification.
We applied HVS-MRMR for four classification problems. The experiment results show that HVS-
MRMR variables selection selected a less number of variables with high classification accuracy compared to
MRMR, HVS.
In a forthcoming research work, we intend to improve this approach to find better subset of
variables selected and to improve classification accuracy [26-27].
ACKNOWLEDGEMENTS
We express our thanks to the associate editor and anonymous referees whose valuable suggestions
helped to improve this work significantly and also to their colleague.
REFERENCES
[1] Hinton Geoffrey E, Salakhutdinov Ruslan R. “Reducing the Dimensionality of Data with Neural Networks.”
Science, 2006; 313(5786); p. 504-507.
[2] Demers David, Cottrell G W. “n–linear dimensionality reduction.” In: Advances in neural information processing
systems, 1993; 5; 580-587.
[3] Fodor, Imola K. “A Survey of Dimension Reduction Techniques.” Center for Applied Scientific Computing,
Lawrence Livermore National Laboratory, 2002; 9; 1-18.
[4] Wei, Hua-Liang Et Billings, Stephen A. “Feature Subset Selection and Ranking for Data Dimensionality
Reduction.” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007; 29(1); 162-166.
[5] Guyon Isabelle, Elisseeff André. “An Introduction to Variable and Feature Selection.” Journal of Machine
Learning Research. 2003; 3(3); 1157-1182.
[6] Liuhuan, Motoda Hiroshi, YU Lei. “Feature Selection with Selective Sampling.” In: Proceedings of the Nineteenth
International Conference on Machine Learning, ICML. 2002; 395–402.
[7] Dy Jennifer G, BRODLEY Carla E. “Feature Selection for Unsupervised Learning.” Journal of machine learning
research. 2004; 5; 845-889.
[8] Kannan S, Senthamarai et Ramaraj N A.”Novel hybrid Feature Selection via Symmetrical Uncertainty Ranking
Based Local Memetic Search Algorithm.” Knowledge-Based Systems, 2010; 23(6); 580-585 .
[9] Peng Hanchuan, Lon Fuhui, Ding Chris. “Feature Selection Based on Mutual Information Criteria of Max-
dependency, Max-relevance, and Min-redundancy.” IEEE Transactions on pattern analysis and machine
intelligence. 2005; 27(8); 1226-1238.
[10] Kohavi Ron, John George H. “Wrappers for Feature Subset Selection.” Artificial intelligence, 1997; 97(1-2), 273-
324.
[11] Chormunge Smita, Jena Sudarson. “Efficient Feature Subset Selection Algorithm for High Dimensional
Data.” International Journal of Electrical and Computer Engineering. 2016; 6 (4); 1880.
[12] Liu Chuan, Wang, Wenyong, Zhao Qiang, et al. “A New Feature Selection Method Based on a Validity Index of
Feature Subset.” Pattern Recognition Letters, 2017.
[13] Sela Enny Itje, Hartati Sri, Harjoko Agus, et al. “Feature Selection of the Combination of Porous Trabecular with
Anthropometric Features for Osteoporosis Screening.” International Journal of Electrical and Computer
Engineering. 2015; 5(1) 78.
[14] Rosenblatt Frank. “Principles of Neurodynamics. Perceptrons and the Theory of Brain Mechanisms.” Cornell
Aeronautical Laboratory Inc Buffalony, 1961.
[15] Yacoub Meziane, Bennani Y. “HVS: A Heuristic for Variable Selection in Multilayer Artificial Neural Network
Classifier.” Intelligent Engineering Systems through Artificial Neural Networks, St. Louis, Missouri. 1997; 7; 527-
532.
[16] Peng Hanchuan, Long Fuhui, Ding Chris. “Feature Selection Based on Mutual Information Criteria of Max-
dependency, Max-relevance, and Min-redundancy.” IEEE Transactions on pattern analysis and machine
intelligence. 2005; 27(8); 1226-1238.
9. IJECE ISSN: 2088-8708
Hybrid Method HVS-MRMR for Variable Selection in Multilayer Artificial Neural … (Ben-Hdech Adil)
2781
[17] Ding Chris, Peng Hanchuan. “Minimum Redundancy Feature Selection from Microarray Gene Expression Data.”
Journal of bioinformatics and computational biology, 2005; 3(02; 185-205.
[18] Rumelhart D E, McClelland J L, PDP Research Group. “Parallel Distributed Processing.” IEEE. 988; 1; 443-453.
[19] Yang, Yang, Hu, Jun, et Zhang, Mu. “Predictions on the Development Dimensions of Provincial Tourism Discipline
Based on the Artificial Neural Network BP Model.” Bulletin of Electrical Engineering and Informatics, 2014; 3(2);
69-76.
[20] Islam, Md M., Yao, Xin, et Murase, Kazuyuki. “A Constructive Algorithm for Training Cooperative Neural
Network Ensembles”. IEEE Transactions on neural networks, 2003; 14(4); 820-834.
[21] Prechelt Lutz, et al. “Proben1: A Set of Neural Network Benchmark Problems and Benchmarking Rules.” 1994.
[22] Richeldi Marco, Lanzi Pier Luca. “ADHOC: A tool for performing effective feature selection. In : Tools with
Artificial Intelligence.” Proceedings Eighth IEEE International Conference on. IEEE. 1996; 102-105.
[23] Muni Durga Prasad, PAL Nikhil R, DAS Jyotirmay. “Genetic Programming for Simultaneous Feature Selection and
Classifier Design.” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics). 2006; 36(1); 106-
117.
[24] Gasca E, Sánchez J S, Alonso R. “Eliminating Redundancy and Irrelevance Using a New MLP-based Feature
Selection Method.” Pattern Recognition. 2006; 39(2), 313-315.
[25] Hsu Chun-Nan, Huang Hung-Ju, Dietrich S. “The Annigma-wrapper Approach to Fast Feature Selection for Neural
Nets.” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics). 2002 ; 32(2) ; 207-212.
[26] Ben-Hdech A, Ghanou Y, Abderrahim E. L. “Robust Multi-combination Feature Selection for Microarray
Data.” Advances in Information Technology: Theory and Application, 2016.
[27] Ghanou, Y., & Bencheikh, G. “Architecture Optimization and Training for the Multilayer Perceptron Using Ant
System.” IAENG International Journal of Computer Science. 2016; 28; 10.