2. Informatics in Medicine Unlocked 26 (2021) 100696
2
gradual association (parallel to the risk factor).
When the dataset contains various features, some of them are not
useful and cause bad results. Therefore, the main aim of this research is
to use a combined method to improve the classification and better
feature selection which will lead to a better diagnosis of heart disease. In
this study, an imperialist competitive algorithm with a meta-heuristic
approach is used to optimize the selection of important features in
heart disease. This algorithm can provide a more optimal response for
feature selection towards genetic and other optimization algorithms.
After data pre-processing of the dataset spitted into two sets training and
testing set. The training set with 80% and the testing set with 20%. After
feature extraction, the features have been supplied to K-nearest neighbor
(KNN), Naïve Bayes Classifier, Support Vector Machine for classification
proposes. Therefore, using the combination of these four methods can
lead to improving the result of heart disease diagnosis and their different
aspects. In other words, we are trying to improve classification accuracy
on heart disease diagnosis. The proposed K-nearest neighbor (KNN),
Naïve Bayes Classifier, Support Vector Machine classifier idea has never
been done before. According to the simulation result section 13 the
proposed method has achieved a better result in comparison with other
algorithms, which has had two advantages, first decreasing the number
of features, second, increasing classification accuracy.
Objectives of this research are as follows:
• Data collection from new features about heart disease.
• Prediction and classification of incidence of heart disease using the
proposed method.
• Using new feature selection algorithms for the first time.
• Providing a new combined approach with higher accuracy
The paper is organized as follows: Section 2 contains a literature
review, section 3 contains design parameters, Section 4 contains the
optimization methods, section 5 contains the research gap, the proposed
methodology is presented in section 6, data analysis and classification
methods are presented in section 7,8 and 9, finally experimental results
and conclusion are presented in the last section.
2. Literature review
Any classification regarding Feature selection plays a significant role.
Later, Swarm algorithms are suggested and they proved a valuable
performance for feature selection. There are some studies on the clas
sification of heart disease in the literature. One of them is the study of
hybrid smart modeling schemes for the classification of heart disease of
Shao et al. [4]. This paper uses 13 risk factors for heart disease predic
tion. This study, which differs from existing approaches, proposes a
novel hybrid framework to achieve various risk factors. This hybrid
framework contains three methods; Multivariate Adaptive Regression
(MAR), Logistic Regression (LR), and Artificial Neural Network (ANN).
Initially, the encoded values of risk factors are reduced by using LR and
MAR. Then, the rest of the encoded factors are used for the training of
the ANN. The simulation results show that the hybrid approach out
performs the conventional single-stage neural network [4]. The study of
the use of data mining techniques in the prediction of heart diseases by
Priyanka et al. compared the performances of Naïve Bayes and Decision
tree algorithms, and the decision tree algorithm yielded much more
successful results than Naïve Bayes, which gave an accuracy rate of
98.03%–82.35% [5]. Yekkala et al. [5] used Particle Swarm Optimiza
tion (PSO) in conjunction with particle methods (Random Forest, Ada
Boost, and Bagged Tree) to more accurately predict the results. The
Heart Stalog dataset has 270 samples and 14 attributes, taken from the
UCI database [5]. The data has already been processed, and PSO is used
as a feature selection method to delete unnecessary and missing data.
The significant features have been tested on the community classifier for
various performance measures and steps as follows. After loading the
data collection, after using PSO’s data to pick the element, removed the
cleaning technique used for data after removing useless functions.
Powerful features continued, and the AdaBoost, Bagging, and Random
Forest. The two factor’s importance to full features. Eventually, we
measured the performance of each algorithm. As a result, Bagged Tree
performed 100%, Random Forest 90.37%, and AdaBoost 88.89%. Ac
cording to test results, Yekkal et al. [5] proved that using Bagging Trees
on PSO will improve learning accuracy in predicting heart disease. Amin
et al. [6] show a heart disease prediction model using a genetic algo
rithm, neural network, Naïve Bayes, Bagging Trees, Decision Tree, Core
Density, and SVM. Learning is faster, more stable, and accurate
compared to back-propagation. Collected risk factors data of 50 patients
and the hybrid model resulted in 96% training accuracy and 89% test
accuracy. Amin and his colleagues then developed the system using the
hybrid fuzzy and k-nearest neighbor approach to predict heart diseases;
in another system, using the neural network community was used with
an accuracy of 89.01% in the diagnosis of heart disease. This hybrid
system’s advantage is to help patients reduce their cost and time and
control themselves for medical examinations before their heart disease
and side effects. The researchers compared the algorithms used ac
cording to the confusion matrix. In the end, understood that the J48
recorded the highest Accuracy at 99%. The K-nearest neighbor algo
rithm is simple, but it can give impressive results. Factor’s importance
about a classification method widely used in many fields and is also
found among the top 10 data mining algorithms [8]. Typically, houses
that are close to each other have similar characteristics. We can group
them and give them a classification. The algorithm uses this same logic
to try to group the elements that are close to each other. There are two
basic types of data mining techniques; Predictive methods and
descriptive methods [7].
• Descriptive Methods: These methods identify the current situation,
describes the common belongings of the data in the dataset, and
emphasize the understanding and interpretation of the feature.
• Predictive Methods: These methods by learning the past simulate
the feature. They use data with the help of known results to develop a
model that could predict the values of other data.
The stress risk factor seems to be complicated by a depressive state
that follows the myocardial infarction. According to some studies, the
incidence of depression is higher in patients with CVD than in those
without CVD. Several studies agree that after myocardial infarction, a
depressive state increases the recurrence risk by two years following the
heart disease event [9]. Various approaches exist to explain the corre
lation between psychosocial factors (stress, anxiety, and depression) and
CVD. These factors increase catecholamine synthesis with their conse
quences on the different metabolisms, blood pressure, and heart rate
[10]. According to Ref. [11], Data Mining has three main axes: Statistics,
Artificial Intelligence (AI, including Machine Learning), and Databases.
Although these three axes are well specified, it is difficult to give a single
definition for Data Mining. However, the description used is probably
stated in Ref. [11], where it is mentioned that “Data Mining is the
non-trivial process of extracting information from the data that is pre
sent implicitly, previously unknown and potentially useful for the user.”
This information is present in the data as patterns that are very useful
when applied to solve problems in a particular context. Ah. E. Hegazy
et al. [12] have highlighted how to improve basic SSA structure to
enhance accuracy, convergence, speed, and reliability. In this research,
the author presented a new control parameter to adjust the existing
solution and suggested a new name to the improved salp swarm algo
rithm(ISSA). This algorithm is used for testing the feature selection task.
The combination of the ISSA algorithm and K-nearest neighbor classifier
is used for feature selection. In this work, they used 23 UCI datasets for
finding the performance of the ISSA algorithm. ISSA is used as a wrapper
feature selection with the combination of KNN classifier as a fitness
function. The researcher compared ISSA with the other four swarm
methods. They got superior results than the previous feature reduction
S.P. Patro et al.
3. Informatics in Medicine Unlocked 26 (2021) 100696
3
and classification accuracy. For ISSA, the average classification accuracy
result is 0.8422. Pei Du et al. [13] have implemented a model for getting
accurate and reliable air pollutant forecasting. Air pollution seriously
affects the human beings’ living environment, and it could be even
endangering human being lives. To overcome this problem the
researcher proposed a novel hybrid model. To decompose the original
time series into different models a robust data preprocessing model is
implemented. This model contains a low frequency and high frequency.
For air pollution series prediction the researcher used the ELM model
parameters with the help of high forecasting accuracy and consistency.
The authors used two experiments, named PM2.5forecasting and PM
10forecasting, to illustrate the hybrid models’ superiority. Liyuan Gao
et al. [14] have highlighted early disease prediction and diagnosis are
most important for the betterment of patients’ survival. It is critically
important to recognize the patient’s condition and its predictive char
acteristics. The authors provided a comparative analysis of various
systems of machine learning. By sampling with substitution, the unit
calculates the standard deviation of the data. The researchers have given
more emphasis in this research regarding analyzing and comparing
Machine learning strategies to predict Breast cancer, Heart Disease, and
identifying early high-risk features. The results of this research show
how the Bayesian Hyperparameter Optimization model is better than
random search and grid search methods. The researcher used the breast
cancer diagnosis dataset for the Extreme Gradient Boosting model, and
they got 94.74% accuracy, and for the heart disease dataset, they got
73.50%. Ahmed A. Abusnain et al. [15] have highlighted pattern clas
sification is the most popular application of neural networks. It is most
important to train the neural network. The researcher highlighted that
the back-propagation algorithm is low in convergence rate, hence
overcoming the Salp Swarm Algorithm(SSA) algorithm. SSA gives good
performance results for optimization problems. For this research work,
they proposed SSA’s for optimizing the weight coefficients for the neural
networks for identifying pattern classification. They used the UCI ma
chine learning repository dataset. Here an approach is used to adjust the
parameter of NN connections weights using SSA algorithms. Zaher
Mundher Yaseen et al. [16] Have highlighted a classical extreme
learning machine (ELM) model algorithm on a non tuned. This approach
is based on a random procedure which is not efficient in the convergence
of outstanding performance in case of local problems. In this work, the
researcher investigated the forecasting of the monthly Tigris river,
Baghdad. The researcher used Salp Swarm Algorithm by using ELM for
this work. They took twenty years of river flow data time series, and the
results were evaluated depending on graphical presentations and several
statistical measures. In this work they used the SSA-ELM model, the
results were found an acceptable level of augmentation regarding the
absolute metrics achieved 8.4 and 13.1% for RMSE and MAE, respec
tively. Youness Khourdifi et al. [17] highlighted that machine learning is
one of the key areas for predicting heart disease. The researcher high
lighted optimization algorithms having the advantages of dealing with
complex non-linear problems with high adaptability and flexibility. In
this research, to improve the quality of heart disease classification, a
method named Fast Correlation-Based Feature Selectin (FCBF) is used to
filter redundant features. The researcher allows used various classifi
cation algorithms like support vector machine, k-nearest neighbor,
naïve Bayes, and random forest by particle swarm optimization com
bined with ant colony optimization process. With the proposed mixed
approach, the researchers applied the heart disease dataset for heart
disease classification. With the help of proposed optimized models,
including FBF, PSO, and ACO, they got a maximum classification ac
curacy of 99.65% with KNN and 99.6% with RF. Jiyang Want et al. [18]
have proposed that reliable and effective load forecasting is one of the
important factors for operation decisions and power system planning.
The safety and economic operation of the power system are directly
affected by forecasting accuracy. Due to the complexity and instability
of power load, forecasting accuracy is a most challenging issue. Hence
the researcher proposed a novel hybrid system for designing forecasting
by embedding a multi-objective module. A detailed salp swarm algo
rithm and its critical characteristics were highlighted by Abualigah et al.
[19]. SSA is one of the effective meta-heuristic optimization algorithms
used for optimization problems discussed by the author. In machine
learning, wireless networking, engineering design, storage power en
ergy, and image processing, SSA can be used. They have done a
comprehensive review in this research for various SSA types, including a
chaotic salp algorithm, hybridizations of scalp swarm algorithm, binary
scalp swarm algorithm, etc. The researcher highlighted the different
limitations of the salp swarm algorithm. The SSA algorithm is having
less control over multimodal strategies. Finally, the review says SSA
shares a few advantages they are speed, simple and hybridization with
other optimization algorithms. Sobhi Ahmed et al. [20] have highlighted
the classification algorithm’s performance for the data dimensionality.
Due to the high dimensionality of data, many problems related to clas
sifier for its computational time are high, to avoid this, Feature selection
is the best solution. This technique aims to reduce the number of features
and irrelevant data, noisy data, and redundant data removed. The
author highlighted metaheuristic algorithms are superior for solving this
type of problem. In this research, the authors proposed a chaotic version
of the Salp Swarm Algorithm. They used four different types of chaotic
maps for controlling the balance between exploration and exploitation.
The researchers used twelve well-known datasets in this research, which
were brought from the UCI data repository. For the wrapper feature
selection, they used the K-NN classifier evaluator. They divided each
dataset into two parts, like 80% for training the data and 20% for testing
the data. Subrat Kumar Nayak at el [21] highlighted how to deal with
real-world data, which are more involved. To deal with such kinds of
data, feature selection plays an important role. In this research, the
author highlighted a Filter Approach using a Multi-objective differential
evolution algorithm for feature selection. This algorithm is applied to
handle the duplicate and unwanted features of a given dataset. The
researcher highlighted two objectives. These are how to remove
redundant ones, and the other is erroneous features by evaluating their
relevance concerning additional features and class labels. In this novel
work, the researcher used feature subsets of 23 required datasets and
they are tested using 10-fold cross-validation. In this research 23,
benchmark datasets are tested using 10- fold cross-validation with the
help of four different well-known classifiers to get the result. Yun Bai
et al. [22] have highlighted a PM2.5 concentration forecasting, which is
useful and essential for protecting public health. In this research, the
author proposed an ensemble long short-term memory neural network
(E-LSTM). The proposed model is implemented in three different steps:
multimodal feature extraction, multimodal feature learning, and inte
gration. In this research, real datasets were used. The datasets are
collected from Beijing and China’s environmental monitoring stations.
They developed various LSTMs in different modes with the aid of
E-LSTM; it was used as a single LSTM, feed-forward neural network, and
the result was achieved as a mean absolute error of 19.604%, root mean
square error 12.077, and correlation coefficient criteria 0.994. Alani, H.,
et al. [23] highlighted that chronic kidney disease leads to high mor
tality rates and high patient expenditure. CKD may be one of the critical
factors which lead to heart disease, and it’s a significant cause of death
among renal transplant patients. This kind of disease is mainly
uremia-specific, and it increases in prevalence as kidney function de
clines. Due to the uremia-specific, it leads to various risk factors. They
are hemoglobin, abnormal bone, albuminuria, and mineral metabolism.
This kind of disease is found due to a lack of diagnostic screening tools, a
lack of sensitivity and specificity to make them reliable, and the need for
more RCT-quality evidence to guide intervention. Jacqueline O’Toole
et al. [24] expressed, as heart disease can be predicted earlier in young
adults even the risk can be reduced from future CVD burden. Mostly in
young adults, the CVD risk is found when they are getting chest pain.
This research analyzed lifestyle habits and CVD risks in young adults. In
this work, they used 26 young adults data, with the age group of 39–40
years. The survey of this works shows a low risk of getting heart disease
S.P. Patro et al.
4. Informatics in Medicine Unlocked 26 (2021) 100696
4
within ten years due to their age. Where half of the young adults were
identified with only 2 or more CVDRFs. But most of the adults were
suffered from heart disease due to sedentary and overweight. Puja
Wieslaw et al. [25] approached that the feature selection is the initial
part in most of the knowledge discovery experiment work. In this
research, a tree-based generational feature selection application is pre
sented for medical data analysis. The basic approach was to estimate the
importance of attributes for extraction from a given structure of the tee
with the recursive application of the generation of feature sets. It helps
remove selected features from a dataset. It creates a next-generation
with a critical feature set, and this process continues until a crucial
feature will be a random value. The researcher applied this process to a
real-world medical dataset, including the Colon dataset and Loung
Dataset. In this work, they recognized that almost all truly relevant
features (19 of them) were simultaneous, with average Accuracy
increasing from 0.68 to 0.74,Only one relevant feature wasn’t identified
as important.
Juan-Jose Beunza et al. [26] highlighted the usage of a supervised
machine learning algorithm for predicting clinical events for its validity
and accuracy. In this work, the data were brought from Framingham
heart study data which contains 4240 observations. For this research,
they focused on risk factors of heart disease with a combination of data
mining. The researcher used different machine learning algorithms
along with RapidMiner and R-Studio for analyzing the data. A neural
network model was implemented to omit all the missing values when the
AUC is 0.71. Later they used the same data by using RapidMiner and
support vector machines and they got an AUC of 0.75. Sinan Q. Salih
et al. [27] highlighted as metaheuristic algorithms are more suitable for
solving different optimization and engineering problems. These kinds of
algorithms have few issues. They are global search and local search
capabilities. The researcher developed an algorithm called nomadic
people optimizer for simulating nature for simulating the nature of the
people’s movement, how they search their food, and how their life is
under years, etc. This research was focused on a multi-swarm approach,
where several clans consist, and each clan finds its best place. Validated
the algorithm on 36 no of unconstrained benchmark functions. The
outcome of the research was a unique solution for the NPO algorithm.
Khaled Mohamad Almustafa et al. [28] highlighted that heart disease
becomes one of the common diseases, and early diagnosis for this dis
ease is challenging for healthcare providers. In this research, the re
searchers implemented various classifiers for classifying the heart
disease dataset for predicting heart disease with minimal attributes.
They have collected the dataset from Cleveland, Switzerland, that con
tains 76 characteristics with a class attribute of 1025 patients. Out of the
76 attributes in this work, using only 14 features. The researcher used
various algorithms, including a k-nearest neighbor, decision tree, Naïve
Bayes, SVM, stochastic gradient descent for best classification, and
predicting heart disease cases. Using these classification algorithms, the
researcher got accuracy results for KNN (k = 1), JRip a Decision tree
with 99.70,97.26%, and 98.04, respectively. K.Vembandasamy et al.
[29] approached health care as an essential factor in human life and
health concern business nowadays, becoming notable in medical sci
ence. The health care industry has a large amount of patient data, and on
these data, various data mining techniques are applied to detect heart
diseases in the patient. But using the data mining technique couldn’t get
significant test results with the hidden information, so the researcher
proposed a system using data mining algorithms to classify the data and
detect heart diseases. In this research, the researcher used the Naïve
Bayes algorithm to diagnose heart disease patients and implement the
experiments using the weka tool. The proposed naïve Bayes model was
satisfied for classifying 74% of the input instances correctly. It exhibited
a precision of 71% on average, recall of 74% on average, and F- a 71.2%
measure. Vikas Chaurasia et al. [30] discussed heart disease as an
essential factor that causes death, and most of these deaths are found in
low and middle-income countries. The healthcare industry collects
enormous amounts of heart disease data, but these data are not
well-mined to discover the hidden information to make effective
decision-making. In this research, the researcher highlighted different
knowledge discovery concepts in databases using various data ming
techniques to help medical practitioners make an effective decision.
The research work’s primary motto was to predict the presence of
heart disease more accurately with fewer attributes. The researcher took
only 11 features and used three classifiers named J48 Decision Tree,
Naïve Bayes, and Bagging Algorithm used to predict patients’ diagnosis.
In this work, the researcher got the highest Accuracy is 85.03%, and the
lowest is 82.31%, where other algorithm yields an average accuracy of
84.35%. Yudong Zhang et al. [31] highlighted that practical swarm
optimization is treated as a heuristic global optimization method and is
one of the most commonly used optimization techniques. In this
research, the author presented PSO’s comprehensive investigation with
its advances, modifications like quantum-behaved PSO, chaotic PSO,
Fuzzy PSO, etc. The author surveyed various applications of PSO in
different areas. They are automation control systems, operation
research, communication theory, fuel, and energy, etc. This work is
divided into various aspects, including PSO modifications, an extension
of PSO, hybridization of PSO, parallel implementation of PSO, and
theoretical analysis of PSO.
Jianzhou Rizk-Allah et al. [32] highlighted on salp swarm algorithm.
It is the recent meta-heuristic algorithm that imitates salps’ behaviors at
the time of navigating and foraging. This research, proposing a new
version of SSA with named BSSA i.e binary salp swarm algorithm. The
proposed BSSA is used for comparing four different variants of trans
formed functions for solving few global optimization problems. Along
with few nonparametric statistical tests were carried out named Wil
coxon’s rank-sum with 5% significance level for judging the importance
of the obtained results among the different algorithms statistically. In
this work, the results of BSSA are better than other algorithms.
Patro, S. P et al. [33] highlighted the challenge of taking health care
anthem of age populations social care. Mostly, heart disease and chronic
illnesses become more dangerous on these aged people, and sometimes
it leads to a heart attack without any omens. It is too difficult for doctors
to identify the patient’s status in time. In this regard, the researcher
proposed a model that can identify these challenges, remotely real-time
patient health data. A framework was proposed for predicting heart
disease using major risk factors based on different classifier algorithms,
including Naïve Bayes, K-Nearest, Support Vector Machine, Laso, and
Ridge regression. Along with this for data classification, the researcher
used principal component analysis and linear discriminant analysis.
They used an open-source data set. In this process, they used 14 attri
butes. After successful implementation, the support vector machine
provides 92% accuracy, and F1 Accuracy is 85%. Khan, M. A., et al. [34]
discussed the IoT application, including manufacturing, agriculture,
healthcare, etc. The researcher focused on wearable devices application
in the health monitoring system named Internet of Medical Things
(IoMT). We can identify the mortality rate of early detection of heart
disease prediction with clinical data analysis with its help. In this
research, they investigated the key characteristics of heart disease pre
diction using machine learning techniques. To improve the prediction
accuracy of an IoMT framework designed to diagnose heart disease, they
used modified salp swarm optimization, an adaptive neuro-fuzzy infer
ence system in this research. The proposed MSSO-ANFIS prediction
model obtains an accuracy of 99.45 with a precision of 96.54, which is
higher than the other approaches. Wang, J., et al. [35] proposed the
coronary arteriography (CAG) approach for the diagnosis of coronary
heart disease (CHD). Machine learning’s help to perform different se
lective of multiple ml algorithms for feature selection methods is used in
the health care industry. In this research, they implemented a two-level
stacking named base-level and meta-level. The prediction of base-level
classifiers is selected as the input of the meta-level. They used the
Z-Alizadeh Sani CHD dataset in this research, and this data set consisting
of 2020 CAG cases. This model’s results obtained an accuracy, speci
ficity, and sensitivity of 95.3%, 94.44%, and 95.84%, respectively.
S.P. Patro et al.
5. Informatics in Medicine Unlocked 26 (2021) 100696
5
The common goal of all the above-mentioned techniques is to classify
heart disease using hybrid classification techniques. Many of the
research is carried out using only classification and optimization tech
niques. The proposed approaches are presented to achieve the desired
results by identifying different optimization techniques with various
machine learning algorithms.
In this research different classifier approaches are proposed that
include combination of different ensemble-based machine learning al
gorithms to identify the redundant features to improve the accuracy and
quality of heart disease classification. We will present a comparative
analysis for a heart disease dataset classification using various classifi
cation algorithms, all the classifiers are mostly used for similar heart
disease-related research for datasets classification. The classifiers were
used with cross-validation with 10 folds method, then we will study the
performance of the Bayesian Optimized Support Vector Machine (BO-
SVM), K-Nearest Neighbors (KNN) classifiers using various training set
instead of the cross-validation methods using K-10 folds classification.
Then, we will apply a different classifier algorithm such as Naïve Bayes
(NB), Bayesian Optimized Support Vector Machine (BO-SVM), K-Nearest
Neighbors (KNN), and Salp Swarm Optimized Neural Network (SSA-
NN). The main goal of this research is to find the best accuracy for the
prediction of heart disease by using major risk factors based on different
classifier algorithms such as Bayesian Optimized Support Vector Ma
chine (BO-SVM), K-Nearest Neighbors (KNN).
3. Design parameters
Parameters or design variables are controlled factors that influence
performance. They can be of various natures: geometric dimensions,
properties of materials, structural choices, etc. They may be quantitative
or qualitative, continuous, or discrete. The selection and the number of
parameters also determine the definition of the optimization problem.
There are many factors for increase the search space, but the optimi
zation process will take longer. For example, a suitable geometrical
shape, to ensure the validity of the modeling retained and its proper
functioning, etc.
4. Optimization methods
4.1. Continuous optimization
Continuous optimization is done by two methods, the first linear and
the second non-linear.
• Linear optimization in integers studies linear optimization problems
in which a particular or all variables are controlled to get integer
values.
• Non-linear optimization delivers the normal case in which the
objective or constraints (or both) contain non-linear, possibly non-
convex, parts.
4.2. Combinatorial optimization
Combinatorial optimization consists of finding the best solution be
tween a finite number of choices. In other words, to minimize a function,
with or without constraints, on a finite set of possibilities. When the
number of possible combinations becomes exponential concerning the
problem’s size, the computation time becomes rapidly critical.
A generalized optimization problem is solved if it consists of finding
a solution _x0001_ optimizing the cost function’s value _x0001_.
Formally, we thus seek s*
∈ X such that f (s*
) ≤ f for all s ∈< /s > (1)
Such a solution S*
_x0001_ Is called an optimal solution or a global
optimum.
5. Research gap
Data quality is viewed from various basic dimensions (Accuracy,
timeliness, relevance, completeness, intelligibility, and reliability),
mainly addressing the data’s integrity in a particular research project.
Missing data values cause errors; data without value creates ambiguity
since it can be correct or wrong. Its importance lies in the fact that
decision-making efficiency depends on the quality of the data. Small
improvements in the dimensions of the data can lead to substantial
improvements in the information for decision-making. Hence, it is
beneficial for organizations to have proven studies of the selection and
evaluation of characteristics of computational learning techniques and
to use hybrid technologies that improve the results obtained. Various
methods have been created for the analysis of heart disease. Be that as it
may, there is dependably a degree for development, and still, a few
systems are being created to beat the confinements of the current stra
tegies. There are different data mining techniques for discovering re
lations between the diseases, their symptoms, and prescriptions.
Although such methods have certain constraints, iteration count,
disposing of the consistent contentions, higher response time, and so
forth. The principal limitation of the back-propagation neural network is
that it results in a higher MSE (mean squared error) if the weight value is
random (not optimized). Therefore, this research work uses the Salp
Swarm algorithm to optimize the neural network’s weight value to
reduce the MSE. Thus the Accuracy of an SSA-optimized Neural Network
is higher than that of a Neural Network alone. Similarly, the support
vector machine is optimized by Bayesian optimization. The KNN and
Naïve Bayes classification algorithms are also used for comparative
analysis of the research work.
6. Proposed methodology
In this research work, we have given more focus to Machine
Learning. Machine learning is a discipline that contains algorithms,
which helps for empirical data to carry out in two approaches. First,
identify the complex relationship through the data’s characteristics and
employ the patterns to predict.
In the data, it’s possible to find the relationship between the vari
ables observed through algorithms, which would come to be like a
machine that learns from a data sample or training data to capture the
characteristics that are not observed through the probability distribu
tion. It is possible to use the learned knowledge to make a smarter de
cision using new data. Normally we can classify machine learning
algorithms into different categories depending upon the results. Few of
these classifications are supervised learning and non-supervised
learning.
When we want to analyze huge variables, we may face some prob
lems with high dimensionality. To avoid such problems, there is a va
riety of classified methods are used. For example, using a one-step
selection method before using some other approach can increase the
latter’s power. There are a few general characteristics of each of the
strategies defined, and they are 1. Method of dimension reduction 2.
Method of selection variables.
6.1. Data analysis and encoding
In this work, the Cleveland dataset is used. The data were taken for
this work in the form of a matrix. The matrix contains a set of rows and
columns. By taking the data, we are predicting heart disease. In the UCI
repository, there is various heart disease dataset available. They are
Hungarian, Cleveland, Switzerland. The dataset contains 76 attributes
and 303 records. But all published experiments refer to using a subset of
14 of them. The target column in the given dataset includes two different
classes; for heart disease, it indicates 1 otherwise, 0.
The important risk factors of the dataset are revealed in Table 1. The
table consists of various risk factor and their corresponding values along
S.P. Patro et al.
6. Informatics in Medicine Unlocked 26 (2021) 100696
6
with the encoded values in brackets. These encoded values will be used
as input to the proposed framework.
Machine learning methods are dynamic because they usually contain
several parameters that need to be optimized for best performance, and
it can be tiring to optimize data manually as well as overloading.
Therefore, the two classification approaches, SVM and Neural Network
are optimized by Bayesian Optimization and Salp Swarm Optimization,
respectively. Also, two standalone classification methods are used are
KNN and Naïve Bayes. These approaches are proposed to define an op
timum number of clusters in analyzed data.
6.2. Data analysis and encoding
The significant risk factors of the dataset are revealed in Table 1. The
table consists of various risk factor and their corresponding values along
with the encoded values in brackets. These encoded values will be used
as input to the proposed framework. Machine learning methods are
dynamic because they usually contain several parameters that need to be
optimized for the best performance. It can be tiring to optimize data
manually and overloading. Therefore, the two classification approaches,
SVM and Neural Network are optimized by Bayesian Optimization and
Salp Swarm Optimization, respectively. Also, two standalone classifi
cation methods are used are KNN and Naïve Bayes. These approaches
are proposed to define an optimum number of clusters in analyzed data.
The proposed methodology is shown in Fig. 1.
7. Classification using K-Nearest neighbor
First of all, three parameters are considered: the sample data, the
number of closest neighbors to select (K), and the point we want to
evaluate (X). Subsequently, for each element of the sample, we assess
the distance between reference point X and point X; of the learning set,
and we check if the distance between them is less than one of the dis
tances contained in the nearest neighbors’ list. If so, the point is added to
the list. If the number of items in the list is more significant than K, the
last value is simply removed from the list. The above Fig. 2 illustrates the
classification of the K-Nearest Neighbor. The algorithm itself is not very
complicated and can result in brute force if sampling is not too big.
However, since we are talking about data mining, the number of in
dividuals to be evaluated is often very big, that’s why an optimization
algorithm is needed. There are many types of trees to speed up a search
like the JCD tree or the ball tree. The algorithm ball tree will be covered
later in this report. Here is the pseudo-code representing the algorithm
[37].
Phase -1: In Heart disease prediction, the data set to follow one of
the topmost repositories, which is called UCI machine learning, is a
collection of data generators that are used to analyze the machine
learning algorithm.
Phase -2: The Data preprocessing step will remain the same as Lo
gistic Regression, which refers to cleaning and organizing the raw for
building and training the machine learning models. Generally, the Data
preprocessing for machine learning follow specific steps, like as;
1.1. Import libraries.
1.2. Import the dataset, which datasets have almost come in CSV
formats
1.3. Focused on Missing Data in Dataset. For identifying the missing
data a library we use is called “Scikit Learn” preprocessing, which
Table 1
Risk factors and their corresponding encodings [36].
S. No. Risk Factors Values
1 Sex Male (1), Female (0)
2 Age (years) 20-34 (− 2), 35–50 (− 1), 51–60 (0),
61-79 (1), >79 (2)
3 Blood
Cholesterol
Below 200 mg/dL - Low (− 1)
200–239 mg/dL - Normal (0)
240 mg/dL and above - High (1)
4 Blood
Pressure
Below 120 mm Hg- Low (− 1)
120–139 mm Hg- Normal (0)
Above 139 mm Hg- High (1)
5 Hereditary Family Member diagnosed with HD -Yes (1) Otherwise
–No (0)
6 Smoking Yes (1) or No (0)
7 Alcohol Intake Yes (1) or No (0)
8 Physical
Activity
Low (− 1), Normal (0) or High (1)
9 Diabetes Yes (1) or No (0)
10 Diet Poor (− 1), Normal (0) or Good (1)
11 Obesity Yes (1) or No (0)
12 Stress Yes (1) or No (0)
Output Heart Disease Yes (1) or No (0)
Fig. 1. Proposed methodology.
S.P. Patro et al.
7. Informatics in Medicine Unlocked 26 (2021) 100696
7
contains a class called “imputer,” which will help us take care of
the missing data.
1.4. Encoding categorical data.
1.5. Split the dataset into a Training set and Test set.
1.6. Feature Scaling. This step is the final step of data preprocessing
Phase -3: Training data set, here fit the K-NN classifier to the training
data. To do this, we will import the K- Neighbors Classifier class of the
Sklean Neighbors library. After importing the class, we will be able to
create the Classifier object of the class.
Phase -4: After evaluating the Training data set, It goes through the
K-Nearest Neighbor method for the prediction of the class, we are
implementing the KNN algorithm, which will be worked on as like;
4.1 Load the data
4.2 Initialize K to your chosen number of neighbors.
4.3 Now compare the Actual/Desired output. For taking the pre
dicted class, repeat the process from 1 to the total number of
training data points. Then calculate the distance between test
data and training data by using the most popular distance metric
called the Euclidean distance method, and the index is the sort to
the ordered collection. Now pick the first K entries from the
sorted array and get the most frequent class of the selected K-
entries.
4.4 If any Error, then repeats step-1 to step 3. Otherwise, return the
predicted class.
Determine the class of x from the class of examples whose number is
stored in the KNN.
8. Classification using Naïve Bayes classifier
The Naïve Bayes technique is based on the theory of probability. In
this, conditional probabilities are calculated by frequencies to predict
the prediction of new cases. The Fig. 3 shows the Naïve Bayes classifier-
based approach.
Let E and F events, we can express E as:
E = EF ∪ EFc
(2)
That is, for an event E to occur, E and F must occur, or E must occur and F
not. Because EF and EFc
are mutually exclusive, then we have:
P(E) = P(EF) + P(EFc
)
= P(E|F)P(F) + P(E|Fc
)P(Fc
)
= P(E|F)P(F) + P(E|Fc
)(1 − P(F)) (3)
Equation (3) states that the probability of event E is a weight of the
conditional probability of E given that F has occurred and the condi
tional probability of event Egiven that F has not occurred. Each condi
tional probability provides as much weight as the conditioned event
Fig. 2. Classification using K-Nearest Neighbor.
S.P. Patro et al.
8. Informatics in Medicine Unlocked 26 (2021) 100696
8
tends to occur.
Equation (3) can be generalized as follows: suppose that events
F1, F2, ...Fn are mutually exclusive such that ∪n
i=1 Fi = S, where S is the
sample space. In other words, precisely one of the events will occur
(Fig. 4).
One can write the above as:
E = ∪
n
i=1
Ei (4)
From the definition of conditional probability, we have:
P(EFi) = P(E|Fi)P(Fi) (5)
Furthermore, using the fact that the EFi events i = 1, …, n are
mutually exclusive, we obtain that:
P(E) =
∑
n
i=1
P(EFi) =
∑
n
i=1
P(E|Fi)P(Fi) (6)
Thus, equation (6) shows how forgiven events F1, F2, ... Fn of which
one and only one can occur, P(E) can be calculated conditioning that F1
occurs. That is, it is established that P(E) is equal to the average of the
weights of P(E|Fi) and each term is weighted by the probability of the
event in which it is conditioned. Now suppose that E has occurred and
that you want to determine the probability that event Fj has occurred. By
equation (6) we have:
P
(
Fj
⃒
⃒E
)
=
P(EFi)
P(E)
=
P
(
E
⃒
⃒Fj
)
P
(
Fj
)
∑n
i=1P(E|Fi)P(Fi)
(7)
Equation (7) is known as the Bayes formula. Thus, we can consider E
as evidence of Fj, and calculate the probability that Fj will occur given
the evidence, P(E|Fi). Now suppose you have evidence from multiple
sources. From equation (4):
P
(
Fj
⃒
⃒E1E2…Em
)
=
P
(
E1E2…Em
⃒
⃒Fj
)
P
(
Fj
)
P(E1E2…Em)
(8)
The above equation will be used to obtain results.
The assumption that gives rise to the adjective Naïve is the inde
pendence between the variables, which is not always true. However, the
approach is efficient in its implementation as the knowledge concerned.
It is found in comparatively large quantities and not so much in the
values themselves of the probabilities.
9. Classification using Bayesian Optimized SVM classifier
Fig. 5 shows the Bayesian optimized SVM classifier-based approach
described in the following subheadings.
Fig. 3. Basic block diagram of the proposed Naïve Bayes method.
Fig. 4. Event E occurs in conjunction with one of the mutually exclusive events
Fj [38].
S.P. Patro et al.
9. Informatics in Medicine Unlocked 26 (2021) 100696
9
9.1. Support vector machines (SVM)
Today in many real-world problems we use Multi-class classification.
Previously Support Vector Machines were used to deal with binary
(+/− 1) problems. The objective function is expressed by:
wr ∈ H, ∈r
∈ Rm
, br ∈ R
1
2
∑
M
r=1
wr
2
+
c
m
∑
m
i=1
∑
r∕
=y1
εr
i (9)
Subject to:
Wyi
, Xi + byi
≥ Wr, Xi + br + 2 − εr
i , … εr
i ≥ 0 (10)
Where, m ∈ {1, …, M}Yi and Yi ∈ [1, …, M] is the multi-class label of the
Xi pattern.
In terms of precision, the results obtained with this approach are
comparable to those obtained directly using the one against the rest
method. For practical problems, the choice of approach depends on the
available limitations, and relevant factors include the precision
required, the time available for development, the processing time, and
the classification problem’s nature.
9.2. Bayesian Optimization of support vector machine
Bayesian Optimization (BO) ‘s main idea is to construct a surrogate
probabilistic model sequentially to infer the objective function. Itera
tively, new observations are made. The model is updated, reducing its
uncertainty allows working with a known and cheaper model, which is
used to construct a utility function that determines the next point to
evaluate. The different steps of the BO methodology are described
below.
First, the apriori model must be chosen over the possible space of
functions. Different parametric approaches can be used, such as Beta-
Bernoulli Bandit or Linear Models (Generalized) or nonparametric
models such as t-Student Processes or Gaussian processes. Then
repeatedly until a particular stopping criterion:
The prior and the likelihood of the observations so far are combined
to obtain a posterior distribution. This is done using Bayes’ theorem,
hence the origin of the name.
Recall Bayes’ theorem. Let A and B be two events such that the
conditional probability P (B | A) is known, then the probability P (A | B)
is given by:
P(A|B) =
P(B|A)P
P(B)
(11)
Where P(A) is the a priori probability, P(B|A) is the probability of event
B conditional on the occurrence of events, and P(A|B) is the posterior
probability. A particular utility function is then maximized on the a
posteriori model to determine the next point to evaluate. The new
observation is collected to repeat until the stop criterion. The primary
function of an SVM classifier is to resolve the feature subset selection
with parameter tuning. Since the SVM approach uses a discretization
technique for the continuous parameter, it results in less accurate data
loss results. This proposed work discusses the algorithm that can tune
SVM parameters. In this work, the algorithms are proposed to optimize
two SVM parameters, which are weight, C, and kernel function. The first
parameter weight identifies the trade-off between specific misclassifying
points and correctly classifying others, and the second parameter kernel
is used to tune SVM parameters and select the feature subset instanta
neously.
Fig. 5. Basic block diagram of the proposed BO-SVM method.
S.P. Patro et al.
10. Informatics in Medicine Unlocked 26 (2021) 100696
10
9.3. BO-SVM algorithm
In the above algorithm, the jobs of each variable are as follows.
i. K _x0001_holds the solutions which are received from the records
ii. M _x0001_holds the number of models that are used to generate
solutions
iii. Q holds the algorithm’s parameter which is used to control
diversification of the search process
iv. C holds the soft margin parameter
v. Y is Used as kernel function parameter that called the margin or
the width parameter.
vi. Finally, the termination conditions for the best values for SVM
parameters (C and Y)
10. Classification using salp swarm optimized neural network
classifier
Fig. 6 shows the salp swarm optimization of the neural network-
based approach described in the following subheadings.
11. Neural network (NN)
In this work, different neural network structures with a hidden layer
are tested, starting from several neurons equal to the average between
the number of inputs and the number of outputs. Then, the number of
neurons in the said layer gradually increased until the most recom
mended structure for predicting heart disease is studied. The selection of
the best network structure is made considering the following evaluation
measures inside and outside the sample: RMSE (Root Mean Square
Error) and the MAPE (Mean Absolute Percentage Error), calculated
using equations (12) and (13).
RMSE =
̅̅̅
1
n
√
∑
n
t=1
(
y
1
t
− yt
)2
(12)
MAPE =
100
n
∑
n
t=1
⃒
⃒
⃒
⃒
⃒
⃒
⃒
y
1
t
− yt
yt
(13)
The number of observations is considered and is the real price and is
the model’s price estimated. The Salp Swarm Algorithm is used for
Fig. 6. Basic block diagram of the proposed SSA-NN method.
S.P. Patro et al.
11. Informatics in Medicine Unlocked 26 (2021) 100696
11
determining bias and weight values.
Fitness Function: The fitness function’s objective is to minimize the
MSE between the target class and the predicted class of the training data.
It is also a function of bias and weight.
mim F(w, v) =
∑
q
t=1
[ct − (wxt + v)]2
(14)
Where, xt is input and ct is the target output.
The Salpswarm algorithm is utilized to find optimal bias and weight
values with an end goal to minimize the objective function of the
equation (14).
12. Salp swarm algorithm (SSA)
A salp is look likes a barrel-shaped planktic tunicate. It belongs to the
salpidae family. The body is very similar to the texture of jellyfish. It
moves in the water like jellyfish. It soaps and moves by pumping the
water with the help of their bodies. Due to the behavior of these crea
tures, it’s not possible to reach their habitats, and also difficult to keep
them in a laboratory environment. For these creatures, biological
research is just a starting point. The main idea of this algorithm is about
the swarm behavior of salps. In-depth oceans, the swarm salps from a
chained structure which is called a salp-chain. Till today the reason for
these kinds of behavior is unknown. But few researchers consider as this
is carried by fast, different variations and producing methods to get
better movement.
A mathematical model for swarm behavior is designed with the help
of salp chains. It processes the population by dividing it into two
different parts, they are followers and leaders. In this salp chain process,
the leader is always in the front. Others are followed by him. The goal in
the search space is a food source, which is called TF. All the swarms are
targeted at this food source. Updating the position for leader salp by
target food source applies the following formula
x1
j =
{
TFj + c1
(
c2
(
ubj − lbj
)
+ lbj
)
c3 ≥ 0
TFj + c1
(
c2
(
ubj − lbj
)
+ lbj
)
c3 < 0
}
(15)
Here, x1
j represents the leading alpine position in the jth
dimension,
TFj represents the target food source in the jth
dimension, the c1, c2 and c3
are random numbers, ubj and lbj, respectively, the upper and lower
boundaries in the jth
dimension. The coefficient c1 balances the explo
ration (global search) and exploitation (local search) phases of the
research space. This is the reason that it is the most important parameter
of the SSA algorithm which is written mathematically as [39].
c1 = 2
−
(
4m
M
)
2
e (16)
Here, m represents the current step, while M represents the total
number of steps. Let the value of M is 100. Both are random numbers,
and coefficients produced uniformly in the range of c1 and c2 [0, 1]. Each
follower salpinx updates the position according to the track followed by
the equation as follows [39]:
xi
j =
1
2
(
xi
j + xi− 1
j
)
∀ ≥ 2 (17)
Equation (17) shows that each follower salpinx follows its leader to
form a chain of salps. Here, xi
j means the jth
dimension ith
follower spin
site. Like all swarm-based optimization techniques, the starting location
of the sepsis was also generated randomly [39].
13. Experimental result
The following Table 2 illustrates the parameters of a simulation.
13.1. Evaluation parameters
The formulas shown below are used to calculate accuracy, precision,
and sensitivity.
Accuracy: Accuracy means what percentage of data is correctly
classified:
Accuracy =
TP + TN
TP + TN + FP + FN
(18)
Precision (P): The percentage of correctly classified when predicting
positivity is called precision:
p =
TP
Totalpositiveclassified
(19)
Sensitivity: The percentage of classifying a positive class is called
sensitivity:
S =
TP
Totalpositive
(20)
13.2. Simulation results for KNN
The Fig. 7 showing the confusion matrix for K-Nearest Neighbor
based approach. The matrix describes the performance of the KNN
model with the target class and output class values.
Here, TP = 3, TN = 9, FP = 1, FN = 2
Fig. 7. Confusion matrix for KNN based approach.
Table 2
Simulation parameters.
Salp Swarm
Algorithm
Neural Network Bayesian
Optimization
SVM
• Number of
swarm = 30
• Maximum
iteration = 100
• A feed-forward neural
network with 1 hidden layer
with 6 neurons.
• Scaled conjugate training
K-10 fold cross-
validation
–
S.P. Patro et al.
12. Informatics in Medicine Unlocked 26 (2021) 100696
12
Accuracy =
TP + TN
TP + TN + FP + FN
=
3 + 9
3 + 9 + 1 + 2
= 80%
Precision =
TP
TP + FP
=
3
3 + 1
= 75%
Sensitivity =
TP
TP + FN
=
3
3 + 2
= 60%
13.3. Simulation results for Naïve Bayes
The Fig. 8 showing the confusion matrix for Naïve Bayes based
approach. The matrix describes the performance of the NB model with
the target class and output class values.
Here, TP = 4, TN = 9, FP = 1, FN = 4
Accuracy =
TP + TN
TP + TN + FP + FN
=
4 + 9
4 + 9 + 1 + 1
= 86.7%
Precision =
TP
TP + FP
=
4
4 + 1
= 80%
Sensitivity =
TP
TP + FN
=
4
4 + 1
80%
13.4. Simulation results for SSA-NN
The Fig. 9 and Fig. 10 showing the confusion matrix for Neural
Network based approach and Salp Swarm Optimized Neural Network-
based approach respectively. The matrix describes the performance of
the Neural Network model with the target class and output class values.
Here, TP = 3, TN = 9, FP = 1, FN = 2
Accuracy =
TP + TN
TP + TN + FP + FN
=
3 + 9
3 + 9 + 1 + 2
= 80%
Precision =
TP
TP + FP
=
3
3 + 1
= 75%
Sensitivity =
TP
TP + FN
=
3
3 + 2
= 60%
The Fig. 10 showing the simulation results for Neural Network based
approach and Fig. 11 showing the Mean squared error performance
graph for a neural network-based approach.
The Fig. 12 showing the confusion matrix for Salp Swarm Neural
Network based approach.
Here, TP = 3, TN = 10 FP = 0, FN = 2
Accuracy =
TP + TN
TP + TN + FP + FN
=
3 + 10
3 + 10 + 0 + 2
= 86.7%
Precision =
TP
TP + FP
=
3
3 + 0
= 100%
Sensitivity =
TP
TP + FN
=
3
3 + 2
= 60%
The Fig. 13 showing the simulation results for Salp Swarm Optimized
Neural Network-based approach and Fig. 14 showing the Mean squared
error performance graph for Salp Swarm Optimized Neural Network-
based approach.
Fig. 9. Confusion matrix for a neural network-based approach.
Fig. 8. Confusion matrix for Naïve Bayes based approach.
S.P. Patro et al.
13. Informatics in Medicine Unlocked 26 (2021) 100696
13
Fig. 10. Simulation result.
Fig. 11. Mean squared error performance graph for a neural network-
based approach.
Fig. 12. Confusion matrix for salp swarm optimized neural network-
based approach.
S.P. Patro et al.
14. Informatics in Medicine Unlocked 26 (2021) 100696
14
13.5. Simulation results for BO-SVM
The confusion matrix for SVM and BO-SVM is show in Fig. 15 and
Fig. 16.
Here, TP = 4, TN = 8 FP = 2, FN = 1
Accuracy =
TP + TN
TP + TN + FP + FN
=
4 + 8
4 + 8 + 2 + 1
= 80 = %
Precision =
TP
TP + FP
=
4
4 + 2
= 66.7%
Sensitivity =
TP
TP + FN
=
4
4 + 1
= 80%
Here, TP = 4, TN = 10 FP = 0, FN = 1
Fig. 13. Output for salp swarm optimized neural network-based approach.
Fig. 14. Mean squared error performance graph for salp swarm optimized
neural network-based approach.
Fig. 15. Confusion matrix for SVM based approach.
S.P. Patro et al.
15. Informatics in Medicine Unlocked 26 (2021) 100696
15
Accuracy =
TP + TN
TP + TN + FP + FN
=
4 + 10
4 + 10 + 0 + 1
= 93.3%
Precision =
TP
TP + FP
=
4
4 + 0
= 100%
Sensitivity =
TP
TP + FN
=
4
4 + 1
= 80%
The Objective function model for SVM is shown in the above Fig. 17
and the function evaluation graph in Fig. 18. The classification process
was developed using the MATLAB R2018a. In the study carried out, four
classification techniques were applied to compare those and observe
who provided greater Accuracy and less error in the predictions related
to heart diseases. Each confusion matrix shows the Accuracy, precision,
and sensitivity of the particular methods applied. The comparative re
sults of each process are presented in Table 3 for better analysis:
Fig. 19 showing the Comparative results graph representation for
various classification methods.
14. Conclusion
Classification in data mining ensures a reduction in the problem’s
size, which reduces the duration of learning and simplifies the learned
model. This simplification generally facilitates the interpretation of this
model. It also makes it possible to avoid over-learning, improving the
accuracy of the prediction, and understanding the classifier. In this
research work, KNN and Naïve Bayes methods are standalone used for
the classification while the Salp Swarm Algorithm optimizes the bias and
weight values for Neural Network. Also, the weight and kernel function
of SVM are optimized by Bayesian Optimization. It can observe from the
confusion matrix plots that the optimization methods are very useful in
heart disease prediction. In this research, the Bayesian Optimized SVM-
based approach exceeds other methods with 93.3% of maximum
accuracy.
Fig. 17. Objective function model for SVM.
Table 3
Comparative results for various classification methods.
Proposed Method Accuracy Precision Sensitivity
KNN 80% 75% 60%
Naïve Bayes 86.7% 80% 80%
NN 80% 75% 60%
SSA-NN 86.7% 100% 60%
SVM 80% 66.7% 80%
BO-SVM 93.3% 100% 80%
Fig. 16. Confusion matrix for Bayesian optimized-SVM based approach.
Fig. 18. Minimum objective vs. the number of function evaluations graph.
Fig. 19. Comparative results graph representation for various classifica
tion methods.
S.P. Patro et al.
16. Informatics in Medicine Unlocked 26 (2021) 100696
16
14.1. Future scope
1. From a future perspective, it is necessary to formalize an alliance and
work together with the institutions that collect the forefront of
knowledge, and thus be able to apply it to improve a real problem at
the country level, to be a contribution to our society.
2. This work leaves an application that can be used as support for
medical personnel in medical decision making, but discrete data
variables.
3. It can also detect future work, heart disease, cancer, arthritis, and
other chronic diseases.
4. As the developed system is generalized, it can utilize it to analyze
various datasets in the future.
5. Deep learning algorithms can be used to increase Accuracy.
Conflict of interest STATEMENT
The authors whose names are listed immediately below certify that
they have NO affiliations with or involvement in any organization or
entity with any fi nancial interest (such as honoraria; educational grants;
participation in speakers’ bureaus; membership, employment, consul
tancies, stock ownership, or other equity interest; and expert testimony
or patent-licensing arrangements), or non-financial interest (such as
personal or professional relationships, affiliations, knowledge or beliefs)
in the subject matter or materials discussed in this manuscript.
Acknowledgement
Sibo Prasad PatroAssistant Professor, Dept. of CSE GIET University,
Gunupur - 765022 14/05/2021.
I/We wish to submit an original research article entitled “Heart
Disease Prediction by using novel optimization algorithm: A supervised
learning prospective” for consideration by Informatics in Medicine
Unlocked.
I/We confirm that this work is original and has not been published
elsewhere, nor is it currently under consideration for publication
elsewhere.
In this paper, I/we report on/show that the heart disease prediction
using novel optimization technique. This is significant because the re
sults reveal that the proposed novel optimized algorithm can provide an
effective healthcare monitoring system for the early prediction of car
diovascular disease.
We believe that this manuscript is appropriate for publication by
Elsevier journal because it is more popular journal publication house for
different research domains over worldwide.
This research work is particularly interested in the category of data.
The classification allows us to obtain a prediction model from training
data and test data. These data are screened by a classification algorithm
that produces a new model capable of detailed data, possibly having the
same classes of data through the combination of mathematical tools and
computer methods. The analysis of data in medicine is becoming more
frequent to clarify the diagnoses, refine the research methods, and
envisage appropriate equipment supplies according to the importance of
the pathologies that appear. To analyze the present data to predict
optimal results, we need to use the optimization technique. In this
research the research work aims a framework for prediction of hearth
disese by using major risk factors based on different classifier algorithms
such as Naïve Bayes (NB), Bayesian Optimized Support Vector Machine
(BO-SVM), K-Nearest Neighbors (KNN), and Salp Swarm Optimized
Neural Network (SSA-NN).
We have no conflicts of interest to disclose.
References
[1] https://medium.com/analytics-vidhya/heart-disease-prediction-with-ensemble-lea
rning-74d6109beba1.
[2] Felman, A. (2018). Everything you need to know about heart disease. Medical
News Today, https://www.medicalnewstoday.com/articles/237191#types,
accessed date : 05/02/2021.
[3] Thomas J, Princy RT. March. “Human heart disease prediction system using data
mining techniques,”. In: 2016 international conference on circuit, power and
computing technologies (ICCPCT). IEEE; 2016. p. 1–5.
[4] Shao YE, Hou CD, Chiu CC. Hybrid intelligent modeling schemes for heart disease
classification. Appl Soft Comput 2014;14:47–52.
[5] Yekkala I, Dixit S, Jabbar MA. August. Prediction of heart disease using ensemble
learning and Particle Swarm Optimization. In: 2017 international conference on
smart technologies for smart nation (SmartTechCon). IEEE; 2017. p. 691–8.
[6] Amin SU, Agarwal K, Beg R. April. Genetic neural network-based data mining in
the prediction of heart disease using risk factors. In: 2013 IEEE conference on in
formation & communication technologies. IEEE; 2013. p. 1227–31.
[7] Tan PN, Chawla S, Ho CK, Bailey J, editors. Advances in knowledge discovery and
data mining, Part II: 16th Pacific-Asia conference, PAKDD 2012, Kuala Lumpur,
Malaysia, may 29-June 1, 2012, Proceedings, Part II, vol. 7302. Springer; 2012.
[8] Chandel K, Kunwar V, Sabitha S, Choudhury T, Mukherjee S. A comparative study
on thyroid disease detection using K-nearest neighbor and Naive Bayes classifica
tion techniques. CSI Trans. ICT 2016;4(2–4):313–9.
[9] Lépine JP, Briley M. The increasing burden of depression. Neuropsychiatric Dis
Treat 2011;7(Suppl 1):3.
[10] Gielen S, Schuler G, Adams V. Cardiovascular effects of exercise training. K: mo
lecular Marti; 2010.
[11] Marti K. Stochastic optimization methods, vol. 3. Berlin: Springer; 2005.
[12] Hegazy AE, Makhlouf MA, El-Tawel GS. Improved salp swarm algorithm for feature
selection. J. King Saud Univ. Comput. Inf. Sci. 2020;32(3):335–44.
[13] Du P, Wang J, Hao Y, Niu T, Yang W. A novel hybrid model based on a multi-
objective Harris hawks optimization algorithm for daily PM2. 5 and PM10 fore
casting. Appl Soft Comput 2020;96:106620.
[14] Gao L, Ding Y. Disease prediction via Bayesian hyperparameter optimization and
ensemble learning. BMC Res Notes 2020;13:1–6.
[15] Abusnaina AA, Ahmad S, Jarrar R, Mafarja M. June). Training neural networks
using a salp swarm algorithm for pattern classification. In: Proceedings of the 2nd
international conference on future networks and distributed systems; 2018. p. 1–6.
[16] Yaseen ZM, Faris H, Al-Ansari N. Hybridized extreme learning machine model with
salp swarm algorithm: a novel predictive model for hydrological application.
Complexity; 2020. 2020.
[17] Khourdifi Y, Bahaj M. Heart disease prediction and classification using machine
learning algorithms optimized by particle swarm optimization and ant colony
optimization. Int. J. Intell. Eng. Syst. 2019;12(1):242–52.
[18] Wang J, Gao Y, Chen X. A novel hybrid interval prediction approach based on
modified lower upper bound estimation in combination with multi-objective salp
swarm algorithm for short-term load forecasting. Energies 2018;11(6):1561.
[19] Abualigah L, Shehab M, Alshinwan M, Alabool H. Salp swarm algorithm: a
comprehensive survey. Neural Comput Appl 2019:1–21.
[20] Ahmed S, Mafarja M, Faris H, Aljarah I. March). Feature selection using a salp
swarm algorithm with chaos. In: Proceedings of the 2nd international conference
on intelligent systems. Metaheuristics & Swarm Intelligence; 2018. p. 65–9.
[21] Nayak SK, Rout PK, Jagadev AK, Swarnkar T. Elitism-based multi-objective dif
ferential evolution for feature selection: a filter approach with an efficient
redundancy measure. J. King Saud Univ. Comput. Inf. Sci. 2020;32(2):174–87.
[22] Bai Y, Zeng B, Li C, Zhang J. An ensemble long short-term memory neural network
for hourly PM2. 5 concentration forecasting. Chemosphere 2019;222:286–94.
[23] Alani H, Tamimi A, Tamimi N. Cardiovascular co-morbidity in chronic kidney
disease: current knowledge and future research needs. World J Nephrol 2014;3(4):
156.
[24] Gao L, Ding Y. Disease prediction via Bayesian hyperparameter optimization and
ensemble learning. BMC Res Notes 2020;13:1–6.
[25] Wiesław P. Tree-based generational feature selection in medical applications.
Procedia Comput. Sci. 2019;159:2172–8.
[26] Beunza JJ, Puertas E, García-Ovejero E, Villalba G, Condes E, Koleva G,
Landecho MF. Comparison of machine learning algorithms for clinical event pre
diction (risk of coronary heart disease). J Biomed Inf 2019;97:103257.
[27] Salih SQ, Alsewari AA. A new algorithm for normal and large-scale optimization
problems: nomadic People Optimizer. Neural Comput Appl 2020;32(14):
10359–86.
[28] Almustafa, K. M. Prediction of heart disease and classifiers’ sensitivity analysis, 02
July 2020. BMC Bioinf, 21.
[29] Vembandasamy K, Sasipriya R, Deepa E. Heart disease detection using Naive Bayes
algorithm. Int. J. Innov. Sci. Eng. Technol. 2015;2(9):441–4.
[30] Chaurasia V, Pal S. Data mining approach to detect heart diseases. Int J Adv
Comput Sci Inf Technol 2014;2:56–66.
[31] Zhang Y, Wang S, Ji G. A comprehensive survey on particle swarm optimization
algorithm and its applications. Math Probl Eng 2015;2015:931256. https://doi.
org/10.1155/2015/931256.
[32] Rizk-Allah RM, Hassanien AE, Elhoseny M, Gunasekaran M. A new binary salp
swarm algorithm: development and application for optimization tasks. Neural
Comput Appl 2019;31(5):1641–63.
[33] Patro SP, Padhy N, Chiranjevi D. Ambient assisted living predictive model for
cardiovascular disease prediction using supervised learning. Evol. Intell. 2020:
1–29.
[34] Khan MA, Algarni F. A healthcare monitoring system for the diagnosis of heart
disease in the IoMT cloud environment using MSSO-ANFIS. IEEE Access 2020;8:
122259–69.
S.P. Patro et al.
17. Informatics in Medicine Unlocked 26 (2021) 100696
17
[35] Wang J, Liu C, Li L, Li W, Yao L, Li H, Zhang H. A stacking-based model for non-
invasive detection of coronary heart disease. IEEE Access 2020;8:37124–33.
[36] Amin SU, Agarwal K, Beg R. Genetic neural network based data mining in pre
diction of heart disease using risk factors. In: 2013 IEEE conference on information
& communication technologies. IEEE; 2013, April. p. 1227–31.
[37] Khateeb N, Usman M. Efficient heart disease prediction system using K-nearest
neighbor classification technique. In: proceedings of the international conference
on big data and Internet of thing; 2017, December. p. 21–6.
[38] Rish I. An empirical study of the naive Bayes classifier. In: IJCAI 2001 workshop on
empirical methods in artificial intelligence, vol. 3; 2001, August. p. 41–6. 22.
[39] Mirjalili S, Gandomi AH, Mirjalili SZ, Saremi S, Faris H, Mirjalili SM. Salp Swarm
Algorithm: a bio-inspired optimizer for engineering design problems. Adv Eng
Software 2017;114:163–91.
Sibo Prasad Patro*
, Gouri Sankar Nayak, Neelamadhab Padhy
School of Engineering and Technology, Department of Computer Science and
Engineering, GIET University, Gunupur-765022, Odisha, India
*
Corresponding author.
E-mail addresses: sibofromgiet@giet.edu (S.P. Patro), gsnayakcse@giet.
edu (G.S. Nayak), dr.neelamadhab@giet.edu (N. Padhy).
S.P. Patro et al.