SlideShare a Scribd company logo
International Journal of Informatics and Communication Technology (IJ-ICT)
Vol.8, No.1, April 2019, pp. 56~62
ISSN: 2252-8776, DOI: 10.11591/ijict.v8i1.pp56-62  56
Journal homepage: http://iaescore.com/journals/index.php/IJICT
Predicting heart ailment in patients with varying number of
features using data mining techniques
T R Stella Mary, Shoney Sebastian
Department of Computer Science, Christ University, Taiwan
Article Info ABSTRACT
Article history:
Received Dec 10, 2018
Revised Jan 20, 2019
Accepted Feb 3, 2019
Data mining can be defined as a process of extracting unknown, verifiable
and possibly helpful data from information. Among the various ailments,
heart ailment is one of the primary reason behind death of individuals around
the globe, hence in order to curb this, a detailed analysis is done using Data
Mining. Many a times we limit ourselves with minimal attributes that are
required to predict a patient with heart disease. By doing so we are missing
on a lot of important attributes that are main causes for heart diseases. Hence,
this research aims at considering almost all the important features affecting
heart disease and performs the analysis step by step with minimal to
maximum set of attributes using Data Mining techniques to predict heart
ailments. The various classification methods used are Naïve Bayes classifier,
Random Forest and Random Tree which are applied on three datasets with
different number of attributes but with a common class label. From the
analysis performed, it shows that there is a gradual increase in prediction
accuracies with the increase in the attributes irrespective of the classifiers
used and Naïve Bayes and Random Forest algorithms comparatively
outperforms with these sets of data.
Keywords:
Data mining
Heart ailments
Naïve bayes classifier
Random forest
Random tree
Copyright © 2019 Institute of Advanced Engineering and Science.
All rights reserved.
Corresponding Author:
T R Stella Mary,
Department of Computer Science,
Christ University, Taiwan
1. INTRODUCTION
Data mining involves the exploration of huge datasets in order to mine unknown, verifiable,
relationships, knowledge and possibly helpful data from information [1-2]. It involves various steps in
converting a raw data into a data with useful information. This complete process involves various steps such
as data cleaning, integration, transformation, selection, mining, evaluation of pattern and representation of
knowledge. This has made data mining find its applications in the fields of healthcare, business, education,
media, weather analysis, bio-informatics and many more.
In healthcare, data mining has gained wide interest and importance and has become the most
efficient means in diagnosing patients with various ailments. Medical data is rich in information but lacks in
terms of tools and technologies to extract the useful patterns and knowledge from them. Almost all the
healthcare industries today all streaming out a lot of huge data as in, data pertaining to patients, diagnosis of
various ailments, patient records in electronic form, data excavated from medical devices etc. This huge
amount of data is processed and analyzed to extract useful patterns which helps in fast diagnosis of ailments
and which even aids in cost reduction.
Among the various ailments, heart ailment is one among the main reason for death of human beings
all around the world in last decade by the WHO. The European public health alliance reported that over 41%
of all deaths were caused by strokes, heart attacks, and other chronic diseases [3]. As indicated by the
American Heart Association, more than 7 million Americans have suffered a heart ailment in their lifetime.
At the point when plaque is worked inside the coronary supply routes, they limit these veins making them
IJ-ICT ISSN: 2252-8776 
Predicting heart ailment in patients with varying number of features using data … (T R Stella Mary)
57
unfit to convey oxygenated blood to the heart muscle causing the outstanding manifestations of CAD, for
example, chest torment (angina) and shortness of breath [4-5]. There are many kinds of cardiovascular
diseases which includes the coronary heart diseases (CHD), stroke, intrinsic heart, fringe vein, rheumatic
heart, provocative heart sickness and hypertensive coronary illness [6-8]. The extreme reasons for heart
illness are tobacco utilization, physical latency, an undesirable eating regimen and customary use of liquor
[9].
Medical diagnosis is an important and complicated task as it must ensure accurate results and need
to be executed precisely. Not all doctors are completely well versed in their various sectors and there is lack
of proper diagnosis. This paves way for and the stream of data mining to ease the tasks and which is capable
of performing the diagnosis and generate the results accurately with less human interference and less
investment both in terms of money and time [10].
In real time there are several tests taken to predict and diagnose patients with heart diseases which in
turn gives the accurate results but turns out to be time consuming and cost effective. Meanwhile in automated
approach people aim at reducing the attributes which in turn reduces the time taken but lacks in accuracy
since an extra care has to be taken to ensure that a patient is suffering from heart ailment providing a detailed
analysis is done in all the perspectives. Keeping this in mind that most of the features affecting heart disease
must be considered, a predictive analysis is done to see how the prediction accuracies vary with very few and
maximum features taken. This paper aims at providing a more efficient approach towards the prediction of
heart ailments in patients with the most important attributes affecting the patients that leads to heart disease
and analyze the data as to how the prediction accuracies fluctuate with minimal and maximal attributes taken
into consideration using the data mining approach.
Numerous studies have been performed with the aim of efficient diagnosis and prediction of heart
ailment. Most of the authors have applied different data mining methods for prediction and have come up
with different probabilities of accuracies for different classifiers.
Randa El-Bialy et al., [11] depicts that heterogeneity datasets gives in heterogeneity data types,
based on this it combines four datasets with same class label namely the Cleveland heart disease, Hungarian
heart disease, V.A. heart disease and Statlog project heart disease with 13 features from UCI repository and
applies C4.5 and Fast Decision tree in order to construct decision tree for each dataset and compare the
accuracies in WEKA. Out of which five common features are selected and are integrated into a new dataset,
on which the two models are applied again, which results in the four most commonly occurring features (ca,
age, cp, thal) with highest information gain. This indicates that these four features are the most important
features for the prediction of heart disease.
Vikas Chaurasia [12] uses three data mining algorithms CART, ID3 and Decision table on the
Cleveland heart disease dataset with 11 attributes out of 13 attributes. Theses 3 classifiers are implemented
on WEKA with 10-fold cross validation. A detailed result in terms of time taken to build the model, correctly
and incorrectly classified instances, error rate, confusion matrices and accuracies are analyzed for all the
three classifiers and finally depicts that CART out performs from the other two classifiers with an accuracy
of 83.49%. Further the importance of each attribute is analyzed with Chi-squared test, Information gain and
Gain ratio test which depicts that cp attributes impacts the output the most.
Chaitrali S. Dangare and Sulabha S. Apte [13] analyses the Cleveland heart disease dataset with 3
classifiers namely the Neural Networks, Decision trees and Naïve Bayes algorithms. To this existing dataset
two more attributes obesity and tobacco are included which makes up for the next dataset. Initially the
classifiers are applied to the first dataset with 13 attributes and the same classifiers are applied to the next
dataset with 15 attributes. Finally, a detailed analysis of their confusion matrices and accuracies are
compared between the two datasets which depicts that for the dataset with 13 attributes Neural Network
outperforms with 99.25% accuracy and for the dataset with 15 attributes again Neural Network outperforms
with 100% accuracy. They conclude that Neural Network algorithm outperforms in both the cases and has
relatively higher accuracy with 15 attributes.
M. Anbarasi et al., [14] objective was to decrease the number of attributes for which they play out
the analysis on the Cleveland heart disease dataset with 13 attributes. Feature subset selection is performed
on the dataset using the Genetic algorithm which results with 6 attributes on which the classifiers are applied.
The classifiers used here are Decision tree, Naïve Bayes and Classification via Clustering. Further while
comparing among all the classifiers with their corresponding confusion matrices and accuracies, Decision
tree outperforms with an accuracy of 99.2%.
Jyoti Rohilla and Preeti Gulia [15] performs the analysis on the Cleveland heart disease dataset with
11 attributes by applying the classifiers on a 10-fold cross validation process. The different classifiers used
are Naïve Bayes, Bagging, ID3, J48, Simple Cart, Logistic Regression and REPTREE algorithms.
Comparatively the Decision tree classifier ID3 outperforms with an accuracy of 88% and with a minimum
 ISSN: 2252-8776
IJ-ICT Vol. 8, No. 1, April 2019 : 56–62
58
error. Hence it concludes that the Decision tree method that is ID3 outperforms when compared to all other
classifiers.
R. Tamilarasi and Dr. R. Porkodi [16] aims at providing a cost effective automated system in
prediction of heart ailment in patients for which the a dataset with 13 attributes are considered. They even
specify the risk factors of heart disease which are smoking, cholesterol, obesity and lack of physical exercise.
To analyze this dataset 6 classifiers namely the Naïve Bayes, KNN algorithm, J48, CART, ANN and SMO
are applied to see which performs comparatively better in terms of correctly and incorrectly classified
instances and the error rate. Hence, it concludes that the K-nearest neighbor algorithm outperforms with an
accuracy of 100%.
Shabad Adam Pattekari and Asma Parveen [17] built a prototype system with a model built with the
classifiers namely Decision trees, Naïve Bayes and Artificial Neural Networks. A real time website was
developed called the Heart Disease Prediction System (HDPS) where in it retrieves the data stored in the
database and predicts if the patient had heart disease using the trained dataset.
M. Akhiljabbar et al., [18] developed an efficient algorithm using the genetic algorithm approach
known as the Associative Classification algorithm in order to predict patients with heart ailment. In this
approach all the attributes are analyzed and measured in terms of the Gini Index which in turn gives
weightage for each of the attributes which helps in determining the most important factors affecting the heart
disease. Then the Z statistics is used to analyze the wellness of the rule obtained. Finally, based on the rules
generated the classifiers are built and the accuracy of the dataset is measured.
2. RESEARCH METHOD
All most all the hospitals today maintain vast amount of healthcare information obtained from
patients, from electronic devices, medical diagnosis etc. This huge amount of data consists a lot of useful
hidden data in them which is very essential in diagnosing various ailments with a minimum number of
medical tests and make the diagnosis more intelligent. Many a times we limit ourselves with minimal
attributes that are required to predict a patient with heart disease. By doing so we are missing on a lot of
important features that are main causes for heart diseases.
Taking this into consideration, an approach is proposed wherein all the attributes responsible for the
heart ailment in patients are considered and analyzed in a step by step procedure to see how the attributes
play a vital role in predicting heart ailment in patients. Hence, the pre-dominant objective is to provide a
more efficient approach towards the prediction of heart ailments with the most important attributes causing
heart ailment and analyze the data, as to how the prediction accuracies fluctuate with minimal and maximal
attributes taken into consideration using the data mining approach.
For which a benchmark heart ailment dataset i.e. Cleveland heart disease dataset along with it two
more minimal datasets with same class label are used. Medical terminologies such as sex, chest pain type,
and age etc., making up for 7 attributes in first dataset, 10 attributes in second dataset and 13 attributes in the
benchmark dataset are used. To improvise the results, step by step approach is followed in increasing the
number of attributes and analyze to see if the prediction accuracies reduced or gradually increased. The data
mining classification methods used to classify the datasets are Decision Trees (Random Forest and Random
Tree) and Naive Bayes are used.
This proposed approach finds its application in predicting the heart ailments in patients as it
considers all the important attributes or features responsible in causing a heart ailment. It reduces the several
tests and time taken to predict and diagnose patients with heart diseases and is also cost effective. The
different data mining classifiers used are Naïve Bayes, Decision trees Random Forest and Random Tree.
2.1. Naïve Bayes
This algorithm has its basis from the Bayes theorem and follows the concept of conditional
independence i.e., it considers an attribute value of a given class to be independent of other attributes values.
The main objective is to maximize the posterior probability in predicting the class [19].
The Bayes theorem is as follows:
Let X={x1, x2, ....., xm} be a set of m attributes. In Bayesian, X is considered as evidence and H be
some hypothesis that the data of X belongs to specific class C. We have to determine P (H|X), the probability
that the hypothesis H holds given the evidence i.e. data sample X. According to Bayes theorem the P (H|X) is
expressed as
P (H|X) = P (X| H) P (H) / P (X)
Naïve Bayes classifier performs relatively better with nominal data but not with numeric data [20].
IJ-ICT ISSN: 2252-8776 
Predicting heart ailment in patients with varying number of features using data … (T R Stella Mary)
59
2.2. Random Tree
It uses the concept of divide and conquer in order to build decision trees [21]. At each node of the
tree, the algorithm takes up a feature that further divides the data into subsets with the help of a splitting
algorithm [22]. Every predicting node leads to a class or decision rule. In order to classify a new item, it
should build a decision tree taking into consideration the existing items in the training set. With the help of
information gain it finds the dependent variable that clearly discriminates other instances. Now among the
possible values of this variable if any of the value turns out to be the target value or leaf node and we can
terminate that branch. Hence we follow this method until we get the decision rules of different combination
of features leads to the target value.
2.3. Random Forest
It is a classifier which consists of n number of decision trees where in each individual result is
depicted as a separate tree. It was derived from the random decision of forest that was proposed by Tin Kam
Ho of Bell Labs in 1995 [23]. This approach does the selection of attributes randomly in order to build the
decision trees but with restricted fluctuations. Multi-classifiers are the after effect of combining a few
independent classifiers and sets of classifiers responsible for broadening the execution have been depicted in
Figure 1 [24].
Figure 1. Architectural diagram of proposed work
3. RESULTS AND ANALYSIS
In Waikato Environment for Knowledge Analysis (WEKA) is used for analyzing and applying the
different classifiers on the datasets because of its excellence in discovering, analysis and predicting efficient
patterns [25]. A detailed analysis is done in Weka wherein Table 1 represents the results obtained by
applying the classifiers on a dataset of 7 attributes. Table 2 depicts the results obtained by applying the
classifiers on a dataset of 10 attributes and Table 3 depicts the results obtained by applying the classifiers on
a dataset of 13 attributes. Figure 2 depicts the gradual increase in the prediction accuracies along with the
increase in the attributes.
Table 1. Accuracies Achieved with 7 Attributes
Evaluation criteria
Classifiers
Naïve Bayes Random Forest Random Tree
Time taken to build model (in sec) 0 0.25 0
Correctly classified instances 71 72 68
Incorrectly classified instances 20 19 23
Accuracy (%) 78.022% 79.1209% 74.7253%
 ISSN: 2252-8776
IJ-ICT Vol. 8, No. 1, April 2019 : 56–62
60
Table 2. Accuracies Achieved with 10 Attributes.
Evaluation criteria
Classifiers
Naïve Bayes Random Forest Random Tree
Time taken to build model (in sec) 0 0.14 0
Correctly classified instances 72 71 67
Incorrectly classified instances 15 16 20
Accuracy (%) 82.7586% 81.6092% 77.0115%
Table 3. Accuracies Achieved with 13 Attributes
Evaluation criteria
Classifiers
Naïve Bayes Random Forest Random Tree
Time taken to build model (in sec) 0 0.14 0
Correctly classified instances 76 79 72
Incorrectly classified instances 15 12 19
Accuracy (%) 83.5165% 86.8132% 79.1209%
Figure 2. Fluctuations in prediction accuracies.
The initial dataset (Dataset-1) consists of 7 attributes making up for 300 records namely age which
is continuous attribute, sex which is categorical attribute, exercise induced agina which is nominal attribute,
depression induced by exercise forming the categorical attribute, slope of peak exercise again a categorical
attribute, number of major vessels a nominal attribute and finally the defect type which is categorical again.
The second dataset (Dataset-2) consists of all the first dataset attributes along with additional 3 more
attributes namely the chest pain type which is a categorical attribute, cholesterol which is continuous and
resting ECG measure again categorical attribute totally making up for 290 records. Dataset-1 and Dataset-2
are collected by taking into consideration the various heart ailment attributes through a survey done with the
patients. Finally, the third dataset Cleveland Heart Disease dataset consists of 303 records with 13 attributes
is used.
Predictable attribute (num) is a common class label for all the three datasets wherein,
Value = 0: depicts that the patient has no heart disease
Value = 1: depicts that the patient has heart disease.
All the dataset are analyzed and the classifiers are applied on a basis of 70% split i.e., 70% of
dataset is used as training data and 30% as testing data.
Step – 1: Pre-processing of datasets.
Dataset-1 with 7 attributes didn’t have any missing values and so does the second dataset. But the
third Dataset-3 had missing values which was rectified by using the Weka filter ReplaceMissingValues.
Since the class variable is the same for all the datasets it is converted from numeric to nominal using the filter
numeric to nominal under the unsupervised attribute filters in order to make few of the classifiers like Naïve
Bayes and Decision trees to be applicable. Further the multi-class classification in the third benchmark
dataset is converted two-class classification. this section, it is explained the results of research and at the
same time is given the comprehensive discussion. Results can be presented in figures, graphs, tables and
others that make the reader understand easily [2], [5].
78.02% 79.12%
74.73%
82.76%
81.61%
77.01%
83.52%
86.81%
79.12%
NAÏVE BAYES RANDOM
FOREST
RANDOM TREE
Fluctuation in
prediction accuracies
Dataset-1 Dataset-2 Dataset-3
IJ-ICT ISSN: 2252-8776 
Predicting heart ailment in patients with varying number of features using data … (T R Stella Mary)
61
Results are analyzed by applying the three classifiers Naïve Bayes, Random Tree and Random
Forest on all the three datasets to see if there are fluctuations in the prediction accuracies after applying them
to build the models.
Step – 2: Apply the 3 classifiers on the dataset with 7 attributes.
Initially, the Dataset-1 with 7 attributes is pre-processed and the class label is converted to nominal
using the filter Numeric to Nominal on which the 3 classifiers are applied. It is evident from Table 1 that the
Random forest algorithm outperforms with 7 attributes.
Step – 3: Apply the 3 classifiers on the dataset with 10 attributes.
Secondly, the Dataset-2 with 10 attributes is pre-processed and the class label is converted to
nominal using the filter Numeric to Nominal on which the 3 classifiers are applied.
It is evident from Table 2 that the Naïve Bayes algorithm outperforms and the accuracies of all the
classifiers are gradually increased with the addition of 3 more attributes chest pain type, cholesterol and ECG
measure to the Dataset-1.
Step – 4: Apply the 3 classifiers on the dataset with 13 attributes.
Finally, the Dataset-3 with 13 attributes is pre-processed and the class label is converted to nominal
using the filter Numeric to Nominal on which the 3 classifiers are applied. It is evident from Table 3 that the
Random Forest algorithm outperforms and the accuracies of all the classifiers are gradually increased with
the addition of 3 more attributes blood pressure, fasting sugar and thalach to the Dataset-2.
From the Table 4 it is evident that with the increase in the number of attributes, the prediction
accuracies have gradually increased irrespective of the classifiers used and the dataset values. Moreover, for
different datasets different classifiers outperforms i.e., in case of Dataset-1 with 7 attributes Random Forest
classifier outperforms with an accuracy of 79.1209%, whereas for Dataset-2 with 10 attributes Naïve Bayes
classifier outperforms with an accuracy of 82.7586% and finally for the Dataset-3 with 13 attributes the
Random Forest attribute outperforms. Hence, it clearly shows that consideration of all the important
attributes or features contribute to a better and accurate prediction of heart diseases in patients.
From the Figure 2 it is evident that there is a gradual increase in the prediction accuracies analyzed
on the three datasets with different number of attributes and values using the above three classifiers. It even
depicts that in case of Dataset-1 Random forest algorithm outperforms, for Dataset-2 Naïve Bayes algorithm
outperforms with an accuracy of 82.75% and for Dataset-3 again Random forest outperforms with an
accuracy of 86.81%.
Table 4. Accuracy Fluctuations among the 3 Datasets
Datasets
Classifiers
Naïve Bayes Random Forest Random Tree
Dataset-1 78.022% 79.1209% 74.7253%
Dataset-2 82.7586% 81.6092% 77.0115%
Dataset-3 83.5165% 86.8132% 79.1209%
4. CONCLUSION
Finally, to conclude from the results and analysis performed in this paper it is evident that each and
every attribute contribute uniquely to the prediction accuracy and hence with the step by step increase in the
heart ailment attributes, the prediction accuracy of heart ailment gradually increases irrespective of the
classifier used to build the model. Adding to this it is also evident that for different dataset with different
values and number of attributes different classifiers out performs each time. Future work can be expanded by
finding the most appropriate features required for prediction of heart ailment and use other data mining
methods such as neural networks, regression etc., for prediction of heart disease.
REFERENCES
[1] Yan, H., J. Zheng, et al. (2003). "Development of a decision support system for heart disease diagnosis using
multilayer perceptron." Proceedings of the 2003 International Symposium on vol.5: pp. V-709- V-712.
[2] Sandhya, J., P. Deepa Shenoy, et al. (2010). "Classification of Neurodegenerative Disorders Based on Major Risk
Factors Employing Machine Learning Techniques." International Journal of Engineering and Technology Vol.2,
No.4.
[3] Jaymin Patel, Prof.TejalUpadhyay and Dr. Samir Patel, "Heart Disease Prediction using Machine Learning and Data
Mining Techniques ", Nirma University, Gujarat, India IJCSC Vol 7,number 1 september 2015-march 2016 pp.129-
137.
[4] Webmd.com. ”Risk factors for heart disease,” retrieved 20/8/2014 from http://www.webmd.com/heart-disease/risk-
factors-heart-disease.
 ISSN: 2252-8776
IJ-ICT Vol. 8, No. 1, April 2019 : 56–62
62
[5] National Heart, Lung, and Blood Institute, ”What is coronary heart disease?”retrieved 20/8/2014 from
http://www.nhlbi.nih.gov/health/healthtopics/ topics/cad/.
[6] Das, R., I. Turkoglu, et al. (2009). "Effective diagnosis of heart disease through neural networks ensembles." Expert
Systems with Applications, Elsevier 36 (2009): 7675–7680.
[7] Panzarasa, S., S. Quaglini, et al. (2010). "Data mining techniques for analyzing stroke care processes." Proceedings
of the 13th World Congress on Medical Informatics.
[8] V. Chaurasia, “Early Prediction of Heart Diseases Using Data Mining,” Caribbean Journal of Science and
Technology, vol. 1, pp. 208–217, 2013.
[9] Srinivas, K.,” Analysis of coronary heart disease and prediction of heart attack in coal mining regions using data
mining techniques”, IEEE Transaction on Computer Science and Education (ICCSE), p(1344 - 1349), 2010.
[10] Ruben, D. C. J. (2009). "Data Mining in Healthcare: Current Applications and Issues."
[11] Randa El-Bialy , Mostafa A. Salamay, Omar H. Karam and M.Essam Khalifa, “Feature Analysis of Coronery Artery
Heart Disease Data Sets”, International Conference on Communication, Management and Information Technology
(ICCMIT 2015), Procedia Computer Science 65 ( 2015 ) 459 – 468
[12] V. Chauraisa and S. Pal, “Data Mining Approach to Detect Heart Diseases”, International Journal of Advanced
Computer Science and Information Technology (IJACSIT),Vol. 2, No. 4,2013, pp 56-66.
[13] Chaitrali S. Dangare and Sulabha S. Apte, “Improved Study of Heart Disease Prediction System using Data Mining
Classification Techniques”, International Journal of Computer Applications (0975 – 888) Volume 47– No.10, June
2012.
[14] M. Anbarasi, E. Anupriya and N.CH.S.N.Iyengar, “Enhanced Prediction of Heart Disease with Feature Subset
Selection using Genetic Algorithm”, International Journal of Engineering Science and Technology Vol. 2(10),
2010, 5370-537.
[15] Jyoti Rohilla and Preeti Gulia, “Analysis of Data Mining Techniques for Diagnosing Heart Disease”, International
Journal of Advanced Research in Computer Science and Software Engineering, Volume 5, Issue 7, July 2015,
ISSN: 2277 128X .
[16] R. Tamilarasi and Dr. R. Porkodi, “A Study and Analysis of Disease Prediction Techniques in Data Mining for
Healthcare”, International Journal of Emerging Research in Management &Technology, March 2015, ISSN: 2278-
9359 (Volume-4, Issue-3)
[17] Shadab Adam Pattekari and Asma Parveen, prediction system for heart disease using naïve bayes, International
Journal of Advanced Computer and Mathematical Sciences, 2012.
[18] M.Akhil jabbers, Dr.prirti Chandra, Dr.B.L Deekshatulu, and Heart Disease Prediction System using Associative
Classification and Genetic Algorithm, International Conference on Emerging Trends in Electrical, Electronics and
Communication Technologies, 2012.
[19] M. Bramer, Principles of Data Mining: Springer-Verlag, 2007.
[20] R. Alizadehsani, J. Habibi, B. Bahadorian, H. Mashayekhi, A. Ghandeharioun, and R. Boghrati, et al., "Diagnosis of
coronary arteries stenosis using data mining," J Med Signals Sens, vol. 2, pp. 153-9, Jul 2012.
[21] J.Nahar, T. Imam, K. S. Tickle, and Y.-P. P. Chen, “Computational Intelligence for heart disease diagnosis: A
medical Knowledge driven approach,” Expert Systems with Applications, vol. 40, no. 1, pp. 96–104,2013.
[22] M. Kumari and S. Godara, "Comparative study of data mining classification methodsin cardiovascular disease
prediction," IJCST vol. 2, pp. 304-305, 2011.
[23] Y. E. Shao, C.-D. Hou, and C.-C. Chiu, “Hybrid intelligent modelling, schemes for heart disease classification,”
Applied Soft Computing,vol. 14, pp. 47–52, 2014.
[24] P. V. Ankur Makwana, “Identify the patients at high risk of re-admissionin hospital in the next year,” International
Journal of Science andResearch, vol. 4, pp. 2431–2434, 2015.
[25] A. Aziz, N. Ismail, And F. Ahmad, “Mining Students’ academic Performance.,” Journal of Theoretical & Applied
Information Technology, vol. 53, no. 3, 2013.

More Related Content

What's hot

A Heart Disease Prediction Model using Logistic Regression
A Heart Disease Prediction Model using Logistic RegressionA Heart Disease Prediction Model using Logistic Regression
A Heart Disease Prediction Model using Logistic Regression
ijtsrd
 
IRJET- Genetic Algorithm for Feature Selection to Improve Heart Disease Predi...
IRJET- Genetic Algorithm for Feature Selection to Improve Heart Disease Predi...IRJET- Genetic Algorithm for Feature Selection to Improve Heart Disease Predi...
IRJET- Genetic Algorithm for Feature Selection to Improve Heart Disease Predi...
IRJET Journal
 
Ascendable Clarification for Coronary Illness Prediction using Classification...
Ascendable Clarification for Coronary Illness Prediction using Classification...Ascendable Clarification for Coronary Illness Prediction using Classification...
Ascendable Clarification for Coronary Illness Prediction using Classification...
ijtsrd
 
Chronic Kidney Disease Prediction
Chronic Kidney Disease PredictionChronic Kidney Disease Prediction
Chronic Kidney Disease Prediction
Rajandeep Gill
 
Heart Disease Prediction Using Data Mining Techniques
Heart Disease Prediction Using Data Mining TechniquesHeart Disease Prediction Using Data Mining Techniques
Heart Disease Prediction Using Data Mining Techniques
IJRES Journal
 
IRJET- Heart Failure Risk Prediction using Trained Electronic Health Record
IRJET- Heart Failure Risk Prediction using Trained Electronic Health RecordIRJET- Heart Failure Risk Prediction using Trained Electronic Health Record
IRJET- Heart Failure Risk Prediction using Trained Electronic Health Record
IRJET Journal
 
Heart disease prediction using machine learning algorithm
Heart disease prediction using machine learning algorithm Heart disease prediction using machine learning algorithm
Heart disease prediction using machine learning algorithm
Kedar Damkondwar
 
Mining of medical data to identify risk factors of heart disease using freque...
Mining of medical data to identify risk factors of heart disease using freque...Mining of medical data to identify risk factors of heart disease using freque...
Mining of medical data to identify risk factors of heart disease using freque...
IRJET Journal
 
Heart disease prediction
Heart disease predictionHeart disease prediction
Heart disease prediction
Ariful Haque
 
Health care analytics
Health care analyticsHealth care analytics
Health care analytics
Gaurav Dubey
 
Android Based Questionnaires Application for Heart Disease Prediction System
Android Based Questionnaires Application for Heart Disease Prediction SystemAndroid Based Questionnaires Application for Heart Disease Prediction System
Android Based Questionnaires Application for Heart Disease Prediction System
ijtsrd
 
IRJET -Improving the Accuracy of the Heart Disease Prediction using Hybrid Ma...
IRJET -Improving the Accuracy of the Heart Disease Prediction using Hybrid Ma...IRJET -Improving the Accuracy of the Heart Disease Prediction using Hybrid Ma...
IRJET -Improving the Accuracy of the Heart Disease Prediction using Hybrid Ma...
IRJET Journal
 
SUPERVISED FEATURE SELECTION FOR DIAGNOSIS OF CORONARY ARTERY DISEASE BASED O...
SUPERVISED FEATURE SELECTION FOR DIAGNOSIS OF CORONARY ARTERY DISEASE BASED O...SUPERVISED FEATURE SELECTION FOR DIAGNOSIS OF CORONARY ARTERY DISEASE BASED O...
SUPERVISED FEATURE SELECTION FOR DIAGNOSIS OF CORONARY ARTERY DISEASE BASED O...
csitconf
 
Supervised Feature Selection for Diagnosis of Coronary Artery Disease Based o...
Supervised Feature Selection for Diagnosis of Coronary Artery Disease Based o...Supervised Feature Selection for Diagnosis of Coronary Artery Disease Based o...
Supervised Feature Selection for Diagnosis of Coronary Artery Disease Based o...
cscpconf
 
IRJET - Chronic Kidney Disease Prediction using Data Mining and Machine Learning
IRJET - Chronic Kidney Disease Prediction using Data Mining and Machine LearningIRJET - Chronic Kidney Disease Prediction using Data Mining and Machine Learning
IRJET - Chronic Kidney Disease Prediction using Data Mining and Machine Learning
IRJET Journal
 
A Survey on Heart Disease Prediction Techniques
A Survey on Heart Disease Prediction TechniquesA Survey on Heart Disease Prediction Techniques
A Survey on Heart Disease Prediction Techniques
ijtsrd
 
IRJET- Heart Disease Prediction System
IRJET- Heart Disease Prediction SystemIRJET- Heart Disease Prediction System
IRJET- Heart Disease Prediction System
IRJET Journal
 
PREVENTION OF HEART PROBLEM USING ARTIFICIAL INTELLIGENCE
PREVENTION OF HEART PROBLEM USING ARTIFICIAL INTELLIGENCEPREVENTION OF HEART PROBLEM USING ARTIFICIAL INTELLIGENCE
PREVENTION OF HEART PROBLEM USING ARTIFICIAL INTELLIGENCE
ijaia
 
Survey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease predictionSurvey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease prediction
Sivagowry Shathesh
 
IRJET- The Prediction of Heart Disease using Naive Bayes Classifier
IRJET- The Prediction of Heart Disease using Naive Bayes ClassifierIRJET- The Prediction of Heart Disease using Naive Bayes Classifier
IRJET- The Prediction of Heart Disease using Naive Bayes Classifier
IRJET Journal
 

What's hot (20)

A Heart Disease Prediction Model using Logistic Regression
A Heart Disease Prediction Model using Logistic RegressionA Heart Disease Prediction Model using Logistic Regression
A Heart Disease Prediction Model using Logistic Regression
 
IRJET- Genetic Algorithm for Feature Selection to Improve Heart Disease Predi...
IRJET- Genetic Algorithm for Feature Selection to Improve Heart Disease Predi...IRJET- Genetic Algorithm for Feature Selection to Improve Heart Disease Predi...
IRJET- Genetic Algorithm for Feature Selection to Improve Heart Disease Predi...
 
Ascendable Clarification for Coronary Illness Prediction using Classification...
Ascendable Clarification for Coronary Illness Prediction using Classification...Ascendable Clarification for Coronary Illness Prediction using Classification...
Ascendable Clarification for Coronary Illness Prediction using Classification...
 
Chronic Kidney Disease Prediction
Chronic Kidney Disease PredictionChronic Kidney Disease Prediction
Chronic Kidney Disease Prediction
 
Heart Disease Prediction Using Data Mining Techniques
Heart Disease Prediction Using Data Mining TechniquesHeart Disease Prediction Using Data Mining Techniques
Heart Disease Prediction Using Data Mining Techniques
 
IRJET- Heart Failure Risk Prediction using Trained Electronic Health Record
IRJET- Heart Failure Risk Prediction using Trained Electronic Health RecordIRJET- Heart Failure Risk Prediction using Trained Electronic Health Record
IRJET- Heart Failure Risk Prediction using Trained Electronic Health Record
 
Heart disease prediction using machine learning algorithm
Heart disease prediction using machine learning algorithm Heart disease prediction using machine learning algorithm
Heart disease prediction using machine learning algorithm
 
Mining of medical data to identify risk factors of heart disease using freque...
Mining of medical data to identify risk factors of heart disease using freque...Mining of medical data to identify risk factors of heart disease using freque...
Mining of medical data to identify risk factors of heart disease using freque...
 
Heart disease prediction
Heart disease predictionHeart disease prediction
Heart disease prediction
 
Health care analytics
Health care analyticsHealth care analytics
Health care analytics
 
Android Based Questionnaires Application for Heart Disease Prediction System
Android Based Questionnaires Application for Heart Disease Prediction SystemAndroid Based Questionnaires Application for Heart Disease Prediction System
Android Based Questionnaires Application for Heart Disease Prediction System
 
IRJET -Improving the Accuracy of the Heart Disease Prediction using Hybrid Ma...
IRJET -Improving the Accuracy of the Heart Disease Prediction using Hybrid Ma...IRJET -Improving the Accuracy of the Heart Disease Prediction using Hybrid Ma...
IRJET -Improving the Accuracy of the Heart Disease Prediction using Hybrid Ma...
 
SUPERVISED FEATURE SELECTION FOR DIAGNOSIS OF CORONARY ARTERY DISEASE BASED O...
SUPERVISED FEATURE SELECTION FOR DIAGNOSIS OF CORONARY ARTERY DISEASE BASED O...SUPERVISED FEATURE SELECTION FOR DIAGNOSIS OF CORONARY ARTERY DISEASE BASED O...
SUPERVISED FEATURE SELECTION FOR DIAGNOSIS OF CORONARY ARTERY DISEASE BASED O...
 
Supervised Feature Selection for Diagnosis of Coronary Artery Disease Based o...
Supervised Feature Selection for Diagnosis of Coronary Artery Disease Based o...Supervised Feature Selection for Diagnosis of Coronary Artery Disease Based o...
Supervised Feature Selection for Diagnosis of Coronary Artery Disease Based o...
 
IRJET - Chronic Kidney Disease Prediction using Data Mining and Machine Learning
IRJET - Chronic Kidney Disease Prediction using Data Mining and Machine LearningIRJET - Chronic Kidney Disease Prediction using Data Mining and Machine Learning
IRJET - Chronic Kidney Disease Prediction using Data Mining and Machine Learning
 
A Survey on Heart Disease Prediction Techniques
A Survey on Heart Disease Prediction TechniquesA Survey on Heart Disease Prediction Techniques
A Survey on Heart Disease Prediction Techniques
 
IRJET- Heart Disease Prediction System
IRJET- Heart Disease Prediction SystemIRJET- Heart Disease Prediction System
IRJET- Heart Disease Prediction System
 
PREVENTION OF HEART PROBLEM USING ARTIFICIAL INTELLIGENCE
PREVENTION OF HEART PROBLEM USING ARTIFICIAL INTELLIGENCEPREVENTION OF HEART PROBLEM USING ARTIFICIAL INTELLIGENCE
PREVENTION OF HEART PROBLEM USING ARTIFICIAL INTELLIGENCE
 
Survey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease predictionSurvey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease prediction
 
IRJET- The Prediction of Heart Disease using Naive Bayes Classifier
IRJET- The Prediction of Heart Disease using Naive Bayes ClassifierIRJET- The Prediction of Heart Disease using Naive Bayes Classifier
IRJET- The Prediction of Heart Disease using Naive Bayes Classifier
 

Similar to 08. 9804 11737-1-rv edit dhyan

HEART DISEASE PREDICTION USING MACHINE LEARNING AND DEEP LEARNING
HEART DISEASE PREDICTION USING MACHINE LEARNING AND DEEP LEARNINGHEART DISEASE PREDICTION USING MACHINE LEARNING AND DEEP LEARNING
HEART DISEASE PREDICTION USING MACHINE LEARNING AND DEEP LEARNING
IJDKP
 
EVALUATING THE ACCURACY OF CLASSIFICATION ALGORITHMS FOR DETECTING HEART DISE...
EVALUATING THE ACCURACY OF CLASSIFICATION ALGORITHMS FOR DETECTING HEART DISE...EVALUATING THE ACCURACY OF CLASSIFICATION ALGORITHMS FOR DETECTING HEART DISE...
EVALUATING THE ACCURACY OF CLASSIFICATION ALGORITHMS FOR DETECTING HEART DISE...
mlaij
 
EVALUATING THE ACCURACY OF CLASSIFICATION ALGORITHMS FOR DETECTING HEART DIS...
EVALUATING THE ACCURACY OF CLASSIFICATION  ALGORITHMS FOR DETECTING HEART DIS...EVALUATING THE ACCURACY OF CLASSIFICATION  ALGORITHMS FOR DETECTING HEART DIS...
EVALUATING THE ACCURACY OF CLASSIFICATION ALGORITHMS FOR DETECTING HEART DIS...
mlaij
 
Heart Attack Prediction System Using Fuzzy C Means Classifier
Heart Attack Prediction System Using Fuzzy C Means ClassifierHeart Attack Prediction System Using Fuzzy C Means Classifier
Heart Attack Prediction System Using Fuzzy C Means Classifier
IOSR Journals
 
A Heart Disease Prediction Model using Decision Tree
A Heart Disease Prediction Model using Decision TreeA Heart Disease Prediction Model using Decision Tree
A Heart Disease Prediction Model using Decision Tree
IOSR Journals
 
javed_prethesis2608 on predcition of heart disease
javed_prethesis2608 on predcition of heart diseasejaved_prethesis2608 on predcition of heart disease
javed_prethesis2608 on predcition of heart disease
javed75
 
Heart Failure Prediction using Different Machine Learning Techniques
Heart Failure Prediction using Different Machine Learning TechniquesHeart Failure Prediction using Different Machine Learning Techniques
Heart Failure Prediction using Different Machine Learning Techniques
IRJET Journal
 
IRJET - Effective Heart Disease Prediction using Distinct Machine Learning Te...
IRJET - Effective Heart Disease Prediction using Distinct Machine Learning Te...IRJET - Effective Heart Disease Prediction using Distinct Machine Learning Te...
IRJET - Effective Heart Disease Prediction using Distinct Machine Learning Te...
IRJET Journal
 
Heart disease prediction by using novel optimization algorithm_ A supervised ...
Heart disease prediction by using novel optimization algorithm_ A supervised ...Heart disease prediction by using novel optimization algorithm_ A supervised ...
Heart disease prediction by using novel optimization algorithm_ A supervised ...
BASMAJUMAASALEHALMOH
 
Machine learning approach for predicting heart and diabetes diseases using da...
Machine learning approach for predicting heart and diabetes diseases using da...Machine learning approach for predicting heart and diabetes diseases using da...
Machine learning approach for predicting heart and diabetes diseases using da...
IAESIJAI
 
Heart failure prediction based on random forest algorithm using genetic algo...
Heart failure prediction based on random forest algorithm  using genetic algo...Heart failure prediction based on random forest algorithm  using genetic algo...
Heart failure prediction based on random forest algorithm using genetic algo...
International Journal of Reconfigurable and Embedded Systems
 
IRJET- Role of Different Data Mining Techniques for Predicting Heart Disease
IRJET-  	  Role of Different Data Mining Techniques for Predicting Heart DiseaseIRJET-  	  Role of Different Data Mining Techniques for Predicting Heart Disease
IRJET- Role of Different Data Mining Techniques for Predicting Heart Disease
IRJET Journal
 
COMPARISON AND EVALUATION DATA MINING TECHNIQUES IN THE DIAGNOSIS OF HEART DI...
COMPARISON AND EVALUATION DATA MINING TECHNIQUES IN THE DIAGNOSIS OF HEART DI...COMPARISON AND EVALUATION DATA MINING TECHNIQUES IN THE DIAGNOSIS OF HEART DI...
COMPARISON AND EVALUATION DATA MINING TECHNIQUES IN THE DIAGNOSIS OF HEART DI...
ijcsa
 
A data mining approach for prediction of heart disease using neural networks
A data mining approach for prediction of heart disease using neural networksA data mining approach for prediction of heart disease using neural networks
A data mining approach for prediction of heart disease using neural networksIAEME Publication
 
A data mining approach for prediction of heart disease using neural networks
A data mining approach for prediction of heart disease using neural networksA data mining approach for prediction of heart disease using neural networks
A data mining approach for prediction of heart disease using neural networksIAEME Publication
 
Diagnosis of Cardiac Disease Utilizing Machine Learning Techniques and Dense ...
Diagnosis of Cardiac Disease Utilizing Machine Learning Techniques and Dense ...Diagnosis of Cardiac Disease Utilizing Machine Learning Techniques and Dense ...
Diagnosis of Cardiac Disease Utilizing Machine Learning Techniques and Dense ...
BASMAJUMAASALEHALMOH
 
Comparing Data Mining Techniques used for Heart Disease Prediction
Comparing Data Mining Techniques used for Heart Disease PredictionComparing Data Mining Techniques used for Heart Disease Prediction
Comparing Data Mining Techniques used for Heart Disease Prediction
IRJET Journal
 
Performance Evaluation of Data Mining Algorithm on Electronic Health Record o...
Performance Evaluation of Data Mining Algorithm on Electronic Health Record o...Performance Evaluation of Data Mining Algorithm on Electronic Health Record o...
Performance Evaluation of Data Mining Algorithm on Electronic Health Record o...
BRNSSPublicationHubI
 
A hybrid model for heart disease prediction using recurrent neural network an...
A hybrid model for heart disease prediction using recurrent neural network an...A hybrid model for heart disease prediction using recurrent neural network an...
A hybrid model for heart disease prediction using recurrent neural network an...
BASMAJUMAASALEHALMOH
 
238_heartdisease (1).pdf
238_heartdisease (1).pdf238_heartdisease (1).pdf
238_heartdisease (1).pdf
ArpitGupta563794
 

Similar to 08. 9804 11737-1-rv edit dhyan (20)

HEART DISEASE PREDICTION USING MACHINE LEARNING AND DEEP LEARNING
HEART DISEASE PREDICTION USING MACHINE LEARNING AND DEEP LEARNINGHEART DISEASE PREDICTION USING MACHINE LEARNING AND DEEP LEARNING
HEART DISEASE PREDICTION USING MACHINE LEARNING AND DEEP LEARNING
 
EVALUATING THE ACCURACY OF CLASSIFICATION ALGORITHMS FOR DETECTING HEART DISE...
EVALUATING THE ACCURACY OF CLASSIFICATION ALGORITHMS FOR DETECTING HEART DISE...EVALUATING THE ACCURACY OF CLASSIFICATION ALGORITHMS FOR DETECTING HEART DISE...
EVALUATING THE ACCURACY OF CLASSIFICATION ALGORITHMS FOR DETECTING HEART DISE...
 
EVALUATING THE ACCURACY OF CLASSIFICATION ALGORITHMS FOR DETECTING HEART DIS...
EVALUATING THE ACCURACY OF CLASSIFICATION  ALGORITHMS FOR DETECTING HEART DIS...EVALUATING THE ACCURACY OF CLASSIFICATION  ALGORITHMS FOR DETECTING HEART DIS...
EVALUATING THE ACCURACY OF CLASSIFICATION ALGORITHMS FOR DETECTING HEART DIS...
 
Heart Attack Prediction System Using Fuzzy C Means Classifier
Heart Attack Prediction System Using Fuzzy C Means ClassifierHeart Attack Prediction System Using Fuzzy C Means Classifier
Heart Attack Prediction System Using Fuzzy C Means Classifier
 
A Heart Disease Prediction Model using Decision Tree
A Heart Disease Prediction Model using Decision TreeA Heart Disease Prediction Model using Decision Tree
A Heart Disease Prediction Model using Decision Tree
 
javed_prethesis2608 on predcition of heart disease
javed_prethesis2608 on predcition of heart diseasejaved_prethesis2608 on predcition of heart disease
javed_prethesis2608 on predcition of heart disease
 
Heart Failure Prediction using Different Machine Learning Techniques
Heart Failure Prediction using Different Machine Learning TechniquesHeart Failure Prediction using Different Machine Learning Techniques
Heart Failure Prediction using Different Machine Learning Techniques
 
IRJET - Effective Heart Disease Prediction using Distinct Machine Learning Te...
IRJET - Effective Heart Disease Prediction using Distinct Machine Learning Te...IRJET - Effective Heart Disease Prediction using Distinct Machine Learning Te...
IRJET - Effective Heart Disease Prediction using Distinct Machine Learning Te...
 
Heart disease prediction by using novel optimization algorithm_ A supervised ...
Heart disease prediction by using novel optimization algorithm_ A supervised ...Heart disease prediction by using novel optimization algorithm_ A supervised ...
Heart disease prediction by using novel optimization algorithm_ A supervised ...
 
Machine learning approach for predicting heart and diabetes diseases using da...
Machine learning approach for predicting heart and diabetes diseases using da...Machine learning approach for predicting heart and diabetes diseases using da...
Machine learning approach for predicting heart and diabetes diseases using da...
 
Heart failure prediction based on random forest algorithm using genetic algo...
Heart failure prediction based on random forest algorithm  using genetic algo...Heart failure prediction based on random forest algorithm  using genetic algo...
Heart failure prediction based on random forest algorithm using genetic algo...
 
IRJET- Role of Different Data Mining Techniques for Predicting Heart Disease
IRJET-  	  Role of Different Data Mining Techniques for Predicting Heart DiseaseIRJET-  	  Role of Different Data Mining Techniques for Predicting Heart Disease
IRJET- Role of Different Data Mining Techniques for Predicting Heart Disease
 
COMPARISON AND EVALUATION DATA MINING TECHNIQUES IN THE DIAGNOSIS OF HEART DI...
COMPARISON AND EVALUATION DATA MINING TECHNIQUES IN THE DIAGNOSIS OF HEART DI...COMPARISON AND EVALUATION DATA MINING TECHNIQUES IN THE DIAGNOSIS OF HEART DI...
COMPARISON AND EVALUATION DATA MINING TECHNIQUES IN THE DIAGNOSIS OF HEART DI...
 
A data mining approach for prediction of heart disease using neural networks
A data mining approach for prediction of heart disease using neural networksA data mining approach for prediction of heart disease using neural networks
A data mining approach for prediction of heart disease using neural networks
 
A data mining approach for prediction of heart disease using neural networks
A data mining approach for prediction of heart disease using neural networksA data mining approach for prediction of heart disease using neural networks
A data mining approach for prediction of heart disease using neural networks
 
Diagnosis of Cardiac Disease Utilizing Machine Learning Techniques and Dense ...
Diagnosis of Cardiac Disease Utilizing Machine Learning Techniques and Dense ...Diagnosis of Cardiac Disease Utilizing Machine Learning Techniques and Dense ...
Diagnosis of Cardiac Disease Utilizing Machine Learning Techniques and Dense ...
 
Comparing Data Mining Techniques used for Heart Disease Prediction
Comparing Data Mining Techniques used for Heart Disease PredictionComparing Data Mining Techniques used for Heart Disease Prediction
Comparing Data Mining Techniques used for Heart Disease Prediction
 
Performance Evaluation of Data Mining Algorithm on Electronic Health Record o...
Performance Evaluation of Data Mining Algorithm on Electronic Health Record o...Performance Evaluation of Data Mining Algorithm on Electronic Health Record o...
Performance Evaluation of Data Mining Algorithm on Electronic Health Record o...
 
A hybrid model for heart disease prediction using recurrent neural network an...
A hybrid model for heart disease prediction using recurrent neural network an...A hybrid model for heart disease prediction using recurrent neural network an...
A hybrid model for heart disease prediction using recurrent neural network an...
 
238_heartdisease (1).pdf
238_heartdisease (1).pdf238_heartdisease (1).pdf
238_heartdisease (1).pdf
 

More from IAESIJEECS

08 20314 electronic doorbell...
08 20314 electronic doorbell...08 20314 electronic doorbell...
08 20314 electronic doorbell...
IAESIJEECS
 
07 20278 augmented reality...
07 20278 augmented reality...07 20278 augmented reality...
07 20278 augmented reality...
IAESIJEECS
 
06 17443 an neuro fuzzy...
06 17443 an neuro fuzzy...06 17443 an neuro fuzzy...
06 17443 an neuro fuzzy...
IAESIJEECS
 
05 20275 computational solution...
05 20275 computational solution...05 20275 computational solution...
05 20275 computational solution...
IAESIJEECS
 
04 20268 power loss reduction ...
04 20268 power loss reduction ...04 20268 power loss reduction ...
04 20268 power loss reduction ...
IAESIJEECS
 
03 20237 arduino based gas-
03 20237 arduino based gas-03 20237 arduino based gas-
03 20237 arduino based gas-
IAESIJEECS
 
02 20274 improved ichi square...
02 20274 improved ichi square...02 20274 improved ichi square...
02 20274 improved ichi square...
IAESIJEECS
 
01 20264 diminution of real power...
01 20264 diminution of real power...01 20264 diminution of real power...
01 20264 diminution of real power...
IAESIJEECS
 
08 20272 academic insight on application
08 20272 academic insight on application08 20272 academic insight on application
08 20272 academic insight on application
IAESIJEECS
 
07 20252 cloud computing survey
07 20252 cloud computing survey07 20252 cloud computing survey
07 20252 cloud computing survey
IAESIJEECS
 
06 20273 37746-1-ed
06 20273 37746-1-ed06 20273 37746-1-ed
06 20273 37746-1-ed
IAESIJEECS
 
05 20261 real power loss reduction
05 20261 real power loss reduction05 20261 real power loss reduction
05 20261 real power loss reduction
IAESIJEECS
 
04 20259 real power loss
04 20259 real power loss04 20259 real power loss
04 20259 real power loss
IAESIJEECS
 
03 20270 true power loss reduction
03 20270 true power loss reduction03 20270 true power loss reduction
03 20270 true power loss reduction
IAESIJEECS
 
02 15034 neural network
02 15034 neural network02 15034 neural network
02 15034 neural network
IAESIJEECS
 
01 8445 speech enhancement
01 8445 speech enhancement01 8445 speech enhancement
01 8445 speech enhancement
IAESIJEECS
 
08 17079 ijict
08 17079 ijict08 17079 ijict
08 17079 ijict
IAESIJEECS
 
07 20269 ijict
07 20269 ijict07 20269 ijict
07 20269 ijict
IAESIJEECS
 
06 10154 ijict
06 10154 ijict06 10154 ijict
06 10154 ijict
IAESIJEECS
 
05 20255 ijict
05 20255 ijict05 20255 ijict
05 20255 ijict
IAESIJEECS
 

More from IAESIJEECS (20)

08 20314 electronic doorbell...
08 20314 electronic doorbell...08 20314 electronic doorbell...
08 20314 electronic doorbell...
 
07 20278 augmented reality...
07 20278 augmented reality...07 20278 augmented reality...
07 20278 augmented reality...
 
06 17443 an neuro fuzzy...
06 17443 an neuro fuzzy...06 17443 an neuro fuzzy...
06 17443 an neuro fuzzy...
 
05 20275 computational solution...
05 20275 computational solution...05 20275 computational solution...
05 20275 computational solution...
 
04 20268 power loss reduction ...
04 20268 power loss reduction ...04 20268 power loss reduction ...
04 20268 power loss reduction ...
 
03 20237 arduino based gas-
03 20237 arduino based gas-03 20237 arduino based gas-
03 20237 arduino based gas-
 
02 20274 improved ichi square...
02 20274 improved ichi square...02 20274 improved ichi square...
02 20274 improved ichi square...
 
01 20264 diminution of real power...
01 20264 diminution of real power...01 20264 diminution of real power...
01 20264 diminution of real power...
 
08 20272 academic insight on application
08 20272 academic insight on application08 20272 academic insight on application
08 20272 academic insight on application
 
07 20252 cloud computing survey
07 20252 cloud computing survey07 20252 cloud computing survey
07 20252 cloud computing survey
 
06 20273 37746-1-ed
06 20273 37746-1-ed06 20273 37746-1-ed
06 20273 37746-1-ed
 
05 20261 real power loss reduction
05 20261 real power loss reduction05 20261 real power loss reduction
05 20261 real power loss reduction
 
04 20259 real power loss
04 20259 real power loss04 20259 real power loss
04 20259 real power loss
 
03 20270 true power loss reduction
03 20270 true power loss reduction03 20270 true power loss reduction
03 20270 true power loss reduction
 
02 15034 neural network
02 15034 neural network02 15034 neural network
02 15034 neural network
 
01 8445 speech enhancement
01 8445 speech enhancement01 8445 speech enhancement
01 8445 speech enhancement
 
08 17079 ijict
08 17079 ijict08 17079 ijict
08 17079 ijict
 
07 20269 ijict
07 20269 ijict07 20269 ijict
07 20269 ijict
 
06 10154 ijict
06 10154 ijict06 10154 ijict
06 10154 ijict
 
05 20255 ijict
05 20255 ijict05 20255 ijict
05 20255 ijict
 

Recently uploaded

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
Pixlogix Infotech
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 

Recently uploaded (20)

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 

08. 9804 11737-1-rv edit dhyan

  • 1. International Journal of Informatics and Communication Technology (IJ-ICT) Vol.8, No.1, April 2019, pp. 56~62 ISSN: 2252-8776, DOI: 10.11591/ijict.v8i1.pp56-62  56 Journal homepage: http://iaescore.com/journals/index.php/IJICT Predicting heart ailment in patients with varying number of features using data mining techniques T R Stella Mary, Shoney Sebastian Department of Computer Science, Christ University, Taiwan Article Info ABSTRACT Article history: Received Dec 10, 2018 Revised Jan 20, 2019 Accepted Feb 3, 2019 Data mining can be defined as a process of extracting unknown, verifiable and possibly helpful data from information. Among the various ailments, heart ailment is one of the primary reason behind death of individuals around the globe, hence in order to curb this, a detailed analysis is done using Data Mining. Many a times we limit ourselves with minimal attributes that are required to predict a patient with heart disease. By doing so we are missing on a lot of important attributes that are main causes for heart diseases. Hence, this research aims at considering almost all the important features affecting heart disease and performs the analysis step by step with minimal to maximum set of attributes using Data Mining techniques to predict heart ailments. The various classification methods used are Naïve Bayes classifier, Random Forest and Random Tree which are applied on three datasets with different number of attributes but with a common class label. From the analysis performed, it shows that there is a gradual increase in prediction accuracies with the increase in the attributes irrespective of the classifiers used and Naïve Bayes and Random Forest algorithms comparatively outperforms with these sets of data. Keywords: Data mining Heart ailments Naïve bayes classifier Random forest Random tree Copyright © 2019 Institute of Advanced Engineering and Science. All rights reserved. Corresponding Author: T R Stella Mary, Department of Computer Science, Christ University, Taiwan 1. INTRODUCTION Data mining involves the exploration of huge datasets in order to mine unknown, verifiable, relationships, knowledge and possibly helpful data from information [1-2]. It involves various steps in converting a raw data into a data with useful information. This complete process involves various steps such as data cleaning, integration, transformation, selection, mining, evaluation of pattern and representation of knowledge. This has made data mining find its applications in the fields of healthcare, business, education, media, weather analysis, bio-informatics and many more. In healthcare, data mining has gained wide interest and importance and has become the most efficient means in diagnosing patients with various ailments. Medical data is rich in information but lacks in terms of tools and technologies to extract the useful patterns and knowledge from them. Almost all the healthcare industries today all streaming out a lot of huge data as in, data pertaining to patients, diagnosis of various ailments, patient records in electronic form, data excavated from medical devices etc. This huge amount of data is processed and analyzed to extract useful patterns which helps in fast diagnosis of ailments and which even aids in cost reduction. Among the various ailments, heart ailment is one among the main reason for death of human beings all around the world in last decade by the WHO. The European public health alliance reported that over 41% of all deaths were caused by strokes, heart attacks, and other chronic diseases [3]. As indicated by the American Heart Association, more than 7 million Americans have suffered a heart ailment in their lifetime. At the point when plaque is worked inside the coronary supply routes, they limit these veins making them
  • 2. IJ-ICT ISSN: 2252-8776  Predicting heart ailment in patients with varying number of features using data … (T R Stella Mary) 57 unfit to convey oxygenated blood to the heart muscle causing the outstanding manifestations of CAD, for example, chest torment (angina) and shortness of breath [4-5]. There are many kinds of cardiovascular diseases which includes the coronary heart diseases (CHD), stroke, intrinsic heart, fringe vein, rheumatic heart, provocative heart sickness and hypertensive coronary illness [6-8]. The extreme reasons for heart illness are tobacco utilization, physical latency, an undesirable eating regimen and customary use of liquor [9]. Medical diagnosis is an important and complicated task as it must ensure accurate results and need to be executed precisely. Not all doctors are completely well versed in their various sectors and there is lack of proper diagnosis. This paves way for and the stream of data mining to ease the tasks and which is capable of performing the diagnosis and generate the results accurately with less human interference and less investment both in terms of money and time [10]. In real time there are several tests taken to predict and diagnose patients with heart diseases which in turn gives the accurate results but turns out to be time consuming and cost effective. Meanwhile in automated approach people aim at reducing the attributes which in turn reduces the time taken but lacks in accuracy since an extra care has to be taken to ensure that a patient is suffering from heart ailment providing a detailed analysis is done in all the perspectives. Keeping this in mind that most of the features affecting heart disease must be considered, a predictive analysis is done to see how the prediction accuracies vary with very few and maximum features taken. This paper aims at providing a more efficient approach towards the prediction of heart ailments in patients with the most important attributes affecting the patients that leads to heart disease and analyze the data as to how the prediction accuracies fluctuate with minimal and maximal attributes taken into consideration using the data mining approach. Numerous studies have been performed with the aim of efficient diagnosis and prediction of heart ailment. Most of the authors have applied different data mining methods for prediction and have come up with different probabilities of accuracies for different classifiers. Randa El-Bialy et al., [11] depicts that heterogeneity datasets gives in heterogeneity data types, based on this it combines four datasets with same class label namely the Cleveland heart disease, Hungarian heart disease, V.A. heart disease and Statlog project heart disease with 13 features from UCI repository and applies C4.5 and Fast Decision tree in order to construct decision tree for each dataset and compare the accuracies in WEKA. Out of which five common features are selected and are integrated into a new dataset, on which the two models are applied again, which results in the four most commonly occurring features (ca, age, cp, thal) with highest information gain. This indicates that these four features are the most important features for the prediction of heart disease. Vikas Chaurasia [12] uses three data mining algorithms CART, ID3 and Decision table on the Cleveland heart disease dataset with 11 attributes out of 13 attributes. Theses 3 classifiers are implemented on WEKA with 10-fold cross validation. A detailed result in terms of time taken to build the model, correctly and incorrectly classified instances, error rate, confusion matrices and accuracies are analyzed for all the three classifiers and finally depicts that CART out performs from the other two classifiers with an accuracy of 83.49%. Further the importance of each attribute is analyzed with Chi-squared test, Information gain and Gain ratio test which depicts that cp attributes impacts the output the most. Chaitrali S. Dangare and Sulabha S. Apte [13] analyses the Cleveland heart disease dataset with 3 classifiers namely the Neural Networks, Decision trees and Naïve Bayes algorithms. To this existing dataset two more attributes obesity and tobacco are included which makes up for the next dataset. Initially the classifiers are applied to the first dataset with 13 attributes and the same classifiers are applied to the next dataset with 15 attributes. Finally, a detailed analysis of their confusion matrices and accuracies are compared between the two datasets which depicts that for the dataset with 13 attributes Neural Network outperforms with 99.25% accuracy and for the dataset with 15 attributes again Neural Network outperforms with 100% accuracy. They conclude that Neural Network algorithm outperforms in both the cases and has relatively higher accuracy with 15 attributes. M. Anbarasi et al., [14] objective was to decrease the number of attributes for which they play out the analysis on the Cleveland heart disease dataset with 13 attributes. Feature subset selection is performed on the dataset using the Genetic algorithm which results with 6 attributes on which the classifiers are applied. The classifiers used here are Decision tree, Naïve Bayes and Classification via Clustering. Further while comparing among all the classifiers with their corresponding confusion matrices and accuracies, Decision tree outperforms with an accuracy of 99.2%. Jyoti Rohilla and Preeti Gulia [15] performs the analysis on the Cleveland heart disease dataset with 11 attributes by applying the classifiers on a 10-fold cross validation process. The different classifiers used are Naïve Bayes, Bagging, ID3, J48, Simple Cart, Logistic Regression and REPTREE algorithms. Comparatively the Decision tree classifier ID3 outperforms with an accuracy of 88% and with a minimum
  • 3.  ISSN: 2252-8776 IJ-ICT Vol. 8, No. 1, April 2019 : 56–62 58 error. Hence it concludes that the Decision tree method that is ID3 outperforms when compared to all other classifiers. R. Tamilarasi and Dr. R. Porkodi [16] aims at providing a cost effective automated system in prediction of heart ailment in patients for which the a dataset with 13 attributes are considered. They even specify the risk factors of heart disease which are smoking, cholesterol, obesity and lack of physical exercise. To analyze this dataset 6 classifiers namely the Naïve Bayes, KNN algorithm, J48, CART, ANN and SMO are applied to see which performs comparatively better in terms of correctly and incorrectly classified instances and the error rate. Hence, it concludes that the K-nearest neighbor algorithm outperforms with an accuracy of 100%. Shabad Adam Pattekari and Asma Parveen [17] built a prototype system with a model built with the classifiers namely Decision trees, Naïve Bayes and Artificial Neural Networks. A real time website was developed called the Heart Disease Prediction System (HDPS) where in it retrieves the data stored in the database and predicts if the patient had heart disease using the trained dataset. M. Akhiljabbar et al., [18] developed an efficient algorithm using the genetic algorithm approach known as the Associative Classification algorithm in order to predict patients with heart ailment. In this approach all the attributes are analyzed and measured in terms of the Gini Index which in turn gives weightage for each of the attributes which helps in determining the most important factors affecting the heart disease. Then the Z statistics is used to analyze the wellness of the rule obtained. Finally, based on the rules generated the classifiers are built and the accuracy of the dataset is measured. 2. RESEARCH METHOD All most all the hospitals today maintain vast amount of healthcare information obtained from patients, from electronic devices, medical diagnosis etc. This huge amount of data consists a lot of useful hidden data in them which is very essential in diagnosing various ailments with a minimum number of medical tests and make the diagnosis more intelligent. Many a times we limit ourselves with minimal attributes that are required to predict a patient with heart disease. By doing so we are missing on a lot of important features that are main causes for heart diseases. Taking this into consideration, an approach is proposed wherein all the attributes responsible for the heart ailment in patients are considered and analyzed in a step by step procedure to see how the attributes play a vital role in predicting heart ailment in patients. Hence, the pre-dominant objective is to provide a more efficient approach towards the prediction of heart ailments with the most important attributes causing heart ailment and analyze the data, as to how the prediction accuracies fluctuate with minimal and maximal attributes taken into consideration using the data mining approach. For which a benchmark heart ailment dataset i.e. Cleveland heart disease dataset along with it two more minimal datasets with same class label are used. Medical terminologies such as sex, chest pain type, and age etc., making up for 7 attributes in first dataset, 10 attributes in second dataset and 13 attributes in the benchmark dataset are used. To improvise the results, step by step approach is followed in increasing the number of attributes and analyze to see if the prediction accuracies reduced or gradually increased. The data mining classification methods used to classify the datasets are Decision Trees (Random Forest and Random Tree) and Naive Bayes are used. This proposed approach finds its application in predicting the heart ailments in patients as it considers all the important attributes or features responsible in causing a heart ailment. It reduces the several tests and time taken to predict and diagnose patients with heart diseases and is also cost effective. The different data mining classifiers used are Naïve Bayes, Decision trees Random Forest and Random Tree. 2.1. Naïve Bayes This algorithm has its basis from the Bayes theorem and follows the concept of conditional independence i.e., it considers an attribute value of a given class to be independent of other attributes values. The main objective is to maximize the posterior probability in predicting the class [19]. The Bayes theorem is as follows: Let X={x1, x2, ....., xm} be a set of m attributes. In Bayesian, X is considered as evidence and H be some hypothesis that the data of X belongs to specific class C. We have to determine P (H|X), the probability that the hypothesis H holds given the evidence i.e. data sample X. According to Bayes theorem the P (H|X) is expressed as P (H|X) = P (X| H) P (H) / P (X) Naïve Bayes classifier performs relatively better with nominal data but not with numeric data [20].
  • 4. IJ-ICT ISSN: 2252-8776  Predicting heart ailment in patients with varying number of features using data … (T R Stella Mary) 59 2.2. Random Tree It uses the concept of divide and conquer in order to build decision trees [21]. At each node of the tree, the algorithm takes up a feature that further divides the data into subsets with the help of a splitting algorithm [22]. Every predicting node leads to a class or decision rule. In order to classify a new item, it should build a decision tree taking into consideration the existing items in the training set. With the help of information gain it finds the dependent variable that clearly discriminates other instances. Now among the possible values of this variable if any of the value turns out to be the target value or leaf node and we can terminate that branch. Hence we follow this method until we get the decision rules of different combination of features leads to the target value. 2.3. Random Forest It is a classifier which consists of n number of decision trees where in each individual result is depicted as a separate tree. It was derived from the random decision of forest that was proposed by Tin Kam Ho of Bell Labs in 1995 [23]. This approach does the selection of attributes randomly in order to build the decision trees but with restricted fluctuations. Multi-classifiers are the after effect of combining a few independent classifiers and sets of classifiers responsible for broadening the execution have been depicted in Figure 1 [24]. Figure 1. Architectural diagram of proposed work 3. RESULTS AND ANALYSIS In Waikato Environment for Knowledge Analysis (WEKA) is used for analyzing and applying the different classifiers on the datasets because of its excellence in discovering, analysis and predicting efficient patterns [25]. A detailed analysis is done in Weka wherein Table 1 represents the results obtained by applying the classifiers on a dataset of 7 attributes. Table 2 depicts the results obtained by applying the classifiers on a dataset of 10 attributes and Table 3 depicts the results obtained by applying the classifiers on a dataset of 13 attributes. Figure 2 depicts the gradual increase in the prediction accuracies along with the increase in the attributes. Table 1. Accuracies Achieved with 7 Attributes Evaluation criteria Classifiers Naïve Bayes Random Forest Random Tree Time taken to build model (in sec) 0 0.25 0 Correctly classified instances 71 72 68 Incorrectly classified instances 20 19 23 Accuracy (%) 78.022% 79.1209% 74.7253%
  • 5.  ISSN: 2252-8776 IJ-ICT Vol. 8, No. 1, April 2019 : 56–62 60 Table 2. Accuracies Achieved with 10 Attributes. Evaluation criteria Classifiers Naïve Bayes Random Forest Random Tree Time taken to build model (in sec) 0 0.14 0 Correctly classified instances 72 71 67 Incorrectly classified instances 15 16 20 Accuracy (%) 82.7586% 81.6092% 77.0115% Table 3. Accuracies Achieved with 13 Attributes Evaluation criteria Classifiers Naïve Bayes Random Forest Random Tree Time taken to build model (in sec) 0 0.14 0 Correctly classified instances 76 79 72 Incorrectly classified instances 15 12 19 Accuracy (%) 83.5165% 86.8132% 79.1209% Figure 2. Fluctuations in prediction accuracies. The initial dataset (Dataset-1) consists of 7 attributes making up for 300 records namely age which is continuous attribute, sex which is categorical attribute, exercise induced agina which is nominal attribute, depression induced by exercise forming the categorical attribute, slope of peak exercise again a categorical attribute, number of major vessels a nominal attribute and finally the defect type which is categorical again. The second dataset (Dataset-2) consists of all the first dataset attributes along with additional 3 more attributes namely the chest pain type which is a categorical attribute, cholesterol which is continuous and resting ECG measure again categorical attribute totally making up for 290 records. Dataset-1 and Dataset-2 are collected by taking into consideration the various heart ailment attributes through a survey done with the patients. Finally, the third dataset Cleveland Heart Disease dataset consists of 303 records with 13 attributes is used. Predictable attribute (num) is a common class label for all the three datasets wherein, Value = 0: depicts that the patient has no heart disease Value = 1: depicts that the patient has heart disease. All the dataset are analyzed and the classifiers are applied on a basis of 70% split i.e., 70% of dataset is used as training data and 30% as testing data. Step – 1: Pre-processing of datasets. Dataset-1 with 7 attributes didn’t have any missing values and so does the second dataset. But the third Dataset-3 had missing values which was rectified by using the Weka filter ReplaceMissingValues. Since the class variable is the same for all the datasets it is converted from numeric to nominal using the filter numeric to nominal under the unsupervised attribute filters in order to make few of the classifiers like Naïve Bayes and Decision trees to be applicable. Further the multi-class classification in the third benchmark dataset is converted two-class classification. this section, it is explained the results of research and at the same time is given the comprehensive discussion. Results can be presented in figures, graphs, tables and others that make the reader understand easily [2], [5]. 78.02% 79.12% 74.73% 82.76% 81.61% 77.01% 83.52% 86.81% 79.12% NAÏVE BAYES RANDOM FOREST RANDOM TREE Fluctuation in prediction accuracies Dataset-1 Dataset-2 Dataset-3
  • 6. IJ-ICT ISSN: 2252-8776  Predicting heart ailment in patients with varying number of features using data … (T R Stella Mary) 61 Results are analyzed by applying the three classifiers Naïve Bayes, Random Tree and Random Forest on all the three datasets to see if there are fluctuations in the prediction accuracies after applying them to build the models. Step – 2: Apply the 3 classifiers on the dataset with 7 attributes. Initially, the Dataset-1 with 7 attributes is pre-processed and the class label is converted to nominal using the filter Numeric to Nominal on which the 3 classifiers are applied. It is evident from Table 1 that the Random forest algorithm outperforms with 7 attributes. Step – 3: Apply the 3 classifiers on the dataset with 10 attributes. Secondly, the Dataset-2 with 10 attributes is pre-processed and the class label is converted to nominal using the filter Numeric to Nominal on which the 3 classifiers are applied. It is evident from Table 2 that the Naïve Bayes algorithm outperforms and the accuracies of all the classifiers are gradually increased with the addition of 3 more attributes chest pain type, cholesterol and ECG measure to the Dataset-1. Step – 4: Apply the 3 classifiers on the dataset with 13 attributes. Finally, the Dataset-3 with 13 attributes is pre-processed and the class label is converted to nominal using the filter Numeric to Nominal on which the 3 classifiers are applied. It is evident from Table 3 that the Random Forest algorithm outperforms and the accuracies of all the classifiers are gradually increased with the addition of 3 more attributes blood pressure, fasting sugar and thalach to the Dataset-2. From the Table 4 it is evident that with the increase in the number of attributes, the prediction accuracies have gradually increased irrespective of the classifiers used and the dataset values. Moreover, for different datasets different classifiers outperforms i.e., in case of Dataset-1 with 7 attributes Random Forest classifier outperforms with an accuracy of 79.1209%, whereas for Dataset-2 with 10 attributes Naïve Bayes classifier outperforms with an accuracy of 82.7586% and finally for the Dataset-3 with 13 attributes the Random Forest attribute outperforms. Hence, it clearly shows that consideration of all the important attributes or features contribute to a better and accurate prediction of heart diseases in patients. From the Figure 2 it is evident that there is a gradual increase in the prediction accuracies analyzed on the three datasets with different number of attributes and values using the above three classifiers. It even depicts that in case of Dataset-1 Random forest algorithm outperforms, for Dataset-2 Naïve Bayes algorithm outperforms with an accuracy of 82.75% and for Dataset-3 again Random forest outperforms with an accuracy of 86.81%. Table 4. Accuracy Fluctuations among the 3 Datasets Datasets Classifiers Naïve Bayes Random Forest Random Tree Dataset-1 78.022% 79.1209% 74.7253% Dataset-2 82.7586% 81.6092% 77.0115% Dataset-3 83.5165% 86.8132% 79.1209% 4. CONCLUSION Finally, to conclude from the results and analysis performed in this paper it is evident that each and every attribute contribute uniquely to the prediction accuracy and hence with the step by step increase in the heart ailment attributes, the prediction accuracy of heart ailment gradually increases irrespective of the classifier used to build the model. Adding to this it is also evident that for different dataset with different values and number of attributes different classifiers out performs each time. Future work can be expanded by finding the most appropriate features required for prediction of heart ailment and use other data mining methods such as neural networks, regression etc., for prediction of heart disease. REFERENCES [1] Yan, H., J. Zheng, et al. (2003). "Development of a decision support system for heart disease diagnosis using multilayer perceptron." Proceedings of the 2003 International Symposium on vol.5: pp. V-709- V-712. [2] Sandhya, J., P. Deepa Shenoy, et al. (2010). "Classification of Neurodegenerative Disorders Based on Major Risk Factors Employing Machine Learning Techniques." International Journal of Engineering and Technology Vol.2, No.4. [3] Jaymin Patel, Prof.TejalUpadhyay and Dr. Samir Patel, "Heart Disease Prediction using Machine Learning and Data Mining Techniques ", Nirma University, Gujarat, India IJCSC Vol 7,number 1 september 2015-march 2016 pp.129- 137. [4] Webmd.com. ”Risk factors for heart disease,” retrieved 20/8/2014 from http://www.webmd.com/heart-disease/risk- factors-heart-disease.
  • 7.  ISSN: 2252-8776 IJ-ICT Vol. 8, No. 1, April 2019 : 56–62 62 [5] National Heart, Lung, and Blood Institute, ”What is coronary heart disease?”retrieved 20/8/2014 from http://www.nhlbi.nih.gov/health/healthtopics/ topics/cad/. [6] Das, R., I. Turkoglu, et al. (2009). "Effective diagnosis of heart disease through neural networks ensembles." Expert Systems with Applications, Elsevier 36 (2009): 7675–7680. [7] Panzarasa, S., S. Quaglini, et al. (2010). "Data mining techniques for analyzing stroke care processes." Proceedings of the 13th World Congress on Medical Informatics. [8] V. Chaurasia, “Early Prediction of Heart Diseases Using Data Mining,” Caribbean Journal of Science and Technology, vol. 1, pp. 208–217, 2013. [9] Srinivas, K.,” Analysis of coronary heart disease and prediction of heart attack in coal mining regions using data mining techniques”, IEEE Transaction on Computer Science and Education (ICCSE), p(1344 - 1349), 2010. [10] Ruben, D. C. J. (2009). "Data Mining in Healthcare: Current Applications and Issues." [11] Randa El-Bialy , Mostafa A. Salamay, Omar H. Karam and M.Essam Khalifa, “Feature Analysis of Coronery Artery Heart Disease Data Sets”, International Conference on Communication, Management and Information Technology (ICCMIT 2015), Procedia Computer Science 65 ( 2015 ) 459 – 468 [12] V. Chauraisa and S. Pal, “Data Mining Approach to Detect Heart Diseases”, International Journal of Advanced Computer Science and Information Technology (IJACSIT),Vol. 2, No. 4,2013, pp 56-66. [13] Chaitrali S. Dangare and Sulabha S. Apte, “Improved Study of Heart Disease Prediction System using Data Mining Classification Techniques”, International Journal of Computer Applications (0975 – 888) Volume 47– No.10, June 2012. [14] M. Anbarasi, E. Anupriya and N.CH.S.N.Iyengar, “Enhanced Prediction of Heart Disease with Feature Subset Selection using Genetic Algorithm”, International Journal of Engineering Science and Technology Vol. 2(10), 2010, 5370-537. [15] Jyoti Rohilla and Preeti Gulia, “Analysis of Data Mining Techniques for Diagnosing Heart Disease”, International Journal of Advanced Research in Computer Science and Software Engineering, Volume 5, Issue 7, July 2015, ISSN: 2277 128X . [16] R. Tamilarasi and Dr. R. Porkodi, “A Study and Analysis of Disease Prediction Techniques in Data Mining for Healthcare”, International Journal of Emerging Research in Management &Technology, March 2015, ISSN: 2278- 9359 (Volume-4, Issue-3) [17] Shadab Adam Pattekari and Asma Parveen, prediction system for heart disease using naïve bayes, International Journal of Advanced Computer and Mathematical Sciences, 2012. [18] M.Akhil jabbers, Dr.prirti Chandra, Dr.B.L Deekshatulu, and Heart Disease Prediction System using Associative Classification and Genetic Algorithm, International Conference on Emerging Trends in Electrical, Electronics and Communication Technologies, 2012. [19] M. Bramer, Principles of Data Mining: Springer-Verlag, 2007. [20] R. Alizadehsani, J. Habibi, B. Bahadorian, H. Mashayekhi, A. Ghandeharioun, and R. Boghrati, et al., "Diagnosis of coronary arteries stenosis using data mining," J Med Signals Sens, vol. 2, pp. 153-9, Jul 2012. [21] J.Nahar, T. Imam, K. S. Tickle, and Y.-P. P. Chen, “Computational Intelligence for heart disease diagnosis: A medical Knowledge driven approach,” Expert Systems with Applications, vol. 40, no. 1, pp. 96–104,2013. [22] M. Kumari and S. Godara, "Comparative study of data mining classification methodsin cardiovascular disease prediction," IJCST vol. 2, pp. 304-305, 2011. [23] Y. E. Shao, C.-D. Hou, and C.-C. Chiu, “Hybrid intelligent modelling, schemes for heart disease classification,” Applied Soft Computing,vol. 14, pp. 47–52, 2014. [24] P. V. Ankur Makwana, “Identify the patients at high risk of re-admissionin hospital in the next year,” International Journal of Science andResearch, vol. 4, pp. 2431–2434, 2015. [25] A. Aziz, N. Ismail, And F. Ahmad, “Mining Students’ academic Performance.,” Journal of Theoretical & Applied Information Technology, vol. 53, no. 3, 2013.