Data mining can be defined as a process of extracting unknown, verifiable and possibly helpful data from information. Among the various ailments, heart ailment is one of the primary reason behind death of individuals around the globe, hence in order to curb this, a detailed analysis is done using Data Mining. Many a times we limit ourselves with minimal attributes that are required to predict a patient with heart disease. By doing so we are missing on a lot of important attributes that are main causes for heart diseases. Hence, this research aims at considering almost all the important features affecting heart disease and performs the analysis step by step with minimal to maximum set of attributes using Data Mining techniques to predict heart ailments. The various classification methods used are Naïve Bayes classifier, Random Forest and Random Tree which are applied on three datasets with different number of attributes but with a common class label. From the analysis performed, it shows that there is a gradual increase in prediction accuracies with the increase in the attributes irrespective of the classifiers used and Naïve Bayes and Random Forest algorithms comparatively outperforms with these sets of data.
Predicting Heart Ailment in Patients with Varying number of Features using Da...IJECEIAES
Data mining can be defined as a process of extracting unknown, verifiable and possibly helpful data from information. Among the various ailments, heart ailment is one of the primary reason behind death of individuals around the globe, hence in order to curb this, a detailed analysis is done using Data Mining. Many a times we limit ourselves with minimal attributes that are required to predict a patient with heart disease. By doing so we are missing on a lot of important attributes that are main causes for heart diseases. Hence, this research aims at considering almost all the important features affecting heart disease and performs the analysis step by step with minimal to maximum set of attributes using Data Mining techniques to predict heart ailments. The various classification methods used are Naïve Bayes classifier, Random Forest and Random Tree which are applied on three datasets with different number of attributes but with a common class label. From the analysis performed, it shows that there is a gradual increase in prediction accuracies with the increase in the attributes irrespective of the classifiers used and Naïve Bayes and Random Forest algorithms comparatively outperforms with these sets of data.
INTEGRATING NAIVE BAYES AND K-MEANS CLUSTERING WITH DIFFERENT INITIAL CENTROI...cscpconf
Heart disease is the leading cause of death in the world over the past 10 years. Researchers have been using several data mining techniques to help health care professionals in the diagnosis of heart disease. Naïve Bayes is one of the data mining techniques used in the diagnosis of heart disease showing considerable success. K-means clustering is one of the most popular clustering techniques; however initial centroid selection strongly affects its results. This paper demonstrates the effectiveness of an unsupervised learning technique which is kmeans clustering in improving supervised learning technique which is naïve bayes. It investigates integrating K-means clustering with Naïve Bayes in the diagnosis of heart disease patients. It also investigates different methods of initial centroid selection of the K-means clustering such as range, inlier, outlier, random attribute values, and random row methods in the diagnosis of heart disease patients. The results show that integrating k-means clustering with naïve bayes with different initial centroid selection could enhance the naïve bayes accuracy in diagnosing heart disease patients. It also showed that the two clusters random row
initial centroid selection method could achieve higher accuracy than other initial centroid selection methods in the diagnosis of heart disease patients showing accuracy of 84.5%.
Heart Disease Prediction using Machine Learning Algorithmijtsrd
Nowadays, Heart disease has become dangerous to a human being, it effects very badly to human body. If anyone is suffering from heart disease, then it leads to blood clotting. Heart disease prediction is very difficult task to predict in the field of medical science. Affiliation has predicted that 12 million people fail horrendously every year as a result of heart disease. In this paper, we propose a k Nearest Neighbors Algorithm KNN way to deal with improve the exactness of heart determination. We show that k Nearest Neighbors Algorithm KNN have better accuracy than random forest algorithm for viewing heart disease. The k Nearest Neighbors Algorithm give more precise and exact outcome . We have taken 13 attributes in the dataset and a target attribute, by applying machine learning we achieved 84 accuracy in the heart disease detection. Ravi Kumar Singh | Dr. A Rengarajan "Heart Disease Prediction using Machine Learning Algorithm" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-2 , February 2021, URL: https://www.ijtsrd.com/papers/ijtsrd38358.pdf Paper Url: https://www.ijtsrd.com/computer-science/other/38358/heart-disease-prediction-using-machine-learning-algorithm/ravi-kumar-singh
A Heart Disease Prediction Model using Logistic Regression By Cleveland DataBaseijtsrd
The early prognosis of cardiovascular diseases can aid in making decisions to lifestyle changes in high risk patients and in turn reduce their complications. Research has attempted to pinpoint the most influential factors of heart disease as well as accurately predict the overall risk using homogenous data mining techniques. Recent research has delved into amalgamating these techniques using approaches such as hybrid data mining algorithms. This paper proposes a rule based model to compare the accuracies of applying rules to the individual results of logistic regression on the Cleveland Heart Disease Database in order to present an accurate model of predicting heart disease. K. Sandhya Rani | M. Sai Chaitanya | G. Sai Kiran"A Heart Disease Prediction Model using Logistic Regression By Cleveland DataBase" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-3 , April 2018, URL: http://www.ijtsrd.com/papers/ijtsrd11402.pdf http://www.ijtsrd.com/computer-science/data-miining/11402/a-heart-disease-prediction-model-using-logistic-regression-by-cleveland-database/k-sandhya-rani
prediction of heart disease using machine learning algorithmsINFOGAIN PUBLICATION
The successful experiment of data mining in highly visible fields like marketing, e-business, and retail has led to its application in other sectors and industries. Healthcare is being discovered among these areas. There is an opulence of data available within the healthcare systems. However, there is a scarcity of useful analysis tool to find hidden relationships in data. This research intends to provide a detailed description of Naïve Bayes and decision tree classifier that are applied in our research particularly in the prediction of Heart Disease. Some experiment has been conducted to compare the execution of predictive data mining technique on the same dataset, and the consequence reveals that Decision Tree outperforms over Bayesian classification.
In healthcare sector, data are enormous and diverse because it contains a data of different types and getting knowledge from these data is crucial. So to get that knowledge, data mining techniques may be utilized to mine knowledge by building models from healthcare dataset. At present, the classification of heart diseases patients has been a demanding research confront for many researchers. For building a classification model for a these patient, we used four different classification algorithms such as NaiveBayes, MultilayerPerceptron, RandomForest and DecisionTable. The intention behind this work is to classify that whether a patient is tested positive or tested negative for heart diseases, based on some diagnostic measurements integrated into the dataset.
Predicting Heart Ailment in Patients with Varying number of Features using Da...IJECEIAES
Data mining can be defined as a process of extracting unknown, verifiable and possibly helpful data from information. Among the various ailments, heart ailment is one of the primary reason behind death of individuals around the globe, hence in order to curb this, a detailed analysis is done using Data Mining. Many a times we limit ourselves with minimal attributes that are required to predict a patient with heart disease. By doing so we are missing on a lot of important attributes that are main causes for heart diseases. Hence, this research aims at considering almost all the important features affecting heart disease and performs the analysis step by step with minimal to maximum set of attributes using Data Mining techniques to predict heart ailments. The various classification methods used are Naïve Bayes classifier, Random Forest and Random Tree which are applied on three datasets with different number of attributes but with a common class label. From the analysis performed, it shows that there is a gradual increase in prediction accuracies with the increase in the attributes irrespective of the classifiers used and Naïve Bayes and Random Forest algorithms comparatively outperforms with these sets of data.
INTEGRATING NAIVE BAYES AND K-MEANS CLUSTERING WITH DIFFERENT INITIAL CENTROI...cscpconf
Heart disease is the leading cause of death in the world over the past 10 years. Researchers have been using several data mining techniques to help health care professionals in the diagnosis of heart disease. Naïve Bayes is one of the data mining techniques used in the diagnosis of heart disease showing considerable success. K-means clustering is one of the most popular clustering techniques; however initial centroid selection strongly affects its results. This paper demonstrates the effectiveness of an unsupervised learning technique which is kmeans clustering in improving supervised learning technique which is naïve bayes. It investigates integrating K-means clustering with Naïve Bayes in the diagnosis of heart disease patients. It also investigates different methods of initial centroid selection of the K-means clustering such as range, inlier, outlier, random attribute values, and random row methods in the diagnosis of heart disease patients. The results show that integrating k-means clustering with naïve bayes with different initial centroid selection could enhance the naïve bayes accuracy in diagnosing heart disease patients. It also showed that the two clusters random row
initial centroid selection method could achieve higher accuracy than other initial centroid selection methods in the diagnosis of heart disease patients showing accuracy of 84.5%.
Heart Disease Prediction using Machine Learning Algorithmijtsrd
Nowadays, Heart disease has become dangerous to a human being, it effects very badly to human body. If anyone is suffering from heart disease, then it leads to blood clotting. Heart disease prediction is very difficult task to predict in the field of medical science. Affiliation has predicted that 12 million people fail horrendously every year as a result of heart disease. In this paper, we propose a k Nearest Neighbors Algorithm KNN way to deal with improve the exactness of heart determination. We show that k Nearest Neighbors Algorithm KNN have better accuracy than random forest algorithm for viewing heart disease. The k Nearest Neighbors Algorithm give more precise and exact outcome . We have taken 13 attributes in the dataset and a target attribute, by applying machine learning we achieved 84 accuracy in the heart disease detection. Ravi Kumar Singh | Dr. A Rengarajan "Heart Disease Prediction using Machine Learning Algorithm" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-2 , February 2021, URL: https://www.ijtsrd.com/papers/ijtsrd38358.pdf Paper Url: https://www.ijtsrd.com/computer-science/other/38358/heart-disease-prediction-using-machine-learning-algorithm/ravi-kumar-singh
A Heart Disease Prediction Model using Logistic Regression By Cleveland DataBaseijtsrd
The early prognosis of cardiovascular diseases can aid in making decisions to lifestyle changes in high risk patients and in turn reduce their complications. Research has attempted to pinpoint the most influential factors of heart disease as well as accurately predict the overall risk using homogenous data mining techniques. Recent research has delved into amalgamating these techniques using approaches such as hybrid data mining algorithms. This paper proposes a rule based model to compare the accuracies of applying rules to the individual results of logistic regression on the Cleveland Heart Disease Database in order to present an accurate model of predicting heart disease. K. Sandhya Rani | M. Sai Chaitanya | G. Sai Kiran"A Heart Disease Prediction Model using Logistic Regression By Cleveland DataBase" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-3 , April 2018, URL: http://www.ijtsrd.com/papers/ijtsrd11402.pdf http://www.ijtsrd.com/computer-science/data-miining/11402/a-heart-disease-prediction-model-using-logistic-regression-by-cleveland-database/k-sandhya-rani
prediction of heart disease using machine learning algorithmsINFOGAIN PUBLICATION
The successful experiment of data mining in highly visible fields like marketing, e-business, and retail has led to its application in other sectors and industries. Healthcare is being discovered among these areas. There is an opulence of data available within the healthcare systems. However, there is a scarcity of useful analysis tool to find hidden relationships in data. This research intends to provide a detailed description of Naïve Bayes and decision tree classifier that are applied in our research particularly in the prediction of Heart Disease. Some experiment has been conducted to compare the execution of predictive data mining technique on the same dataset, and the consequence reveals that Decision Tree outperforms over Bayesian classification.
In healthcare sector, data are enormous and diverse because it contains a data of different types and getting knowledge from these data is crucial. So to get that knowledge, data mining techniques may be utilized to mine knowledge by building models from healthcare dataset. At present, the classification of heart diseases patients has been a demanding research confront for many researchers. For building a classification model for a these patient, we used four different classification algorithms such as NaiveBayes, MultilayerPerceptron, RandomForest and DecisionTable. The intention behind this work is to classify that whether a patient is tested positive or tested negative for heart diseases, based on some diagnostic measurements integrated into the dataset.
A Heart Disease Prediction Model using Logistic Regressionijtsrd
The early prognosis of cardiovascular diseases can aid in making decisions to lifestyle changes in high risk patients and in turn reduce their complications. Research has attempted to pinpoint the most influential factors of heart disease as well as accurately predict the overall risk using homogenous data mining techniques. Recent research has delved into amalgamating these techniques using approaches such as hybrid data mining algorithms. This paper proposes a rule based model to compare the accuracies of applying rules to the individual results of logistic regression on the Cleveland Heart Disease Database in order to present an accurate model of predicting heart disease. K. Sandhya Rani | M. Sai Manoj | G. Suguna Mani"A Heart Disease Prediction Model using Logistic Regression" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-3 , April 2018, URL: http://www.ijtsrd.com/papers/ijtsrd11401.pdf http://www.ijtsrd.com/computer-science/data-miining/11401/a-heart-disease-prediction-model-using-logistic-regression/k-sandhya-rani
Ascendable Clarification for Coronary Illness Prediction using Classification...ijtsrd
Coronary disease is predicted by classification technique. The data mining tool WEKA has been exploited for implementing Naïve Bayes classifier. Proposed work is trapped with a specific end goal to enhance the execution of models. For improving the classification accuracy Naïve Bayes is combined with Bagging and Attribute Selection. Trial results demonstrated a critical change over in the current Naïve Bayes classifier. This approach enhances the classification accuracy and reduces computational time. D. Haripriya | Dr. M. Lovelin Ponn Felciah "Ascendable Clarification for Coronary Illness Prediction using Classification Mining and Feature Selection Performances" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd26690.pdfPaper URL: https://www.ijtsrd.com/computer-science/data-miining/26690/ascendable-clarification-for-coronary-illness-prediction-using-classification-mining-and-feature-selection-performances/d-haripriya
Heart Disease Prediction Using Data Mining TechniquesIJRES Journal
There are huge amounts of data in the medical industry which is not processed properly and hence cannot be used effectively in making decisions. We can use data mining techniques to mine these patterns and relationships. This research has developed a prototype Heart Disease Prediction using data mining techniques, namely Neural Network, K-Means Clustering and Frequent Item Set Generation. Using medical profiles such as age, sex, blood pressure and blood sugar it can predict the likelihood patients getting a heart disease. It enables significant knowledge, e.g. patterns, relationships between medical factors related to heart disease to be established. Performance of these techniques is compared through sensitivity, specificity and accuracy. It has been observed that Artificial Neural Networks outperform K Means clustering in all the parameters i.e. Sensitivity, Specificity and Accuracy.
Android Based Questionnaires Application for Heart Disease Prediction Systemijtsrd
Today classification techniques in data mining are most popular to prediction and data exploration. This Heart Disease Prediction System HDPS is using Naive Bayesian Classification with a comparison for simple probability and that of Jelinek Mercer JM Smoothing. It is implemented as an Android based application user must be feedback and answers the questions then can be seen the result as user desired in different ways exactly heart disease is present or not and then with predictions No, Low, Average, High, Very High . And the system will be provided required suggestions such as doctor details and medications to patients could be able. It will be also proved that enhanced Naive Bayes with Jelinek Mercer smoothing technique is also effective to eliminate the noise for prediction the heart disease. This system can also calculate classifier accuracy by using precision and recall. Nan Yu Hlaing | Phyu Pyar Moe "Android Based Questionnaires Application for Heart Disease Prediction System" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd26750.pdfPaper URL: https://www.ijtsrd.com/computer-science/data-miining/26750/android-based-questionnaires-application-for-heart-disease-prediction-system/nan-yu-hlaing
SUPERVISED FEATURE SELECTION FOR DIAGNOSIS OF CORONARY ARTERY DISEASE BASED O...csitconf
Feature Selection (FS) has become the focus of much research on decision support systems
areas for which datasets with tremendous number of variables are analyzed. In this paper we
present a new method for the diagnosis of Coronary Artery Diseases (CAD) founded on Genetic
Algorithm (GA) wrapped Bayes Naïve (BN) based FS.
Basically, CAD dataset contains two classes defined with 13 features. In GA–BN algorithm, GA
generates in each iteration a subset of attributes that will be evaluated using the BN in the
second step of the selection procedure. The final set of attribute contains the most relevant
feature model that increases the accuracy. The algorithm in this case produces 85.50%
classification accuracy in the diagnosis of CAD. Thus, the asset of the Algorithm is then
compared with the use of Support Vector Machine (SVM), Multi-Layer Perceptron (MLP) and
C4.5 decision tree Algorithm. The result of classification accuracy for those algorithms are
respectively 83.5%, 83.16% and 80.85%. Consequently, the GA wrapped BN Algorithm is
correspondingly compared with other FS algorithms. The Obtained results have shown very
promising outcomes for the diagnosis of CAD.
Supervised Feature Selection for Diagnosis of Coronary Artery Disease Based o...cscpconf
Feature Selection (FS) has become the focus of much research on decision support systems areas for which datasets with tremendous number of variables are analyzed. In this paper we
present a new method for the diagnosis of Coronary Artery Diseases (CAD) founded on Genetic Algorithm (GA) wrapped Bayes Naïve (BN) based FS. Basically, CAD dataset contains two classes defined with 13 features. In GA–BN algorithm, GA
generates in each iteration a subset of attributes that will be evaluated using the BN in the second step of the selection procedure. The final set of attribute contains the most relevant feature model that increases the accuracy. The algorithm in this case produces 85.50% classification accuracy in the diagnosis of CAD. Thus, the asset of the Algorithm is then compared with the use of Support Vector Machine (SVM), Multi-Layer erceptron (MLP) and C4.5 decision tree Algorithm. The result of classification accuracy for those algorithms are respectively 83.5%, 83.16% and 80.85%. Consequently, the GA wrapped BN Algorithm is correspondingly compared with other FS algorithms. The Obtained results have shown very promising outcomes for the diagnosis of CAD.
A Survey on Heart Disease Prediction Techniquesijtsrd
Heart disease is the main reason for a huge number of deaths in the world over the last few decades and has evolved as the most life threatening disease. The health care industry is found to be rich in information. So, there is a need to discover hidden patterns and trends in them. For this purpose, data mining techniques can be applied to extract the knowledge from the large sets of data. Many researchers, in recent times have been using several machine learning techniques for predicting the heart related diseases as it can predict the disease effectively. Even though a machine learning technique proves to be effective in assisting the decision makers, still there is a scope for developing an accurate and efficient system to diagnose and predict the heart diseases thereby helping doctors with ease of work. This paper presents a survey of various techniques used for predicting heart disease and reviews their performance. G. Niranjana | Dr I. Elizabeth Shanthi "A Survey on Heart Disease Prediction Techniques" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-2 , February 2021, URL: https://www.ijtsrd.com/papers/ijtsrd38349.pdf Paper Url: https://www.ijtsrd.com/computer-science/other/38349/a-survey-on-heart-disease-prediction-techniques/g-niranjana
PREVENTION OF HEART PROBLEM USING ARTIFICIAL INTELLIGENCEijaia
Heart is the most important organ of a human body as it not only circulates oxygen and other vital
nutrients through blood to different parts of the body and helping in the metabolic activities but also
removes metabolic wastes. Thus, even a minor problem can affect the whole organism. Hence, researchers
are diverting a lot of data analysis work for assisting the doctors to predict the heart problem. So, an
analysis of the data related to different health problems and its functioning can help in predicting with a
certain probability for the wellness of this organ. In this paper we have analysed the different prescribed
data of patients from different parts of India. Using this data, we have built a model which gets trained
using this data and tries to predict whether a new out-of-sample data has a probability of having any heart
attack or not. This model can help in the decision making along with the doctor to treat the patient well and
creating a transparency between the doctor and the patient. In the validation set of the data, it’s not only
the accuracy that the model has to take care, rather the True Positive Rate and False-Negative Rate along
with the AUC-ROC helps in building/fixing the algorithm inside the model.
HEART DISEASE PREDICTION USING MACHINE LEARNING AND DEEP LEARNINGIJDKP
Heart disease is most common disease reported currently in the United States among both the genders and
according to official statistics about fifty percent of the American population is suffering from some form of
cardiovascular disease. This paper performs chi square tests and linear regression analysis to predict
heart disease based on the symptoms like chest pain and dizziness. This paper will help healthcare sectors
to provide better assistance for patients suffering from heart disease by predicting it in beginning stage of
disease. Chi square test is conducted to identify whether there is a relation between chest pain and heart
disease cases in the United States by analyzing heart disease dataset from IEEE Data Port. The test results
and analysis show that males in the United States are most likely to develop heart disease with the
symptoms like chest pain, dizziness, shortness of breath, fatigue, and nausea. This test also shows that
there is a week corelation of 0.5 is identified which shows people with all ages including teens can face
heart diseases and its prevalence increase with age. Also, the tests indicate that 90 percent of the
participant who are facing severe chest pain is suffering from heart disease where majority of the
successful heart disease identified is in males and only 10 percent participants are identified as healthy.
The evaluated p-values are much greater than the statistical threshold of 0.05 which concludes factors like
sex, Exercise angina, Cholesterol, old peak, ST_Slope, obesity, and blood sugar play significant role in
onset of cardiovascular disease. We have tested the dataset with prediction model built on logistic
regression and observed an accuracy of 85.12 percent.
EVALUATING THE ACCURACY OF CLASSIFICATION ALGORITHMS FOR DETECTING HEART DISE...mlaij
The healthcare industry generates enormous amounts of complex clinical data that make the prediction of
disease detection a complicated process. In medical informatics, making effective and efficient decisions is
very important. Data Mining (DM) techniques are mainly used to identify and extract hidden patterns and
interesting knowledge to diagnose and predict diseases in medical datasets. Nowadays, heart disease is
considered one of the most important problems in the healthcare field. Therefore, early diagnosis leads to
a reduction in deaths. DM techniques have proven highly effective for predicting and diagnosing heart
diseases. This work utilizes the classification algorithms with a medical dataset of heart disease; namely,
J48, Random Forest, and Naïve Bayes to discover the accuracy of their performance. We also examine the
impact of the feature selection method. A comparative and analysis study was performed to determine the
best technique using Waikato Environment for Knowledge Analysis (Weka) software, version 3.8.6. The
performance of the utilized algorithms was evaluated using standard metrics such as accuracy, sensitivity
and specificity. The importance of using classification techniques for heart disease diagnosis has been
highlighted. We also reduced the number of attributes in the dataset, which showed a significant
improvement in prediction accuracy. The results indicate that the best algorithm for predicting heart
disease was Random Forest with an accuracy of 99.24%.
A Heart Disease Prediction Model using Logistic Regressionijtsrd
The early prognosis of cardiovascular diseases can aid in making decisions to lifestyle changes in high risk patients and in turn reduce their complications. Research has attempted to pinpoint the most influential factors of heart disease as well as accurately predict the overall risk using homogenous data mining techniques. Recent research has delved into amalgamating these techniques using approaches such as hybrid data mining algorithms. This paper proposes a rule based model to compare the accuracies of applying rules to the individual results of logistic regression on the Cleveland Heart Disease Database in order to present an accurate model of predicting heart disease. K. Sandhya Rani | M. Sai Manoj | G. Suguna Mani"A Heart Disease Prediction Model using Logistic Regression" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-3 , April 2018, URL: http://www.ijtsrd.com/papers/ijtsrd11401.pdf http://www.ijtsrd.com/computer-science/data-miining/11401/a-heart-disease-prediction-model-using-logistic-regression/k-sandhya-rani
Ascendable Clarification for Coronary Illness Prediction using Classification...ijtsrd
Coronary disease is predicted by classification technique. The data mining tool WEKA has been exploited for implementing Naïve Bayes classifier. Proposed work is trapped with a specific end goal to enhance the execution of models. For improving the classification accuracy Naïve Bayes is combined with Bagging and Attribute Selection. Trial results demonstrated a critical change over in the current Naïve Bayes classifier. This approach enhances the classification accuracy and reduces computational time. D. Haripriya | Dr. M. Lovelin Ponn Felciah "Ascendable Clarification for Coronary Illness Prediction using Classification Mining and Feature Selection Performances" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd26690.pdfPaper URL: https://www.ijtsrd.com/computer-science/data-miining/26690/ascendable-clarification-for-coronary-illness-prediction-using-classification-mining-and-feature-selection-performances/d-haripriya
Heart Disease Prediction Using Data Mining TechniquesIJRES Journal
There are huge amounts of data in the medical industry which is not processed properly and hence cannot be used effectively in making decisions. We can use data mining techniques to mine these patterns and relationships. This research has developed a prototype Heart Disease Prediction using data mining techniques, namely Neural Network, K-Means Clustering and Frequent Item Set Generation. Using medical profiles such as age, sex, blood pressure and blood sugar it can predict the likelihood patients getting a heart disease. It enables significant knowledge, e.g. patterns, relationships between medical factors related to heart disease to be established. Performance of these techniques is compared through sensitivity, specificity and accuracy. It has been observed that Artificial Neural Networks outperform K Means clustering in all the parameters i.e. Sensitivity, Specificity and Accuracy.
Android Based Questionnaires Application for Heart Disease Prediction Systemijtsrd
Today classification techniques in data mining are most popular to prediction and data exploration. This Heart Disease Prediction System HDPS is using Naive Bayesian Classification with a comparison for simple probability and that of Jelinek Mercer JM Smoothing. It is implemented as an Android based application user must be feedback and answers the questions then can be seen the result as user desired in different ways exactly heart disease is present or not and then with predictions No, Low, Average, High, Very High . And the system will be provided required suggestions such as doctor details and medications to patients could be able. It will be also proved that enhanced Naive Bayes with Jelinek Mercer smoothing technique is also effective to eliminate the noise for prediction the heart disease. This system can also calculate classifier accuracy by using precision and recall. Nan Yu Hlaing | Phyu Pyar Moe "Android Based Questionnaires Application for Heart Disease Prediction System" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd26750.pdfPaper URL: https://www.ijtsrd.com/computer-science/data-miining/26750/android-based-questionnaires-application-for-heart-disease-prediction-system/nan-yu-hlaing
SUPERVISED FEATURE SELECTION FOR DIAGNOSIS OF CORONARY ARTERY DISEASE BASED O...csitconf
Feature Selection (FS) has become the focus of much research on decision support systems
areas for which datasets with tremendous number of variables are analyzed. In this paper we
present a new method for the diagnosis of Coronary Artery Diseases (CAD) founded on Genetic
Algorithm (GA) wrapped Bayes Naïve (BN) based FS.
Basically, CAD dataset contains two classes defined with 13 features. In GA–BN algorithm, GA
generates in each iteration a subset of attributes that will be evaluated using the BN in the
second step of the selection procedure. The final set of attribute contains the most relevant
feature model that increases the accuracy. The algorithm in this case produces 85.50%
classification accuracy in the diagnosis of CAD. Thus, the asset of the Algorithm is then
compared with the use of Support Vector Machine (SVM), Multi-Layer Perceptron (MLP) and
C4.5 decision tree Algorithm. The result of classification accuracy for those algorithms are
respectively 83.5%, 83.16% and 80.85%. Consequently, the GA wrapped BN Algorithm is
correspondingly compared with other FS algorithms. The Obtained results have shown very
promising outcomes for the diagnosis of CAD.
Supervised Feature Selection for Diagnosis of Coronary Artery Disease Based o...cscpconf
Feature Selection (FS) has become the focus of much research on decision support systems areas for which datasets with tremendous number of variables are analyzed. In this paper we
present a new method for the diagnosis of Coronary Artery Diseases (CAD) founded on Genetic Algorithm (GA) wrapped Bayes Naïve (BN) based FS. Basically, CAD dataset contains two classes defined with 13 features. In GA–BN algorithm, GA
generates in each iteration a subset of attributes that will be evaluated using the BN in the second step of the selection procedure. The final set of attribute contains the most relevant feature model that increases the accuracy. The algorithm in this case produces 85.50% classification accuracy in the diagnosis of CAD. Thus, the asset of the Algorithm is then compared with the use of Support Vector Machine (SVM), Multi-Layer erceptron (MLP) and C4.5 decision tree Algorithm. The result of classification accuracy for those algorithms are respectively 83.5%, 83.16% and 80.85%. Consequently, the GA wrapped BN Algorithm is correspondingly compared with other FS algorithms. The Obtained results have shown very promising outcomes for the diagnosis of CAD.
A Survey on Heart Disease Prediction Techniquesijtsrd
Heart disease is the main reason for a huge number of deaths in the world over the last few decades and has evolved as the most life threatening disease. The health care industry is found to be rich in information. So, there is a need to discover hidden patterns and trends in them. For this purpose, data mining techniques can be applied to extract the knowledge from the large sets of data. Many researchers, in recent times have been using several machine learning techniques for predicting the heart related diseases as it can predict the disease effectively. Even though a machine learning technique proves to be effective in assisting the decision makers, still there is a scope for developing an accurate and efficient system to diagnose and predict the heart diseases thereby helping doctors with ease of work. This paper presents a survey of various techniques used for predicting heart disease and reviews their performance. G. Niranjana | Dr I. Elizabeth Shanthi "A Survey on Heart Disease Prediction Techniques" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-2 , February 2021, URL: https://www.ijtsrd.com/papers/ijtsrd38349.pdf Paper Url: https://www.ijtsrd.com/computer-science/other/38349/a-survey-on-heart-disease-prediction-techniques/g-niranjana
PREVENTION OF HEART PROBLEM USING ARTIFICIAL INTELLIGENCEijaia
Heart is the most important organ of a human body as it not only circulates oxygen and other vital
nutrients through blood to different parts of the body and helping in the metabolic activities but also
removes metabolic wastes. Thus, even a minor problem can affect the whole organism. Hence, researchers
are diverting a lot of data analysis work for assisting the doctors to predict the heart problem. So, an
analysis of the data related to different health problems and its functioning can help in predicting with a
certain probability for the wellness of this organ. In this paper we have analysed the different prescribed
data of patients from different parts of India. Using this data, we have built a model which gets trained
using this data and tries to predict whether a new out-of-sample data has a probability of having any heart
attack or not. This model can help in the decision making along with the doctor to treat the patient well and
creating a transparency between the doctor and the patient. In the validation set of the data, it’s not only
the accuracy that the model has to take care, rather the True Positive Rate and False-Negative Rate along
with the AUC-ROC helps in building/fixing the algorithm inside the model.
HEART DISEASE PREDICTION USING MACHINE LEARNING AND DEEP LEARNINGIJDKP
Heart disease is most common disease reported currently in the United States among both the genders and
according to official statistics about fifty percent of the American population is suffering from some form of
cardiovascular disease. This paper performs chi square tests and linear regression analysis to predict
heart disease based on the symptoms like chest pain and dizziness. This paper will help healthcare sectors
to provide better assistance for patients suffering from heart disease by predicting it in beginning stage of
disease. Chi square test is conducted to identify whether there is a relation between chest pain and heart
disease cases in the United States by analyzing heart disease dataset from IEEE Data Port. The test results
and analysis show that males in the United States are most likely to develop heart disease with the
symptoms like chest pain, dizziness, shortness of breath, fatigue, and nausea. This test also shows that
there is a week corelation of 0.5 is identified which shows people with all ages including teens can face
heart diseases and its prevalence increase with age. Also, the tests indicate that 90 percent of the
participant who are facing severe chest pain is suffering from heart disease where majority of the
successful heart disease identified is in males and only 10 percent participants are identified as healthy.
The evaluated p-values are much greater than the statistical threshold of 0.05 which concludes factors like
sex, Exercise angina, Cholesterol, old peak, ST_Slope, obesity, and blood sugar play significant role in
onset of cardiovascular disease. We have tested the dataset with prediction model built on logistic
regression and observed an accuracy of 85.12 percent.
EVALUATING THE ACCURACY OF CLASSIFICATION ALGORITHMS FOR DETECTING HEART DISE...mlaij
The healthcare industry generates enormous amounts of complex clinical data that make the prediction of
disease detection a complicated process. In medical informatics, making effective and efficient decisions is
very important. Data Mining (DM) techniques are mainly used to identify and extract hidden patterns and
interesting knowledge to diagnose and predict diseases in medical datasets. Nowadays, heart disease is
considered one of the most important problems in the healthcare field. Therefore, early diagnosis leads to
a reduction in deaths. DM techniques have proven highly effective for predicting and diagnosing heart
diseases. This work utilizes the classification algorithms with a medical dataset of heart disease; namely,
J48, Random Forest, and Naïve Bayes to discover the accuracy of their performance. We also examine the
impact of the feature selection method. A comparative and analysis study was performed to determine the
best technique using Waikato Environment for Knowledge Analysis (Weka) software, version 3.8.6. The
performance of the utilized algorithms was evaluated using standard metrics such as accuracy, sensitivity
and specificity. The importance of using classification techniques for heart disease diagnosis has been
highlighted. We also reduced the number of attributes in the dataset, which showed a significant
improvement in prediction accuracy. The results indicate that the best algorithm for predicting heart
disease was Random Forest with an accuracy of 99.24%.
EVALUATING THE ACCURACY OF CLASSIFICATION ALGORITHMS FOR DETECTING HEART DIS...mlaij
The healthcare industry generates enormous amounts of complex clinical data that make the prediction of
disease detection a complicated process. In medical informatics, making effective and efficient decisions is
very important. Data Mining (DM) techniques are mainly used to identify and extract hidden patterns and
interesting knowledge to diagnose and predict diseases in medical datasets. Nowadays, heart disease is
considered one of the most important problems in the healthcare field. Therefore, early diagnosis leads to
a reduction in deaths. DM techniques have proven highly effective for predicting and diagnosing heart
diseases. This work utilizes the classification algorithms with a medical dataset of heart disease; namely,
J48, Random Forest, and Naïve Bayes to discover the accuracy of their performance. We also examine the
impact of the feature selection method. A comparative and analysis study was performed to determine the
best technique using Waikato Environment for Knowledge Analysis (Weka) software, version 3.8.6. The
performance of the utilized algorithms was evaluated using standard metrics such as accuracy, sensitivity
and specificity. The importance of using classification techniques for heart disease diagnosis has been
highlighted. We also reduced the number of attributes in the dataset, which showed a significant
improvement in prediction accuracy. The results indicate that the best algorithm for predicting heart
disease was Random Forest with an accuracy of 99.24%
Machine learning approach for predicting heart and diabetes diseases using da...IAESIJAI
Environmental changes and food habits affect people's health with numerous diseases in today's life. Machine learning is a technique that plays a vital role in predicting diseases from collected data. The health sector has plenty of electronic medical data, which helps this technique to diagnose various diseases quickly and accurately. There has been an improvement in accuracy in medical data analysis as data continues to grow in the medical field. Doctors may have a hard time predicting symptoms accurately. This proposed work utilized Kaggle data to predict and diagnose heart and diabetic diseases. The diseases heart and diabetes are the foremost cause of higher death rates for people. The dataset contains target features for the diagnosis of heart disease. This work finds the target variable for diabetic disease by comparing the patient's blood sugars to normal levels. Blood pressure, body mass index (BMI), and other factors diagnose these diseases and disorders. This work justifies the filter method and principal component analysis for selecting and extracting the feature. The main aim of this work is to highlight the implementation of three ensemble techniques-Adaptive boost, Extreme Gradient boosting, and Gradient boosting-as well as the emphasis placed on the accuracy of the results.
A disorder or illness called heart failure results in the heart becoming weak or
damaged. In order to avoid heart failure early on, it is crucial to understand the
causes of heart failure. Based on validation, two experimental processing steps
will be applied to the dataset of clinical records related to heart failure. Testing
will be done in the first step utilizing six different classification algorithms,
including K-nearest neighbor, neural network, random forest, decision tree,
Naïve Bayes, and support vector machine (SVM). Cross-validation was
employed to conduct the test. According to the results, the random forest
algorithm performed better than the other five algorithms in tests employing
the algorithm. Subsequent testing uses an algorithm with the best accuracy
value, which will then be tested again using split validation with varying split
ratios and genetic algorithms as a selection feature. The value generated from
testing using the genetic algorithm selection feature is better than the random
forest algorithm alone, which is recorded to produce an accuracy value of
93.36% in predicting the survival of heart failure patients.
COMPARISON AND EVALUATION DATA MINING TECHNIQUES IN THE DIAGNOSIS OF HEART DI...ijcsa
Heart disease is one of the biggest health problems in the world because of high mortality and morbidity
caused by the disease. The use of data mining on medical data brought valuable and effective life
achievements and can enhance medical knowledge to make necessary decisions. Data mining plays an
important role in the field of medical science to solve health problems and diagnose ailments in critical
conditions and in normal conditions. For this reason, in this paper, data mining techniques are used to
diagnose heart disease from a dataset that includes 200 samples from different patients. Techniques used to
diagnose heart disease include Bagging, AdaBoostM1, Random Forest, Naive Bayes, RBF Network, IBK,
and NNge that all the techniques used to diagnose heart disease use Weka tool. Then these techniques are
compared to determine which is more accurate in the diagnosis of heart disease that according to the
results, it was found that the RBF Network with the accuracy of 88.2% is the most accurate classification in
the diagnosis of heart disease.
In our homes or offices, security has been a vital issue. Control of home security system remotely always offers huge advantages like the arming or disarming of the alarms, video monitoring, and energy management control apart from safeguarding the home free up intruders. Considering the oldest simple methods of security that is the mechanical lock system that has a key as the authentication element, then an upgrade to a universal type, and now unique codes for the lock. The recent advancement in the communication system has brought the tremendous application of communication gadgets into our various areas of life. This work is a real-time smart doorbell notification system for home Security as opposes of the traditional security methods, it is composed of the doorbell interfaced with GSM Module, a GSM module would be triggered to send an SMS to the house owner by pressing the doorbell, the owner will respond to the guest by pressing a button to open the door, otherwise, a message would be displayed to the guest for appropriate action. Then, the keypad is provided for an authorized person for the provision of password for door unlocking, if multiple wrong password attempts were made to unlock, a message of burglary attempt would be sent to the house owner for prompt action. The main benefit of this system is the uniqueness of the incorporation of the password and messaging systems which denies access to any unauthorized personality and owner's awareness method.
Augmented reality, the new age technology, has widespread applications in every field imaginable. This technology has proven to be an inflection point in numerous verticals, improving lives and improving performance. In this paper, we explore the various possible applications of Augmented Reality (AR) in the field of Medicine. The objective of using AR in medicine or generally in any field is the fact that, AR helps in motivating the user, making sessions interactive and assist in faster learning. In this paper, we discuss about the applicability of AR in the field of medical diagnosis. Augmented reality technology reinforces remote collaboration, allowing doctors to diagnose patients from a different locality. Additionally, we believe that a much more pronounced effect can be achieved by bringing together the cutting edge technology of AR and the lifesaving field of Medical sciences. AR is a mechanism that could be applied in the learning process too. Similarly, virtual reality could be used in the field where more of practical experience is needed such as driving, sports, neonatal care training.
Image fusion is a sub field of image processing in which more than one images are fused to create an image where all the objects are in focus. The process of image fusion is performed for multi-sensor and multi-focus images of the same scene. Multi-sensor images of the same scene are captured by different sensors whereas multi-focus images are captured by the same sensor. In multi-focus images, the objects in the scene which are closer to the camera are in focus and the farther objects get blurred. Contrary to it, when the farther objects are focused then closer objects get blurred in the image. To achieve an image where all the objects are in focus, the process of images fusion is performed either in spatial domain or in transformed domain. In recent times, the applications of image processing have grown immensely. Usually due to limited depth of field of optical lenses especially with greater focal length, it becomes impossible to obtain an image where all the objects are in focus. Thus, it plays an important role to perform other tasks of image processing such as image segmentation, edge detection, stereo matching and image enhancement. Hence, a novel feature-level multi-focus image fusion technique has been proposed which fuses multi-focus images. Thus, the results of extensive experimentation performed to highlight the efficiency and utility of the proposed technique is presented. The proposed work further explores comparison between fuzzy based image fusion and neuro fuzzy fusion technique along with quality evaluation indices.
Graphs have become the dominant life-form of many tasks as they advance a
structure to represent many tasks and the corresponding relations. A powerful
role of networks/graphs is to bridge local feats that exist in vertices as they
blossom into patterns that help explain how nodal relations and their edges
impacts a complex effect that ripple via a graph. User cluster are formed as a
result of interactions between entities. Many users can hardly categorize their
contact into groups today such as “family”, “friends”, “colleagues” etc. Thus,
the need to analyze such user social graph via implicit clusters, enables the
dynamism in contact management. Study seeks to implement this dynamism
via a comparative study of deep neural network and friend suggest algorithm.
We analyze a user’s implicit social graph and seek to automatically create
custom contact groups using metrics that classify such contacts based on a
user’s affinity to contacts. Experimental results demonstrate the importance
of both the implicit group relationships and the interaction-based affinity in
suggesting friends.
This paper projects Gryllidae Optimization Algorithm (GOA) has been applied to solve optimal reactive power problem. Proposed GOA approach is based on the chirping characteristics of Gryllidae. In common, male Gryllidae chirp, on the other hand some female Gryllidae also do as well. Male Gryllidae draw the females by this sound which they produce. Moreover, they caution the other Gryllidae against dangers with this sound. The hearing organs of the Gryllidae are housed in an expansion of their forelegs. Through this, they bias to the produced fluttering sounds. Proposed Gryllidae Optimization Algorithm (GOA) has been tested in standard IEEE 14, 30 bus test systems and simulation results show that the projected algorithms reduced the real power loss considerably.
In the wake of the sudden replacement of wood and kerosene by gas cookers for several purposes in Nigeria, gas leakage has caused several damages in our homes, Laboratories among others. installation of a gas leakage detection device was globally inspired to eliminate accidents related to gas leakage. We present an alternative approach to developing a device that can automatically detect and control gas leakages and also monitor temperature. The system detects the leakage of the LPG (Liquefied Petroleum Gas) using a gas sensor, then triggred the control system response which employs ventilator system, Mobile phone alert and alarm when the LPG concentration in the air exceeds a certain level. The performance of two gas sensors (MQ5 and MQ6) were tested for a guided decision. Also, when the temperature of the environment poses a danger, LED (indicator), buzzer and LCD (16x2) display was used to indicate temperature and gas leakage status in degree Celsius and PPM respectively. Attension was given to the response time of the control system, which was ascertained that this system significantly increases the chances and efficiency of eliminating gas leakage related accident.
Feature selection problem is one of the main important problems in the text and data mining domain. This paper presents a comparative study of feature selection methods for Arabic text classification. Five of the feature selection methods were selected: ICHI square, CHI square, Information Gain, Mutual Information and Wrapper. It was tested with five classification algorithms: Bayes Net, Naive Bayes, Random Forest, Decision Tree and Artificial Neural Networks. In addition, Data Collection was used in Arabic consisting of 9055 documents, which were compared by four criteria: Precision, Recall, F-measure and Time to build model. The results showed that the improved ICHI feature selection got almost all the best results in comparison with other methods.
In this paper Gentoo Penguin Algorithm (GPA) is proposed to solve optimal reactive power problem. Gentoo Penguins preliminary population possesses heat radiation and magnetizes each other by absorption coefficient. Gentoo Penguins will move towards further penguins which possesses low cost (elevated heat concentration) of absorption. Cost is defined by the heat concentration, distance. Gentoo Penguins penguin attraction value is calculated by the amount of heat prevailed between two Gentoo penguins. Gentoo Penguins heat radiation is measured as linear. Less heat is received in longer distance, in little distance, huge heat is received. Gentoo Penguin Algorithm has been tested in standard IEEE 57 bus test system and simulation results show the projected algorithm reduced the real power loss considerably.
08 20272 academic insight on applicationIAESIJEECS
This research has thrown up many questions in need of further investigation.There was an expressive quantitative-qualitative research, which a common investigation form was used in.The dialogue item was also applied to discover if the contributors asserted the media-based attitude supplements their learning of academic English writing classes or not.Data recounted academic” insights toward using Skype as a sustaining implement for lessons releasing based on chosen variables: their occupation, year of education, and knowledge with Skype discovered that there were no important statistical differences in the use of Skype units because of medical academics major knowledge. There are statistically important differences in using Skype units. The findings also, disclosed that there are statistically significant differences in using Skype units due to the practice with Skype variable, in favors of academics with no Skype use practice. Skype instrument as an instructive media is a positive medium to be employed to supply academic medical writing data and assist education. Academics who do not have enough time to contribute in classes believe comfortable using the Skype-based attitude in scientific writing. They who took part in the course claimed that their approval of this media is due to learning academic innovative medical writing.
Cloud computing has sweeping impact on the human productivity. Today it’s used for Computing, Storage, Predictions and Intelligent Decision Making, among others. Intelligent Decision-Making using Machine Learning has pushed for the Cloud Services to be even more fast, robust and accurate. Security remains one of the major concerns which affect the cloud computing growth however there exist various research challenges in cloud computing adoption such as lack of well managed service level agreement (SLA), frequent disconnections, resource scarcity, interoperability, privacy, and reliability. Tremendous amount of work still needs to be done to explore the security challenges arising due to widespread usage of cloud deployment using Containers. We also discuss Impact of Cloud Computing and Cloud Standards. Hence in this research paper, a detailed survey of cloud computing, concepts, architectural principles, key services, and implementation, design and deployment challenges of cloud computing are discussed in detail and important future research directions in the era of Machine Learning and Data Science have been identified.
Notary is an official authorized to make an authentic deed regarding all deeds, agreements and stipulations required by a general rule. Activities carried out at the notary office such as recording client data and file data still use traditional systems that tend to be manual. The problem that occurs is the inefficiency in data processing and providing information to clients. Clients have difficulty getting information related to the progress of documents that are being taken care of at the notary's office. The client must take the time to arrive to the notary's office repeatedly to check the progress of the work of the document file. The purpose of this study is to facilitate clients in obtaining information about the progress of the work in progress, and make it easier for employees to process incoming documents by implementing an administrative system. This system was developed with the waterfall system development method and uses the Multi-Channel Access Technology integrated in the website to simplify the process of delivering information and requesting information from clients and to clients with Telegram and SMS Gateway. Clients will come to the office only when there is a notification from the system via Telegram or SMS notifying that the client must come directly to the notary's office, thus leading to an efficient time and avoiding excessive transportation costs. The overall functional system can function properly based on the results of alpha testing. The results of beta testing conducted by distributing the system feasibility test questionnaire to end users, get a percentage of 96% of users agree the system is feasible to be implemented.
In this work Tundra wolf algorithm (TWA) is proposed to solve the optimal reactive power problem. In the projected Tundra wolf algorithm (TWA) in order to avoid the searching agents from trapping into the local optimal the converging towards global optimal is divided based on two different conditions. In the proposed Tundra wolf algorithm (TWA) omega tundra wolf has been taken as searching agent as an alternative of indebted to pursue the first three most excellent candidates. Escalating the searching agents’ numbers will perk up the exploration capability of the Tundra wolf wolves in an extensive range. Proposed Tundra wolf algorithm (TWA) has been tested in standard IEEE 14, 30 bus test systems and simulation results show the proposed algorithm reduced the real power loss effectively.
In this work Predestination of Particles Wavering Search (PPS) algorithm has been applied to solve optimal reactive power problem. PPS algorithm has been modeled based on the motion of the particles in the exploration space. Normally the movement of the particle is based on gradient and swarming motion. Particles are permitted to progress in steady velocity in gradient-based progress, but when the outcome is poor when compared to previous upshot, immediately particle rapidity will be upturned with semi of the magnitude and it will help to reach local optimal solution and it is expressed as wavering movement. In standard IEEE 14, 30, 57,118,300 bus systems Proposed Predestination of Particles Wavering Search (PPS) algorithm is evaluated and simulation results show the PPS reduced the power loss efficiently.
In this paper, Mine Blast Algorithm (MBA) has been intermingled with Harmony Search (HS) algorithm for solving optimal reactive power dispatch problem. MBA is based on explosion of landmines and HS is based on Creativeness progression of musicians-both are hybridized to solve the problem. In MBA Initial distance of shrapnel pieces are reduced gradually to allow the mine bombs search the probable global minimum location in order to amplify the global explore capability. Harmony search (HS) imitates the music creativity process where the musicians supervise their instruments’ pitch by searching for a best state of harmony. Hybridization of Mine Blast Algorithm with Harmony Search algorithm (MH) improves the search effectively in the solution space. Mine blast algorithm improves the exploration and harmony search algorithm augments the exploitation. At first the proposed algorithm starts with exploration & gradually it moves to the phase of exploitation. Proposed Hybridized Mine Blast Algorithm with Harmony Search algorithm (MH) has been tested on standard IEEE 14, 300 bus test systems. Real power loss has been reduced considerably by the proposed algorithm. Then Hybridized Mine Blast Algorithm with Harmony Search algorithm (MH) tested in IEEE 30, bus system (with considering voltage stability index)- real power loss minimization, voltage deviation minimization, and voltage stability index enhancement has been attained.
Artificial Neural Networks have proved their efficiency in a large number of research domains. In this paper, we have applied Artificial Neural Networks on Arabic text to prove correct language modeling, text generation, and missing text prediction. In one hand, we have adapted Recurrent Neural Networks architectures to model Arabic language in order to generate correct Arabic sequences. In the other hand, Convolutional Neural Networks have been parameterized, basing on some specific features of Arabic, to predict missing text in Arabic documents. We have demonstrated the power of our adapted models in generating and predicting correct Arabic text comparing to the standard model. The model had been trained and tested on known free Arabic datasets. Results have been promising with sufficient accuracy.
In the present-day communications speech signals get contaminated due to
various sorts of noises that degrade the speech quality and adversely impacts
speech recognition performance. To overcome these issues, a novel approach
for speech enhancement using Modified Wiener filtering is developed and
power spectrum computation is applied for degraded signal to obtain the
noise characteristics from a noisy spectrum. In next phase, MMSE technique
is applied where Gaussian distribution of each signal i.e. original and noisy
signal is analyzed. The Gaussian distribution provides spectrum estimation
and spectral coefficient parameters which can be used for probabilistic model
formulation. Moreover, a-priori-SNR computation is also incorporated for
coefficient updation and noise presence estimation which operates similar to
the conventional VAD. However, conventional VAD scheme is based on the
hard threshold which is not capable to derive satisfactory performance and a
soft-decision based threshold is developed for improving the performance of
speech enhancement. An extensive simulation study is carried out using
MATLAB simulation tool on NOIZEUS speech database and a comparative
study is presented where proposed approach is proved better in comparison
with existing technique.
Previous research work has highlighted that neuro-signals of Alzheimer’s disease patients are least complex and have low synchronization as compared to that of healthy and normal subjects. The changes in EEG signals of Alzheimer’s subjects start at early stage but are not clinically observed and detected. To detect these abnormalities, three synchrony measures and wavelet-based features have been computed and studied on experimental database. After computing these synchrony measures and wavelet features, it is observed that Phase Synchrony and Coherence based features are able to distinguish between Alzheimer’s disease patients and healthy subjects. Support Vector Machine classifier is used for classification giving 94% accuracy on experimental database used. Combining, these synchrony features and other such relevant features can yield a reliable system for diagnosing the Alzheimer’s disease.
Attenuation correction designed for PET/MR hybrid imaging frameworks along with portion making arrangements used for MR-based radiation treatment remain testing because of lacking high-energy photon weakening data. We present a new method so as to uses the learned nonlinear neighborhood descriptors also highlight coordinating toward foresee pseudo-CT pictures starting T1w along with T2w MRI information. The nonlinear neighborhood descriptors are acquired through anticipating the direct descriptors interested in the nonlinear high-dimensional space utilizing an unequivocal constituent guide also low-position guess through regulated complex regularization. The nearby neighbors of every near descriptor inside the data MR pictures are looked during an obliged spatial extent of the MR pictures among the training dataset. By that point, the pseudo-CT patches are evaluated through k-closest neighbor relapse. The planned procedure designed for pseudo-CT forecast is quantitatively broke downward on top of a dataset comprising of coordinated mind MRI along with CT pictures on or after 13 subjects.
The cognitive radio prototype performance is to alleviate the scarcity of spectral resources for wireless communication through intelligent sensing and quick resource allocation techniques. Secondary users (SU’s) actively obtain the spectrum access opportunity by supporting primary users (PU’s) in cognitive radio networks (CRNs). In present generation, spectrum access is endowed through cooperative communication-based link-level frame-based cooperative (LLC) principle. In this SUs independently act as conveyors for PUs to achieve spectrum access opportunities. Unfortunately, this LLC approach cannot fully exploit spectrum access opportunities to enhance the throughput of CRNs and fails to motivate PUs to join the spectrum sharing processes. Therefore, to overcome this con, network level cooperative (NLC) principle was used, where SUs are integrated mutually to collaborate with PUs session by session, instead of frame based cooperation for spectrum access opportunities. NLC approach has justified the challenges facing in LLC approach. In this paper we make a survey of some models that have been proposed to tackle the problem of LLC. We show the relevant aspects of each model, in order to characterize the parameters that we should take in account to achieve a spectrum access opportunity.
In this paper, the author provides insights and lessons that can be learned from colleagues at American universities about their online education experiences. The literature review and previous studies of online educations gains are explored and summarized in this research. Emerging trends in online education are discussed in detail, and strategies to implement these trends are explained. The author provides several tools and strategies that enable universities to ensure the quality of online education. At the end of this research paper, the researcher provides examples from Arab universities who have successfully implemented online education and expanded their impact on the society. This research provides a strategy and a model that can be used by universities in the Middle East as a roadmap to implement online education in their regions.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
20 Comprehensive Checklist of Designing and Developing a WebsitePixlogix Infotech
Dive into the world of Website Designing and Developing with Pixlogix! Looking to create a stunning online presence? Look no further! Our comprehensive checklist covers everything you need to know to craft a website that stands out. From user-friendly design to seamless functionality, we've got you covered. Don't miss out on this invaluable resource! Check out our checklist now at Pixlogix and start your journey towards a captivating online presence today.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
2. IJ-ICT ISSN: 2252-8776
Predicting heart ailment in patients with varying number of features using data … (T R Stella Mary)
57
unfit to convey oxygenated blood to the heart muscle causing the outstanding manifestations of CAD, for
example, chest torment (angina) and shortness of breath [4-5]. There are many kinds of cardiovascular
diseases which includes the coronary heart diseases (CHD), stroke, intrinsic heart, fringe vein, rheumatic
heart, provocative heart sickness and hypertensive coronary illness [6-8]. The extreme reasons for heart
illness are tobacco utilization, physical latency, an undesirable eating regimen and customary use of liquor
[9].
Medical diagnosis is an important and complicated task as it must ensure accurate results and need
to be executed precisely. Not all doctors are completely well versed in their various sectors and there is lack
of proper diagnosis. This paves way for and the stream of data mining to ease the tasks and which is capable
of performing the diagnosis and generate the results accurately with less human interference and less
investment both in terms of money and time [10].
In real time there are several tests taken to predict and diagnose patients with heart diseases which in
turn gives the accurate results but turns out to be time consuming and cost effective. Meanwhile in automated
approach people aim at reducing the attributes which in turn reduces the time taken but lacks in accuracy
since an extra care has to be taken to ensure that a patient is suffering from heart ailment providing a detailed
analysis is done in all the perspectives. Keeping this in mind that most of the features affecting heart disease
must be considered, a predictive analysis is done to see how the prediction accuracies vary with very few and
maximum features taken. This paper aims at providing a more efficient approach towards the prediction of
heart ailments in patients with the most important attributes affecting the patients that leads to heart disease
and analyze the data as to how the prediction accuracies fluctuate with minimal and maximal attributes taken
into consideration using the data mining approach.
Numerous studies have been performed with the aim of efficient diagnosis and prediction of heart
ailment. Most of the authors have applied different data mining methods for prediction and have come up
with different probabilities of accuracies for different classifiers.
Randa El-Bialy et al., [11] depicts that heterogeneity datasets gives in heterogeneity data types,
based on this it combines four datasets with same class label namely the Cleveland heart disease, Hungarian
heart disease, V.A. heart disease and Statlog project heart disease with 13 features from UCI repository and
applies C4.5 and Fast Decision tree in order to construct decision tree for each dataset and compare the
accuracies in WEKA. Out of which five common features are selected and are integrated into a new dataset,
on which the two models are applied again, which results in the four most commonly occurring features (ca,
age, cp, thal) with highest information gain. This indicates that these four features are the most important
features for the prediction of heart disease.
Vikas Chaurasia [12] uses three data mining algorithms CART, ID3 and Decision table on the
Cleveland heart disease dataset with 11 attributes out of 13 attributes. Theses 3 classifiers are implemented
on WEKA with 10-fold cross validation. A detailed result in terms of time taken to build the model, correctly
and incorrectly classified instances, error rate, confusion matrices and accuracies are analyzed for all the
three classifiers and finally depicts that CART out performs from the other two classifiers with an accuracy
of 83.49%. Further the importance of each attribute is analyzed with Chi-squared test, Information gain and
Gain ratio test which depicts that cp attributes impacts the output the most.
Chaitrali S. Dangare and Sulabha S. Apte [13] analyses the Cleveland heart disease dataset with 3
classifiers namely the Neural Networks, Decision trees and Naïve Bayes algorithms. To this existing dataset
two more attributes obesity and tobacco are included which makes up for the next dataset. Initially the
classifiers are applied to the first dataset with 13 attributes and the same classifiers are applied to the next
dataset with 15 attributes. Finally, a detailed analysis of their confusion matrices and accuracies are
compared between the two datasets which depicts that for the dataset with 13 attributes Neural Network
outperforms with 99.25% accuracy and for the dataset with 15 attributes again Neural Network outperforms
with 100% accuracy. They conclude that Neural Network algorithm outperforms in both the cases and has
relatively higher accuracy with 15 attributes.
M. Anbarasi et al., [14] objective was to decrease the number of attributes for which they play out
the analysis on the Cleveland heart disease dataset with 13 attributes. Feature subset selection is performed
on the dataset using the Genetic algorithm which results with 6 attributes on which the classifiers are applied.
The classifiers used here are Decision tree, Naïve Bayes and Classification via Clustering. Further while
comparing among all the classifiers with their corresponding confusion matrices and accuracies, Decision
tree outperforms with an accuracy of 99.2%.
Jyoti Rohilla and Preeti Gulia [15] performs the analysis on the Cleveland heart disease dataset with
11 attributes by applying the classifiers on a 10-fold cross validation process. The different classifiers used
are Naïve Bayes, Bagging, ID3, J48, Simple Cart, Logistic Regression and REPTREE algorithms.
Comparatively the Decision tree classifier ID3 outperforms with an accuracy of 88% and with a minimum
3. ISSN: 2252-8776
IJ-ICT Vol. 8, No. 1, April 2019 : 56–62
58
error. Hence it concludes that the Decision tree method that is ID3 outperforms when compared to all other
classifiers.
R. Tamilarasi and Dr. R. Porkodi [16] aims at providing a cost effective automated system in
prediction of heart ailment in patients for which the a dataset with 13 attributes are considered. They even
specify the risk factors of heart disease which are smoking, cholesterol, obesity and lack of physical exercise.
To analyze this dataset 6 classifiers namely the Naïve Bayes, KNN algorithm, J48, CART, ANN and SMO
are applied to see which performs comparatively better in terms of correctly and incorrectly classified
instances and the error rate. Hence, it concludes that the K-nearest neighbor algorithm outperforms with an
accuracy of 100%.
Shabad Adam Pattekari and Asma Parveen [17] built a prototype system with a model built with the
classifiers namely Decision trees, Naïve Bayes and Artificial Neural Networks. A real time website was
developed called the Heart Disease Prediction System (HDPS) where in it retrieves the data stored in the
database and predicts if the patient had heart disease using the trained dataset.
M. Akhiljabbar et al., [18] developed an efficient algorithm using the genetic algorithm approach
known as the Associative Classification algorithm in order to predict patients with heart ailment. In this
approach all the attributes are analyzed and measured in terms of the Gini Index which in turn gives
weightage for each of the attributes which helps in determining the most important factors affecting the heart
disease. Then the Z statistics is used to analyze the wellness of the rule obtained. Finally, based on the rules
generated the classifiers are built and the accuracy of the dataset is measured.
2. RESEARCH METHOD
All most all the hospitals today maintain vast amount of healthcare information obtained from
patients, from electronic devices, medical diagnosis etc. This huge amount of data consists a lot of useful
hidden data in them which is very essential in diagnosing various ailments with a minimum number of
medical tests and make the diagnosis more intelligent. Many a times we limit ourselves with minimal
attributes that are required to predict a patient with heart disease. By doing so we are missing on a lot of
important features that are main causes for heart diseases.
Taking this into consideration, an approach is proposed wherein all the attributes responsible for the
heart ailment in patients are considered and analyzed in a step by step procedure to see how the attributes
play a vital role in predicting heart ailment in patients. Hence, the pre-dominant objective is to provide a
more efficient approach towards the prediction of heart ailments with the most important attributes causing
heart ailment and analyze the data, as to how the prediction accuracies fluctuate with minimal and maximal
attributes taken into consideration using the data mining approach.
For which a benchmark heart ailment dataset i.e. Cleveland heart disease dataset along with it two
more minimal datasets with same class label are used. Medical terminologies such as sex, chest pain type,
and age etc., making up for 7 attributes in first dataset, 10 attributes in second dataset and 13 attributes in the
benchmark dataset are used. To improvise the results, step by step approach is followed in increasing the
number of attributes and analyze to see if the prediction accuracies reduced or gradually increased. The data
mining classification methods used to classify the datasets are Decision Trees (Random Forest and Random
Tree) and Naive Bayes are used.
This proposed approach finds its application in predicting the heart ailments in patients as it
considers all the important attributes or features responsible in causing a heart ailment. It reduces the several
tests and time taken to predict and diagnose patients with heart diseases and is also cost effective. The
different data mining classifiers used are Naïve Bayes, Decision trees Random Forest and Random Tree.
2.1. Naïve Bayes
This algorithm has its basis from the Bayes theorem and follows the concept of conditional
independence i.e., it considers an attribute value of a given class to be independent of other attributes values.
The main objective is to maximize the posterior probability in predicting the class [19].
The Bayes theorem is as follows:
Let X={x1, x2, ....., xm} be a set of m attributes. In Bayesian, X is considered as evidence and H be
some hypothesis that the data of X belongs to specific class C. We have to determine P (H|X), the probability
that the hypothesis H holds given the evidence i.e. data sample X. According to Bayes theorem the P (H|X) is
expressed as
P (H|X) = P (X| H) P (H) / P (X)
Naïve Bayes classifier performs relatively better with nominal data but not with numeric data [20].
4. IJ-ICT ISSN: 2252-8776
Predicting heart ailment in patients with varying number of features using data … (T R Stella Mary)
59
2.2. Random Tree
It uses the concept of divide and conquer in order to build decision trees [21]. At each node of the
tree, the algorithm takes up a feature that further divides the data into subsets with the help of a splitting
algorithm [22]. Every predicting node leads to a class or decision rule. In order to classify a new item, it
should build a decision tree taking into consideration the existing items in the training set. With the help of
information gain it finds the dependent variable that clearly discriminates other instances. Now among the
possible values of this variable if any of the value turns out to be the target value or leaf node and we can
terminate that branch. Hence we follow this method until we get the decision rules of different combination
of features leads to the target value.
2.3. Random Forest
It is a classifier which consists of n number of decision trees where in each individual result is
depicted as a separate tree. It was derived from the random decision of forest that was proposed by Tin Kam
Ho of Bell Labs in 1995 [23]. This approach does the selection of attributes randomly in order to build the
decision trees but with restricted fluctuations. Multi-classifiers are the after effect of combining a few
independent classifiers and sets of classifiers responsible for broadening the execution have been depicted in
Figure 1 [24].
Figure 1. Architectural diagram of proposed work
3. RESULTS AND ANALYSIS
In Waikato Environment for Knowledge Analysis (WEKA) is used for analyzing and applying the
different classifiers on the datasets because of its excellence in discovering, analysis and predicting efficient
patterns [25]. A detailed analysis is done in Weka wherein Table 1 represents the results obtained by
applying the classifiers on a dataset of 7 attributes. Table 2 depicts the results obtained by applying the
classifiers on a dataset of 10 attributes and Table 3 depicts the results obtained by applying the classifiers on
a dataset of 13 attributes. Figure 2 depicts the gradual increase in the prediction accuracies along with the
increase in the attributes.
Table 1. Accuracies Achieved with 7 Attributes
Evaluation criteria
Classifiers
Naïve Bayes Random Forest Random Tree
Time taken to build model (in sec) 0 0.25 0
Correctly classified instances 71 72 68
Incorrectly classified instances 20 19 23
Accuracy (%) 78.022% 79.1209% 74.7253%
5. ISSN: 2252-8776
IJ-ICT Vol. 8, No. 1, April 2019 : 56–62
60
Table 2. Accuracies Achieved with 10 Attributes.
Evaluation criteria
Classifiers
Naïve Bayes Random Forest Random Tree
Time taken to build model (in sec) 0 0.14 0
Correctly classified instances 72 71 67
Incorrectly classified instances 15 16 20
Accuracy (%) 82.7586% 81.6092% 77.0115%
Table 3. Accuracies Achieved with 13 Attributes
Evaluation criteria
Classifiers
Naïve Bayes Random Forest Random Tree
Time taken to build model (in sec) 0 0.14 0
Correctly classified instances 76 79 72
Incorrectly classified instances 15 12 19
Accuracy (%) 83.5165% 86.8132% 79.1209%
Figure 2. Fluctuations in prediction accuracies.
The initial dataset (Dataset-1) consists of 7 attributes making up for 300 records namely age which
is continuous attribute, sex which is categorical attribute, exercise induced agina which is nominal attribute,
depression induced by exercise forming the categorical attribute, slope of peak exercise again a categorical
attribute, number of major vessels a nominal attribute and finally the defect type which is categorical again.
The second dataset (Dataset-2) consists of all the first dataset attributes along with additional 3 more
attributes namely the chest pain type which is a categorical attribute, cholesterol which is continuous and
resting ECG measure again categorical attribute totally making up for 290 records. Dataset-1 and Dataset-2
are collected by taking into consideration the various heart ailment attributes through a survey done with the
patients. Finally, the third dataset Cleveland Heart Disease dataset consists of 303 records with 13 attributes
is used.
Predictable attribute (num) is a common class label for all the three datasets wherein,
Value = 0: depicts that the patient has no heart disease
Value = 1: depicts that the patient has heart disease.
All the dataset are analyzed and the classifiers are applied on a basis of 70% split i.e., 70% of
dataset is used as training data and 30% as testing data.
Step – 1: Pre-processing of datasets.
Dataset-1 with 7 attributes didn’t have any missing values and so does the second dataset. But the
third Dataset-3 had missing values which was rectified by using the Weka filter ReplaceMissingValues.
Since the class variable is the same for all the datasets it is converted from numeric to nominal using the filter
numeric to nominal under the unsupervised attribute filters in order to make few of the classifiers like Naïve
Bayes and Decision trees to be applicable. Further the multi-class classification in the third benchmark
dataset is converted two-class classification. this section, it is explained the results of research and at the
same time is given the comprehensive discussion. Results can be presented in figures, graphs, tables and
others that make the reader understand easily [2], [5].
78.02% 79.12%
74.73%
82.76%
81.61%
77.01%
83.52%
86.81%
79.12%
NAÏVE BAYES RANDOM
FOREST
RANDOM TREE
Fluctuation in
prediction accuracies
Dataset-1 Dataset-2 Dataset-3
6. IJ-ICT ISSN: 2252-8776
Predicting heart ailment in patients with varying number of features using data … (T R Stella Mary)
61
Results are analyzed by applying the three classifiers Naïve Bayes, Random Tree and Random
Forest on all the three datasets to see if there are fluctuations in the prediction accuracies after applying them
to build the models.
Step – 2: Apply the 3 classifiers on the dataset with 7 attributes.
Initially, the Dataset-1 with 7 attributes is pre-processed and the class label is converted to nominal
using the filter Numeric to Nominal on which the 3 classifiers are applied. It is evident from Table 1 that the
Random forest algorithm outperforms with 7 attributes.
Step – 3: Apply the 3 classifiers on the dataset with 10 attributes.
Secondly, the Dataset-2 with 10 attributes is pre-processed and the class label is converted to
nominal using the filter Numeric to Nominal on which the 3 classifiers are applied.
It is evident from Table 2 that the Naïve Bayes algorithm outperforms and the accuracies of all the
classifiers are gradually increased with the addition of 3 more attributes chest pain type, cholesterol and ECG
measure to the Dataset-1.
Step – 4: Apply the 3 classifiers on the dataset with 13 attributes.
Finally, the Dataset-3 with 13 attributes is pre-processed and the class label is converted to nominal
using the filter Numeric to Nominal on which the 3 classifiers are applied. It is evident from Table 3 that the
Random Forest algorithm outperforms and the accuracies of all the classifiers are gradually increased with
the addition of 3 more attributes blood pressure, fasting sugar and thalach to the Dataset-2.
From the Table 4 it is evident that with the increase in the number of attributes, the prediction
accuracies have gradually increased irrespective of the classifiers used and the dataset values. Moreover, for
different datasets different classifiers outperforms i.e., in case of Dataset-1 with 7 attributes Random Forest
classifier outperforms with an accuracy of 79.1209%, whereas for Dataset-2 with 10 attributes Naïve Bayes
classifier outperforms with an accuracy of 82.7586% and finally for the Dataset-3 with 13 attributes the
Random Forest attribute outperforms. Hence, it clearly shows that consideration of all the important
attributes or features contribute to a better and accurate prediction of heart diseases in patients.
From the Figure 2 it is evident that there is a gradual increase in the prediction accuracies analyzed
on the three datasets with different number of attributes and values using the above three classifiers. It even
depicts that in case of Dataset-1 Random forest algorithm outperforms, for Dataset-2 Naïve Bayes algorithm
outperforms with an accuracy of 82.75% and for Dataset-3 again Random forest outperforms with an
accuracy of 86.81%.
Table 4. Accuracy Fluctuations among the 3 Datasets
Datasets
Classifiers
Naïve Bayes Random Forest Random Tree
Dataset-1 78.022% 79.1209% 74.7253%
Dataset-2 82.7586% 81.6092% 77.0115%
Dataset-3 83.5165% 86.8132% 79.1209%
4. CONCLUSION
Finally, to conclude from the results and analysis performed in this paper it is evident that each and
every attribute contribute uniquely to the prediction accuracy and hence with the step by step increase in the
heart ailment attributes, the prediction accuracy of heart ailment gradually increases irrespective of the
classifier used to build the model. Adding to this it is also evident that for different dataset with different
values and number of attributes different classifiers out performs each time. Future work can be expanded by
finding the most appropriate features required for prediction of heart ailment and use other data mining
methods such as neural networks, regression etc., for prediction of heart disease.
REFERENCES
[1] Yan, H., J. Zheng, et al. (2003). "Development of a decision support system for heart disease diagnosis using
multilayer perceptron." Proceedings of the 2003 International Symposium on vol.5: pp. V-709- V-712.
[2] Sandhya, J., P. Deepa Shenoy, et al. (2010). "Classification of Neurodegenerative Disorders Based on Major Risk
Factors Employing Machine Learning Techniques." International Journal of Engineering and Technology Vol.2,
No.4.
[3] Jaymin Patel, Prof.TejalUpadhyay and Dr. Samir Patel, "Heart Disease Prediction using Machine Learning and Data
Mining Techniques ", Nirma University, Gujarat, India IJCSC Vol 7,number 1 september 2015-march 2016 pp.129-
137.
[4] Webmd.com. ”Risk factors for heart disease,” retrieved 20/8/2014 from http://www.webmd.com/heart-disease/risk-
factors-heart-disease.
7. ISSN: 2252-8776
IJ-ICT Vol. 8, No. 1, April 2019 : 56–62
62
[5] National Heart, Lung, and Blood Institute, ”What is coronary heart disease?”retrieved 20/8/2014 from
http://www.nhlbi.nih.gov/health/healthtopics/ topics/cad/.
[6] Das, R., I. Turkoglu, et al. (2009). "Effective diagnosis of heart disease through neural networks ensembles." Expert
Systems with Applications, Elsevier 36 (2009): 7675–7680.
[7] Panzarasa, S., S. Quaglini, et al. (2010). "Data mining techniques for analyzing stroke care processes." Proceedings
of the 13th World Congress on Medical Informatics.
[8] V. Chaurasia, “Early Prediction of Heart Diseases Using Data Mining,” Caribbean Journal of Science and
Technology, vol. 1, pp. 208–217, 2013.
[9] Srinivas, K.,” Analysis of coronary heart disease and prediction of heart attack in coal mining regions using data
mining techniques”, IEEE Transaction on Computer Science and Education (ICCSE), p(1344 - 1349), 2010.
[10] Ruben, D. C. J. (2009). "Data Mining in Healthcare: Current Applications and Issues."
[11] Randa El-Bialy , Mostafa A. Salamay, Omar H. Karam and M.Essam Khalifa, “Feature Analysis of Coronery Artery
Heart Disease Data Sets”, International Conference on Communication, Management and Information Technology
(ICCMIT 2015), Procedia Computer Science 65 ( 2015 ) 459 – 468
[12] V. Chauraisa and S. Pal, “Data Mining Approach to Detect Heart Diseases”, International Journal of Advanced
Computer Science and Information Technology (IJACSIT),Vol. 2, No. 4,2013, pp 56-66.
[13] Chaitrali S. Dangare and Sulabha S. Apte, “Improved Study of Heart Disease Prediction System using Data Mining
Classification Techniques”, International Journal of Computer Applications (0975 – 888) Volume 47– No.10, June
2012.
[14] M. Anbarasi, E. Anupriya and N.CH.S.N.Iyengar, “Enhanced Prediction of Heart Disease with Feature Subset
Selection using Genetic Algorithm”, International Journal of Engineering Science and Technology Vol. 2(10),
2010, 5370-537.
[15] Jyoti Rohilla and Preeti Gulia, “Analysis of Data Mining Techniques for Diagnosing Heart Disease”, International
Journal of Advanced Research in Computer Science and Software Engineering, Volume 5, Issue 7, July 2015,
ISSN: 2277 128X .
[16] R. Tamilarasi and Dr. R. Porkodi, “A Study and Analysis of Disease Prediction Techniques in Data Mining for
Healthcare”, International Journal of Emerging Research in Management &Technology, March 2015, ISSN: 2278-
9359 (Volume-4, Issue-3)
[17] Shadab Adam Pattekari and Asma Parveen, prediction system for heart disease using naïve bayes, International
Journal of Advanced Computer and Mathematical Sciences, 2012.
[18] M.Akhil jabbers, Dr.prirti Chandra, Dr.B.L Deekshatulu, and Heart Disease Prediction System using Associative
Classification and Genetic Algorithm, International Conference on Emerging Trends in Electrical, Electronics and
Communication Technologies, 2012.
[19] M. Bramer, Principles of Data Mining: Springer-Verlag, 2007.
[20] R. Alizadehsani, J. Habibi, B. Bahadorian, H. Mashayekhi, A. Ghandeharioun, and R. Boghrati, et al., "Diagnosis of
coronary arteries stenosis using data mining," J Med Signals Sens, vol. 2, pp. 153-9, Jul 2012.
[21] J.Nahar, T. Imam, K. S. Tickle, and Y.-P. P. Chen, “Computational Intelligence for heart disease diagnosis: A
medical Knowledge driven approach,” Expert Systems with Applications, vol. 40, no. 1, pp. 96–104,2013.
[22] M. Kumari and S. Godara, "Comparative study of data mining classification methodsin cardiovascular disease
prediction," IJCST vol. 2, pp. 304-305, 2011.
[23] Y. E. Shao, C.-D. Hou, and C.-C. Chiu, “Hybrid intelligent modelling, schemes for heart disease classification,”
Applied Soft Computing,vol. 14, pp. 47–52, 2014.
[24] P. V. Ankur Makwana, “Identify the patients at high risk of re-admissionin hospital in the next year,” International
Journal of Science andResearch, vol. 4, pp. 2431–2434, 2015.
[25] A. Aziz, N. Ismail, And F. Ahmad, “Mining Students’ academic Performance.,” Journal of Theoretical & Applied
Information Technology, vol. 53, no. 3, 2013.