SlideShare a Scribd company logo
1 of 1
Yarlagadda, Merla-
Introduction
Health can be impacted by the physical and psychological characteristics of a
person. Many different types of models can be built to predict the occurrence
of a disease by taking several symptoms into consideration. The objective of
this research is to show that MBR (Memory based reasoning) model seem to
be more effective than all the models in this domain perhaps because MBR
uses K nearest neighbors to predict unknown values for a case based on
similarity with K most similar cases. Memory based reasoning might suit well
in the context of clinical data because the disease and its causes are known
only when the symptoms of earlier disease occurring patients were observed
and analyzed clearly
Data Preparation
The data was collected from the machine learning repository of UCI website.
The data set consists of 89000 number of observations and 20 variables such
as triglycerides, cholesterol, Body fat percentage, HDL, LDL, Systolic and
Diastolic BP, %fat in various body parts and other contributing factors to heart
disease like smoking, alcoholic consumption etc
Several imputation techniques like Tree imputation, synthetic distribution are
used to replace some of the missing values. The distribution of each of the
variable was observed clearly and some of the variable distributions are
transformed into normal distributions using transform variables in order to
improve the performance of the model
Model building and evaluation
Regression, Neural Networks, Decision Trees, RD Tree, Scan method MBR
models are being compared. The validation average squared error,
misclassification rate, ROC curve and cumulative lift statistics are used to
evaluate the performance of the models. RD tree method MBR model turned
out to be the best model with a validation average squared error of 0.07,
misclassification rate of 0.58 and cumulative lift of 1.76
Figure 2. Model building
Discussion:
1)Both the MBR methods for predicting the heart disease from the set of
symptoms worked very well as MBR node totally took care of the symptoms
in diseases with its memory.
2) Rather than focusing on running the previous old models, it is always
better to try and implement the different types of new models to dig the
several insights.
3)Clinical research organizations must give a try to run MBR models with
clinical data related to disease prediction because MBR models are proven
to be the best in the above case in predicting the symptoms related to
diseases as it stores everything into memory with k nearest neighbors
concept
Data Insights:
Age is also the most significant factor for the heart disease coupled with the
main factors such as high blood pressure, smoking, and high cholesterol.
The skin fold measurements of the abdomen and the thigh plays a
significant role to predict the body fat percentage which in turn might lead to
the heart disease rather than the other skin fold measurements
The effective use of Memory Based Reasoning model in predicting the illness of
disease using SAS Enterprise Miner 12.1
Krishna Chaitanya Yarlagadda
Data Mining and Reporting Analyst, IQR Consulting, Oklahoma State University(Alumni), Stillwater, OK 74078
Faculty Advisor: Dr. Goutam Chakraborty
Figure 3. Prediction Accuracy Results
Figure 4 .ROC curve of different models
Results
Figure 1. Existing and Proposed solution
References:
.Data Mining Techniques: For
Marketing, Sales, and
Customer Relationship
Management, Third Edition
. http://acl.ldc.upenn.edu
.Model-Based Reasoning:
Science, Technology, Values
By L.Magnani
Acknowledgement :
The authors wish to thank Dr.
Goutam Chakraborty for his
guidance and advice on this
project.
Authors Information:
Krishna Chaitanya
Yarlagadda
E-mail-
krishna.chaitanya.yarlagadda
@okstate.edu
Work Phone: (269)365-1975
Figure 5 .Cumulative lift of RD Tree MBR model

More Related Content

Similar to Krishna Chaitanya Yarlagadda Main Poster- Memory Based Reasoning

An automatic heart disease prediction using cluster-based bidirectional LSTM ...
An automatic heart disease prediction using cluster-based bidirectional LSTM ...An automatic heart disease prediction using cluster-based bidirectional LSTM ...
An automatic heart disease prediction using cluster-based bidirectional LSTM ...
BASMAJUMAASALEHALMOH
 
Heart disease prediction by using novel optimization algorithm_ A supervised ...
Heart disease prediction by using novel optimization algorithm_ A supervised ...Heart disease prediction by using novel optimization algorithm_ A supervised ...
Heart disease prediction by using novel optimization algorithm_ A supervised ...
BASMAJUMAASALEHALMOH
 
iHealth2016 Submission-VA Diabetes Risk Assessment
iHealth2016 Submission-VA Diabetes Risk AssessmentiHealth2016 Submission-VA Diabetes Risk Assessment
iHealth2016 Submission-VA Diabetes Risk Assessment
Ganesh N Prasad
 
Summarization Techniques in Association Rule Data Mining For Risk Assessment ...
Summarization Techniques in Association Rule Data Mining For Risk Assessment ...Summarization Techniques in Association Rule Data Mining For Risk Assessment ...
Summarization Techniques in Association Rule Data Mining For Risk Assessment ...
IJTET Journal
 
Prediction of heart disease using machine learning.pptx
Prediction of heart disease using machine learning.pptxPrediction of heart disease using machine learning.pptx
Prediction of heart disease using machine learning.pptx
kumari36
 
Performance Evaluation of Data Mining Algorithm on Electronic Health Record o...
Performance Evaluation of Data Mining Algorithm on Electronic Health Record o...Performance Evaluation of Data Mining Algorithm on Electronic Health Record o...
Performance Evaluation of Data Mining Algorithm on Electronic Health Record o...
BRNSSPublicationHubI
 

Similar to Krishna Chaitanya Yarlagadda Main Poster- Memory Based Reasoning (20)

Analysis and Prediction of Diabetes Diseases using Machine Learning Algorithm...
Analysis and Prediction of Diabetes Diseases using Machine Learning Algorithm...Analysis and Prediction of Diabetes Diseases using Machine Learning Algorithm...
Analysis and Prediction of Diabetes Diseases using Machine Learning Algorithm...
 
An automatic heart disease prediction using cluster-based bidirectional LSTM ...
An automatic heart disease prediction using cluster-based bidirectional LSTM ...An automatic heart disease prediction using cluster-based bidirectional LSTM ...
An automatic heart disease prediction using cluster-based bidirectional LSTM ...
 
Heart failure prediction based on random forest algorithm using genetic algo...
Heart failure prediction based on random forest algorithm  using genetic algo...Heart failure prediction based on random forest algorithm  using genetic algo...
Heart failure prediction based on random forest algorithm using genetic algo...
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
20411-38909-2-PB.pdf
20411-38909-2-PB.pdf20411-38909-2-PB.pdf
20411-38909-2-PB.pdf
 
Aa csh diabetespredictioncasestudy
Aa csh diabetespredictioncasestudyAa csh diabetespredictioncasestudy
Aa csh diabetespredictioncasestudy
 
Heart disease prediction by using novel optimization algorithm_ A supervised ...
Heart disease prediction by using novel optimization algorithm_ A supervised ...Heart disease prediction by using novel optimization algorithm_ A supervised ...
Heart disease prediction by using novel optimization algorithm_ A supervised ...
 
iHealth2016 Submission-VA Diabetes Risk Assessment
iHealth2016 Submission-VA Diabetes Risk AssessmentiHealth2016 Submission-VA Diabetes Risk Assessment
iHealth2016 Submission-VA Diabetes Risk Assessment
 
ML Project FINAL.pptx
ML Project FINAL.pptxML Project FINAL.pptx
ML Project FINAL.pptx
 
IRJET- Prediction and Analysis of Heart Disease using SVM Algorithm
IRJET-  	  Prediction and Analysis of Heart Disease using SVM AlgorithmIRJET-  	  Prediction and Analysis of Heart Disease using SVM Algorithm
IRJET- Prediction and Analysis of Heart Disease using SVM Algorithm
 
Short story_2.pptx
Short story_2.pptxShort story_2.pptx
Short story_2.pptx
 
"Predictive Modelling for Overweight and Obesity: Harnessing Machine Learning...
"Predictive Modelling for Overweight and Obesity: Harnessing Machine Learning..."Predictive Modelling for Overweight and Obesity: Harnessing Machine Learning...
"Predictive Modelling for Overweight and Obesity: Harnessing Machine Learning...
 
Estimation of Prediction for Heart Failure Chances Using Various Machine Lear...
Estimation of Prediction for Heart Failure Chances Using Various Machine Lear...Estimation of Prediction for Heart Failure Chances Using Various Machine Lear...
Estimation of Prediction for Heart Failure Chances Using Various Machine Lear...
 
PREDICTION OF DIABETES MELLITUS USING MACHINE LEARNING TECHNIQUES
PREDICTION OF DIABETES MELLITUS USING MACHINE LEARNING TECHNIQUESPREDICTION OF DIABETES MELLITUS USING MACHINE LEARNING TECHNIQUES
PREDICTION OF DIABETES MELLITUS USING MACHINE LEARNING TECHNIQUES
 
Summarization Techniques in Association Rule Data Mining For Risk Assessment ...
Summarization Techniques in Association Rule Data Mining For Risk Assessment ...Summarization Techniques in Association Rule Data Mining For Risk Assessment ...
Summarization Techniques in Association Rule Data Mining For Risk Assessment ...
 
predictionofheartdiseaseusingmachinelearning.pdf
predictionofheartdiseaseusingmachinelearning.pdfpredictionofheartdiseaseusingmachinelearning.pdf
predictionofheartdiseaseusingmachinelearning.pdf
 
Prediction of heart disease using machine learning.pptx
Prediction of heart disease using machine learning.pptxPrediction of heart disease using machine learning.pptx
Prediction of heart disease using machine learning.pptx
 
Machine learning approach for predicting heart and diabetes diseases using da...
Machine learning approach for predicting heart and diabetes diseases using da...Machine learning approach for predicting heart and diabetes diseases using da...
Machine learning approach for predicting heart and diabetes diseases using da...
 
Performance Evaluation of Data Mining Algorithm on Electronic Health Record o...
Performance Evaluation of Data Mining Algorithm on Electronic Health Record o...Performance Evaluation of Data Mining Algorithm on Electronic Health Record o...
Performance Evaluation of Data Mining Algorithm on Electronic Health Record o...
 

Krishna Chaitanya Yarlagadda Main Poster- Memory Based Reasoning

  • 1. Yarlagadda, Merla- Introduction Health can be impacted by the physical and psychological characteristics of a person. Many different types of models can be built to predict the occurrence of a disease by taking several symptoms into consideration. The objective of this research is to show that MBR (Memory based reasoning) model seem to be more effective than all the models in this domain perhaps because MBR uses K nearest neighbors to predict unknown values for a case based on similarity with K most similar cases. Memory based reasoning might suit well in the context of clinical data because the disease and its causes are known only when the symptoms of earlier disease occurring patients were observed and analyzed clearly Data Preparation The data was collected from the machine learning repository of UCI website. The data set consists of 89000 number of observations and 20 variables such as triglycerides, cholesterol, Body fat percentage, HDL, LDL, Systolic and Diastolic BP, %fat in various body parts and other contributing factors to heart disease like smoking, alcoholic consumption etc Several imputation techniques like Tree imputation, synthetic distribution are used to replace some of the missing values. The distribution of each of the variable was observed clearly and some of the variable distributions are transformed into normal distributions using transform variables in order to improve the performance of the model Model building and evaluation Regression, Neural Networks, Decision Trees, RD Tree, Scan method MBR models are being compared. The validation average squared error, misclassification rate, ROC curve and cumulative lift statistics are used to evaluate the performance of the models. RD tree method MBR model turned out to be the best model with a validation average squared error of 0.07, misclassification rate of 0.58 and cumulative lift of 1.76 Figure 2. Model building Discussion: 1)Both the MBR methods for predicting the heart disease from the set of symptoms worked very well as MBR node totally took care of the symptoms in diseases with its memory. 2) Rather than focusing on running the previous old models, it is always better to try and implement the different types of new models to dig the several insights. 3)Clinical research organizations must give a try to run MBR models with clinical data related to disease prediction because MBR models are proven to be the best in the above case in predicting the symptoms related to diseases as it stores everything into memory with k nearest neighbors concept Data Insights: Age is also the most significant factor for the heart disease coupled with the main factors such as high blood pressure, smoking, and high cholesterol. The skin fold measurements of the abdomen and the thigh plays a significant role to predict the body fat percentage which in turn might lead to the heart disease rather than the other skin fold measurements The effective use of Memory Based Reasoning model in predicting the illness of disease using SAS Enterprise Miner 12.1 Krishna Chaitanya Yarlagadda Data Mining and Reporting Analyst, IQR Consulting, Oklahoma State University(Alumni), Stillwater, OK 74078 Faculty Advisor: Dr. Goutam Chakraborty Figure 3. Prediction Accuracy Results Figure 4 .ROC curve of different models Results Figure 1. Existing and Proposed solution References: .Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management, Third Edition . http://acl.ldc.upenn.edu .Model-Based Reasoning: Science, Technology, Values By L.Magnani Acknowledgement : The authors wish to thank Dr. Goutam Chakraborty for his guidance and advice on this project. Authors Information: Krishna Chaitanya Yarlagadda E-mail- krishna.chaitanya.yarlagadda @okstate.edu Work Phone: (269)365-1975 Figure 5 .Cumulative lift of RD Tree MBR model