1) The document describes a project using logistic regression analysis to predict the presence of heart disease based on common factors like age, blood pressure, cholesterol, etc.
2) Exploratory data analysis was performed including understanding variables, cleaning data, analyzing variables through summary statistics and visualizations.
3) Logistic regression was used to build a model to predict heart disease presence in a yes/no format based on the factors. The accuracy of the model was then evaluated.
We are predicting Heart Disease by Taking 14 Medical Parameters as an inputs through 2 data Minning Techniques(Decision Tree(Faster) And KNN neighbour Algorithms(Slower)).
And Visualizing The dataset.If the output 1 then it means Higher Chances of getting Heart Attack ,if 0 then it means Less chances of Heart Attack.
This presentation gives the idea about Data Preprocessing in the field of Data Mining. Images, examples and other things are adopted from "Data Mining Concepts and Techniques by Jiawei Han, Micheline Kamber and Jian Pei "
HEALTH PREDICTION ANALYSIS USING DATA MININGAshish Salve
As we know that health care industry is completely based on assumptions, which after get tested and verified via various tests and patient have to be depend on the doctors knowledge on that topic . so we made a system that uses data mining techniques to predict the health of a person based on various medical test results. so we can predict the health of that person based on that analysis performed by the system.The system currently design only for heart issues, for that we had used Statlog (Heart) Data Set from UCI Machine Learning Repository it includes attributes like age, sex, chest pain type, cholesterol, sugar, outcomes,etc.for training the system. we only need to passed few general inputs in order to generate the prediction and the prediction results from all algorithms are they merged together by calculating there mean value that value shows the actual outcome of the prediction process which entirely works in background
Cardiovascular Disease Prediction Using Machine Learning Approaches.pptxTaminul Islam
Cardiovascular Disease Prediction Using Machine Learning Approaches.
Presentation for CISES 2023. Presentation Outline.
Introduction
Objectives
Literature review
Data Collection
Methodology
Result
Challenges & Future work
Conclusion
PCA and LDA are dimensionality reduction techniques. PCA transforms variables into uncorrelated principal components while maximizing variance. It is unsupervised. LDA finds axes that maximize separation between classes while minimizing within-class variance. It is supervised and finds axes that separate classes well. The document provides mathematical explanations of how PCA and LDA work including calculating covariance matrices, eigenvalues, eigenvectors, and transformations.
A Heart Disease Prediction Model using Logistic Regression By Cleveland DataBaseijtsrd
The early prognosis of cardiovascular diseases can aid in making decisions to lifestyle changes in high risk patients and in turn reduce their complications. Research has attempted to pinpoint the most influential factors of heart disease as well as accurately predict the overall risk using homogenous data mining techniques. Recent research has delved into amalgamating these techniques using approaches such as hybrid data mining algorithms. This paper proposes a rule based model to compare the accuracies of applying rules to the individual results of logistic regression on the Cleveland Heart Disease Database in order to present an accurate model of predicting heart disease. K. Sandhya Rani | M. Sai Chaitanya | G. Sai Kiran"A Heart Disease Prediction Model using Logistic Regression By Cleveland DataBase" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-3 , April 2018, URL: http://www.ijtsrd.com/papers/ijtsrd11402.pdf http://www.ijtsrd.com/computer-science/data-miining/11402/a-heart-disease-prediction-model-using-logistic-regression-by-cleveland-database/k-sandhya-rani
We are predicting Heart Disease by Taking 14 Medical Parameters as an inputs through 2 data Minning Techniques(Decision Tree(Faster) And KNN neighbour Algorithms(Slower)).
And Visualizing The dataset.If the output 1 then it means Higher Chances of getting Heart Attack ,if 0 then it means Less chances of Heart Attack.
This presentation gives the idea about Data Preprocessing in the field of Data Mining. Images, examples and other things are adopted from "Data Mining Concepts and Techniques by Jiawei Han, Micheline Kamber and Jian Pei "
HEALTH PREDICTION ANALYSIS USING DATA MININGAshish Salve
As we know that health care industry is completely based on assumptions, which after get tested and verified via various tests and patient have to be depend on the doctors knowledge on that topic . so we made a system that uses data mining techniques to predict the health of a person based on various medical test results. so we can predict the health of that person based on that analysis performed by the system.The system currently design only for heart issues, for that we had used Statlog (Heart) Data Set from UCI Machine Learning Repository it includes attributes like age, sex, chest pain type, cholesterol, sugar, outcomes,etc.for training the system. we only need to passed few general inputs in order to generate the prediction and the prediction results from all algorithms are they merged together by calculating there mean value that value shows the actual outcome of the prediction process which entirely works in background
Cardiovascular Disease Prediction Using Machine Learning Approaches.pptxTaminul Islam
Cardiovascular Disease Prediction Using Machine Learning Approaches.
Presentation for CISES 2023. Presentation Outline.
Introduction
Objectives
Literature review
Data Collection
Methodology
Result
Challenges & Future work
Conclusion
PCA and LDA are dimensionality reduction techniques. PCA transforms variables into uncorrelated principal components while maximizing variance. It is unsupervised. LDA finds axes that maximize separation between classes while minimizing within-class variance. It is supervised and finds axes that separate classes well. The document provides mathematical explanations of how PCA and LDA work including calculating covariance matrices, eigenvalues, eigenvectors, and transformations.
A Heart Disease Prediction Model using Logistic Regression By Cleveland DataBaseijtsrd
The early prognosis of cardiovascular diseases can aid in making decisions to lifestyle changes in high risk patients and in turn reduce their complications. Research has attempted to pinpoint the most influential factors of heart disease as well as accurately predict the overall risk using homogenous data mining techniques. Recent research has delved into amalgamating these techniques using approaches such as hybrid data mining algorithms. This paper proposes a rule based model to compare the accuracies of applying rules to the individual results of logistic regression on the Cleveland Heart Disease Database in order to present an accurate model of predicting heart disease. K. Sandhya Rani | M. Sai Chaitanya | G. Sai Kiran"A Heart Disease Prediction Model using Logistic Regression By Cleveland DataBase" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-3 , April 2018, URL: http://www.ijtsrd.com/papers/ijtsrd11402.pdf http://www.ijtsrd.com/computer-science/data-miining/11402/a-heart-disease-prediction-model-using-logistic-regression-by-cleveland-database/k-sandhya-rani
Principal Component Analysis (PCA) is a technique used to reduce the dimensionality of data by transforming it to a new coordinate system. It works by finding the principal components - linear combinations of variables with the highest variance - and using those to project the data to a lower dimensional space. PCA is useful for visualizing high-dimensional data, reducing dimensions without much loss of information, and finding patterns. It involves calculating the covariance matrix and solving the eigenvalue problem to determine the principal components.
HEART DISEASE PREDICTION USING NAIVE BAYES ALGORITHMamiteshg
This document describes using a Naive Bayes classifier to predict the likelihood of heart disease. It discusses how a web-based application would take in a user's medical information and use a trained dataset to compare and retrieve hidden data to diagnose heart disease. The document provides an example of using Bayes' theorem to calculate the probability of breast cancer based on a positive mammogram. It explains the implementation of the Naive Bayes classifier and concludes that the model could help practitioners make accurate clinical decisions to diagnose and treat heart disease.
A major challenge facing healthcare organizations (hospitals, medical centers) is
the provision of quality services at affordable costs. Quality service implies diagnosing
patients correctly and administering treatments that are effective. Poor clinical decisions
can lead to disastrous consequences which are therefore unacceptable. Hospitals must
also minimize the cost of clinical tests. They can achieve these results by employing
appropriate computer-based information and/or decision support systems.
Most hospitals today employ some sort of hospital information systems to manage
their healthcare or patient data.
These systems are designed to support patient billing, inventory management and generation of simple statistics. Some hospitals use decision support systems, but they are largely limited. Clinical decisions are often made based on doctors’ intuition and experience rather than on the knowledge rich data hidden in the database.
This practice leads to unwanted biases, errors and excessive medical costs which affects the quality of service provided to patients.
The document discusses various topics related to data analysis, including:
- Data analysis is defined as systematically organizing qualitative data to increase understanding of a phenomenon. It involves coding data and identifying patterns.
- Qualitative data comes in unstructured forms like interviews, observations, diaries and records. Analysis is more intuitive than quantitative analysis and focuses on values, meanings and experiences.
- Data can be measured on nominal, ordinal, interval or ratio scales depending on the properties they satisfy. Nominal data are categorical while ordinal data have a ranking order. Interval and ratio data have equal units of measurement.
- Common types of qualitative data analysis include content analysis, narrative analysis, discourse analysis, framework analysis and grounded theory
PCA projects data onto principal components to reduce dimensionality while retaining most information. It works by (1) zero-centering the data, (2) calculating the covariance matrix to measure joint variability, (3) computing eigenvalues and eigenvectors of the covariance matrix to identify principal components with most variation, and (4) mapping the zero-centered data to a new space using the eigenvectors. This transforms the data onto a new set of orthogonal axes oriented in the directions of maximum variance.
Heart disease prediction using machine learning algorithm Kedar Damkondwar
The document summarizes a seminar presentation on predicting heart disease using machine learning algorithms. It introduces the problem of heart disease prediction and the motivation to develop an automated system to assist in diagnosis and treatment. It reviews several existing studies that used methods like decision trees, naive Bayes, neural networks, and support vector machines to predict heart disease risk factors. The objectives of the presented model are to develop a predictive system using machine learning techniques to analyze heart data and help reduce medical costs and human biases. The proposed model and applications in medical institutions and hospitals are also discussed.
This document discusses and provides examples of supervised and unsupervised learning. Supervised learning involves using labeled training data to learn relationships between inputs and outputs and make predictions. An example is using data on patients' attributes to predict the likelihood of a heart attack. Unsupervised learning involves discovering hidden patterns in unlabeled data by grouping or clustering items with similar attributes, like grouping fruits by color without labels. The goal of supervised learning is to build models that can make predictions when new examples are presented.
This document discusses principal component analysis (PCA) and its applications in image processing and facial recognition. PCA is a technique used to reduce the dimensionality of data while retaining as much information as possible. It works by transforming a set of correlated variables into a set of linearly uncorrelated variables called principal components. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible. The document provides an example of applying PCA to a set of facial images to reduce them to their principal components for analysis and recognition.
This document discusses exploratory data analysis techniques including boxplots and five-number summaries. It explains how to organize and graph data using histograms, frequency polygons, stem-and-leaf plots, and box-and-whisker plots. The five important values used in a boxplot are the minimum, first quartile, median, third quartile, and maximum. An example constructs a boxplot for a stockbroker's daily client numbers over 11 days.
This document discusses exploratory data analysis techniques including summary statistics, visualization methods, and online analytical processing (OLAP). It provides examples using the Iris dataset to illustrate histogram, box plot, scatter plot, contour plot, matrix plot, parallel coordinates, and OLAP cube and slicing/dicing operations. The goal of exploratory data analysis is to better understand data characteristics through pattern recognition and selecting appropriate tools for preprocessing and analysis.
PCA is an unsupervised learning technique used to reduce the dimensionality of large data sets by transforming the data to a new set of variables called principal components. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible. PCA is commonly used for applications like dimensionality reduction, data compression, and visualization. The document discusses PCA algorithms and applications of PCA in domains like face recognition, image compression, and noise filtering.
Data Visualization in Exploratory Data AnalysisEva Durall
This document outlines activities for exploring equity in science education outside the classroom using data visualization. It introduces exploratory data analysis and how data visualization can help generate hypotheses from data. The activities include analyzing an interactive map of science education organizations, and creating visualizations to explore equity indicators like access, diversity, and inclusion. Effective visualization requires defining goals, finding relevant data, and experimenting with different chart types to answer questions arising from the data.
Machine Learning for Disease PredictionMustafa Oğuz
A great application field of machine learning is predicting diseases. This presentation introduces what is preventable diseases and deaths. Then examines three diverse papers to explain what has been done in the field and how the technology works. Finishes with future possibilities and enablers of the disease prediction technology.
This document provides an overview of exploratory data analysis (EDA). It discusses the key stages of EDA including data requirements, collection, processing, cleaning, exploration, modeling, products, and communication. The stages involve examining available data to discover patterns and relationships. EDA is the first step in data mining projects to understand data without assumptions. The document also outlines the problem definition, data preparation, analysis, and result development and representation steps of EDA. Finally, it discusses different types of data like numeric, categorical, and the importance of understanding data types for analysis.
This document provides an introduction to Bayesian methods. It discusses key Bayesian concepts like priors, likelihoods, and Bayes' theorem. Bayes' theorem states that the posterior probability of a measure is proportional to the prior probability times the likelihood function. The document uses examples to illustrate Bayesian analysis and key principles like the likelihood principle and exchangeability. It also briefly discusses Bayesian pioneers like Bayes, Laplace, and Gauss and computational Bayesian methods.
Heart Disease Identification Method Using Machine Learnin in E-healthcare.SUJIT SHIBAPRASAD MAITY
This document describes a student project that aims to develop a machine learning model for heart disease identification and prediction. It discusses existing heart disease diagnosis techniques, identifies the problem and requirements, outlines the proposed algorithm and methodology using supervised learning classification algorithms like K-Nearest Neighbors and logistic regression. Block diagrams and flow charts illustrate the data preprocessing, model training, and web application development steps to classify patients as having heart disease or not and evaluate model performance. The developed system achieves high accuracy for heart disease prediction.
Prediction of cardiovascular disease with machine learningPravinkumar Landge
This document discusses using machine learning to predict cardiovascular disease. It begins with an introduction to heart disease and cardiovascular disease. It then discusses the motivation for using machine learning to predict disease given the large amount of healthcare data and multiple risk factors. The document describes the Cleveland Heart Disease dataset that is used, which contains 14 attributes on individuals. It concludes that machine learning techniques are useful for predicting cardiovascular disease outcomes based on risk factor data.
This document summarizes a student's data analysis project that aims to predict heart attack risk factors. The student analyzes a dataset containing information on 303 patients, including demographic data, medical information, and whether they had a heart attack. The objectives are to investigate relationships between attributes and heart disease risk, identify which chest pain types are most associated with heart attacks, and examine the impact of exercise. Preliminary results include checking the data structure, distributions, outliers, and visualizing relationships through histograms, boxplots, and scatter plots. Future work involves further investigating variable relationships, applying advanced statistical techniques, and collecting additional data to better understand and prevent heart disease.
Dive into the forefront of healthcare analytics with our latest project showcase on heart disease classification. Our students at the Boston Institute of Analytics have delved deep into the complexities of heart disease diagnosis using advanced data science and artificial intelligence techniques. Explore the innovative methodologies, insightful findings, and impactful solutions presented in this collection of projects. From predictive modeling to risk assessment, these projects demonstrate the power of data-driven approaches in revolutionizing healthcare. To learn more about our data science and artificial intelligence programs, visit https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/.
Principal Component Analysis (PCA) is a technique used to reduce the dimensionality of data by transforming it to a new coordinate system. It works by finding the principal components - linear combinations of variables with the highest variance - and using those to project the data to a lower dimensional space. PCA is useful for visualizing high-dimensional data, reducing dimensions without much loss of information, and finding patterns. It involves calculating the covariance matrix and solving the eigenvalue problem to determine the principal components.
HEART DISEASE PREDICTION USING NAIVE BAYES ALGORITHMamiteshg
This document describes using a Naive Bayes classifier to predict the likelihood of heart disease. It discusses how a web-based application would take in a user's medical information and use a trained dataset to compare and retrieve hidden data to diagnose heart disease. The document provides an example of using Bayes' theorem to calculate the probability of breast cancer based on a positive mammogram. It explains the implementation of the Naive Bayes classifier and concludes that the model could help practitioners make accurate clinical decisions to diagnose and treat heart disease.
A major challenge facing healthcare organizations (hospitals, medical centers) is
the provision of quality services at affordable costs. Quality service implies diagnosing
patients correctly and administering treatments that are effective. Poor clinical decisions
can lead to disastrous consequences which are therefore unacceptable. Hospitals must
also minimize the cost of clinical tests. They can achieve these results by employing
appropriate computer-based information and/or decision support systems.
Most hospitals today employ some sort of hospital information systems to manage
their healthcare or patient data.
These systems are designed to support patient billing, inventory management and generation of simple statistics. Some hospitals use decision support systems, but they are largely limited. Clinical decisions are often made based on doctors’ intuition and experience rather than on the knowledge rich data hidden in the database.
This practice leads to unwanted biases, errors and excessive medical costs which affects the quality of service provided to patients.
The document discusses various topics related to data analysis, including:
- Data analysis is defined as systematically organizing qualitative data to increase understanding of a phenomenon. It involves coding data and identifying patterns.
- Qualitative data comes in unstructured forms like interviews, observations, diaries and records. Analysis is more intuitive than quantitative analysis and focuses on values, meanings and experiences.
- Data can be measured on nominal, ordinal, interval or ratio scales depending on the properties they satisfy. Nominal data are categorical while ordinal data have a ranking order. Interval and ratio data have equal units of measurement.
- Common types of qualitative data analysis include content analysis, narrative analysis, discourse analysis, framework analysis and grounded theory
PCA projects data onto principal components to reduce dimensionality while retaining most information. It works by (1) zero-centering the data, (2) calculating the covariance matrix to measure joint variability, (3) computing eigenvalues and eigenvectors of the covariance matrix to identify principal components with most variation, and (4) mapping the zero-centered data to a new space using the eigenvectors. This transforms the data onto a new set of orthogonal axes oriented in the directions of maximum variance.
Heart disease prediction using machine learning algorithm Kedar Damkondwar
The document summarizes a seminar presentation on predicting heart disease using machine learning algorithms. It introduces the problem of heart disease prediction and the motivation to develop an automated system to assist in diagnosis and treatment. It reviews several existing studies that used methods like decision trees, naive Bayes, neural networks, and support vector machines to predict heart disease risk factors. The objectives of the presented model are to develop a predictive system using machine learning techniques to analyze heart data and help reduce medical costs and human biases. The proposed model and applications in medical institutions and hospitals are also discussed.
This document discusses and provides examples of supervised and unsupervised learning. Supervised learning involves using labeled training data to learn relationships between inputs and outputs and make predictions. An example is using data on patients' attributes to predict the likelihood of a heart attack. Unsupervised learning involves discovering hidden patterns in unlabeled data by grouping or clustering items with similar attributes, like grouping fruits by color without labels. The goal of supervised learning is to build models that can make predictions when new examples are presented.
This document discusses principal component analysis (PCA) and its applications in image processing and facial recognition. PCA is a technique used to reduce the dimensionality of data while retaining as much information as possible. It works by transforming a set of correlated variables into a set of linearly uncorrelated variables called principal components. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible. The document provides an example of applying PCA to a set of facial images to reduce them to their principal components for analysis and recognition.
This document discusses exploratory data analysis techniques including boxplots and five-number summaries. It explains how to organize and graph data using histograms, frequency polygons, stem-and-leaf plots, and box-and-whisker plots. The five important values used in a boxplot are the minimum, first quartile, median, third quartile, and maximum. An example constructs a boxplot for a stockbroker's daily client numbers over 11 days.
This document discusses exploratory data analysis techniques including summary statistics, visualization methods, and online analytical processing (OLAP). It provides examples using the Iris dataset to illustrate histogram, box plot, scatter plot, contour plot, matrix plot, parallel coordinates, and OLAP cube and slicing/dicing operations. The goal of exploratory data analysis is to better understand data characteristics through pattern recognition and selecting appropriate tools for preprocessing and analysis.
PCA is an unsupervised learning technique used to reduce the dimensionality of large data sets by transforming the data to a new set of variables called principal components. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible. PCA is commonly used for applications like dimensionality reduction, data compression, and visualization. The document discusses PCA algorithms and applications of PCA in domains like face recognition, image compression, and noise filtering.
Data Visualization in Exploratory Data AnalysisEva Durall
This document outlines activities for exploring equity in science education outside the classroom using data visualization. It introduces exploratory data analysis and how data visualization can help generate hypotheses from data. The activities include analyzing an interactive map of science education organizations, and creating visualizations to explore equity indicators like access, diversity, and inclusion. Effective visualization requires defining goals, finding relevant data, and experimenting with different chart types to answer questions arising from the data.
Machine Learning for Disease PredictionMustafa Oğuz
A great application field of machine learning is predicting diseases. This presentation introduces what is preventable diseases and deaths. Then examines three diverse papers to explain what has been done in the field and how the technology works. Finishes with future possibilities and enablers of the disease prediction technology.
This document provides an overview of exploratory data analysis (EDA). It discusses the key stages of EDA including data requirements, collection, processing, cleaning, exploration, modeling, products, and communication. The stages involve examining available data to discover patterns and relationships. EDA is the first step in data mining projects to understand data without assumptions. The document also outlines the problem definition, data preparation, analysis, and result development and representation steps of EDA. Finally, it discusses different types of data like numeric, categorical, and the importance of understanding data types for analysis.
This document provides an introduction to Bayesian methods. It discusses key Bayesian concepts like priors, likelihoods, and Bayes' theorem. Bayes' theorem states that the posterior probability of a measure is proportional to the prior probability times the likelihood function. The document uses examples to illustrate Bayesian analysis and key principles like the likelihood principle and exchangeability. It also briefly discusses Bayesian pioneers like Bayes, Laplace, and Gauss and computational Bayesian methods.
Heart Disease Identification Method Using Machine Learnin in E-healthcare.SUJIT SHIBAPRASAD MAITY
This document describes a student project that aims to develop a machine learning model for heart disease identification and prediction. It discusses existing heart disease diagnosis techniques, identifies the problem and requirements, outlines the proposed algorithm and methodology using supervised learning classification algorithms like K-Nearest Neighbors and logistic regression. Block diagrams and flow charts illustrate the data preprocessing, model training, and web application development steps to classify patients as having heart disease or not and evaluate model performance. The developed system achieves high accuracy for heart disease prediction.
Prediction of cardiovascular disease with machine learningPravinkumar Landge
This document discusses using machine learning to predict cardiovascular disease. It begins with an introduction to heart disease and cardiovascular disease. It then discusses the motivation for using machine learning to predict disease given the large amount of healthcare data and multiple risk factors. The document describes the Cleveland Heart Disease dataset that is used, which contains 14 attributes on individuals. It concludes that machine learning techniques are useful for predicting cardiovascular disease outcomes based on risk factor data.
This document summarizes a student's data analysis project that aims to predict heart attack risk factors. The student analyzes a dataset containing information on 303 patients, including demographic data, medical information, and whether they had a heart attack. The objectives are to investigate relationships between attributes and heart disease risk, identify which chest pain types are most associated with heart attacks, and examine the impact of exercise. Preliminary results include checking the data structure, distributions, outliers, and visualizing relationships through histograms, boxplots, and scatter plots. Future work involves further investigating variable relationships, applying advanced statistical techniques, and collecting additional data to better understand and prevent heart disease.
Dive into the forefront of healthcare analytics with our latest project showcase on heart disease classification. Our students at the Boston Institute of Analytics have delved deep into the complexities of heart disease diagnosis using advanced data science and artificial intelligence techniques. Explore the innovative methodologies, insightful findings, and impactful solutions presented in this collection of projects. From predictive modeling to risk assessment, these projects demonstrate the power of data-driven approaches in revolutionizing healthcare. To learn more about our data science and artificial intelligence programs, visit https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/.
Dive into the forefront of healthcare analytics with our latest project showcase on heart disease classification. Our students at the Boston Institute of Analytics have delved deep into the complexities of heart disease diagnosis using advanced data science and artificial intelligence techniques. Explore the innovative methodologies, insightful findings, and impactful solutions presented in this collection of projects. From predictive modeling to risk assessment, these projects demonstrate the power of data-driven approaches in revolutionizing healthcare. To learn more about our data science and artificial intelligence programs, visit https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/.
Dive into an extensive analysis of heart disease classification, exploring key factors, trends, and predictive models for improved diagnosis and treatment strategies. Visit, https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/ for more
This NLP Project Presentation explores how Natural Language Processing (NLP) and Data Science are revolutionizing the prediction of heart disease. Discover how cutting-edge techniques are being used to analyze textual data, such as patient records and medical reports, to predict the likelihood of heart disease with unprecedented accuracy. For more details on data science Visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
The document analyzes risk factors for coronary heart disease (CHD) using data on men ages 15-64. Hypothesis tests were conducted to compare systolic blood pressure (SBP) and body mass index (BMI) in those with and without CHD. The test found higher average SBP in those with CHD but no difference in average BMI. A multiple linear regression analyzed the correlation between low-density lipoprotein (LDL) cholesterol and age, SBP, and BMI, finding BMI to be most correlated with LDL.
Dive into our students' innovative project leveraging machine learning for heart disease prediction. Discover how advanced analytics and predictive modeling can revolutionize healthcare, providing early detection and personalized interventions for better patient outcomes. To learn more, do check out https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/.
biostat 2.pptx h h jbjbivigyfyfyfyfyftftcMrMedicine
The sample consists of 10 women seeking prenatal care. Key statistics about the sample include:
- Sample size is 10
- Mode is 22, 26, and 28 (as each occurs once)
- Median is 26.5
- Range is 25 (43-18)
- Sample mean is 26.6
- Sample variance is 64.4
- Sample standard deviation is 8.03
The document summarizes analyses of two heart disease datasets: LA Heart and Cardiovas. For LA Heart, logistic regression found systolic blood pressure highly predicts heart disease probability, while other factors were less predictive. For Cardiovas, multiple regressions found hemoglobin A1C levels best explained by waist size, age, cholesterol, and blood glucose. Blood glucose was best explained by age, and other factors showed moderate-high significance. Overall, the analyses indicate systolic blood pressure and hemoglobin A1C/blood glucose levels along with associated risk factors provide useful information for understanding heart disease outcomes.
The document provides an overview of cardiovascular disorders and the nursing care of clients with such conditions. It discusses the anatomy and physiology of the heart, epidemiology of cardiovascular diseases, nursing assessment including history, physical exam, diagnostic tests, and monitoring tools. Nursing responsibilities are outlined for various diagnostic tests and procedures like stress tests, echocardiograms, cardiac catheterization, and treatment of dysrhythmias. The goal of nursing care is to comprehensively assess and monitor clients while educating them about risk factor reduction and management of their cardiovascular condition.
Hypertension, also known as high blood pressure, is defined as an average systolic blood pressure above 140 mm Hg or an average diastolic blood pressure above 90 mm Hg based on multiple readings. There are three main types of hypertension: essential or primary hypertension which has no known cause, secondary hypertension caused by other conditions, and pregnancy-induced hypertension. Blood pressure is regulated by both rapid-acting mechanisms like the sympathetic nervous system and baroreceptors, and slower-acting mechanisms like the kidneys and renin-angiotensin system. Uncontrolled hypertension increases the risks of heart disease, stroke, kidney disease and other health issues.
Pro / Con Debate on Central Blood Pressuremagdy elmasry
The Basis : Forward & Reflected Pulse Waves
Central BP - Pro Side of the Argument
Central BP - Con Side of the Argument
Central BP - Consensus on Clinical Application
FDA-cleared devices for central BP and arterial stiffness assessment
Value of measuring central BP in clinComparative effect of
anti-hypertensive drugs and nitrates
on central systolic BP
ical practice
isolated systolic hypertension in the young
A Heart Disease Prediction Model using Decision TreeIOSR Journals
This document presents a heart disease prediction model using decision tree analysis. It selects 14 clinical features from patient data and develops prediction models using the J48 decision tree algorithm with unpruned, pruned, and pruned with reduced error pruning approaches. The results show that the pruned J48 decision tree with reduced error pruning has the highest accuracy at 75.73%, compared to 72.82% for unpruned and 73.79% for pruned. The reduced error pruning approach produces more compact decision rules with fewer extracted rules, improving predictive performance.
PREVENTION OF HEART PROBLEM USING ARTIFICIAL INTELLIGENCEijaia
This document discusses building a machine learning model to predict the probability of patients experiencing heart problems based on their medical data. It analyzes data from 1000 patients across India on risk factors like family history, smoking, hypertension, cholesterol levels, blood sugar, obesity, lifestyle, previous bypass surgery, and iron levels. The model aims to help doctors make treatment decisions and minimize false negatives, where the model predicts no problem when one exists. It finds certain risk factors like family history, age over 50, and being male are correlated with higher heart problem rates. The model will be trained on this data to predict new patients' heart problem probability.
This document discusses vital signs and provides detailed information about assessing and interpreting blood pressure. It defines blood pressure and its components, describes the equipment used for measurement including sphygmomanometers and stethoscopes, identifies assessment sites on the body, explains Korotkoff sounds heard during measurement, outlines the procedure for taking a reading, and reviews factors that can affect blood pressure values. Abnormal readings and variations like auscultatory gaps are also addressed.
This document provides an overview and summary of Pulse Dynamics technology, which analyzes arterial pulse waveforms to noninvasively measure hemodynamic parameters like arterial compliance, peripheral resistance, and left ventricular contractility. It discusses clinical studies that have validated the use of Pulse Dynamics to study hypertension, cardiovascular risk factors, heart disease, renal disease, and more. The document also outlines the physics behind Pulse Dynamics methodology and provides sample reports and comments from physicians on their experience using Pulse Dynamics in clinical research and patient care.
This document discusses sudden cardiac death (SCD) in young athletes. SCD is the leading cause of death in exercising young athletes, with estimates of incidence ranging from 1 in a million to as high as 1 in 3,000 for some athlete populations. SCD often results from structural heart abnormalities that may be detected through pre-participation physicals. Pre-participation cardiovascular screening evaluates large athlete populations before participation to reduce the risk of SCD. The document goes on to discuss recommendations and components of pre-participation cardiovascular screening examinations.
Here are the key points about the different types of blood vessels:
- Arteries carry oxygenated blood away from the heart to tissues and organs. They have an outer
tunica externa layer of connective tissue, a middle tunica media layer of smooth muscle, and an
inner tunica intima layer of endothelium. Larger elastic arteries near the heart have more elastic
tissue.
- Capillaries are the microscopic vessels that connect arterioles and venules. They allow for the
exchange of water, oxygen, nutrients, hormones, carbon dioxide and waste between blood and
tissues. Capillaries have a single layer of endothelial cells and connective tissue.
-
An Ill-identified Classification to Predict Cardiac Disease Using Data Cluste...ijdmtaiir
The health care industry contains large amount of
health care data with hidden information. This information is
useful for making effective decision. For getting appropriate
result from the hidden information computer based data mining
techniques are used. Previously Neural Network (NN) is
widely used for predicting cardiac disease. In this paper, a
Cardiac Disease Prediction System (CDPS) is developed by
using data clustering. The CDPS system uses 15 parameters to
predict the disease, for example BP, Obesity, cholesterol, etc.
This 15 attributes like sex, age, weight are given as the input.
In this paper by using the patient’s medical record, an illdefined classification is used at the early stage of the patient to
diagnose the cardiac disease. Based on the result the patients
are advised to keep the sensor to predict them.
Exercise stress echocardiography in patients with aortic stenosis: impact of baseline diastolic dysfunction and functional capacity on mortality and aortic valve replacement
Authors: Andrew N. Rassi, Wael AlJaroudi, Sahar Naderi, M Chadi Alraies, Venu Menon, Leonardo Rodriguez, Richard Grimm, Brian Griffin, Wael A. Jaber
http://www.thecdt.org/article/view/2855
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataKiwi Creative
Harness the power of AI-backed reports, benchmarking and data analysis to predict trends and detect anomalies in your marketing efforts.
Peter Caputa, CEO at Databox, reveals how you can discover the strategies and tools to increase your growth rate (and margins!).
From metrics to track to data habits to pick up, enhance your reporting for powerful insights to improve your B2B tech company's marketing.
- - -
This is the webinar recording from the June 2024 HubSpot User Group (HUG) for B2B Technology USA.
Watch the video recording at https://youtu.be/5vjwGfPN9lw
Sign up for future HUG events at https://events.hubspot.com/b2b-technology-usa/
Open Source Contributions to Postgres: The Basics POSETTE 2024ElizabethGarrettChri
Postgres is the most advanced open-source database in the world and it's supported by a community, not a single company. So how does this work? How does code actually get into Postgres? I recently had a patch submitted and committed and I want to share what I learned in that process. I’ll give you an overview of Postgres versions and how the underlying project codebase functions. I’ll also show you the process for submitting a patch and getting that tested and committed.
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Kaxil Naik
Navigating today's data landscape isn't just about managing workflows; it's about strategically propelling your business forward. Apache Airflow has stood out as the benchmark in this arena, driving data orchestration forward since its early days. As we dive into the complexities of our current data-rich environment, where the sheer volume of information and its timely, accurate processing are crucial for AI and ML applications, the role of Airflow has never been more critical.
In my journey as the Senior Engineering Director and a pivotal member of Apache Airflow's Project Management Committee (PMC), I've witnessed Airflow transform data handling, making agility and insight the norm in an ever-evolving digital space. At Astronomer, our collaboration with leading AI & ML teams worldwide has not only tested but also proven Airflow's mettle in delivering data reliably and efficiently—data that now powers not just insights but core business functions.
This session is a deep dive into the essence of Airflow's success. We'll trace its evolution from a budding project to the backbone of data orchestration it is today, constantly adapting to meet the next wave of data challenges, including those brought on by Generative AI. It's this forward-thinking adaptability that keeps Airflow at the forefront of innovation, ready for whatever comes next.
The ever-growing demands of AI and ML applications have ushered in an era where sophisticated data management isn't a luxury—it's a necessity. Airflow's innate flexibility and scalability are what makes it indispensable in managing the intricate workflows of today, especially those involving Large Language Models (LLMs).
This talk isn't just a rundown of Airflow's features; it's about harnessing these capabilities to turn your data workflows into a strategic asset. Together, we'll explore how Airflow remains at the cutting edge of data orchestration, ensuring your organization is not just keeping pace but setting the pace in a data-driven future.
Session in https://budapestdata.hu/2024/04/kaxil-naik-astronomer-io/ | https://dataml24.sessionize.com/session/667627
2. CONTRIBUTION
Kritika Jain and Vanya Vasudeva: coding,
presentation and research
Vansh Puri and Anant Goyal: data
cleaning
3. Introduction
Heart disease is one of the top leading causes of death accounting for 17.7 million deaths each year,
31% of all global deaths, as reported by World Health Organization 2017.
Several clinical information and symptoms are found to be related to Heart Diseases including age,
blood pressure, total cholestrol, diabetes, hyper tension .
Heart Disease dataset basically consists of the above-mentioned information and attributes which
was summarized and collected from the patients.
With the huge amounts of data made available in recent years, the diagnosis of Heart Diseases can be
automatically performed using traditional statistical methods to predict the potential of having Heart
Diseases on each patient.
4. Aim of analysis
The aim of analyzing this data set is to predict the people suffering from heart
diseases based on certain common factors.
Since the prediction has to be in a yes or no format, the dependent variable is
categorical and thus can be regressed using logistic regression or random forest.
5. EXPLORATORY DATA
ANALYSIS
Exploratory Data Analysis refers to the critical process of performing initial investigations
on data so as to discover patterns, to spot anomalies, to test hypothesis and the check
assumptions with the help of summary statistics and graphical representations. It employs
a variety of techniques to:
1. maximize insight into a data set;
2. uncover underlying structure;
3. extract important variables;
4. detect outliers and anomalies;
5. test underlying assumptions;
6. develop parsimonious models; and
7. determine optimal factor settings
6. Components of EDA
The main components of exploring data,
1. Understanding your variables
Importing dataset
Change structure
Summary statistics: calculating mean, median, mode, skewness, kurtosis, variance,etc
2. Cleaning your dataset
Removing redundant variables (NA values)
Variable selection
Removing outliers: using boxplots
3. Analysing the variables
Visualising data : histograms, bar plots, scatterplot, etc
Checking for correlation : using corrplots
Creating a model
Using logistic regression to make predictions
Check for accuracy of the model
7. What is logistic regression analysis?
Logistic regression is the appropriate regression
analysis to conduct when the dependent variable is
dichotomous (binary) or categorical. Like all
regression analysis, the logistic regression is a
predictive analysis. Logistic regression is used to
describe data and to explain the relationship
between one dependent binary variable and one or
more nominal, ordinal, interval or ratio-level
independent variables
8. Major assumptions in binary
logistic regression
The dependent variable should be dichotomous in nature (e.g., presence vs. absent).
There should be no outliers in the data.
OLS assumptions should be satisfied.
At the center of the logistic regression analysis is the task estimating the log odds of an
event. Mathematically, logistic regression estimates a multiple linear regression function
defined as:
log(p)
for i = 1…n .
Regression coefficients explain the change in log(odds) in the response for a unit change in
predictor. However, since the relationship between p(X) and X is not straight line, a unit change in
input feature doesn't really affect the model output directly but it affects the odds ratio.
9. age - age in years
sex - (1 = male; 0 = female)
cp - chest pain type
0: Typical angina: chest pain related to decreased blood supply to the heart
1: Atypical angina: chest pain not related to heart
2: Non-anginal pain:typically esophaegal spasms; non heart related
3: Asymptomatic: chest pain not showing signs of disease
trestbps - resting blood pressure (in mm Hg on admission to the hospital)
above 130-140 - cause for concern
chol - serum cholestrol in mg/dl
above 200 is cause for concern
Understanding the variables
10. fbs - (fasting blood sugar > 120 mg/dl)
(1 = true; 0 = false)
'>126' mg/dL signals diabetes
restecg - resting electrocardiographic results
0: Nothing to note
1: can range from mild symptoms to severe problems
signals non-normal heart beat
2: Enlarged heart's main pumping chamber
thalach - maximum heart rate achieved
exang - exercise induced angina
(1 = yes; 0 = no)
oldpeak – stress of the heart during excercise
unhealthy heart stresses more
11. slope - the slope of the peak exercise ST segment
0: Upsloping: better heart rate with excercise
1: Flatsloping: typical healthy heart
2: Downslopins: signs of unhealthy heart
ca - number of major vessels (0-3) colored by flourosopy
-colored vessel means the doctor can see the blood passing through
- more the blood movement, better is the functioning of the heart (no clots)
thal - thalium stress result
0,1: normal
2: fixed defect
3: reversable defect: no proper blood movement when exercising
target - have disease or not
(1=yes, 0=no)
- the predicted attribute
12. Changes made to variables
For simplification, we changed the variable names.
Cp=chest_pain_type
Trestbps= rest_bp
Chol= cholesterol
Fbs= fast_bs
Restecg= rest_ecg
Thalach= max_hr
Exang= ex_induced_angina
Old peak= ST_dep
Ca= vessels
Thal= defect
13. Other changes to data
The classes of the following variables are converted into factors to accurately define their
context:
Column age has been converted into an integer
Neither NA nor missing values were found in the data
Slope
Vessels
Defect
Target
Sex
Chest_pain_type
Fast_bs
Rest_ecg
Ex_induced_angina
14. Summary statistics of data
Summary statistics refers to a quick description of the data basically including mean, median, mode,
skewness, kurtosis, variance and standard deviation of various variables. Some of these measures
variables that are numeric in nature.
15. Removing outliers
To remove the outliers, we use boxplots.
We plot the data on boxplot and further replace the outliers with mean, median or mode.
Maximum heart rate
(before and after)
19. Data visualizations
Data visualization refers to plotting the data in histograms, bargraphs, scatterplots, etc for easier
and efficient understanding of the data.
A bar chart or bar graph is a chart or graph that presents categorical data with rectangular bars
with heights or lengths proportional to the values that they represent.
Histogram: a graphical display of data using bars of different heights. It is similar to a Bar Chart,
but a histogram groups numbers into ranges . The height of each bar shows how many fall into
each range.
A scatterplot is a type of data display that shows the relationship between two numerical
variables. Each member of the dataset gets plotted as a point whose ( x , y ) (x, y) (x,y)left
parenthesis, x, comma, y, right parenthesis coordinates relates to its values for the two variables.
20. Histograms
In order to make the histograms, we filtered the data using the “DPLYR package” to get the
Data of people suffering from heart disease. We furthered used the “HIST” function to create
The histograms.
It shows the number of people of a particular age, suffering from
heart disease.
As you age, so do your blood vessels. They become less flexible,
making it harder for blood to move through them easily
We observe that people in the age group of 40-65 are most likely
To suffer from heart disease.
21. It shows the frequency of of ST depressions undergone by
patients suffering from heart disease.
ST depression shows the stress of the heart induced by
exercise.
Lower the ST depression, greater is the stress of the heart.
Heart patients experience relatively higher stress of the
heart.
22. It shows that the maximum number of people having
heart diseases have a resting bp between 120-140mm Hg.
A very high or very low blood pressure are both causes of
concern for heart patients.
The graph shows that heart patients tend to have a high bp.
The normal range for blood pressure is 80-120mm Hg
23. It shows that that the maximum number of people suffering
from heart disease have cholesterol levels between 200 to
240.
According to research, Cholesterol levels above 200 is a
cause for concern.
24. Heart rate is the speed of the heartbeat measured by the
number of contractions (beats) of the heart per minute
(bpm).
The heart rate can vary according to the body's physical
needs, including the need to absorb oxygen.
It shows that the heart disease patients generally have a
maximum heart rate of 160-180.
Researches show that maximum heart rate should not fall too
low and it should not rise too high either.
25. Scatterplot
The scatterplot shows a downward relation
between age and maximum heart rate achieved.
As age increases, maximum heart rate achieved falls
for a diseased person.
26. Mixed Corrplot
There should be no high correlations (multicollinearity)
among the predictors. This can be assessed by a
correlation matrix among the predictors.
We created a data set of numeric variables and
calculated the correlation amongst the variables.
We further used the code of a mixed corrplot to
visualize our calculations.
Since the correlation between the variables is less than 0.90,
these variables can be used in the model.
27. Corrplot
INSIGHTS:
As age increases, cholesterol, stress of heart during exercise and resting bp
also increase. On the other hand, maximum heart rate falls with old age.
As cholesterol increases, stress of heart during exercise and resting bp
increase, while maximum heart rate falls.
As ST depression rises, i.e. stress of the heart falls, resting bp rises.
Resting bp also has a negative relation with maximum heart rate.
The degree of correlation is very small between all variables. However, age
and maximum heart rate show a slightly higher correlation.
St depression and maximum heart rate also show similar results.
28. Bar chart
We observe that the number of males are
higher than the number of females in our dataset.
We thus assumed that it is a biased data.
However, we later realized that a larger proportion of
male population going for heart check ups implies a
higher degree of risk for males.
29. Creating a model
We use LOGISTIC REGRESSION because the dependent variable, in our case,
target (whether a person suffers from heart disease or not) is categorical in nature.
1: person suffers from heart disease
0: person does not suffer from heart disease
30. Division into train data and test data
We use the “CARET package” to divide the entire dataset into 2 parts:
Train: this part of the dataset is used for model building. Analysis is done for this
dataset and an appropriate model is built according to requirements.
Test: this part of the dataset is used to test the model. The output of this dataset is
is compared with the original output and the accuracy of the original model can be
predicted.
The function “CreateDataPartition” is used to split the data into 60% for training set
and 40% for testing set.
We do this to predict the dependent variable target.
31. Variable selection
Method 1: BACKWARD SELECTION
Backward selection (or backward elimination), which starts with all predictors in the model ,
iteratively removes the least contributive predictors, and stops when you have a model where all
predictors are statistically significant.
Using backward selection, the following variables are found to be statistically significant in our
analysis:
Hence, we build a model from the variables selected through backward elimination.
• Sex
• Chest pain type
• Rest bp
• Ex induced angina
• ST depression
• Slope
• Vessels
• defect
33. Checking for multicollinearity
The variables should not be correlated amongst themselves in
order to have better accuracy.
We check the variance inflation factor(VIF) inorder to check for
multicollinearity.
If the VIF of a variable is greater than 5, it indicates possibility of
multi collinearity. Therefor we will remove any variable that has a
VIF>5.
However VIF can only be checked for numeric data
34. Since none of the variables have a VIF >5, therefore we use all
variables.
35. Method2: random forest
Random forest, like its name implies, consists of a large number of individual decision
trees that operate as an ensemble. Each individual tree in the random forest spits out a
class prediction and the class with the most votes becomes our model’s prediction.
36. We applied random forest on our entire dataset
in order to get the variable importance plot.
Variable importance plot shows the mean
decrease accuracy, which represents by how
much does removing each variable reduces the
accuracy of the model.
Higher the value of mean decrease accuracy or
mean decrease gini score, higher the importance
of variable in the model.
With the least mean decrease accuracy, Resting
bp, fasting blood sugar and resting ecg were
eliminated from our model.
VARIABLE IMPORTANCE PLOT
37. Model 2:
With the help of mean decrease accuracy, we build a
logistic model using:
age, sex, chest pain type,cholesterol, maximum heart rate,
exercise induced angina, ST depression, slope, vessels and
defect.
38. Accuracy and ROCR curve.
Model1:
Accuracy= 85.925%
Area under the curve(AUC)= 87.41%
Model2:
Accuracy= 79.33%
Area under the curve(AUC)= 86.52%
We compare the actual and predicted values of both models and the results are as follows:
39. Result
Accuracy= correct predictions/ total predictions
ROCR shows the area under the curve. Greater the
area, more reliable is the model.
Since the accuracy and area under the curve of
Model1 is better, it is a better fit for analysis.
40. Confusion matrix
Accuracy : the proportion of the total number of predictions that were correct.
Positive Predictive Value or Precision : the proportion of positive cases that were correctly
identified.
Negative Predictive Value : the proportion of negative cases that were correctly identified.
Sensitivity or Recall or True Positive Rate : the proportion of actual positive cases which are
correctly identified.
Specificity : the proportion of actual negative cases which are correctly identified.
Precision : d/c+d
42. Testing the significance of regressors
Null Hypothesis(Ho): the explanatory variable does not affect the dependent variable.
Alternative Hypothesis(Ha): The explanatory variable affects the dependent variable.
We check the summary of the model built with the help of backward elimination in
order to check their p value.
If p<0.05 (5% level of significance), null hypothesis is rejected.
P value of sex, chest pain type, ST depression and vessels is less than 0.05.
Therefore, these variables should be added in our final model.
43. Model3:
The variables used in our model is based on the highest rated variables as per
the variable importance plot from random forest. Therefore, with the help of
the previous 2 models, we build our final model having variables sex, chest
pain type, ST depression and vessels.
All the 4 variables have high significance level with respect to their p-values.
44.
45. Checking heteroscedasticity
We plot the fitted values with the residuals of
the logistic model in order to get the graph.
Since the scatterplot is not funnel shaped, this
mode is free from heteroscedasticity.
47. Conclusion
This dataset is old and small by today's standards. However, it has allowed us
to create a simple model and use machine learning explainability tools and
techniques to peek inside.
At the start, we hypothesised, using research knowledge that factors such as
cholesterol, age and fasting blood sugar would be major factors in the model.
However, this dataset didn't show that. Instead, the number of major factors
and aspects of ECG results dominated.
48. Insights
Sex: we observe that a larger number of male population appears in heart check
ups. Thus males are more prone to heart diseases.
Chest pain type: chest pain types have a major role in predicting heart diseases as
the patients suffering from angina pains have higher probability of suffering from
heart diseases. Typical angina pain is a major indicator of heart disease.
ST depression: the stress of the heart during exercise causes higher risk of heart
disease. Thus higher the ST depression, lower is the risk.
Vessels: vessels refer to the fluoroscopy test done as an indicator of blood flow
through the vessels. Darker the color of fluoroscopy, lower is the risk of heart
disease.