Early Hospital Mortality Prediction Using Ensemble Learning

•

1 like•264 views

This document summarizes a research paper that proposed using an ensemble learning approach to predict early hospital mortality for intensive care unit patients. It discusses previous work on mortality prediction that had limitations in handling missing data, imbalanced data, and lack of early prediction. The proposed method uses data mining techniques like random forest, decision trees, naive Bayes and PART on clinical variables from the first 6 hours of ICU patients. It conducted experiments with 10-fold cross validation to evaluate and compare the performance of these techniques as well as existing scoring models. The results showed the ensemble approach achieved higher accuracy than previous methods in early hospital mortality prediction.

Education

A review on
Early Hospital Mortality Prediction of Intensive Care Unit
Patients Using an Ensemble Learning Approach
Authors: Awad, A., Bader-El-Den, M., McNicholas, J., & Briggs, J.
By
Reza Sadeghi
Sadeghi.2@ wright.edu
Instructor
Dr. Romine
Nov 2017

Outline
• The importance of early mortality prediction
• Previous work pros and cons
• The proposed methods
• Experiments
• Conclusion

Early mortality prediction usage
• A decision support systems for intensivists
• The more accurate and quick decision provides the more benefit for
patients and health care resources
• The quality and cost of treatment
• The ICU patient is highly monitored using electronic equipment

Previous work pros and cons
Customized models
Score based models
Data mining models
- perform better than traditional scores [1]
- patients may suffering from heterogeneous diseases then selecting the
proper model is hard job
- rely on panels of experts or statistical models
- refined for use within specified geographical areas
- extracting features from Electronic Medical Records
- only few models are designed for early mortality
- low discrimination power
- many of required attributes are not always available at ICU admission
- higher performance rather than the previous methods
- descriptive modelling as it explains hidden clinical implications
- the results of different algorithm on different data set are not constant.
- it depends on the population of interest, the variables measured and the
outcome being tested

Missing Data Type:
1. missing completely at random (MCAR)
2. missing at random (MAR)
3. Missing not at random (MNAR) [2]
Missing Data Handling:
1. interpretation
2. ignoring those records from the dataset that
are not complete
3. substitute the missing value by the mean or
mode value of each attribute.
("ReplaceMissing-Values“ filter in weka)
4. predicted by using a learning algorithm, such
as Multiple Imputation or EMImputation [3]
Imbalance data Handling:
1. re-sampling [4,5]: under sampling and
oversampling.
2. making the classier 'cost sensitive' [46]
3. hybrid method
Data mining models issues

The structure of experiments
Data set:
MIMIC II
Age of patients:
patients at the age of 16 or above with a single ICU stay
Place of patients:
in either the Medical ICU (MICU), Surgical ICU (SICU) or Cardiac Surgery
Recovery Unit (CSRU)
Early Hospital Mortality Prediction :
define as predicting death inside the hospital via both the chart and lab-test
variables collected in the first 6 hours
Handling missing data :
ignored attributes with coverage below 10%

Selected Attributes
Random Forest, Decision
Trees, Naive Bayes and PART
ANN
Selected DM techniques
10 times 10-fold cross-validated
Experiment strategy

The performance of the methods via AUROC

The proposed method in contrast to the
scoring models via AUROC

What's hot

A comparative analysis of classification techniques on medical data setseSAT Publishing House

How to extract quantitative data for systematic review and meta analysis - Pu...Pubrica

Unified Medical Data Platform focused on AccuracyQuahog Life Sciences

Framework for efficient transformation for complex medical data for improving...IJECEIAES

Q45019899IJERA Editor

Sample size and sampleingAmna Khairy

Multiple imputation of missing dataStatistics Specialist

Petit_FinalChangdae Lee

Dawn Dowding: Implementation of an electronic health recordNuffield Trust

An Experimental Study of Diabetes Disease Prediction System Using Classificat...IOSRjournaljce

PrintVani Mam

Heart Disease Identification Method Using Machine Learnin in E-healthcare.SUJIT SHIBAPRASAD MAITY

Machine learning and operations research to find diabetics at risk for readmi...John Frias Morales, DrBA, MS

DATA-DRIVEN INNOVATIONS IN HEALTHCARE SERVICESInternational Society of Service Innovation Professionals

Introduction to research methodologyAmna Khairy

Lesson Plan Organ System RelationshipsC Russo

Use and misuse of multiple comparison procedure ppt by kenneth tembe oduorKenneth Tembe

PICO Presentation - PientaRyan Pienta

PERFORMANCE OF DATA MINING TECHNIQUES TO PREDICT IN HEALTHCARE CASE STUDY: CH...ijdms

What's hot (19)

A comparative analysis of classification techniques on medical data sets

How to extract quantitative data for systematic review and meta analysis - Pu...

Unified Medical Data Platform focused on Accuracy

Framework for efficient transformation for complex medical data for improving...

Q45019899

Sample size and sampleing

Multiple imputation of missing data

Petit_Final

Dawn Dowding: Implementation of an electronic health record

An Experimental Study of Diabetes Disease Prediction System Using Classificat...

Heart Disease Identification Method Using Machine Learnin in E-healthcare.

Machine learning and operations research to find diabetics at risk for readmi...

DATA-DRIVEN INNOVATIONS IN HEALTHCARE SERVICES

Introduction to research methodology

Lesson Plan Organ System Relationships

Use and misuse of multiple comparison procedure ppt by kenneth tembe oduor

PICO Presentation - Pienta

PERFORMANCE OF DATA MINING TECHNIQUES TO PREDICT IN HEALTHCARE CASE STUDY: CH...

Similar to Early Hospital Mortality Prediction Using Ensemble Learning

Chase presentationReza Sadeghi

A review on early hospital mortality prediction using vital signalsReza Sadeghi

ICU Patient Deterioration Prediction : A Data-Mining Approachcsandit

ICU PATIENT DETERIORATION PREDICTION: A DATA-MINING APPROACHcscpconf

GRC-MS: A GENETIC RULE-BASED CLASSIFIER MODEL FOR ANALYSIS OF MASS SPECTRA DATAcscpconf

ICU MORTALITY PREDICTIONIRJET Journal

Hybrid filtering methods for feature selection in high-dimensional cancer dataIJECEIAES

Heart Failure Prediction using Different MachineLearning TechniquesIRJET Journal

CHE Seminar 20 November 2013cheweb1

Diagnosis of rheumatoid arthritis using an ensemble learning approachcsandit

DIAGNOSIS OF RHEUMATOID ARTHRITIS USING AN ENSEMBLE LEARNING APPROACH cscpconf

IRJET- Disease Prediction using Machine LearningIRJET Journal

IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...IRJET Journal

methods of randomization.pptxNehagurbani

Cancer tissue evaluation.pptxKerenEvangelineI

Clinical data mungingKen Mwai

Pathomics, Clinical Studies, and Cancer SurveillanceJoel Saltz

first review.pptxgghggggvvvvbbvvvvvhhjjjbbvvvvbbbbbhhhhhhhhhbbhmithun302002

The Complementary Roles of Computer-Aided Diagnosis and Quantitative Image A...Carestream

Predicting Life Expectancy of Hepatitis B Patientsnabeelali11101999

Similar to Early Hospital Mortality Prediction Using Ensemble Learning (20)

Chase presentation

A review on early hospital mortality prediction using vital signals

ICU Patient Deterioration Prediction : A Data-Mining Approach

ICU PATIENT DETERIORATION PREDICTION: A DATA-MINING APPROACH

GRC-MS: A GENETIC RULE-BASED CLASSIFIER MODEL FOR ANALYSIS OF MASS SPECTRA DATA

ICU MORTALITY PREDICTION

Hybrid filtering methods for feature selection in high-dimensional cancer data

Heart Failure Prediction using Different MachineLearning Techniques

CHE Seminar 20 November 2013

Diagnosis of rheumatoid arthritis using an ensemble learning approach

DIAGNOSIS OF RHEUMATOID ARTHRITIS USING AN ENSEMBLE LEARNING APPROACH

IRJET- Disease Prediction using Machine Learning

IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...

methods of randomization.pptx

Cancer tissue evaluation.pptx

Clinical data munging

Pathomics, Clinical Studies, and Cancer Surveillance

first review.pptxgghggggvvvvbbvvvvvhhjjjbbvvvvbbbbbhhhhhhhhhbbh

The Complementary Roles of Computer-Aided Diagnosis and Quantitative Image A...

Predicting Life Expectancy of Hepatitis B Patients

Recently uploaded

Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2

Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan

Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri

ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood

Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc

Types of Journalistic Writing Grade 8.pptxEyham Joco

MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma

9953330565 Low Rate Call Girls In Rohini Delhi NCR9953056974 Low Rate Call Girls In Saket, Delhi NCR

AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb

Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth

Introduction to AI in Higher Education_draft.pptxpboyjonauth

Rapple "Scholarly Communications and the Sustainable Development Goals"National Information Standards Organization (NISO)

Keynote by Prof. Wurzer at Nordex about IP-designMIPLM

Full Stack Web Development Course for BeginnersSabitha Banu

Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir

Planning a health career 4th Quarter.pptxLigayaBacuel1

Alper Gobel In Media Res Media ComponentInMediaRes1

Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe

Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1

How to do quick user assign in kanban in Odoo 17 ERPCeline George

Recently uploaded (20)

Grade 9 Q4-MELC1-Active and Passive Voice.pptx

Gas measurement O2,Co2,& ph) 04/2024.pptx

Judging the Relevance and worth of ideas part 2.pptx

ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx

Procuring digital preservation CAN be quick and painless with our new dynamic...

Types of Journalistic Writing Grade 8.pptx

MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx

9953330565 Low Rate Call Girls In Rohini Delhi NCR

AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf

Introduction to ArtificiaI Intelligence in Higher Education

Introduction to AI in Higher Education_draft.pptx

Rapple "Scholarly Communications and the Sustainable Development Goals"

Keynote by Prof. Wurzer at Nordex about IP-design

Full Stack Web Development Course for Beginners

Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf

Planning a health career 4th Quarter.pptx

Alper Gobel In Media Res Media Component

Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf

Employee wellbeing at the workplace.pptx

How to do quick user assign in kanban in Odoo 17 ERP

Early Hospital Mortality Prediction Using Ensemble Learning

1. A review on Early Hospital Mortality Prediction of Intensive Care Unit Patients Using an Ensemble Learning Approach Authors: Awad, A., Bader-El-Den, M., McNicholas, J., & Briggs, J. By Reza Sadeghi Sadeghi.2@ wright.edu Instructor Dr. Romine Nov 2017

2. Outline • The importance of early mortality prediction • Previous work pros and cons • The proposed methods • Experiments • Conclusion

3. Early mortality prediction usage • A decision support systems for intensivists • The more accurate and quick decision provides the more benefit for patients and health care resources • The quality and cost of treatment • The ICU patient is highly monitored using electronic equipment

4. Previous work pros and cons Customized models Score based models Data mining models - perform better than traditional scores [1] - patients may suffering from heterogeneous diseases then selecting the proper model is hard job - rely on panels of experts or statistical models - refined for use within specified geographical areas - extracting features from Electronic Medical Records - only few models are designed for early mortality - low discrimination power - many of required attributes are not always available at ICU admission - higher performance rather than the previous methods - descriptive modelling as it explains hidden clinical implications - the results of different algorithm on different data set are not constant. - it depends on the population of interest, the variables measured and the outcome being tested

5. Missing Data Type: 1. missing completely at random (MCAR) 2. missing at random (MAR) 3. Missing not at random (MNAR) [2] Missing Data Handling: 1. interpretation 2. ignoring those records from the dataset that are not complete 3. substitute the missing value by the mean or mode value of each attribute. ("ReplaceMissing-Values“ filter in weka) 4. predicted by using a learning algorithm, such as Multiple Imputation or EMImputation [3] Imbalance data Handling: 1. re-sampling [4,5]: under sampling and oversampling. 2. making the classier 'cost sensitive' [46] 3. hybrid method Data mining models issues

6. The structure of experiments Data set: MIMIC II Age of patients: patients at the age of 16 or above with a single ICU stay Place of patients: in either the Medical ICU (MICU), Surgical ICU (SICU) or Cardiac Surgery Recovery Unit (CSRU) Early Hospital Mortality Prediction : define as predicting death inside the hospital via both the chart and lab-test variables collected in the first 6 hours Handling missing data : ignored attributes with coverage below 10%

7. Selected Attributes Random Forest, Decision Trees, Naive Bayes and PART ANN Selected DM techniques 10 times 10-fold cross-validated Experiment strategy

8. Experiments A and B

9. Experiments C, D and E

10. The performance of the methods via AUROC

11. The proposed method in contrast to the scoring models via AUROC

12. Thank you for your attention!

Editor's Notes

The Photo originated from https://articles.mercola.com/sites/articles/archive/2010/03/13/hospitals-now-kill-48000-in-us-per-year-up-nearly-500-percent.aspx
The Photo originated fromhttps://www.statnews.com/2016/09/07/hospital-icu-modernize/
[1]-> [11]
1.interpretation: if an individual patients record has multiple entries missing, it may be explained that this is because they were regarded as being less sick than others, so they were not prioritized. Equally, the patient may have been regarded as being extremely sick, so they died before much can be done. Distinguishing these cases is not simple in the absence of other information. [2] -> [42] [3] -> [44] [4,5] -> [42, 45]
6 hour: This interval was reached after consulting several intensivists and considering gaps in literature
They selected 33 chart attributes and 25 lab-tests from the initially identified attributes with high coverage. Attributes with higher coverage were considered, resulting in a total of 20 unique variables; 29 if we count maximums and minimums.
In experiment A, the dataset contained all 20 unique variables (29 if counting maximums and minimums) In experiment B, the dataset contained only the eight unique vital signs variables (10 if counting maximums and minimums) the mean (rep1) EMImputation algorithm (rep2) We conducted five different experiments. In experiments A and B, we evaluated each of the four data mining algorithms on each of six versions of the dataset (second column of Table II): 1. original dataset (original), 2. dataset after modified by applying the Synthetic Minority Oversampling Technique (SMOTE) [48], an oversampling technique that involves increasing the size of the minority class with the insertion of synthetic data (original+smote), 3. dataset after replacing missing values with the mean (rep1) to handle the issue of missing values 4. dataset after replacing missing values with mean and then applying SMOTE (rep1+smote), 5. datasets after replacing missing values using the EMImputation algorithm (rep2), 6. dataset after replacing missing values using EMImputation algorithm and applying SMOTE (rep2+smote).
1. dataset with eliminated records and the 20 unique variables (original), 2. dataset with eliminated records and the 20 unique variables and then applying smote (original+smote), 3. dataset with eliminated records and the top filtered ranked variables only (filtered top, 5, 10 and 15), and 4. dataset with eliminated records and the top filtered ranked variables only and then applying smote (filter+smote).
Graphical model of the roles of different accuracy parameters
In general in the top 5, 10 and 15 experiment categories, the models developed with the original attributes (without any ltering) performed better than those with ltering. In the experiments without ltering, top 10 (original) and (original+smote) performed best (AUROC = 0.89 0.02), followed by top 5 (original) (AUROC = 0.86 0.02). As for the ltered experiments, top 10 (lter) and (lter+smote) also performed best (AUROC = 0.87 0.03), followed by top 15 (lter) and (lter+smote) (AUROC = 0.82 0.04).

Early Hospital Mortality Prediction Using Ensemble Learning

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Early Hospital Mortality Prediction Using Ensemble Learning

Similar to Early Hospital Mortality Prediction Using Ensemble Learning (20)

More from Reza Sadeghi

More from Reza Sadeghi (14)

Recently uploaded

Recently uploaded (20)

Early Hospital Mortality Prediction Using Ensemble Learning

Editor's Notes