SlideShare a Scribd company logo
1 of 13
Download to read offline
Medical Appointment No Shows
Presented by:
Team 2: Kayla Reinhart, Medha Tiwary, Janelle Manuel, Anvitha Ananth
1
Introduction & Problem
● 30% (about 100,000) of patients at a public sector primary care medical facility in Brazil
have missed their scheduled appointments from 5/2013 - 12/2015
● On average, primary care visits cost about $200, translating to approximately $20M in
missed revenues for the practice over 2.5 years
● Outside of missed revenues, assuming that it costs $10 to accommodate missed
appointments, this could cost the facility nearly $1M in salaries and operational costs
● Aside from direct revenue costs, no-shows significantly affect delivery, cost of care and
resource planning. Delayed testing potentially puts patients in danger. Missed screenings can
result in delayed disease detection. Reducing no-show rates can help diminish costs and
improve quality of health care delivery.
2
Objectives
● Evaluate the cause of missed
appointments and the impact of the input
features on no-shows
● Predict whether a patient is going to miss
an appointment
● Determine which factors have the largest
impact on no-show status
● Help doctors identify traits in patients that
are more likely to miss an appointment
and give recommendations on how to
combat high no-show rates
3
About the Dataset
- 300,000 observations
- 15 variables
(characteristics)
- Coded Gender, Day of
the Week, and Status
with binary numeric
values
Input Description Details
Age Patient’s Age 0-95
Gender Patient’s Gender F = 1, M = 2
Appointment Registration Date/Time Appointment was Made Time and Date Stamp Provided
Appointment Data Date/Time of Appointment Time and Date Stamp Provided
Day of the Week Day of the week of Appointment 1 = Sunday …. 7 = Saturday
Status Patient Showed or Didn’t Show 0 = Show Up, 1 = No Show
Diabetes Patient is a Diabetic 0 = No, 1 = Yes
Alcoholism Patient is an Alcoholic 0 = No, 1 = Yes
Hypertension Patient is Hypertensive 0 = No, 1 = Yes
Handicap Patient is Handicapped 0 = No, 1 = Yes
Smoker Patient is a Smoker 0 = No, 1 = Yes
Welfare Patient Receives Government Assistance 0 = No, 1 = Yes
Tuberculosis Patient had Tuberculosis 0 = No, 1 = Yes
SMS Reminder Patient was Sent a Text Reminder 0 = No, 1 = Yes
Waiting Time Duration (in days) Between Date Appt
was Made and Date of Appt
Amount in negative days
4
Evaluating the Data
- Less than 3% of the data contained missing values, these observations were
removed.
- The Age input had improbable values, we treated these as outliers and removed
them from the data set (194 observations = <1%)
- It was evident that there was a class imbalance problem with the “STATUS” target
- Majority Class: Status = 0 (Show) - 70% of the data
- Minority Class – Status = 1 (No Show) - 30% of the data
5
Preparing the Data
- Removed the Appointment Registration
and Appointment Data features, since
these values were summarized in the
Waiting Time feature.
- Using regression, we found that Age,
Alcoholism, Hypertension, SMS
Reminder, Day of the Week, and
Waiting Time were significant variables.
These are the variables we decided to
use as inputs for our models.
6
Class Imbalance Strategy #1: SMOTE
- Performed SMOTE to bring training dataset to 60,000 observations and correct
the class imbalance,
7
Class Imbalance Strategy #1: SMOTE
- Implemented a Random Forest model and
scored the data with a randomly sampled
15,000-observation test data set.
- The error matrix showed that even with
using SMOTE to correct for a class
imbalance, the model was unable to
predict any of the no-shows in the test
data.
- The accuracy of this model is misleading
Predicted 0 1
Actual 0 10480 0
1 4520 0
Error Rate 30%
8
Class Imbalance Strategy #2: Cost-Sensitive Learning
- Randomly sampled 60,000 observations, or
20% of the dataset
- Used a 70/0/30 partition
- 0, 20, 50, 0 loss matrix.
- Error rate increased but instances of correctly
predicted no-shows increased as well
- In this case, recall is high (.73) but precision is
low (.33), driving down the F1 measure (.46)
and making the 52% accuracy rate
unacceptable
9
Class Imbalance Strategy #3: Ignore It
- Performed Random
Forest on 60,000 sample
data with a 70/0/30
partition
- Accuracy improved but
but precision (.40), recall
(.06) and F1 (.10)
decreased significantly
10
Conclusions and Recommendations
● After multiple models with multiple parameters, we were unable to find any
features within the given dataset that significantly predicted missed appointments
● We suspect that the inputs provided are not predictive of no-shows
● We recommend that the facility gather additional information about their
patients when the appointments are made such as:
○ Job Type (Full-time, Part-time, unemployed, etc.)
○ New vs. Existing Patient
○ Reason for Appointment
○ Recency of Symptoms
○ Severity of Symptoms
○ Insurance Coverage
○ Distance from Facility
○ Means of Transportation
11
Thank You
12
Appendix 1: Multiple regression
13

More Related Content

What's hot

The Hidden Benefits of Nurse Prescribing in Care of the Older Person
The Hidden Benefits of Nurse Prescribing in Care of the Older PersonThe Hidden Benefits of Nurse Prescribing in Care of the Older Person
The Hidden Benefits of Nurse Prescribing in Care of the Older Person
anne spencer
 
QI Poster ADDVantage 2014-edit-5 FINAL
QI Poster ADDVantage 2014-edit-5 FINALQI Poster ADDVantage 2014-edit-5 FINAL
QI Poster ADDVantage 2014-edit-5 FINAL
Adam Odeh
 

What's hot (10)

PPS Nsambya Presentation
PPS Nsambya PresentationPPS Nsambya Presentation
PPS Nsambya Presentation
 
The Hidden Benefits of Nurse Prescribing in Care of the Older Person
The Hidden Benefits of Nurse Prescribing in Care of the Older PersonThe Hidden Benefits of Nurse Prescribing in Care of the Older Person
The Hidden Benefits of Nurse Prescribing in Care of the Older Person
 
Kho Amia2008 Demo Final
Kho Amia2008 Demo FinalKho Amia2008 Demo Final
Kho Amia2008 Demo Final
 
Physician Insights from UBM Medica
Physician Insights from UBM MedicaPhysician Insights from UBM Medica
Physician Insights from UBM Medica
 
Improving compliance of applying fall risk precautions
Improving compliance of applying fall risk precautionsImproving compliance of applying fall risk precautions
Improving compliance of applying fall risk precautions
 
Remote Monitoring of Vital signs of Elderly in the Community: a Feasibility S...
Remote Monitoring of Vital signs of Elderly in the Community: a Feasibility S...Remote Monitoring of Vital signs of Elderly in the Community: a Feasibility S...
Remote Monitoring of Vital signs of Elderly in the Community: a Feasibility S...
 
e-Patient Dave AF4Q South Central PA 01-11-2019
e-Patient Dave AF4Q South Central PA 01-11-2019e-Patient Dave AF4Q South Central PA 01-11-2019
e-Patient Dave AF4Q South Central PA 01-11-2019
 
QI Poster ADDVantage 2014-edit-5 FINAL
QI Poster ADDVantage 2014-edit-5 FINALQI Poster ADDVantage 2014-edit-5 FINAL
QI Poster ADDVantage 2014-edit-5 FINAL
 
Taking Stock of Potential Advances in the Early Diagnosis of Alzheimer’s Dise...
Taking Stock of Potential Advances in the Early Diagnosis of Alzheimer’s Dise...Taking Stock of Potential Advances in the Early Diagnosis of Alzheimer’s Dise...
Taking Stock of Potential Advances in the Early Diagnosis of Alzheimer’s Dise...
 
4.5.5 Carolyn Enks
4.5.5 Carolyn Enks4.5.5 Carolyn Enks
4.5.5 Carolyn Enks
 

Similar to Business Analytics with R - Using Data Mining Techniques

PUH 5302, Applied Biostatistics 1 Course Learning Outcomes.docx
PUH 5302, Applied Biostatistics 1 Course Learning Outcomes.docxPUH 5302, Applied Biostatistics 1 Course Learning Outcomes.docx
PUH 5302, Applied Biostatistics 1 Course Learning Outcomes.docx
denneymargareta
 
Untitled presentation.pptx
Untitled presentation.pptxUntitled presentation.pptx
Untitled presentation.pptx
AkshayaPriyaJanartha
 
Ueda2015 tupelo.nurses role in dm prevention dr.martyn molnar
Ueda2015 tupelo.nurses role in dm prevention dr.martyn molnarUeda2015 tupelo.nurses role in dm prevention dr.martyn molnar
Ueda2015 tupelo.nurses role in dm prevention dr.martyn molnar
ueda2015
 
Gns healthcare carol mccall
Gns healthcare   carol mccallGns healthcare   carol mccall
Gns healthcare carol mccall
mlkrgr
 

Similar to Business Analytics with R - Using Data Mining Techniques (20)

MEDICAL AUDIT 3rd Ed.pptx
MEDICAL AUDIT 3rd Ed.pptxMEDICAL AUDIT 3rd Ed.pptx
MEDICAL AUDIT 3rd Ed.pptx
 
How Allina Health Uses Analytics to Transform Care - HAS Session 16
How Allina Health Uses Analytics to Transform Care - HAS Session 16How Allina Health Uses Analytics to Transform Care - HAS Session 16
How Allina Health Uses Analytics to Transform Care - HAS Session 16
 
Classifying Readmissions of Diabetic Patient Encounters
Classifying Readmissions of Diabetic Patient EncountersClassifying Readmissions of Diabetic Patient Encounters
Classifying Readmissions of Diabetic Patient Encounters
 
Health&wellness plan
Health&wellness planHealth&wellness plan
Health&wellness plan
 
Health & Wellness Program
Health & Wellness ProgramHealth & Wellness Program
Health & Wellness Program
 
The economic value of monitoring patient treatment response (Lambert, 2014)
The economic value of monitoring patient treatment response (Lambert, 2014)The economic value of monitoring patient treatment response (Lambert, 2014)
The economic value of monitoring patient treatment response (Lambert, 2014)
 
Predictive Modeling: White Paper
Predictive Modeling: White PaperPredictive Modeling: White Paper
Predictive Modeling: White Paper
 
PUH 5302, Applied Biostatistics 1 Course Learning Outcomes.docx
PUH 5302, Applied Biostatistics 1 Course Learning Outcomes.docxPUH 5302, Applied Biostatistics 1 Course Learning Outcomes.docx
PUH 5302, Applied Biostatistics 1 Course Learning Outcomes.docx
 
Nursing process, Nursing Diagnosis
Nursing process, Nursing DiagnosisNursing process, Nursing Diagnosis
Nursing process, Nursing Diagnosis
 
Untitled presentation.pptx
Untitled presentation.pptxUntitled presentation.pptx
Untitled presentation.pptx
 
Ueda2015 tupelo.nurses role in dm prevention dr.martyn molnar
Ueda2015 tupelo.nurses role in dm prevention dr.martyn molnarUeda2015 tupelo.nurses role in dm prevention dr.martyn molnar
Ueda2015 tupelo.nurses role in dm prevention dr.martyn molnar
 
Zoom mu v 0.2
Zoom mu v 0.2Zoom mu v 0.2
Zoom mu v 0.2
 
Marshall Island Final presentation
Marshall Island Final presentationMarshall Island Final presentation
Marshall Island Final presentation
 
Identifying Problems
Identifying ProblemsIdentifying Problems
Identifying Problems
 
Gns healthcare carol mccall
Gns healthcare   carol mccallGns healthcare   carol mccall
Gns healthcare carol mccall
 
The world of grading shelley widdowosn final01
The world of grading shelley widdowosn final01The world of grading shelley widdowosn final01
The world of grading shelley widdowosn final01
 
Redefining the care team to meet Population Health objectives
Redefining the care team to meet Population Health objectivesRedefining the care team to meet Population Health objectives
Redefining the care team to meet Population Health objectives
 
Care Plan Concept Map Workshop.ppt
Care Plan Concept Map Workshop.pptCare Plan Concept Map Workshop.ppt
Care Plan Concept Map Workshop.ppt
 
Evidence-Based Program Design Date
Evidence-Based Program DesignDateEvidence-Based Program DesignDate
Evidence-Based Program Design Date
 
Understanding clinical trial's statistics
Understanding clinical trial's statisticsUnderstanding clinical trial's statistics
Understanding clinical trial's statistics
 

Recently uploaded

1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
ppy8zfkfm
 
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
yulianti213969
 
原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证
pwgnohujw
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives
23050636
 
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Stephen266013
 
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
Amil baba
 
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotecAbortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
acoha1
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
dq9vz1isj
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
jk0tkvfv
 
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
acoha1
 
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di Ban...
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di  Ban...obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di  Ban...
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di Ban...
siskavia95
 
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontangobat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
siskavia95
 

Recently uploaded (20)

How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data Analytics
 
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
 
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
 
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI  MANAJEMEN OF PENYAKIT TETANUS.pptMATERI  MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
 
原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives
 
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptx
 
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
 
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotecAbortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
 
Seven tools of quality control.slideshare
Seven tools of quality control.slideshareSeven tools of quality control.slideshare
Seven tools of quality control.slideshare
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
 
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
 
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di Ban...
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di  Ban...obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di  Ban...
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di Ban...
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
 
Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"
 
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
 
Bios of leading Astrologers & Researchers
Bios of leading Astrologers & ResearchersBios of leading Astrologers & Researchers
Bios of leading Astrologers & Researchers
 
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontangobat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
 

Business Analytics with R - Using Data Mining Techniques

  • 1. Medical Appointment No Shows Presented by: Team 2: Kayla Reinhart, Medha Tiwary, Janelle Manuel, Anvitha Ananth 1
  • 2. Introduction & Problem ● 30% (about 100,000) of patients at a public sector primary care medical facility in Brazil have missed their scheduled appointments from 5/2013 - 12/2015 ● On average, primary care visits cost about $200, translating to approximately $20M in missed revenues for the practice over 2.5 years ● Outside of missed revenues, assuming that it costs $10 to accommodate missed appointments, this could cost the facility nearly $1M in salaries and operational costs ● Aside from direct revenue costs, no-shows significantly affect delivery, cost of care and resource planning. Delayed testing potentially puts patients in danger. Missed screenings can result in delayed disease detection. Reducing no-show rates can help diminish costs and improve quality of health care delivery. 2
  • 3. Objectives ● Evaluate the cause of missed appointments and the impact of the input features on no-shows ● Predict whether a patient is going to miss an appointment ● Determine which factors have the largest impact on no-show status ● Help doctors identify traits in patients that are more likely to miss an appointment and give recommendations on how to combat high no-show rates 3
  • 4. About the Dataset - 300,000 observations - 15 variables (characteristics) - Coded Gender, Day of the Week, and Status with binary numeric values Input Description Details Age Patient’s Age 0-95 Gender Patient’s Gender F = 1, M = 2 Appointment Registration Date/Time Appointment was Made Time and Date Stamp Provided Appointment Data Date/Time of Appointment Time and Date Stamp Provided Day of the Week Day of the week of Appointment 1 = Sunday …. 7 = Saturday Status Patient Showed or Didn’t Show 0 = Show Up, 1 = No Show Diabetes Patient is a Diabetic 0 = No, 1 = Yes Alcoholism Patient is an Alcoholic 0 = No, 1 = Yes Hypertension Patient is Hypertensive 0 = No, 1 = Yes Handicap Patient is Handicapped 0 = No, 1 = Yes Smoker Patient is a Smoker 0 = No, 1 = Yes Welfare Patient Receives Government Assistance 0 = No, 1 = Yes Tuberculosis Patient had Tuberculosis 0 = No, 1 = Yes SMS Reminder Patient was Sent a Text Reminder 0 = No, 1 = Yes Waiting Time Duration (in days) Between Date Appt was Made and Date of Appt Amount in negative days 4
  • 5. Evaluating the Data - Less than 3% of the data contained missing values, these observations were removed. - The Age input had improbable values, we treated these as outliers and removed them from the data set (194 observations = <1%) - It was evident that there was a class imbalance problem with the “STATUS” target - Majority Class: Status = 0 (Show) - 70% of the data - Minority Class – Status = 1 (No Show) - 30% of the data 5
  • 6. Preparing the Data - Removed the Appointment Registration and Appointment Data features, since these values were summarized in the Waiting Time feature. - Using regression, we found that Age, Alcoholism, Hypertension, SMS Reminder, Day of the Week, and Waiting Time were significant variables. These are the variables we decided to use as inputs for our models. 6
  • 7. Class Imbalance Strategy #1: SMOTE - Performed SMOTE to bring training dataset to 60,000 observations and correct the class imbalance, 7
  • 8. Class Imbalance Strategy #1: SMOTE - Implemented a Random Forest model and scored the data with a randomly sampled 15,000-observation test data set. - The error matrix showed that even with using SMOTE to correct for a class imbalance, the model was unable to predict any of the no-shows in the test data. - The accuracy of this model is misleading Predicted 0 1 Actual 0 10480 0 1 4520 0 Error Rate 30% 8
  • 9. Class Imbalance Strategy #2: Cost-Sensitive Learning - Randomly sampled 60,000 observations, or 20% of the dataset - Used a 70/0/30 partition - 0, 20, 50, 0 loss matrix. - Error rate increased but instances of correctly predicted no-shows increased as well - In this case, recall is high (.73) but precision is low (.33), driving down the F1 measure (.46) and making the 52% accuracy rate unacceptable 9
  • 10. Class Imbalance Strategy #3: Ignore It - Performed Random Forest on 60,000 sample data with a 70/0/30 partition - Accuracy improved but but precision (.40), recall (.06) and F1 (.10) decreased significantly 10
  • 11. Conclusions and Recommendations ● After multiple models with multiple parameters, we were unable to find any features within the given dataset that significantly predicted missed appointments ● We suspect that the inputs provided are not predictive of no-shows ● We recommend that the facility gather additional information about their patients when the appointments are made such as: ○ Job Type (Full-time, Part-time, unemployed, etc.) ○ New vs. Existing Patient ○ Reason for Appointment ○ Recency of Symptoms ○ Severity of Symptoms ○ Insurance Coverage ○ Distance from Facility ○ Means of Transportation 11
  • 13. Appendix 1: Multiple regression 13