Wisconsin hospital - Healthcare Cost Prediction

•Download as PPTX, PDF•

2 likes•54 views

The project aims at predicting healthcare cost against actual data as provided by US survey of hospital, The dataset on which analysis has been done is a sample dataset used for educational purposes only.

Data & Analytics

PROJECT NAME:
HEALTHCARE COST ANALYSIS
Trainer Name: Deepti Miyan Gupta
Submitted By: Prasann Prem

A nationwide survey of hospital costs conducted by the US
Agency for Healthcare consists of hospital records of inpatient
samples. The analysis is being done on data records of the
hospital in Wisconsin and our main aim is to predict the
healthcare cost.
BUSINESS
PROBLEM
2

DATAOVERVIEW:
* According to a Survey
3
 Dataset dimensions:
Total no of rows: 151
Total no of columns: 6
 Dependent variable :
TOTCHG – Actual total healthcare cost
 Independent variables:
AGE: Value ranges from 0 to 17
FEMALE: If yes, then ‘1’ else ‘0’
LOS: Length of stay ranged from 0 to 41
RACE: 6 unique race valued 1 to 6
APRDRG: All Patient Refined Diagnosis Related Groups

BUSINESS SOLUTION:
4
 Since the data has multiple independent variables and has
continuous values, we will use Multi Linear Regression (MLR)
Algorithm.
 We will split the dataset in 70:30 ratio and train the model by
70% of data and predict the healthcare cost on 30% data.

MODEL INTERPRETATION:
5
 Setting the significant value at 95%, we looked for
variables with p-value < 0.05 to find out the
significant variables.
 We found that AGE, LOS AND APRDRG have only
p-value less than 0.05, and thus these are our
significant variables.
 Therefore, we will rebuild our model using only
these significant variables.
 Also, while looking at slope(Estimate) value, we
found the following relation between independent
and dependent variable:
1. AGE and LOS are directly proportional to TOTCHG
2. FEMALE, RACE & APRDRG are inversely
proportional to TOTCHG

MODEL RE-BUILDING:
6
 Using only significant variables from our last model, we
re-built the lm model and found that all the variables
have p-value less than 0.05.
 Thus, the model built with these variables are our final
model to predict our problem
 While looking at slope(Estimate) value, we found the
following relation between independent and dependent
variable:
1. AGE and LOS are directly proportional to TOTCHG
2. APRDRG is inversely proportional to TOTCHG
 Also, we got our R-squared value = 0.4434. This
means our current data and independent variables are
able to explain 44.34% of dependent variable only.

PREDICTION RESULTS:
7
 Based on our final model, we predicted the total
healthcare cost (predtest) and calculated residual
value and mean squared error (MSE).
 Our MSE value turned out to be 3825548.
 Also, we plotted our actual healthcare cost (TOTCHG)
against predicted healthcare cost (predtest) to see the
trend between them.

CONCLUSION:
8
 Our MSE value equals to 3825548, which is very high
and signifies the low accuracy of our predicted result.
 The plotted graph also shows the inaccuracy between
actual (TOTCHG) and predicted (predtest) healthcare
cost.
 We also calculated the R-squared value which turned
out to be only 44.34%
Thus, we can conclude that our current model is
insufficient to predict healthcare cost accurately.
This is because our dataset is very small and
current variables are not able to completely
explain our dependent variable.
With a rich dataset having more features and
information, we are likely to get a good result
with the same model.

THANK
YOU
Prasann Prem
prasannprem@live.com
+91-8006869552
DATA ANALYST

What's hot

StatVignette01-Correlation_06_15_2020.pptxSERC at Carleton College

DATA SCIENCE - Outlier detection and treatment_ sachin pathaniaSachin Pathania

Dependance Technique, Regression & Correlation Qasim Raza

CorrelationFatima Mairaj

Regression projectMansiChowkkar

Dependence Techniques Hasnain Khan

Correlationsuncil0071

RegressionRohit Sharma

LisrelNaveen Chopra

Bias in Research Methods Central University of Jammu

Interpretation and analysisjavaidji

Research meathodologyKetan Kotalwar

Data Applied: CorrelationDataminingTools Inc

Correlation analysis Divyanshu Singh

R - Multiple RegressionLearnbay Datascience

Assignment 3 NURINCHO

ECON104RoughDraft1John Nguyen

Correlation analysis pptDavid Jaison

2.3 the simple regression modelRegmi Milan

What's hot (19)

StatVignette01-Correlation_06_15_2020.pptx

DATA SCIENCE - Outlier detection and treatment_ sachin pathania

Dependance Technique, Regression & Correlation

Correlation

Regression project

Dependence Techniques

Correlation

Regression

Lisrel

Bias in Research Methods

Interpretation and analysis

Research meathodology

Data Applied: Correlation

Correlation analysis

R - Multiple Regression

Assignment 3

ECON104RoughDraft1

Correlation analysis ppt

2.3 the simple regression model

Similar to Wisconsin hospital - Healthcare Cost Prediction

Add slidesRupa D

200994363Jett Hudson

Statistical analysis of Multiple and Logistic RegressionSindhujanDhayalan

B025209013inventionjournals

Econometrics projectShubham Joon

Hypothesis Tests in R ProgrammingAtacan Garip

Wine.Final.Project.MJv3Melissa A. Johnson

Predicting breast cancer: Adrian VallesAdrián Vallés

Introduction to Econometrics for under gruadute class.pptxtadegebreyesus

Atharva_Joshis_Presentation_on_Regression.pptxAtharva Joshi

Meta-Analysis in AyurvedaAyurdata

Statistical distributionsTanveerRehman4

statistical learning theoryHarshKumar943076

Sem with amos iiJordan Sitorus

Two sample t-testStephen Lange

Predicting Hospital Productivity from Capacity Metrics - A Linear Regression ...Madeleine Organ

2014 IIAG Imputation AssessmentsDr Lendy Spires

miningAnthony Twohill

Trochim, W. M. K. (2006). Internal validity.httpwww.socialrescurranalmeta

Boost model accuracy of imbalanced covid 19 mortality predictionBindhuBhargaviTalasi

Similar to Wisconsin hospital - Healthcare Cost Prediction (20)

Add slides

200994363

Statistical analysis of Multiple and Logistic Regression

B025209013

Econometrics project

Hypothesis Tests in R Programming

Wine.Final.Project.MJv3

Predicting breast cancer: Adrian Valles

Introduction to Econometrics for under gruadute class.pptx

Atharva_Joshis_Presentation_on_Regression.pptx

Meta-Analysis in Ayurveda

Statistical distributions

statistical learning theory

Sem with amos ii

Two sample t-test

Predicting Hospital Productivity from Capacity Metrics - A Linear Regression ...

2014 IIAG Imputation Assessments

mining

Trochim, W. M. K. (2006). Internal validity.httpwww.socialres

Boost model accuracy of imbalanced covid 19 mortality prediction

Recently uploaded

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375

High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh

Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863

VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor

Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha

Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson

E-Commerce Order PredictionShraddha Kamble.pptxBoston Institute of Analytics

EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster

Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort

(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat

04242024_CCC TUG_Joins and Relationshipsccctableauusergroup

Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha

RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh

Data Science Jobs and Salaries Analysis.pptxFurkanTasci3

Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster

Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha

Call Girls in Saket 99530🔝 56974 Escort Service9953056974 Low Rate Call Girls In Saket, Delhi NCR

Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...ThinkInnovation

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa

꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

Recently uploaded (20)

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...

High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...

Dubai Call Girls Wifey O52&786472 Call Girls Dubai

VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...

Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...

Schema on read is obsolete. Welcome metaprogramming..pdf

E-Commerce Order PredictionShraddha Kamble.pptx

EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx

Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)

(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service

04242024_CCC TUG_Joins and Relationships

Call Girls In Dwarka 9654467111 Escorts Service

RA-11058_IRR-COMPRESS Do 198 series of 1998

Data Science Jobs and Salaries Analysis.pptx

Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024

Call Girls In Mahipalpur O9654467111 Escorts Service

Call Girls in Saket 99530🔝 56974 Escort Service

Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf

꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...

Wisconsin hospital - Healthcare Cost Prediction

1. PROJECT NAME: HEALTHCARE COST ANALYSIS Trainer Name: Deepti Miyan Gupta Submitted By: Prasann Prem

2. A nationwide survey of hospital costs conducted by the US Agency for Healthcare consists of hospital records of inpatient samples. The analysis is being done on data records of the hospital in Wisconsin and our main aim is to predict the healthcare cost. BUSINESS PROBLEM 2

3. DATAOVERVIEW: * According to a Survey 3  Dataset dimensions: Total no of rows: 151 Total no of columns: 6  Dependent variable : TOTCHG – Actual total healthcare cost  Independent variables: AGE: Value ranges from 0 to 17 FEMALE: If yes, then ‘1’ else ‘0’ LOS: Length of stay ranged from 0 to 41 RACE: 6 unique race valued 1 to 6 APRDRG: All Patient Refined Diagnosis Related Groups

4. BUSINESS SOLUTION: 4  Since the data has multiple independent variables and has continuous values, we will use Multi Linear Regression (MLR) Algorithm.  We will split the dataset in 70:30 ratio and train the model by 70% of data and predict the healthcare cost on 30% data.

5. MODEL INTERPRETATION: 5  Setting the significant value at 95%, we looked for variables with p-value < 0.05 to find out the significant variables.  We found that AGE, LOS AND APRDRG have only p-value less than 0.05, and thus these are our significant variables.  Therefore, we will rebuild our model using only these significant variables.  Also, while looking at slope(Estimate) value, we found the following relation between independent and dependent variable: 1. AGE and LOS are directly proportional to TOTCHG 2. FEMALE, RACE & APRDRG are inversely proportional to TOTCHG

6. MODEL RE-BUILDING: 6  Using only significant variables from our last model, we re-built the lm model and found that all the variables have p-value less than 0.05.  Thus, the model built with these variables are our final model to predict our problem  While looking at slope(Estimate) value, we found the following relation between independent and dependent variable: 1. AGE and LOS are directly proportional to TOTCHG 2. APRDRG is inversely proportional to TOTCHG  Also, we got our R-squared value = 0.4434. This means our current data and independent variables are able to explain 44.34% of dependent variable only.

7. PREDICTION RESULTS: 7  Based on our final model, we predicted the total healthcare cost (predtest) and calculated residual value and mean squared error (MSE).  Our MSE value turned out to be 3825548.  Also, we plotted our actual healthcare cost (TOTCHG) against predicted healthcare cost (predtest) to see the trend between them.

8. CONCLUSION: 8  Our MSE value equals to 3825548, which is very high and signifies the low accuracy of our predicted result.  The plotted graph also shows the inaccuracy between actual (TOTCHG) and predicted (predtest) healthcare cost.  We also calculated the R-squared value which turned out to be only 44.34% Thus, we can conclude that our current model is insufficient to predict healthcare cost accurately. This is because our dataset is very small and current variables are not able to completely explain our dependent variable. With a rich dataset having more features and information, we are likely to get a good result with the same model.

9. THANK YOU Prasann Prem prasannprem@live.com +91-8006869552 DATA ANALYST

Wisconsin hospital - Healthcare Cost Prediction

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Wisconsin hospital - Healthcare Cost Prediction

Similar to Wisconsin hospital - Healthcare Cost Prediction (20)

Recently uploaded

Recently uploaded (20)

Wisconsin hospital - Healthcare Cost Prediction