SlideShare a Scribd company logo
1 of 26
© 2016© 2016
A Practical Approach to Analyzing
Healthcare Data
Chapter 6 – Analyzing the
Relationship between Two
Variables
© 2016
Categorical Variables
• Descriptive Statistics
– Contingency tables
– Used to display and analyze the relationship between two
categorical variables
– Notice in table below:
• 20/32 = 62.5% of female patients were discharged home
• 10/24 = 41.7% of male patients were discharged home
• Inferential Statistics
– Is this just a random occurrence or is this evidence that there
is a significant relationship between gender and being
discharged to home?
– An hypothesis test may be used to answer that question
© 2016
Example: Chi-squared Test of
Independence
Step Response
1. Determine the null and
alternative hypotheses
Ho: Discharged to Home and
Gender are independent
H1: Discharged to Home and
Gender are not independent
2. Set the acceptable type I error or
alpha level
The analyst is willing to accept a
5% chance or probability of
rejecting the null hypothesis when it
is true. Alpha = 5% or 0.05
3. Select the appropriate test
statistic
Chi-squared
© 2016
Example: Chi-squared Test of
Independence
• Test statistics typically compare the value observed in the
sample to the null hypothesis value.
• If gender and discharged home were independent, then we
would expect the distribution of subjects among the four cells
(Male/female x home/not home) to be uniform and not have a
pattern.
• In other words, the proportion of males sent home should be
similar to the proportion of the females sent home if the null
hypothesis were indeed true.
• The basis of the chi-squared test statistic is the observed and
expected frequencies in each of the table cells
© 2016
Example: Chi-squared Test of
Independence
© 2016
Example: Chi-squared Test of
Independence
Test
statistic:
© 2016
Example: Chi-squared Test of
Independence
• Last two steps in hypothesis test:
4. Compare the test statistic to a critical value based on the alpha level and the distribution of the
test statistic
5. Reject the null hypothesis if the test statistic is more extreme than the critical value. If not, do not
reject the null hypothesis.
• Chi-squared test statistic follows the Chi-squared distribution with (r-1)x(c-1) degrees of
freedom. r = rows in contingency table and c = columns
– Chi-squared distribution is always non-negative
– Degrees of freedom define the shape
• Since alpha was set to be 0.05 (5%), reject H0 if the test statistic is greater than 3.841
– X2 = 2.39 which is not greater than 3.841
– Do not reject H0
• Conclusion: The sample data does not provide sufficient evidence to reject H0 and
conclude that there is no significant relationship between gender and the likelihood being
discharged to the home setting
© 2016
Sensitivity and Specificity
• Measures the accuracy of predictions made by
categorical variables
• When using one categorical variable (smoking
status) to predict another categorical variable
(cancer status)
• Sensitivity – proportion of sample with the
indicator present and a positive test divided by the
number of those with an indicator present.
• Specificity – the proportion of the sample without
the indicator and a negative test divided by the
number of those without an indicator
© 2016
Sensitivity/Specificity Example
A health plan wishes to use accessing their patient portal as a predictor of
whether or not a patient will seek care at an emergency room during the year.
That is, they believe that patients that do not access the patient portal are more
likely to experience an ER visit. They collected the following data based on
enrollees during the previous plan year. Calculate the sensitivity and specificity
of patient portal use as a predictor of ER use.
Note that the contingency table is set up so that ‘no’ for patient portal access
and ‘yes’ for ER visit are in cell ‘A’ (upper left hand corner). This is because the
health plan believes that patients that do not use the patient portal are MORE
likely to experience an ER visit.
ER Visit During Previous Year?
Patient Portal Access? Yes No
No 30 23
Yes 15 86
© 2016
Sensitivity/Specificity Example
ER Visit During Previous Year?
Patient Portal Access? Yes No
No A: 30 B: 23
Yes C: 15 D: 86
𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 =
𝐴
𝐴 + 𝐶
=
30
30 + 15
=
30
45
= 0.667 = 66.7%
𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 =
𝐷
𝐷 + 𝐵
=
86
86 + 23
=
86
109
= 0.789 = 78.9%
© 2016
Descriptive Statistics - Correlation
• Pearson’s correlation coefficient (r)
– Measures the linear association between two continuous
variables
• Spearman’s Rho (r)
– Measures the linear association between two ordinal variables or
one ordinal and one continuous variable
• Correlation between two variables does not imply causation –
only that the two have a relationship or are ‘associated’
• Be aware that correlation measures the linear association of
two variables
– They may be related in a non-linear way that may result in
misleading values for the correlation coefficients
© 2016
Descriptive Statistics –
Pearson’s Correlation Coefficient
• Used for measuring the linear association between
two continuous variables
• Values from -1 to +1
• Positive value means that both variables
increase/decrease together
– Example: Charges and length of stay
• Negative value means that one variable increases
as the other decreases
– Example: Experience and time to code a medical
record
© 2016
Descriptive Statistics –
Pearson’s Correlation Coefficient
• Example of negative correlation
– More experienced coders require less time to
code records – in general
© 2016
Descriptive Statistics –
Pearson’s Correlation Coefficient
• Example of positive correlation
– Longer lengths of stay result in longer
charges – in general
© 2016
Descriptive Statistics –
Pearson’s Correlation
Coefficient Example
• 𝑟 =
65,754
14.80× 336,460,939
= 0.93
© 2016
Descriptive Statistics –
Spearman’s Rho Correlation Coefficient
• Used for measuring the linear association between two ordinal variables
or an ordinal and continuous variable
• Operates on the ranks for the paired values and not the actual variable
values
– Typically rank ties are broken with average ranks
• Values from -1 to +1
• Positive value means that both variables increase/decrease together
– Example: patient severity level and charges
• Negative value means that one variable increases as the other decreases
– Example: Grade in elementary school and time to run 100 yards
• Same formula a Pearson’s r, but use ranks instead of actual values
• If there are no ties in the ranks, may use (Where Di is the difference
between the ranks of the ith pair of variables and n is the sample size):
© 2016
Inferential Statistics –
T-test for correlations
• Used to test the null hypothesis that the correlation
coefficient is zero
• Same formula for both Pearson’s and Spearman’s
correlation coefficients
• Note that the sample size in is the numerator of the
test statistic
• For very large samples, the test may reject the
hypothesis of 0 correlation when the value of the
sample correlation is not practically significant
© 2016
Inferential Statistics –
T-test for correlations - Example
• Test the hypothesis that the correlation between
length of stay and charges in the previous example if
different from zero.
• Step 1: State the null and alternative hypotheses
– Ho: r ≤ 0
– Ha: r > 0
– Note: In practice, a one sided test of significance is
used for r. If the sample value is > 0, then the
alternative hypothesis is ‘>0’. If the sample value is
negative, then the alternative hypothesis is ‘<0’.
• Step 2: Set the acceptable alpha level = 0.05
© 2016
Inferential Statistics –
T-test for correlations - Example
• Step 3: Determine the test statistic and
calculate the value
– T-test for correlations
– 𝑡 = 𝑟 ×
𝑛−2
1−𝑟2
= 0.93×
5−2
1−0.932
= 4.71
• Step 4: Compare the test statistic to the
critical value
– Use t-distribution with d.f. = n-2 = 3 and
alpha = 0.05 is 2.353
– t= 4.71 > 2.353,
• Step 5: Reject the null hypothesis since
4.71 > 2.353 and conclude that the
correlation between LOS and charge is not
zero
© 2016
Inferential Statistics
Simple Linear Regression
• Used to formulate a functional relationship between two
continuous variables
• A linear function of the independent variable (X) is estimated
to predict values of the dependent variable (Y)
• Slope-intercept form of a line:
– Y = a + bX
– a is the y-intercept
– b is the slope of the line
• If variables are positively correlated, the slope of the line is
positive
• If variables are negatively correlated, the slope of the line is
negative
© 2016
Inferential Statistics
Simple Linear Regression - Example
• Least squares regression
– Minimizes the vertical distance from each point to line
– Vertical distance called the ‘error’ or ‘residual’
• Least square line provides a line that comes as close as
possible to all points, but may not actually intersect with
any of them
© 2016
Inferential Statistics
Simple Linear Regression - Example
• Slope of line is 4,443
– Interpretation: The expected charge increase for each additional day is
$4,443
• Intercept of line is $7,801
– Interpretation: The expected charge with a zero day stay is $7,801
– Zero stay is not realistic, but intercept gives an estimate of the fixed cost
of admitting a patient while the slope represents the variable cost.
© 2016
Inferential Statistics
Simple Linear Regression - Example
Multiple R = Pearson’s r
R Square = Pearson’s r squared
R Square estimates the amount of
variance in the dependent variable
explained by the independent variable
T stat and p-value for
testing that intercept and
slope are not equal to
zero
Note: If p-value is less
than alpha, then reject
null hypothesis
© 2016
Coefficient of Determination
• In simple linear regression (one independent variable)
– Multiple R is the Pearson’s Correlation Coefficient value for
the correlation
– R Square is also called the coefficient of determination
– The coefficient of determination measures the amount of
variance in the dependent variable that is explained by the
independent variable
– In our example, 87% of the variance in charge is explained
by length of stay
© 2016
Regression Hypothesis Tests
• Two hypothesis tests are presented in this table
– Ho: Intercept = 0 vs H1: Intercept ≠ 0
• P-value = 0.121 > do not reject
• Even though the intercept is not statistically different from
zero (do not reject the null hypothesis that it is equal to
zero), the intercept is typically kept in the model
– Ho: Slope = 0 vs H1: Slope ≠ 0
• P-value = 0.021 > reject Ho and conclude that the slope is
not equal to zero
• The interpretation here is that LOS gives us useful
information about the charge since the slope of the
regression line is non-zero
© 2016
Regression Assumptions
• Residuals
– Difference between the actual value of the dependent
variable and the value predicted using the regression
equation
– The vertical (y-axis) distance from an individual point
to the regression line
• Must test the following assumptions regarding the
residuals:
– Independence
– Normally distributed
– Mean of zero

More Related Content

What's hot

Basics of Hypothesis testing for Pharmacy
Basics of Hypothesis testing for PharmacyBasics of Hypothesis testing for Pharmacy
Basics of Hypothesis testing for PharmacyParag Shah
 
3. parametric assumptions
3. parametric assumptions3. parametric assumptions
3. parametric assumptionsSteve Saffhill
 
T12 non-parametric tests
T12 non-parametric testsT12 non-parametric tests
T12 non-parametric testskompellark
 
Chi square tests using SPSS
Chi square tests using SPSSChi square tests using SPSS
Chi square tests using SPSSParag Shah
 
Chi square tests using spss
Chi square tests using spssChi square tests using spss
Chi square tests using spssParag Shah
 
Statistical inference: Estimation
Statistical inference: EstimationStatistical inference: Estimation
Statistical inference: EstimationParag Shah
 
Chi square Test Using SPSS
Chi square Test Using SPSSChi square Test Using SPSS
Chi square Test Using SPSSDr Athar Khan
 

What's hot (14)

Basics of Hypothesis testing for Pharmacy
Basics of Hypothesis testing for PharmacyBasics of Hypothesis testing for Pharmacy
Basics of Hypothesis testing for Pharmacy
 
3. parametric assumptions
3. parametric assumptions3. parametric assumptions
3. parametric assumptions
 
Nonparametric and Distribution- Free Statistics
Nonparametric and Distribution- Free Statistics Nonparametric and Distribution- Free Statistics
Nonparametric and Distribution- Free Statistics
 
T12 non-parametric tests
T12 non-parametric testsT12 non-parametric tests
T12 non-parametric tests
 
Tests of significance
Tests of significanceTests of significance
Tests of significance
 
Significance test
Significance testSignificance test
Significance test
 
Chi square
Chi squareChi square
Chi square
 
Chi square tests using SPSS
Chi square tests using SPSSChi square tests using SPSS
Chi square tests using SPSS
 
Chi square tests using spss
Chi square tests using spssChi square tests using spss
Chi square tests using spss
 
Statistical inference: Estimation
Statistical inference: EstimationStatistical inference: Estimation
Statistical inference: Estimation
 
Environmental statistics
Environmental statisticsEnvironmental statistics
Environmental statistics
 
Non parametric tests
Non parametric testsNon parametric tests
Non parametric tests
 
Chi square test
Chi square testChi square test
Chi square test
 
Chi square Test Using SPSS
Chi square Test Using SPSSChi square Test Using SPSS
Chi square Test Using SPSS
 

Similar to Hm306 week 5

Chapter 13 Data Analysis Inferential Methods and Analysis of Time Series
Chapter 13 Data Analysis Inferential Methods and Analysis of Time SeriesChapter 13 Data Analysis Inferential Methods and Analysis of Time Series
Chapter 13 Data Analysis Inferential Methods and Analysis of Time SeriesInternational advisers
 
hypothesis teesting
 hypothesis teesting hypothesis teesting
hypothesis teestingkpgandhi
 
Testing of Hypothesis combined with tests.pdf
Testing of Hypothesis combined with tests.pdfTesting of Hypothesis combined with tests.pdf
Testing of Hypothesis combined with tests.pdfRamBk5
 
ChandanChakrabarty_1.pdf
ChandanChakrabarty_1.pdfChandanChakrabarty_1.pdf
ChandanChakrabarty_1.pdfDikshathawait
 
Non-parametric tests:correlation.pptx
Non-parametric tests:correlation.pptxNon-parametric tests:correlation.pptx
Non-parametric tests:correlation.pptxOshinBhatia2
 
Research method ch09 statistical methods 3 estimation np
Research method ch09 statistical methods 3 estimation npResearch method ch09 statistical methods 3 estimation np
Research method ch09 statistical methods 3 estimation npnaranbatn
 
Statistical data handling
Statistical data handling Statistical data handling
Statistical data handling Rohan Jagdale
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis TestingJeremy Lane
 
Introduction to Data Management in Human Ecology
Introduction to Data Management in Human EcologyIntroduction to Data Management in Human Ecology
Introduction to Data Management in Human EcologyKern Rocke
 
Chapter 15 Marketing Research Malhotra
Chapter 15 Marketing Research MalhotraChapter 15 Marketing Research Malhotra
Chapter 15 Marketing Research MalhotraAADITYA TANTIA
 
Quantitative_analysis.ppt
Quantitative_analysis.pptQuantitative_analysis.ppt
Quantitative_analysis.pptmousaderhem1
 
Estimation and hypothesis
Estimation and hypothesisEstimation and hypothesis
Estimation and hypothesisJunaid Ijaz
 

Similar to Hm306 week 5 (20)

Statistical analysis in SPSS_
Statistical analysis in SPSS_ Statistical analysis in SPSS_
Statistical analysis in SPSS_
 
Hm306 week 4
Hm306 week 4Hm306 week 4
Hm306 week 4
 
BBA 020
BBA 020BBA 020
BBA 020
 
Chapter 13 Data Analysis Inferential Methods and Analysis of Time Series
Chapter 13 Data Analysis Inferential Methods and Analysis of Time SeriesChapter 13 Data Analysis Inferential Methods and Analysis of Time Series
Chapter 13 Data Analysis Inferential Methods and Analysis of Time Series
 
hypothesis teesting
 hypothesis teesting hypothesis teesting
hypothesis teesting
 
Testing of Hypothesis combined with tests.pdf
Testing of Hypothesis combined with tests.pdfTesting of Hypothesis combined with tests.pdf
Testing of Hypothesis combined with tests.pdf
 
UNIT 5.pptx
UNIT 5.pptxUNIT 5.pptx
UNIT 5.pptx
 
Hypothsis testing
Hypothsis testingHypothsis testing
Hypothsis testing
 
Analysis of Variance
Analysis of VarianceAnalysis of Variance
Analysis of Variance
 
ChandanChakrabarty_1.pdf
ChandanChakrabarty_1.pdfChandanChakrabarty_1.pdf
ChandanChakrabarty_1.pdf
 
Non-parametric tests:correlation.pptx
Non-parametric tests:correlation.pptxNon-parametric tests:correlation.pptx
Non-parametric tests:correlation.pptx
 
Research method ch09 statistical methods 3 estimation np
Research method ch09 statistical methods 3 estimation npResearch method ch09 statistical methods 3 estimation np
Research method ch09 statistical methods 3 estimation np
 
Validity andreliability
Validity andreliabilityValidity andreliability
Validity andreliability
 
Statistical data handling
Statistical data handling Statistical data handling
Statistical data handling
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
Introduction to Data Management in Human Ecology
Introduction to Data Management in Human EcologyIntroduction to Data Management in Human Ecology
Introduction to Data Management in Human Ecology
 
Chapter 15 Marketing Research Malhotra
Chapter 15 Marketing Research MalhotraChapter 15 Marketing Research Malhotra
Chapter 15 Marketing Research Malhotra
 
DSE-2, ANALYTICAL METHODS.pptx
DSE-2, ANALYTICAL METHODS.pptxDSE-2, ANALYTICAL METHODS.pptx
DSE-2, ANALYTICAL METHODS.pptx
 
Quantitative_analysis.ppt
Quantitative_analysis.pptQuantitative_analysis.ppt
Quantitative_analysis.ppt
 
Estimation and hypothesis
Estimation and hypothesisEstimation and hypothesis
Estimation and hypothesis
 

More from BealCollegeOnline (20)

BA650 Week 3 Chapter 3 "Why Change? contemporary drivers and pressures
BA650 Week 3 Chapter 3 "Why Change? contemporary drivers and pressuresBA650 Week 3 Chapter 3 "Why Change? contemporary drivers and pressures
BA650 Week 3 Chapter 3 "Why Change? contemporary drivers and pressures
 
BIO420 Chapter 25
BIO420 Chapter 25BIO420 Chapter 25
BIO420 Chapter 25
 
BIO420 Chapter 24
BIO420 Chapter 24BIO420 Chapter 24
BIO420 Chapter 24
 
BIO420 Chapter 23
BIO420 Chapter 23BIO420 Chapter 23
BIO420 Chapter 23
 
BIO420 Chapter 20
BIO420 Chapter 20BIO420 Chapter 20
BIO420 Chapter 20
 
BIO420 Chapter 18
BIO420 Chapter 18BIO420 Chapter 18
BIO420 Chapter 18
 
BIO420 Chapter 17
BIO420 Chapter 17BIO420 Chapter 17
BIO420 Chapter 17
 
BIO420 Chapter 16
BIO420 Chapter 16BIO420 Chapter 16
BIO420 Chapter 16
 
BIO420 Chapter 13
BIO420 Chapter 13BIO420 Chapter 13
BIO420 Chapter 13
 
BIO420 Chapter 12
BIO420 Chapter 12BIO420 Chapter 12
BIO420 Chapter 12
 
BIO420 Chapter 09
BIO420 Chapter 09BIO420 Chapter 09
BIO420 Chapter 09
 
BIO420 Chapter 08
BIO420 Chapter 08BIO420 Chapter 08
BIO420 Chapter 08
 
BIO420 Chapter 06
BIO420 Chapter 06BIO420 Chapter 06
BIO420 Chapter 06
 
BIO420 Chapter 05
BIO420 Chapter 05BIO420 Chapter 05
BIO420 Chapter 05
 
BIO420 Chapter 04
BIO420 Chapter 04BIO420 Chapter 04
BIO420 Chapter 04
 
BIO420 Chapter 03
BIO420 Chapter 03BIO420 Chapter 03
BIO420 Chapter 03
 
BIO420 Chapter 01
BIO420 Chapter 01BIO420 Chapter 01
BIO420 Chapter 01
 
BA350 Katz esb 6e_chap018_ppt
BA350 Katz esb 6e_chap018_pptBA350 Katz esb 6e_chap018_ppt
BA350 Katz esb 6e_chap018_ppt
 
BA350 Katz esb 6e_chap017_ppt
BA350 Katz esb 6e_chap017_pptBA350 Katz esb 6e_chap017_ppt
BA350 Katz esb 6e_chap017_ppt
 
BA350 Katz esb 6e_chap016_ppt
BA350 Katz esb 6e_chap016_pptBA350 Katz esb 6e_chap016_ppt
BA350 Katz esb 6e_chap016_ppt
 

Recently uploaded

Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxShobhayan Kirtania
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 

Recently uploaded (20)

Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptx
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 

Hm306 week 5

  • 1. © 2016© 2016 A Practical Approach to Analyzing Healthcare Data Chapter 6 – Analyzing the Relationship between Two Variables
  • 2. © 2016 Categorical Variables • Descriptive Statistics – Contingency tables – Used to display and analyze the relationship between two categorical variables – Notice in table below: • 20/32 = 62.5% of female patients were discharged home • 10/24 = 41.7% of male patients were discharged home • Inferential Statistics – Is this just a random occurrence or is this evidence that there is a significant relationship between gender and being discharged to home? – An hypothesis test may be used to answer that question
  • 3. © 2016 Example: Chi-squared Test of Independence Step Response 1. Determine the null and alternative hypotheses Ho: Discharged to Home and Gender are independent H1: Discharged to Home and Gender are not independent 2. Set the acceptable type I error or alpha level The analyst is willing to accept a 5% chance or probability of rejecting the null hypothesis when it is true. Alpha = 5% or 0.05 3. Select the appropriate test statistic Chi-squared
  • 4. © 2016 Example: Chi-squared Test of Independence • Test statistics typically compare the value observed in the sample to the null hypothesis value. • If gender and discharged home were independent, then we would expect the distribution of subjects among the four cells (Male/female x home/not home) to be uniform and not have a pattern. • In other words, the proportion of males sent home should be similar to the proportion of the females sent home if the null hypothesis were indeed true. • The basis of the chi-squared test statistic is the observed and expected frequencies in each of the table cells
  • 5. © 2016 Example: Chi-squared Test of Independence
  • 6. © 2016 Example: Chi-squared Test of Independence Test statistic:
  • 7. © 2016 Example: Chi-squared Test of Independence • Last two steps in hypothesis test: 4. Compare the test statistic to a critical value based on the alpha level and the distribution of the test statistic 5. Reject the null hypothesis if the test statistic is more extreme than the critical value. If not, do not reject the null hypothesis. • Chi-squared test statistic follows the Chi-squared distribution with (r-1)x(c-1) degrees of freedom. r = rows in contingency table and c = columns – Chi-squared distribution is always non-negative – Degrees of freedom define the shape • Since alpha was set to be 0.05 (5%), reject H0 if the test statistic is greater than 3.841 – X2 = 2.39 which is not greater than 3.841 – Do not reject H0 • Conclusion: The sample data does not provide sufficient evidence to reject H0 and conclude that there is no significant relationship between gender and the likelihood being discharged to the home setting
  • 8. © 2016 Sensitivity and Specificity • Measures the accuracy of predictions made by categorical variables • When using one categorical variable (smoking status) to predict another categorical variable (cancer status) • Sensitivity – proportion of sample with the indicator present and a positive test divided by the number of those with an indicator present. • Specificity – the proportion of the sample without the indicator and a negative test divided by the number of those without an indicator
  • 9. © 2016 Sensitivity/Specificity Example A health plan wishes to use accessing their patient portal as a predictor of whether or not a patient will seek care at an emergency room during the year. That is, they believe that patients that do not access the patient portal are more likely to experience an ER visit. They collected the following data based on enrollees during the previous plan year. Calculate the sensitivity and specificity of patient portal use as a predictor of ER use. Note that the contingency table is set up so that ‘no’ for patient portal access and ‘yes’ for ER visit are in cell ‘A’ (upper left hand corner). This is because the health plan believes that patients that do not use the patient portal are MORE likely to experience an ER visit. ER Visit During Previous Year? Patient Portal Access? Yes No No 30 23 Yes 15 86
  • 10. © 2016 Sensitivity/Specificity Example ER Visit During Previous Year? Patient Portal Access? Yes No No A: 30 B: 23 Yes C: 15 D: 86 𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = 𝐴 𝐴 + 𝐶 = 30 30 + 15 = 30 45 = 0.667 = 66.7% 𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 = 𝐷 𝐷 + 𝐵 = 86 86 + 23 = 86 109 = 0.789 = 78.9%
  • 11. © 2016 Descriptive Statistics - Correlation • Pearson’s correlation coefficient (r) – Measures the linear association between two continuous variables • Spearman’s Rho (r) – Measures the linear association between two ordinal variables or one ordinal and one continuous variable • Correlation between two variables does not imply causation – only that the two have a relationship or are ‘associated’ • Be aware that correlation measures the linear association of two variables – They may be related in a non-linear way that may result in misleading values for the correlation coefficients
  • 12. © 2016 Descriptive Statistics – Pearson’s Correlation Coefficient • Used for measuring the linear association between two continuous variables • Values from -1 to +1 • Positive value means that both variables increase/decrease together – Example: Charges and length of stay • Negative value means that one variable increases as the other decreases – Example: Experience and time to code a medical record
  • 13. © 2016 Descriptive Statistics – Pearson’s Correlation Coefficient • Example of negative correlation – More experienced coders require less time to code records – in general
  • 14. © 2016 Descriptive Statistics – Pearson’s Correlation Coefficient • Example of positive correlation – Longer lengths of stay result in longer charges – in general
  • 15. © 2016 Descriptive Statistics – Pearson’s Correlation Coefficient Example • 𝑟 = 65,754 14.80× 336,460,939 = 0.93
  • 16. © 2016 Descriptive Statistics – Spearman’s Rho Correlation Coefficient • Used for measuring the linear association between two ordinal variables or an ordinal and continuous variable • Operates on the ranks for the paired values and not the actual variable values – Typically rank ties are broken with average ranks • Values from -1 to +1 • Positive value means that both variables increase/decrease together – Example: patient severity level and charges • Negative value means that one variable increases as the other decreases – Example: Grade in elementary school and time to run 100 yards • Same formula a Pearson’s r, but use ranks instead of actual values • If there are no ties in the ranks, may use (Where Di is the difference between the ranks of the ith pair of variables and n is the sample size):
  • 17. © 2016 Inferential Statistics – T-test for correlations • Used to test the null hypothesis that the correlation coefficient is zero • Same formula for both Pearson’s and Spearman’s correlation coefficients • Note that the sample size in is the numerator of the test statistic • For very large samples, the test may reject the hypothesis of 0 correlation when the value of the sample correlation is not practically significant
  • 18. © 2016 Inferential Statistics – T-test for correlations - Example • Test the hypothesis that the correlation between length of stay and charges in the previous example if different from zero. • Step 1: State the null and alternative hypotheses – Ho: r ≤ 0 – Ha: r > 0 – Note: In practice, a one sided test of significance is used for r. If the sample value is > 0, then the alternative hypothesis is ‘>0’. If the sample value is negative, then the alternative hypothesis is ‘<0’. • Step 2: Set the acceptable alpha level = 0.05
  • 19. © 2016 Inferential Statistics – T-test for correlations - Example • Step 3: Determine the test statistic and calculate the value – T-test for correlations – 𝑡 = 𝑟 × 𝑛−2 1−𝑟2 = 0.93× 5−2 1−0.932 = 4.71 • Step 4: Compare the test statistic to the critical value – Use t-distribution with d.f. = n-2 = 3 and alpha = 0.05 is 2.353 – t= 4.71 > 2.353, • Step 5: Reject the null hypothesis since 4.71 > 2.353 and conclude that the correlation between LOS and charge is not zero
  • 20. © 2016 Inferential Statistics Simple Linear Regression • Used to formulate a functional relationship between two continuous variables • A linear function of the independent variable (X) is estimated to predict values of the dependent variable (Y) • Slope-intercept form of a line: – Y = a + bX – a is the y-intercept – b is the slope of the line • If variables are positively correlated, the slope of the line is positive • If variables are negatively correlated, the slope of the line is negative
  • 21. © 2016 Inferential Statistics Simple Linear Regression - Example • Least squares regression – Minimizes the vertical distance from each point to line – Vertical distance called the ‘error’ or ‘residual’ • Least square line provides a line that comes as close as possible to all points, but may not actually intersect with any of them
  • 22. © 2016 Inferential Statistics Simple Linear Regression - Example • Slope of line is 4,443 – Interpretation: The expected charge increase for each additional day is $4,443 • Intercept of line is $7,801 – Interpretation: The expected charge with a zero day stay is $7,801 – Zero stay is not realistic, but intercept gives an estimate of the fixed cost of admitting a patient while the slope represents the variable cost.
  • 23. © 2016 Inferential Statistics Simple Linear Regression - Example Multiple R = Pearson’s r R Square = Pearson’s r squared R Square estimates the amount of variance in the dependent variable explained by the independent variable T stat and p-value for testing that intercept and slope are not equal to zero Note: If p-value is less than alpha, then reject null hypothesis
  • 24. © 2016 Coefficient of Determination • In simple linear regression (one independent variable) – Multiple R is the Pearson’s Correlation Coefficient value for the correlation – R Square is also called the coefficient of determination – The coefficient of determination measures the amount of variance in the dependent variable that is explained by the independent variable – In our example, 87% of the variance in charge is explained by length of stay
  • 25. © 2016 Regression Hypothesis Tests • Two hypothesis tests are presented in this table – Ho: Intercept = 0 vs H1: Intercept ≠ 0 • P-value = 0.121 > do not reject • Even though the intercept is not statistically different from zero (do not reject the null hypothesis that it is equal to zero), the intercept is typically kept in the model – Ho: Slope = 0 vs H1: Slope ≠ 0 • P-value = 0.021 > reject Ho and conclude that the slope is not equal to zero • The interpretation here is that LOS gives us useful information about the charge since the slope of the regression line is non-zero
  • 26. © 2016 Regression Assumptions • Residuals – Difference between the actual value of the dependent variable and the value predicted using the regression equation – The vertical (y-axis) distance from an individual point to the regression line • Must test the following assumptions regarding the residuals: – Independence – Normally distributed – Mean of zero