Hm306 week 4

© 2016© 2016
A Practical Approach to Analyzing
Healthcare Data
Chapter 5 – Analyzing
Continuous Variables

© 2016
Continuous Variables
• Data elements that represent naturally
numeric values that can take infinite values
– Interval (no true zero)
– Ratio
• Healthcare Examples
– Length of stay
– Charge
– Systolic blood pressure
– Age
– Time to code records

© 2016
Descriptive Statistics
Measures of Central Tendency
• Mean
– Arithmetic average
– Sum of values divided by the number of
values
• Median
– Middle value
– If even number of values, average of two
middle values
– Less influenced by extreme values or
outliers than the mean
• Mode
– Most frequent value

© 2016
Measures of Variation
• Range
– Maximum value minus minimum value
• Interquartile range
– Difference between the third and first quartile
• Variance
– Average squared deviation from the mean
– Unit of measure is “squared units”
• Standard deviation
– Square root of the variance
– Unit of measure is same as unit of measure
in sample

© 2016
Example
• Calculate the mean, median and mode of
the following sample length of stay data:
– 2, 4, 6, 3, 1, 2, 5
• Mean: 𝑥 =
2+4+6+3+1+2+5
7
=
23
7
= 3.3
• Median
– Sort values: 1, 2, 2, 3, 4, 5, 6
– Median or middle value = 3
• Mode
– 2, since it is the most frequent value
– Note: The mode is rarely used for continuous
variables that have many unique values and
is presented here for demonstration
purposes.

© 2016
Example
• Calculate the range, variance and
standard deviation of the following
sample length of stay data:
– 2, 4, 6, 3, 1, 2, 5
• Range = 6 – 1 = 5
• Sample variance
𝑠2
=
(2 − 3.3)2
+ (4 − 3.3)2
+ (6 − 3.3)2
+ (3 − 3.3)2
+ (1 −3.3)2
+(2 − 3.3)2
+ (5 −3.3)2
7 − 1
𝑠2 =
(−1.3)2+(0.7)2+(2.7)2+(−0.3)2+(−2.3)2+(−1.3)2+(1.7)2
6
= 3.2
• Standard deviation
– s = 3.2 = 1.8

© 2016
Review: Hypothesis Testing Steps
1. Determine the null and alternative hypotheses
2. Set the acceptable type I error or alpha level
3. Select the appropriate test statistic
4. Compare the test statistic to a critical value
based on the alpha level and the distribution of
the test statistic
5. Reject the null hypothesis if the test statistic is
more extreme than the critical value. If not, do
not reject the null hypothesis.

© 2016
Inferential Statistics
One Sample t-test
• One-sample t-test
– Used to test if a population value is different from a
standard or benchmark
– Test statistic:
– Compare to a t-distribution to determine critical value
– May be one sided or two sided
– Anatomy of test statistic:
• Numerator: distance from sample mean to null hypothesis
value
• Denominator: standard error of the sample mean (SEM)

© 2016
One Sample t-test - Example
Suppose the researcher that collected the length of stay
(LOS) data in the previous examples would like to
determine if the population LOS is longer than a
standard of 3 days.
• Step 1: Determine the null and alternative hypotheses
– Ho: µ ≤ 3
– Ha: µ > 3
• Step 2: Set the acceptable type 1 error rate (AKA
alpha level).
– Set α = 0.05
• Step 3: Select the appropriate test statistic: t-test

© 2016
One Sample t-test -Example
• Step 3 (con’t)
– Recall from previous slides:
• 𝑥 = 3.3
• s = 1.8
• n = 7
• 𝑡 =
3.3−3
1.8
7
=
0.3
0.68
=0.44
• Step 4: Compare test statistic to critical value.
– T-test statistic critical value comes from the t-
distribution with n-1 degrees of freedom
– T-distribution is symmetric around zero much like
standard normal (bell curve); width is defined by the
degrees of freedom. (see Figure 5.1 in text)

© 2016
One Sample t-test - Example
• Step 4 (con’t): t= 0.44; df = n – 1 = 7 -1 = 6, one
sided test at α=0.05, critical value = 1.943
• Step 5: Reject the null hypothesis if the test statistic
is more extreme than the critical value. 0.44 is not
greater than 1.943, do not reject the null hypothesis
and conclude that the LOS is not longer than the
standard

© 2016
Confidence Interval for Population Mean
• Recall that a confidence interval is a range of values
that is likely to cover the true population value with a
pre-defined probability or level of confidence
• A (1-α)% confidence interval for the population mean
is centered at the sample mean and has a width that
is dependent on the confidence level and standard
error of the mean
– Higher level of confidence requires a wider interval
– Large sample size results in a narrower interval
– Width of confidence interval is a measure of the
precision of the estimate of the sample mean
• A narrower interval is more precise

© 2016
Confidence Interval for Population Mean
Formulate a 95% confidence interval for the LOS data
presented in the previous example:
• 𝑥 = 3.3
• s = 1.8
• n = 7
• 95% CI, so α = 0.05; α/2 = 0.025
• df = 6
• Critical value (table 5.1) = 2.447
• 95% CI: 3.3 ± 2.447 ×
1.8
7
3.3 ±1.7
(1.6,5.0)
• We are 95% sure that the range 1.6 to 5.0 days
includes the true population LOS is between

© 2016
Paired t-test
• Paired t-test
– Used to compare pre/post test population values or matched
pairs
– Test statistic:
• Where d = difference between the pre/post values or the pairs
– Compare to a t-distribution to determine critical value
– May be one sided or two sided
– Anatomy of test statistic:
• Numerator: distance from sample mean difference to null hypothesis
value (usually zero)
• Denominator: standard error of the sample mean difference (SEM)

© 2016
Paired t-test – ExampleThe transition from ICD-9 to ICD-10
is predicted to cause an increase in
the amount of time required to code
medical records. A pilot study was
conducted using a random sample
of 10 records to determine if the
time required was significantly
different. Each record was coded
using the two coding systems by on
coder. The values are recorded in
the table.
• Step 1: Determine the null and
alternative hypotheses:
– Ho: D = 0
– Ha: D ≠ 0
• Step 2: Set the alpha level: 0.01
ID ICD-
9
Tim
e
ICD-10
Time
d
1 10 15 5
2 11 12 1
3 15 10 -5
4 30 36 6
5 5 7 2
6 10 13 3
7 8 5 -3
8 11 19 8
9 21 19 -2
10 18 23 5

© 2016
Paired t-test – Example
• Step 3: Select the
appropriate test
statistic:
• Step 4: Compare the
test statistic to the
critical value
– 𝑑 = 2.00; 𝑠 𝑑 = 4.24
– 𝑡 =
2.00 −0
4.24
10
=1.49
– Compare to t distn
with df = 9, α/2 =
0.005
– 1.49 not > 3.25
• Step 5: Do not reject
Ho
ID ICD-
9
Tim
e
ICD-10
Time
d
1 10 15 5
2 11 12 1
3 15 10 -5
4 30 36 6
5 5 7 2
6 10 13 3
7 8 5 -3
8 11 19 8
9 21 19 -2
10 18 23 5

© 2016
Two Sample t-test
• Used to test if a two population means are different
• Test statistic complex
– Denominator is standard error pooled across the two
samples
– use statistical software to calculate
• Compare to a t-distribution to determine critical value
• May be one sided or two sided
• Anatomy of test statistic:
– Numerator: distance between the two sample means
– Denominator: pooled standard error of the difference
between the two sample means

© 2016
Two Sample t-test - Example
An analyst wanted to determine if the length of stay was different for
Hip Replacement (MS-DRG 470) patients that were sent home
versus those that were discharged to another setting. A random
sample was selected and the results are presented in the table
below.
• Step 1: State hypotheses:
– Ho: µ1= µ2
– Ho: µ1≠ µ2
• Step 2: Set the alpha level = 0.01
• Step 3: Determine the test statistic
– T-test
– Use Excel Data Analysis ToolPak to calculate
– Run Two sample test for variances to determine if equal variances can be
assumed
Home? Sample Size Ave LOS Std. Dev. LOS
No 94 3.38 0.86
Yes 47 4.87 1.50

© 2016
• Since F = 0.33 < F Critical = 0.56, reject the null
hypothesis that variances are equal
• Must run two-sample t-test without assuming equal
variances
• Note: The excel version of this test is the reciprocal of the
Levene’s test found elsewhere. Must look for test
statistics (F) less than critical value in this version.

© 2016
• Step 4: Compared test statistic to critical value
– Note: Excel will only give the positive value of the critical value. Recall
that the t-distribution is symmetric, so reject is t stat is < -2.66 OR >
+2.66
– T stat = -6.32 < -2.66
• Step 5: Reject null hypothesis if test statistic is more extreme
than critical value
– Reject the null hypothesis and conclude that patients that are discharged
to home have longer stays than those discharged to another setting

© 2016
ANOVA
• Used to test if a more than two population
means are different
• Test statistic: F-test
– Best to use software to compute
• Compare to an F-distribution to determine
critical value
• Anatomy of test statistic:
– Numerator: variance between comparison
groups
– Denominator: variance within comparison groups

© 2016
ANOVA - Example
An analyst is studying variation in charges
among AMI patients that are discharged alive
with MCC (MS-DRG 280), CC (MS-DRG
281) and no CC (MS-DRG 282). A sample of
25 patients is selected from each MS-DRG.
The sample statistics are presented in the
table below:

© 2016
ANOVA - Example
• Step 1: State the hypotheses
– Ho: µ280= µ281= µ282
– At least two of the population means are unequal
• Step 2: Set the acceptable error level: α=0.05
• Step 3: Determine the appropriate test statistic
– F-test
– Calculate using software
• Step 4: Compare test statistic to critical value
– F = 2.784 < 3.124
• Step 5: Conclude to not reject Ho since F < critical value

© 2016
ANOVA - Example
• Distribution of
charges for the 3
MS-DRGs
overlap
• ANOVA will not
find a statistical
difference
• ANOVA test is
determining is
there is
less/more
variation
between the
groups than
within the groups

Hm306 week 4

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Similar to Hm306 week 4

Similar to Hm306 week 4 (20)

More from BealCollegeOnline

More from BealCollegeOnline (20)

Recently uploaded

Recently uploaded (20)

Hm306 week 4