2. Chi-square and F Distributions
Children of the Normal
School.edhole.com
3. Distributions
• There are many theoretical
distributions, both continuous and
discrete.
• We use 4 of these a lot: z (unit normal),
t, chi-square, and F.
• Z and t are closely related to the
sampling distribution of means; chi-square
and F are closely related to the
sampling distribution of variances.
School.edhole.com
4. Chi-square Distribution (1)
z X X
= ( - ) ; z = (X -m ) ; z = ( y - m
)
s
s
SD
z = y -m
2
2
2 ( )
s
z score
z score squared
z2 = c Make it Greek
2
(1)
What would its sampling distribution look like?
Minimum value is zero.
Maximum value is infinite.
Most values are between zero and 1;
most around zero.
School.edhole.com
5. Chi-square (2)
What if we took 2 values of z2 at random and added them?
z = ( y -m ) ; z = ( y - )
2
2
2
2 2
2
2 2
2 1
1
s
m
s
2
= ( y - ) + ( y - ) = z 2
+ z
2
2 1
2
c m
2
2
2 1
(2)
s
m
s
Same minimum and maximum as before, but now average
should be a bit bigger.
Chi-square is the distribution of a sum of squares.
Each squared deviation is taken from the unit normal:
N(0,1). The shape of the chi-square distribution
depends on the number of squared deviates that are
added together.
School.edhole.com
6. Chi-square 3
The distribution of chi-square depends
on 1 parameter, its degrees of freedom
(df or v). As df gets large, curve is less
skewed, more normal.
School.edhole.com
7. Chi-square (4)
• The expected value of chi-square is df.
– The mean of the chi-square distribution is its
degrees of freedom.
• The expected variance of the distribution is
2df.
– If the variance is 2df, the standard deviation must
be sqrt(2df).
• There are tables of chi-square so you can find
5 or 1 percent of the distribution.
• Chi-square is additive. 2
2
(v1 v2 ) v1 v2 c = c + c +
( )
2
( )
School.edhole.com
8. Distribution of Sample
Variance
( y -
y
)2
1
2
-
= å
N
s
Sample estimate of population variance
(unbiased).
c N s
2
2
2
( 1)
( 1)
s
N
= - -
Multiply variance estimate by N-1 to
get sum of squares. Divide by
population variance to normalize.
Result is a random variable distributed
as chi-square with (N-1) df.
We can use info about the sampling distribution of the
variance estimate to find confidence intervals and
conduct statistical tests.
School.edhole.com
9. Testing Exact Hypotheses
about a Variance
2
0
2
H0 :s =s Test the null that the population
variance has some specific value. Pick
alpha and rejection region. Then:
c N s
2
0
2
2
( 1)
( 1)
s
N
= - -
Plug hypothesized population
variance and sample variance into
equation along with sample size we
used to estimate variance. Compare
to chi-square distribution.
School.edhole.com
10. Example of Exact Test
Test about variance of height of people in inches. Grab 30
people at random and measure height.
H : s ³ 6.25; H : s <
6.25.
Note: 1 tailed test on
2
30; 4.55
2
1
2
0
= =
N s
small side. Set alpha=.01.
21.11
2 (29)(4.55)
29 c = =
6.25
Mean is 29, so it’s on the
small side. But for Q=.99, the
value of chi-square is 14.257.
Cannot reject null.
H s H s
= ¹
: 6.25; : 6.25.
2
30; 4.55
2
1
2
0
= =
N s
Note: 2 tailed with alpha=.01.
Now chi-square with v=29 and Q=.995 is 13.121 and
also with Q=.005 the result is 52.336. N. S. either way.
School.edhole.com
11. Confidence Intervals for the
Variance
We use s2 to estimate s 2
. It can be shown that: é ( - 1) ( - 1) 2
£ £
.95
p N s N s
2
( 1;.975)
2
2
2
( 1;.025)
=
ù
ú úû
ê êë
c
c
Suppose N=15 and is 10. Then df=14 and for Q=.025
the value is 26.12. For Q=.975 the value is 5.63.
s
N- N-
.95
s2
(14)(10) 2 (14)(10)
5.63
úû
= pé £s £
26.12
ù
êë
p[5.36 £s 2 £ 24.87] =.95
School.edhole.com
12. Normality Assumption
• We assume normal distributions to figure
sampling distributions and thus p levels.
• Violations of normality have minor
implications for testing means, especially as
N gets large.
• Violations of normality are more serious for
testing variances. Look at your data before
conducting this test. Can test for normality.
School.edhole.com
13. The F Distribution (1)
• The F distribution is the ratio of two
variance estimates:
2
1
est
.
s
est
s
2
2
2
1
F = s =
2
2
.
s
• Also the ratio of two chi-squares, each
divided by its degrees of freedom:
2
c
=
v
c
2
(
1
2
/
( )
) /
1
v
2
v
F
v
In our applications, v2 will be larger
than v1 and v2 will be larger than 2.
In such a case, the mean of the F
distribution (expected value) is
v2 /(v2 -2).
School.edhole.com
14. F Distribution (2)
• F depends on two parameters: v1 and v2
(df1 and df2). The shape of F changes
with these. Range is 0 to infinity.
Shaped a bit like chi-square.
• F tables show critical values for df in
the numerator and df in the
denominator.
• F tables are 1-tailed; can figure 2-tailed
if you need to (but you usually don’t).
School.edhole.com
15. Testing Hypotheses about 2
Variances
• Suppose
– Note 1-tailed.
• We find
• Then df1=df2 = 15, and
2
2
2
1 1
2
2
2
0 1 H :s £s ; H :s >s
16; 5.8; 16; 2 1.7
2 2
2
1 1 N = s = N = s =
F s 5.8
3.41
Going to the F table with 15
= 1 = =
s
1.7
2
2
2
and 15 df, we find that for alpha
= .05 (1-tailed), the critical
value is 2.40. Therefore the
result is significant.
School.edhole.com
16. A Look Ahead
• The F distribution is used in many
statistical tests
– Test for equality of variances.
– Tests for differences in means in ANOVA.
– Tests for regression models (slopes
relating one continuous variable to another
like SAT and GPA).
School.edhole.com
17. Relations among Distributions
– the Children of the Normal
• Chi-square is drawn from the normal.
N(0,1) deviates squared and summed.
• F is the ratio of two chi-squares, each
divided by its df. A chi-square divided
by its df is a variance estimate, that is,
a sum of squares divided by degrees of
freedom.
• F = t2. If you square t, you get an F
with 1 df in the numerator.
2
(v) v t = F
(1, )
School.edhole.com