2. Questions of goodness of fit have become
increasingly important in modern statistics.
3. Questions of goodness of fit juxtapose complex
observed patterns against hypothesized or
previously observed patterns
to test overall and specific
differences among them.
8. Observed Hypothesized Difference
If the difference is small then the FIT IS GOOD
Observed Hypothesized Difference
For example:
51% Females 50% Females 1%
12. Observed Hypothesized Difference
If the difference is BIG then the FIT IS NOT GOOD
Observed Hypothesized Difference
For example:
13. Observed Hypothesized Difference
If the difference is BIG then the FIT IS NOT GOOD
Observed Hypothesized Difference
For example:
50% Females 22% Females 18%
15. Here is an example:
We want to know if a sample we have selected
has the national percentages of a certain ethnic
groups.
16. Here is an example:
We want to know if a sample we have selected
has the national percentages of a certain ethnic
groups.
2% of sample
is made of
members of
this ethnic
group
10% of the
population is
made of this
ethnic group
8% Difference
17. You will use certain statistical methods
to determine if the goodness of fit is
significant or not.
18. You will use certain statistical methods
to determine if the goodness of fit is
significant or not.
Here is an example:
19. You will use certain statistical methods
to determine if the goodness of fit is
significant or not.
Here is an example:
Problem – The chair of a statistics department
suspects that some of her faculty are more
popular with students than others.
20. There are three sections of introductory stats
that are taught at the same time in the morning
by Professors Cauforek, Kerr, and Rector.
21. There are three sections of introductory stats
that are taught at the same time in the morning
by Professors Cauforek, Kerr, and Rector.
66 students are planning on enrolling in one of
the three classes.
22. What would you expect the number of enrollees
to be in each class if popularity were not an
issue?
23. What would you expect the number of enrollees
to be in each class if popularity were not an
issue?
Professor Cauforek Professor Kerr Professor Rector
22 22 22
24. What would you expect the number of enrollees
to be in each class if popularity were not an
issue?
Professor Cauforek Professor Kerr Professor Rector
22 22 22
This is our expected value.
26. Now let’s see what was observed.
The number who enroll for each class was:
27. Now let’s see what was observed.
The number who enroll for each class was:
Professor Cauforek Professor Kerr Professor Rector
31 25 10
28. We will test the degree to which the observed
data...
29. We will test the degree to which the observed
data...
Professor Cauforek Professor Kerr Professor Rector
31 25 10
30. We will test the degree to which the observed
data...
Professor Cauforek Professor Kerr Professor Rector
31 25 10
…fits the expected enrollments.
31. We will test the degree to which the observed
data...
Professor Cauforek Professor Kerr Professor Rector
31 25 10
…fits the expected enrollments.
Professor Cauforek Professor Kerr Professor Rector
22 22 22
50. Here is the null-hypothesis:
There is no significant difference between the
expected and the observed number of students
enrolled in three stats professors’ classes.
51. Now we will compute the 푥2 value and compare
it with the 푥2 critical value.
52. Now we will compute the 푥2 value and compare
it with the 푥2 critical value.
• If the value exceeds the critical value, then
we will reject the null-hypothesis.
53. Now we will compute the 푥2 value and compare
it with the 푥2 critical value.
• If the value exceeds the critical value, then
we will reject the null-hypothesis.
• If the value DOES NOT exceed the critical
value, then we will fail to reject the null-hypothesis.
84. As a contrasting example note what the 푥2 value
would be if the observed and expected values
were more similar:
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 24 22 20
92. So the moral of the story is that the closer the
expected and observed values are to one
another, the smaller the Chi-square value or the
greater the goodness of fit (as seen below).
93. So the moral of the story is that the closer the
expected and observed values are to one
another, the smaller the Chi-square value or the
greater the goodness of fit (as seen below).
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
94. So the moral of the story is that the closer the
expected and observed values are to one
another, the smaller the Chi-square value or the
greater the goodness of fit (as seen below).
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
푥2 = ퟏퟎ. ퟔ
95. On the other hand, the farther the expected and
observed values are from one another the
smaller the Chi-square value or the greater the
goodness of fit (as seen below).
96. On the other hand, the farther the expected and
observed values are from one another the
smaller the Chi-square value or the greater the
goodness of fit (as seen below).
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
97. On the other hand, the farther the expected and
observed values are from one another the
smaller the Chi-square value or the greater the
goodness of fit (as seen below).
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
푥2 = ퟏퟎ. ퟔ
98. Now we determine if a 푥2 of 10.6 exceeds the
critical 푥2 for terms.
99. To calculate the 푥2 critical we first must
determine the degrees of freedom as well as set
the probability level.
100. To calculate the 푥2 critical we first must
determine the degrees of freedom as well as set
the probability level.
The probability or alpha level means the
probability of a type 1 error we are willing to live
with (i.e., this is the probability of being wrong
when we reject the null hypothesis).
101. To calculate the 푥2 critical we first must
determine the degrees of freedom as well as set
the probability level.
The probability or alpha level means the
probability of a type 1 error we are willing to live
with (i.e., this is the probability of being wrong
when we reject the null hypothesis). Generally
this value is 0.5 which is like saying we are
willing to be wrong 5 out of 100 times (0.05)
before we will reject the null-hypothesis.
102. Degrees of Freedom are calculated by taking the
number of groups and subtracting them by 1.
(Three groups minus 1 = 2)
103. We now have all of the information we need to
determine the critical 푥2.
104. We now have all of the information we need to
determine the critical 푥2.
We go to the Chi-Square Distribution Table and
locate the degrees of freedom.
105. We now have all of the information we need to
determine the critical 푥2.
We go to the Chi-Square Distribution Table and
locate the degrees of freedom.
df 0.100 0.050 0.025
1 2.71 3.84 5.02
2 4.61 5.99 7.38
3 6.25 7.82 9.35
4 7.78 9.49 11.14
5 9.24 11.07 12.83
6 10.64 12.59 14.45
7 12.02 14.07 16.10
8 13.36 15.51 17.54
9 14.68 16.92 19.20
… … … …
106. We now have all of the information we need to
determine the critical 푥2.
We go to the Chi-Square Distribution Table and
locate the degrees of freedom.
And then we locate the probability or alpha level:
df 0.100 0.050 0.025
1 2.71 3.84 5.02
2 4.61 5.99 7.38
3 6.25 7.82 9.35
4 7.78 9.49 11.14
5 9.24 11.07 12.83
6 10.64 12.59 14.45
7 12.02 14.07 16.10
8 13.36 15.51 17.54
9 14.68 16.92 19.20
… … … …
107. We now have all of the information we need to
determine the critical 푥2.
We go to the Chi-Square Distribution Table and
locate the degrees of freedom.
And then we locate the probability or alpha level:
df 0.100 0.050 0.025
1 2.71 3.84 5.02
2 4.61 5.99 7.38
3 6.25 7.82 9.35
4 7.78 9.49 11.14
5 9.24 11.07 12.83
6 10.64 12.59 14.45
7 12.02 14.07 16.10
8 13.36 15.51 17.54
9 14.68 16.92 19.20
… … … …
108. We now have all of the information we need to
determine the critical 푥2.
We go to the Chi-Square Distribution Table and
locate the degrees of freedom.
And then we locate the probability or alpha level:
df 0.100 0.050 0.025
1 2.71 3.84 5.02
2 4.61 5.99 7.38
3 6.25 7.82 9.35
4 7.78 9.49 11.14
5 9.24 11.07 12.83
6 10.64 12.59 14.45
7 12.02 14.07 16.10
8 13.36 15.51 17.54
9 14.68 16.92 19.20
… … … …
Where these two values
intersect in the table we
find the critical 푥2.
109. We now have all of the information we need to
determine the critical 푥2.
We go to the Chi-Square Distribution Table and
locate the degrees of freedom.
And then we locate the probability or alpha level:
df 0.100 0.050 0.025
1 2.71 3.84 5.02
2 4.61 5.99 7.38
3 6.25 7.82 9.35
4 7.78 9.49 11.14
5 9.24 11.07 12.83
6 10.64 12.59 14.45
7 12.02 14.07 16.10
8 13.36 15.51 17.54
9 14.68 16.92 19.20
… … … …
Where these two values
intersect in the table we
find the critical 푥2.
110. We now have all of the information we need to
determine the critical 푥2.
We go to the Chi-Square Distribution Table and
locate the degrees of freedom.
And then we locate the probability or alpha level:
df 0.100 0.050 0.025
1 2.71 3.84 5.02
2 4.61 5.99 7.38
3 6.25 7.82 9.35
4 7.78 9.49 11.14
5 9.24 11.07 12.83
6 10.64 12.59 14.45
7 12.02 14.07 16.10
8 13.36 15.51 17.54
9 14.68 16.92 19.20
… … … …
Where these two values
intersect in the table we
find the critical 푥2.
111. Since the chi-square goodness of fit value (10.6)
exceeds the critical 푥2 (5.99) we will reject the
null hypothesis:
112. Since the chi-square goodness of fit value (10.6)
exceeds the critical 푥2 (5.99) we will reject the
null hypothesis:
There is no significant difference between the
expected and the observed number of students
enrolled in three stats professors’ classes.
113. Since the chi-square goodness of fit value (10.6)
exceeds the critical 푥2 (5.99) we will reject the
null hypothesis:
There is no significant difference between the
expected and the observed number of students
enrolled in three stats professors’ classes.
114. Since the chi-square goodness of fit value (10.6)
exceeds the critical 푥2 (5.99) we will reject the
null hypothesis:
There is no significant difference between the
expected and the observed number of students
enrolled in three stats professors’ classes.
There actually is a significant difference.
116. In summary,
Questions of goodness of fit juxtapose observed
patterns against hypothesized to test overall and
specific differences among them.
117. In summary,
Questions of goodness of fit juxtapose observed
patterns against hypothesized to test overall and
specific differences among them.
Observed Hypothesized Difference