Chi square goodness of fit

Chi-Square Test of Goodness of Fit
(Conceptual)

Question of Goodness of Fit
Questions of goodness of fit have become
increasingly important in modern statistics.

Question of Goodness of Fit
Questions of goodness of fit juxtapose complex
observed patterns against hypothesized or
previously observed patterns to test overall and
specific differences among them.

Observed Hypothesized Difference

If the difference is small then the FIT IS GOOD

For example:

For example:
51% Females 50% Females 1%

If the difference is BIG then the FIT IS NOT GOOD

For example:

For example:
50% Females 22% Females 18%

Here is an example:
We want to know if a sample we have selected
has the national percentages of a certain ethnic
groups.

Here is an example:
We want to know if a sample we have selected
has the national percentages of a certain ethnic
groups.
2% of sample
is made of
members of
this ethnic
group
10% of the
population is
made of this
ethnic group
8% Difference

You will use certain statistical methods
to determine if the goodness of fit is
significant or not.

significant or not.
Here is an example:

significant or not.
Here is an example:
Problem – The chair of a statistics department
suspects that some of her faculty are more
popular with students than others.

There are three sections of introductory stats
that are taught at the same time in the morning
by Professors Cauforek, Kerr, and Rector.

There are three sections of introductory stats
that are taught at the same time in the morning
by Professors Cauforek, Kerr, and Rector.
66 students are planning on enrolling in one of
the three classes.

What would you expect the number of enrollees
to be in each class if popularity were not an
issue?

Professor Cauforek Professor Kerr Professor Rector
22 22 22
issue?

22 22 22
issue?
This is our expected value.

Now let’s see what was observed.

The number who enroll for each class was:

The number who enroll for each class was:
31 25 10

We will test the degree to which the observed
data...

data...
31 25 10

data...
…fits the expected enrollments.
31 25 10

data...
…fits the expected enrollments.
31 25 10
22 22 22

𝑥2
= Σ
(𝑂 − 𝐸)2
𝐸

Where:
𝑥2
= Σ
(𝑂 − 𝐸)2
𝐸

Where:
𝑥2
= Σ
(𝑂 − 𝐸)2
𝐸
𝒙 𝟐
= 𝐶ℎ𝑖 𝑆𝑞𝑢𝑎𝑟𝑒

Where:
𝑥2
= Σ
(𝑂 − 𝐸)2
𝐸
𝒙 𝟐
= 𝐶ℎ𝑖 𝑆𝑞𝑢𝑎𝑟𝑒
𝒙 𝟐
= Σ
(𝑂 − 𝐸)2
𝐸

𝚺 = 𝑆𝑢𝑚 𝑜𝑓
𝑥2
= 𝚺
(𝑂 − 𝐸)2
𝐸

𝐎 = 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑠𝑐𝑜𝑟𝑒

𝑥2
= Σ
(𝑶 − 𝐸)2
𝐸

𝑥2
= Σ
(𝑶 − 𝐸)2
𝐸
31 25 10

𝑬 = 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑠𝑐𝑜𝑟𝑒

𝑥2
= Σ
(𝑂 − 𝑬)2
𝐸

𝑥2
= Σ
(𝑂 − 𝑬)2
𝐸
22 22 22

𝑥2
= Σ
(𝑂 − 𝐸)2
𝑬
22 22 22

Here is the null-hypothesis:
There is no significant difference between the
expected and the observed number of students
enrolled in three stats professors’ classes.

Now we will compute the 𝑥2
value and compare
it with the 𝑥2
critical value.

value and compare
it with the 𝑥2
critical value.
• If the value exceeds the critical value, then
we will reject the null-hypothesis.

value and compare
it with the 𝑥2
critical value.
• If the value exceeds the critical value, then
we will reject the null-hypothesis.
• If the value DOES NOT exceed the critical
value, then we will fail to reject the null-
hypothesis.

Let’s compute the 𝑥2
value.

value.
Expected 22 22 22
Observed 31 25 10

value.
Expected 22 22 22
Observed 31 25 10
𝑥2
= 𝚺
(𝑂 − 𝐸)2
𝐸

value.
OR
Expected 22 22 22
Observed 31 25 10
𝑥2
= 𝚺
(𝑂 − 𝐸)2
𝐸

value.
OR
Expected 22 22 22
Observed 31 25 10
𝑥2
= 𝚺
(𝑂 − 𝐸)2
𝐸
𝑥2
=
(𝑂 − 𝐸)2
𝐸
+
(𝑂 − 𝐸)2
𝐸
+
(𝑂 − 𝐸)2
𝐸

value.
OR
𝑥2
=
(𝑂 − 𝐸)2
𝐸
+
(𝑂 − 𝐸)2
𝐸
+
(𝑂 − 𝐸)2
𝐸
𝑥2
= 𝚺
(𝑂 − 𝐸)2
𝐸
Expected 22 22 22
Observed 31 25 10

Let’s input each professor’s data into the
equation.

equation.
Expected 22 22 22
Observed 31 25 10

equation.
Expected 22 22 22
Observed 31 25 10
𝑥2
=
(𝟑𝟏 − 𝐸)2
𝐸
+
(𝑂 − 𝐸)2
𝐸
+
(𝑂 − 𝐸)2
𝐸

equation.
Expected 22 22 22
Observed 31 25 10
𝑥2
=
(31 − 𝟐𝟐)2
𝐸
+
(𝑂 − 𝐸)2
𝐸
+
(𝑂 − 𝐸)2
𝐸

equation.
Expected 22 22 22
Observed 31 25 10
𝑥2
=
(31 − 22)2
𝟐𝟐
+
(𝑂 − 𝐸)2
𝐸
+
(𝑂 − 𝐸)2
𝐸

equation.
Expected 22 22 22
Observed 31 25 10
𝑥2
=
(31 − 22)2
22
+
(𝟐𝟓 − 𝐸)2
𝐸
+
(𝑂 − 𝐸)2
𝐸

equation.
Expected 22 22 22
Observed 31 25 10
𝑥2
=
(31 − 22)2
22
+
(25 − 𝟐𝟐)2
𝟐𝟐
+
(𝑂 − 𝐸)2
𝐸

equation.
Expected 22 22 22
Observed 31 25 10
𝑥2
=
(31 − 22)2
22
+
(25 − 22)2
22
+
(𝟏𝟎 − 𝐸)2
𝐸

equation.
Expected 22 22 22
Observed 31 25 10
𝑥2
=
(31 − 22)2
22
+
(25 − 22)2
22
+
(10 − 𝟐𝟐)2
𝟐𝟐

Now for the calculation:
𝑥2
=
(31 − 22)2
22
+
(25 − 22)2
22
+
(10 − 22)2
22

𝑥2
=
(𝟗)2
22
+
(25 − 22)2
22
+
(10 − 22)2
22

𝑥2
=
𝟖𝟏
22
+
(25 − 22)2
22
+
(10 − 22)2
22

𝑥2
=
81
22
+
(𝟑)2
22
+
(10 − 22)2
22

𝑥2
=
81
22
+
𝟗
22
+
(10 − 22)2
22

𝑥2
=
81
22
+
𝟗
22
+
(−𝟏𝟐)2
22

𝑥2
=
81
22
+
9
22
+
𝟏𝟒𝟒
22

Convert the fractions into decimals:
𝑥2
=
81
22
+
9
22
+
𝟏𝟒𝟒
22

𝑥2
=
81
22
+
9
22
+
144
22

𝑥2
= 𝟑. 𝟕 +
9
22
+
144
22

𝑥2
= 3.7 + 𝟎. 𝟒 +
144
22

𝑥2
= 3.7 + 0.4 + 𝟔. 𝟓

Sum the terms:
𝑥2
= 3.7 + 0.4 + 6.5

As a contrasting example note what the 𝑥2
value
would be if the observed and expected values
were more similar:
Expected 22 22 22
Observed 24 22 20

Expected 22 22 22
Observed 24 22 20
𝑥2
=
(𝑂 − 𝐸)2
𝐸
+
(𝑂 − 𝐸)2
𝐸
+
(𝑂 − 𝐸)2
𝐸

Expected 22 22 22
Observed 24 22 20
𝑥2
=
(𝑂 − 𝟐𝟐)2
𝟐𝟐
+
(𝑂 − 𝟐𝟐)2
𝟐𝟐
+
(𝑂 − 𝟐𝟐)2
𝟐𝟐

Expected 22 22 22
Observed 24 22 20
𝑥2
=
(𝟐𝟒 − 22)2
22
+
(𝟐𝟐 − 22)2
22
+
(𝟐𝟎 − 22)2
22

Expected 22 22 22
Observed 24 22 20
𝑥2
=
(𝟐)2
22
+
(𝟎)2
22
+
(−𝟐)2
22

Expected 22 22 22
Observed 24 22 20
𝑥2
=
𝟒
22
+
𝟎
22
+
𝟒
22

Expected 22 22 22
Observed 24 22 20
𝑥2
= 𝟎. 𝟐 + 𝟎. 𝟎 + 𝟎. 𝟐

Expected 22 22 22
Observed 24 22 20
𝑥2
= 𝟎. 𝟒

So the moral of the story is that the closer the
expected and observed values are to one
another, the smaller the Chi-square value or the
greater the goodness of fit (as seen below).

Expected 22 22 22
Observed 31 25 10

Expected 22 22 22
Observed 31 25 10
𝑥2
= 𝟏𝟎. 𝟔

On the other hand, the farther the expected and
observed values are from one another the
smaller the Chi-square value or the greater the
goodness of fit (as seen below).

Expected 22 22 22
Observed 31 25 10

Expected 22 22 22
Observed 31 25 10
𝑥2
= 𝟏𝟎. 𝟔

Now we determine if a 𝑥2
of 10.6 exceeds the
critical 𝑥2
for terms.

To calculate the 𝑥2
critical we first must
determine the degrees of freedom as well as set
the probability level.

The probability or alpha level means the
probability of a type 1 error we are willing to live
with (i.e., this is the probability of being wrong
when we reject the null hypothesis).

The probability or alpha level means the
probability of a type 1 error we are willing to live
with (i.e., this is the probability of being wrong
when we reject the null hypothesis). Generally
this value is 0.5 which is like saying we are
willing to be wrong 5 out of 100 times (0.05)
before we will reject the null-hypothesis.

Degrees of Freedom are calculated by taking the
number of groups and subtracting them by 1.
(Three groups minus 1 = 2)

We now have all of the information we need to
determine the critical 𝑥2
.

.
We go to the Chi-Square Distribution Table and
locate the degrees of freedom.

.
df 0.100 0.050 0.025
1 2.71 3.84 5.02
2 4.61 5.99 7.38
3 6.25 7.82 9.35
4 7.78 9.49 11.14
5 9.24 11.07 12.83
6 10.64 12.59 14.45
7 12.02 14.07 16.10
8 13.36 15.51 17.54
9 14.68 16.92 19.20
… … … …

.
And then we locate the probability or alpha level:
df 0.100 0.050 0.025
1 2.71 3.84 5.02
2 4.61 5.99 7.38
3 6.25 7.82 9.35
4 7.78 9.49 11.14
5 9.24 11.07 12.83
6 10.64 12.59 14.45
7 12.02 14.07 16.10
8 13.36 15.51 17.54
9 14.68 16.92 19.20
… … … …

.
df 0.100 0.050 0.025
1 2.71 3.84 5.02
2 4.61 5.99 7.38
3 6.25 7.82 9.35
4 7.78 9.49 11.14
5 9.24 11.07 12.83
6 10.64 12.59 14.45
7 12.02 14.07 16.10
8 13.36 15.51 17.54
9 14.68 16.92 19.20
… … … …
Where these two values
intersect in the table we
find the critical 𝑥2
.

df 0.100 0.050 0.025
1 2.71 3.84 5.02
2 4.61 5.99 7.38
3 6.25 7.82 9.35
4 7.78 9.49 11.14
5 9.24 11.07 12.83
6 10.64 12.59 14.45
7 12.02 14.07 16.10
8 13.36 15.51 17.54
9 14.68 16.92 19.20
… … … …
.
Where these two values
intersect in the table we
find the critical 𝑥2
.

Since the chi-square goodness of fit value (10.6)
exceeds the critical 𝑥2
(5.99) we will reject the
null hypothesis:

null hypothesis:

null hypothesis:
There actually is a significant difference.

In summary,
Questions of goodness of fit juxtapose observed
patterns against hypothesized to test overall and

Chi square goodness of fit

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (17)

Similar to Chi square goodness of fit

Similar to Chi square goodness of fit (18)

More from CTLTLA

More from CTLTLA (8)

Recently uploaded

Recently uploaded (20)

Chi square goodness of fit