Chi squared test

KARL PEARSON
(1857-1936)
British
mathematician,
‘father’ of modern
statistics and a
pioneer of eugenics!
(Pearson’s)

Chi-squared (χ2
) test
• This test compares measurements relating to
the frequency of individuals in defined
categories e.g. the numbers of white and
purple flowers in a population of pea plants.
• Chi-squared is used to test if the observed
frequency fits the frequency you expected or
predicted.

How do we calculate the expected
frequency?
• You might expect the observed frequency of
your data to match a specific ratio. e.g. a 3:1
ratio of phenotypes in a genetic cross.
• Or you may predict a homogenous distribution
of individuals in an environment. e.g. numbers
of daisies counted in quadrats on a field.
Note: In some cases you might expect the observed
frequencies to match the expected, in others you
might hope for a difference between them.

Example 1: GENETICS
Comparing the observed frequency of
different types of maize grains with the
expected ratio calculated using a
Punnett square.

The photo shows four different phenotypes for maize grain,
as follows:
Purple & Smooth (A), Purple & Shrunken (B), Yellow &
Smooth (C) and Yellow & Shrunken (D)

Gametes PS Ps pS ps
PS PPSS PPSs PpSS PpSs
Ps PPSs PPss PpSs Ppss
pS PpSS PpSs ppSS ppSs
ps PpSs Ppss ppSs ppss
The Punnett square below shows the
expected ratio of phenotypes from crosses of
four genotypes of maize.
A : B : C : D = 9 : 3 : 3 : 1

H0 = there is no statistically significant difference
between the observed frequency of maize grains
and the expected frequency (the 9:3:3:1 ratio)
HA = there is a significant difference between the
observed frequency of maize grains and the
expected frequency
If the value for χ2
exceeds the critical value (P =
0.05), then you can reject the null hypothesis.
What is the null hypothesis (H0)?

Calculating χ2
χ2
= (O – E)2
E
Σ
O = the observed results
E = the expected (or predicted) results

Phenotype O
E
(9:3:3:1)
O-E (O-E)2
(O-E)2
E
A 271 244 27 729 2.99
B 73 81 -8 64 0.88
C 63 81 -18 324 4.00
D 26 27 -1 1 0.04
Σ 433 433 χ2
= 7.91

Compare your calculated value of χ2 with the critical value
in your stats table
Our value of χ2
= 7.91
Degrees of freedom = no. of categories - 1 = 3
D.F. Critical Value
(P = 0.05)
1 3.84
2 5.99
3 7.82
4 9.49
5 11.07
Our value for χ2
exceeds the
critical value, so we can reject
the null hypothesis.
There is a significant difference
between our expected and
observed ratios. i.e. they are a
poor fit.

Example 2: ECOLOGY
• One section of a river was trawled and four
species of fish counted and frequencies
recorded.
• The expected frequency is equal numbers of the
four fish species to be present in the sample.

between the observed frequency of fish species and
the expected frequency.
observed frequency of fish and the expected
frequency

Species O E O-E (O-E)2
(O-E)2
E
Rudd 15 10 5 25 2.5
Roach 15 10 5 25 2.5
Dace 4 10 -6 36 3.6
Bream 6 10 -4 16 1.6
Σ 40 40 χ2
= 10.2

in your table of critical values.
Our value of χ2
= 10.2
Degrees of freedom = no. of categories - 1 = 3
D.F. Critical Value
(P = 0.05)
1 3.84
2 5.99
3 7.82
4 9.49
5 11.07
Our value for χ2 exceeds the
observed frequencies of fish
species.

Example 3: ECOLOGY
• Do 2 plant species A and B grow independently
of one another?
• Quadrats taken to see if each plant species is
present or absent
• The expected frequency is equal numbers of the
two species to be present in the sample.

Observed values
Species A
Present Absent Totals
Specis B
Present 111 9 120
Absent 71 43 114
182 52 234

Expected Values
Species A
Present Absent Totals
Specis B
Present 182/234*120 52/234*120 120
Absent 182/234*114 52/234*114 114
182 52 234

So…
• Chi 2 = (Observed – Expected)2
» Expected

• Null hypothesis:
• If the plants grow independently of each
other there should be no statistically
significant difference in the number of
species A seen when B is present as
when it is absent! And vice versa

Example 4: CONTINGENCY TABLES
You can use contingency tables to calculate
expected frequencies when the relationship
between two quantities is being investigated.
In this example we will look
at the incidence of colour
blindness in both males and
females.

between the observed frequency of colour blindness
in males and females.
between the observed frequency of colour blindness
in males and females

Observed frequencies Males Females
Colour blind 56 14
Not colour blind 754 536
e.g.
The expected frequency
for colour blind males =
(56 + 14) x (56 + 754)
1360
= 42
Expected Cell Frequency = (Row Total x Column Total)
n

Observed: Males Females
•Colour blind 56 14
•Not colour blind 754 536
Expected: Males Females
•Colour blind 42 28
Males Females
•Colour blind 4.7 14
χ2
=… (O – E)2
E
Σ = 4.7 + 14 + 754 + 536 = 12.33
(O – E)2
/ E

in your table of critical values
Our value of χ2
= 12.33
Deg of Freedom = (2 rows - 1) x (2 cols – 1) = 1
D.F. Critical Value
(P = 0.05)
1 3.84
2 5.99
3 7.82
4 9.49
5 11.07
Our value for χ2 exceeds the
observed frequencies.
The fraction of males with colour
blindness is greater than that in
females. The difference cannot
be attributed to chance alone.

Chi squared test

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Chi squared test

Similar to Chi squared test (20)

Recently uploaded

Recently uploaded (20)

Chi squared test