dusjagr & nano talk on open tools for agriculture research and learning
Chi square test
1. Chi-Square Test
• A basic problem is genetics is deciding if
experimentally determined data is close enough to
what is expected from theory (i.e. Mendel’s law of
segregation).
• For example, you cross heterozygous purple
colored flowers and for the offspring you get 72
purple flowers and 26 white flowers.
– Is this a 3/4 : 1/4 ratio as predicted by Mendel’s Law?
– Is it close enough?
– How do you decide how close is “close enough”?
– What about 70:30? or 60:40?
2. Goodness of Fit
• Mendel didn’t know how to solve this problem. Shortly
after the rediscovery of his work in 1900, Karl Pearson
and R.A. Fisher developed the chi-square test for this
purpose.
• The chi-square test is a goodness of fit test: it answers
the question of how well do experimental observations fit
expectations.
• We start with a null hypothesis: a hypothesis for how
many of each type of offspring are expected.
3. Formula
• Know how to use this formula!
• The “Χ” is the Greek letter chi
• “∑” means to sum for all types of offspring.
• “obs” is the number of individuals observed
• “exp” is the number of individuals expected if
your hypothesis is true
• Note that you must use the number of
individuals, the counts, NOT proportions or
percentages.
exp
exp)( 2
2 obs
4. What does the number mean?
• The Chi Square calculation just gives a number.
• What happens if observed equals expected?
• What happens if observed is very different from
expected?
• The smaller the number, the more believable your
hypothesis is. The larger the number, the less
believable your hypothesis is. Likely there is another
reason to cause the results you got.
• Could your explanation (your hypothesis) still be correct
and you just randomly got weird numbers?
• For example, if a couple has 7 girls, is our theory about
sex being determined by X and Y chromosomes still
correct and that is just an unusual couple, or is our
theory wrong?
5. What does the number mean?
• Could your explanation (your hypothesis) still be correct
and you just randomly got weird numbers?
• For example, if a couple has 8 children, how many
should be girls? Why?
• If a couple has all 8 girls, is the theory about sex being
determined by X and Y chromosomes still correct and
that is just an unusual result for a family, or is the theory
wrong?
• How do you decide if a result is just unusual or your
hypothesis is wrong?
6. The Critical Question
• The simple answer is: you can never tell for certain that a
hypothesis is “wrong”, that the result you got was
completely impossible based on the theory you used.
• All we can do is determine whether a result is likely or
unlikely.
• Key point: There are 2 ways of getting a high chi-square
value: an unusual result from the correct theory, or a result
from the wrong theory.
• Example: imagine tossing a coin 100 times. What number
of heads and tails do you expect?
• Is it possible to get 80 heads and 20 tails?
• Is it likely?
• How likely?
7. Reasonable
• What is a “likely” or “reasonable” depends on
what we decide.
• For most scientists, if unusual results could
happen less than 5% of the time, the hypothesis
is acceptable.
• If the difference between the observed results
and the expected results is small enough that it
would happen less than 5% of the time, we “fail
to reject” the null hypothesis.
• Statisticians say “fail to reject” instead of
“accept” (but that’s kind of stupid I think)
• 5% of the time is a probability value of p = 0.05
9. • Our 0.05 p value choice determines which
column we use on the Chi Square table
• How do we decide what row to use?
What Column and Row?
10. Degrees of Freedom
• A critical factor in using the chi-square test is the
degrees of freedom, which is the number of
independent variables involved.
• For example, if you have 20 offspring total, both
males and females and I tell you that 12 of them
are male, are you free to choose how many
females, or do you already know?
• Degrees of freedom is simply the number of
types of offspring minus 1.
11. Critical Chi-Square Value
• The critical value for a chi-square test is
the cutoff level:
– if your calculated chi-square value is greater
than the critical value from the table, you
“reject the null hypothesis”.
– If your chi-square value is less than the critical
value, you “fail to reject” the hypothesis (that
is, you accept that your explanation about
how the trait is inherited).