2. • Statistical test that measures the association
between two categorical variables.
• Most commonly applied in questionnaire data
from a survey
– Response is yes, no [nominal or ordinal]
• If the observed χ2test statistic is greater than the
critical value from table null hypothesis is
rejected
– χ2 > critical value at given df and level of significance;
H0 rejected
4. Example
• Presume you observe 100 people to see who
deposits garbage in the can and who litters. You
want to see if there is difference between the
gender.
• A person can fall under one of the following 4
categories:
– Male, deposits garbage
– Male, litters
– Female, deposits garbage
– Female, litters
deposit litter
female 18 7
male 42 33
5. We can put this nominal data in 2X2
contingency table
Deposit Litter Total
Female 18 7 25
Male 42 33 75
Total 60 40 100
Deposit Litter Total
Female 15
Male 30
If Null Hypothesis were true there is no difference in Gender
So, the Expected outcome would be:
6. We can put this nominal data in 2X2
contingency table
Deposit Litter Total
Female 18 7 25
Male 42 33 75
Total 60 40 100
Deposit Litter Total
Female 15 10
Male 45 30
If Null Hypothesis were true there is no difference in Gender
So, the Expected outcome would be:
7. We can put this nominal data in 2X2
contingency table
Deposit Litter Total
Female 18 7 25
Male 42 33 75
Total 60 40 100
Deposit Litter Total
Female 15 10 25
Male 45 30 75
Total 60 40 100
If Null Hypothesis were true there is no difference in Gender
So, the Expected outcome would be:
9. Degree of Freedom [df]
• df = [C-1][R-1]
– C= number of columns
– R= number of Rows
10.
11. • Chi square statistic > critical value
But in our observation,
• Chi square statistic (2) < critical value (3.841)
– pN6f]kf] cfof]!
• Null Hypothesis is NOT Rejected.
– Meaning ?
12. We retain the null hypothesis as:
• The littering or depositing behavior is
independent of the gender distribution
13. • Do you get how to assume the expected outcome
?
– toss a coin 100 times. Observed outcome H-60, T-40
• What was the expected outcome?
• What is the chi squared statistic ?
• Is Null hypothesis rejected ?
• The more spread are the observed variable from
the expected variable greater chance of null
hypothesis being rejected
– Do you agree ? ? how
14. Key Assumptions in χ2
• Each individual appears in the table once only
• The result for each individual is independent
of all other individuals
• The table of expected values should have 80%
of all expected values greater than 5.
• This test is valid only when you have
reasonable sample size
15. Key Assumptions in χ2
• For 2X2 table [only 2 categories in each variable]
– χ2 test can be used when total sample size is > 40
– if the total sample size is 20-40, and smallest
expected frequency is at least 5 χ2 test can be used
– Otherwise Fisher’s exact test should be used [SPSS
will automatically give this]
• For all other table:
– χ2 can be used if no more than 20% of the expected
frequencies are less than 5 and none is less than 1