This document discusses data analysis and contingency tables. It presents a contingency table analyzing the relationship between gender and household poverty status using 2005 data. A chi-squared test is performed and reveals that gender and poverty status are not independent, meaning knowing one provides some ability to predict the other. However, the test does not indicate the strength of the relationship.
3. Hypotheses are“tested”
Hypotheses are never“proved”
Hypotheses only are“rejected”
Theories are built and verified by testing hypotheses
4. Truth
Ho true Ho false
Decision
Fail to
reject Ho
Reject Ho
5. Error
Error
Truth
Ho true Ho false
Decision
Fail to
reject Ho
Reject Ho
6. Type 1
error
Type 2
error
Truth
Ho true Ho false
Decision
Fail to
reject Ho
Reject Ho
7. TRADITIONALLY,
probability of Type 1
error set at .05
Minimize Type 2
error by
increasing
sample size
Truth
Ho true Ho false
Decision
Fail to
reject Ho
Reject Ho
9. A contingency table is a table of counts.
A two-dimensional contingency table is
formed by classifying subjects by two
variables.
One variable identifies the row categories;
the other variable defines the column
categories.
The combinations of row and column
categories are called cells.
10. Column 1 Column 2
R1; C1 R1; C2
R2; C1 R2; C2
Row 2 Row 1
R1
tot
R2
tot
C1 tot C2 tot Total
11.
12. Male Female
R1; C1 R1; C2
R2; C1 R2; C2
Row 2 Row 1
R1
total
R2
total
C1 total C2 total Total
13. Male Female
R1; C1 R1; C2
R2; C1 R2; C2
2005 Household
Not in Poverty
2005 Household
In Poverty
R1
total
R2
total
C1 total C2 total Total
14. Male Female
Males not in
poverty
Females not in
poverty
Males in poverty
Females in
poverty
R1
total
R2
total
C1 total C2 total Total
2005 Household
Not in Poverty
2005 Household
In Poverty
15. Male Female
Males not in
poverty
Females not in
poverty
Males in poverty
Females in
poverty
No
pov
total
Pov
total
Male total Female total Total
2005 Household
Not in Poverty
2005 Household
In Poverty
16. Male Female
Males not in
poverty
Females not in
poverty
Males in poverty
Females in
poverty
No
pov
total
Pov
total
Male total Female total Total
2005 Household
Not in Poverty
2005 Household
In Poverty
Marginals =>
<= Marginals
20. Male Female
3,086 3,039
443 623
3,529 3,662 7,191
2005 Household
Not in Poverty
2005 Household
In Poverty
6,125
1,066
21. Male Female
3,086 3,039
443 623
3,529 3,662 7,191
2005 Household
Not in Poverty
2005 Household
In Poverty
6,125
1,066
Is gender independent of household poverty status?
22. Male Female
3,086 3,039
443 623
3,529 3,662 7,191
2005 Household
Not in Poverty
2005 Household
In Poverty
6,125
1,066
If you know a person’s gender, can you predict poverty status?
23. If you know a person’s poverty status, can you predict gender?
Male Female
3,086 3,039
443 623
3,529 3,662 7,191
2005 Household
Not in Poverty
2005 Household
In Poverty
6,125
1,066
24. Male Female
3,086 3,039
443 623
3,529 3,662 7,191
2005 Household
Not in Poverty
2005 Household
In Poverty
6,125
1,066
A cell value should be equal to (row total x column total) ÷ total
25. Male Female
3,086
Expected value
is 3006
3,039
443 623
3,529 3,662 7,191
2005 Household
Not in Poverty
2005 Household
In Poverty
6,125
1,066
E.g., (6125 x 3529) ÷ 7191 should be equal to 3086, but is 3006
26. Male Female
3,086
Expected value
is 3006
3,039
443 623
3,529 3,662 7,191
2005 Household
Not in Poverty
2005 Household
In Poverty
6,125
1,066
An expected cell count is a hypothetical count that would occur
if there is no relationship between the two variables
27. Male Female
3,086
Expected value
is 3006
3,039
443 623
3,529 3,662 7,191
2005 Household
Not in Poverty
2005 Household
In Poverty
6,125
1,066
A value is the sum of the squared deviations of observed
minus expected divided by the expected value
28. Male Female
3,086
Expected value
is 3006
3,039
443 623
3,529 3,662 7,191
2005 Household
Not in Poverty
2005 Household
In Poverty
6,125
1,066
A value is the sum of the squared deviations of observed
minus expected divided by the expected value
29. Null hypothesis is H0: R x C = 0
Alternate hypothesis is H1: R x C ≠ 0
a = .05 Described as a test
of independence
33. A test of the hypothesis that rows and
columns in a table are independent
In our case, a test of the independence of
gender and poverty status reveals
• Household poverty status and gender are not
independent
• Knowing household poverty status helps predict
gender
34. A test of the hypothesis that rows and
columns in a table are independent
In our case, a test of the independence of
gender and poverty status reveals
• Household poverty status and gender are not
independent
But how much?
• Knowing household poverty status helps predict
gender