1. What Am I Supposed to Do With Three-Way
Crosstabs? An Introduction to Log Linear Models
ERIC CANEN, M.S.
UNIVERSITY OF WYOMING
WYOMING SURVEY & ANALYSIS CENTER
EVALUATION 2010: EVALUATION QUALITY
SAN ANTONIO, TX
NOVEMBER 13, 2010
2. Situation
What effects?
Community Level
Communities
3
3. Parents Have
Favorable Attitude
toward Smoking
Would be
Seen as cool
for smoking
Set Up Tried smoking
during lifetime
Pre-Ordinance Post-Ordinance
Matched Communities
Variables of Interest
Pre/Post Smoked during past
30 days
Friends Smoke
4
10. Seen As Cool
Some chance OR Pretty
No or very little chance good chance OR Very
OR Little chance good chance Total
Pre Ord Count 11443 1676 13119
Expected Count 11308.3 1810.7 13119.0
% of Total 52.6% 7.7% 60.3%
Non-ord Count 7324 1329 8653
Expected Count 7458.7 1194.3 8653.0
% of Total 33.6% 6.1% 39.7%
Total Count 18767 3005 21772
Expected Count 18767.0 3005.0 21772.0
% of Total 86.2% 13.8% 100.0%
Post Ord Count 4137 603 4740
Expected Count 4083.0 657.0 4740.0
% of Total 50.8% 7.4% 58.2%
Non-ord Count 2879 526 3405
Expected Count 2933.0 472.0 3405.0
% of Total 35.3% 6.5% 41.8%
Total Count 7016 1129 8145
Expected Count 7016.0 1129.0 8145.0
% of Total 86.1% 13.9% 100.0%
11
11. Chi-Square Tests
Asymp.
Sig. (2- Exact Sig. Exact Sig.
Value df sided) (2-sided) (1-sided)
Pre Pearson Chi- 29.251 1 .000
Square
Likelihood Ratio 28.976 1 .000
N of Valid Cases 21772
Post Pearson Chi- 12.336 1 .000
Square
Likelihood Ratio 12.243 1 .000
N of Valid Cases 8145
12
12. Seen As Cool
Some chance OR Pretty
No or very little chance good chance OR Very
OR Little chance good chance Total
Pre Ord Count 11443 1676 13119
Expected Count 11308.3 1810.7 13119.0
% of Total 52.6% 7.7% 60.3%
Non-ord Count 7324 1329 8653
Expected Count 7458.7 1194.3 8653.0
% of Total 33.6% 6.1% 39.7%
Total Count 18767 3005 21772
Expected Count 18767.0 3005.0 21772.0
% of Total 86.2% 13.8% 100.0%
Post Ord Count 4137 603 4740
Expected Expected Count
Cell Probabilities:
% of Total
4083.0
50.8%
657.0
7.4%
4740.0
58.2%
P(AB|C) P(A) P(B) P(B|C)2879
P(AB) ==Count * * P(B) * P(C)
P(ABC) =P(A)
Non-ord P(A|C) * 526 3405
Expected Count 2933.0 472.0 3405.0
% of Total 35.3% 6.5% 41.8%
Expected Count Count
Total
Cell Counts:
Expected
7016
7016.0
1129
1129.0
8145
8145.0
E(nab))= n ofnP(AB)
abc = = *Total P(AB|C)
|C) % * P(ABC)
n * 86.1% 13.9% 100.0%
13
14. Analysis
Loglinear Models
Relationship
Higher Order
Modeling Cell
Related to
Model
Alternative to
Generalize
between
Terms
ANOVA
Counts
Based
Linear Models
Crosstabs
variables
15
15. Assumptions
Data represent cross tabulated counts
No expected cell counts are zero cell counts and no
more than 20% of the cells have expected cell counts
<=5
If sample size was fixed then the cell counts are
expected to follow a multinomial distribution
If sample size was not fixed then cell counts are
expected to follow a Poisson distribution
Models look at relationships or association,
like correlation (r statistic)
16
16. Program Commands
SAS
Proc CatMod procedure
Proc GenMod procedure
R
loglin() function
glm() function
Stata
poisson command (Poison regression)
glm command
SPSS/PASW
GENLOG
GENLIN
17
50. Tips and Tricks
Plot both the observed and expected values for all models
Consider if you want to work forward
(independent block partial uniform saturated)
or backward
(saturated uniform block partial independent)
Backward maybe quicker
Example of non-significant and inconclusive result
Run Syntax
51
51. Contact information
Eric Canen
University of Wyoming,
Wyoming Survey & Analysis Center
ecanen@uwyo.edu
Download presentation file and handout from AEA
e-library
52
Editor's Notes
All the factors are indepen
Example height and weight are related but are independent of hair growth
Example height and weight are related but are independent of hair growth
Example height and weight are related but are independent of hair growth
All the factors are indepen
Note when only accounting for pre/post marginals, then non ordinance seem to consistently have lower percentages of students who have one or more friends who use cigarettes.
Note: that pattern in the last slide reverses when you take into account the community size differences between ordinance and non ordinance communities and the pre-post count differences. Then the observed percentages of students who have friends who use cigarettes is actually higher in the non-ordinance case. Also notice the change in rates. The non-ordinance went from 45% to 42%. The ordinance went from 41% to 33%. Yet the complete independence model would have expected no change pre to post after accounting for the other two factors.