Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Hierarchical Linear Modeling


Published on

Published in: Technology
  • Be the first to comment

Hierarchical Linear Modeling

  1. 1. HLM: Hierarchical Linear Modeling Katy Pearce, CRRC Armenia, May 15-16, 2008
  2. 2. Introduction <ul><li>Katy Pearce, current PhD student in Communication at University of California, Santa Barbara. </li></ul><ul><li>Communication is sociology + psychology. </li></ul><ul><li>Studies technology and how cultural characteristics can moderate technology adoption, attitudes, and use. </li></ul>
  3. 3. Introduction <ul><li>Data with nested structures are frequently observed in behavioral/social sciences. </li></ul><ul><li>For example: </li></ul><ul><ul><li>Educational settings: Students are nested within classes; classes are nested within schools. </li></ul></ul><ul><ul><li>Organizational studies: Workers are nested within departments; departments are nested within organizations. </li></ul></ul><ul><ul><li>Cross-cultural research: People are nested within countries. </li></ul></ul><ul><ul><li>But we often ignore these structures. </li></ul></ul>
  4. 4. Example 1 <ul><li>Educational achievement: </li></ul><ul><li>Imagine 5 little boys who are very similar: parental education is the same = low, parental income is the same = low, IQ is the same = low, etc. These 5 boys go to 5 different schools: an excellent school, a very good school, a good school, a poor school, and a very poor school. With HLM we can compare the impact of these different types of schools on the boys’ education achievement (test scores, grades, etc.). One can imagine that the mean parental education, parental income, and IQ are low are the very poor school and are high at the excellent school. With HLM we can control for variance at both the individual and the mean level. </li></ul>
  5. 5. But first, a brief review of other statistical techniques <ul><li>ANOVA: 1 IV with 2+ levels -> DV, to compare means among the 2+ groups. These means are compared by analyzing the variance in the DV. </li></ul><ul><li>Linear regression: linear relationship between two variables so that 1 may predict the other. 1 predictor variable -> 1 criterion variable </li></ul><ul><li>Multiple regression: 2+ predictor variables -> 1 criterion varaible </li></ul>
  6. 6. Example 2 <ul><li>World Values Survey </li></ul><ul><li>Trust and satisfaction </li></ul><ul><li>Trust and satisfaction with one’s life have been shown to be related. However, it is possible that the “mean” trust level in a society can moderate this relationship. </li></ul><ul><li>L1 (individual): trust generally -> satisfaction with one’s life </li></ul><ul><li>L2 (society): “mean” trust level </li></ul>
  7. 7. First, we need to get the data ready <ul><li>Step 1: prepare the file </li></ul><ul><ul><li>The World Values Survey is too big for the student version of HLM, so let’s take ~10% of the sample and save the file. </li></ul></ul><ul><ul><li>Sort by nation [v2], save the file. </li></ul></ul><ul><ul><li>Aggregate the data: “break variable” is nation [v2] and “aggregate variables” are life satisfaction [v81] and take advantage [v26], but sure to create a new data file </li></ul></ul>
  8. 8. HLM program <ul><li>Step 2: create HLM file </li></ul><ul><li>Open the HLM program </li></ul><ul><li>go to the File menu and select the following options: Make new MDM file... Stat package input </li></ul><ul><li>For the L1 file, open your WVS random file </li></ul><ul><li>For the L2 file, open your WVS aggregate file </li></ul>
  9. 9. HLM program 2 <ul><li>5. Now you must select the variables, in the L2 file the “ID” is v2 (nation) and the other two variables are in MDM. In the L2 file, the “ID” is also v2 and the two variables in the MDM are v26 and v81 </li></ul><ul><li>6. Select “yes” for missing data and “delete missing data while making MDM” </li></ul><ul><li>7. Save the file </li></ul><ul><li>8. Click “Make MDM” </li></ul><ul><li>9. Click “Done” </li></ul>
  10. 10. Effects <ul><li>Before we get to the actual data analysis, let’s talk about effects in HLM. </li></ul><ul><li>Fixed effects are the only levels of a variable in which a researcher is interested in studying. </li></ul><ul><li>Random effects are a subset of the total possible levels of a variable where the researcher is interested in generalizing to levels not observed. </li></ul><ul><li>For example, let’s say that we set up a school where in different classrooms, some of the students receive special tutoring and others are in a control group. A fixed effect variable would be which group the student was in: control or treatment, only two groups exist. A random effect variable would be the classroom that the student was in, as it shouldn’t matter to the study. </li></ul>
  11. 11. HLM analysis – Means as Outcomes <ul><li>9. Let’s start with specifying the L1 model. First we need to tell the program what our DV is, life satisfaction or [v81]. Click on v81 and select “outcome variable.” </li></ul><ul><li>10. Now we need to tell the program what our fixed and random effects are. V26 (trust) is a fixed effect, because we care about it. The intercept and slope are by default random effects. </li></ul><ul><li>11. Repeat for L2. </li></ul><ul><li>12. Click “Run analysis” </li></ul>
  12. 12. Output <ul><li>13. Go to the file menu, click on “View Output” </li></ul><ul><li>They show us the model: </li></ul><ul><li>Summary of the model specified (in equation format) </li></ul><ul><li>--------------------------------------------------- </li></ul><ul><li>Level-1 Model </li></ul><ul><li>Y = B0 + B1*(V26) + R </li></ul><ul><li>Level-2 Model </li></ul><ul><li>B0 = G00 + G01*(V26_1) + U0 </li></ul><ul><li>B1 = G10 + G11*(V26_1) </li></ul>
  13. 13. Output 2 <ul><li>Sigma_squared = 82.48620 </li></ul><ul><li>Tau </li></ul><ul><li>INTRCPT1,B0 4.21449 </li></ul><ul><li>Tau (as correlations) </li></ul><ul><li>INTRCPT1,B0 1.000 </li></ul><ul><li>---------------------------------------------------- </li></ul><ul><li>Random level-1 coefficient Reliability estimate </li></ul><ul><li>---------------------------------------------------- </li></ul><ul><li>INTRCPT1, B0 0.845 </li></ul><ul><li>---------------------------------------------------- </li></ul><ul><li>The value of the likelihood function at iteration 5 = -1.747747E+004 </li></ul><ul><li>The outcome variable is V81 </li></ul>
  14. 14. Output 3 <ul><li>Final estimation of fixed effects : </li></ul><ul><li>--------------------------------------------------------- </li></ul><ul><li>Standard Approx. </li></ul><ul><li>Fixed Effect Coefficient Error T-ratio d.f. P-value </li></ul><ul><li>---------------------------------------------------------------------------- </li></ul><ul><li>For INTRCPT1, B0 </li></ul><ul><li>INTRCPT2, G00 7.652744 0.870587 8.790 38 0.000 </li></ul><ul><li>V26_1, G01 -0.440045 0.404599 -1.088 38 0.284 </li></ul><ul><li>For V26 slope, B1 </li></ul><ul><li>INTRCPT2, G10 0.333436 0.195697 1.704 4806 0.088 </li></ul><ul><li>V26_1, G11 -0.070027 0.078756 -0.889 4806 0.374 </li></ul><ul><li>---------------------------------------------------------------------------- </li></ul>
  15. 15. Output 4 <ul><li>The outcome variable is V81 </li></ul><ul><li>Final estimation of fixed effects </li></ul><ul><li>(with robust standard errors) </li></ul><ul><li>---------------------------------------------------------------------------- </li></ul><ul><li>Standard Approx. </li></ul><ul><li>Fixed Effect Coefficient Error T-ratio d.f. P-value </li></ul><ul><li>---------------------------------------------------------------------------- </li></ul><ul><li>For INTRCPT1, B0 </li></ul><ul><li>INTRCPT2, G00 7.652744 0.670477 11.414 38 0.000 </li></ul><ul><li>V26_1, G01 -0.440045 0.309190 -1.423 38 0.163 </li></ul><ul><li>For V26 slope, B1 </li></ul><ul><li>INTRCPT2, G10 0.333436 0.212963 1.566 4806 0.117 </li></ul><ul><li>V26_1, G11 -0.070027 0.075376 -0.929 4806 0.353 </li></ul><ul><li>---------------------------------------------------------------------------- </li></ul>
  16. 16. Output 5 <ul><li>Final estimation of variance components : </li></ul><ul><li>----------------------------------------------------------------------------- </li></ul><ul><li>Random Effect Standard Variance df Chi-square P-value </li></ul><ul><li>Deviation Component </li></ul><ul><li>----------------------------------------------------------------------------- </li></ul><ul><li>INTRCPT1, U0 2.05292 4.21449 38 288.90950 0.000 </li></ul><ul><li>level-1, R 9.08219 82.48620 </li></ul><ul><li>----------------------------------------------------------------------------- </li></ul><ul><li>Statistics for current covariance components model </li></ul><ul><li>-------------------------------------------------- </li></ul><ul><li>Deviance = 34954.948408 </li></ul><ul><li>Number of estimated parameters = 2 </li></ul>
  17. 17. What to do with this output? <ul><li>First, we must calculate the intraclass correlation. </li></ul><ul><li>ρ = τ 00 / ( τ 00 + σ 2 ) </li></ul><ul><li>4.21449 / (4.21449 + 82.48620) </li></ul><ul><li>= 4.21449/ 86.70069 </li></ul><ul><li>= 0.0486096477 </li></ul><ul><li>Which means that ~5% of the variance is at the national level (L2), and that 95% of the variance is at the individual (L1) level. </li></ul>
  18. 18. Let’s try some different WVS examples <ul><li>Family important [v4] -> Work important [v8] </li></ul><ul><li>~6% of variance is at the national level. </li></ul><ul><li>Democracy isn’t good [v171] -> Having army rule [v166] </li></ul><ul><li>~57% of the variance is at the national level. </li></ul>
  19. 19. CRRC DI <ul><li>3 countries (AM, AZ, and GE) are technically too small of groups to compare, but can compare regions </li></ul><ul><li>First, Armenia only, sort by quadrant. </li></ul><ul><li>What variables would differ by quadrant? </li></ul><ul><li>English language knowledge level [e9_2] -> political cooperation with U.S. [p15_6] </li></ul><ul><li>3% of variance is at the quadrant level </li></ul>
  20. 20. Your own data <ul><li>Your own data set </li></ul><ul><li>Needs to have 10+ groups </li></ul><ul><li>Continuous variables or categorical, but preferably with a larger scale </li></ul><ul><li>If you don’t have your own data, you’re welcome to use the WVS or CRRC DI or if there is a topic that you’re interested in, get a data set before tomorrow or give me a sense of your interests and I’ll find one. </li></ul>
  21. 21. Other datasets freely available <ul><li> : archive of thousands of datasets </li></ul><ul><li> : United Nations Statistics </li></ul><ul><li> : World Bank data </li></ul>