Effect of Number of Categories and Category
  Boundaries on Recovery of Latent Linear
   Correlations from Optimally Weight...
Outline


   Introduction
       LINEALS
       Forming a Hypothesis

   Method
      Description
      Simulation
      A...
Outline


   Introduction
       LINEALS
       Forming a Hypothesis

   Method
      Description
      Simulation
      A...
Introducing LINEALS
A Method of Optimal Scaling




    Algorithm
    An iterative process that minimizes m   m     2     ...
Plot of LINEALS Transformation
   Criterion: Linearize both X on Y and Y on X simultaneously.




                  Figure...
Outline


   Introduction
       LINEALS
       Forming a Hypothesis

   Method
      Description
      Simulation
      A...
Questions to ask




   First, define good recovery as small deviation from true score.
    1. Does LINEALS recover true po...
Outline


   Introduction
       LINEALS
       Forming a Hypothesis

   Method
      Description
      Simulation
      A...
Conditions tested



   Correlation Type, True Population Correlation, Number of
   Categories, and Homogeneity


    Cond...
Outline


   Introduction
       LINEALS
       Forming a Hypothesis

   Method
      Description
      Simulation
      A...
Creating functions in R




   For each combination (total of 80):
    1. Generate 1000 sets of bivariate normal data.
   ...
Outline


   Introduction
       LINEALS
       Forming a Hypothesis

   Method
      Description
      Simulation
      A...
Hierarchical Regression
Description




          DV: deviation of sample correlation from true population
          corre...
Hierarchical Regression
Model Selection



          Tested full model against nested models.
          Confirmed with Best...
Final Model
SPSS Output

                                        Coefficients(a)

                                 Unstand...
Outline


   Introduction
       LINEALS
       Forming a Hypothesis

   Method
      Description
      Simulation
      A...
Plot of Main Effects I




                                   Figure: Main Effect of Number of
Figure: Main Effect of Populat...
Plot of Main Effects II




Figure: Main Effect of Homogeneity h   Figure: Main Effect of Correlation Type r
Outline


   Introduction
       LINEALS
       Forming a Hypothesis

   Method
      Description
      Simulation
      A...
Plot of Significant Interactions

    Note: The significant 3-way interaction hPV is not plotted.




Figure: Population Cor...
Interaction of Correlation Type and Number of Categories
   When rV added into regression model, the main effect of
   Corr...
Summary



   1. LINEALS performs slightly better than Pearson under
      bivariate normal categorizations.
   2. The non...
Upcoming SlideShare
Loading in …5
×

Johnny Aqm Presentation

558 views
507 views

Published on

Published in: Technology, Spiritual
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
558
On SlideShare
0
From Embeds
0
Number of Embeds
37
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Johnny Aqm Presentation

  1. 1. Effect of Number of Categories and Category Boundaries on Recovery of Latent Linear Correlations from Optimally Weighted Categorical Data Johnny Lin Advisor: Peter Bentler November 19, 2008
  2. 2. Outline Introduction LINEALS Forming a Hypothesis Method Description Simulation Analysis Results Main Effects Interactions
  3. 3. Outline Introduction LINEALS Forming a Hypothesis Method Description Simulation Analysis Results Main Effects Interactions
  4. 4. Introducing LINEALS A Method of Optimal Scaling Algorithm An iterative process that minimizes m m 2 2 2 l=1 (ηjl − rjl ) where ηjl j=1 is a measure of nonlinearity. Developed by Jan de Leeuw and implemented by Patrick Mair. Assumption That bi-linearization is possible. No assumption of normality.
  5. 5. Plot of LINEALS Transformation Criterion: Linearize both X on Y and Y on X simultaneously. Figure: Red: X on Y , Blue: Y on X
  6. 6. Outline Introduction LINEALS Forming a Hypothesis Method Description Simulation Analysis Results Main Effects Interactions
  7. 7. Questions to ask First, define good recovery as small deviation from true score. 1. Does LINEALS recover true population correlations better than Pearson for categorical data? 2. Is the performance of LINEALS robust? 3. What factors influence good recovery?
  8. 8. Outline Introduction LINEALS Forming a Hypothesis Method Description Simulation Analysis Results Main Effects Interactions
  9. 9. Conditions tested Correlation Type, True Population Correlation, Number of Categories, and Homogeneity Condition Parameters {0=LINEALS, 1=Pearson} 1. Correlation Type (r) {0.3,0.5,0.7,0.9} 2. True Population Correlation (P) {2,3,5,7,10} 3. Number of Categories (V) {0=Non-Homogeneous, 1=Homogeneous} 4. Homogeneity (h) Total of 80 combinations (2x4x5x2).
  10. 10. Outline Introduction LINEALS Forming a Hypothesis Method Description Simulation Analysis Results Main Effects Interactions
  11. 11. Creating functions in R For each combination (total of 80): 1. Generate 1000 sets of bivariate normal data. 2. Make “cuts” (homogeneous vs. non-homogeneous). 3. Run through LINEALS / Pearson. 4. Calculate deviation of result and true population correlation. 5. Repeat Steps 1 - 4 twenty-five times. Result: Total of 2000 deviations (80x25).
  12. 12. Outline Introduction LINEALS Forming a Hypothesis Method Description Simulation Analysis Results Main Effects Interactions
  13. 13. Hierarchical Regression Description DV: deviation of sample correlation from true population correlation |ρ12 | − |ˆ12 | ρ IVs: main effect and interactions of four conditions (total of 15) Four main effects (h,r,P,V) Six 2-way interactions (hr, hP, hV, . . . ) Four 3-way interactions (hrP, hrV, . . . ) One 4-way interaction (hrPV)
  14. 14. Hierarchical Regression Model Selection Tested full model against nested models. Confirmed with Best Subset Regression. Optimal Adj. R 2 and Mallow’s CP found with 7-8 parameters. (a) Adj. R 2 (b) Mallow’s CP
  15. 15. Final Model SPSS Output Coefficients(a) Unstandardized Standardized Model Coefficients Coefficients t Sig. B Std. Error Beta 1 (Constant) .189 .006 31.240 .000 h -.113 .012 -.620 -9.299 .000 r .007 .002 .041 3.054 .002 V -.024 .001 -.773 -40.558 .000 P .098 .008 .241 12.655 .000 hV .013 .002 .487 7.164 .000 hP .117 .018 .435 6.392 .000 hPV -.017 .003 -.422 -6.326 .000 a Dependent Variable: difference Difference between LINEALS and Pearson deviations is .007 controlling for other factors.
  16. 16. Outline Introduction LINEALS Forming a Hypothesis Method Description Simulation Analysis Results Main Effects Interactions
  17. 17. Plot of Main Effects I Figure: Main Effect of Number of Figure: Main Effect of Population Categories V Correlation P
  18. 18. Plot of Main Effects II Figure: Main Effect of Homogeneity h Figure: Main Effect of Correlation Type r
  19. 19. Outline Introduction LINEALS Forming a Hypothesis Method Description Simulation Analysis Results Main Effects Interactions
  20. 20. Plot of Significant Interactions Note: The significant 3-way interaction hPV is not plotted. Figure: Population Correlation by Levels Figure: Number of Categories by Levels of Homogeneity hP of Homogeneity hV
  21. 21. Interaction of Correlation Type and Number of Categories When rV added into regression model, the main effect of Correlation Type r goes away. Suggests that number of categories may contribute to the LINEALS vs. Pearson difference. Figure: Number of Categories by Correlation Type (rV, marginally sig.)
  22. 22. Summary 1. LINEALS performs slightly better than Pearson under bivariate normal categorizations. 2. The non-significant interactions with Correlation Type suggest that LINEALS is robust. 3. Recovery of true population correlations is highly influenced by homogeneity (i.e., the underlying equality of interval widths). Future Studies How does it compare against polychoric correlations? Is the resulting matrix positive definite?

×