Confirmatory
Factor Analysis
Professor Patrick Sturgis
Plan
• Measuring concepts using latent variables
• Exploratory Factor Analysis (EFA)
• Confirmatory Factor Analysis (CFA)
• Fixing the scale of latent variables
• Mean structures
• Formative indicators
• Item parcelling
• Higher-order factors
2 step modeling
• ‘SEM is path analysis with latent variables’
• This as a distinction between:
– Measurement of constructs
– Relationships between these constructs
• First step: measure constructs
• Second step: estimate how constructs
are related to one another
Step 1: measurement
• All measurements are made with error
(random and/or systematic)
• We want to isolate ‘true score’ component
of measured variables: X = t + e
• How can we do this?
• Sum items (random error cancels)
• Estimate latent variable model
Exploratory Factor Analysis
• Also called ‘unrestricted’ factor analysis
• Finds factor loadings which best reproduce
correlations between observed variables
• n of factors = n of observed variables
• All variables related to all factors
Exploratory Factor Analysis
• Retain <n factors which ‘explain’ satisfactory
amount of observed variance
• ‘Meaning’ of factors determined by pattern
of loadings
• No unique solution where >1 factor, rotation
used to clarify what each factor measures
Example: Intelligence
Observed Items Factor 1 Factor 2 Factor 3
Math 1 .89 .12 .03
Math 2 .73 -.13 .03
Math 3 .75 .09 -.11
Visual-Spatial 1 -.03 .68 .07
Visual-Spatial 2 .13 .74 -.12
Visual-Spatial 3 -.08 .91 .05
Verbal 1 .23 .17 .88
Verbal 2 .18 .03 .73
Verbal 3 -.03 -.11 .70
9 knowledge quiz items
...Factor 9
Example: Intelligence
Observed Items Factor 1 Factor 2 Factor 3
Math 1 .89 .12 .03
Math 2 .73 -.13 .03
Math 3 .75 .09 -.11
Visual-Spatial 1 -.03 .68 .07
Visual-Spatial 2 .13 .74 -.12
Visual-Spatial 3 -.08 .91 .05
Verbal 1 .23 .17 .88
Verbal 2 .18 .03 .73
Verbal 3 -.03 -.11 .70
9 knowledge quiz items
...Factor 9
Example: Intelligence
Observed Items Factor 1 Factor 2 Factor 3
Math 1 .89 .12 .03
Math 2 .73 -.13 .03
Math 3 .75 .09 -.11
Visual-Spatial 1 -.03 .68 .07
Visual-Spatial 2 .13 .74 -.12
Visual-Spatial 3 -.08 .91 .05
Verbal 1 .23 .17 .88
Verbal 2 .18 .03 .73
Verbal 3 -.03 -.11 .70
9 knowledge quiz items
...Factor 9
Example: Intelligence
Observed Items Factor 1 Factor 2 Factor 3
Math 1 .89 .12 .03
Math 2 .73 -.13 .03
Math 3 .75 .09 -.11
Visual-Spatial 1 -.03 .68 .07
Visual-Spatial 2 .13 .74 -.12
Visual-Spatial 3 -.08 .91 .05
Verbal 1 .23 .17 .88
Verbal 2 .18 .03 .73
Verbal 3 -.03 -.11 .70
9 knowledge quiz items
...Factor 9
Limitations of EFA
• Inductive, atheoretical (Data->Theory)
• Subjective judgement & heuristic rules
• We usually have a theory about how
indicators are related to particular latent
variables (Theory-> Data)
• Be explicit and test this measurement theory
against sample data
Confirmatory Factor Analysis (CFA)
• Also ‘the restricted factor model’
• Specify the measurement model before
looking at the data (the ‘no peeking’ rule!)
• Which indicators measure which factors?
• Which indicators are unrelated to which
factors?
• Are the factors correlated or uncorrelated?
Two Factor, Six Item EFA
Two Factor, Six Item CFA
Parameter Constraints
• CFA applies constraints to parameters
(hence ‘restricted’ factor model)
• Factor loadings are fixed to zero for
indicators that do not measure the factor
• Measurement theory is expressed in the
constraints that we place on the model
• Fixing parameters over-identifies the model,
can test the fit of our a priori model
Scales of latent variables
• A latent variable has no inherent metric, 2
approaches:
1. Constrain variance of latent variable to 1
2. Constrain the factor loading of one item to 1
• (2) makes item the ‘reference item’, other
loadings interpreted relative to reference
item
1. yields a standardised solution
2. generally preferred (more flexible)
Mean Structures
• In conventional SEM, we do not model
means of observed or latent variables
• Interest is in relationships between variables
(correlations, directional paths)
• Sometimes, we are interested in means of
latent variables
e.g. Differences between groups
e.g. Changes over time
Identification of latent means
• observed and latent means introduced by
adding a constant
• This is a variable set to 1 for all cases
• The regression of a variable on a predictor
and a constant, yields the intercept (mean)
of that variable in the unstandardised b
• The mean of an observed variable=total
effect of a constant on that variable
Mean Structures
x y
1
a
b
c
b = mean of x
a+(b*c)=mean of y
Means and identification
• Mean structure models require additional
identification restrictions
• We are estimating more unknown
parameters (the latent means)
• Where we have >1 group, we can fix the
latent mean of one group to zero
• Means of remaining groups are estimated
as differences from reference group
Formative and Reflective Indicators
• CFA assumes latent variable causes the
indicators, arrows point from latent to
indicator
• For some concepts this does not make
sense
e.g. using education, occupation and earnings to
measure ‘socio-economic status’
• We wouldn’t think that manipulating an
individual’s SES would change their
education
Formative Indicators
• For these latent variables, we specify the
indicators as ‘formative’
• This produces a weighted index of the
observed indicators
• Latent variable has no disturbance term
• In the path diagram, the arrows point from
indicator to latent variable
Item Parceling
• A researcher may have a very large number
of indicators for a latent construct
• Here, model complexity can become a
problem for estimation and interpretation
• Items are first combined in ‘parcels’ through
summing scores over item sub-groups
• Assumes unidimensionality of items in a
parcel
Higher Order Factors
• Usually, latent variables measured via
observed indicators
• Can also specify ‘higher order’ latent
variables which are measured by other
latent variables
• Used to test more theories about the
structure of multi-dimensional constructs
e.g. intelligence, personality
Higher-order Factor Model
Summary
• Measuring concepts using latent variables
• Exploratory Factor Analysis (EFA)
• Confirmatory Factor Analysis (CFA)
• Fixing the scale of latent variables
• Mean structures
• Formative indicators
• Item parcelling
• Higher-order factors
for more information contact
www.ncrm.ac.uk

Confirmatory Factor Analysis

  • 1.
  • 2.
    Plan • Measuring conceptsusing latent variables • Exploratory Factor Analysis (EFA) • Confirmatory Factor Analysis (CFA) • Fixing the scale of latent variables • Mean structures • Formative indicators • Item parcelling • Higher-order factors
  • 3.
    2 step modeling •‘SEM is path analysis with latent variables’ • This as a distinction between: – Measurement of constructs – Relationships between these constructs • First step: measure constructs • Second step: estimate how constructs are related to one another
  • 4.
    Step 1: measurement •All measurements are made with error (random and/or systematic) • We want to isolate ‘true score’ component of measured variables: X = t + e • How can we do this? • Sum items (random error cancels) • Estimate latent variable model
  • 5.
    Exploratory Factor Analysis •Also called ‘unrestricted’ factor analysis • Finds factor loadings which best reproduce correlations between observed variables • n of factors = n of observed variables • All variables related to all factors
  • 6.
    Exploratory Factor Analysis •Retain <n factors which ‘explain’ satisfactory amount of observed variance • ‘Meaning’ of factors determined by pattern of loadings • No unique solution where >1 factor, rotation used to clarify what each factor measures
  • 7.
    Example: Intelligence Observed ItemsFactor 1 Factor 2 Factor 3 Math 1 .89 .12 .03 Math 2 .73 -.13 .03 Math 3 .75 .09 -.11 Visual-Spatial 1 -.03 .68 .07 Visual-Spatial 2 .13 .74 -.12 Visual-Spatial 3 -.08 .91 .05 Verbal 1 .23 .17 .88 Verbal 2 .18 .03 .73 Verbal 3 -.03 -.11 .70 9 knowledge quiz items ...Factor 9
  • 8.
    Example: Intelligence Observed ItemsFactor 1 Factor 2 Factor 3 Math 1 .89 .12 .03 Math 2 .73 -.13 .03 Math 3 .75 .09 -.11 Visual-Spatial 1 -.03 .68 .07 Visual-Spatial 2 .13 .74 -.12 Visual-Spatial 3 -.08 .91 .05 Verbal 1 .23 .17 .88 Verbal 2 .18 .03 .73 Verbal 3 -.03 -.11 .70 9 knowledge quiz items ...Factor 9
  • 9.
    Example: Intelligence Observed ItemsFactor 1 Factor 2 Factor 3 Math 1 .89 .12 .03 Math 2 .73 -.13 .03 Math 3 .75 .09 -.11 Visual-Spatial 1 -.03 .68 .07 Visual-Spatial 2 .13 .74 -.12 Visual-Spatial 3 -.08 .91 .05 Verbal 1 .23 .17 .88 Verbal 2 .18 .03 .73 Verbal 3 -.03 -.11 .70 9 knowledge quiz items ...Factor 9
  • 10.
    Example: Intelligence Observed ItemsFactor 1 Factor 2 Factor 3 Math 1 .89 .12 .03 Math 2 .73 -.13 .03 Math 3 .75 .09 -.11 Visual-Spatial 1 -.03 .68 .07 Visual-Spatial 2 .13 .74 -.12 Visual-Spatial 3 -.08 .91 .05 Verbal 1 .23 .17 .88 Verbal 2 .18 .03 .73 Verbal 3 -.03 -.11 .70 9 knowledge quiz items ...Factor 9
  • 11.
    Limitations of EFA •Inductive, atheoretical (Data->Theory) • Subjective judgement & heuristic rules • We usually have a theory about how indicators are related to particular latent variables (Theory-> Data) • Be explicit and test this measurement theory against sample data
  • 12.
    Confirmatory Factor Analysis(CFA) • Also ‘the restricted factor model’ • Specify the measurement model before looking at the data (the ‘no peeking’ rule!) • Which indicators measure which factors? • Which indicators are unrelated to which factors? • Are the factors correlated or uncorrelated?
  • 13.
  • 14.
  • 15.
    Parameter Constraints • CFAapplies constraints to parameters (hence ‘restricted’ factor model) • Factor loadings are fixed to zero for indicators that do not measure the factor • Measurement theory is expressed in the constraints that we place on the model • Fixing parameters over-identifies the model, can test the fit of our a priori model
  • 16.
    Scales of latentvariables • A latent variable has no inherent metric, 2 approaches: 1. Constrain variance of latent variable to 1 2. Constrain the factor loading of one item to 1 • (2) makes item the ‘reference item’, other loadings interpreted relative to reference item 1. yields a standardised solution 2. generally preferred (more flexible)
  • 17.
    Mean Structures • Inconventional SEM, we do not model means of observed or latent variables • Interest is in relationships between variables (correlations, directional paths) • Sometimes, we are interested in means of latent variables e.g. Differences between groups e.g. Changes over time
  • 18.
    Identification of latentmeans • observed and latent means introduced by adding a constant • This is a variable set to 1 for all cases • The regression of a variable on a predictor and a constant, yields the intercept (mean) of that variable in the unstandardised b • The mean of an observed variable=total effect of a constant on that variable
  • 19.
    Mean Structures x y 1 a b c b= mean of x a+(b*c)=mean of y
  • 20.
    Means and identification •Mean structure models require additional identification restrictions • We are estimating more unknown parameters (the latent means) • Where we have >1 group, we can fix the latent mean of one group to zero • Means of remaining groups are estimated as differences from reference group
  • 21.
    Formative and ReflectiveIndicators • CFA assumes latent variable causes the indicators, arrows point from latent to indicator • For some concepts this does not make sense e.g. using education, occupation and earnings to measure ‘socio-economic status’ • We wouldn’t think that manipulating an individual’s SES would change their education
  • 22.
    Formative Indicators • Forthese latent variables, we specify the indicators as ‘formative’ • This produces a weighted index of the observed indicators • Latent variable has no disturbance term • In the path diagram, the arrows point from indicator to latent variable
  • 23.
    Item Parceling • Aresearcher may have a very large number of indicators for a latent construct • Here, model complexity can become a problem for estimation and interpretation • Items are first combined in ‘parcels’ through summing scores over item sub-groups • Assumes unidimensionality of items in a parcel
  • 24.
    Higher Order Factors •Usually, latent variables measured via observed indicators • Can also specify ‘higher order’ latent variables which are measured by other latent variables • Used to test more theories about the structure of multi-dimensional constructs e.g. intelligence, personality
  • 25.
  • 26.
    Summary • Measuring conceptsusing latent variables • Exploratory Factor Analysis (EFA) • Confirmatory Factor Analysis (CFA) • Fixing the scale of latent variables • Mean structures • Formative indicators • Item parcelling • Higher-order factors
  • 27.
    for more informationcontact www.ncrm.ac.uk

Editor's Notes

  • #16 An overidentified model occurs when every parameter is identified and at least one parameter is overidentified (e.g., it can be solved for in more than way--instead of solving for this parameter with one equation, more than one equation will generate this parameter estimate). Typically, most people who use structural equation modeling prefer to work with models that are overidentified. An overidentified model has positive degrees of freedom and may not fit as well as a model which is just identified. Imposing restrictions on the model when we have an overidentified model provides us with a test of our hypotheses, which can then be evaluated using the Chi-square statistic and fit indices. The positive degrees of freedom associated with an overidentified model allows the model to be falsified with a statistical test. When an overidentified model does fit well, then the researcher typically considers the model to be an adequate fit for the data.