A measure to evaluate latent variable model fit by sensitivity analysis

A measure to evaluate latent variable
model ﬁt by sensitivity analysis
Daniel Oberski
Department of methodology and statistics
Dept of Statistics, Leiden University
Latent variable model ﬁt by sensitivity analysis Daniel Oberski

Latent variable models
What do they assume and what are they good for?

ξ
y1
y2
yJ...
p(y) =
∑
ξ
p(ξ)
J∏
j=1
p(yj|ξ)

ξ
y1
y2
yJ...
p(y) =
∑
ξ
p(ξ)p(y1, y2|ξ)
J∏
j=3
p(yj|ξ)

Example
Goal: estimate false positives and false negatives in four
diagnostic tests for C. Trachomatis infection:
y1 Ligase chain reaction (LCR) test (Yes/No);
y2 Polymerase chain reaction (PCR) test (Yes/No);
y3 DNA probe test (DNAP) (Yes/No);
y4 Culture (CULT) (Yes/No).
Tool: 2-latent class model (diseased or non-diseased).
(Original data from Dendukuri et al. 2009)

Assume:
ξ
y1 y2 yJ...
But really:
ξ
y1 y2 yJ...
What difference does it make for the goal: false positives and
false negatives? (simulation by Van Smeden et al., submitted)

ξ
y1
y2
yJ...
x
p(y) =
∑
ξ
p(ξ|x)
J∏
j=1
p(yj|ξ)

ξ
y1
y2
yJ...
x
p(y) =
∑
ξ
p(ξ|x)
J∏
j=1
p(yj|ξ, x)

Example
Goal: Estimate gender differences in ”valuing Stimulation”:
(1) Very much like me; (2) Like me; (3) Somewhat like me; (4) A
little like me; (5) Not like me; (6) Not like me at all.
impdiff (S)he looks for adventures and likes to take risks.
(S)he wants to have an exciting life.
impadv (S)he likes surprises and is always looking for new
things to do. He thinks it is important to do lots of
different things in life.
Tool: Structural Equation Model for European Social Survey data
(n = 18519 men and 16740 women).
(Original study by Schwarz et al. 2005)

Assume:
ξ
y1 y2 yJ...
x
But really (?):
ξ
y1 y2 yJ...
x
What difference does it make for the goal: true gender
differences in values? (re-analysis of data by Oberski 2014)
q
q
q
q
q
q
q
q
Men value more
Women value more
−0.2
0.0
0.2
ACPO ST SD HE COTR SE UN BE
"Human value" factor
Latentmeandifferenceestimate±2s.e.
Model
q Scalar invariance
Free intercept 'Adventure'

PROBLEM
The original authors found that the conditional independence
model fit the data ”approximately” (p. 1013)...
”Chi-square deteriorated significantly, ∆χ2
(19) = 3313,
p < .001, but CFI did not change. Change in chi-square is
highly sensitive with large sample sizes and complex models.
The other indices suggested that scalar invariance might be
accepted (CFI = .88, RMSEA = .04, CI = .039.040, PCLOSE
= 1.0).”
... but unfortunately this ”acceptable” misspecification could
reverse their conclusions!

Numbers that indicate how well the model ﬁts the data
• Likelihood Ratio vs. saturated
• Information-based criteria: AIC, BIC, CAIC, ...
• Bivariate residuals (Maydeu & Joe 2005; Oberski, Van Kollenburg &
Vermunt 2013)
• Score/Lagrange multiplier tests, “modiﬁcation index”,
“expected parameter change” (EPC) (Saris, Satorra & Sörbom
1989; Oberski & Vermunt 2013; Oberski & Vermunt accepted)
“Fit indices”:
• RMSEA:
√
(χ2/df)−1)
N−1
• CFI:
[
(χ2
null − dfnull) − (χ2 − df)
]
/(χ2
null − dfnull)
• Lots of others: TLI, NFI, NNFI, RFI, IFI, RNI, RMR,
SRMR1-3, GFI, AGFI, MFI, ECVI, ...

What is the problem?
• We do latent variable modeling with a goal in mind.
• But the latent variable model might be misspecified.
• The appropriate question: ”will that affect my goal?”
• The actual question: ”do the data fit the model in the
population” (LR) or ”are the model and the data far apart
relative to model complexity” (RMSEA etc.)
What is the solution?
Evaluate directly what effect possible misspecifications
have on the goal of the analysis.

How to evaluate directly what effect possible misspeciﬁcations
have on the goal of the analysis.

Two ideas to evaluate the effect of misspeciﬁcations
1 Try out all possible models with misspeciﬁcations, calculate
the estimates of interest under these models and evaluate
whether these are substantively different.
Advantage: Does the job.
Disadvantage: There may be too many alternative models.
Also: are applied researchers really going to do this?
2 Use EPC-interest: expected change in free parameters
Advantage: Does the job without the need to estimate any
alternative models.
Disadvantage: Is an approximation (though a reasonable
one).

EPC-interest applied to Stimulation example
• After ﬁtting the full scalar invariance model,
• Effect size estimate of sex difference in Stimulation is +0.214
(s.e. 0.0139).
• But EPC-interest of equal ”Adventure” item intercept is
-0.243.
• So EPC-interest suggests conclusion can be reversed by
freeing a misspeciﬁed scalar invariance restriction
• Actual change when freeing this intercept is very close to
EPC-interest: -0.235.

EPC-interest
How does it work?

• Let’s say there is a restricted model whose purpose it is to
estimate its parameters, θ, or some linear function of them
such as a subselection, Pθ.
• We could parameterize these restrictions as ψ = 0.
For example: ψ could be direct effect of gender on
”Adventure”, or loglinear dependence between DNA tests.
• The maximum likelihood estimates are then
ˆθ = arg max L(θ, ψ = 0)
Question: How much would ˆθ change if we freed ψ?

How much would ˆθ change if we freed ψ?
The trick is to consider estimate of θ we would get under ψ ̸= 0;
that is, ˜θ = arg max L(θ, ψ).
As it turns out, we don’t actually need ˜θ, since
˜θ − ˆθ = ˆH
−1
θθ
ˆHθψD−1
[
∂L(θ, ψ)
∂ψ θ=ˆθ
]
+ O(δ′
δ),
where H is a Hessian, D = ˆHψψ − ˆH
′
θψ
ˆH
−1
θθ
ˆHθψ and δ is the
”overall wrongness” of the model (ψ′
, θ′
− ˆθ
′
)′.

How much would ˆθ change if we freed ψ?
Dropping the approximation term (assuming the model
parameters are not ”too far” from the truth) we get the
approximation
EPC-interest = −P ˆH
−1
θθ
ˆHθψ EPC-self ≈ −P ˆH
−1
θθ
ˆHθψ
(
ψ − ˆψ
)
For those of you familiar with Structural Equation Modeling (or
attending my 2013 MBC2 talk), ”EPC-self” is the usual ”expected
parameter change” in the ﬁxed parameter vector, i.e. the size of
the misspeciﬁcation.

Monte Carlo simulation: EPC-interest is a good
approximation to the actual change in parameters of
interest when freeing equality restriction
Average over 200 replications
∆ν1 ng EPC-self ∆ˆα ∆ˆα bias EPC-interest EPC-interest bias
0.1 50 0.064 0.240 -0.040 -0.034 0.005
0.3 50 0.213 0.313 -0.113 -0.113 -0.001
0.8 50 0.657 0.505 -0.305 -0.401 -0.096
0.1 100 0.058 0.231 -0.031 -0.031 0.000
0.3 100 0.203 0.323 -0.123 -0.109 0.014
0.8 100 0.619 0.492 -0.292 -0.370 -0.077
0.1 500 0.063 0.233 -0.033 -0.033 0.000
0.3 500 0.208 0.307 -0.107 -0.112 -0.005
0.8 500 0.598 0.501 -0.301 -0.349 -0.048

Another example showcasing EPC-interest

Ranking data in 48 WVS countries
Option # M/P Value wording
Set A
1. M A high level of economic growth
2. M Making sure this country has strong defense forces
3. P Seeing that people have more say about how things are done at
their jobs and in their communities
4. P Trying to make our cities and countryside more beautiful
Set B
1. M Maintaining order in the nation
2. P Giving people more say in important government decisions
3. M Fighting rising prices
4. P Protecting freedom of speech
Set C
1. M A stable economy
2. P Progress toward a less impersonal and more humane society
3. P Progress toward a society in which ideas count more than money
4. M The ﬁght against crime

Figure: Graphical representation of the multilevel latent class regression
model for (post)materialism measured by three partial ranking tasks.
Observed variables are shown in rectangles while unobserved (“latent”)
variables are shown in ellipses.

Latent class ranking model with 4 choices
Each ranking set, for example, set A:
P(A1ic = a1, A2ic = a2|Xic = x) =
ωa1x
∑
k ωkx
ωa2x
∑
k̸=a1
ωkx
,
where ωkx is the “utility” of object k for respondents in class x.
Multilevel structure to account for the countries using group class
variable G:
P(Xic = x|Z1ic = z1ic, Z2ic = z2, Gc = g) =
=
exp(αx + γ1xz1 + γ2xz2 + βgx)
∑
t exp(αt + γ1tz1 + γ2tz2 + +βtg)
,

Multilevel latent class model w/ covariates for rankings
L(θ) = P(A1, A2, B1, B2, C1, C2|Z1, Z2) =
C∏
c=1
∑
G
P(Gc)
nc∏
i=1
∑
X
P(Xic|Z1ic, Z2ic, Gc)×
P(A1ic, A2ic|Xic)P(B1ic, B2ic|Xic)P(C1ic, C2ic|Xic),
Goal: estimate γ (especially its sign).
Possible problem: Violations of scalar and metric
measurement invariance (DIF), parameterized respectively as
τ∗ and λ∗.
Solution: See if these matter for the sign of γ.

Table: Full invariance multilevel latent class model: parameter estimates
of interest with standard errors (columns 3 and 4), as well as expected
change in these parameters measured by the EPC-interest when
freeing each of six sets of possible misspeciﬁcations (columns 5–10).
EPC-interest for...
τ∗
jkg λ∗
jkxg
Estimates Ranking task Ranking t
Est. s.e. 1 2 3 1 2
Class 1 GDP -0.035 (0.007) -0.013 0.021 -0.002 0.073 0.252
Class 2 GDP -0.198 (0.012) -0.018 -0.035 0.015 -0.163 -0.058
Class 1 Women 0.013 (0.001) -0.006 0.002 0.000 -0.003 0.029
Class 2 Women -0.037 (0.001) 0.007 -0.003 0.002 -0.006 -0.013

Table: Partially invariant multilevel latent class model: parameter
estimates of interest with standard errors (columns 3 and 4), as well as
expected change in these parameters measured by the EPC-interest
when freeing each of four sets of remaining possible misspeciﬁcations
(columns 5–7 and 10).
EPC-interest for non-invariance of...
τ∗
kg λ∗
kxg
Ranking task Ranking task
Est. s.e. 1 2 3 1 2 3
Class 1 GDP -0.127 (0.008) -0.015 -0.003 0.002 0.097
Class 2 GDP 0.057 (0.011) -0.043 -0.013 0.002 0.161
Class 1 Women 0.008 (0.001) -0.002 0.000 0.002 0.001
Class 2 Women 0.020 (0.001) -0.007 -0.001 0.002 0.007

Mixed
Postmaterialist
Materialist
Mixed
Postmaterialist
Materialist
% Women in parliament GDP per capita
0.2
0.4
0.6
Minimum Maximum Minimum Maximum
Covariate level
ProbabilityofClass
Figure: Estimated probability of choosing each class as a function of the
covariates of interest under the ﬁnal model.

ARM
AUS
AZE
BLR
CHL
CHNCOL
CYP
DEU
DZA
ECU
EGY ESPEST
GHA
IRQ
JOR
JPN
KAZ
KGZ
KOR
LBN
MAR
MEX
MYSNGA
NLD
NZL
PAK
PER
PHL
POLQAT ROU
RUS
RWA
SGPSVN
SWE
TTO
TUN
TUR
UKR
URY
USA
UZB
YEM
ZWE
ARM
AUS
AZE
BLR
CHL
CHN
COL
CYP
DEU
DZA
ECU
EGY
ESP
ESTGHA
IRQJOR
JPN
KAZ
KGZ
KOR
LBN MAR
MEX
MYSNGA NLD
NZL
PAK
PER
PHLPOL
QAT
ROU
RUS
RWA
SGP
SVN
SWE
TTO
TUN
TUR
UKR
URY
USA
UZB
YEM ZWE
ARM
AUS
AZE
BLR
CHL
CHN
COL
CYP
DEU
DZA
ECU
EGY
ESP
EST
GHA
IRQ
JOR
JPN KAZKGZ
KOR
LBN
MAR
MEX
MYSNGA
NLDNZL
PAK
PER
PHL
POL
QAT
ROU
RUS RWASGP
SVN
SWE
TTO
TUN
TUR
UKR
URY
USA
UZB
YEM
ZWE
Class 1
("Materialist")
Class 2
("Postmaterialist")
Class 3
("Mixed")
0.0
0.2
0.4
0.6
0.8
0 20 40 0 20 40 0 20 40
% Women in Parliament
Classposterior
ARM
AUS
AZE
BLR
CHL
CHNCOL
CYP
DEU
DZA
ECU
EGY ESPEST
GHA
IRQ
JOR
JPN
KAZ
KGZ
KOR
LBN
MAR
MEX
MYSNGA
NLD
NZL
PAK
PER
PHL
POL QATROU
RUS
RWA
SGPSVN
SWE
TTO
TUN
TUR
UKR
URY
USA
UZB
YEM
ZWE
ARM
AUS
AZE
BLR
CHL
CHN
COL
CYP
DEU
DZA
ECU
EGY
ESP
ESTGHA
IRQJOR
JPN
KAZ
KGZ
KOR
LBNMAR
MEX
MYSNGA NLD
NZL
PAK
PER
PHL POL
QAT
ROU
RUS
RWA
SGP
SVN
SWE
TTO
TUN
TUR
UKR
URY
USA
UZB
YEMZWE
ARM
AUS
AZE
BLR
CHL
CHN
COL
CYP
DEU
DZA
ECU
EGY
ESP
EST
GHA
IRQ
JOR
JPNKAZKGZ
KOR
LBN
MAR
MEX
MYSNGA
NLDNZL
PAK
PER
PHL
POL
QAT
ROU
RUSRWA SGP
SVN
SWE
TTO
TUN
TUR
UKR
URY
USA
UZB
YEM
ZWE
Class 1
("Materialist")
Class 2
("Postmaterialist")
Class 3
("Mixed")
0.0
0.2
0.4
0.6
0.8
7 8 9 10 11 7 8 9 10 11 7 8 9 10 11
Ln(GDP per capita)
Classposterior

What has been gained by using EPC-interest:
I am fairly conﬁdent here that there truly is ”approximate
measurement invariance”, in the sense that any violations of
measurement invariance do not bias the primary conclusions.
I think attaining this goal is the main purpose of model ﬁt
evaluation.

Conclusion

Conclusion
• Latent variable modeling is often performed for a purpose;
• Model ﬁt evaluation should then be done for the reason that
violations of assumptions can disturb this purpose.
• Introduced the EPC-interest to look into this;
• Evaluates the change in the parameter(s) of interest that
would result if a restriction is freed that parameterizes a
potential violation of assumptions.

Implemented in SEM software lavaan for R:
Oberski (2014). Evaluating Sensitivity of Parameters of Interest to Measurement
Invariance in Latent Variable Models. Political Analysis, 22 (1).
Implemented in LCA software Latent Gold:
Oberski, Vermunt & Moors (submitted). Evaluating measurement invariance in
categorical data latent variable models with the EPC-interest. Under
review.
Oberski & Vermunt (2014). A model-based approach to goodness-of-ﬁt
evaluation in item response theory. Measurement, 11, 117–122.
Nagelkerke, Oberski, & Vermunt (accepted). ”Goodness-of-ﬁt of Multilevel
Latent Class Models for Categorical Data”. Sociological Methodology.
Oberski & Vermunt (conditionally accepted). ”The Expected Parameter Change
(EPC) for Local Dependence Assessment in Binary Data Latent Class
Models”. Psychometrika.

Thank you for your attention!
Daniel Oberski
doberski@uvt.nl
See http://daob.nl/publications for full texts & code

SEM regression coefﬁcient example
European Sociological Review 2008, 24(5), 583–599

Conservation Self−transcendence
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
Sweden
Danmark
Austria
Switzerland
Netherlands
Germany
Ireland
Spain
Norway
Hungary
Finland
Portugal
France
Belgium
Slovenia
United Kingdom
Greece
Czech Republic
Poland
Sweden
Danmark
Austria
Switzerland
Netherlands
Germany
Ireland
Spain
Norway
Hungary
Finland
Portugal
France
Belgium
Slovenia
United Kingdom
Greece
Czech Republic
Poland
ALLOWNOCOND
−1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0
Regression coefficient

EPC-interest statistics of at least 0.1 in absolute value with
respect to the latent variable regression coefﬁcients.
Metric invariance (loading) restriction
“Conditions → Work skills” in...
Slovenia France Hungary Ireland
EPC-interest w.r.t.:
Conditions →
Self-transcendence -0.073 -0.092 -0.067 0.073
Conservation 0.144 0.139 0.123 -0.113
SEPC-self 0.610 0.692 0.759 -0.514

What has been gained by using EPC-interest
• Full metric invariance model: ”close fit”;
• EPC-interest still detects threats to cross-country
comparisons of regression coefficients;
• MI and EPC-self do not detect these particular
misspecifications;
• MI and EPC-self detect other misspecifications;
• Looking at EPC-interest reveals that these do not affect the
cross-country comparisons of regression coefficients.

A measure to evaluate latent variable model fit by sensitivity analysis

Recommended

Recommended

More Related Content

Similar to A measure to evaluate latent variable model fit by sensitivity analysis

Similar to A measure to evaluate latent variable model fit by sensitivity analysis (20)

Recently uploaded

Recently uploaded (20)

A measure to evaluate latent variable model fit by sensitivity analysis