This document discusses generalized linear mixed models (GLMMs), including:
- GLMMs combine linear predictors, exponential family distributions, link functions, and random effects.
- Examples of GLMM applications include ecology, genomics, education, psychology, and epidemiology.
- Estimation methods for GLMMs include penalized quasi-likelihood (PQL) and Bayesian/Monte Carlo methods. PQL is widely used but biased for small sample sizes.
1. Denitions
Estimation
Inference
Challenges open questions
Generalized linear mixed models: overview and
open questions
Ben Bolker
McMaster University, Mathematics Statistics and Biology
12 November 2013
Ben Bolker
GLMMs
References
4. Denitions
Estimation
Inference
Challenges open questions
References
(Generalized) linear mixed models
(G)LMMs: a statistical modeling framework incorporating:
Linear combinations
of categorical and continuous
predictors, and interactions
Response distributions in the
exponential family
(binomial, Poisson, and extensions)
Any smooth, monotonic
link function
(e.g. logistic, exponential models)
Flexible combinations of
blocking factors
(clustering; random eects)
Applications in ecology, neurobiology, behaviour, epidemiology, real
estate, . . .
Ben Bolker
GLMMs
5. Denitions
Estimation
Inference
Challenges open questions
References
(Generalized) linear mixed models
(G)LMMs: a statistical modeling framework incorporating:
Linear combinations
of categorical and continuous
predictors, and interactions
Response distributions in the
exponential family
(binomial, Poisson, and extensions)
Any smooth, monotonic
link function
(e.g. logistic, exponential models)
Flexible combinations of
blocking factors
(clustering; random eects)
Applications in ecology, neurobiology, behaviour, epidemiology, real
estate, . . .
Ben Bolker
GLMMs
6. Denitions
Estimation
Inference
Challenges open questions
References
(Generalized) linear mixed models
(G)LMMs: a statistical modeling framework incorporating:
Linear combinations
of categorical and continuous
predictors, and interactions
Response distributions in the
exponential family
(binomial, Poisson, and extensions)
Any smooth, monotonic
link function
(e.g. logistic, exponential models)
Flexible combinations of
blocking factors
(clustering; random eects)
Applications in ecology, neurobiology, behaviour, epidemiology, real
estate, . . .
Ben Bolker
GLMMs
7. Denitions
Estimation
Inference
Challenges open questions
References
(Generalized) linear mixed models
(G)LMMs: a statistical modeling framework incorporating:
Linear combinations
of categorical and continuous
predictors, and interactions
Response distributions in the
exponential family
(binomial, Poisson, and extensions)
Any smooth, monotonic
link function
(e.g. logistic, exponential models)
Flexible combinations of
blocking factors
(clustering; random eects)
Applications in ecology, neurobiology, behaviour, epidemiology, real
estate, . . .
Ben Bolker
GLMMs
8. Denitions
Estimation
Inference
Challenges open questions
References
Examples
ecology survival, predation, etc. (experimental plots)
genomics presence/absence of polymorphisms, gene expression
(individuals)
educational assessment student scores (students
×
teachers)
psychology/sensometrics decisions, responses to stimuli
(individuals)
epidemiology disease prevalence (postal codes, provinces, countries)
Ben Bolker
GLMMs
9. Denitions
Estimation
Inference
Challenges open questions
References
Examples
ecology survival, predation, etc. (experimental plots)
genomics presence/absence of polymorphisms, gene expression
(individuals)
educational assessment student scores (students
×
teachers)
psychology/sensometrics decisions, responses to stimuli
(individuals)
epidemiology disease prevalence (postal codes, provinces, countries)
Ben Bolker
GLMMs
10. Denitions
Estimation
Inference
Challenges open questions
References
Examples
ecology survival, predation, etc. (experimental plots)
genomics presence/absence of polymorphisms, gene expression
(individuals)
educational assessment student scores (students
×
teachers)
psychology/sensometrics decisions, responses to stimuli
(individuals)
epidemiology disease prevalence (postal codes, provinces, countries)
Ben Bolker
GLMMs
11. Denitions
Estimation
Inference
Challenges open questions
References
Examples
ecology survival, predation, etc. (experimental plots)
genomics presence/absence of polymorphisms, gene expression
(individuals)
educational assessment student scores (students
×
teachers)
psychology/sensometrics decisions, responses to stimuli
(individuals)
epidemiology disease prevalence (postal codes, provinces, countries)
Ben Bolker
GLMMs
12. Denitions
Estimation
Inference
Challenges open questions
References
Examples
ecology survival, predation, etc. (experimental plots)
genomics presence/absence of polymorphisms, gene expression
(individuals)
educational assessment student scores (students
×
teachers)
psychology/sensometrics decisions, responses to stimuli
(individuals)
epidemiology disease prevalence (postal codes, provinces, countries)
Ben Bolker
GLMMs
13. Denitions
Estimation
Inference
Challenges open questions
Coral protection by symbionts
Number of predation events
Number of blocks
10
8
6
2
2
2
2
1
1
4
0
2
0
shrimp
crabs
0
1
0
none
Symbionts
Ben Bolker
GLMMs
both
References
17. Denitions
Estimation
Inference
Challenges open questions
Technical denition
conditional
distribution
Yi
∼
Distr
response
η
linear
predictor
b
conditional
modes
Ben Bolker
GLMMs
=
Xβ
xed
eects
(g −1 (η ),
i
φ
)
scale
inverse
parameter
link
function
+
Zb
random
eects
∼ MVN(0, Σ(θ) )
variancecovariance
matrix
References
19. Denitions
Estimation
Inference
Challenges open questions
References
Overview
Maximum likelihood estimation
L(Y |θ, β) =
i
L(Y |β, b )
···
i
likelihood
data|random eects
× L(b |Σ(θ))
d
b
random eects
Best t is a compromise between two components
(consistency of data with
β
and random eects, consistency of
random eect with RE distribution)
Ben Bolker
GLMMs
20. Denitions
Estimation
Inference
Challenges open questions
Overview
Integrated (marginal) likelihood
L (x|b, β)
Scaled probability
1.0
0.8
L prod
0.6
L (b |σ2)
0.4
0.2
0.0
−10
−5
0
5
conditional mode value (u )
Ben Bolker
GLMMs
10
References
22. Denitions
Estimation
Inference
Challenges open questions
Methods
Estimation methods
deterministic:
precision vs. computational cost:
penalized quasi-likelihood, Laplace approximation, adaptive
Gauss-Hermite quadrature (Breslow, 2004) . . .
stochastic
(Monte Carlo): frequentist and Bayesian (Booth
and Hobert, 1999; Ponciano et al., 2009; Sung, 2007)
Ben Bolker
GLMMs
References
23. Denitions
Estimation
Inference
Challenges open questions
References
Methods
Penalized quasi-likelihood (PQL)
alternate steps of estimating GLM using known RE variances
to calculate weights; estimate LMMs given GLM t (Breslow,
2004)
exible (allows spatial/temporal correlations, crossed REs)
biased
for small unit samples (e.g. counts
5,
binary or
low-survival data)
widely used: SAS
PROC GLIMMIX,
R
MASS:glmmPQL:
90% of small-unit-sample cases
descendants: higher-order PQL, hierarchical GLM
Ben Bolker
GLMMs
in
≈
24. Denitions
Estimation
Inference
Challenges open questions
References
Methods
Penalized quasi-likelihood (PQL)
alternate steps of estimating GLM using known RE variances
to calculate weights; estimate LMMs given GLM t (Breslow,
2004)
exible (allows spatial/temporal correlations, crossed REs)
biased
for small unit samples (e.g. counts
5,
binary or
low-survival data)
widely used: SAS
PROC GLIMMIX,
R
MASS:glmmPQL:
90% of small-unit-sample cases
descendants: higher-order PQL, hierarchical GLM
Ben Bolker
GLMMs
in
≈
25. Denitions
Estimation
Inference
Challenges open questions
References
Methods
Penalized quasi-likelihood (PQL)
alternate steps of estimating GLM using known RE variances
to calculate weights; estimate LMMs given GLM t (Breslow,
2004)
exible (allows spatial/temporal correlations, crossed REs)
biased
for small unit samples (e.g. counts
5,
binary or
low-survival data)
widely used: SAS
PROC GLIMMIX,
R
MASS:glmmPQL:
90% of small-unit-sample cases
descendants: higher-order PQL, hierarchical GLM
Ben Bolker
GLMMs
in
≈
26. Denitions
Estimation
Inference
Challenges open questions
References
Methods
Penalized quasi-likelihood (PQL)
alternate steps of estimating GLM using known RE variances
to calculate weights; estimate LMMs given GLM t (Breslow,
2004)
exible (allows spatial/temporal correlations, crossed REs)
biased
for small unit samples (e.g. counts
5,
binary or
low-survival data)
widely used: SAS
PROC GLIMMIX,
R
MASS:glmmPQL:
90% of small-unit-sample cases
descendants: higher-order PQL, hierarchical GLM
Ben Bolker
GLMMs
in
≈
27. Denitions
Estimation
Inference
Challenges open questions
Methods
Breslow (2004) on PQL
As usual when software for complicated statistical
inference procedures is broadly disseminated, there is
potential for abuse and misinterpretation. In spite of the
fact that PQL was initially advertised as a procedure for
approximate inference in GLMMs, and its tendency to
give seriously biased estimates of variance components
and a fortiori regression parameters with binary outcome
data was emphasized in multiple publications [5, 6, 24],
some statisticians seemed to ignore these warnings and to
think of PQL as synonymous with GLMM.
Ben Bolker
GLMMs
References
28. Denitions
Estimation
Inference
Challenges open questions
References
Methods
Laplace approximation
for given
β, θ
(RE parameters), nd conditional modes by
penalized, iterated reweighted least squares;
then use second-order Taylor expansion around the conditional
modes
more accurate than PQL
reasonably fast and exible
lme4:glmer, glmmML, glmmADMB, R2ADMB
Ben Bolker
GLMMs
(AD Model Builder)
29. Denitions
Estimation
Inference
Challenges open questions
Methods
Gauss-Hermite quadrature (GHQ)
as above, but compute additional terms in the integral
(typically 8, but often up to 20)
most accurate
slowest, hence not exible (23 RE at most, maybe only 1)
lme4:glmer, glmmML, repeated
Ben Bolker
GLMMs
References
30. Denitions
Estimation
Inference
Challenges open questions
Methods
Adaptive vs. non-adaptive GHQ
Adaptive GHQ is more expensive at a given n ,
but makes up for it in accuracy
Ben Bolker
GLMMs
References
31. Denitions
Estimation
Inference
Challenges open questions
References
Methods
Stochastic approaches
Mostly Bayesians (Bayesian computation handles
high-dimensional integration)
various avours: Gibbs sampling, MCMC, MCEM, etc.
generally slower but more exible
simplies many inferential problems
must specify priors, assess convergence/error
specialized:
bernor
glmmAK, MCMCglmm
(Hadeld, 2010),
INLA,
glmmBUGS, R2WinBUGS, BRugs (WinBUGS/OpenBUGS),
R2jags, rjags (JAGS)
general:
Ben Bolker
GLMMs
32. Denitions
Estimation
Inference
Challenges open questions
Methods
Estimation: example (McKeon et al., 2012)
Log−odds of predation
−6
−4
−2
0
2
q
q
q
q
q
Added symbiont
q
q
q
q
q
Crab vs. Shrimp
q
Symbiont
Ben Bolker
GLMMs
q
q
q
q
GLM (fixed)
GLM (pooled)
PQL
Laplace
AGQ
References
34. Denitions
Estimation
Inference
Challenges open questions
Wald tests
Wald
tests (e.g. typical results of
summary)
based on information matrix
assume quadratic log-likelihood surface
exact for regular linear models;
only asymptotically OK for GLM(M)s
computationally cheap
approximation is sometimes awful (Hauck-Donner eect)
Ben Bolker
GLMMs
References
36. Denitions
Estimation
Inference
Challenges open questions
References
Likelihood ratio tests
better, but still have to deal with two nite-size problems:
when scale parameter is free (Gamma, etc.), deviance is ∼ F
rather than ∼ χ2 , with poorly dened denominator df
in GLM(M) case, numerator is only asymptotically χ2 anyway
Bartlett corrections (Cordeiro and Ferrari, 1998; Cordeiro
et al., 1994), higher-order asymptotics: cond [neither extended
to GLMMs!]
Prole condence intervals: moderately dicult/fragile
Ben Bolker
GLMMs
37. Denitions
Estimation
Inference
Challenges open questions
Parametric bootstrapping
t null model to data
simulate data from null model
t null and working model, compute likelihood dierence
repeat to estimate null distribution
should be OK but ??? not well tested
(assumes estimated parameters are suciently good)
Ben Bolker
GLMMs
References
38. Denitions
Estimation
Inference
Challenges open questions
Parametric bootstrap results
0.02 0.06
Inferred p value
H2S
Anoxia
0.08
0.06
0.04
0.02
Osm
Cu
0.08
0.06
0.04
0.02
0.02 0.06
True p value
Ben Bolker
GLMMs
References
39. Denitions
Estimation
Inference
Challenges open questions
References
Bayesian approaches
Provided that we have a good sample from the posterior
distribution (Markov chains have converged etc. etc.) we get
most of the inferences we want for free by summarizing the
marginal posteriors
Model selection is still an open question: reversible-jump
MCMC, deviance information criterion
Ben Bolker
GLMMs
43. Denitions
Estimation
Inference
Challenges open questions
Acknowledgments
lme4:
Doug Bates, Martin
Mächler, Steve Walker
Data: Adrian Stier (UBC/OSU),
NSERC (Discovery)
Sea McKeon (Smithsonian),
SHARCnet
David Julian (UF), Jada-Simone
White (Univ Hawai'i)
Ben Bolker
GLMMs
References
44. Denitions
Estimation
Inference
Challenges open questions
References
Booth, J.G. and Hobert, J.P., 1999. Journal of the Royal Statistical Society. Series B, 61(1):265285.
doi:10.1111/1467-9868.00176.
Breslow, N.E., 2004. In D.Y. Lin and P.J. Heagerty, editors, Proceedings of the second Seattle
symposium in biostatistics: Analysis of correlated data, pages 122. Springer. ISBN 0387208623.
Cordeiro, G.M. and Ferrari, S.L.P., 1998. Journal of Statistical Planning and Inference,
71(1-2):261269. ISSN 0378-3758. doi:10.1016/S0378-3758(98)00005-6.
Cordeiro, G.M., Paula, G.A., and Botter, D.A., 1994. International Statistical Review / Revue
Internationale de Statistique, 62(2):257274. ISSN 03067734. doi:10.2307/1403512.
Hadeld, J.D., 2010. Journal of Statistical Software, 33(2):122. ISSN 1548-7660.
McKeon, C.S., Stier, A., et al., 2012. Oecologia, 169(4):10951103. ISSN 0029-8549.
doi:10.1007/s00442-012-2275-2.
Pinheiro, J.C. and Bates, D.M., 1996. Statistics and Computing, 6(3):289296.
doi:10.1007/BF00140873.
Ponciano, J.M., Taper, M.L., et al., 2009. Ecology, 90(2):356362. ISSN 0012-9658.
Sung, Y.J., 2007. The Annals of Statistics, 35(3):9901011. ISSN 0090-5364.
doi:10.1214/009053606000001389.
Ben Bolker
GLMMs