SlideShare a Scribd company logo
1 of 50
Download to read offline
Precursors           GLMMs                Results                   Conclusions                   References




             Open-source tools for estimation and inference
                using generalized linear mixed models

                                      Ben Bolker

                                   McMaster University
                    Departments of Mathematics & Statistics and Biology


                                      3 July 2011




Ben Bolker                           McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors            GLMMs        Results                   Conclusions                   References




Outline
       1 Precursors
           Definitions
           Examples
           Challenges
       2 GLMMs
           Estimation
           Inference
       3 Results
           Coral symbionts
           Glycera
           Arabidopsis
       4 Conclusions
           Conclusions
Ben Bolker                    McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors            GLMMs        Results                   Conclusions                   References



Definitions


Outline
       1 Precursors
           Definitions
           Examples
           Challenges
       2 GLMMs
           Estimation
           Inference
       3 Results
           Coral symbionts
           Glycera
           Arabidopsis
       4 Conclusions
           Conclusions
Ben Bolker                    McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors          GLMMs             Results                   Conclusions                   References



Definitions


Definitions


       Fixed effects (FE) Predictors where interest is in specific levels
       Random effects (RE) Predictors where interest is in distribution
                  rather than levels (blocks) 5
       Mixed models Statistical models with both FEs and REs
       Linear mixed models Linear effects, normal responses, normal REs
       Generalized linear models Linearizable effects, exponential-family
                     responses, normal REs (on linearized scale)
       Generalized linear mixed models GLMMs = LMMs + GLMs



Ben Bolker                       McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors          GLMMs             Results                   Conclusions                   References



Definitions


Definitions


       Fixed effects (FE) Predictors where interest is in specific levels
       Random effects (RE) Predictors where interest is in distribution
                  rather than levels (blocks) 5
       Mixed models Statistical models with both FEs and REs
       Linear mixed models Linear effects, normal responses, normal REs
       Generalized linear models Linearizable effects, exponential-family
                     responses, normal REs (on linearized scale)
       Generalized linear mixed models GLMMs = LMMs + GLMs



Ben Bolker                       McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors             GLMMs               Results                   Conclusions                   References



Definitions


GLMMs




             Distributions from exponential family
             (Poisson, binomial, Gaussian, Gamma, NegBinom(k), . . . )
             Means = linear functions of predictors
             on scale of link function (identity, log, logit, . . . )




Ben Bolker                            McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors            GLMMs               Results                   Conclusions                   References



Definitions


GLMMs (cont.)


             Linear predictor:
                                        η = Xβ + Zu
             Random effects:
                                      u ∼ MVN(0, Σ)
             Response:

                              Y ∼ D g −1 η, φ           (φ often ≡ 1)




Ben Bolker                           McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors             GLMMs                Results                   Conclusions                   References



Definitions


Marginal likelihood


       Likelihood (Prob(data|parameters)) — requires integrating over
       possible values of REs to get marginal likelihood e.g.:
             likelihood of i th obs. in block j is L(xij |θi , σw )
                                                                2

                                                                   2
             likelihood of a particular block mean θj is L(θj |0, σb )
             marginal likelihood is                   2            2
                                         L(xij |θj , σw )L(θj |0, σb ) dθj
       Balance (dispersion of RE around 0) with (dispersion of data
       conditional on RE)




Ben Bolker                             McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors             GLMMs                Results                   Conclusions                   References



Definitions


Marginal likelihood


       Likelihood (Prob(data|parameters)) — requires integrating over
       possible values of REs to get marginal likelihood e.g.:
             likelihood of i th obs. in block j is L(xij |θi , σw )
                                                                2

                                                                   2
             likelihood of a particular block mean θj is L(θj |0, σb )
             marginal likelihood is                   2            2
                                         L(xij |θj , σw )L(θj |0, σb ) dθj
       Balance (dispersion of RE around 0) with (dispersion of data
       conditional on RE)




Ben Bolker                             McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors             GLMMs               Results                   Conclusions                   References



Definitions


Bayesian solution?



       Bayesians should not feel smug: they are stuck with the
       normalizing constant

                                    Prior(β, θ, Σ)L(xij |β, θ)L(θ|Σ)
             Posterior(β, θ, Σ) =                                                          (!!)
                                              (. . .)dβ dθ dΣ

       and similar issues with marginal posteriors




Ben Bolker                            McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors            GLMMs        Results                   Conclusions                   References



Examples


Outline
       1 Precursors
           Definitions
           Examples
           Challenges
       2 GLMMs
           Estimation
           Inference
       3 Results
           Coral symbionts
           Glycera
           Arabidopsis
       4 Conclusions
           Conclusions
Ben Bolker                    McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors                       GLMMs                  Results                   Conclusions                   References



Examples


Coral protection by symbionts

                                     Number of predation events
                           10

                            8                                                2
        Number of blocks




                                            2
                                                             2
                            6    2
                                                                             1
                                            1
                            4
                                                                             0
                            2               0                0
                                 1
                            0
                                none      shrimp          crabs            both

                                                Symbionts


Ben Bolker                                         McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors                                       GLMMs                                            Results                                            Conclusions     References



Examples


Environmental stress: Glycera cell survival
                                                 0    0.03   0.1   0.32                            0    0.03   0.1   0.32


                             Anoxia                   Anoxia                   Anoxia                   Anoxia                   Anoxia
                            Osm=12.8                 Osm=22.4                 Osm=32                   Osm=41.6                 Osm=51.2                       1.0



                                                                                                                                                       133.3




                                                                                                                                                       66.6    0.8




                                                                                                                                                       33.3



                                                                                                                                                               0.6
                                                                                                                                                       0
       Copper




                            Normoxia                 Normoxia                 Normoxia                 Normoxia                 Normoxia
                            Osm=12.8                 Osm=22.4                 Osm=32                   Osm=41.6                 Osm=51.2
                                                                                                                                                               0.4

                133.3




                 66.6
                                                                                                                                                               0.2



                 33.3




                   0                                                                                                                                           0.0




                        0    0.03   0.1   0.32                            0   0.03   0.1   0.32                             0    0.03   0.1   0.32


                                                                                H2S


Ben Bolker                                                                             McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors                              GLMMs                  Results                      Conclusions                References



Examples


Arabidopsis response to fertilization & clipping
                                 panel: nutrient, color: genotype

                                       nutrient : 1                      nutrient : 8
                                                                                        q
                                                                                        q
                                                                    q                   q
                                                                                        q
                                                                    q                   q
                                                                    q                   q
                           5       q
                                   q
                                   q
                                                      q             q
                                                                    q
                                                                    q
                                                                    q
                                                                    q
                                                                    q
                                                                                        q
                                                                                        q
                                                                                        q
                                                                                        q
                                                                                        q
                                   q                                q
                                                                    q                   q
                                                                                        q
                                   q                                q
                                                                    q                   q
                                                                                        q
                                                                    q
                                                                    q                   q
                                   q                  q
                                                      q             q
                                                                    q                   q
                                   q                                q
                                                                    q                   q
                                                                                        q
                                   q
                                   q                  q             q                   q
                                                                                        q
                                                      q             q
                                                                    q                   q
                                                                                        q
                                                                    q
        Log(1+fruit set)




                                   q                  q             q                   q
                                                                                        q
                           4       q
                                   q
                                                      q
                                                      q
                                                      q
                                                      q
                                                                    q
                                                                    q
                                                                    q
                                                                    q
                                                                    q
                                                                    q
                                                                                        q
                                                                                        q
                                                                                        q
                                                                                        q
                                                                                        q
                                                                                        q
                                                                                        q
                                   q                  q             q
                                                                    q                   q
                                                                                        q
                                                      q             q
                                                                    q                   q
                                   q                  q             q
                                                                    q                   q
                                   q
                                   q                  q                                 q
                                                                                        q
                                   q                  q             q
                                                                    q                   q
                                                                                        q
                                   q
                                   q                  q             q                   q
                                   q
                                   q                  q             q
                                                                    q                   q
                                   q
                                   q                  q
                                                      q             q
                           3       q
                                   q
                                   q
                                   q
                                   q                  q
                                                      q
                                                                    q
                                                                    q
                                                                    q
                                                                    q
                                                                                        q

                                   q
                                   q                                                    q
                                                                                        q
                                   q
                                   q                  q
                                                      q             q                   q
                                   q                  q                                 q
                                                                                        q
                                                      q
                                                      q
                                                      q             q
                                   q                  q             q
                                                                    q                   q
                                                                                        q
                                   q
                                   q                  q
                                                      q                                 q
                                   q                  q                                 q
                                                                                        q
                                   q                  q             q                   q
                                   q                  q             q
                                                                    q                   q
                                                                                        q
                           2       q
                                   q
                                   q
                                   q
                                   q
                                   q
                                                      q
                                                      q
                                                      q
                                                      q
                                                      q
                                                                    q
                                                                    q
                                                                                        q
                                                                                        q
                                                                                        q
                                                                                        q
                                                                    q                   q
                                   q                  q             q
                                   q                  q
                                   q
                                   q                  q
                                                      q             q                   q
                                   q                  q             q
                                   q
                                   q
                                   q                  q
                                                      q             q
                                                                    q
                           1                          q             q

                                   q
                                   q                  q             q                   q
                                   q                  q




                           0       q
                                   q
                                   q
                                                      q
                                                      q
                                                      q
                                                                    q
                                                                    q
                                                                    q
                                                                                        q
                                                                                        q
                                                                                        q



                               unclipped        clipped         unclipped         clipped



Ben Bolker                                                McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors            GLMMs        Results                   Conclusions                   References



Challenges


Outline
       1 Precursors
           Definitions
           Examples
           Challenges
       2 GLMMs
           Estimation
           Inference
       3 Results
           Coral symbionts
           Glycera
           Arabidopsis
       4 Conclusions
           Conclusions
Ben Bolker                    McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors            GLMMs             Results                   Conclusions                   References



Challenges


Data challenges: estimation



             Small # RE levels (<5–6) [modes at zero]
             Crossed REs [unusual setup]
             Spatial/temporal correlation structure (
                                                    “R-side” effects)
             Overdispersion
             Unusual distributions (Gamma, negative binomial . . . )




Ben Bolker                         McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors           GLMMs              Results                   Conclusions                   References



Challenges


Data challenges: computation




             Large n (of course)
             Multiple REs (dimensionality)
             Crossed REs




Ben Bolker                         McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors             GLMMs          Results                   Conclusions                   References



Challenges


Inference




             Any departures from classical LMMs
             Small N (<40)
             Small n
             Inference on components of Σ (boundary effects, df)




Ben Bolker                       McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors           GLMMs              Results                   Conclusions                   References



Challenges


RE examples


             Coral symbionts: simple experimental blocks, RE affects
             intercept (overall probability of predation in block)
             Glycera: applied to cells from 10 individuals, RE again affects
             intercept (cell survival prob.)
             Arabidopsis: region (3 levels, treated as fixed) / population /
             genotype: affects intercept (overall fruit set) as well as
             treatment effects (nutrients, herbivory, interaction)




Ben Bolker                         McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors            GLMMs        Results                   Conclusions                   References



Estimation


Outline
       1 Precursors
           Definitions
           Examples
           Challenges
       2 GLMMs
           Estimation
           Inference
       3 Results
           Coral symbionts
           Glycera
           Arabidopsis
       4 Conclusions
           Conclusions
Ben Bolker                    McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors            GLMMs             Results                   Conclusions                   References



Estimation


Penalized quasi-likelihood (PQL)


             alternate steps of estimating GLM using known RE variances
             to calculate weights; estimate LMMs given GLM fit 2
             flexible (e.g. spatial/temporal correlations)
             biased for small unit samples (e.g. counts < 5, binary or
             low-survival data)
             widely used: SAS PROC GLIMMIX, R MASS:glmmPQL
             marginal models: generalized estimating equations
             (geepack, geese)



Ben Bolker                         McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors            GLMMs             Results                   Conclusions                   References



Estimation


Penalized quasi-likelihood (PQL)


             alternate steps of estimating GLM using known RE variances
             to calculate weights; estimate LMMs given GLM fit 2
             flexible (e.g. spatial/temporal correlations)
             biased for small unit samples (e.g. counts < 5, binary or
             low-survival data)
             widely used: SAS PROC GLIMMIX, R MASS:glmmPQL
             marginal models: generalized estimating equations
             (geepack, geese)



Ben Bolker                         McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors            GLMMs             Results                   Conclusions                   References



Estimation


Penalized quasi-likelihood (PQL)


             alternate steps of estimating GLM using known RE variances
             to calculate weights; estimate LMMs given GLM fit 2
             flexible (e.g. spatial/temporal correlations)
             biased for small unit samples (e.g. counts < 5, binary or
             low-survival data)
             widely used: SAS PROC GLIMMIX, R MASS:glmmPQL
             marginal models: generalized estimating equations
             (geepack, geese)



Ben Bolker                         McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors            GLMMs             Results                   Conclusions                   References



Estimation


Penalized quasi-likelihood (PQL)


             alternate steps of estimating GLM using known RE variances
             to calculate weights; estimate LMMs given GLM fit 2
             flexible (e.g. spatial/temporal correlations)
             biased for small unit samples (e.g. counts < 5, binary or
             low-survival data)
             widely used: SAS PROC GLIMMIX, R MASS:glmmPQL
             marginal models: generalized estimating equations
             (geepack, geese)



Ben Bolker                         McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors           GLMMs              Results                   Conclusions                   References



Estimation


Laplace approximation


             approximate marginal likelihood
             for given β, θ find conditional modes by penalized, iterated
             reweighted least squares; then use second-order Taylor
             expansion around the conditional modes
             more accurate than PQL
             reasonably fast and flexible
             lme4:glmer, glmmML, glmmADMB, R2ADMB (AD Model Builder)




Ben Bolker                         McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors           GLMMs             Results                   Conclusions                   References



Estimation


(adaptive) Gauss-Hermite quadrature (AGHQ)



             as above, but compute additional terms in the integral
             (typically 8, but often up to 20)
             most accurate
             slowest, hence not flexible (2–3 RE at most, maybe only 1)
             lme4:glmer, glmmML, gamlss.mx:gamlssNP, repeated




Ben Bolker                        McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors           GLMMs            Results                   Conclusions                   References



Estimation


Variations




             Hierarchical GLMS (hglm, HGLMMM)
             Monte Carlo methods: MCEM 1 , MCMLE (bernor) 18 ,
             sequential MC (pomp), data cloning (dclone)




Ben Bolker                       McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors           GLMMs              Results                   Conclusions                   References



Estimation


Bayesian approaches


             Monte Carlo approaches: MCMC (Gibbs sampling,
             Metropolis-Hastings, etc.)
             slow but flexible
             makes marginal inference easy
             must specify priors, assess convergence
             specialized: glmmAK, MCMCglmm 9 , INLA
             general: BUGS (glmmBUGS, R2WinBUGS, BRugs, WinBUGS,
             OpenBUGS, R2jags, rjags, JAGS)



Ben Bolker                         McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors          GLMMs             Results                   Conclusions                   References



Estimation


Extensions


       Overdispersion Variance > expected from statistical model
                        Quasi-likelihood approaches: MASS:glmmPQL
                        Extended distributions (e.g. negative binomial):
                        glmmADMB, gamlss.mx:gamlssNP
                        Observation-level random effects (e.g.
                        lognormal-Poisson): lme4
       Zero-inflation Overabundance of zeros in a discrete distribution
                        zero-inflated models: glmmADMB, MCMCglmm
                        hurdle models: MCMCglmm


Ben Bolker                       McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors            GLMMs        Results                   Conclusions                   References



Inference


Outline
       1 Precursors
           Definitions
           Examples
           Challenges
       2 GLMMs
           Estimation
           Inference
       3 Results
           Coral symbionts
           Glycera
           Arabidopsis
       4 Conclusions
           Conclusions
Ben Bolker                    McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors           GLMMs              Results                   Conclusions                   References



Inference


Wald tests/CIs



             Easy (e.g. typical results of summary): assume quadratic
             surface, based on information matrix @ MLE
             always approximate, sometimes awful (Hauck-Donner effect)
             often bad for variance estimates
             available from most direct-maximization packages




Ben Bolker                         McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors            GLMMs             Results                   Conclusions                   References



Inference


Likelihood ratio tests/profile confidence intervals


             Model comparison is relatively easy
             Profiling is expensive — and not (yet) available . . . (lme4a for
             LMMs)
             in GLM(M) case, numerator is only asymptotically χ2 anyway:
             Bartlett corrections 3;4 , higher-order asymptotics: cond
             [neither extended to GLMMs!]
             OK if N − n, N      40?




Ben Bolker                         McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors            GLMMs              Results                   Conclusions                   References



Inference


Conditional F tests


             What if scale parameter (φ) is estimated
             (e.g. Gaussian, Gamma, quasi-likelihood) ?
             In classical LMMs, −2 log L ∼ F (ν1 , ν2 )
             For non-classical LMMs (unbalanced, crossed, R-side) or
             GLMMs, ν2 poorly defined:
             Kenward-Roger, Satterthwaite approximations 12;16
             unimplemented except in SAS (partially in Genstat)




Ben Bolker                          McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors            GLMMs             Results                   Conclusions                   References



Inference


Tests/CIs of variances [boundary problems]



             LRT depends on null hypothesis being within the parameter’s
             feasible range 6;13
             violated e.g. by H0 : σ 2 = 0
             In simple cases null distribution is a mixture of χ2
             distributions (e.g. 0.5χ2 + 0.5χ2 : emdbook:dchibarsq)
                                     0        1
             simulation-based testing: RLRsim




Ben Bolker                         McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors            GLMMs              Results                   Conclusions                   References



Inference


Information-theoretic approaches


             Above issues apply, but less well understood: 7;8
             AIC is asymptotic
             “corrected” AIC (AICc ) 10 derived for linear models, widely
             used but not tested elsewhere 14
             For comparing models with different REs,
             or for AICc , what is p? conditional AIC: 8;19 (cAIC) (level of
             focus issue: see also Deviance Information Criterion (DIC, 17 )




Ben Bolker                          McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors               GLMMs             Results                   Conclusions                   References



Inference


Bootstrapping

             1   fit null model to data
             2   simulate “data” from null model
             3   fit null and working model, compute likelihood difference
             4   repeat to estimate null distribution

                 confidence intervals?
                 simulate/refit methods; bootMer in lme4a (LMMs only!)
       > pboot <- function(m0, m1) {
            s <- simulate(m0)
            2 * (logLik(refit(m1, s)) - logLik(refit(m0, s)))
        }
       > replicate(1000, pboot(fm2, fm1))

Ben Bolker                            McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors            GLMMs              Results                   Conclusions                   References



Inference


Bayesian inference



             Marginal highest posterior density intervals (or quantiles)
             Computationally “free” with results of stochastic Bayesian
             computation
             Easily extended to prediction intervals etc. etc.
             Post hoc Markov chain Monte Carlo sampling available for
             some packages (glmmADMB, R2ADMB, eventually lme4a)




Ben Bolker                          McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors              GLMMs              Results                   Conclusions                   References



Inference


Bottom line



             Large data: computation slow (maximization methods
             fastest), inference easy (asymptotics)
             Bayesian computation slow, inference easy (posterior samples)
             Small data: computation fast
                    RE variances may be poorly estimated/set to zero (upcoming:
                    penalty/prior term in blmer within arm)
                    inference tricky, may need bootstrapping




Ben Bolker                            McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors                     GLMMs                         Results                                          Conclusions   References



Coral symbionts


Coral symbionts: comparison of results

                                                Regression estimates
                                 −6            −4            −2                0                          2


                                                                                               q
                                                                                   q
                                                                                       q
                                                                                           q
                                                                                           q
                                                                                           q
             Added symbiont                                                                q




                                                                       q
                                                                               q
                                                                           q
                                                                           q
                                                                           q
                                                                           q
             Crab vs. Shrimp                                               q




                                       q
                                                             q                                     q   GLM (fixed)
                                                    q
                                           q
                                                                                                   q   GLM (pooled)
                                           q                                                       q   PQL
                                           q                                                       q   Laplace
                   Symbiont                q
                                                                                                   q   AGQ




Ben Bolker                                              McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors                   GLMMs                           Results                              Conclusions   References



Glycera


Glycera fit comparisons

                                                                              qq   qq
             Osm:Cu:H2S:Anoxia                                                     q
                                                                   q
                                                                   q
                 Cu:H2S:Anoxia                                                              q q
                                                                                            q
                                                                       qq
                                                                       q
               Osm:H2S:Anoxia                                           q
                                                                        q
                                                                       q
                                                                       qq
                                                                        q
                 Osm:Cu:Anoxia                                           q

                                             q     q         qq
                   Osm:Cu:H2S                 q
                                                                    qqq
                                                                     qq
                    H2S:Anoxia
                                                                    q
                                                                   qq q
                     Cu:Anoxia                                      q
                                                                       q
                                                                       q
                   Osm:Anoxia                                          qq
                                                                       q
                                  q                     q    q
                       Cu:H2S     q
                                  q
                                                                          q
                                                                          q
                     Osm:H2S                                           qq
                                                                       q
                                                                        q q
                                                                        q q
                       Osm:Cu                                           q
                                                                                        q   MCMCglmm
                                                                       qqq
                                                                         q
                        Anoxia                                          q               q   glmer(OD:2)
                                                            q qq
                          H2S                                      q
                                                                   q                    q   glmer(OD)
                                                             qq q
                           Cu                                   q
                                                                q                       q   glmmML
                                                                q
                          Osm
                                                                 qq
                                                                 qq
                                                                                        q   glmer


                                 −60   −40        −20                   0          20       40       60

                                              Effect on survival (logit)

Ben Bolker                                        McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors                    GLMMs                                 Results                     Conclusions      References



Glycera


Glycera: MCMCglmm fit

       Osm : Cu : H2S : Oxygen                                                     q

              Osm : Cu : Oxygen                             q

             Osm : H2S : Oxygen                             q

              Cu : H2S : Oxygen                     q                                       3−way
                 Osm : Cu : H2S                 q

                     Osm : Cu                                           q

                  H2S : Oxygen                                          q

                     Osm : H2S                                      q
                                                                                            2−way
                   Cu : Oxygen                                  q

                  Osm : Oxygen                              q

                       Cu : H2S             q

                       Oxygen                                   q

                          Osm                           q
                                                                                       main effects
                            Cu              q

                           H2S          q




                                  −20   −10                     0             10          20          30
                                                        Effect on survival

Ben Bolker                                          McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors                                               GLMMs                          Results                        Conclusions             References



Glycera


Parametric bootstrap results
                                                          Osm                                       Cu

                               0.5


                               0.1
                              0.05


                              0.01
                             0.005
          Inferred p value




                                                                                                                       variable
                             0.001
                                                                                                                           normal
                                                          H2S                                      Anoxia
                                                                                                                           t7

                               0.5                                                                                         t14


                               0.1
                              0.05


                              0.01
                             0.005


                             0.001

                                     0.001   0.0050.01      0.05 0.1     0.5   0.001   0.0050.01      0.05 0.1   0.5

                                                                       True p value


Ben Bolker                                                                        McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors                GLMMs                     Results                   Conclusions                   References



Arabidopsis


Arabidopsis: AIC comparison of RE models

                          nointeract                 q

                           int(popu)                      q

                int(gen) X int(popu)       q

               int(gen) X nut(popu)                       q

              int(gen) X clip(popu)                           q

               nut(gen) X int(popu)    q

             nut(gen) X nut(popu)                                 q

             nut(gen) X clip(popu)                            q

              clip(gen) X int(popu)                                      q

             clip(gen) X nut(popu)                                               q

             clip(gen) X clip(popu)                                                  q


                                       0              2                  4   6
                                                                  ∆AIC


Ben Bolker                                     McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors                            GLMMs                            Results                   Conclusions                References



Arabidopsis


Arabidopsis: fits with and without nutrient(genotype)

                                                          Regression estimates
                                        −1.0       −0.5          0.0         0.5   1.0          1.5

                                                                             q
              nutrient8:amdclipped                                           q




                                                           q
                  statusTransplant                         q




                                                      q
                  statusPetri.Plate                   q




                                          q
                             rack2        q




                                               q
                       amdclipped              q




                                                                                            q
                         nutrient8                                                         q




Ben Bolker                                                     McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors              GLMMs              Results                   Conclusions                   References



Conclusions


Primary tools



              lme4: multiple/crossed REs, (profiling): fast
              MCMCglmm: Bayesian, very flexible
              glmmADMB: negative binomial, zero-inflated etc.
              Flexible tools:
                    AD Model Builder (and interfaces)
                    BUGS/JAGS (and interfaces)
                    INLA 15




Ben Bolker                            McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors            GLMMs              Results                   Conclusions                   References



Conclusions


Outlook


              Computation: faster algorithms, parallel computation
              Inference: mostly computational?
              Implementation: extensions (e.g. L1-penalized approaches 11 ),
              consistency (profile, simulate, predict)
              Benefits & costs of staying within the GLMM framework
              Benefits & costs of diversity
       More info: http://glmm.wikidot.com



Ben Bolker                          McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors            GLMMs            Results                   Conclusions                   References



Conclusions


Acknowledgements



              Data: Josh Banta and Massimo Pigliucci (Arabidopsis);
              Adrian Stier and Sea McKeon (coral symbionts); Courtney
              Kagan, Jocelynn Ortega, David Julian (Glycera);
              Co-authors: Mollie Brooks, Connie Clark, Shane Geange, John
              Poulsen, Hank Stevens, Jada White




Ben Bolker                        McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors                    GLMMs                        Results                       Conclusions                   References




        [1] Booth JG & Hobert JP, 1999. Journal of the                      3867274916. URL http://www.cuvillier.de/
            Royal Statistical Society. Series B, 61(1):265–285.             flycms/en/html/30/-UickI3zKPS,3cEY=
            doi:10.1111/1467-9868.00176. URL http://                        /Buchdetails.html?SID=wVZnpL8f0fbc.
            links.jstor.org/sici?sici=1369-7412(1999)                 [8]   Greven S & Kneib T, 2010. Biometrika,
            61%3A1%3C265%3AMGLMML%3E2.0.CO%3B2-C.                           97(4):773–789. URL http:
        [2] Breslow NE, 2004. In DY Lin & PJ Heagerty,                      //www.bepress.com/jhubiostat/paper202/.
            eds., Proceedings of the second Seattle                   [9]   Hadfield JD, 2 2010. Journal of Statistical
            symposium in biostatistics: Analysis of correlated              Software, 33(2):1–22. ISSN 1548-7660. URL
            data, pp. 1–22. Springer. ISBN 0387208623.                      http://www.jstatsoft.org/v33/i02.
        [3] Cordeiro GM & Ferrari SLP, Aug. 1998. Journal            [10]   HURVICH CM & TSAI C, Jun. 1989. Biometrika,
            of Statistical Planning and Inference,                          76(2):297 –307.
            71(1-2):261–269. ISSN 0378-3758.                                doi:10.1093/biomet/76.2.297. URL
            doi:10.1016/S0378-3758(98)00005-6. URL                          http://biomet.oxfordjournals.org/content/
            http://www.sciencedirect.com/science/                           76/2/297.abstract.
            article/B6V0M-3V5CVRT-M/2/                               [11]   Jiang J, Aug. 2008. The Annals of Statistics,
            190f68a684dd08c569a7836ff59568e4.                               36(4):1669–1692. ISSN 0090-5364.
        [4] Cordeiro GM, Paula GA, & Botter DA, 1994.                       doi:10.1214/07-AOS517. URL http:
            International Statistical Review / Revue                        //projecteuclid.org/euclid.aos/1216237296.
            Internationale de Statistique, 62(2):257–274.            [12]   Kenward MG & Roger JH, 1997. Biometrics,
            ISSN 03067734. doi:10.2307/1403512. URL                         53(3):983–997.
            http://www.jstor.org/stable/1403512.
                                                                     [13]   Molenberghs G & Verbeke G, 2007. The
        [5] Gelman A, 2005. Annals of Statistics, 33(1):1–53.               American Statistician, 61(1):22–27.
            doi:doi:10.1214/009053604000001048.                             doi:10.1198/000313007X171322.
        [6] Goldman N & Whelan S, 2000. Molecular Biology            [14]   Richards SA, 2005. Ecology, 86(10):2805–2814.
            and Evolution, 17(6):975–978.                                   doi:10.1890/05-0074.
        [7] Greven S, 2008. Non-Standard Problems in                 [15]   Rue H, Martino S, & Chopin N, 2009. Journal of
            Inference for Additive and Linear Mixed Models.                 the Royal Statistical Society, Series B,
            Cuvillier Verlag, G¨ttingen, Germany. ISBN
                               o                                            71(2):319–392.
Ben Bolker                                           McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors                    GLMMs                       Results                  Conclusions                   References



Conclusions

       [16] Schaalje G, McBride J, & Fellingham G, 2002.
            Journal of Agricultural, Biological &
            Environmental Statistics, 7(14):512–524. URL
            http://www.ingentaconnect.com/content/
            asa/jabes/2002/00000007/00000004/art00004.
       [17] Spiegelhalter DJ, Best N et al., 2002. Journal of
            the Royal Statistical Society B, 64:583–640.
       [18] Sung YJ, Jul. 2007. The Annals of Statistics,
            35(3):990–1011. ISSN 0090-5364.
            doi:10.1214/009053606000001389. URL
            http:
            //projecteuclid.org/euclid.aos/1185303995.
            Mathematical Reviews number (MathSciNet):
            MR2341695; Zentralblatt MATH identifier:
            1124.62009.
       [19] Vaida F & Blanchard S, Jun. 2005. Biometrika,
            92(2):351–370.
            doi:10.1093/biomet/92.2.351. URL
            http://biomet.oxfordjournals.org/cgi/
            content/abstract/92/2/351.




Ben Bolker                                          McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors            GLMMs               Results                   Conclusions                   References



Conclusions


Extras



              Spatial and temporal correlation (R-side effects):
              MASS:glmmPQL (sort of), GLMMarp, INLA;
              WinBUGS, AD Model Builder
              Additive models: amer, gamm4, mgcv
              Penalized methods 11




Ben Bolker                           McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs

More Related Content

Similar to Open source GLMM tools: Concordia

Harvard Forest GLMM talk
Harvard Forest GLMM talkHarvard Forest GLMM talk
Harvard Forest GLMM talkBen Bolker
 
Mining Maximally Banded Matrices in Binary Data
 Mining Maximally Banded Matrices in Binary Data Mining Maximally Banded Matrices in Binary Data
Mining Maximally Banded Matrices in Binary DataFaris Alqadah
 
computational science & engineering seminar, 16 oct 2013
computational science & engineering seminar, 16 oct 2013computational science & engineering seminar, 16 oct 2013
computational science & engineering seminar, 16 oct 2013Ben Bolker
 
SBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resourcesSBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resourcesMike Hucka
 
Waterloo GLMM talk
Waterloo GLMM talkWaterloo GLMM talk
Waterloo GLMM talkBen Bolker
 
Waterloo GLMM talk
Waterloo GLMM talkWaterloo GLMM talk
Waterloo GLMM talkBen Bolker
 
Trondheim glmm
Trondheim glmmTrondheim glmm
Trondheim glmmBen Bolker
 
Sampling based approximation of confidence intervals for functions of genetic...
Sampling based approximation of confidence intervals for functions of genetic...Sampling based approximation of confidence intervals for functions of genetic...
Sampling based approximation of confidence intervals for functions of genetic...prettygully
 
Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...
Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...
Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...Gota Morota
 

Similar to Open source GLMM tools: Concordia (12)

Igert glmm
Igert glmmIgert glmm
Igert glmm
 
Harvard Forest GLMM talk
Harvard Forest GLMM talkHarvard Forest GLMM talk
Harvard Forest GLMM talk
 
Threads 2013
Threads 2013Threads 2013
Threads 2013
 
Threads 2013
Threads 2013Threads 2013
Threads 2013
 
Mining Maximally Banded Matrices in Binary Data
 Mining Maximally Banded Matrices in Binary Data Mining Maximally Banded Matrices in Binary Data
Mining Maximally Banded Matrices in Binary Data
 
computational science & engineering seminar, 16 oct 2013
computational science & engineering seminar, 16 oct 2013computational science & engineering seminar, 16 oct 2013
computational science & engineering seminar, 16 oct 2013
 
SBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resourcesSBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resources
 
Waterloo GLMM talk
Waterloo GLMM talkWaterloo GLMM talk
Waterloo GLMM talk
 
Waterloo GLMM talk
Waterloo GLMM talkWaterloo GLMM talk
Waterloo GLMM talk
 
Trondheim glmm
Trondheim glmmTrondheim glmm
Trondheim glmm
 
Sampling based approximation of confidence intervals for functions of genetic...
Sampling based approximation of confidence intervals for functions of genetic...Sampling based approximation of confidence intervals for functions of genetic...
Sampling based approximation of confidence intervals for functions of genetic...
 
Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...
Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...
Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...
 

More from Ben Bolker

Ecological synthesis across scales: West Nile virus in individuals and commun...
Ecological synthesis across scales: West Nile virus in individuals and commun...Ecological synthesis across scales: West Nile virus in individuals and commun...
Ecological synthesis across scales: West Nile virus in individuals and commun...Ben Bolker
 
evolution of virulence: devil in the details
evolution of virulence: devil in the detailsevolution of virulence: devil in the details
evolution of virulence: devil in the detailsBen Bolker
 
model complexity and model choice for animal movement models
model complexity and model choice for animal movement modelsmodel complexity and model choice for animal movement models
model complexity and model choice for animal movement modelsBen Bolker
 
model complexity and model choice for animal movement models
model complexity and model choice for animal movement modelsmodel complexity and model choice for animal movement models
model complexity and model choice for animal movement modelsBen Bolker
 
Fundamental principles (?) of biological data
Fundamental principles (?) of biological dataFundamental principles (?) of biological data
Fundamental principles (?) of biological dataBen Bolker
 
ESS of minimal mutation rate in an evo-epidemiological model
ESS of minimal mutation rate in an evo-epidemiological modelESS of minimal mutation rate in an evo-epidemiological model
ESS of minimal mutation rate in an evo-epidemiological modelBen Bolker
 
math bio for 1st year math students
math bio for 1st year math studentsmath bio for 1st year math students
math bio for 1st year math studentsBen Bolker
 
MBRS detectability talk
MBRS detectability talkMBRS detectability talk
MBRS detectability talkBen Bolker
 
Bolker esa2014
Bolker esa2014Bolker esa2014
Bolker esa2014Ben Bolker
 
Davis eco-evo virulence
Davis eco-evo virulenceDavis eco-evo virulence
Davis eco-evo virulenceBen Bolker
 
intro to knitr with RStudio
intro to knitr with RStudiointro to knitr with RStudio
intro to knitr with RStudioBen Bolker
 
Disease-induced extinction
Disease-induced extinctionDisease-induced extinction
Disease-induced extinctionBen Bolker
 
unmarked individuals: Guelph
unmarked individuals: Guelphunmarked individuals: Guelph
unmarked individuals: GuelphBen Bolker
 
GLMs and extensions in R
GLMs and extensions in RGLMs and extensions in R
GLMs and extensions in RBen Bolker
 

More from Ben Bolker (16)

Ecological synthesis across scales: West Nile virus in individuals and commun...
Ecological synthesis across scales: West Nile virus in individuals and commun...Ecological synthesis across scales: West Nile virus in individuals and commun...
Ecological synthesis across scales: West Nile virus in individuals and commun...
 
evolution of virulence: devil in the details
evolution of virulence: devil in the detailsevolution of virulence: devil in the details
evolution of virulence: devil in the details
 
model complexity and model choice for animal movement models
model complexity and model choice for animal movement modelsmodel complexity and model choice for animal movement models
model complexity and model choice for animal movement models
 
model complexity and model choice for animal movement models
model complexity and model choice for animal movement modelsmodel complexity and model choice for animal movement models
model complexity and model choice for animal movement models
 
Fundamental principles (?) of biological data
Fundamental principles (?) of biological dataFundamental principles (?) of biological data
Fundamental principles (?) of biological data
 
ESS of minimal mutation rate in an evo-epidemiological model
ESS of minimal mutation rate in an evo-epidemiological modelESS of minimal mutation rate in an evo-epidemiological model
ESS of minimal mutation rate in an evo-epidemiological model
 
math bio for 1st year math students
math bio for 1st year math studentsmath bio for 1st year math students
math bio for 1st year math students
 
MBRS detectability talk
MBRS detectability talkMBRS detectability talk
MBRS detectability talk
 
Bolker esa2014
Bolker esa2014Bolker esa2014
Bolker esa2014
 
Montpellier
MontpellierMontpellier
Montpellier
 
Davis eco-evo virulence
Davis eco-evo virulenceDavis eco-evo virulence
Davis eco-evo virulence
 
Google lme4
Google lme4Google lme4
Google lme4
 
intro to knitr with RStudio
intro to knitr with RStudiointro to knitr with RStudio
intro to knitr with RStudio
 
Disease-induced extinction
Disease-induced extinctionDisease-induced extinction
Disease-induced extinction
 
unmarked individuals: Guelph
unmarked individuals: Guelphunmarked individuals: Guelph
unmarked individuals: Guelph
 
GLMs and extensions in R
GLMs and extensions in RGLMs and extensions in R
GLMs and extensions in R
 

Recently uploaded

Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 

Recently uploaded (20)

Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Open source GLMM tools: Concordia

  • 1. Precursors GLMMs Results Conclusions References Open-source tools for estimation and inference using generalized linear mixed models Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology 3 July 2011 Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 2. Precursors GLMMs Results Conclusions References Outline 1 Precursors Definitions Examples Challenges 2 GLMMs Estimation Inference 3 Results Coral symbionts Glycera Arabidopsis 4 Conclusions Conclusions Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 3. Precursors GLMMs Results Conclusions References Definitions Outline 1 Precursors Definitions Examples Challenges 2 GLMMs Estimation Inference 3 Results Coral symbionts Glycera Arabidopsis 4 Conclusions Conclusions Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 4. Precursors GLMMs Results Conclusions References Definitions Definitions Fixed effects (FE) Predictors where interest is in specific levels Random effects (RE) Predictors where interest is in distribution rather than levels (blocks) 5 Mixed models Statistical models with both FEs and REs Linear mixed models Linear effects, normal responses, normal REs Generalized linear models Linearizable effects, exponential-family responses, normal REs (on linearized scale) Generalized linear mixed models GLMMs = LMMs + GLMs Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 5. Precursors GLMMs Results Conclusions References Definitions Definitions Fixed effects (FE) Predictors where interest is in specific levels Random effects (RE) Predictors where interest is in distribution rather than levels (blocks) 5 Mixed models Statistical models with both FEs and REs Linear mixed models Linear effects, normal responses, normal REs Generalized linear models Linearizable effects, exponential-family responses, normal REs (on linearized scale) Generalized linear mixed models GLMMs = LMMs + GLMs Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 6. Precursors GLMMs Results Conclusions References Definitions GLMMs Distributions from exponential family (Poisson, binomial, Gaussian, Gamma, NegBinom(k), . . . ) Means = linear functions of predictors on scale of link function (identity, log, logit, . . . ) Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 7. Precursors GLMMs Results Conclusions References Definitions GLMMs (cont.) Linear predictor: η = Xβ + Zu Random effects: u ∼ MVN(0, Σ) Response: Y ∼ D g −1 η, φ (φ often ≡ 1) Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 8. Precursors GLMMs Results Conclusions References Definitions Marginal likelihood Likelihood (Prob(data|parameters)) — requires integrating over possible values of REs to get marginal likelihood e.g.: likelihood of i th obs. in block j is L(xij |θi , σw ) 2 2 likelihood of a particular block mean θj is L(θj |0, σb ) marginal likelihood is 2 2 L(xij |θj , σw )L(θj |0, σb ) dθj Balance (dispersion of RE around 0) with (dispersion of data conditional on RE) Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 9. Precursors GLMMs Results Conclusions References Definitions Marginal likelihood Likelihood (Prob(data|parameters)) — requires integrating over possible values of REs to get marginal likelihood e.g.: likelihood of i th obs. in block j is L(xij |θi , σw ) 2 2 likelihood of a particular block mean θj is L(θj |0, σb ) marginal likelihood is 2 2 L(xij |θj , σw )L(θj |0, σb ) dθj Balance (dispersion of RE around 0) with (dispersion of data conditional on RE) Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 10. Precursors GLMMs Results Conclusions References Definitions Bayesian solution? Bayesians should not feel smug: they are stuck with the normalizing constant Prior(β, θ, Σ)L(xij |β, θ)L(θ|Σ) Posterior(β, θ, Σ) = (!!) (. . .)dβ dθ dΣ and similar issues with marginal posteriors Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 11. Precursors GLMMs Results Conclusions References Examples Outline 1 Precursors Definitions Examples Challenges 2 GLMMs Estimation Inference 3 Results Coral symbionts Glycera Arabidopsis 4 Conclusions Conclusions Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 12. Precursors GLMMs Results Conclusions References Examples Coral protection by symbionts Number of predation events 10 8 2 Number of blocks 2 2 6 2 1 1 4 0 2 0 0 1 0 none shrimp crabs both Symbionts Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 13. Precursors GLMMs Results Conclusions References Examples Environmental stress: Glycera cell survival 0 0.03 0.1 0.32 0 0.03 0.1 0.32 Anoxia Anoxia Anoxia Anoxia Anoxia Osm=12.8 Osm=22.4 Osm=32 Osm=41.6 Osm=51.2 1.0 133.3 66.6 0.8 33.3 0.6 0 Copper Normoxia Normoxia Normoxia Normoxia Normoxia Osm=12.8 Osm=22.4 Osm=32 Osm=41.6 Osm=51.2 0.4 133.3 66.6 0.2 33.3 0 0.0 0 0.03 0.1 0.32 0 0.03 0.1 0.32 0 0.03 0.1 0.32 H2S Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 14. Precursors GLMMs Results Conclusions References Examples Arabidopsis response to fertilization & clipping panel: nutrient, color: genotype nutrient : 1 nutrient : 8 q q q q q q q q q 5 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Log(1+fruit set) q q q q q 4 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 3 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 2 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 1 q q q q q q q q q 0 q q q q q q q q q q q q unclipped clipped unclipped clipped Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 15. Precursors GLMMs Results Conclusions References Challenges Outline 1 Precursors Definitions Examples Challenges 2 GLMMs Estimation Inference 3 Results Coral symbionts Glycera Arabidopsis 4 Conclusions Conclusions Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 16. Precursors GLMMs Results Conclusions References Challenges Data challenges: estimation Small # RE levels (<5–6) [modes at zero] Crossed REs [unusual setup] Spatial/temporal correlation structure ( “R-side” effects) Overdispersion Unusual distributions (Gamma, negative binomial . . . ) Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 17. Precursors GLMMs Results Conclusions References Challenges Data challenges: computation Large n (of course) Multiple REs (dimensionality) Crossed REs Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 18. Precursors GLMMs Results Conclusions References Challenges Inference Any departures from classical LMMs Small N (<40) Small n Inference on components of Σ (boundary effects, df) Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 19. Precursors GLMMs Results Conclusions References Challenges RE examples Coral symbionts: simple experimental blocks, RE affects intercept (overall probability of predation in block) Glycera: applied to cells from 10 individuals, RE again affects intercept (cell survival prob.) Arabidopsis: region (3 levels, treated as fixed) / population / genotype: affects intercept (overall fruit set) as well as treatment effects (nutrients, herbivory, interaction) Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 20. Precursors GLMMs Results Conclusions References Estimation Outline 1 Precursors Definitions Examples Challenges 2 GLMMs Estimation Inference 3 Results Coral symbionts Glycera Arabidopsis 4 Conclusions Conclusions Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 21. Precursors GLMMs Results Conclusions References Estimation Penalized quasi-likelihood (PQL) alternate steps of estimating GLM using known RE variances to calculate weights; estimate LMMs given GLM fit 2 flexible (e.g. spatial/temporal correlations) biased for small unit samples (e.g. counts < 5, binary or low-survival data) widely used: SAS PROC GLIMMIX, R MASS:glmmPQL marginal models: generalized estimating equations (geepack, geese) Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 22. Precursors GLMMs Results Conclusions References Estimation Penalized quasi-likelihood (PQL) alternate steps of estimating GLM using known RE variances to calculate weights; estimate LMMs given GLM fit 2 flexible (e.g. spatial/temporal correlations) biased for small unit samples (e.g. counts < 5, binary or low-survival data) widely used: SAS PROC GLIMMIX, R MASS:glmmPQL marginal models: generalized estimating equations (geepack, geese) Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 23. Precursors GLMMs Results Conclusions References Estimation Penalized quasi-likelihood (PQL) alternate steps of estimating GLM using known RE variances to calculate weights; estimate LMMs given GLM fit 2 flexible (e.g. spatial/temporal correlations) biased for small unit samples (e.g. counts < 5, binary or low-survival data) widely used: SAS PROC GLIMMIX, R MASS:glmmPQL marginal models: generalized estimating equations (geepack, geese) Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 24. Precursors GLMMs Results Conclusions References Estimation Penalized quasi-likelihood (PQL) alternate steps of estimating GLM using known RE variances to calculate weights; estimate LMMs given GLM fit 2 flexible (e.g. spatial/temporal correlations) biased for small unit samples (e.g. counts < 5, binary or low-survival data) widely used: SAS PROC GLIMMIX, R MASS:glmmPQL marginal models: generalized estimating equations (geepack, geese) Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 25. Precursors GLMMs Results Conclusions References Estimation Laplace approximation approximate marginal likelihood for given β, θ find conditional modes by penalized, iterated reweighted least squares; then use second-order Taylor expansion around the conditional modes more accurate than PQL reasonably fast and flexible lme4:glmer, glmmML, glmmADMB, R2ADMB (AD Model Builder) Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 26. Precursors GLMMs Results Conclusions References Estimation (adaptive) Gauss-Hermite quadrature (AGHQ) as above, but compute additional terms in the integral (typically 8, but often up to 20) most accurate slowest, hence not flexible (2–3 RE at most, maybe only 1) lme4:glmer, glmmML, gamlss.mx:gamlssNP, repeated Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 27. Precursors GLMMs Results Conclusions References Estimation Variations Hierarchical GLMS (hglm, HGLMMM) Monte Carlo methods: MCEM 1 , MCMLE (bernor) 18 , sequential MC (pomp), data cloning (dclone) Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 28. Precursors GLMMs Results Conclusions References Estimation Bayesian approaches Monte Carlo approaches: MCMC (Gibbs sampling, Metropolis-Hastings, etc.) slow but flexible makes marginal inference easy must specify priors, assess convergence specialized: glmmAK, MCMCglmm 9 , INLA general: BUGS (glmmBUGS, R2WinBUGS, BRugs, WinBUGS, OpenBUGS, R2jags, rjags, JAGS) Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 29. Precursors GLMMs Results Conclusions References Estimation Extensions Overdispersion Variance > expected from statistical model Quasi-likelihood approaches: MASS:glmmPQL Extended distributions (e.g. negative binomial): glmmADMB, gamlss.mx:gamlssNP Observation-level random effects (e.g. lognormal-Poisson): lme4 Zero-inflation Overabundance of zeros in a discrete distribution zero-inflated models: glmmADMB, MCMCglmm hurdle models: MCMCglmm Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 30. Precursors GLMMs Results Conclusions References Inference Outline 1 Precursors Definitions Examples Challenges 2 GLMMs Estimation Inference 3 Results Coral symbionts Glycera Arabidopsis 4 Conclusions Conclusions Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 31. Precursors GLMMs Results Conclusions References Inference Wald tests/CIs Easy (e.g. typical results of summary): assume quadratic surface, based on information matrix @ MLE always approximate, sometimes awful (Hauck-Donner effect) often bad for variance estimates available from most direct-maximization packages Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 32. Precursors GLMMs Results Conclusions References Inference Likelihood ratio tests/profile confidence intervals Model comparison is relatively easy Profiling is expensive — and not (yet) available . . . (lme4a for LMMs) in GLM(M) case, numerator is only asymptotically χ2 anyway: Bartlett corrections 3;4 , higher-order asymptotics: cond [neither extended to GLMMs!] OK if N − n, N 40? Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 33. Precursors GLMMs Results Conclusions References Inference Conditional F tests What if scale parameter (φ) is estimated (e.g. Gaussian, Gamma, quasi-likelihood) ? In classical LMMs, −2 log L ∼ F (ν1 , ν2 ) For non-classical LMMs (unbalanced, crossed, R-side) or GLMMs, ν2 poorly defined: Kenward-Roger, Satterthwaite approximations 12;16 unimplemented except in SAS (partially in Genstat) Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 34. Precursors GLMMs Results Conclusions References Inference Tests/CIs of variances [boundary problems] LRT depends on null hypothesis being within the parameter’s feasible range 6;13 violated e.g. by H0 : σ 2 = 0 In simple cases null distribution is a mixture of χ2 distributions (e.g. 0.5χ2 + 0.5χ2 : emdbook:dchibarsq) 0 1 simulation-based testing: RLRsim Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 35. Precursors GLMMs Results Conclusions References Inference Information-theoretic approaches Above issues apply, but less well understood: 7;8 AIC is asymptotic “corrected” AIC (AICc ) 10 derived for linear models, widely used but not tested elsewhere 14 For comparing models with different REs, or for AICc , what is p? conditional AIC: 8;19 (cAIC) (level of focus issue: see also Deviance Information Criterion (DIC, 17 ) Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 36. Precursors GLMMs Results Conclusions References Inference Bootstrapping 1 fit null model to data 2 simulate “data” from null model 3 fit null and working model, compute likelihood difference 4 repeat to estimate null distribution confidence intervals? simulate/refit methods; bootMer in lme4a (LMMs only!) > pboot <- function(m0, m1) { s <- simulate(m0) 2 * (logLik(refit(m1, s)) - logLik(refit(m0, s))) } > replicate(1000, pboot(fm2, fm1)) Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 37. Precursors GLMMs Results Conclusions References Inference Bayesian inference Marginal highest posterior density intervals (or quantiles) Computationally “free” with results of stochastic Bayesian computation Easily extended to prediction intervals etc. etc. Post hoc Markov chain Monte Carlo sampling available for some packages (glmmADMB, R2ADMB, eventually lme4a) Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 38. Precursors GLMMs Results Conclusions References Inference Bottom line Large data: computation slow (maximization methods fastest), inference easy (asymptotics) Bayesian computation slow, inference easy (posterior samples) Small data: computation fast RE variances may be poorly estimated/set to zero (upcoming: penalty/prior term in blmer within arm) inference tricky, may need bootstrapping Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 39. Precursors GLMMs Results Conclusions References Coral symbionts Coral symbionts: comparison of results Regression estimates −6 −4 −2 0 2 q q q q q q Added symbiont q q q q q q q Crab vs. Shrimp q q q q GLM (fixed) q q q GLM (pooled) q q PQL q q Laplace Symbiont q q AGQ Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 40. Precursors GLMMs Results Conclusions References Glycera Glycera fit comparisons qq qq Osm:Cu:H2S:Anoxia q q q Cu:H2S:Anoxia q q q qq q Osm:H2S:Anoxia q q q qq q Osm:Cu:Anoxia q q q qq Osm:Cu:H2S q qqq qq H2S:Anoxia q qq q Cu:Anoxia q q q Osm:Anoxia qq q q q q Cu:H2S q q q q Osm:H2S qq q q q q q Osm:Cu q q MCMCglmm qqq q Anoxia q q glmer(OD:2) q qq H2S q q q glmer(OD) qq q Cu q q q glmmML q Osm qq qq q glmer −60 −40 −20 0 20 40 60 Effect on survival (logit) Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 41. Precursors GLMMs Results Conclusions References Glycera Glycera: MCMCglmm fit Osm : Cu : H2S : Oxygen q Osm : Cu : Oxygen q Osm : H2S : Oxygen q Cu : H2S : Oxygen q 3−way Osm : Cu : H2S q Osm : Cu q H2S : Oxygen q Osm : H2S q 2−way Cu : Oxygen q Osm : Oxygen q Cu : H2S q Oxygen q Osm q main effects Cu q H2S q −20 −10 0 10 20 30 Effect on survival Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 42. Precursors GLMMs Results Conclusions References Glycera Parametric bootstrap results Osm Cu 0.5 0.1 0.05 0.01 0.005 Inferred p value variable 0.001 normal H2S Anoxia t7 0.5 t14 0.1 0.05 0.01 0.005 0.001 0.001 0.0050.01 0.05 0.1 0.5 0.001 0.0050.01 0.05 0.1 0.5 True p value Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 43. Precursors GLMMs Results Conclusions References Arabidopsis Arabidopsis: AIC comparison of RE models nointeract q int(popu) q int(gen) X int(popu) q int(gen) X nut(popu) q int(gen) X clip(popu) q nut(gen) X int(popu) q nut(gen) X nut(popu) q nut(gen) X clip(popu) q clip(gen) X int(popu) q clip(gen) X nut(popu) q clip(gen) X clip(popu) q 0 2 4 6 ∆AIC Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 44. Precursors GLMMs Results Conclusions References Arabidopsis Arabidopsis: fits with and without nutrient(genotype) Regression estimates −1.0 −0.5 0.0 0.5 1.0 1.5 q nutrient8:amdclipped q q statusTransplant q q statusPetri.Plate q q rack2 q q amdclipped q q nutrient8 q Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 45. Precursors GLMMs Results Conclusions References Conclusions Primary tools lme4: multiple/crossed REs, (profiling): fast MCMCglmm: Bayesian, very flexible glmmADMB: negative binomial, zero-inflated etc. Flexible tools: AD Model Builder (and interfaces) BUGS/JAGS (and interfaces) INLA 15 Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 46. Precursors GLMMs Results Conclusions References Conclusions Outlook Computation: faster algorithms, parallel computation Inference: mostly computational? Implementation: extensions (e.g. L1-penalized approaches 11 ), consistency (profile, simulate, predict) Benefits & costs of staying within the GLMM framework Benefits & costs of diversity More info: http://glmm.wikidot.com Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 47. Precursors GLMMs Results Conclusions References Conclusions Acknowledgements Data: Josh Banta and Massimo Pigliucci (Arabidopsis); Adrian Stier and Sea McKeon (coral symbionts); Courtney Kagan, Jocelynn Ortega, David Julian (Glycera); Co-authors: Mollie Brooks, Connie Clark, Shane Geange, John Poulsen, Hank Stevens, Jada White Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 48. Precursors GLMMs Results Conclusions References [1] Booth JG & Hobert JP, 1999. Journal of the 3867274916. URL http://www.cuvillier.de/ Royal Statistical Society. Series B, 61(1):265–285. flycms/en/html/30/-UickI3zKPS,3cEY= doi:10.1111/1467-9868.00176. URL http:// /Buchdetails.html?SID=wVZnpL8f0fbc. links.jstor.org/sici?sici=1369-7412(1999) [8] Greven S & Kneib T, 2010. Biometrika, 61%3A1%3C265%3AMGLMML%3E2.0.CO%3B2-C. 97(4):773–789. URL http: [2] Breslow NE, 2004. In DY Lin & PJ Heagerty, //www.bepress.com/jhubiostat/paper202/. eds., Proceedings of the second Seattle [9] Hadfield JD, 2 2010. Journal of Statistical symposium in biostatistics: Analysis of correlated Software, 33(2):1–22. ISSN 1548-7660. URL data, pp. 1–22. Springer. ISBN 0387208623. http://www.jstatsoft.org/v33/i02. [3] Cordeiro GM & Ferrari SLP, Aug. 1998. Journal [10] HURVICH CM & TSAI C, Jun. 1989. Biometrika, of Statistical Planning and Inference, 76(2):297 –307. 71(1-2):261–269. ISSN 0378-3758. doi:10.1093/biomet/76.2.297. URL doi:10.1016/S0378-3758(98)00005-6. URL http://biomet.oxfordjournals.org/content/ http://www.sciencedirect.com/science/ 76/2/297.abstract. article/B6V0M-3V5CVRT-M/2/ [11] Jiang J, Aug. 2008. The Annals of Statistics, 190f68a684dd08c569a7836ff59568e4. 36(4):1669–1692. ISSN 0090-5364. [4] Cordeiro GM, Paula GA, & Botter DA, 1994. doi:10.1214/07-AOS517. URL http: International Statistical Review / Revue //projecteuclid.org/euclid.aos/1216237296. Internationale de Statistique, 62(2):257–274. [12] Kenward MG & Roger JH, 1997. Biometrics, ISSN 03067734. doi:10.2307/1403512. URL 53(3):983–997. http://www.jstor.org/stable/1403512. [13] Molenberghs G & Verbeke G, 2007. The [5] Gelman A, 2005. Annals of Statistics, 33(1):1–53. American Statistician, 61(1):22–27. doi:doi:10.1214/009053604000001048. doi:10.1198/000313007X171322. [6] Goldman N & Whelan S, 2000. Molecular Biology [14] Richards SA, 2005. Ecology, 86(10):2805–2814. and Evolution, 17(6):975–978. doi:10.1890/05-0074. [7] Greven S, 2008. Non-Standard Problems in [15] Rue H, Martino S, & Chopin N, 2009. Journal of Inference for Additive and Linear Mixed Models. the Royal Statistical Society, Series B, Cuvillier Verlag, G¨ttingen, Germany. ISBN o 71(2):319–392. Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 49. Precursors GLMMs Results Conclusions References Conclusions [16] Schaalje G, McBride J, & Fellingham G, 2002. Journal of Agricultural, Biological & Environmental Statistics, 7(14):512–524. URL http://www.ingentaconnect.com/content/ asa/jabes/2002/00000007/00000004/art00004. [17] Spiegelhalter DJ, Best N et al., 2002. Journal of the Royal Statistical Society B, 64:583–640. [18] Sung YJ, Jul. 2007. The Annals of Statistics, 35(3):990–1011. ISSN 0090-5364. doi:10.1214/009053606000001389. URL http: //projecteuclid.org/euclid.aos/1185303995. Mathematical Reviews number (MathSciNet): MR2341695; Zentralblatt MATH identifier: 1124.62009. [19] Vaida F & Blanchard S, Jun. 2005. Biometrika, 92(2):351–370. doi:10.1093/biomet/92.2.351. URL http://biomet.oxfordjournals.org/cgi/ content/abstract/92/2/351. Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs
  • 50. Precursors GLMMs Results Conclusions References Conclusions Extras Spatial and temporal correlation (R-side effects): MASS:glmmPQL (sort of), GLMMarp, INLA; WinBUGS, AD Model Builder Additive models: amer, gamm4, mgcv Penalized methods 11 Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology Open-source GLMMs