Trondheim glmm

1,213 views
1,157 views

Published on

talk on GLMMs at Trondheim

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,213
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
25
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Trondheim glmm

  1. 1. Precursors GLMMs Results Conclusions References Generalized linear mixed models for ecologists: coping with non-normal, spatially and temporally correlated data Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology 30 August 2011Ben Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  2. 2. Precursors GLMMs Results Conclusions ReferencesOutline 1 Precursors Examples Definitions ANOVA vs. (G)LMMs 2 GLMMs Estimation Inference 3 Results Coral symbionts Glycera Arabidopsis 4 ConclusionsBen Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  3. 3. Precursors GLMMs Results Conclusions ReferencesOutline 1 Precursors Examples Definitions ANOVA vs. (G)LMMs 2 GLMMs Estimation Inference 3 Results Coral symbionts Glycera Arabidopsis 4 ConclusionsBen Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  4. 4. Precursors GLMMs Results Conclusions ReferencesExamplesCoral protection by symbionts Number of predation events 10 8 2 Number of blocks 2 2 6 2 1 1 4 0 2 0 0 1 0 none shrimp crabs both SymbiontsBen Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  5. 5. Precursors GLMMs Results Conclusions ReferencesExamplesEnvironmental stress: Glycera cell survival 0 0.03 0.1 0.32 0 0.03 0.1 0.32 Anoxia Anoxia Anoxia Anoxia Anoxia Osm=12.8 Osm=22.4 Osm=32 Osm=41.6 Osm=51.2 1.0 133.3 66.6 0.8 33.3 0.6 0 Copper Normoxia Normoxia Normoxia Normoxia Normoxia Osm=12.8 Osm=22.4 Osm=32 Osm=41.6 Osm=51.2 0.4 133.3 66.6 0.2 33.3 0 0.0 0 0.03 0.1 0.32 0 0.03 0.1 0.32 0 0.03 0.1 0.32 H2SBen Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  6. 6. Precursors GLMMs Results Conclusions ReferencesExamplesArabidopsis response to fertilization & clipping panel: nutrient, color: genotype nutrient : 1 nutrient : 8 q q q q q q q q q 5 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Log(1+fruit set) q q q q q 4 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 3 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 2 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 1 q q q q q q q q q 0 q q q q q q q q q q q q unclipped clipped unclipped clippedBen Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  7. 7. Precursors GLMMs Results Conclusions ReferencesDefinitionsGeneralized linear models (GLMs) non-normal data: binary, binomial, count (Poisson/negative binomial) non-linearity: log/exponential, logit/logistic: link function L flexibility via linear predictor: L(response) = a + bi + cx . . . stable, robust, fast, easy to useBen Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  8. 8. Precursors GLMMs Results Conclusions ReferencesDefinitionsRandom vs. fixed effects Fixed effects (FE) Interested in specific levels ( “treatments”) Random effects (RE): 2 Interested in distribution ( “blocks”) Experimental Temporal, spatial Genera, species, genotypes Individuals ( “repeated measures” ) inference on population of blocks (blocks randomly selected?) (large number of blocks [> 5 − 7]?)Ben Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  9. 9. Precursors GLMMs Results Conclusions ReferencesDefinitionsRandom vs. fixed effects Fixed effects (FE) Interested in specific levels ( “treatments”) Random effects (RE): 2 Interested in distribution ( “blocks”) Experimental Temporal, spatial Genera, species, genotypes Individuals ( “repeated measures” ) inference on population of blocks (blocks randomly selected?) (large number of blocks [> 5 − 7]?)Ben Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  10. 10. Precursors GLMMs Results Conclusions ReferencesDefinitionsRandom vs. fixed effects Fixed effects (FE) Interested in specific levels ( “treatments”) Random effects (RE): 2 Interested in distribution ( “blocks”) Experimental Temporal, spatial Genera, species, genotypes Individuals ( “repeated measures” ) inference on population of blocks (blocks randomly selected?) (large number of blocks [> 5 − 7]?)Ben Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  11. 11. Precursors GLMMs Results Conclusions ReferencesDefinitionsRandom vs. fixed effects Fixed effects (FE) Interested in specific levels ( “treatments”) Random effects (RE): 2 Interested in distribution ( “blocks”) Experimental Temporal, spatial Genera, species, genotypes Individuals ( “repeated measures” ) inference on population of blocks (blocks randomly selected?) (large number of blocks [> 5 − 7]?)Ben Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  12. 12. Precursors GLMMs Results Conclusions ReferencesANOVA vs. (G)LMMsMixed models: classical approach traditional approach to non-independence nested, randomized block, split-plot . . . sum-of-squares decomposition/ANOVA: figure out treatment SSQ/df, error SQ/df 3Ben Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  13. 13. Precursors GLMMs Results Conclusions ReferencesANOVA vs. (G)LMMsYou can use an ANOVA if . . . data are normal (or can be transformed) responses are linear design is (nearly) balanced simple design (single or nested REs) (not crossed REs: e.g. year effects that apply across all spatial blocks) no spatial or temporal correlation within blocksBen Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  14. 14. Precursors GLMMs Results Conclusions ReferencesANOVA vs. (G)LMMs“Modern” mixed models Data still normal(izable), linear, but unbalanced/crossed/correlated Balance (dispersion of observation around block mean) with (dispersion of block means around overall average) Good for large, messy data . . . and when variation is interestingBen Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  15. 15. Precursors GLMMs Results Conclusions ReferencesANOVA vs. (G)LMMsShrinkage (Arabidopsis) Arabidopsis block estimates 5 11 2 5 7 9 4 9 q 3 6 10 5 q q q 4 2 q q q q 6 q q q 3 9 9 4 q q q q q Mean(log) fruit set 4 q q 10 8 q q 2 q 0 3 q 10 q q q −3 −15 q q 0 5 10 15 20 25 GenotypeBen Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  16. 16. Precursors GLMMs Results Conclusions ReferencesANOVA vs. (G)LMMsShrinkage (sparrows) q q q q 0.80 q q q q q q q island 0.78 q q q q q q Hestmannøy q q q q q q q Sleneset q q q q Gjerøy q 0.76 q q Indre Kvarøy Heterozygosity q q q q q q Husøy q q q Selvær 0.74 q Ytre Kvarøy q q q q q Aldra q q q Myken 0.72 q q Lovund q q Onøy q Nesøy 0.70 q Lurøy q Sundøy q q 0.68 q 2.0 2.5 3.0 3.5 4.0 4.5 Log(harmonic mean pop size)Ben Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  17. 17. Precursors GLMMs Results Conclusions ReferencesANOVA vs. (G)LMMsGLMMs Data not normal(izable), nonlinear Standard distributions (Poisson, binomial etc.) Specific forms of nonlinearity (exponential, logistic etc.) Conceptually v. similar to LMMs, but harderBen Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  18. 18. Precursors GLMMs Results Conclusions ReferencesANOVA vs. (G)LMMsChallenges Small # RE levels (<5–6) Big data (> 1000 observations) Spatial/temporal correlation structure (in GLMMs) Unusual distributions of data (in GLMMs)Ben Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  19. 19. Precursors GLMMs Results Conclusions ReferencesOutline 1 Precursors Examples Definitions ANOVA vs. (G)LMMs 2 GLMMs Estimation Inference 3 Results Coral symbionts Glycera Arabidopsis 4 ConclusionsBen Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  20. 20. Precursors GLMMs Results Conclusions ReferencesEstimationPenalized quasi-likelihood (PQL) 1 flexible (e.g. handles spatial/temporal correlations) least accurate: biased for small samples (low counts per block) SAS PROC GLIMMIX, R MASS:glmmPQLBen Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  21. 21. Precursors GLMMs Results Conclusions ReferencesEstimationLaplace and Gauss-Hermite quadrature more accurate than PQL: speed/accuracy tradeoff lme4:glmer, glmmML, glmmADMB, R2ADMB (AD Model Builder, gamlss.mx:gamlssNP, repeatedBen Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  22. 22. Precursors GLMMs Results Conclusions ReferencesEstimationBayesian approaches usually slow but flexible best confidence intervals must specify priors, assess convergence specialized: glmmAK, MCMCglmm 6 , INLA general: BUGS (glmmBUGS, R2WinBUGS, BRugs, WinBUGS, OpenBUGS, R2jags, rjags, JAGS)Ben Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  23. 23. Precursors GLMMs Results Conclusions ReferencesEstimationExtensions Overdispersion Variance > expected from statistical model Quasi-likelihood MASS:glmmPQL; overdispersed distributions (e.g. negative binomial): glmmADMB, gamlss.mx:gamlssNP; observation-level random effects (e.g. lognormal-Poisson): lme4, MCMCglmm Zero-inflation Overabundance of zeros in a discrete distribution zero-inflated models: glmmADMB, MCMCglmm hurdle models: MCMCglmmBen Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  24. 24. Precursors GLMMs Results Conclusions ReferencesInferenceWald tests/CIs Widely available (e.g. summary()) Assume data set is large/well-behaved Always approximate, sometimes awful; bad for variance estimatesBen Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  25. 25. Precursors GLMMs Results Conclusions ReferencesInferenceLikelihood ratio tests Compare models (easy) Confidence intervals — expensive and rarely available (lme4a for LMMs) Asymptotic assumption LMMs: F tests; estimate “equivalent” denominator df? approximations 8;13 : doBy:KRmodcomp don’t really know what to do for GLMMs OK if number obs number of parameters and large # of blocks . . .Ben Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  26. 26. Precursors GLMMs Results Conclusions ReferencesInferenceInformation-theoretic approaches Above issues apply, but less well understood 4;5;7;11 : AIC is asymptotic too For comparing models with different REs, or for AICc , what is p? “Level of focus” issue: what are you trying to predict? 5;14;15 (cAIC)Ben Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  27. 27. Precursors GLMMs Results Conclusions ReferencesInferenceBootstrapping 1 fit null model to data 2 simulate “data” from null model 3 fit null and working model, compute likelihood difference 4 repeat to estimate null distribution simulate/refit methods; bootMer in lme4a (LMMs only!), doBy:PBModComp, or “by hand”: > pboot <- function(m0, m1) { s <- simulate(m0) 2 * (logLik(refit(m1, s)) - logLik(refit(m0, s))) } > replicate(1000, pboot(fm2, fm1))Ben Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  28. 28. Precursors GLMMs Results Conclusions ReferencesInferenceBayesian inference CIs, prediction intervals etc. computationally “free” after estimation Post hoc MCMC sampling: (glmmADMB, R2ADMB, lme4:MCMCsamp)Ben Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  29. 29. Precursors GLMMs Results Conclusions ReferencesInferenceBottom line Large data: computation slow, inference easy Bayesian computation slow, inference easy Small data: computation fast Problems with zero variance (blme), correlations = ±1 Bootstrapping for inference?Ben Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  30. 30. Precursors GLMMs Results Conclusions ReferencesOutline 1 Precursors Examples Definitions ANOVA vs. (G)LMMs 2 GLMMs Estimation Inference 3 Results Coral symbionts Glycera Arabidopsis 4 ConclusionsBen Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  31. 31. Precursors GLMMs Results Conclusions ReferencesCoral symbiontsCoral symbionts: comparison of results Regression estimates −6 −4 −2 0 2 q q q Added symbiont q q q q q q q Crab vs. Shrimp q q q q q q q GLM (fixed) q q GLM (pooled) Symbiont q q q PQL q q Laplace q q AGQBen Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  32. 32. Precursors GLMMs Results Conclusions ReferencesGlyceraGlycera fit comparisons qq qq Osm:Cu:H2S:Anoxia q q q Cu:H2S:Anoxia q q q qq q Osm:H2S:Anoxia q q q qq q Osm:Cu:Anoxia q q q qq Osm:Cu:H2S q qqq qq H2S:Anoxia q qq q Cu:Anoxia q q q Osm:Anoxia qq q q q q Cu:H2S q q q q Osm:H2S qq q q q q q Osm:Cu q q MCMCglmm qqq q Anoxia q q glmer(OD:2) q qq H2S q q q glmer(OD) qq q Cu q q q glmmML q Osm qq qq q glmer −60 −40 −20 0 20 40 60 Effect on survival (logit)Ben Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  33. 33. Precursors GLMMs Results Conclusions ReferencesGlyceraGlycera: parametric bootstrap results Osm Cu 0.5 0.1 0.05 0.01 0.005 Inferred p value variable 0.001 normal H2S Anoxia t7 0.5 t14 0.1 0.05 0.01 0.005 0.001 0.001 0.0050.01 0.05 0.1 0.5 0.001 0.0050.01 0.05 0.1 0.5 True p valueBen Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  34. 34. Precursors GLMMs Results Conclusions ReferencesArabidopsisArabidopsis results Regression estimates −1.0 0.0 1.0 statusTransplant q statusPetri.Plate q rack2 q nutrient8:amdclipped q amdclipped q nutrient8 qBen Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  35. 35. Precursors GLMMs Results Conclusions ReferencesOutline 1 Precursors Examples Definitions ANOVA vs. (G)LMMs 2 GLMMs Estimation Inference 3 Results Coral symbionts Glycera Arabidopsis 4 ConclusionsBen Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  36. 36. Precursors GLMMs Results Conclusions ReferencesWhat about space and/or time? if in blocks, no problem (crossed random effects) 10 test residuals, try to fail to reject NH of no autocorrelation if normal (LMM), corStruct in lme, spdep otherwise . . . spatcounts, geoRglm, geoBUGS, . . . ??? big data 9Ben Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  37. 37. Precursors GLMMs Results Conclusions ReferencesPrimary tools Special-purpose: lme4: multiple/crossed REs, (profiling): fast MCMCglmm: Bayesian, fairly flexible glmmADMB: negative binomial, zero-inflated etc. General-purpose: AD Model Builder (and interfaces) BUGS/JAGS (and interfaces) INLA 12 Tools are getting better, but still not easy! Info: http://glmm.wikidot.com Slides: http://www.slideshare.net/bbolkerBen Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  38. 38. Precursors GLMMs Results Conclusions ReferencesAcknowledgements Funding: NSF, NSERC, NCEAS Data: Josh Banta and Massimo Pigliucci (Arabidopsis); Adrian Stier and Seabird McKeon (coral symbionts); Courtney Kagan, Jocelynn Ortega, David Julian (Glycera); Co-authors: Mollie Brooks, Connie Clark, Shane Geange, John Poulsen, Hank Stevens, Jada WhiteBen Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  39. 39. Precursors GLMMs Results Conclusions References [1] Breslow NE, 2004. In DY Lin & PJ Heagerty, [9] Latimer AM, Banerjee S et al., 2009. Ecology eds., Proceedings of the second Seattle Letters, 12(2):144–154. symposium in biostatistics: Analysis of correlated [10] Ozgul A, Oli MK et al., Apr. 2009. Ecological data, pp. 1–22. Springer. ISBN 0387208623. Applications: A Publication of the Ecological [2] Gelman A, 2005. Annals of Statistics, 33(1):1–53. Society of America, 19(3):786–798. ISSN doi:doi:10.1214/009053604000001048. 1051-0761. URL http: //www.ncbi.nlm.nih.gov/pubmed/19425439. [3] Gotelli NJ & Ellison AM, 2004. A Primer of PMID: 19425439. Ecological Statistics. Sinauer, Sunderland, MA. [11] Richards SA, 2005. Ecology, 86(10):2805–2814. [4] Greven S, 2008. Non-Standard Problems in doi:10.1890/05-0074. Inference for Additive and Linear Mixed Models. Cuvillier Verlag, G¨ttingen, Germany. ISBN o [12] Rue H, Martino S, & Chopin N, 2009. Journal of 3867274916. URL http://www.cuvillier.de/ the Royal Statistical Society, Series B, flycms/en/html/30/-UickI3zKPS,3cEY= 71(2):319–392. /Buchdetails.html?SID=wVZnpL8f0fbc. [13] Schaalje G, McBride J, & Fellingham G, 2002. [5] Greven S & Kneib T, 2010. Biometrika, Journal of Agricultural, Biological & 97(4):773–789. URL http: Environmental Statistics, 7(14):512–524. URL //www.bepress.com/jhubiostat/paper202/. http://www.ingentaconnect.com/content/ asa/jabes/2002/00000007/00000004/art00004. [6] Hadfield JD, 2 2010. Journal of Statistical Software, 33(2):1–22. ISSN 1548-7660. URL [14] Spiegelhalter DJ, Best N et al., 2002. Journal of http://www.jstatsoft.org/v33/i02. the Royal Statistical Society B, 64:583–640. [7] Hurvich CM & Tsai CL, Jun. 1989. Biometrika, [15] Vaida F & Blanchard S, Jun. 2005. Biometrika, 76(2):297 –307. 92(2):351–370. doi:10.1093/biomet/76.2.297. URL doi:10.1093/biomet/92.2.351. URL http://biomet.oxfordjournals.org/content/ http://biomet.oxfordjournals.org/cgi/ 76/2/297.abstract. content/abstract/92/2/351. [8] Kenward MG & Roger JH, 1997. Biometrics, 53(3):983–997.Ben Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs
  40. 40. Precursors GLMMs Results Conclusions ReferencesExtras Spatial and temporal correlation (R-side effects): MASS:glmmPQL (sort of), GLMMarp, INLA; WinBUGS, AD Model Builder Additive models: amer, gamm4, mgcv, lmeSplines Ordinal models: ordinal Population genetics: pedigreemm, kinship Survival: coxme, kinship, phmmBen Bolker McMaster University Departments of Mathematics & Statistics and BiologyGLMMs

×