Threads 2013

405 views
322 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
405
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Threads 2013

  1. 1. Denitions Statistics Computation Sociological Conclusions References General-purpose tools for generalized linear mixed models Ben Bolker McMaster University, Mathematics Statistics and Biology 13 September 2013 Ben Bolker GLMMs
  2. 2. Denitions Statistics Computation Sociological Conclusions References Outline 1 Denitions and context 2 Statistical challenges 3 Computational challenges 4 Sociological challenges 5 Conclusions Ben Bolker GLMMs
  3. 3. Denitions Statistics Computation Sociological Conclusions References Outline 1 Denitions and context 2 Statistical challenges 3 Computational challenges 4 Sociological challenges 5 Conclusions Ben Bolker GLMMs
  4. 4. Denitions Statistics Computation Sociological Conclusions References Generalized linear mixed models GLMMs: a statistical modeling framework incorporating: Linear combinations of categorical and continuous predictors, and interactions Response distributions in the exponential family (binomial, Poisson, and extensions) Any smooth, monotonic link function (e.g. logistic, exponential models) Flexible combinations of blocking factors (clustering; random eects) Applications in ecology, neurobiology, behaviour, epidemiology, real estate, . . . Ben Bolker GLMMs
  5. 5. Denitions Statistics Computation Sociological Conclusions References Generalized linear mixed models GLMMs: a statistical modeling framework incorporating: Linear combinations of categorical and continuous predictors, and interactions Response distributions in the exponential family (binomial, Poisson, and extensions) Any smooth, monotonic link function (e.g. logistic, exponential models) Flexible combinations of blocking factors (clustering; random eects) Applications in ecology, neurobiology, behaviour, epidemiology, real estate, . . . Ben Bolker GLMMs
  6. 6. Denitions Statistics Computation Sociological Conclusions References Generalized linear mixed models GLMMs: a statistical modeling framework incorporating: Linear combinations of categorical and continuous predictors, and interactions Response distributions in the exponential family (binomial, Poisson, and extensions) Any smooth, monotonic link function (e.g. logistic, exponential models) Flexible combinations of blocking factors (clustering; random eects) Applications in ecology, neurobiology, behaviour, epidemiology, real estate, . . . Ben Bolker GLMMs
  7. 7. Denitions Statistics Computation Sociological Conclusions References Generalized linear mixed models GLMMs: a statistical modeling framework incorporating: Linear combinations of categorical and continuous predictors, and interactions Response distributions in the exponential family (binomial, Poisson, and extensions) Any smooth, monotonic link function (e.g. logistic, exponential models) Flexible combinations of blocking factors (clustering; random eects) Applications in ecology, neurobiology, behaviour, epidemiology, real estate, . . . Ben Bolker GLMMs
  8. 8. Denitions Statistics Computation Sociological Conclusions References Technical denition Yi response ∼ conditional distribution Distr (g −1(ηi ) inverse link function , φ scale parameter ) η linear predictor = Xβ xed eects + Zb random eects b conditional modes ∼ MVN(0, Σ(θ) variance- covariance matrix ) Ben Bolker GLMMs
  9. 9. Denitions Statistics Computation Sociological Conclusions References Outline 1 Denitions and context 2 Statistical challenges 3 Computational challenges 4 Sociological challenges 5 Conclusions Ben Bolker GLMMs
  10. 10. Denitions Statistics Computation Sociological Conclusions References Estimation Maximum likelihood estimation L(Yi |θ, β) likelihood = · · · L(Yi |θ, β ) data|random eects × L(β |Σ(θ)) random eects dβ deterministic: precision vs. computational cost: penalized quasi-likelihood, Laplace approximation, adaptive Gauss-Hermite quadrature (Breslow, 2004) . . . Monte Carlo: frequentist and Bayesian (Booth and Hobert, 1999; Ponciano et al., 2009; Sung, 2007) Ben Bolker GLMMs
  11. 11. Denitions Statistics Computation Sociological Conclusions References Estimation: example (McKeon et al., 2012) Log−odds of predation −6 −4 −2 0 2 Symbiont Crab vs. Shrimp Added symbiont q q q q q q q q q q q q q q q GLM (fixed) GLM (pooled) PQL Laplace AGQ Ben Bolker GLMMs
  12. 12. Denitions Statistics Computation Sociological Conclusions References Inference Big problem. Inferential tools: either asymptotic or taken from classical linear models boundary solutions (Stram and Lee, 1994) the great p-value/degrees of freedom debate small numbers of clusters solutions: computational and/or Bayesian (parametric bootstrap, MCMC) True p value Inferredpvalue 0.02 0.04 0.06 0.08 0.02 0.06 Osm Cu H2S 0.02 0.06 0.02 0.04 0.06 0.08 Anoxia Ben Bolker GLMMs
  13. 13. Denitions Statistics Computation Sociological Conclusions References Outline 1 Denitions and context 2 Statistical challenges 3 Computational challenges 4 Sociological challenges 5 Conclusions Ben Bolker GLMMs
  14. 14. Denitions Statistics Computation Sociological Conclusions References Sparse matrix algorithms repeated decomposition of large, matrices (especially Z) ll-reducing permutation to improve sparsity pattern further improvements possible: better matrix representation, parallelization? Ben Bolker GLMMs
  15. 15. Denitions Statistics Computation Sociological Conclusions References Bounded optimization Parameterize variance-covariance matrix Σ(θ) (Pinheiro and Bates, 1996) Positive denite or only semi-denite? Disadvantages of transforming to unconstrain (Disadvantages of boundary solutions) raw log 0 10 20 30 0 1 2 3 −3 −2 −1 0 deviance Ben Bolker GLMMs
  16. 16. Denitions Statistics Computation Sociological Conclusions References Outline 1 Denitions and context 2 Statistical challenges 3 Computational challenges 4 Sociological challenges 5 Conclusions Ben Bolker GLMMs
  17. 17. Denitions Statistics Computation Sociological Conclusions References Sociological issues The curse of neophilia Wide user base: As usual when software for complicated statistical inference procedures is broadly disseminated, there is potential for abuse and misinterpretation. (Breslow, 2004) What if there is no good answer? do no harm vs. better me than someone else Diagnostics and warning messages End users vs. downstream developers Ben Bolker GLMMs
  18. 18. Denitions Statistics Computation Sociological Conclusions References Outline 1 Denitions and context 2 Statistical challenges 3 Computational challenges 4 Sociological challenges 5 Conclusions Ben Bolker GLMMs
  19. 19. Denitions Statistics Computation Sociological Conclusions References Next steps Alternative platforms/languages Flexible correlation structures: spatial, temporal, phylogenetic . . . Improved MCMC methods? Simulation tests of inferential tools (sigh) Ben Bolker GLMMs
  20. 20. Denitions Statistics Computation Sociological Conclusions References Is it science? Science is what we understand well enough to explain to a computer. Art is everything else we do. (Donald Knuth) 10 20 30 40 50 2006 2008 2010 2012 Date articlespermonth key glmm lme4 Ben Bolker GLMMs
  21. 21. Denitions Statistics Computation Sociological Conclusions References Acknowledgments lme4: Doug Bates, Martin Mächler, Steve Walker Data: Adrian Stier (UBC/OSU), Sea McKeon (Smithsonian), David Julian (UF) NSERC (Discovery) SHARCnet Ben Bolker GLMMs
  22. 22. Denitions Statistics Computation Sociological Conclusions References Booth, J.G. and Hobert, J.P., 1999. Journal of the Royal Statistical Society. Series B, 61(1):265285. doi:10.1111/1467-9868.00176. Breslow, N.E., 2004. In D.Y. Lin and P.J. Heagerty, editors, Proceedings of the second Seattle symposium in biostatistics: Analysis of correlated data, pages 122. Springer. ISBN 0387208623. McKeon, C.S., Stier, A., et al., 2012. Oecologia, 169(4):10951103. ISSN 0029-8549. doi:10.1007/s00442-012-2275-2. Pinheiro, J.C. and Bates, D.M., 1996. Statistics and Computing, 6(3):289296. doi:10.1007/BF00140873. Ponciano, J.M., Taper, M.L., et al., 2009. Ecology, 90(2):356362. ISSN 0012-9658. Stram, D.O. and Lee, J.W., 1994. Biometrics, 50(4):11711177. Sung, Y.J., 2007. The Annals of Statistics, 35(3):9901011. ISSN 0090-5364. doi:10.1214/009053606000001389. Ben Bolker GLMMs

×