1. Overview New methods War stories Conclusions References
Statistical machismo and common sense
Ben Bolker, McMaster University
Departments of Mathematics & Statistics and Biology
ISEC
July 2014
2. Overview New methods War stories Conclusions References
Outline
1 Overview
Statistical machismo
Why do ecological statistics?
2 New methods
Desiderata
Statisticians and software
3 War stories
4 Conclusions
3. Overview New methods War stories Conclusions References
Acknowledgements
People Steve Walker, Mollie Brooks, Mike McCoy
Support NSERC Discovery grant
4. Overview New methods War stories Conclusions References
Outline
1 Overview
Statistical machismo
Why do ecological statistics?
2 New methods
Desiderata
Statisticians and software
3 War stories
4 Conclusions
5. Overview New methods War stories Conclusions References
Statistical machismo
blog post by Brian McGill on Dynamic Ecology
≈ method ageism (M. Brewer)
criticizes unnecessarily fancy statistics, e.g.
Bonferroni corrections
phylogenetic corrections
spatial regression
estimation of detectability
Bayesian methods
Also cf. Murtaugh (2007; 2009; 2014)
6. Overview New methods War stories Conclusions References
Statistical machismo
blog post by Brian McGill on Dynamic Ecology
≈ method ageism (M. Brewer)
criticizes unnecessarily fancy statistics, e.g.
Bonferroni corrections
phylogenetic corrections
spatial regression
estimation of detectability
Bayesian methods
Also cf. Murtaugh (2007; 2009; 2014)
7. Overview New methods War stories Conclusions References
Statistical machismo
blog post by Brian McGill on Dynamic Ecology
≈ method ageism (M. Brewer)
criticizes unnecessarily fancy statistics, e.g.
Bonferroni corrections
phylogenetic corrections
spatial regression
estimation of detectability
Bayesian methods
Also cf. Murtaugh (2007; 2009; 2014)
8. Overview New methods War stories Conclusions References
Statistical machismo (2)
Slippery slope: what is good enough? (GLM vs ANOVA)
McGill: And when the p-values are 0.0000001 and unlikely
to change . . .
Criticizing dogma, not methods per se
Does this apply to statistical ecologists?
Are we enabling bad practice?
Caveat: researcher/teacher vs. researcher/consultant niche
9. Overview New methods War stories Conclusions References
Outline
1 Overview
Statistical machismo
Why do ecological statistics?
2 New methods
Desiderata
Statisticians and software
3 War stories
4 Conclusions
10. Overview New methods War stories Conclusions References
Intellectual satisfaction
Novelty
Seek elegant/rigorous solutions
Mathematical or computational challenges
11. Overview New methods War stories Conclusions References
Solve scientic and societal problems
Knuth: The important thing, once you have enough to eat and a
nice house, is what you can do for others, what you can contribute
to the enterprise as a whole
Basic science
Hypothesis testing (broad sense)
Links to ecological theory
Applied science
Prediction
Decision analysis
Technology transfer
(where possible)
12. Overview New methods War stories Conclusions References
Solve scientic and societal problems
Knuth: The important thing, once you have enough to eat and a
nice house, is what you can do for others, what you can contribute
to the enterprise as a whole
Basic science
Hypothesis testing (broad sense)
Links to ecological theory
Applied science
Prediction
Decision analysis
Technology transfer
(where possible)
13. Overview New methods War stories Conclusions References
Career advancement
get a job/get tenure/get funded
satisfy colleagues
be either useful or rigorous!
avoid too much collaboration
curse of novelty
14. Overview New methods War stories Conclusions References
Outline
1 Overview
Statistical machismo
Why do ecological statistics?
2 New methods
Desiderata
Statisticians and software
3 War stories
4 Conclusions
15. Overview New methods War stories Conclusions References
Robustness
less reliance on assumptions
(Box all models are wrong . . . )
computational robustness
handle crappy (small, noisy, missing . . . ) data
robust to user error
examples: sandwich estimators; M-estimators;
design-based statistics; permutation tests
tradeos: complexity, loss of interpretability, increased bias,
decreased eciency
16. Overview New methods War stories Conclusions References
Speed and scalability
just get a bigger computer, or a cluster
quantitative becomes qualitative
modern methods (ensembles, resampling, MCMC . . . )
increase need for speed
(but usually trivially parallelizable)
scalability
(parallel implementations; computational complexity)
examples: ensemble-based approaches; kernel-based methods;
sparse matrix computation; map/reduce
tradeos: complexity, lack of generality
17. Overview New methods War stories Conclusions References
Interpretability (Warton and Hui, 2010)
connection to mechanistic/theoretical models
. . . or to statistical paradigms (e.g. linear models)
maybe a hard sell
examples: Bayesian statistics (P(H|D) vs. P(D|H));
GLMs
counterargument: Breiman (2001)
18. Overview New methods War stories Conclusions References
Increased correctness (O'Hara and Kotze, 2010; Warton and Hui, 2010)
decrease bias/type I error; improve coverage
how big is bias generally? how much does it matter?
unpopular with ecologists!
especially if it lowers power/makes eects disappear . . .
19. Overview New methods War stories Conclusions References
Statistical power/eciency
squeezing more out of data is always good
. . . but how much?
maybe irrelevant if there's enough data (econometrics, e.g.
Angrist and Pischke (2009))
examples: GLMs vs. least-squares; random vs xed eects
tradeos: complexity, model-dependence/loss of robustness
20. Overview New methods War stories Conclusions References
Handle new kinds of data / solve new problems
Hard to argue with this one!
Most often, methods for combining dierent characteristics:
spatial/temporal/phylog. correlation
non-Normal data
missing data
regularization/smoothing/dimension limitation
etc. ...
21. Overview New methods War stories Conclusions References
Ease of use
general, exible frameworks (maybe)
(or domain-specic solutions)
interfaces
examples: MaxEnt; information-theoretic approaches
counterexamples: BUGS/JAGS, AD Model Builder?
22. Overview New methods War stories Conclusions References
Outline
1 Overview
Statistical machismo
Why do ecological statistics?
2 New methods
Desiderata
Statisticians and software
3 War stories
4 Conclusions
23. Overview New methods War stories Conclusions References
User friendliness/interfaces
how hard should you try to make your methods usable?
(Dynamic Ecology blog post)
what kind of interface?
equations
pseudocode
code blobs
package
GUI
24. Overview New methods War stories Conclusions References
Technology transfer and software engineering
Technology transfer might be the most useful we can be
Boring? Unrewarding? New tech has to come from somewhere
statistical software now commoditized and often free
good statistical software takes a lot of time and eort
(and few users want to pay for it)
eating dog food/developer blindness
building software exposes new issues:
Knuth: Science is what we understand well enough to explain
to a computer. Art is everything else we do
25. Overview New methods War stories Conclusions References
Technology transfer and software engineering
Technology transfer might be the most useful we can be
Boring? Unrewarding? New tech has to come from somewhere
statistical software now commoditized and often free
good statistical software takes a lot of time and eort
(and few users want to pay for it)
eating dog food/developer blindness
building software exposes new issues:
Knuth: Science is what we understand well enough to explain
to a computer. Art is everything else we do
26. Overview New methods War stories Conclusions References
Responsibilities
Do you have a responsibility to
Fix bugs?
Provide features?
Advise users?
Does the no liability clause of free software licenses
(e.g. GPL §16) absolve us of moral responsibility?
27. Overview New methods War stories Conclusions References
Outline
1 Overview
Statistical machismo
Why do ecological statistics?
2 New methods
Desiderata
Statisticians and software
3 War stories
4 Conclusions
28. Overview New methods War stories Conclusions References
Spatial moment equations
general theoretical framework for ecological dynamics of
spatial point processes (Ovaskainen et al., 2014)
used mostly to understand qualitative ecological dynamics
framework for connecting spatial theory and data?
in particular, should be able to deconvolve eects of (e.g.)
habitat and habitat preference
N(x) = E(y − x)H(y) dy = (E ∗ H)(x)
then
˜H(ω) =
CEN
CEE
29. Overview New methods War stories Conclusions References
Deconvolution: example
habitat preference
distance
habitat
0
1
0 50
correlations preference
distance
population
so far just a gleam in my eye
many details (non-Normality, nonlinear response) . . .
30. Overview New methods War stories Conclusions References
Estimating growth autocorrelation in unmarked individuals
(Brooks et al., 2013)
uncorrelated growth: σ2(t)
linear
correlated growth: σ2(t)
quadratic
straightforward estimation
31. Overview New methods War stories Conclusions References
Growth autocorrelation (cont.)
power analysis:
require 500 individuals
@ 25%
autocorrelation,
100 individuals @ 50%
autocorrelation
available data: 550
individuals
so far unused
(cf. Lavine et al.
(2002))
32. Overview New methods War stories Conclusions References
Mixed stock estimation
(Bolker et al., 2003, 2007; Okuyama and Bolker, 2005)
Bayesian mixed stock
analysis, following Pella and
Masuda (2001)
unconditional likelihood:
account for sampling error
in sources
better condence intervals
many-to-many methods
(Chen et al., 2010)
widely used . . .
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
AS
AV
CR
FL
GB
GG
MX
BR
SU
TR
BAH
BAR
FLF
CBG
MOC
NIC
BRF
33. Overview New methods War stories Conclusions References
A cautionary example
Mismatch between simple
Gibbs sampler with
equivalent WinBUGS
implementation
Bug (??) in relatively
widely used software;
haven't found time to
diagnose/x it!
Density
051015
0.0 0.2 0.4 0.6 0.8 1.0
NWFL
NWFL
0.00.51.01.5
SOFL
NWFL
0246810
NWFL
SOFL
0.0 0.2 0.4 0.6 0.8 1.0
0.00.51.01.5
SOFL
SOFL
tmcmc
wbugs
wbugsL
wbugsLL
34. Overview New methods War stories Conclusions References
Mixed models
(generalized) linear mixed
models
very useful framework:
GLMs (exponential family)
+ random eects
linear algebra magic by D.
Bates, M. Maechler
technology transfer
amazingly popular!
q
q
q
q
q
q q q
q q q q
q q
q
q
q
q
q q
q
q
q
q
q
q
1995 2000 2005 2010 2015
year
citations
w
q
q
GLMM paper
all others
35. Overview New methods War stories Conclusions References
Conclusions
Can't be too gloomy . . . trickle-down process does work
But we should at least recognize when we are amusing
ourselves, and when we are doing science
Can we formalize these ideas?
Is it worth it?