Upcoming SlideShare
×

# Combining Data in Species Distribution Models

1,054 views

Published on

Using point process models to combine different data types for species distribution models.

Slides for talk at ISEC 2014, presented on the 3rd July

Published in: Science, Technology
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
1,054
On SlideShare
0
From Embeds
0
Number of Embeds
173
Actions
Shares
0
22
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Combining Data in Species Distribution Models

1. 1. Combining Data in Species Distribution Models Combining Data in Species Distribution Models Bob O’Hara1 Petr Keil 2 Walter Jetz2 1BiK-F, Biodiversity and Climate Change Research Centre Frankfurt am Main Germany bobohara 2Department of Ecology and Evolutionary Biology Yale University New Haven, CT, USA
2. 2. Combining Data in Species Distribution Models Motivation Map Of Life www.mol.org/
3. 3. Combining Data in Species Distribution Models The Problem Diﬀerent data sources GBIF expert range maps eBird and similar citizen science eﬀorts organised surveys (BBS, BMSs)
4. 4. Combining Data in Species Distribution Models Pointed Process Models Point process representation of actual distribution Continuous space models Build diﬀerent sampling models on top
5. 5. Combining Data in Species Distribution Models Point Processes: Model Intensity ρ(ξ) at point s. Assume covariates (features?) X(ξ), and a random ﬁeld ν(ξ) log(ρ(ξ)) = η(ξ) = βX(ξ) + ν(ξ) then, for an area A, P(N(A) = r) = λ(A)r e−λ(A) r! where λ(A) = A eη(s) ds
6. 6. Combining Data in Species Distribution Models In practice... Constrained refined Delaunay triangulation λ(A) ≈ N s=1 |A(s)|eη(s) Approximate λ(ξ) numerically: select some integration points, and sum over those
7. 7. Combining Data in Species Distribution Models Some Data Types Abundance e.g. Point counts Presence/absence surveys, areal lists Point observations museum archives, citizen science observations Expert range maps
8. 8. Combining Data in Species Distribution Models Abundance Assume a small area A, so that η(ξ) is constant, and observation for a time t, then n(A, t) ∼ Po(eµ(A,t)) with µA(A, t) = η(A) + log(|A|) + log(t) + log(p) where p is the proability of observing each indidivual. Don’t know all of |A|, t and p, so estimate an intercept Can also add a sampling model to log(p)
9. 9. Combining Data in Species Distribution Models Presence/Absence for ’points’ As n(A, t) ∼ Po(µ(A, t)), cloglogPr(n(A, t)) = µI (A, t) with µI (A, t) as before Again, can make log(|A|) + log(t) + log(p) an intercept
10. 10. Combining Data in Species Distribution Models Presence only: point process log Gaussian Cox Process Likelihood is a Poisson GLM (but with non-integer response)
11. 11. Combining Data in Species Distribution Models Areal Presence/absence If an area is large enough, we can’t assume constant covariates, so Pr(n(A) > 0) = 1 − e A eρ(ξ)dξ in pracice this is calculated as 1 − e s |A(s)|eρ(s) which causes problems with the ﬁtting
12. 12. Combining Data in Species Distribution Models Expert Range Maps Not the same as areal presence. Instead, use distance to range as a covariate within range, this is 0. Have to estimate the slope for outside the range Use informative priors to force the slope to be negative 0 20 40 60 80 100 0.00.20.40.60.81.0 Space (1d) Intensity Species' Range
13. 13. Combining Data in Species Distribution Models Put these together with INLA Quicker than MCMC SolTim.res <- inla(SolTim.formula, family=c('poisson','binomial'), data=inla.stack.data(stk.all), control.family = list(list(link = "log"), list(link = "cloglog")), control.predictor=list(A=inla.stack.A(stk.all)), Ntrials=1, E=inla.stack.data(stk.all)\$e, verbose=FALSE)
14. 14. Combining Data in Species Distribution Models The Solitary Tinamou Photo credit: Francesco Veronesi on Flickr (https://www.ﬂickr.com/photos/francesco veronesi/12797666343)
15. 15. Combining Data in Species Distribution Models Data Whole Region Expert range Park, absent Park, present eBird GBIF expert range 2 point processes (49 points) 28 parks
16. 16. Combining Data in Species Distribution Models A Fitted Model mean sd mode Intercept -0.30 0.09 -0.30 b.PP 1.37 0.40 1.37 b.GBIF 1.43 0.26 1.43 Forest -0.03 0.04 -0.03 NPP 0.15 0.05 0.15 Altitude -0.02 0.04 -0.02 DistToRange -0.01 0.02 -0.01
17. 17. Combining Data in Species Distribution Models Predicted Distribution −0.10 −0.05 0.00 0.05 0.10 0.15 0.20 0.25 Whole Region Expert range Park, absent Park, present eBird GBIF
18. 18. Combining Data in Species Distribution Models Individual Data Types Expert Range −10 −8 −6 −4 −2 0 GBIF −0.060 −0.058 −0.056 −0.054 −0.052 −0.050 −0.048 eBird −0.060 −0.058 −0.056 −0.054 −0.052 −0.050 −0.048 Parks −10 −8 −6 −4 −2 0 all data −0.10 −0.05 0.00 0.05 0.10 0.15 0.20 0.25
19. 19. Combining Data in Species Distribution Models Summary Parks and expert range seem to drive distribution NPP is main covariate, not forest or altitude
20. 20. Combining Data in Species Distribution Models What Next Multiple species already being done elsewhere estimate sampling biases More Data Point counts (have it working) Can we estimate absolute probability of presence? Distance sampling? Mark-recapture? scaling issues (in time and space)
21. 21. Combining Data in Species Distribution Models Not the ﬁnal answer... http://www.gocomics.com/nonsequitur/2014/06/24