Combining Data in Species Distribution Models
Combining Data in Species Distribution Models
Bob O’Hara1 Petr Keil 2 Walter...
Combining Data in Species Distribution Models
Motivation
Map Of Life
www.mol.org/
Combining Data in Species Distribution Models
The Problem
Different data sources
GBIF
expert range maps
eBird and similar c...
Combining Data in Species Distribution Models
Pointed Process Models
Point process representation of actual distribution
C...
Combining Data in Species Distribution Models
Point Processes: Model
Intensity ρ(ξ) at point s. Assume covariates (feature...
Combining Data in Species Distribution Models
In practice...
Constrained refined Delaunay triangulation
λ(A) ≈
N
s=1
|A(s)...
Combining Data in Species Distribution Models
Some Data Types
Abundance
e.g. Point counts
Presence/absence
surveys, areal ...
Combining Data in Species Distribution Models
Abundance
Assume a small area A, so that η(ξ) is constant, and observation
f...
Combining Data in Species Distribution Models
Presence/Absence for ’points’
As n(A, t) ∼ Po(µ(A, t)),
cloglogPr(n(A, t)) =...
Combining Data in Species Distribution Models
Presence only: point process
log Gaussian Cox Process
Likelihood is a Poisso...
Combining Data in Species Distribution Models
Areal Presence/absence
If an area is large enough, we can’t assume constant ...
Combining Data in Species Distribution Models
Expert Range Maps
Not the same as areal presence.
Instead, use distance to r...
Combining Data in Species Distribution Models
Put these together with INLA
Quicker than MCMC
SolTim.res <- inla(SolTim.for...
Combining Data in Species Distribution Models
The Solitary Tinamou
Photo credit: Francesco Veronesi on Flickr
(https://www...
Combining Data in Species Distribution Models
Data
Whole Region
Expert range
Park, absent
Park, present
eBird
GBIF
expert ...
Combining Data in Species Distribution Models
A Fitted Model
mean sd mode
Intercept -0.30 0.09 -0.30
b.PP 1.37 0.40 1.37
b...
Combining Data in Species Distribution Models
Predicted Distribution
−0.10
−0.05
0.00
0.05
0.10
0.15
0.20
0.25
Whole Regio...
Combining Data in Species Distribution Models
Individual Data Types
Expert Range
−10
−8
−6
−4
−2
0
GBIF
−0.060
−0.058
−0.0...
Combining Data in Species Distribution Models
Summary
Parks and expert range seem to drive distribution
NPP is main covari...
Combining Data in Species Distribution Models
What Next
Multiple species
already being done elsewhere
estimate sampling bi...
Combining Data in Species Distribution Models
Not the final answer...
http://www.gocomics.com/nonsequitur/2014/06/24
Upcoming SlideShare
Loading in …5
×

Combining Data in Species Distribution Models

1,054 views

Published on

Using point process models to combine different data types for species distribution models.

Slides for talk at ISEC 2014, presented on the 3rd July

Published in: Science, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,054
On SlideShare
0
From Embeds
0
Number of Embeds
173
Actions
Shares
0
Downloads
22
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Combining Data in Species Distribution Models

  1. 1. Combining Data in Species Distribution Models Combining Data in Species Distribution Models Bob O’Hara1 Petr Keil 2 Walter Jetz2 1BiK-F, Biodiversity and Climate Change Research Centre Frankfurt am Main Germany bobohara 2Department of Ecology and Evolutionary Biology Yale University New Haven, CT, USA
  2. 2. Combining Data in Species Distribution Models Motivation Map Of Life www.mol.org/
  3. 3. Combining Data in Species Distribution Models The Problem Different data sources GBIF expert range maps eBird and similar citizen science efforts organised surveys (BBS, BMSs)
  4. 4. Combining Data in Species Distribution Models Pointed Process Models Point process representation of actual distribution Continuous space models Build different sampling models on top
  5. 5. Combining Data in Species Distribution Models Point Processes: Model Intensity ρ(ξ) at point s. Assume covariates (features?) X(ξ), and a random field ν(ξ) log(ρ(ξ)) = η(ξ) = βX(ξ) + ν(ξ) then, for an area A, P(N(A) = r) = λ(A)r e−λ(A) r! where λ(A) = A eη(s) ds
  6. 6. Combining Data in Species Distribution Models In practice... Constrained refined Delaunay triangulation λ(A) ≈ N s=1 |A(s)|eη(s) Approximate λ(ξ) numerically: select some integration points, and sum over those
  7. 7. Combining Data in Species Distribution Models Some Data Types Abundance e.g. Point counts Presence/absence surveys, areal lists Point observations museum archives, citizen science observations Expert range maps
  8. 8. Combining Data in Species Distribution Models Abundance Assume a small area A, so that η(ξ) is constant, and observation for a time t, then n(A, t) ∼ Po(eµ(A,t)) with µA(A, t) = η(A) + log(|A|) + log(t) + log(p) where p is the proability of observing each indidivual. Don’t know all of |A|, t and p, so estimate an intercept Can also add a sampling model to log(p)
  9. 9. Combining Data in Species Distribution Models Presence/Absence for ’points’ As n(A, t) ∼ Po(µ(A, t)), cloglogPr(n(A, t)) = µI (A, t) with µI (A, t) as before Again, can make log(|A|) + log(t) + log(p) an intercept
  10. 10. Combining Data in Species Distribution Models Presence only: point process log Gaussian Cox Process Likelihood is a Poisson GLM (but with non-integer response)
  11. 11. Combining Data in Species Distribution Models Areal Presence/absence If an area is large enough, we can’t assume constant covariates, so Pr(n(A) > 0) = 1 − e A eρ(ξ)dξ in pracice this is calculated as 1 − e s |A(s)|eρ(s) which causes problems with the fitting
  12. 12. Combining Data in Species Distribution Models Expert Range Maps Not the same as areal presence. Instead, use distance to range as a covariate within range, this is 0. Have to estimate the slope for outside the range Use informative priors to force the slope to be negative 0 20 40 60 80 100 0.00.20.40.60.81.0 Space (1d) Intensity Species' Range
  13. 13. Combining Data in Species Distribution Models Put these together with INLA Quicker than MCMC SolTim.res <- inla(SolTim.formula, family=c('poisson','binomial'), data=inla.stack.data(stk.all), control.family = list(list(link = "log"), list(link = "cloglog")), control.predictor=list(A=inla.stack.A(stk.all)), Ntrials=1, E=inla.stack.data(stk.all)$e, verbose=FALSE)
  14. 14. Combining Data in Species Distribution Models The Solitary Tinamou Photo credit: Francesco Veronesi on Flickr (https://www.flickr.com/photos/francesco veronesi/12797666343)
  15. 15. Combining Data in Species Distribution Models Data Whole Region Expert range Park, absent Park, present eBird GBIF expert range 2 point processes (49 points) 28 parks
  16. 16. Combining Data in Species Distribution Models A Fitted Model mean sd mode Intercept -0.30 0.09 -0.30 b.PP 1.37 0.40 1.37 b.GBIF 1.43 0.26 1.43 Forest -0.03 0.04 -0.03 NPP 0.15 0.05 0.15 Altitude -0.02 0.04 -0.02 DistToRange -0.01 0.02 -0.01
  17. 17. Combining Data in Species Distribution Models Predicted Distribution −0.10 −0.05 0.00 0.05 0.10 0.15 0.20 0.25 Whole Region Expert range Park, absent Park, present eBird GBIF
  18. 18. Combining Data in Species Distribution Models Individual Data Types Expert Range −10 −8 −6 −4 −2 0 GBIF −0.060 −0.058 −0.056 −0.054 −0.052 −0.050 −0.048 eBird −0.060 −0.058 −0.056 −0.054 −0.052 −0.050 −0.048 Parks −10 −8 −6 −4 −2 0 all data −0.10 −0.05 0.00 0.05 0.10 0.15 0.20 0.25
  19. 19. Combining Data in Species Distribution Models Summary Parks and expert range seem to drive distribution NPP is main covariate, not forest or altitude
  20. 20. Combining Data in Species Distribution Models What Next Multiple species already being done elsewhere estimate sampling biases More Data Point counts (have it working) Can we estimate absolute probability of presence? Distance sampling? Mark-recapture? scaling issues (in time and space)
  21. 21. Combining Data in Species Distribution Models Not the final answer... http://www.gocomics.com/nonsequitur/2014/06/24

×