EURO-­‐BASIN,	  www.euro-­‐basin.eu	     Introduc)on	  to	  Sta)s)cal	  Modelling	  Tools	  for	  Habitat	  Models	  Devel...
2                     OUTLINE• Why to model?• Habitat models• Model properties• Steps for modelling• What about data?
3             WHY TO MODEL?• “All models are wrong, some models are useful” (G. Box)• Models are how we understand the wor...
4             HABITAT MODELS• Habitat models are focused on how environmental factors control  the distribution of species...
5          MODEL PROPERTIESSome desirable model properties:• Parsimony (Occam’s razor): “All things being equal, the simpl...
6         MODEL PROPERTIES  Predictive habitatdistribution models                Levins (1966); Sharpe (1990); Guisan and ...
7 MODEL PROPERTIES                              COMPLEXITY        GENERALITYThe more complex model is not necessarily the ...
8STEPS FOR MODELLING 1) Conceptual phase 2) Model formulation 3) Model calibration 4) Spatial predictions 5) Model evaluat...
9STEPS FOR MODELLING            Guisan and Zimmermann (2000)
10             1. Conceptual phase• Some sort of theoretical model should be in mind, before a statistical  model is even ...
11STEPS FOR MODELLING            Guisan and Zimmermann (2000)
12             2. Model formulation• The model depends on the type of response variable and its  associated probability di...
132. Model formulation             Guisan and Zimmermann (2000)
14REGRESSION ANALYSIS   2. Model formulation                                   50                                   40    ...
15REGRESSION ANALYSIS   2. Model formulation                               50                               40            ...
16REGRESSION ANALYSIS   2. Model formulation                                   10                                   5     ...
17REGRESSION ANALYSIS   2. Model formulation                                   10                                   5     ...
18REGRESSION ANALYSIS        2. Model formulation                        LINK                      FUNCTION               ...
19REGRESSION ANALYSIS        2. Model formulation                        LINK                                             ...
20REGRESSION ANALYSIS      2. Model formulation                           Modelo lineal                          Modelo ad...
21REGRESSION ANALYSIS         2. Model formulation                      Other regression models:                      • Mi...
22CLASSIFICATION TECHNIQUES           2. Model formulation                            • Classification is the placement of...
23CLASSIFICATION TECHNIQUES           2. Model formulation                            • Classification is the placement of...
24ENVIRONMENTAL ENVELOPES           2. Model formulation                          • The environmental envelope of a specie...
25ENVIRONMENTAL ENVELOPES           2. Model formulation                          • The environmental envelope of a specie...
26                                2. Model formulation                        • Ordination is the arrangement or ‘ordering...
27                                   2. Model formulation                        •   Indirect gradient analysis (no enviro...
28                          2. Model formulation                  • Models inspired in the human-brain (interconnected gro...
29STEPS FOR MODELLING            Guisan and Zimmermann (2000)
30             3. Model calibration• It includes model fitting (find the best value of the unknown  parameters to improve ...
31STEPS FOR MODELLING            Guisan and Zimmermann (2000)
32             4.Spatial predictions• Spatial predictions can be done on the data set used for calibration  or on new data...
33STEPS FOR MODELLING            Guisan and Zimmermann (2000)
34              5. Model evaluation• The aim is to evaluate the predictive power of a model• If only one data set is avail...
35STEPS FOR MODELLING      APPLICABILITY               Guisan and Zimmermann (2000)
36            6. Model applicability• It refers to the domain over which a validated model can be properly  used• Potentia...
37         WHAT ABOUT DATA?• Data is even more important than the model itself.• Usually from multiple sources: surveys (c...
EURO-­‐BASIN,	  www.euro-­‐basin.eu	     Introduc)on	  to	  Sta)s)cal	  Modelling	  Tools	  for	  Habitat	  Models	  Devel...
Upcoming SlideShare
Loading in …5
×

Predictive Habitat Distribution Models, Leire Ibaibarriaga

957 views

Published on

Key lecture for the EURO-BASIN Training Workshop on Introduction to Statistical Modelling for Habitat Model Development, 26-28 Oct, AZTI-Tecnalia, Pasaia, Spain (www.euro-basin.eu)

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
957
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
24
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Predictive Habitat Distribution Models, Leire Ibaibarriaga

  1. 1. EURO-­‐BASIN,  www.euro-­‐basin.eu   Introduc)on  to  Sta)s)cal  Modelling  Tools  for  Habitat  Models  Development,  26-­‐28th  Oct  2011  
  2. 2. 2 OUTLINE• Why to model?• Habitat models• Model properties• Steps for modelling• What about data?
  3. 3. 3 WHY TO MODEL?• “All models are wrong, some models are useful” (G. Box)• Models are how we understand the world: We see the world through models We learn about the world using formal descriptions• Model types: – Static vs dynamic – Explanatory vs predictive – Deterministic vs stochastic – Discrete vs continuous
  4. 4. 4 HABITAT MODELS• Habitat models are focused on how environmental factors control the distribution of species and communities.• Multiple applications: – Biogeography, impact of the global change, management, conservation, ecology, …• New conceptual and operative advances due to the growth in computing power, e.g. GIS, remote sensing, new statistical modelling tools (computer intensive), etc
  5. 5. 5 MODEL PROPERTIESSome desirable model properties:• Parsimony (Occam’s razor): “All things being equal, the simplest solution tends to be the best one”• Tractability: easy to be analysed• Conceptually insightful: reveal fundamental properties• Generalizability: can be applied to other situations/species/…• Empirical consistency: consistent with the available data• Falsifiability: can be tested by observations• Predictive precision
  6. 6. 6 MODEL PROPERTIES Predictive habitatdistribution models Levins (1966); Sharpe (1990); Guisan and Zimmermann (2000)
  7. 7. 7 MODEL PROPERTIES COMPLEXITY GENERALITYThe more complex model is not necessarily the best…
  8. 8. 8STEPS FOR MODELLING 1) Conceptual phase 2) Model formulation 3) Model calibration 4) Spatial predictions 5) Model evaluation 6) Model applicability
  9. 9. 9STEPS FOR MODELLING Guisan and Zimmermann (2000)
  10. 10. 10 1. Conceptual phase• Some sort of theoretical model should be in mind, before a statistical model is even considered• This phase includes: – Literature review – Define an up-to-date conceptual model – Set multiple hypothesis – Assess available and missing data – Identify appropriate sampling strategy for new data – Choose appropriate spatio-temporal resolution and geographic extent – Identify the most appropriate statistical methods for the other phases
  11. 11. 11STEPS FOR MODELLING Guisan and Zimmermann (2000)
  12. 12. 12 2. Model formulation• The model depends on the type of response variable and its associated probability distribution Distribution Examples Gaussian Biomass Poisson Individual counts Negative Binomial Individual counts Multinomial Communities Binomial Presence/absence
  13. 13. 132. Model formulation Guisan and Zimmermann (2000)
  14. 14. 14REGRESSION ANALYSIS 2. Model formulation 50 40 30 y 20 10 0 0 2 4 6 8 10 oct-11 © AZTI-Tecnalia x 14
  15. 15. 15REGRESSION ANALYSIS 2. Model formulation 50 40 30 y 20 10 0 0 2 4 6 8 10 © AZTI-Tecnalia x 15
  16. 16. 16REGRESSION ANALYSIS 2. Model formulation 10 5 y 0 -5 0.0 0.2 0.4 0.6 0.8 1.0 oct-11 © AZTI-Tecnalia x 16
  17. 17. 17REGRESSION ANALYSIS 2. Model formulation 10 5 y 0 -5 0.0 0.2 0.4 0.6 0.8 1.0 oct-11 © AZTI-Tecnalia x 17
  18. 18. 18REGRESSION ANALYSIS 2. Model formulation LINK FUNCTION The response variable y can follow distributions like: NORMAL, BINOMIAL, POISSON, GAMMA, etc McCullagh and Nelder (1989); Dobson (2008) © AZTI-Tecnalia 18 oct-11
  19. 19. 19REGRESSION ANALYSIS 2. Model formulation LINK SMOOTHS FUNCTION The response variable y can follow distributions like: NORMAL, BINOMIAL, POISSON, GAMMA, etc Hastie and Tibshirani (1990); Wood (2006) © AZTI-Tecnalia 19 oct-11
  20. 20. 20REGRESSION ANALYSIS 2. Model formulation Modelo lineal Modelo aditivo (LM) (AM) Modelo lineal generalizado Modelo aditivo generalizado (GLM) (GAM) oct-11 © AZTI-Tecnalia 20
  21. 21. 21REGRESSION ANALYSIS 2. Model formulation Other regression models: • Mixed models: LM, GLM and GAMs including random effect terms. Useful for meta-analysis. • Quantile regression: the quantiles are modelled instead of the mean. Useful for finding limiting factors • Segmented regression: the model changes depending on a partition of the explanatory variable. Useful for detecting regime changes • Spatial autocorrelation and autoregressive models
  22. 22. 22CLASSIFICATION TECHNIQUES 2. Model formulation • Classification is the placement of species and/or sample units into groups based on the environmental variables
  23. 23. 23CLASSIFICATION TECHNIQUES 2. Model formulation • Classification is the placement of species and/or sample units into groups based on the environmental variables • Many techniques included: classification decision tree, regression decision tree, rule-based classification, maximum- likelihood classification • Mainly two groups: – Supervised classification: a training data set is required (groups are known beforehand) – unsupervised classification: groups are unknown and need to be defined, like in cluster analysis
  24. 24. 24ENVIRONMENTAL ENVELOPES 2. Model formulation • The environmental envelope of a species is defined as the set of environments within which it is believed that the species can persist (Walker and Cocks, 1991)
  25. 25. 25ENVIRONMENTAL ENVELOPES 2. Model formulation • The environmental envelope of a species is defined as the set of environments within which it is believed that the species can persist (Walker and Cocks, 1991) • Examples of models: – BIOCLIM: minimal rectilinear envelopes based on classification trees – HABITAT: convex polytope envelopes based on classification trees – DOMAIN: based on multivariate distance metrics
  26. 26. 26 2. Model formulation • Ordination is the arrangement or ‘ordering’ of species and/orORDINATION TECHNIQUES sample units along gradients • Usually applied to community data matrices (row: species, column: samples, value: abundance)
  27. 27. 27 2. Model formulation • Indirect gradient analysis (no environmental data used) – Distance-based approaches:ORDINATION TECHNIQUES • Polar ordination, Principal Coordinates Analysis, Nonmetric Multidimensional Scaling – Eigenanalysis-based approaches • Linear model – Principal Components Analysis • Unimodal model – Correspondence Analysis, Detrended Correspondence Analysis • Direct gradient analysis (environmental data used) – Linear model • Redundancy Analysis – Unimodal model • Canonical Correspondence Analysis, Detrended Canonical Correspondence Analysis ter Braak and Prentice (1988)
  28. 28. 28 2. Model formulation • Models inspired in the human-brain (interconnected group of neurons)NEURAL NETWORKS • They define a non-linear function, decomposed further as a weighted sum of functions, that similarly can be further decomponsed, etc. So, complex non-parametric model (black- box?) • Adjusted by varying parameters, connection weights, or specifics of the architecture such as the number of neurons or their connectivity • Few examples available yet
  29. 29. 29STEPS FOR MODELLING Guisan and Zimmermann (2000)
  30. 30. 30 3. Model calibration• It includes model fitting (find the best value of the unknown parameters to improve the agreement between the data and model outputs) and model selection (which explanatory variables to be included)• To take into account: – Use of predictors that are ecologically relevant: direct vs indirect (proxy) variables – Correlation between explanatory variables• Each method has each own diagnostic tools according to their assumptions, e.g, in regression models the residual deviance
  31. 31. 31STEPS FOR MODELLING Guisan and Zimmermann (2000)
  32. 32. 32 4.Spatial predictions• Spatial predictions can be done on the data set used for calibration or on new data sets. Care must be taken if predictions are done in a new data set with new combinations between the explanatory variables and for values outside the range of values in the data set for calibration• GIS tools are very often used, but still many statistical models are not implemented in a GIS environment
  33. 33. 33STEPS FOR MODELLING Guisan and Zimmermann (2000)
  34. 34. 34 5. Model evaluation• The aim is to evaluate the predictive power of a model• If only one data set is available (we have used the data set for calibration), bootstrap, cross-validation, jacknife• If other data sets are available (independent of the calibration data set), predicted and observed values are compared using: – the same goodness of fit measure as used for model calibration – any other measure of association The data sets for calibration and evaluation are called respectively training and evaluation data sets. Sometimes the original single data set is split in two (split-sample approach)
  35. 35. 35STEPS FOR MODELLING APPLICABILITY Guisan and Zimmermann (2000)
  36. 36. 36 6. Model applicability• It refers to the domain over which a validated model can be properly used• Potential uses (Decoursey, 1992): – Screening – Research – Planning, monitoring and assessment
  37. 37. 37 WHAT ABOUT DATA?• Data is even more important than the model itself.• Usually from multiple sources: surveys (continuous, stations, vertical profiles), remote sensing, circulation models, …• The scale of the response and the environmental variables might not be the same. Need to define a common scale unit. Sometimes interpolation might be needed. This might include additional uncertainities• Simple exploratory statistics and figures can be very useful before even start thinking on any model. They also help to spot errors in the data.
  38. 38. EURO-­‐BASIN,  www.euro-­‐basin.eu   Introduc)on  to  Sta)s)cal  Modelling  Tools  for  Habitat  Models  Development,  26-­‐28th  Oct  2011  

×