Los Angeles R users group - July 12 2011 - Part 1
Upcoming SlideShare
Loading in...5
×
 

Los Angeles R users group - July 12 2011 - Part 1

on

  • 414 views

 

Statistics

Views

Total Views
414
Slideshare-icon Views on SlideShare
414
Embed Views
0

Actions

Likes
0
Downloads
86
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Los Angeles R users group - July 12 2011 - Part 1 Los Angeles R users group - July 12 2011 - Part 1 Presentation Transcript

    • Using R for multilevel modeling of salmon habitatYasmin Lucero, Statistical ConsultantKelly Burnett, PNW Research Station, USFSKelly Christiansen, PNW Research Station, USFSE. Ashley Steel, PNW Research Station, USFSEli Holmes, NW Fisheries Science Center, NOAAAcknowledgements:NRC-RAP, National Academy of SciencesISEMP Monitoring Program, NOAA
    • Outline• Background on fish ecology and the data• Background on multilevel modeling• Demo of lme4 package in R
    • The big goal: measure effect of stream habitat quality on fish survival Photo by David Wolman Schooling Juvenile Coho Salmon
    • Land Area Affected byEndangered SpeciesAct Listings of Salmon& Steelhead* 28 distinct population segments:6 endangered, 22 threatened* 176,000 sq. miles in Washington,Oregon, Idaho & California study area* 61% of Washington’s land area,55% of Oregon’s, 26% of Idaho’s, &32% of California’s February 2008
    • The Data ~266 study sites Oregon coastal region juvenile coho salmon habitat sparsely sampled, longitudinalstudy design Oregon 12 year time series 35 data layers ~100 landscape level variates ~22 habitat level variates
    • Abundance increases over time due to variation inOcean conditions (i.e. external to our analysis) coho.obs coho.obs ● 1.0 ● ● ● 8 ● 0.8 ● ● ● ● ● ● 6 ● ● coefficient ● 0.6 ●fs.coho.obs ● ● ● ● ● ● ● ● ● ● 4 ● ● ● ● ● ● ● ● ● ● ● ● ● 0.4 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2 ● ● ● ● ● ● 0.2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 1998 2000 2002 2004 2006 2008 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 year fs.year
    • Sparsely sampled longitudinal data • Only fish data has time 3.0 17100201010102 17100202030201 17100203020501 17100203020902 component 2.5 2.0 ● • year effects exogenous 1.5 1.0 ● ● ● ● ● ● ● ● ● 0.5 ● • Landscape data everywhere ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● • Habitat data some places 17100203040402 17100203040602 17100203070101 17100203090101 3.0 ● 2.5 • Fish data some places ● 2.0 ● ● 1.5 ● • Not always same places ● ● 1.0 ● ● ● fs.coho.obs 0.5 ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 17100204050303 17100205040105 17100205070202 17100206010504 3.0 2.5 2.0 1.5 ● ● ● ● ● 1.0 ● ● ● ● ● 0.5 ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 17100206010603 17100303080202 17100304010604 17100305060202 3.0 2.5 2.0 ● 1.5 1.0 ● 0.5 ● ● ● ● ●Figure Legend. Mean density of coho at ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ●16 frequently visited sites for 1998–2009 1998 2002 2006 1998 2002 2006 1998 2002 2006 1998 2002 2006 2000 2004 2008 2000 2004 2008 2000 2004 2008 2000 2004 2008 year
    • How the landscape data is acquired summarize across area surrounding GIS map layers study site
    • habitat level data is collected by survey visits:labor intensive to collect/therefore less abundant gradient pool density debris flow rates drainage area high structure: rocks and woody debris channel width etc. shallow, highly channelized
    • Multilevel structure for two reasons
    • Multi-level structure for two reasons:(1) longitudinal sampling design(2) varying scales of predictors landscape habitat fish
    • Generalized linear mixed models(aka hierarchical, multilevel, or random effects models) canonical example: school test scores class class class class class class class class class school school school state student_score ~ class_average + school_average + state_average
    • state level predictors Norm(0, σstate ) 2 stateschool level predictors Norm(µstate1 , σschool ) 2 school 1 school 2 school 3 school 4 class level predictors Norm(µschool1 , σclass ) 2 class 1 class 2 class 3 class 4student level predictors Norm(µclass3 , σstudent ) 2 student 1 student 2 student 3 student 4
    • Our model structure is not so complicated global landscape level predictors site 1 site 2 site 3 site 4 habitat level predictors & year effects obs 1 obs 2 obs 3 obs 4
    • Modeling presence/absence of fish:logistic mixed model with site and year effects year effects γ ∗ yearlogit(Pr{yi = 1}) = βyear xy + β1 xh1 + β1 xh2 + αsite + βh1 xh1 + βh2 xh2 + ... + αsite habitat level predictors site effects αsite ∼ Norm(βl1 xl1 + βl2 xl2 + ... , σsite ) 2 landscape level predictors
    • Fit a lot of models, some predictors rose to the top 1300 m3 m18 m5 m6 m13 m11 m17 m15 m1m4 m9 m2 m21 Best predictors: m8 m12 m7 gradient 1250 debris level drainage area 1200AIC m14 mean elevation 1150 m10 m32 m30 m33 m34 1100 m16 m29 m31 m25 m20 m26 m28 m27 m19 m23 m22 m24 −620 −600 −580 −560 −540 −520 logLik
    • Overall model performance is strong at somethings, weak at others fitted probabilities 1.0 ● ● ● ● ● ● ● ● ● ● ● 0.8 800 ● ● ● ● ● ● ● ● ● ● ● ● ● fitted probability ● ● fitted probabilities ● ● ● 0.6 600 ● ● ● ● ● ● ● ● ● ● ● ●count 0.4 ● ● 400 ● 0.2 ● ● 200 ● ● 0.0 0 0 1 0.0 0.2 0.4 0.6 0.8 1.0 fitted(models.ls$m24) absence presence histogram of fitted probabilities
    • Another look at model fit: some heavy outliers ~ pa.obss.year + (fs.grad.rs + fs.cfs.down.rs + fs.vol.len.rs + el.mean.rs | catchment p/a of coho obs (data) 0.8 1998 2004 1999 2005 2000 2006 2001 2007 0.4 2002 2008 2003 2009 0.0 0.0 0.2 0.4 0.6 0.8 1.0 fitted
    • conclusions• site matters• we can explain about half of the variation in why site matters with 4-5 predictors• habitat data more valuable than landscape data• small number of predictions are very wrong, and we can’t seem to improve them
    • Thanks. yasmin.lucero@gmail.com
    • Model predicted probabilities given presence/absence with and without site effects m0 m1 1.0 1.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.8 0.8 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Pr{coho present} Pr{coho present} ● ● ● 0.6 0.6 ● ● ● ● ● ● ● ● ● ● ● 0.4 0.4 ● ● ● ● ● ● ● ● ● 0.2 0.2 ● ● 0.0 0.0 FALSE TRUE FALSE TRUE coho presence coho presence