RPG iEvoBio 2010 Keynote
Upcoming SlideShare
Loading in...5
×
 

RPG iEvoBio 2010 Keynote

on

  • 3,951 views

Keynote talk slides, inaugural iEvoBio meeting, Portland 2010.

Keynote talk slides, inaugural iEvoBio meeting, Portland 2010.

Statistics

Views

Total Views
3,951
Views on SlideShare
3,799
Embed Views
152

Actions

Likes
1
Downloads
9
Comments
0

17 Embeds 152

http://phylogenomics.blogspot.com 99
http://phylogenomics.blogspot.ca 12
http://phylogenomics.blogspot.com.br 7
http://phylogenomics.blogspot.co.uk 5
http://phylogenomics.blogspot.de 5
http://phylogenomics.blogspot.fr 4
http://phylogenomics.blogspot.in 3
http://phylogenomics.blogspot.ch 3
http://phylogenomics.blogspot.nl 2
http://phylogenomics.blogspot.ie 2
http://phylogenomics.blogspot.com.au 2
http://phylogenomics.blogspot.com.es 2
http://phylogenomics.blogspot.cz 2
http://phylogenomics.blogspot.pt 1
http://phylogenomics.blogspot.gr 1
http://kaboodle.nescent.org 1
http://phylogenomics.blogspot.kr 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

RPG iEvoBio 2010 Keynote RPG iEvoBio 2010 Keynote Presentation Transcript

  • Biodiversity Discovery and Documentation in the Information and Attention Age
    Presented by: Rob Guralnick
    Authors: Rob Guralnick and Andrew Hill
    Contributors: Meredith Lane, Dan Janies, Walter Jetz, and lots of other folks.
    Funding support: Global Biodiversity Information Facility, National Biological
    Information Infrastructure, Defense Advanced Research Projects Agency,
    National Science Foundation.
    #ievobio
  • WHAT IS BIODIVERSITY DISCOVERY AND DOCUMENTATION?
    IMPEDIMENTS
    ACTION
    Linnean shortfall (too few taxonomists,
    antiquated and laborious process)
    Wallacean shortfall (very coarse
    resolution, scattered data, no integration)
    Darwinian shortfall (trees scattered in
    literature, no “mother of all trees”)
    Multiple repositories that do not
    communicate well storing genetic,
    phenotypic data. Phenotypic
    knowledge-bases lag behind.
    Discovering and documenting
    new units of biodiversity
    Discovering and documenting
    distributions of lineages
    Discovering and documenting
    relationship among lineages
    Discovering and documenting
    lineage traits from genomes
    to phenotype.
  • app
    Pace of Species Description and Documentation for 2008
    Approx. 1.922 million named species (all taxa)
    ~4-30 million undiscovered
    From the State of Observed Species report, http://species.asu.edu/files/SOS2010.pdf
  • Pace of Species Description and Documentation for 2008
    Assuming a relatively conservative number
    (eg. 10 million undescribed species), it will take another
    360 years to discover and document them at our current
    pace. Why is discovery and documentation so slow?
    Taxonomists proceed in the same manner today as they did one hundred years ago.
    Few products are generated along the way. This also means the process is vulnerable (to loss of computers to the loss of taxonomists themselves).
    Discovery and documentation are coupled.
  • State and Scale of Knowledge in Environmental Sciences
    Slide from Walter Jetz (thanks Walter)
  • 90 meter
    resolution
    SRTM
    elevation
    data for a
    Portion of
    Colorado
    A view of the
    world at different
    resolutions
    100 times
    as coarse
    1000 times
    as coarse
  • Distribution Knowledge Is Scattered
    • Points are from GBIF
    data portal
    • Expert opinion range
    map from IUCN Red
    List
    • IUCN also lists some
    habitat preferences
    (cropland, meadows,
    mountain valleys)
    Microtusmontanus
  • Documenting our
    biodiversity matters
    because it is under
    increasing threat.
  • “Overall, we are locked into a race. We must hurry to acquire the knowledge on which a wise policy of conservation and development can be based for centuries to come.”
    - E. O. Wilson
  • HOW DO WE DO THIS?
    DEVELOP KNOWLEDGEBASES OF SPECIES
    DISTRIBUTIONS AND SPECIES RELATIONSHIPS.
    PROVIDE MEANS TO INTEGRATE ACROSS
    THESE KNOWLEDGE-BASES
    PROVIDE TOOLS TO RAPIDLY AND EASILY
    EXLORE THESE DATA ACROSS SPACE AND TIME
    MAKE THIS A COMMUNITY EFFORT –
    LEVERAGE COMMUNITY SOURCING
  • Phylo-, Biodiversity and
    Ecological Informatics
    Analytical Methods
    means to summarize data & select hypotheses
    Growing
    Toolbox
    Application Services
    automated workflow for biodiversity science
    X
    Tools
    Encoding analytical methods
    X
    Initial Research Questions
    Raw global data
    lineage, occurrence,
    environmental
    New Research Questions
    X
    Concepts
    and ideas
    Processed global data
    Species, distributions, new envir. layers
    Growing data
    and information
    repositories
  • Growing data
    Repositories, formats
    Growing Toolbox
    Concepts and ideas
    Paup, Phyml/Raxml, MrBayes,
    Beast, Mesquite, etc.
    Tree of Life
    Population Genetics
    GenBank, TreeBase
    (Nexus/Newick/PhyloXML,
    etc)
    TCS/NCA, MsBayes. BayesSCC,
    Structure, etc.
    Inference-based
    Satellites (Modis/GOES/Landat, etc)
    Satellite image repositiories,
    Worldclim, PRISM , PMIP
    (erdapp, netCDF, GIS
    formats)
    Earth Surface
    Satellites; historical, current in-situ,
    GCMs, etc.
    Climate
    Infrared Imaging Spectrometer ,
    etc
    Ecosystem fluxes
    Instrument-based raw
    Statistical/inferential
    TaxonX, automated
    Species name extraction
    ITIS, Catalog of Life,
    Zoobank, Zookeys, etc.
    Species named
    Lucid, Ontologies, RDF
    Species traits
    Species distributions
    Morphbank, TraitNet, etc.
    GIS, habitat suitability models,
    SDMs/ENMs, Survey Gap,etc.
    GBIF, VertNet, OBIS
    (species occurrence),
    Map of Life, IUCN
    Observations and model-based
  • The Interconnected Nature of Biodiversity Ideas, Outputs, Repositories
    From Peterson et al. In Press Systematics and Biodiversity
  • DECOUPLING SPECIES DISCOVERY AND DOCUMENTATION
    (OR GET IT OUT THERE FOR OTHERS TO USE AND REPURPOSE)
    (OR CLAIM NEW BIODIVERSITY, PROVISIONALLY, BEFORE FORMAL PUBLICATION)
    repositories
    Community
    sourcing
    Publish step 1
    Generate new data
    from specimens
    Genbank
    Scratch-
    pads
    Morphbank
    Life-desks
    Comparartive
    analyses
    Treebase
    Link new unit of
    biodiversity
    onto tree of life
    (claim discovery)
    Publish step 2
    Formal publication
    (documentation)
  • TAKE HOME MESSAGE 1:
    We need to use the web as a collaborative work environment for biodiversity knowledge generation
    We need to claim knowledge of the existence of new species before all of the formal steps to document it are complete
    We need to publish new data about species soon after generation and prior to publication
  • What about monitoring an evolving Earth System?
    Tracking the spread of disease lineages with known important mutations through time & space
    Questions:
    • How are drug resistant strains of H5N1 circulating around the globe?
    • How did drug resistance arise in the H5N1 population?
    • Are mutations that give rise to drug resistance in H5N1 under positive selection?
    • Can we provide ways for researchers and the general public to near real-time track this spread?
    Hosts and strains of avian influenza A
    Viral structure
  • Methods:
    • Collect public genome data for H5N1 avian
    influenza (676 full genomes).
    • Use tools for more efficient alignment and
    phylogeneticanalysis of data
    • Test whether mutations on M2 gene (L26I, V27A/I, A30S, S31N) that provide resistance to adamantanes (a class of drugs used to treat influenza A) are under positive selection, purifying selection or are neutral (across the full sampled population of H5N1 inf. A)
    • Make GoogleEarthTMvizualizations available
    .
  • Global View of Spread of H5N1 (blue branches are lineages with mutation for higher transmissibility among mammals)
  • Resistant mutant found at position 31 of the M2 protein – colored red below
    Altitude of node X = a+ [(n− 1) ×b]
  • Dn/ds measurements across the M2 protein (high Dn/ds ratios (>1) suggest that more non-synonymous substitutions are occurring than expected and therefore are likely being maintained in population)
  • Table 2Amantadine use in chicken farms in Northern China in 1 year (from October 2004 to September 2005)
    From He, 2007, Antiviral Research
  • So What Did We Find Out?
    • Drug resistance to adamantanes is under positive selection
    for at least some mutations (S31N and V27A/I).
    • Drug resistant lineages can spread quickly across the globe
    • Emergence of drug resistance has been through mutation not
    recombination and hitch-hiking (results not shown)
    • Effectively treating a potential H5N1 pandemic is based on
    continued monitoring of evolution and spread of resistance to
    adamantanesand oseltimivir(Tamiflu)
  • TAKE HOME MESSAGES 2:
    • It is possible to not just develop observing systems of species but of
    evolving lineages.
    • These monitoring or observing systems can provide a unique view into
    evolution, selection and adaptation.
    • Such systems are essential for more accurate forecasting.
    • Developing such a system means creating automated workflows.
  • WHAT ABOUT ALLOWING OTHERS TO MAKE THEIR OWN GEOPHYLOGENY?
    http:// geophylo.appspot.com/
    Hill and Guralnick, in press, Ecography
    Google App Engine application
  • GeoPhylo Engine - Written in Python, open source,
    and deployed on Google App Engine.
    Advantages of cloud-based deployment:
    • Scalable (near infinite computation resources)
    • All versioning kept intact so developers can easily link to latest and greatest
    • Storage of persistent KMLs for users who want to share and modify their KMLs.
    • Easily deployable as a web service
  • TAKE HOME MESSAGES 3:
    Geophylogenies provide rich visualizations of multidimensional data that can be examined at multiple spatial (and temporal) scales
    Such visualizations may appeal beyond our community of evolutionary biologists to the broader scientific and policy community
    Automated approaches and workbench-oriented tools allow for updating, community-driven content to be generated
    Our ultimate goal should be an ever-growing “mother of all trees” from which we can attach new “twigs” as we discover them.
  • Can We Really Track Distributions of Lineages Through Space and Time?
  • Map of Life Will:
    • Provide expert opinion range maps for almost all terrestrial vertebrates (and means to accumulate more maps for other taxa)
    • Provide means for the community to annotate those maps
    • Assemble point occurences, habitat preference data and environmental data (e.g. climate, landcover, soil, etc)
    • Provide a modeling approach to generate much finer scale distribution models (on the order of a kilometer resolution)
  • Overlaying expert opinion maps and model outputs
    • Common data model
    for range maps
    • Web-services based
    for sharing maps
    • Focus on
    improvement
    through
    modeling and
    community
    involvement both
    Map of Life Connections
  • Integrating phylogenetic and distributional data in GoogleEarthTM
  • Work
    Workflows
    combining
    phlyogenetic
    approaches,
    conservation
    status and
    species
    occurrence
  • TAKE HOME MESSAGES 4:
    Map of Life fills a critical gap in our global biodiversity knowledge by integrating different sources of species distribution into high resolution range maps for community use.
    The ultimate goal is to integrate such species distribution knowledge with knowledge about relationships among species and conservation knowledge
    Such integration, at global scale, and across large taxonomic groups, is the next step forward
  • Relational
    Modeling
    Patterns
    Predictions
  • Community Sourcing and the Attention Age
    At the heart of the message here today is also a challenge:
    The vision here suggests that data publishing and “sharing” is as important as academic “kudos”
    Can we act for collective good of our community and by so doing see gains for all?
    Lets change our model of credit!