• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Delivering biodiversity knowledge in the information age

Delivering biodiversity knowledge in the information age



Presented via Google Hangouts to the Hellenic Botanical Society, Thessaloniki, Greece, 3 Oct. 2013.

Presented via Google Hangouts to the Hellenic Botanical Society, Thessaloniki, Greece, 3 Oct. 2013.



Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Delivering biodiversity knowledge in the information age Delivering biodiversity knowledge in the information age Presentation Transcript

    • Vince Smith Delivering biodiversity knowledge in the information age Hellenic Botanical Society Thessaloniki, Greece, 3-6 Oct. 2013
    • Overview 1. Background – biodiversity data diversity • An introduction to me (lice to data infrastructures) • The problem (integrating biodiversity research) 2. Example tools to manage biodiversity data • Scratchpads (a platform to manage data) • Biodiversity Data Journal (incentives to work digitally) • eMonocot (aggregating data across communities) 3. Big community challenges – three examples • Social issues (openness) • Data issues (mobilizing existing data) • Synthetic issues (modeling data) 4. Next steps • Toward an integrated view for H2020 (strategy)
    • 1. Background
    • Lice to data infrastructures (1997-2004) Systematics (circa 1998) - No high level keys - Poor high level taxonomy - Just one phylogeny - Few living experts! Circa 5,000 spp. Mammals & birds 12,000 associations 15,000 potential hosts
    • http://darwin.zoology.gla.ac.uk/~rpage/LouseBase/2/ LouseBASE Specimens Images (SID) http://darwin.zoology.gla.ac.uk/~SID/ Literature PHPBib http://myphpbib.sourceforge.net/ Lab Notebook http://www2.flmnh.ufl.edu/pdb/ Host-Parasite Checklists http://www2.flmnh.ufl.edu/adb/ Glasgow version at: Lousy data infrastructures (circa 2004)
    • The problem – integrating biodiversity research (2004>) How to we join up these activities? How do we use this as a tool? Species conservation & protected areas Impacts of human development Biodiversity & human health Impacts of climate change Food, farming & biofuels Invasive alien species What infrastructures do we need? (technologies, tools, standards…) What processes do we need? (Modelling, workflows…) What data do we need? (Genes, localities…)
    • 2. Biodiversity data tools - Scratchpads - Biodiversity Data Journal - eMonocot
    • Scratchpads – a space for your data • Hosted websites for biodiversity data • Virtual research environments • Completely open access & open source • Modular & flexible • Running since 2007 • Making taxonomy digital, open & linked http://scratchpads.eu
    • Scratchpads– a space for your data Taxa Projects Regions Societies 544Scratchpad Communities by 6,644active registered users covering 91,631taxa in 535,317 pages. 81 paper citations in 2012 In total more than 1,300,000 visitors http://scratchpads.eu
    • Biodiversity Data Journal – incentivising data publishing • New, Open Access data journal • Linked to Scratchpads via Publication Module • Supports the life cycle of a manuscript • Writing, submission, review, publication & dissemination, all in one place • Structured, reusable, standardised data • Launched in Sept 2013 with 24 articles http://biodiversitydatajournal.com
    • Biodiversity Data Journal – easy manuscript assembly Structured data Review, Publish , cite & disseminate EOL Dryad GBIF Wiki Species-Id PubMed Plazi Select, describ e & annotate data Publication module http://biodiversitydatajournal.com
    • eMonocot – aggregating data across communities • Online resource for monocot plants • Collaboration between Kew, Oxford University and NHM • Data to be open and usable by other scientists http://e-monocot.org
    • eMonocot – aggregating data across communities • Linking monocot communities • Identification, checklist & taxonomic data for: - 275,000 taxa - 8,300 images - 15 identification keys - 3 phylogenies • A sustainable digital portal • A source of data for analysis http://e-monocot.org
    • 3. Example challenges - Social issues (openness) - Data issues (mobalising existing data) - Synthetic issues (modelling)
    • Social challenges: openness E. Archambault et. al., Proportion of Open Access Peer-Reviewed Papers at the European and World Levels--2004-2011, June 2013, Science-Metrix Inc. “One-half of all papers are now freely available within a year or two of publication” “A piece of data or content is open if anyone is free to use, reuse, and redistribute it - subject, at most, to the requirement to attribute and/or share-alike.” http://opendefinition.org/ Many kinds of openness: • Open Access • Open Data • Open Science • Open Source • Sharing data is a foundation for our activities • Normal practice in some communities (molecular) • Mandated by some funders & governments Need to continue to incentivise openness
    • Data challenges: mobilising existing data Collections • 1.5-3B specimens in collections worldwide • Fragments efforts / need coordination Biodiversity literature • >300M pages, BHL scanned 41M to date • Copyright post-1923 & article metadata Informatics challenges • Automation & annotation • Storage & persistence • Business models to sustain activity Collections, literature & metadata How can we quickly, efficiently and cost effectively mobilise biological data at scale? Bibliography of Life (RefFinder & RefBank) BHL literature NHM Digitisation
    • Synthetic challenges: Modeling the biosphere Conceptually has many potential uses • Identifying trends • Explaining patterns • Making predictions • Real time alerts - when data contradicts current knowledge • The ultimate policy tool Major informatics challenges • Technical very difficult (many years off) • Needs effective prototypes & platforms • Some first steps e.g. Local Ecological Footprint Tool Nature 2013, doi:10.1038/493295a Reasoning across large, linked biodiversity datasets A clear, singular, long-term vision, which biodiversity data can contribute too
    • 4. Next steps - Further reading - H2020 Opportunities
    • A strategic view: community informatics challenges GBIF GBIC Report (Sept. 2013) Biodiv. Inf. Challenges (April, 2013) Grand Challenges for Biodiversity Informatics (integrating activities for H2020)