Scratchpad virtual
research environments:
sharing, linking and publishing
biodiversity data the ViBRANT way
Vince Smith1, Dave Roberts1 & Lyubomir Penev2
1. Natural History Museum, London
2. Pensoft Publishers, Sofia, Bulgaria
Our informatics grand challenge…
“Link together evolutionary
data… by developing
analytical tools and proper
documentation and then
use this framework to
conduct comparative
analyses, studies of
evolutionary process and
biodiversity analyses”

Cyndy Parr, Rob Guralnick, Nico
Cellinese and Rod Page. TREE.
doi:10.1016/j.tree.2011.11.001
Our informatics grand challenge…
“Link together evolutionary
data… by developing
analytical tools and proper
documentation and then
use this framework to
conduct comparative
analyses, studies of
evolutionary process and
biodiversity analyses”

Cyndy Parr, Rob Guralnick, Nico
Cellinese and Rod Page. TREE.
doi:10.1016/j.tree.2011.11.001

This requires data, information
& knowledge to be…
•  Digital
Not printed paper
•  Openly accessible
Not behind barriers
•  Linked-up
Not in silos
Most of our output is not digital, open or linked
• 
• 
• 
• 
• 
• 

15-20k new spp. described annually (2M total)1
30k nomenclatural acts (12M total) 1
20k phylogenies (750k total)2
31k taxa sequenced (360k taxa total)3
800k BioMed papers (40M total pp. of taxonomy) 4
Countless specimens, images, maps, keys…
Typically generated by small
communities for “local” research
projects

Figures from 1) Zhang, Zootaxa 2011 4, 1-4; 2) Web-of-Science; 3) Genbank and 4) PubMed.
ViBRANT
Virtual Biodiversity

A website for you & your community

Magic
Your data
SEVENTH FRAMEWORK
PROGRAMME

Your web site
-infrastructure
ViBRANT
Virtual Biodiversity

What are Scratchpads?
• Hosted websites for biodiversity data
• Virtual research & publication platform
• Completely open access & open source
• Modular & flexible

SEVENTH FRAMEWORK
PROGRAMME

-infrastructure
ViBRANT
Virtual Biodiversity

What Scratchpads are not!
• A single biodiversity database
• Restricted thematically, geographically or taxonomically
• A tool just for taxonomists
• Owned or controlled by anyone other than the data creator

SEVENTH FRAMEWORK
PROGRAMME

-infrastructure
ViBRANT
Virtual Biodiversity

How are Scratchpads funded?
2007

2011

2014

ViBRANT
Virtual Biodiversity

&

SEVENTH FRAMEWORK
PROGRAMME

-infrastructure
ViBRANT
Virtual Biodiversity

Scratchpads

Taxonomy & Literature

Lice, mosquitos, freeloader flies, ...
(rapid upload and management of names, synonyms & bibliographic data)

Characters, Phylogeny & Specimens

Termites, bryozoa, ...
(character matrices exporting to SDD and Nexus format, phylogenies, specimen records & maps)

biodiversity online

7000
6000
5000
4000

Sites
Users
Active Users

400
300

3000

200

2000

100
Taxon descriptions & Publications

1000
50

Sites

20

eJournals

European Mosquito Bulletin, Phasmid Studies, ...
(submission, review & dissemination of articles)

SEVENTH FRAMEWORK
PROGRAMME

Dragon trees, nanno fossils, cockroaches, fungi, polychaetes, ...
(rapid upload, annotation & display of images)

500
2007

2008

2009

2010

2011

2012

ViBRANT
Scratchpads 2

Users

Freeloader Flies, fungus gnats, ...
(publication of Scratchpad data in the ZooKeys journal and export to Encyclopedia of Life)

Image Galleries

Societies, Organisations & Projects

ICZN, GBIF, Sampled Red List Index for Plants, Global Plants Initiative ...
(space for data collection, services, discussion & organisation)

-infrastructure
ViBRANT
Virtual Biodiversity
Training
& outreach

Support
services

ViBRANT Goals
Vision
Connecting the people, data & science of
biodiversity

http://vbrant.eu

SEVENTH FRAMEWORK
PROGRAMME

Controlled
vocabulary

Networking
Training
Standards
Mobilisation

Sociology

Data
aggregation

Field
recording

GBIF
integration

Citizen
science

Position
Open & sustainable development of a
federated network of biodiversity
informatics infrastructures
Mission
Facilitate the mobalisation, sharing,
reuse and publication of biodiversity data

Data
standards

Visualisation

Scratchpads

Virtual Research
Environment
Scratchpad
Phylogeny
tools

hosting

Bioclimatic
modelling

Software integration

Identification
tools
Matrix data
editor

Data
publishing

Service

Data
Publishing

Manuscript
publishing

Sustainability
Communal
literature

Research

Literature
mark up

Architecture
Literature

Data mining

-infrastructure
ViBRANT
Virtual Biodiversity

Taxonomic Concept
Schema XML

Nexus

Newick

CSV/tab

Excel file

EoL Transfer schema (SPM)

SDD, Lucid,
DwCA
Nexus
SEVENTH FRAMEWORK
PROGRAMME

XML

CSV, XLS,
RDF
Microsoft Word
.DOC, TXT
-infrastructure
ViBRANT

What can Scratchpads do?

Virtual Biodiversity

• Taxon pages (generated from tagged content)
• Distribution maps (from specimens and TDWG regional distributions - Brummitt, 2001)
• Specimen records
• Bibliography management
• Images, video and sound (bulk import)
• Excel spreadsheet import
• Tabular data editing & Character matrixes
• Custom content
• User management
• Custom webforms
• Analytics
• Darwin Core Archive export (links to eMonocot Portal and EOL)
• EOL data import (taxonomy, species information)
• GBIF Map integration
SEVENTH FRAMEWORK
PROGRAMME

-infrastructure
ViBRANT
Virtual Biodiversity

http://www.comber.hcmr.gr

SEVENTH FRAMEWORK
PROGRAMME

-infrastructure
ViBRANT
Virtual Biodiversity

Oxford Batch Operations Engine
https://oboe.oerc.ox.ac.uk/
SEVENTH FRAMEWORK
PROGRAMME

-infrastructure
ViBRANT
Virtual Biodiversity

BDJ
The Biodiversity Data Journal
Making small data big!

SEVENTH FRAMEWORK
PROGRAMME

-infrastructure
ViBRANT
Virtual Biodiversity
ISSN 1314-2828 (online) ISSN 1314-2836 (print)

1. Define the
publication

A peer-reviewed open-access journal

Biodiversity
D ata Journal

Articles

Launched to accelerate biodiversity data journal

Bibliographies
2. Enter
metadata

Occurrence

3. Select taxa
& content

1t 2011

Taxon
treatments

Plazi

http://www.pensoft.net/biodiversitydata

4. Organise
manuscript

Editor-in-Chief: VINCENT SMITH
Natural History Museum, London, UK

Taxon
names

5. Submit to
journal
I . P . N . I

SEVENTH FRAMEWORK
PROGRAMME

-infrastructure
ViBRANT
Virtual Biodiversity

Acknowledgements
• Scratchpad technical development
- Simon Rycroft, Ben Scott, Ed Baker, Alice Heaton & Katherine Boulton
• Scratchpad outreach

- Laurence Livermore & Dimitris Koureas

• E-Monocot

- Paul Wilkin & the Kew team, Charles Godfray & the Oxford team

• ViBRANT

- Vince Smith, Dave Roberts & Lucy Reeve

• Our 7,000+ users

SEVENTH FRAMEWORK
PROGRAMME

-infrastructure
ViBRANT
Virtual Biodiversity

Thank you for your
attention.

Any questions
e-mail: enquiries@vbrant.eu
e-mail: scratchpad@nhm.ac.uk

http://vbrant.eu
SEVENTH FRAMEWORK
PROGRAMME

http://scratchpads.eu
-infrastructure

Roberts leiden110213

  • 1.
    Scratchpad virtual research environments: sharing,linking and publishing biodiversity data the ViBRANT way Vince Smith1, Dave Roberts1 & Lyubomir Penev2 1. Natural History Museum, London 2. Pensoft Publishers, Sofia, Bulgaria
  • 2.
    Our informatics grandchallenge… “Link together evolutionary data… by developing analytical tools and proper documentation and then use this framework to conduct comparative analyses, studies of evolutionary process and biodiversity analyses” Cyndy Parr, Rob Guralnick, Nico Cellinese and Rod Page. TREE. doi:10.1016/j.tree.2011.11.001
  • 3.
    Our informatics grandchallenge… “Link together evolutionary data… by developing analytical tools and proper documentation and then use this framework to conduct comparative analyses, studies of evolutionary process and biodiversity analyses” Cyndy Parr, Rob Guralnick, Nico Cellinese and Rod Page. TREE. doi:10.1016/j.tree.2011.11.001 This requires data, information & knowledge to be… •  Digital Not printed paper •  Openly accessible Not behind barriers •  Linked-up Not in silos
  • 4.
    Most of ouroutput is not digital, open or linked •  •  •  •  •  •  15-20k new spp. described annually (2M total)1 30k nomenclatural acts (12M total) 1 20k phylogenies (750k total)2 31k taxa sequenced (360k taxa total)3 800k BioMed papers (40M total pp. of taxonomy) 4 Countless specimens, images, maps, keys… Typically generated by small communities for “local” research projects Figures from 1) Zhang, Zootaxa 2011 4, 1-4; 2) Web-of-Science; 3) Genbank and 4) PubMed.
  • 5.
    ViBRANT Virtual Biodiversity A websitefor you & your community Magic Your data SEVENTH FRAMEWORK PROGRAMME Your web site -infrastructure
  • 6.
    ViBRANT Virtual Biodiversity What areScratchpads? • Hosted websites for biodiversity data • Virtual research & publication platform • Completely open access & open source • Modular & flexible SEVENTH FRAMEWORK PROGRAMME -infrastructure
  • 7.
    ViBRANT Virtual Biodiversity What Scratchpadsare not! • A single biodiversity database • Restricted thematically, geographically or taxonomically • A tool just for taxonomists • Owned or controlled by anyone other than the data creator SEVENTH FRAMEWORK PROGRAMME -infrastructure
  • 8.
    ViBRANT Virtual Biodiversity How areScratchpads funded? 2007 2011 2014 ViBRANT Virtual Biodiversity & SEVENTH FRAMEWORK PROGRAMME -infrastructure
  • 9.
    ViBRANT Virtual Biodiversity Scratchpads Taxonomy &Literature Lice, mosquitos, freeloader flies, ... (rapid upload and management of names, synonyms & bibliographic data) Characters, Phylogeny & Specimens Termites, bryozoa, ... (character matrices exporting to SDD and Nexus format, phylogenies, specimen records & maps) biodiversity online 7000 6000 5000 4000 Sites Users Active Users 400 300 3000 200 2000 100 Taxon descriptions & Publications 1000 50 Sites 20 eJournals European Mosquito Bulletin, Phasmid Studies, ... (submission, review & dissemination of articles) SEVENTH FRAMEWORK PROGRAMME Dragon trees, nanno fossils, cockroaches, fungi, polychaetes, ... (rapid upload, annotation & display of images) 500 2007 2008 2009 2010 2011 2012 ViBRANT Scratchpads 2 Users Freeloader Flies, fungus gnats, ... (publication of Scratchpad data in the ZooKeys journal and export to Encyclopedia of Life) Image Galleries Societies, Organisations & Projects ICZN, GBIF, Sampled Red List Index for Plants, Global Plants Initiative ... (space for data collection, services, discussion & organisation) -infrastructure
  • 10.
    ViBRANT Virtual Biodiversity Training & outreach Support services ViBRANTGoals Vision Connecting the people, data & science of biodiversity http://vbrant.eu SEVENTH FRAMEWORK PROGRAMME Controlled vocabulary Networking Training Standards Mobilisation Sociology Data aggregation Field recording GBIF integration Citizen science Position Open & sustainable development of a federated network of biodiversity informatics infrastructures Mission Facilitate the mobalisation, sharing, reuse and publication of biodiversity data Data standards Visualisation Scratchpads Virtual Research Environment Scratchpad Phylogeny tools hosting Bioclimatic modelling Software integration Identification tools Matrix data editor Data publishing Service Data Publishing Manuscript publishing Sustainability Communal literature Research Literature mark up Architecture Literature Data mining -infrastructure
  • 11.
    ViBRANT Virtual Biodiversity Taxonomic Concept SchemaXML Nexus Newick CSV/tab Excel file EoL Transfer schema (SPM) SDD, Lucid, DwCA Nexus SEVENTH FRAMEWORK PROGRAMME XML CSV, XLS, RDF Microsoft Word .DOC, TXT -infrastructure
  • 12.
    ViBRANT What can Scratchpadsdo? Virtual Biodiversity • Taxon pages (generated from tagged content) • Distribution maps (from specimens and TDWG regional distributions - Brummitt, 2001) • Specimen records • Bibliography management • Images, video and sound (bulk import) • Excel spreadsheet import • Tabular data editing & Character matrixes • Custom content • User management • Custom webforms • Analytics • Darwin Core Archive export (links to eMonocot Portal and EOL) • EOL data import (taxonomy, species information) • GBIF Map integration SEVENTH FRAMEWORK PROGRAMME -infrastructure
  • 13.
  • 14.
    ViBRANT Virtual Biodiversity Oxford BatchOperations Engine https://oboe.oerc.ox.ac.uk/ SEVENTH FRAMEWORK PROGRAMME -infrastructure
  • 15.
    ViBRANT Virtual Biodiversity BDJ The BiodiversityData Journal Making small data big! SEVENTH FRAMEWORK PROGRAMME -infrastructure
  • 16.
    ViBRANT Virtual Biodiversity ISSN 1314-2828(online) ISSN 1314-2836 (print) 1. Define the publication A peer-reviewed open-access journal Biodiversity D ata Journal Articles Launched to accelerate biodiversity data journal Bibliographies 2. Enter metadata Occurrence 3. Select taxa & content 1t 2011 Taxon treatments Plazi http://www.pensoft.net/biodiversitydata 4. Organise manuscript Editor-in-Chief: VINCENT SMITH Natural History Museum, London, UK Taxon names 5. Submit to journal I . P . N . I SEVENTH FRAMEWORK PROGRAMME -infrastructure
  • 17.
    ViBRANT Virtual Biodiversity Acknowledgements • Scratchpadtechnical development - Simon Rycroft, Ben Scott, Ed Baker, Alice Heaton & Katherine Boulton • Scratchpad outreach - Laurence Livermore & Dimitris Koureas • E-Monocot - Paul Wilkin & the Kew team, Charles Godfray & the Oxford team • ViBRANT - Vince Smith, Dave Roberts & Lucy Reeve • Our 7,000+ users SEVENTH FRAMEWORK PROGRAMME -infrastructure
  • 18.
    ViBRANT Virtual Biodiversity Thank youfor your attention. Any questions e-mail: enquiries@vbrant.eu e-mail: scratchpad@nhm.ac.uk http://vbrant.eu SEVENTH FRAMEWORK PROGRAMME http://scratchpads.eu -infrastructure