20140317 pi b_nmbe_journal_club

Towards an (European) Open
Biodiversity Knowledge
Management System
Donat Agosti (Plazi, Bern)
March 17, 2014
Berne, Journal Club @ NMBE

El Bulli: Cooking in Progress (2011) Ferran Adria (Actor), Gereon Wetzel (Director)

The cook (Ferran Adriá) wants to know when he can
expect what seafood for his kitchen.
He assumes that phenological data is open and
accessible to anyone.
He has a question and needs to know: What seafood
at what time?
His goal is to provide a service based on the use of
observation data, i.e. treat you (and make some
money).

The fishmonger knows when what seafood is
available.
He considers his knowledge of seafood phenology
as his asset to make money.
His goal is to make money with knowledge based
on observation records and understanding the
characteristics of seafood.

What do YOU want to know?
How do YOU expect to get to your information?

• What are the main online resources you use?
• Do you maintain your own digital library?
• Do you participate in an online project, eg
scratchpads, catalogue, digital archive and
make your data accessible?
• … ?

What does this mean?
Meredith Lane, e-biosphere Conference, London 2009

Hardisty, Nature 502, 171 (2013)
BUT: predictive ecology has substantial data needs
Harfoot, BIH2013, Rome, 2013
The big question
What is the future of the biological world?
Imagine if we could:
…Predict community level dynamics of ecosystems at
scales from local to global, based on the ecology and
biology of all individual organisms

Decentralized biodiversity infrastructure
Plants
3,400 Herbaria worldwide
10,000 Associate curators and specialists
350,000,000 specimens in collections
180,000,000 specimens digitized
2,000,000,000 specimens including animals
Source: gbif.org; http://sciweb.nybg.org/science2/IndexHerbariorum.asp

200,000,000+ printed pages
1,900,000 species described
20,000,000+ species treatments
17,000 new species per year
Biodiversity libraries
BUT: The data are hidden
Incomplete digitization
Publications are
unstructured
Collections are incomplete
Data is not linked
Most data are not open

Nationaal Herbarium Nederland collection on GBIF
Source: http://www.gbif.org/dataset/7b33b040-f762-11e1-a439-00145eb45e9a
One collection’s view of the world

Another collection’s view of the world
http://www.gbif.org/dataset/82b0f51c-f762-11e1-a439-00145eb45e9a

What does this mean?
The Linking Open Data cloud diagram
Linked Open Data Cloud

Names as information tags in life sciences
Names
Characteristics
Publications
GenesCollections
Specimens
Distribution

The enhanced and linked treatments, extracted, stored on Plazi.org, and served in
a human readable form, are linked to the underlying data: Fisher & Smith, 2008,
PLoS ONE.

Towards an (European) Open
Biodiversity Knowledge
Management System

Coordination and Policy Development in Preparation for a
European Open Biodiversity Knowledge Management
System
Supported by the European Commission through its FP7 research funding programme
pro-iBiosphere

Create digital objects
+ Identifiers and resolvers
+ Open Access
+ Adequate infrastructure
+ Sustainable and permanent infrastructure
+ Reliable services for partners in research projects and society
Seamless Global Virtual Research Knowledge Management System
(European Open Biodiversity Knowledge Management System)
Biodiversity Knowledge Management System

Impact
Support reliable and permanent open access to digital biodiversity
records
Create identifiers and link biodiversity literature, collections, digital
objects, genes, etc.
Ensure global interoperability and sharing of biodiversity data,
information and knowledge
Provide new services in support of open science
Provide the ground for modelling biosphere
Develop data policies to harness the potential of open access
European Open Biodiversity Management System
The envisaged
will:

Convert data into machine
readable data

Text
<tax:treatment>
<tax:nomenclature>
<tax:name>
<tax:xid source="HNS" identifier="193329"/>
<tax:xmldata>
<dc:Genus>Mystrium</dc:Genus>
<dc:Species>leonie</dc:Species>
</tax:xmldata>
Mystrium leonie
</tax:name> Bohn & Verhaagh
<tax:status>n. sp.</tax:status>
Fig 1 D - F
</tax:nomenclature>
<tax:div type="description">
<tax:p>HOLOTYPE WORKER: TL 3.95, HL 1.02, HW 0.
1.30, SI 137, PW 0.73, ML 0.38. Mandible oute
to a sharp apical tooth, the apex parallel to
(Holotype with material in mandibles, so mand
$ described below from paratypes.) Median cly
....
</treatment>
Enhanced and linked text

Treatment
A publication or section of a publication documenting the
features or distribution of a related group of organisms
(called a “taxon”, plural “taxa”) in ways adhering to highly
formalized conventions.
http://terms.tdwg.org/wiki/tp:taxon-treatment
Catapano, 2010.

X-us c-us
(Treatment)
Citation
Description
Mate
X-us b-us
(Treatment)
Citation
Description
Material cit
X-us b-us
n.sp
(Treatment)
Citation
Description
Material cit
X-us b-us
(Treatment)
Citation
Description
Material cit
Treatments

X-us c-us
(Treatment)
Citation
Description
Mate
X-us b-us
(Treatment)
Citation
Description
Material cit
X-us b-us
n.sp
(Treatment)
Citation
Description
Material cit
X-us b-us
(Treatment)
Citation
Description
Mateerial cit
Title
(Article)
Bibliogra-
phic
references
Title
(Article)
Bibliogra-
phic
references
Title
(Article)
Bibliogra-
phic
references
Title
(Article)
Bibliogra-
phic
references
Systema
naturae
(Article)
Bibliogra-
phic
references
Treatments
References

Treatments can be cited, like
publications, with stable identifiers.

http://treatment.plazi.org/id/31F96F41E3E002BD88985A4F3A20E45A
Best practices for stable URIs:
http://wiki.pro-ibiosphere.eu/wiki/Best_practices_for_stable_URIs

Jeremy Miller, Work in Progress

Names can be linked automatically

Automated registration
MANUSCRIPT
SUBMISSION
MANUSCRIPT
ACCEPTED
XML
Response
ARTICLE
PUBLISHED
Taxon name available/valid
(effectively published)
XML article
metadata
XML Query
Peer review

Penestomus egazini Miller, Haddad & Griswold, 2010
Progress
Treatments (% complete): 4/4 (100%)
Data summary
Specimen records:41
adult female
adult male
other
51%
2%
46%
Specimen collections
Institutions: 3
Distribution
Muséum National d'Histoire Naturelle, Paris
California Academy of Sciences, San Francisco
Albany Museum, Grahamstown
2%
5%
76%
20%
Countries
Lesotho
South Africa
Georeferenced materials citations
Export species materials citations (DwC)
Export treatment materials citations (DwC)

0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
Materials Citations Records by Researcher
Other
Donat Agosti
David Grimaldi
Toby Schuh
James Carpenter
Norman Platnick
American Museum of Natural History
Data summary
Materials citations 2004-2013:111,364
Distribution
MaterialsCitationsRecords

0
500
1000
1500
2000
2500
Materials Citations Records by Institution
Other
Muséum National d'Histoire
Naturelle, Paris
Natural History Museum,
London
Museum of Comparative
Zoology
Smithsonian Institution
American Museum of Natural
History
Zootaxa
Data summary
Materials citations 2004-2013:11,476
Distribution
MaterialsCitationsRecords

Better:
Create data as machine readable
data

Unified marked up final output
Taxon treatments, keys, images, localities
PROSPECTIVE PUBLISHING | HISTORICAL LITERATURE
Legacy and new taxonomic literature
Content management systems &
repositories (e.g., Plazi, EOL, GBIF, SCRATCHPADS, EDIT)
TaxPub XML schema
PENSOFT MARK UP tool
Marked up publications
PDF, HTML and XML
archiving
WIKI
Species-ID, Wikispecies
Wikipedia
Indexing (IPNI,
ZooBank, Myco-
Bank, GNA)
Aggregators
(EOL, GBIF)
Electronic
archives; Data
Centers
END
USERS
TaxonX schema
PLAZI’ GOLDEN GATE editor
Automated
submission; peer-
review

http://biodiversitydatajournal.com/articles.php?id=995

Access to ant taxonomic publications through antbase.org /Smithsonian Institution, including currently the entire
body of non-copyrighted publications since 1758 (>4,000 publications or 85,000 pages)

Before antbase.org, Harvard‘s Museum of
Comparative Zoology could claim to be the only
location with a complete set of ant systematics
publications from 1758 - present.

Before antbase.org, Harvard‘s Museum of
Comparative Zoology could claim to be the only
location with a complete set of ant systematics
publications from 1758 - present.
Through antbase.org‘s
digital library, access
to this body of
literature is worldwide,
and it is actively used
(>10,000 visits in one
month only).

Bouchout Declaration, 2014
Umsetzung durch den
Schweizerischen
Nationalfonds, 2007
Berlin Declaration, 2003

• The free and open use of content, services and other digital resources
about biodiversity;
• Licenses that grant all users a free, irrevocable, world-wide, right to
copy, use, distribute, transmit and display the work publicly as well as
build on the work and making derivative works, subject to proper
attribution consistent with community practices;
• Policy developments that will foster free and open access to biodiversity
data;
• Tracking the use of information to ensure that sources and suppliers of
data are assigned credit for their contributions;
• An agreed infrastructure, standards and protocols to improve access to
and use of open data;
Bouchout Declaration, 2014 (1)

• Registers for content and services to allow discovery, access and use of
open data;
• Persistent, dereferenceable identifiers for data objects and physical
objects such as specimens, images and taxonomic treatments;
• Linking data using agreed vocabularies, both within and beyond
biodiversity, that enable participation in the Linked Open Data Cloud;
• Dialogue coordinated by the leading signatories to refine the concept,
priorities and technical requirements of Open Biodiversity Knowledge
Management.
• A sustainable Open Biodiversity Knowledge Management that is
attentive to scientific, sociological, legal, and financial aspects.
Bouchout Declaration, 2014 (2)

Reduce costs – future publishing

Don’t waist money:
Focus on Open Access enhanced
linked publications – not pdf only

founded in 2008
Swiss based NGO with members in
Switzerland, Germany, Bulgaria, US and
Iran
research based think tank with the
mission to promote open access to
scientific content
five pillars: Legal advice,
technical innovations and solutions,
maintenance of a treatment repository
and Biowikifarm, consultancy, advocacy

Modify copyright legislation to serve
better the scientific needs

Taxpub TaxonX
DTD Schema
Prospective publications Legacy publications
Constraint loose
Derivative of JATS independent
Self-contained Allows import of other schemas

Plazi Search and Retrieval
Server: Access to data
Darwin Core-Archive
You
You
You
human
machine

founded in 2008
Swiss based NGO with members in
Switzerland, Germany, Bulgaria, US and
Iran
scientific content
Plazi GmbH founded in 2012 as
service SME owned by Plazi

scientific content
Funding from public donors, eg. EU,
corporate and private

Funding:
EU
EU-BON
Pro-iBiosphere
Private sector
Inkind
Voluntary work

five pillars: Legal advice, technical
innovations and solutions, maintenance
of a treatment repository and
Biowikifarm, consultancy, advocacy
Funding from public donors, eg. EU,
corporate and private
Clients are global

Consultancies and Services:
Consulting publishers on how to
produce XML semantically enhanced
output (eg. EJT, Zootaxa, Smithsonian
Institution)
Service to mark-up literature

http://plazi.org
Thank you very much!
Donat Agosti
agosti@plazi.org
This project is funded under the European Union's Seventh Framework
Programme (FP7/2007-2013) under grant agreement №312848.

20140317 pi b_nmbe_journal_club

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (9)

Similar to 20140317 pi b_nmbe_journal_club

Similar to 20140317 pi b_nmbe_journal_club (20)

More from agosti

More from agosti (9)

Recently uploaded

Recently uploaded (20)

20140317 pi b_nmbe_journal_club