Biodiversity Informatics
David P. Shorthouse, Université de Montréal
© Mr.checker (CC-SA 3.0 Unported)
What is biodiversity informatics?
How are biodiversity data used?
How are biodiversity data made available?
What are the key challenges?
What are its organizations?
Where can I go for more?
Bioinformatics
focused on the *omics
Biodiversity Informatics
interoperability of scientific
names, classifications
History of “Biodiversity Informatics”
John S. Whiting
Canadian Biodiversity
Informatics Consortium (1993)
Johnson Norm F. 2007. Biodiversity
informatics. Annu Rev Entomol. 52:421-38.
DOI 10.1146/annurev.ento.52.110405.091259
Who, What,
Where, When?
How are biodiversity data used?
Chapman, A. D. 2005. Uses of Primary
Species-Occurrence Data, version 1.0.
Report for the Global Biodiversity
Information Facility, Copenhagen.
http://www.gbif.org/resources/2834
1 Taxonomy:
research, indices, floras/faunas, field
guides, phylogenies
2 Biogeography: distributional atlases, species
distribution modeling, species decline
3 Life Histories and Phenologies
4 Endangered, Migratory, and Invasive Species
5 Impact of Climate Change
6 Ecology, Evolution and Genetics: habitat
loss, ecosystem function
7 Environmental Planning: impact assessments
Uses of Primary Occurrence Data
Uses of Primary Occurrence Data
8 Conservation Planning: rapid biodiversity
assessments, identifying priority areas, reserve
selection, sustainable use
9 Health and Public Safety: disease and disease
vectors, bioterrorism, biosafety, parasitology
10 Bioprospecting
11 Border Control and Wildlife Trade
12 Education and Public Outreach
13 Ecotourism
14 Society and Politics: data repatriation
15 Recreational activities
DOI 10.7717/peerj.11
DOI 10.1038/nature12872
How are biodiversity data made
available?
The Process
Collect
Prepare
Digitize
Standardize
Publish
Collect
© Ainsley Seago
Prepare
Creating a long-term voucher
for scientific research
Specimen label
Primary biodiversity data
What, when, where & who
Digitize
Recording specimen information
in a digital format
Standardize
Different database systems
Different formats
Different languages
Darwin Core
A common biodiversity
information language
bit.ly/DarwinCore
175 terms
Darwin Core Archive
A common biodiversity
information format
Publish
Make available online
GBIF Integrated Publishing Toolkit (IPT)
What Other Kinds of Data?
Images
Observations
Phylogenetic Trees
Graphs
Unstructured texts
Taxonomic lists
What are the key challenges?
Scientific Names
DOI 10.1007/11530084_8
Homonyms
same name for many taxa
Synonyms
different names for same taxa
Variant representations
orthography, spelling,
differences in authority
DOI 10.1016/j.tree.2010.09.004
Globally Unique Identifiers
Data Quality and Fitness-for-Use
Giving Credit for Participation &
Metrics of Success
What are (a few of) the Biodiversity
Informatics organizations?
*.globalnames.org
Edit
http://gnite.org
Index
http://gni.*
Atomize
…{
genus: { epitheton: "Pardosa" },
species: {
basionymAuthorTeam: {
year: "1892”,
authorTeam: "Banks",
author: ["Banks”] },
epitheton: "moesta",
authorship: "Banks, 1892"
}
}…
Resolve
http://resolver.*
Find
http://gnrd.*
Global Names
What about Canadian Organizations?
Federal Biodiversity Information Partnership
Canadian Biodiversity Information Facility
OBIS Canada
canadensys.net
Academic
11 universities, 5 botanical
gardens & 2 museums
35+ researchers
30 collections
Plants, insects and fungi
Canadensys Headquarters
Université de Montréal
Biodiversity Centre
13 mil. specimens
2 out of 3 are insects
Goal
Mobilize 3 million specimen
records (20%)
Download
Per dataset
Not very flexible
Checklists
Data about taxa (vs specimens)
also supported by
DwC-A, GBIF & IPT
VASCAN
Database of Vascular Plants of Canada
data.canadensys.net/vascan
Biological Survey of Canada
The Biota of Canada
http://www.biologicalsurvey.ca
Data license
Allow data to be used
bit.ly/cc0-for-data
Where can I go for more?
Social Venues
TAXACOM
TDWG
Canadensys Google Group
iDigBio
ECN-L
GitHub
Twitter
What Skills/Technologies
Might I Need?
Web programming: HTML5, css
Relational databases: PostgreSQL/PostGIS,
MySQL
NoSQL data stores: Neo4j, CouchDB
Programming languages: R, Python, ruby, Java,
JavaScript
Creativity with data: dynamic visualizations
Biodiversity Informatics
Commercialization
iekho.com
Branché
What is biodiversity informatics?
How are biodiversity data used?
How are biodiversity data made available?
What are the key challenges?
What are its organizations?
Where can I go for more?
www.canadensys.net
@canadensys
@dpsSpiders
david.shorthouse@umontreal.ca
David P. Shorthouse

2014.04.01 Shorthouse REDM400