Biodiversity Informatics



David P. Shorthouse, Université de Montréal
TrochosaterricolaThorell, 1856   Trochosaruricola(De Geer, 1778)
What is biodiversity informatics?
How are biodiversity data used?
How are biodiversity data made available?
What are the key challenges?
What are its organizations?
Where can I go for more?
Bioinformatics
focused on the *omics
Biodiversity Informatics
interoperability of scientific names,
          classifications
History of “Biodiversity Informatics”

                           Canadian Biodiversity
                      Informatics Consortium (1993)


    John S. Whiting
Johnson Norm F. 2007. Biodiversity
                 informatics. Annu Rev Entomol. 52:421-38.




DOI 10.1146/annurev.ento.52.110405.091259
Who, What,
Where, When?
http://www.simplemappr.net
How are biodiversity data used?
Chapman, A. D. 2005. Uses of Primary
Species-Occurrence Data, version 1.0.
Report for the Global Biodiversity
Information Facility, Copenhagen.
Uses of Primary Occurrence Data
1 Taxonomy: research, indices, floras/faunas, field guides,
  phylogenies
2 Biogeography: distributional atlases, species distribution
  modeling, species decline
3 Life Histories and Phenologies
4 Endangered, Migratory, and Invasive Species
5 Impact of Climate Change
6 Ecology, Evolution and Genetics: habitat loss, ecosystem
  function
7 Environmental Planning: impact assessments
8 Conservation Planning: rapid biodiversity assessments,
  identifying priority areas, reserve selection, sustainable use
Uses of Primary Occurrence Data
9 Health and Public Safety: disease and disease
   vectors, bioterrorism, biosafety, parasitology
10 Bioprospecting
11 Border Control and Wildlife Trade
12 Education and Public Outreach
13 Ecotourism
14 Society and Politics: data repatriation
15 Recreational activities
DOI 10.7717/peerj.11
Dr. Jeremy Kerr, University of Ottawa
How are biodiversity data made
          available?
The Process

 Collect
 Prepare
 Digitize
 Standardize
 Publish
Collect
Why do we collect specimens?
Prepare
Creating a long-term voucher
   for scientific research
Specimen label
Primary biodiversity data
What, when, where & who
What?
Scientific name & classification
       •   Anemone narcissiflora
       •   Anemone parviflora
       •   Anemone richardsonii
       •   Arabis lyrata
       •   Caltha leptosepala
       •   Campanula lasiocarpa
       •   Cardamine umbellata
       •   Carex aquatilis
       •   Carex capillaris
       •   Carex enanderi
       •   Carex gynocrates
       •   Carex podocarpa
       •   Carex vaginata
       •   Claytonia sarmentosa
       •   Corydalis pauciflora
       •   Dodecatheon frigidum
       •   Draba crassifolia
       •   Dryas integrifolia
       •   Epilobium anagallidifolium
       •   Epilobium latifolium
       •   Equisetum variegatum
       •   Eriophorum angustifolium
       •   Eriophorum brachyantherum
When?
Date -> trends
Where?
Locality, elevation & habitat
Georeferencing
Locality description -> Coordinates
Who?
Collector -> history
Locked in paper format
    Not easily accessible
Digitize
Recording specimen information
       in a digital format
Standardize
Different database systems
     Different formats
    Different languages
Darwin Core
A common biodiversity
 information language
   bit.ly/DarwinCore
175 terms
Darwin Core Archive
  A common biodiversity
    information format
Simplify
           Standardize                    Harvest
           & publish


                         Tapir protocol                GBIF’s
Database
                                                    central index




                         DarwinCore
                           Archive                       User
Publish
         Make available online
GBIF Integrated Publishing Toolkit (IPT)
What are the key challenges?
DOI 10.1016/j.tree.2010.09.004
DOI 10.1007/11530084_8
Homonyms
same name for many taxa
Synonyms
  different names for same taxa
Variant representations
   orthography, spelling,
differences in authority
What are (a few of) the Biodiversity
   Informatics organizations?
Global Names
  Find                        Atomize                                    Index
                       …{
                       genus: { epitheton: "Pardosa" },
                       species: {
                       basionymAuthorTeam: {
                          year: "1892”,
                       authorTeam: "Banks",
                          author: ["Banks”] },
                       epitheton: "moesta",
                         authorship: "Banks, 1892"
                        }                                                http://gni.*
                       }…
http://gnrd.*

                Resolve                                   Edit


                                                      http://gnite.org

                http://resolver.*
                                                                  *.globalnames.org
Applying Global Names Tools

                             80

                             70

                             60




             # Names found
                             50

                             40

                             30

                             20
                                    t220 =, 3.68 p = 0.0003
                             10

                              0
                                  Data Packages Published PDF
What about Canadian Organizations?
  Federal Biodiversity Information Partnership
  Canadian Biodiversity Information Facility
  OBIS Canada
canadensys.net
A network
Of people and collections
Academic
11 universities, 5 botanical
  gardens & 2 museums
Canadensys Headquarters
Université de Montréal
Biodiversity Centre
35+ researchers
 Mainly systematists
30 collections
Plants, insects and fungi
13 mil. specimens
 2 out of 3 are insects
Goal
Mobilize 3 million specimen
  records (20%) by 2013
Carole Sinou
Data Publication Support Professional
Download
  Per dataset
Not very flexible
Checklists
Data about taxa (vs specimens)
   Now also supported by
      DwC-A, GBIF & IPT
VASCAN
Database of Vascular Plants of Canada
    data.canadensys.net/vascan
Data license
Allow data to be used
  bit.ly/cc0-for-data
Where can I go for more?
Biodiversity Informatics
  Commercialization
What is biodiversity informatics?
How are biodiversity data used?
How are biodiversity data made available?
What are the key challenges?
What are its organizations?
Where can I go for more?
Thanks!
        www.canadensys.net
           @canadensys
           @dpsSpiders
  david.shorthouse@umontreal.ca


David P. Shorthouse

Introduction to Biodiversity Informatics

Editor's Notes

  • #5 it aids in sequencing and annotating genomes and their observed mutations.Aids in development of biological and gene ontologies to organize and query biological dataplays a role in the analysis of gene and protein expression and regulation
  • #6 computerized handling of any biodiversity informationIt typically builds on a foundation of taxonomic, biogeographic, or ecological information stored in digital form, which, with the application of modern computer techniques, can yield new ways to view and analyse existing information, as well as predictive models for information that does not yet exist
  • #7 Coined by John Whiting in 1992 to cover the activities of an entity known as the Canadian Biodiversity Informatics Consortium, a group involved with fusing basic biodiversity information with environmental economics and geospatial information in the form of GPS and GIS.I coined the term as part of a title for what was at first a loose affiliation between about five agencies (including myself, a firm specializing in GPS, and GIS firm, a firm specializing in database management, a firm specializing in environmental economics, and a representative of the Canadian Museum of Naturelost any obligate connection with the GPS/GIS world and be associated with the computerized management of any aspects of biodiversity information
  • #10 Biodiversity Informatics – 2004