RETRIEVING USEFUL
                                              INFORMATION
                                         FROM CONNECTED
                                       SPECIMEN- AND DATA
                                               COLLECTIONS

    Rutger Vos   Conceptual design of future databases: sense and nonsense,
22 March 2012    how to proceed jointly?
Outline
   NCB Naturalis
   Collections of physical and digital objects

   Examples of research and services

   Linking specimens and data

   Future developments

   Conclusions




Specimen- and data collections, Rutger Vos
NCB Naturalis
   Netherlands Centre for
    Biodiversity and
    national natural history
    museum
   37 million physical

    objects
   In the global top 5 of

    natural history
    museums

Specimen- and data collections, Rutger Vos
Biological specimen collections

 Natural history
 museums, which evolved
 from cabinets of
 curiosities, played an
 important role in the
 emergence of
 professional biological
 disciplines and research
 programs. Particularly
 in the 19th century,
 scientists began to use
 their natural history
 collections as teaching
 tools for advanced
 students and the basis
 for their own
 morphological research.

Specimen- and data collections, Rutger Vos
Biological data

                                             Research on biological
                                             collections generates many kinds
                                             of publicly available data
                                                    Global molecular databases
                                                     (NCBI)
                                                    Global biodiversity information
                                                     facility (*BIF)
                                                    Barcode of life data system
                                                     (BOLD)
                                                    Domain-specific databases (e.g.
                                                     TreeBASE)

Specimen- and data collections, Rutger Vos
Big data?

     Large data sets of
      various types:
        NGS   sequence data
        GIS occurrence data

        Digitization

        Identification keys




Specimen- and data collections, Rutger Vos
NCB services and research

 Research                                    Services

   Terrestrial and marine                     Advice customs on
    zoology, geology and                        traded endangered
    botany                                      species
   Fundamental and                            Identify birds from

    applied research                            plane crashes
   Significant NGS                            Identify hardwoods

    applications                               Identify gemstones


Specimen- and data collections, Rutger Vos
Example: orchid genomics

                                                 NCB Naturalis and BGI scientists
                                                  are mapping the first fully
                                                  sequenced orchid genome
                                                  (Erycina pusilla)
                                                 Study of developmental genes
                                                  coding for floral shape,
                                                  symmetry, scent and senescence
                                                 Many genes found to have
                                                  horticultural applications


Specimen- and data collections, Rutger Vos
Example: DNA Barcoding TCM

     Orchids long since used in
      China and now also
      increasingly popular in
      Europe
     Require identification to
      ensure they do not contain:
          legally protected wild species
          other species than mentioned
           on label (=adulteration)
          life threatening poisons
           in case of toxic substitutes

Specimen- and data collections, Rutger Vos
Example: the biodiversity crisis
                              1400 modeled species distributions
                                 (red = loss; green = gain)




                               —                             =




          2050                               2010                  2050-2010
Specimen- and data collections, Rutger Vos
Example: snake venom and medicine

                                               NCB Naturalis scientists
                                                are mapping the King
                                                Cobra genome
                                               Studying its evolution

                                                in broader
                                                comparative context
                                               Many proteins in

                                                venom might have
                                                medical applications
Specimen- and data collections, Rutger Vos
Linking physical and digital objects

                                             Physical objects are linked to
                                              digital data along along
                                              various axes:
                                                Specimen   identifiers
                                                Georeferences

                                                Classification

                                                Characters

                                                Literature



Specimen- and data collections, Rutger Vos
Links between data and specimens

     Primary voucher identifier
      is:
      institution:collection:specimen
     Several databases use
      these for cross-referencing
     Unfortunately not (yet,
      universally) resolvable*

 * http://iphylo.blogspot.com/



Specimen- and data collections, Rutger Vos
The future

    Globally unique,
     resolvable identifiers
    Resolution results in

     standards compliant
     open data
    Data discoverable to

     all by its links to
     other data

Specimen- and data collections, Rutger Vos
Conclusions

   Stakeholders in neighbouring domains need to
    identify where physical and digital objects can be
    linked usefully
   Stakeholders need to engage in community

    processes for standards development and
    adoption to enable data sharing
   Complexity needs to be managed collaboratively




Specimen- and data collections, Rutger Vos
Acknowledgements
     Thank you:
        For your attention
        To our gracious hosts today

        To the organizers of this visit
                                             謝謝!

Specimen- and data collections, Rutger Vos

Retrieving useful information from connected specimen- and data collections

  • 1.
    RETRIEVING USEFUL INFORMATION FROM CONNECTED SPECIMEN- AND DATA COLLECTIONS Rutger Vos Conceptual design of future databases: sense and nonsense, 22 March 2012 how to proceed jointly?
  • 2.
    Outline   NCBNaturalis   Collections of physical and digital objects   Examples of research and services   Linking specimens and data   Future developments   Conclusions Specimen- and data collections, Rutger Vos
  • 3.
    NCB Naturalis  Netherlands Centre for Biodiversity and national natural history museum   37 million physical objects   In the global top 5 of natural history museums Specimen- and data collections, Rutger Vos
  • 4.
    Biological specimen collections Natural history museums, which evolved from cabinets of curiosities, played an important role in the emergence of professional biological disciplines and research programs. Particularly in the 19th century, scientists began to use their natural history collections as teaching tools for advanced students and the basis for their own morphological research. Specimen- and data collections, Rutger Vos
  • 5.
    Biological data Research on biological collections generates many kinds of publicly available data   Global molecular databases (NCBI)   Global biodiversity information facility (*BIF)   Barcode of life data system (BOLD)   Domain-specific databases (e.g. TreeBASE) Specimen- and data collections, Rutger Vos
  • 6.
    Big data?   Large data sets of various types:   NGS sequence data   GIS occurrence data   Digitization   Identification keys Specimen- and data collections, Rutger Vos
  • 7.
    NCB services andresearch Research Services   Terrestrial and marine   Advice customs on zoology, geology and traded endangered botany species   Fundamental and   Identify birds from applied research plane crashes   Significant NGS   Identify hardwoods applications   Identify gemstones Specimen- and data collections, Rutger Vos
  • 8.
    Example: orchid genomics   NCB Naturalis and BGI scientists are mapping the first fully sequenced orchid genome (Erycina pusilla)   Study of developmental genes coding for floral shape, symmetry, scent and senescence   Many genes found to have horticultural applications Specimen- and data collections, Rutger Vos
  • 9.
    Example: DNA BarcodingTCM   Orchids long since used in China and now also increasingly popular in Europe   Require identification to ensure they do not contain:   legally protected wild species   other species than mentioned on label (=adulteration)   life threatening poisons in case of toxic substitutes Specimen- and data collections, Rutger Vos
  • 10.
    Example: the biodiversitycrisis 1400 modeled species distributions (red = loss; green = gain) — = 2050 2010 2050-2010 Specimen- and data collections, Rutger Vos
  • 11.
    Example: snake venomand medicine   NCB Naturalis scientists are mapping the King Cobra genome   Studying its evolution in broader comparative context   Many proteins in venom might have medical applications Specimen- and data collections, Rutger Vos
  • 12.
    Linking physical anddigital objects   Physical objects are linked to digital data along along various axes:   Specimen identifiers   Georeferences   Classification   Characters   Literature Specimen- and data collections, Rutger Vos
  • 13.
    Links between dataand specimens   Primary voucher identifier is: institution:collection:specimen   Several databases use these for cross-referencing   Unfortunately not (yet, universally) resolvable* * http://iphylo.blogspot.com/ Specimen- and data collections, Rutger Vos
  • 14.
    The future   Globally unique, resolvable identifiers   Resolution results in standards compliant open data   Data discoverable to all by its links to other data Specimen- and data collections, Rutger Vos
  • 15.
    Conclusions   Stakeholdersin neighbouring domains need to identify where physical and digital objects can be linked usefully   Stakeholders need to engage in community processes for standards development and adoption to enable data sharing   Complexity needs to be managed collaboratively Specimen- and data collections, Rutger Vos
  • 16.
    Acknowledgements   Thank you:   For your attention   To our gracious hosts today   To the organizers of this visit 謝謝! Specimen- and data collections, Rutger Vos