• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
EURISCO and GBIF IPT, at the Vavilov Institute in St Petersburg (27 April 2010)
 

EURISCO and GBIF IPT, at the Vavilov Institute in St Petersburg (27 April 2010)

on

  • 1,241 views

Visit to the NI Vavilov Institute for Plant Industry (VIR) in April 2010. Installation of the GBIF IPT toolkit for data publishing as a test upgrade for the EURISCO data infrastructure of European ...

Visit to the NI Vavilov Institute for Plant Industry (VIR) in April 2010. Installation of the GBIF IPT toolkit for data publishing as a test upgrade for the EURISCO data infrastructure of European genebanks.

Statistics

Views

Total Views
1,241
Views on SlideShare
1,238
Embed Views
3

Actions

Likes
1
Downloads
11
Comments
0

1 Embed 3

http://www.slideshare.net 3

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • IMAGE: http://blog.tapirtype.com/cartoons/ [Creative Commons License: http://creativecommons.org/licenses/by-nc-sa/3.0/us/]
  • Darwin core
  • Technology overview – IPT role
  • PPT: Nick King and Vishwas Chavan, Albuquerque, 2-7 Aug. 2009

EURISCO and GBIF IPT, at the Vavilov Institute in St Petersburg (27 April 2010) EURISCO and GBIF IPT, at the Vavilov Institute in St Petersburg (27 April 2010) Presentation Transcript

  • Web service demo for EURISCO
    GBIF Tools and Darwin Core extension for germplasm
    N.I. Vavilov Research Institute of Plant Industry (VIR), April 26th – 29th 2010, St Petersburg, Russian Federation
    Dag Endresen, Jonas Nordling, Nordic Genetic Resources Center (NordGen)
  • Topics for this session
    • Web service installations for EURISCO
    • Overview of the current project
    • Darwin Core and the extension for germplasm
    • GBIF informatics tools
    • Integrated Publishing Toolkit (IPT)
    • Distributed datasets
    2
  • Possible Upgraded PGR Network Model
    • The gene bank dataset is shared from the holding gene bank.
    • The National Inventory (NI) endorse all national gene banks (and eventually individual accessions) for EURISCO.
    • ECPGR Crop databases can access passport data from EURISCO and additional crop specific data from the genebank IPT interface.
    • Standard data sharing tools ensure that the genebank dataset is available to other relevant decentralized thematic, regional or global networks.
    3
  • Objectives of the EURISCO demo project
    • Evaluate the GBIF decentralized architecture
    • Install the IPT installation for 8 genebanks in Europe that, as far as possible, are also EURISCO/ECPGR partners.
    • Test the registration of IPT installation through the GBIF registry
    Global Biodiversity Resources Discovery System (GBRDS).
    • Test the Harvesting and Indexing Toolkit (HIT) installation for the EURISCO platform (Bioversity HQ, Rome).
    • Project runs until 20 December 2010.
    4
  • 2010 : IPT installations for EURISCO
    • EURISCO
    • NordGen (Nordic)
    • Bioversity-Montpellier (France)
    • IPK Gatersleben (Germany)
    • BLE (Germany)
    • WUR CGN (The Netherlands)
    • CRI (Czech Republic)
    • VIR (Russian Federation)
    • SeedNET (Balkan)
    • Baltic (Estonia, Latvia, Lithuania)
    5
  • 2005 : BioCASE demo
    http://chm.grinfo.net/
    6
  • Potential of the GBIF technology
    Using GBIF technology (and contributing to its development), the PGR community can easily establish specific PGR networks without duplicating GBIF's work.
    The compatibility of data standards between PGR and biodiversity collections made it possible to integrate the worldwide germplasm collections into the biodiversity community (TDWG, GBIF).
    http://data.gbif.org/datasets/network/2
    7
  • Darwin Core
    The purpose of DwC terms is to facilitate data sharing
    • a well-defined standard core vocabulary
    • a flexible framework to maximize re-usability
    The Darwin Core can be extended by adding new terms to share additional information.
    Approved as TDWG standard 2009
    “The Darwin Core is primarily based on taxa, their occurrence in nature as documented by observations, specimens, and samples, and related information.”
    http://rs.tdwg.org/dwc/
    8
  • DwC extension for germplasm
    DwC Germplasm : DRAFT 0.1 : August 26, 2009
    • “MCPD in Darwin Core”
    • Maintained by gene banks worldwide
    • Additional terms to describe germplasm samples
    • Includes the new terms for crop trait experiments developed as part of the European EPGRIS3 project
    • Includes a few additional terms for new international crop treaty regulations
    http://code.google.com/p/darwincore-germplasmhttp://rs.nordgen.org/dwc
    9
  • Mapping of DwC-G terms to the MCPD descriptors (EURISCO data exchange format)
    10
  • Mapping of DwC-G terms to the MCPD descriptors (continued)
    11
  • MCPD -> ABCD 2.06 (2004) for BioCASE
    National Inventory Code
    Institute Code
    AccessionNumber
    CollectingNumber
    Collecting Institute Code
    Genus
    Species
    SpeciesAuthority
    „Subtaxa“
    „Subtaxa“ Authority
    Common Crop Name
    Accession Name
    Acquisition Date
    Donor Institute Code
    DonorAccessionNumber
    OtherIdentification (Number) associatedwiththeaccession
    Location of SafetyDuplicates
    Type of Germplasm Storage
    Remarks
    DecodedCollecting Institute
    DecodedBreeding Institute
    DecodedDonor Institute
    DecodedSafetyDuplicationLocation
    Accession URL
    Country of Origin
    Location of Collection Site
    Latitude of CS
    Longitude of CS
    Elevation of CS
    Collecting Date of Sample
    Breeding Institute Code
    Biological Status of Accession
    Ancestral Data
    Collecting/AcquisitionSource
    Helmut Knüpffer
    IPK Gatersleben
    http://www.ecpgr.cgiar.org/epgris/Tech_papers/EURISCO_Descriptors.pdf
    Walter Berendsohn
    BGBM
    12
  • GBIF Informatics Suite
    • GBIF tools to empower decentralized thematic or regional networks
    • Darwin Core extension for germplasm makes these tools usable for crop gene banks.
    13
  • Integrated Publishing Toolkit (IPT)
    A tool for data publishers.
    A simple mechanism to share primary biodiversity data following the Darwin Core standard.
    Open source, Java based web application.
    Provides a local tool for data quality assessment, etc.
    14
    • Embeds its own database
    • Multilingual
    • Has a user management feature based on roles, which allows for multiple data managers to share a common instance
    • Manages multiple data sources
    • Several upload options: relational database management systems or data files
    • Public web interface allows for data browsing and full text search
    • Customised detail pages
    15
  • The IPT user interface includes the germplasm extension
    16
  • XML interface includes thegermplasm extension
    17
  • European ECPGR Crop Databases
    European EURISCO Catalog
    VIR (RUS001)
    Passport data
    Global Crop Registries
    VIR (RUS001)
    Crop departments
    18
  • Same dataset available from multiple information systems...
    ?!
    VIR Crop
    dataset
    ECPGR Crop Databases
    VIR (RUS001)
    Passport data
    EURISCO
    Global Crop Registries
    19
  • Resolvable persistent identifiers can direct the user to the publisher of the primary dataset (official original dataset)
    VIR Crop
    dataset
    ECPGR Crop Databases
    VIR (RUS001)
    Passport data
    EURISCO
    Global Crop Registries
    20
  • Persistent Identifier
    The Persistent Identifier (PI) is a digital name tag
    Also called Global Unique Identifiers (GUID)
    Life Science Identifiers (LSID) is one example
    Digital Object Identifier (doi) is another example
    The Persistent Identifier concept for to naming and identification of data resources stored in multiple, distributed data stores.
    Effective identification of data objects is essential for linking the world’s biodiversity data.
    21
  • Moving towards…global integration of information
    Genebank datasets
    Spatial data
    Threatened species
    Crop standards
    Migratory species
    Legislation and regulations etc.
    Crop collections in Europe
    Global crop system
    22
    Global crop collections
  • Special thanks to:
    • GBIF, Global Biodiversity Information Facility http://www.gbif.org
    • TDWG, Biodiversity Information Standards http://www.tdwg.org
    • Bioversity International
    http://www.bioversityinternational.org
    Things can happen in a band, or any type of collaboration, that would not otherwise happen. (Jim Coleman, Musician)