Prototype germplasm data portal (2006)


Published on

Prototype Germplasm Data Portal, predecessor for the ALIS-Global of the GIGA project. Presentation for the Nordic Gene Bank board meeting on 4th December 2006.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • There are more than 6 million ex situ accessions of agricultural and horticultural crops conserved worldwide by genebanks (seed banks) according to FAO.
  • The passport data from most genebank datasets worldwide are indexed by EURISCO (European genebanks), SINGER (International CGIAR organizations) or USDA-GRIN (USDA ARS National Germplasm Repositories). Summary meta data on the datasets are collected and indexed by the FAO WIEWS database (World Information and Early Warning System on Plant Genetic Resources). FAO WIEWS. The World Information and Early Warning System (WIEWS ) on Plant Genetic Resources for Food and Agriculture (PGRFA) []. GRIN Canada [] This a modification of a slide from Samy Gaiji, from presentation on: “ Information Networking - Challenges for the Plant Genetic Resources Communities, 2004.
  • Arnica montana SPIMED Medicinal plants , Photographer Katarina Wdelsbäck (NGB Picture Archive, image 003907)
  • Samy to provide more input...? Release in December 2006? With approximately 2.2 million accessions indexed. Helmut suggest to describe CHM
  • Potato plants in greenhouse. Priekuli, Latvia. Photographer Dag Terje Endresen (NGB Picture Archive, image 002421)
  • Temporary ”UDDI” Registry
  • * Illustration: Corn earworm pupae that will be used to produce control parasites for release in the field. Photo by Scott Bauer. [] * UBIF is an attempt to define a common foundation for several TDWG/GBIF standards like SDD (see SDD WIKI), ABCD (see ABCD content schema homepage) or TaxonConceptNames (see Taxonomic Concept Transfer Schema WIKI). * Unified Biosciences Information Frameword (UBIF) XML schema for data exchange and integration across knowledge domains. The schema has been design for biological data, but is applicable to other knowledge areas as well. It is based on work of the TDWG SDD and ABCD subgroups and currently jointly authored by the SDD, ABCD, TaxonName subgroups and by GBIF (Global Biodiversity Information Facility). The framework may be used without changes for new schemata, no registration is necessary. * Complex Types are part of the UBIF infrastructure (TDWG common complex type for several schemas, ABCD, SDD, TCS, Lnnean Core, etc.)
  • GCP_Passport v 1.03 [] The GCP Passport 1.03 descriptor standard is based on the MCPD and ABCD standards and implemented for the PyWrapper/BioCASE data exchange software. A mapping for automatic “upgrade” between ABCD 2.06 and GCP_Passport_1.03 is also included in the PyWrapper/BioCASE software. The Generation Challenge Programme is a research and capacity building network that uses plant genetic diversity, advanced genomic science, and comparative biology to develop tools and technologies that enable plant breeders in the developing world to produce better crop varieties for resource-poor farmers. []
  • Photo: Field been from Boreal, accession NGB11518, 2005-03-05, Dag Endresen []
  • Solanum tuberosum L. Potato. Light sprout. Photographer NGB (NGB Picture Archive, image 001289).
  • Prototype germplasm data portal (2006)

    1. 1. Exchange of germplasm datasets with PyWrapper/BioCASE GAIN Global Accession Information Network December 4, 2006 NGB Board Meeting 2006 Alnarp, Sweden Dag Endresen, Nordic Gene Bank
    2. 2. TOPICS <ul><li>Genetic resources: </li></ul><ul><li>Data exchange </li></ul><ul><li>Information network </li></ul><ul><li>Outlook </li></ul>
    3. 3. Germplasm data, seed genebanks <ul><li>Germplasm genebanks are biodiversity collections. </li></ul><ul><li>Collection level data </li></ul><ul><li>Metadata about genebank institutes and the germplasm collections they hold. </li></ul><ul><li>Unit level data </li></ul><ul><li>The unit level data for germplasm collections are the accessions. Genebank accessions share many properties and attributes with other biodiversity specimens. </li></ul><ul><li>Descriptive data (phenotype) </li></ul><ul><li>The germplasm accessions are further described by descriptive characterization and evaluation data. </li></ul>
    4. 4. Germplasm catalogues <ul><li>Most genebank datasets are indexed by three major germplasm catalogues </li></ul><ul><li>EURISCO is the data catalogue of the European genebanks (997 631 accessions) </li></ul><ul><li>SINGER is the portal to the international CGIAR collections (689 349 accessions) </li></ul><ul><li>USDA-GRIN is the portal to the USDA ARS National Germplasm Repositories of the USA (475 178 accessions) </li></ul><ul><li>All three catalogues are now published in GBIF </li></ul>
    5. 5. Data warehouse model
    6. 6. The present data flow from genebanks to EURISCO and ECCDBs
    7. 7. Decentralized model
    8. 8. Decentralized model EURISCO (Data Portal Europe) Nordic Gene Bank (Northern Europe) IPK Gatersleben (Germany) IHAR (Poland) (Other European gene banks...) SINGER (Data Portal for CGIAR) (CGIAR International Future Harvest gene banks...) USDA GRIN (Data Portal USA) (USDA ARS National Germplasm Repositories...) WUR CGN (Netherlands) GBIF (Global Biodiversity Data Portal) USER GAIN (Global germplasm Data Portal) [] Internet MCPD MCPD MCPD MCPD MCPD Svalbard International Seed Vault (Safe Backup)
    9. 9. Germplasm data indexing tools <ul><li>We have recently built data indexing methodologies for access to germplasm data with BioCASE/PyWrapper. </li></ul><ul><li>This is planned to build a Germplasm Clearing House Mechanism (GAIN). </li></ul><ul><li>Development in close cooperation with GBIF, which themselves index basic biodiversity data from a similar approach. </li></ul><ul><li>[] </li></ul>
    10. 14. <ul><li>Decentralized data network with web services </li></ul>
    11. 15. Germplasm data exchange with PyWrapper/BioCASE <ul><li>GBIF technology demonstrated to IPGRI, FAO, CGIAR centres and genebanks (2005) and widely adopted for PGR information networks </li></ul><ul><li>In the spring of 2004 the first European genebanks joined GBIF as data providers. </li></ul><ul><li>In 2005 USDA-GRIN joined GBIF. </li></ul><ul><li>In 2006 both SINGER and EURISCO joined GBIF. </li></ul><ul><li>The germplasm datasets worldwide are compatible with the MCPD data standard. </li></ul><ul><li>Sharing of germplasm datasets with GBIF was rather straight forward after mapping of the MCPD data standard to ABCD 2.06 </li></ul>
    12. 16. Germplasm BioCASE entry points []
    13. 17. Taxonomic Database Working Group <ul><li>Darwin Core 2 - Element definitions designed to support the sharing and integration of primary biodiversity data&quot;. [] </li></ul><ul><li>Access to Biological Collection Data (ABCD) 2.06 - An evolving comprehensive standard for the access to and exchange of data about specimens and observations (a.k.a. primary biodiversity data)“. </li></ul><ul><li>[] </li></ul>
    14. 18. PGR sub-unit of ABCD 2.06
    15. 19. Generation Challenge Programme, GCP_Passport_1.03 <ul><li>The Generation Challenge Programme is a research and capacity building network that uses plant genetic diversity to produce better crop varieties for resource-poor farmers. </li></ul><ul><li>In the context of the GCP (Generation Challenge Programme), the GCP Passport data exchange schema was developed. </li></ul>
    16. 20. GCP_Passport Upgrade to ABCD
    17. 21. <ul><li>Global Unique Identifiers, GUID ( LSID , Life Science Identifiers) [] </li></ul><ul><li>Biodiversity informatics workflow tools (BioMOBY and Taverna) </li></ul>Work in progress
    18. 22. Outlook <ul><li>The compatibility of data standards between PGR and biodiversity collections made it possible to integrate the worldwide germplasm collections into the biodiversity community. </li></ul><ul><li>Using GBIF technology (and contributing to its development), the PGR community can easily establish specific PGR networks without duplicating GBIF's work. </li></ul><ul><li>Use of GBIF technology and integration of PGR collection data into GBIF allows PGR users to simultaneously search PGR collections and other biodiversity collections, and to get access to the data (and possibly the material) of relevant biodiversity collections. </li></ul><ul><li>Users from the biodiversity community (who may not be aware of the existence of relevant material in genebanks) will find in GBIF genebank material of, e.g. crop wild relatives, along with data of the same species from herbaria, botanical gardens and floristic observations. </li></ul>
    19. 23. Special thanks to <ul><li>IPGRI (Bioversity International) [] </li></ul><ul><li>GCP , The Generation Challenge Programme [] </li></ul><ul><li>GBIF, Global Biodiversity Information Facility [] </li></ul><ul><li>BioCASE , The Biological Collection Access Service for Europe. [] </li></ul><ul><li>TDWG , Taxonomic Database Working Group [] </li></ul>
    20. 24. Thank you for listening!