Sharing of germplasm data sets, at the TDWG 2006 conference


Published on

Data exchange for germplasm data sets with PyWrapper/BioCASE. TDWG 2006 conference, 16th October 2006, St. Louis. Dag Endresen, Johan Bäckman, Helmut Knupffer, Samy Gaiji.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • * IPGRI Descriptors lists [] (119 descriptor lists, 2005) * MCPD [] * UPOV - International Union for the Protection of New Varieties of Plants (UPOV) [] * UPOV - The International Union for the Protection of New Varieties of Plants or UPOV (French: Union internationale pour la protection des obtentions végétales) is an intergovernmental organization with headquarters in Geneva, Switzerland. [] * COMECON - The Council for Mutual Economic Assistance (COMECON / Comecon / CMEA / CEMA), 1949 – 1991, was an economic organisation of communist states and a kind of Eastern European equivalent to the European Economic Community. The military counterpart to the Comecon was the Warsaw Pact. [] * Multi-crop Passport Descriptors (MCPD) [] F AO (Food and Agricultural Organization of the United Nations) - IPGRI (International Plant Genetic Resources Institute). This is a revised version (December 2001) of the 1997 MCPD List. * FAO World Information and Early WarningSystem ( WIEWS) [] * 19 Plant Uses Categories based on categories developed for the Working Group on Taxonomic Databases (TDWG) (Cook, Frances E.M., 1995. Economic Botany: Data Collection Standard. Royal Botanic Gardens Kew). [] * The mapping of MCPD to ABCD was started in 2004 by Helmut Knüpffer and Walter Berendsohn, and continued by Javier de la Torre and Dag Terje Filip Endresen in 2005. [] [ ]
  • GCP_Passport v 1.03 [] The GCP Passport 1.03 descriptor standard is based on the MCPD and ABCD standards and implemented for the PyWrapper/BioCASE data exchange software. A mapping for automatic “upgrade” between ABCD 2.06 and GCP_Passport_1.03 is also included in the PyWrapper/BioCASE software. The Generation Challenge Programme is a research and capacity building network that uses plant genetic diversity, advanced genomic science, and comparative biology to develop tools and technologies that enable plant breeders in the developing world to produce better crop varieties for resource-poor farmers. []
  • Photo: Field been from Boreal, accession NGB11518, 2005-03-05, Dag Endresen []
  • There are more than 6 million ex situ accessions of agricultural and horticultural crops conserved worldwide by genebanks (seed banks) according to FAO.
  • The passport data from most genebank datasets worldwide are indexed by EURISCO (European genebanks), SINGER (International CGIAR organizations) or USDA-GRIN (USDA ARS National Germplasm Repositories). Summary meta data on the datasets are collected and indexed by the FAO WIEWS database (World Information and Early Warning System on Plant Genetic Resources). FAO WIEWS. The World Information and Early Warning System (WIEWS ) on Plant Genetic Resources for Food and Agriculture (PGRFA) []. GRIN Canada [] This a modification of a slide from Samy Gaiji, from presentation on: “ Information Networking - Challenges for the Plant Genetic Resources Communities, 2004.
  • Potato plants in greenhouse. Priekuli, Latvia. Photographer Dag Terje Endresen (NGB Picture Archive, image 002421)
  • Temporary ”UDDI” Registry
  • Samy to provide more input...? Release in December 2006? With approximately 2.2 million accessions indexed. Helmut suggest to describe CHM
  • Solanum tuberosum L. Potato. Light sprout. Photographer NGB (NGB Picture Archive, image 001289).
  • Sharing of germplasm data sets, at the TDWG 2006 conference

    1. 1. Exchange of germplasm datasets with PyWrapper/BioCASE October 16, 2006 TDWG annual Meeting 2006 Missouri Botanical Garden St. Louis, Missouri, U.S.A. Dag Endresen, Nordic Gene Bank Johan Bäckman, Nordic Gene Bank Helmut Knüpffer, IPK Gatersleben Samy Gaiji, IPGRI, Bioversity International
    2. 2. TOPICS <ul><li>Genetic resources: </li></ul><ul><li>Data standards </li></ul><ul><li>Data exchange </li></ul><ul><li>Information network </li></ul><ul><li>Outlook </li></ul>
    3. 3. Germplasm data, seed genebanks <ul><li>Germplasm genebanks are biodiversity collections. </li></ul><ul><li>Collection level data </li></ul><ul><li>Metadata about genebank institutes and the germplasm collections they hold. </li></ul><ul><li>Unit level data </li></ul><ul><li>The unit level data for germplasm collections are the accessions. Genebank accessions share many properties and attributes with other biodiversity specimens. </li></ul>
    4. 4. <ul><li>Germplasm Data Standards </li></ul>
    5. 5. IPGRI Crop Descriptors <ul><li>The IPGRI crop descriptors are developed to standardize characterization and evaluation data – called “descriptive data” in TDWG context. </li></ul><ul><li>The MCPD (Multi Crop Passport Descriptors) is designed to standardize &quot;passport data&quot; across crops. It enables compatibility with the IPGRI crop specific descriptor lists and the FAO World Information and Early Warning System (WIEWS) and serves as a basis for data exchange. </li></ul><ul><li>The MCPD descriptor list was made fully compatible with ABCD 2.06 </li></ul>
    6. 6. Generation Challenge Programme, GCP_Passport_1.03 <ul><li>The Generation Challenge Programme is a research and capacity building network that uses plant genetic diversity to produce better crop varieties for resource-poor farmers. </li></ul><ul><li>In the context of the GCP (Generation Challenge Programme), the GCP Passport data exchange schema was developed. </li></ul>
    7. 7. GCP_Passport Upgrade to ABCD
    8. 8. PGR sub-unit of ABCD 2.06
    9. 9. <ul><li>Germplasm Data Catalogues </li></ul>
    10. 10. Germplasm catalogues <ul><li>Most genebank datasets are indexed by three major germplasm catalogues </li></ul><ul><li>EURISCO is the data catalogue of the European genebanks (836 725 accessions) </li></ul><ul><li>SINGER is the portal to the international CGIAR collections (442 635 accessions) </li></ul><ul><li>USDA-GRIN is the portal to the USDA ARS National Germplasm Repositories of the USA (464 586 accessions) </li></ul><ul><li>All three catalogues are published in GBIF </li></ul>
    11. 11. Data warehouse model
    12. 12. <ul><li>Decentralized data network with web services </li></ul>
    13. 13. Germplasm data exchange with PyWrapper/BioCASE <ul><li>GBIF technology demonstrated to IPGRI, FAO, CGIAR centres and genebanks (2005) and widely adopted for PGR information networks </li></ul><ul><li>In the spring of 2004 the first European genebanks joined GBIF as data providers. </li></ul><ul><li>In 2005 USDA-GRIN joined GBIF. </li></ul><ul><li>In 2006 both SINGER and EURISCO joined GBIF. </li></ul><ul><li>The germplasm datasets worldwide are compatible with the MCPD data standard. </li></ul><ul><li>Sharing of germplasm datasets with GBIF was rather straight forward after mapping of the MCPD data standard to ABCD 2.06 </li></ul>
    14. 14. Germplasm BioCASE entry points []
    15. 15. Decentralized model EURISCO (Data Portal Europe) Nordic Gene Bank (Northern Europe) IPK Gatersleben (Germany) IHAR (Poland) (Other European gene banks...) SINGER (Data Portal for CGIAR) (CGIAR International Future Harvest gene banks...) USDA GRIN (Data Portal USA) (USDA ARS National Germplasm Repositories...) WUR CGN (Netherlands) GBIF (Global Data Portal) USER (Global germplasm Data Portal) Internet MCPD MCPD MCPD MCPD MCPD
    16. 16. Germplasm data indexing <ul><li>The genebanks are building data indexing methodologies for access to global germplasm data. </li></ul><ul><li>It is planned to build a “Clearing House Mechanism” for germplasm. </li></ul><ul><li>This data portal is developed in cooperation with GBIF, which is also harvesting global biodiversity data using a similar approach. </li></ul><ul><li>[] </li></ul>
    17. 17. <ul><li>Global Unique Identifiers, GUID ( LSID , Life Science Identifiers) [] </li></ul><ul><li>Biodiversity informatics workflow tools (BioMOBY and Taverna) </li></ul>Work in progress
    18. 18. Outlook <ul><li>The compatibility of data standards between PGR and biodiversity collections made it possible to integrate the worldwide germplasm collections into the biodiversity community. </li></ul><ul><li>Using GBIF technology (and contributing to its development), the PGR community can easily establish specific PGR networks without duplicating GBIF's work. </li></ul><ul><li>Use of GBIF technology and integration of PGR collection data into GBIF allows PGR users to simultaneously search PGR collections and other biodiversity collections, and to get access to the data (and possibly the material) of relevant biodiversity collections. </li></ul><ul><li>Users from the biodiversity community (who may not be aware of the existence of relevant material in genebanks) will find in GBIF genebank material of, e.g. crop wild relatives, along with data of the same species from herbaria, botanical gardens and floristic observations. </li></ul>
    19. 19. Special thanks to <ul><li>GBIF, Global Biodiversity Information Facility [] </li></ul><ul><li>BioCASE , The Biological Collection Access Service for Europe. [] </li></ul><ul><li>TDWG , Taxonomic Database Working Group [] </li></ul><ul><li>GCP , The Generation Challenge Programme [] </li></ul>
    20. 20. Thanks for listening!