Global Information Systems for Plant Genetic Resources, SeedNet training course (2008)
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Global Information Systems for Plant Genetic Resources, SeedNet training course (2008)

on

  • 3,025 views

Seednet training course at NordGen, Alnarp Sweden May 28, 2009.

Seednet training course at NordGen, Alnarp Sweden May 28, 2009.

Statistics

Views

Total Views
3,025
Views on SlideShare
3,021
Embed Views
4

Actions

Likes
0
Downloads
30
Comments
0

2 Embeds 4

http://www.slideshare.net 2
http://www.linkedin.com 2

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09 Some keywords; the main topics of the talk
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09 http://wwwdev.ngb.se/epgris3/index.php/Main_Page
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09 FAO WIEWS
  • Presentation of TDWG and GBIF 10/10/09 Samy to provide more input...? Release in December 2006? With approximately 2.2 million accessions indexed. Helmut suggest to describe CHM
  • Presentation of TDWG and GBIF 10/10/09 Samy to provide more input...? Release in December 2006? With approximately 2.2 million accessions indexed. Helmut suggest to describe CHM
  • Presentation of TDWG and GBIF 10/10/09 Samy to provide more input...? Release in December 2006? With approximately 2.2 million accessions indexed. Helmut suggest to describe CHM
  • Presentation of TDWG and GBIF 10/10/09 Samy to provide more input...? Release in December 2006? With approximately 2.2 million accessions indexed. Helmut suggest to describe CHM
  • Presentation of TDWG and GBIF 10/10/09 Samy to provide more input...? Release in December 2006? With approximately 2.2 million accessions indexed. Helmut suggest to describe CHM
  • Presentation of TDWG and GBIF 10/10/09 There are more than 6 million ex situ accessions of agricultural and horticultural crops conserved worldwide by genebanks (seed banks) according to FAO.
  • Presentation of TDWG and GBIF 10/10/09 FAO WIEWS
  • Presentation of TDWG and GBIF 10/10/09 Samy to provide more input...? Release in December 2006? With approximately 2.2 million accessions indexed. Helmut suggest to describe CHM
  • Presentation of TDWG and GBIF 10/10/09 Samy to provide more input...? Release in December 2006? With approximately 2.2 million accessions indexed. Helmut suggest to describe CHM
  • Presentation of TDWG and GBIF 10/10/09 Photo: Field been from Boreal, accession NGB11518, 2005-03-05, Dag Endresen [http://r142b.ngb.se/ngb/2005-03--the-making-of-seeds-pictures/index.php?offset=19&size=medium&stp=1]
  • Presentation of TDWG and GBIF 10/10/09 * IPGRI Descriptors lists [http://www.ipgri.cgiar.org/system/page.asp?frame=programmes/inibap/home.htm] (119 descriptor lists, 2005) * MCPD [http://www.ipgri.cgiar.org/publications/pdf/333.pdf] * UPOV - International Union for the Protection of New Varieties of Plants (UPOV) [ http://www.upov.int/] * UPOV - The International Union for the Protection of New Varieties of Plants or UPOV (French: Union internationale pour la protection des obtentions végétales) is an intergovernmental organization with headquarters in Geneva, Switzerland. [http://en.wikipedia.org/wiki/UPOV] * COMECON - The Council for Mutual Economic Assistance (COMECON / Comecon / CMEA / CEMA), 1949 – 1991, was an economic organisation of communist states and a kind of Eastern European equivalent to the European Economic Community. The military counterpart to the Comecon was the Warsaw Pact. [http://en.wikipedia.org/wiki/Comecon] * Multi-crop Passport Descriptors (MCPD) [http://www.ipgri.cgiar.org/publications/pdf/124.pdf] F AO (Food and Agricultural Organization of the United Nations) - IPGRI (International Plant Genetic Resources Institute). This is a revised version (December 2001) of the 1997 MCPD List. * FAO World Information and Early WarningSystem ( WIEWS) [http://apps3.fao.org/wiews/] * 19 Plant Uses Categories based on categories developed for the Working Group on Taxonomic Databases (TDWG) (Cook, Frances E.M., 1995. Economic Botany: Data Collection Standard. Royal Botanic Gardens Kew). [ http://www.ecpgr.cgiar.org/epgris/Training/MCPD-1998.doc] * The mapping of MCPD to ABCD was started in 2004 by Helmut Knüpffer and Walter Berendsohn, and continued by Javier de la Torre and Dag Terje Filip Endresen in 2005. [ http://ww3.bgbm.org/MCDPH] [ http://www.bgbm.org/TDWG/CODATA/Schema/Mappings/EURISCO-2-ABCD.pdf ]
  • Presentation of TDWG and GBIF 10/10/09 * Illustration: Corn earworm pupae that will be used to produce control parasites for release in the field. Photo by Scott Bauer. [http://www.ars.usda.gov/is/graphics/photos/k5554-2.htm] * UBIF is an attempt to define a common foundation for several TDWG/GBIF standards like SDD (see SDD WIKI), ABCD (see ABCD content schema homepage) or TaxonConceptNames (see Taxonomic Concept Transfer Schema WIKI). * Unified Biosciences Information Frameword (UBIF) XML schema for data exchange and integration across knowledge domains. The schema has been design for biological data, but is applicable to other knowledge areas as well. It is based on work of the TDWG SDD and ABCD subgroups and currently jointly authored by the SDD, ABCD, TaxonName subgroups and by GBIF (Global Biodiversity Information Facility). The framework may be used without changes for new schemata, no registration is necessary. * Complex Types are part of the UBIF infrastructure (TDWG common complex type for several schemas, ABCD, SDD, TCS, Lnnean Core, etc.)
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09 OMG-LSR, Object Management Group – Life Science Research [http://www.omg.org/lsr/]
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09 [http://www.biocase.org/index.shtml]
  • Presentation of TDWG and GBIF 10/10/09 http://www.biocase.org/whats_biocase/unit_net.shtml
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09 The passport data from most genebank datasets worldwide are indexed by EURISCO (European genebanks), SINGER (International CGIAR organizations) or USDA-GRIN (USDA ARS National Germplasm Repositories). Summary meta data on the datasets are collected and indexed by the FAO WIEWS database (World Information and Early Warning System on Plant Genetic Resources). FAO WIEWS. The World Information and Early Warning System (WIEWS ) on Plant Genetic Resources for Food and Agriculture (PGRFA) [http://apps3.fao.org/wiews/wiews.jsp]. GRIN Canada [http://pgrc3.agr.gc.ca/index_e.html] This a modification of a slide from Samy Gaiji, from presentation on: “ Information Networking - Challenges for the Plant Genetic Resources Communities, 2004.
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09 Arnica montana SPIMED Medicinal plants , Photographer Katarina Wdelsbäck (NGB Picture Archive, image 003907)
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09 Samy to provide more input...? Release in December 2006? With approximately 2.2 million accessions indexed. Helmut suggest to describe CHM
  • Presentation of TDWG and GBIF 10/10/09 Samy to provide more input...? Release in December 2006? With approximately 2.2 million accessions indexed. Helmut suggest to describe CHM
  • Presentation of TDWG and GBIF 10/10/09 Solanum tuberosum L. Potato. Light sprout. Photographer NGB (NGB Picture Archive, image 001289).
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09 Image: Field of wheat at Alnarp. Photographer Dag Terje Endresen (NGB Picture Archive, image 002981) Image: Spider in a spiderweb Image: Dag Terje Filip Endresen in Benin Image: Michael Mackay
  • Presentation of TDWG and GBIF 10/10/09
  • Presentation of TDWG and GBIF 10/10/09 Image: Rheum x hybridum Murray Rhubarb (2004). Photographer Gitte K. Björn (NGB Picture Archive, image 003683) Image: Rheum x hybridum Murray Rhubarb (2004). Photographer Gitte K. Björn (NGB Picture Archive, image 003714) Image: Brassica nigra (L.) W. D. J. Koch Black Mustard . Photographer Dag Terje Endresen (NGB Picture Archive, image 003840) [ http://www.nordgen.org/sesto/index.php?scp=ngb&thm=pictures ]

Global Information Systems for Plant Genetic Resources, SeedNet training course (2008) Presentation Transcript

  • 1. Cover slide Documentation of Genetic Resources Global Information Systems SEEDNet Training Course May 28, 2008 NordGen, Alnarp Dag Terje Filip Endresen Nordic Genetic Resource Center/ Bioversity International
  • 2. TOPICS
    • Documentation of genetic resources:
    • Information Systems
    • Data standards
    • Data exchange
    • Distributed data network
  • 3.
    • Global
    • PGR
    • Information
    • Systems
  • 4. SEEDNet data portal
    • SEEDNet, South East European Development Network on Plant Genetic Resources was established in 2004.
    • [ http://seednet.geminova.net/ ]
      • Albania
      • Bulgaria
      • Croatia
      • Federation of Bosnia and Herzegovina
      • Kosovo
      • Macedonia
      • Moldova
      • Montenegro
      • Republika Srpska
      • Romania
      • Serbia
      • Slovenia
  • 5. ECPGR (AEGIS, ECCDB, EURISCO)
    • A European Genebank Integrated System (AEGIS)
    • Sharing of responsibilities ( Most Appropriate Accession ; common agreed quality standards for e x situ conservation ).
    • Conservation of the genetically unique and important accessions for Europe and making them available for breeding and research .
    • F our model crops : Allium , Avena , Brassica and Prunus species .
    • Membership in AEGIS is open to all European countries (ECPGR) .
    • EURISCO and the Central Crop Databases play a key role in the information management .
  • 6. ECPGR Central Crop Databases
  • 7. EURISCO [ http://eurisco.ecpgr.org/ ]
    • EURISCO data catalogue of the European genebanks (more than 1 0 00 000 accessions from 35 European countries)
    • EURISCO holds accession level data on 1 300 genera and 8 500 species.
    • EURISCO was released in September 2003 as a result of the EU funded EPGRIS project.
    • EURISCO is hosted by Bioversity International on behalf of the ECPGR.
  • 8. EURISCO (draft, new layout)
  • 9. Data flow from genebanks to EURISCO and ECCDBs
  • 10. EPGRIS3 [ http://www.epgris3.eu/ ]
    • EPGRIS3 is a volunteer self-funded follow up on the EU funded EPGRIS project.
    • EPGRIS3 is about improving the data exchange of European genebank datasets and to further develop the IT infrastructure on genetic resources in Europe.
  • 11. EPGRIS3 Wiki Environment
    • A EPGRIS3 Wiki environment is hosted by NordGen. Please register and contribute to the discussions. [ http://wwwdev.ngb.se/epgris3/ ]
    • Please make contact with one of the EPGRIS3 contact persons if you want to contribute to the EPGRIS3 project.
    • [ http://www.epgris3.eu/ EPGRIS3contacts.htm ]
  • 12. SINGER [ http://singer.grinfo.net/ ]
    • The System-wide Information Network for Genetic Resources (SINGER) .
    • More than 650 000 accessions from the 12 international CGIAR organizations.
    • SINGER is hosted by Bioversity International on behalf of the CGIAR.
  • 13. CGIAR [ http://www.cgiar.org/ ]
    • AVRDC - The World Vegetable Center
    • Bioversity - Bioversity International
    • CIAT - Centro Internacional de Agricultura Tropical
    • CIMMYT - Ce ntro Internacional de Mejoramiento de Maiz y Trigo
    • CIP - C entro Internacional de la Papa
    • ICARDA - International Center for Agricultural Research in the Dry Areas
    • ICRAF - The World Agroforestry Centre
    • ICRISAT - International Crops Research Institute for the Semi-Arid Tropics
    • IITA - International Institute of Tropical Agriculture
    • ILRI - International Livestock Research Institute
    • IRRI - International Rice Research Institute
    • WARDA - The Africa Rice Center
  • 14. GCP [ http://www.generationcp.org/ ]
    • GCP
    • G eneration C hallenge P rogramme.
    • The GCP Mission: To use advanced genomics science and plant genetic diversity to overcome complex agricultural bottlenecks that condemn millions of the world’s neediest people to a future of poverty and hunger .
    • The GCP Vision: A future where plant breeders have the tools to breed crops in marginal environments with greater efficiency and accuracy for the benefit of the resource-poor farmers and their families.
  • 15. NordGen [ http://www.nordgen.org/ ]
    • The Nordic Genetic Resource Center (NordGen) was established in January 2008.
    • NordGen replaces the former institute Nordic Gene Bank (NGB) established in 1979.
    • NordGen is the joint regional genetic resource center for all the 5 Nordic countries: Denmark, Finland, Iceland, Norway and Sweden.
    • The NordGen reports to the Nordic Council of Ministers [ http://www.norden.org ].
    • The mandate of the NordGen is conservation and utilization of Nordic Genetic Resources.
  • 16. Regional Programs on Genetic Resources
    • SEEDNet, South East European Development Network on Plant Genetic Resources was established in 2004. [ http://seednet.geminova.net/ ]
    • SADC, Southern African Development Community program on genetic resources was started in 1989. [ http://www.spgrc.org/ ]
    • USDA GRIN, Germplasm Resources Information Network of the US. [ http://www.ars-grin.gov/ ]
    • … and more
  • 17.
    • GBIF
    • Global Biodiversity Information Facility
  • 18. GBIF Data Portal GBIF [ http:// data .gbif.org/ ]
  • 19. GBIF PGR Network 2 [ http://data.gbif.org/datasets/network/2 ]
  • 20. GBIF NordGen [ http:// data .gbif.org/ ]
  • 21. GBIF SINGER [ http:// data .gbif.org/ ]
  • 22. GBIF USDA
  • 23. Germplasm catalogues
    • The three large germplasm catalogues are indexed by the GBIF data portal
    • EURISCO is the data catalogue of the European genebanks ( more than 1 000 000 accessions )
    • SINGER is the portal to the international CGIAR collections ( more than 650 000 accessions )
    • USDA-GRIN is the portal to the USDA ARS National Germplasm Repositories of the USA ( more than 400 000 accessions )
  • 24.
    • FAO
    • WIEWS
  • 25. FAO WIEWS [ http://apps3.fao.org/wiews / ]
  • 26. FAO WIEWS, GPA [ http://www.pgrfa.org/gpa/ ] Leipzig Declaration 1996, 150 countries [ http://www.globalplanofaction.org/ ]
  • 27.
    • Data Standards
  • 28. Crop Descriptors
    • The crop descriptor lists from Bioversity International provide global standards for characterization and evaluation data on crop genetic resources.
    • The MCPD (Multi Crop Passport Descriptor List) provides a global standard for "passport data" across the crops.
    • The MCPD descriptor list is compatible with the TDWG standard: ABCD 2.06.
  • 29. Accession level, Data Standards
    • Multi Crop Passport (MCPD)
    • [http://www.bioversityinternational.org/publications/pubfile.asp?id_pub=124]
    • Darwin Core (DwC v2)
    • [http://wiki.tdwg.org/twiki/bin/view/DarwinCore/]
    • Access to Biological Collection Data (ABCD 2.06)
    • [http://wiki.tdwg.org/twiki/bin/view/ABCD]
    • Generation Challenge Programme (GCP Passport v1.05)
    • [ http://gcpcr.grinfo.net/include/webservices/schema-documentation.php]
  • 30. W3C :: RDF
    • Resource Description Framework
    • Scenario: You have a dataset of genebank accessions with pointers to the source datasets of the holding genebanks. You produce phenotypic evaluation data on accessions in this dataset. You find evaluation data from other sources on some of the accessions in your dataset. Some of the evaluation data are produced in areas of different day length, rainfall, soils… Some of the accessions in your dataset originate from areas of higher population densities other accessions originate from more natural habitats. Unfortunately most of the different sources of information is located on different web sites and it is difficult to bring the information together.
    • You would need to go through more or less the same process as other researchers in many domains of gathering heterogeneous data from multiple sources, combining and analysing it. This is the challenge that faces the web as a whole and is being addressed by the Semantic Web project.
    • RDFs can assist you to relate information from different sources.
    • A RDF triplet looks like this: subject-predicate-object
    • <rdf:Description rdf:about=&quot;http://www.example.org/index.html&quot;>
    • <dc:creator>John Smith</dc:creator>
    • </rdf:Description>
    anytime approximate case study diagnosis inconsistent kads banana apples stem color knowledge based systems knowledge level knowledge management knowledge representation LSID accession number GUID unitID ontology owl parametric design Full Scientific Name peer to peer systems problem solving landrace traditional cultivar 300 methods rdf rdf WEB2 ABCD SDD semantic web semantics specification languages web based web ontology INSTCODE plant genetic resources germplasm agricultural traits Aegilops
  • 31.
    • Life Science IDentifiers
    • LSID is a digital name tag.
    • LSIDs are GUIDs, Global Unique Identifiers.
    • [http://lsid.sourceforge.net/]
    • Structure urn:lsid: authority : namespace : object : revision
    • Example (fictive) urn:lsid:eurisco.org:accession:H451269
    • The LSID concept introduces a straightforward approach to naming and identifying data resources stored in multiple, distributed data stores .
    • LSID define s a simple, common way to identify and access biologically significant data
    • LSID provides a naming standard to support interoperability.
    • Developed by OMG-LSR and W3C, implemented by IBM.
    W3C :: LSID
  • 32.
    • Biodiversity data exchange tools
  • 33. Data Provider Software
    • PyWrapper v3, based on the BioCASE Python software.
    • [ http://www.pywrapper.org/ ]
    • [ http://www.biocase.org/ ]
    • DiGIR, Di stributed G eneric I nformation R etrieval. [ http://digir.net ]
    • TapirLink [ http://wiki.tdwg.org/twiki/bin/view/TAPIR/TapirLink ]
    • TapirDotNet [ http://wiki.tdwg.org/twiki/bin/view/TAPIR/TapirDotNET ]
  • 34. Distributed BioCASE/PyWrapper network
  • 35. Example of a service request
    • All exchanged data is formatted with XML tags.
  • 36. Example of a service response
  • 37.
    • Data portal and decentralized data networks with web services
  • 38. Data warehouse model
  • 39. Decentralized network EURISCO (Europe) NordGen (Northern Europe) IPK Gatersleben (Germany) IHAR (Poland) (Other European gene banks...) SINGER (CGIAR) (CGIAR International Future Harvest gene banks...) USDA GRIN (USA) (USDA ARS National Germplasm Repositories...) WUR CGN (Netherlands) GBIF (Global Biodiversity Information Facility) USER ALIS (Accession Level Information System) Web Services MCPD MCPD Svalbard Global Seed Vault (Safe Backup) SEEDNET Countries
  • 40. Germplasm data indexing tools
    • We have recently built data indexing tools for access to gene bank datasets provided with the BioCASE/PyWrapper.
    • This is planned to build a Global Accession Level Information System (ALIS).
    • In cooperation with GBIF, which themselves index basic biodiversity data from a similar approach.
    • [ http://chm.grinfo.net/ ]
  • 41. [ http://wwwdev.ngb.se/portal/ ]
  • 42. Crop Wild Relatives ARM LKA BOL MDG UZB National Datasets are shared with the central CWR data index. The national datasets as well as access to other International datasets are provided from the CWR data portal. EURISCO SINGER [ http://www.cropwildrelatives.org ]
  • 43. Taxonomy level metadata
    • The Taxon and Country pages provides access to the relevant external datasets.
  • 44. Country level metadata
  • 45. Outlook
    • The compatibility of data standards between PGR and biodiversity collections made it possible to integrate the worldwide germplasm collections into the biodiversity community.
    • Using GBIF technology (and contributing to its development), the PGR community can easily establish specific PGR networks without duplicating GBIF's work.
    • Use of GBIF technology and integration of PGR collection data into GBIF allows PGR users to simultaneously search PGR collections and other biodiversity collections, and to get access to the data (and possibly the material) of relevant biodiversity collections.
    • The establishment of new data portals on a specific crop, a regional thematic network or similar subset of the total global biodiversity datasets; can be done with rather few efforts! This requires only that all the relevant datasets are provided by GBIF compatible web services (like the BioCASE PyWrapper).
  • 46. Special thanks to:
    • Bioversity International [http://www.bioversityinternational.org]
    • GBIF, Global Biodiversity Information Facility [http://www.gbif.org]
    • BioCASE , The Biological Collection Access Service for Europe. [http://www.biocase.org]
    • TDWG , Taxonomic Database Working Group [http://www.tdwg.org]
  • 47. Thank you for listening!
  • 48. The data portal application
    • [ http://wwwdev.ngb.se/portal/]
    • The data portal application is tested to work well with MS Windows, Mac OSX, Linux, FreeBSD...
    • The data portal is developed for the PostgreSQL database, but works well with many different database systems, through the ADODB database abstraction library.
    • The data portal is developed for UNICODE
    • ضاإطقكغب שּׁשׁﭻﭗﭼﱠ אָבּדּוּ
    • The data portal is Open Source and licensed as GPL 2.
    • The data portal is developed with the PHP5 programming language, with some maintenance scripts developed with Perl.
  • 49. TDWG :: SDD
    • Structured Descriptive Data
    • In taxonomy, descriptive data takes a number of very different forms.
    • Natural-language descriptions are semi-structured, semi-formalised descriptions of a taxon (or occasionally of an individual specimen). They may be simple, short and written in plain language (if used for a popular field guide), or long, highly formal and using specialised terminology when used in a taxonomic monograph or other treatment.
    • The goal of the SDD standard is to allow capture, transport, caching and archiving of descriptive data in all the forms shown above, using a platform- and application-independent, international standard. Such a standard is crucial to enabling lossless porting of data between existing and future software platforms including identification, data-mining and analysis tools, and federated databases.
    • Hagedorn, G.; Thiele, K.; Morris, R. & Heidorn, P. B. 2005. The Structured Descriptive Data (SDD) w3c-xml-schema, version 1.0. http://www.tdwg.org/standards/116/ . [Last retrieved 05-May-2007]
    • [ http://www.tdwg.org/standards/116/ ]