EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)


Published on

Regional GBIF NODES meeting of Europe in March 2010. Presentation of current activities from the NordGen NODE. Implementations of the GBIF IPT toolkit for genebanks in Europe. Upgrade for selected genebanks from the BioCASE publishing toolkit to the IPT. First step of a scheduled larger implementation planned to start in 2011 as part of the EuroGeneBank application pending EU funding decision. NordGen IPT EURISCO

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • IMAGE: [Creative Commons License:]
  • Contract overview
  • Darwin core
  • Technology overview – IPT role
  • Java: DarwinCore Archive (DwC-A): (DBMS):
  • TAPIR (2004 ->) (International Crop Information System): http://www.icis.cgiar.orgBioMOBY (2001), (2002, not active) http://digir.sourceforge.netBioCASE (2003, PyWrapper v2) http://www.biocase.orgEURISCO (2003, tab delimited text) http://eurisco.ecpgr.orgPyWrapper 3 (2006, not active) http://trac.pywrapper.orgTapirLink (2007) Provider Toolkit (2009)
  • EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 March 2010)

    1. 1. European GBIF Nodes Meeting 2010, March 10th-12th Alicante, Spain<br />Dag Endresen, Nordiv Genetic Resources Center, NordGen<br />GBIF IPT installations for EURISCO<br />GBIF Tools and Darwin Core extension for germplasm<br />Cartoon by Sasha Kopf (Creative Commons)<br />
    2. 2. Topics for this session<br /><ul><li>GBIF IPT installation for EURISCO
    3. 3. Overview of the project
    4. 4. Darwin Core extension for germplasm
    5. 5. GBIF informatics tools
    6. 6. Integrated Publishing Toolkit (IPT)
    7. 7. IPT installations for EURISCO
    8. 8. Possible PGR networkmodel</li></li></ul><li>Darwin Core extension for Germplasm, (presentedat TDWG 2009)<br />Opened up for use of new GBIF technology in gene banking world<br />Proposal to implement GBIF technology as a test in the European gene banking community<br />
    9. 9. From the contract between NordGen and GBIF:<br />“... a feasibility study aimed at demonstrating the practical implementation of the GBIF decentralised architecture strategy and in particular in the context of the EURISCO Network.”<br />“... focused on the adoption of the IPT by selected gene banks in Europe, the publishing of richer content using the Darwin Core germplasm extension and the indexing of these published resources by the EURISCO platform.”<br />“... implemented in the context of EURISCO and therefore in close collaboration with the EURISCO Coordinator.”<br />
    10. 10. GBIF Informatics Suite<br /><ul><li>GBIF tools to empower decentralized thematic or regional networks
    11. 11. Darwin Core extension for germplasm makes these tools usable for crop gene banks.</li></li></ul><li>Darwin Core<br />The purpose of DwC terms is to facilitate data sharing<br /><ul><li> a well-defined standard core vocabulary
    12. 12. a flexible framework to maximize re-usability </li></ul>The Darwin Core can be extended by adding new terms to share additional information.<br />TDWG standard 2009<br />“The Darwin Core is primarily based on taxa, their occurrence in nature as documented by observations, specimens, and samples, and related information.”<br /><br />
    13. 13. DwC star schema model<br />
    14. 14. DwC extension for Germplasm<br />DwC Germplasm : DRAFT 0.1 : August 26, 2009<br /><ul><li> “MCPD in Darwin Core”
    15. 15. Maintained by gene banks worldwide
    16. 16. Additional terms to describe germplasm samples
    17. 17. Includes the new terms for crop trait experiments developed as part of the European EPGRIS3 project
    18. 18. Includes a few additional terms for new international crop treaty regulations</li></ul><br />
    19. 19. DwC Germplasm (1)<br />
    20. 20. DwC Germplasm (2)<br />
    21. 21. DwC Germplasm (3)<br />
    22. 22. DwC Germplasm (4)<br />
    23. 23. DwC Germplasm (5)<br />
    24. 24. DwC Germplasm (6)<br />GermplasmDistribution<br />Perhaps add new terms to facilitate the reporting of germplasm distribution for the ITPGRFA (International Treaty for Genetic Resources for Food and Agriculture)<br />GermplasmManagement<br />The Millennium Seed Bank (Kew) has contributed feedback to the DwC-G modeling and proposed to include a number of seed management descriptors.<br /><ul><li> Seed processing terms
    25. 25. Seed cleaning
    26. 26. Seed germination testing</li></ul>ConservationStatus<br />Suggested by ENSCONET - threat status for populations in situ<br />
    27. 27. Mapping of DwC-G terms to the MCPD descriptors<br />
    28. 28. Mapping of DwC-G terms to the MCPD descriptors (continued)<br />
    29. 29. MCPD -> ABCD 2.06 (2004)<br />National Inventory Code<br />Institute Code<br />Accession Number<br />Collecting Number<br />Collecting Institute Code<br />Genus<br />Species<br />Species Authority<br />„Subtaxa“<br />„Subtaxa“ Authority<br />Common Crop Name<br />Accession Name<br />Acquisition Date<br />Country of Origin<br />Location of Collection Site<br />Latitude of CS<br />Longitude of CS<br />Elevation of CS<br />Collecting Date of Sample<br />Breeding Institute Code<br />Biological Status of Accession<br />Ancestral Data<br />Collecting/Acquisition Source<br />Donor Institute Code<br />Donor Accession Number<br />Other Identification (Number) associated with the accession<br />Location of Safety Duplicates<br />Type of Germplasm Storage<br />Remarks<br />Decoded Collecting Institute<br />Decoded Breeding Institute<br />Decoded Donor Institute<br />Decoded Safety Duplication Location<br />Accession URL<br />Helmut Knüpffer<br />IPK Gatersleben<br />Descriptors marked red did not match the earlier versions of ABCD<br /> ABCD was extended by a PGR section [W. Berendsohn, H. Knüpffer]<br />Walter Berendsohn<br />BGBM<br /><br />
    30. 30. Home:<br />Primary developers: Markus Döring, Tim Robertson, John Wieczorek<br />Source code: Java <br />Released: 2009<br />DEMO at<br />Genebank Example at<br />
    31. 31. Integrated Publishing Toolkit (IPT)<br />A tool in support of data publishers.<br />A simple and straightforward mechanism to share primary biodiversity data following the Darwin Core standard.<br />Open source, Java based web application.<br />Provides a local tool for data quality assessment.<br />
    32. 32. GBIF Integrated Publishing Toolkit (IPT)<br /><ul><li>Java 1.5 or higher is required
    33. 33. Apache Tomcat is recommended (1 GB RAM+)
    34. 34. GBIF IPT is provided as a WAR archive (for easy deployment)
    35. 35. GeoServer is included for web mapping (OGC Compliant, WFS, WMS, etc)
    36. 36. H2 Embedded Java Database (with JDBC interface and web console)
    37. 37. Hibernate (object relational mapping)</li></li></ul><li>IPT Interfaces<br /><ul><li>REST XML
    38. 38. TAPIR
    39. 39. DwC Archive
    40. 40. OGC (WFS, WMS, Web Mapping)
    41. 41. EML (EcologicalMarkup Language)</li></li></ul><li>Darwin Core Archive (DwC-A)<br /><ul><li>DwC-A publish dwc records including extensions
    42. 42. Simple text based format
    43. 43. Zipped single file archive</li></ul>Germplasm.txt<br /><br />
    44. 44. IPT service from NordGen at<br />Alternatives:<br />-------<br /><ul><li> TAPIR (2004 ->)</li></ul>-------<br /><ul><li>DiGIR(PHP, 2001-2006)
    45. 45. TapirLink(PHP, 2007 ->)</li></ul>-------<br /><ul><li>BioCASE(Python, 2001-2008)
    46. 46. PyWrapper3 (2006-2008)</li></ul>-------<br /><ul><li> EURISCO (tab-delimited, 2003) </li></ul>-------<br /><ul><li> ICIS (Java, 1996 ->)</li></ul>-------<br /><ul><li>BioMOBY(Perl, 2001 ->)</li></li></ul><li><ul><li> Embeds its own database
    47. 47. Multilingual
    48. 48. Has a user management feature based on roles, which allows for multiple data managers to share a common instance
    49. 49. Manages multiple data sources
    50. 50. Several upload options: relational database management systems or data files
    51. 51. Public web interface allows for data browsing and full text search
    52. 52. Customised detail pages</li></li></ul><li>GBIF IPT<br />GBIF IPT implements the Darwin Core Standard; and provides an interface to easily build extensions to the core Darwin Core terms.<br />The draft germplasm extension is one example of how-to extend the Darwin Core terms for the GBIF IPT.<br />
    53. 53. The IPT user interface includes the germplasm extension<br />
    54. 54. XML interface includes thegermplasm extension<br />
    55. 55. The Harvesting and Indexing Toolkit (HIT)<br />Addresses the need of Nodes managers, to aggregate indexes of published primary biodiversity data. <br />Aims to ease the complexity of heterogeneous networks of data publishers, by shielding the end-user from the complexities of the different protocols.<br />
    56. 56. Biodiversity Resources Discovery System (GBRDS)<br />A Yellow Page reference of Biodiversity resources.<br />The IPT and HIT instances installed in the course of this project will be registered in the GBRDS. <br />Any biodiversity organisation should be able to register their resources and services into the GBRDS and contribute to the discovery services.<br />
    57. 57. Objectives of the European genebank project<br /><ul><li>Evaluate the GBIF decentralized architecture
    58. 58. Upgrade of the Integrated Publishing Toolkit (IPT) with the genebank extension and develop associated documentation.
    59. 59. Install and test the IPT installation in various genebanks in Europe that, as far as possible, are also EURISCO/ECPGR partners.
    60. 60. Test the registration of IPT installation through the GBIF Global Biodiversity Resources Discovery System (GBRDS).
    61. 61. Test the Harvesting and Indexing Toolkit (HIT) installation for the EURISCO platform.
    62. 62. Install an IPT instance on the EURISCO platform and synchronize with GBIF central Index.
    63. 63. Project runs until 20 December 2010.</li></li></ul><li>IPT deployment in Europe<br /><ul><li>NordGen in Sweden covering 5 countries (Denmark, Sweden, Finland, Norway and Iceland)
    64. 64. EURISCO / Bioversity-HQ (Italy)
    65. 65. Bioversity-Montpellier (France)
    66. 66. IPK Gatersleben (Germany)
    67. 67. WUR CGN (The Netherlands)
    68. 68. CRI (Czech Republic)
    69. 69. VIR (Russia)
    70. 70. Balkan countries (Albania, Bosnia, Croatia, Macedonia, Serbia, Romania)
    71. 71. Baltic countries (Estonia, Latvia, Lithuania)</li></li></ul><li>Possible PGR Network model<br />32<br /><ul><li>The gene bank dataset is shared from the holding gene bank.
    72. 72. The National Inventory (NI) endorse all national gene banks (and eventually individual accessions) for EURISCO.
    73. 73. ECPGR Crop databases can access passport data from EURISCO and additional crop specific data from the genebank IPT interface.
    74. 74. Standard data sharing tools ensure that the genebank dataset is available to other relevant decentralized thematic, regional or global networks.</li></li></ul><li>Potential of GBIF technology<br />Using GBIF technology (and contributing to its development), the PGR community can easily establish specific PGR networks without duplicating GBIF's work.<br />The compatibility of data standards between PGR and biodiversity collections made it possible to integrate the worldwide germplasm collections into the biodiversity community (TDWG, GBIF).<br /><br />
    75. 75. Special thanks to:<br /><ul><li>GBIF, Global Biodiversity Information Facility
    76. 76. TDWG, Biodiversity Information Standards
    77. 77. BioCASE, The Biological Collection Access Service for Europe. </li></ul><br /><ul><li>Bioversity International </li></ul><br />Things can happen in a band, or any type of collaboration, that would not otherwise happen. (Jim Coleman, Musician)<br />