The agINFRA Germplasm Working Group


Published on

Presentation about the agINFRA Germplasm Working Group ( Presented during Session 1 of the 1st International e-Conference on Germplasm Data Interoperability (

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Heterogeneous data types and formats,
  • OAI-PMH harvesting is not an option in the case of germplasm data
  • The agINFRA Germplasm Working Group

    1. 1. The Germplasm Working Group Dr. Vassilis Protonotarios Agricultural Biotechnologist, PhD Agro-Know Technologies, Greece e-Conference on Germplasm Data Interoperability Session 1: “The vision of Linked Germplasm Data”
    2. 2. Structure of the presentation 1. Background – About the agINFRA project – Issues related to data sharing 2. The Germplasm Working Group – Objectives – Wiki – Link with RDA 3. The next steps
    3. 3. Background
    4. 4. The agINFRA project • A project funded under the FP7 program of EC • Consortium with expertise on – Technology / infrastructures – Data / data management Combined to facilitate agricultural data sharing More info at:
    5. 5. The agINFRA project • Aims to enhance the interoperability between the agricultural data sources – Data sharing by • Metadata aggregation & linking data • Design and deploy the linked ag-data framework – Methodology for linking data – Provide the infrastructure needed • Both cloud- and grid-based services • Tools, APIs etc.
    6. 6. agINFRA major data types Bibliographic Agri Statistics & Economics Other? Raw data agINFRA Profiles Educational Germplasm Soil data
    7. 7. agINFRA major data sources Data Type Data provider(s) Bibliographic FAO AGRIS CASDD (CAAS) Educational Organic.Edunet Green Learning Network LAFLOR Germplasm Chinese Crop Germplasm Information System (CAAS) Italian National Germplasm Database (CRA) Soil Data Italian National Center for Soil Mapping Statistical FAOSTAT CountrySTAT Researchers’ profiles, organizations & events AGRIVIVO
    8. 8. Focusing on germplasm Aggregators National Databases Data flow Italian University GENESYS EURISCO Local Databases Italian Italian research center GBIF Chinese Chinese research center
    9. 9. Focusing on germplasm Aggregators GENESYS EURISCO National Databases Local Databases Italian University Italian Italian research center Chinese Chinese research center
    10. 10. The issue ? • Heterogeneity! – Data types – Data formats – Data management workflows – Standards used – Metadata exposure options – …. • Lack of connectivity with other data sources
    11. 11. The Germplasm Working Group
    12. 12. The Germplasm Working Group • Created in the context of the agINFRA project • Initially included agINFRA stakeholders – now expanded to host all stakeholders • The group is NOT a group of experts on germplasm data!
    13. 13. The scope of the Germplasm WG • Aims to enable/enhance interoperability between germplasm databases – By developing the services for • exchanging their data and • delivering their data to other partners • Focusing on three actions: 1. IDENTIFY 2. ORGANIZE 3. PROPOSE
    14. 14. Germplasm WG objectives • IDENTIFY: collect all information related to germplasm data • • • • • People/groups Namespaces (metadata, KOS) Standards Workflows Events • ORGANIZE: engage all stakeholders & available resources, analyze existing standards , facilitate collaboration • PROPOSE: linked data framework to connect data sources • facilitate data sharing between germplasm data sources
    15. 15. Germplasm related information metadata schemas Working groups in germplasm Events (for connecting stakeholders) KOS (ontologies, thesauri, vocabularies etc.) data management workflows Data exposure capabilities
    16. 16. Germplasm related information KOS (ontologies, thesauri, vocabularies etc.) Working groups in germplasm metadata schemas Events (for connecting stakeholders) data management workflows Data exposure capabilities
    17. 17. Proposed methodology 1. Analyze metadata schemas & KOSs used to describe germplasm resources 2. Define attributes & vocabularies that can be used to expose germplasm resources in linked data format. 3. Provide a set of recommendations for the exposure of germplasm resources as linked data 4. Embed the recommendations in the data infrastructure of agINFRA – to allow the exposure of germplasm resources as LOD.
    18. 18. The Germplasm WG wiki • Central point of reference • Freely accessible (no login required)
    19. 19. Information available so far • • • • • • Vision Activities Outcomes Participants Next steps Useful resources – – – – Data sources Standards Services Stakeholders • Events
    20. 20. Key outcomes of the group • Dossier on Germplasm Information: – Major programs – Major information systems and services – agINFRA germplasm data sources (CGRIS & CRA) – Core standards for germplasm information – Plant nomenclature, taxonomies and ontologies – Plant genomic resources – Related references and links • Freely available from the Germplasm Group wiki
    21. 21. Existing participants
    22. 22. Our wish list (tentative list) Reusing experiences from …and working closely with
    23. 23. Connection with RDA • RDA: Research Data Alliance ( • Aims to “accelerate and facilitate research data sharing and exchange” • Structure: – Interest Groups: Cover wider topics – Working Groups: Working on focused topics
    24. 24. Connection with RDA • Representation of agINFRA Germplasm WG in – 1st RDA Plenary Meeting (March 2013, Gothenburg, Sweden) – 2nd RDA Plenary Meeting (September 2013, Washington D.C., USA) • Suggestion for a Germplasm WG in RDA
    25. 25. Link between WG and RDA Groups
    26. 26. Link between WG and RDA Groups agINFRA WG RDA IG/WG •Interactions with data providers •Collection of large-scale data •Collection of requirements • Two (2) case studies •Development of Best Practices •Analysis of existing standards •Collection of requirements •Definition of data management workflows •Interaction with other IGs/WGs (e.g. metadata, LD) • Application in more cases •Wider exposure of outcomes •Development & adaptation of tools and services •Development of Best Practices •Development of Best Practices
    27. 27. The next steps
    28. 28. Towards the linking of germplasm data sources 1. Definition and application of the linked data for the agINFRA germplasm data sources 2. Recording and documentation of the process 3. Identification of issues 4. Suggestion for solutions to these issues 5. Fine-tuning of workflow 6. Development of Best Practices
    29. 29. …and more next steps • Update the existing analysis with new data • Collect new user requirements • (re)define the mappings between metadata schemas and KOSs • Fine-tune the linked data approach
    30. 30. Source: Contact me: