Data exchange alternatives, GIGA TAG (2009)
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Data exchange alternatives, GIGA TAG (2009)

  • 1,870 views
Uploaded on

GIGA TAG meeting at Bioversity International, Rome, Italy 18th May 2009. Data exchange alternatives for the Global Information on Germplasm Accessions (GIGA) project. Dag Endresen......

GIGA TAG meeting at Bioversity International, Rome, Italy 18th May 2009. Data exchange alternatives for the Global Information on Germplasm Accessions (GIGA) project. Dag Endresen (Bioversity/NordGen).

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,870
On Slideshare
1,865
From Embeds
5
Number of Embeds
2

Actions

Shares
Downloads
8
Comments
0
Likes
0

Embeds 5

http://www.slideshare.net 4
http://www.linkedin.com 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • IMAGE: http://blog.tapirtype.com/cartoons/ [Creative Commons License: http://creativecommons.org/licenses/by-nc-sa/3.0/us/]
  • http://www.bigfoto.com/miscellaneous/photos-05/index.htm
  • Photo: PICT0173.jpg Sub-section from Whale Safari to Kaikoura New Zealand. Photo Dag Terje Filip Endresen, October 2004.
  • http://www.tdwg.org
  • More details see:GBIF NODES meeting 2007 in Amsterdam.Agenda 09 Technical Training session - TAPIR/PyWrapper3:http://circa.gbif.net/Public/irc/gbif/nodes/library?l=/meetings/2007_10_amsterdam/tapir_pywrapper3/_EN_1.0_&a=i
  • More details see:GBIF NODES meeting 2007 in Amsterdam.Agenda 09 Technical Training session - TAPIR/PyWrapper3:http://circa.gbif.net/Public/irc/gbif/nodes/library?l=/meetings/2007_10_amsterdam/tapir_pywrapper3/_EN_1.0_&a=i
  • IMAGE source: http://commons.wikimedia.org/wiki/Image:Handshake_(Workshop_Cologne_%2706).jpeg; Copyright: GNU Public Licence
  • http://en.wikipedia.org/wiki/Darwin_Corehttp://rs.tdwg.org/dwc/terms/index.htmhttp://code.google.com/p/darwincore/http://code.google.com/p/darwincore/source/browse/#svn/trunk/xsd/profiles/germplasm
  • http://wwwdev.ngb.se/portal/index.php?scope=demohttp://chm.grinfo.net/index.php?app=data_providers
  • http://wwwdev.ngb.se/portal/index.php?scope=demohttp://chm.grinfo.net/index.php?app=data_providers
  • http://wwwdev.ngb.se/portal/index.php?scope=demohttp://chm.grinfo.net/index.php?app=data_providers
  • http://wwwdev.ngb.se/portal/index.php?scope=demohttp://chm.grinfo.net/index.php?app=data_providers
  • http://wwwdev.ngb.se/portal/index.php?scope=demohttp://chm.grinfo.net/index.php?app=data_providers
  • IMAGE source: http://commons.wikimedia.org/wiki/Image:Handshake_(Workshop_Cologne_%2706).jpeg; Copyright: GNU Public Licence
  • Image source: University of Ottawa, Distributed Computing Research Group: http://www.genie.uottawa.ca/research/rsrch_site.php?lang=e&id=90 (Google Images).See also: http://en.wikipedia.org/wiki/Fallacies_of_Distributed_Computing

Transcript

  • 1. Data exchange alternatives
    Global Information on Germplasm Accessions (GIGA, ALIS)
    2nd GIGA Technical Advisory Group Meeting
    Dag Terje Filip Endresen, Nordic Genetic Resources Center, NordGen (Sweden)
  • 2. Data exchange
    2
    Cartoon by Sasha Kopf (Creative Commons)
  • 3. Data Exchange Format
    MCPD (1997)
    Multi Crop Passport Descriptors
    Darwin Core (2001) **
    New version up for revision at TDWG2009
    http://rs.tdwg.org/dwc/index.htm
    ABCD (2001)
    Access to Biological Collections Data
    http://wiki.tdwg.org/twiki/bin/view/ABCD
    GCP Passport (2005)
    http://www.generationcp.org
    Ontology (including all above)
    perhaps develop a new GIGA ontology
    3
  • 4. Data Provider Software
    BioMOBY (2001)
    http://biomoby.org
    DiGIR (2002, not active)
    http://digir.sourceforge.net
    BioCASE (2003, PyWrapper v2)
    http://www.biocase.org
    EURISCO (2003, tab delimited text)
    http://eurisco.ecpgr.org
    PyWrapper 3 (2006, not active)
    http://trac.pywrapper.org
    TapirLink (2007)
    http://wiki.tdwg.org/twiki/bin/view/TAPIR/TapirLink
    GBIF Provider Toolkit (2009) **
    http://code.google.com/p/gbif-providertoolkit
    4
  • 5. Data Harvest Infrastructure
    GIGA Registry (UDDI)
    New GIGA registry for germplasm dataset?
    ICIS and CropForge tools
    http://cropwiki.irri.org/icis/
    https://cropforge.org/
    GBIF data portal and registry**
    http://data.gbif.org
    gbrds.gbif.org (registry)
    GBIF Indexing Toolkit (2009)
    http://code.google.com/p/gbif-indexingtoolkit/
    5
  • 6. Data Provider Software
    6
  • 7. EURISCO tab delimited upload
    http://eurisco.ecpgr.org
    7
  • 8. 8
    BioMOBY
    The BioMOBY project was initiated in 2001 (in Saskatchewan, Canada).
    Two branches, web service and semantic (MOBY-S).
    MOBY ontology-aware registry for discovery of both data and services.
    Works well with TAPIR and BioCASE.
    GCP have selected BioMOBY as the main web service technology.
    http://biomoby.org
  • 9. BioCASE 2.5
    9
    The BioCASE provider software is a product of the EU funded BioCASE project (2001-2004).
    Developed at BGBM in Berlin.
    Last updated in April 2008, with support for Python version 2.5
    Data formats include: ABCD 2.06, Darwin Core, GCP_Passport, MCPD.
    http://www.biocase.org
  • 10. BioCASE 2.5
    Configuration
    • Add datasource (dsa)
    • 11. Database connection
    • 12. Database table structure
    • 13. Mapping of data model to standard schema
    10
  • 14. TAPIR
    TAPIR - TDWG Access Protocol for Information Retrieval.
    During the 2004 TDWG meeting in Christchurch, NZ, work started on a unified protocol and named TAPIR.
    TAPIR is based on the protocol from the two data provider software, BioCASE and DiGIR.
    11
  • 15. PyWrapper3
    Home:http://trac.pywrapper.org/
    Primary developers: Markus Döring, Javier de la Torre
    Source code: Python
    14/07/2008 - Development stalled
    We are sorry to inform you that development of the TAPIR branch of PyWrapper has been stalled. The latest 3.1 alpha version is not stable and not recommended for production! (Message from the home page)
    PyWrapper is tested and verified to work fine with Windows, Mac OS X and Linux.
    12
  • 16. Web configuration tool
    PyWrapper graphical web based configuration tool
    13
  • 17. TapirLink 0.6.1
    Home: http://wiki.tdwg.org/twiki/bin/view/TAPIR/TapirLink
    Primary developers: Renato De Giovanni, Dave Vieglais
    Download: http://sourceforge.net/project/showfiles.php?group_id=38190
    Source code: PHP
    14
    Test resource with client form:
    http://localhost/tapirlink/tapir_client.php
    The XML Client form is very illustrative for understanding exactly how the wrapper software works!
  • 18. GBIF IPT
    Home: http://code.google.com/p/gbif-providertoolkit/
    Primary developers: Markus Döring, Tim Robertson
    Download: http://code.google.com/p/gbif-providertoolkit/downloads/list
    Source code: Java
    15
    DEMO at http://atlas.nordgen.org/ipt/
  • 19. GBIF IPT
    • The GBIF IPT is an open source, Java (TM) based web application that connects and serves three types of biodiversity data: taxon primary occurrence data, taxon checklists and general resource metadata.
    • 20. The data registered in the IPT is connected to the GBIF distributed network and made available for public consultation and use.
    • 21. Designed to transfer big amounts of records. Decentralize and speed up the process of indexing biodiversity occurrence datasets.
    • 22. IPT also provides a local tool for data quality assessment to data publishers.
    • 23. The data publisher will easily monitor data access and use.
    16
  • 24. GBIF IPT
    17
  • 25. GBIF IPT
    18
  • 26. IPT
    19
  • 27. GBIF IPT
    20
  • 28. Web service interface
    21
  • 29. Example TAPIR service SEARCH request
    22
  • 30. Example TAPIR service Search response
    23
  • 31. Example of OAI-PMH service request
    Request types:
    Identify
    ListMetadataFormats
    ListSets
    GetRecord
    ListIdentifiers
    ListRecords
    http://an.oa.org/OAI-script?verb=GetRecord
    &identifier=oai:arXiv.org:hep-th/9901001
    &metadataPrefix=oai_dc
    24
    OAI-PMH requests are submitted using either the HTTP GET or POST methods.
  • 32. Example of OAI-PMH service RESPONSE
    25
    OAI-PMH responses formatted as HTTP.
    With The Content-Type as text/xml.
  • 33. GBIF PGR Network 2
    [http://data.gbif.org/datasets/network/2]
    26
  • 34. 27
    Darwin Core
    A new version of Darwin Core is up for public review.
    http://rs.tdwg.org/dwc/terms/index.htm
    TDWG 2009, Montpellier, November 9 -13
    DRAFT Germplasm extension
    http://code.google.com/p/darwincore/source/browse/#svn/trunk/xsd/profiles/germplasm
    RDF, LSID, ontology friendly
  • 35. 28
    Outlook
    The compatibility of data standards between PGR and biodiversity collections made it possible to integrate the worldwide germplasm collections into the biodiversity community (GBIF, TDWG).
    Using GBIF technology (and contributing to its development), the PGR community can easily establish specific PGR networks without duplicating GBIF's work.
    Use of GBIF technology and integration of PGR collection data into GBIF allows PGR users to simultaneously search PGR collections and other biodiversity collections, and to get access to the data (and possibly the material) of relevant biodiversity collections.
    The establishment of new data portals and tools on a specific crop, a regional thematic network or similar subset of the total global biodiversity datasets; can be done with rather few efforts!
    Adopted from a slide by Helmut Knüpffer (IPK Gatersleben)
  • 36. Special thanks to
    29
    • Bioversity International
    http://www.bioversityinternational.org
    • GBIF, Global Biodiversity Information Facility http://www.gbif.org
    • 37. BioCASE, The Biological Collection Access Service for Europe.
    http://www.biocase.org
    • TDWG, Biodiversity Information Standards http://www.tdwg.org
  • Data portal example (2006)
    30
  • 38. 31
    http://wwwdev.ngb.se/portal/index.php?scope=demo
  • 39. 32
  • 40. 33
  • 41. 34
  • 42. 35
  • 43. Data Harvest
    36
  • 44. GBIF GBRDS
    http://gbrds.gbif.org
    37
  • 45. GBIF GBRDS
    http://gbrds.gbif.org
    38
  • 46. Fallacies of Distributed Computing
    The network is reliable.
    Latency is zero.
    Bandwidth is infinite.
    The network is secure.
    Topology doesn't change.
    There is one administrator.
    Transport cost is zero.
    The network is homogeneous.
    This list of fallacies came about at Sun Microsystems around 1994.
    39