SlideShare a Scribd company logo
1 of 6
Download to read offline
A short presentation of EPGRIS,
                  at the N.I. Vavilov Institute of Plant Industry
                  by Dag Terje Endresen, The Nordic Gene Bank.




The three year EU founded project EPGRIS (European Plant Genetic Resources
Information-Structure) aim to produce an European PGR Search Catalog (EURISCO)
including passport data of collections maintained ex situ in Europe. The project started
in October 2000. The participating countries are devided in four regions: north, south,
east and west. Region 1 include Denmark, Estonia, Finland, Germany, Iceland,
Latvia, Lithuania, Norway, Poland, Russia and Sweden. The Nordic Gene Bank is the
coordinator in region 1.

Today there are several European search cataloges for different crops within the
ECP/GR networks. One of the motivations behind the EPGRIS project is to simplify
the data flow. The data flow for the Central Crop Data Bases (CCDB) and for
EURISCO is illustrated below in figure 1 and figure 2.




Figure 1. Information flow for the central crop databases today. Each gene bank upload data to many
Central Crop Data Bases (CCDB). Each Central Crop Data Base receive data sets from many gene
banks.
Figure 2. Information flow with EURISCO. National inventories hold all PGR data set of national gene
banks and upload the data set directly to the multi crop search catalog EURISCO.



Each of the regions are expected to have a separate regional prosess to produce their
subset of the passport data to be included in EURISCO. The methods of collecting
passport data could be different for each country. Some gene banks don’t have a
pemanent internet connection and must activly upload their data set, while others are
permanently online and may provide their data set online. EPGRIS will use the
IPGRI/FAO Multi-Crop Passport Descriptor List (MCPDL) for data exchange. Some
gene banks will transform the data from their documentation system into the MCPDL
themselves, others will wish to make their ’raw’ local data set available (together with
a decoding table) and leave for the regional coordinator to transform the data to the
MCPDL format. The goal is however to have an automatic prosedure where the
national inventories present their data directly to EURISCO in the standardised
format.

EPGRIS will initially use only the flat ascii file for file exchange. For the Nordic
countries and especially Russia the ascii file will be insufficient as national characters
will need to be translitterated. For the collection of passport data in region 1 we have
agreed to accept different character sets (as long as the used character set is specified).
The data sets will next be combined and stored in the unicode character set before
translitterated and submitted to EURISCO. EPGRIS will need to solve the character
set ’problem’ and when it does region 1 will this way already have routines to provide
correct characters.

For data exchange preferably the flat text file should be used. In region 1 it is agreed
to temporarily accept even other file formats like dbf and excel to get ’fresher’ and
more complete data faster. Many gene banks in the EPGRIS region 1 still keep their
original data in these formats. To produce a flat text file would in these cases mean
manual work. NGB will produce automatic or ’half’-automatic script routines for the
transformation to text files coded in unicode (and ascii) to reduce the ’manual’ work
involved.
The flat ascii text file for data exchange is very sensitive to ’illegal’ characters. One
misplaced carrige return or tab could ruin the record or even possibly all the data
below the ’illegal’ character. One solution (or improvement) suggested by Germany is
to use XML (Extensible Markup Language). XML is spreading in use to be one the
standard data exchange formats on the internet. This will probably eventually be the
most common exchange format also for EURISCO. XML is designed to make it easy
to interchange structured documents over the Internet. For gene bank passport data an
accession could in XML terminology be an ’entity’ or ’object’ and the descriptors
’attributes’ or ’properties’. To allow the computer to check the structure of a data set a
Document Type Definition (DTD) could declare each of the permitted entities,
elements and attributes, and the relationships between them.

The common search catalog (EURISCO) will be an important tool to improve
collaboration between European gene banks. It will be easier to compare and combine
data sets for analysis, extracting interesting PGR material or to find possible
dupplicates. And a search catalog of passport data is a good starting point for adding
more descriptive data to the common ’data-pool’ or for linking from the search
catalog to more detailed information provided online by the individual gene banks or
national inventories. Another interesting task could be to link EURISCO to other large
PGR search catalogs like GRIN or SINGER.




Figure 3. Demo of the search catalog has been published at http://ipgri.singer.cgiar.org



A first demo of the EURISCO search catalog was prepared for the sub-regional
meetings and may still be accessed at http://ipgri.singer.cgiar.org, figure 3. NGB has
an example on how the transliteration of characters might work demonstrated online
at http://www.ngb.se/epgris/epgris_test2.php. For the page to work your browser must
of course have support for unicode for the ’No translitteration’ option and Latin 1 for
the ’Latin-1’ option. MS Windows 2000, MS Windows NT 4, MS Windows XP and
Linux systems all have support for unicode. Even older systems like MS Windows 95
and MS Windows 98 will show the text propperly with unicode-aware software such
as Netscape, Internet Explorer, or Outlook Express, to name a few. You also need a
font with all the characters represented. Today a modern computer system will only
miss the very most exotic characters and all the European languages should be
presentated fine. See figure 4 and figure 5. The example use the UTF-8 unicode
schema. Another page show online export of the complete or a selection of the NGB
ex situ collection to the IPGRI/FAO Multi-Crop Passport Descriptors. See figure 6.
For more information on the EPGRIS project see the home page at
http://www.ecpgr.cgiar.org/epgris/




Figure 4. An example of search result presented without transliteration in Unicode.
http://www.ngb.se/epgris /epgris_test2.php




Figure 5. An example of search result presented with transliteration to Latin 1.
http://www.ngb.se/epgris/epgris_test2.php
Figure 6. The Nordic Gene Bank has online a tool for export of all or a selection of the NGB collection
to the multicrop passport descriptor list format at http://www.ngb.se/epgris/mcpdl/




References:
* EPGRIS home page, http://www.ecpgr.cgiar.org/epgris
* EURISCO DEMO Web site, http://ipgri.singer.cgiar.org
* EPGRIS Demo at the Nordic Gene Bank, http://www.ngb.se/epgris
* SINGER, http://www.singer.cgiar.org
* GRIN, http://www.ars-grin.gov/npgs
* EPGRIS Power point presentation, (unpublished, filename: EPGRIS.ppt, 26 July
2001, 419 kB)
Data Exchange Model Of EPGRIS, seminar at the Vavilov Institute in St Petersburg (12 March 2002)

More Related Content

What's hot

What's hot (20)

NGB documentation system SESTO (17 Sept 2004)
NGB documentation system SESTO (17 Sept 2004)NGB documentation system SESTO (17 Sept 2004)
NGB documentation system SESTO (17 Sept 2004)
 
TDWG VoMaG Vocabulary management workflow, 2013-10-31
TDWG VoMaG Vocabulary management workflow, 2013-10-31TDWG VoMaG Vocabulary management workflow, 2013-10-31
TDWG VoMaG Vocabulary management workflow, 2013-10-31
 
Germplasm data exchange, CGIAR SINGER (2009)
Germplasm data exchange, CGIAR SINGER (2009)Germplasm data exchange, CGIAR SINGER (2009)
Germplasm data exchange, CGIAR SINGER (2009)
 
Data exchange alternatives, GIGA TAG (2009)
Data exchange alternatives, GIGA TAG (2009)Data exchange alternatives, GIGA TAG (2009)
Data exchange alternatives, GIGA TAG (2009)
 
EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 M...
EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 M...EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 M...
EURISCO demo installations of IPT, at GBIF EU Nodes meeting in Alicante (11 M...
 
EURISCO and GBIF IPT, at the Vavilov Institute in St Petersburg (27 April 2010)
EURISCO and GBIF IPT, at the Vavilov Institute in St Petersburg (27 April 2010)EURISCO and GBIF IPT, at the Vavilov Institute in St Petersburg (27 April 2010)
EURISCO and GBIF IPT, at the Vavilov Institute in St Petersburg (27 April 2010)
 
Prototype germplasm data portal (2006)
Prototype germplasm data portal (2006)Prototype germplasm data portal (2006)
Prototype germplasm data portal (2006)
 
Genebanks as GBIF data providers, the first experiences, at the TDWG 2004 con...
Genebanks as GBIF data providers, the first experiences, at the TDWG 2004 con...Genebanks as GBIF data providers, the first experiences, at the TDWG 2004 con...
Genebanks as GBIF data providers, the first experiences, at the TDWG 2004 con...
 
GBIF registry (GBRDS), at European Nodes meeting in Alicante, Spain (10 March...
GBIF registry (GBRDS), at European Nodes meeting in Alicante, Spain (10 March...GBIF registry (GBRDS), at European Nodes meeting in Alicante, Spain (10 March...
GBIF registry (GBRDS), at European Nodes meeting in Alicante, Spain (10 March...
 
Estonian National Inventory, at the EPGRIS and EURISCO conference (2 Sept 2003)
Estonian National Inventory, at the EPGRIS and EURISCO conference (2 Sept 2003)Estonian National Inventory, at the EPGRIS and EURISCO conference (2 Sept 2003)
Estonian National Inventory, at the EPGRIS and EURISCO conference (2 Sept 2003)
 
EURISCO needs and priorities, at CGIAR ICT-KM Workshop, IPGRI, Rome (2005)
EURISCO needs and priorities, at CGIAR ICT-KM Workshop, IPGRI, Rome (2005)EURISCO needs and priorities, at CGIAR ICT-KM Workshop, IPGRI, Rome (2005)
EURISCO needs and priorities, at CGIAR ICT-KM Workshop, IPGRI, Rome (2005)
 
TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)
TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)
TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)
 
Nordic regional genebank documentation, at a meeting of the Nordic Council of...
Nordic regional genebank documentation, at a meeting of the Nordic Council of...Nordic regional genebank documentation, at a meeting of the Nordic Council of...
Nordic regional genebank documentation, at a meeting of the Nordic Council of...
 
GBIF-Norway at NMBU, January 2015
GBIF-Norway at NMBU, January 2015GBIF-Norway at NMBU, January 2015
GBIF-Norway at NMBU, January 2015
 
PFU Documentation services from the Nordic Gene Bank (2006)
PFU Documentation services from the Nordic Gene Bank (2006)PFU Documentation services from the Nordic Gene Bank (2006)
PFU Documentation services from the Nordic Gene Bank (2006)
 
Global Biodiversity Information Facility - 2013
Global Biodiversity Information Facility - 2013Global Biodiversity Information Facility - 2013
Global Biodiversity Information Facility - 2013
 
Darwin Core extension for genebanks (germplasm), at Kansas University (May 2012)
Darwin Core extension for genebanks (germplasm), at Kansas University (May 2012)Darwin Core extension for genebanks (germplasm), at Kansas University (May 2012)
Darwin Core extension for genebanks (germplasm), at Kansas University (May 2012)
 
#HepaticaWeek April 2016, GBIF data publishing
#HepaticaWeek April 2016, GBIF data publishing#HepaticaWeek April 2016, GBIF data publishing
#HepaticaWeek April 2016, GBIF data publishing
 
Trait data mining using FIGS (2006)
Trait data mining using FIGS (2006)Trait data mining using FIGS (2006)
Trait data mining using FIGS (2006)
 
2016-10-12 MUSIT & GBIF - Dataset portals
2016-10-12 MUSIT & GBIF - Dataset portals2016-10-12 MUSIT & GBIF - Dataset portals
2016-10-12 MUSIT & GBIF - Dataset portals
 

Similar to Data Exchange Model Of EPGRIS, seminar at the Vavilov Institute in St Petersburg (12 March 2002)

cxcxc program ssk-cug 2010 - standardized systematization of knowledge via ...
cxcxc program   ssk-cug 2010 - standardized systematization of knowledge via ...cxcxc program   ssk-cug 2010 - standardized systematization of knowledge via ...
cxcxc program ssk-cug 2010 - standardized systematization of knowledge via ...
Ionel Gabriel Niculescu
 
8056 article text-23461-1-10-20160930
8056 article text-23461-1-10-201609308056 article text-23461-1-10-20160930
8056 article text-23461-1-10-20160930
mm nn
 
Semantic data integration proof of concept
Semantic data integration proof of conceptSemantic data integration proof of concept
Semantic data integration proof of concept
Nicolas Bertrand
 
Building new knowledge from distributed scientific corpus: HERBADROP & EUROPE...
Building new knowledge from distributed scientific corpus: HERBADROP & EUROPE...Building new knowledge from distributed scientific corpus: HERBADROP & EUROPE...
Building new knowledge from distributed scientific corpus: HERBADROP & EUROPE...
Nuno Freire
 

Similar to Data Exchange Model Of EPGRIS, seminar at the Vavilov Institute in St Petersburg (12 March 2002) (20)

Item Project
Item Project Item Project
Item Project
 
Global Information Systems for Plant Genetic Resources, SeedNet training cour...
Global Information Systems for Plant Genetic Resources, SeedNet training cour...Global Information Systems for Plant Genetic Resources, SeedNet training cour...
Global Information Systems for Plant Genetic Resources, SeedNet training cour...
 
cxcxc program ssk-cug 2010 - standardized systematization of knowledge via ...
cxcxc program   ssk-cug 2010 - standardized systematization of knowledge via ...cxcxc program   ssk-cug 2010 - standardized systematization of knowledge via ...
cxcxc program ssk-cug 2010 - standardized systematization of knowledge via ...
 
Regional Nordic genebank documentation, at the DanBIF seminar in Århus Decemb...
Regional Nordic genebank documentation, at the DanBIF seminar in Århus Decemb...Regional Nordic genebank documentation, at the DanBIF seminar in Århus Decemb...
Regional Nordic genebank documentation, at the DanBIF seminar in Århus Decemb...
 
The FAO Open Archive: Enhancing Access to FAO Publications Using Internationa...
The FAO Open Archive: Enhancing Access to FAO Publications Using Internationa...The FAO Open Archive: Enhancing Access to FAO Publications Using Internationa...
The FAO Open Archive: Enhancing Access to FAO Publications Using Internationa...
 
Global Information Systems for Plant Genetic Resources (2009)
Global Information Systems for Plant Genetic Resources (2009)Global Information Systems for Plant Genetic Resources (2009)
Global Information Systems for Plant Genetic Resources (2009)
 
8056 article text-23461-1-10-20160930
8056 article text-23461-1-10-201609308056 article text-23461-1-10-20160930
8056 article text-23461-1-10-20160930
 
2005 09 Dc Keynote
2005 09 Dc Keynote2005 09 Dc Keynote
2005 09 Dc Keynote
 
Presentation of agriopenlink @ EFITA (main program)
Presentation of agriopenlink @ EFITA (main program)Presentation of agriopenlink @ EFITA (main program)
Presentation of agriopenlink @ EFITA (main program)
 
rworldmap: A New R package for Mapping Global Data
rworldmap: A New R package for Mapping Global Datarworldmap: A New R package for Mapping Global Data
rworldmap: A New R package for Mapping Global Data
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences research
 
Semantic data integration proof of concept
Semantic data integration proof of conceptSemantic data integration proof of concept
Semantic data integration proof of concept
 
XMLPipeDB
XMLPipeDBXMLPipeDB
XMLPipeDB
 
XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7
 
Building new knowledge from distributed scientific corpus: HERBADROP & EUROPE...
Building new knowledge from distributed scientific corpus: HERBADROP & EUROPE...Building new knowledge from distributed scientific corpus: HERBADROP & EUROPE...
Building new knowledge from distributed scientific corpus: HERBADROP & EUROPE...
 
Recommendations for the automatic enrichment of digital library content using...
Recommendations for the automatic enrichment of digital library content using...Recommendations for the automatic enrichment of digital library content using...
Recommendations for the automatic enrichment of digital library content using...
 
GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...
GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...
GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...
 
A Finnish perspective on FAIRsFAIR outputs
A Finnish perspective on FAIRsFAIR outputsA Finnish perspective on FAIRsFAIR outputs
A Finnish perspective on FAIRsFAIR outputs
 
LoCloud - D1.3 Content and Metadata Analysis
LoCloud - D1.3 Content and Metadata AnalysisLoCloud - D1.3 Content and Metadata Analysis
LoCloud - D1.3 Content and Metadata Analysis
 
A Service-Oriented National E-Theses Information System And Repository
A Service-Oriented National E-Theses Information System And RepositoryA Service-Oriented National E-Theses Information System And Repository
A Service-Oriented National E-Theses Information System And Repository
 

More from Dag Endresen

Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19
Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19
Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19
Dag Endresen
 

More from Dag Endresen (20)

Joint GBIF Biodiversa+ symposium in Helsinki on 2024-04-16
Joint GBIF Biodiversa+ symposium in  Helsinki on 2024-04-16Joint GBIF Biodiversa+ symposium in  Helsinki on 2024-04-16
Joint GBIF Biodiversa+ symposium in Helsinki on 2024-04-16
 
Iliad webinar 2024-03-13, Accessing and publishing marine biodiversity data i...
Iliad webinar 2024-03-13, Accessing and publishing marine biodiversity data i...Iliad webinar 2024-03-13, Accessing and publishing marine biodiversity data i...
Iliad webinar 2024-03-13, Accessing and publishing marine biodiversity data i...
 
Modelling Research Expeditions in Wikidata: Best Practice for Standardisation...
Modelling Research Expeditions in Wikidata: Best Practice for Standardisation...Modelling Research Expeditions in Wikidata: Best Practice for Standardisation...
Modelling Research Expeditions in Wikidata: Best Practice for Standardisation...
 
Ontologies for biodiversity informatics, UiO DSC June 2023
 Ontologies for biodiversity informatics, UiO DSC June 2023 Ontologies for biodiversity informatics, UiO DSC June 2023
Ontologies for biodiversity informatics, UiO DSC June 2023
 
Evacuation of the Kherson herbarium
Evacuation of the Kherson herbariumEvacuation of the Kherson herbarium
Evacuation of the Kherson herbarium
 
2023-05-08 GLIS SAC Rome
2023-05-08 GLIS SAC Rome2023-05-08 GLIS SAC Rome
2023-05-08 GLIS SAC Rome
 
BioDT for the UiO Science section meeting 2023-03-24
BioDT for the UiO Science section meeting 2023-03-24BioDT for the UiO Science section meeting 2023-03-24
BioDT for the UiO Science section meeting 2023-03-24
 
Data and Stats Forum at MINA NMBU - 2023-04-26
Data and Stats Forum at MINA NMBU - 2023-04-26Data and Stats Forum at MINA NMBU - 2023-04-26
Data and Stats Forum at MINA NMBU - 2023-04-26
 
BioDATA final conference in Oslo, November 2022
BioDATA final conference in Oslo, November 2022BioDATA final conference in Oslo, November 2022
BioDATA final conference in Oslo, November 2022
 
GBIF data mobilisation for the Nansen Legacy, Tromsø, 2022-09-20
GBIF data mobilisation for the Nansen Legacy, Tromsø, 2022-09-20GBIF data mobilisation for the Nansen Legacy, Tromsø, 2022-09-20
GBIF data mobilisation for the Nansen Legacy, Tromsø, 2022-09-20
 
GBIF at Living Norway Open Science Lab 2022-03-03
GBIF at Living Norway Open Science Lab 2022-03-03GBIF at Living Norway Open Science Lab 2022-03-03
GBIF at Living Norway Open Science Lab 2022-03-03
 
GBIF & GRScicoll, Høstseminar Norges museumsforbunds Seksjon for natur, 2021-...
GBIF & GRScicoll, Høstseminar Norges museumsforbunds Seksjon for natur, 2021-...GBIF & GRScicoll, Høstseminar Norges museumsforbunds Seksjon for natur, 2021-...
GBIF & GRScicoll, Høstseminar Norges museumsforbunds Seksjon for natur, 2021-...
 
Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19
Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19
Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19
 
The role of biodiversity informatics in GBIF, 2021-05-18
The role of biodiversity informatics in GBIF, 2021-05-18The role of biodiversity informatics in GBIF, 2021-05-18
The role of biodiversity informatics in GBIF, 2021-05-18
 
GBIF and Biodiversity informatics for museums, 15 March 2021
GBIF and Biodiversity informatics for museums, 15 March 2021GBIF and Biodiversity informatics for museums, 15 March 2021
GBIF and Biodiversity informatics for museums, 15 March 2021
 
2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)
 
GBIF and Open Science
GBIF and Open ScienceGBIF and Open Science
GBIF and Open Science
 
FAIR and open biodiversity collection data management
FAIR and open biodiversity collection data managementFAIR and open biodiversity collection data management
FAIR and open biodiversity collection data management
 
BioDATA capacity enhancement curriculum at GBIF GB26 Global Nodes Meeting in ...
BioDATA capacity enhancement curriculum at GBIF GB26 Global Nodes Meeting in ...BioDATA capacity enhancement curriculum at GBIF GB26 Global Nodes Meeting in ...
BioDATA capacity enhancement curriculum at GBIF GB26 Global Nodes Meeting in ...
 
GBIF-Norway node story lightning talk at GB26 in Leiden, October 2019
GBIF-Norway node story lightning talk at GB26 in Leiden, October 2019GBIF-Norway node story lightning talk at GB26 in Leiden, October 2019
GBIF-Norway node story lightning talk at GB26 in Leiden, October 2019
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Data Exchange Model Of EPGRIS, seminar at the Vavilov Institute in St Petersburg (12 March 2002)

  • 1. A short presentation of EPGRIS, at the N.I. Vavilov Institute of Plant Industry by Dag Terje Endresen, The Nordic Gene Bank. The three year EU founded project EPGRIS (European Plant Genetic Resources Information-Structure) aim to produce an European PGR Search Catalog (EURISCO) including passport data of collections maintained ex situ in Europe. The project started in October 2000. The participating countries are devided in four regions: north, south, east and west. Region 1 include Denmark, Estonia, Finland, Germany, Iceland, Latvia, Lithuania, Norway, Poland, Russia and Sweden. The Nordic Gene Bank is the coordinator in region 1. Today there are several European search cataloges for different crops within the ECP/GR networks. One of the motivations behind the EPGRIS project is to simplify the data flow. The data flow for the Central Crop Data Bases (CCDB) and for EURISCO is illustrated below in figure 1 and figure 2. Figure 1. Information flow for the central crop databases today. Each gene bank upload data to many Central Crop Data Bases (CCDB). Each Central Crop Data Base receive data sets from many gene banks.
  • 2. Figure 2. Information flow with EURISCO. National inventories hold all PGR data set of national gene banks and upload the data set directly to the multi crop search catalog EURISCO. Each of the regions are expected to have a separate regional prosess to produce their subset of the passport data to be included in EURISCO. The methods of collecting passport data could be different for each country. Some gene banks don’t have a pemanent internet connection and must activly upload their data set, while others are permanently online and may provide their data set online. EPGRIS will use the IPGRI/FAO Multi-Crop Passport Descriptor List (MCPDL) for data exchange. Some gene banks will transform the data from their documentation system into the MCPDL themselves, others will wish to make their ’raw’ local data set available (together with a decoding table) and leave for the regional coordinator to transform the data to the MCPDL format. The goal is however to have an automatic prosedure where the national inventories present their data directly to EURISCO in the standardised format. EPGRIS will initially use only the flat ascii file for file exchange. For the Nordic countries and especially Russia the ascii file will be insufficient as national characters will need to be translitterated. For the collection of passport data in region 1 we have agreed to accept different character sets (as long as the used character set is specified). The data sets will next be combined and stored in the unicode character set before translitterated and submitted to EURISCO. EPGRIS will need to solve the character set ’problem’ and when it does region 1 will this way already have routines to provide correct characters. For data exchange preferably the flat text file should be used. In region 1 it is agreed to temporarily accept even other file formats like dbf and excel to get ’fresher’ and more complete data faster. Many gene banks in the EPGRIS region 1 still keep their original data in these formats. To produce a flat text file would in these cases mean manual work. NGB will produce automatic or ’half’-automatic script routines for the transformation to text files coded in unicode (and ascii) to reduce the ’manual’ work involved.
  • 3. The flat ascii text file for data exchange is very sensitive to ’illegal’ characters. One misplaced carrige return or tab could ruin the record or even possibly all the data below the ’illegal’ character. One solution (or improvement) suggested by Germany is to use XML (Extensible Markup Language). XML is spreading in use to be one the standard data exchange formats on the internet. This will probably eventually be the most common exchange format also for EURISCO. XML is designed to make it easy to interchange structured documents over the Internet. For gene bank passport data an accession could in XML terminology be an ’entity’ or ’object’ and the descriptors ’attributes’ or ’properties’. To allow the computer to check the structure of a data set a Document Type Definition (DTD) could declare each of the permitted entities, elements and attributes, and the relationships between them. The common search catalog (EURISCO) will be an important tool to improve collaboration between European gene banks. It will be easier to compare and combine data sets for analysis, extracting interesting PGR material or to find possible dupplicates. And a search catalog of passport data is a good starting point for adding more descriptive data to the common ’data-pool’ or for linking from the search catalog to more detailed information provided online by the individual gene banks or national inventories. Another interesting task could be to link EURISCO to other large PGR search catalogs like GRIN or SINGER. Figure 3. Demo of the search catalog has been published at http://ipgri.singer.cgiar.org A first demo of the EURISCO search catalog was prepared for the sub-regional meetings and may still be accessed at http://ipgri.singer.cgiar.org, figure 3. NGB has an example on how the transliteration of characters might work demonstrated online
  • 4. at http://www.ngb.se/epgris/epgris_test2.php. For the page to work your browser must of course have support for unicode for the ’No translitteration’ option and Latin 1 for the ’Latin-1’ option. MS Windows 2000, MS Windows NT 4, MS Windows XP and Linux systems all have support for unicode. Even older systems like MS Windows 95 and MS Windows 98 will show the text propperly with unicode-aware software such as Netscape, Internet Explorer, or Outlook Express, to name a few. You also need a font with all the characters represented. Today a modern computer system will only miss the very most exotic characters and all the European languages should be presentated fine. See figure 4 and figure 5. The example use the UTF-8 unicode schema. Another page show online export of the complete or a selection of the NGB ex situ collection to the IPGRI/FAO Multi-Crop Passport Descriptors. See figure 6. For more information on the EPGRIS project see the home page at http://www.ecpgr.cgiar.org/epgris/ Figure 4. An example of search result presented without transliteration in Unicode. http://www.ngb.se/epgris /epgris_test2.php Figure 5. An example of search result presented with transliteration to Latin 1. http://www.ngb.se/epgris/epgris_test2.php
  • 5. Figure 6. The Nordic Gene Bank has online a tool for export of all or a selection of the NGB collection to the multicrop passport descriptor list format at http://www.ngb.se/epgris/mcpdl/ References: * EPGRIS home page, http://www.ecpgr.cgiar.org/epgris * EURISCO DEMO Web site, http://ipgri.singer.cgiar.org * EPGRIS Demo at the Nordic Gene Bank, http://www.ngb.se/epgris * SINGER, http://www.singer.cgiar.org * GRIN, http://www.ars-grin.gov/npgs * EPGRIS Power point presentation, (unpublished, filename: EPGRIS.ppt, 26 July 2001, 419 kB)