Reportnet – a Case StudySøren RougEuropean Environment Agency
What is Reportnet?• 14 year old system to receive data fromcountries• About 400 reporting obligations• Is also an archive ...
Automatic QA• Countries can run automatic QA on XMLfiles before final delivery• Uses the XQuery language• Rules selected b...
Automatic QA popularity problem• Need for checking against large code lists• Need for comparison with previous delivery• N...
Solution: Conversion to RDF and import• Very easy for code lists• When a delivery is made it is automaticallyconverted to ...
New possibilities – new ideas• What can we do when the deliveries are inthe triple store?• More QA?• Auto-correction?• Lin...
Reportnet pipelineXMLfileSPARQLQARDFXSL-TReportnettriplestoreImportXQueryQASemanticdataserviceRepository
Post delivery pipelineReportnettriplestoreCreateref. dataSemanticdataserviceEurostatdataWorldBankWebsiteCharts
Visualisation
Visualisation
Upcoming SlideShare
Loading in...5
×

EDF2013: Selected Talk Søren Roug: Reportnet – a Case Study

879

Published on

Selected Talk by Søren Roug, European Environment Agency, at the European Data Forum 2013, 10 April 2013 in Dublin, Ireland: Reportnet – a Case Study

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
879
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "EDF2013: Selected Talk Søren Roug: Reportnet – a Case Study"

  1. 1. Reportnet – a Case StudySøren RougEuropean Environment Agency
  2. 2. What is Reportnet?• 14 year old system to receive data fromcountries• About 400 reporting obligations• Is also an archive with persistent URLs• Countries upload (mainly) XML files with aweb browser to a repository• http://cdr.eionet.europa.eu/• Online web questionnaire using XForms forsmall data amounts• Field validation
  3. 3. Automatic QA• Countries can run automatic QA on XMLfiles before final delivery• Uses the XQuery language• Rules selected by the schema identifier in theXML file• Feedback is HTML• Very popular• Improves quality of deliveries• Decreases mundane work for humans• Clients very imaginative in devising new rules
  4. 4. Automatic QA popularity problem• Need for checking against large code lists• Need for comparison with previous delivery• Not necessarily same schema• XQuery and XForms can load code lists inXML format into memory• But not on this scale• Idea: Load the data into a database andimplement a ReST web service• Very heterogeneous data
  5. 5. Solution: Conversion to RDF and import• Very easy for code lists• When a delivery is made it is automaticallyconverted to RDF and imported into thetriple store• Can be done with XSL-T• SPARQL is the ReST web service• New problem: RDF modelling• New problem: Small code lists• New opportunity: linking of vocabularies
  6. 6. New possibilities – new ideas• What can we do when the deliveries are inthe triple store?• More QA?• Auto-correction?• Linking the data?• Extract a reference dataset?• Inference?• Post-harvest SPARQL scripts to insert,replace or delete triples
  7. 7. Reportnet pipelineXMLfileSPARQLQARDFXSL-TReportnettriplestoreImportXQueryQASemanticdataserviceRepository
  8. 8. Post delivery pipelineReportnettriplestoreCreateref. dataSemanticdataserviceEurostatdataWorldBankWebsiteCharts
  9. 9. Visualisation
  10. 10. Visualisation

×