Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Christophe Gueret: Publish Web data - an interactive session


Published on

Christophe Gueret (DANS, VU) “Publish Web data - an interactive session"
Presentation at the KnoweScape workshop "Evolution and variation of classification systems" March 4-5, 2015 Amsterdam

Published in: Education
  • Be the first to comment

  • Be the first to like this

Christophe Gueret: Publish Web data - an interactive session

  1. 1. Data Archiving and Networked Services Publishing data on the Web Christophe Guéret (@cgueret) Evolution and variation of classification systems March 4-5, 2015 Amsterdam
  2. 2. Publishing data on the Web ● Its' easy! Everybody does it! – … in very different ways :-/ ● Several, even not so big, issues : – Several competing standards and formats – Data hard to compare across sources – Lack of documentation – Limited capacities to assess trust – Missing dialog publisher ↔ consumer – etc
  3. 3. htp:// ● W3C working group “Data on the Web best practices” ● Part of the Data Activity – Also in this activity : Working group for CSV on the Web ● Charted until July 2016, running since January 2014 ● Focus on defining best practices for publishing and using open data via the Web – Agnostic to technologies – Scope: government data, research data, cultural heritage data
  4. 4. Goals ● Publish a set of best practices for publishing a consum ing open data – and supporting list of use-cases and requirem ents ● Publish a vocabulary for quality and granularit description ● Publish a vocabulary for data usage descriptio
  5. 5. Feb 24 headlines : frst draf !!!
  6. 6. which means ... Weneed your feedbackon the work done so far ! © DonkeyHotey, Flickr
  7. 7. Plans for the remaining tme ● Go quickly through the best practices ● Split up in groups of 3 or 4 persons ● Each group review the BP and say what is missing, what should be deleted, what should be added, … – Write everything on post-its! ● We collect and cluster the input. This will be reported back to the group on Friday
  8. 8. The 27 best practces © James Perkins, Flickr
  9. 9. htp://
  10. 10. Derived from htp://
  11. 11. Grouped in topics (1/4) ● Metadata – What kind of metadata should be considered when describing data on the Web? – How can metadata be provided in a machine readable way? ● Data Identification – How can unique re-use be provided for data resources? – How should URIs be designed and managed for persistence? ● Data Formats – What kind of data formats should be considered when publishing data on the Web? (List based on )
  12. 12. Grouped in topics (2/4) ● Data Vocabularies – How can existing vocabularies be used to provide semantic interoperability? – How can a new vocabulary be designed if needed? ● Data Licenses – How can data licenses be made machine readable? – How can license information about data published on the Web be provided/gathered? ● Data Provenance – How can data provenance information about data published on the Web be provided/gathered? (List based on )
  13. 13. Grouped in topics (3/4) ● Data Quality – How can data quality information about data on the Web be provided/gathered? ● Sensitive Data – How can data be published without infringing a person's right to privacy or an organization's security? ● Data Access – What kind of data access should be considered when publishing data on the Web? – What requirements should be taken into account when deciding how to make data available on the Web? (List based on )
  14. 14. Grouped in topics (4/4) ● Data Versions – How can different versions of a dataset be tracked and managed? ● Data Preservation – How can publishers decide when and how data on the Web should be archived? ● Feedback – How can user feedback about data consumed from the Web be gathered? (List based on )
  15. 15. Hands-on ! © Ian Collins, Flickr