Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

EDF2012: The Web of Data and its Five Stars


Published on

Published in: Technology

EDF2012: The Web of Data and its Five Stars

  1. 1. The Web of Data and its Five Stars Richard Cyganiak, DERI, NUI Galway @cygri 6 June 2012 Realising and Exploiting the EU data cloud European Data Forum, Copenhagen, Denmark
  2. 2. Generating insight from data •  Today, data is abundant •  New middlemen find new ways of getting data to the end user •  Supply and demand for data higher than ever •  Analysts problem is no longer a lack of relevant data, but: •  Understanding data •  Assessing applicability •  Getting it into the right form for use •  Similar problems inside and outside of the firewall
  3. 3. From the Web to the Web of Data
  4. 4. Tim Berners-Lee’s 5-star plan for an open web of data ★ Make data available on the Web under an open license ★★ Make it available as structured data ★★★ Use a non-proprietary format ★★★★ Use URIs to identify things ★★★★★ Link your data to other people’s data to provide context
  5. 5. The 0th star •  Data catalog with good metadata •  Make your data findable
  6. 6. Data on the Web, Open License ★
  7. 7. Open Data
  8. 8. Government data catalogs
  9. 9. Open vs. Closed Data used to be closed by default.In the future, it will be open by default.
  10. 10. Is open data just for governments?
  11. 11. Good reasons against opening data •  Privacy •  Competitive advantage •  Producing data and charging for it as business model •  Cant get license from upstream
  12. 12. Business models Scott Brinker,
  13. 13. Data licenses
  14. 14. Structured Data ★★
  15. 15. Enabling re-use •  Delivering data to end users in different forms •  Combining data with other data •  3rd party analysis of data
  16. 16. Formats in government data •  Good for re-use: MS Excel, CSV, XML, JSON, Microdata •  Not so good for re-use: Pure websites, MS Word •  Bad for re-use: PDF •  Really bad for re-use: Only charts/maps without numbers
  17. 17. Symptom: Screenscraping
  18. 18. Non-Proprietary Formats ★★★
  19. 19. Specialist formats •  Specialist tools often have specialist formats •  Few people have the tools •  Expensive •  Difficult to re-use •  (Geospatial tools, statistics packages, etc.)
  20. 20. Non-proprietary formats, open standards •  CSV (dead simple) •  XML •  JSON •  RDF (good for 4+5 stars) •  OGC web services •  OAI-ORE web services
  21. 21. Use URIs as Identifiers ★★★★
  22. 22.
  23. 23.
  24. 24.
  25. 25. Turning local identifiers into URIs–Why? •  Make them globally unique •  Clarify authority •  Make them resolvable •  Make them linkable
  26. 26. The schema level By using URIs, connections that existed only in peoples minds can be put explicitly into the data model.
  27. 27. Include Links to Other Data ★★★★★
  28. 28. Hyperlinks are the soul of the Web. The Web of Data is no different.
  29. 29. Data links Central Contractor Registration (CCR) Geonames
  30. 30. Linked Data Principles 1.  Use URIs to name things (not only documents, but also people, locations, concepts, etc.) 2.  To enable agents (human users and machine agents alike) to look up those names, use HTTP URIs 3.  When someone looks up a URI, provide useful information (structured data in RDF, SPARQL). 4.  Include links to other URIs allowing agents to discover more things
  31. 31. Summary •  In the future, data will be open by default, unless good reason not to •  Emergence of a web of data •  “Five-star plan” for getting there, dataset by dataset •  2 stars: re-usable data! •  3 stars: open standards! •  4+5 stars: connect the silos!
  32. 32. Thank You! @cygri