The Web of Data and its Five Stars Richard Cyganiak, DERI, NUI Galway @cygri 6 June 2012 Realising and Exploiting the EU data cloud European Data Forum, Copenhagen, Denmark
Generating insight from data • Today, data is abundant • New middlemen ﬁnd new ways of getting data to the end user • Supply and demand for data higher than ever • Analysts problem is no longer a lack of relevant data, but: • Understanding data • Assessing applicability • Getting it into the right form for use • Similar problems inside and outside of the ﬁrewall
Tim Berners-Lee’s 5-star plan for an open web of data ★ Make data available on the Web under an open license ★★ Make it available as structured data ★★★ Use a non-proprietary format ★★★★ Use URIs to identify things ★★★★★ Link your data to other people’s data to provide context
The 0th star • Data catalog with good metadata • Make your data ﬁndable
Enabling re-use • Delivering data to end users in different forms • Combining data with other data • 3rd party analysis of data
Formats in government data • Good for re-use: MS Excel, CSV, XML, JSON, Microdata • Not so good for re-use: Pure websites, MS Word • Bad for re-use: PDF • Really bad for re-use: Only charts/maps without numbers
Hyperlinks are the soul of the Web. The Web of Data is no different.
Data links Central Contractor Registration (CCR) Geonames
Linked Data Principles 1. Use URIs to name things (not only documents, but also people, locations, concepts, etc.) 2. To enable agents (human users and machine agents alike) to look up those names, use HTTP URIs 3. When someone looks up a URI, provide useful information (structured data in RDF, SPARQL). 4. Include links to other URIs allowing agents to discover more things http://www.w3.org/DesignIssues/LinkedData.html
Summary • In the future, data will be open by default, unless good reason not to • Emergence of a web of data • “Five-star plan” for getting there, dataset by dataset • 2 stars: re-usable data! • 3 stars: open standards! • 4+5 stars: connect the silos!