Linked Datalife cyclesDr. Michael Hausenblas, Linked Data Research CentreDERI, NUI GalwayJuly 2011
What is a dataspace?Heterogeneous data sourcesDistributed environment - proximityFind and consume dataUpdate data
What is a DSSP and why does it matter?DSSP == Dataspace Support PlatformParticipants & relationshipsServicesCatalog & BrowseSearch & QueryIndexDiscoveryLinked Data ecosystem is an open & standards-basedreal-world DSSP
Data management solutionsBased on [Franklin:SIGMOD05]
Linked Data principles*Use URIs to identify the “things” in your dataUse HTTP URIs so people and machines can look them up (on the Web)When a URI is looked up, return a description of the thingInclude links to related things* http://www.w3.org/DesignIssues/LinkedData.html
http://lod-cloud.net/Linked Open Data cloud
Linked Open Data cloud statstriples distributionlinks distribution http://lod-cloud.net/state/
The ChallengeClassical data management approaches assume complete control over schema, data, and data generation
The Web: distributed & open  lacks control
Requires a new model of life cyclesLinked Data life cyclesopendata.ieLOD cloud NeologismDataCubeprefix.ccGoogle RefineRDB2RDFVoIDDCATSindiceCKANLATC 24/7dukeSig.maschool explorerdata-gov.ie
Linked Data life cycles: data awarenessopendata.ieLOD cloud NeologismDataCubeprefix.ccGoogle RefineRDB2RDFVoIDDCATSindiceCKANLATC 24/7dukeSig.maschool explorerdata-gov.ie
http://thinkquarterly.co.uk/01-data/a-data-state-of-mind/Hans Rosling ‘database hugging disorder’
TimBL’s 5-star plan for open data*★ Make your data available on		          the Web under an open license                            ★★ Make it available as structured                  data(Excel sheet instead of image scan of a table)                         ★★★Use a non-proprietary format(CSV file instead of an Excel sheet) ★★★★ Use Linked Data format(URIs to identify things, RDF to represent data)★★★★★ Link your data to other                  people’s data to provide                   context*http://lab.linkeddata.deri.ie/2010/star-scheme-by-example/
Linked Data life cycles: modelingopendata.ieLOD cloud NeologismDataCubeprefix.ccGoogle RefineRDB2RDFVoIDDCATSindiceCKANLATC 24/7dukeSig.maschool explorerdata-gov.ie
http://linked-statistics.org/datacube/
http://vocab.deri.iehttp://neologism.deri.ie
http://schema.rdfs.org
Linked Data life cycles: publishingopendata.ieLOD cloud NeologismDataCubeprefix.ccGoogle RefineRDB2RDFVoIDDCATSindiceCKANLATC 24/7dukeSig.maschool explorerdata-gov.ie
Publishinghttp://lab.linkeddata.deri.ie/2010/grefine-rdf-extension/
Linked Data life cycles: discoveryopendata.ieLOD cloud NeologismDataCubeprefix.ccGoogle RefineRDB2RDFVoIDDCATSindiceCKANLATC 24/7dukeSig.maschool explorerdata-gov.ie
DiscoveryModel for dataset description: VoID vocabularyUsers in industry and governmentsPublished as W3C Notehttp://www.w3.org/TR/voidSignificant uptake in research
Describing DatasetsGeneral dataset metadataAccess metadataStructural metadataDescribing linksetsDeployment and discovery of voiD files
Linked Data life cycles: integrationopendata.ieLOD cloud NeologismDataCubeprefix.ccGoogle RefineRDB2RDFVoIDDCATSindiceCKANLATC 24/7dukeSig.maschool explorerdata-gov.ie
Why going for the 5th star?Central Contractor Registration (CCR) Geonameshttp://webofdata.wordpress.com/2011/05/22/why-we-link/
Pay-as-you-go integrationFix Overall Data  IntegrationEfforthttp://latc-project.eu/
Linked Data life cycles: use casesopendata.ieLOD cloud NeologismDataCubeprefix.ccGoogle RefineRDB2RDFVoIDDCATSindiceCKANLATC 24/7dukeSig.maschool explorerdata-gov.ie
Use case: eGov IrelandFingal County CouncilRaising awareness re open data and demonstrating its value.ODC2011 submission http://planning-apps.opendata.ieLocal Government Management Agency (former LGCSB)Advancing access to Open Data for Local Authorities LD pilot for Management Service Indicators across Local AuthoritiesCentral Statistics Office, dissemination groupBoot-strapping data-gov.ie with statistical data.school explorer - pilotEnterprise Ireland: National Cross Industry Working Group on Open Data27
School explorer
Linked Data life cyclesopendata.ieLOD cloud NeologismDataCubeprefix.ccGoogle RefineRDB2RDFVoIDDCATSindiceCKANLATC 24/7dukeSig.maschool explorerdata-gov.ie
ChallengesSchema mapping, matching, alignment[Hausenblas:DBKDA10]Write-enable the LD world [Berners-Lee:DERITR09]Authentication and authorisation in a distributed setuphttp://www.w3.org/2005/Incubator/webid/REST-alignment of Linked Data[Wilde:WEWST09]Dataset dynamics[Umbrich:LDOW10]
References[Franklin:SIGMOD05]	M. J. Franklin, A. Y. Halevy, and D. Maier, From databases to dataspaces: a new abstraction for information management.SIGMOD Record, 34(4):27–33, 2005.[Berners-Lee:DERITR09]	T. Berners-Lee, R. Cyganiak, M. Hausenblas, J. Presbrey, O. Seneviratne, and O. Ureche.	On Integration Issues of Site-Specific APIs into the Web of Data.  DERI Technical Report, 2009.[Hausenblas:DBKDA10]	M.Hausenblas and Marcel Karnstedt.Understanding Linked Open Data as a Web-Scale Database. Second International Conference on Advances in Databases, Knowledge, and Data Applications, 2010.[Wilde:WEWST09]	E. Wilde and M. Hausenblas.RESTful SPARQL? You Name It! Aligning SPARQL with REST and Resource Orientation. Fourth Workshop on Emerging Web Services Technology Workshop at European Conference on Web Services, Eindhoven, The Netherlands, 2009.[Umbrich:LDOW10]	J. Umbrich, M. Hausenblas, A. Hogan, A. Polleres, and S. Decker.Towards Dataset Dynamics: Change Frequency of Linked Open Data Sources. Third International Workshop on Linked Data on the Web at 19th International World Wide Web Conference, Raleigh, North Carolina, USA, 2010.

Linked data life cycles

  • 1.
    Linked Datalife cyclesDr.Michael Hausenblas, Linked Data Research CentreDERI, NUI GalwayJuly 2011
  • 2.
    What is adataspace?Heterogeneous data sourcesDistributed environment - proximityFind and consume dataUpdate data
  • 3.
    What is aDSSP and why does it matter?DSSP == Dataspace Support PlatformParticipants & relationshipsServicesCatalog & BrowseSearch & QueryIndexDiscoveryLinked Data ecosystem is an open & standards-basedreal-world DSSP
  • 4.
    Data management solutionsBasedon [Franklin:SIGMOD05]
  • 5.
    Linked Data principles*UseURIs to identify the “things” in your dataUse HTTP URIs so people and machines can look them up (on the Web)When a URI is looked up, return a description of the thingInclude links to related things* http://www.w3.org/DesignIssues/LinkedData.html
  • 6.
  • 7.
    Linked Open Datacloud statstriples distributionlinks distribution http://lod-cloud.net/state/
  • 8.
    The ChallengeClassical datamanagement approaches assume complete control over schema, data, and data generation
  • 9.
    The Web: distributed& open  lacks control
  • 10.
    Requires a newmodel of life cyclesLinked Data life cyclesopendata.ieLOD cloud NeologismDataCubeprefix.ccGoogle RefineRDB2RDFVoIDDCATSindiceCKANLATC 24/7dukeSig.maschool explorerdata-gov.ie
  • 11.
    Linked Data lifecycles: data awarenessopendata.ieLOD cloud NeologismDataCubeprefix.ccGoogle RefineRDB2RDFVoIDDCATSindiceCKANLATC 24/7dukeSig.maschool explorerdata-gov.ie
  • 12.
  • 13.
    TimBL’s 5-star planfor open data*★ Make your data available on the Web under an open license ★★ Make it available as structured data(Excel sheet instead of image scan of a table) ★★★Use a non-proprietary format(CSV file instead of an Excel sheet) ★★★★ Use Linked Data format(URIs to identify things, RDF to represent data)★★★★★ Link your data to other people’s data to provide context*http://lab.linkeddata.deri.ie/2010/star-scheme-by-example/
  • 15.
    Linked Data lifecycles: modelingopendata.ieLOD cloud NeologismDataCubeprefix.ccGoogle RefineRDB2RDFVoIDDCATSindiceCKANLATC 24/7dukeSig.maschool explorerdata-gov.ie
  • 16.
  • 17.
  • 18.
  • 19.
    Linked Data lifecycles: publishingopendata.ieLOD cloud NeologismDataCubeprefix.ccGoogle RefineRDB2RDFVoIDDCATSindiceCKANLATC 24/7dukeSig.maschool explorerdata-gov.ie
  • 20.
  • 21.
    Linked Data lifecycles: discoveryopendata.ieLOD cloud NeologismDataCubeprefix.ccGoogle RefineRDB2RDFVoIDDCATSindiceCKANLATC 24/7dukeSig.maschool explorerdata-gov.ie
  • 22.
    DiscoveryModel for datasetdescription: VoID vocabularyUsers in industry and governmentsPublished as W3C Notehttp://www.w3.org/TR/voidSignificant uptake in research
  • 23.
    Describing DatasetsGeneral datasetmetadataAccess metadataStructural metadataDescribing linksetsDeployment and discovery of voiD files
  • 24.
    Linked Data lifecycles: integrationopendata.ieLOD cloud NeologismDataCubeprefix.ccGoogle RefineRDB2RDFVoIDDCATSindiceCKANLATC 24/7dukeSig.maschool explorerdata-gov.ie
  • 25.
    Why going forthe 5th star?Central Contractor Registration (CCR) Geonameshttp://webofdata.wordpress.com/2011/05/22/why-we-link/
  • 26.
    Pay-as-you-go integrationFix OverallData IntegrationEfforthttp://latc-project.eu/
  • 27.
    Linked Data lifecycles: use casesopendata.ieLOD cloud NeologismDataCubeprefix.ccGoogle RefineRDB2RDFVoIDDCATSindiceCKANLATC 24/7dukeSig.maschool explorerdata-gov.ie
  • 28.
    Use case: eGovIrelandFingal County CouncilRaising awareness re open data and demonstrating its value.ODC2011 submission http://planning-apps.opendata.ieLocal Government Management Agency (former LGCSB)Advancing access to Open Data for Local Authorities LD pilot for Management Service Indicators across Local AuthoritiesCentral Statistics Office, dissemination groupBoot-strapping data-gov.ie with statistical data.school explorer - pilotEnterprise Ireland: National Cross Industry Working Group on Open Data27
  • 29.
  • 30.
    Linked Data lifecyclesopendata.ieLOD cloud NeologismDataCubeprefix.ccGoogle RefineRDB2RDFVoIDDCATSindiceCKANLATC 24/7dukeSig.maschool explorerdata-gov.ie
  • 31.
    ChallengesSchema mapping, matching,alignment[Hausenblas:DBKDA10]Write-enable the LD world [Berners-Lee:DERITR09]Authentication and authorisation in a distributed setuphttp://www.w3.org/2005/Incubator/webid/REST-alignment of Linked Data[Wilde:WEWST09]Dataset dynamics[Umbrich:LDOW10]
  • 32.
    References[Franklin:SIGMOD05] M. J. Franklin,A. Y. Halevy, and D. Maier, From databases to dataspaces: a new abstraction for information management.SIGMOD Record, 34(4):27–33, 2005.[Berners-Lee:DERITR09] T. Berners-Lee, R. Cyganiak, M. Hausenblas, J. Presbrey, O. Seneviratne, and O. Ureche. On Integration Issues of Site-Specific APIs into the Web of Data. DERI Technical Report, 2009.[Hausenblas:DBKDA10] M.Hausenblas and Marcel Karnstedt.Understanding Linked Open Data as a Web-Scale Database. Second International Conference on Advances in Databases, Knowledge, and Data Applications, 2010.[Wilde:WEWST09] E. Wilde and M. Hausenblas.RESTful SPARQL? You Name It! Aligning SPARQL with REST and Resource Orientation. Fourth Workshop on Emerging Web Services Technology Workshop at European Conference on Web Services, Eindhoven, The Netherlands, 2009.[Umbrich:LDOW10] J. Umbrich, M. Hausenblas, A. Hogan, A. Polleres, and S. Decker.Towards Dataset Dynamics: Change Frequency of Linked Open Data Sources. Third International Workshop on Linked Data on the Web at 19th International World Wide Web Conference, Raleigh, North Carolina, USA, 2010.
  • 33.
    See also ...TheLinked Open Data cloudhttp://lod-cloud.net
  • 34.
    Linked Data corespecificationshttp://linkeddata-specs.info
  • 35.
    Enabling cross-boundary accessto data sourceshttp://enable-cors.org
  • 36.
    Linked Open Data5-star deployment schemehttp://lab.linkeddata.deri.ie/2010/star-scheme-by-example/

Editor's Notes