Linked data life cycles


Published on

Existing data management approaches assume control over schema, data and data generation, which is not the case in open, de-centralised environments such as the Web. The lack of control means that there are social processes necessary to generate 'ordo ab chao' and hence a new life cycle model is necessary.

Based on our experience in Linked Data publishing and consumption over the past years, we have identify involved parties and fundamental phases, which provide for a multitude of so called Linked Data life cycles.

If you want to hear me speak to the slides, you might want to check out the following videos on YouTube:

Part 1:

Part 2:

Part 3:

Published in: Technology, Education
  • Thanks for your feedback, very much appreciated!

    Re slide 4: MDM, yes, makes sense.

    Re slide 8: *what* existing life cycles :)

    Re slide 18: you might have noticed that I'm only using DERI stuff for demonstration purposes, but yes, true ;)

    Re slide 25: for example co-reference systems incl. RKBexlporer or or what we do in the LATC 24/7 platform

    FYI: I'll soon publish a Technical Report describing the life cycles in detail.
    Are you sure you want to  Yes  No
    Your message goes here
  • good summary!

    my 2 cents:

    slide 4: you may include master data management systems and data stores, which are 'behind' other business applications today

    re. Slide 8: I don't think it requires a completely new model of life cycles, it is more about how to add linked data contents to existing life cycles and which additional new steps are needed to put necessary attention to linked data specifics.

    re. Slide 18: at publishing you may add tools which work as a constant facade 4 DBMS/applications as opposed to one-time conversion tools such as Refine.

    Re. slide 25: what do you mean with third-party-effort?
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • DataCube
  • Linked data life cycles

    1. 1. Linked Datalife cycles<br />Dr. Michael Hausenblas, Linked Data Research CentreDERI, NUI Galway<br />July 2011<br />
    2. 2. What is a dataspace?<br />Heterogeneous data sources<br />Distributed environment - proximity<br />Find and consume data<br />Update data<br />
    3. 3. What is a DSSP and why does it matter?<br />DSSP == Dataspace Support Platform<br />Participants & relationships<br />Services<br />Catalog & Browse<br />Search & Query<br />Index<br />Discovery<br />Linked Data ecosystem is an open & standards-basedreal-world DSSP<br />
    4. 4. Data management solutions<br />Based on [Franklin:SIGMOD05]<br />
    5. 5. Linked Data principles*<br />Use URIs to identify the “things” in your data<br />Use HTTP URIs so people and machines can look them up (on the Web)<br />When a URI is looked up, return a description of the thing<br />Include links to related things<br />*<br />
    6. 6.<br />Linked Open Data cloud<br />
    7. 7. Linked Open Data cloud stats<br />triples distribution<br />links distribution <br /><br />
    8. 8. The Challenge<br /><ul><li>Classical data management approaches assume complete control over schema, data, and data generation
    9. 9. The Web: distributed & open  lacks control
    10. 10. Requires a new model of life cycles</li></li></ul><li>Linked Data life cycles<br /><br />LOD cloud <br />Neologism<br />DataCube<br /><br />Google Refine<br />RDB2RDF<br />VoID<br />DCAT<br />Sindice<br />CKAN<br />LATC 24/7<br />duke<br /><br />school explorer<br /><br />
    11. 11. Linked Data life cycles: data awareness<br /><br />LOD cloud <br />Neologism<br />DataCube<br /><br />Google Refine<br />RDB2RDF<br />VoID<br />DCAT<br />Sindice<br />CKAN<br />LATC 24/7<br />duke<br /><br />school explorer<br /><br />
    12. 12.<br />Hans Rosling <br />‘database hugging disorder’<br />
    13. 13. TimBL’s 5-star plan for open data*<br />★ Make your data available on the Web under an open license<br /> ★★ Make it available as structured data(Excel sheet instead of image scan of a table) <br /> ★★★Use a non-proprietary format(CSV file instead of an Excel sheet) <br />★★★★ Use Linked Data format(URIs to identify things, RDF to represent data)<br />★★★★★ Link your data to other people’s data to provide context<br />*<br />
    14. 14.
    15. 15. Linked Data life cycles: modeling<br /><br />LOD cloud <br />Neologism<br />DataCube<br /><br />Google Refine<br />RDB2RDF<br />VoID<br />DCAT<br />Sindice<br />CKAN<br />LATC 24/7<br />duke<br /><br />school explorer<br /><br />
    16. 16.<br />
    17. 17.<br /><br />
    18. 18.<br />
    19. 19. Linked Data life cycles: publishing<br /><br />LOD cloud <br />Neologism<br />DataCube<br /><br />Google Refine<br />RDB2RDF<br />VoID<br />DCAT<br />Sindice<br />CKAN<br />LATC 24/7<br />duke<br /><br />school explorer<br /><br />
    20. 20. Publishing<br /><br />
    21. 21. Linked Data life cycles: discovery<br /><br />LOD cloud <br />Neologism<br />DataCube<br /><br />Google Refine<br />RDB2RDF<br />VoID<br />DCAT<br />Sindice<br />CKAN<br />LATC 24/7<br />duke<br /><br />school explorer<br /><br />
    22. 22. Discovery<br />Model for dataset description: VoID vocabulary<br />Users in industry and governments<br />Published as W3C Note<br />Significant uptake in research<br />
    23. 23. Describing Datasets<br />General dataset metadata<br />Access metadata<br />Structural metadata<br />Describing linksets<br />Deployment and discovery of voiD files<br />
    24. 24. Linked Data life cycles: integration<br /><br />LOD cloud <br />Neologism<br />DataCube<br /><br />Google Refine<br />RDB2RDF<br />VoID<br />DCAT<br />Sindice<br />CKAN<br />LATC 24/7<br />duke<br /><br />school explorer<br /><br />
    25. 25. Why going for the 5th star?<br />Central Contractor Registration (CCR) <br />Geonames<br /><br />
    26. 26. Pay-as-you-go integration<br />Fix Overall Data IntegrationEffort<br /><br />
    27. 27. Linked Data life cycles: use cases<br /><br />LOD cloud <br />Neologism<br />DataCube<br /><br />Google Refine<br />RDB2RDF<br />VoID<br />DCAT<br />Sindice<br />CKAN<br />LATC 24/7<br />duke<br /><br />school explorer<br /><br />
    28. 28. Use case: eGov Ireland<br />Fingal County Council<br />Raising awareness re open data and demonstrating its value.<br />ODC2011 submission<br />Local Government Management Agency (former LGCSB)<br />Advancing access to Open Data for Local Authorities <br />LD pilot for Management Service Indicators across Local Authorities<br />Central Statistics Office, dissemination group<br />Boot-strapping with statistical data.<br />school explorer - pilot<br /><ul><li>Enterprise Ireland: National Cross Industry Working Group on Open Data</li></ul>27<br />
    29. 29. School explorer<br />
    30. 30. Linked Data life cycles<br /><br />LOD cloud <br />Neologism<br />DataCube<br /><br />Google Refine<br />RDB2RDF<br />VoID<br />DCAT<br />Sindice<br />CKAN<br />LATC 24/7<br />duke<br /><br />school explorer<br /><br />
    31. 31. Challenges<br />Schema mapping, matching, alignment[Hausenblas:DBKDA10]<br />Write-enable the LD world [Berners-Lee:DERITR09]<br />Authentication and authorisation in a distributed setup<br />REST-alignment of Linked Data[Wilde:WEWST09]<br />Dataset dynamics[Umbrich:LDOW10]<br />
    32. 32. References<br />[Franklin:SIGMOD05] M. J. Franklin, A. Y. Halevy, and D. Maier, From databases to dataspaces: a new abstraction for information management.SIGMOD Record, 34(4):27–33, 2005.<br />[Berners-Lee:DERITR09] T. Berners-Lee, R. Cyganiak, M. Hausenblas, J. Presbrey, O. Seneviratne, and O. Ureche.<br /> On Integration Issues of Site-Specific APIs into the Web of Data. DERI Technical Report, 2009.<br />[Hausenblas:DBKDA10] M.Hausenblas and Marcel Karnstedt.Understanding Linked Open Data as a Web-Scale Database. Second International Conference on Advances in Databases, Knowledge, and Data Applications, 2010.<br />[Wilde:WEWST09] E. Wilde and M. Hausenblas.RESTful SPARQL? You Name It! Aligning SPARQL with REST and Resource Orientation. Fourth Workshop on Emerging Web Services Technology Workshop at European Conference on Web Services, Eindhoven, The Netherlands, 2009.<br />[Umbrich:LDOW10] J. Umbrich, M. Hausenblas, A. Hogan, A. Polleres, and S. Decker.Towards Dataset Dynamics: Change Frequency of Linked Open Data Sources. Third International Workshop on Linked Data on the Web at 19th International World Wide Web Conference, Raleigh, North Carolina, USA, 2010. <br />
    33. 33. See also ...<br /><ul><li>The Linked Open Data cloud
    34. 34. Linked Data core specifications
    35. 35. Enabling cross-boundary access to data sources
    36. 36. Linked Open Data 5-star deployment scheme</li>