Linked (Open) Data - But what does it buy me?

1,800 views

Published on

Pres

Published in: Technology

Linked (Open) Data - But what does it buy me?

  1. 1. Linked (Open) Data But what does it buy me? Rinke Hoekstra VU University Amsterdam/University of Amsterdam rinke.hoekstra@vu.nl Linked (Open) Data - But what does it buy me? by Rinke Hoekstra Licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.maandag 11 maart 13
  2. 2. maandag 11 maart 13
  3. 3. http://www.youtube.com/watch?v=ga1aSJXCFe0maandag 11 maart 13
  4. 4. maandag 11 maart 13
  5. 5. http://www.ted.com/talks/tim_berners_lee_the_year_open_data_went_worldwide.htmlmaandag 11 maart 13
  6. 6. Linked Open Datamaandag 11 maart 13
  7. 7. Linked Open Data Texts taken from http://5stardata.infomaandag 11 maart 13
  8. 8. Why people go “Meh” • Data needs to be converted to RDF • Data needs to be published on the Web • An open license is required even for a single ★ Pacific Barreleye, http://imgur.com/gallery/Mzyb5 (can rotate its eyes forwards or upwards to look through the transparent head to prey above)maandag 11 maart 13
  9. 9. Why people go “Meh” What if people draw incorrect conclusions from my data? • Data needs to be converted to RDF • Data needs to be published on the Web • An open license is required even for a single ★ Pacific Barreleye, http://imgur.com/gallery/Mzyb5 (can rotate its eyes forwards or upwards to look through the transparent head to prey above)maandag 11 maart 13
  10. 10. Why people go “Meh” What if if people draw incorrect What journalists draw incorrect conclusions from my data? • Data needs to be converted to RDF • Data needs to be published on the Web • An open license is required even for a single ★ Pacific Barreleye, http://imgur.com/gallery/Mzyb5 (can rotate its eyes forwards or upwards to look through the transparent head to prey above)maandag 11 maart 13
  11. 11. Why people go “Meh” What if if people draw incorrect What journalists draw incorrect conclusions from my data? • Data needs to be converted to RDF • Data needs to be published on the Web • An open license is required even for a single ★ What if combining data results in privacy infringement? Pacific Barreleye, http://imgur.com/gallery/Mzyb5 (can rotate its eyes forwards or upwards to look through the transparent head to prey above)maandag 11 maart 13
  12. 12. ... but LOD is just asking for more!maandag 11 maart 13
  13. 13. ... how can I sell this internally?maandag 11 maart 13
  14. 14. maandag 11 maart 13
  15. 15. Linked Open Datamaandag 11 maart 13
  16. 16. Repeatable Transformation The missing ★ Choose your Grain Size Linked Data Six Ingredients Contextualize! Mix ‘n Mash Lower the Thresholdmaandag 11 maart 13
  17. 17. 1 The missing ★maandag 11 maart 13
  18. 18. 1 The missing ★maandag 11 maart 13
  19. 19. 1 The missing ★ Version information Guessable http://give.everything/a/URI Version agnostic HTTPs URIs only please! (or resolver + URN)maandag 11 maart 13
  20. 20. Messy Data http://wetten.overheid.nl/BWBIdService/BWBIdList.xml.zip NB: The problem with the XML processing instruction was reported and fixed, but returned some weeks latermaandag 11 maart 13
  21. 21. Example: Juriconnect 1.0:c:BWBR0005416&artikel=6 vs http://wetten.overheid.nl/cgi-bin/deeplink/law1/bwbid=BWBR0005416/article=6/date=2005-01-14 vs http://wetten.overheid.nl/BWBR0005416/TitelII698946/HoofdstukII/Artikel16/geldigheidsdatum_14-01-2005 • Existing identification standard: Juriconnect • URN-like... but no naming server cf. Document Object Identifiers • Named elements do not carry identifier • No explicit version information, only contextualmaandag 11 maart 13
  22. 22. Levels of Identification Bibliographic Work Entity • realizes IFLA FRBR levels Expression • embodies Work Manifestation • Expression Item exemplifies • Manifestation XML version of regulation on XML version of Version of Regulation regulation regulation my harddiskmaandag 11 maart 13
  23. 23. Transparent = Guessable • Hierarchical information (work) http://doc.metalex.eu/id/BWBR0011823/hoofdstuk/1/artikel/1 http://doc.metalex.eu/id/BWBR0011823/artikel/1 • Version and language (expression) http://doc.metalex.eu/id/BWBR0011823/hoofdstuk/1/artikel/1/nl/2010-09-01 • Format information (manifestation) http://doc.metalex.eu/doc/BWBR0011823/hoofdstuk/1/artikel/1/nl/2010-09-01/data.xmlmaandag 11 maart 13
  24. 24. Versioning Issues • URIs don’t carry semantics... • Detect changes: • which element versions are the same • ... and which versions are different? Art. 44, lid 4 (2011-03-26) Art. 44, lid 4 (2011-04-05) From: Besluit prudentiële regels Wft, BWBR0020420maandag 11 maart 13
  25. 25. Opaque Identifiers http://doc.metalex.eu/BWBR0011823/hoofdstuk/1/artikel/34b0cee26ee5138c74aa2c62caf2c117d3c616e9 vermogen van de erflater dcterms:subject SW SW Hoofdstuk I, Artikel 10 Hoofdstuk I, Artikel 10 2011-01-01 2011-10-12 owl:sameAs SHA1 8738ef273ea4dbc73 • Content information • Unique SHA1 Hash of textmaandag 11 maart 13
  26. 26. Opaque Identifiers http://doc.metalex.eu/BWBR0011823/hoofdstuk/1/artikel/34b0cee26ee5138c74aa2c62caf2c117d3c616e9 vermogen van de erflater dcterms:subject SW SW Hoofdstuk I, Artikel 10 Hoofdstuk I, Artikel 10 2011-01-01 2011-10-12 owl:sameAs owl:sameAs SHA1 8738ef273ea4dbc73 • Content information • Unique SHA1 Hash of textmaandag 11 maart 13
  27. 27. Opaque Identifiers http://doc.metalex.eu/BWBR0011823/hoofdstuk/1/artikel/34b0cee26ee5138c74aa2c62caf2c117d3c616e9 vermogen van de erflater dcterms:subject dcterms:subject SW SW Hoofdstuk I, Artikel 10 owl:sameAs Hoofdstuk I, Artikel 10 2011-01-01 2011-10-12 owl:sameAs owl:sameAs SHA1 8738ef273ea4dbc73 • Content information • Unique SHA1 Hash of textmaandag 11 maart 13
  28. 28. Opaque Identifiers http://doc.metalex.eu/BWBR0011823/hoofdstuk/1/artikel/34b0cee26ee5138c74aa2c62caf2c117d3c616e9 vermogen van de erflater dcterms:subject SW SW Hoofdstuk I, Artikel 10 Hoofdstuk I, Artikel 10 2011-01-01 2011-10-12 owl:sameAs owl:sameAs SHA1 SHA1 8738ef273ea4dbc73 a433f53273c78a56f2 • Content information • Unique SHA1 Hash of textmaandag 11 maart 13
  29. 29. Network Analysismaandag 11 maart 13
  30. 30. 2 Repeatable Transformation Transformation should be part of routine ... ... manageable and scalable ... ... repeatable ... http://www.w3.org/TR/prov-overview/maandag 11 maart 13
  31. 31. 2 Repeatable Transformation Linked Data will not be the official source anytime soon Provenance is key Transformation should be part of routine ... ... manageable and scalable ... ... repeatable ... http://www.w3.org/TR/prov-overview/maandag 11 maart 13
  32. 32. maandag 11 maart 13
  33. 33. LODStats http://stats.lod2.eumaandag 11 maart 13
  34. 34. 40.745.554.078 Triples!maandag 11 maart 13
  35. 35. 40.745.554.078 Triples! (1.6 Billion) (I tried to check the latest figures, but http://stats.lod2.eu was down)maandag 11 maart 13
  36. 36. 3 Choose your Grain Size • The document is the traditional grain size (dublin core) • Linked data allows for deep links into data • Cost versus usefulness • Are you the right party to provide detailed descriptions? http://creatingandeducating.blogspot.nl/2011/11/blog-post.htmlmaandag 11 maart 13
  37. 37. Report Card Categories Report Card Cate RDF Report Card Low Detail High Detail Structure Metadata Scope Internals RDF Report Card by Leigh Dodds, talk at Semtech Biz London, 2011, http://slideshare.net/ldoddsmaandag 11 maart 13
  38. 38. 4 Mix ‘n Mash • Multiple vocabularies won’t bite • Multiple identifiers won’t bite • Choose what’s useful for you... • ... then map to others! Image © David Sykes 2009 All rights reservedmaandag 11 maart 13
  39. 39. 4 Mix ‘n Mash • Multiple vocabularies won’t bite • Multiple identifiers won’t bite • Choose what’s useful for you... • ... then map to others! Good News: the bulk has already been done for you! Image © David Sykes 2009 All rights reservedmaandag 11 maart 13
  40. 40. Semantically-Interlinked Online Communitiesmaandag 11 maart 13
  41. 41. Semantically-Interlinked Online Communitiesmaandag 11 maart 13
  42. 42. Example: Provenance The date at which the expression was created "2009-10-23"^^xsd:date time:Instant ml:Date sem:Time rdf:value sem:hasTimeStamp rdf:type rdf:type sem:timeType time:inXSDDateTime rdf:type opmv:Process http://doc.metalex.eu/id/date/2009-10-23 sem:Event ml:LegislativeModification sem:hasTime rdf:type rdf:type time:hasEnd rdf:type ml:date sem:eventType The creation event of the regulation http://doc.metalex.eu/id/process/BWBR0017869/2009-10-23 http://doc.metalex.eu/id/event/BWBR0017869/2009-10-23 opmv:Artifact opmv:wasGeneratedAt The process that generated the expression ml:resultOf rdf:type ml:BibliographicExpression opmv:wasGeneratedBy rdf:type http://doc.metalex.eu/id/BWBR0017869/2009-10-23 The expression (version) URI of a regulationmaandag 11 maart 13
  43. 43. 5 Contextualize! • Information is not always compatible • Make explicit in which context the information holds ... • ... and who stated the information, why and how. Flat Earth and Square Earth idea courtesy of Szymon Klarmanmaandag 11 maart 13
  44. 44. <http://example.com/workbook1/sheet1> <http://example.com/workbook1/sheet1/corrected> provo:Activity rdf:type :curation20120126 "1"^^xsd:int "11"^^xsd:int provo:wasGeneratedBy provo:hadAgent provo:startedAt d2s:populationSize d2s:populationSize provo:endedAt "1889"^^xsd:int :RinkeHoekstra d2s:censusYear _:x d2s:birthYears :1875--1874 _:b _:a d2s:gemeente d2s:dimension d2s:ageGroup time:inXSDDateTime time:inXSDDateTime :Assendelft :14--15_1875--1874 :14-15 "20120126T09:00:00" "20120126T08:30:00" • Namespaces don’t mean anything • Use named graphs to compartmentalize metadata • Add provenance information about groups of statementsmaandag 11 maart 13
  45. 45. Compliance Regulation A Art 12 Art 14, lid 3, 2e volzinmaandag 11 maart 13
  46. 46. Compliance start State Name entry/action do/activity action State exit/action event/action(arguments) end Regulation A Art 12 Art 14, lid 3, 2e volzinmaandag 11 maart 13
  47. 47. Compliance start State Name entry/action do/activity action State exit/action event/action(arguments) end Regulation A Art 12 Art 14, lid 3, 2e volzinmaandag 11 maart 13
  48. 48. Compliance start State Name entry/action do/activity action State exit/action event/action(arguments) end Regulation A Art 12 Art 14, lid 3, 2e volzinmaandag 11 maart 13
  49. 49. Compliance start State Name entry/action do/activity action State exit/action event/action(arguments) end Regulation A Art 12 Art 14, lid 3, 2e volzinmaandag 11 maart 13
  50. 50. Compliance start State Name entry/action do/activity action State exit/action event/action(arguments) end Regulation A Art 12 Art 14, lid 3, 2e volzin Art 14, lid 3, 2e volzinmaandag 11 maart 13
  51. 51. Compliance start State Name entry/action do/activity action State exit/action event/action(arguments) end Regulation A Art 12 Art 14, lid 3, 2e volzin Art 14, lid 3, 2e volzin (01-01-2011) (04-02-2011) (11-06-2008) (01-07-2011)maandag 11 maart 13
  52. 52. Contextual Annotation vermogen van de erflater Successiewet dcterms:subject Successiewet vermogen van de erflater SW Hoofdstuk I SW dcterms:subject vermogen van de erflater Hoofdstuk I SW Artikel 10 SW dcterms:subject vermogen van de erflater Hoofdstuk I, Artikel 10 SW SW Art. 10, zin 1 Hoofdstuk I, Artikel 10 dcterms:subject vermogen van de erflater Zin 1 No nice background because Google Image search only returned boring imagesmaandag 11 maart 13
  53. 53. 6 Lower the Threshold • Integrate Linked Data production into everyday tools • Allow tools to do the work for you • Use a built-in reward model Image courtesy of http://themaisonette.netmaandag 11 maart 13
  54. 54. 6 Lower the Threshold Linked Data allows you to trace usage! • Integrate Linked Data production into everyday tools • Allow tools to do the work for you • Use a built-in reward model Image courtesy of http://themaisonette.netmaandag 11 maart 13
  55. 55. Wrap Legacy Systems http://www.w3.org/TR/r2rml/maandag 11 maart 13
  56. 56. maandag 11 maart 13
  57. 57. Idea: use reward mechanisms of Web 2.0maandag 11 maart 13
  58. 58. • Lightweight Web Application • Interface to API of existing data repositories • Enrich metadata by linking to Linked Data resources • Provide annotation services for data files • Plugin based architecture http://linkitup.data2semantics.org • Publish RDF metadata as new data publicationmaandag 11 maart 13
  59. 59. recoprov Reconstruct provenance using Dropbox file edit history 19 7 5 8 14 11 9 13 4 16 1 22 17 12 2 0 23 3 18 6 10 15 20 21 24 Sara Magliacane and Paul Grothmaandag 11 maart 13
  60. 60. plsheet How are results calculated (1)? Analyse dependencies between Automatic analyis of workflow in spreadsheets cells in complex spreadsheets Martine de Vos, Jan Wielemaker and Willem van Hagemaandag 11 maart 13
  61. 61. plsheet                               Reconstruct and explain the workflow of computations   Martine de Vos, Jan Wielemaker and Willem van Hagemaandag 11 maart 13
  62. 62. TabLinker Semi-automatic RDF converter for eccentric spreadsheets Albert Merono-Penuela, Rinke Hoekstra, http://www.cedar-project.nl Laurens Rietveld, Christophe Gueretmaandag 11 maart 13
  63. 63. TabLinker Semi-automatic RDF converter for eccentric spreadsheets Albert Merono-Penuela, Rinke Hoekstra, http://www.cedar-project.nl Laurens Rietveld, Christophe Gueretmaandag 11 maart 13
  64. 64. Repeatable Transformation The missing ★ Choose your Grain Size Linked Data Six Ingredients Contextualize! Mix ‘n Mash Lower the Thresholdmaandag 11 maart 13
  65. 65. Repeatable Transformation The missing ★ Choose your Grain Size Linked Open Data ... be sure to use it internally too! Contextualize! Mix ‘n Mash Lower the Thresholdmaandag 11 maart 13

×