Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Machine-Interpretable Dataset and Service Descriptions for Heterogeneous Data Access and Retrieval

1,551 views

Published on

Aligning vocabularies describing dataset and services with mapping descriptions based on RML to automate and facilitate the generation of RDF dataset.

Published in: Technology
  • Be the first to comment

Machine-Interpretable Dataset and Service Descriptions for Heterogeneous Data Access and Retrieval

  1. 1. Machine-Interpretable Dataset and Service Descriptions for Heterogeneous Data Access & Retrieval Anastasia Dimou, Ruben Verborgh, Miel Vander Sande, Erik Mannens, Rik Van de Walle Anastasia.Dimou@UGent.be @natadimou Ghent University – iMinds – Multimedia Lab http://RML.io
  2. 2. Semantic Web enabled applications rely on data represented as Linked Open Data
  3. 3. Linked Open Data describe domain-level knowledge that is understandable by both humans and machines
  4. 4. Resource Description Framework (RDF) is the prevalent data model for describing Linked Open Data
  5. 5. predicatesubject object Resource Description Framework (RDF)
  6. 6. ex:1 ex:MMLabex:works “Anastasia Dimou”
  7. 7. ex:1 ex:MMLabex:works “Anastasia Dimou” ex:2 ex:MMLabex:works “Ruben Verborgh”
  8. 8. ex:1 ex:MMLabex:works “Anastasia Dimou” ex:2 ex:MMLabex:works “Ruben Verborgh” ex:3 ex:MMLabex:works “Miel Vander Sande”
  9. 9. ex:1 ex:MMLabex:works “Anastasia Dimou” ex:locatedex:MMLab ex:Ghent ex:2 ex:MMLabex:works “Ruben Verborgh” ex:3 ex:MMLabex:works “Miel Vander Sande”
  10. 10. ex:{id} ex:{lab} ex:located ex:{lab} ex:{city} sets of triples of a dataset have repetitive patterns “{firstname} {surname}”
  11. 11. ex:{id} ex:{lab} sets of triples of a dataset have repetitive patterns “{firstname} {surname}” triple-oriented mapping languages formalize patterns into rules to map data to RDF ex:located ex:{lab} ex:{city}
  12. 12. RDF Mapping Language (RML) map any data to RDF uniform, integrable, interoperable, extensible extends the W3C-recommended R2RML http://RML.io A. Dimou, M. Vander Sande, P. Colpaert, R. Verborgh, E. Mannens, and R. Van de Walle. RML: A Generic Language for Integrated RDF Mappings of Heterogeneous Data. In Proceedings of the 7th Workshop on Linked Data on the Web (LDOW2014), 2014.
  13. 13. RML describes rules to map any structured data to RDF RML supports any data independently of which structure and format they have where they originally reside how they are accessed & retrieved
  14. 14. data access and retrieval is manually performed remains hard-coded
  15. 15. Mapping data any data to RDF with RML Specifying data which data form a data input how to reference data input extracts Accessing & Retrieving data data input from original source(s)
  16. 16. Mapping data any data to RDF with RML Specifying which data form a data input how to reference data input extracts Accessing & Retrieving data input from original source(s)
  17. 17. rr:constant ex:located rr:template “http://ex.com/{lab}” rr:template “http://ex.com/{city}” rr:template “http://ex.com/{id}” rr:template “http://ex.com/{lab}” rr:template “{firstname} {surname}” rr:termType rr:Literal RDF Mapping Language (RML) @prefix rr: <http://www.w3.org/ns/r2rml#>
  18. 18. Predicate MapSubject Map Object Map <#TriplesMap> RDF Mapping Language (RML)
  19. 19. rr:constant ex:located rr:template “http://ex.com/{lab}” rr:template “http://ex.com/{city}” rr:template “http://ex.com/{lab}” rr:template “http://ex.com/{lab}” <#ResearcherMap> <#LabMap> rr:template “{firstname} {surname}” rr:termType rr:Literal
  20. 20. Mapping data data to RDF with RML Specifying data which data form a data input how to reference data input extracts Accessing & Retrieving data data input from original source(s)
  21. 21. Triples Map RDF Mapping Language (RML) Predicate Object Map Subject Map Predicate Map Object Map
  22. 22. Triples Map RDF Mapping Language (RML) Predicate Object Map Subject Map Predicate Map Object Map Logical Source
  23. 23. Support data in Heterogeneous Structures tabular-structured hierarchical-structured (semi-)structured … … …
  24. 24. Support data in Heterogeneous Structures and Formats tabular-structured tables in DBs or CSV files … hierarchical-structured JSON or XML … (semi-)structured HTML … … … …
  25. 25. rr:template “http://ex.com/{id}” rr:template “http://ex.com/{lab}” <#ResearcherMap> rr:template “{firstname} {surname}” rr:termType rr:Literal id firstname surname lab 1 Anastasia Dimou MMLab 2 Ruben Verborgh MMLab 3 Miel Vander Sande MMLab support tabular-structured data
  26. 26. rr:constant ex:located rr:template “http://ex.com/ {/labs/lab/short}” rr:template “http://ex.com/ {/labs/lab/location/city}” <#LabMap> <labs> <lab> <short>MMLab</short> <title>Multimedia Lab</title> <location> <city>Ghent</city> </location> </lab> <lab> …. </lab> … </labs> support hierarchical-structured data
  27. 27. rr:constant ex:located rr:template “http://ex.com/ {/labs/lab/short}” rr:template “http://ex.com/ {/labs/lab/location/city}” <#LabMap> <labs> <lab> <short>MMLab</short> <title>Multimedia Lab</title> <location> <city>Ghent</city> </location> </lab> <lab> …. </lab> … </labs> How to reference data extracts?
  28. 28. Triples Map RDF Mapping Language (RML) Predicate Object Map Subject Map Predicate Map Object Map Logical Source Reference Formulation
  29. 29. <labs> <lab> <short>MMLab</short> <title>Multimedia Lab</title> <location> <city>Ghent</city> </location> </lab> <lab> …. </lab> … </labs> <#Lab Logical Source> ql:XPath rr:constant ex:located rr:template “http://ex.com/ {/labs/lab/short}” rr:template “http://ex.com/ {/labs/lab/location/city}” <#LabMap>
  30. 30. <labs> <lab> <short>MMLab</short> <title>Multimedia Lab</title> <location> <city>Ghent</city> </location> </lab> <lab> …. </lab> … </labs> <#Lab Logical Source> ql:XPath rr:constant ex:located rr:template “http://ex.com/ {/labs/lab/short}” rr:template “http://ex.com/ {/labs/lab/location/city}” <#LabMap> How to iterate over the data?
  31. 31. Triples Map RDF Mapping Language (RML) Predicate Object Map Subject Map Predicate Map Object Map Logical Source Reference Formulation iterator
  32. 32. <labs> <lab> <short>MMLab</short> <title>Multimedia Lab</title> <location> <city>Ghent</city> </location> </lab> <lab> …. </lab> … </labs> <#Lab Logical Source> ql:XPath “/labs/lab” rr:constant ex:located rr:template “http://ex.com/ {/labs/lab/short}” rr:template “http://ex.com/ {/labs/lab/location/city}” <#LabMap>
  33. 33. Mapping data data to RDF with RML Specifying data which data form a data source how to reference data extracts Accessing & Retrieving data data from their original sources
  34. 34. Input data Input data Input data Output RDF Mapping module RML Processor Map doc
  35. 35. Data source Access interface Input data Input data Input data Output RDF Mapping module RML Processor Map doc Data source Access interface Data source Access interface Retrieval module Source description
  36. 36. Data source Access interface Input data Input data Input data Output RDF Mapping module RML Processor Map doc Data source Access interface Data source Access interface Retrieval module Source description Where does this data originally come from?
  37. 37. Support different Locations and Access Interfaces Local File(s) Database connectivity Web source(s) RDF source(s)
  38. 38. Dataset and Service Vocabularies advertising in machine-interpretable fashion how to access the underlying data can also be used in combination with RML to retrieve the data input to be mapped from its original source
  39. 39. Support different Locations and Access Interfaces Local File(s) Database connectivity D2RQ Web source(s) (Web API/service) DCAT, CSVW, Hydra, VOiD (Dataset) RDF source(s) VOiD (Endpoint), SPARQL-SD
  40. 40. Triples Map RDF Mapping Language (RML) Predicate Object Map Subject Map Predicate Map Object Map Logical Source Reference Formulation iterator Source
  41. 41. <labs> <lab> <short>MMLab</short> <title>Multimedia Lab</title> <location> <city>Ghent</city> </location> </lab> <lab> …. </lab> … </labs> <#Lab Logical Source> ql:XPath rr:constant ex:located rr:template “http://ex.com/ {/labs/lab/short}” rr:template “http://ex.com/ {/labs/lab/location/city}” <#LabMap> “/labs/lab” _:Source Where does this data originally come from?
  42. 42. file.xml XML data Output RDF Mapping module RML Processor Map doc Retrieval module Support Local File(s)
  43. 43. <labs> <lab> <short>MMLab</short> <title>Multimedia Lab</title> <location> <city>Ghent</city> </location> </lab> <lab> …. </lab> … </labs> <#Lab Logical Source> ql:XPath rr:constant ex:located rr:template “http://ex.com/ {/labs/lab/short}” rr:template “http://ex.com/ {/labs/lab/location/city}” <#LabMap> “/labs/lab” “file.xml” Support Local File(s)
  44. 44. file.xml WEBAPI DCAT XML data Output RDF Mapping module RML Processor Map doc Retrieval module Source description Support file(s) published on the Web
  45. 45. <labs> <lab> <short>MMLab</short> <title>Multimedia Lab</title> <location> <city>Ghent</city> </location> </lab> <lab> …. </lab> … </labs> <#Lab Logical Source> ql:XPath dcat: distribution a dcat: Distribution “/labs/lab” _:Source Support dataset on the Web (DCAT) _:Source dcat:Dataset <http://ex.com/ file.xml> dcat: downloadUrl
  46. 46. file.xml WEBAPI DCAT XML data JSON data Output RDF Mapping module RML Processor Map doc Data repo WEBAPI Hydra Retrieval module Source description Support data derived from a Web API
  47. 47. <labs> <lab> <short>MMLab</short> <title>Multimedia Lab</title> <location> <city>Ghent</city> </location> </lab> <lab> …. </lab> … </labs> <#Lab Logical Source> ql:XPath hydra: template “http://ex.com/lab? name={labName}” “/labs/lab” _:Source Support data from a Web API (Hydra) _:Source hydra: IriTemplate
  48. 48. file.xml WEBAPI DCAT XML data JSON data tabular data Output RDF Mapping module RML Processor Map doc Data repo WEBAPI Hydra Data base JDBC D2RQ Retrieval module Source description
  49. 49. rr:template “http://ex.com/{id}” rr:template “http://ex.com/{lab}” <#ResearcherMap> rr:template “{firstname} {surname}” rr:termType rr:Literal id firstname surname lab 1 Anastasia Dimou MMLab 2 Ruben Verborgh MMLab 3 Miel Vander Sande MMLab Support tabular-structured data <#DB Logical Source> rr:SQL2008 “…” _:Source “SELECT …”
  50. 50. rr:template “http://ex.com/{id}” rr:template “http://ex.com/{lab}” <#ResearcherMap> rr:template “{firstname} {surname}” rr:termType rr:Literal Support tabular-structured data <#DB Logical Source> rr:SQL2008 “…” _:Source “SELECT …” “…” _:Source d2rq:Database “…” “…” “…”
  51. 51. file.xml WEBAPI DCAT XML data JSON data tabular data Output RDF Mapping module RML Processor Map doc Data repo WEBAPI Hydra Data base JDBC D2RQ Retrieval module Source description Triple store SPARQL
  52. 52. ex:located ex:{lab} dbpedia: {city} ex:located ex:{lab} ex:{city} object defined in existing RDF source(s)
  53. 53. <#Lab Logical Source> ql:XPath rr:constant ex:located rr:template “http://ex.com/ {/labs/lab/short}” rml:reference “{/…/city}” rr:termType rr:IRI <#LabMap> “/labs/lab” _:Source <#Dbpedia Logical Source> ql:XPath “/…/result” DBpedia <#DBpediaMap> ex:located ex:{lab} dbpedia: {city} “SELECT …”
  54. 54. <#Lab Logical Source> ql:XPath rr:constant ex:located rr:template “http://ex.com/ {/labs/lab/short}” rml:reference “{/…/city}” rr:termType rr:IRI <#LabMap> “/labs/lab” _:Source <#Dbpedia Logical Source> ql:XPath “/…/result” DBpedia <#DBpediaMap> ex:located ex:{lab} dbpedia: {city} “SELECT …”
  55. 55. RML Editor (http://RML.io/RMLeditor)
  56. 56. Mapping data any data to RDF with RML Specifying data which data form a data input how to reference data input extracts Accessing & Retrieving data data input from original source(s)
  57. 57. Data access, retrieval and mapping descriptions are machine-interpretable Granular robust solution based on RML which further automates and facilitates the generation of RDF representations
  58. 58. RML.io Questions? Anastasia Dimou @natadimou

×