Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
RDF Mapping Language (RML)
A Generic Language for Integrated
RDF Mappings of Heterogeneous Data
Anastasia Dimou, Miel Vand...
The five stars of the Linked Open Data scheme
are approached as a set of
consecutive steps
… and are applied to a single input source every time
Limitations of current solutions
The semantic representation of each mapped resource is
Independently defined
disregarding...
Need for a well-considered policy regarding
mapping and primary interlinking of data
in the context of a certain
knowledge...
No mapping formalization
exists that defines how to map
heterogeneous sources into RDF
using integrated and
interoperable ...
Relational Database to RDF (R2RML W3C)
R2RML mappings R2RML processor
Data OWNER / PUBLISHER
defines
RDF
DB
Mapping heterogeneous resources to RDF
R2RML mappings R2RML processor
Data OWNER / PUBLISHER
defines
RDF
DB CSV
RDF
Mapping heterogeneous resources to RDF
R2RML mappings R2RML processor
Data OWNER / PUBLISHER
defines
RDF
DB CSV XML
RDF RDF
Current limitation:
mapping data on a per-source & per-format basis
R2RML mappings R2RML processor
Data OWNER / PUBLISHER
...
The mappings are tied to the implementations
not interoperable across different implementations
No uniform way to describe...
Uniform way for integrated mapping
of heterogeneous sources
Mappings definitions? processor
Data OWNER / PUBLISHER
defines...
R2RML mapping definition
Table Name
Triples
Map
Logical Table
Subject Map
Predicate-Object Map
1 Subject Map
0 or more
Pre...
R2RML mapping definition
Table Name
Triples
Map
Logical Table
Subject Map
Predicate-Object Map
Predicate-Object Map
Predic...
From R2RML to a generic mapping language
Object Map
Predicate
Map
Subject Map
Term Map
template
constant
column
column
RDF...
R2RML Mapping
<#ProductMapping>
rr:logicalTable [
rr:tableName “Suitcase" ];
rr:subjectMap [
rr:template "http://ex.com/{S...
from R2RML to a generic mapping language
R2RML Generic mapping language
Logical Table Logical Source
(CSV, XML, JSON)
Tabl...
References to values of heterogeneous resources
<PendingOrders>...
<Order id="398">
<Product>
<Id>AE5982</Id>
<Name>Samson...
from R2RML to a generic mapping language
R2RML R2RML
Logical Table Logical Source
(CSV, XML, JSON)
Table Name Source name ...
Mapping XML files
<PendingOrders>…
<Order id="398">
<Product>
<Id>AE5982</Id>
<Name>Samsonite DeLux 45</Name>
</Product>
<...
Mapping JSON files
{ ... ,
“ProductInStock” :
{ “ID”: "567",
“Name”: “Samsonite DeLux 45”,
“type”: “suitcase”
}, ...
}
ex:...
RDF Mapping Language (RML)
Source
Triples Map
Logical Source
Subject Map
Predicate-Object
Map
Predicate
Map
Object Map
Ter...
{ ... "Performance" :
{ "Perf_ID": "567",
"Location": {
"lat": "51.043611" ,
"long": "3.717222"}
},
... }
<Events> ...
<Ex...
{ ... "Performance" :
{ "Perf_ID": "567",
"Location": {
"lat": "51.043611" ,
"long": "3.717222“ } } ,
... }
<Events> ...
<...
{ ... "Performance" :
{ "Perf_ID": "567",
"Venue": {
"Name": "STAM",
"Venue_ID": "78" },
"Location": {
"long": "3.717222",...
{ ... "Performance" :
{ "Perf_ID": "567",
"Venue": {
"Name": "STAM",
"Venue_ID": "78" },
... }
<Events> ...
<Exhibition id...
Avoid redefining and replicating URI patterns
Uniquely define the URI patterns that
generates a resource and refer to its ...
Address the mappings definition in a generic way
scale over the input data extracts.
Distinct and not interdependent
refer...
Limitations:
Mapping of data on a per-source and per-format basis
Mapping definitions are tied to the implementation
Lack ...
RDF Mapping Language (RML)
generic language for mapping
heterogeneous resources into RDF
in an integrate and interoperable...
Upcoming SlideShare
Loading in …5
×

A Generic Language for Integrated RDF Mappings of Heterogeneous Data

691 views

Published on

Despite the significant number of existing tools, incorporating data from multiple sources and di different formats into the Linked Open Data cloud remains complicated. No mapping formalization exists to define how to map such heterogeneous sources into RDF in an integrated and interoperable fashion.
This paper introduces the RML mapping language, a generic language based on an extension over R2RML, the W3C standard for mapping relational databases into RDF. Broadening R2RML's scope, the language becomes source-agnostic and extensible, while facilitating the definition of mappings of multiple heterogeneous sources. This leads to higher integrity within datasets and richer interlinking among resources.

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

A Generic Language for Integrated RDF Mappings of Heterogeneous Data

  1. 1. RDF Mapping Language (RML) A Generic Language for Integrated RDF Mappings of Heterogeneous Data Anastasia Dimou, Miel Vander Sande, Pieter Colpaert, Ruben Verborgh, Erik Mannens and Rik Van de Walle Ghent University – iMinds – Multimedia Lab http://semweb.mmlab.be/rml LDOW14, WWW14 Seoul, Korea, 8th April 2014
  2. 2. The five stars of the Linked Open Data scheme are approached as a set of consecutive steps
  3. 3. … and are applied to a single input source every time
  4. 4. Limitations of current solutions The semantic representation of each mapped resource is Independently defined disregarding its possible prior definitions and its links to other resources Manual aligned to its prior appearances (if possible) by reconstructing the same URIs Not linked to other resources links are defined after the data are mapped and published
  5. 5. Need for a well-considered policy regarding mapping and primary interlinking of data in the context of a certain knowledge domain
  6. 6. No mapping formalization exists that defines how to map heterogeneous sources into RDF using integrated and interoperable mappings.
  7. 7. Relational Database to RDF (R2RML W3C) R2RML mappings R2RML processor Data OWNER / PUBLISHER defines RDF DB
  8. 8. Mapping heterogeneous resources to RDF R2RML mappings R2RML processor Data OWNER / PUBLISHER defines RDF DB CSV RDF
  9. 9. Mapping heterogeneous resources to RDF R2RML mappings R2RML processor Data OWNER / PUBLISHER defines RDF DB CSV XML RDF RDF
  10. 10. Current limitation: mapping data on a per-source & per-format basis R2RML mappings R2RML processor Data OWNER / PUBLISHER defines RDF DB CSV JSONXML RDF RDF RDF
  11. 11. The mappings are tied to the implementations not interoperable across different implementations No uniform way to describe mappings of heterogeneous resources that describe complementarily the same domain Mapping definitions are not reused for data in the same or different formats Further limitation: lack of uniform and interoperable solutions
  12. 12. Uniform way for integrated mapping of heterogeneous sources Mappings definitions? processor Data OWNER / PUBLISHER defines RDF DB CSV JSONXML
  13. 13. R2RML mapping definition Table Name Triples Map Logical Table Subject Map Predicate-Object Map 1 Subject Map 0 or more Predicate-Object Maps Predicate-Object Map Predicate-Object Map Predicate Map Object Map
  14. 14. R2RML mapping definition Table Name Triples Map Logical Table Subject Map Predicate-Object Map Predicate-Object Map Predicate-Object Map Predicate Map Object Map
  15. 15. From R2RML to a generic mapping language Object Map Predicate Map Subject Map Term Map template constant column column RDF Term : a URI, a literal, a blank node
  16. 16. R2RML Mapping <#ProductMapping> rr:logicalTable [ rr:tableName “Suitcase" ]; rr:subjectMap [ rr:template "http://ex.com/{Suitcase}"; rr:class ex:Person ]; rr:predicateObjectMap [ rr:predicate rdfs:label; rr:objectMap “Name” ]. ex:567 a schema:Product; rdfs:label “Samsonite DeLux 45”. Suitcase Name 567 Samsonite DeLux 45
  17. 17. from R2RML to a generic mapping language R2RML Generic mapping language Logical Table Logical Source (CSV, XML, JSON) Table Name Source name / URI Column ??? per row iteration ???
  18. 18. References to values of heterogeneous resources <PendingOrders>... <Order id="398"> <Product> <Id>AE5982</Id> <Name>Samsonite DeLux 45</Name> </Product> </Order>... <PendingOrders> { ... , “ProductInStock” : { “ID”: "567", “Name”: “Samsonite DeLux 45”, “type”: “suitcase”, }, ... } XPath for XML Reference: “Order@Id” Iterator: “/PendingOrders /Order” JSONPath for JSON Reference: “$. ProductInStock.ID” Iterator: “$.ProductInStock”
  19. 19. from R2RML to a generic mapping language R2RML R2RML Logical Table Logical Source (CSV, XML, JSON) Table Name Source name / URI Column Reference (defined Reference Formulation) per row iteration defined Iterator
  20. 20. Mapping XML files <PendingOrders>… <Order id="398"> <Product> <Id>AE5982</Id> <Name>Samsonite DeLux 45</Name> </Product> </Order> … </PendingOrders> <#OrdersMapping> rml:logicalSource [ rml:source “orders.xml"; rml:referenceFormulation ql:XPath; rml:iterator “/PendingOrders/Order/Product” ]; rr:subjectMap [ rr:template http://ex.com/{Id}; rr:class schema:Product ]; rr:predicateObjectMap [ rr:predicate rdfs:label ; rr:object “Product/Name” ] . ex:AE5982 a schema:Product ; rdfs:label “Samsonite DeLux 45”.
  21. 21. Mapping JSON files { ... , “ProductInStock” : { “ID”: "567", “Name”: “Samsonite DeLux 45”, “type”: “suitcase” }, ... } ex:567 a schema:Product ; rdfs:label “Samsonite DeLux 45” . <#ProductInStockMapping> rml:logicalSource [ rml:source “stock.json"; rml:referenceFormulation ql:JSONPath; rml:iterator “$.ProductInStock” ]; rr:subjectMap [ rr:template http://ex.com/{ID}; rr:class schema:Product ]; rr:predicateObjectMap [ rr:predicate rdfs:label ; rr:object “Name” ] .
  22. 22. RDF Mapping Language (RML) Source Triples Map Logical Source Subject Map Predicate-Object Map Predicate Map Object Map Term Map template constant reference Iterator Reference Formulation Referencing Object Map Triples Map Join Condition Parent column Child column
  23. 23. { ... "Performance" : { "Perf_ID": "567", "Location": { "lat": "51.043611" , "long": "3.717222"} }, ... } <Events> ... <Exhibition id="398"> <Location> <lat>51.043611</lat> <long>3.717222</long> </Location> </Exhibition> ... </Events> Robust cross-references <#PerformancesMapping> rr:subjectMap [ rr:template “http://ex.com/{Perf_ID}”]; rr:predicateObjectMap [ rr:predicate ex:location; rr:objectMap [ rr:parentTriplesMap <#LocationMapping> ] ]. <#EventsMapping> rr:subjectMap [ rr:template "http://ex.com/{@id}" ]; rr:predicateObjectMap [ rr:predicate ex:location; rr:objectMap [ rr:parentTriplesMap <#LocationMapping> ] ];
  24. 24. { ... "Performance" : { "Perf_ID": "567", "Location": { "lat": "51.043611" , "long": "3.717222“ } } , ... } <Events> ... <Exhibition id="398"> <Location> <lat>51.076891</lat> <long>3.717222</long> </Location> </Exhibition> ... ... </Events> Robust cross-references <#LocationMapping> rr:subjectMap [ rr:template "http://ex.com/{lat},{long}"]; rr:predicateObjectMap [ rr:predicate ex:long; rr:objectMap [ rml:reference "long" ] ]; rr:predicateObjectMap [ rr:predicate ex:lat; rr:objectMap [ rml:reference "lat" ] ] . ex:567 ex:location ex:51.043611, 3.717222 ex:398 ex:location ex:51.076891, 3.717222 ex:51.043611, 3.717222 ex:lat ex:3.717222 ex:long ex:51.043611.
  25. 25. { ... "Performance" : { "Perf_ID": "567", "Venue": { "Name": "STAM", "Venue_ID": "78" }, "Location": { "long": "3.717222", "lat": "51.043611" } } , ... } Primary Interlinking <#PerformancesMapping> rr:subjectMap [ rr:template “http://ex.com/{Perf_ID}”]; rr:predicateObjectMap [ rr:predicate ex:venue; rr:objectMap [ rr:parentTriplesMap <#VenueMapping> ] ]. <#VenueMapping> rml:logicalSource [ rml:source "http://ex.com/performances.json"; rml:referenceFormulation ql:JSONPath; rml:iterator "$.Performance.Venue.[*]" ]; rr:subjectMap [ rr:template "http://ex.com/{Venue_ID}"; rr:class ex:Venue ]. .
  26. 26. { ... "Performance" : { "Perf_ID": "567", "Venue": { "Name": "STAM", "Venue_ID": "78" }, ... } <Events> ... <Exhibition id="398"> <Venue>STAM</Venue> </Exhibition> ... ... </Events> Primary Interlinking ex:567 ex:venue ex:78. ex:398 ex:venue ex:78. <#EventsMapping> rr:subjectMap [ rr:template "http://ex.com/{@id}" ]; rr:predicateObjectMap [ rr:predicate ex:venue; rr:objectMap [ rr:parentTriplesMap <#VenueMapping>; rr:joinCondition [ rr:child "$.Performance.Venue.Name"; rr:parent "/Events/Exhibition/Venue" ] ] ] .
  27. 27. Avoid redefining and replicating URI patterns Uniquely define the URI patterns that generates a resource and refer to its definition Modifications to the patterns or data values are propagated to every other reference of the resource Links between resources in different inputs are defined already on mapping level New mappings are automatically aligning Robust cross-references and primary interlinking
  28. 28. Address the mappings definition in a generic way scale over the input data extracts. Distinct and not interdependent references to the data extracts and the mappings Proof: CSS3 selectors to map HTML documents enrich the aforementioned data with data from and Extensibility and Scalability
  29. 29. Limitations: Mapping of data on a per-source and per-format basis Mapping definitions are tied to the implementation Lack of Mapping definitions’ reuse RDF Mapping Language (RML): Uniform and interoperable mapping definitions Robust cross-references and interlinking Scalable mapping language Conclusions: Addressed Limitation
  30. 30. RDF Mapping Language (RML) generic language for mapping heterogeneous resources into RDF in an integrate and interoperable fashion RML: http://semweb.mmlab.be/rml RML Processor: https://github.com/mmlab/RMLProcessor Contact us Anastasia Dimou anastasia.dimou@ugent.be @natadimou Miel Vander Sande miel.vandersande@ugent.be @Miel_vds

×