Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Mapping Hierarchical Sources into
RDF using the RML Mapping Language
Anastasia Dimou1, Miel Vander Sande1,
Jason Slepicka2...
Most of the data that we would like to
be able to query as Linked Open Data
exists in formats other than RDF
There are…
over 11,000 APIs according to
ProgrammableWeb.org
only 74 of which return results in RDF
But more than 5000
ret...
Many
languages, tools and approaches
were proposed
to convert data
from relational databases to RDF
Relational Database to RDF (R2RML W3C)
R2RML mappings R2RML processor
Data OWNER / PUBLISHER
defines
RDF
DB
R2RML mappings R2RML processor
Data OWNER / PUBLISHER
defines
RDF
DB CSV JSONXML
RDF RDF RDF
lack of uniform definitions
to describe mapping rules for heterogeneous sources
lack of interoperable definitions
that wou...
mapping data
on a per-source and per-format basis
or on case-specific basis
Uniform way of defining mappings
for heterogen...
R2RML mappings R2RML processor
Data OWNER / PUBLISHER
defines
RDF
DB CSV JSONXML
RDF RDF RDF
Mappings definitions processor
Data OWNER / PUBLISHER
defines
RDF
DB CSV JSONXML
any format to RDF
RDF Mapping Language (RML)
generic scalable mapping language
for mapping heterogeneous resources into RDF
in an integrable...
Relational Database to RDF
Mapping Language
(R2RML)
R2RML mapping document
NAME BIRTH_DATE DEATH_DATE
Robert Theodore McCall 1919-12-23 2010-02-26
Ronald Anderson 1929-12-06
...
R2RML mapping definition
Table Name
Triples
Map
Logical Table
Subject Map
Predicate-Object Map
Predicate-Object Map
Predic...
R2RML mapping document
Triples Map
Subject Map
NAME BIRTH_DATE DEATH_DATE
Robert Theodore McCall 1919-12-23 2010-02-26
Ron...
R2RML mapping document
Predicate Map
NAME BIRTH_DATE DEATH_DATE
Robert Theodore McCall 1919-12-23 2010-02-26
Ronald Anders...
RDF Mapping Language
(RML)
RDF Mapping Language (RML)
mapping hierarchical sources to RDF
deal with hierarchy and heterogeneity
R2RML: each row is a self-contained
that can be processed independently
R2RML: the columns in each row
can be referred to ...
explicit reference to the iteration pattern
R2RML: each row is a self-contained
that can be processed independently
abstra...
RDF Mapping Language
(RML)
For hierarchical sources
[ ... …
{ "Title": "Apollo 11 Crew",
"Artist": "Ronald Anderson",
"Ref": "NPG_70_36",
"Sitter": [
{ "Name": "Neil Armstron...
Specifying the input data
R2RML: database
RML: file, API, …
R2RML: Logical Table (rr:logicalTable)
RML: Logical Source (rm...
Triples Map
Logical Source
source
<#ArtworkMapping>
rml:logicalSource
[rml:source “http://ex.com/artworks.json”].
Triples ...
Referring to the input data
R2RML: databases
RML: XML or JSON or CSV or ….
R2RML: (SQL)
RML: Xpath/Xquery or JSONPath or R...
<#ArtworkMapping>
rml:logicalSource
[ rml:source “http://ex.com/artworks.json” ;
rml:rererenceFormulation ql:JSONPath ].
T...
Iterating over the input data
R2RML: per row
RML: ?
R2RML:
RML: rml:iterator
<#ArtistMapping>
rml:logicalSource
[ rml:source “artists.xml”;
rml:referenceFormulation ql:Xpath ;
rml:iterator “/Artists/...
[ ... …
{ "Title": "Apollo 11 Crew",
"Artist": "Ronald Anderson",
"Ref": "NPG_70_36",
"Sitter": [
{ "Name": "Neil Armstron...
Referring to the extracts of the input data
explicitly and implicitly
R2RML: column name
RML: XML element or JSON object o...
<#ArtistMapping>
rml:logicalSource
[ rml:source “http://ex.com/artists.xml”;
rml:rererenceFormulation ql:XPath ;
rml:itera...
[ ... …
{ "Title": "Apollo 11 Crew",
"Artist": "Ronald Anderson",
"Ref": "NPG_70_36",
"Sitter": [
{ "Name": "Neil Armstron...
[ ... …
{ "Title": "Apollo 11 Crew",
"Artist": "Ronald Anderson",
"Ref": "NPG_70_36",
"Sitter": [
{ "Name": "Neil Armstron...
RDF Mapping Language (RML)
Source
Triples Map
Logical Source
Subject Map
Predicate-Object
Map
Predicate
Map
Object Map
Ter...
RDF Mapping Language
(RML)
Editing mappings with Karma
http://www.isi.edu/integration/karma/
RDF Mapping Language
(RML)
Processing
mapping-driven processing:
processing driven by the mapping module
data-driven processing:
processing driven by the extrac...
Extraction Module Mapping Module
RML Processor
Mapping Hierarchical Sources into RDF
using the RML mapping language
RML: http://rml.io
RML Namespace: http://semweb.mmlab...
Mapping Hierarchical Sources into RDF using the RML Mapping Language
Mapping Hierarchical Sources into RDF using the RML Mapping Language
Upcoming SlideShare
Loading in …5
×

Mapping Hierarchical Sources into RDF using the RML Mapping Language

1,476 views

Published on

Incorporating structured data in the Linked Data cloud is still complicated, despite the numerous existing tools. In particular, hierarchical structured data (e.g., JSON) are underrepresented, due to their processing complexity. A uniform mapping formalisation for data in different formats, which would enable reuse and exchange between tools and applied data, is missing. This paper describes a novel approach of mapping heterogeneous and hierarchical data sources into RDF using the RML mapping language, an extension over R2RML (the W3C standard for mapping relational databases into RDF). To facilitate those mappings, we present a toolset for producing RML mapping files using the Karma data modelling tool, and for consuming them using a prototype RML processor. A use case shows how RML facilitates the mapping rules’ definition and execution to map several heterogeneous sources.

http://rml.io
https://github.com/mmlab/RMLProcessor

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

Mapping Hierarchical Sources into RDF using the RML Mapping Language

  1. 1. Mapping Hierarchical Sources into RDF using the RML Mapping Language Anastasia Dimou1, Miel Vander Sande1, Jason Slepicka2, Pedro Szekely2, Erik Mannens1, Craig Knoblock2, Rik Van de Walle1 1Ghent University – iMinds – Multimedia Lab 2University of Southern California – Information Science Institute – Department of Computer Science http://rml.io IEEE-ICSC14 Newport beach, California, 18th June 2014
  2. 2. Most of the data that we would like to be able to query as Linked Open Data exists in formats other than RDF
  3. 3. There are… over 11,000 APIs according to ProgrammableWeb.org only 74 of which return results in RDF But more than 5000 return results in JSON or XML
  4. 4. Many languages, tools and approaches were proposed to convert data from relational databases to RDF
  5. 5. Relational Database to RDF (R2RML W3C) R2RML mappings R2RML processor Data OWNER / PUBLISHER defines RDF DB
  6. 6. R2RML mappings R2RML processor Data OWNER / PUBLISHER defines RDF DB CSV JSONXML RDF RDF RDF
  7. 7. lack of uniform definitions to describe mapping rules for heterogeneous sources lack of interoperable definitions that would allow the re-use of mapping rules across different implementations lack of reusable definitions that would allow the re-use of mapping rules for representing data in the same or different formats
  8. 8. mapping data on a per-source and per-format basis or on case-specific basis Uniform way of defining mappings for heterogeneous sources that can be re-used across data in the same or different formats and be interoperable across different implementations
  9. 9. R2RML mappings R2RML processor Data OWNER / PUBLISHER defines RDF DB CSV JSONXML RDF RDF RDF
  10. 10. Mappings definitions processor Data OWNER / PUBLISHER defines RDF DB CSV JSONXML any format to RDF
  11. 11. RDF Mapping Language (RML) generic scalable mapping language for mapping heterogeneous resources into RDF in an integrable and interoperable fashion superset of the W3C standardized R2RML mapping language http://semweb.mmlab.be/ns/rml
  12. 12. Relational Database to RDF Mapping Language (R2RML)
  13. 13. R2RML mapping document NAME BIRTH_DATE DEATH_DATE Robert Theodore McCall 1919-12-23 2010-02-26 Ronald Anderson 1929-12-06 Triples Map Logical Table Table Name <#ArtistMapping> rr:logicalTable [ rr:tableName “ARTISTS” ].
  14. 14. R2RML mapping definition Table Name Triples Map Logical Table Subject Map Predicate-Object Map Predicate-Object Map Predicate-Object Map Predicate Map Object Map
  15. 15. R2RML mapping document Triples Map Subject Map NAME BIRTH_DATE DEATH_DATE Robert Theodore McCall 1919-12-23 2010-02-26 Ronald Anderson 1929-12-06 <#ArtistMapping> rr:subjectMap [ rr:template “http://ex.com/{NAME}” ; rr:class ex:Person ]; <http://ex.com/Robert+Theodore+McCall> a ex:Person
  16. 16. R2RML mapping document Predicate Map NAME BIRTH_DATE DEATH_DATE Robert Theodore McCall 1919-12-23 2010-02-26 Ronald Anderson 1929-12-06 <#ArtistMapping> rr:predicateObjectMap [ rr:predicate ex:birth_date; rr:objectMap [ rr:column "BIRTH_DATE" ] ]; <http://ex.com/Robert+Theodore+McCall> ex:birth_date “1919-12-23” Predicate Object Map Objectt Map
  17. 17. RDF Mapping Language (RML)
  18. 18. RDF Mapping Language (RML) mapping hierarchical sources to RDF deal with hierarchy and heterogeneity
  19. 19. R2RML: each row is a self-contained that can be processed independently R2RML: the columns in each row can be referred to unambiguously R2RML: for each reference to a column in a single row a unique value is returned
  20. 20. explicit reference to the iteration pattern R2RML: each row is a self-contained that can be processed independently abstract reference to the input data R2RML: the columns in each row can be referred to unambiguously more than one triples per Predicate-Object Map R2RML: for each reference to a column in a single row a unique value is returned
  21. 21. RDF Mapping Language (RML) For hierarchical sources
  22. 22. [ ... … { "Title": "Apollo 11 Crew", "Artist": "Ronald Anderson", "Ref": "NPG_70_36", "Sitter": [ { "Name": "Neil Armstrong", "Birth Date": "1930-08-05" }, { "Name": "Buzz Aldrin", "Birth Date": "1930-01-20" }, { "Name": "Michael Collins" } ], "DateOfWork": "1969" }, { "Title": "Neil Armstrong", "Artist": "Robert Theodore McCall", "Ref": "S_NPG_2010_51", "Sitter": [ { "Name": "Neil Armstrong" } ], "DateOfWork": "2009" }, ... … ] <Artists> ... ... <Artist> <Name>Robert Theodore McCall</Name> <Birth_Date>1919-12-23</Birth_Date> <Death_Date>2010-02-26</Death_Date> </Artist> <Artist> <Name>Ronald Anderson</Name> <Birth_Date>1929-12-06</Birth_Date> <Death_Date/> </Artist> ... ... </Artists> artworks.JSON artists.XML
  23. 23. Specifying the input data R2RML: database RML: file, API, … R2RML: Logical Table (rr:logicalTable) RML: Logical Source (rml:logicalSource) R2RML: logical Name (rr:logicalName) RML: source (rml:source)
  24. 24. Triples Map Logical Source source <#ArtworkMapping> rml:logicalSource [rml:source “http://ex.com/artworks.json”]. Triples Map Logical Source source <#ArtistMapping> rml:logicalSource [ rml:source “artists.xml” ].
  25. 25. Referring to the input data R2RML: databases RML: XML or JSON or CSV or …. R2RML: (SQL) RML: Xpath/Xquery or JSONPath or RFC 4180 or … R2RML: (rr:sqlQuery) RML: rml:referenceFormulation
  26. 26. <#ArtworkMapping> rml:logicalSource [ rml:source “http://ex.com/artworks.json” ; rml:rererenceFormulation ql:JSONPath ]. Triples Map Logical Source source <#ArtistMapping> rml:logicalSource [ rml:source “artists.xml”; rml:referenceFormulation ql:XPath ].Reference Formulation Triples Map Logical Source source Reference Formulation
  27. 27. Iterating over the input data R2RML: per row RML: ? R2RML: RML: rml:iterator
  28. 28. <#ArtistMapping> rml:logicalSource [ rml:source “artists.xml”; rml:referenceFormulation ql:Xpath ; rml:iterator “/Artists/Artist” ]. <Artists> ... ... <Artist> <Name>Robert Theodore McCall</Name> <Birth_Date>1919-12-23</Birth_Date> <Death_Date>2010-02-26</Death_Date> </Artist> <Artist> <Name>Ronald Anderson</Name> <Birth_Date>1929-12-06</Birth_Date> <Death_Date/> </Artist> ... ... </Artists>
  29. 29. [ ... … { "Title": "Apollo 11 Crew", "Artist": "Ronald Anderson", "Ref": "NPG_70_36", "Sitter": [ { "Name": "Neil Armstrong", "Birth Date": "1930-08-05" }, { "Name": "Buzz Aldrin", "Birth Date": "1930-01-20" }, { "Name": "Michael Collins" } ], "DateOfWork": "1969" }, { "Title": "Neil Armstrong", "Artist": "Robert Theodore McCall", "Ref": "S_NPG_2010_51", "Sitter": [ { "Name": "Neil Armstrong" } ], "DateOfWork": "2009" }, ... … ] <#ArtworkMapping> rml:logicalSource [ rml:source “http://ex.com/artworks.json” ; rml:rererenceFormulation ql:JSONPath ; rml:iterator “$.[*]” ]. <#SitterMapping> rml:logicalSource [ rml:source “http://ex.com/artworks.json”; rml:rererenceFormulation ql:JSONPath ; rml:iterator “$.[*].Sitter” ].
  30. 30. Referring to the extracts of the input data explicitly and implicitly R2RML: column name RML: XML element or JSON object or … R2RML: rr:column RML: rml:reference
  31. 31. <#ArtistMapping> rml:logicalSource [ rml:source “http://ex.com/artists.xml”; rml:rererenceFormulation ql:XPath ; rml:iterator “/Artists/Artist” ] ; rr:subjectMap [ rr:template “http://ex.com/{Name}” ]; rr:predicateObjectMap [ rr:predicate ex:death_date ; rr:objectMap [ rml:reference “/Artists/Artist/Death_Date”] ]. <Artists> ... ... <Artist> <Name>Robert Theodore McCall</Name> <Birth_Date>1919-12-23</Birth_Date> <Death_Date>2010-02-26</Death_Date> </Artist> <Artist> <Name>Ronald Anderson</Name> <Birth_Date>1929-12-06</Birth_Date> <Death_Date/> </Artist> ... ... </Artists> <http://ex.com/Robert+Theodore+McCall> ex:death_date “1929-12-06”.
  32. 32. [ ... … { "Title": "Apollo 11 Crew", "Artist": "Ronald Anderson", "Ref": "NPG_70_36", "Sitter": [ { "Name": "Neil Armstrong", "Birth Date": "1930-08-05" }, { "Name": "Buzz Aldrin", "Birth Date": "1930-01-20" }, { "Name": "Michael Collins" } ], "DateOfWork": "1969" }, { "Title": "Neil Armstrong", "Artist": "Robert Theodore McCall", "Ref": "S_NPG_2010_51", "Sitter": [ { "Name": "Neil Armstrong" } ], "DateOfWork": "2009" }, ... … ] <#ArtworkMapping> rml:logicalSource [ rml:source “http://ex.com/artworks.json”; rml:rererenceFormulation ql:JSONPath ; rml:iterator “$.[*]” ] ; rr:subjectMap [ rr:template “http://ex.com/{Ref}”]; rr:predicateObjectMap [ rr:predicate rdfs:label ; rr:objectMap [ rml:reference “$.[*].Title” ] ]. <http://ex.com/NPG_70_36> rdfs:label “Apollo 11 Crew”.
  33. 33. [ ... … { "Title": "Apollo 11 Crew", "Artist": "Ronald Anderson", "Ref": "NPG_70_36", "Sitter": [ { "Name": "Neil Armstrong", "Birth Date": "1930-08-05" }, { "Name": "Buzz Aldrin", "Birth Date": "1930-01-20" }, { "Name": "Michael Collins" } ], "DateOfWork": "1969" }, { "Title": "Neil Armstrong", "Artist": "Robert Theodore McCall", "Ref": "S_NPG_2010_51", "Sitter": [ { "Name": "Neil Armstrong" } ], "DateOfWork": "2009" }, ... … ] <#SitterMapping> rml:logicalSource [ rml:source “http://ex.com/artworks.json”; rml:rererenceFormulation ql:JSONPath ; rml:iterator “$.[*].Sitter” ] ; rr:subjectMap [ rr:template “http://ex.com/{Name}”]; rr:predicateObjectMap [ rr:predicate ex:birth_date ; rr:objectMap [ rml:reference “$.[*].Sitter.Birth Date” ]]. <http://ex.com/Neil+Armstrong> ex:birth_date “1930-08-05”.
  34. 34. RDF Mapping Language (RML) Source Triples Map Logical Source Subject Map Predicate-Object Map Predicate Map Object Map Term Map template constant reference Iterator Reference Formulation Referencing Object Map Triples Map Join Condition Parent column Child column
  35. 35. RDF Mapping Language (RML) Editing mappings with Karma http://www.isi.edu/integration/karma/
  36. 36. RDF Mapping Language (RML) Processing
  37. 37. mapping-driven processing: processing driven by the mapping module data-driven processing: processing driven by the extraction module
  38. 38. Extraction Module Mapping Module RML Processor
  39. 39. Mapping Hierarchical Sources into RDF using the RML mapping language RML: http://rml.io RML Namespace: http://semweb.mmlab.be/ns/rml RML Processor: https://github.com/mmlab/RMLProcessor Contact us Anastasia Dimou anastasia.dimou@ugent.be @natadimou Miel Vander Sande miel.vandersande@ugent.be @Miel_vds

×