SlideShare a Scribd company logo
1 of 30
Download to read offline
1
Translation of Relational and
Non-Relational Databases
into RDF with xR2RML
F. Michel, L. Djimenou, C. Faron-Zucker, J. Montagnat
I3S lab, CNRS, Univ. Nice Sophia
2
 Web of data  publication/interlinking of open datasets
• Goal: publish heterogeneous data in a common format (RDF)
 Driven by data integration initiatives, e.g.:
• Linking Open Data, 1015 ds.
• W3C Data Activity
• BIO2RDF, 35 ds.
• Neuroscience Information
Framework
(12598 registry entries)
Web-scale data integration
Linked Datasets as of Aug. 30th 2014.
(c) R. Cyganiak & and A. Jentzsch
(Data: Apr. 2015)
3
Web-scale data integration
 Need to access data from the Deep Web [1]
• Strd./unstrd. data
hardly indexed by search engines,
hardly linked with other data sources
 Exponential data growth goes on
• Various types of DBs:
RDB, NoSQL, NewSQL, Native XML,
LDAP directory, OODB...
• Heterogeneous data models and
query capabilities
[1] B. He, M. Patel, Z. Zhang, and K. C.-C. Chang. Accessing the deep web. Communications of the ACM, 50(5):94–101, 2007
4
Web-scale data integration
To enrich the web of data with
existing and new data being created
ever faster...
... we need standardized approaches
to enable the translation of
heterogeneous data sources to RDF
5
 Previous works
 Background: R2RML and RML
 Description of xR2RML
 Evaluation and perspectives
Agenda
6
 Previous works
 Background: R2RML and RML
 Description of xR2RML
 Evaluation and perspectives
Agenda
7
 Much work achieved on RDBs
D2RQ, Virtuoso, R2RML (W3C)…
Goals: generic RDB-to-RDF, OBDA, ontology learning, schema mapping…
Methods: direct mapping vs. domain-specific,
materialization vs. SQL-to-SPARQL query rewriting
 XML: using either XPath (RML), XQuery (XSPARQL,
SPARQL2XQuery) or XSLT (Scissor-Lift), XSD-to-OWL
(SPARQL2XQuery)
 CSV/TSV/Spreadsheets: CSV on the web (W3C WG)
 JSON: using JSONPath (RML)
 Integration frameworks: DataLift, RML, Asio Tool Suite…
Previous works
8
 Existing approaches to map specific types of databases or
map specific data formats to RDF
 Each comes with its own mapping language or UI
 Supporting a new system (data model and QL) not
straightforward
Previous works
No unified mapping language to equally apply to most common
databases (RDB, NoSQL, XML, LDAP, OO…)
Supporting a new data model and/or QL  develop a DB
connector but no change in the mapping language
9
 Previous works
 Background: R2RML and RML
 Description of xR2RML
 Evaluation and perspectives
Agenda
10
R2RML – RDB To RDF Mapping Language
 W3C recommendation, 2012
 Goals:
• Describe mappings of relational entities to RDF
• Reuse of existing ontologies
• Operationalization not addressed
 How: TriplesMaps (TM) define how to generate RDF triples
• 1 logical table  rows to process
• 1 subject map  subject IRIs
• N (predicate map-object map) couples
• 1 opt. graph map  graph IRIs
 An R2RML mapping is an RDF graph
Triples
11
R2RML – RDB To RDF Mapping Language
Id Acronym Centre_Id
10 CAC2010 4
Id Name address
4 Pasteur ...
Study
Centre
FK
R2RML mapping graph:
Produced RDF:
<#Centre> a rr:TriplesMap;
rr:logicalTable [ rr:tableName "Centre" ];
rr:subjectMap [ rr:class ex:Centre;
rr:template "http://example.org/centre#{Name}"; ].
<#Study> a rr:TriplesMap;
rr:logicalTable [ rr:tableName “Study" ];
rr:subjectMap [ rr:class ex:Study;
rr:template "http://example.org/study#{Id}"; ];
rr:predicateObjectMap [
rr:predicate ex:hasName;
rr:objectMap [ rr:column "Acronym" ]; ];
rr:predicateObjectMap [
rr:predicate ex:locatedIn;
rr:objectMap [
rr:parentTriplesMap <#Centre>;
rr:joinCondition [
rr:child "Centre_id";
rr:parent "Id";
]; ]; ].
<http://example.org/centre#Pasteur> a ex:Centre.
<http://example.org/study#10> a ex:Study;
ex:hasName "CAC2010";
ex:locatedIn <http://example.org/centre#Pasteur>.
12
<#Centre>
rml:logicalSource [
rml:source “http://example.org/Centres.xml";
rml:referenceFormulation ql:XPath;
rml:iterator “/centres/centre”:
];
rr:subjectMap [
rr:class ex:Centre;
rr:template
"http://example.org/centre#{//centre/@Id}";
];
rr:predicateObjectMap [
rr:predicate ex:hasName;
rr:objectMap [
rml:reference "//centre/name" ];
];
RML extensions to R2RML
<centres>
<centre @Id="4">
<name>Pasteur</name>
</centre>
<centre @Id="6">
<name>Pontchaillou</name>
</centre>
</centres>
Advantages:
• Extends to CSV, JSON, XML sources
• Map several sources simultaneously
Limitations:
• Fixed list of reference formulations
• No distinction between reference
formulation and query language
• No RDF collections
RML mapping graph:XML document:
13
 Previous works
 Background: R2RML and RML
 Description of xR2RML
 Evaluation and perspectives
Agenda
14
xR2RML - Overall picture
xR2RML
Translation
Engine
xR2RML
Mapping
description
Native QL
Source database
Flexible language to describe mappings from
most common types of DB to RDF.
Extends R2RML and leverages RML extensions.
Domain
ontologies
refers to
Domain
ontologies
uses
15
xR2RML: Logical source
<#Centre>
xrr:logicalSource [
xrr:query ’’’for $x in doc(“centres.xml”)/centres/centre
where ... return $x’’’;
];
rr: R2RML vocabulary
xrr: xR2RML vocabulary
<centres>
<centre @Id="4">
<name>Pasteur</name>
</centre>
<centre @Id="6">
<name>Pontchaillou</name>
</centre>
</centres>
XML database
supporting XQuey:
xR2RML mapping graph:
16
xR2RML: Data element references
<#Centre>
xrr:logicalSource [
xrr:query ’’’for $x in doc(“centres.xml”)/centres/centre
where ... return $x’’’;
];
rr:subjectMap [
rr:class ex:Centre;
rr:template
"http://example.org/centre#{//centre/@Id}";
];
rr:predicateObjectMap [
rr:predicate ex:hasName;
rr:objectMap [
xrr:reference "//centre/name" ];
];
rr: R2RML vocabulary
xrr: xR2RML vocabulary
<centres>
<centre @Id="4">
<name>Pasteur</name>
</centre>
<centre @Id="6">
<name>Pontchaillou</name>
</centre>
</centres>
XML database
supporting XQuey:
xR2RML mapping graph:
17
xR2RML: Data element references
<centres>
<centre @Id="4">
<name>Pasteur</name>
</centre>
<centre @Id="6">
<name>Pontchaillou</name>
</centre>
</centres>
XML database
supporting XQuey:
xR2RML mapping graph:
rr: R2RML vocabulary
xrr: xR2RML vocabulary
<#Centre>
xrr:logicalSource [
xrr:query ’’’for $x in doc(“centres.xml”)/centres/centre
where ... return $x’’’;
];
rr:subjectMap [
rr:class ex:Centre;
rr:template
"http://example.org/centre#{//centre/@Id}";
];
rr:predicateObjectMap [
rr:predicate ex:hasName;
rr:objectMap [
xrr:reference “//centre/name" ];
];
xR2RML engine usage guidelines
Types of DB xrr:query
xrr:reference
rr:template
RDB, Column
stores
SQL, CQL, HQL Column name
Native XML DB XQuery XPath
NoSQL doc. Store Proprietary JS-based JSONPath
SPARQL endpoint SPARQL
Variable name,
Column name (s, p, o)
Neo4J (graph db) Cypher Column name (s, p, o)
LDAP directory LDAP Query Attribute name
... ... ...
18
{ "studyid": 10,
"acronym": "CAC2010",
"centres": [
{ "centreid": 4, "name": "Pasteur" },
{ "centreid": 6, "name": "Pontchaillou" }
]
}
xR2RML: multiple values vs. RDF list/container
Mapping case: link the study
with the centres it involves
<http://example.org/study#10> ex:involves “Pasteur”.
<http://example.org/study#10> ex:involves “Pontchaillou”.
<http://example.org/study#10>
ex:involvesCenters ( “Pasteur” “Pontchaillou” )
19
{ "studyid": 10,
"acronym": "CAC2010",
"centres": [
{ "centreid": 4, "name": "Pasteur" },
{ "centreid": 6, "name": "Pontchaillou" }
]
}
xR2RML: multiple values vs. RDF list/container
Mapping case: link the study
with the centres it involves
rr:objectMap [
xrr:reference "$.centres.*.name“;
rr:termType xrr:RdfList;
];
R2RML
term types
rr:IRI,
rr:Literal,
rr:BlankNode
xR2RML
term types
xrr:RdfList,
xrr:RdfSeq,
xrr:RdfBag,
xrr:RdfAlt
20
xR2RML: nested collections
From structured values (XML, JSON...):
nested collections and key-value associations...
... to RDF:
 generate nested lists/containers,
qualify members (data type,
language tag...)
rr:objectMap [
xrr:reference “...";
rr:termType xrr:RdfList;
xrr:nestedTermMap [
xrr:reference “...";
rr:termType xrr:RdfList;
xrr:nestedTermMap [
rr:datatype xsd:string;
]; ]; ];
(
( “John”^^xsd:string “Bob”^^xsd:string )
( “Ted”^^xsd:string “Mark”^^xsd:string )
)
E.g.: produce a list of lists of strings
21
Collection “studies”:
{ “studyid”: 10,
“acronym”: “CAC2010”,
“centres”: [ 4, 6 ]
}
Collection “centres”:
{ “centreid”: 4,
“name”: “Pasteur” },
{ “centreid”: 6,
“name”: “Pontchaillou”}
xR2RML: cross-references
<#Centre>
xrr:logicalSource [ ... ]; rr:subjectMap [ ... ].
<#Study>
xrr:logicalSource [ .. ]; rr:subjectMap [ ... ];
rr:predicateObjectMap [
rr:predicate ex:involvesSeq;
rr:objectMap [
rr:parentTriplesMap <#Centre>;
rr:joinCondition [
rr:child "$.centres.*";
rr:parent "$.centreid";
];
rr:termType xrr:RdfSeq;
];
].
<http://example.org/study#10> ex:involvesSeq
[ a rdf:Seq;
rdf:_1 <http://example.org/centre#Pasteur>;
rdf:_2 <http://example.org/centre#Pontchaillou>; ].
xR2RML mapping graph:MongoDB database:
Produced RDF:
22
Collection “studies”:
{ “studyid”: 10,
“acronym”: “CAC2010”,
“centres”: [ 4, 6 ]
}
Collection “centres”:
{ “centreid”: 4,
“name”: “Pasteur” },
{ “centreid”: 6,
“name”: “Pontchaillou”}
xR2RML: cross-references
<#Centre>
xrr:logicalSource [ ... ]; rr:subjectMap [ ... ].
<#Study>
xrr:logicalSource [ .. ]; rr:subjectMap [ ... ];
rr:predicateObjectMap [
rr:predicate ex:involvesSeq;
rr:objectMap [
rr:parentTriplesMap <#Centre>;
rr:joinCondition [
rr:child "$.centres.*";
rr:parent "$.centreid";
];
rr:termType xrr:RdfSeq;
];
].
xR2RML mapping graph:MongoDB database:
Joint query pushed to the DB
if supported, performed by
the xR2RML engine otherwise
<http://example.org/study#10> ex:involvesSeq
[ a rdf:Seq;
rdf:_1 <http://example.org/centre#Pasteur>;
rdf:_2 <http://example.org/centre#Pontchaillou>; ].
Produced RDF:
23
<#Centre>
xrr:logicalSource [
xrr:sourceName "STAFF";
];
...
rr:predicateObjectMap [
rr:predicate ex:fist-name;
rr:objectMap [
xrr:reference
"Column(Name)/JSONPath($.FirstName)" ];
];
xR2RML: content with mixed formats
Data with mixed content
Relational table “STAFF”, column “Name”
contains JSON data:
... Name ...
... {
“FirstName”: “Bob”,
“LastName: “Smith”
}
...
xR2RML mapping graph:
24
<#Centre>
xrr:logicalSource [
xrr:sourceName "STAFF";
];
...
rr:predicateObjectMap [
rr:predicate ex:fist-name;
rr:objectMap [
xrr:reference
"Column(Name)/JSONPath($.FirstName)" ];
];
xR2RML: content with mixed formats
Data with mixed content
Relational table “STAFF”, column “Name”
contains JSON data:
... Name ...
... {
“FirstName”: “Bob”,
“LastName: “Smith”
}
...
Data
format
Syntax path constructor
Row Column(), CSV(), TSV()
XML XPath()
JSON JSONPath()
... ...
xR2RML mapping graph:
25
 Previous works
 Background: R2RML and RML
 Description of xR2RML main features
 Evaluation and perspectives
Agenda
26
 Use case: study the history and transmission of
zoological knowledge
along historical periods
 TAXREF taxonomical reference
• Designed to support studies in Conservation Biology, enriched
with bioarchaeological taxa
• Maintained the French National Museum of Natural History
• ~ 450.000 terms, CSV/JSON/XML
Use case in Digital Humanities
27
 Ongoing work [2]: Construction of a SKOS1 thesaurus based
on TAXREF
• Import of TAXREF/JSON into MongoDB
• Use of the Morph-xR2RML prototype implementation of
xR2RML, to convert the MongoDB data to RDF
• Make alignments with existing well-adopted ontologies
(e.g. NCBI Taxonomic Classification, GeoNames...)
• Static alignments at mapping design time
• Using automatic alignment methods
Use case in Digital Humanities
1 SKOS: Simple Knowledge Organization System, W3C RDF-based standard to represent controlled
vocabularies, taxonomies and thesauri. Bridge the gap between existing KOS and the Semantic Web
and Linked Data.
28
 Ongoing discussion about the use of
xR2RML to support ecology and
agronomic studies
• Large phenotype databases
 Consider the query rewriting approach to support large
datasets
 How to write xR2RML mappings
• Automatic xR2RML mapping generation from data schema
(XSD/DTD, JSON schema, JSON-LD...)
• Schema mapping
• Schema discovery
Perspectives
29
Conclusions
 Data deluge keeps on ever faster
 Data stored in many kinds of DBs
 xR2RML:
• Flexible language to map most common types of database to
RDF
• Supports various data models and query languages
• Rich features: RDF collections/containers, joins, content with
mixed formats
 Applied to the construction of a SKOS thesaurus of
TAXREF, a taxonomical reference
30
Contacts:
Franck Michel
Johan Montagnat
Catherine Faron-Zucker
[2] C. Callou, F. Michel, C. Faron-Zucker, C. Martin, J. Montagnat. Towards a Shared Reference Thesaurus for
Studies on History of Zoology, Archaeozoology and Conservation Biology. In SW4SH workshop, ESWC’15.
[3] F. Michel, L. Djimenou, C. Faron-Zucker, and J. Montagnat. xR2RML: Non-Relational Databases to RDF
Mapping Language. Research report. ISRN I3S/RR 2014-04-FR. http://hal.archives-ouvertes.fr/hal-01066663
https://github.com/frmichel/morph-xr2rml/

More Related Content

What's hot

Achieving time effective federated information from scalable rdf data using s...
Achieving time effective federated information from scalable rdf data using s...Achieving time effective federated information from scalable rdf data using s...
Achieving time effective federated information from scalable rdf data using s...
తేజ దండిభట్ల
 
SWT Lecture Session 9 - RDB2RDF direct mapping
SWT Lecture Session 9 - RDB2RDF direct mappingSWT Lecture Session 9 - RDB2RDF direct mapping
SWT Lecture Session 9 - RDB2RDF direct mapping
Mariano Rodriguez-Muro
 

What's hot (19)

RDF data model
RDF data modelRDF data model
RDF data model
 
Web Data Management with RDF
Web Data Management with RDFWeb Data Management with RDF
Web Data Management with RDF
 
Linked (Open) Data
Linked (Open) DataLinked (Open) Data
Linked (Open) Data
 
A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...
A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...
A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...
 
SWT Lecture Session 10 R2RML Part 1
SWT Lecture Session 10 R2RML Part 1SWT Lecture Session 10 R2RML Part 1
SWT Lecture Session 10 R2RML Part 1
 
2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod Gmod
 
Achieving time effective federated information from scalable rdf data using s...
Achieving time effective federated information from scalable rdf data using s...Achieving time effective federated information from scalable rdf data using s...
Achieving time effective federated information from scalable rdf data using s...
 
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
 
SWT Lecture Session 9 - RDB2RDF direct mapping
SWT Lecture Session 9 - RDB2RDF direct mappingSWT Lecture Session 9 - RDB2RDF direct mapping
SWT Lecture Session 9 - RDB2RDF direct mapping
 
Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-
 
SWT Lecture Session 3 - SPARQL
SWT Lecture Session 3 - SPARQLSWT Lecture Session 3 - SPARQL
SWT Lecture Session 3 - SPARQL
 
SWT Lecture Session 8 - Rules
SWT Lecture Session 8 - RulesSWT Lecture Session 8 - Rules
SWT Lecture Session 8 - Rules
 
SWT Lecture Session 11 - R2RML part 2
SWT Lecture Session 11 - R2RML part 2SWT Lecture Session 11 - R2RML part 2
SWT Lecture Session 11 - R2RML part 2
 
5 rdfs
5 rdfs5 rdfs
5 rdfs
 
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...
 
Efficient RDF Interchange (ERI) Format for RDF Data Streams
Efficient RDF Interchange (ERI) Format for RDF Data StreamsEfficient RDF Interchange (ERI) Format for RDF Data Streams
Efficient RDF Interchange (ERI) Format for RDF Data Streams
 
Democratizing Big Semantic Data management
Democratizing Big Semantic Data managementDemocratizing Big Semantic Data management
Democratizing Big Semantic Data management
 
XSPARQL CrEDIBLE workshop
XSPARQL CrEDIBLE workshopXSPARQL CrEDIBLE workshop
XSPARQL CrEDIBLE workshop
 
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
 

Viewers also liked

Who wants to be a millionaire game chemistry vocab
Who wants to be a millionaire game   chemistry vocabWho wants to be a millionaire game   chemistry vocab
Who wants to be a millionaire game chemistry vocab
MrEmersonScience
 
sejarah komputer beserta hardware dan software
sejarah komputer beserta hardware dan softwaresejarah komputer beserta hardware dan software
sejarah komputer beserta hardware dan software
evirahma
 
Alopez powerpoint
Alopez powerpointAlopez powerpoint
Alopez powerpoint
atlopez
 

Viewers also liked (17)

Who wants to be a millionaire game chemistry vocab
Who wants to be a millionaire game   chemistry vocabWho wants to be a millionaire game   chemistry vocab
Who wants to be a millionaire game chemistry vocab
 
Test PowerPoint C1
Test PowerPoint C1Test PowerPoint C1
Test PowerPoint C1
 
Kanchi Periva Forum - Ebook # 18 - Navaratri 2013 - Kamakshi
Kanchi Periva Forum - Ebook # 18 - Navaratri 2013 - KamakshiKanchi Periva Forum - Ebook # 18 - Navaratri 2013 - Kamakshi
Kanchi Periva Forum - Ebook # 18 - Navaratri 2013 - Kamakshi
 
sejarah komputer beserta hardware dan software
sejarah komputer beserta hardware dan softwaresejarah komputer beserta hardware dan software
sejarah komputer beserta hardware dan software
 
Alopez powerpoint
Alopez powerpointAlopez powerpoint
Alopez powerpoint
 
Question 5.
Question 5.Question 5.
Question 5.
 
Sandra Roe Visual Resume
Sandra Roe Visual ResumeSandra Roe Visual Resume
Sandra Roe Visual Resume
 
Kanchi Periva Forum Newsletter - Volume 3
Kanchi Periva Forum Newsletter - Volume 3Kanchi Periva Forum Newsletter - Volume 3
Kanchi Periva Forum Newsletter - Volume 3
 
At89c4051
At89c4051At89c4051
At89c4051
 
Kursintroduktion entreprenörskap
Kursintroduktion entreprenörskapKursintroduktion entreprenörskap
Kursintroduktion entreprenörskap
 
Partea 1
Partea 1Partea 1
Partea 1
 
Landscape
LandscapeLandscape
Landscape
 
The making of Lys-de-Membre - part 4
The making of Lys-de-Membre - part 4The making of Lys-de-Membre - part 4
The making of Lys-de-Membre - part 4
 
Trabajator presentation
Trabajator presentationTrabajator presentation
Trabajator presentation
 
Presentation1
Presentation1Presentation1
Presentation1
 
A New Approach: Automatically Identify Proper Noun from Bengali Sentence for ...
A New Approach: Automatically Identify Proper Noun from Bengali Sentence for ...A New Approach: Automatically Identify Proper Noun from Bengali Sentence for ...
A New Approach: Automatically Identify Proper Noun from Bengali Sentence for ...
 
Angela ajo
Angela ajoAngela ajo
Angela ajo
 

Similar to Translation of Relational and Non-Relational Databases into RDF with xR2RML

Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapub
eswcsummerschool
 
Triplestore and SPARQL
Triplestore and SPARQLTriplestore and SPARQL
Triplestore and SPARQL
Lino Valdivia
 

Similar to Translation of Relational and Non-Relational Databases into RDF with xR2RML (20)

RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And Visualization
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic Web
 
Going for GOLD - Adventures in Open Linked Geospatial Metadata
Going for GOLD - Adventures in Open Linked Geospatial MetadataGoing for GOLD - Adventures in Open Linked Geospatial Metadata
Going for GOLD - Adventures in Open Linked Geospatial Metadata
 
Enterprise knowledge graphs
Enterprise knowledge graphsEnterprise knowledge graphs
Enterprise knowledge graphs
 
Do it on your own - From 3 to 5 Star Linked Open Data with RMLio
Do it on your own - From 3 to 5 Star Linked Open Data with RMLioDo it on your own - From 3 to 5 Star Linked Open Data with RMLio
Do it on your own - From 3 to 5 Star Linked Open Data with RMLio
 
Vital AI: Big Data Modeling
Vital AI: Big Data ModelingVital AI: Big Data Modeling
Vital AI: Big Data Modeling
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapub
 
FIWARE Global Summit - IDS Implementation with FIWARE Software Components
FIWARE Global Summit - IDS Implementation with FIWARE Software ComponentsFIWARE Global Summit - IDS Implementation with FIWARE Software Components
FIWARE Global Summit - IDS Implementation with FIWARE Software Components
 
Scalable and privacy-preserving data integration - part 1
Scalable and privacy-preserving data integration - part 1Scalable and privacy-preserving data integration - part 1
Scalable and privacy-preserving data integration - part 1
 
ROHub-Argos integration
ROHub-Argos integrationROHub-Argos integration
ROHub-Argos integration
 
Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked data
 
Exploring the Semantic Web
Exploring the Semantic WebExploring the Semantic Web
Exploring the Semantic Web
 
RDTF Metadata Guidelines: an update
RDTF Metadata Guidelines: an updateRDTF Metadata Guidelines: an update
RDTF Metadata Guidelines: an update
 
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
 
Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
 
Transient and persistent RDF views over relational databases in the context o...
Transient and persistent RDF views over relational databases in the context o...Transient and persistent RDF views over relational databases in the context o...
Transient and persistent RDF views over relational databases in the context o...
 
Triplestore and SPARQL
Triplestore and SPARQLTriplestore and SPARQL
Triplestore and SPARQL
 

More from Franck Michel

A Model to Represent Nomenclatural and Taxonomic Information as Linked Data. ...
A Model to Represent Nomenclatural and Taxonomic Information as Linked Data. ...A Model to Represent Nomenclatural and Taxonomic Information as Linked Data. ...
A Model to Represent Nomenclatural and Taxonomic Information as Linked Data. ...
Franck Michel
 
SPARQL Micro-Services: Lightweight Integration of Web APIs and Linked Data
SPARQL Micro-Services: Lightweight Integration of Web APIs and Linked DataSPARQL Micro-Services: Lightweight Integration of Web APIs and Linked Data
SPARQL Micro-Services: Lightweight Integration of Web APIs and Linked Data
Franck Michel
 
Construction d’un référentiel taxonomique commun pour des études sur l’histoi...
Construction d’un référentiel taxonomique commun pour des études sur l’histoi...Construction d’un référentiel taxonomique commun pour des études sur l’histoi...
Construction d’un référentiel taxonomique commun pour des études sur l’histoi...
Franck Michel
 

More from Franck Michel (14)

ISSA: Generic Pipeline, Knowledge Model and Visualization tools to Help Scien...
ISSA: Generic Pipeline, Knowledge Model and Visualization tools to Help Scien...ISSA: Generic Pipeline, Knowledge Model and Visualization tools to Help Scien...
ISSA: Generic Pipeline, Knowledge Model and Visualization tools to Help Scien...
 
Bioschemas: Marking up biodiversity websites to improve data discovery and we...
Bioschemas: Marking up biodiversity websites to improve data discovery and we...Bioschemas: Marking up biodiversity websites to improve data discovery and we...
Bioschemas: Marking up biodiversity websites to improve data discovery and we...
 
Unleash the Potential of your Website! 180,000 webpages from the French NHM m...
Unleash the Potential of your Website! 180,000 webpages from the French NHM m...Unleash the Potential of your Website! 180,000 webpages from the French NHM m...
Unleash the Potential of your Website! 180,000 webpages from the French NHM m...
 
Heterogeneous Data Aggregation and Querying at Web Scale Using Semantic align...
Heterogeneous Data Aggregation and Querying at Web Scale Using Semantic align...Heterogeneous Data Aggregation and Querying at Web Scale Using Semantic align...
Heterogeneous Data Aggregation and Querying at Web Scale Using Semantic align...
 
Describe and Publish data sets on the web: vocabularies, catalogues, data por...
Describe and Publish data sets on the web: vocabularies, catalogues, data por...Describe and Publish data sets on the web: vocabularies, catalogues, data por...
Describe and Publish data sets on the web: vocabularies, catalogues, data por...
 
Knowledge Engineering: Semantic web, web of data, linked data
Knowledge Engineering: Semantic web, web of data, linked dataKnowledge Engineering: Semantic web, web of data, linked data
Knowledge Engineering: Semantic web, web of data, linked data
 
Enabling Automatic Discovery and Querying of Web APIs at Web Scale using Link...
Enabling Automatic Discovery and Querying of Web APIs at Web Scale using Link...Enabling Automatic Discovery and Querying of Web APIs at Web Scale using Link...
Enabling Automatic Discovery and Querying of Web APIs at Web Scale using Link...
 
Modelling Biodiversity Linked Data: Pragmatism May Narrow Future Opportunities
Modelling Biodiversity Linked Data: Pragmatism May Narrow Future OpportunitiesModelling Biodiversity Linked Data: Pragmatism May Narrow Future Opportunities
Modelling Biodiversity Linked Data: Pragmatism May Narrow Future Opportunities
 
A Model to Represent Nomenclatural and Taxonomic Information as Linked Data. ...
A Model to Represent Nomenclatural and Taxonomic Information as Linked Data. ...A Model to Represent Nomenclatural and Taxonomic Information as Linked Data. ...
A Model to Represent Nomenclatural and Taxonomic Information as Linked Data. ...
 
SPARQL Micro-Services: Lightweight Integration of Web APIs and Linked Data
SPARQL Micro-Services: Lightweight Integration of Web APIs and Linked DataSPARQL Micro-Services: Lightweight Integration of Web APIs and Linked Data
SPARQL Micro-Services: Lightweight Integration of Web APIs and Linked Data
 
Construction d’un référentiel taxonomique commun pour des études sur l’histoi...
Construction d’un référentiel taxonomique commun pour des études sur l’histoi...Construction d’un référentiel taxonomique commun pour des études sur l’histoi...
Construction d’un référentiel taxonomique commun pour des études sur l’histoi...
 
A Mapping-based Method to Query MongoDB Documents with SPARQL
A Mapping-based Method to Query MongoDB Documents with SPARQLA Mapping-based Method to Query MongoDB Documents with SPARQL
A Mapping-based Method to Query MongoDB Documents with SPARQL
 
Make our Scientific Datasets Accessible and Interoperable on the Web
Make our Scientific Datasets Accessible and Interoperable on the WebMake our Scientific Datasets Accessible and Interoperable on the Web
Make our Scientific Datasets Accessible and Interoperable on the Web
 
Towards a Shared Reference Thesaurus for Studies on History of Zoology, Archa...
Towards a Shared Reference Thesaurus for Studies on History of Zoology, Archa...Towards a Shared Reference Thesaurus for Studies on History of Zoology, Archa...
Towards a Shared Reference Thesaurus for Studies on History of Zoology, Archa...
 

Recently uploaded

Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...
Sérgio Sacani
 
Isolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxIsolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptx
GOWTHAMIM22
 
Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...
Sérgio Sacani
 

Recently uploaded (20)

Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
 
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptxBiochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
 
GBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interactionGBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interaction
 
IISc Bangalore M.E./M.Tech. courses and fees 2024
IISc Bangalore M.E./M.Tech. courses and fees 2024IISc Bangalore M.E./M.Tech. courses and fees 2024
IISc Bangalore M.E./M.Tech. courses and fees 2024
 
Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...
 
GBSN - Microbiology Lab (Microbiology Lab Safety Procedures)
GBSN -  Microbiology Lab (Microbiology Lab Safety Procedures)GBSN -  Microbiology Lab (Microbiology Lab Safety Procedures)
GBSN - Microbiology Lab (Microbiology Lab Safety Procedures)
 
Mining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxMining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptx
 
Isolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxIsolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptx
 
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana LahariERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
 
Plasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyanPlasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyan
 
NUMERICAL Proof Of TIme Electron Theory.
NUMERICAL Proof Of TIme Electron Theory.NUMERICAL Proof Of TIme Electron Theory.
NUMERICAL Proof Of TIme Electron Theory.
 
Cellular Communication and regulation of communication mechanisms to sing the...
Cellular Communication and regulation of communication mechanisms to sing the...Cellular Communication and regulation of communication mechanisms to sing the...
Cellular Communication and regulation of communication mechanisms to sing the...
 
Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...
 
Erythropoiesis- Dr.E. Muralinath-C Kalyan
Erythropoiesis- Dr.E. Muralinath-C KalyanErythropoiesis- Dr.E. Muralinath-C Kalyan
Erythropoiesis- Dr.E. Muralinath-C Kalyan
 
Triploidy ...............................pptx
Triploidy ...............................pptxTriploidy ...............................pptx
Triploidy ...............................pptx
 
SCHISTOSOMA HEAMATOBIUM life cycle .pdf
SCHISTOSOMA HEAMATOBIUM life cycle  .pdfSCHISTOSOMA HEAMATOBIUM life cycle  .pdf
SCHISTOSOMA HEAMATOBIUM life cycle .pdf
 
NuGOweek 2024 full programme - hosted by Ghent University
NuGOweek 2024 full programme - hosted by Ghent UniversityNuGOweek 2024 full programme - hosted by Ghent University
NuGOweek 2024 full programme - hosted by Ghent University
 
Lubrication System in forced feed system
Lubrication System in forced feed systemLubrication System in forced feed system
Lubrication System in forced feed system
 
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
 
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
 

Translation of Relational and Non-Relational Databases into RDF with xR2RML

  • 1. 1 Translation of Relational and Non-Relational Databases into RDF with xR2RML F. Michel, L. Djimenou, C. Faron-Zucker, J. Montagnat I3S lab, CNRS, Univ. Nice Sophia
  • 2. 2  Web of data  publication/interlinking of open datasets • Goal: publish heterogeneous data in a common format (RDF)  Driven by data integration initiatives, e.g.: • Linking Open Data, 1015 ds. • W3C Data Activity • BIO2RDF, 35 ds. • Neuroscience Information Framework (12598 registry entries) Web-scale data integration Linked Datasets as of Aug. 30th 2014. (c) R. Cyganiak & and A. Jentzsch (Data: Apr. 2015)
  • 3. 3 Web-scale data integration  Need to access data from the Deep Web [1] • Strd./unstrd. data hardly indexed by search engines, hardly linked with other data sources  Exponential data growth goes on • Various types of DBs: RDB, NoSQL, NewSQL, Native XML, LDAP directory, OODB... • Heterogeneous data models and query capabilities [1] B. He, M. Patel, Z. Zhang, and K. C.-C. Chang. Accessing the deep web. Communications of the ACM, 50(5):94–101, 2007
  • 4. 4 Web-scale data integration To enrich the web of data with existing and new data being created ever faster... ... we need standardized approaches to enable the translation of heterogeneous data sources to RDF
  • 5. 5  Previous works  Background: R2RML and RML  Description of xR2RML  Evaluation and perspectives Agenda
  • 6. 6  Previous works  Background: R2RML and RML  Description of xR2RML  Evaluation and perspectives Agenda
  • 7. 7  Much work achieved on RDBs D2RQ, Virtuoso, R2RML (W3C)… Goals: generic RDB-to-RDF, OBDA, ontology learning, schema mapping… Methods: direct mapping vs. domain-specific, materialization vs. SQL-to-SPARQL query rewriting  XML: using either XPath (RML), XQuery (XSPARQL, SPARQL2XQuery) or XSLT (Scissor-Lift), XSD-to-OWL (SPARQL2XQuery)  CSV/TSV/Spreadsheets: CSV on the web (W3C WG)  JSON: using JSONPath (RML)  Integration frameworks: DataLift, RML, Asio Tool Suite… Previous works
  • 8. 8  Existing approaches to map specific types of databases or map specific data formats to RDF  Each comes with its own mapping language or UI  Supporting a new system (data model and QL) not straightforward Previous works No unified mapping language to equally apply to most common databases (RDB, NoSQL, XML, LDAP, OO…) Supporting a new data model and/or QL  develop a DB connector but no change in the mapping language
  • 9. 9  Previous works  Background: R2RML and RML  Description of xR2RML  Evaluation and perspectives Agenda
  • 10. 10 R2RML – RDB To RDF Mapping Language  W3C recommendation, 2012  Goals: • Describe mappings of relational entities to RDF • Reuse of existing ontologies • Operationalization not addressed  How: TriplesMaps (TM) define how to generate RDF triples • 1 logical table  rows to process • 1 subject map  subject IRIs • N (predicate map-object map) couples • 1 opt. graph map  graph IRIs  An R2RML mapping is an RDF graph Triples
  • 11. 11 R2RML – RDB To RDF Mapping Language Id Acronym Centre_Id 10 CAC2010 4 Id Name address 4 Pasteur ... Study Centre FK R2RML mapping graph: Produced RDF: <#Centre> a rr:TriplesMap; rr:logicalTable [ rr:tableName "Centre" ]; rr:subjectMap [ rr:class ex:Centre; rr:template "http://example.org/centre#{Name}"; ]. <#Study> a rr:TriplesMap; rr:logicalTable [ rr:tableName “Study" ]; rr:subjectMap [ rr:class ex:Study; rr:template "http://example.org/study#{Id}"; ]; rr:predicateObjectMap [ rr:predicate ex:hasName; rr:objectMap [ rr:column "Acronym" ]; ]; rr:predicateObjectMap [ rr:predicate ex:locatedIn; rr:objectMap [ rr:parentTriplesMap <#Centre>; rr:joinCondition [ rr:child "Centre_id"; rr:parent "Id"; ]; ]; ]. <http://example.org/centre#Pasteur> a ex:Centre. <http://example.org/study#10> a ex:Study; ex:hasName "CAC2010"; ex:locatedIn <http://example.org/centre#Pasteur>.
  • 12. 12 <#Centre> rml:logicalSource [ rml:source “http://example.org/Centres.xml"; rml:referenceFormulation ql:XPath; rml:iterator “/centres/centre”: ]; rr:subjectMap [ rr:class ex:Centre; rr:template "http://example.org/centre#{//centre/@Id}"; ]; rr:predicateObjectMap [ rr:predicate ex:hasName; rr:objectMap [ rml:reference "//centre/name" ]; ]; RML extensions to R2RML <centres> <centre @Id="4"> <name>Pasteur</name> </centre> <centre @Id="6"> <name>Pontchaillou</name> </centre> </centres> Advantages: • Extends to CSV, JSON, XML sources • Map several sources simultaneously Limitations: • Fixed list of reference formulations • No distinction between reference formulation and query language • No RDF collections RML mapping graph:XML document:
  • 13. 13  Previous works  Background: R2RML and RML  Description of xR2RML  Evaluation and perspectives Agenda
  • 14. 14 xR2RML - Overall picture xR2RML Translation Engine xR2RML Mapping description Native QL Source database Flexible language to describe mappings from most common types of DB to RDF. Extends R2RML and leverages RML extensions. Domain ontologies refers to Domain ontologies uses
  • 15. 15 xR2RML: Logical source <#Centre> xrr:logicalSource [ xrr:query ’’’for $x in doc(“centres.xml”)/centres/centre where ... return $x’’’; ]; rr: R2RML vocabulary xrr: xR2RML vocabulary <centres> <centre @Id="4"> <name>Pasteur</name> </centre> <centre @Id="6"> <name>Pontchaillou</name> </centre> </centres> XML database supporting XQuey: xR2RML mapping graph:
  • 16. 16 xR2RML: Data element references <#Centre> xrr:logicalSource [ xrr:query ’’’for $x in doc(“centres.xml”)/centres/centre where ... return $x’’’; ]; rr:subjectMap [ rr:class ex:Centre; rr:template "http://example.org/centre#{//centre/@Id}"; ]; rr:predicateObjectMap [ rr:predicate ex:hasName; rr:objectMap [ xrr:reference "//centre/name" ]; ]; rr: R2RML vocabulary xrr: xR2RML vocabulary <centres> <centre @Id="4"> <name>Pasteur</name> </centre> <centre @Id="6"> <name>Pontchaillou</name> </centre> </centres> XML database supporting XQuey: xR2RML mapping graph:
  • 17. 17 xR2RML: Data element references <centres> <centre @Id="4"> <name>Pasteur</name> </centre> <centre @Id="6"> <name>Pontchaillou</name> </centre> </centres> XML database supporting XQuey: xR2RML mapping graph: rr: R2RML vocabulary xrr: xR2RML vocabulary <#Centre> xrr:logicalSource [ xrr:query ’’’for $x in doc(“centres.xml”)/centres/centre where ... return $x’’’; ]; rr:subjectMap [ rr:class ex:Centre; rr:template "http://example.org/centre#{//centre/@Id}"; ]; rr:predicateObjectMap [ rr:predicate ex:hasName; rr:objectMap [ xrr:reference “//centre/name" ]; ]; xR2RML engine usage guidelines Types of DB xrr:query xrr:reference rr:template RDB, Column stores SQL, CQL, HQL Column name Native XML DB XQuery XPath NoSQL doc. Store Proprietary JS-based JSONPath SPARQL endpoint SPARQL Variable name, Column name (s, p, o) Neo4J (graph db) Cypher Column name (s, p, o) LDAP directory LDAP Query Attribute name ... ... ...
  • 18. 18 { "studyid": 10, "acronym": "CAC2010", "centres": [ { "centreid": 4, "name": "Pasteur" }, { "centreid": 6, "name": "Pontchaillou" } ] } xR2RML: multiple values vs. RDF list/container Mapping case: link the study with the centres it involves <http://example.org/study#10> ex:involves “Pasteur”. <http://example.org/study#10> ex:involves “Pontchaillou”. <http://example.org/study#10> ex:involvesCenters ( “Pasteur” “Pontchaillou” )
  • 19. 19 { "studyid": 10, "acronym": "CAC2010", "centres": [ { "centreid": 4, "name": "Pasteur" }, { "centreid": 6, "name": "Pontchaillou" } ] } xR2RML: multiple values vs. RDF list/container Mapping case: link the study with the centres it involves rr:objectMap [ xrr:reference "$.centres.*.name“; rr:termType xrr:RdfList; ]; R2RML term types rr:IRI, rr:Literal, rr:BlankNode xR2RML term types xrr:RdfList, xrr:RdfSeq, xrr:RdfBag, xrr:RdfAlt
  • 20. 20 xR2RML: nested collections From structured values (XML, JSON...): nested collections and key-value associations... ... to RDF:  generate nested lists/containers, qualify members (data type, language tag...) rr:objectMap [ xrr:reference “..."; rr:termType xrr:RdfList; xrr:nestedTermMap [ xrr:reference “..."; rr:termType xrr:RdfList; xrr:nestedTermMap [ rr:datatype xsd:string; ]; ]; ]; ( ( “John”^^xsd:string “Bob”^^xsd:string ) ( “Ted”^^xsd:string “Mark”^^xsd:string ) ) E.g.: produce a list of lists of strings
  • 21. 21 Collection “studies”: { “studyid”: 10, “acronym”: “CAC2010”, “centres”: [ 4, 6 ] } Collection “centres”: { “centreid”: 4, “name”: “Pasteur” }, { “centreid”: 6, “name”: “Pontchaillou”} xR2RML: cross-references <#Centre> xrr:logicalSource [ ... ]; rr:subjectMap [ ... ]. <#Study> xrr:logicalSource [ .. ]; rr:subjectMap [ ... ]; rr:predicateObjectMap [ rr:predicate ex:involvesSeq; rr:objectMap [ rr:parentTriplesMap <#Centre>; rr:joinCondition [ rr:child "$.centres.*"; rr:parent "$.centreid"; ]; rr:termType xrr:RdfSeq; ]; ]. <http://example.org/study#10> ex:involvesSeq [ a rdf:Seq; rdf:_1 <http://example.org/centre#Pasteur>; rdf:_2 <http://example.org/centre#Pontchaillou>; ]. xR2RML mapping graph:MongoDB database: Produced RDF:
  • 22. 22 Collection “studies”: { “studyid”: 10, “acronym”: “CAC2010”, “centres”: [ 4, 6 ] } Collection “centres”: { “centreid”: 4, “name”: “Pasteur” }, { “centreid”: 6, “name”: “Pontchaillou”} xR2RML: cross-references <#Centre> xrr:logicalSource [ ... ]; rr:subjectMap [ ... ]. <#Study> xrr:logicalSource [ .. ]; rr:subjectMap [ ... ]; rr:predicateObjectMap [ rr:predicate ex:involvesSeq; rr:objectMap [ rr:parentTriplesMap <#Centre>; rr:joinCondition [ rr:child "$.centres.*"; rr:parent "$.centreid"; ]; rr:termType xrr:RdfSeq; ]; ]. xR2RML mapping graph:MongoDB database: Joint query pushed to the DB if supported, performed by the xR2RML engine otherwise <http://example.org/study#10> ex:involvesSeq [ a rdf:Seq; rdf:_1 <http://example.org/centre#Pasteur>; rdf:_2 <http://example.org/centre#Pontchaillou>; ]. Produced RDF:
  • 23. 23 <#Centre> xrr:logicalSource [ xrr:sourceName "STAFF"; ]; ... rr:predicateObjectMap [ rr:predicate ex:fist-name; rr:objectMap [ xrr:reference "Column(Name)/JSONPath($.FirstName)" ]; ]; xR2RML: content with mixed formats Data with mixed content Relational table “STAFF”, column “Name” contains JSON data: ... Name ... ... { “FirstName”: “Bob”, “LastName: “Smith” } ... xR2RML mapping graph:
  • 24. 24 <#Centre> xrr:logicalSource [ xrr:sourceName "STAFF"; ]; ... rr:predicateObjectMap [ rr:predicate ex:fist-name; rr:objectMap [ xrr:reference "Column(Name)/JSONPath($.FirstName)" ]; ]; xR2RML: content with mixed formats Data with mixed content Relational table “STAFF”, column “Name” contains JSON data: ... Name ... ... { “FirstName”: “Bob”, “LastName: “Smith” } ... Data format Syntax path constructor Row Column(), CSV(), TSV() XML XPath() JSON JSONPath() ... ... xR2RML mapping graph:
  • 25. 25  Previous works  Background: R2RML and RML  Description of xR2RML main features  Evaluation and perspectives Agenda
  • 26. 26  Use case: study the history and transmission of zoological knowledge along historical periods  TAXREF taxonomical reference • Designed to support studies in Conservation Biology, enriched with bioarchaeological taxa • Maintained the French National Museum of Natural History • ~ 450.000 terms, CSV/JSON/XML Use case in Digital Humanities
  • 27. 27  Ongoing work [2]: Construction of a SKOS1 thesaurus based on TAXREF • Import of TAXREF/JSON into MongoDB • Use of the Morph-xR2RML prototype implementation of xR2RML, to convert the MongoDB data to RDF • Make alignments with existing well-adopted ontologies (e.g. NCBI Taxonomic Classification, GeoNames...) • Static alignments at mapping design time • Using automatic alignment methods Use case in Digital Humanities 1 SKOS: Simple Knowledge Organization System, W3C RDF-based standard to represent controlled vocabularies, taxonomies and thesauri. Bridge the gap between existing KOS and the Semantic Web and Linked Data.
  • 28. 28  Ongoing discussion about the use of xR2RML to support ecology and agronomic studies • Large phenotype databases  Consider the query rewriting approach to support large datasets  How to write xR2RML mappings • Automatic xR2RML mapping generation from data schema (XSD/DTD, JSON schema, JSON-LD...) • Schema mapping • Schema discovery Perspectives
  • 29. 29 Conclusions  Data deluge keeps on ever faster  Data stored in many kinds of DBs  xR2RML: • Flexible language to map most common types of database to RDF • Supports various data models and query languages • Rich features: RDF collections/containers, joins, content with mixed formats  Applied to the construction of a SKOS thesaurus of TAXREF, a taxonomical reference
  • 30. 30 Contacts: Franck Michel Johan Montagnat Catherine Faron-Zucker [2] C. Callou, F. Michel, C. Faron-Zucker, C. Martin, J. Montagnat. Towards a Shared Reference Thesaurus for Studies on History of Zoology, Archaeozoology and Conservation Biology. In SW4SH workshop, ESWC’15. [3] F. Michel, L. Djimenou, C. Faron-Zucker, and J. Montagnat. xR2RML: Non-Relational Databases to RDF Mapping Language. Research report. ISRN I3S/RR 2014-04-FR. http://hal.archives-ouvertes.fr/hal-01066663 https://github.com/frmichel/morph-xr2rml/