SlideShare a Scribd company logo
On Mapping Relational Databases to RDF and SHACL
Ratan Bahadur Thapa
PhD Candidate - SIRIUS & IFI
Mapping of Relational Data to RDF
Direct Mapping
Sequeda et. al.’s Direct Mapping
Constraint rewriting T for Direct Mapping
Properties of Rewriting T
Research Question
R2RML: RDB to RDF Mapping Language
Constraint rewriting Γ for simple RDB to RDF Mapping
Simple RDB to RDF Mapping
Exam. of Rewriting Γ
Properties of Rewriting Γ
Final Remarks
2nd April 2023 1 / 21
Mapping M of Relational Data to RDF
Standardized by RDB2RDF working group (W3C)
Direct Mapping, i.e., Default and Automatic
R2RML: RDB to RDF Mapping Language
Available Tools
D2R, Virtuoso, Morph, r2rml4net, db2triples, ultrawrap, Quest
Commercial such as Virtuoso, Oracle SW
Properties of M
M is data mapping
M translates database instances into RDF triples
M 1
is monotone
If database instances D ⊆ D′
then M(D) ⊆ M(D′
R2RML with Monotonic Source Query
2nd April 2023 2 / 21
Mapping M of Relational Data to RDF
2nd April 2023 3 / 21
W3C Mapping
Quality assurance and validation?
Database Schema and Instance D
Key Constraints: PKs and FKs
Other Constraints: Nullability, Uniqueness and Data types
RDF Graph
Primary descriptors?
Does not explicitly differentiate between data and schema
Constraint-less, i.e., rdf syntax cannot express constraints
Challenges with "RDF without schema and constraint descriptions":
Understandability and usability
Verifying compliance of a dataset w.r.t. certain requirement or
Detecting metadata errors etc
Direct Mapping M Engine
M is a Fixed Set of Mapping Rules
Generates IRI identifiers for table names, columns and foreign keys
Generates identifiers for tuples: IRI if PK exists, otherwise Blank nodes
Produce Triples: Table (tuples), Literal (attributes), Reference (FKs)
2nd April 2023 4 / 21
Table triples: for every tuples of tables
<baseIRI/User#U_ID=E01 > rdf:type <baseIRI/User> .
Literal triples: for every attributes of table
<baseIRI/User#U_ID=E01> <baseIRI/User#Name> "Ida" .
Reference triples: for every FK attributes of table (if exists)
U_ID Name Position
E01 Ida Post Doc
IRI for table
IRI for tuples
IRI for columns
Sequeda et. al.2
Direct Mapping M
Extend W3C Direct Mapping M with Binary table rule and OWL axioms
Contain "OWL rules" that translate vocabularies V identifies by direct
mapping rules Ms
into OWL axioms
Mapping rules Mi
translate database instance into RDF
Key constraints
Schema R
Instance D
V Mi
OWL rules
Graph G
OWL axioms
On Directly Mapping Relational Databases to RDF and OWL, WWW 2012
2nd April 2023 5 / 21
Properties of Sequeda et. al.’s Direct
Mapping M
Information and Query Preserving, and Monotone
M is not semantics Preserving
i.e., for every R and σ set of PKs and FKs on R, it is not the case that
D ⊨ σ ⇐⇒ M(D) ⊨ OWL axioms
Non-monotonic Mextended is semantics Preserving, i.e., relies on DB
instances and artificial RDF triples to trigger unsatisfiability of OWL
No monotone M is semantics preserving, Sequeda et. al.[Them. 3].
2nd April 2023 6 / 21
Constraint Rewriting 4
T for Direct
Mapping M
Extend Monotone Direct Mapping M with SHACL 3 Constraints
T translates vocabularies V identifies by mapping rules Ms
and SQL
(keys, not nullable and uniqueness) constraints Σ into sets of SHACL
Data constraints δ
Key constraints σ
Schema R
Instance D
Shapes S
Graph G
Shapes Constraint Language for describing RDF, W3C rec since 2017.
A Souce-to-Target Constraint Rewriting for Direct Mapping, ISWC 2021.
2nd April 2023 7 / 21
Properties of Rewriting T
T is constraint preserving, i.e., there exist mapping N s.t.
N(T (V, Σ)) = (V, Σ).
T is not semantics Preserving
i.e., for every R and Σ set of SQL constraints on R, it is not the case that
D ⊨ Σ ⇐⇒ M(D) ⊨ T (V, Σ)
2nd April 2023 8 / 21
Properties of Rewriting T
T is constraint preserving
T is not semantics Preserving
i.e., for every R and Σ set of SQL constraints on R, it is not the case that
D ⊨ Σ ⇐⇒ M(D) ⊨ T (V, Σ)
2nd April 2023 9 / 21
:User/U_ID=U01 rdf:type :User .
:User/U_ID=U01 :User/U_ID "U01" .
RDF Triples:
:User a sh:NodeShape, rdfs:Class;
sh:property [ sh:path :User/U_ID;
sh:nodeKind. sh:Literal;
sh:maxCount 1; sh:minCount 1;
sh:datatype xsd:integer ];
un:uniqueValuesForClass [un:unqProp :User/U_ID;
un:unqForClass :User ].
Datatype Property
Properties of Rewriting T
T is constraint preserving
T is not semantics Preserving
i.e., for every R and Σ set of SQL constraints on R, it is not the case that
D ⊨ Σ ⇐⇒ M(D) ⊨ T(V, Σ)
Since M,
Generates RDF terms from the active domain of database,
i.e., ignores the Nulls
Rule that generates IRIs for tuples from the PK values is injective
mapping, i.e., maps duplicate values to a single IRI
T is weakly semantics preserving, i.e.,
D ⊨ Σ ⇐⇒ M(D) ⊨ T(V, Σ),
for all DB instances D that satisfy their key constraints σ.
2nd April 2023 10 / 21
Research Question
Constraint rewriting T for monotone Direct mapping M is not semantics
preserving if:
Relational data violating keys constraints are considered
Besides weak semantics translation between SQL constraints and SHACL for
Direct Mapping:
Does there exist any other strong 5
semantics correspondence?
D ⊨ Σ =⇒ M(D) ⊨ T (V, Σ), where T (V, Σ) is maximal ?
"Maximal", meaning that any other SHACL constraints are either not
implied by the source constraints Σ wrt mapping M, or subsumed by
the maximally implied sets T (V, Σ) of SHACL shapes.
What would be the definition of such constraint rewriting T ?
one-to-one semantics translation between SQL constaints and SHACL
2nd April 2023 11 / 21
R2RML: RDB to RDF Mapping Language
RDB to RDF Mapping M is a Finite Set of Assertion of the Form
”Query −→ Triple Patterns”
Select S_id from student −→ ⟨iri1(S_id), rdf:type, Student⟩.
Select C_id from course −→ ⟨iri2(C_id), rdf:type, Course⟩.
Select S_id, C_id from −→ ⟨iri1(S_id), enrolledFor, iri2(C_id)⟩.
student, course where
student.Code = course.C_id
create table course (C_id varchar primary key, Title varchar unique);
create table student (S_id integer primary key, Name varchar, Code
varchar not null foreign key references course(C_id));
S_id Name Code
011 Ida CS40
012 CS20
C_id Title
CS40 Logic
CS20 Database
CS50 Data Eng
2nd April 2023 12 / 21
Constraint Rewriting 6
T for Simple
RDB to RDF Mapping M 7
A Maximal Semantics Preserving Rewriting T for Simple Mapping M
T : Q −→ P(S),
Q is a set of all pairs (M, Σ) s.t.,
M is a Simple RDB-to-RDF Mapping
Σ is a set of SQL constraints, i.e., keys and others
S is a set of all SHACL shapes
P(S) is Maximal sets of SHACL shapes
Mapping Relational Database Constraint to SHACL, ISWC 2022.
Simplifying M further yields Direct Mapping, therefore, results of T also apply for Direct
2nd April 2023 13 / 21
Simple RDB to RDF Mapping M
A simple mapping M is a finite set of assertions of form Q −→ ψ,
Q is an SP or SPJ query over a relational source D, called source
query, s.t.,
Selections considered are those that filter out nulls
Joins considered are equality joins along foreign keys.
ψ is a graph triple pattern
πS_idσ¬isNull(S_id)(student) −→ ⟨iri1(S_id), rdf:type, Student⟩.
πC_idσ¬isNull(C_id)(course) −→ ⟨iri2(C_id), rdf:type, Course⟩.
πS_id,C_idσ¬isNull(S_id)∧¬isNull(C_id) −→ ⟨iri1(S_id), enrolledFor, iri2(C_id)⟩.
(Q1 ⋊
⋉Code=C_id Q2)
where Q1 = σ¬isNull(S_id)∧¬isNull(Code)(student) and Q2 = σ¬isNull(C_id)(course).
2nd April 2023 14 / 21
The Rewriting Γ
Rewriting steps:
Let Q −→ ψ be a mapping defined on schema R with Σ.
Then, Γ computes,
1 Σ|Q - i.e., Σ propagated to the att(Q)
2 Σ|Q ⊩ σX→Y - where X, Y ⊆ att(Q), i.e., Σ-implied data
dependency σ 8
on view projected by Q
3 SHACL constraint on scheme(ψ) based on Σ|Q ⊩ σ and
Data dependencies that also apply to the databases with null
2nd April 2023 15 / 21
Rewriting Γ
Mapping M
Constraints Σ
Instance D
RDF Graph M(D)
SHACL Constraint
Exam. of Γ: Inputs R, Σ and M
Schema R with Σ Defn.
create table course (C_id varchar primary key, Title varchar unique);
create table student (S_id integer primary key, Name varchar, Code
varchar not null foreign key references course(C_id));
S_id Name Code
011 Ida CS40
012 CS20
C_id Title
CS40 Logic
CS20 Database
CS50 Data Eng
Simple Mapping M Defn.
πS_idσ¬isNull(S_id)(student) −→ ⟨iri1(S_id), rdf : type, Student⟩.
πC_idσ¬isNull(C_id)(course) −→ ⟨iri2(C_id), rdf : type, Course⟩.
πS_id,C_idσ¬isNull(S_id)∧¬isNull(C_id) −→ ⟨iri1(S_id), enrolledFor,
(Q1 ⋊
⋉Code=C_id Q2) iri2(C_id)⟩.
where Q1 = σ¬isNull(S_id)∧¬isNull(Code)(student) and Q2 = σ¬isNull(C_id)(course)
2nd April 2023 16 / 21
Exam. of Γ: Computing att(Q), Σ|Q and
ΣQ ⊩ σ
πS_id,C_idσ¬isNull(S_id)∧¬isNull(C_id)(Q1 ⋊
⋉Code=C_id Q2)
−→ ⟨iri1(S_id), enrolledFor, iri2(C_id)⟩.
where Q1 = σ¬isNull(S_id)∧¬isNull(Code)(student) and Q2 = σ¬isNull(C_id)(course)
Steps 1-2 of Γ:
att(Q1) = {S_id,Code} and {UNQ(S_id), NN(S_id), NN(Code)} ⊆ Σ|Q,
⊩ FDS_id→Code
att(Q2) = {C_id} and {UNQ(C_id), NN(C_id)} ⊆ Σ|Q2
⊩ UFDC_id→C_id
att(Q) = {S_id, C_id} and
FK(Code, student, C_id, course) ∈ Σ|Q1
∩ Σ|Q2
ΣQ ⊩ FDS_id→C_id
since ΣQ1
⊩ FDS_id→Code, and Σ ⊩ UFDC_id→C_id → Σ ⊩ FDC_id→C_id.
2nd April 2023 17 / 21
Exam. of Γ: Computing Γ(M, Σ)
πS_idσ¬isNull(S_id)(student) −→ ⟨iri1(S_id), rdf : type, Student⟩.
πC_idσ¬isNull(C_id)(course) −→ ⟨iri2(C_id), rdf : type, Course⟩.
πS_id,C_idσ¬isNull(S_id)∧¬isNull(C_id) −→ ⟨iri1(S_id), enrolledFor,
(Q1 ⋊
⋉Code=C_id Q2) iri2(C_id)⟩.
Steps 1-2 of Γ: Defn. Σ|Q ⊩ σ
⊩ FDS_id→Code, ΣQ2
⊩ UFDC_id→C_id, ΣQ ⊩ FDS_id→C_id
Step 3 of Γ: Defn. shapes Γ(M, Σ) with implicit class-based target
⟨Student, τStudent, φStudent⟩ s.t., since mStudent ∈ M
(≥0 enrolledFor. Course) ∈ φStudent since M, ι(m, M) = A
(≤0 enrolledFor. ¬Course) ∈ φStudent since M is simple
(=1 enrolledFor. Course) ∈ φStudent since ΣQ ⊩ FDS_id→C_id
⟨Course, τCourse, φCourse⟩ s.t., since mCourse ∈ M
(≥0 enrolledFor−
. Student) ∈ φCourse since M, ι(m, M) = A
(≤0 enrolledFor−
. ¬Student) ∈ φCourse since M is simple
2nd April 2023 18 / 21
Properties of Γ
Γ is semantics preserving,
for a mapping set M defined over a relational schema R with source
constraint Σ, and an arbitrary instance D of R:
D ⊨ Σ =⇒ M(D) ⊨ Γ(M, Σ).
Γ is maximal semantics preserving,
∀S. Σ |=M S s.t. sch(S) ⊆ sch(M), meaning that
∀D. (D ⊨ Σ =⇒ M(D) ⊨ S),
∀G.(G |= Γ(M, Σ) =⇒ G |= S).
Γ is monotone,
for every two mapping sets M1 ⊆ M2 defined on a schema R with Σ,
∀G.(G |= Γ(M2, Σ) =⇒ G |= Γ(M1, Σ))
2nd April 2023 19 / 21
Final Remarks
Constraint Rewriting Γ Extends RDB to RDF Mapping M with SHACL, where
M is simple RDB to RDF mapping.
Γ is maximal semantics preserving and monotone.
Work in Progress and Future Goal :
Extension of Γ beyond simple R2RML, i.e.,Expressive R2RML with
monotonic source query
Extension of Γ in an OBDA Platform
SPARQL query simplification and optimization with Γ
Semantics preserving SQL-to-SPARQL translation with Γ
2nd April 2023 20 / 21
Ratan Bahadur Thapa and Martin Giese
A source-to-target constraint rewriting for direct mapping.
International Semantic Web Conference, 21–38, 2021, Springer.
Ratan Bahadur Thapa and Martin Giese
Mapping Relational Database Constraints to SHACL.
International Semantic Web Conference, 214–230, 2022, Springer.
Juan F. Sequeda, Marcelo Arenas and Daniel P. Miranker
On directly mapping relational databases to RDF and OWL.
Proc. 21st Intl. Conf. on World Wide Web, 649-658, 2012, ACM.
Marcelo Arenas, Alexandre Bertails, Eric Prud’hommeaux and Juan F. Sequeda
A Direct Mapping of Relational Data to RDF.
W3C Recommendation, 2012
Souripriya Das, Seema Sundara and Richard Cyganiak
R2RML: RDB to RDF Mapping Language.
W3C Recommendation, 2012
Holger Knublauch and Dimitris Kontokostas
Shapes Constraint Language (SHACL).
W3C Recommendation, 2017
2nd April 2023 21 / 21

More Related Content

Similar to On_Mapping_Relational_Databases_to_RDF_and_SHACL.pdf

DBMS _Relational model
DBMS _Relational modelDBMS _Relational model
DBMS _Relational modelAzizul Mamun
Lecture 06 relational algebra and calculus
Lecture 06 relational algebra and calculusLecture 06 relational algebra and calculus
Lecture 06 relational algebra and calculusemailharmeet
Efficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF DatabasesEfficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF DatabasesAlexandra Roatiș
Rattle Graphical Interface for R Language
Rattle Graphical Interface for R LanguageRattle Graphical Interface for R Language
Rattle Graphical Interface for R LanguageMajid Abdollahi
Introduction to database-Normalisation
Introduction to database-NormalisationIntroduction to database-Normalisation
Introduction to database-NormalisationAjit Nayak
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4jExplicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4jConnected Data World
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Jinho Choi
MODULE 4 -Normalization_1.ppt
MODULE 4 -Normalization_1.pptMODULE 4 -Normalization_1.ppt
MODULE 4 -Normalization_1.pptBelkinAntony1
IJERD ( International Journal of Engineering Research and Devel...
IJERD ( International Journal of Engineering Research and Devel...IJERD ( International Journal of Engineering Research and Devel...
IJERD ( International Journal of Engineering Research and Devel...IJERD Editor
Dbms ii mca-ch4-relational model-2013
Dbms ii mca-ch4-relational model-2013Dbms ii mca-ch4-relational model-2013
Dbms ii mca-ch4-relational model-2013Prosanta Ghosh
Data translation with SPARQL 1.1
Data translation with SPARQL 1.1Data translation with SPARQL 1.1
Data translation with SPARQL 1.1andreas_schultz
Reasoning of database consistency through description logics
Reasoning of database consistency through description logicsReasoning of database consistency through description logics
Reasoning of database consistency through description logicsAhmad karawash

Similar to On_Mapping_Relational_Databases_to_RDF_and_SHACL.pdf (20)

DBMS _Relational model
DBMS _Relational modelDBMS _Relational model
DBMS _Relational model
Lecture 06 relational algebra and calculus
Lecture 06 relational algebra and calculusLecture 06 relational algebra and calculus
Lecture 06 relational algebra and calculus
Efficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF DatabasesEfficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF Databases
Unit 3
Unit  3Unit  3
Unit 3
Rattle Graphical Interface for R Language
Rattle Graphical Interface for R LanguageRattle Graphical Interface for R Language
Rattle Graphical Interface for R Language
Introduction to database-Normalisation
Introduction to database-NormalisationIntroduction to database-Normalisation
Introduction to database-Normalisation
inteSearch: An Intelligent Linked Data Information Access Framework
inteSearch: An Intelligent Linked Data Information Access FrameworkinteSearch: An Intelligent Linked Data Information Access Framework
inteSearch: An Intelligent Linked Data Information Access Framework
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4jExplicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Optimizing SPARQL Queries with SHACL.pdf
Optimizing SPARQL Queries with SHACL.pdfOptimizing SPARQL Queries with SHACL.pdf
Optimizing SPARQL Queries with SHACL.pdf
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
MODULE 4 -Normalization_1.ppt
MODULE 4 -Normalization_1.pptMODULE 4 -Normalization_1.ppt
MODULE 4 -Normalization_1.ppt
IJERD ( International Journal of Engineering Research and Devel...
IJERD ( International Journal of Engineering Research and Devel...IJERD ( International Journal of Engineering Research and Devel...
IJERD ( International Journal of Engineering Research and Devel...
Cs501 rel algebra
Cs501 rel algebraCs501 rel algebra
Cs501 rel algebra
Dbms ii mca-ch4-relational model-2013
Dbms ii mca-ch4-relational model-2013Dbms ii mca-ch4-relational model-2013
Dbms ii mca-ch4-relational model-2013
The Smartpath Information Systems | BASIC RDBMS CONCEPTS
The Smartpath Information Systems | BASIC RDBMS CONCEPTSThe Smartpath Information Systems | BASIC RDBMS CONCEPTS
The Smartpath Information Systems | BASIC RDBMS CONCEPTS
Data translation with SPARQL 1.1
Data translation with SPARQL 1.1Data translation with SPARQL 1.1
Data translation with SPARQL 1.1
Reasoning of database consistency through description logics
Reasoning of database consistency through description logicsReasoning of database consistency through description logics
Reasoning of database consistency through description logics

Recently uploaded

When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...Elena Simperl
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekCzechDreamin
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaCzechDreamin
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoTAnalytics
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka DoktorováCzechDreamin
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutesconfluent
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Product School
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1DianaGray10
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Julian Hyde
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyJohn Staveley
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Thierry Lestable
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationZilliz
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Product School
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀DianaGray10
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Alison B. Lowndes
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...Product School

Recently uploaded (20)

When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG Evaluation
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...


  • 1. On Mapping Relational Databases to RDF and SHACL Ratan Bahadur Thapa PhD Candidate - SIRIUS & IFI
  • 2. Outline Mapping of Relational Data to RDF Direct Mapping Sequeda et. al.’s Direct Mapping Constraint rewriting T for Direct Mapping Properties of Rewriting T Research Question R2RML: RDB to RDF Mapping Language Constraint rewriting Γ for simple RDB to RDF Mapping Simple RDB to RDF Mapping Exam. of Rewriting Γ Properties of Rewriting Γ Final Remarks References 2nd April 2023 1 / 21
  • 3. Mapping M of Relational Data to RDF Standardized by RDB2RDF working group (W3C) Direct Mapping, i.e., Default and Automatic R2RML: RDB to RDF Mapping Language Available Tools D2R, Virtuoso, Morph, r2rml4net, db2triples, ultrawrap, Quest Commercial such as Virtuoso, Oracle SW Properties of M M is data mapping M translates database instances into RDF triples M 1 is monotone If database instances D ⊆ D′ then M(D) ⊆ M(D′ ) 1 R2RML with Monotonic Source Query 2nd April 2023 2 / 21
  • 4. Mapping M of Relational Data to RDF 2nd April 2023 3 / 21 Relational Database RDF W3C Mapping Quality assurance and validation? INPUT Database Schema and Instance D Key Constraints: PKs and FKs Other Constraints: Nullability, Uniqueness and Data types OUTPUT RDF Graph Primary descriptors? Does not explicitly differentiate between data and schema Constraint-less, i.e., rdf syntax cannot express constraints Challenges with "RDF without schema and constraint descriptions": Understandability and usability Verifying compliance of a dataset w.r.t. certain requirement or policies Detecting metadata errors etc
  • 5. Direct Mapping M Engine M is a Fixed Set of Mapping Rules Generates IRI identifiers for table names, columns and foreign keys Generates identifiers for tuples: IRI if PK exists, otherwise Blank nodes Produce Triples: Table (tuples), Literal (attributes), Reference (FKs) 2nd April 2023 4 / 21 Table triples: for every tuples of tables <baseIRI/User#U_ID=E01 > rdf:type <baseIRI/User> . Literal triples: for every attributes of table <baseIRI/User#U_ID=E01> <baseIRI/User#Name> "Ida" . Reference triples: for every FK attributes of table (if exists) U_ID Name Position E01 Ida Post Doc User <baseIRI/User> IRI for table <baseIRI/User#Name> <baseIRI/User#U_ID=E01> IRI for tuples IRI for columns
  • 6. Sequeda et. al.2 Direct Mapping M Extend W3C Direct Mapping M with Binary table rule and OWL axioms Contain "OWL rules" that translate vocabularies V identifies by direct mapping rules Ms into OWL axioms Mapping rules Mi translate database instance into RDF Key constraints Σ Schema R Instance D Ms V Mi OWL rules Graph G OWL axioms 2 On Directly Mapping Relational Databases to RDF and OWL, WWW 2012 2nd April 2023 5 / 21
  • 7. Properties of Sequeda et. al.’s Direct Mapping M Information and Query Preserving, and Monotone M is not semantics Preserving i.e., for every R and σ set of PKs and FKs on R, it is not the case that D ⊨ σ ⇐⇒ M(D) ⊨ OWL axioms Non-monotonic Mextended is semantics Preserving, i.e., relies on DB instances and artificial RDF triples to trigger unsatisfiability of OWL axioms No monotone M is semantics preserving, Sequeda et. al.[Them. 3]. 2nd April 2023 6 / 21
  • 8. Constraint Rewriting 4 T for Direct Mapping M Extend Monotone Direct Mapping M with SHACL 3 Constraints T translates vocabularies V identifies by mapping rules Ms and SQL (keys, not nullable and uniqueness) constraints Σ into sets of SHACL shapes Data constraints δ Key constraints σ Σ Schema R Instance D Ms V Γ Mi Shapes S Graph G 3 Shapes Constraint Language for describing RDF, W3C rec since 2017. 4 A Souce-to-Target Constraint Rewriting for Direct Mapping, ISWC 2021. 2nd April 2023 7 / 21
  • 9. Properties of Rewriting T T is constraint preserving, i.e., there exist mapping N s.t. N(T (V, Σ)) = (V, Σ). T is not semantics Preserving i.e., for every R and Σ set of SQL constraints on R, it is not the case that D ⊨ Σ ⇐⇒ M(D) ⊨ T (V, Σ) 2nd April 2023 8 / 21
  • 10. Properties of Rewriting T T is constraint preserving T is not semantics Preserving i.e., for every R and Σ set of SQL constraints on R, it is not the case that D ⊨ Σ ⇐⇒ M(D) ⊨ T (V, Σ) Example: 2nd April 2023 9 / 21 U_ID U01 U01 Null User :User/U_ID=U01 rdf:type :User . :User/U_ID=U01 :User/U_ID "U01" . RDF Triples: :User a sh:NodeShape, rdfs:Class; sh:property [ sh:path :User/U_ID; sh:nodeKind. sh:Literal; sh:maxCount 1; sh:minCount 1; sh:datatype xsd:integer ]; un:uniqueValuesForClass [un:unqProp :User/U_ID; un:unqForClass :User ]. Class Datatype Property
  • 11. Properties of Rewriting T T is constraint preserving T is not semantics Preserving i.e., for every R and Σ set of SQL constraints on R, it is not the case that D ⊨ Σ ⇐⇒ M(D) ⊨ T(V, Σ) Since M, Generates RDF terms from the active domain of database, i.e., ignores the Nulls Rule that generates IRIs for tuples from the PK values is injective mapping, i.e., maps duplicate values to a single IRI T is weakly semantics preserving, i.e., D ⊨ Σ ⇐⇒ M(D) ⊨ T(V, Σ), for all DB instances D that satisfy their key constraints σ. 2nd April 2023 10 / 21
  • 12. Research Question Constraint rewriting T for monotone Direct mapping M is not semantics preserving if: Relational data violating keys constraints are considered Besides weak semantics translation between SQL constraints and SHACL for Direct Mapping: Does there exist any other strong 5 semantics correspondence? D ⊨ Σ =⇒ M(D) ⊨ T (V, Σ), where T (V, Σ) is maximal ? "Maximal", meaning that any other SHACL constraints are either not implied by the source constraints Σ wrt mapping M, or subsumed by the maximally implied sets T (V, Σ) of SHACL shapes. What would be the definition of such constraint rewriting T ? 5 one-to-one semantics translation between SQL constaints and SHACL 2nd April 2023 11 / 21
  • 13. R2RML: RDB to RDF Mapping Language RDB to RDF Mapping M is a Finite Set of Assertion of the Form ”Query −→ Triple Patterns” Example: Select S_id from student −→ ⟨iri1(S_id), rdf:type, Student⟩. Select C_id from course −→ ⟨iri2(C_id), rdf:type, Course⟩. Select S_id, C_id from −→ ⟨iri1(S_id), enrolledFor, iri2(C_id)⟩. student, course where student.Code = course.C_id create table course (C_id varchar primary key, Title varchar unique); create table student (S_id integer primary key, Name varchar, Code varchar not null foreign key references course(C_id)); S_id Name Code 011 Ida CS40 012 CS20 C_id Title CS40 Logic CS20 Database CS50 Data Eng FK 2nd April 2023 12 / 21
  • 14. Constraint Rewriting 6 T for Simple RDB to RDF Mapping M 7 A Maximal Semantics Preserving Rewriting T for Simple Mapping M T : Q −→ P(S), where Q is a set of all pairs (M, Σ) s.t., M is a Simple RDB-to-RDF Mapping Σ is a set of SQL constraints, i.e., keys and others S is a set of all SHACL shapes P(S) is Maximal sets of SHACL shapes 6 Mapping Relational Database Constraint to SHACL, ISWC 2022. 7 Simplifying M further yields Direct Mapping, therefore, results of T also apply for Direct Mapping 2nd April 2023 13 / 21
  • 15. Simple RDB to RDF Mapping M A simple mapping M is a finite set of assertions of form Q −→ ψ, where Q is an SP or SPJ query over a relational source D, called source query, s.t., Selections considered are those that filter out nulls Joins considered are equality joins along foreign keys. ψ is a graph triple pattern Example: πS_idσ¬isNull(S_id)(student) −→ ⟨iri1(S_id), rdf:type, Student⟩. πC_idσ¬isNull(C_id)(course) −→ ⟨iri2(C_id), rdf:type, Course⟩. πS_id,C_idσ¬isNull(S_id)∧¬isNull(C_id) −→ ⟨iri1(S_id), enrolledFor, iri2(C_id)⟩. (Q1 ⋊ ⋉Code=C_id Q2) where Q1 = σ¬isNull(S_id)∧¬isNull(Code)(student) and Q2 = σ¬isNull(C_id)(course). 2nd April 2023 14 / 21
  • 16. The Rewriting Γ Rewriting steps: Let Q −→ ψ be a mapping defined on schema R with Σ. Then, Γ computes, 1 Σ|Q - i.e., Σ propagated to the att(Q) 2 Σ|Q ⊩ σX→Y - where X, Y ⊆ att(Q), i.e., Σ-implied data dependency σ 8 on view projected by Q 3 SHACL constraint on scheme(ψ) based on Σ|Q ⊩ σ and mappings 8 Data dependencies that also apply to the databases with null 2nd April 2023 15 / 21 Rewriting Γ Mapping M Constraints Σ Instance D Database (R,Σ,D) RDF Graph M(D) SHACL Constraint Γ(M,Σ)
  • 17. Exam. of Γ: Inputs R, Σ and M Schema R with Σ Defn. create table course (C_id varchar primary key, Title varchar unique); create table student (S_id integer primary key, Name varchar, Code varchar not null foreign key references course(C_id)); S_id Name Code 011 Ida CS40 012 CS20 C_id Title CS40 Logic CS20 Database CS50 Data Eng FK Simple Mapping M Defn. πS_idσ¬isNull(S_id)(student) −→ ⟨iri1(S_id), rdf : type, Student⟩. πC_idσ¬isNull(C_id)(course) −→ ⟨iri2(C_id), rdf : type, Course⟩. πS_id,C_idσ¬isNull(S_id)∧¬isNull(C_id) −→ ⟨iri1(S_id), enrolledFor, (Q1 ⋊ ⋉Code=C_id Q2) iri2(C_id)⟩. where Q1 = σ¬isNull(S_id)∧¬isNull(Code)(student) and Q2 = σ¬isNull(C_id)(course) 2nd April 2023 16 / 21
  • 18. Exam. of Γ: Computing att(Q), Σ|Q and ΣQ ⊩ σ πS_id,C_idσ¬isNull(S_id)∧¬isNull(C_id)(Q1 ⋊ ⋉Code=C_id Q2) −→ ⟨iri1(S_id), enrolledFor, iri2(C_id)⟩. where Q1 = σ¬isNull(S_id)∧¬isNull(Code)(student) and Q2 = σ¬isNull(C_id)(course) Steps 1-2 of Γ: att(Q1) = {S_id,Code} and {UNQ(S_id), NN(S_id), NN(Code)} ⊆ Σ|Q, ΣQ1 ⊩ FDS_id→Code att(Q2) = {C_id} and {UNQ(C_id), NN(C_id)} ⊆ Σ|Q2 , ΣQ2 ⊩ UFDC_id→C_id att(Q) = {S_id, C_id} and FK(Code, student, C_id, course) ∈ Σ|Q1 ∩ Σ|Q2 , ΣQ ⊩ FDS_id→C_id since ΣQ1 ⊩ FDS_id→Code, and Σ ⊩ UFDC_id→C_id → Σ ⊩ FDC_id→C_id. 2nd April 2023 17 / 21
  • 19. Exam. of Γ: Computing Γ(M, Σ) πS_idσ¬isNull(S_id)(student) −→ ⟨iri1(S_id), rdf : type, Student⟩. πC_idσ¬isNull(C_id)(course) −→ ⟨iri2(C_id), rdf : type, Course⟩. πS_id,C_idσ¬isNull(S_id)∧¬isNull(C_id) −→ ⟨iri1(S_id), enrolledFor, (Q1 ⋊ ⋉Code=C_id Q2) iri2(C_id)⟩. Steps 1-2 of Γ: Defn. Σ|Q ⊩ σ ΣQ1 ⊩ FDS_id→Code, ΣQ2 ⊩ UFDC_id→C_id, ΣQ ⊩ FDS_id→C_id Step 3 of Γ: Defn. shapes Γ(M, Σ) with implicit class-based target ⟨Student, τStudent, φStudent⟩ s.t., since mStudent ∈ M (≥0 enrolledFor. Course) ∈ φStudent since M, ι(m, M) = A (≤0 enrolledFor. ¬Course) ∈ φStudent since M is simple (=1 enrolledFor. Course) ∈ φStudent since ΣQ ⊩ FDS_id→C_id ⟨Course, τCourse, φCourse⟩ s.t., since mCourse ∈ M (≥0 enrolledFor− . Student) ∈ φCourse since M, ι(m, M) = A (≤0 enrolledFor− . ¬Student) ∈ φCourse since M is simple 2nd April 2023 18 / 21
  • 20. Properties of Γ Γ is semantics preserving, i.e., for a mapping set M defined over a relational schema R with source constraint Σ, and an arbitrary instance D of R: D ⊨ Σ =⇒ M(D) ⊨ Γ(M, Σ). Γ is maximal semantics preserving, i.e., ∀S. Σ |=M S s.t. sch(S) ⊆ sch(M), meaning that ∀D. (D ⊨ Σ =⇒ M(D) ⊨ S), ∀G.(G |= Γ(M, Σ) =⇒ G |= S). Γ is monotone, i.e., for every two mapping sets M1 ⊆ M2 defined on a schema R with Σ, ∀G.(G |= Γ(M2, Σ) =⇒ G |= Γ(M1, Σ)) 2nd April 2023 19 / 21
  • 21. Final Remarks Constraint Rewriting Γ Extends RDB to RDF Mapping M with SHACL, where M is simple RDB to RDF mapping. Γ is maximal semantics preserving and monotone. Work in Progress and Future Goal : Extension of Γ beyond simple R2RML, i.e.,Expressive R2RML with monotonic source query Extension of Γ in an OBDA Platform SPARQL query simplification and optimization with Γ Semantics preserving SQL-to-SPARQL translation with Γ 2nd April 2023 20 / 21
  • 22. References Ratan Bahadur Thapa and Martin Giese A source-to-target constraint rewriting for direct mapping. International Semantic Web Conference, 21–38, 2021, Springer. Ratan Bahadur Thapa and Martin Giese Mapping Relational Database Constraints to SHACL. International Semantic Web Conference, 214–230, 2022, Springer. Juan F. Sequeda, Marcelo Arenas and Daniel P. Miranker On directly mapping relational databases to RDF and OWL. Proc. 21st Intl. Conf. on World Wide Web, 649-658, 2012, ACM. Marcelo Arenas, Alexandre Bertails, Eric Prud’hommeaux and Juan F. Sequeda A Direct Mapping of Relational Data to RDF. W3C Recommendation, 2012 Souripriya Das, Seema Sundara and Richard Cyganiak R2RML: RDB to RDF Mapping Language. W3C Recommendation, 2012 Holger Knublauch and Dimitris Kontokostas Shapes Constraint Language (SHACL). W3C Recommendation, 2017 2nd April 2023 21 / 21