SlideShare a Scribd company logo
1 of 20
Download to read offline
Relational Databases, RDF Graphs and
Constraints
Ratan Bahadur Thapa
PhD candidate at IFI/SIRIUS
University of Oslo
April 2, 2023
Plan
▶ Relational Databases
▶ RDF graph
▶ Relational-to-RDF Mappings
▶ Direct Mapping
▶ Constraint Rewriting for Direct Mapping R2RML
▶ Constraint Rewriting for Direct Mapping
▶ SPARQL Query Optimization With SHACL
▶ Open Questions?
Relational Database
▶ E. F. Codd, ”A Relational Model of Data for Large Shared
Data Banks”, IBM, 1970.
▶ First commercial implementation of SQL, Oracle V2, June
1979 (Standardized in 1986 as SQL-86).
▶ Closed World Assumption, i.e., assumption that what is not
known to be true must be false.
▶ E.g., consider a relation ∀x.PhD(x) → Employee(x) in
relational model.
▶ In SQL DDL (i.e., Schema + Constraints)
create table Employee (E id int not null, primary key (E id));
create table PhD (P id int not null, primary key (P id),
Foreign key(P id) ref. to Employee (E id));
RDF Graph
▶ Composed of triples ”(Subject, Predicate, Object)”, e.g.,
dbp:Norway dbp-ont:Capital dbp:Oslo .
dbp:Oslo dbp-ont:population ”673469”8sd:integer .
▶ Syntax doesn’t explicitly differentiate betwn. data and Schema
▶ Syntax cannot express constraints
▶ SHACL– a language for validating RDF graphs against a set
of conditions.
▶ makes Closed-World Assumption
▶ ”Schema + Constraint” language for RDF
▶ W3C Recommendation since 2017
RDF Graph
▶ Composed of triple ”(Subject, Predicate, Object)” statements
▶ ....
▶ SHACL – a language for validating RDF graphs against a set
of conditions.
▶ ....
▶ E.g. shape :=(Name, target Defn, Constraint Defn)
:EmployeeNode a sh:NodeShape;
sh:targetClass :Employee;
sh:property [ sh:path :hasAddress;
sh:nodeKind sh:Literal;
sh:maxCount 1; sh:minCount 1;
sh:datatype xsd:string ];
sh:property [ sh:path :hasAddress;
dash:uniqueValueForClass
:Employee ].
Relational-to-RDF Mapping: Direct Mapping and R2RML
Relational
Database
RDF
W3C Mapping
Quality assurance and validation?
INPUT
Database Schema and Instance D
Key Constraints: PKs and FKs
Other Constraints: Nullability, Uniqueness and Data types
OUTPUT
RDF Graph
Primary descriptors?
Does not explicitly differentiate between data and schema
Constraint-less, i.e., rdf syntax cannot express constraints
Challenges with ”RDB2RDF Graph”:
▶ Understandability and usability
▶ Verifying compliance of a dataset w.r.t. certain requirement
or policies
▶ Detecting metadata errors, Query optimizations etc
Sequeda et. al.3
Direct Mapping M
Integrates W3C Direct Mapping M1 and SQL Schemas to OWL
mapping2
1
Marcelo et. al, A Direct Mapping of Relational Data to RDF, W3C rec. 2012
2
Tirmizi et. al., Translating SQL applications to the Semantic Web, DEXA 2008
3
On Directly Mapping Relational Databases to RDF and OWL, WWW 2012
Properties of Sequeda et. al.’s Direct Mapping M
▶ M is not Semantics Preserving
i.e., for every R and σ set of PKs and FKs on R, it is not the case
that
D ⊨ σ ⇐⇒ M(D) ⊨ OWL axioms
E.g.,
E_ID Name Position
E01 Ida Post Doc
E01 Cathrine PhD
Employee
:E01
"Ida"
"Cathrine"
"Post Doc"
"PhD"
:baseIRI/Employee#Name
:baseIRI/Employee#Position
rdf:type
:baseIRI/Employee
▶ No monotone M is semantics preserving, Sequeda et. al.[Them. 3].
Constraint Rewriting 4
Γ for Direct Mapping M
Extend Direct Mapping M with rules Γ that rewrite SQL Schema
and Constraints into SHACL
Data constraints δ
Key constraints σ
Σ
Schema R
Instance D
Ms
V
Γ
Mi
Shapes S
Graph G
4
A Souce-to-Target Constraint Rewriting for Direct Mapping, ISWC 2021.
Properties of Rewriting Γ
▶ Γ is weakly semantics preserving, i.e.,
D ⊨ Σ ⇐⇒ M(D) ⊨ Γ(V, Σ),
for all DB instances D that satisfy their key constraints σ.
Question?
Besides weak semantics correspondence between SQL constraints
and SHACL:
D ⊨ Σ =⇒ M(D) ⊨ Γ(V, Σ),
where Γ(V, Σ) is maximal ? definition of such constraint rewriting
Γ?
Constraint Rewriting 5
Γ for simple R2RML M 6
A simple mapping M is a finite set of assertions of the form
Q −→ ψ,
where
▶ Q is an SP or SPJ query over a relational source D s.t.,
▶ filter out nulls
▶ equality joins along foreign keys.
▶ ψ is a triple pattern
5
Mapping Relational Database Constraint to SHACL, ISWC 2022.
6
Souripriya et. al., R2RML: RDB to RDF Mapping Language, W3C rec. 2012
Constraint Rewriting Γ for simple R2RML M
Rewriting Γ
Mapping M
Constraints Σ
Instance D
Database
(R,Σ,D)
RDF Graph M(D)
SHACL Constraint
Γ(M,Σ)
Rewriting steps:
▶ Let Q −→ ψ be a mapping defined on schema R with Σ.
Then, Γ computes,
1. Σ|Q - i.e., Σ propagated to the att(Q)
2. Σ|Q ⊩ σX→Y - where X, Y ⊆ att(Q), i.e., Σ-implied data
dependency σ 7
on view projected by Q
3. SHACL constraint on scheme(ψ) based on Σ|Q ⊩ σ and
mappings
7
Data dependencies that also apply to the databases with null
Properties of Γ
▶ Γ is maximal semantics preserving,
i.e.,
∀S. Σ |=M S s.t. sch(S) ⊆ sch(M), meaning that
∀D. (D ⊨ Σ =⇒ M(D) ⊨ S),
∀G.(G |= Γ(M, Σ) =⇒ G |= S).
Next
SPARQL query optimization with SHACL
SPARQL query optimization with SHACL 8
In short, we aim to find optimal S-equivalent queries Q’ of
the original query Q s.t.,
Q′
≡S Q iff ∀G.G |= S =⇒ Q′G
= QG
We propose a set of query rewriting rules that based on SHACL
guarantee,
1. reduce OPTIONAL to JOIN Pattern
2. remove JOIN Pattern
3. eliminate DIST Operator etc
8
Manuscript, 2023
Example
Consider an RDF graph on the left that validates a SHACL shape s
on the right, written in Turtle syntax:
:Ida a :Employee;
:hasID "001"^^xsd:int;
:hasAddress "Oslo".
:Ingrid a :Employee;
:hasID "002"^^xsd:int;
:hasAddress "Bergen".
:EmployeeNode a sh:NodeShape;
sh:targetClass :Employee;
sh:property [ sh:path :hasAddress;
sh:nodeKind sh:Literal;
sh:maxCount 1; sh:minCount 1;
sh:datatype xsd:string ];
sh:property [ sh:path :hasAddress;
dash:uniqueValueForClass
:Employee ].
An Example of Query Rewriting
Assume a SPARQL query,
Dist(Projxy (Opt⊤(Employee(x), hasAddress(x, y)))).
over graph G that satisfies shape s. Then,
▶ Since G |= ∀x.Employee(x) → ∃yhasAddress(x, y) (resp.,
G |= ∀x∀y∀y′.hasAddress(x, y)∧hasAddress(x, y′) → y = y′),
Dist(Projxy (Join(Employee(x), hasAddress(x, y)))).
▶ Since G |= ∀x.Employee(x) → ∃yhasAddress(x, y) and
G |= ∀x∀x′∀y.hasAddress(x, y) ∧ hasAddress(x′, y) → x = x′)
Projxy (Join(Employee(x), hasAddress(x, y))).
Property of Query Rewriting Rules: Confluent Reduction
SPARQL query is a graph pattern P defined by the grammar
P := B | FilterF (P) | Union(P1, P2) | Join(P1, P2) | Minus(P1, P2)
| DiffF (P1, P2) | OptF (P1, P2) | ProjL(P) | Dist(P)
Shape target τs and constraint ϕs are expressions defined by the
grammar
τs := sh:targetClass C | sh:targetSubjectOf P |
sh:targetObjectOf P
ϕs := ≥n α. β | ≤n α. β | ▷τs α | α1 = α2 | ϕs ∧ ϕs
β := ⊤ | C | s′
| ¬β
Future Work?
▶ Constraints Rewriting for Expressive Ontology-Based (or
BootStrap-Based) Mapping Patterns.
▶ Optimization of SPARQL Path Query
▶ Optimization of Ontology-Mediated Query Rewriting

More Related Content

Similar to Relational Databases_RDF Graphs_and_Constraints.pdf

A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...Databricks
 
Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...
Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...
Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...Databricks
 
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4jExplicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4jConnected Data World
 
Learning spark ch09 - Spark SQL
Learning spark ch09 - Spark SQLLearning spark ch09 - Spark SQL
Learning spark ch09 - Spark SQLphanleson
 
GraphFrames: Graph Queries in Spark SQL by Ankur Dave
GraphFrames: Graph Queries in Spark SQL by Ankur DaveGraphFrames: Graph Queries in Spark SQL by Ankur Dave
GraphFrames: Graph Queries in Spark SQL by Ankur DaveSpark Summit
 
Knowledge Discovery Query Language (KDQL)
Knowledge Discovery Query Language (KDQL)Knowledge Discovery Query Language (KDQL)
Knowledge Discovery Query Language (KDQL)Zakaria Zubi
 
final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)Ankit Rathi
 
Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...
Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...
Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...CloudxLab
 
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Ontotext
 
Introduction to Spark Datasets - Functional and relational together at last
Introduction to Spark Datasets - Functional and relational together at lastIntroduction to Spark Datasets - Functional and relational together at last
Introduction to Spark Datasets - Functional and relational together at lastHolden Karau
 
GraphQL & DGraph with Go
GraphQL & DGraph with GoGraphQL & DGraph with Go
GraphQL & DGraph with GoJames Tan
 
Data translation with SPARQL 1.1
Data translation with SPARQL 1.1Data translation with SPARQL 1.1
Data translation with SPARQL 1.1andreas_schultz
 
2005 fall cs523_lecture_4
2005 fall cs523_lecture_42005 fall cs523_lecture_4
2005 fall cs523_lecture_4abhineetverma
 
Generating Executable Mappings from RDF Data Cube Data Structure Definitions
Generating Executable Mappings from RDF Data Cube Data Structure DefinitionsGenerating Executable Mappings from RDF Data Cube Data Structure Definitions
Generating Executable Mappings from RDF Data Cube Data Structure DefinitionsChristophe Debruyne
 
Spark & Cassandra at DataStax Meetup on Jan 29, 2015
Spark & Cassandra at DataStax Meetup on Jan 29, 2015 Spark & Cassandra at DataStax Meetup on Jan 29, 2015
Spark & Cassandra at DataStax Meetup on Jan 29, 2015 Sameer Farooqui
 
GraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communitiesGraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communitiesPaco Nathan
 
Graph Analytics in Spark
Graph Analytics in SparkGraph Analytics in Spark
Graph Analytics in SparkPaco Nathan
 

Similar to Relational Databases_RDF Graphs_and_Constraints.pdf (20)

A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
 
Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...
Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...
Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark wi...
 
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4jExplicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
 
Learning spark ch09 - Spark SQL
Learning spark ch09 - Spark SQLLearning spark ch09 - Spark SQL
Learning spark ch09 - Spark SQL
 
GraphFrames: Graph Queries in Spark SQL by Ankur Dave
GraphFrames: Graph Queries in Spark SQL by Ankur DaveGraphFrames: Graph Queries in Spark SQL by Ankur Dave
GraphFrames: Graph Queries in Spark SQL by Ankur Dave
 
2 rel-algebra
2 rel-algebra2 rel-algebra
2 rel-algebra
 
Knowledge Discovery Query Language (KDQL)
Knowledge Discovery Query Language (KDQL)Knowledge Discovery Query Language (KDQL)
Knowledge Discovery Query Language (KDQL)
 
final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)
 
Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...
Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...
Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...
 
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
 
Hibernate
HibernateHibernate
Hibernate
 
Introduction to Spark Datasets - Functional and relational together at last
Introduction to Spark Datasets - Functional and relational together at lastIntroduction to Spark Datasets - Functional and relational together at last
Introduction to Spark Datasets - Functional and relational together at last
 
Graph based data models
Graph based data modelsGraph based data models
Graph based data models
 
GraphQL & DGraph with Go
GraphQL & DGraph with GoGraphQL & DGraph with Go
GraphQL & DGraph with Go
 
Data translation with SPARQL 1.1
Data translation with SPARQL 1.1Data translation with SPARQL 1.1
Data translation with SPARQL 1.1
 
2005 fall cs523_lecture_4
2005 fall cs523_lecture_42005 fall cs523_lecture_4
2005 fall cs523_lecture_4
 
Generating Executable Mappings from RDF Data Cube Data Structure Definitions
Generating Executable Mappings from RDF Data Cube Data Structure DefinitionsGenerating Executable Mappings from RDF Data Cube Data Structure Definitions
Generating Executable Mappings from RDF Data Cube Data Structure Definitions
 
Spark & Cassandra at DataStax Meetup on Jan 29, 2015
Spark & Cassandra at DataStax Meetup on Jan 29, 2015 Spark & Cassandra at DataStax Meetup on Jan 29, 2015
Spark & Cassandra at DataStax Meetup on Jan 29, 2015
 
GraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communitiesGraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communities
 
Graph Analytics in Spark
Graph Analytics in SparkGraph Analytics in Spark
Graph Analytics in Spark
 

Recently uploaded

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 

Recently uploaded (20)

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 

Relational Databases_RDF Graphs_and_Constraints.pdf

  • 1. Relational Databases, RDF Graphs and Constraints Ratan Bahadur Thapa PhD candidate at IFI/SIRIUS University of Oslo April 2, 2023
  • 2. Plan ▶ Relational Databases ▶ RDF graph ▶ Relational-to-RDF Mappings ▶ Direct Mapping ▶ Constraint Rewriting for Direct Mapping R2RML ▶ Constraint Rewriting for Direct Mapping ▶ SPARQL Query Optimization With SHACL ▶ Open Questions?
  • 3. Relational Database ▶ E. F. Codd, ”A Relational Model of Data for Large Shared Data Banks”, IBM, 1970. ▶ First commercial implementation of SQL, Oracle V2, June 1979 (Standardized in 1986 as SQL-86). ▶ Closed World Assumption, i.e., assumption that what is not known to be true must be false. ▶ E.g., consider a relation ∀x.PhD(x) → Employee(x) in relational model. ▶ In SQL DDL (i.e., Schema + Constraints) create table Employee (E id int not null, primary key (E id)); create table PhD (P id int not null, primary key (P id), Foreign key(P id) ref. to Employee (E id));
  • 4. RDF Graph ▶ Composed of triples ”(Subject, Predicate, Object)”, e.g., dbp:Norway dbp-ont:Capital dbp:Oslo . dbp:Oslo dbp-ont:population ”673469”8sd:integer . ▶ Syntax doesn’t explicitly differentiate betwn. data and Schema ▶ Syntax cannot express constraints ▶ SHACL– a language for validating RDF graphs against a set of conditions. ▶ makes Closed-World Assumption ▶ ”Schema + Constraint” language for RDF ▶ W3C Recommendation since 2017
  • 5. RDF Graph ▶ Composed of triple ”(Subject, Predicate, Object)” statements ▶ .... ▶ SHACL – a language for validating RDF graphs against a set of conditions. ▶ .... ▶ E.g. shape :=(Name, target Defn, Constraint Defn) :EmployeeNode a sh:NodeShape; sh:targetClass :Employee; sh:property [ sh:path :hasAddress; sh:nodeKind sh:Literal; sh:maxCount 1; sh:minCount 1; sh:datatype xsd:string ]; sh:property [ sh:path :hasAddress; dash:uniqueValueForClass :Employee ].
  • 6. Relational-to-RDF Mapping: Direct Mapping and R2RML Relational Database RDF W3C Mapping Quality assurance and validation? INPUT Database Schema and Instance D Key Constraints: PKs and FKs Other Constraints: Nullability, Uniqueness and Data types OUTPUT RDF Graph Primary descriptors? Does not explicitly differentiate between data and schema Constraint-less, i.e., rdf syntax cannot express constraints Challenges with ”RDB2RDF Graph”: ▶ Understandability and usability ▶ Verifying compliance of a dataset w.r.t. certain requirement or policies ▶ Detecting metadata errors, Query optimizations etc
  • 7. Sequeda et. al.3 Direct Mapping M Integrates W3C Direct Mapping M1 and SQL Schemas to OWL mapping2 1 Marcelo et. al, A Direct Mapping of Relational Data to RDF, W3C rec. 2012 2 Tirmizi et. al., Translating SQL applications to the Semantic Web, DEXA 2008 3 On Directly Mapping Relational Databases to RDF and OWL, WWW 2012
  • 8. Properties of Sequeda et. al.’s Direct Mapping M ▶ M is not Semantics Preserving i.e., for every R and σ set of PKs and FKs on R, it is not the case that D ⊨ σ ⇐⇒ M(D) ⊨ OWL axioms E.g., E_ID Name Position E01 Ida Post Doc E01 Cathrine PhD Employee :E01 "Ida" "Cathrine" "Post Doc" "PhD" :baseIRI/Employee#Name :baseIRI/Employee#Position rdf:type :baseIRI/Employee ▶ No monotone M is semantics preserving, Sequeda et. al.[Them. 3].
  • 9. Constraint Rewriting 4 Γ for Direct Mapping M Extend Direct Mapping M with rules Γ that rewrite SQL Schema and Constraints into SHACL Data constraints δ Key constraints σ Σ Schema R Instance D Ms V Γ Mi Shapes S Graph G 4 A Souce-to-Target Constraint Rewriting for Direct Mapping, ISWC 2021.
  • 10. Properties of Rewriting Γ ▶ Γ is weakly semantics preserving, i.e., D ⊨ Σ ⇐⇒ M(D) ⊨ Γ(V, Σ), for all DB instances D that satisfy their key constraints σ.
  • 11. Question? Besides weak semantics correspondence between SQL constraints and SHACL: D ⊨ Σ =⇒ M(D) ⊨ Γ(V, Σ), where Γ(V, Σ) is maximal ? definition of such constraint rewriting Γ?
  • 12. Constraint Rewriting 5 Γ for simple R2RML M 6 A simple mapping M is a finite set of assertions of the form Q −→ ψ, where ▶ Q is an SP or SPJ query over a relational source D s.t., ▶ filter out nulls ▶ equality joins along foreign keys. ▶ ψ is a triple pattern 5 Mapping Relational Database Constraint to SHACL, ISWC 2022. 6 Souripriya et. al., R2RML: RDB to RDF Mapping Language, W3C rec. 2012
  • 13. Constraint Rewriting Γ for simple R2RML M Rewriting Γ Mapping M Constraints Σ Instance D Database (R,Σ,D) RDF Graph M(D) SHACL Constraint Γ(M,Σ) Rewriting steps: ▶ Let Q −→ ψ be a mapping defined on schema R with Σ. Then, Γ computes, 1. Σ|Q - i.e., Σ propagated to the att(Q) 2. Σ|Q ⊩ σX→Y - where X, Y ⊆ att(Q), i.e., Σ-implied data dependency σ 7 on view projected by Q 3. SHACL constraint on scheme(ψ) based on Σ|Q ⊩ σ and mappings 7 Data dependencies that also apply to the databases with null
  • 14. Properties of Γ ▶ Γ is maximal semantics preserving, i.e., ∀S. Σ |=M S s.t. sch(S) ⊆ sch(M), meaning that ∀D. (D ⊨ Σ =⇒ M(D) ⊨ S), ∀G.(G |= Γ(M, Σ) =⇒ G |= S).
  • 16. SPARQL query optimization with SHACL 8 In short, we aim to find optimal S-equivalent queries Q’ of the original query Q s.t., Q′ ≡S Q iff ∀G.G |= S =⇒ Q′G = QG We propose a set of query rewriting rules that based on SHACL guarantee, 1. reduce OPTIONAL to JOIN Pattern 2. remove JOIN Pattern 3. eliminate DIST Operator etc 8 Manuscript, 2023
  • 17. Example Consider an RDF graph on the left that validates a SHACL shape s on the right, written in Turtle syntax: :Ida a :Employee; :hasID "001"^^xsd:int; :hasAddress "Oslo". :Ingrid a :Employee; :hasID "002"^^xsd:int; :hasAddress "Bergen". :EmployeeNode a sh:NodeShape; sh:targetClass :Employee; sh:property [ sh:path :hasAddress; sh:nodeKind sh:Literal; sh:maxCount 1; sh:minCount 1; sh:datatype xsd:string ]; sh:property [ sh:path :hasAddress; dash:uniqueValueForClass :Employee ].
  • 18. An Example of Query Rewriting Assume a SPARQL query, Dist(Projxy (Opt⊤(Employee(x), hasAddress(x, y)))). over graph G that satisfies shape s. Then, ▶ Since G |= ∀x.Employee(x) → ∃yhasAddress(x, y) (resp., G |= ∀x∀y∀y′.hasAddress(x, y)∧hasAddress(x, y′) → y = y′), Dist(Projxy (Join(Employee(x), hasAddress(x, y)))). ▶ Since G |= ∀x.Employee(x) → ∃yhasAddress(x, y) and G |= ∀x∀x′∀y.hasAddress(x, y) ∧ hasAddress(x′, y) → x = x′) Projxy (Join(Employee(x), hasAddress(x, y))).
  • 19. Property of Query Rewriting Rules: Confluent Reduction SPARQL query is a graph pattern P defined by the grammar P := B | FilterF (P) | Union(P1, P2) | Join(P1, P2) | Minus(P1, P2) | DiffF (P1, P2) | OptF (P1, P2) | ProjL(P) | Dist(P) Shape target τs and constraint ϕs are expressions defined by the grammar τs := sh:targetClass C | sh:targetSubjectOf P | sh:targetObjectOf P ϕs := ≥n α. β | ≤n α. β | ▷τs α | α1 = α2 | ϕs ∧ ϕs β := ⊤ | C | s′ | ¬β
  • 20. Future Work? ▶ Constraints Rewriting for Expressive Ontology-Based (or BootStrap-Based) Mapping Patterns. ▶ Optimization of SPARQL Path Query ▶ Optimization of Ontology-Mediated Query Rewriting