A Source-to-Target Constraint rewriting for Direct Mapping
The 20th International Semantic Web Conference 2021
Ratan Bahadur Thapa
ratanbt@ifi.uio.no
W3C Direct Mapping M
M is a fixed set of mapping rules
Ratan Bahadur Thapa ratanbt@ifi.uio.no
A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 1 / 13
Relational
Database
RDF Graph
W3C Direct Mapping
Input:
Database schema and Instance
Primary Key
Foreign Key
Output:
RDF Graph
Completely Automatic
M is data mapping
M directly translates database instances into RDF triples
M is monotone
If database instances D ⊆ D0
then M(D) ⊆ M(D0
)
Direct Mapping M Engine
IRI identifiers for table, columns and foreign keys
Identifiers for tuples: IRI (if PK exists), otherwise Blank nodes
Triples: Table (tuples), Literal (attributes), Reference (FKs)
Ratan Bahadur Thapa ratanbt@ifi.uio.no
A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 2 / 13
Table triples: for every tuples of tables
<baseIRI/User#U_ID=E01 > rdf:type <baseIRI/User> .
Literal triples: for every attributes of table
<baseIRI/User#U_ID=E01> <baseIRI/User#Name> "Ida" .
Reference triples: for every FK attributes of table (if exists)
U_ID Name Position
E01 Ida Post Doc
User
<baseIRI/User>
IRI for table
<baseIRI/User#Name>
<baseIRI/User#U_ID=E01>
IRI for tuples
IRI for columns
Sequeda et. al. Direct Mapping DM
Integrates W3C Direct Mapping M and SQL Schemas to OWL mapping1
Ms
rules identify (RDFS) vocabularies V
Mi
rules translate database instance into RDF
OWL rules translate vocabularies V into OWL axioms
Ratan Bahadur Thapa ratanbt@ifi.uio.no
A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 3 / 13
Key constraints
Σ
Schema R
Instance D
Ms
V Mi
OWL rules
Graph G
OWL axioms
1 Tirmizi et. al., Translating SQL applications to the Semantic Web, DEXA 2008
DM
Ratan Bahadur Thapa ratanbt@ifi.uio.no
A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 4 / 13
P_ID Name Proj
P02 Semantic Integration B01
Emp Prog
E01 P02
B_ID Name
B01 PeTWIN
E_ID Name Position
E01 Ida Post Doc
FK
FK
FK
Employee Program
EmpProg (Emp,Prog)
Project
Class
Object Property
binary table foreign key of non-binary table
non-binary table
Data type Property
attributes of non-binary table
<baseIRI/Program#P_ID=P02> <rdf:type> <baseIRI/Program> .
Reference triples: via binary tables
<baseIRI/Employee#E_ID=E01> <baseIRI/EmpProg#Emp,Proj,E_ID,B_ID> <baseIRI/Program#P_ID=P02> .
ClassIRI
Where,
<baseIRI/Employee#E_ID=E01> <rdf:type> <baseIRI/Employee> .
Properties of DM
Information Preserving
Query Preserving
Monotone
not Semantics Preserving
i.e., for every R and σ set of PKs and FKs on R, it is not the case that
D  σ ⇐⇒ G  OWL axioms.
Counter Example: D 2 σ, but G  OWL axioms.
No monotone DMextended is semantics preserving, Sequeda et. al.[Them. 3].
Ratan Bahadur Thapa ratanbt@ifi.uio.no
A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 5 / 13
E_ID Name Position
E01 Ida Post Doc
E01 Cathrine PhD
Employee
:E01
Ida
Cathrine
Post Doc
PhD
:baseIRI/Employee#Name
:baseIRI/Employee#Position
rdf:type
:baseIRI/Employee
Our Goal
Constraints Rewriting T for monotone direct mapping
Relational
Database
RDF Graph
Direct Mapping
Integrity Constraints?
INPUT
Database Schema and Instance
Key Constraints: PKs and FKs
Data Constraints: Nullability, Uniqueness and Data types
OUTPUT
RDF Graph
SHACL Description
Ratan Bahadur Thapa ratanbt@ifi.uio.no
A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 6 / 13
The rewriting Γ
Γ is a set of Datalog rules.
σ contains a PK for each R ∈ R
for every PK ∈ σ, UNQ ∈ δ and NN ∈ δ
Ratan Bahadur Thapa ratanbt@ifi.uio.no
A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 7 / 13
Data constraints δ
Key constraints σ
Σ
Schema R
Instance D
Ms
V
Γ
Mi
Shapes S
Graph G
Properties of Γ
Γ is constraint preserving,
i.e., there exist mapping N s.t. N(Γ(V, δ)) = (V, δ).
Γ is not semantics preserving,
i.e., D  Σ ⇐⇒ G  S does not hold.
Counter Example:
Ratan Bahadur Thapa ratanbt@ifi.uio.no
A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 8 / 13
Properties of Γ
Γ is constraint preserving.
Γ is not semantics preserving,
i.e., D  Σ ⇐⇒ G  S does not hold.
Counter Example:
Ratan Bahadur Thapa ratanbt@ifi.uio.no
A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 9 / 13
U_ID
U01
U01
Null
User
:User/U_ID=U01 rdf:type :User .
:User/U_ID=U01 :User/U_ID U01 .
RDF Triples:
:User a sh:NodeShape, rdfs:Class;
sh:property [ sh:path :User/U_ID;
sh:nodeKind. sh:Literal;
sh:maxCount 1; sh:minCount 1;
sh:datatype xsd:integer ];
un:uniqueValuesForClass [un:unqProp :User/U_ID;
un:unqForClass :User ].
Class
Datatype Property
Properties of Γ
Γ is constraint preserving,
i.e., there exist mapping N s.t. N(Γ(V, δ)) = (V, δ).
Γ is not semantics preserving,
i.e., D  Σ ⇐⇒ G  S does not hold.
Γ is weakly semantics preserving, i.e.,
D  Σ ⇐⇒ G  S,
for all DB instances D that satisfy their key constraints σ.
Ratan Bahadur Thapa ratanbt@ifi.uio.no
A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 10 / 13
Discussion
Rewriting Γ extends direct mapping with the SHACL constraints.
Limitations:
Γ is not semantics preserving if:
relation schemas without PKs, and
databases violating the key constraints are considered
Open question:
Maximal semantics preserving rewriting Γ
Ratan Bahadur Thapa ratanbt@ifi.uio.no
A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 11 / 13
Conclusion
We have extended direct mapping M with the constraint rewriting Γ :
M is monotone, information and query preserving.
Γ is constraint and weakly semantics preserving.
Future Goal :
Constraint rewriting for R2RML
Ratan Bahadur Thapa ratanbt@ifi.uio.no
A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 12 / 13
References
Sequeda, Juan F. and Arenas, Marcelo and Miranker, Daniel P.
On directly mapping relational databases to RDF and OWL.
Proc. 21st Intl. Conf. on World Wide Web, 649-658, 2012, ACM.
Arenas, Marcelo and Bertails, Alexandre and Prud’hommeaux, Eric
and Sequeda, Juan
A Direct Mapping of Relational Data to RDF.
W3C Recommendation, 2012
Holger Knublauch and Dimitris Kontokostas
Shapes Constraint Language (SHACL).
W3C Recommendation, 2017
Ratan Bahadur Thapa ratanbt@ifi.uio.no
A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 13 / 13

A Source-to-Target Constraint rewriting for Direct Mapping.pdf

  • 1.
    A Source-to-Target Constraintrewriting for Direct Mapping The 20th International Semantic Web Conference 2021 Ratan Bahadur Thapa ratanbt@ifi.uio.no
  • 2.
    W3C Direct MappingM M is a fixed set of mapping rules Ratan Bahadur Thapa ratanbt@ifi.uio.no A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 1 / 13 Relational Database RDF Graph W3C Direct Mapping Input: Database schema and Instance Primary Key Foreign Key Output: RDF Graph Completely Automatic M is data mapping M directly translates database instances into RDF triples M is monotone If database instances D ⊆ D0 then M(D) ⊆ M(D0 )
  • 3.
    Direct Mapping MEngine IRI identifiers for table, columns and foreign keys Identifiers for tuples: IRI (if PK exists), otherwise Blank nodes Triples: Table (tuples), Literal (attributes), Reference (FKs) Ratan Bahadur Thapa ratanbt@ifi.uio.no A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 2 / 13 Table triples: for every tuples of tables <baseIRI/User#U_ID=E01 > rdf:type <baseIRI/User> . Literal triples: for every attributes of table <baseIRI/User#U_ID=E01> <baseIRI/User#Name> "Ida" . Reference triples: for every FK attributes of table (if exists) U_ID Name Position E01 Ida Post Doc User <baseIRI/User> IRI for table <baseIRI/User#Name> <baseIRI/User#U_ID=E01> IRI for tuples IRI for columns
  • 4.
    Sequeda et. al.Direct Mapping DM Integrates W3C Direct Mapping M and SQL Schemas to OWL mapping1 Ms rules identify (RDFS) vocabularies V Mi rules translate database instance into RDF OWL rules translate vocabularies V into OWL axioms Ratan Bahadur Thapa ratanbt@ifi.uio.no A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 3 / 13 Key constraints Σ Schema R Instance D Ms V Mi OWL rules Graph G OWL axioms 1 Tirmizi et. al., Translating SQL applications to the Semantic Web, DEXA 2008
  • 5.
    DM Ratan Bahadur Thaparatanbt@ifi.uio.no A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 4 / 13 P_ID Name Proj P02 Semantic Integration B01 Emp Prog E01 P02 B_ID Name B01 PeTWIN E_ID Name Position E01 Ida Post Doc FK FK FK Employee Program EmpProg (Emp,Prog) Project Class Object Property binary table foreign key of non-binary table non-binary table Data type Property attributes of non-binary table <baseIRI/Program#P_ID=P02> <rdf:type> <baseIRI/Program> . Reference triples: via binary tables <baseIRI/Employee#E_ID=E01> <baseIRI/EmpProg#Emp,Proj,E_ID,B_ID> <baseIRI/Program#P_ID=P02> . ClassIRI Where, <baseIRI/Employee#E_ID=E01> <rdf:type> <baseIRI/Employee> .
  • 6.
    Properties of DM InformationPreserving Query Preserving Monotone not Semantics Preserving i.e., for every R and σ set of PKs and FKs on R, it is not the case that D σ ⇐⇒ G OWL axioms. Counter Example: D 2 σ, but G OWL axioms. No monotone DMextended is semantics preserving, Sequeda et. al.[Them. 3]. Ratan Bahadur Thapa ratanbt@ifi.uio.no A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 5 / 13 E_ID Name Position E01 Ida Post Doc E01 Cathrine PhD Employee :E01 Ida Cathrine Post Doc PhD :baseIRI/Employee#Name :baseIRI/Employee#Position rdf:type :baseIRI/Employee
  • 7.
    Our Goal Constraints RewritingT for monotone direct mapping Relational Database RDF Graph Direct Mapping Integrity Constraints? INPUT Database Schema and Instance Key Constraints: PKs and FKs Data Constraints: Nullability, Uniqueness and Data types OUTPUT RDF Graph SHACL Description Ratan Bahadur Thapa ratanbt@ifi.uio.no A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 6 / 13
  • 8.
    The rewriting Γ Γis a set of Datalog rules. σ contains a PK for each R ∈ R for every PK ∈ σ, UNQ ∈ δ and NN ∈ δ Ratan Bahadur Thapa ratanbt@ifi.uio.no A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 7 / 13 Data constraints δ Key constraints σ Σ Schema R Instance D Ms V Γ Mi Shapes S Graph G
  • 9.
    Properties of Γ Γis constraint preserving, i.e., there exist mapping N s.t. N(Γ(V, δ)) = (V, δ). Γ is not semantics preserving, i.e., D Σ ⇐⇒ G S does not hold. Counter Example: Ratan Bahadur Thapa ratanbt@ifi.uio.no A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 8 / 13
  • 10.
    Properties of Γ Γis constraint preserving. Γ is not semantics preserving, i.e., D Σ ⇐⇒ G S does not hold. Counter Example: Ratan Bahadur Thapa ratanbt@ifi.uio.no A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 9 / 13 U_ID U01 U01 Null User :User/U_ID=U01 rdf:type :User . :User/U_ID=U01 :User/U_ID U01 . RDF Triples: :User a sh:NodeShape, rdfs:Class; sh:property [ sh:path :User/U_ID; sh:nodeKind. sh:Literal; sh:maxCount 1; sh:minCount 1; sh:datatype xsd:integer ]; un:uniqueValuesForClass [un:unqProp :User/U_ID; un:unqForClass :User ]. Class Datatype Property
  • 11.
    Properties of Γ Γis constraint preserving, i.e., there exist mapping N s.t. N(Γ(V, δ)) = (V, δ). Γ is not semantics preserving, i.e., D Σ ⇐⇒ G S does not hold. Γ is weakly semantics preserving, i.e., D Σ ⇐⇒ G S, for all DB instances D that satisfy their key constraints σ. Ratan Bahadur Thapa ratanbt@ifi.uio.no A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 10 / 13
  • 12.
    Discussion Rewriting Γ extendsdirect mapping with the SHACL constraints. Limitations: Γ is not semantics preserving if: relation schemas without PKs, and databases violating the key constraints are considered Open question: Maximal semantics preserving rewriting Γ Ratan Bahadur Thapa ratanbt@ifi.uio.no A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 11 / 13
  • 13.
    Conclusion We have extendeddirect mapping M with the constraint rewriting Γ : M is monotone, information and query preserving. Γ is constraint and weakly semantics preserving. Future Goal : Constraint rewriting for R2RML Ratan Bahadur Thapa ratanbt@ifi.uio.no A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 12 / 13
  • 14.
    References Sequeda, Juan F.and Arenas, Marcelo and Miranker, Daniel P. On directly mapping relational databases to RDF and OWL. Proc. 21st Intl. Conf. on World Wide Web, 649-658, 2012, ACM. Arenas, Marcelo and Bertails, Alexandre and Prud’hommeaux, Eric and Sequeda, Juan A Direct Mapping of Relational Data to RDF. W3C Recommendation, 2012 Holger Knublauch and Dimitris Kontokostas Shapes Constraint Language (SHACL). W3C Recommendation, 2017 Ratan Bahadur Thapa ratanbt@ifi.uio.no A Source-to-Target Constraint rewriting for Direct Mapping 2nd April 2023 13 / 13