SlideShare a Scribd company logo
1 of 40
Download to read offline
Cooperative Techniques for SPARQL Query
Relaxation in RDF Databases
G´eraud FOKOU St´ephane JEAN Allel HADJALI
Micka¨el BARON
LIAS/ENSMA and University of Poitiers, France
ESWC 2015, June 2015, Portoroz, Slovenia
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 1/21
Motivation
A lot of Knowledge Bases (KBs) available on the web
Academic projects: YAGO, NELL, DBPedia, ...
Commercial projects: Google (Knowledge Vault), ...
Difficulties to query KBs
Incomplete and Heterogeneous Data
Large size and complex schema semantics for end-users
Correct formulation of queries returning the intended result
...
⇒ failing RDF queries: queries that return an empty set of answers
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 2/21
Motivation
A lot of Knowledge Bases (KBs) available on the web
Academic projects: YAGO, NELL, DBPedia, ...
Commercial projects: Google (Knowledge Vault), ...
Difficulties to query KBs
Incomplete and Heterogeneous Data
Large size and complex schema semantics for end-users
Correct formulation of queries returning the intended result
...
⇒ failing RDF queries: queries that return an empty set of answers
The need for RDF query relaxation techniques
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 2/21
Related Work
Proposed relaxation operators [Hurtado08, Fokou14, ...]
OPTIONAL of SPARQL
Classes/Properties Hierarchies (Student → Person)
Replacing constants by variables
Removing triple patterns
...
⇒ On which triple patterns?
Proposed relaxation process [Huang12, ...]
Based on similarity measure Sim(Q, relax(Q))
Execution from the most to the least similar relaxed queries
⇒ blind relaxation of the query (long execution time)
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 3/21
Related Work
Proposed relaxation operators [Hurtado08, Fokou14, ...]
OPTIONAL of SPARQL
Classes/Properties Hierarchies (Student → Person)
Replacing constants by variables
Removing triple patterns
...
⇒ On which triple patterns?
Proposed relaxation process [Huang12, ...]
Based on similarity measure Sim(Q, relax(Q))
Execution from the most to the least similar relaxed queries
⇒ blind relaxation of the query (long execution time)
The need to know the causes of failure of an RDF query
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 3/21
Minimal Failing Subquery (MFS)
Definition
Let Q be a conjunction of triple patterns Q = t1 ∧ t2 ∧ t3 ∧ t4
Q* is an MFS of Q iff:
Q* is a failing subquery of Q
No subquery of Q* is a failing query
∅
t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3 t2 ∧ t4 t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
Q : failing query
Q : successful query
MFSs
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 4/21
Minimal Failing Subquery (MFS)
Definition
Let Q be a conjunction of triple patterns Q = t1 ∧ t2 ∧ t3 ∧ t4
Q* is an MFS of Q iff:
Q* is a failing subquery of Q
No subquery of Q* is a failing query
∅
t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3 t2 ∧ t4 t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
Q : failing query
Q : successful query
MFSs
The need to find maximal non failing subquery ⇒ notion of XSS
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 4/21
MaXimal Succeeding Subquery (XSS)
Definition
Q* is an XSS of Q iff:
Q* is a successful subquery of Q
Q* is not included in any successful subquery of Q
∅
t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3 t2 ∧ t4 t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
Q : failing query
Q : successful query
XSSs
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 5/21
MaXimal Succeeding Subquery (XSS)
Definition
Q* is an XSS of Q iff:
Q* is a successful subquery of Q
Q* is not included in any successful subquery of Q
∅
t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3 t2 ∧ t4 t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
Q : failing query
Q : successful query
XSSs
Problem: Computation of MFSs and XSSs of a failing RDF query
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 5/21
Outline
• Problem Statement
• Lattice-Based Approach (LBA)
• Matrice-Based Approach (MBA)
• Experimental results
• Conclusion and Perspectives
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 6/21
LBA Approach: 1) Finding an MFS
Inspired by the work of Godfrey in RDBMSs [Godfrey97]
A 3 steps procedure
First step: Q∗ ← FindAnMFS(Q)
• remove iteratively each triple pattern ti of Q (query Q )
• if Q ∧ Q∗
is successful, ti is an element of Q∗
(Proposition)
• otherwise Q∗
is a subquery of Q ∧ Q∗
∅
t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
Q = t2 ∧ t3 ∧ t4
Q ∧ Q∗ = t2 ∧ t3 ∧ t4
⇒ Q∗ = ∅
remove t1
t1 /∈ MFS
Q = t3 ∧ t4
Q ∧ Q∗ = t3 ∧ t4
⇒ Q∗ = t2
remove t2
t2 ∈ MFS
Q = t4
Q ∧ Q∗ = t2 ∧ t4
⇒ Q∗ = t2 ∧ t3
remove t3
t3 ∈ MFS
Q = ∅
Q ∧ Q∗ = t2 ∧ t3
⇒ Q∗ = t2 ∧ t3
remove t4
t4 /∈ MFS
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 7/21
LBA Approach: 1) Finding an MFS
Inspired by the work of Godfrey in RDBMSs [Godfrey97]
A 3 steps procedure
First step: Q∗ ← FindAnMFS(Q)
• remove iteratively each triple pattern ti of Q (query Q )
• if Q ∧ Q∗
is successful, ti is an element of Q∗
(Proposition)
• otherwise Q∗
is a subquery of Q ∧ Q∗
∅
t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
Q = t2 ∧ t3 ∧ t4
Q ∧ Q∗ = t2 ∧ t3 ∧ t4
⇒ Q∗ = ∅
remove t1
t1 /∈ MFS
Q = t3 ∧ t4
Q ∧ Q∗ = t3 ∧ t4
⇒ Q∗ = t2
remove t2
t2 ∈ MFS
Q = t4
Q ∧ Q∗ = t2 ∧ t4
⇒ Q∗ = t2 ∧ t3
remove t3
t3 ∈ MFS
Q = ∅
Q ∧ Q∗ = t2 ∧ t3
⇒ Q∗ = t2 ∧ t3
remove t4
t4 /∈ MFS
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 7/21
LBA Approach: 1) Finding an MFS
Inspired by the work of Godfrey in RDBMSs [Godfrey97]
A 3 steps procedure
First step: Q∗ ← FindAnMFS(Q)
• remove iteratively each triple pattern ti of Q (query Q )
• if Q ∧ Q∗
is successful, ti is an element of Q∗
(Proposition)
• otherwise Q∗
is a subquery of Q ∧ Q∗
∅
t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
Q = t2 ∧ t3 ∧ t4
Q ∧ Q∗ = t2 ∧ t3 ∧ t4
⇒ Q∗ = ∅
remove t1
t1 /∈ MFS
Q = t3 ∧ t4
Q ∧ Q∗ = t3 ∧ t4
⇒ Q∗ = t2
remove t2
t2 ∈ MFS
Q = t4
Q ∧ Q∗ = t2 ∧ t4
⇒ Q∗ = t2 ∧ t3
remove t3
t3 ∈ MFS
Q = ∅
Q ∧ Q∗ = t2 ∧ t3
⇒ Q∗ = t2 ∧ t3
remove t4
t4 /∈ MFS
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 7/21
LBA Approach: 1) Finding an MFS
Inspired by the work of Godfrey in RDBMSs [Godfrey97]
A 3 steps procedure
First step: Q∗ ← FindAnMFS(Q)
• remove iteratively each triple pattern ti of Q (query Q )
• if Q ∧ Q∗
is successful, ti is an element of Q∗
(Proposition)
• otherwise Q∗
is a subquery of Q ∧ Q∗
∅
t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
Q = t2 ∧ t3 ∧ t4
Q ∧ Q∗ = t2 ∧ t3 ∧ t4
⇒ Q∗ = ∅
remove t1
t1 /∈ MFS
Q = t3 ∧ t4
Q ∧ Q∗ = t3 ∧ t4
⇒ Q∗ = t2
remove t2
t2 ∈ MFS
Q = t4
Q ∧ Q∗ = t2 ∧ t4
⇒ Q∗ = t2 ∧ t3
remove t3
t3 ∈ MFS
Q = ∅
Q ∧ Q∗ = t2 ∧ t3
⇒ Q∗ = t2 ∧ t3
remove t4
t4 /∈ MFS
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 7/21
LBA Approach: 1) Finding an MFS
Inspired by the work of Godfrey in RDBMSs [Godfrey97]
A 3 steps procedure
First step: Q∗ ← FindAnMFS(Q)
• remove iteratively each triple pattern ti of Q (query Q )
• if Q ∧ Q∗
is successful, ti is an element of Q∗
(Proposition)
• otherwise Q∗
is a subquery of Q ∧ Q∗
∅
t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
Q = t2 ∧ t3 ∧ t4
Q ∧ Q∗ = t2 ∧ t3 ∧ t4
⇒ Q∗ = ∅
remove t1
t1 /∈ MFS
Q = t3 ∧ t4
Q ∧ Q∗ = t3 ∧ t4
⇒ Q∗ = t2
remove t2
t2 ∈ MFS
Q = t4
Q ∧ Q∗ = t2 ∧ t4
⇒ Q∗ = t2 ∧ t3
remove t3
t3 ∈ MFS
Q = ∅
Q ∧ Q∗ = t2 ∧ t3
⇒ Q∗ = t2 ∧ t3
remove t4
t4 /∈ MFS
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 7/21
LBA Approach: 2) Computing Potential XSSs
Second step:
All queries that include Q∗ are failing
⇒ they can be neither MFS nor XSS
Continue with the largest subqueries of Q that do not include
Q∗. If they are successful, they are XSSs.
⇒ potential XSSs (PXSS): pxss(Q, Q∗) = {Q − ti | ti ∈ Q∗}
Q∗ = t2 ∧ t3 ⇒ pxss(Q, Q∗) = {t1 ∧ t3 ∧ t4, t1 ∧ t2 ∧ t4}
∅
t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4 t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
Q∗
pxss(Q, Q∗
)
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 8/21
LBA Approach: 2) Computing Potential XSSs
Second step:
All queries that include Q∗ are failing
⇒ they can be neither MFS nor XSS
Continue with the largest subqueries of Q that do not include
Q∗. If they are successful, they are XSSs.
⇒ potential XSSs (PXSS): pxss(Q, Q∗) = {Q − ti | ti ∈ Q∗}
Q∗ = t2 ∧ t3 ⇒ pxss(Q, Q∗) = {t1 ∧ t3 ∧ t4, t1 ∧ t2 ∧ t4}
∅
t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4 t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
Q∗
pxss(Q, Q∗
)
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 8/21
LBA Approach: 3) Finding all MFSs and XSSs
Third step:
Na¨ıve approach
Execute each PXSS PQ
If PQ is successful, this is an XSS
Otherwise apply recursively the LBA algorithm on PQ
Problem: the same MFS can be found several times
Solution: incremental computation of the PXSSs that do not
include the identified MFSs
∅
t1t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
MFSs XSSs
pxss = {t1 ∧ t3 ∧ t4, t1 ∧ t2 ∧ t4}
mfs(Q) = {t2 ∧ t3}
xss(Q) = ∅
t1 ∧ t2 ∧ t3 ∧ t4
Q∗ = t2 ∧ t3
pxss = {t3 ∧ t4, t2 ∧ t4}
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = ∅
t1 ∧ t3 ∧ t4
Q∗∗ = t1
pxss = {t2 ∧ t4}
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = {t3 ∧ t4}
t3 ∧ t4
pxss = ∅
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = {t3 ∧ t4, t2 ∧ t4}
t2 ∧ t4
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 9/21
LBA Approach: 3) Finding all MFSs and XSSs
Third step:
Na¨ıve approach
Execute each PXSS PQ
If PQ is successful, this is an XSS
Otherwise apply recursively the LBA algorithm on PQ
Problem: the same MFS can be found several times
Solution: incremental computation of the PXSSs that do not
include the identified MFSs
∅
t1t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
MFSs XSSs
pxss = {t1 ∧ t3 ∧ t4, t1 ∧ t2 ∧ t4}
mfs(Q) = {t2 ∧ t3}
xss(Q) = ∅
t1 ∧ t2 ∧ t3 ∧ t4
Q∗ = t2 ∧ t3
pxss = {t3 ∧ t4, t2 ∧ t4}
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = ∅
t1 ∧ t3 ∧ t4
Q∗∗ = t1
pxss = {t2 ∧ t4}
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = {t3 ∧ t4}
t3 ∧ t4
pxss = ∅
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = {t3 ∧ t4, t2 ∧ t4}
t2 ∧ t4
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 9/21
LBA Approach: 3) Finding all MFSs and XSSs
Third step:
Na¨ıve approach
Execute each PXSS PQ
If PQ is successful, this is an XSS
Otherwise apply recursively the LBA algorithm on PQ
Problem: the same MFS can be found several times
Solution: incremental computation of the PXSSs that do not
include the identified MFSs
∅
t1t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
MFSs XSSs
pxss = {t1 ∧ t3 ∧ t4, t1 ∧ t2 ∧ t4}
mfs(Q) = {t2 ∧ t3}
xss(Q) = ∅
t1 ∧ t2 ∧ t3 ∧ t4
Q∗ = t2 ∧ t3
pxss = {t3 ∧ t4, t2 ∧ t4}
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = ∅
t1 ∧ t3 ∧ t4
Q∗∗ = t1
pxss = {t2 ∧ t4}
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = {t3 ∧ t4}
t3 ∧ t4
pxss = ∅
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = {t3 ∧ t4, t2 ∧ t4}
t2 ∧ t4
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 9/21
LBA Approach: 3) Finding all MFSs and XSSs
Third step:
Na¨ıve approach
Execute each PXSS PQ
If PQ is successful, this is an XSS
Otherwise apply recursively the LBA algorithm on PQ
Problem: the same MFS can be found several times
Solution: incremental computation of the PXSSs that do not
include the identified MFSs
∅
t1t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
MFSs XSSs
pxss = {t1 ∧ t3 ∧ t4, t1 ∧ t2 ∧ t4}
mfs(Q) = {t2 ∧ t3}
xss(Q) = ∅
t1 ∧ t2 ∧ t3 ∧ t4
Q∗ = t2 ∧ t3
pxss = {t3 ∧ t4, t2 ∧ t4}
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = ∅
t1 ∧ t3 ∧ t4
Q∗∗ = t1
pxss = {t2 ∧ t4}
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = {t3 ∧ t4}
t3 ∧ t4
pxss = ∅
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = {t3 ∧ t4, t2 ∧ t4}
t2 ∧ t4
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 9/21
LBA Approach: 3) Finding all MFSs and XSSs
Third step:
Na¨ıve approach
Execute each PXSS PQ
If PQ is successful, this is an XSS
Otherwise apply recursively the LBA algorithm on PQ
Problem: the same MFS can be found several times
Solution: incremental computation of the PXSSs that do not
include the identified MFSs
∅
t1t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
MFSs XSSs
pxss = {t1 ∧ t3 ∧ t4, t1 ∧ t2 ∧ t4}
mfs(Q) = {t2 ∧ t3}
xss(Q) = ∅
t1 ∧ t2 ∧ t3 ∧ t4
Q∗ = t2 ∧ t3
pxss = {t3 ∧ t4, t2 ∧ t4}
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = ∅
t1 ∧ t3 ∧ t4
Q∗∗ = t1
pxss = {t2 ∧ t4}
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = {t3 ∧ t4}
t3 ∧ t4
pxss = ∅
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = {t3 ∧ t4, t2 ∧ t4}
t2 ∧ t4
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 9/21
Outline
• Problem Statement
• Lattice-Based Approach (LBA)
• Matrice-Based Approach (MBA)
• Experimental results
• Conclusion and Perspectives
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 10/21
MBA Approach: Overview
Inspired by the work of Jannach in Recommender Systems
Based on the computation of a Matrix
one row by potential solution µ
(a mapping satisfying at least one triple pattern)
one column by triple pattern
M[µ][ti ] = 1 ⇔ µ satisfies ti (else 0)
t
s p o
Smith type Professor
Smith research SW
Smith teacherOf DB
Jane type Lecturer
Jane research DB
Jane teacherOf DB
SELECT ?p ?a WHERE {
?p age ?a (t1)
?p type Lecturer (t2)
?p research SW (t3)
?p teacherOf DB } (t4)
?p ?a t1 t2 t3 t4
Smith null 0 0 1 1
Jane null 0 1 0 1
xss(Q) = {t2 ∧ t4, t3 ∧ t4}
mfs(Q) = {t1, t2 ∧ t3}
RDF triples
The query Q
The relaxed matrix of Q The MFSs and XSSs of Q
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 11/21
MBA Approach: Relaxed Matrix Computation
Execution of each triple pattern ti
Computation of t1
∗ t2... ∗ tn
where t1
∗
t2 = t1 ∪ t2 ∪ t1 t2
the mappings that satisfies t1 or t2 or a combination of t1, t2
Problem: subqueries of Q can lead to Cartesian Product
⇒ Matrix of a huge size (exceeds the size of main memory)
Not the case of Star-shaped queries
(a shared join variable in the subject position)
SELECT ?p ?a WHERE {
?p age ?a (t1)
?p type Lecturer (t2)
?p research SW (t3)
?p teacherOf DB } (t4)
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 12/21
MBA Approach: Optimized Relaxed Matrix Computation
Algorithm NQ
execute each triple patternti
for each value v of the join variable
if v is already in the matrix, complete the row with a 1 for ti
else add a new row in the matrix with only a 1 for ti
?p ?a t1 t2 t3 t4
t1 = ∅
?p ?a t1 t2 t3 t4
Jane null 0 1 0 0
t2 = Jane
?p ?a t1 t2 t3 t4
Jane null 0 1 0 0
Smith null 0 0 1 0
t3 = Smith
?p ?a t1 t2 t3 t4
Jane null 0 1 0 1
Smith null 0 0 1 1
t4 = Smith, Jane
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 13/21
MBA Approach: Optimized Relaxed Matrix Computation
Algorithm NQ
execute each triple patternti
for each value v of the join variable
if v is already in the matrix, complete the row with a 1 for ti
else add a new row in the matrix with only a 1 for ti
?p ?a t1 t2 t3 t4
t1 = ∅
?p ?a t1 t2 t3 t4
Jane null 0 1 0 0
t2 = Jane
?p ?a t1 t2 t3 t4
Jane null 0 1 0 0
Smith null 0 0 1 0
t3 = Smith
?p ?a t1 t2 t3 t4
Jane null 0 1 0 1
Smith null 0 0 1 1
t4 = Smith, Jane
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 13/21
MBA Approach: Optimized Relaxed Matrix Computation
Algorithm NQ
execute each triple patternti
for each value v of the join variable
if v is already in the matrix, complete the row with a 1 for ti
else add a new row in the matrix with only a 1 for ti
?p ?a t1 t2 t3 t4
t1 = ∅
?p ?a t1 t2 t3 t4
Jane null 0 1 0 0
t2 = Jane
?p ?a t1 t2 t3 t4
Jane null 0 1 0 0
Smith null 0 0 1 0
t3 = Smith
?p ?a t1 t2 t3 t4
Jane null 0 1 0 1
Smith null 0 0 1 1
t4 = Smith, Jane
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 13/21
MBA Approach: Optimized Relaxed Matrix Computation
Algorithm NQ
execute each triple patternti
for each value v of the join variable
if v is already in the matrix, complete the row with a 1 for ti
else add a new row in the matrix with only a 1 for ti
?p ?a t1 t2 t3 t4
t1 = ∅
?p ?a t1 t2 t3 t4
Jane null 0 1 0 0
t2 = Jane
?p ?a t1 t2 t3 t4
Jane null 0 1 0 0
Smith null 0 0 1 0
t3 = Smith
?p ?a t1 t2 t3 t4
Jane null 0 1 0 1
Smith null 0 0 1 1
t4 = Smith, Jane
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 13/21
MBA Approach: Optimized Relaxed Matrix Computation
Algorithm NQ
execute each triple patternti
for each value v of the join variable
if v is already in the matrix, complete the row with a 1 for ti
else add a new row in the matrix with only a 1 for ti
?p ?a t1 t2 t3 t4
t1 = ∅
?p ?a t1 t2 t3 t4
Jane null 0 1 0 0
t2 = Jane
?p ?a t1 t2 t3 t4
Jane null 0 1 0 0
Smith null 0 0 1 0
t3 = Smith
?p ?a t1 t2 t3 t4
Jane null 0 1 0 1
Smith null 0 0 1 1
t4 = Smith, Jane
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 13/21
MBA Approach: Optimized Relaxed Matrix Computation
Algorithm 1Q
if the RDF database is implemented as a triple table: T(s, p, o)
a single SQL query can be used to compute the matrix
roughly the translation of t1 t2... tn
select coalesce(t1.s , t2.s, t3.s, t4.s),
case when t1.s is null then 0 else 1 end as t1,
case when t2.s is null then 0 else 1 end as t2,
case when t3.s is null then 0 else 1 end as t3,
case when t4.s is null then 0 else 1 end as t4
from t1 full outer join t2 on t1.s = t2.s
full outer join t3 on coalesce(t1.s , t2.s) = t3.s
full outer join t4 on coalesce(t1.s , t2.s, t3.s) = t4.s
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 14/21
MBA Approach: MFSs and XSSs computation
XSSs computation
1) Compute the skyline of the relaxed matrix
Many solutions: block nested loop, sort filter skyline, ...
2) Retrieve the queries that correspond to skyline rows
They are the XSSs
?p ?a t1 t2 t3 t4
Smith null 0 0 1 1
Jane null 0 1 0 1
⇒ xss(Q) = {t2 ∧ t4, t3 ∧ t4}
MFSs computation
The matrix provides a simple way to know if a query is failing
t2 ∧ t3 is failing ⇔ col(t2) AND col(t3) = 0
0
The matrix can be used as an index for optimizing LBA
LBA without any database query
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 15/21
Outline
• Problem Statement
• Lattice-Based Approach (LBA)
• Matrice-Based Approach (MBA)
• Experimental results
• Conclusion and Perspectives
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 16/21
Experimental Setup
Java implementation of LBA and MBA on top of Jena TDB
http://www.lias-lab.fr/forge/projects/qars
Dataset: LUBM20 (≈ 300 MB) and LUBM100 (≈ 1 GB)
Queries: 7 failing queries ranging 1 to 15 triple patterns (TP)
Hardware: Intel Core i7 CPU and 8GB RAM
Implementation of the relaxed matrix as a set of Bitmap
Relaxed matrix size and computation time:
LUBM20 LUBM100
Q3 Q5 Q7 Q3 Q5 Q7
Computation time with NQ (in sec) 8.6 8.6 8.6 42.6 43.4 44.6
Computation time with 1Q (in sec) 6.1 6.3 6.8 30.4 34.6 38.5
Size (in KB) 293 400 335 1385 1912 1590
Number of rows (in K) 430 430 430 2149 2149 2149
The algorithm 1Q is 25% faster than NQ
The size of the matrix is small thanks to bitmap compression
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 17/21
Experiments: XSS and MFS Computation Time
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 18/21
DFS: Depth-First Search algorithm (baseline method)
10
100
1000
10000
100000
Q1(1) Q2(5) Q3(7) Q4(9) Q5(11) Q6(13) Q7(15)
QueryTime(ms)
Queries with Number of Triple pattern (TP)
LUBM100
DFS
DFS does not scale for medium size queries (≈ 9TP)
Experiments: XSS and MFS Computation Time
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 18/21
DFS: Depth-First Search algorithm (baseline method)
ISHMAEL: adaptation of Godfrey’algorithm, LBA
10
100
1000
10000
100000
1e+006
Q1(1) Q2(5) Q3(7) Q4(9) Q5(11) Q6(13) Q7(15)
QueryTime(ms)
Queries with Number of Triple pattern (TP)
LUBM100
DFS
ISHMAEL
LBA
LBA/ISHMAEL scale till queries ≈ 11 TP
Experiments: XSS and MFS Computation Time
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 18/21
DFS: Depth-First Search algorithm (baseline method)
ISHMAEL: adaptation of Godfrey’algorithm, LBA
MBA+M: computation of the matrix + LBA algorithm
MBA-M: same as MBA+M but without computing the matrix
1
10
100
1000
10000
100000
1e+006
Q1(1) Q2(5) Q3(7) Q4(9) Q5(11) Q6(13) Q7(15)
QueryTime(ms)
Queries with Number of Triple pattern (TP)
LUBM100
DFS
ISHMAEL
LBA
MBA+M
MBA-M
MBA is only useful for Star-shaped queries
MBA+M only useful for large queries, MBA-M scales well
Experiments: Performance as the number of TP scales
The number of TP plays a important role in the performance
Experiment: decompose Q7 in 15 subqueries (1 to 15TP)
Execute each such subquery
1
10
100
1000
10000
100000
1e+006
2 4 6 8 10 12 14
Number of Triple Patterns
LUBM100
DFS
ISMAEL
LBA
MBA+M
MBA-M
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 19/21
⇒ Confirm our previous results
Outline
• Problem Statement
• Lattice-Based Approach (LBA)
• Matrice-Based Approach (MBA)
• Experimental results
• Conclusion and Perspectives
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 20/21
Conclusion and Perspectives
Study of 2 cooperative techniques for RDF query relaxation
MFS: a cause of query failure
XSS: a relaxed query with a maximum of TP
Contribution: 2 solutions to find MFSs/XSSs of RDF queries
LBA is smart exploration of the subquery lattice
MBA is based on a matrix used as a bitmap index
LBA >> baseline algorithms for queries with more than 5 TPs
MBA is only interesting for large star-shaped queries
if the matrix has been precomputed, MBA runs in milliseconds
even for large queries (>15TP)
Perspectives:
Optimization of LBA using heuristics (obvious MFS)
Optimization of MBA for other kinds of RDF queries
Definition of query relaxation strategies based on MFSs/XSSs
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 21/21

More Related Content

What's hot

The discrete quartic spline interpolation over non uniform mesh
The discrete  quartic spline interpolation over non uniform meshThe discrete  quartic spline interpolation over non uniform mesh
The discrete quartic spline interpolation over non uniform meshAlexander Decker
 
Goldberg-Coxeter construction for 3- or 4-valent plane maps
Goldberg-Coxeter construction for 3- or 4-valent plane mapsGoldberg-Coxeter construction for 3- or 4-valent plane maps
Goldberg-Coxeter construction for 3- or 4-valent plane mapsMathieu Dutour Sikiric
 
11.a seasonal arima model for nigerian gross domestic product
11.a seasonal arima model for nigerian gross domestic product11.a seasonal arima model for nigerian gross domestic product
11.a seasonal arima model for nigerian gross domestic productAlexander Decker
 
11.[1 11]a seasonal arima model for nigerian gross domestic product
11.[1 11]a seasonal arima model for nigerian gross domestic product11.[1 11]a seasonal arima model for nigerian gross domestic product
11.[1 11]a seasonal arima model for nigerian gross domestic productAlexander Decker
 
CS Fundamentals: Scalability and Memory
CS Fundamentals: Scalability and MemoryCS Fundamentals: Scalability and Memory
CS Fundamentals: Scalability and MemoryHaseeb Qureshi
 
Slides Workshopon Explainable Logic-Based Knowledge Representation (XLoKR 2020)
Slides Workshopon Explainable Logic-Based Knowledge Representation (XLoKR 2020)Slides Workshopon Explainable Logic-Based Knowledge Representation (XLoKR 2020)
Slides Workshopon Explainable Logic-Based Knowledge Representation (XLoKR 2020)Federal University of Technology of Parana
 

What's hot (8)

The discrete quartic spline interpolation over non uniform mesh
The discrete  quartic spline interpolation over non uniform meshThe discrete  quartic spline interpolation over non uniform mesh
The discrete quartic spline interpolation over non uniform mesh
 
A Note on TopicRNN
A Note on TopicRNNA Note on TopicRNN
A Note on TopicRNN
 
Goldberg-Coxeter construction for 3- or 4-valent plane maps
Goldberg-Coxeter construction for 3- or 4-valent plane mapsGoldberg-Coxeter construction for 3- or 4-valent plane maps
Goldberg-Coxeter construction for 3- or 4-valent plane maps
 
11.a seasonal arima model for nigerian gross domestic product
11.a seasonal arima model for nigerian gross domestic product11.a seasonal arima model for nigerian gross domestic product
11.a seasonal arima model for nigerian gross domestic product
 
11.[1 11]a seasonal arima model for nigerian gross domestic product
11.[1 11]a seasonal arima model for nigerian gross domestic product11.[1 11]a seasonal arima model for nigerian gross domestic product
11.[1 11]a seasonal arima model for nigerian gross domestic product
 
CS Fundamentals: Scalability and Memory
CS Fundamentals: Scalability and MemoryCS Fundamentals: Scalability and Memory
CS Fundamentals: Scalability and Memory
 
Slides Workshopon Explainable Logic-Based Knowledge Representation (XLoKR 2020)
Slides Workshopon Explainable Logic-Based Knowledge Representation (XLoKR 2020)Slides Workshopon Explainable Logic-Based Knowledge Representation (XLoKR 2020)
Slides Workshopon Explainable Logic-Based Knowledge Representation (XLoKR 2020)
 
Saarbruecken
SaarbrueckenSaarbruecken
Saarbruecken
 

Recently uploaded

HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Recently uploaded (20)

HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Cooperative Techniques for SPARQL Query Relaxation in RDF Databases: ESWC 2015

  • 1. Cooperative Techniques for SPARQL Query Relaxation in RDF Databases G´eraud FOKOU St´ephane JEAN Allel HADJALI Micka¨el BARON LIAS/ENSMA and University of Poitiers, France ESWC 2015, June 2015, Portoroz, Slovenia G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 1/21
  • 2. Motivation A lot of Knowledge Bases (KBs) available on the web Academic projects: YAGO, NELL, DBPedia, ... Commercial projects: Google (Knowledge Vault), ... Difficulties to query KBs Incomplete and Heterogeneous Data Large size and complex schema semantics for end-users Correct formulation of queries returning the intended result ... ⇒ failing RDF queries: queries that return an empty set of answers G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 2/21
  • 3. Motivation A lot of Knowledge Bases (KBs) available on the web Academic projects: YAGO, NELL, DBPedia, ... Commercial projects: Google (Knowledge Vault), ... Difficulties to query KBs Incomplete and Heterogeneous Data Large size and complex schema semantics for end-users Correct formulation of queries returning the intended result ... ⇒ failing RDF queries: queries that return an empty set of answers The need for RDF query relaxation techniques G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 2/21
  • 4. Related Work Proposed relaxation operators [Hurtado08, Fokou14, ...] OPTIONAL of SPARQL Classes/Properties Hierarchies (Student → Person) Replacing constants by variables Removing triple patterns ... ⇒ On which triple patterns? Proposed relaxation process [Huang12, ...] Based on similarity measure Sim(Q, relax(Q)) Execution from the most to the least similar relaxed queries ⇒ blind relaxation of the query (long execution time) G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 3/21
  • 5. Related Work Proposed relaxation operators [Hurtado08, Fokou14, ...] OPTIONAL of SPARQL Classes/Properties Hierarchies (Student → Person) Replacing constants by variables Removing triple patterns ... ⇒ On which triple patterns? Proposed relaxation process [Huang12, ...] Based on similarity measure Sim(Q, relax(Q)) Execution from the most to the least similar relaxed queries ⇒ blind relaxation of the query (long execution time) The need to know the causes of failure of an RDF query G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 3/21
  • 6. Minimal Failing Subquery (MFS) Definition Let Q be a conjunction of triple patterns Q = t1 ∧ t2 ∧ t3 ∧ t4 Q* is an MFS of Q iff: Q* is a failing subquery of Q No subquery of Q* is a failing query ∅ t1 t2 t3 t4 t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3 t2 ∧ t4 t3 ∧ t4 t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4 t1 ∧ t2 ∧ t3 ∧ t4 Q : failing query Q : successful query MFSs G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 4/21
  • 7. Minimal Failing Subquery (MFS) Definition Let Q be a conjunction of triple patterns Q = t1 ∧ t2 ∧ t3 ∧ t4 Q* is an MFS of Q iff: Q* is a failing subquery of Q No subquery of Q* is a failing query ∅ t1 t2 t3 t4 t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3 t2 ∧ t4 t3 ∧ t4 t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4 t1 ∧ t2 ∧ t3 ∧ t4 Q : failing query Q : successful query MFSs The need to find maximal non failing subquery ⇒ notion of XSS G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 4/21
  • 8. MaXimal Succeeding Subquery (XSS) Definition Q* is an XSS of Q iff: Q* is a successful subquery of Q Q* is not included in any successful subquery of Q ∅ t1 t2 t3 t4 t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3 t2 ∧ t4 t3 ∧ t4 t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4 t1 ∧ t2 ∧ t3 ∧ t4 Q : failing query Q : successful query XSSs G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 5/21
  • 9. MaXimal Succeeding Subquery (XSS) Definition Q* is an XSS of Q iff: Q* is a successful subquery of Q Q* is not included in any successful subquery of Q ∅ t1 t2 t3 t4 t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3 t2 ∧ t4 t3 ∧ t4 t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4 t1 ∧ t2 ∧ t3 ∧ t4 Q : failing query Q : successful query XSSs Problem: Computation of MFSs and XSSs of a failing RDF query G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 5/21
  • 10. Outline • Problem Statement • Lattice-Based Approach (LBA) • Matrice-Based Approach (MBA) • Experimental results • Conclusion and Perspectives G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 6/21
  • 11. LBA Approach: 1) Finding an MFS Inspired by the work of Godfrey in RDBMSs [Godfrey97] A 3 steps procedure First step: Q∗ ← FindAnMFS(Q) • remove iteratively each triple pattern ti of Q (query Q ) • if Q ∧ Q∗ is successful, ti is an element of Q∗ (Proposition) • otherwise Q∗ is a subquery of Q ∧ Q∗ ∅ t1 t2 t3 t4 t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4 t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4t2 ∧ t3 ∧ t4 t1 ∧ t2 ∧ t3 ∧ t4 Q = t2 ∧ t3 ∧ t4 Q ∧ Q∗ = t2 ∧ t3 ∧ t4 ⇒ Q∗ = ∅ remove t1 t1 /∈ MFS Q = t3 ∧ t4 Q ∧ Q∗ = t3 ∧ t4 ⇒ Q∗ = t2 remove t2 t2 ∈ MFS Q = t4 Q ∧ Q∗ = t2 ∧ t4 ⇒ Q∗ = t2 ∧ t3 remove t3 t3 ∈ MFS Q = ∅ Q ∧ Q∗ = t2 ∧ t3 ⇒ Q∗ = t2 ∧ t3 remove t4 t4 /∈ MFS G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 7/21
  • 12. LBA Approach: 1) Finding an MFS Inspired by the work of Godfrey in RDBMSs [Godfrey97] A 3 steps procedure First step: Q∗ ← FindAnMFS(Q) • remove iteratively each triple pattern ti of Q (query Q ) • if Q ∧ Q∗ is successful, ti is an element of Q∗ (Proposition) • otherwise Q∗ is a subquery of Q ∧ Q∗ ∅ t1 t2 t3 t4 t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4 t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4t2 ∧ t3 ∧ t4 t1 ∧ t2 ∧ t3 ∧ t4 Q = t2 ∧ t3 ∧ t4 Q ∧ Q∗ = t2 ∧ t3 ∧ t4 ⇒ Q∗ = ∅ remove t1 t1 /∈ MFS Q = t3 ∧ t4 Q ∧ Q∗ = t3 ∧ t4 ⇒ Q∗ = t2 remove t2 t2 ∈ MFS Q = t4 Q ∧ Q∗ = t2 ∧ t4 ⇒ Q∗ = t2 ∧ t3 remove t3 t3 ∈ MFS Q = ∅ Q ∧ Q∗ = t2 ∧ t3 ⇒ Q∗ = t2 ∧ t3 remove t4 t4 /∈ MFS G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 7/21
  • 13. LBA Approach: 1) Finding an MFS Inspired by the work of Godfrey in RDBMSs [Godfrey97] A 3 steps procedure First step: Q∗ ← FindAnMFS(Q) • remove iteratively each triple pattern ti of Q (query Q ) • if Q ∧ Q∗ is successful, ti is an element of Q∗ (Proposition) • otherwise Q∗ is a subquery of Q ∧ Q∗ ∅ t1 t2 t3 t4 t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4 t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4t2 ∧ t3 ∧ t4 t1 ∧ t2 ∧ t3 ∧ t4 Q = t2 ∧ t3 ∧ t4 Q ∧ Q∗ = t2 ∧ t3 ∧ t4 ⇒ Q∗ = ∅ remove t1 t1 /∈ MFS Q = t3 ∧ t4 Q ∧ Q∗ = t3 ∧ t4 ⇒ Q∗ = t2 remove t2 t2 ∈ MFS Q = t4 Q ∧ Q∗ = t2 ∧ t4 ⇒ Q∗ = t2 ∧ t3 remove t3 t3 ∈ MFS Q = ∅ Q ∧ Q∗ = t2 ∧ t3 ⇒ Q∗ = t2 ∧ t3 remove t4 t4 /∈ MFS G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 7/21
  • 14. LBA Approach: 1) Finding an MFS Inspired by the work of Godfrey in RDBMSs [Godfrey97] A 3 steps procedure First step: Q∗ ← FindAnMFS(Q) • remove iteratively each triple pattern ti of Q (query Q ) • if Q ∧ Q∗ is successful, ti is an element of Q∗ (Proposition) • otherwise Q∗ is a subquery of Q ∧ Q∗ ∅ t1 t2 t3 t4 t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4 t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4t2 ∧ t3 ∧ t4 t1 ∧ t2 ∧ t3 ∧ t4 Q = t2 ∧ t3 ∧ t4 Q ∧ Q∗ = t2 ∧ t3 ∧ t4 ⇒ Q∗ = ∅ remove t1 t1 /∈ MFS Q = t3 ∧ t4 Q ∧ Q∗ = t3 ∧ t4 ⇒ Q∗ = t2 remove t2 t2 ∈ MFS Q = t4 Q ∧ Q∗ = t2 ∧ t4 ⇒ Q∗ = t2 ∧ t3 remove t3 t3 ∈ MFS Q = ∅ Q ∧ Q∗ = t2 ∧ t3 ⇒ Q∗ = t2 ∧ t3 remove t4 t4 /∈ MFS G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 7/21
  • 15. LBA Approach: 1) Finding an MFS Inspired by the work of Godfrey in RDBMSs [Godfrey97] A 3 steps procedure First step: Q∗ ← FindAnMFS(Q) • remove iteratively each triple pattern ti of Q (query Q ) • if Q ∧ Q∗ is successful, ti is an element of Q∗ (Proposition) • otherwise Q∗ is a subquery of Q ∧ Q∗ ∅ t1 t2 t3 t4 t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4 t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4t2 ∧ t3 ∧ t4 t1 ∧ t2 ∧ t3 ∧ t4 Q = t2 ∧ t3 ∧ t4 Q ∧ Q∗ = t2 ∧ t3 ∧ t4 ⇒ Q∗ = ∅ remove t1 t1 /∈ MFS Q = t3 ∧ t4 Q ∧ Q∗ = t3 ∧ t4 ⇒ Q∗ = t2 remove t2 t2 ∈ MFS Q = t4 Q ∧ Q∗ = t2 ∧ t4 ⇒ Q∗ = t2 ∧ t3 remove t3 t3 ∈ MFS Q = ∅ Q ∧ Q∗ = t2 ∧ t3 ⇒ Q∗ = t2 ∧ t3 remove t4 t4 /∈ MFS G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 7/21
  • 16. LBA Approach: 2) Computing Potential XSSs Second step: All queries that include Q∗ are failing ⇒ they can be neither MFS nor XSS Continue with the largest subqueries of Q that do not include Q∗. If they are successful, they are XSSs. ⇒ potential XSSs (PXSS): pxss(Q, Q∗) = {Q − ti | ti ∈ Q∗} Q∗ = t2 ∧ t3 ⇒ pxss(Q, Q∗) = {t1 ∧ t3 ∧ t4, t1 ∧ t2 ∧ t4} ∅ t1 t2 t3 t4 t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4 t3 ∧ t4 t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4 t1 ∧ t2 ∧ t3 ∧ t4 Q∗ pxss(Q, Q∗ ) G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 8/21
  • 17. LBA Approach: 2) Computing Potential XSSs Second step: All queries that include Q∗ are failing ⇒ they can be neither MFS nor XSS Continue with the largest subqueries of Q that do not include Q∗. If they are successful, they are XSSs. ⇒ potential XSSs (PXSS): pxss(Q, Q∗) = {Q − ti | ti ∈ Q∗} Q∗ = t2 ∧ t3 ⇒ pxss(Q, Q∗) = {t1 ∧ t3 ∧ t4, t1 ∧ t2 ∧ t4} ∅ t1 t2 t3 t4 t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4 t3 ∧ t4 t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4 t1 ∧ t2 ∧ t3 ∧ t4 Q∗ pxss(Q, Q∗ ) G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 8/21
  • 18. LBA Approach: 3) Finding all MFSs and XSSs Third step: Na¨ıve approach Execute each PXSS PQ If PQ is successful, this is an XSS Otherwise apply recursively the LBA algorithm on PQ Problem: the same MFS can be found several times Solution: incremental computation of the PXSSs that do not include the identified MFSs ∅ t1t1 t2 t3 t4 t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4t3 ∧ t4 t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4 t1 ∧ t2 ∧ t3 ∧ t4 MFSs XSSs pxss = {t1 ∧ t3 ∧ t4, t1 ∧ t2 ∧ t4} mfs(Q) = {t2 ∧ t3} xss(Q) = ∅ t1 ∧ t2 ∧ t3 ∧ t4 Q∗ = t2 ∧ t3 pxss = {t3 ∧ t4, t2 ∧ t4} mfs(Q) = {t2 ∧ t3, t1} xss(Q) = ∅ t1 ∧ t3 ∧ t4 Q∗∗ = t1 pxss = {t2 ∧ t4} mfs(Q) = {t2 ∧ t3, t1} xss(Q) = {t3 ∧ t4} t3 ∧ t4 pxss = ∅ mfs(Q) = {t2 ∧ t3, t1} xss(Q) = {t3 ∧ t4, t2 ∧ t4} t2 ∧ t4 G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 9/21
  • 19. LBA Approach: 3) Finding all MFSs and XSSs Third step: Na¨ıve approach Execute each PXSS PQ If PQ is successful, this is an XSS Otherwise apply recursively the LBA algorithm on PQ Problem: the same MFS can be found several times Solution: incremental computation of the PXSSs that do not include the identified MFSs ∅ t1t1 t2 t3 t4 t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4t3 ∧ t4 t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4 t1 ∧ t2 ∧ t3 ∧ t4 MFSs XSSs pxss = {t1 ∧ t3 ∧ t4, t1 ∧ t2 ∧ t4} mfs(Q) = {t2 ∧ t3} xss(Q) = ∅ t1 ∧ t2 ∧ t3 ∧ t4 Q∗ = t2 ∧ t3 pxss = {t3 ∧ t4, t2 ∧ t4} mfs(Q) = {t2 ∧ t3, t1} xss(Q) = ∅ t1 ∧ t3 ∧ t4 Q∗∗ = t1 pxss = {t2 ∧ t4} mfs(Q) = {t2 ∧ t3, t1} xss(Q) = {t3 ∧ t4} t3 ∧ t4 pxss = ∅ mfs(Q) = {t2 ∧ t3, t1} xss(Q) = {t3 ∧ t4, t2 ∧ t4} t2 ∧ t4 G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 9/21
  • 20. LBA Approach: 3) Finding all MFSs and XSSs Third step: Na¨ıve approach Execute each PXSS PQ If PQ is successful, this is an XSS Otherwise apply recursively the LBA algorithm on PQ Problem: the same MFS can be found several times Solution: incremental computation of the PXSSs that do not include the identified MFSs ∅ t1t1 t2 t3 t4 t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4t3 ∧ t4 t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4 t1 ∧ t2 ∧ t3 ∧ t4 MFSs XSSs pxss = {t1 ∧ t3 ∧ t4, t1 ∧ t2 ∧ t4} mfs(Q) = {t2 ∧ t3} xss(Q) = ∅ t1 ∧ t2 ∧ t3 ∧ t4 Q∗ = t2 ∧ t3 pxss = {t3 ∧ t4, t2 ∧ t4} mfs(Q) = {t2 ∧ t3, t1} xss(Q) = ∅ t1 ∧ t3 ∧ t4 Q∗∗ = t1 pxss = {t2 ∧ t4} mfs(Q) = {t2 ∧ t3, t1} xss(Q) = {t3 ∧ t4} t3 ∧ t4 pxss = ∅ mfs(Q) = {t2 ∧ t3, t1} xss(Q) = {t3 ∧ t4, t2 ∧ t4} t2 ∧ t4 G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 9/21
  • 21. LBA Approach: 3) Finding all MFSs and XSSs Third step: Na¨ıve approach Execute each PXSS PQ If PQ is successful, this is an XSS Otherwise apply recursively the LBA algorithm on PQ Problem: the same MFS can be found several times Solution: incremental computation of the PXSSs that do not include the identified MFSs ∅ t1t1 t2 t3 t4 t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4t3 ∧ t4 t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4 t1 ∧ t2 ∧ t3 ∧ t4 MFSs XSSs pxss = {t1 ∧ t3 ∧ t4, t1 ∧ t2 ∧ t4} mfs(Q) = {t2 ∧ t3} xss(Q) = ∅ t1 ∧ t2 ∧ t3 ∧ t4 Q∗ = t2 ∧ t3 pxss = {t3 ∧ t4, t2 ∧ t4} mfs(Q) = {t2 ∧ t3, t1} xss(Q) = ∅ t1 ∧ t3 ∧ t4 Q∗∗ = t1 pxss = {t2 ∧ t4} mfs(Q) = {t2 ∧ t3, t1} xss(Q) = {t3 ∧ t4} t3 ∧ t4 pxss = ∅ mfs(Q) = {t2 ∧ t3, t1} xss(Q) = {t3 ∧ t4, t2 ∧ t4} t2 ∧ t4 G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 9/21
  • 22. LBA Approach: 3) Finding all MFSs and XSSs Third step: Na¨ıve approach Execute each PXSS PQ If PQ is successful, this is an XSS Otherwise apply recursively the LBA algorithm on PQ Problem: the same MFS can be found several times Solution: incremental computation of the PXSSs that do not include the identified MFSs ∅ t1t1 t2 t3 t4 t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4t3 ∧ t4 t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4 t1 ∧ t2 ∧ t3 ∧ t4 MFSs XSSs pxss = {t1 ∧ t3 ∧ t4, t1 ∧ t2 ∧ t4} mfs(Q) = {t2 ∧ t3} xss(Q) = ∅ t1 ∧ t2 ∧ t3 ∧ t4 Q∗ = t2 ∧ t3 pxss = {t3 ∧ t4, t2 ∧ t4} mfs(Q) = {t2 ∧ t3, t1} xss(Q) = ∅ t1 ∧ t3 ∧ t4 Q∗∗ = t1 pxss = {t2 ∧ t4} mfs(Q) = {t2 ∧ t3, t1} xss(Q) = {t3 ∧ t4} t3 ∧ t4 pxss = ∅ mfs(Q) = {t2 ∧ t3, t1} xss(Q) = {t3 ∧ t4, t2 ∧ t4} t2 ∧ t4 G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 9/21
  • 23. Outline • Problem Statement • Lattice-Based Approach (LBA) • Matrice-Based Approach (MBA) • Experimental results • Conclusion and Perspectives G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 10/21
  • 24. MBA Approach: Overview Inspired by the work of Jannach in Recommender Systems Based on the computation of a Matrix one row by potential solution µ (a mapping satisfying at least one triple pattern) one column by triple pattern M[µ][ti ] = 1 ⇔ µ satisfies ti (else 0) t s p o Smith type Professor Smith research SW Smith teacherOf DB Jane type Lecturer Jane research DB Jane teacherOf DB SELECT ?p ?a WHERE { ?p age ?a (t1) ?p type Lecturer (t2) ?p research SW (t3) ?p teacherOf DB } (t4) ?p ?a t1 t2 t3 t4 Smith null 0 0 1 1 Jane null 0 1 0 1 xss(Q) = {t2 ∧ t4, t3 ∧ t4} mfs(Q) = {t1, t2 ∧ t3} RDF triples The query Q The relaxed matrix of Q The MFSs and XSSs of Q G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 11/21
  • 25. MBA Approach: Relaxed Matrix Computation Execution of each triple pattern ti Computation of t1 ∗ t2... ∗ tn where t1 ∗ t2 = t1 ∪ t2 ∪ t1 t2 the mappings that satisfies t1 or t2 or a combination of t1, t2 Problem: subqueries of Q can lead to Cartesian Product ⇒ Matrix of a huge size (exceeds the size of main memory) Not the case of Star-shaped queries (a shared join variable in the subject position) SELECT ?p ?a WHERE { ?p age ?a (t1) ?p type Lecturer (t2) ?p research SW (t3) ?p teacherOf DB } (t4) G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 12/21
  • 26. MBA Approach: Optimized Relaxed Matrix Computation Algorithm NQ execute each triple patternti for each value v of the join variable if v is already in the matrix, complete the row with a 1 for ti else add a new row in the matrix with only a 1 for ti ?p ?a t1 t2 t3 t4 t1 = ∅ ?p ?a t1 t2 t3 t4 Jane null 0 1 0 0 t2 = Jane ?p ?a t1 t2 t3 t4 Jane null 0 1 0 0 Smith null 0 0 1 0 t3 = Smith ?p ?a t1 t2 t3 t4 Jane null 0 1 0 1 Smith null 0 0 1 1 t4 = Smith, Jane G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 13/21
  • 27. MBA Approach: Optimized Relaxed Matrix Computation Algorithm NQ execute each triple patternti for each value v of the join variable if v is already in the matrix, complete the row with a 1 for ti else add a new row in the matrix with only a 1 for ti ?p ?a t1 t2 t3 t4 t1 = ∅ ?p ?a t1 t2 t3 t4 Jane null 0 1 0 0 t2 = Jane ?p ?a t1 t2 t3 t4 Jane null 0 1 0 0 Smith null 0 0 1 0 t3 = Smith ?p ?a t1 t2 t3 t4 Jane null 0 1 0 1 Smith null 0 0 1 1 t4 = Smith, Jane G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 13/21
  • 28. MBA Approach: Optimized Relaxed Matrix Computation Algorithm NQ execute each triple patternti for each value v of the join variable if v is already in the matrix, complete the row with a 1 for ti else add a new row in the matrix with only a 1 for ti ?p ?a t1 t2 t3 t4 t1 = ∅ ?p ?a t1 t2 t3 t4 Jane null 0 1 0 0 t2 = Jane ?p ?a t1 t2 t3 t4 Jane null 0 1 0 0 Smith null 0 0 1 0 t3 = Smith ?p ?a t1 t2 t3 t4 Jane null 0 1 0 1 Smith null 0 0 1 1 t4 = Smith, Jane G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 13/21
  • 29. MBA Approach: Optimized Relaxed Matrix Computation Algorithm NQ execute each triple patternti for each value v of the join variable if v is already in the matrix, complete the row with a 1 for ti else add a new row in the matrix with only a 1 for ti ?p ?a t1 t2 t3 t4 t1 = ∅ ?p ?a t1 t2 t3 t4 Jane null 0 1 0 0 t2 = Jane ?p ?a t1 t2 t3 t4 Jane null 0 1 0 0 Smith null 0 0 1 0 t3 = Smith ?p ?a t1 t2 t3 t4 Jane null 0 1 0 1 Smith null 0 0 1 1 t4 = Smith, Jane G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 13/21
  • 30. MBA Approach: Optimized Relaxed Matrix Computation Algorithm NQ execute each triple patternti for each value v of the join variable if v is already in the matrix, complete the row with a 1 for ti else add a new row in the matrix with only a 1 for ti ?p ?a t1 t2 t3 t4 t1 = ∅ ?p ?a t1 t2 t3 t4 Jane null 0 1 0 0 t2 = Jane ?p ?a t1 t2 t3 t4 Jane null 0 1 0 0 Smith null 0 0 1 0 t3 = Smith ?p ?a t1 t2 t3 t4 Jane null 0 1 0 1 Smith null 0 0 1 1 t4 = Smith, Jane G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 13/21
  • 31. MBA Approach: Optimized Relaxed Matrix Computation Algorithm 1Q if the RDF database is implemented as a triple table: T(s, p, o) a single SQL query can be used to compute the matrix roughly the translation of t1 t2... tn select coalesce(t1.s , t2.s, t3.s, t4.s), case when t1.s is null then 0 else 1 end as t1, case when t2.s is null then 0 else 1 end as t2, case when t3.s is null then 0 else 1 end as t3, case when t4.s is null then 0 else 1 end as t4 from t1 full outer join t2 on t1.s = t2.s full outer join t3 on coalesce(t1.s , t2.s) = t3.s full outer join t4 on coalesce(t1.s , t2.s, t3.s) = t4.s G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 14/21
  • 32. MBA Approach: MFSs and XSSs computation XSSs computation 1) Compute the skyline of the relaxed matrix Many solutions: block nested loop, sort filter skyline, ... 2) Retrieve the queries that correspond to skyline rows They are the XSSs ?p ?a t1 t2 t3 t4 Smith null 0 0 1 1 Jane null 0 1 0 1 ⇒ xss(Q) = {t2 ∧ t4, t3 ∧ t4} MFSs computation The matrix provides a simple way to know if a query is failing t2 ∧ t3 is failing ⇔ col(t2) AND col(t3) = 0 0 The matrix can be used as an index for optimizing LBA LBA without any database query G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 15/21
  • 33. Outline • Problem Statement • Lattice-Based Approach (LBA) • Matrice-Based Approach (MBA) • Experimental results • Conclusion and Perspectives G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 16/21
  • 34. Experimental Setup Java implementation of LBA and MBA on top of Jena TDB http://www.lias-lab.fr/forge/projects/qars Dataset: LUBM20 (≈ 300 MB) and LUBM100 (≈ 1 GB) Queries: 7 failing queries ranging 1 to 15 triple patterns (TP) Hardware: Intel Core i7 CPU and 8GB RAM Implementation of the relaxed matrix as a set of Bitmap Relaxed matrix size and computation time: LUBM20 LUBM100 Q3 Q5 Q7 Q3 Q5 Q7 Computation time with NQ (in sec) 8.6 8.6 8.6 42.6 43.4 44.6 Computation time with 1Q (in sec) 6.1 6.3 6.8 30.4 34.6 38.5 Size (in KB) 293 400 335 1385 1912 1590 Number of rows (in K) 430 430 430 2149 2149 2149 The algorithm 1Q is 25% faster than NQ The size of the matrix is small thanks to bitmap compression G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 17/21
  • 35. Experiments: XSS and MFS Computation Time G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 18/21 DFS: Depth-First Search algorithm (baseline method) 10 100 1000 10000 100000 Q1(1) Q2(5) Q3(7) Q4(9) Q5(11) Q6(13) Q7(15) QueryTime(ms) Queries with Number of Triple pattern (TP) LUBM100 DFS DFS does not scale for medium size queries (≈ 9TP)
  • 36. Experiments: XSS and MFS Computation Time G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 18/21 DFS: Depth-First Search algorithm (baseline method) ISHMAEL: adaptation of Godfrey’algorithm, LBA 10 100 1000 10000 100000 1e+006 Q1(1) Q2(5) Q3(7) Q4(9) Q5(11) Q6(13) Q7(15) QueryTime(ms) Queries with Number of Triple pattern (TP) LUBM100 DFS ISHMAEL LBA LBA/ISHMAEL scale till queries ≈ 11 TP
  • 37. Experiments: XSS and MFS Computation Time G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 18/21 DFS: Depth-First Search algorithm (baseline method) ISHMAEL: adaptation of Godfrey’algorithm, LBA MBA+M: computation of the matrix + LBA algorithm MBA-M: same as MBA+M but without computing the matrix 1 10 100 1000 10000 100000 1e+006 Q1(1) Q2(5) Q3(7) Q4(9) Q5(11) Q6(13) Q7(15) QueryTime(ms) Queries with Number of Triple pattern (TP) LUBM100 DFS ISHMAEL LBA MBA+M MBA-M MBA is only useful for Star-shaped queries MBA+M only useful for large queries, MBA-M scales well
  • 38. Experiments: Performance as the number of TP scales The number of TP plays a important role in the performance Experiment: decompose Q7 in 15 subqueries (1 to 15TP) Execute each such subquery 1 10 100 1000 10000 100000 1e+006 2 4 6 8 10 12 14 Number of Triple Patterns LUBM100 DFS ISMAEL LBA MBA+M MBA-M G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 19/21 ⇒ Confirm our previous results
  • 39. Outline • Problem Statement • Lattice-Based Approach (LBA) • Matrice-Based Approach (MBA) • Experimental results • Conclusion and Perspectives G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 20/21
  • 40. Conclusion and Perspectives Study of 2 cooperative techniques for RDF query relaxation MFS: a cause of query failure XSS: a relaxed query with a maximum of TP Contribution: 2 solutions to find MFSs/XSSs of RDF queries LBA is smart exploration of the subquery lattice MBA is based on a matrix used as a bitmap index LBA >> baseline algorithms for queries with more than 5 TPs MBA is only interesting for large star-shaped queries if the matrix has been precomputed, MBA runs in milliseconds even for large queries (>15TP) Perspectives: Optimization of LBA using heuristics (obvious MFS) Optimization of MBA for other kinds of RDF queries Definition of query relaxation strategies based on MFSs/XSSs G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 21/21