Federated Query Formulation and Processing through BioFed

Semantic Web Solutions For Large-Scale
Biomedical Data Analytics (SEWEBMEDA)
Workshop at ESWC2017, Portoroz,
Slovenia
May 28th, 2017
Federated Query Formulation and
Processing through BioFed
Ali Hasnain, Syeda Sana E Zainab, Dure Zehra,
Qaiser Mehmood, Muhammad Saleem and Dietrich
Rebholz-Schuhmann
1

OUTLINE
1. Introduction
2. BioFed query processing
 Source selection
 Query re-writing
3. Evaluation
4. Biofed demo
2

INTRODUCTION
 Linked, decentralized
and distributed architecture
 9,960 datasets
 ~150B triples
 Complex information needs
 Need for federated queries
3

INTRODUCTION: EXAMPLE
Return the party membership and news pages about all US presidents.
 Party memberships
 US presidents
 US presidents
 News pages
 Computation of results require data from both sources
4

Integrator
Source Selection
Parse Query
SERVICE Annotation
Road
Map
BIOFED: QUERY PROCESSING
Get Individual Triple
Patterns
Identify relevant
sources
Generate optimized
query Execution Plan
Integrate sub-queries
results
Execute sub-queries
5
Federator Optimizer
Rewrite query, i.e.,
add SPARQL SERVICES
BioFed
Engine

BIOFED: SOURCE SELECTION
Two steps triple pattern-wise source selection:
1. Road Map lookup for predicate of each triple pattern
 Select those sources that contain the predicate
 Select all sources if predicate is unbound
2. If subject or object of triple pattern is bound
 Send SPARQL ASK query to each of the selected source in step 1, asking
for the complete triple pattern
 Prune relevant sources that returns false for the SPARQL ASK query
6

FedBench (LD3): Return for all US presidents their party
membership and news pages about them.
SELECT ?president ?party ?page
WHERE {
?president rdf:type dbpedia:President .
?president dbpedia:nationality dbpedia:United_States .
?president dbpedia:party ?party .
?x nyt:topicPage ?page .
?x owl:sameAs ?president .
}
Source Selection Algorithm
Triple pattern-wise source selection
S1TP1 =
//TP1
//TP3
//TP4
//TP5
//TP2
7
Step 1: Road Map lookup
for rdf:type
S2 S3 S4
DBpedia
RDF
KEGG
RDF
ChEBI
RDF
NYT
RDF
S1 S2 S3 S4

WHERE {
}
S1TP1 =
//TP1
//TP3
//TP4
//TP5
//TP2
8
S2 S3 S4
DBpedia
RDF
KEGG
RDF
ChEBI
RDF
NYT
RDF
Step 2: Prune step 1 sources
using SPARQL ASK queries
ASK{ ?president rdf:type
dbpedia:President}
S1 S2 S3 S4

WHERE {
}
S1TP1 =
//TP1
//TP3
//TP4
//TP5
//TP2
9
DBpedia
RDF
KEGG
RDF
ChEBI
RDF
NYT
RDF
S1 S2 S3 S4

MOTIVATION: SOURCE SELECTION
10
S1TP1 = S1TP2 =
WHERE {
}
//TP1
//TP3
//TP4
//TP5
//TP2
DBpedia
RDF
KEGG
RDF
ChEBI
RDF
NYT
RDF
S1 S2 S3 S4

11
S1TP1 = S1TP2 =
S1TP3 =
WHERE {
}
//TP1
//TP3
//TP4
//TP5
//TP2
DBpedia
RDF
KEGG
RDF
ChEBI
RDF
NYT
RDF
S1 S2 S3 S4

12
S1TP1 = S1TP2 =
S1TP3 = S4TP4 =
WHERE {
}
//TP1
//TP3
//TP4
//TP5
//TP2
DBpedia
RDF
KEGG
RDF
ChEBI
RDF
NYT
RDF
S1 S2 S3 S4

13
S1TP1 = S1TP2 =
S1TP3 = S4TP4 =
S1TP5 = S2 S4
WHERE {
}
//TP1
//TP3
//TP4
//TP5
//TP2
DBpedia
RDF
KEGG
RDF
ChEBI
RDF
NYT
RDF
S1 S2 S3 S4

BIOFED: QUERY RE-WRITING
SPARQL 1.0 To SPARQL 1.1 conversion
14
S1TP1 = S1TP2 =
S1TP3 = S4TP4 =
S1TP5 = S2 S4SELECT ?president ?party ?page
WHERE {
?president rdf:type dbpedia:President . //TP1
?president dbpedia:nationality dbpedia:United_States . //TP2
?president dbpedia:party ?party . //TP3
?x nyt:topicPage ?page . //TP4
?x owl:sameAs ?president . //TP5
}

 Combine triple patterns having same, one and only one relevant source
15
S1TP1 = S1TP2 =
S1TP3 = S4TP4 =
S1TP5 = S2 S4
WHERE {
SERVICE <S1> {
?president dbpedia:party ?party . } //TP3
SERVICE <S4> { ?x nyt:topicPage ?page . } //TP4
?x owl:sameAs ?president . //TP5
}

 Combine triple patterns having same, one and only one relevant source
 Use UNION and SERVICE for triple patterns with more than one relevant sources
16
S1TP1 = S1TP2 =
S1TP3 = S4TP4 =
S1TP5 = S2 S4
WHERE {
SERVICE <S1> {
?president dbpedia:party ?party . } //TP3
SERVICE <S4> { ?x nyt:topicPage ?page . } //TP4
{ SERVICE<S1> { ?x owl:sameAs ?president . }} //TP5
UNION {
SERVICE<S2> { ?x owl:sameAs ?president . }} //TP5
UNION {
SERVICE<S4> { ?x owl:sameAs ?president . }} //TP5
}

COMPARISON ON
LARGERDFBENCH
17

COMPARISON ON
LARGERDFBENCH
18

http://vmurq09.deri.ie:8007/
19

Federated Query Formulation and Processing through BioFed

More Related Content

What's hot

Similar to Federated Query Formulation and Processing through BioFed

More from Syed Muhammad Ali Hasnain

Recently uploaded

Federated Query Formulation and Processing through BioFed