SlideShare a Scribd company logo
1 of 20
Semantic Web Solutions For Large-Scale
Biomedical Data Analytics (SEWEBMEDA)
Workshop at ESWC2017, Portoroz,
Slovenia
May 28th, 2017
Federated Query Formulation and
Processing through BioFed
Ali Hasnain, Syeda Sana E Zainab, Dure Zehra,
Qaiser Mehmood, Muhammad Saleem and Dietrich
Rebholz-Schuhmann
1
OUTLINE
1. Introduction
2. BioFed query processing
 Source selection
 Query re-writing
3. Evaluation
4. Biofed demo
2
INTRODUCTION
 Linked, decentralized
and distributed architecture
 9,960 datasets
 ~150B triples
 Complex information needs
 Need for federated queries
3
INTRODUCTION: EXAMPLE
Return the party membership and news pages about all US presidents.
 Party memberships
 US presidents
 US presidents
 News pages
 Computation of results require data from both sources
4
Integrator
Source Selection
Parse Query
SERVICE Annotation
Road
Map
BIOFED: QUERY PROCESSING
Get Individual Triple
Patterns
Identify relevant
sources
Generate optimized
query Execution Plan
Integrate sub-queries
results
Execute sub-queries
5
Federator Optimizer
Rewrite query, i.e.,
add SPARQL SERVICES
BioFed
Engine
BIOFED: SOURCE SELECTION
Two steps triple pattern-wise source selection:
1. Road Map lookup for predicate of each triple pattern
 Select those sources that contain the predicate
 Select all sources if predicate is unbound
2. If subject or object of triple pattern is bound
 Send SPARQL ASK query to each of the selected source in step 1, asking
for the complete triple pattern
 Prune relevant sources that returns false for the SPARQL ASK query
6
BIOFED: SOURCE SELECTION
FedBench (LD3): Return for all US presidents their party
membership and news pages about them.
SELECT ?president ?party ?page
WHERE {
?president rdf:type dbpedia:President .
?president dbpedia:nationality dbpedia:United_States .
?president dbpedia:party ?party .
?x nyt:topicPage ?page .
?x owl:sameAs ?president .
}
Source Selection Algorithm
Triple pattern-wise source selection
S1TP1 =
//TP1
//TP3
//TP4
//TP5
//TP2
7
Step 1: Road Map lookup
for rdf:type
S2 S3 S4
DBpedia
RDF
KEGG
RDF
ChEBI
RDF
NYT
RDF
S1 S2 S3 S4
BIOFED: SOURCE SELECTION
FedBench (LD3): Return for all US presidents their party
membership and news pages about them.
SELECT ?president ?party ?page
WHERE {
?president rdf:type dbpedia:President .
?president dbpedia:nationality dbpedia:United_States .
?president dbpedia:party ?party .
?x nyt:topicPage ?page .
?x owl:sameAs ?president .
}
Source Selection Algorithm
Triple pattern-wise source selection
S1TP1 =
//TP1
//TP3
//TP4
//TP5
//TP2
8
S2 S3 S4
DBpedia
RDF
KEGG
RDF
ChEBI
RDF
NYT
RDF
Step 2: Prune step 1 sources
using SPARQL ASK queries
ASK{ ?president rdf:type
dbpedia:President}
S1 S2 S3 S4
BIOFED: SOURCE SELECTION
FedBench (LD3): Return for all US presidents their party
membership and news pages about them.
SELECT ?president ?party ?page
WHERE {
?president rdf:type dbpedia:President .
?president dbpedia:nationality dbpedia:United_States .
?president dbpedia:party ?party .
?x nyt:topicPage ?page .
?x owl:sameAs ?president .
}
Source Selection Algorithm
Triple pattern-wise source selection
S1TP1 =
//TP1
//TP3
//TP4
//TP5
//TP2
9
DBpedia
RDF
KEGG
RDF
ChEBI
RDF
NYT
RDF
S1 S2 S3 S4
MOTIVATION: SOURCE SELECTION
10
Source Selection Algorithm
Triple pattern-wise source selection
S1TP1 = S1TP2 =
FedBench (LD3): Return for all US presidents their party
membership and news pages about them.
SELECT ?president ?party ?page
WHERE {
?president rdf:type dbpedia:President .
?president dbpedia:nationality dbpedia:United_States .
?president dbpedia:party ?party .
?x nyt:topicPage ?page .
?x owl:sameAs ?president .
}
//TP1
//TP3
//TP4
//TP5
//TP2
DBpedia
RDF
KEGG
RDF
ChEBI
RDF
NYT
RDF
S1 S2 S3 S4
MOTIVATION: SOURCE SELECTION
11
Source Selection Algorithm
Triple pattern-wise source selection
S1TP1 = S1TP2 =
S1TP3 =
FedBench (LD3): Return for all US presidents their party
membership and news pages about them.
SELECT ?president ?party ?page
WHERE {
?president rdf:type dbpedia:President .
?president dbpedia:nationality dbpedia:United_States .
?president dbpedia:party ?party .
?x nyt:topicPage ?page .
?x owl:sameAs ?president .
}
//TP1
//TP3
//TP4
//TP5
//TP2
DBpedia
RDF
KEGG
RDF
ChEBI
RDF
NYT
RDF
S1 S2 S3 S4
MOTIVATION: SOURCE SELECTION
12
Source Selection Algorithm
Triple pattern-wise source selection
S1TP1 = S1TP2 =
S1TP3 = S4TP4 =
FedBench (LD3): Return for all US presidents their party
membership and news pages about them.
SELECT ?president ?party ?page
WHERE {
?president rdf:type dbpedia:President .
?president dbpedia:nationality dbpedia:United_States .
?president dbpedia:party ?party .
?x nyt:topicPage ?page .
?x owl:sameAs ?president .
}
//TP1
//TP3
//TP4
//TP5
//TP2
DBpedia
RDF
KEGG
RDF
ChEBI
RDF
NYT
RDF
S1 S2 S3 S4
MOTIVATION: SOURCE SELECTION
13
Source Selection Algorithm
Triple pattern-wise source selection
S1TP1 = S1TP2 =
S1TP3 = S4TP4 =
S1TP5 = S2 S4
FedBench (LD3): Return for all US presidents their party
membership and news pages about them.
SELECT ?president ?party ?page
WHERE {
?president rdf:type dbpedia:President .
?president dbpedia:nationality dbpedia:United_States .
?president dbpedia:party ?party .
?x nyt:topicPage ?page .
?x owl:sameAs ?president .
}
//TP1
//TP3
//TP4
//TP5
//TP2
DBpedia
RDF
KEGG
RDF
ChEBI
RDF
NYT
RDF
S1 S2 S3 S4
BIOFED: QUERY RE-WRITING
SPARQL 1.0 To SPARQL 1.1 conversion
14
Triple pattern-wise source selection
S1TP1 = S1TP2 =
S1TP3 = S4TP4 =
S1TP5 = S2 S4SELECT ?president ?party ?page
WHERE {
?president rdf:type dbpedia:President . //TP1
?president dbpedia:nationality dbpedia:United_States . //TP2
?president dbpedia:party ?party . //TP3
?x nyt:topicPage ?page . //TP4
?x owl:sameAs ?president . //TP5
}
BIOFED: QUERY RE-WRITING
SPARQL 1.0 To SPARQL 1.1 conversion
 Combine triple patterns having same, one and only one relevant source
15
Triple pattern-wise source selection
S1TP1 = S1TP2 =
S1TP3 = S4TP4 =
S1TP5 = S2 S4
SELECT ?president ?party ?page
WHERE {
SERVICE <S1> {
?president rdf:type dbpedia:President . //TP1
?president dbpedia:nationality dbpedia:United_States . //TP2
?president dbpedia:party ?party . } //TP3
SERVICE <S4> { ?x nyt:topicPage ?page . } //TP4
?x owl:sameAs ?president . //TP5
}
BIOFED: QUERY RE-WRITING
SPARQL 1.0 To SPARQL 1.1 conversion
 Combine triple patterns having same, one and only one relevant source
 Use UNION and SERVICE for triple patterns with more than one relevant sources
16
Triple pattern-wise source selection
S1TP1 = S1TP2 =
S1TP3 = S4TP4 =
S1TP5 = S2 S4
SELECT ?president ?party ?page
WHERE {
SERVICE <S1> {
?president rdf:type dbpedia:President . //TP1
?president dbpedia:nationality dbpedia:United_States . //TP2
?president dbpedia:party ?party . } //TP3
SERVICE <S4> { ?x nyt:topicPage ?page . } //TP4
{ SERVICE<S1> { ?x owl:sameAs ?president . }} //TP5
UNION {
SERVICE<S2> { ?x owl:sameAs ?president . }} //TP5
UNION {
SERVICE<S4> { ?x owl:sameAs ?president . }} //TP5
}
COMPARISON ON
LARGERDFBENCH
17
COMPARISON ON
LARGERDFBENCH
18
http://vmurq09.deri.ie:8007/
19
THANK YOU
20

More Related Content

What's hot

Semantic Web
Semantic WebSemantic Web
Semantic Webhardchiu
 
1 bioline & t space or2013 final
1 bioline & t space or2013 final1 bioline & t space or2013 final
1 bioline & t space or2013 finalKellliBee
 
Annotations as Linked Data with Fedora4 and Triannon
Annotations as Linked Data with Fedora4 and TriannonAnnotations as Linked Data with Fedora4 and Triannon
Annotations as Linked Data with Fedora4 and TriannonRobert Sanderson
 
Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...
Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...
Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...Alasdair Gray
 
Taking TL-2 Online: A Linked Data Resource
Taking TL-2 Online: A Linked Data ResourceTaking TL-2 Online: A Linked Data Resource
Taking TL-2 Online: A Linked Data ResourceMartin Kalfatovic
 
Introduction to RDF
Introduction to RDFIntroduction to RDF
Introduction to RDFNarni Rajesh
 
Rdf In A Nutshell V1
Rdf In A Nutshell V1Rdf In A Nutshell V1
Rdf In A Nutshell V1Fabien Gandon
 
An introduction to Semantic Web and Linked Data
An introduction to Semantic Web and Linked DataAn introduction to Semantic Web and Linked Data
An introduction to Semantic Web and Linked DataFabien Gandon
 
Building a Linked Open Data Set
Building a Linked Open Data SetBuilding a Linked Open Data Set
Building a Linked Open Data SetJoel Richard
 
Search Data
Search DataSearch Data
Search DataWarawut
 
Supporting Dataset Descriptions in the Life Sciences
Supporting Dataset Descriptions in the Life SciencesSupporting Dataset Descriptions in the Life Sciences
Supporting Dataset Descriptions in the Life SciencesAlasdair Gray
 
Poster - Completeness Statements about RDF Data Sources and Their Use for Qu...
Poster - Completeness Statements about RDF Data Sources and Their Use for Qu...Poster - Completeness Statements about RDF Data Sources and Their Use for Qu...
Poster - Completeness Statements about RDF Data Sources and Their Use for Qu...Fariz Darari
 
when the link makes sense
when the link makes sensewhen the link makes sense
when the link makes senseFabien Gandon
 
In grammars we trust: LeadMine, a knowledge driven solution
In grammars we trust: LeadMine, a knowledge driven solutionIn grammars we trust: LeadMine, a knowledge driven solution
In grammars we trust: LeadMine, a knowledge driven solutionNextMove Software
 

What's hot (19)

Semantic Web
Semantic WebSemantic Web
Semantic Web
 
Data in RDF
Data in RDFData in RDF
Data in RDF
 
1 bioline & t space or2013 final
1 bioline & t space or2013 final1 bioline & t space or2013 final
1 bioline & t space or2013 final
 
Ontologies in RDF-S/OWL
Ontologies in RDF-S/OWLOntologies in RDF-S/OWL
Ontologies in RDF-S/OWL
 
Annotations as Linked Data with Fedora4 and Triannon
Annotations as Linked Data with Fedora4 and TriannonAnnotations as Linked Data with Fedora4 and Triannon
Annotations as Linked Data with Fedora4 and Triannon
 
Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...
Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...
Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...
 
RDF Transformations
RDF TransformationsRDF Transformations
RDF Transformations
 
Taking TL-2 Online: A Linked Data Resource
Taking TL-2 Online: A Linked Data ResourceTaking TL-2 Online: A Linked Data Resource
Taking TL-2 Online: A Linked Data Resource
 
Name That Graph !
Name That Graph !Name That Graph !
Name That Graph !
 
Introduction to RDF
Introduction to RDFIntroduction to RDF
Introduction to RDF
 
Rdf In A Nutshell V1
Rdf In A Nutshell V1Rdf In A Nutshell V1
Rdf In A Nutshell V1
 
An introduction to Semantic Web and Linked Data
An introduction to Semantic Web and Linked DataAn introduction to Semantic Web and Linked Data
An introduction to Semantic Web and Linked Data
 
Building a Linked Open Data Set
Building a Linked Open Data SetBuilding a Linked Open Data Set
Building a Linked Open Data Set
 
Search Data
Search DataSearch Data
Search Data
 
Supporting Dataset Descriptions in the Life Sciences
Supporting Dataset Descriptions in the Life SciencesSupporting Dataset Descriptions in the Life Sciences
Supporting Dataset Descriptions in the Life Sciences
 
Querying Bio2RDF data
Querying Bio2RDF dataQuerying Bio2RDF data
Querying Bio2RDF data
 
Poster - Completeness Statements about RDF Data Sources and Their Use for Qu...
Poster - Completeness Statements about RDF Data Sources and Their Use for Qu...Poster - Completeness Statements about RDF Data Sources and Their Use for Qu...
Poster - Completeness Statements about RDF Data Sources and Their Use for Qu...
 
when the link makes sense
when the link makes sensewhen the link makes sense
when the link makes sense
 
In grammars we trust: LeadMine, a knowledge driven solution
In grammars we trust: LeadMine, a knowledge driven solutionIn grammars we trust: LeadMine, a knowledge driven solution
In grammars we trust: LeadMine, a knowledge driven solution
 

Similar to Federated Query Formulation and Processing through BioFed

2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod GmodJun Zhao
 
Semantic web meetup – sparql tutorial
Semantic web meetup – sparql tutorialSemantic web meetup – sparql tutorial
Semantic web meetup – sparql tutorialAdonisDamian
 
Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02eswcsummerschool
 
Processing Life Science Data at Scale - using Semantic Web Technologies
Processing Life Science Data at Scale - using Semantic Web TechnologiesProcessing Life Science Data at Scale - using Semantic Web Technologies
Processing Life Science Data at Scale - using Semantic Web TechnologiesSyed Muhammad Ali Hasnain
 
ParlBench: a SPARQL-benchmark for electronic publishing applications.
ParlBench: a SPARQL-benchmark for electronic publishing applications.ParlBench: a SPARQL-benchmark for electronic publishing applications.
ParlBench: a SPARQL-benchmark for electronic publishing applications.Tatiana Tarasova
 
Sustainable queryable access to Linked Data
Sustainable queryable access to Linked DataSustainable queryable access to Linked Data
Sustainable queryable access to Linked DataRuben Verborgh
 
The Lonesome LOD Cloud
The Lonesome LOD CloudThe Lonesome LOD Cloud
The Lonesome LOD CloudRuben Verborgh
 
Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013François Belleau
 
List.MID: A MIDI-Based Benchmark for RDF Lists
List.MID: A MIDI-Based Benchmark for RDF ListsList.MID: A MIDI-Based Benchmark for RDF Lists
List.MID: A MIDI-Based Benchmark for RDF ListsAlbert Meroño-Peñuela
 
Re-using Media on the Web: Media fragment re-mixing and playout
Re-using Media on the Web: Media fragment re-mixing and playoutRe-using Media on the Web: Media fragment re-mixing and playout
Re-using Media on the Web: Media fragment re-mixing and playoutMediaMixerCommunity
 
SPARTIQULATION - Verbalizing SPARQL queries
SPARTIQULATION - Verbalizing SPARQL queriesSPARTIQULATION - Verbalizing SPARQL queries
SPARTIQULATION - Verbalizing SPARQL queriesBasil Ell
 
2010 06 ipaw_prv
2010 06 ipaw_prv2010 06 ipaw_prv
2010 06 ipaw_prvJun Zhao
 
GDG Meets U event - Big data & Wikidata - no lies codelab
GDG Meets U event - Big data & Wikidata -  no lies codelabGDG Meets U event - Big data & Wikidata -  no lies codelab
GDG Meets U event - Big data & Wikidata - no lies codelabCAMELIA BOBAN
 
MULDER: Querying the Linked Data Web by Bridging RDF Molecule Templates
MULDER: Querying the Linked Data Web by Bridging RDF Molecule TemplatesMULDER: Querying the Linked Data Web by Bridging RDF Molecule Templates
MULDER: Querying the Linked Data Web by Bridging RDF Molecule TemplatesKemele M. Endris
 
Querying data on the Web – client or server?
Querying data on the Web – client or server?Querying data on the Web – client or server?
Querying data on the Web – client or server?Ruben Verborgh
 

Similar to Federated Query Formulation and Processing through BioFed (20)

2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod Gmod
 
Querying Linked Data
Querying Linked DataQuerying Linked Data
Querying Linked Data
 
Semantic web meetup – sparql tutorial
Semantic web meetup – sparql tutorialSemantic web meetup – sparql tutorial
Semantic web meetup – sparql tutorial
 
Sparql
SparqlSparql
Sparql
 
Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02
 
Processing Life Science Data at Scale - using Semantic Web Technologies
Processing Life Science Data at Scale - using Semantic Web TechnologiesProcessing Life Science Data at Scale - using Semantic Web Technologies
Processing Life Science Data at Scale - using Semantic Web Technologies
 
Linked Data Fragments
Linked Data FragmentsLinked Data Fragments
Linked Data Fragments
 
ParlBench: a SPARQL-benchmark for electronic publishing applications.
ParlBench: a SPARQL-benchmark for electronic publishing applications.ParlBench: a SPARQL-benchmark for electronic publishing applications.
ParlBench: a SPARQL-benchmark for electronic publishing applications.
 
Sustainable queryable access to Linked Data
Sustainable queryable access to Linked DataSustainable queryable access to Linked Data
Sustainable queryable access to Linked Data
 
The Lonesome LOD Cloud
The Lonesome LOD CloudThe Lonesome LOD Cloud
The Lonesome LOD Cloud
 
Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013
 
List.MID: A MIDI-Based Benchmark for RDF Lists
List.MID: A MIDI-Based Benchmark for RDF ListsList.MID: A MIDI-Based Benchmark for RDF Lists
List.MID: A MIDI-Based Benchmark for RDF Lists
 
Re-using Media on the Web: Media fragment re-mixing and playout
Re-using Media on the Web: Media fragment re-mixing and playoutRe-using Media on the Web: Media fragment re-mixing and playout
Re-using Media on the Web: Media fragment re-mixing and playout
 
SPARTIQULATION - Verbalizing SPARQL queries
SPARTIQULATION - Verbalizing SPARQL queriesSPARTIQULATION - Verbalizing SPARQL queries
SPARTIQULATION - Verbalizing SPARQL queries
 
2010 06 ipaw_prv
2010 06 ipaw_prv2010 06 ipaw_prv
2010 06 ipaw_prv
 
GDG Meets U event - Big data & Wikidata - no lies codelab
GDG Meets U event - Big data & Wikidata -  no lies codelabGDG Meets U event - Big data & Wikidata -  no lies codelab
GDG Meets U event - Big data & Wikidata - no lies codelab
 
MULDER: Querying the Linked Data Web by Bridging RDF Molecule Templates
MULDER: Querying the Linked Data Web by Bridging RDF Molecule TemplatesMULDER: Querying the Linked Data Web by Bridging RDF Molecule Templates
MULDER: Querying the Linked Data Web by Bridging RDF Molecule Templates
 
Querying data on the Web – client or server?
Querying data on the Web – client or server?Querying data on the Web – client or server?
Querying data on the Web – client or server?
 
Democratizing Big Semantic Data management
Democratizing Big Semantic Data managementDemocratizing Big Semantic Data management
Democratizing Big Semantic Data management
 
inteSearch: An Intelligent Linked Data Information Access Framework
inteSearch: An Intelligent Linked Data Information Access FrameworkinteSearch: An Intelligent Linked Data Information Access Framework
inteSearch: An Intelligent Linked Data Information Access Framework
 

More from Syed Muhammad Ali Hasnain

Quantifying the content of biomedical semantic resources as a core for drug d...
Quantifying the content of biomedical semantic resources as a core for drug d...Quantifying the content of biomedical semantic resources as a core for drug d...
Quantifying the content of biomedical semantic resources as a core for drug d...Syed Muhammad Ali Hasnain
 
SHARP: Harmonizing cross-workflow Provenance
SHARP: Harmonizing cross-workflow ProvenanceSHARP: Harmonizing cross-workflow Provenance
SHARP: Harmonizing cross-workflow ProvenanceSyed Muhammad Ali Hasnain
 
SHARP: Harmonizing Galaxy and Taverna workflow provenance
SHARP: Harmonizing Galaxy and Taverna workflow provenanceSHARP: Harmonizing Galaxy and Taverna workflow provenance
SHARP: Harmonizing Galaxy and Taverna workflow provenanceSyed Muhammad Ali Hasnain
 
Exploiting Cognitive Computing and Frame Semantic Features for Biomedical Doc...
Exploiting Cognitive Computing and Frame Semantic Features for Biomedical Doc...Exploiting Cognitive Computing and Frame Semantic Features for Biomedical Doc...
Exploiting Cognitive Computing and Frame Semantic Features for Biomedical Doc...Syed Muhammad Ali Hasnain
 
An Approach for Discovering and Exploring Semantic Relationships between Genes
An Approach for Discovering and Exploring Semantic Relationships between GenesAn Approach for Discovering and Exploring Semantic Relationships between Genes
An Approach for Discovering and Exploring Semantic Relationships between GenesSyed Muhammad Ali Hasnain
 
A Provenance assisted Roadmap for Life Sciences Linked Open Data Cloud
A Provenance assisted Roadmap for Life Sciences Linked Open Data CloudA Provenance assisted Roadmap for Life Sciences Linked Open Data Cloud
A Provenance assisted Roadmap for Life Sciences Linked Open Data CloudSyed Muhammad Ali Hasnain
 
Improving discovery in Life Sciences Linked Open Data Cloud
Improving discovery in Life Sciences Linked Open Data CloudImproving discovery in Life Sciences Linked Open Data Cloud
Improving discovery in Life Sciences Linked Open Data CloudSyed Muhammad Ali Hasnain
 
Knowledge Processing with Big Data and Semantic Web Technologies
Knowledge Processing with Big Data and  Semantic Web TechnologiesKnowledge Processing with Big Data and  Semantic Web Technologies
Knowledge Processing with Big Data and Semantic Web TechnologiesSyed Muhammad Ali Hasnain
 
FedViz: A Visual Interface for SPARQL Queries Formulation and Execution
FedViz: A Visual Interface for SPARQL Queries Formulation and ExecutionFedViz: A Visual Interface for SPARQL Queries Formulation and Execution
FedViz: A Visual Interface for SPARQL Queries Formulation and ExecutionSyed Muhammad Ali Hasnain
 

More from Syed Muhammad Ali Hasnain (10)

Fair data vs 5 star open data final
Fair data vs 5 star open data finalFair data vs 5 star open data final
Fair data vs 5 star open data final
 
Quantifying the content of biomedical semantic resources as a core for drug d...
Quantifying the content of biomedical semantic resources as a core for drug d...Quantifying the content of biomedical semantic resources as a core for drug d...
Quantifying the content of biomedical semantic resources as a core for drug d...
 
SHARP: Harmonizing cross-workflow Provenance
SHARP: Harmonizing cross-workflow ProvenanceSHARP: Harmonizing cross-workflow Provenance
SHARP: Harmonizing cross-workflow Provenance
 
SHARP: Harmonizing Galaxy and Taverna workflow provenance
SHARP: Harmonizing Galaxy and Taverna workflow provenanceSHARP: Harmonizing Galaxy and Taverna workflow provenance
SHARP: Harmonizing Galaxy and Taverna workflow provenance
 
Exploiting Cognitive Computing and Frame Semantic Features for Biomedical Doc...
Exploiting Cognitive Computing and Frame Semantic Features for Biomedical Doc...Exploiting Cognitive Computing and Frame Semantic Features for Biomedical Doc...
Exploiting Cognitive Computing and Frame Semantic Features for Biomedical Doc...
 
An Approach for Discovering and Exploring Semantic Relationships between Genes
An Approach for Discovering and Exploring Semantic Relationships between GenesAn Approach for Discovering and Exploring Semantic Relationships between Genes
An Approach for Discovering and Exploring Semantic Relationships between Genes
 
A Provenance assisted Roadmap for Life Sciences Linked Open Data Cloud
A Provenance assisted Roadmap for Life Sciences Linked Open Data CloudA Provenance assisted Roadmap for Life Sciences Linked Open Data Cloud
A Provenance assisted Roadmap for Life Sciences Linked Open Data Cloud
 
Improving discovery in Life Sciences Linked Open Data Cloud
Improving discovery in Life Sciences Linked Open Data CloudImproving discovery in Life Sciences Linked Open Data Cloud
Improving discovery in Life Sciences Linked Open Data Cloud
 
Knowledge Processing with Big Data and Semantic Web Technologies
Knowledge Processing with Big Data and  Semantic Web TechnologiesKnowledge Processing with Big Data and  Semantic Web Technologies
Knowledge Processing with Big Data and Semantic Web Technologies
 
FedViz: A Visual Interface for SPARQL Queries Formulation and Execution
FedViz: A Visual Interface for SPARQL Queries Formulation and ExecutionFedViz: A Visual Interface for SPARQL Queries Formulation and Execution
FedViz: A Visual Interface for SPARQL Queries Formulation and Execution
 

Recently uploaded

TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsTransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsSérgio Sacani
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.Cherry
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxDiariAli
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCherry
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learninglevieagacer
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....muralinath2
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Cherry
 
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry Areesha Ahmad
 
Plasmid: types, structure and functions.
Plasmid: types, structure and functions.Plasmid: types, structure and functions.
Plasmid: types, structure and functions.Cherry
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspectsmuralinath2
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...Scintica Instrumentation
 
Concept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdfConcept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdfCherry
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
Pteris : features, anatomy, morphology and lifecycle
Pteris : features, anatomy, morphology and lifecyclePteris : features, anatomy, morphology and lifecycle
Pteris : features, anatomy, morphology and lifecycleCherry
 
COMPOSTING : types of compost, merits and demerits
COMPOSTING : types of compost, merits and demeritsCOMPOSTING : types of compost, merits and demerits
COMPOSTING : types of compost, merits and demeritsCherry
 
FS P2 COMBO MSTA LAST PUSH past exam papers.
FS P2 COMBO MSTA LAST PUSH past exam papers.FS P2 COMBO MSTA LAST PUSH past exam papers.
FS P2 COMBO MSTA LAST PUSH past exam papers.takadzanijustinmaime
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Cherry
 
Efficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence accelerationEfficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence accelerationSérgio Sacani
 
GBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolationGBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolationAreesha Ahmad
 

Recently uploaded (20)

TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsTransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptx
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
 
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
 
ABHISHEK ANTIBIOTICS PPT MICROBIOLOGY // USES OF ANTIOBIOTICS TYPES OF ANTIB...
ABHISHEK ANTIBIOTICS PPT MICROBIOLOGY  // USES OF ANTIOBIOTICS TYPES OF ANTIB...ABHISHEK ANTIBIOTICS PPT MICROBIOLOGY  // USES OF ANTIOBIOTICS TYPES OF ANTIB...
ABHISHEK ANTIBIOTICS PPT MICROBIOLOGY // USES OF ANTIOBIOTICS TYPES OF ANTIB...
 
Plasmid: types, structure and functions.
Plasmid: types, structure and functions.Plasmid: types, structure and functions.
Plasmid: types, structure and functions.
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
Concept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdfConcept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdf
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Pteris : features, anatomy, morphology and lifecycle
Pteris : features, anatomy, morphology and lifecyclePteris : features, anatomy, morphology and lifecycle
Pteris : features, anatomy, morphology and lifecycle
 
COMPOSTING : types of compost, merits and demerits
COMPOSTING : types of compost, merits and demeritsCOMPOSTING : types of compost, merits and demerits
COMPOSTING : types of compost, merits and demerits
 
FS P2 COMBO MSTA LAST PUSH past exam papers.
FS P2 COMBO MSTA LAST PUSH past exam papers.FS P2 COMBO MSTA LAST PUSH past exam papers.
FS P2 COMBO MSTA LAST PUSH past exam papers.
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
Efficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence accelerationEfficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence acceleration
 
GBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolationGBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolation
 

Federated Query Formulation and Processing through BioFed

  • 1. Semantic Web Solutions For Large-Scale Biomedical Data Analytics (SEWEBMEDA) Workshop at ESWC2017, Portoroz, Slovenia May 28th, 2017 Federated Query Formulation and Processing through BioFed Ali Hasnain, Syeda Sana E Zainab, Dure Zehra, Qaiser Mehmood, Muhammad Saleem and Dietrich Rebholz-Schuhmann 1
  • 2. OUTLINE 1. Introduction 2. BioFed query processing  Source selection  Query re-writing 3. Evaluation 4. Biofed demo 2
  • 3. INTRODUCTION  Linked, decentralized and distributed architecture  9,960 datasets  ~150B triples  Complex information needs  Need for federated queries 3
  • 4. INTRODUCTION: EXAMPLE Return the party membership and news pages about all US presidents.  Party memberships  US presidents  US presidents  News pages  Computation of results require data from both sources 4
  • 5. Integrator Source Selection Parse Query SERVICE Annotation Road Map BIOFED: QUERY PROCESSING Get Individual Triple Patterns Identify relevant sources Generate optimized query Execution Plan Integrate sub-queries results Execute sub-queries 5 Federator Optimizer Rewrite query, i.e., add SPARQL SERVICES BioFed Engine
  • 6. BIOFED: SOURCE SELECTION Two steps triple pattern-wise source selection: 1. Road Map lookup for predicate of each triple pattern  Select those sources that contain the predicate  Select all sources if predicate is unbound 2. If subject or object of triple pattern is bound  Send SPARQL ASK query to each of the selected source in step 1, asking for the complete triple pattern  Prune relevant sources that returns false for the SPARQL ASK query 6
  • 7. BIOFED: SOURCE SELECTION FedBench (LD3): Return for all US presidents their party membership and news pages about them. SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } Source Selection Algorithm Triple pattern-wise source selection S1TP1 = //TP1 //TP3 //TP4 //TP5 //TP2 7 Step 1: Road Map lookup for rdf:type S2 S3 S4 DBpedia RDF KEGG RDF ChEBI RDF NYT RDF S1 S2 S3 S4
  • 8. BIOFED: SOURCE SELECTION FedBench (LD3): Return for all US presidents their party membership and news pages about them. SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } Source Selection Algorithm Triple pattern-wise source selection S1TP1 = //TP1 //TP3 //TP4 //TP5 //TP2 8 S2 S3 S4 DBpedia RDF KEGG RDF ChEBI RDF NYT RDF Step 2: Prune step 1 sources using SPARQL ASK queries ASK{ ?president rdf:type dbpedia:President} S1 S2 S3 S4
  • 9. BIOFED: SOURCE SELECTION FedBench (LD3): Return for all US presidents their party membership and news pages about them. SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } Source Selection Algorithm Triple pattern-wise source selection S1TP1 = //TP1 //TP3 //TP4 //TP5 //TP2 9 DBpedia RDF KEGG RDF ChEBI RDF NYT RDF S1 S2 S3 S4
  • 10. MOTIVATION: SOURCE SELECTION 10 Source Selection Algorithm Triple pattern-wise source selection S1TP1 = S1TP2 = FedBench (LD3): Return for all US presidents their party membership and news pages about them. SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } //TP1 //TP3 //TP4 //TP5 //TP2 DBpedia RDF KEGG RDF ChEBI RDF NYT RDF S1 S2 S3 S4
  • 11. MOTIVATION: SOURCE SELECTION 11 Source Selection Algorithm Triple pattern-wise source selection S1TP1 = S1TP2 = S1TP3 = FedBench (LD3): Return for all US presidents their party membership and news pages about them. SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } //TP1 //TP3 //TP4 //TP5 //TP2 DBpedia RDF KEGG RDF ChEBI RDF NYT RDF S1 S2 S3 S4
  • 12. MOTIVATION: SOURCE SELECTION 12 Source Selection Algorithm Triple pattern-wise source selection S1TP1 = S1TP2 = S1TP3 = S4TP4 = FedBench (LD3): Return for all US presidents their party membership and news pages about them. SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } //TP1 //TP3 //TP4 //TP5 //TP2 DBpedia RDF KEGG RDF ChEBI RDF NYT RDF S1 S2 S3 S4
  • 13. MOTIVATION: SOURCE SELECTION 13 Source Selection Algorithm Triple pattern-wise source selection S1TP1 = S1TP2 = S1TP3 = S4TP4 = S1TP5 = S2 S4 FedBench (LD3): Return for all US presidents their party membership and news pages about them. SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } //TP1 //TP3 //TP4 //TP5 //TP2 DBpedia RDF KEGG RDF ChEBI RDF NYT RDF S1 S2 S3 S4
  • 14. BIOFED: QUERY RE-WRITING SPARQL 1.0 To SPARQL 1.1 conversion 14 Triple pattern-wise source selection S1TP1 = S1TP2 = S1TP3 = S4TP4 = S1TP5 = S2 S4SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . //TP1 ?president dbpedia:nationality dbpedia:United_States . //TP2 ?president dbpedia:party ?party . //TP3 ?x nyt:topicPage ?page . //TP4 ?x owl:sameAs ?president . //TP5 }
  • 15. BIOFED: QUERY RE-WRITING SPARQL 1.0 To SPARQL 1.1 conversion  Combine triple patterns having same, one and only one relevant source 15 Triple pattern-wise source selection S1TP1 = S1TP2 = S1TP3 = S4TP4 = S1TP5 = S2 S4 SELECT ?president ?party ?page WHERE { SERVICE <S1> { ?president rdf:type dbpedia:President . //TP1 ?president dbpedia:nationality dbpedia:United_States . //TP2 ?president dbpedia:party ?party . } //TP3 SERVICE <S4> { ?x nyt:topicPage ?page . } //TP4 ?x owl:sameAs ?president . //TP5 }
  • 16. BIOFED: QUERY RE-WRITING SPARQL 1.0 To SPARQL 1.1 conversion  Combine triple patterns having same, one and only one relevant source  Use UNION and SERVICE for triple patterns with more than one relevant sources 16 Triple pattern-wise source selection S1TP1 = S1TP2 = S1TP3 = S4TP4 = S1TP5 = S2 S4 SELECT ?president ?party ?page WHERE { SERVICE <S1> { ?president rdf:type dbpedia:President . //TP1 ?president dbpedia:nationality dbpedia:United_States . //TP2 ?president dbpedia:party ?party . } //TP3 SERVICE <S4> { ?x nyt:topicPage ?page . } //TP4 { SERVICE<S1> { ?x owl:sameAs ?president . }} //TP5 UNION { SERVICE<S2> { ?x owl:sameAs ?president . }} //TP5 UNION { SERVICE<S4> { ?x owl:sameAs ?president . }} //TP5 }