SlideShare a Scribd company logo
1 of 1
Download to read offline
Querying Distributed RDF Graphs: The Effects of Partitioning 
Anthony Potter Boris Motik Ian Horrocks 
Challenges 
Create a distributed, cloud-based DBMS for large 
RDF graphs. The two main challenges are: 
• How to partition data across a cluster 
• How to answer queries over a cluster 
LUBM: Percentage of Local Answers 
System Q2 Q8 Q9 Q11 Q12 Qc 
RDFox 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 
SHAPE 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 
Hash 0.44% 4.96% 0.23% 5.80% 0.00% 0.04% 
SP2B: Percentage of Local Answers 
System Q4 Q5 Q6 Q7 Q8 
RDFox 95.95% 73.00% 99.90% 92.41% 91.45% 
SHAPE 95.23% 9.72% 100.00% 41.97% 73.72% 
Hash 0.01% 0.77% 0.25% 0.08% 0.26% 
Storage Overhead 
RDFox SHAPE Hash 
LUBM 3.60% 84.23% 0.00% 
SP2B 0.60% 38.63% 0.00% 
RDF Graph Partitioning 
Evaluation 
Wildcard * 
Conclusions 
Our approach: 
• Greater percentage of local 
answers compared to 
competitors 
• Minimal storage overhead 
• Recognises local answers 
Future Work 
• Develop efficient query 
answering scheme with 
wildcards 
• Implement distributed 
DBMS on top of RDFox 
Local Answer 
A local answer can be 
evaluated fully on a 
single machine: 
• No network 
communication 
• Fast evaluation Subject Hashing Partition element 1 Graph-based 
Partition element 2 
RDFox: Our approach 
SHAPE: Semantic hash partitioning 
with n-hop duplication 
Hash: Subject hashing 
Aims 
• Maximise number of 
local answers to 
common queries 
• Recognise local 
answers 
• Minimise storage 
overhead 
• Many approaches to partitioning: 
§ Subject hashing 
§ Graph-based 
§ Semantic hashing 
• Partitioning scheme affects the 
number of local answers 
• We propose a novel graph-based 
partitioning scheme: 
§ Minimal storage overhead 
§ Common (star) queries fully 
local 
Introduction of a new 
wildcard resource: 
• Represents all external 
resources 
• Tracks connections 
between partition 
elements 
Used in our novel query 
answering scheme: 
• Reduces the number of 
intermediate answers 
• Recognises local 
answers

More Related Content

Viewers also liked

Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...DBOnto
 
SemFacet paper
SemFacet paperSemFacet paper
SemFacet paperDBOnto
 
ROSeAnn Presentation
ROSeAnn PresentationROSeAnn Presentation
ROSeAnn PresentationDBOnto
 
SemFacet Poster
SemFacet PosterSemFacet Poster
SemFacet PosterDBOnto
 
Aggregating Semantic Annotators Paper
Aggregating Semantic Annotators PaperAggregating Semantic Annotators Paper
Aggregating Semantic Annotators PaperDBOnto
 
PDQ Poster
PDQ PosterPDQ Poster
PDQ PosterDBOnto
 
PAGOdA paper
PAGOdA paperPAGOdA paper
PAGOdA paperDBOnto
 
PAGOdA Presentation
PAGOdA PresentationPAGOdA Presentation
PAGOdA PresentationDBOnto
 
Sem facet paper
Sem facet paperSem facet paper
Sem facet paperDBOnto
 
Optique - poster
Optique - posterOptique - poster
Optique - posterDBOnto
 
DIADEM: domain-centric intelligent automated data extraction methodology Pres...
DIADEM: domain-centric intelligent automated data extraction methodology Pres...DIADEM: domain-centric intelligent automated data extraction methodology Pres...
DIADEM: domain-centric intelligent automated data extraction methodology Pres...DBOnto
 
Welcome by Ian Horrocks
Welcome by Ian HorrocksWelcome by Ian Horrocks
Welcome by Ian HorrocksDBOnto
 
Parallel Datalog Reasoning in RDFox Presentation
Parallel Datalog Reasoning in RDFox PresentationParallel Datalog Reasoning in RDFox Presentation
Parallel Datalog Reasoning in RDFox PresentationDBOnto
 
Query Distributed RDF Graphs: The Effects of Partitioning Paper
Query Distributed RDF Graphs: The Effects of Partitioning PaperQuery Distributed RDF Graphs: The Effects of Partitioning Paper
Query Distributed RDF Graphs: The Effects of Partitioning PaperDBOnto
 

Viewers also liked (14)

Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...
 
SemFacet paper
SemFacet paperSemFacet paper
SemFacet paper
 
ROSeAnn Presentation
ROSeAnn PresentationROSeAnn Presentation
ROSeAnn Presentation
 
SemFacet Poster
SemFacet PosterSemFacet Poster
SemFacet Poster
 
Aggregating Semantic Annotators Paper
Aggregating Semantic Annotators PaperAggregating Semantic Annotators Paper
Aggregating Semantic Annotators Paper
 
PDQ Poster
PDQ PosterPDQ Poster
PDQ Poster
 
PAGOdA paper
PAGOdA paperPAGOdA paper
PAGOdA paper
 
PAGOdA Presentation
PAGOdA PresentationPAGOdA Presentation
PAGOdA Presentation
 
Sem facet paper
Sem facet paperSem facet paper
Sem facet paper
 
Optique - poster
Optique - posterOptique - poster
Optique - poster
 
DIADEM: domain-centric intelligent automated data extraction methodology Pres...
DIADEM: domain-centric intelligent automated data extraction methodology Pres...DIADEM: domain-centric intelligent automated data extraction methodology Pres...
DIADEM: domain-centric intelligent automated data extraction methodology Pres...
 
Welcome by Ian Horrocks
Welcome by Ian HorrocksWelcome by Ian Horrocks
Welcome by Ian Horrocks
 
Parallel Datalog Reasoning in RDFox Presentation
Parallel Datalog Reasoning in RDFox PresentationParallel Datalog Reasoning in RDFox Presentation
Parallel Datalog Reasoning in RDFox Presentation
 
Query Distributed RDF Graphs: The Effects of Partitioning Paper
Query Distributed RDF Graphs: The Effects of Partitioning PaperQuery Distributed RDF Graphs: The Effects of Partitioning Paper
Query Distributed RDF Graphs: The Effects of Partitioning Paper
 

Recently uploaded

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Recently uploaded (20)

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

Querying Distributed RDF Graphs: The Effects of Partitioning Poster

  • 1. Querying Distributed RDF Graphs: The Effects of Partitioning Anthony Potter Boris Motik Ian Horrocks Challenges Create a distributed, cloud-based DBMS for large RDF graphs. The two main challenges are: • How to partition data across a cluster • How to answer queries over a cluster LUBM: Percentage of Local Answers System Q2 Q8 Q9 Q11 Q12 Qc RDFox 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% SHAPE 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% Hash 0.44% 4.96% 0.23% 5.80% 0.00% 0.04% SP2B: Percentage of Local Answers System Q4 Q5 Q6 Q7 Q8 RDFox 95.95% 73.00% 99.90% 92.41% 91.45% SHAPE 95.23% 9.72% 100.00% 41.97% 73.72% Hash 0.01% 0.77% 0.25% 0.08% 0.26% Storage Overhead RDFox SHAPE Hash LUBM 3.60% 84.23% 0.00% SP2B 0.60% 38.63% 0.00% RDF Graph Partitioning Evaluation Wildcard * Conclusions Our approach: • Greater percentage of local answers compared to competitors • Minimal storage overhead • Recognises local answers Future Work • Develop efficient query answering scheme with wildcards • Implement distributed DBMS on top of RDFox Local Answer A local answer can be evaluated fully on a single machine: • No network communication • Fast evaluation Subject Hashing Partition element 1 Graph-based Partition element 2 RDFox: Our approach SHAPE: Semantic hash partitioning with n-hop duplication Hash: Subject hashing Aims • Maximise number of local answers to common queries • Recognise local answers • Minimise storage overhead • Many approaches to partitioning: § Subject hashing § Graph-based § Semantic hashing • Partitioning scheme affects the number of local answers • We propose a novel graph-based partitioning scheme: § Minimal storage overhead § Common (star) queries fully local Introduction of a new wildcard resource: • Represents all external resources • Tracks connections between partition elements Used in our novel query answering scheme: • Reduces the number of intermediate answers • Recognises local answers