SlideShare a Scribd company logo
A short intro to PDQ: Proof-driven 
Querying 
Michael Benedikt 
with Julien Leblay, Efi Tsamoura, and Michael Vanden Boom
Background 
DBOnto: Semantics for a better world 
Exploit semantics of data: 
within a single source, among distributed sources, across data models 
• Enable new applications 
• Deliver better performance for current data-intensive tasks 
• Diminish effort in integrating complex data sources
Background 
Dimensions of Semantic Data 
Completeness 
of Sources/Source Access Model 
Target 
Implementation 
Data model 
for queries and constraints
Background 
Dimensions of Semantic Data 
Completeness 
of Sources/Source Access Model 
Target 
Implementation 
Data model 
for queries and constraints
Background 
Semantic Data Technology 
Completeness of Sources/ 
Source Access Model 
Target 
Implementation 
Data model 
for queries and constraints 
Semantic Web 
• RDF data model, description logic constraints 
• Inherently incomplete sources 
• Certain answer semantics 
• Wide range of target implementations
Background 
Semantic Data Technology 
Target 
Implementation 
Data model 
for queries and constraints 
Completeness of Sources/ 
Source Access Model 
Query Optimization 
with Constraints 
• Relational data model and constraints 
• Complete information 
• Access via lookup indices in sources 
• Compile to plan language of DBMS
Background 
Semantic Data Technology 
Target 
Implementation 
Data model 
for queries and constraints 
Completeness of Sources/ 
Source Access Model 
Query Optimization 
with Constraints via Reformulation 
• Relational data model and constraints 
• Complete sources 
• Compile to query language (e.g. SQL)
Background 
Semantic Data Technology 
Target 
Implementation 
Data model 
for queries and constraints 
Completeness of Sources/ 
Source Access Model 
Query Rewriting 
with Exact Views 
• Relational sources and constraints 
• Base data may not be accessible 
• Can still look for exact answers to queries 
• Compile to query language (e.g. SQL)
Background 
Semantic Data Technology 
Target 
Implementation 
Data model 
for queries and constraints 
Completeness of Sources/ 
Source Access Model 
Federated Querying Over Web-based 
Sources 
• Model sources and constraints relationally 
• Complete information on subset of sources 
• Distributed sources with mix of access regimes 
• Compile to middleware plan
Background 
Long-term PDQ vision 
Completeness 
of Sources/Source Access Model 
Target 
Implementation 
Data model 
for queries and constraints 
PDQ
Functionality 
PDQ: what it is today 
System for answering queries Q in the presence of semantic relationships and 
access restrictions on sources 
Targets: 
•Relational data model and constraints 
•Sufficient accessible information assumption: there is sufficient accessible 
data to obtain the exact answers to the query Q 
•Compilation into a “static plan” (reformulation, physical plan, middleware plan) 
Unified framework for: 
•Query Optimization/Reformulation with Constraints 
•Querying with Materialized Views 
•Federated Querying with Complete Information
Functionality 
PDQ: what it is 
Metadata including 
•D description of access to sources 
•integrity constraints C 
PDQ planner 
Cost information 
(e.g. cost function on plans) 
Query Q 
Pbest: plan using access model described by D with minimal cost 
giving the exact answer to Q for databases satisfying constraints C 
PDQ runtime Executes plans on top of 
Web-based or local datasources
Under the hood 
PDQ: how it works (sort of) 
Key observation: Under the sufficient accessible information assumption 
on Q, C, D there is always a “static plan” (e.g. relational algebra query) PQ 
that can be run to answer Q 
We can find such a PQ by looking for a “proof that there is sufficient 
information to answer Q”. 
• First main component: procedures to turn “proofs of answerability” into plans 
• Proof-to-plan procedure works for extremely rich class of integrity constraints 
• Adaptable to different target implementations (SQL query, physical plan, distributed plan…) 
• These “proof-to-plan” procedures are coupled with a reasoning system 
for finding the proofs of answerability. 
• Plug-in architecture: Chase procedure, Tableau-based FO theorem-prover, …
Under the hood 
PDQ: how it works in a bit more detail 
Metadata including 
•D description of access to sources 
•integrity constraints C Query Q 
PDQ planner 
Reasoning 
system for 
finding “proofs of 
answerability” 
Proof-to-Plan 
conversion 
Cost information 
(e.g. cost function on plans)
Under the hood 
PDQ: how it works, still more 
We can find a static plan PQ getting the exact answer to Q by looking for a 
“proof that Q is answerable” and then applying a proof-to-plan procedure. 
Last component – search strategy: we can find a good PQ by searching 
for a proof that 
1.witnesses that Q is answerable 
2.generates a low-cost plan 
Search is directed by proof goal and cost
Under the hood 
PDQ architecture
Status 
PDQ today and tomorrow 
• Theoretical basis given in PODS 2014 paper 
• Demonstration implemented over web services in VLDB 2014 
• Implementation generates SQL reformulation over relational sources (run on top 
of Postgres) 
Moving forward: 
•Pilot project beginning Oct 2014 to explore “native implementation” of PDQ on top 
of the plan language of the LogicBlox DBMS 
•Large EPSRC-funded project 2015-2020 to explore diverse uses of PDQ
Status 
PDQ today and tomorrow 
Completeness 
of Sources/Source Access Model 
Target 
Implementation 
Data model 
for queries and constraints 
PDQ 
2014 
PDQ 
2020
Next Steps 
PDQ: Next Steps 
• More info at http://cs.ox.ac.uk/pdq 
• See the demo!

More Related Content

Viewers also liked

ArtForm - Dynamic analysis of JavaScript validation in web forms - Poster
ArtForm - Dynamic analysis of JavaScript validation in web forms - PosterArtForm - Dynamic analysis of JavaScript validation in web forms - Poster
ArtForm - Dynamic analysis of JavaScript validation in web forms - Poster
DBOnto
 
Welcome by Ian Horrocks
Welcome by Ian HorrocksWelcome by Ian Horrocks
Welcome by Ian Horrocks
DBOnto
 
Optique - poster
Optique - posterOptique - poster
Optique - poster
DBOnto
 
Diadem DBOnto Kick Off meeting
Diadem DBOnto Kick Off meetingDiadem DBOnto Kick Off meeting
Diadem DBOnto Kick Off meeting
DBOnto
 
RDFox Poster
RDFox PosterRDFox Poster
RDFox Poster
DBOnto
 
PAGOdA paper
PAGOdA paperPAGOdA paper
PAGOdA paper
DBOnto
 
PAGOdA Presentation
PAGOdA PresentationPAGOdA Presentation
PAGOdA Presentation
DBOnto
 
SemFacet paper
SemFacet paperSemFacet paper
SemFacet paper
DBOnto
 
Optique presentation
Optique presentationOptique presentation
Optique presentation
DBOnto
 
SemFacet Poster
SemFacet PosterSemFacet Poster
SemFacet Poster
DBOnto
 
PAGOdA poster
PAGOdA posterPAGOdA poster
PAGOdA poster
DBOnto
 
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...
DBOnto
 
PDQ Poster
PDQ PosterPDQ Poster
PDQ Poster
DBOnto
 
Aggregating Semantic Annotators Paper
Aggregating Semantic Annotators PaperAggregating Semantic Annotators Paper
Aggregating Semantic Annotators Paper
DBOnto
 
Overview of Dan Olteanu's Research presentation
Overview of Dan Olteanu's Research presentationOverview of Dan Olteanu's Research presentation
Overview of Dan Olteanu's Research presentation
DBOnto
 
ROSeAnn Presentation
ROSeAnn PresentationROSeAnn Presentation
ROSeAnn Presentation
DBOnto
 
Semantic Faceted Search with SemFacet presentation
Semantic Faceted Search with SemFacet presentationSemantic Faceted Search with SemFacet presentation
Semantic Faceted Search with SemFacet presentation
DBOnto
 
Sem facet paper
Sem facet paperSem facet paper
Sem facet paper
DBOnto
 
DIADEM: domain-centric intelligent automated data extraction methodology Pres...
DIADEM: domain-centric intelligent automated data extraction methodology Pres...DIADEM: domain-centric intelligent automated data extraction methodology Pres...
DIADEM: domain-centric intelligent automated data extraction methodology Pres...
DBOnto
 
Parallel Datalog Reasoning in RDFox Presentation
Parallel Datalog Reasoning in RDFox PresentationParallel Datalog Reasoning in RDFox Presentation
Parallel Datalog Reasoning in RDFox Presentation
DBOnto
 

Viewers also liked (20)

ArtForm - Dynamic analysis of JavaScript validation in web forms - Poster
ArtForm - Dynamic analysis of JavaScript validation in web forms - PosterArtForm - Dynamic analysis of JavaScript validation in web forms - Poster
ArtForm - Dynamic analysis of JavaScript validation in web forms - Poster
 
Welcome by Ian Horrocks
Welcome by Ian HorrocksWelcome by Ian Horrocks
Welcome by Ian Horrocks
 
Optique - poster
Optique - posterOptique - poster
Optique - poster
 
Diadem DBOnto Kick Off meeting
Diadem DBOnto Kick Off meetingDiadem DBOnto Kick Off meeting
Diadem DBOnto Kick Off meeting
 
RDFox Poster
RDFox PosterRDFox Poster
RDFox Poster
 
PAGOdA paper
PAGOdA paperPAGOdA paper
PAGOdA paper
 
PAGOdA Presentation
PAGOdA PresentationPAGOdA Presentation
PAGOdA Presentation
 
SemFacet paper
SemFacet paperSemFacet paper
SemFacet paper
 
Optique presentation
Optique presentationOptique presentation
Optique presentation
 
SemFacet Poster
SemFacet PosterSemFacet Poster
SemFacet Poster
 
PAGOdA poster
PAGOdA posterPAGOdA poster
PAGOdA poster
 
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...
 
PDQ Poster
PDQ PosterPDQ Poster
PDQ Poster
 
Aggregating Semantic Annotators Paper
Aggregating Semantic Annotators PaperAggregating Semantic Annotators Paper
Aggregating Semantic Annotators Paper
 
Overview of Dan Olteanu's Research presentation
Overview of Dan Olteanu's Research presentationOverview of Dan Olteanu's Research presentation
Overview of Dan Olteanu's Research presentation
 
ROSeAnn Presentation
ROSeAnn PresentationROSeAnn Presentation
ROSeAnn Presentation
 
Semantic Faceted Search with SemFacet presentation
Semantic Faceted Search with SemFacet presentationSemantic Faceted Search with SemFacet presentation
Semantic Faceted Search with SemFacet presentation
 
Sem facet paper
Sem facet paperSem facet paper
Sem facet paper
 
DIADEM: domain-centric intelligent automated data extraction methodology Pres...
DIADEM: domain-centric intelligent automated data extraction methodology Pres...DIADEM: domain-centric intelligent automated data extraction methodology Pres...
DIADEM: domain-centric intelligent automated data extraction methodology Pres...
 
Parallel Datalog Reasoning in RDFox Presentation
Parallel Datalog Reasoning in RDFox PresentationParallel Datalog Reasoning in RDFox Presentation
Parallel Datalog Reasoning in RDFox Presentation
 

Similar to PDQ: Proof-driven Querying presentation

Designing real-time recommendations engine using graph databases.pptx
Designing real-time recommendations engine using graph databases.pptxDesigning real-time recommendations engine using graph databases.pptx
Designing real-time recommendations engine using graph databases.pptx
Gopi Krishna
 
LONG_Dong_CV
LONG_Dong_CVLONG_Dong_CV
LONG_Dong_CV
dong long
 
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic ProfilesA Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
Besnik Fetahu
 
Core Geospatial Ontologies
Core Geospatial OntologiesCore Geospatial Ontologies
Core Geospatial Ontologies
Stephane Fellah
 
Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017
Debraj GuhaThakurta
 
Lei Liu Resume
Lei Liu ResumeLei Liu Resume
Lei Liu Resume
Lei Liu
 
Government GraphSummit: And Then There Were 15 Standards
Government GraphSummit: And Then There Were 15 StandardsGovernment GraphSummit: And Then There Were 15 Standards
Government GraphSummit: And Then There Were 15 Standards
Neo4j
 
Effective Semantic Web Service Composition Framework Based on QoS
Effective Semantic Web Service Composition Framework Based on QoSEffective Semantic Web Service Composition Framework Based on QoS
Effective Semantic Web Service Composition Framework Based on QoS
sethuraman R
 
A BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL EndpointsA BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL Endpoints
Enrico Daga
 
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph TechnologyOracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
InfiniteGraph
 
Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2
Vivien Bonazzi
 
Presentation: Project Preliminary
Presentation: Project PreliminaryPresentation: Project Preliminary
Presentation: Project Preliminary
Mrugen Deshmukh
 
NWEA Summer 2014 UPDATES - Webinar
NWEA Summer 2014 UPDATES - WebinarNWEA Summer 2014 UPDATES - Webinar
NWEA Summer 2014 UPDATES - Webinar
lissaweier
 
Wei Fang's resume
Wei Fang's resumeWei Fang's resume
Wei Fang's resume
Wei Fang
 
Palantir learning for begineers to understand basics
Palantir learning for begineers to understand basicsPalantir learning for begineers to understand basics
Palantir learning for begineers to understand basics
balakrishna110526
 
Experiences In Building Globus Genomics Using Galaxy, Globus Online and AWS
Experiences In Building Globus Genomics Using Galaxy, Globus Online and AWSExperiences In Building Globus Genomics Using Galaxy, Globus Online and AWS
Experiences In Building Globus Genomics Using Galaxy, Globus Online and AWS
Ed Dodds
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
Joaquin Delgado PhD.
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
S. Diana Hu
 
Cloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application DevelopmentCloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application Development
Peter Haase
 
Cochrane Collaboration - Register of Studies Consultation
Cochrane Collaboration - Register of Studies ConsultationCochrane Collaboration - Register of Studies Consultation
Cochrane Collaboration - Register of Studies Consultation
Cochrane.Collaboration
 

Similar to PDQ: Proof-driven Querying presentation (20)

Designing real-time recommendations engine using graph databases.pptx
Designing real-time recommendations engine using graph databases.pptxDesigning real-time recommendations engine using graph databases.pptx
Designing real-time recommendations engine using graph databases.pptx
 
LONG_Dong_CV
LONG_Dong_CVLONG_Dong_CV
LONG_Dong_CV
 
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic ProfilesA Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
 
Core Geospatial Ontologies
Core Geospatial OntologiesCore Geospatial Ontologies
Core Geospatial Ontologies
 
Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017
 
Lei Liu Resume
Lei Liu ResumeLei Liu Resume
Lei Liu Resume
 
Government GraphSummit: And Then There Were 15 Standards
Government GraphSummit: And Then There Were 15 StandardsGovernment GraphSummit: And Then There Were 15 Standards
Government GraphSummit: And Then There Were 15 Standards
 
Effective Semantic Web Service Composition Framework Based on QoS
Effective Semantic Web Service Composition Framework Based on QoSEffective Semantic Web Service Composition Framework Based on QoS
Effective Semantic Web Service Composition Framework Based on QoS
 
A BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL EndpointsA BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL Endpoints
 
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph TechnologyOracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
 
Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2
 
Presentation: Project Preliminary
Presentation: Project PreliminaryPresentation: Project Preliminary
Presentation: Project Preliminary
 
NWEA Summer 2014 UPDATES - Webinar
NWEA Summer 2014 UPDATES - WebinarNWEA Summer 2014 UPDATES - Webinar
NWEA Summer 2014 UPDATES - Webinar
 
Wei Fang's resume
Wei Fang's resumeWei Fang's resume
Wei Fang's resume
 
Palantir learning for begineers to understand basics
Palantir learning for begineers to understand basicsPalantir learning for begineers to understand basics
Palantir learning for begineers to understand basics
 
Experiences In Building Globus Genomics Using Galaxy, Globus Online and AWS
Experiences In Building Globus Genomics Using Galaxy, Globus Online and AWSExperiences In Building Globus Genomics Using Galaxy, Globus Online and AWS
Experiences In Building Globus Genomics Using Galaxy, Globus Online and AWS
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 
Cloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application DevelopmentCloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application Development
 
Cochrane Collaboration - Register of Studies Consultation
Cochrane Collaboration - Register of Studies ConsultationCochrane Collaboration - Register of Studies Consultation
Cochrane Collaboration - Register of Studies Consultation
 

Recently uploaded

GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 

Recently uploaded (20)

GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 

PDQ: Proof-driven Querying presentation

  • 1. A short intro to PDQ: Proof-driven Querying Michael Benedikt with Julien Leblay, Efi Tsamoura, and Michael Vanden Boom
  • 2. Background DBOnto: Semantics for a better world Exploit semantics of data: within a single source, among distributed sources, across data models • Enable new applications • Deliver better performance for current data-intensive tasks • Diminish effort in integrating complex data sources
  • 3. Background Dimensions of Semantic Data Completeness of Sources/Source Access Model Target Implementation Data model for queries and constraints
  • 4. Background Dimensions of Semantic Data Completeness of Sources/Source Access Model Target Implementation Data model for queries and constraints
  • 5. Background Semantic Data Technology Completeness of Sources/ Source Access Model Target Implementation Data model for queries and constraints Semantic Web • RDF data model, description logic constraints • Inherently incomplete sources • Certain answer semantics • Wide range of target implementations
  • 6. Background Semantic Data Technology Target Implementation Data model for queries and constraints Completeness of Sources/ Source Access Model Query Optimization with Constraints • Relational data model and constraints • Complete information • Access via lookup indices in sources • Compile to plan language of DBMS
  • 7. Background Semantic Data Technology Target Implementation Data model for queries and constraints Completeness of Sources/ Source Access Model Query Optimization with Constraints via Reformulation • Relational data model and constraints • Complete sources • Compile to query language (e.g. SQL)
  • 8. Background Semantic Data Technology Target Implementation Data model for queries and constraints Completeness of Sources/ Source Access Model Query Rewriting with Exact Views • Relational sources and constraints • Base data may not be accessible • Can still look for exact answers to queries • Compile to query language (e.g. SQL)
  • 9. Background Semantic Data Technology Target Implementation Data model for queries and constraints Completeness of Sources/ Source Access Model Federated Querying Over Web-based Sources • Model sources and constraints relationally • Complete information on subset of sources • Distributed sources with mix of access regimes • Compile to middleware plan
  • 10. Background Long-term PDQ vision Completeness of Sources/Source Access Model Target Implementation Data model for queries and constraints PDQ
  • 11. Functionality PDQ: what it is today System for answering queries Q in the presence of semantic relationships and access restrictions on sources Targets: •Relational data model and constraints •Sufficient accessible information assumption: there is sufficient accessible data to obtain the exact answers to the query Q •Compilation into a “static plan” (reformulation, physical plan, middleware plan) Unified framework for: •Query Optimization/Reformulation with Constraints •Querying with Materialized Views •Federated Querying with Complete Information
  • 12. Functionality PDQ: what it is Metadata including •D description of access to sources •integrity constraints C PDQ planner Cost information (e.g. cost function on plans) Query Q Pbest: plan using access model described by D with minimal cost giving the exact answer to Q for databases satisfying constraints C PDQ runtime Executes plans on top of Web-based or local datasources
  • 13. Under the hood PDQ: how it works (sort of) Key observation: Under the sufficient accessible information assumption on Q, C, D there is always a “static plan” (e.g. relational algebra query) PQ that can be run to answer Q We can find such a PQ by looking for a “proof that there is sufficient information to answer Q”. • First main component: procedures to turn “proofs of answerability” into plans • Proof-to-plan procedure works for extremely rich class of integrity constraints • Adaptable to different target implementations (SQL query, physical plan, distributed plan…) • These “proof-to-plan” procedures are coupled with a reasoning system for finding the proofs of answerability. • Plug-in architecture: Chase procedure, Tableau-based FO theorem-prover, …
  • 14. Under the hood PDQ: how it works in a bit more detail Metadata including •D description of access to sources •integrity constraints C Query Q PDQ planner Reasoning system for finding “proofs of answerability” Proof-to-Plan conversion Cost information (e.g. cost function on plans)
  • 15. Under the hood PDQ: how it works, still more We can find a static plan PQ getting the exact answer to Q by looking for a “proof that Q is answerable” and then applying a proof-to-plan procedure. Last component – search strategy: we can find a good PQ by searching for a proof that 1.witnesses that Q is answerable 2.generates a low-cost plan Search is directed by proof goal and cost
  • 16. Under the hood PDQ architecture
  • 17. Status PDQ today and tomorrow • Theoretical basis given in PODS 2014 paper • Demonstration implemented over web services in VLDB 2014 • Implementation generates SQL reformulation over relational sources (run on top of Postgres) Moving forward: •Pilot project beginning Oct 2014 to explore “native implementation” of PDQ on top of the plan language of the LogicBlox DBMS •Large EPSRC-funded project 2015-2020 to explore diverse uses of PDQ
  • 18. Status PDQ today and tomorrow Completeness of Sources/Source Access Model Target Implementation Data model for queries and constraints PDQ 2014 PDQ 2020
  • 19. Next Steps PDQ: Next Steps • More info at http://cs.ox.ac.uk/pdq • See the demo!