SlideShare a Scribd company logo
1 of 16
Download to read offline
Efficient Semantic Querying of Relational Databases with
Resolution
Alexandre Riazanov
RuleML Initiative
CSWWS, May 24, 2009
1
Semantic Querying of RDB
Main scenario:
• One or more relational databases (RDB). Tables are conceptually treated
as sets (conjunctions) of ground logical assertions:
UNIV DB TAKES COURSE STUDENT COURSE
s1 c1 UNIV DB TAKES COURSE(s1,c1)
s2 c2 UNIV DB TAKES COURSE(s2,c2)
. . . . . . . . .
• Knowledge bases for the domains: rule sets (Derivational RuleML) and
taxonomies/ontologies (RDFS/OWL):
graduateStudent(X) : −takesCourse(X, C), graduateCourse(C)
• Semantic mapping for RDB schemas:
takesCourse(X, C) : − univ db takes course(X, C)
• User formulates logical queries that have to be answered modulo the KB:
? − graduateStudent(S), memberOf(S, D), suborganization(D,′
UBC′
).
S = ′
JohnSmith′
, D = ′
PsychCS′
;
S = ′
MaryTaylor′
, D = ′
MathStatPhys′
;
. . .
2
Main Applications of Semantic Querying
Non-programmer interface to RDB:
• financial analysts, biomedical researchers, criminal investigators, patent
examiners, etc., etc., don’t (want to) know RDB programming!
But they are usually able to formulate complex queries in terms of their
domains.
• a step towards querying in NL: logic is closer to English than SQL
• hundreds of thousands of potential end users
Web-scale semantic search:
• precise answers to queries; answers justified by reasoning with ontology
axioms and rules
• probably the main prize in the game: millions of potential end users
• semantic querying of RDB may be a large step in this direction
3
Outline of the Proposed Technique
• Use a variant of the classical resolution calculus to answer queries.
This allows to use very expressive knowledge bases and query languages.
• Use resolution in a smart way: find answers in bulk, i.e., find schematic
answers covering whole sets of concrete solutions.
• Convert schematic answers into SQL to leverage the efficiency of highly
optimised RDBMS.
• The technique can be characterised as incremental query rewriting – single
query can give rise to a potentially infinite set of SQL queries whose union
covers all answers.
——————————————————————————
Very expressive querying at reasonable cost.
4
Outline of the Proposed Technique
• Use a variant of the classical resolution calculus to answer queries.
This allows to use very expressive knowledge bases and query languages.
• Use resolution in a smart way: find answers in bulk, i.e., find schematic
answers covering whole sets of concrete solutions.
• Convert schematic answers into SQL to leverage the efficiency of highly
optimised RDBMS.
• The technique can be characterised as incremental query rewriting – single
query can give rise to a potentially infinite set of SQL queries whose union
covers all answers.
——————————————————————————
Very expressive querying at reasonable cost.
5
Resolution
Main principle:
C ∨ A ¬B ∨ D
(C ∨ D)θ where θ = mgu(A, B)
Procedure: Given a set of clauses, exhaustively apply rules until an empty
clause (contradiction) is found, in which case the original set is inconsistent.
Well studied: powerful and elegant theory, > 40 years of research.
Efficient implementation techniques exist: datastructures,
heuristic-driven algorithms, term indexing techniques, etc. Efficient (general
purpose) implementations: E, Gandalf, Otter, SPASS, Vampire,... Ontology
oriented optimisations: chain resolution.
6
Query Answering with Resolution
2 | ¬answer(X)
theorem : ∃X.stud(X)
¬stud(X)| ¬answer(X) stud(si)
pers(si) stud(si) ∨ ¬pers(si)
pers(X) : − grStud(X)
pers(X) ∨ ¬grStud(X)
DB
grStud(si)
stud(si) ∨ ¬course(sk0(si)) ∨ ¬pers(si)
course(sk0(si))
grCourse(sk0(si))
course(X) : − grCourse(X)
course(X) ∨ ¬grCourse(X)
stud(X) : − takesC(X, C), course(C), pers(X)
stud(X) ∨ ¬takesC(x, y) ∨ ¬course(y) ∨ ¬pers(X)
takesC(si, sk0(si))
GrStud ⊑ ∃takesC.GrCourse
¬grStud(X) ∨ takesC(x, sk0(X)) DB
grStud(si)
DB
grStud(si)
GrStud ⊑ ∃takesC.GrCourse
¬grStud(X) ∨ grCourse(sk0(X))
7
Query Answering with Resolution
2 | ¬answer(si)
query : find X.stud(X)
¬stud(X) | ¬answer(X) stud(si)
pers(si) stud(si) ∨ ¬pers(si)
pers(X) : − grStud(X)
pers(X) ∨ ¬grStud(X)
DB
grStud(si)
stud(si) ∨ ¬course(sk0(si)) ∨ ¬pers(si)
course(sk0(si))
grCourse(sk0(si))
course(X) : − grCourse(X)
course(X) ∨ ¬grCourse(X)
stud(X) : − takesC(X, C), course(C), pers(X)
stud(X) ∨ ¬takesC(x, y) ∨ ¬course(y) ∨ ¬pers(X)
takesC(si, sk0(si))
GrStud ⊑ ∃takesC.GrCourse
¬grStud(X) ∨ takesC(x, sk0(X)) DB
grStud(si)
DB
grStud(si)
GrStud ⊑ ∃takesC.GrCourse
¬grStud(X) ∨ grCourse(sk0(X))
8
Query Answering with Resolution
2 | ¬answer(si)
query : find X.stud(X)
¬stud(X) | ¬answer(X) stud(si)
pers(si) stud(si) ∨ ¬pers(si)
pers(X) : − grStud(X)
pers(X) ∨ ¬grStud(X)
DB
grStud(si)
stud(si) ∨ ¬course(sk0(si)) ∨ ¬pers(si)
course(sk0(si))
grCourse(sk0(si))
course(X) : − grCourse(X)
course(X) ∨ ¬grCourse(X)
stud(X) : − takesC(X, C), course(C), pers(X)
stud(X) ∨ ¬takesC(x, y) ∨ ¬course(y) ∨ ¬pers(X)
takesC(si, sk0(si))
GrStud ⊑ ∃takesC.GrCourse
¬grStud(X) ∨ takesC(x, sk0(X)) DB
grStud(si)
DB
grStud(si)
GrStud ⊑ ∃takesC.GrCourse
¬grStud(X) ∨ grCourse(sk0(X))
Inefficient:
reasoning on
per-answer basis!
9
Schematic Answer Derivation
. . . | gr stud(X) means
“provided X is in the DB
table gr stud(X)”
2 | ¬answer(X), gr stud(X)
2 | ¬answer(X), gr stud(X), gr stud(X), gr stud(X)
¬stud(X) | ¬answer(X) stud(X) | gr stud(X), gr stud(X), gr stud(X)
pers(X) | gr stud(X) ¬pers(X) ∨ stud(X) | gr stud(X), gr stud(X)
pers(X) ∨ grStud(X)
DB abstraction
grStud(X) | gr stud(X)
stud(X) ∨ ¬course(sk0(X)) ∨ ¬pers(X) | gr stud(X)
course(sk0(X)) | gr stud(X)
grCourse(sk0(X)) | gr stud(X)
course(X) ∨ ¬grCourse(X)
stud(X) ∨ ¬takesC(x, y) ∨ ¬course(y) ∨ ¬pers(X)
takesC(x, sk0(X)) | gr stud(X)
¬grStud(X) ∨ takesC(x, sk0(X)) DB abstraction
grStud(X) | gr stud(X)
DB abstraction
grStud(X) | gr stud(X)
¬grStud(X) ∨ grCourse(sk0(X))
10
Transformation to SQL
Query: ? − stud(S), memberOf(S,′′
http : //www.ubc.ca′′
).
Schematic answer:
2 | ¬answer(S),
takes course(S, C),
gr course(C),
member of(S, D),
suborganization(D,′′
http : //www.ubc.ca′′
)
SQL query for this schematic answer:
SELECT takes course.subject AS S
FROM takes course, gr course, member of, suborganization
WHERE takes course.object = gr course.instance
AND takes course.subject = member of.subject
AND member of.object = suborganization.subject
AND suborganization.object = “http://www.ubc.ca”
11
Soundness and Completeness w.r.t. Answers
• The method is sound: schematic answers cover only legitimate concrete
answers, as long as the underlying resolution calculus is sound.
• Completeness concept: for every existing concrete answer, we must be able
to find a schematic answer covering it.
• At the calculus level: complete with reasonable requirements on the
calculus, if resolution or hyperresolution is used (details in the paper).
• Complete with subsumption? Intuitively OK, requires careful formalisation.
• Complete superposition-based calculus for equality? Completeness proof
seems possible, requires careful research (soundness already established).
12
Work in Progress
• Prototype exists: the Vampire reasoner extended for query answering over
DB abstractions + simple SQL generator. Currerently supports TPTP,
OWL, Derby and MySQL. Next target – monotonic Derivational RuleML
(up to fologeq).
• Some experiments have been done, mostly with LUBM: all queries are
answered on a large instance with an unoptimised instance store.
Next step: case study with a real-life RDB design and RuleML.
• A lot more theoretical work is possible: completeness of superposition for
KBs with equality, completeness with standard search-space reduction
techniques, termination.
13
Semantic Web Indexing
• Indexing problem: avoid downloading SW documents that are not (known
to be) relevant to the query. Keyword-based indexing does not work:
relevant data may have no URIs/data literals in common with the query.
Query: ? − animal(X), ofBrightColour(X).
Relevant document: elephant(jumbo). pink(jumbo).
• Solution idea: index documents by their abstractions. Download only those
documents whose abstractions contribute to a schematic answer.
Indexes: I1 = elephant(X) | elephant(X), I2 = pink(X) | pink(X)
One of the schematic answers: 2 | ¬answer(X), elephant(X), pink(X),
derived from I1 and I2.
⇒ download all documents indexed with I1 or I2.
• Just a conceptual framework, many optimisations are possible.
14
Related Work
Prot´eg´e-related project (Stanford Medical Informatics). Querying
biomedical databases with OWL and SWRL. Using SQL to extract potentially
relevant data. Using a rule engine to do the inference.
XSTONE project (Tallinn U. of Tech.) Very expressive query language –
FOL with extensions. Relevant data is loaded into a resolution-based prover for
inference. Pilot use in financial sector, as a part of a decision support system.
OntoGrate (U. of Oregon + Yale) Close to my approach: negative
hyperresolution on Horn clauses is used to rewrite logical queries into SQL.
Too ad hoc, missing a sound theoretical foundation (e.g., no completeness
considerations).
Several projects leveraging the simplicity of inexpressive DL: E.g.,
traditional Chinese medicine DB integration project in Hangzhou (70 real
DB!). Some form of query rewriting with SQL in the back end. Also, SHER
(IBM) based on OWL-DL ABox summarisation (another form of abstraction).
Case studies: trial cohort selection (RDB were pre-translated to OWL).
15
Summary
Hot real-life problem: deductive query answering over RDB.
The target is scalability: high expressiveness without performance sacrifice.
Original idea: incremental query rewriting with resolution and constraints.
Encouraging preliminary results: some formal basis, a prototype,
preliminary case studies.
16

More Related Content

Recently uploaded

GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Exploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdfExploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdfrohankumarsinghrore1
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...Scintica Instrumentation
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceAlex Henderson
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIADr. TATHAGAT KHOBRAGADE
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...Monika Rani
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsOrtegaSyrineMay
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Silpa
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
Stages in the normal growth curve
Stages in the normal growth curveStages in the normal growth curve
Stages in the normal growth curveAreesha Ahmad
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptxSilpa
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfSumit Kumar yadav
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.Silpa
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxRenuJangid3
 

Recently uploaded (20)

GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Exploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdfExploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdf
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Stages in the normal growth curve
Stages in the normal growth curveStages in the normal growth curve
Stages in the normal growth curve
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdf
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 

Featured

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Efficient Semantic Querying of Relational Databases with Resolution

  • 1. Efficient Semantic Querying of Relational Databases with Resolution Alexandre Riazanov RuleML Initiative CSWWS, May 24, 2009 1
  • 2. Semantic Querying of RDB Main scenario: • One or more relational databases (RDB). Tables are conceptually treated as sets (conjunctions) of ground logical assertions: UNIV DB TAKES COURSE STUDENT COURSE s1 c1 UNIV DB TAKES COURSE(s1,c1) s2 c2 UNIV DB TAKES COURSE(s2,c2) . . . . . . . . . • Knowledge bases for the domains: rule sets (Derivational RuleML) and taxonomies/ontologies (RDFS/OWL): graduateStudent(X) : −takesCourse(X, C), graduateCourse(C) • Semantic mapping for RDB schemas: takesCourse(X, C) : − univ db takes course(X, C) • User formulates logical queries that have to be answered modulo the KB: ? − graduateStudent(S), memberOf(S, D), suborganization(D,′ UBC′ ). S = ′ JohnSmith′ , D = ′ PsychCS′ ; S = ′ MaryTaylor′ , D = ′ MathStatPhys′ ; . . . 2
  • 3. Main Applications of Semantic Querying Non-programmer interface to RDB: • financial analysts, biomedical researchers, criminal investigators, patent examiners, etc., etc., don’t (want to) know RDB programming! But they are usually able to formulate complex queries in terms of their domains. • a step towards querying in NL: logic is closer to English than SQL • hundreds of thousands of potential end users Web-scale semantic search: • precise answers to queries; answers justified by reasoning with ontology axioms and rules • probably the main prize in the game: millions of potential end users • semantic querying of RDB may be a large step in this direction 3
  • 4. Outline of the Proposed Technique • Use a variant of the classical resolution calculus to answer queries. This allows to use very expressive knowledge bases and query languages. • Use resolution in a smart way: find answers in bulk, i.e., find schematic answers covering whole sets of concrete solutions. • Convert schematic answers into SQL to leverage the efficiency of highly optimised RDBMS. • The technique can be characterised as incremental query rewriting – single query can give rise to a potentially infinite set of SQL queries whose union covers all answers. —————————————————————————— Very expressive querying at reasonable cost. 4
  • 5. Outline of the Proposed Technique • Use a variant of the classical resolution calculus to answer queries. This allows to use very expressive knowledge bases and query languages. • Use resolution in a smart way: find answers in bulk, i.e., find schematic answers covering whole sets of concrete solutions. • Convert schematic answers into SQL to leverage the efficiency of highly optimised RDBMS. • The technique can be characterised as incremental query rewriting – single query can give rise to a potentially infinite set of SQL queries whose union covers all answers. —————————————————————————— Very expressive querying at reasonable cost. 5
  • 6. Resolution Main principle: C ∨ A ¬B ∨ D (C ∨ D)θ where θ = mgu(A, B) Procedure: Given a set of clauses, exhaustively apply rules until an empty clause (contradiction) is found, in which case the original set is inconsistent. Well studied: powerful and elegant theory, > 40 years of research. Efficient implementation techniques exist: datastructures, heuristic-driven algorithms, term indexing techniques, etc. Efficient (general purpose) implementations: E, Gandalf, Otter, SPASS, Vampire,... Ontology oriented optimisations: chain resolution. 6
  • 7. Query Answering with Resolution 2 | ¬answer(X) theorem : ∃X.stud(X) ¬stud(X)| ¬answer(X) stud(si) pers(si) stud(si) ∨ ¬pers(si) pers(X) : − grStud(X) pers(X) ∨ ¬grStud(X) DB grStud(si) stud(si) ∨ ¬course(sk0(si)) ∨ ¬pers(si) course(sk0(si)) grCourse(sk0(si)) course(X) : − grCourse(X) course(X) ∨ ¬grCourse(X) stud(X) : − takesC(X, C), course(C), pers(X) stud(X) ∨ ¬takesC(x, y) ∨ ¬course(y) ∨ ¬pers(X) takesC(si, sk0(si)) GrStud ⊑ ∃takesC.GrCourse ¬grStud(X) ∨ takesC(x, sk0(X)) DB grStud(si) DB grStud(si) GrStud ⊑ ∃takesC.GrCourse ¬grStud(X) ∨ grCourse(sk0(X)) 7
  • 8. Query Answering with Resolution 2 | ¬answer(si) query : find X.stud(X) ¬stud(X) | ¬answer(X) stud(si) pers(si) stud(si) ∨ ¬pers(si) pers(X) : − grStud(X) pers(X) ∨ ¬grStud(X) DB grStud(si) stud(si) ∨ ¬course(sk0(si)) ∨ ¬pers(si) course(sk0(si)) grCourse(sk0(si)) course(X) : − grCourse(X) course(X) ∨ ¬grCourse(X) stud(X) : − takesC(X, C), course(C), pers(X) stud(X) ∨ ¬takesC(x, y) ∨ ¬course(y) ∨ ¬pers(X) takesC(si, sk0(si)) GrStud ⊑ ∃takesC.GrCourse ¬grStud(X) ∨ takesC(x, sk0(X)) DB grStud(si) DB grStud(si) GrStud ⊑ ∃takesC.GrCourse ¬grStud(X) ∨ grCourse(sk0(X)) 8
  • 9. Query Answering with Resolution 2 | ¬answer(si) query : find X.stud(X) ¬stud(X) | ¬answer(X) stud(si) pers(si) stud(si) ∨ ¬pers(si) pers(X) : − grStud(X) pers(X) ∨ ¬grStud(X) DB grStud(si) stud(si) ∨ ¬course(sk0(si)) ∨ ¬pers(si) course(sk0(si)) grCourse(sk0(si)) course(X) : − grCourse(X) course(X) ∨ ¬grCourse(X) stud(X) : − takesC(X, C), course(C), pers(X) stud(X) ∨ ¬takesC(x, y) ∨ ¬course(y) ∨ ¬pers(X) takesC(si, sk0(si)) GrStud ⊑ ∃takesC.GrCourse ¬grStud(X) ∨ takesC(x, sk0(X)) DB grStud(si) DB grStud(si) GrStud ⊑ ∃takesC.GrCourse ¬grStud(X) ∨ grCourse(sk0(X)) Inefficient: reasoning on per-answer basis! 9
  • 10. Schematic Answer Derivation . . . | gr stud(X) means “provided X is in the DB table gr stud(X)” 2 | ¬answer(X), gr stud(X) 2 | ¬answer(X), gr stud(X), gr stud(X), gr stud(X) ¬stud(X) | ¬answer(X) stud(X) | gr stud(X), gr stud(X), gr stud(X) pers(X) | gr stud(X) ¬pers(X) ∨ stud(X) | gr stud(X), gr stud(X) pers(X) ∨ grStud(X) DB abstraction grStud(X) | gr stud(X) stud(X) ∨ ¬course(sk0(X)) ∨ ¬pers(X) | gr stud(X) course(sk0(X)) | gr stud(X) grCourse(sk0(X)) | gr stud(X) course(X) ∨ ¬grCourse(X) stud(X) ∨ ¬takesC(x, y) ∨ ¬course(y) ∨ ¬pers(X) takesC(x, sk0(X)) | gr stud(X) ¬grStud(X) ∨ takesC(x, sk0(X)) DB abstraction grStud(X) | gr stud(X) DB abstraction grStud(X) | gr stud(X) ¬grStud(X) ∨ grCourse(sk0(X)) 10
  • 11. Transformation to SQL Query: ? − stud(S), memberOf(S,′′ http : //www.ubc.ca′′ ). Schematic answer: 2 | ¬answer(S), takes course(S, C), gr course(C), member of(S, D), suborganization(D,′′ http : //www.ubc.ca′′ ) SQL query for this schematic answer: SELECT takes course.subject AS S FROM takes course, gr course, member of, suborganization WHERE takes course.object = gr course.instance AND takes course.subject = member of.subject AND member of.object = suborganization.subject AND suborganization.object = “http://www.ubc.ca” 11
  • 12. Soundness and Completeness w.r.t. Answers • The method is sound: schematic answers cover only legitimate concrete answers, as long as the underlying resolution calculus is sound. • Completeness concept: for every existing concrete answer, we must be able to find a schematic answer covering it. • At the calculus level: complete with reasonable requirements on the calculus, if resolution or hyperresolution is used (details in the paper). • Complete with subsumption? Intuitively OK, requires careful formalisation. • Complete superposition-based calculus for equality? Completeness proof seems possible, requires careful research (soundness already established). 12
  • 13. Work in Progress • Prototype exists: the Vampire reasoner extended for query answering over DB abstractions + simple SQL generator. Currerently supports TPTP, OWL, Derby and MySQL. Next target – monotonic Derivational RuleML (up to fologeq). • Some experiments have been done, mostly with LUBM: all queries are answered on a large instance with an unoptimised instance store. Next step: case study with a real-life RDB design and RuleML. • A lot more theoretical work is possible: completeness of superposition for KBs with equality, completeness with standard search-space reduction techniques, termination. 13
  • 14. Semantic Web Indexing • Indexing problem: avoid downloading SW documents that are not (known to be) relevant to the query. Keyword-based indexing does not work: relevant data may have no URIs/data literals in common with the query. Query: ? − animal(X), ofBrightColour(X). Relevant document: elephant(jumbo). pink(jumbo). • Solution idea: index documents by their abstractions. Download only those documents whose abstractions contribute to a schematic answer. Indexes: I1 = elephant(X) | elephant(X), I2 = pink(X) | pink(X) One of the schematic answers: 2 | ¬answer(X), elephant(X), pink(X), derived from I1 and I2. ⇒ download all documents indexed with I1 or I2. • Just a conceptual framework, many optimisations are possible. 14
  • 15. Related Work Prot´eg´e-related project (Stanford Medical Informatics). Querying biomedical databases with OWL and SWRL. Using SQL to extract potentially relevant data. Using a rule engine to do the inference. XSTONE project (Tallinn U. of Tech.) Very expressive query language – FOL with extensions. Relevant data is loaded into a resolution-based prover for inference. Pilot use in financial sector, as a part of a decision support system. OntoGrate (U. of Oregon + Yale) Close to my approach: negative hyperresolution on Horn clauses is used to rewrite logical queries into SQL. Too ad hoc, missing a sound theoretical foundation (e.g., no completeness considerations). Several projects leveraging the simplicity of inexpressive DL: E.g., traditional Chinese medicine DB integration project in Hangzhou (70 real DB!). Some form of query rewriting with SQL in the back end. Also, SHER (IBM) based on OWL-DL ABox summarisation (another form of abstraction). Case studies: trial cohort selection (RDB were pre-translated to OWL). 15
  • 16. Summary Hot real-life problem: deductive query answering over RDB. The target is scalability: high expressiveness without performance sacrifice. Original idea: incremental query rewriting with resolution and constraints. Encouraging preliminary results: some formal basis, a prototype, preliminary case studies. 16