We extend RDF with the ability to represent property values that exist, but are unknown or partially known, using constraints. Following ideas from the incomplete information literature, we develop a semantics for this extension of RDF, called RDFi, and study SPARQL query evaluation in this framework.
Machine Learning Methods for Analysing and Linking RDF DataJens Lehmann
Invited Talk at the 8th International Conference on Scalable Uncertainty Management (SUM)
The talk outlines applications of supervised structured machine learning and presents a specific refinement operator based approach for RDF/OWL. It also outlines how similar ideas can be used in other (formal) languages, in particular link specifications.
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...DBOnto
Abstract:
We present a novel approach to parallel materialisation (i.e.,
fixpoint computation) of datalog programs in centralised,
main-memory, multi-core RDF systems. Our approach comprises an algorithm that evenly distributes the workload to cores, and an RDF indexing data structure that supports efficient, ‘mostly’ lock-free parallel updates. Our empirical evaluation shows that our approach parallelises computation very well: with 16 physical cores, materialisation can be up to 13.9 times faster than with just one core.
Detecting paraphrases using recursive autoencodersFeynman Liang
Presentation on deep learning applied to natural language processing, presented at University of Cambridge Machine Learning Group's Research and Communication Club 2-11-2015 meeting.
Machine Learning Methods for Analysing and Linking RDF DataJens Lehmann
Invited Talk at the 8th International Conference on Scalable Uncertainty Management (SUM)
The talk outlines applications of supervised structured machine learning and presents a specific refinement operator based approach for RDF/OWL. It also outlines how similar ideas can be used in other (formal) languages, in particular link specifications.
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF ...DBOnto
Abstract:
We present a novel approach to parallel materialisation (i.e.,
fixpoint computation) of datalog programs in centralised,
main-memory, multi-core RDF systems. Our approach comprises an algorithm that evenly distributes the workload to cores, and an RDF indexing data structure that supports efficient, ‘mostly’ lock-free parallel updates. Our empirical evaluation shows that our approach parallelises computation very well: with 16 physical cores, materialisation can be up to 13.9 times faster than with just one core.
Detecting paraphrases using recursive autoencodersFeynman Liang
Presentation on deep learning applied to natural language processing, presented at University of Cambridge Machine Learning Group's Research and Communication Club 2-11-2015 meeting.
8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force statusLDBC council
Peter Boncz, Research Scientist at the Centrum Wiskunde & Informatica in the Netherlands, talked about the updates on the Graph Query Language Task Force after being alive for a year. This Task Force was created to answer an issue detected during the benchmark meetings, all the workload is created in English text because there is no common graph query language.
The talk was given at the 15th International Conference on Extending Database Technology (EDBT 2012) on March 29, 2012 in Berlin, Germany.
Abstract:
Query optimization in RDF Stores is a challenging problem as SPARQL queries typically contain many more joins than equivalent relational plans, and hence lead to a large join order search space. In such cases, cost-based query optimization often is not possible. One practical reason for this is that statistics typically are missing in web scale setting such as the Linked Open Datasets (LOD). The more profound reason is that due to the absence of schematic structure in RDF, join-hit ratio estimation requires complicated forms of correlated join statistics; and currently there are no methods to identify the relevant correlations beforehand. For this reason, the use of good heuristics is essential in SPARQL query optimization, even in the case that are partially used with cost-based statistics (i.e., hybrid query optimization). In this paper we describe a set of useful heuristics for SPARQL query optimizers. We present these in the context of a new Heuristic SPARQL Planner (HSP) that is capable of exploiting the syntactic and the structural variations of the triple patterns in a SPARQL query in order to choose an execution plan without the need of any cost model. For this, we define the variable graph and we show a reduction of the SPARQL query optimization problem to the maximum weight independent set problem. We implemented our planner on top of the MonetDB open source column-store and evaluated its effectiveness against the state-of-the-art RDF-3X engine as well as comparing the plan quality with a relational (SQL) equivalent of the benchmarks.
The DE-9IM Matrix in Details using ST_Relate: In Picture and SQLtorp42
The DE-9IM matrix is the foundation for understanding how spatial relationships are implemented in DBMSs like PostgreSQL, Oracle, and Microsoft SQL Server. This presentation makes a structure walk-through of most of the cases using a very large number of examples.
Redundancy analysis on linked data #cold2014 #ISWC2014honghan2013
Wu, Honghan, Boris Villazon-Terrazas, Jeff Z. Pan, and Jose Manuel Gomez-Perez. “How Redundant Is It? – An Empirical Analysis on Linked Datasets.” In ISWC COLD Workshop. 2014.
http://ceur-ws.org/Vol-1264/cold2014_WuVPG.pdf
Formal Verification of Data Provenance RecordsSzymon Klarman
Szymon Klarman, Stefan Schlobach and Luciano Serafini. Formal Verification of Data Provenance Records. In Proceedings of the 11th International Semantic Web Conference (ISWC-12), 2012
Exchange and Consumption of Huge RDF DataMario Arias
Huge RDF datasets are currently exchanged on textual RDF formats, hence consumers need to post-process them using RDF stores for local consumption, such as indexing and SPARQL query. This results in a painful task requiring a great effort in terms of time and compu- tational resources. A first approach to lightweight data exchange is a compact (binary) RDF serialization format called HDT. In this paper, we show how to enhance the exchanged HDT with additional structures to support some basic forms of SPARQL query resolution without the need of "unpacking" the data. Experiments show that i) with an exchanging ef- ficiency that outperforms universal compression, ii) post-processing now becomes a fast process which iii) provides competitive query performance at consumption.
this presentation is an introduction to R programming language.we will talk about usage, history, data structure and feathers of R programming language.
This short text will get you up to speed in no time on creating visualizations using R's ggplot2 package. It was developed as part of a training to those who had no prior experience in R and had limited knowledge on general programming concepts. It's a must have initial guide for those exploring the field of Data Science
Inductive Triple Graphs: A purely functional approach to represent RDFJose Emilio Labra Gayo
Slides of my presentation on 3rd International Workshop on Graph Structures for Knowledge Representation, part of the International Joint Conference on Artificial Intelligence, Beijing, China. 4 August 2013
Full version of http://www.slideshare.net/valexiev1/gvp-lodcidocshort. Same is available on http://vladimiralexiev.github.io/pres/20140905-CIDOC-GVP/index.html
CIDOC Congress, Dresden, Germany
2014-09-05: International Terminology Working Group: full version.
2014-09-09: Getty special session: short version
Deriving human readable labels from sparql queries Basil Ell
This presentation was given at I-SEMANTICS 2011, 7th International Conference on Semantic Systems, Graz, and is related the publication of the same title.
Over 80% of entities on the Semantic Web lack a human-readable label. This hampers the ability of any tool that uses linked data to offer a meaningful interface to human users. We argue that methods for deriving human-readable labels are essential in order to allow the usage of the Web of Data. In this paper we explore, implement, and evaluate a method for deriving human-readable labels based on the variable names used in a large corpus of SPARQL queries that we built from a set of log files. We analyze the structure of the SPARQL graph patterns and offer a classification scheme for graph patterns. Based on this classification, we identify graph patterns that allow us to derive useful labels. We also provide an overview over the current usage of SPARQL in the newly built corpus.
The publication is available at http://www.aifb.kit.edu/images/9/9d/Sparql_queries.pdf
Looking for Invariant Operators in ArgumentationCarlo Taticchi
We study invariant local expansion operators for conflict-free and admissible sets in Abstract Argumentation Frameworks (AFs). Such operators are directly applied on AFs, and are invariant with respect to a chosen “semantics” (that is w.r.t. each of the conflict free/admissible set of arguments). Accordingly, we derive a definition of robustness for AFs in terms of the number of times such operators can be applied without producing any change in the chosen semantics.
8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force statusLDBC council
Peter Boncz, Research Scientist at the Centrum Wiskunde & Informatica in the Netherlands, talked about the updates on the Graph Query Language Task Force after being alive for a year. This Task Force was created to answer an issue detected during the benchmark meetings, all the workload is created in English text because there is no common graph query language.
The talk was given at the 15th International Conference on Extending Database Technology (EDBT 2012) on March 29, 2012 in Berlin, Germany.
Abstract:
Query optimization in RDF Stores is a challenging problem as SPARQL queries typically contain many more joins than equivalent relational plans, and hence lead to a large join order search space. In such cases, cost-based query optimization often is not possible. One practical reason for this is that statistics typically are missing in web scale setting such as the Linked Open Datasets (LOD). The more profound reason is that due to the absence of schematic structure in RDF, join-hit ratio estimation requires complicated forms of correlated join statistics; and currently there are no methods to identify the relevant correlations beforehand. For this reason, the use of good heuristics is essential in SPARQL query optimization, even in the case that are partially used with cost-based statistics (i.e., hybrid query optimization). In this paper we describe a set of useful heuristics for SPARQL query optimizers. We present these in the context of a new Heuristic SPARQL Planner (HSP) that is capable of exploiting the syntactic and the structural variations of the triple patterns in a SPARQL query in order to choose an execution plan without the need of any cost model. For this, we define the variable graph and we show a reduction of the SPARQL query optimization problem to the maximum weight independent set problem. We implemented our planner on top of the MonetDB open source column-store and evaluated its effectiveness against the state-of-the-art RDF-3X engine as well as comparing the plan quality with a relational (SQL) equivalent of the benchmarks.
The DE-9IM Matrix in Details using ST_Relate: In Picture and SQLtorp42
The DE-9IM matrix is the foundation for understanding how spatial relationships are implemented in DBMSs like PostgreSQL, Oracle, and Microsoft SQL Server. This presentation makes a structure walk-through of most of the cases using a very large number of examples.
Redundancy analysis on linked data #cold2014 #ISWC2014honghan2013
Wu, Honghan, Boris Villazon-Terrazas, Jeff Z. Pan, and Jose Manuel Gomez-Perez. “How Redundant Is It? – An Empirical Analysis on Linked Datasets.” In ISWC COLD Workshop. 2014.
http://ceur-ws.org/Vol-1264/cold2014_WuVPG.pdf
Formal Verification of Data Provenance RecordsSzymon Klarman
Szymon Klarman, Stefan Schlobach and Luciano Serafini. Formal Verification of Data Provenance Records. In Proceedings of the 11th International Semantic Web Conference (ISWC-12), 2012
Exchange and Consumption of Huge RDF DataMario Arias
Huge RDF datasets are currently exchanged on textual RDF formats, hence consumers need to post-process them using RDF stores for local consumption, such as indexing and SPARQL query. This results in a painful task requiring a great effort in terms of time and compu- tational resources. A first approach to lightweight data exchange is a compact (binary) RDF serialization format called HDT. In this paper, we show how to enhance the exchanged HDT with additional structures to support some basic forms of SPARQL query resolution without the need of "unpacking" the data. Experiments show that i) with an exchanging ef- ficiency that outperforms universal compression, ii) post-processing now becomes a fast process which iii) provides competitive query performance at consumption.
this presentation is an introduction to R programming language.we will talk about usage, history, data structure and feathers of R programming language.
This short text will get you up to speed in no time on creating visualizations using R's ggplot2 package. It was developed as part of a training to those who had no prior experience in R and had limited knowledge on general programming concepts. It's a must have initial guide for those exploring the field of Data Science
Inductive Triple Graphs: A purely functional approach to represent RDFJose Emilio Labra Gayo
Slides of my presentation on 3rd International Workshop on Graph Structures for Knowledge Representation, part of the International Joint Conference on Artificial Intelligence, Beijing, China. 4 August 2013
Full version of http://www.slideshare.net/valexiev1/gvp-lodcidocshort. Same is available on http://vladimiralexiev.github.io/pres/20140905-CIDOC-GVP/index.html
CIDOC Congress, Dresden, Germany
2014-09-05: International Terminology Working Group: full version.
2014-09-09: Getty special session: short version
Deriving human readable labels from sparql queries Basil Ell
This presentation was given at I-SEMANTICS 2011, 7th International Conference on Semantic Systems, Graz, and is related the publication of the same title.
Over 80% of entities on the Semantic Web lack a human-readable label. This hampers the ability of any tool that uses linked data to offer a meaningful interface to human users. We argue that methods for deriving human-readable labels are essential in order to allow the usage of the Web of Data. In this paper we explore, implement, and evaluate a method for deriving human-readable labels based on the variable names used in a large corpus of SPARQL queries that we built from a set of log files. We analyze the structure of the SPARQL graph patterns and offer a classification scheme for graph patterns. Based on this classification, we identify graph patterns that allow us to derive useful labels. We also provide an overview over the current usage of SPARQL in the newly built corpus.
The publication is available at http://www.aifb.kit.edu/images/9/9d/Sparql_queries.pdf
Looking for Invariant Operators in ArgumentationCarlo Taticchi
We study invariant local expansion operators for conflict-free and admissible sets in Abstract Argumentation Frameworks (AFs). Such operators are directly applied on AFs, and are invariant with respect to a chosen “semantics” (that is w.r.t. each of the conflict free/admissible set of arguments). Accordingly, we derive a definition of robustness for AFs in terms of the number of times such operators can be applied without producing any change in the chosen semantics.
Franz et. al. 2012. Reconciling Succeeding Classifications, ESA 2012taxonbytes
Presentation on reconciling taxonomic concepts using the Euler approach, given at the 2012 Annual Meeting of Entomological Society of America, Knoxville, TN.
In this paper, we propose the problem of implementing an efficient query processing system for incomplete temporal and geospatial information in RDFi as a challenge to the SSTD community.
Many Linked Data datasets model elements in their domains in the form of lists: a countable number of ordered resources.
When publishing these lists in RDF, an important concern is making them easy to consume.
Therefore, a well-known recommendation is to find an existing list modelling solution, and reuse it.
However, a specific domain model can be implemented in different ways and vocabularies may provide alternative solutions.
In this paper, we argue that a wrong decision could have a significant impact in terms of performance and, ultimately, the availability of the data.
We take the case of RDF Lists and make the hypothesis that the efficiency of retrieving sequential linked data depends primarily on how they are modelled (triple-store invariance hypothesis).
To demonstrate this, we survey different solutions for modelling sequences in RDF, and propose a pragmatic approach for assessing their impact on data availability.
Finally, we derive good (and bad) practices on how to publish lists as linked open data.
By doing this, we sketch the foundations of an empirical, task-oriented methodology for benchmarking linked data modelling solutions.
Wi2015 - Clustering of Linked Open Data - the LODeX toolLaura Po
Presentation of the tool LODeX (http://www.dbgroup.unimore.it/lodex2/testCluster) at the 2015 IEEE/WIC/ACM International Conference on Web Intelligence, Singapore, December 6-8, 2015
Online Relation Alignment for Linked DatasetsMaria Koutraki
The large number of linked datasets in the Web, and their diversity in terms of schema representation has led to a fragmented dataset landscape. Querying and addressing information needs that span across disparate datasets requires the alignment of such schemas. Majority of schema and ontology alignment approaches focus exclusively on class alignment. Yet, relation alignment has not been fully addressed, and existing approaches fall short on addressing the dynamics of datasets and their size.
In this work, we address the problem of relation alignment across disparate linked datasets. Our approach focuses on two main aspects. First, online relation alignment, where we do not require full access, and sample instead for a minimal subset of the data. Thus, we address the main limitation of existing work on dealing with the large scale of linked datasets, and in cases where the datasets provide only query access. Second, we learn supervised machine learning models for which we employ various features or matchers that account for the diversity of linked datasets at the instance level. We perform an experimental evaluation on real-world linked datasets, DBpedia, YAGO, and Freebase. The results show superior performance against state-of-the-art approaches in schema matching, with an average relation alignment accuracy of 84%. In addition, we show that relation alignment can be performed efficiently at scale.
Filtering Inaccurate Entity Co-references on the Linked Open Dataebrahim_bagheri
A method for identifying incorrect sameAs links on the Linked Open Data cloud
Details published in:
John Cuzzola, Ebrahim Bagheri, Jelena Jovanovic:
Filtering Inaccurate Entity Co-references on the Linked Open Data. DEXA (1) 2015: 128-143
Robust Low-rank and Sparse Decomposition for Moving Object DetectionActiveEon
Presentation summary:
* Moving object detection by background modeling and subtraction.
* Solved and unsolved challenges.
* Framework for low-rank and sparse decomposition.
* Some applications of RPCA on:
* * Background modeling and foreground separation.
* * Very dynamic background.
* * Multidimensional and streaming data.
* LRSLibrary1 + demo.
The increasing amount of valuable semi-structured data has become available online. In this talk, we overview the state of the art in entity ranking over structured data ("linked data").
Biological screening of herbal drugs: Introduction and Need for
Phyto-Pharmacological Screening, New Strategies for evaluating
Natural Products, In vitro evaluation techniques for Antioxidants, Antimicrobial and Anticancer drugs. In vivo evaluation techniques
for Anti-inflammatory, Antiulcer, Anticancer, Wound healing, Antidiabetic, Hepatoprotective, Cardio protective, Diuretics and
Antifertility, Toxicity studies as per OECD guidelines
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
Francesca Gottschalk - How can education support child empowerment.pptxEduSkills OECD
Francesca Gottschalk from the OECD’s Centre for Educational Research and Innovation presents at the Ask an Expert Webinar: How can education support child empowerment?
Embracing GenAI - A Strategic ImperativePeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
A Strategic Approach: GenAI in EducationPeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Honest Reviews of Tim Han LMA Course Program.pptxtimhan337
Personal development courses are widely available today, with each one promising life-changing outcomes. Tim Han’s Life Mastery Achievers (LMA) Course has drawn a lot of interest. In addition to offering my frank assessment of Success Insider’s LMA Course, this piece examines the course’s effects via a variety of Tim Han LMA course reviews and Success Insider comments.
1. Incomplete Information in RDF
Charalampos Nikolaou and Manolis Koubarakis
charnik@di.uoa.gr
koubarak@di.uoa.gr
Department of Informatics and Telecommunications
National and Kapodistrian University of Athens
Web Reasoning and Rule Systems (RR) 2013
July 27, 2013
2. Outline
Motivation
Previous work
The RDFi framework
SPARQL query evaluation over RDFi databases
An algorithm for certain answer computation
Preliminary complexity results
Conclusions and future work
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
2/36
3. Motivation
Incomplete information is an important issue in many research
areas: relational databases, knowledge representation and the
semantic web.
Incomplete information arises in many practical settings (e.g.,
sensor data). RDF is often used to represent such data.
Even if initial information is complete, incomplete information
arises later on (e.g., relational view updates, data integration,
data exchange).
Although there is much work recently on incomplete
information in XML, not much has been done for incomplete
information in RDF.
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
3/36
4. Previous work
Relational
Relations extended to tables with various models of
incompleteness [Imielinski/Lipski ’84]
Complexity results for the associated decision problems
[Abiteboul/Kanellakis/Grahne ’91]
Dependencies and updates [Grahne ’91]
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
4/36
5. Previous work
Relational
Relations extended to tables with various models of
incompleteness [Imielinski/Lipski ’84]
Complexity results for the associated decision problems
[Abiteboul/Kanellakis/Grahne ’91]
Dependencies and updates [Grahne ’91]
XML
Dynamic enrichment of incomplete information
[Abiteboul/Segoufin/Vianu ’01,’06]
General models of incompleteness, query answering, and
computational complexity [Barcel´/Libkin/Poggi/Sirangelo
o
’09,’10]
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
4/36
6. Previous work (cont’d)
RDF
Blank nodes as existential variables in the RDF standard
SPARQL query evaluation under certain answer semantics
(Open World Assumption) [Arenas/P´rez ’11]
e
Anonymous timestamps in general temporal RDF graphs
[Gutierrez/Hurtado/Vaisman ’05]
General temporal RDF graphs with temporal constraints
[Hurtado/Vaisman ’06]
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
5/36
7. Previous work (cont’d)
RDF
Blank nodes as existential variables in the RDF standard
SPARQL query evaluation under certain answer semantics
(Open World Assumption) [Arenas/P´rez ’11]
e
Anonymous timestamps in general temporal RDF graphs
[Gutierrez/Hurtado/Vaisman ’05]
General temporal RDF graphs with temporal constraints
[Hurtado/Vaisman ’06]
RDFi : It captures incomplete information for property values
using constraints. It is for RDF what the c-tables model is for the
relational model.
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
5/36
10. RDFi in a nutshell
Extension of RDF for capturing incomplete information for
property values that exist but are unknown or partially known
Partial knowledge captured by constraints using an appropriate
constraint language L interpreted over a fixed structure ML
Syntax
RDF graphs extended to RDFi databases: pair (G , φ)
G : RDF graph with a new kind of literals, called e-literals
φ: quantifier-free formula of L
Semantics
Possible world semantics as in [Imielinski/Lipski ’84] and
[Grahne ’91]
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
7/36
11. Constraint languages L
Examples
ECL
Equality constraints
interpreted over an infinite
domain: x EQ y , x EQ c
Blank nodes as existential
variables
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
8/36
12. Constraint languages L
Examples
diPCL/dePCL
ECL
Equality constraints
interpreted over an infinite
domain: x EQ y , x EQ c
Blank nodes as existential
variables
C. Nikolaou and M. Koubarakis
–
Difference constraints of the
form x − y ≤ c interpreted over
the integers or rationals
Incomplete temporal
information [Koubarakis ’94]
Incomplete Information in RDF
8/36
13. Constraint languages L
Examples
diPCL/dePCL
ECL
Equality constraints
interpreted over an infinite
domain: x EQ y , x EQ c
Blank nodes as existential
variables
Difference constraints of the
form x − y ≤ c interpreted over
the integers or rationals
Incomplete temporal
information [Koubarakis ’94]
TCL
Topological constraints of
non-empty, regular closed
subsets of topological space
Six binary predicates:
DC, EC, PO, EQ, TPP, NTPP
C. Nikolaou and M. Koubarakis
–
x DC y
y
x
Incomplete Information in RDF
x EQ y
x
y
x EC y
y
x
x TPP y
y
x
x PO y
y
x
x NTPP y
y
x
8/36
14. Constraint languages L
Examples
diPCL/dePCL
ECL
Equality constraints
interpreted over an infinite
domain: x EQ y , x EQ c
Blank nodes as existential
variables
TCL
Difference constraints of the
form x − y ≤ c interpreted over
the integers or rationals
Incomplete temporal
information [Koubarakis ’94]
PCL
Topological constraints of
non-empty, regular closed
subsets of topological space
Six binary predicates:
DC, EC, PO, EQ, TPP, NTPP
C. Nikolaou and M. Koubarakis
–
TCL plus constant symbols
representing polygons in Q2
e.g.,
r NTPP "x − y ≥ 0 ∧ x ≤ 1 ∧ y ≥ 0"
Incomplete Information in RDF
8/36
15. RDFi : Vocabulary
RDF
RDFi
I (IRIs)
B (blank nodes)
L (literals)
I
B
L
C (literals)
U (e-literals)
M
A (datatypes)
M (datatype map)
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
L
constants
variables
set of sorts
9/36
16. RDFi : Vocabulary
RDF
RDFi
I (IRIs)
B (blank nodes)
L (literals)
I
B
L
C (literals)
U (e-literals)
M
A (datatypes)
M (datatype map)
L
constants
variables
set of sorts
ML interprets the constants of L in agreement with function
L2V of M
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
9/36
17. RDFi : Syntax
I
subject
I
predicate
B
object
θ
I B L C U
I :
B:
L:
C:
U:
IRIs
blank nodes
literals
constants of L
e-literals
Definition
(s, p, o) ∈ (I ∪ B) ∪ I ∪ (I ∪ B ∪ L ∪ C ∪ U) is called an e-triple
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
10/36
18. RDFi : Syntax
I
subject
I
predicate
B
object
θ
I B L C U
I :
B:
L:
C:
U:
IRIs
blank nodes
literals
constants of L
e-literals
Definition
(s, p, o) ∈ (I ∪ B) ∪ I ∪ (I ∪ B ∪ L ∪ C ∪ U) is called an e-triple
If t is an e-triple and θ a conjunction of L-constraints, then
the pair (t, θ) is called a conditional triple
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
10/36
19. RDFi : Syntax
I
subject
I
predicate
B
object
θ
I B L C U
I :
B:
L:
C:
U:
IRIs
blank nodes
literals
constants of L
e-literals
Definition
(s, p, o) ∈ (I ∪ B) ∪ I ∪ (I ∪ B ∪ L ∪ C ∪ U) is called an e-triple
If t is an e-triple and θ a conjunction of L-constraints, then
the pair (t, θ) is called a conditional triple
A set of conditional triples is called a conditional graph
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
10/36
20. RDFi : Syntax (cont’d)
Definition
An RDFi database D is a pair D = (G , φ) where G is a conditional
graph and φ a Boolean combination of L-constraints (global
constraint)
Example
y
hotspot1
type
Hotspot
fire1
type
Fire
hotspot1 correspondsTo fire1
fire1
occuredIn _R1
.
.
.
.
R1 NTPP "x ≥ 6 ∧ x ≤ 23 ∧ y ≥ 8 ∧ y ≤ 19"
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
P
19
8
6
23
11/36
x
21. RDFi : Semantics
RDFi database
Rep
D
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
set of RDF
G1 ,
G2 ,
.
.
.
graphs
12/36
22. RDFi : Semantics
RDFi database
Rep
D
set of RDF
G1 ,
G2 ,
.
.
.
graphs
Definition
A valuation v is a function from U to C assigning to each e-literal
from U a constant from C
Definition
Let G be a conditional graph and v a valuation. Then v (G )
denotes the RDF graph
{v (t) | (t, θ) ∈ G and ML |= v (θ)}
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
12/36
23. RDFi : Semantics (cont’d)
From RDFi databases to sets of RDF graphs
An RDFi database D = (G , φ) corresponds to the following set of
RDF graphs:
Rep(D) = H | there exists valuation v and RDF graph H
such that ML |= v (φ) and H ⊇ v (G )
Relation ⊇ captures the OWA semantics
An RDFi database corresponds to an infinite number of RDF
graphs
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
13/36
24. Question
How can we evaluate a query q over an RDFi database D
(compute q D )?
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
14/36
25. Question
How can we evaluate a query q over an RDFi database D
(compute q D )?
Semantic definition
q
C. Nikolaou and M. Koubarakis
Rep(D)
–
={ q
G
| G ∈ Rep(D)}
Incomplete Information in RDF
14/36
26. Question
How can we evaluate a query q over an RDFi database D
(compute q D )?
Semantic definition
q
C. Nikolaou and M. Koubarakis
Rep(D)
–
={ q
G
| G ∈ Rep(D)}
Incomplete Information in RDF
14/36
27. Question
How can we evaluate a query q over an RDFi database D
(compute q D )?
Semantic definition
q
Rep(D)
={ q
G
| G ∈ Rep(D)}
In practice?
Start with SPARQL algebra of [P´rez/Arenas/Gutierrez ’06]
e
with set semantics
Define SPARQL query evaluation for RDFi databases
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
14/36
28. From mappings to e-mappings...
{?F → fire1, ?S → ”x ≥ 1 ∧ x ≤ 2 ∧ y ≥ 1 ∧ y ≤ 2”}
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
15/36
29. From mappings to e-mappings...
{?F → fire1, ?S → ”x ≥ 1 ∧ x ≤ 2 ∧ y ≥ 1 ∧ y ≤ 2”}
{?F → fire1, ?S → R1}
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
15/36
30. ... to conditional mappings
{?F → fire1, ?S →”x ≥ 1 ∧ x ≤ 2 ∧ y ≥ 1 ∧ y ≤ 2”}
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
16/36
31. ... to conditional mappings
{?F → fire1, ?S →”x ≥ 1 ∧ x ≤ 2 ∧ y ≥ 1 ∧ y ≤ 2”}, true
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
16/36
32. ... to conditional mappings
{?F → fire1, ?S → R1}, R1 EQ ”x ≥ 1∧x ≤ 2∧y ≥ 1∧y ≤ 2”
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
16/36
33. From compatible mappings to possibly compatible
mappings
Join of conditional mappings
{?F → fire1, ?S → R1}, R1 EQ ”x ≥ 1 ∧ x ≤ 2 ∧ y ≥ 1 ∧ y ≤ 2”
{
?S → R2}, true
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
17/36
34. From compatible mappings to possibly compatible
mappings
Join of conditional mappings
{?F → fire1, ?S → R1}, R1 EQ ”x ≥ 1 ∧ x ≤ 2 ∧ y ≥ 1 ∧ y ≤ 2”
{
?S → R2}, true
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
17/36
35. From compatible mappings to possibly compatible
mappings
Join of conditional mappings
{?F → fire1, ?S → R1}, R1 EQ ”x ≥ 1 ∧ x ≤ 2 ∧ y ≥ 1 ∧ y ≤ 2”
{
?S → R2}, true
=
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
17/36
36. From compatible mappings to possibly compatible
mappings
Join of conditional mappings
{?F → fire1, ?S → R1}, R1 EQ ”x ≥ 1 ∧ x ≤ 2 ∧ y ≥ 1 ∧ y ≤ 2”
{
?S → R2}, true
=
{?F → fire1, ?S → R1}, true ∧ R1 EQ R2 ∧
R1 EQ ”x ≥ 1 ∧ x ≤ 2 ∧ y ≥ 1 ∧ y ≤ 2”
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
17/36
37. Operations on conditional mappings
Let Ω1 and Ω2 be sets of conditional mappings. We can define the
operation of:
Join (Ω1
Ω2 )
Union (Ω1 ∪ Ω2 )
Difference (Ω1 Ω2 )
Left-outer join (Ω1
C. Nikolaou and M. Koubarakis
–
1Ω )
2
Incomplete Information in RDF
18/36
38. Graph pattern evaluation
If D is an RDFi database and P a graph pattern, the evaluation of
P over D is defined recursively:
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
19/36
39. Graph pattern evaluation
If D is an RDFi database and P a graph pattern, the evaluation of
P over D is defined recursively:
base case:
P is the triple pattern t
recursion:
P is (P1 AND P2 )
→ P1 D
P2 D
P is (P1 UNION P2 )
→ P1 D ∪ P2 D
P is (P1 OPT P2 )
→ P1 D
P2 D
1
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
19/36
40. Graph pattern evaluation
If D is an RDFi database and P a graph pattern, the evaluation of
P over D is defined recursively:
base case:
P is the triple pattern t
recursion:
P is (P1 AND P2 )
→ P1 D
P2 D
P is (P1 UNION P2 )
→ P1 D ∪ P2 D
P is (P1 OPT P2 )
→ P1 D
P2 D
P is (P1 FILTER R)
where R is a conjunction of L-constraints
1
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
19/36
41. Graph pattern evaluation
If D is an RDFi database and P a graph pattern, the evaluation of
P over D is defined recursively:
base case:
P is the triple pattern t
recursion:
P is (P1 AND P2 )
→ P1 D
P2 D
P is (P1 UNION P2 )
→ P1 D ∪ P2 D
P is (P1 OPT P2 )
→ P1 D
P2 D
P is (P1 FILTER R)
where R is a conjunction of L-constraints
1
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
19/36
42. Triple pattern evaluation (case 1)
Example
Database D
Query q
fire1 occuredIn _R1 .
?F occuredIn ?R
R1 NTPP "x ≥ 6 ∧ x ≤ 23 ∧ y ≥ 8 ∧ y ≤ 19"
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
20/36
43. Triple pattern evaluation (case 1)
Example
Database D
Query q
fire1 occuredIn _R1 .
?F occuredIn ?R
R1 NTPP "x ≥ 6 ∧ x ≤ 23 ∧ y ≥ 8 ∧ y ≤ 19"
Answer (set of conditional mappings)
q
D
{?F → fire1, ?R → R1}, true
=
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
20/36
44. Triple pattern evaluation (case 2)
Example
Database D
Query q
fire1 occuredIn _R1 .
?F occuredIn
"x ≥ 1 ∧ x ≤ 2 ∧ y ≥ 1 ∧ y ≤ 2"
R1 NTPP "x ≥ 6 ∧ x ≤ 23 ∧ y ≥ 8 ∧ y ≤ 19"
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
21/36
45. Triple pattern evaluation (case 2)
Example
Database D
Query q
fire1 occuredIn _R1 .
?F occuredIn
"x ≥ 1 ∧ x ≤ 2 ∧ y ≥ 1 ∧ y ≤ 2"
R1 NTPP "x ≥ 6 ∧ x ≤ 23 ∧ y ≥ 8 ∧ y ≤ 19"
Answer (set of conditional mappings)
q
D
=
{?F → fire1}, R1 EQ "x ≥ 1∧x ≤ 2∧y ≥ 1∧y ≤ 2"
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
21/36
46. Evaluation of FILTER graph patterns
Example
Database D
Query q
fire1 occuredIn _R1 .
?F occuredIn ?R .
FILTER (?R NTPP
"x ≥ 1 ∧ x ≤ 2 ∧ y ≥ 1 ∧ y ≤ 2")
R1 NTPP "x ≥ 6 ∧ x ≤ 23 ∧ y ≥ 8 ∧ y ≤ 19"
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
22/36
47. Evaluation of FILTER graph patterns
Example
Database D
Query q
fire1 occuredIn _R1 .
?F occuredIn ?R .
FILTER (?R NTPP
"x ≥ 1 ∧ x ≤ 2 ∧ y ≥ 1 ∧ y ≤ 2")
R1 NTPP "x ≥ 6 ∧ x ≤ 23 ∧ y ≥ 8 ∧ y ≤ 19"
Answer
q
D
=
{?F → fire1, ?R → R1},
R1 NTPP "x ≥ 1 ∧ x ≤ 2 ∧ y ≥ 1 ∧ y ≤ 2"
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
22/36
48. SELECT queries
Example
Database D
Query q
SELECT ?F
WHERE {
?F occuredIn ?R .
R1 NTPP "x ≥ 6 ∧ x ≤ 23 ∧ y ≥ 8 ∧ y ≤ 19"
FILTER (?R NTPP
"x ≥ 1 ∧ x ≤ 2 ∧ y ≥ 1 ∧ y ≤ 2")}
fire1 occuredIn _R1 .
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
23/36
49. SELECT queries
Example
Database D
Query q
SELECT ?F
WHERE {
?F occuredIn ?R .
R1 NTPP "x ≥ 6 ∧ x ≤ 23 ∧ y ≥ 8 ∧ y ≤ 19"
FILTER (?R NTPP
"x ≥ 1 ∧ x ≤ 2 ∧ y ≥ 1 ∧ y ≤ 2")}
fire1 occuredIn _R1 .
Answer (set of conditional mappings)
q
D
=
{?F → fire1},
R1 NTPP "x ≥ 1 ∧ x ≤ 2 ∧ y ≥ 1 ∧ y ≤ 2"
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
23/36
50. CONSTRUCT queries
Example
Database D
Query q
fire1 occuredIn _R1 .
CONSTRUCT { ?F type Fire }
WHERE {
R1 NTPP "x ≥ 6 ∧ x ≤ 23 ∧ y ≥ 8 ∧ y ≤ 19"
?F occuredIn ?R
}
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
24/36
51. CONSTRUCT queries
Example
Database D
Query q
fire1 occuredIn _R1 .
CONSTRUCT { ?F type Fire }
WHERE {
R1 NTPP "x ≥ 6 ∧ x ≤ 23 ∧ y ≥ 8 ∧ y ≤ 19"
?F occuredIn ?R
}
Answer (RDFi database)
fire1 type Fire .
D = (G , φ)
R1 NTPP "x ≥ 6 ∧ x ≤ 23 ∧ y ≥ 8 ∧ y ≤ 19"
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
24/36
52. CONSTRUCT queries
Example
Database D
Query q
fire1 occuredIn _R1 .
CONSTRUCT { ?F type Fire }
WHERE {
R1 NTPP "x ≥ 6 ∧ x ≤ 23 ∧ y ≥ 8 ∧ y ≤ 19"
?F occuredIn ?R
}
Answer (RDFi database)
fire1 type Fire .
D = (G , φ)
R1 NTPP "x ≥ 6 ∧ x ≤ 23 ∧ y ≥ 8 ∧ y ≤ 19"
Closure property
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
24/36
53. Correctness of SPARQL query evaluation for RDFi
Does query evaluation compute the correct answer
(the answer agrees with the semantic definition)?
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
25/36
54. Correctness of SPARQL query evaluation for RDFi
Does query evaluation compute the correct answer
(the answer agrees with the semantic definition)?
The following diagram should commute
D
Rep
G
q
q
D
G
Rep
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
25/36
55. Correctness of SPARQL query evaluation for RDFi
Does query evaluation compute the correct answer
(the answer agrees with the semantic definition)?
The following diagram should commute
D
Rep
G
q
q
D
G
Rep
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
25/36
56. Correctness of SPARQL query evaluation for RDFi
Does query evaluation compute the correct answer
(the answer agrees with the semantic definition)?
The following diagram should commute
D
Rep
G
q
q
D
G
Rep
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
25/36
57. Correctness of SPARQL query evaluation for RDFi
Does query evaluation compute the correct answer
(the answer agrees with the semantic definition)?
The following diagram should commute
D
Rep
G
q
q
D
Rep( q
D)
= q
Rep(D)
G
Rep
C. Nikolaou and M. Koubarakis
OR
–
Incomplete Information in RDF
25/36
58. Correctness of SPARQL query evaluation for RDFi
Does query evaluation compute the correct answer
(the answer agrees with the semantic definition)?
The following diagram should commute. Does it?
D
Rep
G
q
q
D
Rep( q
D)
= q
Rep(D)
G
Rep
C. Nikolaou and M. Koubarakis
OR
–
Incomplete Information in RDF
25/36
59. Correctness of SPARQL query evaluation for RDFi
Does query evaluation compute the correct answer
(the answer agrees with the semantic definition)?
The following diagram should commute. Does it?
D
Rep
G
q
q
D
Rep( q
D)
=
q
Rep(D)
G
Rep
C. Nikolaou and M. Koubarakis
OR
–
Incomplete Information in RDF
25/36
60. Certain answer to the rescue
Definition
The certain answer to query q over a set of RDF graphs G is set
{ q
C. Nikolaou and M. Koubarakis
–
G
| G ∈ G}
Incomplete Information in RDF
26/36
61. Certain answer to the rescue
Definition
The certain answer to query q over a set of RDF graphs G is set
{ q
G
| G ∈ G}
Using the notion of certain answer we can relax the earlier equality
requirement to one that uses Q-equivalence.
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
26/36
62. Certain answer to the rescue
Definition
The certain answer to query q over a set of RDF graphs G is set
{ q
G
| G ∈ G}
Using the notion of certain answer we can relax the earlier equality
requirement to one that uses Q-equivalence.
Definition
Let Q be a fragment of SPARQL. Two sets of RDF graphs G, H
will be Q-equivalent (denoted by G ≡Q H) if they give the same
certain answer to every query q ∈ Q
{ q
C. Nikolaou and M. Koubarakis
G
–
| G ∈ G} =
{ q
Incomplete Information in RDF
H
| H ∈ H}
26/36
63. Representation system
Let
D be the set of all RDFi databases
G be the set of all RDF graphs
Rep : D → G be a function determining the set of possible
RDF graphs corresponding to an RDFi database, and
Q be a fragment of SPARQL
D, Rep, Q is a representation system if for all D ∈ D and all
q ∈ Q, there exists an RDFi database q D such that
Rep( q
C. Nikolaou and M. Koubarakis
–
D)
≡Q q
Rep(D)
Incomplete Information in RDF
27/36
64. Representation system
Let
D be the set of all RDFi databases
G be the set of all RDF graphs
Rep : D → G be a function determining the set of possible
RDF graphs corresponding to an RDFi database, and
Q be a fragment of SPARQL
D, Rep, Q is a representation system if for all D ∈ D and all
q ∈ Q, there exists an RDFi database q D such that
Rep( q
D)
≡Q q
Rep(D)
Are there interesting fragments Q of SPARQL that lead to a
representation system?
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
27/36
65. Representation systems for RDFi
Theorem
The following fragments of SPARQL can give us representation
systems for RDFi (with D and Rep as defined):
QC : CONSTRUCT queries using only AND, UNION,
AUF
and FILTER graph patterns, and without blank nodes in their
templates
QC : CONSTRUCT queries using only well-designed
WD
graph patterns, and without blank nodes in their templates
Well-designed graph patterns [P´rez/Arenas/Gutierrez ’06]
e
AND, FILTER, OPT fragment
P FILTER R: safe
P1 OPT P2 : variables in P2 are properly scoped
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
28/36
66. Representation systems for RDFi (cont’d)
Monotonicity
Definition
A fragment Q of SPARQL is monotone if for every q ∈ Q and
RDF graphs G and H such that G ⊆ H, it is q G ⊆ q H .
Proposition [Arenas/P´rez ’11]
e
The fragment of SPARQL corresponding to AND, UNION,
and FILTER graph patterns is monotone.
The fragment of SPARQL corresponding to well-designed
graph patterns is weakly-monotone ( ).
Proposition
Fragments QC and QC are monotone.
AUF
WD
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
29/36
67. Computing certain answers
Representation systems guarantee correctness of query
evaluation for RDFi and SPARQL
Query evaluation computes an RDFi database
q
D
= D = (G , φ)
How could we compute the certain answer?
Rep( q
Rep( q
D)
D)
is infinite!
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
30/36
68. Computing certain answers (cont’d)
Theorem
For D = (G , φ) and q from QC or QC , the certain answer of q
AUF
WD
over D can be computed as follows:
i) compute q
D
= Dq = (Gq , φ),
ii) compute the RDFi database (Hq , φ) = ((Dq )EQ )∗ , and
iii) return the set of RDF triples
{(s, p, o) | ((s, p, o), θ) ∈ Hq such that φ |= θ and o ∈ U}
/
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
31/36
69. The certainty problem
CERT (q, H, D)
Input
An RDF graph H, a CONSTRUCT query q, and an RDFi
database D
Question
Does H belong to the certain answer of q over D?
H⊆
C. Nikolaou and M. Koubarakis
–
q
Rep(D) ?
Incomplete Information in RDF
32/36
70. The certainty problem
CERT (q, H, D)
Input
An RDF graph H, a CONSTRUCT query q, and an RDFi
database D
Question
Does H belong to the certain answer of q over D?
H⊆
q
Rep(D) ?
We study the data complexity of CERT (q, H, D)
H and D are part of the input
q is fixed
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
32/36
71. Deciding the certainty problem
Theorem
CERT (q, H, D) is equivalent to deciding whether formula
t∈H
(∀ l)(φ( l) ⊃ Θ(t, q, D, l))
is true
l is the vector of all e-literals in D
Θ(t, q, D, l) is of the form θ1 ∨ · · · ∨ θk , where θi is a
conjunction of L-constraints
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
33/36
72. Computational complexity
CERT (q, H, D)
C. Nikolaou and M. Koubarakis
L
data complexity
ECL/diPCL/dePCL/RCL
coNP-complete
TCL/PCL (RCC-5)
Problem
EXPTIME
–
Incomplete Information in RDF
34/36
73. Computational complexity
CERT (q, H, D)
L
data complexity
ECL/diPCL/dePCL/RCL
coNP-complete
TCL/PCL (RCC-5)
Problem
EXPTIME
Problem
combined complexity
SPARQL
SPARQLAUF
SPARQLWD
PSPACE-complete
NP-complete
coNP-complete
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
data complexity
E
AC
SP
OG
L
34/36
74. Conclusions
RDFi framework
Modeling of incomplete information for property values
Formal semantics through possible worlds semantics
SPARQL query evaluation and certain answer semantics
Two representation systems for RDFi and SPARQL
Algorithm for certain answer computation
Preliminary complexity analysis
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
35/36
75. Future work
More general models of incomplete information (subject,
predicate)
More refined complexity results
Scalable implementation when L expresses topological
constraints with/without constants (TCL/PCL)
Connection with query processing for the topology vocabulary
extension of GeoSPARQL
Probabilistic extension to RDFi
Data integration theory for linked data (only practice exists so
far)
Connection to geospatial OBDA using DL logics
C. Nikolaou and M. Koubarakis
–
Incomplete Information in RDF
36/36
77. Constraint languages L
Properties of L
Many-sorted first-order language
Interpreted over a fixed (intended) structure ML
EQ: distinguished equality predicate
L-constraints: quantifier-free formulae of L
Weakly closed under negation: the negation of every atomic
L-constraint is equivalent to a disjunction of L-constraints
78. Correctness of SPARQL query evaluation for RDFi
An easy negative example
Example (classical RDF - OWA)
D
s p o .
q
CONSTRUCT { s ?p ?o }
WHERE { s ?p ?o }
79. Correctness of SPARQL query evaluation for RDFi
An easy negative example
Example (classical RDF - OWA)
D
q
s p o .
CONSTRUCT { s ?p ?o }
WHERE { s ?p ?o }
Then,
q
D
=D
80. Correctness of SPARQL query evaluation for RDFi (cont’d)
An easy negative example
Example
Let us compare the the set of graphs represented by q
D
with q
Rep(D)
81. Correctness of SPARQL query evaluation for RDFi (cont’d)
An easy negative example
Example
Let us compare the the set of graphs represented by q
Rep( q
D)
=
(s, p, o)
,
(s, p, o)
(c, d, e)
,
D
with q
(s, p, o)
(s, b, c)
Rep(D)
,···
82. Correctness of SPARQL query evaluation for RDFi (cont’d)
An easy negative example
Example
Let us compare the the set of graphs represented by q
Rep( q
D)
q
=
Rep(D)
(s, p, o)
=
,
(s, p, o)
(s, p, o)
(c, d, e)
,
,
(s, p, o)
(s, b, c)
D
with q
(s, p, o)
(s, b, c)
,···
Rep(D)
,···
83. Correctness of SPARQL query evaluation for RDFi (cont’d)
An easy negative example
Example
Let us compare the the set of graphs represented by q
Rep( q
D)
q
=
Rep(D)
(s, p, o)
=
There is no g ∈ q
(s, p, o)
(c, d, e)
,
(s, p, o)
Rep(D)
,
,
(s, p, o)
(s, b, c)
D
with q
(s, p, o)
(s, b, c)
Rep(D)
,···
,···
containing the triple (c, d, e)!
84. Correctness of SPARQL query evaluation for RDFi (cont’d)
An easy negative example
Example
Let us compare the the set of graphs represented by q
Rep( q
D)
q
=
Rep(D)
(s, p, o)
=
There is no g ∈ q
(s, p, o)
(c, d, e)
,
(s, p, o)
Rep(D)
,
,
(s, p, o)
(s, b, c)
D
with q
(s, p, o)
(s, b, c)
,···
,···
containing the triple (c, d, e)!
This would work if RDF made the CWA
Rep(D)
85. Correctness of SPARQL query evaluation for RDFi (cont’d)
An easy negative example
Example
Let us compare the the set of graphs represented by q
Rep( q
D)
q
=
Rep(D)
(s, p, o)
=
There is no g ∈ q
(s, p, o)
(c, d, e)
,
(s, p, o)
Rep(D)
,
,
(s, p, o)
(s, b, c)
D
with q
(s, p, o)
(s, b, c)
Rep(D)
,···
,···
containing the triple (c, d, e)!
This would work if RDF made the CWA
We know this already from the relational case [Imielinski/Lipski ’84]
86. Computing certain answers
Definitions
Definition (EQ-completion)
The EQ-completed form of D = (G , φ), denoted by
D EQ = (G EQ , φ), is taken from D by replacing all e-literals l ∈ U
appearing in G by the constant c ∈ C such that φ |= l EQ c
87. Computing certain answers
Definitions
Definition (EQ-completion)
The EQ-completed form of D = (G , φ), denoted by
D EQ = (G EQ , φ), is taken from D by replacing all e-literals l ∈ U
appearing in G by the constant c ∈ C such that φ |= l EQ c
Definition (Normalization)
The normalized form of D is the RDFi database D ∗ = (G ∗ , φ)
where G ∗ is the set
{(t, θ) | (t, θi ) ∈ G for all i = 1 . . . n, and θ is
G = {(t, θ1 ), (t, θ2 ), (t , θ )}
i
θi }
G ∗ = {(t, θ1 ∨ θ2 ), (t , θ )}