This document proposes cooperative techniques for relaxing SPARQL queries over RDF databases. It discusses existing approaches that relax queries by applying operators like OPTIONAL or replacing constants with variables, but notes these are "blind" relaxations that don't consider why the original query failed.
The document introduces the concepts of a Minimal Failing Subquery (MFS) and Maximal Succeeding Subquery (XSS) to identify the specific patterns causing a query to fail or succeed. It then presents a Lattice-Based Approach (LBA) that uses a three-step procedure to find an MFS for a failing query and identify potential XSSs, inspired by previous work on relational databases. The LBA aims to
A Note on Hessen berg of Trapezoidal Fuzzy Number MatricesIOSRJM
This document discusses Hessenberg matrices of trapezoidal fuzzy number matrices. It begins with background on fuzzy sets and fuzzy numbers, including definitions of trapezoidal fuzzy numbers and operations on them. It then defines trapezoidal fuzzy matrices and operations on those matrices, such as addition, subtraction, and multiplication. The main topic of the document is introduced - Hessenberg matrices of trapezoidal fuzzy numbers. Some properties of Hessenberg trapezoidal fuzzy matrices are presented. The document provides necessary preliminaries on fuzzy sets and numbers before defining the key concepts and discussing properties.
Some Continued Mock Theta Functions from Ramanujan’s Lost Notebook (IV)paperpublications3
Abstract: Ramanujan’s lost notebook contains many results on mock theta functions. In particular, the lost notebook contains eight identities for tenth order mock theta functions. Previously many authors proved the first six of Ramanujan’s tenth order mock theta function identities. It is the purpose of this paper to prove the seventh and eighth identities of Ramanujan’s tenth order mock theta function identities which are expressed by mock theta functions and also a definite integral. The properties of modular forms are used for the proofs of theta function identities and L. J. Mordell’s transformation formula for the definite integral.Keywords: Mock Theta Functions from Ramanujan’s Lost Notebook.
Title: Some Continued Mock Theta Functions from Ramanujan’s Lost Notebook (IV)
Author: MOHAMMADI BEGUM JEELANI SHAIKH
ISSN 2350-1022
International Journal of Recent Research in Mathematics Computer Science and Information Technology
Paper Publications
Some Properties of Determinant of Trapezoidal Fuzzy Number MatricesIJMERJOURNAL
ABSTRACT: The fuzzy set theory has been applied in many fields such as management, engineering, matrices and so on. In this paper, some elementary operations on proposed trapezoidal fuzzy numbers (TrFNs) are defined. We also defined some operations on trapezoidal fuzzy matrices (TrFMs). The notion of Determinant of trapezoidal fuzzy matrices are introduced and discussed. Some of their relevant properties have also been verified.
This document describes the divide-and-conquer algorithm design strategy and provides examples of its use. It discusses:
1) The divide-and-conquer strategy involves dividing a problem into smaller subproblems, solving those subproblems recursively, and combining the solutions.
2) Examples where divide-and-conquer is applied include sorting algorithms like mergesort and quicksort, binary tree traversals, binary search, and multiplying large integers.
3) Analysis techniques like recurrence relations and the master theorem are introduced for analyzing divide-and-conquer algorithms' run times. Specific algorithms like mergesort, quicksort, and binary search are analyzed in more detail.
The document discusses skyline queries and algorithms for computing skylines. It begins by defining skylines and providing examples. It then discusses two main options for implementing skyline queries: extending SQL with a skyline operator, or building skylines on top of relational databases using nested SQL queries. The document focuses on algorithms for computing skylines, including the block-nested loops algorithm, divide and conquer algorithm, and algorithms using B-trees and R-trees. It also discusses combining skyline queries with joins and top-N queries. Experimental results show divide and conquer performs best for multidimensional skylines while block-nested loops is better for small skylines.
This document discusses efficient implementation of cryptographic pairings. It begins by introducing pairings and their properties like bilinearity. It describes the hard computational problems that pairings are based on and how suitable elliptic curves and algorithms like the Tate pairing can be used to implement pairings securely. The document then discusses optimizations to the basic Tate pairing algorithm as well as other pairing-friendly curves and pairings like the Ate pairing. It also covers efficient arithmetic in extension fields and techniques for fast final exponentiation.
This document summarizes the derivation of an evidence lower bound (ELBO) for latent LSTM allocation, a model that uses an LSTM to determine topic assignments in a topic modeling framework. It expresses the ELBO as terms related to the variational posterior distributions over topics and topics proportions, the generative process of words given topics, and the LSTM's prediction of topic assignments. It also describes how to optimize the ELBO with respect to the variational and LSTM parameters through gradient ascent.
A Note on Hessen berg of Trapezoidal Fuzzy Number MatricesIOSRJM
This document discusses Hessenberg matrices of trapezoidal fuzzy number matrices. It begins with background on fuzzy sets and fuzzy numbers, including definitions of trapezoidal fuzzy numbers and operations on them. It then defines trapezoidal fuzzy matrices and operations on those matrices, such as addition, subtraction, and multiplication. The main topic of the document is introduced - Hessenberg matrices of trapezoidal fuzzy numbers. Some properties of Hessenberg trapezoidal fuzzy matrices are presented. The document provides necessary preliminaries on fuzzy sets and numbers before defining the key concepts and discussing properties.
Some Continued Mock Theta Functions from Ramanujan’s Lost Notebook (IV)paperpublications3
Abstract: Ramanujan’s lost notebook contains many results on mock theta functions. In particular, the lost notebook contains eight identities for tenth order mock theta functions. Previously many authors proved the first six of Ramanujan’s tenth order mock theta function identities. It is the purpose of this paper to prove the seventh and eighth identities of Ramanujan’s tenth order mock theta function identities which are expressed by mock theta functions and also a definite integral. The properties of modular forms are used for the proofs of theta function identities and L. J. Mordell’s transformation formula for the definite integral.Keywords: Mock Theta Functions from Ramanujan’s Lost Notebook.
Title: Some Continued Mock Theta Functions from Ramanujan’s Lost Notebook (IV)
Author: MOHAMMADI BEGUM JEELANI SHAIKH
ISSN 2350-1022
International Journal of Recent Research in Mathematics Computer Science and Information Technology
Paper Publications
Some Properties of Determinant of Trapezoidal Fuzzy Number MatricesIJMERJOURNAL
ABSTRACT: The fuzzy set theory has been applied in many fields such as management, engineering, matrices and so on. In this paper, some elementary operations on proposed trapezoidal fuzzy numbers (TrFNs) are defined. We also defined some operations on trapezoidal fuzzy matrices (TrFMs). The notion of Determinant of trapezoidal fuzzy matrices are introduced and discussed. Some of their relevant properties have also been verified.
This document describes the divide-and-conquer algorithm design strategy and provides examples of its use. It discusses:
1) The divide-and-conquer strategy involves dividing a problem into smaller subproblems, solving those subproblems recursively, and combining the solutions.
2) Examples where divide-and-conquer is applied include sorting algorithms like mergesort and quicksort, binary tree traversals, binary search, and multiplying large integers.
3) Analysis techniques like recurrence relations and the master theorem are introduced for analyzing divide-and-conquer algorithms' run times. Specific algorithms like mergesort, quicksort, and binary search are analyzed in more detail.
The document discusses skyline queries and algorithms for computing skylines. It begins by defining skylines and providing examples. It then discusses two main options for implementing skyline queries: extending SQL with a skyline operator, or building skylines on top of relational databases using nested SQL queries. The document focuses on algorithms for computing skylines, including the block-nested loops algorithm, divide and conquer algorithm, and algorithms using B-trees and R-trees. It also discusses combining skyline queries with joins and top-N queries. Experimental results show divide and conquer performs best for multidimensional skylines while block-nested loops is better for small skylines.
This document discusses efficient implementation of cryptographic pairings. It begins by introducing pairings and their properties like bilinearity. It describes the hard computational problems that pairings are based on and how suitable elliptic curves and algorithms like the Tate pairing can be used to implement pairings securely. The document then discusses optimizations to the basic Tate pairing algorithm as well as other pairing-friendly curves and pairings like the Ate pairing. It also covers efficient arithmetic in extension fields and techniques for fast final exponentiation.
This document summarizes the derivation of an evidence lower bound (ELBO) for latent LSTM allocation, a model that uses an LSTM to determine topic assignments in a topic modeling framework. It expresses the ELBO as terms related to the variational posterior distributions over topics and topics proportions, the generative process of words given topics, and the LSTM's prediction of topic assignments. It also describes how to optimize the ELBO with respect to the variational and LSTM parameters through gradient ascent.
The discrete quartic spline interpolation over non uniform meshAlexander Decker
The document presents a study investigating discrete quartic spline interpolation over non-uniform meshes. It begins with background on discrete splines and defines deficient splines. The paper then develops a new function for deficient quartic spline interpolation and obtains existence, uniqueness, and convergence properties of the deficient discrete quartic spline interpolation matching given function values at two interior points and first differences at midpoints, with boundary conditions on the function. It presents the analysis that leads to a system of equations and proves that for any mesh spacing h>0, there exists a unique deficient discrete quartic spline satisfying the given conditions.
TopicRNN is a generative model for documents that:
1. Draws a topic vector from a standard normal distribution and uses it to generate words in a document.
2. Computes a lower bound on the log marginal likelihood of words and stop word indicators.
3. Approximates the expected values in the lower bound using samples from an inference network that models the approximate posterior distribution over topics.
The Goldberg-Coxeter construction takes two integers (k,l) a 3-or 4-valent plane graph and returns a 3- or 4-valent plane graph. This construction is useful in virus study, numerical analysis, architecture, chemistry and of course mathematics.
Here we consider the zigzags and central circuits of 3- or 4-valent plane graph. It turns out that we can define an algebraic construction of (k,l)-product that allows to find the length of the zigzags and central circuits in a compact way. All possible lengths of zigzags are determined by this (k,l)-product and the normal structure of the automorphism group allows to find them for some congruence conditions.
11.a seasonal arima model for nigerian gross domestic productAlexander Decker
This document summarizes a study that developed a seasonal ARIMA model for Nigerian Gross Domestic Product (NGDP) data from 1980 to 2007. The researchers took seasonal and non-seasonal differences of the NGDP time series to remove trends and seasonality. Analysis of the differenced series' autocorrelation function revealed a seasonal ARIMA(0,0,1)x(1,0,0)4 model provided the best fit. Estimation of this model resulted in a good fit between the actual and fitted values, and the residuals showed no remaining autocorrelation, indicating an adequate model.
11.[1 11]a seasonal arima model for nigerian gross domestic productAlexander Decker
This document discusses time series analysis and seasonal ARIMA modeling of Nigerian Gross Domestic Product (NGDP) data from 1980 to 2007. The author performs the following steps: (1) determines the orders of differencing (d, D) and model parameters (p, P, q, Q) needed based on analyzing autocorrelations and partial autocorrelations; (2) estimates the seasonal ARIMA model parameters using nonlinear iterative techniques in EViews software; (3) checks the goodness-of-fit of the fitted model by analyzing the autocorrelations of the residuals. The results show that the seasonal ARIMA model adequately captures the autocorrelation structure in the NGDP time series data.
The document discusses applying counterexample guided abstraction refinement (CEGAR) to verifying properties of Petri nets. It summarizes using the Petri net state equation to represent reachable markings as solutions to a system of linear equations. It then describes using CEGAR to iteratively check solutions and refine the abstraction by adding increments when solutions are found to be infeasible. The approach is implemented in a tool called Sara which shows better performance than other tools on verification problems involving large Petri nets and parameterized systems.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
The discrete quartic spline interpolation over non uniform meshAlexander Decker
The document presents a study investigating discrete quartic spline interpolation over non-uniform meshes. It begins with background on discrete splines and defines deficient splines. The paper then develops a new function for deficient quartic spline interpolation and obtains existence, uniqueness, and convergence properties of the deficient discrete quartic spline interpolation matching given function values at two interior points and first differences at midpoints, with boundary conditions on the function. It presents the analysis that leads to a system of equations and proves that for any mesh spacing h>0, there exists a unique deficient discrete quartic spline satisfying the given conditions.
TopicRNN is a generative model for documents that:
1. Draws a topic vector from a standard normal distribution and uses it to generate words in a document.
2. Computes a lower bound on the log marginal likelihood of words and stop word indicators.
3. Approximates the expected values in the lower bound using samples from an inference network that models the approximate posterior distribution over topics.
The Goldberg-Coxeter construction takes two integers (k,l) a 3-or 4-valent plane graph and returns a 3- or 4-valent plane graph. This construction is useful in virus study, numerical analysis, architecture, chemistry and of course mathematics.
Here we consider the zigzags and central circuits of 3- or 4-valent plane graph. It turns out that we can define an algebraic construction of (k,l)-product that allows to find the length of the zigzags and central circuits in a compact way. All possible lengths of zigzags are determined by this (k,l)-product and the normal structure of the automorphism group allows to find them for some congruence conditions.
11.a seasonal arima model for nigerian gross domestic productAlexander Decker
This document summarizes a study that developed a seasonal ARIMA model for Nigerian Gross Domestic Product (NGDP) data from 1980 to 2007. The researchers took seasonal and non-seasonal differences of the NGDP time series to remove trends and seasonality. Analysis of the differenced series' autocorrelation function revealed a seasonal ARIMA(0,0,1)x(1,0,0)4 model provided the best fit. Estimation of this model resulted in a good fit between the actual and fitted values, and the residuals showed no remaining autocorrelation, indicating an adequate model.
11.[1 11]a seasonal arima model for nigerian gross domestic productAlexander Decker
This document discusses time series analysis and seasonal ARIMA modeling of Nigerian Gross Domestic Product (NGDP) data from 1980 to 2007. The author performs the following steps: (1) determines the orders of differencing (d, D) and model parameters (p, P, q, Q) needed based on analyzing autocorrelations and partial autocorrelations; (2) estimates the seasonal ARIMA model parameters using nonlinear iterative techniques in EViews software; (3) checks the goodness-of-fit of the fitted model by analyzing the autocorrelations of the residuals. The results show that the seasonal ARIMA model adequately captures the autocorrelation structure in the NGDP time series data.
The document discusses applying counterexample guided abstraction refinement (CEGAR) to verifying properties of Petri nets. It summarizes using the Petri net state equation to represent reachable markings as solutions to a system of linear equations. It then describes using CEGAR to iteratively check solutions and refine the abstraction by adding increments when solutions are found to be infeasible. The approach is implemented in a tool called Sara which shows better performance than other tools on verification problems involving large Petri nets and parameterized systems.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Cooperative Techniques for SPARQL Query Relaxation in RDF Databases: ESWC 2015
1. Cooperative Techniques for SPARQL Query
Relaxation in RDF Databases
G´eraud FOKOU St´ephane JEAN Allel HADJALI
Micka¨el BARON
LIAS/ENSMA and University of Poitiers, France
ESWC 2015, June 2015, Portoroz, Slovenia
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 1/21
2. Motivation
A lot of Knowledge Bases (KBs) available on the web
Academic projects: YAGO, NELL, DBPedia, ...
Commercial projects: Google (Knowledge Vault), ...
Difficulties to query KBs
Incomplete and Heterogeneous Data
Large size and complex schema semantics for end-users
Correct formulation of queries returning the intended result
...
⇒ failing RDF queries: queries that return an empty set of answers
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 2/21
3. Motivation
A lot of Knowledge Bases (KBs) available on the web
Academic projects: YAGO, NELL, DBPedia, ...
Commercial projects: Google (Knowledge Vault), ...
Difficulties to query KBs
Incomplete and Heterogeneous Data
Large size and complex schema semantics for end-users
Correct formulation of queries returning the intended result
...
⇒ failing RDF queries: queries that return an empty set of answers
The need for RDF query relaxation techniques
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 2/21
4. Related Work
Proposed relaxation operators [Hurtado08, Fokou14, ...]
OPTIONAL of SPARQL
Classes/Properties Hierarchies (Student → Person)
Replacing constants by variables
Removing triple patterns
...
⇒ On which triple patterns?
Proposed relaxation process [Huang12, ...]
Based on similarity measure Sim(Q, relax(Q))
Execution from the most to the least similar relaxed queries
⇒ blind relaxation of the query (long execution time)
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 3/21
5. Related Work
Proposed relaxation operators [Hurtado08, Fokou14, ...]
OPTIONAL of SPARQL
Classes/Properties Hierarchies (Student → Person)
Replacing constants by variables
Removing triple patterns
...
⇒ On which triple patterns?
Proposed relaxation process [Huang12, ...]
Based on similarity measure Sim(Q, relax(Q))
Execution from the most to the least similar relaxed queries
⇒ blind relaxation of the query (long execution time)
The need to know the causes of failure of an RDF query
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 3/21
6. Minimal Failing Subquery (MFS)
Definition
Let Q be a conjunction of triple patterns Q = t1 ∧ t2 ∧ t3 ∧ t4
Q* is an MFS of Q iff:
Q* is a failing subquery of Q
No subquery of Q* is a failing query
∅
t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3 t2 ∧ t4 t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
Q : failing query
Q : successful query
MFSs
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 4/21
7. Minimal Failing Subquery (MFS)
Definition
Let Q be a conjunction of triple patterns Q = t1 ∧ t2 ∧ t3 ∧ t4
Q* is an MFS of Q iff:
Q* is a failing subquery of Q
No subquery of Q* is a failing query
∅
t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3 t2 ∧ t4 t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
Q : failing query
Q : successful query
MFSs
The need to find maximal non failing subquery ⇒ notion of XSS
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 4/21
8. MaXimal Succeeding Subquery (XSS)
Definition
Q* is an XSS of Q iff:
Q* is a successful subquery of Q
Q* is not included in any successful subquery of Q
∅
t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3 t2 ∧ t4 t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
Q : failing query
Q : successful query
XSSs
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 5/21
9. MaXimal Succeeding Subquery (XSS)
Definition
Q* is an XSS of Q iff:
Q* is a successful subquery of Q
Q* is not included in any successful subquery of Q
∅
t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3 t2 ∧ t4 t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
Q : failing query
Q : successful query
XSSs
Problem: Computation of MFSs and XSSs of a failing RDF query
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 5/21
10. Outline
• Problem Statement
• Lattice-Based Approach (LBA)
• Matrice-Based Approach (MBA)
• Experimental results
• Conclusion and Perspectives
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 6/21
11. LBA Approach: 1) Finding an MFS
Inspired by the work of Godfrey in RDBMSs [Godfrey97]
A 3 steps procedure
First step: Q∗ ← FindAnMFS(Q)
• remove iteratively each triple pattern ti of Q (query Q )
• if Q ∧ Q∗
is successful, ti is an element of Q∗
(Proposition)
• otherwise Q∗
is a subquery of Q ∧ Q∗
∅
t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
Q = t2 ∧ t3 ∧ t4
Q ∧ Q∗ = t2 ∧ t3 ∧ t4
⇒ Q∗ = ∅
remove t1
t1 /∈ MFS
Q = t3 ∧ t4
Q ∧ Q∗ = t3 ∧ t4
⇒ Q∗ = t2
remove t2
t2 ∈ MFS
Q = t4
Q ∧ Q∗ = t2 ∧ t4
⇒ Q∗ = t2 ∧ t3
remove t3
t3 ∈ MFS
Q = ∅
Q ∧ Q∗ = t2 ∧ t3
⇒ Q∗ = t2 ∧ t3
remove t4
t4 /∈ MFS
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 7/21
12. LBA Approach: 1) Finding an MFS
Inspired by the work of Godfrey in RDBMSs [Godfrey97]
A 3 steps procedure
First step: Q∗ ← FindAnMFS(Q)
• remove iteratively each triple pattern ti of Q (query Q )
• if Q ∧ Q∗
is successful, ti is an element of Q∗
(Proposition)
• otherwise Q∗
is a subquery of Q ∧ Q∗
∅
t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
Q = t2 ∧ t3 ∧ t4
Q ∧ Q∗ = t2 ∧ t3 ∧ t4
⇒ Q∗ = ∅
remove t1
t1 /∈ MFS
Q = t3 ∧ t4
Q ∧ Q∗ = t3 ∧ t4
⇒ Q∗ = t2
remove t2
t2 ∈ MFS
Q = t4
Q ∧ Q∗ = t2 ∧ t4
⇒ Q∗ = t2 ∧ t3
remove t3
t3 ∈ MFS
Q = ∅
Q ∧ Q∗ = t2 ∧ t3
⇒ Q∗ = t2 ∧ t3
remove t4
t4 /∈ MFS
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 7/21
13. LBA Approach: 1) Finding an MFS
Inspired by the work of Godfrey in RDBMSs [Godfrey97]
A 3 steps procedure
First step: Q∗ ← FindAnMFS(Q)
• remove iteratively each triple pattern ti of Q (query Q )
• if Q ∧ Q∗
is successful, ti is an element of Q∗
(Proposition)
• otherwise Q∗
is a subquery of Q ∧ Q∗
∅
t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
Q = t2 ∧ t3 ∧ t4
Q ∧ Q∗ = t2 ∧ t3 ∧ t4
⇒ Q∗ = ∅
remove t1
t1 /∈ MFS
Q = t3 ∧ t4
Q ∧ Q∗ = t3 ∧ t4
⇒ Q∗ = t2
remove t2
t2 ∈ MFS
Q = t4
Q ∧ Q∗ = t2 ∧ t4
⇒ Q∗ = t2 ∧ t3
remove t3
t3 ∈ MFS
Q = ∅
Q ∧ Q∗ = t2 ∧ t3
⇒ Q∗ = t2 ∧ t3
remove t4
t4 /∈ MFS
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 7/21
14. LBA Approach: 1) Finding an MFS
Inspired by the work of Godfrey in RDBMSs [Godfrey97]
A 3 steps procedure
First step: Q∗ ← FindAnMFS(Q)
• remove iteratively each triple pattern ti of Q (query Q )
• if Q ∧ Q∗
is successful, ti is an element of Q∗
(Proposition)
• otherwise Q∗
is a subquery of Q ∧ Q∗
∅
t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
Q = t2 ∧ t3 ∧ t4
Q ∧ Q∗ = t2 ∧ t3 ∧ t4
⇒ Q∗ = ∅
remove t1
t1 /∈ MFS
Q = t3 ∧ t4
Q ∧ Q∗ = t3 ∧ t4
⇒ Q∗ = t2
remove t2
t2 ∈ MFS
Q = t4
Q ∧ Q∗ = t2 ∧ t4
⇒ Q∗ = t2 ∧ t3
remove t3
t3 ∈ MFS
Q = ∅
Q ∧ Q∗ = t2 ∧ t3
⇒ Q∗ = t2 ∧ t3
remove t4
t4 /∈ MFS
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 7/21
15. LBA Approach: 1) Finding an MFS
Inspired by the work of Godfrey in RDBMSs [Godfrey97]
A 3 steps procedure
First step: Q∗ ← FindAnMFS(Q)
• remove iteratively each triple pattern ti of Q (query Q )
• if Q ∧ Q∗
is successful, ti is an element of Q∗
(Proposition)
• otherwise Q∗
is a subquery of Q ∧ Q∗
∅
t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
Q = t2 ∧ t3 ∧ t4
Q ∧ Q∗ = t2 ∧ t3 ∧ t4
⇒ Q∗ = ∅
remove t1
t1 /∈ MFS
Q = t3 ∧ t4
Q ∧ Q∗ = t3 ∧ t4
⇒ Q∗ = t2
remove t2
t2 ∈ MFS
Q = t4
Q ∧ Q∗ = t2 ∧ t4
⇒ Q∗ = t2 ∧ t3
remove t3
t3 ∈ MFS
Q = ∅
Q ∧ Q∗ = t2 ∧ t3
⇒ Q∗ = t2 ∧ t3
remove t4
t4 /∈ MFS
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 7/21
16. LBA Approach: 2) Computing Potential XSSs
Second step:
All queries that include Q∗ are failing
⇒ they can be neither MFS nor XSS
Continue with the largest subqueries of Q that do not include
Q∗. If they are successful, they are XSSs.
⇒ potential XSSs (PXSS): pxss(Q, Q∗) = {Q − ti | ti ∈ Q∗}
Q∗ = t2 ∧ t3 ⇒ pxss(Q, Q∗) = {t1 ∧ t3 ∧ t4, t1 ∧ t2 ∧ t4}
∅
t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4 t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
Q∗
pxss(Q, Q∗
)
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 8/21
17. LBA Approach: 2) Computing Potential XSSs
Second step:
All queries that include Q∗ are failing
⇒ they can be neither MFS nor XSS
Continue with the largest subqueries of Q that do not include
Q∗. If they are successful, they are XSSs.
⇒ potential XSSs (PXSS): pxss(Q, Q∗) = {Q − ti | ti ∈ Q∗}
Q∗ = t2 ∧ t3 ⇒ pxss(Q, Q∗) = {t1 ∧ t3 ∧ t4, t1 ∧ t2 ∧ t4}
∅
t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4 t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
Q∗
pxss(Q, Q∗
)
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 8/21
18. LBA Approach: 3) Finding all MFSs and XSSs
Third step:
Na¨ıve approach
Execute each PXSS PQ
If PQ is successful, this is an XSS
Otherwise apply recursively the LBA algorithm on PQ
Problem: the same MFS can be found several times
Solution: incremental computation of the PXSSs that do not
include the identified MFSs
∅
t1t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
MFSs XSSs
pxss = {t1 ∧ t3 ∧ t4, t1 ∧ t2 ∧ t4}
mfs(Q) = {t2 ∧ t3}
xss(Q) = ∅
t1 ∧ t2 ∧ t3 ∧ t4
Q∗ = t2 ∧ t3
pxss = {t3 ∧ t4, t2 ∧ t4}
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = ∅
t1 ∧ t3 ∧ t4
Q∗∗ = t1
pxss = {t2 ∧ t4}
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = {t3 ∧ t4}
t3 ∧ t4
pxss = ∅
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = {t3 ∧ t4, t2 ∧ t4}
t2 ∧ t4
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 9/21
19. LBA Approach: 3) Finding all MFSs and XSSs
Third step:
Na¨ıve approach
Execute each PXSS PQ
If PQ is successful, this is an XSS
Otherwise apply recursively the LBA algorithm on PQ
Problem: the same MFS can be found several times
Solution: incremental computation of the PXSSs that do not
include the identified MFSs
∅
t1t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
MFSs XSSs
pxss = {t1 ∧ t3 ∧ t4, t1 ∧ t2 ∧ t4}
mfs(Q) = {t2 ∧ t3}
xss(Q) = ∅
t1 ∧ t2 ∧ t3 ∧ t4
Q∗ = t2 ∧ t3
pxss = {t3 ∧ t4, t2 ∧ t4}
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = ∅
t1 ∧ t3 ∧ t4
Q∗∗ = t1
pxss = {t2 ∧ t4}
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = {t3 ∧ t4}
t3 ∧ t4
pxss = ∅
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = {t3 ∧ t4, t2 ∧ t4}
t2 ∧ t4
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 9/21
20. LBA Approach: 3) Finding all MFSs and XSSs
Third step:
Na¨ıve approach
Execute each PXSS PQ
If PQ is successful, this is an XSS
Otherwise apply recursively the LBA algorithm on PQ
Problem: the same MFS can be found several times
Solution: incremental computation of the PXSSs that do not
include the identified MFSs
∅
t1t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
MFSs XSSs
pxss = {t1 ∧ t3 ∧ t4, t1 ∧ t2 ∧ t4}
mfs(Q) = {t2 ∧ t3}
xss(Q) = ∅
t1 ∧ t2 ∧ t3 ∧ t4
Q∗ = t2 ∧ t3
pxss = {t3 ∧ t4, t2 ∧ t4}
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = ∅
t1 ∧ t3 ∧ t4
Q∗∗ = t1
pxss = {t2 ∧ t4}
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = {t3 ∧ t4}
t3 ∧ t4
pxss = ∅
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = {t3 ∧ t4, t2 ∧ t4}
t2 ∧ t4
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 9/21
21. LBA Approach: 3) Finding all MFSs and XSSs
Third step:
Na¨ıve approach
Execute each PXSS PQ
If PQ is successful, this is an XSS
Otherwise apply recursively the LBA algorithm on PQ
Problem: the same MFS can be found several times
Solution: incremental computation of the PXSSs that do not
include the identified MFSs
∅
t1t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
MFSs XSSs
pxss = {t1 ∧ t3 ∧ t4, t1 ∧ t2 ∧ t4}
mfs(Q) = {t2 ∧ t3}
xss(Q) = ∅
t1 ∧ t2 ∧ t3 ∧ t4
Q∗ = t2 ∧ t3
pxss = {t3 ∧ t4, t2 ∧ t4}
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = ∅
t1 ∧ t3 ∧ t4
Q∗∗ = t1
pxss = {t2 ∧ t4}
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = {t3 ∧ t4}
t3 ∧ t4
pxss = ∅
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = {t3 ∧ t4, t2 ∧ t4}
t2 ∧ t4
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 9/21
22. LBA Approach: 3) Finding all MFSs and XSSs
Third step:
Na¨ıve approach
Execute each PXSS PQ
If PQ is successful, this is an XSS
Otherwise apply recursively the LBA algorithm on PQ
Problem: the same MFS can be found several times
Solution: incremental computation of the PXSSs that do not
include the identified MFSs
∅
t1t1 t2 t3 t4
t1 ∧ t2 t1 ∧ t3 t1 ∧ t4 t2 ∧ t3t2 ∧ t3 t2 ∧ t4t2 ∧ t4t2 ∧ t4 t3 ∧ t4t3 ∧ t4t3 ∧ t4
t1 ∧ t2 ∧ t3 t1 ∧ t2 ∧ t4t1 ∧ t2 ∧ t4 t1 ∧ t3 ∧ t4t1 ∧ t3 ∧ t4 t2 ∧ t3 ∧ t4
t1 ∧ t2 ∧ t3 ∧ t4
MFSs XSSs
pxss = {t1 ∧ t3 ∧ t4, t1 ∧ t2 ∧ t4}
mfs(Q) = {t2 ∧ t3}
xss(Q) = ∅
t1 ∧ t2 ∧ t3 ∧ t4
Q∗ = t2 ∧ t3
pxss = {t3 ∧ t4, t2 ∧ t4}
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = ∅
t1 ∧ t3 ∧ t4
Q∗∗ = t1
pxss = {t2 ∧ t4}
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = {t3 ∧ t4}
t3 ∧ t4
pxss = ∅
mfs(Q) = {t2 ∧ t3, t1}
xss(Q) = {t3 ∧ t4, t2 ∧ t4}
t2 ∧ t4
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 9/21
23. Outline
• Problem Statement
• Lattice-Based Approach (LBA)
• Matrice-Based Approach (MBA)
• Experimental results
• Conclusion and Perspectives
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 10/21
24. MBA Approach: Overview
Inspired by the work of Jannach in Recommender Systems
Based on the computation of a Matrix
one row by potential solution µ
(a mapping satisfying at least one triple pattern)
one column by triple pattern
M[µ][ti ] = 1 ⇔ µ satisfies ti (else 0)
t
s p o
Smith type Professor
Smith research SW
Smith teacherOf DB
Jane type Lecturer
Jane research DB
Jane teacherOf DB
SELECT ?p ?a WHERE {
?p age ?a (t1)
?p type Lecturer (t2)
?p research SW (t3)
?p teacherOf DB } (t4)
?p ?a t1 t2 t3 t4
Smith null 0 0 1 1
Jane null 0 1 0 1
xss(Q) = {t2 ∧ t4, t3 ∧ t4}
mfs(Q) = {t1, t2 ∧ t3}
RDF triples
The query Q
The relaxed matrix of Q The MFSs and XSSs of Q
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 11/21
25. MBA Approach: Relaxed Matrix Computation
Execution of each triple pattern ti
Computation of t1
∗ t2... ∗ tn
where t1
∗
t2 = t1 ∪ t2 ∪ t1 t2
the mappings that satisfies t1 or t2 or a combination of t1, t2
Problem: subqueries of Q can lead to Cartesian Product
⇒ Matrix of a huge size (exceeds the size of main memory)
Not the case of Star-shaped queries
(a shared join variable in the subject position)
SELECT ?p ?a WHERE {
?p age ?a (t1)
?p type Lecturer (t2)
?p research SW (t3)
?p teacherOf DB } (t4)
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 12/21
26. MBA Approach: Optimized Relaxed Matrix Computation
Algorithm NQ
execute each triple patternti
for each value v of the join variable
if v is already in the matrix, complete the row with a 1 for ti
else add a new row in the matrix with only a 1 for ti
?p ?a t1 t2 t3 t4
t1 = ∅
?p ?a t1 t2 t3 t4
Jane null 0 1 0 0
t2 = Jane
?p ?a t1 t2 t3 t4
Jane null 0 1 0 0
Smith null 0 0 1 0
t3 = Smith
?p ?a t1 t2 t3 t4
Jane null 0 1 0 1
Smith null 0 0 1 1
t4 = Smith, Jane
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 13/21
27. MBA Approach: Optimized Relaxed Matrix Computation
Algorithm NQ
execute each triple patternti
for each value v of the join variable
if v is already in the matrix, complete the row with a 1 for ti
else add a new row in the matrix with only a 1 for ti
?p ?a t1 t2 t3 t4
t1 = ∅
?p ?a t1 t2 t3 t4
Jane null 0 1 0 0
t2 = Jane
?p ?a t1 t2 t3 t4
Jane null 0 1 0 0
Smith null 0 0 1 0
t3 = Smith
?p ?a t1 t2 t3 t4
Jane null 0 1 0 1
Smith null 0 0 1 1
t4 = Smith, Jane
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 13/21
28. MBA Approach: Optimized Relaxed Matrix Computation
Algorithm NQ
execute each triple patternti
for each value v of the join variable
if v is already in the matrix, complete the row with a 1 for ti
else add a new row in the matrix with only a 1 for ti
?p ?a t1 t2 t3 t4
t1 = ∅
?p ?a t1 t2 t3 t4
Jane null 0 1 0 0
t2 = Jane
?p ?a t1 t2 t3 t4
Jane null 0 1 0 0
Smith null 0 0 1 0
t3 = Smith
?p ?a t1 t2 t3 t4
Jane null 0 1 0 1
Smith null 0 0 1 1
t4 = Smith, Jane
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 13/21
29. MBA Approach: Optimized Relaxed Matrix Computation
Algorithm NQ
execute each triple patternti
for each value v of the join variable
if v is already in the matrix, complete the row with a 1 for ti
else add a new row in the matrix with only a 1 for ti
?p ?a t1 t2 t3 t4
t1 = ∅
?p ?a t1 t2 t3 t4
Jane null 0 1 0 0
t2 = Jane
?p ?a t1 t2 t3 t4
Jane null 0 1 0 0
Smith null 0 0 1 0
t3 = Smith
?p ?a t1 t2 t3 t4
Jane null 0 1 0 1
Smith null 0 0 1 1
t4 = Smith, Jane
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 13/21
30. MBA Approach: Optimized Relaxed Matrix Computation
Algorithm NQ
execute each triple patternti
for each value v of the join variable
if v is already in the matrix, complete the row with a 1 for ti
else add a new row in the matrix with only a 1 for ti
?p ?a t1 t2 t3 t4
t1 = ∅
?p ?a t1 t2 t3 t4
Jane null 0 1 0 0
t2 = Jane
?p ?a t1 t2 t3 t4
Jane null 0 1 0 0
Smith null 0 0 1 0
t3 = Smith
?p ?a t1 t2 t3 t4
Jane null 0 1 0 1
Smith null 0 0 1 1
t4 = Smith, Jane
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 13/21
31. MBA Approach: Optimized Relaxed Matrix Computation
Algorithm 1Q
if the RDF database is implemented as a triple table: T(s, p, o)
a single SQL query can be used to compute the matrix
roughly the translation of t1 t2... tn
select coalesce(t1.s , t2.s, t3.s, t4.s),
case when t1.s is null then 0 else 1 end as t1,
case when t2.s is null then 0 else 1 end as t2,
case when t3.s is null then 0 else 1 end as t3,
case when t4.s is null then 0 else 1 end as t4
from t1 full outer join t2 on t1.s = t2.s
full outer join t3 on coalesce(t1.s , t2.s) = t3.s
full outer join t4 on coalesce(t1.s , t2.s, t3.s) = t4.s
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 14/21
32. MBA Approach: MFSs and XSSs computation
XSSs computation
1) Compute the skyline of the relaxed matrix
Many solutions: block nested loop, sort filter skyline, ...
2) Retrieve the queries that correspond to skyline rows
They are the XSSs
?p ?a t1 t2 t3 t4
Smith null 0 0 1 1
Jane null 0 1 0 1
⇒ xss(Q) = {t2 ∧ t4, t3 ∧ t4}
MFSs computation
The matrix provides a simple way to know if a query is failing
t2 ∧ t3 is failing ⇔ col(t2) AND col(t3) = 0
0
The matrix can be used as an index for optimizing LBA
LBA without any database query
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 15/21
33. Outline
• Problem Statement
• Lattice-Based Approach (LBA)
• Matrice-Based Approach (MBA)
• Experimental results
• Conclusion and Perspectives
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 16/21
34. Experimental Setup
Java implementation of LBA and MBA on top of Jena TDB
http://www.lias-lab.fr/forge/projects/qars
Dataset: LUBM20 (≈ 300 MB) and LUBM100 (≈ 1 GB)
Queries: 7 failing queries ranging 1 to 15 triple patterns (TP)
Hardware: Intel Core i7 CPU and 8GB RAM
Implementation of the relaxed matrix as a set of Bitmap
Relaxed matrix size and computation time:
LUBM20 LUBM100
Q3 Q5 Q7 Q3 Q5 Q7
Computation time with NQ (in sec) 8.6 8.6 8.6 42.6 43.4 44.6
Computation time with 1Q (in sec) 6.1 6.3 6.8 30.4 34.6 38.5
Size (in KB) 293 400 335 1385 1912 1590
Number of rows (in K) 430 430 430 2149 2149 2149
The algorithm 1Q is 25% faster than NQ
The size of the matrix is small thanks to bitmap compression
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 17/21
35. Experiments: XSS and MFS Computation Time
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 18/21
DFS: Depth-First Search algorithm (baseline method)
10
100
1000
10000
100000
Q1(1) Q2(5) Q3(7) Q4(9) Q5(11) Q6(13) Q7(15)
QueryTime(ms)
Queries with Number of Triple pattern (TP)
LUBM100
DFS
DFS does not scale for medium size queries (≈ 9TP)
36. Experiments: XSS and MFS Computation Time
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 18/21
DFS: Depth-First Search algorithm (baseline method)
ISHMAEL: adaptation of Godfrey’algorithm, LBA
10
100
1000
10000
100000
1e+006
Q1(1) Q2(5) Q3(7) Q4(9) Q5(11) Q6(13) Q7(15)
QueryTime(ms)
Queries with Number of Triple pattern (TP)
LUBM100
DFS
ISHMAEL
LBA
LBA/ISHMAEL scale till queries ≈ 11 TP
37. Experiments: XSS and MFS Computation Time
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 18/21
DFS: Depth-First Search algorithm (baseline method)
ISHMAEL: adaptation of Godfrey’algorithm, LBA
MBA+M: computation of the matrix + LBA algorithm
MBA-M: same as MBA+M but without computing the matrix
1
10
100
1000
10000
100000
1e+006
Q1(1) Q2(5) Q3(7) Q4(9) Q5(11) Q6(13) Q7(15)
QueryTime(ms)
Queries with Number of Triple pattern (TP)
LUBM100
DFS
ISHMAEL
LBA
MBA+M
MBA-M
MBA is only useful for Star-shaped queries
MBA+M only useful for large queries, MBA-M scales well
38. Experiments: Performance as the number of TP scales
The number of TP plays a important role in the performance
Experiment: decompose Q7 in 15 subqueries (1 to 15TP)
Execute each such subquery
1
10
100
1000
10000
100000
1e+006
2 4 6 8 10 12 14
Number of Triple Patterns
LUBM100
DFS
ISMAEL
LBA
MBA+M
MBA-M
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 19/21
⇒ Confirm our previous results
39. Outline
• Problem Statement
• Lattice-Based Approach (LBA)
• Matrice-Based Approach (MBA)
• Experimental results
• Conclusion and Perspectives
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 20/21
40. Conclusion and Perspectives
Study of 2 cooperative techniques for RDF query relaxation
MFS: a cause of query failure
XSS: a relaxed query with a maximum of TP
Contribution: 2 solutions to find MFSs/XSSs of RDF queries
LBA is smart exploration of the subquery lattice
MBA is based on a matrix used as a bitmap index
LBA >> baseline algorithms for queries with more than 5 TPs
MBA is only interesting for large star-shaped queries
if the matrix has been precomputed, MBA runs in milliseconds
even for large queries (>15TP)
Perspectives:
Optimization of LBA using heuristics (obvious MFS)
Optimization of MBA for other kinds of RDF queries
Definition of query relaxation strategies based on MFSs/XSSs
G. FOKOU, S. JEAN, A. HADJALI, M. BARON Cooperative Techniques for SPARQL Query Relaxation 21/21