We propose a set of optimizations that can be applied to a given SPARQL query, and that guarantee that the optimized query has the same answers under bag semantics as the original query, provided that the queried RDF graph validates certain SHACL constraints. We prove the correctness of these optimizations and show how they can be propagated to larger queries while preserving answers. Further, we prove the confluence of rewritings that employ these optimizations, guaranteeing convergence to the same optimized query regardless of the rewriting order.
This document discusses relational databases, RDF graphs, and constraints. It covers:
- Relational databases and their use of constraints like primary keys
- RDF graphs and their lack of explicit schema/constraints
- Mappings from relational databases to RDF graphs using direct mapping and R2RML
- Approaches to rewrite database constraints to SHACL constraints to validate the mapped RDF graph
- Opportunities to optimize SPARQL queries using inferred constraints from the SHACL shapes
Two graph data models : RDF and Property Graphsandyseaborne
This document provides an overview of two graph data models: RDF and Property Graphs. It describes the key components of each model, including triples for RDF and nodes/edges/properties for Property Graphs. It also discusses Apache projects that work with each model like Apache Jena for RDF and Apache TinkerPop, Spark, Giraph and Flink for Property Graphs. Finally, it notes that while the models have different focuses, they could potentially share technologies like storage and query capabilities.
This document provides an outline for a WWW 2012 tutorial on schema mapping with SPARQL 1.1. The outline includes sections on why data integration is important, schema mapping, translating RDF data with SPARQL 1.1, and common mapping patterns. Mapping patterns discussed include simple renaming, structural patterns like renaming based on property existence or value, value transformation using SPARQL functions, and aggregation. The tutorial aims to show how SPARQL 1.1 can be used to express executable mappings between different data schemas and representations.
RSP-QL*: Querying Data-Level Annotations in RDF Streamskeski
This document proposes an extension to RSP-QL called RSP-QL* that allows querying of statement-level annotations in RDF streams. RSP-QL* uses the RDF* model, which allows embedding RDF triples as the subject or object of other triples. This provides an efficient way to represent statement-level metadata in RDF. The semantics of RSP-QL are extended to support RSP-QL* patterns, which can include basic graph patterns, named graphs, windows and other operators. Future work includes adding more functionality to the RDF* model, prototyping an implementation, and evaluating performance.
This document provides an introduction and examples for SHACL (Shapes Constraint Language), a W3C recommendation for validating RDF graphs. It defines key SHACL concepts like shapes, targets, and constraint components. An example shape validates nodes with a schema:name and schema:email property. Constraints like minCount, maxCount, datatype, nodeKind, and logical operators like and/or are demonstrated. The document is an informative tutorial for learning SHACL through examples.
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...Databricks
This document discusses property graphs and how they are represented and queried using Morpheus, a graph query engine for Apache Spark.
Morpheus allows querying property graphs using Cypher and represents property graphs using DataFrames, with node and relationship data stored in tables. It integrates with various data sources and supports federated queries across multiple property graphs. The document provides examples of loading property graph data from sources like JSON, SQL databases and Neo4j, creating graph projections, running analytical queries, and recommending businesses based on graph algorithms.
SPARQL 1.1 introduced several new features including:
- Updated versions of the SPARQL Query and Protocol specifications
- A SPARQL Update language for modifying RDF graphs
- A protocol for managing RDF graphs over HTTP
- Service descriptions for describing SPARQL endpoints
- Basic federated query capabilities
- Other minor features and extensions
This document discusses relational databases, RDF graphs, and constraints. It covers:
- Relational databases and their use of constraints like primary keys
- RDF graphs and their lack of explicit schema/constraints
- Mappings from relational databases to RDF graphs using direct mapping and R2RML
- Approaches to rewrite database constraints to SHACL constraints to validate the mapped RDF graph
- Opportunities to optimize SPARQL queries using inferred constraints from the SHACL shapes
Two graph data models : RDF and Property Graphsandyseaborne
This document provides an overview of two graph data models: RDF and Property Graphs. It describes the key components of each model, including triples for RDF and nodes/edges/properties for Property Graphs. It also discusses Apache projects that work with each model like Apache Jena for RDF and Apache TinkerPop, Spark, Giraph and Flink for Property Graphs. Finally, it notes that while the models have different focuses, they could potentially share technologies like storage and query capabilities.
This document provides an outline for a WWW 2012 tutorial on schema mapping with SPARQL 1.1. The outline includes sections on why data integration is important, schema mapping, translating RDF data with SPARQL 1.1, and common mapping patterns. Mapping patterns discussed include simple renaming, structural patterns like renaming based on property existence or value, value transformation using SPARQL functions, and aggregation. The tutorial aims to show how SPARQL 1.1 can be used to express executable mappings between different data schemas and representations.
RSP-QL*: Querying Data-Level Annotations in RDF Streamskeski
This document proposes an extension to RSP-QL called RSP-QL* that allows querying of statement-level annotations in RDF streams. RSP-QL* uses the RDF* model, which allows embedding RDF triples as the subject or object of other triples. This provides an efficient way to represent statement-level metadata in RDF. The semantics of RSP-QL are extended to support RSP-QL* patterns, which can include basic graph patterns, named graphs, windows and other operators. Future work includes adding more functionality to the RDF* model, prototyping an implementation, and evaluating performance.
This document provides an introduction and examples for SHACL (Shapes Constraint Language), a W3C recommendation for validating RDF graphs. It defines key SHACL concepts like shapes, targets, and constraint components. An example shape validates nodes with a schema:name and schema:email property. Constraints like minCount, maxCount, datatype, nodeKind, and logical operators like and/or are demonstrated. The document is an informative tutorial for learning SHACL through examples.
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...Databricks
This document discusses property graphs and how they are represented and queried using Morpheus, a graph query engine for Apache Spark.
Morpheus allows querying property graphs using Cypher and represents property graphs using DataFrames, with node and relationship data stored in tables. It integrates with various data sources and supports federated queries across multiple property graphs. The document provides examples of loading property graph data from sources like JSON, SQL databases and Neo4j, creating graph projections, running analytical queries, and recommending businesses based on graph algorithms.
SPARQL 1.1 introduced several new features including:
- Updated versions of the SPARQL Query and Protocol specifications
- A SPARQL Update language for modifying RDF graphs
- A protocol for managing RDF graphs over HTTP
- Service descriptions for describing SPARQL endpoints
- Basic federated query capabilities
- Other minor features and extensions
The document discusses Semantic Web technologies including RDF, SPARQL and ontologies. It provides:
1) An introduction to the Semantic Web vision of machines being able to understand and respond to complex requests based on meaning. This requires information to be semantically structured.
2) A brief overview of key concepts in RDF including triples, nodes, blank nodes, and predefined RDF structures like bags and lists.
3) An explanation of the SPARQL query language, which is similar to SQL but interrogates the Semantic Web. SPARQL clauses like SELECT, CONSTRUCT, DESCRIBE and ASK are covered.
4) A discussion of ontological representations including R
The document provides an overview of validation of RDF data using the SHACL (Shapes Constraint Language) recommendation. It begins with background on RDF and then discusses why validation of RDF data is important. It introduces key SHACL concepts like shapes, constraints, targets, and property shapes. Examples are provided to illustrate node shapes, value type constraints, cardinality constraints, logical constraints, and property pair constraints. The document serves as an introduction to validating RDF data using the SHACL language.
The document discusses RDF Shapes, which are used to describe and validate RDF data. It provides examples of using ShEx and SHACL to define shapes for RDF graphs and validate instance data against those shapes. Key points covered include the differences between ShEx and SHACL, such as ShEx focusing on defining structures while SHACL adds target declarations, and how both can be used to generate validation reports.
ShEx is a language for validating RDF data. It allows defining shapes that specify constraints on nodes and triples. ShEx expressions can be used to validate if RDF graphs conform to the defined shapes. The ShEx language is inspired by languages like RelaxNG and provides different serialization formats like ShExC, ShExJ, and ShExR. There are open-source implementations of ShEx validators in languages like JavaScript, Scala, Ruby, Python, and Java. ShEx provides a concise way to define RDF shapes and validate instance data against those shapes.
Federation and Navigation in SPARQL 1.1net2-project
This document discusses new features in SPARQL 1.1, including federation using the SERVICE operator and navigation using property paths. It provides an overview of the basics of SPARQL and the syntax and semantics of SPARQL 1.0 queries before explaining federation, which allows querying multiple datasets, and navigation, which allows navigating RDF graphs using regular expressions to match properties. It also discusses the evaluation procedures and complexity of these new features.
Comparison of features between ShEx (Shape Expressions) and SHACL (Shapes Constraint Language)
Changelog:
11/06/17
- Removed slides about compositionality
31/May/2017
- Added slide 30 about validation report
- Added slide 32 about stems
- Changed slides 7 and 8 adapting compact syntax to new operator .
23/05/2017:
Slide 14: Repaired typos in typos in sh:entailment, rdfs:range
21/05/2017:
- Slide 8. Changed the example to be an IRI and a datatype
- Added typically in slide 9
- Slide 10: Removed the phrase: "Target declarations can problematic when reusing/importing shapes"
and created slide 27 to talk about reuability
- Added slide 11 to talk about the differences in triggering validation
- Created slide 14 to talk about inference
- Renamed slide 15 as "Inference and triggering mechanism"
- Added slides 27 and 28 to talk about reuability
- Added slide 29 to talk about annotations
18/05/2017
- Slides 9 now includes an example using ShEx RDF vocabulary
- Slide 10 now says that target declarations are optional
- Slide 13 now says that some RDF Schema terms have special treatment in SHACL
- Example in slide 18 now uses sh:or instead of sh:and
- Added slides 22, 23 and 24 which show some features supported by SHACL but not supported by ShEx (property pair constraints, uniqueLang and owl:imports)
This document introduces SPARQL, the SPARQL query language used to retrieve and manipulate RDF data. It provides an example SPARQL query to return full names from a sample RDF graph. It then describes what a SPARQL Service Description is, which is a vocabulary for discovering and describing SPARQL services and endpoints. It outlines several properties and classes used in SPARQL Service Descriptions.
This document provides an overview of the Semantic Web, RDF, SPARQL, and triplestores. It discusses how RDF structures and links data using subject-predicate-object triples. SPARQL is introduced as a standard query language for retrieving and manipulating data stored in RDF format. Popular triplestore implementations like Apache Jena and applications of linked data like DBPedia are also summarized.
Validating and Describing Linked Data Portals using RDF Shape ExpressionsJose Emilio Labra Gayo
Presentation at 1st Linked Data Quality Workshop, Leipzig, 2nd Sept. 2014
Author: Jose Emilio Labra Gayo
Applies Shapes Expressions to validate the WebIndex linked data portal
Introduction to Spark Datasets - Functional and relational together at lastHolden Karau
Spark Datasets are an evolution of Spark DataFrames which allow us to work with both functional and relational transformations on big data with the speed of Spark.
This document provides an overview of SPARQL, the SPARQL Query Language. It begins by explaining that SPARQL is an RDF query language designed to query graphs of RDF data. It then describes some key aspects of SPARQL including that it is based on matching graph patterns against RDF graphs, supports basic graph patterns through triple patterns, and allows for implicit and explicit joins. The document provides examples of SPARQL queries and discusses features like select-from-where structure, blank nodes, and group patterns.
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...Data Con LA
Data transformation has traditionally required expertise in specialized data platforms and typically been restricted to the domain of IT. A domain specific language (DSL) separates the user’s intent from a specific implementation, while maintaining expressivity. A user interface can be used to produce these expressions, in the form of suggestions, without requiring the user to manually write code. This higher level interaction, aided by transformation previews and suggestion ranking allows domain experts such as data scientists and business analysts to wrangle data while leveraging the optimal processing framework for the data at hand.
Alpine academy apache spark series #1 introduction to cluster computing wit...Holden Karau
Alpine academy apache spark series #1 introduction to cluster computing with python & a wee bit of scala. This is the first in the series and is aimed at the intro level, the next one will cover MLLib & ML.
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIsJosef Petrák
The document discusses the Semantic Web and RDF data formats. It provides an overview of RDF syntaxes like RDF/XML, N3, N-Triples, RDF/JSON, and RDFa. It also discusses software APIs for working with RDF data in languages like Java, PHP, and Ruby. The document outlines handling RDF data using statement-centric, resource-centric, and ontology-centric models, as well as named graphs. It provides examples of reading RDF data from files and querying RDF data using SPARQL.
This document provides an overview of RDF, RDFS, and OWL, which are graph data models used to represent data on the Semantic Web. It describes the core components of RDF, including URIs, triples, and data types. It also explains how RDF graphs can be represented in N-Triples format or XML. Additionally, it covers RDF Schema (RDFS) and how it adds a type system to RDF through classes, subclasses, domains, and ranges of properties. The document concludes by noting some limitations of RDF and RDFS in modeling complex constraints and relationships.
Inductive Triple Graphs: A purely functional approach to represent RDFJose Emilio Labra Gayo
Slides of my presentation on 3rd International Workshop on Graph Structures for Knowledge Representation, part of the International Joint Conference on Artificial Intelligence, Beijing, China. 4 August 2013
SparkR: Enabling Interactive Data Science at Scale on HadoopDataWorks Summit
SparkR enables interactive data science at scale on Hadoop by providing an R interface to Apache Spark. Some key points:
- SparkR allows users to manipulate distributed datasets (RDDs) using familiar R operations like map, filter, reduceByKey.
- It integrates R and Spark by running R code on Spark executors via JNI, allowing R scripts to process large datasets in parallel.
- Examples show how to do tasks like word count and logistic regression on Spark using R code, demonstrating the ability to scale R for data science on big data.
"SPARQL Cheat Sheet" is a short collection of slides intended to act as a guide to SPARQL developers. It includes the syntax and structure of SPARQL queries, common SPARQL prefixes and functions, and help with RDF datasets.
The "SPARQL Cheat Sheet" is intended to accompany the SPARQL By Example slides available at http://www.cambridgesemantics.com/2008/09/sparql-by-example/ .
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
More Related Content
Similar to Optimizing SPARQL Queries with SHACL.pdf
The document discusses Semantic Web technologies including RDF, SPARQL and ontologies. It provides:
1) An introduction to the Semantic Web vision of machines being able to understand and respond to complex requests based on meaning. This requires information to be semantically structured.
2) A brief overview of key concepts in RDF including triples, nodes, blank nodes, and predefined RDF structures like bags and lists.
3) An explanation of the SPARQL query language, which is similar to SQL but interrogates the Semantic Web. SPARQL clauses like SELECT, CONSTRUCT, DESCRIBE and ASK are covered.
4) A discussion of ontological representations including R
The document provides an overview of validation of RDF data using the SHACL (Shapes Constraint Language) recommendation. It begins with background on RDF and then discusses why validation of RDF data is important. It introduces key SHACL concepts like shapes, constraints, targets, and property shapes. Examples are provided to illustrate node shapes, value type constraints, cardinality constraints, logical constraints, and property pair constraints. The document serves as an introduction to validating RDF data using the SHACL language.
The document discusses RDF Shapes, which are used to describe and validate RDF data. It provides examples of using ShEx and SHACL to define shapes for RDF graphs and validate instance data against those shapes. Key points covered include the differences between ShEx and SHACL, such as ShEx focusing on defining structures while SHACL adds target declarations, and how both can be used to generate validation reports.
ShEx is a language for validating RDF data. It allows defining shapes that specify constraints on nodes and triples. ShEx expressions can be used to validate if RDF graphs conform to the defined shapes. The ShEx language is inspired by languages like RelaxNG and provides different serialization formats like ShExC, ShExJ, and ShExR. There are open-source implementations of ShEx validators in languages like JavaScript, Scala, Ruby, Python, and Java. ShEx provides a concise way to define RDF shapes and validate instance data against those shapes.
Federation and Navigation in SPARQL 1.1net2-project
This document discusses new features in SPARQL 1.1, including federation using the SERVICE operator and navigation using property paths. It provides an overview of the basics of SPARQL and the syntax and semantics of SPARQL 1.0 queries before explaining federation, which allows querying multiple datasets, and navigation, which allows navigating RDF graphs using regular expressions to match properties. It also discusses the evaluation procedures and complexity of these new features.
Comparison of features between ShEx (Shape Expressions) and SHACL (Shapes Constraint Language)
Changelog:
11/06/17
- Removed slides about compositionality
31/May/2017
- Added slide 30 about validation report
- Added slide 32 about stems
- Changed slides 7 and 8 adapting compact syntax to new operator .
23/05/2017:
Slide 14: Repaired typos in typos in sh:entailment, rdfs:range
21/05/2017:
- Slide 8. Changed the example to be an IRI and a datatype
- Added typically in slide 9
- Slide 10: Removed the phrase: "Target declarations can problematic when reusing/importing shapes"
and created slide 27 to talk about reuability
- Added slide 11 to talk about the differences in triggering validation
- Created slide 14 to talk about inference
- Renamed slide 15 as "Inference and triggering mechanism"
- Added slides 27 and 28 to talk about reuability
- Added slide 29 to talk about annotations
18/05/2017
- Slides 9 now includes an example using ShEx RDF vocabulary
- Slide 10 now says that target declarations are optional
- Slide 13 now says that some RDF Schema terms have special treatment in SHACL
- Example in slide 18 now uses sh:or instead of sh:and
- Added slides 22, 23 and 24 which show some features supported by SHACL but not supported by ShEx (property pair constraints, uniqueLang and owl:imports)
This document introduces SPARQL, the SPARQL query language used to retrieve and manipulate RDF data. It provides an example SPARQL query to return full names from a sample RDF graph. It then describes what a SPARQL Service Description is, which is a vocabulary for discovering and describing SPARQL services and endpoints. It outlines several properties and classes used in SPARQL Service Descriptions.
This document provides an overview of the Semantic Web, RDF, SPARQL, and triplestores. It discusses how RDF structures and links data using subject-predicate-object triples. SPARQL is introduced as a standard query language for retrieving and manipulating data stored in RDF format. Popular triplestore implementations like Apache Jena and applications of linked data like DBPedia are also summarized.
Validating and Describing Linked Data Portals using RDF Shape ExpressionsJose Emilio Labra Gayo
Presentation at 1st Linked Data Quality Workshop, Leipzig, 2nd Sept. 2014
Author: Jose Emilio Labra Gayo
Applies Shapes Expressions to validate the WebIndex linked data portal
Introduction to Spark Datasets - Functional and relational together at lastHolden Karau
Spark Datasets are an evolution of Spark DataFrames which allow us to work with both functional and relational transformations on big data with the speed of Spark.
This document provides an overview of SPARQL, the SPARQL Query Language. It begins by explaining that SPARQL is an RDF query language designed to query graphs of RDF data. It then describes some key aspects of SPARQL including that it is based on matching graph patterns against RDF graphs, supports basic graph patterns through triple patterns, and allows for implicit and explicit joins. The document provides examples of SPARQL queries and discusses features like select-from-where structure, blank nodes, and group patterns.
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...Data Con LA
Data transformation has traditionally required expertise in specialized data platforms and typically been restricted to the domain of IT. A domain specific language (DSL) separates the user’s intent from a specific implementation, while maintaining expressivity. A user interface can be used to produce these expressions, in the form of suggestions, without requiring the user to manually write code. This higher level interaction, aided by transformation previews and suggestion ranking allows domain experts such as data scientists and business analysts to wrangle data while leveraging the optimal processing framework for the data at hand.
Alpine academy apache spark series #1 introduction to cluster computing wit...Holden Karau
Alpine academy apache spark series #1 introduction to cluster computing with python & a wee bit of scala. This is the first in the series and is aimed at the intro level, the next one will cover MLLib & ML.
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIsJosef Petrák
The document discusses the Semantic Web and RDF data formats. It provides an overview of RDF syntaxes like RDF/XML, N3, N-Triples, RDF/JSON, and RDFa. It also discusses software APIs for working with RDF data in languages like Java, PHP, and Ruby. The document outlines handling RDF data using statement-centric, resource-centric, and ontology-centric models, as well as named graphs. It provides examples of reading RDF data from files and querying RDF data using SPARQL.
This document provides an overview of RDF, RDFS, and OWL, which are graph data models used to represent data on the Semantic Web. It describes the core components of RDF, including URIs, triples, and data types. It also explains how RDF graphs can be represented in N-Triples format or XML. Additionally, it covers RDF Schema (RDFS) and how it adds a type system to RDF through classes, subclasses, domains, and ranges of properties. The document concludes by noting some limitations of RDF and RDFS in modeling complex constraints and relationships.
Inductive Triple Graphs: A purely functional approach to represent RDFJose Emilio Labra Gayo
Slides of my presentation on 3rd International Workshop on Graph Structures for Knowledge Representation, part of the International Joint Conference on Artificial Intelligence, Beijing, China. 4 August 2013
SparkR: Enabling Interactive Data Science at Scale on HadoopDataWorks Summit
SparkR enables interactive data science at scale on Hadoop by providing an R interface to Apache Spark. Some key points:
- SparkR allows users to manipulate distributed datasets (RDDs) using familiar R operations like map, filter, reduceByKey.
- It integrates R and Spark by running R code on Spark executors via JNI, allowing R scripts to process large datasets in parallel.
- Examples show how to do tasks like word count and logistic regression on Spark using R code, demonstrating the ability to scale R for data science on big data.
"SPARQL Cheat Sheet" is a short collection of slides intended to act as a guide to SPARQL developers. It includes the syntax and structure of SPARQL queries, common SPARQL prefixes and functions, and help with RDF datasets.
The "SPARQL Cheat Sheet" is intended to accompany the SPARQL By Example slides available at http://www.cambridgesemantics.com/2008/09/sparql-by-example/ .
Similar to Optimizing SPARQL Queries with SHACL.pdf (20)
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Infrastructure Challenges in Scaling RAG with Custom AI modelsZilliz
Building Retrieval-Augmented Generation (RAG) systems with open-source and custom AI models is a complex task. This talk explores the challenges in productionizing RAG systems, including retrieval performance, response synthesis, and evaluation. We’ll discuss how to leverage open-source models like text embeddings, language models, and custom fine-tuned models to enhance RAG performance. Additionally, we’ll cover how BentoML can help orchestrate and scale these AI components efficiently, ensuring seamless deployment and management of RAG systems in the cloud.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
CAKE: Sharing Slices of Confidential Data on BlockchainClaudio Di Ciccio
Presented at the CAiSE 2024 Forum, Intelligent Information Systems, June 6th, Limassol, Cyprus.
Synopsis: Cooperative information systems typically involve various entities in a collaborative process within a distributed environment. Blockchain technology offers a mechanism for automating such processes, even when only partial trust exists among participants. The data stored on the blockchain is replicated across all nodes in the network, ensuring accessibility to all participants. While this aspect facilitates traceability, integrity, and persistence, it poses challenges for adopting public blockchains in enterprise settings due to confidentiality issues. In this paper, we present a software tool named Control Access via Key Encryption (CAKE), designed to ensure data confidentiality in scenarios involving public blockchains. After outlining its core components and functionalities, we showcase the application of CAKE in the context of a real-world cyber-security project within the logistics domain.
Paper: https://doi.org/10.1007/978-3-031-61000-4_16
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
Generating privacy-protected synthetic data using Secludy and MilvusZilliz
During this demo, the founders of Secludy will demonstrate how their system utilizes Milvus to store and manipulate embeddings for generating privacy-protected synthetic data. Their approach not only maintains the confidentiality of the original data but also enhances the utility and scalability of LLMs under privacy constraints. Attendees, including machine learning engineers, data scientists, and data managers, will witness first-hand how Secludy's integration with Milvus empowers organizations to harness the power of LLMs securely and efficiently.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
OpenID AuthZEN Interop Read Out - AuthorizationDavid Brossard
During Identiverse 2024 and EIC 2024, members of the OpenID AuthZEN WG got together and demoed their authorization endpoints conforming to the AuthZEN API
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfTechgropse Pvt.Ltd.
In this blog post, we'll delve into the intersection of AI and app development in Saudi Arabia, focusing on the food delivery sector. We'll explore how AI is revolutionizing the way Saudi consumers order food, how restaurants manage their operations, and how delivery partners navigate the bustling streets of cities like Riyadh, Jeddah, and Dammam. Through real-world case studies, we'll showcase how leading Saudi food delivery apps are leveraging AI to redefine convenience, personalization, and efficiency.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
3. RDF
▶ Standard for web data
▶ W3C Rec. since 1999
▶ RDF 1.0, 2004 https://www.w3.org/TR/rdf-primer/
▶ RDF 1.2, 2014 https://www.w3.org/TR/rdf11-concepts/
▶ W3C working draft for RDF 1.2, 2023 https://www.w3.org/TR/rdf12-concepts/
4. RDF Syntax
▶ IRIs to reference resources on web
▶ Statements as nodes and arcs in a graph, in the form of triples
”(Subject, Predicate, Object)”. E.g.,
”Mona Lisa has a creator whose value is Leonardo Da Vinci”
http://purl.org/dc/terms/creator
https://en.wikipedia.org/wiki/Mona_Lisa
https://en.wikipedia.org/wiki/Leonardo_da_Vinci
Subject
Predicate
Object
7. RDF: Constraints?
W3C defines RDF as an ”assertional logic,” where each triple
expresses a simple proposition.
▶ This logical framework imposes a strict monotonic discipline
on the language, preventing the expression of closed-world
assumptions, local default preferences, and other commonly
used non-monotonic constructs.
8. SHACL
▶ Constraint language for RDF
▶ W3C Rec. since July 2017
Other constraint languages:
▶ SPIN - SPARQL Syntax, (2009) 2011
https://www.w3.org/submissions/2011/SUBM-spin-sparql-20110222/
▶ IBM Resource Shape 2.0, 2014 https://www.w3.org/submissions/shapes/
▶ Shape Expressions Language 2.0, 2017, http://shex.io/shex-semantics-20170713/
9. SHACL
▶ relies on the notion of ”shapes”
e.g.,
:EmployeeNode a sh:NodeShape;
sh:targetClass :Employee;
sh:property [ sh:path :hasAddress;
sh:nodeKind sh:Literal;
sh:maxCount 1; sh:minCount 1;
sh:datatype xsd:string ];
sh:property [ sh:path :hasAddress;
dash:uniqueValueForClass
:Employee ].
10. SHACL Shape
▶ relies on the notion of ”shapes”
e.g.,
:EmployeeNode a sh:NodeShape ;
sh:targetClass :Employee ;
sh:property [ sh:path :hasAddress ;
sh:nodeKind. sh:Literal ;
sh:maxCount 1;
sh:minCount 1;
sh:datatype xsd:string ];
sh:property [ sh:path :hasAddress;
dash:uniqueValueForClass
:Employeee ].
shape name
target defn
constraints
defn
11. SHACL: Constraint Validation
Consider an RDF graph on the left and a SHACL shape on the right, written in
Turtle syntax:
:Ida a :Employee;
:hasID "001"^^xsd:int;
:hasAddress "Oslo".
:Ingrid a :Employee;
:hasID "002"^^xsd:int;
:hasAddress "Bergen".
:EmployeeNode a sh:NodeShape;
sh:targetClass :Employee;
sh:property [ sh:path :hasAddress;
sh:nodeKind sh:Literal;
sh:maxCount 1; sh:minCount 1;
sh:datatype xsd:string ];
sh:property [ sh:path :hasAddress;
dash:uniqueValueForClass
:Employee ].
18. SHACL: Abstract Syntax
Let S, C and P be countable infinite and mutually disjoint sets of
shape, class and property names.
Shape target τs and constraint ϕs are expressions defined by
the grammar
τs := sh:targetClass C | sh:targetSubjectOf P |
sh:targetObjectOf P
ϕs := ≥n α. β | ≤n α. β | ▷τs α | α1 = α2 | ϕs ∧ ϕs
β := ⊤ | C | s′
| ¬β
Where α, α1, α2 ∈ {P ∪ {P− | P ∈ P}}, C ∈ C and s, s′ ∈ S.
19. SHACL: Abstract Syntax
Shape target τs and constraint ϕs are expressions defined by the
grammar:
τs := sh:targetClass C | sh:targetSubjectOf P |
sh:targetObjectOf P
τs := C | P | P−
(i.e., short syntax)
ϕs := ≥n α. β | ≤n α. β | ▷τs α | α1 = α2 | ϕs ∧ ϕs
β := ⊤ | C | s′
| ¬β
A shape in abstract syntax:
⟨Employee, τEmployee, ϕEmployee⟩ with τEmployee = :Employee and
ϕEmployee = (=1 hasAddress. ⊤) ∧ (▷τEmployee
hasAddress).
20. SHACL
Shape target τs and constraint ϕs are expressions defined by the
grammar:
τs := C | P | P−
(i.e., short syntax)
ϕs := ≥n α. β | ≤n α. β | ▷τs α | α1 = α2 | ϕs ∧ ϕs
β := ⊤ | C | s′
| ¬β
A shape in abstract syntax:
⟨Employee, τEmployee, ϕEmployee⟩ with τEmployee = :Employee and
ϕEmployee = (=1 hasAddress. ⊤) ∧ (▷τEmployee
hasAddress).
Once the context is clear, we simply write:
⟨Employee, :Employee, (=1 hasAddress. ⊤) ∧ (▷τEmployee
hasAddress)⟩
21. SPARQL
▶ Query language for RDF
▶ W3C Rec. since January
2008
▶ SPARQL 1.1, 2013 https://www.w3.org/TR/sparql11-query/
▶ W3C working draft for SPARQL 1.2, 2023 https://www.w3.org/TR/sparql12-update/
▶ W3C community draft for RDF∗
and SPARQL∗
, 2021 https://www.w3.org/2021/12/rdf-star.html
22. SPARQL: Query Variables?
▶ For Queries we need variables, and SPARQL Variables are
bound to RDF terms
▶ E.g., ?title, ?author, ?published
▶ In the same way as SQL,
A Query for variables is performed via SELECT statement
▶ E.g., SELECT ?title ?author ?published
A SELECT statement returns Query Result as a table
?title ?author ?published
Games of no
chance
Richard J.
Nowakowski
1999
Calculated Bets Steven S. Skiena 2001
▶ Bag Semantics
25. SPARQL Algebra
SPARQL query is a graph pattern P defined by the grammar
P := B | FilterF (P) | Union(P1, P2) | Join(P1, P2) | Minus(P1, P2)
| DiffF (P1, P2) | OptF (P1, P2) | ProjL(P) | Dist(P)
26. SPARQL Algebra
SPARQL query is a graph pattern P defined by the grammar
P := B | FilterF (P) | Union(P1, P2) | Join(P1, P2) | Minus(P1, P2)
| DiffF (P1, P2) | OptF (P1, P2) | ProjL(P) | Dist(P)
E.g. Consider a case of nested SPARQL query that retrieves the
name of employees and their office addresses,
SELECT ?y ?z WHERE { ?x :hasName ?y
SELECT ?x ?z WHERE { ?x :hasOffice ?y . ?y :hasAddress ?z }}
In SPARQL Algebra,
Projyz (Join(hasName(x, y), Projxz (Join(hasOffice(x, y),
hasAddress(y, z)))))
27. SPARQL Algebra
E.g. Consider a case of nested SPARQL query that retrieves the
name of employees and their office addresses,
SELECT ?y ?z WHERE { ?x :hasName ?y
SELECT ?x ?z WHERE { ?x :hasOffice ?y . ?y :hasAddress ?z }}
In SPARQL Algebra,
Projyz (Join(hasName(x, y), Projxz (Join(hasOffice(x, y),
hasAddress(y, z)))))
Upon simplification (whenever possible but absolutely not necessary), we
get:
Projyz (hasName(x, y) hasOffice(x, n) hasAddress(n, z)) .
28. SPARQL Algebra: some notions on query evaluation?
The semantics of graph patterns is defined in terms of (solution)
mappings, partial functions,
µ : V → T with (possibly empty) dom(µ)
where T is sets of RDF terms I ∪ B ∪ L, and V countably infinite
set of variables disjoint from T.
29. SPARQL Algebra: some notions on query evaluation?
Partial functions,
µ : V → T
Let
▶ µ|L be the restriction of mapping µ to L ⊆ V
▶ µ|L̄ be the restriction of mapping µ to V L
Evaluation of a SPARQL query Q over an RDF graph G, denoted
by QG, returns a multiset (i.e.,bag) of mappings.
▶ QG
|X̄ is the multiset of mappings µ ∈ QG restricted to V X
i.e.,
|µ, QG
|X̄ | =
X
µ=µ′
|X̄
|µ′
, QG
|
▶ Support of the multiset QG, denoted by sup(QG), is
sup(QG
) = {µ | |µ, QG
| > 0}
31. Optimization: Problem Statement
Let S be a set of SHACL shapes, and Q a SPARQL query.
Our goal is to find optimal S-equivalent queries Q′ of the original
query Q s.t.,
Q ≡S Q′
iff, ∀G.G |= S, QG
= Q′G
32. Optimization: Equivalences
Let U and V be two graph patterns, and S a set of SHACL shapes.
U ≡S V iff, ∀G.G |= S, UG
= V G
U ≡S,y V iff, ∀G.G |= S, UG
|ȳ = V G
|ȳ
U ∼
=S,y V iff, ∀G.G |= S, sup(UG
|ȳ ) = sup(V G
|ȳ )
33. Optimization: Rewriting Rules
Let U and V be two graph patterns, and S a set of SHACL shapes.
U ≡S V iff, ∀G.G |= S, UG
= V G
U ≡S,y V iff, ∀G.G |= S, UG
|ȳ = V G
|ȳ
U ∼
=S,y V iff, ∀G.G |= S, sup(UG
|ȳ ) = sup(V G
|ȳ )
We then propose a set of query rewriting rules based on these
equivalences that:
1. reduce OPTIONAL to JOIN Pattern
2. remove redundant JOIN Pattern
3. eliminate DIST Operator etc
34. An Example of Query Rewriting
Consider a SPARQL query,
Dist(Projxy (Opt⊤(Employee(x), hasAddress(x, y))))
over graph G,
:Ida a :Employee;
:hasID "001"^^xsd:int;
:hasAddress "Oslo".
:Yacob a :Employee;
. . .
:Nils a :Employee;
. . .
:Ingrid a :Employee;
:hasID "002"^^xsd:int;
:hasAddress "Bergen".
. . .
35. An Example of Query Rewriting
Consider a SPARQL query,
Dist(Projxy (Opt⊤(Employee(x), hasAddress(x, y))))
over graph G,
:Ida a :Employee;
:hasID "001"^^xsd:int;
:hasAddress "Oslo".
:Yacob a :Employee;
. . .
:Ingrid a :Employee;
:hasID "002"^^xsd:int;
:hasAddress "Bergen".
. . .
Assume G satisfies shape,
⟨Employee, :Employee, (=1 hasAddress. ⊤)∧(▷τEmployee
hasAddress)⟩
36. An Example of Query Rewriting
Consider the query over G,
Dist(Projxy (Opt⊤(Employee(x), hasAddress(x, y)))) .
▶ Since G satisfies ϕEmployee = (=1 hasAddress. ⊤), “Opt
pattern” can be reduce to “Join pattern”
Dist(Projxy (Join(Employee(x), hasAddress(x, y)))).
▶ Since G satisfies ϕEmployee = (▷τEmployee
hasAddress), “Dist”
can be removed,
Projxy (Join(Employee(x), hasAddress(x, y))).
37. An Example of Query Rewriting
Consider the query over G,
Dist(Projxy (Opt⊤(Employee(x), hasAddress(x, y)))) .
▶ Since G satisfies ϕEmployee = (=1 hasAddress. ⊤), “Opt
pattern” can be reduce to “Join pattern”
Dist(Projxy (Join(Employee(x), hasAddress(x, y)))).
▶ Since G satisfies ϕEmployee = (▷τEmployee
hasAddress), “Dist”
can be removed,
Projxy (Join(Employee(x), hasAddress(x, y))).
“≡S Equivalent Queries”
38. Optimization: Example of Rewriting Rules
Lemma
Let ⟨s, τs, ϕs⟩ ∈ S with (≥n P.⊤) ∈ ϕs s.t. n ≥ 1, and P a graph
pattern s.t. T ◀ P. If y /
∈ var(P), then
1. OptF (P, P(x, y)) ≡S FilterF (Join(P, P(x, y)))
2. Join(P, P(x, y)) ∼
=S,y P
where T =
C(x), if τs = C,
R(x, z), if τs = ∃R,
R−(x, z), if τs = ∃R− .
Corollary
Let ⟨s, τs, ϕs⟩ ∈ S with (≥n P.⊤) ∈ ϕs s.t. n ≥ 1, and P a graph
pattern s.t. T ◀ P. If y /
∈ var(P ∪ F), then
1. FilterF (Join(P, P(x, y))) ∼
=S,y FilterF (P)
2. OptF (P, P(x, y)) ∼
=S,y FilterF (P)
39. Optimization: Example of Rewriting Rules
T =
C(x), if τs = C,
R(x, z), if τs = ∃R,
R−(x, z), if τs = ∃R− .
Corollary
Let ⟨s, τs, ϕs⟩ ∈ S with (≥n P.⊤) ∈ ϕs s.t. n ≥ 1, and P a graph
pattern s.t. T ◀ P. If y /
∈ var(P ∪ F), then
FilterF (Join(P, P(x, y))) ∼
=S,y FilterF (P) .
Q = Dist(Projx y (Filterregex(y,”Smith”)(Join(Student(x)
lastName(x, y), hasAddress(x, z)))))
Consider the Q over G |= ⟨Student, :Student, (≥n hasAddress ⊤)⟩.
40. Optimization: Example of Rewriting Rules
Corollary
Let ⟨s, τs, ϕs⟩ ∈ S with (≥n P.⊤) ∈ ϕs s.t. n ≥ 1, and P a graph
pattern s.t. T ◀ P. If y /
∈ var(P ∪ F), then
FilterF (Join(P, P(x, y))) ∼
=S,y FilterF (P) .
Q = Dist(Projx y (Filterregex(y,”Smith”)(Join(Student(x)
lastName(x, y), hasAddress(x, z)))))
Consider the Q over G |= ⟨Student, :Student, (≥n hasAddress ⊤)⟩.
Then, by following Corollary, we can reduce query Q to :
Dist(Projx y (Filterregex(y,”Smith”)(Student(x)
lastName(x, y))))
41. Property of Query Rewriting Rules
▶ Propagation to Larger Queries
▶ Confluent Reduction
42. Property of Query Rewriting Rules: Propagation
Definition
Let Q be a SPARQL query, and let P and U be two graph patterns.
Then, we write U ◁
∼ Q if Dist(ProjX (P)) ⊴ Q and U ⊴ P .
Theorem
Let Q be a SPARQL query and S a SHACL document. Let U and
V be two graph patterns. Then,
1. Q ≡S QU7→V if U ≡S V
2. ProjX (Q) ≡S ProjX (Q)U7→V if U ⊴ Q, U ≡S,y V and
y /
∈ var(ProjX (Q) U)
3. Q ≡S QU7→V if U ◁
∼ Q, U ∼
=S,y V and y /
∈ var(Q U)
43. Property of Query Rewriting Rules: Confluent Reduction
Consider the SPARQL query,
Dist(Projx y (employeeID(x, y) hiredBy(x, k) insuredBy(x, z)))
over a graph G s.t. G |= ⟨∃employeeID, τ∃employeeID, ϕ∃employeeID⟩
with
{(▷∃employeeID employeeID), (=1 insuredBy. ⊤),
(hiredBy = insuredBy)} ⊆ ϕ∃employedID .
44. Property of Query Rewriting Rules: Confluent Reduction
Consider the SPARQL query,
Dist(Projx y (employeeID(x, y) hiredBy(x, k) insuredBy(x, z)))
over a graph G s.t. G |= ⟨∃employeeID, τ∃employeeID, ϕ∃employeeID⟩ with
{(▷∃employeeID employeeID), (=1 insuredBy. ⊤),
(hiredBy = insuredBy)} ⊆ ϕ∃employedID
Subsequently,
(=1 insuredBy. ⊤)∧(hiredBy = insuredBy)
−→ (=1 hiredBy. ⊤)
(=1 hiredBy. ⊤) −→ (≥1 hiredBy. ⊤)
(=1 insuredBy. ⊤) −→ (≥1 insuredBy. ⊤)
Need to take-care all explicit and implicit rewritings rules
45. Property of Query Rewriting Rules: Confluent Reduction
Consider the SPARQL query,
Dist(Projx y (employeeID(x, y) hiredBy(x, k) insuredBy(x, z)))
over a graph G s.t. G |= ⟨∃employeeID, τ∃employeeID, ϕ∃employeeID⟩ with
{(▷∃employeeID employeeID), (=1 insuredBy. ⊤),
, (hiredBy = insuredBy)} ⊆ ϕ∃employedID
Then, the query is subjective to the following rewriting rules:
1. ∼
=S,y - ”Join” optimization based on (≥1 insuredBy. ⊤)
2. ∼
=S,y - ”Join” optimization based on (≥1 hiredBy. ⊤)
3. ∼
=S,y - ”Join” optimization based on (hiredBy = insuredBy)
4. ≡S,y - ”Join” optimization based on (=1 insuredBy. ⊤)
5. ≡S,y - ”Join” optimization based on (=1 hiredBy. ⊤)
6. ≡S - ”Dist” optimization based on (▷∃employeeID employeeID)
46. Property of Query Rewriting Rules: Confluent Reduction
Consider the SPARQL query,
Dist(Projx y (employeeID(x, y) hiredBy(x, k) insuredBy(x, z)))
over a graph G s.t. G |= ⟨∃employeeID, τ∃employeeID, ϕ∃employeeID⟩.
Then, the query is subjective to the following rewriting rules:
1. ∼
=S,y - ”Join” optimization based on (≥1 insuredBy. ⊤)
2. ∼
=S,y - ”Join” optimization based on (≥1 hiredBy. ⊤)
3. ∼
=S,y - ”Join” optimization based on (hiredBy = insuredBy)
4. ≡S,y - ”Join” optimization based on (=1 insuredBy. ⊤)
5. ≡S,y - ”Join” optimization based on (=1 hiredBy. ⊤)
6. ≡S - ”Dist” optimization based on (▷∃employeeID employeeID)
Regardless of the sequence in which these rewrites are applied, we will get:
Projx y (employeeID(x, y))
47. Property of Query Rewriting Rules: Confluent Reduction
... As rewriting optimizations are generalized in the form of lemmas and their
consequences, we state confluent results as follows:
Theorem
Query rewriting defined by Lemmas 1 to 6 is a confluent reduction.
Theorem
Query rewriting defined by Lemmas 1 to 7 is a confluent reduction iff
ϕ′
=
⊤, if P = T,
Vn
i=1(=1 Pi . ⊤), if P = (T P1(x, z1) . . . Pi (x, zi ) . . . Pn(x, zn))
in
Lemma 7.
48. Other or future work?
▶ Extension to SPARQL Property Path Queries
▶ Optimization of Ontology-Mediated Query Answering