The document discusses new improvements to the parser and optimizer in MySQL 5.7. Key points include:
1) The parser and optimizer were refactored for improved maintainability and stability. Parsing was separated from optimization and execution.
2) The cost model was improved with better record estimation for joins, configurable cost constants, and additional explain output.
3) A new query rewrite plugin allows rewriting queries without changing application code.
This presentation focuses on optimization of queries in MySQL from developer’s perspective. Developers should care about the performance of the application, which includes optimizing SQL queries. It shows the execution plan in MySQL and explain its different formats - tabular, TREE and JSON/visual explain plans. Optimizer features like optimizer hints and histograms as well as newer features like HASH joins, TREE explain plan and EXPLAIN ANALYZE from latest releases are covered. Some real examples of slow queries are included and their optimization explained.
Introduction into MySQL Query Tuning for Dev[Op]sSveta Smirnova
Percona Live Online 2021 talk: https://www.percona.com/resources/videos/introduction-mysql-query-tuning-for-devops
In this talk I will show how to get started with MySQL Query Tuning. I will make a short introduction into physical table structure and demonstrate how it may influence query execution time.
Then we will discuss basic query tuning instruments and techniques, mainly EXPLAIN command with its latest variations. You will learn how to understand its output and how to rewrite queries or change table structure to achieve better performance.
Optimizer features in recent releases of other databasesSergey Petrunya
The document summarizes several recent optimizer features introduced in MySQL 8.0 and PostgreSQL versions:
- MySQL 8.0 introduced an iterator-based executor, hash joins, EXPLAIN ANALYZE, and optimizations for anti-joins, semi-joins, and subqueries.
- PostgreSQL improved query parallelism, added multi-column statistics, parallel index creation, and optimized non-recursive common table expressions.
- Both databases have focused on join algorithms, statistics gathering, and parallel query processing to improve performance. MySQL continues to adopt features from other databases in recent releases.
- Common Table Expressions (CTEs) allow for temporary results to be stored and reused within the same SQL statement, similar to derived tables or views.
- CTEs can be non-recursive or recursive. Non-recursive CTEs are optimized by merging into joins or pushing conditions down, while recursive CTEs compute results through iterative steps until a fixed point is reached.
- The document discusses optimizations for non-recursive CTEs in MariaDB and provides examples of using CTEs for common queries involving things like hierarchical or network data.
MySQL is a relational database management system. It provides tools for managing data, including creating, querying, updating and deleting data in databases. Some key features include:
- Creating, altering and dropping databases, tables, indexes, users and more.
- Inserting, selecting, updating and deleting data with SQL statements.
- Backup and restore capabilities using mysqldump to backup entire databases or tables.
- Security features including user accounts and privileges to control access.
- Performance optimization using indexes, partitioning, query tuning and more.
- Data types for different kinds of data like numbers, dates, text, JSON and more.
This document introduces windowing functions in Firebird 3. It explains that windowing functions allow calculations across sets of rows and provide access to values from related rows in the same table. It describes the syntax of window functions in Firebird and how they use windows to define partitions of rows and sort orders. Examples show how aggregate functions can be used as window functions to calculate moving and cumulative aggregates over window partitions.
This document provides a summary of MySQL indexes and how to use the EXPLAIN statement to analyze query performance. It defines what indexes are, the different types of indexes like B-tree, hash, and full-text indexes. It also explains concepts like compound indexes, covering indexes, and partial indexes. The document demonstrates how to use the EXPLAIN statement to view and understand a query execution plan, including analyzing the possible and actual indexes used, join types, number of rows examined, and index usage. It provides examples of interpreting EXPLAIN output and analyzing performance bottlenecks.
The document discusses new improvements to the parser and optimizer in MySQL 5.7. Key points include:
1) The parser and optimizer were refactored for improved maintainability and stability. Parsing was separated from optimization and execution.
2) The cost model was improved with better record estimation for joins, configurable cost constants, and additional explain output.
3) A new query rewrite plugin allows rewriting queries without changing application code.
This presentation focuses on optimization of queries in MySQL from developer’s perspective. Developers should care about the performance of the application, which includes optimizing SQL queries. It shows the execution plan in MySQL and explain its different formats - tabular, TREE and JSON/visual explain plans. Optimizer features like optimizer hints and histograms as well as newer features like HASH joins, TREE explain plan and EXPLAIN ANALYZE from latest releases are covered. Some real examples of slow queries are included and their optimization explained.
Introduction into MySQL Query Tuning for Dev[Op]sSveta Smirnova
Percona Live Online 2021 talk: https://www.percona.com/resources/videos/introduction-mysql-query-tuning-for-devops
In this talk I will show how to get started with MySQL Query Tuning. I will make a short introduction into physical table structure and demonstrate how it may influence query execution time.
Then we will discuss basic query tuning instruments and techniques, mainly EXPLAIN command with its latest variations. You will learn how to understand its output and how to rewrite queries or change table structure to achieve better performance.
Optimizer features in recent releases of other databasesSergey Petrunya
The document summarizes several recent optimizer features introduced in MySQL 8.0 and PostgreSQL versions:
- MySQL 8.0 introduced an iterator-based executor, hash joins, EXPLAIN ANALYZE, and optimizations for anti-joins, semi-joins, and subqueries.
- PostgreSQL improved query parallelism, added multi-column statistics, parallel index creation, and optimized non-recursive common table expressions.
- Both databases have focused on join algorithms, statistics gathering, and parallel query processing to improve performance. MySQL continues to adopt features from other databases in recent releases.
- Common Table Expressions (CTEs) allow for temporary results to be stored and reused within the same SQL statement, similar to derived tables or views.
- CTEs can be non-recursive or recursive. Non-recursive CTEs are optimized by merging into joins or pushing conditions down, while recursive CTEs compute results through iterative steps until a fixed point is reached.
- The document discusses optimizations for non-recursive CTEs in MariaDB and provides examples of using CTEs for common queries involving things like hierarchical or network data.
MySQL is a relational database management system. It provides tools for managing data, including creating, querying, updating and deleting data in databases. Some key features include:
- Creating, altering and dropping databases, tables, indexes, users and more.
- Inserting, selecting, updating and deleting data with SQL statements.
- Backup and restore capabilities using mysqldump to backup entire databases or tables.
- Security features including user accounts and privileges to control access.
- Performance optimization using indexes, partitioning, query tuning and more.
- Data types for different kinds of data like numbers, dates, text, JSON and more.
This document introduces windowing functions in Firebird 3. It explains that windowing functions allow calculations across sets of rows and provide access to values from related rows in the same table. It describes the syntax of window functions in Firebird and how they use windows to define partitions of rows and sort orders. Examples show how aggregate functions can be used as window functions to calculate moving and cumulative aggregates over window partitions.
This document provides a summary of MySQL indexes and how to use the EXPLAIN statement to analyze query performance. It defines what indexes are, the different types of indexes like B-tree, hash, and full-text indexes. It also explains concepts like compound indexes, covering indexes, and partial indexes. The document demonstrates how to use the EXPLAIN statement to view and understand a query execution plan, including analyzing the possible and actual indexes used, join types, number of rows examined, and index usage. It provides examples of interpreting EXPLAIN output and analyzing performance bottlenecks.
Using histograms to provide better query performance in MariaDB. Histograms capture the distribution of values in columns to help the query optimizer select better execution plans. The optimizer needs statistics on data distributions to estimate query costs accurately. Histograms are not enabled by default but can be collected using ANALYZE TABLE with the PERSISTENT option. Making histograms available improves the performance of queries that have selective filters or ordering on non-indexed columns.
What SQL functionality was added in the past year or so. The presentation covers default expressions, functional key parts, lateral derived tables, CHECK constraints, JSON and spatial improvements. Also some other small SQL and other improvements.
Oracle Advanced SQL and Analytic FunctionsZohar Elkayam
Even though DBAs and developers are writing SQL queries every day, it seems that advanced SQL techniques such as multidimension aggregation and analytic functions still remain relatively unknown. In this session, we will explore some of the common real-world usages for analytic function and understand how to take advantage of this great and useful tool. We will deep dive into ranking based on values and groups, understand aggregation of multiple dimensions without a group by, see how to do inter-row calculations, and much more.
This is the presentation slides which was presented in Kscope 17 on June 28, 2017.
EXPLAIN ANALYZE is a new query profiling tool first released in MySQL 8.0.18. This presentation covers how this new feature works, both on the surface and on the inside, and how you can use it to better understand your queries, to improve them and make them go faster.
This presentation is for everyone who has ever had to understand why a query is executed slower than anticipated, and for everyone who wants to learn more about query plans and query execution in MySQL.
Er is een stelling: 'If you can do it in SQL, use SQL.' Maar soms is zelfs de zeer krachtige Oracle versie van SQL niet genoeg en heb je behoefte aan méér, zoals loops, condities etc.
Oracle biedt sinds Oracle 9i de mogelijkheid om PL/SQL-code te bouwen en op te nemen in de FROM clause van je query. Hoe? Door de output van een PL/SQL functie zo te formatteren dat die op een tabel lijkt, dus met records van waarden (rijen met kolommen). Hiermee heb je alle kracht van PL/SQL én SQL tot je beschikking in je SQL-statement.
Deze aanpak biedt nóg een voordeel: de code in de PL/SQL-functie wordt slechts éénmaal uitgevoerd, en niet voor elke rij (functie in de WHERE-clause) of voor elke rij in het resultaat (functie in de SELECT).
If you can do it in SQL, use SQL
Firebird 3 includes many new SQL features such as the full MERGE statement syntax from SQL 2008, window functions, regular expression support in SUBSTRING, a native BOOLEAN datatype, and improvements to cursor stability for data modification queries. Procedural SQL is also enhanced with the ability to create SQL functions, subroutines, external functions/procedures/triggers, and exception handling improvements. Additional new features include DDL changes, enhanced security options, and monitoring capabilities.
Using Optimizer Hints to Improve MySQL Query Performanceoysteing
The document discusses using optimizer hints in MySQL to improve query performance. It covers index hints to influence which indexes the optimizer uses, join order hints to control join order, and subquery hints. New optimizer hints introduced in MySQL 5.7 and 8.0 are also presented, including hints for join strategies, materialized intermediate results, and query block naming. Examples are provided to illustrate how hints can be used and their behavior.
The optimizer trace provides a detailed log of the actions taken by the query optimizer. It traces the major stages of query optimization including join preparation, join optimization, and join execution. During join optimization, it records steps like condition processing, determining table dependencies, estimating rows for plans, considering different execution plans, and choosing the best join order. The trace helps understand why certain query plans are chosen and catch differences in plans that may occur due to factors like database version changes.
The MySQL Query Optimizer Explained Through Optimizer Traceoysteing
The document discusses the MySQL query optimizer. It begins by explaining how the optimizer works, including analyzing statistics, determining optimal join orders and access methods. It then describes how the optimizer trace can provide insight into why a particular execution plan was selected. The remainder of the document provides details on the various phases the optimizer goes through, including logical transformations, cost-based optimizations like range analysis and join order selection.
SQL Macros - Game Changing Feature for SQL Developers?Andrej Pashchenko
SQL Macros are functions that return SQL statements as text. When called in a SQL statement, the returned SQL text is parsed and optimized rather than executing the function at runtime. This avoids context switches to PL/SQL and allows the optimizer to see the full SQL. Table SQL macros can be called in the FROM clause and act like views or inline queries, except they allow parameters to make the views polymorphic. Scalar parameters in the returned SQL text are substituted like bind variables to make the macros more reusable and flexible.
Exploring Advanced SQL Techniques Using Analytic FunctionsZohar Elkayam
Session from BGOUG I presented in June, 2016
Even though DBAs and developers are writing SQL queries every day, it seems that advanced SQL techniques such as multi-dimension aggregation and analytic functions are still relatively remain unknown. In this session, we will explore some of the common real-world usages for analytic function, and understand how to take advantage of this great and useful tool. We will deep dive into ranking based on values and groups; understand aggregation of multiple dimensions without a group by; see how to do inter-row calculations, and much-much more…
Together we will see how we can unleash the power of analytics using Oracle 11g best practices and Oracle 12c new features.
Introduction to MySQL Query Tuning for Dev[Op]sSveta Smirnova
To get data, we query the database. MySQL does its best to return requested bytes as fast as possible. However, it needs human help to identify what is important and should be accessed in the first place.
Queries, written smartly, can significantly outperform automatically generated ones. Indexes and Optimizer statistics, not limited to the Histograms only, help to increase the speed of the query a lot.
In this session, I will demonstrate by examples of how MySQL query performance can be improved. I will focus on techniques, accessible by Developers and DevOps rather on those which are usually used by Database Administrators. In the end, I will present troubleshooting tools which will help you to identify why your queries do not perform. Then you could use the knowledge from the beginning of the session to improve them.
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Jaime Crespo
Tutorial delivered at Percona MySQL Conference Live London 2013.
It doesn't matter what new SSD technologies appear, or what are the latest breakthroughs in flushing algorithms: the number one cause for MySQL applications being slow is poor execution plan of SQL queries. While the latest GA version provided a huge amount of transparent optimizations -specially for JOINS and subqueries- it is still the developer's responsibility to take advantage of all new MySQL 5.6 features.
In this tutorial we will propose the attendants a sample PHP application with bad response time. Through practical examples, we will suggest step-by-step strategies to improve its performance, including:
* Checking MySQL & InnoDB configuration
* Internal (performance_schema) and external tools for profiling (pt-query-digest)
* New EXPLAIN tools
* Simple and multiple column indexing
* Covering index technique
* Index condition pushdown
* Batch key access
* Subquery optimization
This document discusses cube, rollup and materialized views in Oracle databases. It provides an overview of how cube and rollup extend the GROUP BY clause to automatically calculate subtotals and totals. It also discusses how materialized views can store the results of a query to improve performance for frequent or complex queries. The document includes examples demonstrating how to use cube, rollup, and materialized views.
The document summarizes new features in the query optimizer in MariaDB 10.4, including:
1) An optimizer trace that provides insight into the query planning process.
2) Using sampling for histogram collection during ANALYZE TABLE to improve performance.
3) Rowid filtering that pushes qualifying conditions into joins to filter out non-matching rows earlier.
4) Updated default settings that make better use of statistics and condition selectivity.
The document provides an overview of the MySQL query optimizer. It discusses how the optimizer performs logical transformations, cost-based optimizations, analyzes access methods, and optimizes join orders. The goal of the optimizer is to produce a query execution plan that uses the least resources. It considers factors like I/O and CPU costs to select optimal table access methods, join orders, and other optimizations to minimize the cost of executing the query.
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZENorvald Ryeng
This presentation focuses on two of the new features in MySQL 8.0.18: hash joins and EXPLAIN ANALYZE. It covers how these features work, both on the surface and on the inside, and how you can use them to improve your queries and make them go faster.
Both features are the result of major refactoring of how the MySQL executor works. In addition to explaining and demonstrating the features themselves, the presentation looks at how the investment in a new iterator based executor prepares MySQL for a future with faster queries, greater plan flexibility and even more SQL features.
The document discusses how the Oracle optimizer can sometimes choose suboptimal execution plans, leading to performance deterioration. It presents a scenario where the same query runs much slower when bind variables are used. The document then shows how SQL profiles can be used to enforce a better execution plan. It argues that manually creating profiles is not ideal for 24/7 environments. The document proposes using machine learning for outlier detection to identify performance issues and then automatically generate SQL profiles to address the issues. Code examples are provided for outlier detection and generating profiles through the Oracle API to allow automating the process.
The document discusses a study comparing the SQL optimizer in Oracle and Hive query execution. It aims to understand how the SQL optimizer works in Oracle by generating query plans using Explain and comparing performance to queries executed on Hive. Various query types including single relations, joins, aggregates, and subqueries are executed on both Oracle and Hive and their plans and performance are analyzed and compared to understand how each system optimizes queries and executes them efficiently.
Presto is an open-source distributed SQL query engine for interactive analytics. It uses a connector architecture to query data across different data sources and formats in the same query. Presto's query planning and execution involves scanning data sources, optimizing query plans, distributing queries across workers, and aggregating results. Understanding Presto's query plans helps optimize queries and troubleshoot performance issues.
Using histograms to provide better query performance in MariaDB. Histograms capture the distribution of values in columns to help the query optimizer select better execution plans. The optimizer needs statistics on data distributions to estimate query costs accurately. Histograms are not enabled by default but can be collected using ANALYZE TABLE with the PERSISTENT option. Making histograms available improves the performance of queries that have selective filters or ordering on non-indexed columns.
What SQL functionality was added in the past year or so. The presentation covers default expressions, functional key parts, lateral derived tables, CHECK constraints, JSON and spatial improvements. Also some other small SQL and other improvements.
Oracle Advanced SQL and Analytic FunctionsZohar Elkayam
Even though DBAs and developers are writing SQL queries every day, it seems that advanced SQL techniques such as multidimension aggregation and analytic functions still remain relatively unknown. In this session, we will explore some of the common real-world usages for analytic function and understand how to take advantage of this great and useful tool. We will deep dive into ranking based on values and groups, understand aggregation of multiple dimensions without a group by, see how to do inter-row calculations, and much more.
This is the presentation slides which was presented in Kscope 17 on June 28, 2017.
EXPLAIN ANALYZE is a new query profiling tool first released in MySQL 8.0.18. This presentation covers how this new feature works, both on the surface and on the inside, and how you can use it to better understand your queries, to improve them and make them go faster.
This presentation is for everyone who has ever had to understand why a query is executed slower than anticipated, and for everyone who wants to learn more about query plans and query execution in MySQL.
Er is een stelling: 'If you can do it in SQL, use SQL.' Maar soms is zelfs de zeer krachtige Oracle versie van SQL niet genoeg en heb je behoefte aan méér, zoals loops, condities etc.
Oracle biedt sinds Oracle 9i de mogelijkheid om PL/SQL-code te bouwen en op te nemen in de FROM clause van je query. Hoe? Door de output van een PL/SQL functie zo te formatteren dat die op een tabel lijkt, dus met records van waarden (rijen met kolommen). Hiermee heb je alle kracht van PL/SQL én SQL tot je beschikking in je SQL-statement.
Deze aanpak biedt nóg een voordeel: de code in de PL/SQL-functie wordt slechts éénmaal uitgevoerd, en niet voor elke rij (functie in de WHERE-clause) of voor elke rij in het resultaat (functie in de SELECT).
If you can do it in SQL, use SQL
Firebird 3 includes many new SQL features such as the full MERGE statement syntax from SQL 2008, window functions, regular expression support in SUBSTRING, a native BOOLEAN datatype, and improvements to cursor stability for data modification queries. Procedural SQL is also enhanced with the ability to create SQL functions, subroutines, external functions/procedures/triggers, and exception handling improvements. Additional new features include DDL changes, enhanced security options, and monitoring capabilities.
Using Optimizer Hints to Improve MySQL Query Performanceoysteing
The document discusses using optimizer hints in MySQL to improve query performance. It covers index hints to influence which indexes the optimizer uses, join order hints to control join order, and subquery hints. New optimizer hints introduced in MySQL 5.7 and 8.0 are also presented, including hints for join strategies, materialized intermediate results, and query block naming. Examples are provided to illustrate how hints can be used and their behavior.
The optimizer trace provides a detailed log of the actions taken by the query optimizer. It traces the major stages of query optimization including join preparation, join optimization, and join execution. During join optimization, it records steps like condition processing, determining table dependencies, estimating rows for plans, considering different execution plans, and choosing the best join order. The trace helps understand why certain query plans are chosen and catch differences in plans that may occur due to factors like database version changes.
The MySQL Query Optimizer Explained Through Optimizer Traceoysteing
The document discusses the MySQL query optimizer. It begins by explaining how the optimizer works, including analyzing statistics, determining optimal join orders and access methods. It then describes how the optimizer trace can provide insight into why a particular execution plan was selected. The remainder of the document provides details on the various phases the optimizer goes through, including logical transformations, cost-based optimizations like range analysis and join order selection.
SQL Macros - Game Changing Feature for SQL Developers?Andrej Pashchenko
SQL Macros are functions that return SQL statements as text. When called in a SQL statement, the returned SQL text is parsed and optimized rather than executing the function at runtime. This avoids context switches to PL/SQL and allows the optimizer to see the full SQL. Table SQL macros can be called in the FROM clause and act like views or inline queries, except they allow parameters to make the views polymorphic. Scalar parameters in the returned SQL text are substituted like bind variables to make the macros more reusable and flexible.
Exploring Advanced SQL Techniques Using Analytic FunctionsZohar Elkayam
Session from BGOUG I presented in June, 2016
Even though DBAs and developers are writing SQL queries every day, it seems that advanced SQL techniques such as multi-dimension aggregation and analytic functions are still relatively remain unknown. In this session, we will explore some of the common real-world usages for analytic function, and understand how to take advantage of this great and useful tool. We will deep dive into ranking based on values and groups; understand aggregation of multiple dimensions without a group by; see how to do inter-row calculations, and much-much more…
Together we will see how we can unleash the power of analytics using Oracle 11g best practices and Oracle 12c new features.
Introduction to MySQL Query Tuning for Dev[Op]sSveta Smirnova
To get data, we query the database. MySQL does its best to return requested bytes as fast as possible. However, it needs human help to identify what is important and should be accessed in the first place.
Queries, written smartly, can significantly outperform automatically generated ones. Indexes and Optimizer statistics, not limited to the Histograms only, help to increase the speed of the query a lot.
In this session, I will demonstrate by examples of how MySQL query performance can be improved. I will focus on techniques, accessible by Developers and DevOps rather on those which are usually used by Database Administrators. In the end, I will present troubleshooting tools which will help you to identify why your queries do not perform. Then you could use the knowledge from the beginning of the session to improve them.
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Jaime Crespo
Tutorial delivered at Percona MySQL Conference Live London 2013.
It doesn't matter what new SSD technologies appear, or what are the latest breakthroughs in flushing algorithms: the number one cause for MySQL applications being slow is poor execution plan of SQL queries. While the latest GA version provided a huge amount of transparent optimizations -specially for JOINS and subqueries- it is still the developer's responsibility to take advantage of all new MySQL 5.6 features.
In this tutorial we will propose the attendants a sample PHP application with bad response time. Through practical examples, we will suggest step-by-step strategies to improve its performance, including:
* Checking MySQL & InnoDB configuration
* Internal (performance_schema) and external tools for profiling (pt-query-digest)
* New EXPLAIN tools
* Simple and multiple column indexing
* Covering index technique
* Index condition pushdown
* Batch key access
* Subquery optimization
This document discusses cube, rollup and materialized views in Oracle databases. It provides an overview of how cube and rollup extend the GROUP BY clause to automatically calculate subtotals and totals. It also discusses how materialized views can store the results of a query to improve performance for frequent or complex queries. The document includes examples demonstrating how to use cube, rollup, and materialized views.
The document summarizes new features in the query optimizer in MariaDB 10.4, including:
1) An optimizer trace that provides insight into the query planning process.
2) Using sampling for histogram collection during ANALYZE TABLE to improve performance.
3) Rowid filtering that pushes qualifying conditions into joins to filter out non-matching rows earlier.
4) Updated default settings that make better use of statistics and condition selectivity.
The document provides an overview of the MySQL query optimizer. It discusses how the optimizer performs logical transformations, cost-based optimizations, analyzes access methods, and optimizes join orders. The goal of the optimizer is to produce a query execution plan that uses the least resources. It considers factors like I/O and CPU costs to select optimal table access methods, join orders, and other optimizations to minimize the cost of executing the query.
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZENorvald Ryeng
This presentation focuses on two of the new features in MySQL 8.0.18: hash joins and EXPLAIN ANALYZE. It covers how these features work, both on the surface and on the inside, and how you can use them to improve your queries and make them go faster.
Both features are the result of major refactoring of how the MySQL executor works. In addition to explaining and demonstrating the features themselves, the presentation looks at how the investment in a new iterator based executor prepares MySQL for a future with faster queries, greater plan flexibility and even more SQL features.
The document discusses how the Oracle optimizer can sometimes choose suboptimal execution plans, leading to performance deterioration. It presents a scenario where the same query runs much slower when bind variables are used. The document then shows how SQL profiles can be used to enforce a better execution plan. It argues that manually creating profiles is not ideal for 24/7 environments. The document proposes using machine learning for outlier detection to identify performance issues and then automatically generate SQL profiles to address the issues. Code examples are provided for outlier detection and generating profiles through the Oracle API to allow automating the process.
The document discusses a study comparing the SQL optimizer in Oracle and Hive query execution. It aims to understand how the SQL optimizer works in Oracle by generating query plans using Explain and comparing performance to queries executed on Hive. Various query types including single relations, joins, aggregates, and subqueries are executed on both Oracle and Hive and their plans and performance are analyzed and compared to understand how each system optimizes queries and executes them efficiently.
Presto is an open-source distributed SQL query engine for interactive analytics. It uses a connector architecture to query data across different data sources and formats in the same query. Presto's query planning and execution involves scanning data sources, optimizing query plans, distributing queries across workers, and aggregating results. Understanding Presto's query plans helps optimize queries and troubleshoot performance issues.
This document provides an overview of DbFit, which is a tool for database unit testing. It discusses what DbFit is, its key features for testing databases, how it works by connecting Fit tables to database fixtures, and examples of different DbFit fixtures like Query, Insert, Update, Execute, and ExecuteProcedure. Installation steps are also covered along with using a Wiki to write test scripts.
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiDatabricks
Catalyst is becoming one of the most important components of Apache Spark, as it underpins all the major new APIs in Spark 2.0 and later versions, from DataFrames and Datasets to Streaming. At its core, Catalyst is a general library for manipulating trees.
In this talk, Yin explores a modular compiler frontend for Spark based on this library that includes a query analyzer, optimizer, and an execution planner. Yin offers a deep dive into Spark SQL’s Catalyst optimizer, introducing the core concepts of Catalyst and demonstrating how developers can extend it. You’ll leave with a deeper understanding of how Spark analyzes, optimizes, and plans a user’s query.
The document discusses benchmarking tools for databases and introduces a new benchmarking tool called Firehose. It describes some issues with existing tools like Mongoimport and YCSB in that they do not adequately model real-world workloads. Firehose is presented as a new multi-threaded benchmarking tool that aims to have high relevance to real applications. It measures performance metrics like operation durations and supports features like configurable load levels and integration with monitoring systems.
This is a presentation from Oracle Week 2016 (Israel). This is a newer version from last year with new 12cR2 features and demo.
In the agenda:
Aggregative and advanced grouping options
Analytic functions, ranking and pagination
Hierarchical and recursive queries
Regular Expressions
Oracle 12c new rows pattern matching
XML and JSON handling with SQL
Oracle 12c (12.1 + 12.2) new features
SQL Developer Command Line tool
Understand the Query Plan to Optimize Performance with EXPLAIN and EXPLAIN AN...EDB
What do you do, when you have to deal with poor database and query performance in PostgreSQL and there is no one around to help? Let us introduce you to important commands in PostgreSQL - EXPLAIN and EXPLAIN ANALYZE. Knowing how to use these 'tools' will help you identify query performance bottlenecks and opportunities. It provides a query plan detailing what approach the planner took to execute the statement provided.
Attend this webinar to learn:
- What are EXPLAIN and EXPLAIN ANALYZE in PostgreSQL?
- How do they help?
- Know planner tuning parameters
Paper presented by Arno Brinkman https://www.abvisie.nl/firebird/
Global topics
- Introduction
- At which point does the optimizer his work
- Optimizer steps
- Index
TechDays 2016 - Developing websites using asp.net core mvc6 and entity framew...Fons Sonnemans
Tijdens deze sessie zal met behulp van demo's en voorbeeld code getoond worden van wat de nieuwe mogelijkheden zijn van ASP.NET MVC6 en Entity Framework Core 1.0. Hoe genereer je de model classes, de controllers en de views. Wat kan en dien je daarna nog aan te passen. Wat zijn de mogelijkheden voor validatie en weergaves. Hoe kan je de nieuwe TagHelpers slim toepassen en zelf ook maken. Hoe gebruik je async controllers en views om de schaalbaarheid en soms ook de performance te verbeteren.
LINQ (Language Integrated Query) is Microsoft's technology that allows querying of data from various sources using a common language syntax. It provides a uniform programming model to query and manipulate data regardless of data source. LINQ queries can be executed against in-memory collections, databases, XML documents, and other data sources. The document discusses various LINQ concepts such as LINQ query expressions, deferred execution, implicit typing with var, extension methods of the Enumerable class, and LINQ to Objects for querying arrays and generic/non-generic collections.
This document provides an overview of the MySQL query optimizer. It discusses the main phases of the optimizer including logical transformations, cost-based optimizations, analyzing access methods, join ordering, and plan refinements. Logical transformations prepare the query for cost-based optimization by simplifying conditions. Cost-based optimizations select the optimal join order and access methods to minimize resources used. Access methods analyzed include table scans, index scans, and ref access. The join optimizer searches for the best join order. Plan refinements include sort avoidance and index condition pushdown.
Let'Swift 2019 컨퍼런스에서 RxSwift to Combine 이라고 발표한 자료입니다. 자료 내에 언급되는 코드 링크는 다음과 같습니다.
[BringMyOwnBeer]
• RxSwift/RxCocoa: https://github.com/fimuxd/BringMyOwnBeer-
• Combine/SwiftUI: https://github.com/fimuxd/BringMyOwnBeer-Combine
Oracle Week 2015 presentation (Presented on November 15, 2015)
Agenda:
Aggregative and advanced grouping options
Analytic functions, ranking and pagination
Hierarchical and recursive queries
Oracle 12c new rows pattern matching feature
XML and JSON handling with SQL
Regular Expressions
SQLcl – a new replacement tool for SQL*Plus from Oracle
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...InfluxData
This document discusses top hurdles for Flux beginners and provides solutions. It covers 10 hurdles: 1) Overlooking UI tools for writing Flux, 2) Misunderstanding annotated CSVs, 3) Data layout design leading to high cardinality, 4) Storing wrong data in InfluxDB, 5) Confusion about projecting multiple aggregations, 6) Unfamiliarity with common Flux packages and functions, 7) Unawareness of performance optimizations, 8) Improper use of schema mutations, 9) Knowing when to write custom tasks, and 10) Not understanding when and why to downsample data. For each hurdle, it provides explanations and code examples to illustrate solutions.
Apache Spark in your likeness - low and high level customizationBartosz Konieczny
User-defined features and session extensions in Apache Spark allow for high-level and low-level customization. High-level customization includes user-defined types (UDT), user-defined functions (UDF), and user-defined aggregate functions (UDAF). Low-level customization involves extensions to the analyzer rules, optimizations, physical execution, and more. The document provides examples of UDT, UDF, and UDAF and discusses how they allow incorporating custom logic into Spark applications similar to stored procedures in databases.
Optimizing the Catalyst Optimizer for Complex PlansDatabricks
For more than 6 years, Workday has been building various analytics products powered by Apache Spark. At the core of each product offering, customers use our UI to create data prep pipelines, which are then compiled to DataFrames and executed by Spark under the hood. As we built out our products, however, we started to notice places where vanilla Spark is not suitable for our workloads. For example, because our Spark plans are programmatically generated, they tend to be very complex, and often result in tens of thousands of operators. Another common issue is having case statements with thousands of branches, or worse, nested expressions containing such case statements.
With the right combination of these traits, the final DataFrame can easily take Catalyst hours to compile and optimize – that is, if it doesn’t first cause the driver JVM to run out of memory.
In this talk, we discuss how we addressed some of our pain points regarding complex pipelines. Topics covered include memory-efficient plan logging, using common subexpression elimination to remove redundant subplans, rewriting Spark’s constraint propagation mechanism to avoid exponential growth of filter constraints, as well as other performance enhancements made to Catalyst rules.
We then apply these changes to several production pipelines, showcasing the reduction of time spent in Catalyst, and list out ideas for further improvements. Finally, we share tips on how you too can better handle complex Spark plans.
- The new Odoo 8 API makes the model API more object-oriented and Pythonic by introducing recordsets, which replace browse records and browse lists. Recordsets can be treated as both collections of records and individual records.
- Methods are decorated to support both the old and new APIs. Decorators like @api.multi and @api.one control whether methods operate on single records or multiple records.
- The environment object encapsulates the database cursor, user ID, and context to make them implicit rather than explicit parameters of methods.
- Fields are implemented as Python descriptors to provide getter and setter behavior. Computed fields must assign values using a compute method.
SQL Performance Solutions: Refactor Mercilessly, Index WiselyEnkitec
The document discusses techniques for refactoring SQL queries to improve performance, including rewriting queries to filter data earlier, correcting improper outer joins, avoiding duplicate predicates and tables, and breaking up OR clauses. It also explains how to test SQL performance by naming queries, collecting statistics, and reviewing execution plans and monitoring reports. The speaker will cover common situations that call for refactoring SQL and how to transform queries using techniques like view merging, filter push-down, and join factorization.
This document provides information about new features and improvements in MySQL 8.0. It discusses enhancements to JSON functionality including new functions and indexing support. It also summarizes added functionality for GIS, UUIDs, common table expressions, window functions, and other query optimizations. The document notes that MySQL 8.0 uses utf8mb4 as the default character set for improved Unicode support and performance.
Similar to Nested subqueries and subquery chaining in openCypher (20)
The document discusses learning timed automata using Cypher queries. It provides background on automaton learning, including representing systems as state machines and learning their behavior through observation and experimentation. The goal is to develop a hypothesis model that matches the internal operations of the system under learning. The document outlines the basic automaton learning process and describes an algorithm that repeats splitting inconsistent states, merging similar states, and coloring states to finalize the learned automaton. It also discusses some interesting queries that could be used as part of the learning process in Cypher, such as selecting longest paths and handling inconsistencies that can be transitive when merging states.
This document summarizes Hannes Voigt's presentation on graph abstraction. It discusses matching patterns to bind variables, and using those variables to construct new graph elements through production patterns. It provides examples of simple graph construction and aggregation through grouping. The goal is to introduce an intuitive way to abstract and construct new subgraphs through pattern matching and variable bindings.
This document discusses Cypher for Gremlin, which provides the ability to run Cypher queries against Gremlin databases. It covers the Gremlin and Cypher query languages, examples of translating Cypher to Gremlin, and the Java APIs for client-side and server-side translation of Cypher to Gremlin.
Academic research on graph processing: connecting recent findings to industri...openCypher
The document discusses graph processing and querying in the context of industrial technologies and academic research. It provides an overview of different types of graph queries, including OLTP, analytics, OLAP, local queries and global queries. It also describes benchmarks for evaluating graph databases, including the LDBC Social Network Benchmark and Graphalytics Benchmark. The document discusses techniques for efficiently evaluating graph queries, including using local search to match patterns in a graph.
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceIndexBug
Imagine a world where machines not only perform tasks but also learn, adapt, and make decisions. This is the promise of Artificial Intelligence (AI), a technology that's not just enhancing our lives but revolutionizing entire industries.
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxSitimaJohn
Ocean Lotus cyber threat actors represent a sophisticated, persistent, and politically motivated group that poses a significant risk to organizations and individuals in the Southeast Asian region. Their continuous evolution and adaptability underscore the need for robust cybersecurity measures and international cooperation to identify and mitigate the threats posed by such advanced persistent threat groups.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfChart Kalyan
A Mix Chart displays historical data of numbers in a graphical or tabular form. The Kalyan Rajdhani Mix Chart specifically shows the results of a sequence of numbers over different periods.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
OpenID AuthZEN Interop Read Out - AuthorizationDavid Brossard
During Identiverse 2024 and EIC 2024, members of the OpenID AuthZEN WG got together and demoed their authorization endpoints conforming to the AuthZEN API
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Project Management Semester Long Project - Acuityjpupo2018
Acuity is an innovative learning app designed to transform the way you engage with knowledge. Powered by AI technology, Acuity takes complex topics and distills them into concise, interactive summaries that are easy to read & understand. Whether you're exploring the depths of quantum mechanics or seeking insight into historical events, Acuity provides the key information you need without the burden of lengthy texts.
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
2. Subqueries are self-contained queries running within the scope of
an outer query
<outer-query>
{
<inner-query>
}
<next-outer-query> (continued)
2
What?
3. 1. Read-only nested subqueries
2. Read/Write updating subqueries
3. Set operations
4. Chained subqueries
Not in this talk: Subqueries in expressions (scalar, lists, existential)
3
What?
4. CIP2017-04-20 - Query combinators for set operations
https://github.com/opencypher/openCypher/pull/227
CIP2016-06-22 - Nested, updating, and chained subqueries
https://github.com/opencypher/openCypher/pull/100
+ CIPs for subqueries in expressions not covered by this talk
4
Relevant CIPs
6. Subqueries enable:
• Composition of query pipelines
• Programmatic query composition
• Post-processing of results
• Multiple write actions for each record
6
Why?
8. <inner-query>
Any complete, read-only query "... RETURN" enclosed within { }
May be uncorrelated or correlated
(<inner-query> may use variables from the outer query)
Read-only subqueries may be nested at an arbitrary depth
8
Preliminaries
9. Regular subqueries
MATCH { <inner-query> }
Optional subqueries
OPTIONAL MATCH { <inner-query> }
Mandatory subqueries
MANDATORY MATCH { <inner-query> }
9
Variants
10. <inner-query> is evaluated for each record from <outer-query>
Variables varouter
from <outer-query> are visible to <inner-query>
Variables varinner
are introduced and returned by <inner-query>
(as output records)
Variables from varouter
and varinner
are available to <next-outer-query>
(as result records)
10
Semantics
11. <inner-query> may omit variables from varouter
But omitted variables are re-added to the final result records
<inner-query> may shadow and thus alter any variable in varouter
But CIP recommends warning if <inner-query>:
• Discards a variable vi
in varouter
• Re-introduces vi
11
Semantics
12. result records available to the <next-outer-query>
All output records returned by the <inner-query>
amended with missing variables from varouter
12
Semantics: Regular MATCH
13. result records available to the <next-outer-query>
All output records returned by the <inner-query>
amended with missing variables from varouter
Generate an error if no output records are returned by <inner-query>
13
Semantics: Mandatory MATCH
14. result records available to the <next-outer-query>
{
All output records returned by the <inner-query>
(if at least one of these), or a single record with the same
fields as the output records, where any vi
in varinner
is set to
null
}
and amended with missing variables from varouter
14
Semantics: Optional MATCH
15. MATCH {
MATCH ...
RETURN *
UNION
MATCH ...
RETURN *
}
WITH *
WHERE ...
RETURN *
15
Example: Post-UNION processing
16. MATCH (u:User {id: $userId})
MATCH (f:Farm {id: $farmId})-[:IS_IN]->(country:Country)
MATCH {
MATCH (u)-[:LIKES]->(b:Brand)-[:PRODUCES]->(p:Lawnmower)
RETURN b.name AS name
UNION
MATCH (u)-[:LIKES]->(b:Brand)-[:PRODUCES]->(v:Vehicle)
WHERE v.leftHandDrive = country.leftHandDrive
RETURN b.name AS name
}
RETURN f, name
16
Example: Correlated subquery
18. <updating-query>
Can be any updating query, and may not end with RETURN
(No data is returned)
18
Preliminaries
19. <updating-query>
May be uncorrelated or correlated
(<updating-query> may use variables from the outer query)
Updating subqueries may be nested at an arbitrary depth
Read-only nested subqueries may not contain updating subqueries
FOREACH removed from Cypher - now obsolete
19
Preliminaries
20. Simple updating subqueries
DO { <updating-query> }
Conditionally-updating subqueries
DO
[WHEN <predicate> THEN { <updating-query> }+
[ELSE { <updating-query> }]
END
20
Variants
21. <updating-query> is run for each incoming record
Executing DO does not affect the cardinality
Input records are passed on to <next-outer-query>
Conditional DO:
This happens whether or not the incoming record is eligible
for processing by <updating-query>
21
Semantics
22. A query can end with a DO subquery in the same way as an
updating query currently can end with any update clause
All varinner
introduced by <updating-query> are suppressed by DO
DO can be nested within another DO
22
Semantics
23. DO
[WHEN <predicate> THEN { <updating-query> }+
[ELSE { <updating-query> }]
END
Conditions in WHEN evaluated in order
<updating-query> evaluated for the first true condition, falling
through to ELSE if no true conditions found
Otherwise, no updates will occur
23
Semantics: conditional variant
24. Old version:
MATCH (r:Root)
FOREACH(x IN range(1, 10) |
MERGE (c:Child {id: x})
MERGE (r)-[:PARENT]->(c)
)
24
Example: FOREACH vs DO
New version:
MATCH (r:Root)
UNWIND range(1, 10) AS x
DO {
MERGE (c:Child {id: x})
MERGE (r)-[:PARENT]->(c)
}
25. MATCH (r:Root)
UNWIND range(1, 10) AS x
DO WHEN x % 2 = 1 THEN {
MERGE (c:Odd:Child {id: x})
MERGE (r)-[:PARENT]->(c)
}
ELSE {
MERGE (c:Even:Child {id: x})
MERGE (r)-[:PARENT]->(c)
}
END
25
Example: Conditional DO
29. • Rule: <left-hand-query> and <right-hand-query>
return same variables (fields) in same order
• UNION set union
• INTERSECT set intersection
• EXCEPT set difference
• EXCLUSIVE UNION set union
Set operation semantics
29
32. Multiset operations
• UNION ALL n + k
• INTERSECT ALL min(n, k)
• EXCLUDE ALL max(0, n-k)
• UNION MAX max(n, k)
• EXCLUSIVE UNION MAX max(n, k) - min(n,k)
33. • Rule: <left-hand-query> and <right-hand-query>
return same variables (fields) in same order
• UNION ALL multiset union
• INTERSECT ALL multiset intersection
• EXCEPT ALL multiset difference (0-bounded)
Multiset operation semantics
33
34. • Rule: <left-hand-query> and <right-hand-query>
return same variables (fields) in same order
• UNION MAX max-bounded multiset union
(largest number of duplicates from either input query)
• EXCLUSIVE UNION MAX exclusive union
(excess duplicates from either input query over the other)
Multiset operation semantics
34
35. All set operations are left-associative
Q1 UNION Q2 INTERSECT Q3
is equivalent to
((Q1 UNION Q2) INTERSECT Q3)
35
Using multiple set operations
36. • OTHERWISE [ALL] computes the logical choice between
<left-hand-query> and <right-hand-query>
• Becomes <left-hand-query> if non-empty,
<right-hand-query> otherwise
• OTHERWISE does not preserve duplicates
• OTHERWISE ALL preserves duplicates
• Rule: <left-hand-query> and <right-hand-query>
return same variables (fields) in same order
OTHERWISE [ALL]
36
37. • CROSS computes the cross product (cartesian product) between
<left-hand-query> and <right-hand-query>
• Rule: <left-hand-query> and <right-hand-query>
must return different variables
CROSS
37
39. Chain arbitrary queries:
<query1> composedWith <query2> composedWith <query3> ...
Returned variables from <queryN> are passed on to <queryN+1>
39
What?
41. MATCH {
MATCH {
MATCH {
...
// death by curly
...
}
}
}
41
What won't work
● Requires deep nesting
● Would require putting updating queries
inside read-only queries
(instead of after them)
42. MATCH ...RETURN ...
UNION
MATCH ...RETURN ...
WITH
...
MATCH ...RETURN ...
42
Our proposal
Recast WITH after RETURN as query combinator
(similar to UNION)
● Linear flow of subqueries
● Post-union processing
46. WITH ... <query>
Declares expected input variables
Useful for programmatic composition: result.cypher("WITH a")
THEN ... <query>
Drops incoming records
(Reduces cardinality to single empty record)
46
WITH and THEN at start of query
47. Like set operations, chained subqueries are left-associative!
<query1> WITH ... <query2> THEN <query3> ...
is equivalent to
(((<query1> WITH ... <query2>) THEN <query3>) ...)
47
Complex chained queries
48. <queryN> cannot contain set operations or query combinators
But it can contain nested subqueries again!
MATCH { ... } RETURN ...
UNION
MATCH { ... } RETURN ...
WITH ...
RETURN ...
48
Complex chained queries
50. Multiple Graphs add returning of graphs besides variables
Set operations are defined over sets of records
Set operations may also be defined for graphs
=> This extends to Cypher with support for multiple graphs
Chaining is about passing variables to the follow-up query
=> This extends to Cypher with support for multiple graphs
50
Extension to multiple graphs
51. Adding subqueries to Cypher will massively increase expressivity
Chained subqueries provide a powerful composition mechanism
Subqueries naturally extends to Cypher for multiple graphs
51
Summary