This presentation focuses on optimization of queries in MySQL from developer’s perspective. Developers should care about the performance of the application, which includes optimizing SQL queries. It shows the execution plan in MySQL and explain its different formats - tabular, TREE and JSON/visual explain plans. Optimizer features like optimizer hints and histograms as well as newer features like HASH joins, TREE explain plan and EXPLAIN ANALYZE from latest releases are covered. Some real examples of slow queries are included and their optimization explained.
How to Analyze and Tune MySQL Queries for Better Performanceoysteing
The document discusses techniques for optimizing MySQL queries for better performance. It covers topics like cost-based query optimization in MySQL, selecting optimal data access methods like indexes, the join optimizer, subquery optimizations, and tools for monitoring and analyzing queries. The presentation agenda includes introductions to index selection, join optimization, subquery optimizations, ordering and aggregation, and influencing the optimizer. Examples are provided to illustrate index selection, ref access analysis, and the range optimizer.
This document discusses various strategies for optimizing MySQL queries and indexes, including:
- Using the slow query log and EXPLAIN statement to analyze slow queries.
- Avoiding correlated subqueries and issues in older MySQL versions.
- Choosing indexes based on selectivity and covering common queries.
- Identifying and addressing full table scans and duplicate indexes.
- Understanding the different join types and selecting optimal indexes.
- The document discusses advanced techniques for optimizing MySQL queries, including topics like temporary tables, file sorting, order optimizations, and calculated fields.
- It provides examples of using indexes and index optimizations, explaining concepts like index types, index usage, key lengths, and covering indexes.
- One example shows how to optimize a query involving a calculated year() expression by rewriting the query to use a range on the date field instead.
The MySQL Query Optimizer Explained Through Optimizer Traceoysteing
The document discusses the MySQL query optimizer. It begins by explaining how the optimizer works, including analyzing statistics, determining optimal join orders and access methods. It then describes how the optimizer trace can provide insight into why a particular execution plan was selected. The remainder of the document provides details on the various phases the optimizer goes through, including logical transformations, cost-based optimizations like range analysis and join order selection.
This document discusses techniques for efficient pagination over large datasets using MySQL. The typical solution of using LIMIT and OFFSET can degrade performance as the offset increases. The document proposes using additional criteria like a "last seen" value combined with ordering to retrieve the next page without an offset. This allows fetching the next results set using an index scan. Testing showed a 6x improvement in query throughput over using large offsets. Additional enhancements like secondary indexes and caching are discussed to further optimize pagination.
How to Take Advantage of Optimizer Improvements in MySQL 8.0Norvald Ryeng
MySQL 8.0 introduces several improvements to the query optimizer that may give improved performance for your queries. This presentation looks at what kind of queries the different improvements apply to, and the focus is on what you can do to get the most out of the optimizer improvements. The main topics are changes to the optimizer cost model, histograms, and new optimizer hints, but other improvements to how MySQL executes queries are also covered. The presentation includes many practical examples of how you can get a significant speedup for your MySQL queries.
The document discusses the Performance Schema in MySQL. It provides an overview of what the Performance Schema is and how it can be used to monitor events within a MySQL server. It also describes how to configure the Performance Schema by setting up actors, objects, instruments, consumers and threads to control what is monitored. Finally, it explains how to initialize the Performance Schema by truncating existing summary tables before collecting new performance data.
MySQL Indexing : Improving Query Performance Using Index (Covering Index)Hemant Kumar Singh
The document discusses improving query performance in databases using indexes. It explains what indexes are and the different types of indexes including column, composite, and covering indexes. It provides examples of how to create indexes on single and multiple columns and how the order of columns matters. The document also discusses factors that affect database performance and guidelines for index usage and size optimization.
How to Analyze and Tune MySQL Queries for Better Performanceoysteing
The document discusses techniques for optimizing MySQL queries for better performance. It covers topics like cost-based query optimization in MySQL, selecting optimal data access methods like indexes, the join optimizer, subquery optimizations, and tools for monitoring and analyzing queries. The presentation agenda includes introductions to index selection, join optimization, subquery optimizations, ordering and aggregation, and influencing the optimizer. Examples are provided to illustrate index selection, ref access analysis, and the range optimizer.
This document discusses various strategies for optimizing MySQL queries and indexes, including:
- Using the slow query log and EXPLAIN statement to analyze slow queries.
- Avoiding correlated subqueries and issues in older MySQL versions.
- Choosing indexes based on selectivity and covering common queries.
- Identifying and addressing full table scans and duplicate indexes.
- Understanding the different join types and selecting optimal indexes.
- The document discusses advanced techniques for optimizing MySQL queries, including topics like temporary tables, file sorting, order optimizations, and calculated fields.
- It provides examples of using indexes and index optimizations, explaining concepts like index types, index usage, key lengths, and covering indexes.
- One example shows how to optimize a query involving a calculated year() expression by rewriting the query to use a range on the date field instead.
The MySQL Query Optimizer Explained Through Optimizer Traceoysteing
The document discusses the MySQL query optimizer. It begins by explaining how the optimizer works, including analyzing statistics, determining optimal join orders and access methods. It then describes how the optimizer trace can provide insight into why a particular execution plan was selected. The remainder of the document provides details on the various phases the optimizer goes through, including logical transformations, cost-based optimizations like range analysis and join order selection.
This document discusses techniques for efficient pagination over large datasets using MySQL. The typical solution of using LIMIT and OFFSET can degrade performance as the offset increases. The document proposes using additional criteria like a "last seen" value combined with ordering to retrieve the next page without an offset. This allows fetching the next results set using an index scan. Testing showed a 6x improvement in query throughput over using large offsets. Additional enhancements like secondary indexes and caching are discussed to further optimize pagination.
How to Take Advantage of Optimizer Improvements in MySQL 8.0Norvald Ryeng
MySQL 8.0 introduces several improvements to the query optimizer that may give improved performance for your queries. This presentation looks at what kind of queries the different improvements apply to, and the focus is on what you can do to get the most out of the optimizer improvements. The main topics are changes to the optimizer cost model, histograms, and new optimizer hints, but other improvements to how MySQL executes queries are also covered. The presentation includes many practical examples of how you can get a significant speedup for your MySQL queries.
The document discusses the Performance Schema in MySQL. It provides an overview of what the Performance Schema is and how it can be used to monitor events within a MySQL server. It also describes how to configure the Performance Schema by setting up actors, objects, instruments, consumers and threads to control what is monitored. Finally, it explains how to initialize the Performance Schema by truncating existing summary tables before collecting new performance data.
MySQL Indexing : Improving Query Performance Using Index (Covering Index)Hemant Kumar Singh
The document discusses improving query performance in databases using indexes. It explains what indexes are and the different types of indexes including column, composite, and covering indexes. It provides examples of how to create indexes on single and multiple columns and how the order of columns matters. The document also discusses factors that affect database performance and guidelines for index usage and size optimization.
Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricksJaime Crespo
Tutorial delivered at Percona Live London 2014, where we explore new features and techniques for faster queries with MySQL 5.6 and 5.7 and MariaDB 10, including the newest options in MySQL 5.7.5 and MariaDB 10.1.
Download here the virtual machine with the example database: http://dbahire.com/pluk14
Update: WordPress has a workaround for STRICT mode: https://core.trac.wordpress.org/ticket/26847
Using Optimizer Hints to Improve MySQL Query Performanceoysteing
The document discusses using optimizer hints in MySQL to improve query performance. It covers index hints to influence which indexes the optimizer uses, join order hints to control join order, and subquery hints. New optimizer hints introduced in MySQL 5.7 and 8.0 are also presented, including hints for join strategies, materialized intermediate results, and query block naming. Examples are provided to illustrate how hints can be used and their behavior.
This document provides tips for tuning a MySQL database to optimize performance. It discusses why tuning is important for cost effectiveness, performance, and competitive advantage. It outlines who should be involved in tuning including application designers, developers, DBAs and system administrators. The document covers what can be tuned such as applications, databases structures, and hardware. It provides best practices for when and how much to tune a database. Specific tuning techniques are discussed for various areas including application development, database design, server configuration, and storage engine optimizations.
The document provides guidance on understanding and optimizing database performance. It emphasizes the importance of properly designing schemas, normalizing data, using appropriate data types, and creating useful indexes. Explain plans should be used to test queries and identify optimization opportunities like adding missing indexes. Overall, the document encourages developers to view the database as a collaborative "friend" rather than an enemy, by understanding its capabilities and limitations.
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Jaime Crespo
Tutorial delivered at Percona MySQL Conference Live London 2013.
It doesn't matter what new SSD technologies appear, or what are the latest breakthroughs in flushing algorithms: the number one cause for MySQL applications being slow is poor execution plan of SQL queries. While the latest GA version provided a huge amount of transparent optimizations -specially for JOINS and subqueries- it is still the developer's responsibility to take advantage of all new MySQL 5.6 features.
In this tutorial we will propose the attendants a sample PHP application with bad response time. Through practical examples, we will suggest step-by-step strategies to improve its performance, including:
* Checking MySQL & InnoDB configuration
* Internal (performance_schema) and external tools for profiling (pt-query-digest)
* New EXPLAIN tools
* Simple and multiple column indexing
* Covering index technique
* Index condition pushdown
* Batch key access
* Subquery optimization
Performance Schema is a powerful diagnostic instrument for:
- Query performance
- Complicated locking issues
- Memory leaks
- Resource usage
- Problematic behavior, caused by inappropriate settings
- More
It comes with hundreds of options which allow precisely tuning what to instrument. More than 100 consumers store collected data.
In this tutorial, we will try all the important instruments out. We will provide a test environment and a few typical problems which could be hardly solved without Performance Schema. You will not only learn how to collect and use this information but have experience with it.
Tutorial at Percona Live Austin 2019
How to Manage Scale-Out Environments with MariaDB MaxScaleMariaDB plc
MaxScale is a database proxy that provides load balancing, connection pooling, and replication capabilities for MariaDB and MySQL databases. It can be used to scale databases horizontally across multiple servers for increased performance and availability. The document provides an overview of MaxScale concepts and capabilities such as routing, filtering, security features, and how it can be used for operational tasks like query caching, logging, and data streaming. It also includes instructions on setting up MaxScale with a basic example of configuring read/write splitting between a master and slave database servers.
This one is about advanced indexing in PostgreSQL. It guides you through basic concepts as well as through advanced techniques to speed up the database.
All important PostgreSQL Index types explained: btree, gin, gist, sp-gist and hashes.
Regular expression indexes and LIKE queries are also covered.
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZENorvald Ryeng
This presentation focuses on two of the new features in MySQL 8.0.18: hash joins and EXPLAIN ANALYZE. It covers how these features work, both on the surface and on the inside, and how you can use them to improve your queries and make them go faster.
Both features are the result of major refactoring of how the MySQL executor works. In addition to explaining and demonstrating the features themselves, the presentation looks at how the investment in a new iterator based executor prepares MySQL for a future with faster queries, greater plan flexibility and even more SQL features.
Adrian Hardy's slides from PHPNW08
Once you have your query returning the correct results, speed becomes an important factor. Speed can either be an issue from the outset, or can creep in as your dataset grows. Understanding the EXPLAIN command is essential to helping you solve and even anticipate slow queries.
Associated video: http://blip.tv/file/1791781
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스PgDay.Seoul
This document summarizes how to set up and use Citus, an open-source PostgreSQL-based distributed database. It explains how to install Citus, add worker nodes, create distributed tables, and use features like reference tables to perform distributed queries across the cluster.
MySQL Administrator
Basic course
- MySQL 개요
- MySQL 설치 / 설정
- MySQL 아키텍처 - MySQL 스토리지 엔진
- MySQL 관리
- MySQL 백업 / 복구
- MySQL 모니터링
Advanced course
- MySQL Optimization
- MariaDB / Percona
- MySQL HA (High Availability)
- MySQL troubleshooting
네오클로바
http://neoclova.co.kr/
The document is an introduction to the MySQL 8.0 optimizer guide. It includes a safe harbor statement noting that the guide outlines Oracle's general product direction but not commitments. The agenda lists 25 topics to be covered related to query optimization, diagnostic commands, examples from the "World Schema" sample database, and a companion website with more details.
MySQL Indexing - Best practices for MySQL 5.6MYXPLAIN
This document provides an overview of MySQL indexing best practices. It discusses the types of indexes in MySQL, how indexes work, and how to optimize queries through proper index selection and configuration. The presentation emphasizes understanding how MySQL utilizes indexes to speed up queries through techniques like lookups, sorting, avoiding full table scans, and join optimizations. It also covers new capabilities in MySQL 5.6 like index condition pushdown that provide more flexible index usage.
Window functions enable calculations across partitions of rows in a result set. This document discusses window function syntax, types of window functions available in MySQL 8.0 like RANK(), DENSE_RANK(), ROW_NUMBER(), and provides examples of queries using window functions to analyze and summarize data in partitions.
Abstract: A histogram represents the frequency of data distribution. This histogram can help in predicting a better execution plan. MariaDB and MySQL support HIstogram. Though they serve a common purpose they are implemented differently in MariaDB and MySQL, they have their control knobs. Our Consultants ( Monu Mahto and Madhavan ) articulate how histograms can be used in your production MySQL and MariaDB databases and how it helps in bringing down the execution time.
MySQL Administrator
Basic course
- MySQL 개요
- MySQL 설치 / 설정
- MySQL 아키텍처 - MySQL 스토리지 엔진
- MySQL 관리
- MySQL 백업 / 복구
- MySQL 모니터링
Advanced course
- MySQL Optimization
- MariaDB / Percona
- MySQL HA (High Availability)
- MySQL troubleshooting
네오클로바
http://neoclova.co.kr/
This document provides information about partitioning in MySQL 5.1 presented by Sarah Sproehnle and Giuseppe Maxia. It discusses partitioning types (range, hash, list, key), partitioning expressions, partitioning pruning, benchmarking partitions, partitioning with different storage engines, partitioning by dates, and optimizing queries on partitioned tables.
My talk for "MySQL, MariaDB and Friends" devroom at Fosdem on February 2, 2019
Born in 2010 in MySQL 5.5.3 as "a feature for monitoring server execution at a low level," grown in 5.6 times with performance fixes and DBA-faced features, in MySQL 5.7 Performance Schema is a mature tool, used by humans and more and more monitoring products. It becomes more popular over the years. In this talk I will give an overview of Performance Schema, focusing on its tuning, performance, and usability.
Performance Schema helps to troubleshoot query performance, complicated locking issues, memory leaks, resource usage, problematic behavior, caused by inappropriate settings and much more. It comes with hundreds of options which allow precisely tune what to instrument. More than 100 consumers store collected data.
Performance Schema is a potent tool. And very complicated at the same time. It does not affect performance in most cases and can slow down server dramatically if configured without care. It collects a lot of data, and sometimes this data is hard to read.
This talk will start from the introduction of how Performance Schema designed, and you will understand why it slowdowns server in some cases and does not affect your queries in others. Then we will discuss which information you can retrieve from Performance Schema and how to do it effectively.
I will cover its companion sys schema and graphical monitoring tools.
The webinar will review a multi-layered framework for PostgreSQL security, with a deeper focus on limiting access to the database and data, as well as securing the data.
Using the popular AAA (Authentication, Authorization, Auditing) framework we will cover:
- Best practices for authentication (trust, certificate, MD5, Scram, etc).
- Advanced approaches, such as password profiles.
- Deep dive of authorization and data access control for roles, database objects (tables, etc), view usage, row-level security, and data redaction.
- Auditing, encryption, and SQL injection attack prevention.
Note: this session is delivered in German
Speaker:
Borys Neselovskyi, Sales Engineer, EDB
Modern query optimisation features in MySQL 8.Mydbops
MySQL 8 (a huge leap forward), indexing capabilities, execution plan enhancements, optimizer improvements, and many other current query tweak features are covered in the slides.
Just about anyone can write a basic SQL query for a table. Not everyone can write a good query though - that takes practice and knowing how to understand what the optimizer is doing with the query. Learn the basics of query optimization so you keep your application engaging the user rather then showing the progress bar as they wait on the database.
Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricksJaime Crespo
Tutorial delivered at Percona Live London 2014, where we explore new features and techniques for faster queries with MySQL 5.6 and 5.7 and MariaDB 10, including the newest options in MySQL 5.7.5 and MariaDB 10.1.
Download here the virtual machine with the example database: http://dbahire.com/pluk14
Update: WordPress has a workaround for STRICT mode: https://core.trac.wordpress.org/ticket/26847
Using Optimizer Hints to Improve MySQL Query Performanceoysteing
The document discusses using optimizer hints in MySQL to improve query performance. It covers index hints to influence which indexes the optimizer uses, join order hints to control join order, and subquery hints. New optimizer hints introduced in MySQL 5.7 and 8.0 are also presented, including hints for join strategies, materialized intermediate results, and query block naming. Examples are provided to illustrate how hints can be used and their behavior.
This document provides tips for tuning a MySQL database to optimize performance. It discusses why tuning is important for cost effectiveness, performance, and competitive advantage. It outlines who should be involved in tuning including application designers, developers, DBAs and system administrators. The document covers what can be tuned such as applications, databases structures, and hardware. It provides best practices for when and how much to tune a database. Specific tuning techniques are discussed for various areas including application development, database design, server configuration, and storage engine optimizations.
The document provides guidance on understanding and optimizing database performance. It emphasizes the importance of properly designing schemas, normalizing data, using appropriate data types, and creating useful indexes. Explain plans should be used to test queries and identify optimization opportunities like adding missing indexes. Overall, the document encourages developers to view the database as a collaborative "friend" rather than an enemy, by understanding its capabilities and limitations.
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Jaime Crespo
Tutorial delivered at Percona MySQL Conference Live London 2013.
It doesn't matter what new SSD technologies appear, or what are the latest breakthroughs in flushing algorithms: the number one cause for MySQL applications being slow is poor execution plan of SQL queries. While the latest GA version provided a huge amount of transparent optimizations -specially for JOINS and subqueries- it is still the developer's responsibility to take advantage of all new MySQL 5.6 features.
In this tutorial we will propose the attendants a sample PHP application with bad response time. Through practical examples, we will suggest step-by-step strategies to improve its performance, including:
* Checking MySQL & InnoDB configuration
* Internal (performance_schema) and external tools for profiling (pt-query-digest)
* New EXPLAIN tools
* Simple and multiple column indexing
* Covering index technique
* Index condition pushdown
* Batch key access
* Subquery optimization
Performance Schema is a powerful diagnostic instrument for:
- Query performance
- Complicated locking issues
- Memory leaks
- Resource usage
- Problematic behavior, caused by inappropriate settings
- More
It comes with hundreds of options which allow precisely tuning what to instrument. More than 100 consumers store collected data.
In this tutorial, we will try all the important instruments out. We will provide a test environment and a few typical problems which could be hardly solved without Performance Schema. You will not only learn how to collect and use this information but have experience with it.
Tutorial at Percona Live Austin 2019
How to Manage Scale-Out Environments with MariaDB MaxScaleMariaDB plc
MaxScale is a database proxy that provides load balancing, connection pooling, and replication capabilities for MariaDB and MySQL databases. It can be used to scale databases horizontally across multiple servers for increased performance and availability. The document provides an overview of MaxScale concepts and capabilities such as routing, filtering, security features, and how it can be used for operational tasks like query caching, logging, and data streaming. It also includes instructions on setting up MaxScale with a basic example of configuring read/write splitting between a master and slave database servers.
This one is about advanced indexing in PostgreSQL. It guides you through basic concepts as well as through advanced techniques to speed up the database.
All important PostgreSQL Index types explained: btree, gin, gist, sp-gist and hashes.
Regular expression indexes and LIKE queries are also covered.
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZENorvald Ryeng
This presentation focuses on two of the new features in MySQL 8.0.18: hash joins and EXPLAIN ANALYZE. It covers how these features work, both on the surface and on the inside, and how you can use them to improve your queries and make them go faster.
Both features are the result of major refactoring of how the MySQL executor works. In addition to explaining and demonstrating the features themselves, the presentation looks at how the investment in a new iterator based executor prepares MySQL for a future with faster queries, greater plan flexibility and even more SQL features.
Adrian Hardy's slides from PHPNW08
Once you have your query returning the correct results, speed becomes an important factor. Speed can either be an issue from the outset, or can creep in as your dataset grows. Understanding the EXPLAIN command is essential to helping you solve and even anticipate slow queries.
Associated video: http://blip.tv/file/1791781
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스PgDay.Seoul
This document summarizes how to set up and use Citus, an open-source PostgreSQL-based distributed database. It explains how to install Citus, add worker nodes, create distributed tables, and use features like reference tables to perform distributed queries across the cluster.
MySQL Administrator
Basic course
- MySQL 개요
- MySQL 설치 / 설정
- MySQL 아키텍처 - MySQL 스토리지 엔진
- MySQL 관리
- MySQL 백업 / 복구
- MySQL 모니터링
Advanced course
- MySQL Optimization
- MariaDB / Percona
- MySQL HA (High Availability)
- MySQL troubleshooting
네오클로바
http://neoclova.co.kr/
The document is an introduction to the MySQL 8.0 optimizer guide. It includes a safe harbor statement noting that the guide outlines Oracle's general product direction but not commitments. The agenda lists 25 topics to be covered related to query optimization, diagnostic commands, examples from the "World Schema" sample database, and a companion website with more details.
MySQL Indexing - Best practices for MySQL 5.6MYXPLAIN
This document provides an overview of MySQL indexing best practices. It discusses the types of indexes in MySQL, how indexes work, and how to optimize queries through proper index selection and configuration. The presentation emphasizes understanding how MySQL utilizes indexes to speed up queries through techniques like lookups, sorting, avoiding full table scans, and join optimizations. It also covers new capabilities in MySQL 5.6 like index condition pushdown that provide more flexible index usage.
Window functions enable calculations across partitions of rows in a result set. This document discusses window function syntax, types of window functions available in MySQL 8.0 like RANK(), DENSE_RANK(), ROW_NUMBER(), and provides examples of queries using window functions to analyze and summarize data in partitions.
Abstract: A histogram represents the frequency of data distribution. This histogram can help in predicting a better execution plan. MariaDB and MySQL support HIstogram. Though they serve a common purpose they are implemented differently in MariaDB and MySQL, they have their control knobs. Our Consultants ( Monu Mahto and Madhavan ) articulate how histograms can be used in your production MySQL and MariaDB databases and how it helps in bringing down the execution time.
MySQL Administrator
Basic course
- MySQL 개요
- MySQL 설치 / 설정
- MySQL 아키텍처 - MySQL 스토리지 엔진
- MySQL 관리
- MySQL 백업 / 복구
- MySQL 모니터링
Advanced course
- MySQL Optimization
- MariaDB / Percona
- MySQL HA (High Availability)
- MySQL troubleshooting
네오클로바
http://neoclova.co.kr/
This document provides information about partitioning in MySQL 5.1 presented by Sarah Sproehnle and Giuseppe Maxia. It discusses partitioning types (range, hash, list, key), partitioning expressions, partitioning pruning, benchmarking partitions, partitioning with different storage engines, partitioning by dates, and optimizing queries on partitioned tables.
My talk for "MySQL, MariaDB and Friends" devroom at Fosdem on February 2, 2019
Born in 2010 in MySQL 5.5.3 as "a feature for monitoring server execution at a low level," grown in 5.6 times with performance fixes and DBA-faced features, in MySQL 5.7 Performance Schema is a mature tool, used by humans and more and more monitoring products. It becomes more popular over the years. In this talk I will give an overview of Performance Schema, focusing on its tuning, performance, and usability.
Performance Schema helps to troubleshoot query performance, complicated locking issues, memory leaks, resource usage, problematic behavior, caused by inappropriate settings and much more. It comes with hundreds of options which allow precisely tune what to instrument. More than 100 consumers store collected data.
Performance Schema is a potent tool. And very complicated at the same time. It does not affect performance in most cases and can slow down server dramatically if configured without care. It collects a lot of data, and sometimes this data is hard to read.
This talk will start from the introduction of how Performance Schema designed, and you will understand why it slowdowns server in some cases and does not affect your queries in others. Then we will discuss which information you can retrieve from Performance Schema and how to do it effectively.
I will cover its companion sys schema and graphical monitoring tools.
The webinar will review a multi-layered framework for PostgreSQL security, with a deeper focus on limiting access to the database and data, as well as securing the data.
Using the popular AAA (Authentication, Authorization, Auditing) framework we will cover:
- Best practices for authentication (trust, certificate, MD5, Scram, etc).
- Advanced approaches, such as password profiles.
- Deep dive of authorization and data access control for roles, database objects (tables, etc), view usage, row-level security, and data redaction.
- Auditing, encryption, and SQL injection attack prevention.
Note: this session is delivered in German
Speaker:
Borys Neselovskyi, Sales Engineer, EDB
Modern query optimisation features in MySQL 8.Mydbops
MySQL 8 (a huge leap forward), indexing capabilities, execution plan enhancements, optimizer improvements, and many other current query tweak features are covered in the slides.
Just about anyone can write a basic SQL query for a table. Not everyone can write a good query though - that takes practice and knowing how to understand what the optimizer is doing with the query. Learn the basics of query optimization so you keep your application engaging the user rather then showing the progress bar as they wait on the database.
Applied Partitioning And Scaling Your Database System PresentationRichard Crowley
This document discusses applying partitioning to MySQL OLTP applications. It begins by reviewing classic uses of partitioning in data warehousing applications. It then explains how partitioning can benefit OLTP workloads by reducing seek and scan set sizes and limiting transaction durations. The document provides an example of applying hash partitioning to tables in a retail store application based on store ID. It demonstrates how to expand and merge partitions as stores are added or closed. Performance tests on the partitioned tables show improvements in query times compared to non-partitioned tables. In summary, the document argues that partitioning provides a way to scale MySQL databases for OLTP workloads while reducing maintenance costs.
In this first of a series of presentations, we'll overview the differences between SQL and PL/SQL, and the first steps in optimization, as understanding RULE vs. COST, and how to slash 90% response time in data extractions running in SQL*Plus.
The document discusses querying data in a distributed database system using SELECT statements. It explains the basic syntax of SELECT, including how to specify the columns, tables, and conditions to retrieve the desired data. It also covers additional clauses like ORDER BY, LIMIT, and DISTINCT that allow sorting, pagination, and removing duplicates from result sets. Aggregate functions like COUNT, MIN, MAX, and GROUP_CONCAT are described for summarizing and combining column values.
The document discusses MySQL data manipulation commands. It provides examples of using SELECT statements to retrieve data from tables based on specified criteria, INSERT statements to add new data to tables, UPDATE statements to modify existing data in tables, and the basic syntax for these commands. It also reviews naming conventions and some best practices for working with tables in MySQL.
The document discusses various Python libraries used for data science tasks. It describes NumPy for numerical computing, SciPy for algorithms, Pandas for data structures and analysis, Scikit-Learn for machine learning, Matplotlib for visualization, and Seaborn which builds on Matplotlib. It also provides examples of loading data frames in Pandas, exploring and manipulating data, grouping and aggregating data, filtering, sorting, and handling missing values.
Introduction to MySQL Query Tuning for Dev[Op]sSveta Smirnova
To get data, we query the database. MySQL does its best to return requested bytes as fast as possible. However, it needs human help to identify what is important and should be accessed in the first place.
Queries, written smartly, can significantly outperform automatically generated ones. Indexes and Optimizer statistics, not limited to the Histograms only, help to increase the speed of the query a lot.
In this session, I will demonstrate by examples of how MySQL query performance can be improved. I will focus on techniques, accessible by Developers and DevOps rather on those which are usually used by Database Administrators. In the end, I will present troubleshooting tools which will help you to identify why your queries do not perform. Then you could use the knowledge from the beginning of the session to improve them.
Query Optimization with MySQL 5.6: Old and New TricksMYXPLAIN
The document discusses query optimization techniques for MySQL 5.6, including both established techniques and new features in 5.6. It provides an overview of tools for profiling queries such as EXPLAIN, the slow query log, and the performance schema. It also covers indexing strategies like compound indexes and index condition pushdown.
New optimizer features in MariaDB releases before 10.12Sergey Petrunya
The document discusses new optimizer features in recent and upcoming MariaDB releases. MariaDB 10.8 introduced JSON histograms and support for reverse-ordered indexes. JSON produced by the optimizer is now valid and processible. MariaDB 10.9 added SHOW EXPLAIN FORMAT=JSON and SHOW ANALYZE can return partial results. MariaDB 10.10 enabled table elimination for derived tables and improved optimization of many-table joins. Future releases will optimize queries on stored procedures and show optimizer timing in EXPLAIN FORMAT=JSON.
The document provides 15 pro-tips for MySQL users, beginning with basic tips like using the correct MySQL version and understanding EXPLAIN plans, and progressing to more advanced tips involving backups, queries, indexing, and database design. Key recommendations include knowing important configuration settings, backing up at the table level, avoiding SELECT * queries, using triggers for consistency, and separating volatile and non-volatile data.
MySQL 5.7 Tutorial Dutch PHP Conference 2015Dave Stokes
MySQL 5.7 is the latest version of the MySQL database. It includes new features such as support for JSON as a native data type with functions for manipulating JSON documents. Security has also been improved with secure defaults, password rotation/expiration controls, and SSL encryption enabled by default for the C client library. The release candidate for 5.7 was released in April 2015 and includes patches, contributions, and enhancements over previous versions.
MySQL 5.7. Tutorial - Dutch PHP Conference 2015Dave Stokes
MySQL 5.7 is the latest version of the MySQL database. It includes new features such as support for JSON as a native data type with functions for manipulating JSON documents. Security has also been improved with secure defaults, password rotation/expiration controls, and SSL encryption enabled by default for the C client library. The release candidate for 5.7 was released in April 2015 and includes patches, contributions, and enhancements over previous versions.
Data Science, Statistical Analysis and R... Learn what those mean, how they can help you find answers to your questions and complement the existing toolsets and processes you are currently using to make sense of data. We will explore R and the RStudio development environment, installing and using R packages, basic and essential data structures and data types, plotting graphics, manipulating data frames and how to connect R and SQL Server.
This document discusses MySQL indexing best practices. It explains that indexes can improve query performance but also waste space and time if unnecessary. The right balance must be found to achieve fast queries using an optimal index set. It provides examples of how to create indexes and how they can be used to optimize WHERE, JOIN, ORDER BY, and other clauses. It also discusses avoiding unnecessary or redundant indexes.
What SQL functionality was added in the past year or so. The presentation covers default expressions, functional key parts, lateral derived tables, CHECK constraints, JSON and spatial improvements. Also some other small SQL and other improvements.
MariaDB Server 10.3 - Temporale Daten und neues zur DB-KompatibilitätMariaDB plc
MariaDB Server 10.3 (RC) introduces enhancements for temporal data support, database compatibility, performance, flexibility, and scalability. Key features include system versioned tables for querying historical data, PL/SQL compatibility for stored functions, sequences, intersect and except operators, and user-defined aggregate functions. The Spider storage engine is also updated.
MariaDB Server 10.3 provides enhancements for temporal data support, database compatibility, and performance. Key features include:
- System versioned tables to store and query historical data at different points in time.
- Improved Oracle compatibility with features like PL/SQL parsing, packages for stored functions, sequences, and additional data types.
- Performance improvements such as adding instant columns for InnoDB and statement-based lock wait timeouts.
- Other new features include user-defined aggregate functions, compressed columns, and proxy protocol support.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
20 Comprehensive Checklist of Designing and Developing a WebsitePixlogix Infotech
Dive into the world of Website Designing and Developing with Pixlogix! Looking to create a stunning online presence? Look no further! Our comprehensive checklist covers everything you need to know to craft a website that stands out. From user-friendly design to seamless functionality, we've got you covered. Don't miss out on this invaluable resource! Check out our checklist now at Pixlogix and start your journey towards a captivating online presence today.
Building RAG with self-deployed Milvus vector database and Snowpark Container...Zilliz
This talk will give hands-on advice on building RAG applications with an open-source Milvus database deployed as a docker container. We will also introduce the integration of Milvus with Snowpark Container Services.
2. Who am I?
• Software Development Manager & Team Leader @ Codix
• Using MySQL since 3.23.x
• Building MySQL Server and related products for Slackware Linux for
more than 14 years (check SlackPack)
• Free and open source software enthusiast
• Formula 1 and Le Mans 24 fan
• gdsotirov @
2
3. Agenda
• Query optimization
• What is explain plan?
• How to explain queries in MySQL
• Understanding tabular explain plans
• Optimization features (indexes, hints, histograms)
• New optimization features (TREE, hash join, EXPLAIN ANALYZE)
• Using visual explain plans
3
4. Query optimization
• SQL queries express requests to the database
• Database parses, optimizes and executes the query
• The optimizer should choose the most efficient way for execution:
• using all available information about tables, columns, indexes, etc.
• evaluating alternative plans
• For us (database practitioners) optimizing (or tuning) queries means to
• ensure use of efficient access paths;
• ensure use of optimal join order;
• rewrite query, to help optimizer choose a better plan; and
• even reorganize (normalize) the schema.
• The optimizer provides information about how it intends to execute a
query through the explain plan
For automatic query optimization see EverSQL 4
5. What is explain plan?
• Explain plan (or execution plan) shows the steps that MySQL
optimizer would take to execute a query
• Includes information about:
• access paths;
• used indexes and partitions;
• joins between tables;
• order of operations; and
• extra details.
• It helps understand if indexes are missing or not used, whether joins
are done in optimal order or generally why queries are slow
5
6. Why should we as developers care?
• We (should) know the schema (or should we?)
• We know how the application(s) queries data (i.e. the access paths)
as we are familiar with the implemented functionalities
• We know what data is stored in tables as it comes from the
application(s)
• We should care about performance of the application(s), which may
depend on the performance of SQL queries
• We do not need to depend on others (e.g. DBAs), if we could
diagnose and fix a slow query
6
7. How to explain queries - EXPLAIN syntax
• The general syntax is:
{EXPLAIN | DESCRIBE | DESC}
[explain_type: {FORMAT = {TRADITIONAL | JSON | TREE}]
{statement | FOR CONNECTION con_id}
• Statement could be SELECT, DELETE, INSERT, REPLACE or UPDATE
(before 5.6.3 only SELECT)
• TREE is new format since 8.0.16 GA (2019-04-25)
• Can explain the currently executing query for a connection (since 5.7.2)
• Requires SELECT privilege for tables and views + SHOW VIEW privilege for
views
• DESCRIBE is synonym for EXPLAIN but used mostly for getting table
structure
7
10. Traditional explain plan Example
EXPLAIN
SELECT E.ename, E.job, D.dname
FROM dept D,
emp E
WHERE E.deptno = D.deptno
AND E.job = 'CLERK';
+----+-------------+-------++------+---------------+-----------++-------------------+--------+----------+-------------+
| id | select_type | table || type | possible_keys | key || ref | rows | filtered | Extra |
+----+-------------+-------++------+---------------+-----------++-------------------+--------+----------+-------------+
| 1 | SIMPLE | D || ALL | PRIMARY | NULL || NULL | 4 | 100 | NULL |
| 1 | SIMPLE | E || ref | fk_deptno | fk_deptno || dept_emp.D.deptno | 250003 | 10 | Using where |
+----+-------------+-------++------+---------------+-----------++-------------------+--------+----------+-------------+
2 rows in set, 1 warning (0.0006 sec)
Note (code 1003): /* select#1 */ select `dept_emp`.`e`.`ename` AS `ename`,`dept_emp`.`e`.`job` AS
`job`,`dept_emp`.`d`.`dname` AS `dname` from `dept_emp`.`dept` `d` join `dept_emp`.`emp` `e` where
((`dept_emp`.`e`.`deptno` = `dept_emp`.`d`.`deptno`) and (`dept_emp`.`e`.`job` = 'CLERK'))
rows to join from E = 250 003 * 10% = 25 000.3 (per department)
10
11. Understanding tabular explain plans 1/5
• id: the sequential number of the SELECT in
query
• select_type: possible values include:
+----+-------------+
| id | select_type |
+----+-------------+
| 1 | SIMPLE |
| 1 | SIMPLE |
+----+-------------+
Value Meaning
SIMPLE no unions or subqueries
PRIMARY outermost SELECT
[DEPENDENT] UNION [dependent on outer query] second or later SELECT in a union
UNION RESULT result of a union
[DEPENDENT] SUBQUERY [dependent on outer query] first SELECT in subquery
[DEPENDENT] DERIVED [dependent on another table] derived table
MATERIALIZED materialized subquery
UNCACHEABLE [SUBQUERY|UNION] a subquery/union that must be re-evaluated for each row of the outer query
11
12. Understanding tabular explain plans 2/5
• table: name of table, union, subquery or derived. Could be
• table alias (or name)
• <unionM,N> - union between rows with ids M and N;
• <subqueryN> - result of materialized subquery from row N;
• <derivedN> - result of derived table from row N.
• partitions: NULL or the names of matched partitions
• type: join (access) type
+-------+------------+------+
| table | partitions | type |
+-------+------------+------+
| D | NULL | ALL |
| E | NULL | ref |
+-------+------------+------+
12
13. Join (access) types
best
worst
Value Meaning
system for tables with just one row
const for tables matching at most one row (by constant value)
eq_ref for 1:1 relations (primary keys or UNIQUE NOT NULL indexes)
ref for 1:N relations (non-unique indexes)
ref_or_null like ref, but searches also NULL values
fulltext join using full text index
index_merge using index merge optimization (merge multiple ranges)
unique_subquery for some IN subqueries returning primary key values
index_subquery same as previous, but for non-unique indexes
range for comparison operators (e.g. >, <, >=, <=), BETWEEN, IN, LIKE
index same as ALL, but only the index tree is scanned
ALL all rows (i.e. full table scan)
13
14. Understanding tabular explain plans 3/5
• possible_keys: NULL or names of indexes that could be used
• key: actually used index (may be other than listed in possible_keys)
• key_len: used key parts in bytes (e.g. >= 4 bytes for INT, cumulated
for composite indexes)
• ref: columns or constants used to compare to indexes
• rows: estimated number of rows to be retrieved by the chosen
access path
+---------------+-----------+---------+-------------------+--------+
| possible_keys | key | key_len | ref | rows |
+---------------+-----------+---------+-------------------+--------+
| PRIMARY | NULL | NULL | NULL | 4 |
| fk_deptno | fk_deptno | 5 | dept_emp.D.deptno | 250003 |
+---------------+-----------+---------+-------------------+--------+
14
15. Understanding tabular explain plans 4/5
• filtered: estimated percentage of filtered rows (before
5.7.3 EXPLAIN EXTENDED was needed)
• rows x filtered gives the estimated number of rows to be joined
• filter estimates are based on range estimates (e.g. for dates), index
statistics or hardcoded selectivity factors (see Selinger):
+----------+
| filtered |
+----------+
| 100 |
| 10 |
+----------+
Predicate Selectivity Filtered
Equality (=) 0.1 10.00 %
Comparison (>, <, >=, <=) 1/3 ≈ 0.33 33.33 %
BETWEEN (also LIKE) 1/9 ≈ 0.11 (it’s ¼ by Selinger!) 11.11 %
(pred1) AND (pred2) SEL(pred1) * SEL(pred2)
(pred1) OR (pred2) SEL(pred1) + SEL(pred2) - SEL(pred1) * SEL(pred2)
NOT (pred) 1 – SEL(pred)
15
16. Understanding tabular explain plans 5/5
• Extra information:
• NULL; or
• Plan isn't ready yet – when explaining query in a named connection;
• Recursive – indicates recursive CTE;
• Rematerialize – for dependent lateral derived tables;
• Using filesort – extra pass for sorting (i.e. when no index could be used);
• Using index – only index tree is scanned;
• Using index condition – WHERE conditions pushed down to storage engine;
• Using join buffer (Block Nested Loop) – BNL algorithm;
• Using join buffer (Batched Key Access) – BKA algorithm;
• Using temporary – when temporary table is used (e.g. GROUP/ORDER BY);
• Using where – selected rows are restricted by a WHERE clause;
• many more.
16
17. Indexes
• Indexes improve efficiency of queries by providing faster access to data
• Proper indexing reduces query response time and improves scalability
• As developers we need to index, because we know the access paths of the
application(s)
• MySQL supports B-Tree, hash, full text and spatial indexes (R-Tree)
• B-Tree indexes are by default ordered (in ascending order or also
descending order since MySQL 8.0.1 DMR)
• Indexes could be based on multiple columns (composite), functional (since
8.0.13 GA) and multi-valued (since 8.0.17 GA)
• Indexes could also be invisible (since 8.0.0 DMR)
• Use ANALYZE TABLE regularly to update indexes cardinality
17
18. Optimizer hints
• Fine control over optimizer execution plans of individual statements
• Statement level hints first appeared in MySQL 5.7.7 RC (2015-04-08)
• Extended with new hints in 8.0 for join order, resource group, etc.
• Syntax is similar to Oracle – e.g. /*+ JOIN_ORDER(...) */
• System variable optimizer_switch could also be used, but on
global or session level
• The necessary “evil” when the optimizer cannot itself chose the best
execution plan
• Several scope levels: query block, table, join order, index, subquery and
global (e.g. MAX_EXECUTION_TIME, RESOURCE_GROUP, SET_VAR)
18
19. Optimizer hints Example
SELECT
E.ename, E.job, D.dname
FROM dept D,
emp E
WHERE E.deptno = D.deptno
AND E.job = 'PRESIDENT';
SELECT /*+ JOIN_ORDER(E, D) */
E.ename, E.job, D.dname
FROM dept D,
emp E
WHERE E.deptno = D.deptno
AND E.job = 'PRESIDENT';
+----+-------+------+-----------+--------+----------+
| id | table | type | key | rows | filtered |
+----+-------+------+-----------+--------+----------+
| 1 | D | ALL | NULL | 4 | 100 |
| 1 | E | ref | fk_deptno | 250003 | 10 |
+----+-------+------+-----------+--------+----------+
Execution time: ≈ 2.1 sec
+----+-------+--------+---------+---------+----------+
| id | table | type | key | rows | filtered |
+----+-------+--------+---------+---------+----------+
| 1 | E | ALL | NULL | 1000014 | 9.995684 |
| 1 | D | eq_ref | PRIMARY | 1 | 100 |
+----+-------+--------+---------+---------+----------+
Execution time: ≈ 0.3 sec
19
20. Histograms
• Since MySQL 8.0.3 RC (released 2017-09-21)
• Statistical information about distribution of values in a column
• Values are grouped in buckets (maximum 1024)
• Cumulative frequency is automatically calculated for each bucket
• For large data sets sampling of the data is used, but it needs memory
(see histogram_generation_max_mem_size variable)
• Sampling requires full table scan, but would be improved for InnoDB
in 8.0.19 (probably to be released in January 2020)
• Sampling is not deterministic!
20
21. Histogram types
Singleton
• Single value per bucket
• Bucket stores value and cumulative
frequency
• Useful for estimation of equality and
range conditions
Equi-height
• Multiple values per bucket
• Bucket stores min and max inclusive
values, cumulative frequency and
number of distinct values
• Frequent values in separate buckets
• Most useful for range conditions
21
0
0.1
0.2
0.3
0.4
0.5
0.6
5 4 3 1 2
Frequency
0
0.05
0.1
0.15
0.2
0.25
0.3
[1,7] 8 [9,12] [13,19] [20,25]
[1,7] 8 [9,12] [13,19] [20,25]
22. Histograms continued
• Used by optimizer for estimating join cost
• Help the optimizer to make better row estimates
• Useful for columns that are NOT first column of any index, but used in
WHERE clause for joins or IN subqueries
• Best for columns with:
• low cardinality;
• uneven distribution; and
• distribution that does not vary much over time
• Not useful for columns with constantly increasing values (e.g. dates,
counters, etc.)
22
23. Histogram Example 1/4 – creation and meta
ANALYZE TABLE emp
UPDATE HISTOGRAM ON job
WITH 5 BUCKETS;
{"buckets": [
["base64:type254:QU5BTFlTVA==", 0.44427638528762128],
["base64:type254:Q0xFUks=" , 0.7765828956408041],
["base64:type254:TUFOQUdFUg==", 0.8882609755557034],
["base64:type254:U0FMRVNNQU4=", 1.0]
],
"data-type": "string",
"null-values": 0.0,
"collation-id": 33,
"last-updated": "2019-10-17 07:33:42.007222",
"sampling-rate": 0.10810659461578348,
"histogram-type": "singleton",
"number-of-buckets-specified": 5
}
SELECT JSON_PRETTY(`histogram`)
FROM INFORMATION_SCHEMA.COLUMN_STATISTICS CS
WHERE CS.`schema_name` = 'dept_emp'
AND CS.`table_name` = 'emp'
AND CS.`column_name` = 'job';
23
25. Histogram Example 3/4 - frequencies
SELECT HG.val, ROUND(HG.freq, 3) cfreq,
ROUND(HG.freq - LAG(HG.freq, 1, 0) OVER (), 3) freq
FROM INFORMATION_SCHEMA.COLUMN_STATISTICS CS,
JSON_TABLE(`histogram`->'$.buckets', '$[*]'
COLUMNS(val VARCHAR(10) PATH '$[0]',
freq DOUBLE PATH '$[1]')) HG
WHERE CS.`schema_name` = 'dept_emp'
AND CS.`table_name` = 'emp'
AND CS.`column_name` = 'job';
+-----------+-------+-------+
| val | cfreq | freq |
+-----------+-------+-------+
| ANALYST | 0.444 | 0.444 |
| CLERK | 0.777 | 0.333 |
| MANAGER | 0.889 | 0.111 |
| PRESIDENT | 0.889 | 0 |
| SALESMAN | 1 | 0.111 |
+-----------+-------+-------+
5 rows in set (0.0009 sec)
25
26. Histogram Example 4/4 – query plan effect
SELECT E.ename, E.job, D.dname
FROM dept D,
emp E
WHERE E.deptno = D.deptno
AND E.job = 'CLERK';
+----+-------------+-------++------+---------------+-----------++-------------------+--------+----------+-------------+
| id | select_type | table || type | possible_keys | key || ref | rows | filtered | Extra |
+----+-------------+-------++------+---------------+-----------++-------------------+--------+----------+-------------+
| 1 | SIMPLE | D || ALL | PRIMARY | NULL || NULL | 4 | 100 | NULL |
| 1 | SIMPLE | E || ref | fk_deptno | fk_deptno || dept_emp.D.deptno | 250003 | 33.32733 | Using where |
+----+-------------+-------++------+---------------+-----------++-------------------+--------+----------+-------------+
rows to join from E = 250 003 * 33.32733154296875 ≈ 83 319 (per department)
AND E.job = 'PRESIDENT';
+----+-------------+-------++--------+---------------+---------++-------------------+---------+----------+-------------+
| id | select_type | table || type | possible_keys | key || ref | rows | filtered | Extra |
+----+-------------+-------++--------+---------------+---------++-------------------+---------+----------+-------------+
| 1 | SIMPLE | E || ALL | fk_deptno | NULL || NULL | 1000014 | 0.000099 | Using where |
| 1 | SIMPLE | D || eq_ref | PRIMARY | PRIMARY || dept_emp.E.deptno | 1 | 100 | NULL |
+----+-------------+-------++--------+---------------+---------++-------------------+---------+----------+-------------+
rows to join from E = 1 000 014 * 0.00009999860048992559 ≈ 1 26
27. Histograms vs Indexes
Histograms
• Use less disk space
• Updated only on demand
• For low cardinality columns
• Create needs only backup lock
(permits DML)
Indexes
• Use more disk space
• Updated with each DML
• For any cardinality columns
• Create needs metadata lock
(no DML permitted)
27
28. TREE explain plan
• Appeared in MySQL 8.0.16 GA (released 2019-04-25) and improved
with 8.0.18 GA (released 2019-10-14)
• Still under development and considered “experimental”
• Displays operations (iterators) nested as a tree
• Helps to better understand the order of execution of operations
• For access path operations includes (since 8.0.18 GA) information
about:
• estimated execution cost
• estimated number of returned rows
See TREE explain format in MySQL 8.0.16 28
29. TREE explain plan Example 1/2
EXPLAIN FORMAT=TRADITIONAL
SELECT D.dname, LDT.min_sal, LDT.avg_sal, LDT.max_sal
FROM dept D,
LATERAL (SELECT MIN(E.sal) min_sal, AVG(E.sal) avg_sal, MAX(E.sal) max_sal
FROM emp E
WHERE E.deptno = D.deptno
) AS LDT;
+----+-------------------+------------++------++-----------++-------------------+------+----------+----------------------------+
| id | select_type | table || type || key || ref | rows | filtered | Extra |
+----+-------------------+------------++------++-----------++-------------------+------+----------+----------------------------+
| 1 | PRIMARY | D || ALL || NULL || NULL | 4| 100 | Rematerialize (<derived2>) |
| 1 | PRIMARY | <derived2> || ALL || NULL || NULL | 2| 100 | NULL |
| 2 | DEPENDENT DERIVED | E || ref || fk_deptno || dept_emp.D.deptno |250003| 100 | NULL |
+----+-------------------+------------++------++-----------++-------------------+------+----------+----------------------------+
3 rows in set, 2 warnings (0.0010 sec)
Note (code 1276): Field or reference 'dept_emp.D.deptno' of SELECT #2 was resolved in SELECT #1
Note (code 1003): /* select#1 */ select `dept_emp`.`d`.`dname` AS `dname`,`ldt`.`min_sal` AS ...
29
30. TREE explain plan Example 2/2
EXPLAIN FORMAT=TREE
SELECT D.dname, LDT.min_sal, LDT.avg_sal, LDT.max_sal
FROM dept D,
LATERAL (SELECT MIN(E.sal) min_sal, AVG(E.sal) avg_sal, MAX(E.sal) max_sal
FROM emp E
WHERE E.deptno = D.deptno
) AS LDT;
+----------------------------------------------------------------------+
| EXPLAIN |
+----------------------------------------------------------------------+
| -> Nested loop inner join
-> Invalidate materialized tables (row from D) (cost=0.65 rows=4)
-> Table scan on D (cost=0.65 rows=4)
-> Table scan on LDT
-> Materialize (invalidate on row from D)
-> Aggregate: min(e.sal), avg(e.sal), max(e.sal)
-> Index lookup on E using fk_deptno (deptno=d.deptno)
(cost=36095.62 rows=332359) |
+----------------------------------------------------------------------+ 30
31. Hash join optimization
• New in MySQL 8.0.18 GA (released 2019-10-14)
• Nested loop was the only join type before -
Nested Loop Join (NLJ), Block Nested Loop (BNL)
and Batched Key Access (BKA) algorithms
• Hash the smaller table and use it to lookup
rows from the other table
• Uses xxHash for extremely fast RAM hashing
• The best is hashing to be done entirely in
memory, but could also use disk (less efficient)
• You may need to adjust join buffer size (see
join_buffer_size)
Table 1 Table 2
xxHash64 xxHash64
Join buffer
#
#
#
Result
=
See WL#2241 31
32. Hash join continued
• Used automatically for any query with eq_ref condition and join
uses no indexes
• Would also work for cartesian joins (i.e. for joins without join
condition)
• Unfortunately visible only in TREE format of explain plan
• New hints for forcing hash join or NL - HASH_JOIN or
NO_HASH_JOIN
• Also on global or session level with hash_join=on|off in
optimizer_switch system variable
32
34. Hash join Example 2/2
EXPLAIN FORMAT=TREE
SELECT E.ename, E.sal, JS.sal_min, JS.sal_max
FROM emp E,
job_sal JS
WHERE E.job = JS.job
AND E.sal NOT BETWEEN JS.sal_min AND JS.sal_max;
+---------------------------------------------------------------------------------------+
| EXPLAIN |
+---------------------------------------------------------------------------------------+
| -> Filter: (E.sal not between JS.sal_min and JS.sal_max) (cost=499211.71 rows=442901)
-> Inner hash join (E.job = JS.job) (cost=499211.71 rows=442901)
-> Table scan on E (cost=1962.39 rows=996514)
-> Hash
-> Table scan on JS (cost=0.75 rows=5) |
+---------------------------------------------------------------------------------------+
1 row in set (0.0011 sec)
34
35. Block Nested Loop (BNL) vs Hash join
Block Nested Loop (BNL)
• Block NL query run for about
1.30 sec for 1M employees
• For equality and non-equality
joins
• For smaller result sets BNL
should be just fine
Hash join
• Hash join query run for about
0.9 sec for 1M employees
• For equality joins only
• For large result sets hash join
should be faster
35
36. EXPLAIN ANALYZE
• New in MySQL 8.0.18 GA (released 2019-10-14)
• Actually executes the query and provides timings from the execution
• Useful for comparing optimizer’s estimations to actual execution
• Output is in TREE format only (hopefully just for now – see WL#4168)
• In addition to TREE output provides also information about:
• time to return first row
• time to return all rows
• number of returned rows
• number of loops
• Only for SELECT statements. Cannot be used with FOR CONNECTION
See also EXPLAIN for PostgreSQL 36
37. EXPLAIN ANALYZE Example 1
+--------------------------------------------------------------------------------------------+
| EXPLAIN |
+--------------------------------------------------------------------------------------------+
| -> Filter: (e.sal not between js.sal_min and js.sal_max) (cost=499211.71 rows=442901)
(actual time=0.098..778.486 rows=915166 loops=1)
-> Inner hash join (e.job = js.job) (cost=499211.71 rows=442901)
(actual time=0.089..568.473 rows=1000014 loops=1)
-> Table scan on E (cost=1962.39 rows=996514)
(actual time=0.025..288.830 rows=1000014 loops=1)
-> Hash
-> Table scan on JS (cost=0.75 rows=5) (actual time=0.041..0.048 rows=5 loops=1)|
+--------------------------------------------------------------------------------------------+
1 row in set (0.8240 sec)
EXPLAIN ANALYZE
SELECT E.ename, E.sal, JS.sal_min, JS.sal_max
FROM emp E,
job_sal JS
WHERE E.job = JS.job
AND E.sal NOT BETWEEN JS.sal_min AND JS.sal_max; Note: Actual times are in milliseconds!
37
38. EXPLAIN ANALYZE Example 2
EXPLAIN ANALYZE
SELECT E.ename, E.job, D.dname
FROM dept D,
emp E
WHERE E.deptno = D.deptno
AND E.job = 'CLERK';
+------------------------------------------------------------------------------------------+
| EXPLAIN |
+------------------------------------------------------------------------------------------+
| -> Nested loop inner join (cost=112394.29 rows=333278)
(actual time=5.904..3281.472 rows=333278 loops=1)
-> Table scan on D (cost=1.40 rows=4)
(actual time=4.659..4.666 rows=4 loops=1)
-> Filter: (e.job = 'CLERK') (cost=5180.86 rows=83319)
(actual time=1.016..811.246 rows=83320 loops=4)
-> Index lookup on E using fk_deptno (deptno=d.deptno) (cost=5180.86 rows=250004)
(actual time=1.013..786.799 rows=250004 loops=4) |
+------------------------------------------------------------------------------------------+
1 row in set (3.3051 sec)
first row -> 5.904 ≈ 5.675 = 4.659 + 1.016
all rows -> 3 281.472 ≈ 3 249.65 = 4.666 + 4 * 811.246
Note: Actual times are aggregated between loops!
38
39. Problems with TREE and EXPLAIN ANALYZE
• Will not explain queries using nested loop (shows just “not executable
by iterator executor”)
• Will not explain SELECT COUNT(*) FROM table queries (shows
just “Count rows in table”)
• Does not compute select list subqueries (see bug 97296) [FIXED]
• Not integrated with MySQL Workbench (see bug 97282)
• Does not print units on timings (see bug 97492)
39
41. Additional information by JSON/visual explain
• Used columns – list of columns either read or written
• Used key parts – list of used key parts
• Rows produced per join – estimated rows after join
• Cost estimates (since 5.7.2) are split into:
• query cost – total cost of query (or subquery) block
• sort cost (CPU) – cost of the first sorting operation
• read cost (IO) – the cost of reading data from table
• eval cost (CPU) – the cost of condition evaluation
• prefix cost (CPU) – the cost of joining tables
• data read – estimated amount of data processed (rows x
record width)
See WL#6510 and MySQL 5.7.2 Release notes 41
42. Benefits from visual explain plans
• Help understand easily where is the problem
• easily spot bad access paths by box color:
• missing index(es);
• wrong or insufficient join conditions
• easily spot where most rows are generated by the thickness of lines:
• bad access path and/or no filter;
• involuntary cartesian joins;
• wrong join order
• Cost calculations are not fully documented, so hard to understand
• basically cost values are related to the number of blocks read (IO) and rows
processed (CPU)
• costs reflect the work done by the database
42
43. Visual explain plans Example 1 1/2
MySQL 5.5 ≈ 9.5 sec
Option A: Add WHERE
condition in the subquery
MySQL 5.5/8.0 ≈ 1.2 sec 43
44. Visual explain plans Example 1 2/2
MySQL 8.0 ≈ 0.015 secMySQL 5.5 ≈ 0.015 sec
Option B: Use derived
table instead of subquery
in WHERE
Option C: Use MySQL 8 ;-)
44
47. Visual explain plans Example 2 3/3
Option A: Use JOIN_ORDER hint
Option B: Re-create multi-column index on two instead of
three columns to improve selectivity (one of the columns in
the index had low cardinality)
Option C: Option B + remove table msg_progs
Execution time: ≈ 4 sec 47
48. Summary
• Use EXPLAIN to examine and tune query plan
• Use EXPLAIN ANALYZE to profile query execution
• Use different types of indexes and histograms properly
• There are no strict rules to follow for optimizing a query
• So be creative and evaluate different options
48
49. References
• MySQL Reference Manual (and section Optimization in particular)
• Use the index, Luke! and Modern SQL sites by Markus Winand
• MySQL EXPLAIN Explained by Øystein Grøvlen
• Histogram statistics in MySQL & Hash join in MySQL 8 by Erik Frøseth
• MySQL Explain Example by Tomer Shay
• MySQL EXPLAIN ANALYZE by Norvald H. Ryeng
• Using Explain Analyze in MySQL 8 from Percona blog
• My blog posts on MySQL
49
4th generation programming languages provide higher level of abstraction. A subset of domain specific languages (e.g. database, reporting, GUI, web development). Also declarative languages.
“SQL language is perhaps the most successful fourth-generation programming language (4GL).” Markus Winand
MySQL’s optimizer is cost based (CBO)
DESCRITBE | DESC is provided for compatibility with Oracle
Difference between DEPENDENT and UNCACHEABLE is that in the first case the subquery is re-evaluated only once for each set of different values from the outer context, while in the second case the subquery is re-evaluated for each row of the outer context
Select type could also be DELETE, INSERT or UPDATE for non-SELECT statements.
Always use aliases when joining tables!
Index may be other than listed in possible_keys, because possible_keys lists indexes suitable for access (looking up rows), but a covering index could be used
Using index condition is for Index Condition Pushdown (ICP), not Engine condition pushdown possible only for NDB.
„Searching in a database index is like searching in a printed telephone directory.” Markus Winand
No need to index all columns of a table (i.e. over indexing).For concatenated indexes the most important is to choose properly column order.
Before MySQL 5.7.7 optimizer adjustments only possible through optimizer_switch system variable.
In some cases hints are the necessary evil
Default value for histogram_generation_max_mem_size is 20 000 000 (i.e. 19 MB)
Cumulative frequency – it is the sum (or running total) of all the frequencies up to the current point in the data set.
Not deterministic - Sampling considers different data each time.
MySQL automatically chooses histogram type considering number of distinct values and buckets specified
Data variation over time – histogram needs to be updated
For increasing dates better use index
For string values, a maximum of 42 characters from the beginning are considered
Total rows for CLERK = 83 319 * 4 = 333 276
DML – Data Manipulation Language (e.g. INSERT, UPDATE, DELETE)
Cost and row estimates since MySQL 8.0.18
Join buffer size is 256 MB by default in MySQL 8
Limit the columns selected from the hashed table
Hash joins won’t benefit from indexes on the join condition, but for sure would benefit from indexes on columns used in independent conditions (e.g. sale_date > ‘2019-10-01’)
Hash join may not succeed if disk is used and open_files_limit is reached
Pipelined means to return first results without processing all input data.
A profiling tool
Actual time for loops is average over all loops. Total time is time * loops
Times add up, but are not exactly equal
Used columns is useful for identifying candidates for covering indexes.Used key parts is useful for checking composite indexes.