Dives into how MySQL indexes work under the hood, and provides strategies for efficiently indexing your data to reduce query times.
Presented at Western Slope Tech Meetup in Montrose, CO 3/29/17
Recovery of lost or corrupted inno db tables(mysql uc 2010)guest808c167
The document discusses recovery of lost or corrupted InnoDB tables in MySQL. It provides an overview of how MySQL stores data in InnoDB, including system tables, primary and secondary keys, and typical failure scenarios. It then introduces an InnoDB recovery tool that can split the InnoDB tablespace into pages and find valid records by parsing pages and applying constraints defined in table definitions.
The document provides information on defining database objects like tables, fields, domains, and data elements in ABAP Dictionary. It describes how to create these objects step-by-step including defining domains with data types and lengths, creating data elements with field labels, and building tables with fields linked to data elements and domains. The document also covers foreign key relationships, structures, currency/quantity fields, and domain value lists.
This document discusses new query optimization features in MariaDB 10.3. It describes how MariaDB 10.3 improves on condition pushdown from 10.2 by allowing conditions to be pushed through window functions. It also explains a new "split grouping" optimization where grouping is done separately for each relevant group, rather than computing all groups at once, allowing indexes to be leveraged more efficiently. These optimizations can improve performance by filtering out unnecessary rows earlier in query execution.
MariaDB Optimizer - further down the rabbit holeSergey Petrunya
The document summarizes new features in the MariaDB 10.4 query optimizer including:
1) New default optimizer settings that take more factors into account for condition selectivity and use histograms by default.
2) Faster histogram collection using Bernoulli sampling rather than analyzing the whole data set.
3) Two new types of condition pushdown - from HAVING clauses into WHERE clauses, and into materialized IN subqueries.
Oracle allocates logical space (tablespaces) to store database objects and data. Tablespaces contain segments which are made up of extents and data blocks. The document discusses different types of tablespace partitioning including range, list and hash partitioning. It provides examples of creating partitions based on numeric, alphabetic and date ranges and manipulating partitions using ALTER TABLE operations like ADD, DROP, RENAME and MERGE.
Building a Hierarchical Data Model Using the Latest IBM Informix FeaturesAjay Gupte
Learn about developing Hierarchical queries using Informix features such as OLAP functions, setops operators and query rewrite. This presentation will cover building the hierarchical data model using existing relational schema in IDS. You learn about customer scenarios for designing hierarchical data model, in-depth knowledge of complex hierarchical queries, performance tips and references. This talk will provide details on how to identify hierarchical relationship and take advantage of using existing relational model.
SQL databases work by compiling SQL statements into programs that are executed by a virtual machine. The virtual machine consists of several key components, including a parser, code generator, virtual machine, B-tree storage structure, pager, and OS interface. Indexes play an important role in allowing the query optimizer to select the most efficient execution plan for a given SQL statement. The optimizer aims to minimize the cost of executing the statement in terms of memory, disk I/O, and processing time.
Dives into how MySQL indexes work under the hood, and provides strategies for efficiently indexing your data to reduce query times.
Presented at Western Slope Tech Meetup in Montrose, CO 3/29/17
Recovery of lost or corrupted inno db tables(mysql uc 2010)guest808c167
The document discusses recovery of lost or corrupted InnoDB tables in MySQL. It provides an overview of how MySQL stores data in InnoDB, including system tables, primary and secondary keys, and typical failure scenarios. It then introduces an InnoDB recovery tool that can split the InnoDB tablespace into pages and find valid records by parsing pages and applying constraints defined in table definitions.
The document provides information on defining database objects like tables, fields, domains, and data elements in ABAP Dictionary. It describes how to create these objects step-by-step including defining domains with data types and lengths, creating data elements with field labels, and building tables with fields linked to data elements and domains. The document also covers foreign key relationships, structures, currency/quantity fields, and domain value lists.
This document discusses new query optimization features in MariaDB 10.3. It describes how MariaDB 10.3 improves on condition pushdown from 10.2 by allowing conditions to be pushed through window functions. It also explains a new "split grouping" optimization where grouping is done separately for each relevant group, rather than computing all groups at once, allowing indexes to be leveraged more efficiently. These optimizations can improve performance by filtering out unnecessary rows earlier in query execution.
MariaDB Optimizer - further down the rabbit holeSergey Petrunya
The document summarizes new features in the MariaDB 10.4 query optimizer including:
1) New default optimizer settings that take more factors into account for condition selectivity and use histograms by default.
2) Faster histogram collection using Bernoulli sampling rather than analyzing the whole data set.
3) Two new types of condition pushdown - from HAVING clauses into WHERE clauses, and into materialized IN subqueries.
Oracle allocates logical space (tablespaces) to store database objects and data. Tablespaces contain segments which are made up of extents and data blocks. The document discusses different types of tablespace partitioning including range, list and hash partitioning. It provides examples of creating partitions based on numeric, alphabetic and date ranges and manipulating partitions using ALTER TABLE operations like ADD, DROP, RENAME and MERGE.
Building a Hierarchical Data Model Using the Latest IBM Informix FeaturesAjay Gupte
Learn about developing Hierarchical queries using Informix features such as OLAP functions, setops operators and query rewrite. This presentation will cover building the hierarchical data model using existing relational schema in IDS. You learn about customer scenarios for designing hierarchical data model, in-depth knowledge of complex hierarchical queries, performance tips and references. This talk will provide details on how to identify hierarchical relationship and take advantage of using existing relational model.
SQL databases work by compiling SQL statements into programs that are executed by a virtual machine. The virtual machine consists of several key components, including a parser, code generator, virtual machine, B-tree storage structure, pager, and OS interface. Indexes play an important role in allowing the query optimizer to select the most efficient execution plan for a given SQL statement. The optimizer aims to minimize the cost of executing the statement in terms of memory, disk I/O, and processing time.
MySQL Indexing : Improving Query Performance Using Index (Covering Index)Hemant Kumar Singh
The document discusses improving query performance in databases using indexes. It explains what indexes are and the different types of indexes including column, composite, and covering indexes. It provides examples of how to create indexes on single and multiple columns and how the order of columns matters. The document also discusses factors that affect database performance and guidelines for index usage and size optimization.
15 Ways to Kill Your Mysql Application Performanceguest9912e5
Jay is the North American Community Relations Manager at MySQL. Author of Pro MySQL, Jay has also written articles for Linux Magazine and regularly assists software developers in identifying how to make the most effective use of MySQL. He has given sessions on performance tuning at the MySQL Users Conference, RedHat Summit, NY PHP Conference, OSCON and Ohio LinuxFest, among others.In his abundant free time, when not being pestered by his two needy cats and two noisy dogs, he daydreams in PHP code and ponders the ramifications of __clone().
The document discusses indexes in SQL Server. It describes internal and external fragmentation that can occur in indexes. Internal fragmentation is unused space between records within a page, while external fragmentation is when page extents are not stored contiguously on disk. It provides examples of identifying fragmentation using system views and the dm_db_index_physical_stats dynamic management function. It also covers best practices for index types, such as numeric and date fields making good candidates while character fields are less efficient. Composite indexes, fill factor, and rebuilding vs. reorganizing indexes are also discussed.
This document discusses optimizing data warehouse star schemas with MySQL. It provides tips for database design including using a star schema approach with dimension and fact tables. It recommends MySQL 5.x, optimizing the MySQL configuration file, designing tables and indexes effectively, using the correct data loading architecture, analyzing tables, writing efficient query styles, and examining explain plans. Optimizing these areas can help ensure fast query performance from large data warehouses built with MySQL.
This presentation is an INTRODUCTION to intermediate MySQL query optimization for the Audience of PHP World 2017. It covers some of the more intricate features in a cursory overview.
How to analyze and tune sql queries for better performance percona15oysteing
The document discusses how to analyze and tune MySQL queries for better performance. It covers several key topics:
1) The MySQL optimizer selects the most efficient access method (e.g. table scan, index scan) based on a cost model that estimates I/O and CPU costs.
2) The join optimizer searches for the lowest-cost join order by evaluating partial plans in a depth-first manner and pruning less promising plans.
3) Tools like the performance schema provide query history and statistics to analyze queries and monitor performance bottlenecks like disk I/O.
4) Indexes, rewriting queries, and query hints can influence the optimizer to select a better execution plan.
The document discusses migrating job search data from MySQL to Elasticsearch to improve performance. The MySQL queries for job searches were slow, with some over 2 seconds. Testing showed Elasticsearch queries were faster at 0.5 seconds. The first phase of migration moved 160,000 job records to Elasticsearch. Performance tuning included adjusting thread pools, memory settings, and hardware. Monitoring and failure testing were conducted before migration. Latency for job searches improved greatly after migrating to Elasticsearch.
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...Spark Summit
Scaling out doesn’t have to mean giving up transactions and efficient joins! Relational databases can scale horizontally, and using them as a store for Spark Streaming or batch computations can help cover areas in which Spark is typically weaker. Examples will be drawn from our experience using Citus (https://github.com/citusdata/citus), an open-source extension to Postgres, but lessons learned should be applicable to many databases.
The final part of the SQL Tuning workshop focuses on applying the techniques discussed in the previous sections to help diagnose and correct a number of problematic SQL statements and shows how you can use SQL Plan Management or a SQL Patch to influence an execution plan.
Based on the legendary "Don't Do This" PostgreSQL wiki page, this talk explores some of the common pitfalls and misconceptions that Postgres users can face - and shows possible ways to undo them or workarounds.
Some of the things discussed:
- Bad SQL habits
- Correct types for data storage
- (Sub-)Partitioning (and how to get it wrong)
- Table inheritance (and how to undo it)
- Connections (number of, and properly handling)
- Security issues (unsafe configurations and usage)
Talk given at FOSDEM 2023
The document discusses various techniques for optimizing database performance in Oracle, including:
- Using the cost-based optimizer (CBO) to choose the most efficient execution plan based on statistics and hints.
- Creating appropriate indexes on columns used in predicates and queries to reduce I/O and sorting.
- Applying constraints and coding practices like limiting returned rows to improve query performance.
- Tuning SQL statements through techniques like predicate selectivity, removing unnecessary objects, and leveraging indexes.
This document discusses various MySQL performance optimization techniques, including:
- Choosing between the InnoDB and MyISAM storage engines, with InnoDB generally recommended due to its transactional capabilities and row-level locking.
- Selecting optimal data types to minimize storage size and improve indexing and query performance.
- Considering whether to normalize or denormalize database schemas based on query patterns to reduce the need for joins or minimize data duplication respectively.
- Using summary/cache tables to pre-aggregate data and improve performance of analytical queries that involve expensive joins across multiple tables.
- Understanding the EXPLAIN output to analyze indexes used, table access methods, and ways to improve queries by adding appropriate indexes.
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...Dave Stokes
Slow query? Add an index or two! But things are suddenly even slower! Indexes are great tools to speed data lookup but have overhead issues. Histograms don’t have that overhead but may not be suited. And how you lock rows also effects performance. So what do you do to speed up queries smartly?
Confoo 2021 - MySQL Indexes & HistogramsDave Stokes
Confoo 2021 presentation on MySQL Indexes, Histograms, and other ways to speed up your queries. This slide deck has slides that may not have been included in the presentation that were omitted due to time constraints
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory optionFranck Pachot
Besides adaptive joins and adaptive parallel distribution, 12c comes with Adaptive Bitmap Pruning. I’ll describe the case it applies to and which is often not well known: the Star Transformation
How to analyze and tune sql queries for better performance webinaroysteing
This document discusses a presentation on analyzing and tuning MySQL queries for better performance. The presentation agenda covers the MySQL cost-based optimizer, selecting data access methods, the join optimizer, sorting, tools for monitoring queries, and influencing the optimizer. The document provides examples and case studies on topics like ref access, range optimization, index selection, and using performance schema for query analysis.
*If you see the screen is not good condition, downloading please.*
MariaDB Optimizer
- Query execution process
- Optimizer
- Statistics
- Histogram-based Statistics
- Join Optimizer
- Analysis of the execution plan
- Optimizer Hint
- Tuning Point
Using Query Store in Azure PostgreSQL to Understand Query PerformanceGrant Fritchey
Microsoft has added an excellent new extension in PostgreSQL on their Azure Platform. This session, presented at Posette 2024, covers what Query Store is and the types of information you can get out of it.
Enhanced Screen Flows UI/UX using SLDS with Tom KittPeter Caitens
Join us for an engaging session led by Flow Champion, Tom Kitt. This session will dive into a technique of enhancing the user interfaces and user experiences within Screen Flows using the Salesforce Lightning Design System (SLDS). This technique uses Native functionality, with No Apex Code, No Custom Components and No Managed Packages required.
MySQL Indexing : Improving Query Performance Using Index (Covering Index)Hemant Kumar Singh
The document discusses improving query performance in databases using indexes. It explains what indexes are and the different types of indexes including column, composite, and covering indexes. It provides examples of how to create indexes on single and multiple columns and how the order of columns matters. The document also discusses factors that affect database performance and guidelines for index usage and size optimization.
15 Ways to Kill Your Mysql Application Performanceguest9912e5
Jay is the North American Community Relations Manager at MySQL. Author of Pro MySQL, Jay has also written articles for Linux Magazine and regularly assists software developers in identifying how to make the most effective use of MySQL. He has given sessions on performance tuning at the MySQL Users Conference, RedHat Summit, NY PHP Conference, OSCON and Ohio LinuxFest, among others.In his abundant free time, when not being pestered by his two needy cats and two noisy dogs, he daydreams in PHP code and ponders the ramifications of __clone().
The document discusses indexes in SQL Server. It describes internal and external fragmentation that can occur in indexes. Internal fragmentation is unused space between records within a page, while external fragmentation is when page extents are not stored contiguously on disk. It provides examples of identifying fragmentation using system views and the dm_db_index_physical_stats dynamic management function. It also covers best practices for index types, such as numeric and date fields making good candidates while character fields are less efficient. Composite indexes, fill factor, and rebuilding vs. reorganizing indexes are also discussed.
This document discusses optimizing data warehouse star schemas with MySQL. It provides tips for database design including using a star schema approach with dimension and fact tables. It recommends MySQL 5.x, optimizing the MySQL configuration file, designing tables and indexes effectively, using the correct data loading architecture, analyzing tables, writing efficient query styles, and examining explain plans. Optimizing these areas can help ensure fast query performance from large data warehouses built with MySQL.
This presentation is an INTRODUCTION to intermediate MySQL query optimization for the Audience of PHP World 2017. It covers some of the more intricate features in a cursory overview.
How to analyze and tune sql queries for better performance percona15oysteing
The document discusses how to analyze and tune MySQL queries for better performance. It covers several key topics:
1) The MySQL optimizer selects the most efficient access method (e.g. table scan, index scan) based on a cost model that estimates I/O and CPU costs.
2) The join optimizer searches for the lowest-cost join order by evaluating partial plans in a depth-first manner and pruning less promising plans.
3) Tools like the performance schema provide query history and statistics to analyze queries and monitor performance bottlenecks like disk I/O.
4) Indexes, rewriting queries, and query hints can influence the optimizer to select a better execution plan.
The document discusses migrating job search data from MySQL to Elasticsearch to improve performance. The MySQL queries for job searches were slow, with some over 2 seconds. Testing showed Elasticsearch queries were faster at 0.5 seconds. The first phase of migration moved 160,000 job records to Elasticsearch. Performance tuning included adjusting thread pools, memory settings, and hardware. Monitoring and failure testing were conducted before migration. Latency for job searches improved greatly after migrating to Elasticsearch.
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...Spark Summit
Scaling out doesn’t have to mean giving up transactions and efficient joins! Relational databases can scale horizontally, and using them as a store for Spark Streaming or batch computations can help cover areas in which Spark is typically weaker. Examples will be drawn from our experience using Citus (https://github.com/citusdata/citus), an open-source extension to Postgres, but lessons learned should be applicable to many databases.
The final part of the SQL Tuning workshop focuses on applying the techniques discussed in the previous sections to help diagnose and correct a number of problematic SQL statements and shows how you can use SQL Plan Management or a SQL Patch to influence an execution plan.
Based on the legendary "Don't Do This" PostgreSQL wiki page, this talk explores some of the common pitfalls and misconceptions that Postgres users can face - and shows possible ways to undo them or workarounds.
Some of the things discussed:
- Bad SQL habits
- Correct types for data storage
- (Sub-)Partitioning (and how to get it wrong)
- Table inheritance (and how to undo it)
- Connections (number of, and properly handling)
- Security issues (unsafe configurations and usage)
Talk given at FOSDEM 2023
The document discusses various techniques for optimizing database performance in Oracle, including:
- Using the cost-based optimizer (CBO) to choose the most efficient execution plan based on statistics and hints.
- Creating appropriate indexes on columns used in predicates and queries to reduce I/O and sorting.
- Applying constraints and coding practices like limiting returned rows to improve query performance.
- Tuning SQL statements through techniques like predicate selectivity, removing unnecessary objects, and leveraging indexes.
This document discusses various MySQL performance optimization techniques, including:
- Choosing between the InnoDB and MyISAM storage engines, with InnoDB generally recommended due to its transactional capabilities and row-level locking.
- Selecting optimal data types to minimize storage size and improve indexing and query performance.
- Considering whether to normalize or denormalize database schemas based on query patterns to reduce the need for joins or minimize data duplication respectively.
- Using summary/cache tables to pre-aggregate data and improve performance of analytical queries that involve expensive joins across multiple tables.
- Understanding the EXPLAIN output to analyze indexes used, table access methods, and ways to improve queries by adding appropriate indexes.
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...Dave Stokes
Slow query? Add an index or two! But things are suddenly even slower! Indexes are great tools to speed data lookup but have overhead issues. Histograms don’t have that overhead but may not be suited. And how you lock rows also effects performance. So what do you do to speed up queries smartly?
Confoo 2021 - MySQL Indexes & HistogramsDave Stokes
Confoo 2021 presentation on MySQL Indexes, Histograms, and other ways to speed up your queries. This slide deck has slides that may not have been included in the presentation that were omitted due to time constraints
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory optionFranck Pachot
Besides adaptive joins and adaptive parallel distribution, 12c comes with Adaptive Bitmap Pruning. I’ll describe the case it applies to and which is often not well known: the Star Transformation
How to analyze and tune sql queries for better performance webinaroysteing
This document discusses a presentation on analyzing and tuning MySQL queries for better performance. The presentation agenda covers the MySQL cost-based optimizer, selecting data access methods, the join optimizer, sorting, tools for monitoring queries, and influencing the optimizer. The document provides examples and case studies on topics like ref access, range optimization, index selection, and using performance schema for query analysis.
*If you see the screen is not good condition, downloading please.*
MariaDB Optimizer
- Query execution process
- Optimizer
- Statistics
- Histogram-based Statistics
- Join Optimizer
- Analysis of the execution plan
- Optimizer Hint
- Tuning Point
Using Query Store in Azure PostgreSQL to Understand Query PerformanceGrant Fritchey
Microsoft has added an excellent new extension in PostgreSQL on their Azure Platform. This session, presented at Posette 2024, covers what Query Store is and the types of information you can get out of it.
Enhanced Screen Flows UI/UX using SLDS with Tom KittPeter Caitens
Join us for an engaging session led by Flow Champion, Tom Kitt. This session will dive into a technique of enhancing the user interfaces and user experiences within Screen Flows using the Salesforce Lightning Design System (SLDS). This technique uses Native functionality, with No Apex Code, No Custom Components and No Managed Packages required.
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...The Third Creative Media
"Navigating Invideo: A Comprehensive Guide" is an essential resource for anyone looking to master Invideo, an AI-powered video creation tool. This guide provides step-by-step instructions, helpful tips, and comparisons with other AI video creators. Whether you're a beginner or an experienced video editor, you'll find valuable insights to enhance your video projects and bring your creative ideas to life.
Nashik's top web development company, Upturn India Technologies, crafts innovative digital solutions for your success. Partner with us and achieve your goals
Measures in SQL (SIGMOD 2024, Santiago, Chile)Julian Hyde
SQL has attained widespread adoption, but Business Intelligence tools still use their own higher level languages based upon a multidimensional paradigm. Composable calculations are what is missing from SQL, and we propose a new kind of column, called a measure, that attaches a calculation to a table. Like regular tables, tables with measures are composable and closed when used in queries.
SQL-with-measures has the power, conciseness and reusability of multidimensional languages but retains SQL semantics. Measure invocations can be expanded in place to simple, clear SQL.
To define the evaluation semantics for measures, we introduce context-sensitive expressions (a way to evaluate multidimensional expressions that is consistent with existing SQL semantics), a concept called evaluation context, and several operations for setting and modifying the evaluation context.
A talk at SIGMOD, June 9–15, 2024, Santiago, Chile
Authors: Julian Hyde (Google) and John Fremlin (Google)
https://doi.org/10.1145/3626246.3653374
Consistent toolbox talks are critical for maintaining workplace safety, as they provide regular opportunities to address specific hazards and reinforce safe practices.
These brief, focused sessions ensure that safety is a continual conversation rather than a one-time event, which helps keep safety protocols fresh in employees' minds. Studies have shown that shorter, more frequent training sessions are more effective for retention and behavior change compared to longer, infrequent sessions.
Engaging workers regularly, toolbox talks promote a culture of safety, empower employees to voice concerns, and ultimately reduce the likelihood of accidents and injuries on site.
The traditional method of conducting safety talks with paper documents and lengthy meetings is not only time-consuming but also less effective. Manual tracking of attendance and compliance is prone to errors and inconsistencies, leading to gaps in safety communication and potential non-compliance with OSHA regulations. Switching to a digital solution like Safelyio offers significant advantages.
Safelyio automates the delivery and documentation of safety talks, ensuring consistency and accessibility. The microlearning approach breaks down complex safety protocols into manageable, bite-sized pieces, making it easier for employees to absorb and retain information.
This method minimizes disruptions to work schedules, eliminates the hassle of paperwork, and ensures that all safety communications are tracked and recorded accurately. Ultimately, using a digital platform like Safelyio enhances engagement, compliance, and overall safety performance on site. https://safelyio.com/
Mobile App Development Company In Noida | Drona InfotechDrona Infotech
React.js, a JavaScript library developed by Facebook, has gained immense popularity for building user interfaces, especially for single-page applications. Over the years, React has evolved and expanded its capabilities, becoming a preferred choice for mobile app development. This article will explore why React.js is an excellent choice for the Best Mobile App development company in Noida.
Visit Us For Information: https://www.linkedin.com/pulse/what-makes-reactjs-stand-out-mobile-app-development-rajesh-rai-pihvf/
Unveiling the Advantages of Agile Software Development.pdfbrainerhub1
Learn about Agile Software Development's advantages. Simplify your workflow to spur quicker innovation. Jump right in! We have also discussed the advantages.
14 th Edition of International conference on computer visionShulagnaSarkar2
About the event
14th Edition of International conference on computer vision
Computer conferences organized by ScienceFather group. ScienceFather takes the privilege to invite speakers participants students delegates and exhibitors from across the globe to its International Conference on computer conferences to be held in the Various Beautiful cites of the world. computer conferences are a discussion of common Inventions-related issues and additionally trade information share proof thoughts and insight into advanced developments in the science inventions service system. New technology may create many materials and devices with a vast range of applications such as in Science medicine electronics biomaterials energy production and consumer products.
Nomination are Open!! Don't Miss it
Visit: computer.scifat.com
Award Nomination: https://x-i.me/ishnom
Conference Submission: https://x-i.me/anicon
For Enquiry: Computer@scifat.com
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...XfilesPro
Wondering how X-Sign gained popularity in a quick time span? This eSign functionality of XfilesPro DocuPrime has many advancements to offer for Salesforce users. Explore them now!
3. InnoDB table format
• Use InnoDB by default
• InnoDB vs MyISAM pros:
• Row locking, allows concurrent writes
• ACID
• Non-blocking backups
• Better data recovery after a crash
• InnoDB cons:
• Lack of instant row count
3
4. What about MyISAM?
• Never use MyISAM for concurrent applications in
real time
• Possible use cases for MyISAM:
• No writes
• Batch writes only
• Mostly full scan selects
4
6. Table segmentation
• For very big tables or tables that will grow forever,
find some criteria to segment data into smaller
tables
• For instance:
• one table per month for log recording
• one table per country for users, so you can
shard them and put them on different servers
closer to the users
6
7. Table specialisation
• Don’t use the same table for heterogeneous
records that don’t share the same fields, it will
increase table size and affect performance when
using indexes
• For instance:
• In a database for classified ads for homes,
cars and jobs, use one table for each type,
because they don’t share fields like number of
rooms, engine power or salary.
7
9. Clustered index
1 1 2 A B … … …
Primary key
1 1 1 A A … … …
1 1 1A A
Row data
Primary keySecondary key 1
1 2 2A A
1 1 2A B
1 2 1 A B … … …
9
1 2 2 A A … … …
1 2 1A B
10. Clustered index
• You must always define a primary key, if there’s no
natural PK for the table, define an auto incremental PK
• Records are physically ordered in table by the PK
• Accessing a row using the PK is the fastest way,
because the row data is on the same page where the
index search leads
• Don’t include fields in the PK that could be modified
after insertion, it will delete and insert the record again at
the right position if you update the PK, affecting
performance
10
11. Secondary indexes
• All indexes other than the clustered index are known as secondary
indexes
• Each record in a secondary index contains the primary key columns
for the row, as well as the columns specified for the secondary index
• InnoDB uses this primary key value to search for the row in the
clustered index
• You could take profit from this design for paging by PK in selects
that use a secondary index
• If the primary key is long, the secondary indexes use more space,
so it is better to have a small auto increment PK and define a
unique key with the fields that would be the natural PK
11
15. Pros of indexes
1. Filter: access only the records you need, without
considering more records than necessary. Applies
to SELECT, UPDATE, REPLACE and DELETE
2. Sor/group: avoid using temporary tables
3. Cover: if all fields in a SELECT are included in the
index used (despite its order), data is retrieved
directly from this index, saving extra reads
15
16. Cons of indexes
1. Writes: the more indexes a table has, the slower
writes are going to be
2. Size: each index is going to increase total table
size
16
17. Rules for using indexes
• Only one index is used for each table in a query
• In case of using “OR” in a WHERE condition, it works as many different
queries, and each one could use its own index
• Fields use order in index is from left to right, always beginning with the first
one
• Its not mandatory to use all fields in an index, but every field used must be
consecutive
• Fields in WHERE condition go first, next GROUP BY and ORDER BY last
• Order of fields in WHERE condition doesn’t matter
• A range condition in WHERE or using GROUP BY or ORDER BY will prevent
using the next fields in the index
17
18. Rules for using indexes
fields use order
18
You can’t skip previous fields if you want to filter using the index for
rightmost fields
KEY `country_date_vertical` (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)
SELECT * FROM table
WHERE fk_c_id_tbl_countries = ‘es’
AND d_date = ‘2014-‐10-‐10’
AND fk_i_id_tbl_vertical = 1;
KEY `country_date_vertical` (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)
SELECT * FROM table
WHERE d_date = ‘2014-‐10-‐10’
AND fk_i_id_tbl_vertical = 1;
KEY `country_date_vertical` (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)
SELECT * FROM table
WHERE fk_c_id_tbl_countries = ‘es’
AND fk_i_id_tbl_vertical = 1;
19. Rules for using indexes
fields order in ranges, groups and sorts
19
None of this examples could filter using the index for the field
fk_i_id_tbl_vertical
KEY `country_date_vertical` (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)
SELECT * FROM table
WHERE fk_c_id_tbl_countries = ‘es’
AND d_date < ‘2014-‐10-‐10’
AND fk_i_id_tbl_vertical = 1;
SELECT * FROM table
WHERE fk_c_id_tbl_countries = ‘es’
AND fk_i_id_tbl_vertical = 1,
GROUP BY d_date;
SELECT * FROM table
WHERE fk_c_id_tbl_countries = ‘es’
AND fk_i_id_tbl_vertical = 1
ORDER BY d_date;
20. Rules for using indexes
covering indexes and fields order
20
This query will use the covering index for all the requested fields, also uses
the index for filtering the first field, but can’t use it for filtering the next fields
because the WHERE condition is skipping the second field in the index
KEY `country_date_vertical` (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)
KEY `country_date_vertical` (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)
SELECT fk_c_id_tbl_countries, d_date, fk_i_id_tbl_vertical FROM table
WHERE fk_c_id_tbl_countries = ‘es’
AND fk_i_id_tbl_vertical = 1;
This query will use the covering index for all the requested fields, but can’t
use it for filtering because the WHERE condition is skipping the first field in
the index
KEY `country_date_vertical` (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)
KEY `country_date_vertical` (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)
SELECT fk_c_id_tbl_countries, d_date, fk_i_id_tbl_vertical FROM table
WHERE d_date < ‘2014-‐10-‐10’
AND fk_i_id_tbl_vertical = 1;
21. Rules for using indexes
covering indexes and not indexed fields
21
You can’t use covering index optimization if any of the
requested fields is not included in the index
KEY `country_date_vertical` (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)
SELECT fk_c_id_tbl_countries, d_date, fk_i_id_tbl_vertical, s_query FROM table
WHERE fk_c_id_tbl_countries = ‘es’
AND d_date = ‘2014-‐10-‐10’
AND fk_i_id_tbl_vertical = 1;
23. Duplicated indexes
• Avoid duplicating fields in different indexes, it
will affect write performance and increase table
size
• Think about the most frequent uses of the table, so
you can design the table itself and order the fields
in indexes smartly to avoid duplicate fields
23
24. Promote covering indexes
• Consider adding frequently requested fields at
the end of an index, even if they aren’t used in
WHERE, GROUP BY or ORDER BY
• If the major part of queries on a table are optimised
to use use covering indexes, is the most
important performance boost you can get
24
25. Index cardinality
• Cardinality is how many unique values an index has
• The more cardinality, the more efficient an index is
filtering records
• MySQL maintains approximate statistics about
cardinality
• Each MySQL version gives different cardinality
values, and they can become out of date under high
write load
25
26. Index cardinality
26
mysql> analyze table users;
mysql> show index from users;
+-‐-‐-‐-‐-‐-‐-‐+-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐+-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐+-‐-‐-‐-‐-‐+-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐+-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐+
| Table | Non_unique | Key_name | Seq | Column_name | Cardinality |
+-‐-‐-‐-‐-‐-‐-‐+-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐+-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐+-‐-‐-‐-‐-‐+-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐+-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐+
| users | 0 | PRIMARY | 1 | id | 9728224 |
| users | 1 | age_sex | 1 | age | 192 |
| users | 1 | age_sex | 2 | sex | 406 |
| users | 1 | sex_age | 1 | sex | 2 |
| users | 1 | sex_age | 2 | age | 406 |
| users | 1 | name | 1 | name | 38149 |
| users | 1 | active | 1 | active | 2 |
+-‐-‐-‐-‐-‐-‐-‐+-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐+-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐+-‐-‐-‐-‐-‐+-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐+-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐+
27. Cardinality and distribution
27
Cardinality
version 5.5
Cardinality
version 5.6
count(distinct) Distribution
Filter
efficiency
PRIMARY 10.000.267 9.728.224 10.000.000 única optimum
name 18,868 38,149 10,001 ±0,01% c/u very good
age 18 192 101 ±1% c/u good
sex 18 2 2 50% c/u fair
active 18 2 2
‘0’: 0,1%
‘1’: 99,9%
‘0’ very good
‘1’ very bad
30. EXPLAIN fields
id Id of the SELECT, if there are more than one
select_type Query type (SIMPLE, UNION, SUBQUERY…)
table Table name
type Record search strategy
possible_keys Indexes to consider
key Index used
key_len Used length of the index
ref Fields matched with the index
rows Approximate record number to consider
filtered Percentage of filtered records by WHERE
Extra Additional info
30
31. Search strategies
from most to less optimal
• const: just one record, by primary key or unique index
• eq_ref: just one record for each other record in a JOIN
• ref: multiple records filtering by index
• index_merge: multiple records filtering by more than one index
• range: multiple records filtering by range using index
• index: read all records in the index file (index scan)
• ALL: read all records in the data file (full scan)
32. Information in the “Extra” field
• Using where: records are filtered after reading them, using the
WHERE condition (an index wasn’t able to filter all of them)
• Using index: all field data is read directly from the index, without
accessing the data file (covering index)
• Using where, Using index: as with “Using where”, records are
filtered after reading them, but data comes from an index
• Using filesort: extra step to sort after filtering records, when
records were not read in order from an index, using a temporary file
• Using temporary: temporary tables are needed to complete some
steps and satisfy the query (in memory or disk)
33. Guide to understand EXPLAIN
• “Using index” will boost query performance (covering index),
specially with lots of results
• Avoid “Using filesort” and “Using temporary”, specially with lots
of results
• “Using where” in queries of the types “ALL” o “index” means
that is not possible to discard any record directly from the index,
and they will be filtered while reading all the records
• Watch how may records are going to be read approximately
according to the field “rows”
• Watch the effective used length of the index according to the field
“key_len”
33
34. Just one record by Primary Key
mysql> EXPLAIN SELECT * FROM jobs_tbl_trovit_stats
-‐> WHERE s_query = 'account executive'
-‐> AND fk_i_id_tbl_type_dates = 1
-‐> AND d_date = '2009-‐09-‐28'G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: jobs_tbl_trovit_stats
type: const
possible_keys: PRIMARY,s_query_i_num_ads_salary_d_date,d_date
key: PRIMARY
key_len: 774
ref: const,const,const
rows: 1
34
35. Just one record by UNIQUE KEY
mysql> EXPLAIN SELECT * FROM jobs_tbl_trovit_stats_NEW
-‐> WHERE s_query = 'account executive'
-‐> AND fk_i_id_tbl_type_dates = 1
-‐> AND d_date = '2009-‐09-‐28'G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: jobs_tbl_trovit_stats_NEW
type: const
possible_keys: unique_key,s_query_i_num_ads_salary_d_date,d_date
key: unique_key
key_len: 774
ref: const,const,const
rows: 1
35
36. JOIN of just one record
mysql> EXPLAIN SELECT * FROM tbl_users LEFT JOIN tbl_countries
-‐> ON tbl_users.fk_c_id_tbl_countries = tbl_countries.c_idG
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: tbl_users
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 20986861
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: tbl_countries
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 6
ref: trovit_global.tbl_users.fk_c_id_tbl_countries
rows: 1
36
37. Many records using a partial
Unique Key
mysql> EXPLAIN SELECT * FROM jobs_tbl_trovit_stats_NEW
-‐> WHERE s_query = 'account executive'G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: jobs_tbl_trovit_stats_NEW
type: ref
possible_keys: unique_key,s_query_i_num_ads_salary_d_date
key: unique_key
key_len: 767
ref: const
rows: 1073
37
38. Many records using index
mysql> EXPLAIN SELECT * FROM jobs_tbl_trovit_stats
-‐> WHERE d_date = '2012-‐10-‐15'G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: jobs_tbl_trovit_stats
type: ref
possible_keys: d_date
key: d_date
key_len: 3
ref: const
rows: 10908
38
39. OR condition and indexes
mysql> EXPLAIN SELECT a FROM t
-‐> WHERE b = 1 OR c = 1G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: t
type: index_merge
possible_keys: b,c,b_a
key: b,c
key_len: 4,4
ref: NULL
rows: 4863128
Extra: Using union(b,c); Using where
39
40. Range query using index
mysql> EXPLAIN SELECT * FROM jobs_tbl_trovit_stats_NEW
-‐> WHERE s_query = 'account executive’
-‐> AND fk_i_id_tbl_type_dates = 1
-‐> AND d_date < ‘2011-‐04-‐25'G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: jobs_bl_trovit_stats
type: range
possible_keys: unique_key,s_query_i_num_ads_salary_d_date,d_date
key: unique_key
key_len: 774
ref: NULL
rows: 539
40
41. Condition is not using index,
but using covering index
mysql> EXPLAIN SELECT s_query FROM jobs_tbl_trovit_stats
-‐> WHERE i_num_ads_salary = 10G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: jobs_tbl_trovit_stats
type: index
possible_keys: NULL
key: s_query_i_num_ads_salary_d_date
key_len: 775
ref: NULL
rows: 5543853
Extra: Using where; Using index
41
42. Full scan query, filter condition
doesn’t use index
mysql> EXPLAIN SELECT i_num_ads FROM jobs_tbl_trovit_stats
-‐> WHERE i_num_ads_salary = 10G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: jobs_tbl_trovit_stats
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5543853
Extra: Using where
42
43. Full scan query, retrieve all
records
mysql> EXPLAIN SELECT i_num_ads FROM jobs_tbl_trovit_statsG
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: jobs_tbl_trovit_stats
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5543853
Extra: NULL
43
45. 45
KEY `a_b_c_d` (`a`,`b`,`c`,`d`) + id
KEY `a_b_c_d` (`a`,`b`,`c`,`d`) + id
mysql> EXPLAIN SELECT a, b, d FROM t
-‐> WHERE a = 1 ORDER BY idG
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: t
type: ref
possible_keys: a,a_b,a_b_c_d
key: a_b_c_d
key_len: 4
ref: const
rows: 199704
Extra: Using where; Using index; Using filesort
200000 rows in set (0.28 sec)
The importance of “using index”
query optimizer choses wisely and favours the performance boost of
“using index” even if it’s forced to “using filesort”
46. 46
KEY `a` (`a`) + id
mysql> EXPLAIN SELECT a, b, d FROM t
-‐> FORCE KEY (a)
-‐> WHERE a = 1 ORDER BY idG
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: t
type: ref
possible_keys: a
key: a
key_len: 4
ref: const
rows: 199704
Extra: Using where
200000 rows in set (0.53 sec)
The importance of “using index”
if we choose to force an index we think it would be better for performance (in this case the
bad extra “using filesort” is not used), we might be mistaken and the query could be slower
47. 47
mysql> EXPLAIN SELECT * FROM jobs_tbl_trovit_stats
-‐> ORDER BY d_dateG
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: jobs_tbl_trovit_stats
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5543853
Extra: Using filesort
5066959 rows in set (3 min 27.96 sec)
But sometimes query optimizer is wrong
full scan query, it prefers to read only the data file and do a filesort, instead of using an
index to first: read keys directly in order and second: access the data file to retrieve fields
48. 48
mysql> EXPLAIN SELECT * FROM jobs_tbl_trovit_stats
-‐> FORCE KEY (d_date)
-‐> ORDER BY d_dateG
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: homes_tbl_my_searches
type: index
possible_keys: NULL
key: d_date
key_len: 3
ref: NULL
rows: 5543853
5066959 rows in set (18.11 sec)
But sometimes query optimizer is wrong
same query, but this time is an index scan, with millions of records, is much better to force
the use of an index that satisfies the sorting of records, even if we must do a second read
from the data file to retrieve the fields for every record
50. Denormalize dates
don’t use DATE_FORMAT in WHERE, GROUP BY or ORDER BY
mysql> EXPLAIN SELECT SUM(f_revenue_in_euros*f_revenue_share/100) AS revenue,
-‐> DATE_FORMAT(d_date, “%Y-‐%m”) FROM tbl_publishers_stats
-‐> WHERE fk_i_id_tbl_publishers = ‘1658'
-‐> GROUP BY YEAR(d_date), MONTH(d_date)G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: tbl_publishers_stats
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 323826
Extra: Using where; Using temporary; Using filesort
50
51. Denormalize dates
use specific indexed fields for year, month, day, etc
mysql> EXPLAIN SELECT SUM(f_revenue_in_euros*f_revenue_share/100) AS revenue,
-‐> i_year, i_month FROM tbl_publishers_stats
-‐> WHERE fk_i_id_tbl_publishers = ‘1658'
-‐> GROUP BY i_year, i_monthG
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: tbl_publishers_stats_NEW
type: ref
possible_keys: publisher_year_month
key: publisher_year_month
key_len: 5
ref: const
rows: 236
Extra: Using where
51
52. Avoid using “*”
• Avoid using “*” in the field list of a SELECT
• Put only the fields you need in order to save unneeded disk reads
• Big extra optimisation if all the fields belong to the index used in
the query (covering index)
• Exception for using “*”: count(*) must always be used instead of
count(field_name):
• To avoid causing confusion to the query optimiser
• Avoids future errors if the query is modified and the field
inside the count() function is no longer indexed
52
53. Page queries
• Avoid running any long executing query
• Long queries block the execution of MySQL internal
maintenance processes, like the purge history
growing several gigabytes that would never shrink
again
• Long INSERT, UPDATE and DELETE, in addition to the
above, will delay replication to slaves
• Use LIMIT in queries of any type to page in smaller
and faster blocks
54. Using LIMIT the right way
• LIMIT filters final results of the query, only after
WHERE, GROUP BY and ORDER BY are
processed
• Beware of GROUP BY and ORDER BY “using
filesort”, all records are going to be file sorted
before LIMIT could take effect
• Even when using an index, the LIMIT
<offset>,<row_count> syntax will run slower
and slower as the offset increments
54
55. Using LIMIT the right way
always avoid “using filesort”
55
mysql> EXPLAIN SELECT * FROM homes_tbl_my_searches
-‐> ORDER BY s_where
-‐> LIMIT 10G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: homes_tbl_my_searches
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 746475
Extra: Using filesort
10 rows in set (4.00 sec)
56. Using LIMIT the right way
always avoid “using filesort”
56
mysql> EXPLAIN SELECT * FROM homes_tbl_my_searches
-‐> ORDER BY s_what
-‐> LIMIT 10G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: homes_tbl_my_searches
type: index
possible_keys: NULL
key: what_where
key_len: NULL
ref: NULL
rows: 10
10 rows in set (0.00 sec)
57. # Slower as offset increments
mysql> SELECT i_id FROM jobs_tbl_trovit_stats_NEW
-‐> LIMIT 5000000,100;
100 rows in set (0.92 sec)
# Always fast
mysql> SELECT i_id FROM jobs_tbl_trovit_stats_NEW
-‐> WHERE i_id > $last_id
-‐> LIMIT 100;
100 rows in set (0.00 sec)
Using LIMIT the right way
• When paging, avoid LIMIT <offset>,<row_count>
it’s better to filter using a primary or unique key
57
58. LIMIT for long UPDATE and DELETE
• Divide long UPDATE and DELETE queries in several
shorter executions using LIMIT
• If possible, use indexed fields to find records
58
# Crontab to purge old records
Do
mysql> DELETE FROM homes_tbl_my_searches
-‐> WHERE dt_date < ‘2013-‐12-‐31’
-‐> LIMIT 1000;
While(rows affected > 0)
# One-‐time UPDATE, not worth to create an index just for this time
Do
mysql> UPDATE homes_tbl_my_searches SET i_active = 0
-‐> WHERE i_active != 0 AND s_what = ‘offensive stopword’
-‐> LIMIT 1000;
While(rows changed > 0)