The document summarizes several case studies on optimizing large SQL queries. Case study 1 examines a query that runs slowly due to a column used as a bitmap being compared with a bitwise operator rather than being indexed directly. Case study 2 shows a query that could use indexes but data type mismatches prevent it. Case study 3 involves a dependent subquery that is rewritten as a left join to better use indexes.
Slicing in Python is a feature that enables accessing parts of sequences like strings, tuples, and lists. You can also use them to modify or delete the items of mutable sequences such as lists. Slices can also be applied on third-party objects like NumPy arrays, as well as Pandas series and data frames.
Slicing enables writing clean, concise, and readable code.
This article shows how to access, modify, and delete items with indices and slices, as well as how to use the built-in class slice().
For more course tutorials visit
uophelp.com is now newtonhelp.com
www.newtonhelp.com
For what reason do commentators need to think about the internal controls of the affiliation? What are some key segments of inside control? Which are the most fundamental? In what way will the overseer need to modify the survey program if the inside controls are viewed as inadequate to help organization articulations?
Slicing in Python is a feature that enables accessing parts of sequences like strings, tuples, and lists. You can also use them to modify or delete the items of mutable sequences such as lists. Slices can also be applied on third-party objects like NumPy arrays, as well as Pandas series and data frames.
Slicing enables writing clean, concise, and readable code.
This article shows how to access, modify, and delete items with indices and slices, as well as how to use the built-in class slice().
For more course tutorials visit
uophelp.com is now newtonhelp.com
www.newtonhelp.com
For what reason do commentators need to think about the internal controls of the affiliation? What are some key segments of inside control? Which are the most fundamental? In what way will the overseer need to modify the survey program if the inside controls are viewed as inadequate to help organization articulations?
For over 30 years our products have delivered reliable energy efficient roof savings with maintenance solutions to commercial/industrial roofs. Now used on all continents, these all elastomeric roof leak repair compounds and ENERGY STAR® certified coating maintenance systems are proven most effective for all climates.
Rechtsupdate 2016 - Community Camp Berlin 2016 - #ccb16Thomas Schwenke
Präsentation zu meiner Rechts-Session beim CommunityCamp 2016 mit wichtigen Rechtsentwicklungen und Rechtsnews (ergänzt um weiterführende Links). https://communitycampberlin.tixxt.com/
Python is an interpreted, object-oriented programming language similar to PERL, that has gained popularity because of its clear syntax and readability.
Python is an interpreted, object-oriented, high-level programming language with dynamic semantics.
Its high-level built in data structures, combined with dynamic typing and dynamic binding, make it very attractive for Rapid Application Development, as well as for use as a scripting or glue language to connect existing components together.
For over 30 years our products have delivered reliable energy efficient roof savings with maintenance solutions to commercial/industrial roofs. Now used on all continents, these all elastomeric roof leak repair compounds and ENERGY STAR® certified coating maintenance systems are proven most effective for all climates.
Rechtsupdate 2016 - Community Camp Berlin 2016 - #ccb16Thomas Schwenke
Präsentation zu meiner Rechts-Session beim CommunityCamp 2016 mit wichtigen Rechtsentwicklungen und Rechtsnews (ergänzt um weiterführende Links). https://communitycampberlin.tixxt.com/
Python is an interpreted, object-oriented programming language similar to PERL, that has gained popularity because of its clear syntax and readability.
Python is an interpreted, object-oriented, high-level programming language with dynamic semantics.
Its high-level built in data structures, combined with dynamic typing and dynamic binding, make it very attractive for Rapid Application Development, as well as for use as a scripting or glue language to connect existing components together.
Python is an interpreted, object-oriented programming language similar to PERL, that has gained popularity because of its clear syntax and readability.
Python is an interpreted, object-oriented, high-level programming language with dynamic semantics.
Its high-level built in data structures, combined with dynamic typing and dynamic binding, make it very attractive for Rapid Application Development, as well as for use as a scripting or glue language to connect existing components together.
Discover the power of Recursive SQL and query transformation with Informix da...Ajay Gupte
This presentation will provide an overview of the Recursive SQL with the CONNECT BY clause feature. We will provide examples of typical practical database problems and describe in detail how they can be solved with recursive SQL. The problems discussed include for bill of materials, obtaining the number of employees for each manager in a particular sub-organization, converting linked dimension hierarchies in a star schema to fixed dimension hierarchies, tracking packages, and generating test data. This presentation compares the new solutions with traditional solutions of these problems and discusses the advantages and disadvantages of the various methods. This presentation will also discuss the query transformation techniques with Informix 12.10 features which will focus on how query blocks are moved between different levels and optimized using examples and diagrams. Users will learn how to analyze complex examples based on various Informix 12.10 features. Examples included in this session are query block movement, table re-ordering, complex ANSI joins, sub-queries, derived tables, views, connect by, OLAP functions, setops cases.
This session hold data about SQL in general and specifically on :
- string gotchas
- string functions
- string aggregations
- basic NLP
- spaghetti queries
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
GridMate - End to end testing is a critical piece to ensure quality and avoid...ThomasParaiso2
End to end testing is a critical piece to ensure quality and avoid regressions. In this session, we share our journey building an E2E testing pipeline for GridMate components (LWC and Aura) using Cypress, JSForce, FakerJS…
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
2. Agenda
• Showing the title slide
• Going through this agenda
• Prerequisites
• Case study #1
• Case study #2
• Case study #3
• Case study #4
www.percona.com
3. This talk is not about
●
SQL basics
●
Reading explain
●
Indexing basics
www.percona.com
5. #1: Query
●
Query runs in the SELECT t.id, ts.meta1, lt.meta2
FROM things t
10 seconds range INNER JOIN things_stuff ts
ON t.fk_things_stuff_id = ts.id
INNER JOIN yet_an_other_table yat
●
According to ON t.fk_yat_id = yat.id
WHERE t.active = 1
explain output it AND t.type != 'guest'
AND t.title like '%the%'
AND LOWER(t.username) like '%e%'
can use some AND t.grants & 1 != 0
AND t.attribute1=3;
indexes
●
Mostly single-
column indexes
were defined
www.percona.com
6. #1: Query
●
Rows examined in SELECT t.id, ts.meta1, lt.meta2
FROM things t
the 100k range INNER JOIN things_stuff ts
ON t.fk_things_stuff_id = ts.id
INNER JOIN yet_an_other_table yat
●
Rows returned is ON t.fk_yat_id = yat.id
WHERE t.active = 1
single digit AND t.type != 'guest'
AND t.title like '%the%'
AND LOWER(t.username) like '%e%'
●
This means AND t.grants & 1 != 0
AND t.attribute1=3;
something is
selective, but most
likely not indexed
www.percona.com
7. #1: issue identification
********************** 1. row **********************
id: 1
select_type: SIMPLE
table: t This is the issue, the rest
type: const of the tables are just some
key: idx_active metadata fetching (type is eq_ref)
key_len: 4
rows: 332873
Extra: Using where
********************** 2. row **********************
id: 1
select_type: SIMPLE
table: ts
type: eq_ref
key: PRIMARY
key_len: 4
rows: 1
Extra:
********************** 3. row *****************
type: eq_ref
rows: 1
www.percona.com
8. #1: finding the filtering clause
● In the original query, filtering in the
WHERE clause are all for table t (things)
● Find out if any of them are selective alone
(it's possible that they are not, but some
combination of them will be selective)
SELECT COUNT(t.id) FROM things t WHERE t.active=1;
+----------+
| COUNT(1) |
+----------+
| 168520 |
+----------+
1 row in set (10.15 sec)
www.percona.com
9. #1: finding the filtering clause
SELECT COUNT(t.id) FROM things t WHERE t.status!='guest';
+----------+
| COUNT(1) |
+----------+
| 284582 |
+----------+
1 row in set (10.13 sec)
SELECT COUNT(t.id) FROM things t WHERE t.title LIKE '%the%';
+----------+
| COUNT(1) |
+----------+
| 173895 |
+----------+
1 row in set (10.12 sec)
SELECT COUNT(t.id) FROM things t WHERE lower(t.username) like '%e%';
+----------+
| COUNT(1) |
+----------+
| 190345 |
+----------+
1 row in set (10.15 sec)
www.percona.com
10. #1: finding the filtering clause
SELECT COUNT(t.id) FROM things t WHERE t.grants & 1 != 0;
+----------+
| COUNT(1) |
+----------+
| 4 |
+----------+
1 row in set (10.13 sec)
● This is selective
● In the original table definition the grants
column has an index on it, but it's not used
(the index on the not too selective active
column is used instead)
● Data type of grants in TINYINT
www.percona.com
11. #1: reason for slowness
SELECT COUNT(t.id) FROM things t WHERE t.grants & 1 != 0;
+----------+
| COUNT(1) |
+----------+
| 4 |
+----------+
1 row in set (10.13 sec)
● The & is bitwise AND operator
● On the left hand side of comparison it's
not the column itself, which is indexed,
but some kind of function of the column
is compared, which is not indexed.
www.percona.com
12. #1: TINYINT column used as
bitmap
● The grants in this case used as 1 raw byte
(8 bits), 1 is a mask we compare to
● A common case for this is storing
privileges Does some document / user / page
has privilege “H”?
1 0 1 0 0 0 0 1
&
Privilege “A” Privilege “H”
0 0 0 0 0 0 0 1
The result is non-0 when the
appropriate privilege bit is 1. 0 0 0 0 0 0 0 1
www.percona.com
13. #1: solution
● The solution itself is very space efficient,
but the data is not accessible.
● If only privilege “H” is checked frequently,
just create a privilege “H” column, which is
indexed and store data redundantly
● If other checks are frequent, use separate
columns for each privilege
● Other antipatterns in the query, which are
not making it slower
www.percona.com
15. #2: Query
●
Query runs in the SELECT ta.*,
tb.attr1,
10 seconds range tb.attr2,
tb.attr3,
tb.attr4,
●
Rows examined is tc.attr1
FROM table_a ta,
much higher than table_b tb,
table_c tc
WHERE tb.ID = ta.tb_id
rows returned AND tc.attr1 = ta.attr1
AND tc.attr2 = 1
AND ta.attr3 = 'non_selective'
AND ta.attr4 = 715988
AND ta.attr5 BETWEEN 1 AND 10
ORDER BY ta.attr6 ASC
www.percona.com
17. #2: Show indexes and data
types
mysql> SHOW INDEX FROM table_a WHERE Key_name = "attr4" OR Key_name = "attr3"G
*************************** 1. row ***************************
Table: table_a
Key_name: attr4
Seq_in_index: 1
Column_name: attr4
Cardinality: 523608
*************************** 2. row ***************************
Table: table_a
Key_name: attr3
Seq_in_index: 1
Column_name: attr4
Cardinality: 21
2 rows in set (0.01 sec)
`attr3` varchar(200) DEFAULT NULL,
`attr4` varchar(200) DEFAULT NULL
www.percona.com
18. #2: solution
SELECT ta.*, SELECT ta.*,
tb.attr1, tb.attr1,
tb.attr2, tb.attr2,
tn.attr3, tn.attr3,
tb.attr4, tb.attr4,
tc.attr1 tc.attr1
FROM table_a ta, FROM table_a ta,
table_b tb, table_b tb,
table_c tc table_c tc
WHERE tb.ID = ta.tb_id WHERE tb.ID = ta.tb_id
AND tc.attr1 = ta.attr1 AND tc.attr1 = ta.attr1
AND tc.attr2 = 1 AND tc.attr2 = 1
AND ta.attr3 = 'non_selective' AND ta.attr3 = 'non_selective'
AND ta.attr4 = 715988 AND ta.attr4 = '715988'
AND ta.attr5 BETWEEN 1 AND 10 AND ta.attr5 BETWEEN 1 AND 10
ORDER BY a.date_recorded ASC ORDER BY a.date_recorded ASC
After this change, the right index is used (changing column type could be better)
www.percona.com
19. #2: conclusion
● Always quote strings, never quote
numbers
● Similar thing can happen when there is a
data type mismatch between joined tables
● For example an id is in a INT column is
one table, VARCHAR(x) in the other,
join won't use indexes on them
www.percona.com
21. #3: Query
SELECT * FROM things WHERE ((thing_id=87459234 AND thing_flag1 = 1)
OR (thing_other_thing_id=87459234 AND thing_flag2 = 1)) AND
(thing_id=87459234 OR other_thing_id=87459234) AND (thing_id <>
3846571 AND other_thing_id <> 3846571) AND (thing_prop1 IS NOT NULL)
AND (thing_rangedate >= '2012-12-01') OR (thing_rangedate_min >
'2012-12-03' AND thing_rangedate_max < '2012-12-04')
AND (NOT EXISTS (SELECT 1 FROM thing_stuffs WHERE fk_things_id =
things.id AND important_num BETWEEN 1 and 4 AND other_range_value
between 1 and 3));
If a query is big enough, always start with formatting the query.
www.percona.com
22. #3: Query
SELECT *
FROM things
First issue is clear from WHERE
explain output: ((thing_id=87459234 AND thing_flag1 = 1)
DEPENDENT SUBQUERY OR (thing_other_thing_id=87459234 AND thing_flag2 = 1))
AND (thing_id=87459234 OR other_thing_id=87459234)
AND (thing_id <> 3846571 AND other_thing_id <> 3846571)
AND (thing_prop1 IS NOT NULL)
AND (thing_rangedate >= '2012-12-01')
OR (thing_rangedate_min > '2012-12-03'
AND thing_rangedate_max < '2012-12-04')
AND (NOT EXISTS
(SELECT 1 FROM thing_stuffs
WHERE fk_things_id = things.id
AND important_num BETWEEN 1 and 4
AND other_range_value BETWEEN 1 and 3));
www.percona.com
23. #3: Explain
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: things
type: ref
possible_keys: idx_thing_id,idx_other_thing_id,idx_prop1
key: idx_thing_id
key_len: 4
rows: 23917933
Extra: Using where
*************************** 2. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: thing_stuffs
type: subquery
possible_keys: PRIMARY,fk_things_id
key: PRIMARY
key_len: 4
rows: 1
Extra: Using where
2 rows in set (0.01 sec)
www.percona.com
24. #3: Rewrite DEPENDENT
SUBQUERY
SELECT *
FROM things
WHERE
AND (NOT EXISTS
(SELECT 1 FROM thing_stuffs
WHERE fk_things_id = things.id
AND important_num BETWEEN 1 and 4
AND other_range_value BETWEEN 1 and 3));
SELECT t.*
FROM things t
LEFT JOIN thing_stuffs ts ON ts.fk_things_id = things.id
WHERE ts.fk_things_id IS NULL
AND ts.important_num BETWEEN 1 AND 4
AND ts.other_range_value BETWEEN 1 AND 3
www.percona.com
25. #3: Explain after rewrite
*************************** 1. row ***************************
id: 1
select_type: SIMPLE Composite index on important_num
table: ts and other_range_value
type: ref
possible_keys: idx_important_num_other_range
key: idx_important_num_other_range
key_len: 4
ref: const
rows: 93854
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: t
type: ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: ts.fk_things_id
rows: 5
Extra: Using where
2 rows in set (0.01 sec)
www.percona.com
26. #3: Examining further
Still more rows examined than returned.
possible_keys: idx_important_num_other_range
key: idx_important_num_other_range
key_len: 4
rows: 93854
The index is a composite index, however only important_num is used
mysql> SELECT COUNT(1) FROM thing_stuffs WHERE important_num BETWEEN 1 AND 4;
+----------+
| count(1) |
+----------+
| 87520 |
+----------+
1 row in set (0.15 sec)
mysql> SELECT COUNT(1) FROM thing_stuffs WHERE other_range_value BETWEEN 1 AND 3;
+----------+
| count(1) |
+----------+
| 134526 |
+----------+
1 row in set (0.17 sec)
www.percona.com
27. #3: Examining further
mysql> SELECT COUNT(1) FROM thing_stuff WHERE important_num BETWEEN 1 AND 4
AND other_range_value BETWEEN 1 AND 3;
+----------+
| count(1) |
+----------+
| 125 |
+----------+
1 row in set (0.15 sec)
● Together they are selective
● Both of them are range conditions, so they
can't be used together in a composite
index
● Unless we rewrite
www.percona.com
28. #3: Second rewrite
SELECT COUNT(1) from thing_stuff WHERE important_num IN (1,2,3,4) AND
other_range_value IN (1,2,3);
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: ts
type: ref
key: idx_important_num_other_range
key_len: 8
ref: const
rows: 125
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: t
type: ref
key: PRIMARY
key_len: 4
ref: ts.fk_things_id
rows: 5
Extra: Using where
2 rows in set (0.01 sec)
www.percona.com
29. #3: Using it in the original
query
The query is still slow. SELECT t.*
FROM things t
LEFT JOIN thing_stuffs ts ON ts.fk_things_id = t.id
A lot more rows examined WHERE ((t.thing_id=87459234 AND t.thing_flag1 = 1)
Than returned. OR (t.other_thing_id=87459234 AND t.thing_flag2 = 1))
AND (t.thing_id=87459234 OR t.other_thing_id=87459234)
AND (t.thing_id <> 3846571
Checking explain the AND t.other_thing_id <> 3846571)
relevant part is that the AND (t.thing_prop1 IS NOT NULL)
query uses an index merge. AND (t.thing_rangedate >= '2012-12-01')
OR (t.thing_rangedate_min > '2012-12-03'
AND t.thing_rangedate_max < '2012-12-04')
AND ts.important_num IN (1,2,3,4)
AND ts.other_range_value IN (1,2,3)
AND ts.fk_things_id IS NULL;
Extra: Using union(idx_thing_id,idx_other_thing_id); Using where
www.percona.com
30. #3: Relevant part
SELECT * FROM things
WHERE (thing_id=87459234 OR other_thing_id=87459234)
AND thing_id <> 3846571 AND other_thing_id <> 3846571;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: things
type: index_merge
possible_keys: idx_thing_id,idx_other_thing_id
key: idx_thing_id,idx_other_thing_id
key_len: 4,4
ref: NULL
rows: 59834
Extra: Using union(idx_thing_id,idx_other_thing_id); Using where
1 row in set (0.00 sec)
www.percona.com
31. #3: Rewritten query
● Index merge is an expensive SELECT *
FROM things
operation WHERE thing_id=87459234
AND other_thing_id<>3846571
● Especially if indexes are large UNION
SELECT *
FROM things
● UNION always create an on-disk WHERE other_thing_id=87459234
temp table AND thing_id<>3846571;
● Just turning off index merge
could be enough for a significant
performance advantage
mysql> select @@optimizer_switchG
*************************** 1. row ***************************
@@optimizer_switch:
index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_interse
ction=on,engine_condition_pushdown=on
1 row in set (0.00 sec)
www.percona.com
32. #3: Putting it together
SELECT t.* FROM things t
LEFT JOIN things_stuff ts
ON ts.fk_things_id = t.id
WHERE t.thing_id=87459234
AND t.other_thing_id<>3846571
AND t.thing_flag1=1
AND t.thing_prop IS NOT NULL SELECT t.* FROM things t
AND (t.thing_rangedate_min >= '2012-12-01') LEFT JOIN t.things_stuff ts
AND ((t.thing_rangedate_min <= '2012-12-03' ON ts.fk_things_id = t.id
AND t.thing_rangedate_max > '2012-12-03') WHERE t.other_thing_id=87459234
OR (t.thing_rangedate_min < '2012-12-04' AND t.thing_id<>3846571
AND t.thing_rangedate_max >= '2012-12-04') AND t.thing_flag2=1
OR (t.thing_rangedate_min > '2012-12-03' AND t.thing_prop1 IS NOT NULL
AND t.thing_rangerate_max < '2012-12-04')) AND (t.thing_rangedate_min >= '2012-12-01')
AND ts.fk_things_id IS NULL AND ((t.thing_rangedate_min <= '2012-12-03'
AND ts.important_num IN (1,2,3,4) AND t.thing_rangedate_max > '2012-12-03')
AND ts.other_range_value IN (1,2,3) OR (t.thing_rangedate_min < '2012-12-04'
UNION AND t.thing_rangedate_max >= '2012-12-04')
OR (t.thing_rangedate_min > '2012-12-03'
AND t.thing_rangedate_max < '2012-12-04'))
AND ts.fk_things_id IS NULL
AND ts.important_num IN (1,2,3,4)
AND ts.other_range_value IN (1,2,3);
www.percona.com
34. #4: Summary
● Very large, 20 table join
● Explain looks ok
● Query runs for almost a hour
● Handler status variables are way off
www.percona.com
35. #4: Query
SELECT tmain.id, tmain.date,
CONCAT_WS("_", COALESCE(m1.col1, 'n/a'), COALESCE(m2.col2, 'n/a')
/* a lot more columns selected */)
FROM main_table as tmain ah
LEFT JOIN
(metadata_table1 m1 JOIN metadata_table2 m2 ON m1.attr1 = m2.attr1 AND m1.const = 'some text')
ON tmain.id = m1.tmain_id AND tmain.important_flag=1
AND ((m1.attr11='a') OR (m2.attr12 IS NOT NULL) AND 3>=m1.attr13 AND 5<m2.attr14)
OR m1.attr15 IS NOT NULL AND m2.attr16 BETWEEN 1 AND 10
LEFT JOIN
(metadata_table3 m3 JOIN metadata_table4 m4 ON m3.attr1 = m4.attr1 AND m1.const = 'some text')
ON tmain.id = m4.tmain_id AND tmain.important_flag=1
AND ((m3.attr11='a') OR (m4.attr12 IS NOT NULL) AND 3>=m3.attr13 AND 5<m4.attr14)
OR m3.attr15 IS NOT NULL AND m4.attr16 BETWEEN 11 AND 20
LEFT JOIN
(metadata_table5 m5 JOIN metadata_table2 m6 ON m5.attr1 = m6.attr1 AND m1.const = 'some text')
ON tmain.id = m5.tmain_id AND tmain.important_flag=1
AND ((m5.attr11='a') OR (m6.attr12 IS NOT NULL) AND 3>=m5.attr13 AND 5<m6.attr14)
OR m1.attr15 IS NOT NULL AND m2.attr16 BETWEEN 21 AND 30
LEFT JOIN
(metadata_table1 m7 JOIN metadata_table2 m8 ON m7.attr1 = m8.attr1 AND m7.const = 'some text')
ON tmain.id = m7.tmain_id AND tmain.important_flag=1
AND ((m7.attr11='a') OR (m8.attr12 IS NOT NULL) AND 3>=m7.attr13 AND 5<m8.attr14)
OR m7.attr15 IS NOT NULL AND m8.attr16 BETWEEN 31 AND 40
/* 6 more joins exactly like the previous ones... */
WHERE tmain.important_flag = 1;
www.percona.com
37. #4: Handler operations
mysql> show status like 'Handler%';
+----------------------------+-----------+
Run FLUSH STATUS before the query | Variable_name | Value |
+----------------------------+-----------+
to reset the counters. | Handler_commit | 1 |
| Handler_delete | 0 |
| Handler_discover | 0 |
FLUSH STATUS; | Handler_prepare | 0 |
QUERY; | Handler_read_first | 0 |
| Handler_read_key | 61341 |
SHOW STATUS LIKE 'Handler%'; | Handler_read_last | 0 |
| Handler_read_next | 198008370 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_next | 0 |
| Handler_rollback | 0 |
| Handler_savepoint | 0 |
| Handler_savepoint_rollback | 0 |
| Handler_update | 0 |
| Handler_write | 0 |
+----------------------------+-----------+
www.percona.com
38. #4: Problematic part
LEFT JOIN
(metadata_table1 m7 JOIN metadata_table2 m8 ON m7.attr1 = m8.attr1
AND m7.const = 'some text')
ON tmain.id = m7.tmain_id AND tmain.important_flag=1;
SELECT AVG(LENGTH(m1.somefield))
FROM metadata_table1 m1
JOIN metadata_table2 m2
ON m1.attr1 = m2.attr1
AND m1.const='some text'
+----+-------------+----------+--------+-------+-------------+
| id | select_type | table | type | rows | Extra |
+----+-------------+----------+--------+-------+-------------+
| 1 | SIMPLE | m1 | ref | 19764 | |
| 1 | SIMPLE | m2 | eq_ref | 102 | Using index |
+----+-------------+----------+--------+-------+-------------+
MySQL executes the join like this. One “problematic part takes roughly 3 minutes, so
It adds up.
www.percona.com
39. #4: Rewrite
SELECT tmain.id
FROM main_table tmain
LEFT JOIN metadata_table1 m1 ON tmain.id=m1.tmain_id
LEFT JOIN metadata_table2 m2 ON m1.attr1 = m2.attr1
AND m1.const = 'some text'
WHERE tmain.important_flag = 1;
● In order to leave out () LEFT JOIN has to be used to
find non-matching elements.
● Runtime for query is 0.2 sec instead of 3 minutes.
+----+-------------+----------+--------+------+-------------+
| id | select_type | table | type | rows | Extra |
+----+-------------+----------+--------+------+-------------+
| 1 | SIMPLE | tmain | ref | 1786 | |
| 1 | SIMPLE | m1 | ref | 4 | |
| 1 | SIMPLE | m2 | eq_ref | 1 | Using index |
+----+-------------+----------+--------+------+-------------+
www.percona.com
40. #4: Similarly large queries
● After applying the tuning to the original query, I
expected it t execute in a few seconds.
● It executed in 4 minutes instead.
● Used Handler operations looked good.
● Took a look at explain, and explain took 4 minutes
also.
● This means the optimizer spends time in
constructing the query plan.
● Solution: STRAIGHT_JOIN and/or
optimizer_search_depth
www.percona.com
41. Annual Percona Live
MySQL Conference and Expo
Visit:
http://www.percona.com/live/mysql-conference-2013/
www.percona.com