This document discusses Parallel Query, a feature of POLARDB for MySQL that allows queries to run in parallel across multiple CPU cores for improved performance. It begins with an introduction to Parallel Query and how it works, then discusses how to use Parallel Query, how it is implemented internally, examples of performance improvements seen, and some current limitations and plans for future work.
2. Agenda
• What is Parallel Query
• How to use Parallel Query
• Parallel Query Internals
• Parallel Query Performance
• Future Work
3. What is Parallel Query?
Parallel Query is an innovative method to accelerate MySQL queries
from Alibaba Cloud.
• Traditionally, 1 MySQL query runs with just 1 thread, and can not take
advantage of multiple cores on modern processors.
• Parallel Query takes advantage of modern processors to distribute
work across many or all available cores:
• 8 parallel threads can be up to 8 times faster.
• 32 parallel threads can be up to 32 times faster
4. Why Parallel Query?
• 2003: CPUs stopped getting
faster
• 2004-2019 focus on more
cores, sockets.
• PQ lets MySQL take advantage
of last 15 years of progress.
5. How to Use Parallel Query
Parallel Query runs against your existing InnoDB data.
• No data extraction to another system is required.
• No query modifications are required.
Parallel Query within InnoDB (no extraction needed) is an amazing
feature exclusive to Alibaba Cloud
6. Query with Parallelism
SELECT count(*) FROM production.product;
Serial execution plan:
• 1
Stream Aggregate: For each of the rows returned by index scan, do the
aggregation.
For the above query, Stream Aggregate operator counts the rows it receives
from the Index Scan operator.
1 active thread
63 idle threads
Thread 1: Scan, Count
SQL
Client
7. Parallel Execution Plan
Sum
Thread 1: Scan, Count
Thread 2: Scan, Count
Thread 3: Scan, Count
Thread 4: Scan, Count
Thread 5: Scan, Count
Thread 6: Scan, Count
Thread 7: Scan, Count
. . .
Thread 64: Scan, Count
With 64 parallel
threads, each thread
does < 2% of the work.
SQL
Client
8. How Parallel Query Works
1. Parallel coordinator can split a table or index scan into equal-size
pieces
2. Each of the worker can execute part of the query plan
3. Gather stream operator is responsible for collecting the
intermediate results from workers
9. How Parallel Query Works
• Each of the workers write results to their own buffer
Ø threads run without interruption
• Pointers are passed for Merge step
Ø optimized method to hand off data
10. Parallel Query Internals
Parallel Query uses multiple methods to distribute work among the parallel
threads, including:
In a parallel sequential scan, the data pages for the table will be divided
among the cooperating threads.
In a parallel index operation, the cooperating threads will read a single index
block and will scan and return all records referenced by that block; other
threads can at the same time be returning records from a different index
page. The results of a parallel btree scan are returned in sorted order within
each worker thread.
15. Partitioning
• Server will normally request 100 partitions per worker thread
• “Fast” workers may process more partitions than “slow” workers
• Partitions of more equal size
• When finished with one partition, a worker may be automatically
attached to a new partition.
16. Parallel Query SORT
SELECT col1, col2, col3 FROM t1 ORDER BY 1,2;
1. Parallel data access (table scan or index)
2. Parallel order by of the data handled by each worker
3. Final merge sort of the results and return to client.
Parallel threads
run local sort
SQL
Client
Merge
Sort
Thread 1: Scan, Sort
Thread 2: Scan, Sort
Thread 3: Scan, Sort
Thread 4: Scan, Sort
Thread 5: Scan, Sort
Thread 6: Scan, Sort
Thread 7: Scan, Sort
. . .
Thread 64: Scan, Sort
17. Parallel Query GROUP BY
SELECT col1, col2, SUM(col3) FROM t1 GROUP BY 1,2;
1. Parallel data access (table scan or index)
2. Parallel group by of the data handled by each worker
3. Final merge of the local group by and return results
DISTINCT operation will be similar to GROUP BY.
Parallel threads
run local group
Merge
Groups
Thread 1: Scan, Group
Thread 2: Scan, Group
Thread 3: Scan, Group
Thread 4: Scan, Group
Thread 5: Scan, Group
Thread 6: Scan, Group
Thread 7: Scan, Group
. . .
Thread 64: Scan, Group
SQL
Client
18. Parallel Query Nested-Loops JOIN
SELECT * FROM t1 JOIN t3 ON t1.id = t3. id;
1. Parallel data access (table scan or index) of driving
table
2. Parallel join of the local data handled by each worker
3. Final merge of the and return to client
Parallel scan
and join
Merge
Thread 1: Scan, Join
Thread 2: Scan, Join
Thread 3: Scan, Join
Thread 4: Scan, Join
Thread 5: Scan, Join
Thread 6: Scan, Join
Thread 7: Scan, Join
. . .
Thread 64: Scan, Join
SQL
Client
19. Parallel Query Usage
• To enable parallel execution for a session:
set max_parallel_degree = n
Maximum n worker threads will be used
• MySQL may still decide to not use parallelization. If so, parallel
execution may be forced with
set force_parallel_mode = on
20. Parallel Query Usage: Hint
• To force parallel query execution for a single query:
SELECT /*+ PARALLEL() */ * FROM ...
• To force the use of a specific number of worker threads, n :
SELECT /*+ PARALLEL(n) */ * FROM ...
21. Parallel Query Usage: EXPLAIN
mysql> EXPLAIN SELECT SUM(l_quantity) FROM lineitem where l_returnflag = 'A';
+----+-------------+-----------+------------+------+---------------+------+---
------+------+---------+----------+----------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | ke
y_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+---------------+------+---
------+------+---------+----------+----------------------------------------+
| 1 | SIMPLE | <gather2> | NULL | ALL | NULL | NULL | NU
LL | NULL | 5938499 | 10.00 | NULL |
| 2 | SIMPLE | lineitem | NULL | ALL | NULL | NULL | NU
LL | NULL | 742312 | 10.00 | Parallel scan (8 workers); Using where |
+----+-------------+-----------+------------+------+---------------+------+---
------+------+---------+----------+----------------------------------------+
2 rows in set, 1 warning (0.00 sec)
22. Parallel Query Performance
Parallel Query delivers near-perfect
linear acceleration for DBT3 Query 6:
select sum(l_extendedprice * l_discount) as revenue
from lineitem where l_shipdate >= date '1994-01-01’
and l_shipdate < date '1995-01-01’
and l_discount between 0.06 - 0.01 and 0.06 + 0.01
and l_quantity < 24
Tested at 30, 60, 120, and 240 million rows.
Examples:
89 seconds to 3.4 seconds.
177 seconds to 6.3 seconds.
24. Why do users care about linear scalability?
Users care about
• Business growth. DB must
deliver stable performance as
business grows
• Faster decisions. Faster
analysis driving faster action.
Faster:
85 seconds to
6 seconds
22.6 seconds
2x data size - 21.6 seconds
4x data size - 21.6 seconds
22.6
21.6
21.6
27. Parallel Query – Current Limitations
Parallel query currently only support:
• SELECT queries
• Parallel scan on driving table of nested-loops join
• InnoDB
Parallel query does not currently execute in parallel:
• JSON
• GIS
• UDFs
• Full text indexes
• Subqueries & CTEs
• Windows functions
• WITH ROLLUP
• Procedures
• SELECT … FOR UPDATE etc.
• SERIALIZABLE isolation level
28. Parallel Query – Future Work
1 E 2 6 2?D E9 ?6DE65 ?
1 E D 3 6 6D
/6 7 2?46 E 2E ?D E ? ?8 7 6I DE ?8 7 ?4E ?2 E
- 6 5 28? DE 4D D E
I492?86 6 2E D E 6 82E96 6 2E ?D
/2 2 6 92D9 ?
. 5 7 6 2?D E D E 6 677 4 6?E 2 2 6 2E ?
( 6 EE6? E 6 E92E E2 6D 2 2 6 2E ? ?E 244 ?E
) DE 3 E65 6 6I64 E ?