MySQL5.6検証レポート

1
MySQL5.6 データアクセスアルゴリズムの調査/検証
三浦裕典
2012 年 2 月 23 日

2
1. はじめに
本資料では次期リリースメジャーバージョンの MySQL5.6 にてテーブルデータ走査時のアルゴリズム
に新たに Index Condition Pushdown,Multi-Range Read(mrr),Block Nested Loop,Batched Key Access
の 4 つが追加された。この 4 つのアルゴリズムについて調査/検証、実際の業務にて適用可能な例を示す。
2. 検証対象について
今回検証するにあたって MySQL サイトより以下の RPM リポジトリを用いて MySQL5.6.4 を構築し
検証を行った。
MySQL5.6.4 構築後、データベースインスタンスに対して my.cnf より以下の設定を行い構築、起動した。
記載されていないパラメータについては初期値とする。
MySQL-client-5.6.4_m7-1.linux2.6.x86_64.rpm
MySQL-server-5.6.4_m7-1.linux2.6.x86_64.rpm
MySQL-shared-5.6.4_m7-1.linux2.6.x86_64.rpm
[mysqld]
port = 3306
socket = /var/lib/mysql/mysql.sock
key_buffer = 128M
max_allowed_packet = 16M
table_cache = 2048
sort_buffer_size = 1M
read_buffer_size = 1M
read_rnd_buffer_size = 1M
myisam_sort_buffer_size = 1M
thread_cache_size = 8
query_cache_size = 0
max_connections = 100
thread_concurrency = 8
log_error = /var/lib/mysql/5.6.4_1.err
log-bin = /var/lib/mysql/5.6.4_1-bin
binlog_format = mixed
expire_logs_days = 3
server-id = 1
slow_query_log
slow_query_log_file = /var/lib/mysql/5.6.4_1-slow.log
log_output = FILE
long_query_time = 1
character-set-server = utf8
skip-character-set-client-handshake
old-password
# innodb
innodb_buffer_pool_size = 1024M
innodb_log_file_size = 256M
innodb_log_buffer_size = 16M
innodb_flush_log_at_trx_commit = 1
innodb_lock_wait_timeout = 5
innodb_file_format = Barracuda
innodb_flush_method = O_DIRECT
innodb_read_io_threads = 4
innodb_write_io_threads = 4
innodb_file_per_table
skip_innodb_doublewrite

3
3. 検証を行うにあたって
3-1 対象とするストレージエンジンについて
今回調査検証を行うアクセスアルゴリズムは複数のストレージエンジンでサポートされているが本資
料で特に記載の無いものは全てストレージエンジンは innodb を前提とした調査検証資料となる。
3-2 対象のアクセスアルゴリズムの制御について
Index Condition Pushdown,Multi-Range Read(mrr),Block Nested Loop,Batched Key Access のアク
セスアルゴリズムは optimizer_switch パラメータにて制御がされている。制御方法は set
[global|session] optimizer_switch = expression にて行う。設定内容の確認は show [global|session]
variables にて行う。
optimizer_switch 設定値の確認と設定例
mysql> show variables like 'optimizer_switch'G
*************************** 1. row ***************************
Variable_name: optimizer_switch
Value:
index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on
,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off
1 row in set (0.00 sec)
mysql> set optimizer_switch='index_condition_pushdown=off';
Query OK, 0 rows affected (0.00 sec)
mysql> show variables like 'optimizer_switch'G
*************************** 1. row ***************************
Variable_name: optimizer_switch
Value:
index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on
,index_condition_pushdown=off,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off
1 row in set (0.00 sec)

4
4. Index Condition Pushdown
4-1 アルゴリズム概要
Index Condition Pushdown(ICP)はインデックスを使用してテーブルレコードを取得する際の最適化
方法の一つである。まず通常のインデックス探索だが検索条件（Where 句）に指定されたカラムがイ
ンデックスに含まれるカラムと合致する場合にインデックスを選択しテーブルレコードにアクセスす
る。そして一般的な動きでは 2 個以上のカラムによって構成されたインデックス（マルチカラムイン
デックス）を用いた探索ではインデックスカラムの先頭から条件句に指定されたカラムと合致する場
合に探索され必要な場合はその後テーブルレコードを参照する。ここまでの動作アルゴリズムで検索
条件によってはインデックスカラムに検索条件のカラムが含まれているにも関わらずインデック探索
後テーブルレコードのカラムで条件句の判断を行うケースが存在していた。ここに対して ICP では最
適化を行っている。
以下のケースの場合
これまでのインデックス探索では col1 の値だけでインデックス探索を終え col2 の検索条件に関しては
テーブルレコードの col2 の値を用いて対象となるか
判断していた。
ICP では col1 だけでなく col2 の値に関してもインデックスカラムで探索する。
4-2 性能検証用データ
下記にて ICP、非 ICP 時と性能検証を実施するためのデータとした。
テーブル定義
総データ件数 1,048,576 件
ファイルサイズ約 200Mbyte
INDEX に IDX1 (col1 , col2 )が作成されているとする
Select * From hoge
Where col1 = 1
And col2 like ‘%a%’
CREATE TABLE `test` (
ìd` int(11) NOT NULL AUTO_INCREMENT,
òther_id` int(11) NOT NULL,
`col1` int(11) DEFAULT NULL,
`col3` varchar(15) DEFAULT NULL,
PRIMARY KEY (ìd`),
KEY òther_id` (òther_id`),
KEY `col2` (`col2`),
KEY `multi_idx1` (`col1`,`col2`,`col3`)
) ENGINE=InnoDB AUTO_INCREMENT=1114104 DEFAULT CHARSET=utf8

5
other_id は 1 から 720874 を采番（飛び番あり）し最大２件ずつ分布（524288 件）
col1 は下記で分布
値件数
1 2
2 38
3 342
4 1938
5 7752
6 23256
7 54264
8 100776
9 151164
10 184756
11 184756
12 151164
13 100776
14 54264
15 23256
16 7752
17 1938
18 342
19 38
20 2

6
col2 は下記で分布
値件数
1820 211
1840 5488
1890 15258
1920 2301
1960 4
1980 10062
2000 795
2070 5536
2080 195
2160 2467
2240 2
2250 769
2340 203
2430 32
2520 4
値件数
0 105052
110 4
120 30
130 190
140 777
150 2264
160 5501
170 10287
180 15156
190 18646
200 18600
210 15004
220 10121
230 5464
240 2343
250 802
260 396
270 33
280 792
300 2282
320 5477
330 4
340 10139
360 15369
380 18403
390 194
400 18539
420 15890
440 10259
450 2396
460 5382
480 7681
500 747
510 9928
520 361
540 15148
値件数
550 6
560 784
570 18667
600 20743
630 15053
640 5433
650 197
660 10062
680 10099
690 5350
700 772
720 17538
750 3112
760 18534
770 7
780 356
800 23997
810 24
840 15774
850 10146
870 1
880 10052
900 17532
910 208
920 5438
950 18407
960 7756
980 774
990 3
1000 19248
1020 9753
1040 373
1050 17400
1080 15132
1100 10003
1120 6093
値件数
1040 18372
1050 5311
1080 195
1100 10127
1120 23002
1140 763
1150 31153
1170 5423
1190 174
1200 9880
1250 18441
1260 2351
1280 9885
1300 5310
1320 18350
1330 22655
1350 1
1360 15207
1380 767
1400 18369
1440 10198
1450 10156
1470 209
1500 18529
1520 5392
1530 15032
1540 17652
1560 18487
1600 789
1610 9982
1620 18599
1680 211
1710 5488
1750 15258
1760 2301
1800 4

7
col3 に関しては下記 SQL で a から j までの 5 文字の組合わせデータを 38098 通り作成した。
update test set col3 = 'aaaaa' where mod(id,10) = truncate(rand()*10,0);
update test set col3 = 'bbbbb' where mod(id,10) = truncate(rand()*10,0);
update test set col3 = 'ccccc' where mod(id,10) = truncate(rand()*10,0);
update test set col3 = 'ddddd' where mod(id,10) = truncate(rand()*10,0);
update test set col3 = 'eeeee' where mod(id,10) = truncate(rand()*10,0);
update test set col3 = 'fffff' where mod(id,10) = truncate(rand()*10,0);
update test set col3 = 'ggggg' where mod(id,10) = truncate(rand()*10,0);
update test set col3 = 'hhhhh' where mod(id,10) = truncate(rand()*10,0);
update test set col3 = 'iiiii' where mod(id,10) = truncate(rand()*10,0);
update test set col3 = 'jjjjj' where mod(id,10) = truncate(rand()*10,0);
update test set col3 = concat(concat (substr(col3,1,1),'a'),substr(col3,3,3)) where mod(id,10) = truncate(rand()*10,0);
update test set col3 = concat(concat (substr(col3,1,1),'b'),substr(col3,3,3)) where mod(id,10) = truncate(rand()*10,0);
update test set col3 = concat(concat (substr(col3,1,1),'c'),substr(col3,3,3)) where mod(id,10) = truncate(rand()*10,0);
update test set col3 = concat(concat (substr(col3,1,1),'d'),substr(col3,3,3)) where mod(id,10) = truncate(rand()*10,0);
update test set col3 = concat(concat (substr(col3,1,1),'e'),substr(col3,3,3)) where mod(id,10) = truncate(rand()*10,0);
update test set col3 = concat(concat (substr(col3,1,1),'f'),substr(col3,3,3)) where mod(id,10) = truncate(rand()*10,0);
update test set col3 = concat(concat (substr(col3,1,1),'g'),substr(col3,3,3)) where mod(id,10) = truncate(rand()*10,0);
update test set col3 = concat(concat (substr(col3,1,1),'h'),substr(col3,3,3)) where mod(id,10) = truncate(rand()*10,0);
update test set col3 = concat(concat (substr(col3,1,1),'i'),substr(col3,3,3)) where mod(id,10) = truncate(rand()*10,0);
update test set col3 = concat(concat (substr(col3,1,1),'j'),substr(col3,3,3)) where mod(id,10) = truncate(rand()*10,0);
update test set col3 = concat (substr(col3,1,4),'a') where mod(id,10) = truncate(rand()*10,0);
update test set col3 = concat (substr(col3,1,4),'b') where mod(id,10) = truncate(rand()*10,0);
update test set col3 = concat (substr(col3,1,4),'c') where mod(id,10) = truncate(rand()*10,0);
update test set col3 = concat (substr(col3,1,4),'d') where mod(id,10) = truncate(rand()*10,0);
update test set col3 = concat (substr(col3,1,4),'e') where mod(id,10) = truncate(rand()*10,0);
update test set col3 = concat (substr(col3,1,4),'f') where mod(id,10) = truncate(rand()*10,0);
update test set col3 = concat (substr(col3,1,4),'g') where mod(id,10) = truncate(rand()*10,0);
update test set col3 = concat (substr(col3,1,4),'h') where mod(id,10) = truncate(rand()*10,0);
update test set col3 = concat (substr(col3,1,4),'i') where mod(id,10) = truncate(rand()*10,0);
update test set col3 = concat (substr(col3,1,4),'j') where mod(id,10) = truncate(rand()*10,0);

8
4-3 性能検証
下記 SQL を実行し、ICP、非 ICP で検索された結果を用いてを検証比較した。col3 の「?」は a から
任意の値を設定した。又、col1=8 で合致するデータ数（今回の母数）は前述のデータ分布から 100776
件となる。
上記 SQL を実行した際に ICP、非 ICP で出力される実行計画は全て以下となる。
ICP
非 ICP
SQL 実行前に innodb 用のバッファーキャッシュ、OS のページキャッシュを下記コマンドにてクリア
し各 SQL 実行時にキャッシュデータが無い状態で実行した。
select * from test where col1 = 8 and col3 like ‘%?%’;
# service mysql restart
# sync
# sysctl –w vm.drop_caches=3
# sysctl –w vm.drop_caches=0
id: 1
select_type: SIMPLE
table: test
type: ref
possible_keys: multi_idx1
key: multi_idx1
key_len: 5
ref: const
rows: 244860
Extra: Using index condition
id: 1
select_type: SIMPLE
table: test
type: ref
possible_keys: multi_idx1
key: multi_idx1
key_len: 5
ref: const
rows: 244860
Extra: Using where

9
SQL の実行結果は以下となる。計測回数は a から始まる文字列を 4 回、同様に b,c,d を各 4 回の計 16
回実行した。
ここまでの結果では ICP は検索結果の件数と比例して検索時間が減尐していることが伺える。非 ICP
に関しては検索結果の件数と検索時間は比例せず一定時間であることが伺えた。検索結果の件数が多
いケースでは ICP より非 ICP の方が検索時間が短く検索結果の件数が尐なくなる程 ICP の検索時間が
優位になった。
4-4 仮説
現在のテーブルから更にカラムを追加しインデックスから問合せるテーブルレコードのページを増や
すことにより ICP の適した状態になると仮説を立てた。上記を満たすため col4,col5.col6,col7 の 4 カ
ラムを追加し（全て varchar(255)）0 から 9 の数字で 255 文字埋めた。その optimize を実施した。デ
ータ分布や件数に変動は無いがファイルサイズは約 200Mbyte から 1.7Gbyte に増大した。この状態で
改めて 4-3 の検証を行った。
?の設定値
ICP 非 ICP
件数実行時間(sec) 件数実行時間(sec)
a 11426 17.33 11426 6.47
ab 1103 5.90 1103 6.61
abc 50 1.18 50 6.48
abcd 2 0.93 2 6.54
b 16050 18.61 16050 6.53
bc 1406 6.78 1406 6.44
bcd 54 1.25 54 6.44
bcde empty 0.86 Empty 6.44
c 17458 18.46 17458 6.46
cd 1204 6.30 1204 6.45
cde 49 1.17 49 6.50
cdef 1 0.83 1 6.47
d 15037 18.47 15037 6.49
de 1439 6.92 1439 6.53
def 78 1.35 78 6.42
defg 3 0.86 3 6.57

10
SQL の実行結果は以下となる。4-3 と同様計測回数は a から始まる文字列を 4 回、同様に b,c,d を各 4
回の計 16 回実行した。
今回の結果でも 4-3 の結果と同様に ICP は検索結果の件数と比例して検索時間が減尐していた。但し
検索時間の減尐傾向が 4-3 よりも極端な場面が見受けられた（a→ab,b→bc,c→cd,d→de）。非 ICP に関
しては検索結果の件数と検索時間は比例せず一定時間であることは 4-3 と同じであった。検索結果の
件数が多いケースについても今回は ICP が非 ICP より検索時間が短くなった。
4-5 まとめと業務での適用について
非常に有効な機能だと考える。これまでは検索結果としては不要となる可能性のあったテーブルデー
タ部のページをバッファキャッシュに展開する割合をインデックスページのみに減らすことができシ
ステム全体のメモリ使用効率を向上させ、合わせて検索性能も向上させることが実現されていると考
える。業務での適用箇所は多岐に渡る検索要件があるもの全般に適用できると考える。但し OLTP 業
務で頻繁に呼ばれる SQL に関してはこれまでの COVERING INDEX 戦略を今後も取り、OLTP では
頻度が低いがある一定以上の応答性能を実現することにより前段の SQL の応答性能を劣化させない
ことを踏まえた戦略となると考える。今後のインデックス設計は本アルゴリズムを理解した上でカラ
ムの取捨選択が必要になると考える。
?の設定値
ICP 非 ICP
件数実行時間(sec) 件数実行時間(sec)
a 11426 63.56 11426 90.42
ab 1103 9.46 1103 91.01
abc 50 1.77 50 90.60
abcd 2 1.15 2 90.69
b 16050 82.59 16050 90.61
bc 1406 11.92 1406 92.99
bcd 54 1.82 54 90.66
bcde empty 1.07 Empty 93.47
c 17458 87.08 17458 92.84
cd 1204 10.18 1204 91.11
cde 49 1.82 49 90.66
cdef 1 1.05 1 93.27
d 15037 77.91 15037 90.82
de 1439 11.78 1439 95.78
def 78 2.02 78 90.55
defg 3 1.11 3 93.95

11
4-6 余談
4-3 の性能検証結果より IPC が非 IPC より検索時間が遅い理由の調査として SQL（col2=’%a%’）の
pfofile を取得した。
ICP profile 結果
非 ICP profile 結果
どちらも Sending data（sql_select.cc 3664 行での info メッセージにより確認）箇所で時間を使って
いることまでは判明した。今後 sql_select.cc を継続して解析して行きたいと考える。
sql_select.cc 3644 行（インフォメーションログ出力）前後の抜粋
+----------------------+-----------+-----------------------+---------------+-------------+
| Status | Duration | Source_function | Source_file | Source_line |
+----------------------+-----------+-----------------------+---------------+-------------+
| starting | 0.003464 | NULL | NULL | NULL |
| checking permissions | 0.000025 | check_access | sql_parse.cc | 4939 |
| Opening tables | 0.000026 | open_tables | sql_base.cc | 4974 |
| System lock | 0.000012 | mysql_lock_tables | lock.cc | 304 |
| init | 0.000902 | mysql_select | sql_select.cc | 3963 |
| optimizing | 0.000379 | optimize | sql_select.cc | 2058 |
| statistics | 0.040249 | optimize | sql_select.cc | 2278 |
| preparing | 0.000047 | optimize | sql_select.cc | 2302 |
| executing | 0.000005 | exec | sql_select.cc | 3074 |
| Sending data | 17.216521 | execute | sql_select.cc | 3664 |
| end | 0.000012 | mysql_select | sql_select.cc | 3993 |
| query end | 0.000006 | mysql_execute_command | sql_parse.cc | 4657 |
| closing tables | 0.000015 | mysql_execute_command | sql_parse.cc | 4705 |
| freeing items | 0.000029 | mysql_parse | sql_parse.cc | 5881 |
| logging slow query | 0.026895 | log_slow_statement | sql_parse.cc | 1665 |
| cleaning up | 0.000007 | dispatch_command | sql_parse.cc | 1606 |
+----------------------+-----------+-----------------------+---------------+-------------+
+----------------------+----------+-----------------------+---------------+-------------+
| Status | Duration | Source_function | Source_file | Source_line |
+----------------------+----------+-----------------------+---------------+-------------+
| starting | 0.031773 | NULL | NULL | NULL |
| checking permissions | 0.000018 | check_access | sql_parse.cc | 4939 |
| Opening tables | 0.000032 | open_tables | sql_base.cc | 4974 |
| System lock | 0.000014 | mysql_lock_tables | lock.cc | 304 |
| init | 0.008593 | mysql_select | sql_select.cc | 3963 |
| optimizing | 0.000953 | optimize | sql_select.cc | 2058 |
| statistics | 0.056910 | optimize | sql_select.cc | 2278 |
| preparing | 0.000031 | optimize | sql_select.cc | 2302 |
| executing | 0.000005 | exec | sql_select.cc | 3074 |
| Sending data | 8.583325 | execute | sql_select.cc | 3664 |
| end | 0.000012 | mysql_select | sql_select.cc | 3993 |
| query end | 0.000006 | mysql_execute_command | sql_parse.cc | 4657 |
| closing tables | 0.000015 | mysql_execute_command | sql_parse.cc | 4705 |
| freeing items | 0.000033 | mysql_parse | sql_parse.cc | 5881 |
| logging slow query | 0.026897 | log_slow_statement | sql_parse.cc | 1665 |
| cleaning up | 0.000007 | dispatch_command | sql_parse.cc | 1606 |
+----------------------+----------+-----------------------+---------------+-------------+
3655 /* XXX: When can we have here thd->is_error() not zero? */
3656 if (thd->is_error())
3657 {
3658 error= thd->is_error();
3659 DBUG_VOID_RETURN;
3660 }
3661 having= tmp_having;
3662 fields= curr_fields_list;
3663
3664 THD_STAGE_INFO(thd, stage_sending_data);
3665 DBUG_PRINT("info", ("%s", thd->proc_info));
3666 result->send_result_set_metadata((procedure ? procedure_fields_list :
3667 *curr_fields_list),
3668 Protocol::SEND_NUM_ROWS | Protocol::SEND_EOF);
3669 error= do_select(this, curr_fields_list, NULL, procedure);
3670 thd->limit_found_rows= send_records;

12
5. Multi-Range Read(mrr)
Multi-Range Read（mrr）はセカンダリインデックスからテーブルデータにアクセスする際にアクセ
スすべきページ（ディスク）に対して最適化を行っている。セカンダリインデックスから参照すべき
レコードが複数となった場合、セカンダリインデックスの並びとテーブルレコードの並びには関係性
は無いため無規則にページ（ディスク）にアクセスが発生することとなる。この部分に関して参照す
べきテーブルレコードの格納順（innodb の場合 PK、MyISAM の場合 rowid）にソートしソート順で
テーブルレコードにアクセスを行うことによりページ（ディスク）のアクセス効率を向上させる。又、
本アルゴリズムをオプティマイザに選択させるにはは mrr_cost_based=on の状態で mrr_cost_based、
read_rnd_buffer_size に影響される。今回の検証では mrr_cost_based=on、mrr_cost_based=off 設定
時に mrr が選択されることで検証した。
性能検証データは 4-2,4-4 で作成したデータを流用した。
5-3 性能検証
性能検証を行うにあたって read_rnd_buffer_size の影響度を確認した。結論としては
read_rnd_buffer_size の変動でオプティマイザが mrr の選択を変更させていることは今回確認できな
かった。read_rnd_buffer_size は今回最初に設定した 1Mbyte で検証することとする。又 4-3 と同様に
SQL 実行前に innodb 用のバッファーキャッシュ、OS のページキャッシュをクリアし各 SQL 実行時
にキャッシュデータが無い状態で実行した。
read_rnd_buffer_size 1Mbyte で下記 SQL で確認。「？」は 14 からデクリメントしていく。
上記結果から「？」の値「12」固定で read_rnd_buffer_size を上昇させ確認する。
？の値 explain の rows explain の extra
14 88646 Using index condition; Using MRR
13 204954 Using index condition; Using MRR
12 278784 Using index condition
Size(Mbyte) explain の rows explain の extra
select * from test where col1 = ? and col2 > 100 and col2 < 3500;

13
上記より read_rnd_buffer_size の変更でオプティマイザに mrr を選択させることは今回は見送る。
下記 SQL を実行し、mrr（index condition pushdown を含むケース（ア）、含まないケース（イ））、
非 mrr（index condition pushdown を含むケース（ウ）、含まないケース（エ））で検索した結果を検
証比較した。col2 の「?」は 2500 から 100 ずつ減尐させた値を設定した。又、col1=13 で合致するデ
ータ数（今回の母数）は前述のデータ分布から 100776 件となる。
（ア）（イ）（ウ）（エ）の実行計画は以下となる。
（ア）Using index condition; Using MRR
（イ）Using where; Using MRR
（ウ）Using index condition
select * from test where col1 = 13 and col2 > 100 and col2 < ?;
id: 1
select_type: SIMPLE
table: test
type: range
possible_keys: col2,multi_idx1
key: multi_idx1
key_len: 10
ref: NULL
rows: 204954
Extra: Using index condition; Using MRR
id: 1
select_type: SIMPLE
table: test
type: range
key: multi_idx1
key_len: 10
ref: NULL
rows: 204954
Extra: Using where; Using MRR
id: 1
select_type: SIMPLE
table: test
type: range
key: multi_idx1
key_len: 10
ref: NULL
rows: 204954
Extra: Using index condition

14
（エ）Using where
上記 4 パターンでの検索件数と検索時間は以下となる。
col2 件数アイウエ
2500 90566 16.44 16.23 174.14 169.53
2400 90566 16.52 16.39 172.37 169.44
2300 90566 16.44 16.46 172.08 172.32
2200 90566 16.45 16.39 172.81 172.04
2100 90566 16.35 16.38 169.71 171.85
2000 90566 17.08 16.26 169.87 169.18
1900 80504 16.54 16.19 164.61 162.19
1800 80504 16.54 16.04 164.74 162.82
1700 70522 15.94 16.98 156.12 153.33
1600 70522 15.86 15.80 156.64 156.00
1500 60366 15.37 15.43 143.58 143.61
1400 60366 15.34 15.41 145.82 143.38
1300 50486 14.87 14.76 131.27 131.99
1200 50486 14.78 14.85 131.31 131.99
1100 40483 15.69 14.09 119.63 122.69
1000 40483 14.18 16.11 117.21 117.09
900 40483 14.18 14.33 117.35 122.42
800 30435 13.28 13.55 100.81 99.02
id: 1
select_type: SIMPLE
table: test
type: range
key: multi_idx1
key_len: 10
ref: NULL
rows: 204954
Extra: Using where

15
今回の検証では mrr、非 mrr で大きく検索結果の時間に差が見られ、mrr を選択した方が検索時間が
短かった。又、全てのパターンで検索結果の減尐と検索時間が緩やかにだが比例して減尐しているこ
とも伺えた。
今後も検証が必要だが現時点では mrr によるデメリットが見当たらず、小さいながらも mrr のメリッ
トが今回の検証では見受けられ mrr を有効にすることは後述する BKA と絡めても意義があると考え
る。 SQL コーディング時には mrr を誘導する手立ては無いがインスタンスの設定で
mrr_cost_based=off でオプティマイザに mrr が選択されるように設定する。
6. Block Nested Loop（BNL）
MySQL5.6 での Block Nested Loop（BNL）アルゴリズムの実装は外部結合操作時のサポートとして
実装されている。結合時に駆動テーブルから参照される内部テーブルとの結合カラムにインデックス
700 30435 13.39 13.32 98.66 100.19
600 20375 12.50 12.40 77.90 74.54
500 20375 12.55 12.52 74.67 74.70
400 10117 11.29 11.21 42.03 44.12
300 10117 11.20 11.23 42.01 42.12
200 Empty 0.16 0.16 0.18 0.15

16
が無い場合、駆動テーブルのレコードを一旦キャッシュしキャッシュが一杯になったら参照先の内部
テーブルを検索する。キャッシュ（join_buffer_size）のサイズにより繰り返す回数は決まる。
性能検証データは 4-2,4-4 で作成したデータを流用し、外部結合用テーブルとして下記 outer_tb を作
成した。
テーブル定義
テーブルデータ
6-3 性能検証
性能検証を行うにあたって join_buffer_size の増減による性能差を確認した。結論としては
join_buffer_size の値で性能差は確認できなかった。今回は初期設定の 131072byte で Block Nested
Loop、非 Block Nested Loop の性能検証を行う。今回の検証は全て以下の SQL で行った。又 4-3 と同
様に SQL 実行前に innodb 用のバッファーキャッシュ、OS のページキャッシュをクリアし各 SQL 実
行時にキャッシュデータが無い状態で実行した。
Block Nested Loop 時の join_buffer_size の検証結果
join_buffer_size 検索時間
1024 byte 22.03
CREATE TABLE `outer_tb` (
PRIMARY KEY (`id`)
+----+------+--------+
| id | col1 | col2 |
+----+------+--------+
| 1 | 1 | 2 |
| 2 | 2 | 38 |
| 3 | 3 | 342 |
| 4 | 4 | 1938 |
| 5 | 5 | 7752 |
| 6 | 6 | 23256 |
| 7 | 7 | 54264 |
| 8 | 8 | 100776 |
| 9 | 9 | 151164 |
| 10 | 10 | 184756 |
| 11 | 11 | 184756 |
| 12 | 12 | 151164 |
| 13 | 13 | 100776 |
| 14 | 14 | 54264 |
| 15 | 15 | 23256 |
| 16 | 16 | 7752 |
| 17 | 17 | 1938 |
| 18 | 18 | 342 |
| 19 | 19 | 38 |
| 20 | 20 | 2 |
+----+------+--------+
select * from test a left outer join outer_tb b on (a.col1 = b.col1);

17
1024*1024 byte 19.32
8*1024*1024 byte 28.00
Block Nested Loop 時の実行計画
非 Block Nested Loop 時の実行計画
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: a
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1048595
Extra:
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: b
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 20
Extra: Using where; Using join buffer (Block Nested Loop)
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: a
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1048595
Extra:
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: b
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 20
Extra: Using where

18
性能検証結果
Block Nested Loop 非 Block Nested Loop
1 回目 17.34 16.51
2 回目 17.39 16.45
3 回目 17.74 15.90
結果としては僅かだが非 Block Nested Loop の方が検索結果が早かった。但し今回のデータ分布によ
る可能性もあるため今後も継続して検証することとする。
業務では大きな効果は期待できないと考える。又、期待するような SQL を書くことは出来る限り行わ
ないことが大事だと考える。理由としては Block Nested Loop 実装以前からある NestedLoop Join 時
に内部テーブルにインデックスが無い場合に使用されたアクセスアルゴリズムUsing Join Bufferと基
本的には使いどころは変わらず、業務では Full Scan を発生させないことが不要なデータをキャッシ
ュさせずキャッシュ効率を高め、瞬間的な FullScan による DiskI/O の増大により他の SQL に対して
予期せぬパフォーマンス劣化を発生させない最善のポリシーと考えるからである。
7. Batched Key Access（BKA）
Batched Key Access（BKA）は結合操作時に駆動テーブルから内部テーブルにアクセスする際に駆動
テーブルからの一定量のレコードをバッファしバッファしたレコードを mrr を用いて内部テーブルに
アクセスするアルゴリズムとなる。このため BKA をオプティマイザに選択させるには mrr が有効で
ある必要がある。（optimizer_switch=’mrr=on,mrr_cost_based=off’）MySQL の初期設定では BKA
は有効で無い為、前述の mrr を有効にすると共に合わせて BKA を有効にする必要がある。
（optimizer_switch=’batched_key_access=on’）又 BNL と同様 join_buffer_size によりバッファサイ
ズは制御している。
性能検証データは 4-2,4-4 で作成したデータを流用し、結合用テーブルとして下記 other_tb を作成し
た。other_tb と test は一般的な受注と明細の関係としたデータ構成とし other_tb.id と test.othder_id

19
は親子の関係である。但し制約としては定義はしていない。other_tb の件数は 524288 件である。
テーブル定義
7-3 性能検証
BKA と非 BKA の性能検証を下記 SQL で行った。「？」は 8 から 11 まで実施した。又 4-3 と同様に
SQL 実行前に innodb 用のバッファーキャッシュ、OS のページキャッシュをクリアし各 SQL 実行時
にキャッシュデータが無い状態で実行した。
BKA の実行計画
CREATE TABLE `other_tb` (
`col2` varchar(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `col1` (`col1`),
KEY `col2` (`col2`)
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: a
type: ref
possible_keys: PRIMARY,col1
key: col1
key_len: 5
ref: const
rows: 92946
Extra:
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: b
type: ref
possible_keys: other_id
key: other_id
key_len: 4
ref: test.a.id
rows: 1
Extra: Using join buffer (Batched Key Access)
select * from test_3 a join test b on (a.id = b.other_id) where a.col1 =?:

20
非 BKA の実行計画
上記 SQL の BKA、非 BKA の検索件数と検索時間は以下となった。
検証結果から若干 BKA の方が検索時間が早かった。ここで join_buffer_size の変動により検索時間に
変動があるか BKA で col1=11 で検証した。
join_buffer_size 検索時間
131072 byte（初期値） 22.50
1024*1024 byte 20.80
8*1024*1024 byte 20.56
16*1024*1024 byte 20.64
32*1024*1024 byte 20.45
join_buffer_size の増加により効果があったが 1024*1024(1Mbyte)以上では効果は同じだった。
BNL と違い結合処理時に内部テーブルに対してインデックス検索時に効果があり、今回の検証では通
常の Nested Loop Join より検索時間が良好だったため、BKA を有効にすることは意味があると考え
る。業務においては今回の検証シナリオのような親子の関係になるログテーブルで十分効果が見込め
col1 件数 BKA 非 BKA
8 100766 18.80 21.15
9 151164 21.20 23.68
10 184756 22.63 24.24
11 184756 22.50 24.98
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: a
type: ref
possible_keys: PRIMARY,col1
key: col1
key_len: 5
ref: const
rows: 92946
Extra:
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: b
type: ref
possible_keys: other_id
key: other_id
key_len: 4
ref: test.a.id
rows: 1
Extra:

21
る。今後は適切な join_buffer_size（セッション単位で設定可能）を更に検証していくことが BKA の
効果を最大化できると考える。
8. おわりに
今回検証対象となった新しい 4 つのデータアクセスアルゴリズムは全て SQL の記述レベルで誘導できる
アルゴリズムではなくデータベースシステムの設定に依存するアルゴリズムであったと考える。但し今
回のアルゴリズムの特長、適用箇所を検証する事でこれまでのデータアクセスアルゴリズムの弱点を補
完でき SQL 記述方法の向上に繋がると考える。検証回数の尐なさや検証データの分布や検証シナリオな
どの見直しを含め今後も継続して本アルゴリズムの検証調査を続けて行きたいと考える。
9. 参考資料
MySQL5.6.4 RPM download,Source Code download
http://dev.mysql.com/downloads/mysql/#downloads
MySQL5.6 Reference Manual 7.13.4 Index Condition Pushdown
http://dev.mysql.com/doc/refman/5.6/en/index-condition-pushdown-optimization.html
MySQL5.6 Reference Manual 7.13.10 Multi-Range Read Optimization
http://dev.mysql.com/doc/refman/5.6/en/mrr-optimization.html
MySQL5.6 Reference Manual 7.13.11.2 Block Nested-Loop Algorithm for Outer Joins
http://dev.mysql.com/doc/refman/5.6/en/bnl-optimization.html
MySQL5.6 Reference Manual 7.13.11.3 Batched Key Access Joins
http://dev.mysql.com/doc/refman/5.6/en/bka-optimization.html

MySQL5.6検証レポート

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to MySQL5.6検証レポート

Similar to MySQL5.6検証レポート (20)

More from Hironori Miura

More from Hironori Miura (6)

MySQL5.6検証レポート