FOSDEM 2012




                               MariaDB 5.3 query optimizer
                                 Taking the dolphin to where
                                 he's never been before



                                                                    Sergey Petrunya
                                                                    MariaDB


Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
What is MySQL currently

 MySQL is
 ● “World's most popular database”

 ● Used for

    ● Websites

    ● OLTP applications




 at the same time
    ● “a toy database”

    ● “cannot handle complex queries”




                                                                    2   14:16:02
Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
When does one use MySQL

 Web apps
 ● Mostly reads


 ● Point or narrow-range select queries:

           ● SELECT * FROM web_pages WHERE key=...
           ● SELECT * FROM email_archive WHERE date BETWEEN ...


           ● SELECT * FROM forum_posts

             WHERE thread_id=34213 ORDER BY post_date DESC LIMIT 10

 OLTP (Online Transaction Processing) applications
 ● Same as above but more writes and ACID requirements

           ● SELECT balance FROM users WHERE user_id=...
           ● UPDATE user_id




                                                                    3   14:16:02
Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
When does one not use MySQL

  Decision support / analytics / reporting
  ● Database is larger

            ●   think “current state” data → “full history” data

  ●    Queries shuffle through more data
            ●   “get data for order X” → “get biggest orders in the last month”

  ●    Queries are frequently complex
            ● “get last N posts” → “get distribution of posts by time of the day”
            ● “get orders for item X made today” → “which fraction of those who

              ordered item X also ordered item Y” ?




                                                                    4             14:16:02
Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
What exactly is not working

Reasons MySQL is poor at decision support/analytics
●   Large datasets
         ●   Reading lots of records from disk requires special disk access
             strategies

●   Complex queries
         ●   Insufficient subquery optimizations
                ● On many levels

         ●   Insufficient support for big joins




                                                                    5         14:16:02
Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
What exactly isn't working?

Let's try running an analytics-type query
● Take DBT-3 (ad-hoc, decision-support benchmark)


● Load the data for scale=30 (75 GB)

● And try some query:

   “average price of item ordered in a certain month”
  select avg(lineitem.l_extendedprice)
  from orders, lineitem
  where
    lineitem.l_orderkey=orders.o_orderkey and
    orders.o_orderdate between date '1992-07-01' and
                               date '1992-07-31';




                                                                    6   14:16:02
Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
What exactly isn't working? (2)
     select avg(lineitem.l_extendedprice)
     from orders, lineitem
     where
       lineitem.l_orderkey=orders.o_orderkey and
       orders.o_orderdate between date '1992-07-01' and
                                  date '1992-07-31';

id      select_type       table           type         possible_keys           key          key_len   ref           rows      Extra
1       SIMPLE            orders          range        PRIMARY,                i_o_orderdate 4        NULL          1165090   Using where;
                                                       i_o_orderdate                                                          Using index
1       SIMPLE            lineitem ref                 PRIMARY,                PRIMARY      4         orders.o_or   1
                                                       i_l_orderkey,                                  derkey
                                                       i_l_orderkey_quantity


     ●    Query time: 45 min
               ●   Why?
     ●    Let's explore

                                                                                                               7              14:16:02
     Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
What is the problem?
●   Check “iostat -x”
avg­cpu:  %user   %nice %system %iowait  %steal   %idle
           2.01    0.00    2.26   23.62    0.00   72.11

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
sda               0.00     0.00  229.00    0.00  3952.00     0.00




                                                                    avgrq­sz avgqu­sz   await r_await w_await  svctm  %util
                                                                       34.52     0.95    4.15    4.15    0.00   4.15  95.00


●   IO-bound load
●   229 reqests/sec
●   4.15 msec average
●   It's a 7200 RPM hdd, which gives 8 msec disk seek.
●   Not random disk seeks but close.

                                                                                                       8           14:16:02
Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
What is the problem (2)

  ●   Check “SHOW ENGINE INNODB STATUS”
    ...
    --------
    FILE I/O
    --------
    ...

    1 pending preads, 0 pending pwrites
    206.36 reads/s, 16384 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s


●     Possible solutions:
        ● Get more RAM

        ● Get an SSD


●     These are ok to speedup OLTP workloads
●     Speeding up analytics this way is going to be costly!

                                                                    9    14:16:02
Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
MySQL/MariaDB solution

 Improved disk access strategies
 ● Multi-Range Read


 ● Batched Key Access




Multi Range Read
● Access table records in disk order (MySQL, MariaDB)


● Enumerate index entries in index order (MariaDB)




Batched Key Access
● Group ref/eq_ref accesses together


● Submit them to storage engine (e.g. InnoDB) as batch,

         ●   So that Multi Range Read can do its job

                                                                    10   14:16:02
Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
Let's try the query with MRR/BKA

 ●   Enable MRR/BKA
       set        optimizer_switch='mrr=on,mrr_sort_keys=on';
       set        join_cache_level=6;
       set        join_buffer_size=1024*1024*32;
       set        join_buffer_space_limit=1024*1024*32;

●    Re-run the query
       select avg(lineitem.l_extendedprice)
       from orders, lineitem
       where
         lineitem.l_orderkey=orders.o_orderkey and
         orders.o_orderdate between date '1992-07-01' and
                                    date '1992-07-31';
●    Query time: 3 min 48 sec
          ●   Was: 45 min, 11.8x speedup

                                                                    11   14:16:02
Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
Let's try the query with MRR/BKA
 ●   Explain is almost as before
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: orders
         type: range
possible_keys: PRIMARY,i_o_orderdate
          key: i_o_orderdate
      key_len: 4
          ref: NULL
         rows: 1165090
        Extra: Using where; Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: lineitem
         type: ref
possible_keys: PRIMARY,i_l_orderkey,i_l_orderkey_quantity
          key: i_l_orderkey
      key_len: 4
          ref: dbt3sf30.orders.o_orderkey
         rows: 1
        Extra: Using join buffer (flat, BKA join); Key-ordered Rowid-ordered scan
2 rows in set (0.00 sec)

                                                                    12        14:16:02
Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
Check what is doing
●   Check “iostat -x”
avg-cpu:           %user            %nice %system %iowait                   %steal   %idle
                   15.13             0.00    2.82    9.74                     0.00   72.31

Device:                         rrqm/s             wrqm/s     r/s              w/s    rkB/s      wkB/s avgrq-sz avgqu-sz
sda                               0.00               0.00 1936.00             0.00 88112.00       0.00    91.02     0.45




                                                                    await r_await w_await    svctm   %util
                                                                     0.23    0.23    0.00     0.23   44.20


●   Got some CPU load
●   svctm down to 0.23 ms (random seeks are 8ms)
●   SHOW ENGINE INNODB STATUS
 ...
 0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 10450.55 reads/s




                                                                                                         13         14:16:02
Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
Use systemtap to look at io patterns

bash# stap deviceseeks.stp -c "sleep 60"

                               Regular                              Batched Key Access




                                                                            14      14:16:02
Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
Data from a bigger benchmark

 ● 4 GB RAM box
 ● DBT-3 scale=10 (30GB dataset)




                                                                      Before       After
                                                         avg         3.24 hrs    5.91 min
                                                    median           0.32 hrs    4.48 min
                                                        max         25.97 hrs*   22.92 min


 ● Can do joins we couldn't before!
 ● NOTE: not with default settings

           ●   Will publish special “big” configuration on http://kb.askmonty.org

                                                                                             15   14:16:02
Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
What exactly is not working

Reasons MySQL is poor at decision support/analytics
●   Large datasets
         ●   Reading lots of records from disk requires special disk access
             strategies

●   Complex queries
         ●   Insufficient subquery optimizations
                ● On many levels

         ●   Insufficient support for big joins




                                                                    16        14:16:02
Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
Subquery optimizations

 Status in MySQL 5.x, MariaDB 5.2
 ●   One execution strategy for every kind of subquery
           ●   If it is appropriate for your case – OK
           ●   If it is not – query is too slow to be usable
                     ●   10x, 100x, 1000x slower
 ●   There are cases when even EXPLAIN is very slow
           ● FROM subuqeries
           ● .. and other less-obvious cases



 ●   General public reaction
           ●   “Don't use subqueries”


                                                                    17   14:16:02
Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
Subquery handling in MySQL 5.x

                                                                    ●   One strategy
                                                                        for every kind
                                                                        of subquery




                                                                          18      14:16:02
Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
Subquery handling in MySQL 5.6

                                                                    ●   Optimizations
                                                                        for a couple of
                                                                        cases




                                                                          19      14:16:02
Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
Subquery handling in MySQL 6.0

                                                                    ●   But look, this is
                                                                        what was in
                                                                        MySQL 6.0
                                                                        alpha




                                                                          20       14:16:02
Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
Subquery handling in MariaDB 5.3

                                                                    ●   And we still
                                                                        had to add this
                                                                        to provide
                                                                        enough
                                                                        coverage




                                                                          21      14:16:02
Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
Subqueries in MariaDB 5.3

 Outside view
 ● Execution can be 10x, 100x, 1000x faster than before


 ● EXPLAIN is always instant




 A bit of detail
 ● Competitive amount of strategies

 ● Optimizer quality: can expect what you expect from join

   optimizer
 ● There is @@optimizer_switch flag for every new

   optimization
 ● Batched Key Access supported in important cases




                                                                    22   14:16:02
Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
Where to look for more detail




                                                      http://kb.askmonty.org




                                                                               23   14:16:02
Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
Thanks




                                                                    Q&A




                                                                          24   14:16:02
Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
25   14:16:02
Notice: MySQL is a registered trademark of Sun Microsystems, Inc.

Fosdem2012 mariadb-5.3-query-optimizer-r2

  • 1.
    FOSDEM 2012 MariaDB 5.3 query optimizer Taking the dolphin to where he's never been before Sergey Petrunya MariaDB Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 2.
    What is MySQLcurrently MySQL is ● “World's most popular database” ● Used for ● Websites ● OLTP applications at the same time ● “a toy database” ● “cannot handle complex queries” 2 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 3.
    When does oneuse MySQL Web apps ● Mostly reads ● Point or narrow-range select queries: ● SELECT * FROM web_pages WHERE key=... ● SELECT * FROM email_archive WHERE date BETWEEN ... ● SELECT * FROM forum_posts WHERE thread_id=34213 ORDER BY post_date DESC LIMIT 10 OLTP (Online Transaction Processing) applications ● Same as above but more writes and ACID requirements ● SELECT balance FROM users WHERE user_id=... ● UPDATE user_id 3 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 4.
    When does onenot use MySQL Decision support / analytics / reporting ● Database is larger ● think “current state” data → “full history” data ● Queries shuffle through more data ● “get data for order X” → “get biggest orders in the last month” ● Queries are frequently complex ● “get last N posts” → “get distribution of posts by time of the day” ● “get orders for item X made today” → “which fraction of those who ordered item X also ordered item Y” ? 4 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 5.
    What exactly isnot working Reasons MySQL is poor at decision support/analytics ● Large datasets ● Reading lots of records from disk requires special disk access strategies ● Complex queries ● Insufficient subquery optimizations ● On many levels ● Insufficient support for big joins 5 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 6.
    What exactly isn'tworking? Let's try running an analytics-type query ● Take DBT-3 (ad-hoc, decision-support benchmark) ● Load the data for scale=30 (75 GB) ● And try some query: “average price of item ordered in a certain month” select avg(lineitem.l_extendedprice) from orders, lineitem where lineitem.l_orderkey=orders.o_orderkey and orders.o_orderdate between date '1992-07-01' and date '1992-07-31'; 6 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 7.
    What exactly isn'tworking? (2) select avg(lineitem.l_extendedprice) from orders, lineitem where lineitem.l_orderkey=orders.o_orderkey and orders.o_orderdate between date '1992-07-01' and date '1992-07-31'; id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE orders range PRIMARY, i_o_orderdate 4 NULL 1165090 Using where; i_o_orderdate Using index 1 SIMPLE lineitem ref PRIMARY, PRIMARY 4 orders.o_or 1 i_l_orderkey, derkey i_l_orderkey_quantity ● Query time: 45 min ● Why? ● Let's explore 7 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 8.
    What is theproblem? ● Check “iostat -x” avg­cpu:  %user   %nice %system %iowait  %steal   %idle            2.01    0.00    2.26   23.62    0.00   72.11 Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s sda               0.00     0.00  229.00    0.00  3952.00     0.00 avgrq­sz avgqu­sz   await r_await w_await  svctm  %util    34.52     0.95    4.15    4.15    0.00   4.15  95.00 ● IO-bound load ● 229 reqests/sec ● 4.15 msec average ● It's a 7200 RPM hdd, which gives 8 msec disk seek. ● Not random disk seeks but close. 8 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 9.
    What is theproblem (2) ● Check “SHOW ENGINE INNODB STATUS” ... -------- FILE I/O -------- ... 1 pending preads, 0 pending pwrites 206.36 reads/s, 16384 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s ● Possible solutions: ● Get more RAM ● Get an SSD ● These are ok to speedup OLTP workloads ● Speeding up analytics this way is going to be costly! 9 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 10.
    MySQL/MariaDB solution Improveddisk access strategies ● Multi-Range Read ● Batched Key Access Multi Range Read ● Access table records in disk order (MySQL, MariaDB) ● Enumerate index entries in index order (MariaDB) Batched Key Access ● Group ref/eq_ref accesses together ● Submit them to storage engine (e.g. InnoDB) as batch, ● So that Multi Range Read can do its job 10 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 11.
    Let's try thequery with MRR/BKA ● Enable MRR/BKA set optimizer_switch='mrr=on,mrr_sort_keys=on'; set join_cache_level=6; set join_buffer_size=1024*1024*32; set join_buffer_space_limit=1024*1024*32; ● Re-run the query select avg(lineitem.l_extendedprice) from orders, lineitem where lineitem.l_orderkey=orders.o_orderkey and orders.o_orderdate between date '1992-07-01' and date '1992-07-31'; ● Query time: 3 min 48 sec ● Was: 45 min, 11.8x speedup 11 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 12.
    Let's try thequery with MRR/BKA ● Explain is almost as before *************************** 1. row *************************** id: 1 select_type: SIMPLE table: orders type: range possible_keys: PRIMARY,i_o_orderdate key: i_o_orderdate key_len: 4 ref: NULL rows: 1165090 Extra: Using where; Using index *************************** 2. row *************************** id: 1 select_type: SIMPLE table: lineitem type: ref possible_keys: PRIMARY,i_l_orderkey,i_l_orderkey_quantity key: i_l_orderkey key_len: 4 ref: dbt3sf30.orders.o_orderkey rows: 1 Extra: Using join buffer (flat, BKA join); Key-ordered Rowid-ordered scan 2 rows in set (0.00 sec) 12 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 13.
    Check what isdoing ● Check “iostat -x” avg-cpu: %user %nice %system %iowait %steal %idle 15.13 0.00 2.82 9.74 0.00 72.31 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz sda 0.00 0.00 1936.00 0.00 88112.00 0.00 91.02 0.45 await r_await w_await svctm %util 0.23 0.23 0.00 0.23 44.20 ● Got some CPU load ● svctm down to 0.23 ms (random seeks are 8ms) ● SHOW ENGINE INNODB STATUS ... 0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 10450.55 reads/s 13 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 14.
    Use systemtap tolook at io patterns bash# stap deviceseeks.stp -c "sleep 60" Regular Batched Key Access 14 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 15.
    Data from abigger benchmark ● 4 GB RAM box ● DBT-3 scale=10 (30GB dataset) Before After avg 3.24 hrs 5.91 min median 0.32 hrs 4.48 min max 25.97 hrs* 22.92 min ● Can do joins we couldn't before! ● NOTE: not with default settings ● Will publish special “big” configuration on http://kb.askmonty.org 15 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 16.
    What exactly isnot working Reasons MySQL is poor at decision support/analytics ● Large datasets ● Reading lots of records from disk requires special disk access strategies ● Complex queries ● Insufficient subquery optimizations ● On many levels ● Insufficient support for big joins 16 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 17.
    Subquery optimizations Statusin MySQL 5.x, MariaDB 5.2 ● One execution strategy for every kind of subquery ● If it is appropriate for your case – OK ● If it is not – query is too slow to be usable ● 10x, 100x, 1000x slower ● There are cases when even EXPLAIN is very slow ● FROM subuqeries ● .. and other less-obvious cases ● General public reaction ● “Don't use subqueries” 17 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 18.
    Subquery handling inMySQL 5.x ● One strategy for every kind of subquery 18 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 19.
    Subquery handling inMySQL 5.6 ● Optimizations for a couple of cases 19 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 20.
    Subquery handling inMySQL 6.0 ● But look, this is what was in MySQL 6.0 alpha 20 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 21.
    Subquery handling inMariaDB 5.3 ● And we still had to add this to provide enough coverage 21 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 22.
    Subqueries in MariaDB5.3 Outside view ● Execution can be 10x, 100x, 1000x faster than before ● EXPLAIN is always instant A bit of detail ● Competitive amount of strategies ● Optimizer quality: can expect what you expect from join optimizer ● There is @@optimizer_switch flag for every new optimization ● Batched Key Access supported in important cases 22 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 23.
    Where to lookfor more detail http://kb.askmonty.org 23 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 24.
    Thanks Q&A 24 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.
  • 25.
    25 14:16:02 Notice: MySQL is a registered trademark of Sun Microsystems, Inc.