Inno Db Performance And Usability Patches

2,356 views
2,274 views

Published on

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,356
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
51
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Inno Db Performance And Usability Patches

  1. 1. InnoDB Performance and Usability Patches MySQL CE 2009 Vadim Tkachenko, Ewen Fortune Percona Inc MySQLPerformanceBlog.com
  2. 2. Who are we ? •  Vadim Tkachenko –  Co-Founder of Percona Inc •  Lead of R&D department –  Co-Author MySQLPerformanceBlog.com –  Co-Author “High Performance MySQL” 2nd edition book •  Ewen Fortune –  Consultant, Percona Inc •  Special Thanks Yasufumi Kinoshita –  Not here, but author of most patches
  3. 3. What is this talk about? •  Patches made by Percona for InnoDB Storage Engine •  Two main focuses –  Performance improvement patches –  “Usability” patches •  Make InnoDB a bit more friendly  •  World changed since time of Pentium 100MHz and 8MB of RAM –  But many such assumptions still in InnoDB code
  4. 4. Why we do it •  Most requirements and changes come from practical work with customers •  We need InnoDB fully utilizing modern hardware today –  16 cores –  RAIDs –  SSD / FusionIO / other storage technologies •  InnoDB team is “conservative” in making improvements in this area
  5. 5. Future •  Why patches ? Why it can’t be included in InnoDB ? –  We are often asked about, but actually question is to InnoDB team •  (empty space due to uncertainty of MySQL future in Oracle) •  Anyway we will continue our work
  6. 6. Versions •  5.0 –  Set of patches –  SHOW PATCHES to see full list •  5.1 –  Storage engine XtraDB –  Based on InnoDB + patches, not real competitor of InnoDB, but drop-in enhanced version
  7. 7. Performance Patches
  8. 8. Scalability •  Enhanced read_write locks –  Improves InnoDB scalability on systems with 8-16 cores –  Similar on Google implementation, InnoDB-plugin-1.0.3 –  Our implementation is alternative •  Topic to research which one is better •  InnoDB-plugin may be preferred, InnoDB team made hard job porting it to many platforms –  And now in 5.4 •  Split buffer_pool mutex even more –  Additional split of buffer_pool mutex to 5.0.33
  9. 9. IO patches •  InnoDB IO patches –  Part similar to Google’s InnoDB IO patches, but again alternative  –  Several parts – some of them now in 5.4
  10. 10. IO – multiple threads •  Read_io_threads –  Number of threads for reads requests (by default 1) –  Not really useful as used only for read-ahead requests •  Write_io_threads –  Number of threads for write requests (by default 1) –  This is one you may want to use on system with multiple disks •  Io_capacity –  Amount of IO operations per second InnoDB assumes server can do (by default 100, which is not right assumptions for modern systems)
  11. 11. IO – Adaptive checkpoint •  InnoDB flushing of dirty buffer_pool pages may be intensive •  Lack of free pages may be controlled by innodb_max_dirty_pages_pct •  Flushing at the moment of checkpoint is not controllable, intensive and may hurt
  12. 12. Adaptive checkpointing InnoDB default behavior, hiccups during buffer_pool flushing
  13. 13. Adaptive checkpoint •  What we do: •  Flush pages more intensive –  the closer checkpoint the more intensive
  14. 14. Adaptive_checkpoint •  Adaptive_checkpoint=1
  15. 15. IO Control of Insert buffer •  Ibuf_max_size – maximal size of insert buffer (by default can be half of buffer_pool) •  Ibuf_accel_rate – IO rate for background thread, works in pair with io_capacity
  16. 16. IO – multiple pages •  Read_ahead = (both | linear | random) –  Control to use or not internal InnoDB read-ahead logic •  Flush_neighbor_pages = (yes|no) –  By default InnoDB also writes neighborhoods of flushing pages •  All these operations were made for disks with expensive (in time sense) random reads – may be not needed for SSD / FusionIO / other devices with cheap random reads
  17. 17. Extra rollback segments •  By default InnoDB uses single segment protected by mutex •  Sensitive in intensive parallel insert load
  18. 18. Fix group commit •  “Broken” in 5.0 –  Problem appears on slow disks with enabled binary-logs
  19. 19. Benchmark •  Tpcc-like workload •  100 Warehouses (about 10GB of data) •  Buffer_pool=5GB •  System: Dell PowerEdge R900, RAID 10 on 8 disks, RAM 32GB –  O_DIRECT for InnoDB, xfs filesystem, mounted with nobarrier •  5.0.77 vs 5.0.77-percona –  Had no chance to test 5.4 yet
  20. 20. Benchmark
  21. 21. Usability patches
  22. 22. Microslow •  InnoDB part InnoDB_IO_r_ops: 1 InnoDB_IO_r_bytes: 16384 InnoDB_IO_r_wait: 0.028487 # InnoDB_rec_lock_wait: 0.000000 InnoDB_queue_wait: 0.000000 # InnoDB_pages_distinct: 5
  23. 23. Limit data dictionary •  Problem: –  Data dictionary entry of once opened table kept in memory forever (or while DELETE table) –  Is not problem for regular usage (100-1000 tables) –  Is problem for instances with 10K+ tables •  10GB+ of memory just allocated for datadictionary entries •  Our solution: –  LRU based datadictionary entries –  Remove from memory oldest entries if limit reached
  24. 24. IO access pattern Show pattern of pages on disk accessed •  mysql> select INDEX_ID,TABLE_NAME,INDEX_NAME,sum(N_READ),sum(N_WRITE) from INFORMATION_SCHEMA.INNODB_ALL_PAGE_IO group •  by INDEX_ID; •  +------------+------------------+-------------------+-------------+--------------+ •  | INDEX_ID | TABLE_NAME | INDEX_NAME | sum(N_READ) | sum(N_WRITE) | •  +------------+------------------+-------------------+-------------+--------------+ •  | 30 | tpcc/item | PRIMARY | 547 | 0| •  | 32 | tpcc/district | PRIMARY | 1| 1| •  | 36 | tpcc/history | GEN_CLUST_INDEX | 11 | 5| •  | 37 | tpcc/history | fkey_history_1 | 166 | 163 | •  | 38 | tpcc/history | fkey_history_2 | 37 | 30 | •  | 39 | tpcc/new_orders | PRIMARY | 76 | 76 | •  | 43 | tpcc/order_line | PRIMARY | 218 | 189 | •  | 44 | tpcc/order_line | fkey_order_line_2 | 1040 | 1040 | •  | 46 | tpcc/stock | PRIMARY | 3137 | 1764 | •  | 47 | tpcc/stock | fkey_stock_2 | 269 | 0| •  | 48 | tpcc/customer | PRIMARY | 960 | 580 | •  | 49 | tpcc/customer | idx_customer | 171 | 0| •  | 50 | tpcc/orders | PRIMARY | 94 | 70 | •  | 51 | tpcc/orders | idx_orders | 142 | 129 |
  25. 25. Show buffer pool content •  What is in buffer_pool select space,offset, RECORDS, DATASIZE, INDEX_NAME,TABLE_SCHEMA,TABLE_NAME from information_schema.INNODB_BUFFER_POOL_CONTENT limit 10; +-------+---------+---------+----------+------------+--------------+-------------+ | space | offset | RECORDS | DATASIZE | INDEX_NAME | TABLE_SCHEMA | TABLE_NAME | +-------+---------+---------+----------+------------+--------------+-------------+ | 1584 | 640643 | 9| 10312 | PRIMARY | art104 | article104 | | 1648 | 2100 | 135 | 15226 | PRIMARY | art114 | author114 | | 1492 | 4507 | 158 | 15130 | PRIMARY | art87 | author87 | | 1406 | 17498 | 141 | 16056 | img_status | art52 | img_out52 | | 1466 | 47632 | 49 | 15140 | PRIMARY | art62 | img_out62 | | 1470 | 1395457 | 24 | 14769 | PRIMARY | art84 | article84 | | 1460 | 16025 | 62 | 15174 | PRIMARY | art61 | img_out61 | | 1458 | 560956 | 20 | 14977 | PRIMARY | art61 | article61 | | 1466 | 67953 | 56 | 15182 | PRIMARY | art62 | img_out62 | | 1621 | 162962 | 46 | 15134 | PRIMARY | art110 | link_out110 | +-------+---------+---------+----------+------------+--------------+-------------+
  26. 26. Show memory usage •  Extended information about memory consuming ---------------------- BUFFER POOL AND MEMORY ---------------------- Total memory allocated 328830416; in additional pool allocated 2117120 + Internal hash tables (constant factor + variable factor) + Adaptive hash index 4839388 (4425628 + 413760) + Page hash 138716 + Dictionary cache 3383508 (3320220 + 63288) + File system 41848 (41336 + 512) + Lock system 332788 (332468 + 320) + Recovery system 0 (0 + 0) + Threads 41900 (41348 + 552) Buffer pool size 16384 + Buffer pool size, bytes 268435456 Free buffers 12396
  27. 27. Show locks held •  ---TRANSACTION 0 163390, ACTIVE 0 sec, process no 15571, OS thread id 1159485776 inserting •  mysql tables in use 1, locked 1 •  7 lock struct(s), heap size 1216, undo log entries 4 •  MySQL thread id 15, query id 15455 127.0.0.1 root update •  INSERT INTO history(h_c_d_id, h_c_w_id, h_c_id, h_d_id, h_w_id, h_date, h_amount, h_data) VALUES(?, ?, ?, ?, ?, ?, •  ?, ?) •  Trx read view will not see trx with id >= 0 163391, sees < 0 163086 •  TABLE LOCK table `test/warehouse` trx id 0 163390 lock mode IX •  RECORD LOCKS space id 10 page no 3 n bits 168 index `PRIMARY` of table `test/warehouse` trx id 0 163390 lock_mode X •  locks rec but not gap •  TABLE LOCK table `test/district` trx id 0 163390 lock mode IX •  RECORD LOCKS space id 18 page no 7 n bits 216 index `PRIMARY` of table `test/district` trx id 0 163390 lock_mode X locks rec but not gap •  TABLE LOCK table `test/customer` trx id 0 163390 lock mode IX •  RECORD LOCKS space id 19 page no 57918 n bits 96 index `PRIMARY` of table `test/customer` trx id 0 163390 lock_mode X locks rec but not gap •  TABLE LOCK table `test/history` trx id 0 163390 lock mode IX
  28. 28. Extra undo slots •  By default 1024 slots to store transaction undo information, that may limit count of concurrent transactions to 512 •  We increase to 4072 –  Only on 5.1 XtraDB –  Use it only if you need, breaks compatibility with InnoDB
  29. 29. TransactionalReplication •  Similar to Google’s patch •  Information in relay-log.info is not consistent with InnoDB state. –  When server crash MySQL will repeat several transaction •  You are lucky if replication fails on “Duplicate key error” •  In worst case you will have several transactions executed twice •  Our solution: store information of binary log name and position and relay-log name and position in InnoDB transactional log file
  30. 30. Plans •  Still hunt performance improvements •  Operations tasks: –  Fast recovery •  There is reported bug http://bugs.mysql.com/bug.php?id=29847 –  Preload table / index into buffer_pool. –  Copy single .ibd table from one server to different –  Open InnoDB tables in parallel •  Currently serialized –  Different improvements on statistics •  Some patches already published (not by us)
  31. 31. To finalize •  Most of patches is not rocket science –  Could be developed or included in official tree long time ago •  Even more, for some patches we just only uncommented few lines of code –  Expect most of them in MariaDB 5.1
  32. 32. Questions ? •  Thank you for coming!

×