Optimizing InnoDB bufferpool usage

11,396 views
11,058 views

Published on

Presentation at the 2012 FOSDEM MySQL devroom

Published in: Technology, Business
0 Comments
10 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
11,396
On SlideShare
0
From Embeds
0
Number of Embeds
86
Actions
Shares
0
Downloads
78
Comments
0
Likes
10
Embeds 0
No embeds

No notes for slide

Optimizing InnoDB bufferpool usage

  1. 1. <ul>Optimizing your InnoDB buffer pool usage </ul><ul>Steve Hardy </ul>
  2. 2. <ul><li>It decreases I/O load </li></ul><ul>Why optimize buffer pool usage ? </ul>
  3. 3. <ul><li>It decreases I/O load
  4. 4. It decreases I/O load </li></ul><ul>Why optimize buffer pool usage ? </ul>
  5. 5. <ul><li>More RAM
  6. 6. Page compression
  7. 7. Less (smaller) data
  8. 8. Rearrange data </li></ul><ul>How ? </ul>
  9. 9. <ul>OVERSIMPLIFICATION WARNING Do not complain about inaccuracies, I’m just trying to make a general point </ul>
  10. 10. <ul><li>Let’s say our buffer pool is 20% of our total DB size
  11. 11. Let’s say accesses to cached pages take no time at all
  12. 12. Let’s say all data is equally accessed
  13. 13. Let’s look at the data access over a certain period of time, say 1hr </li></ul><ul>Let’s take a look at why RAM cache works </ul><ul>Total gain from cache: 20% of the accesses are cached, so we’re 20% faster than when he have no cache at all </ul>
  14. 14. <ul><li>Page accesses are more concentrated in some places than others
  15. 15. We want to cache the most-accessed places
  16. 16. Let’s say accesses follow a 1/x curve </li></ul><ul>Fortunately, data access isn’t really like that </ul><ul>Total gain from cache: 60% of accesses are in cached pages, 60% reduction in I/O, 2.5 times faster than without a cache. </ul>
  17. 17. <ul><li>The trick to optimizing your buffer pool usage is to make the histogram look more like a slope than a flat line </li></ul><ul>So now what? </ul><ul>A slope following 1/1 has an 20% increase with 20% cache size (1.2x faster) A slope following 1/x has an 60% increase with 20% cache size (2.5x faster) A slope following 1/x^2 has an 92% increase with 20% cache size (12.5x faster) </ul>
  18. 19. <ul><li>Get a hold of MariaDB or Percona Server (stock MySQL is uncapable for displaying these stats, maybe 5.6 ?)
  19. 20. See what pages you’re storing in the buffer pool: </li></ul><ul>select * from information_schema.INNODB_BUFFER_POOL_PAGES_INDEX; +---------------------+----------+---------+--------+------------+--------+-------------+----------+-------+-----+--------------+-----------+------------+ | index_id | space_id | page_no | n_recs | data_size | hashed | access_time | modified | dirty | old | lru_position | fix_count | flush_type | +---------------------+----------+---------+--------+------------+--------+-------------+----------+-------+-----+--------------+-----------+------------+ | 945 | 0 | 115129 | 290 | 15080 | 0 | 1213040390 | 0 | 0 | 0 | 0 | 0 | 0 | | 945 | 0 | 32820 | 549 | 15965 | 0 | 1213040370 | 0 | 0 | 0 | 0 | 0 | 0 | | 945 | 0 | 112322 | 366 | 15006 | 0 | 1213040370 | 0 | 0 | 0 | 0 | 0 | 0 | | 945 | 0 | 32831 | 506 | 15961 | 0 | 1213040349 | 0 | 0 | 0 | 0 | 0 | 0 | | 945 | 0 | 111817 | 350 | 15050 | 0 | 1213040332 | 0 | 0 | 0 | 0 | 0 | 0 | | 945 | 0 | 49176 | 535 | 15959 | 0 | 1213040307 | 0 | 0 | 0 | 0 | 0 | 0 | | 945 | 0 | 113198 | 318 | 15030 | 0 | 1213040299 | 0 | 0 | 0 | 0 | 0 | 0 | | 945 | 0 | 32828 | 533 | 15966 | 0 | 1213040284 | 0 | 0 | 0 | 0 | 0 | 0 | </ul><ul>Inspecting your buffer pool </ul>
  20. 21. <ul><li>index_id: reference to information_schema. INNODB_SYS_INDEXES
  21. 22. space_id: tablespace (ibdata file or file number in file_per_table)
  22. 23. Page_no: page number (unique)
  23. 24. N_recs: number of records on the page
  24. 25. Data_size: size of data on the innodb page
  25. 26. Access_time: epoch timestamp of last access
  26. 27. Modified (1/0): something was modified in the page since load
  27. 28. Dirty (1/0): not flushed to disk yet </li></ul><ul>Inspecting your buffer pool </ul>
  28. 29. <ul><li>See what pages you’re storing in the buffer pool per table/index: </li></ul><ul>select count(*) as pages, ta.SCHEMA as db, ta.NAME as tab, ind.NAME as ind FROM innodb_buffer_pool_pages_index AS bp JOIN innodb_sys_indexes AS ind ON bp.index_id=ind.id JOIN innodb_sys_tables AS ta ON ind.table_id = ta.id group by bp.index_id; </ul><ul>Inspecting your buffer pool </ul>
  29. 30. <ul>+-------+----------------+-------------+-----------+ | pages | db | tab | ind | +-------+----------------+-------------+-----------+ | 1 | | SYS_FOREIGN | FOR_IND | | 1 | | SYS_FOREIGN | REF_IND | | 1 | zarafa_db9 | tproperties | PRIMARY | | 1 | zarafa_db9 | tproperties | hi | | 4 | zarafa_db9 | properties | PRIMARY | | 1 | zarafa_db9 | syncs | sync_time | | 1 | zarafa_indexer | stores | PRIMARY | | 1 | zarafa_indexer | stores | id | | 14 | zarafa_indexer | docwords_2 | PRIMARY | | 17 | zarafa_indexer | docwords | PRIMARY | | 17 | zarafa_indexer | words | PRIMARY | | 1 | zarafa_indexer | updates | PRIMARY | | 1 | zarafa_indexer | updates | doc | | 1 | zarafa_indexer | sourcekeys | PRIMARY | +-------+----------------+-------------+-----------+ </ul>
  30. 31. <ul><li>More of an art than an exact science, here are some clues: </li></ul><ul><ul><li>50% of your buffer pool is used by a table that contains 20% of your data
  31. 32. 50% of your buffer pool is used by a table that you thought wouldn’t really impact performance </li></ul></ul><ul>How do I know something is ‘wrong’ ? </ul>
  32. 33. <ul><li>Strategies: </li></ul><ul><ul><li>Increase record ‘density’: make your records smaller. This will give you more records per page, and therefore less pages for the same data
  33. 34. Remove indexes if you can use others almost-as-efficiently
  34. 35. Increase record ‘locality’: make sure that records that are accessed ‘around the same time’ have a higher chance of being on the same page </li></ul></ul><ul><li>Other, non-application strategies: </li></ul><ul><ul><li>Use page compression (may double or quadruple the number of pages you can hold in RAM) – but isn’t a sure-win since pages need to be decompressed in RAM as well.
  35. 36. Buy more RAM ;)
  36. 37. Try BKA (MariaDB 5.3) if you have this problem even within one query </li></ul></ul><ul>Okay, so how do I fix it ? </ul>
  37. 38. <ul><li>If you’re reading a range in an index, locality is high by definition
  38. 39. Things get less efficient when you join to some other table, each record will generate a single page read for the linked record
  39. 40. There are some enemies: random ID’s (like GUIDs) and auto_increments that are accessed in some other order than the original insertion order </li></ul><ul>Eg. table messages (id INTEGER auto_increment, userid INTEGER, data varchar(255), PRIMARY KEY (id)); This table is now read most optimally if you read them in the order they were received. In practice, this is really never needed (maybe during backup) </ul><ul>Locality? </ul>
  40. 41. <ul><li>Probably more interesting case, reading all the emails of a user: </li></ul><ul>SELECT data FROM emails WHERE userid=10; </ul><ul><li>The naïve approach would be to use an index: </li></ul><ul>table messages (id INTEGER auto_increment, userid INTEGER, data varchar(255), PRIMARY KEY (id), KEY user (userid)); </ul><ul>Increasing locality example </ul><ul>index </ul><ul>record </ul><ul>Fast range read (1 page) </ul><ul>Slow record lookup (eg 100 pages) </ul><ul>101 pages in the bufferpool </ul>
  41. 42. <ul><li>Better idea is to have records that are accessed together, more closely packed, using InnoDB clustered index: </li></ul><ul>table messages (id INTEGER auto_increment, userid INTEGER, data varchar(255), PRIMARY KEY (userid, id), UNIQUE KEY user (id)); </ul><ul>Increasing locality example </ul><ul>records </ul><ul>Fast range read (~5 pages) </ul><ul>5 pages in the bufferpool </ul>
  42. 43. <ul><li>In Zarafa, we have ‘summary’ information on emails (subject, from, to, x-mailer, etc)
  43. 44. (we have databases up to 1 TB in a single table for 1000s of users) </li></ul><ul>More locality: introducing redundant information </ul><ul>Hierarchy (id, parentid) </ul><ul>Properties (id, type, value) PRIMARY KEY(id,type) </ul><ul>4,1 5,1 6,1 7,1 </ul><ul>4, ‘subject’, ‘hello’ 4, ‘from’, ‘steve@zarafa.com’ 4, ‘x-mailer’, ‘zarafa’ 5, ‘subject’, ‘hello again’ etc </ul>
  44. 45. <ul><li>Eg: sort by subject (get all subjects of folder X): </li></ul><ul>SELECT value FROM properties JOIN hierarchy ON properties.id = hierarchy.id WHERE hierarchy.parent=X and properties.type=Y; +----+-------------+------------+--------+----------------+---------+---------+------------------------+------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+--------+----------------+---------+---------+------------------------+------+-------+ | 1 | SIMPLE | hierarchy | ref | PRIMARY,parent | parent | 4 | const | 3 | | | 1 | SIMPLE | properties | eq_ref | PRIMARY | PRIMARY | 8 | bla.hierarchy.id,const | 1 | | +----+-------------+------------+--------+----------------+---------+---------+------------------------+------+-------+ Result: one random access per email, for 10000 emails, worst case 50 seconds </ul><ul>Why this is slow </ul>
  45. 46. <ul><li>Data is redundant, if you lost it, you could regenerate it
  46. 47. Normally regarded as a bad idea
  47. 48. I don’t care.
  48. 49. Add ‘parentid’ into properties’ </li></ul><ul>Introduce a little garbage for locality </ul><ul>Hierarchy (id, parentid) </ul><ul>Properties (id, garbage, type, value) PRIMARY KEY(id,garbage, type) </ul><ul>4,1 5,1 6,1 7,1 </ul><ul>4, 1, ‘subject’, ‘hello’ 4, 1, ‘from’, ‘steve@zarafa.com’ 4, 1, ‘x-mailer’, ‘zarafa’ 5, 1, ‘subject’, ‘hello again’ etc </ul>
  49. 50. <ul><li>Writes are reads
  50. 51. If you are write-heavy, this will not work since increasing read-locality normally decreases write-locality </li></ul><ul>What’s the catch? </ul><ul>records </ul><ul>Random writes cause Random reads into buffer pool </ul>
  51. 52. <ul><li>You know, like {0999C37B-9F73-42bb-BA57-B88940FDD686}
  52. 53. Random inserts
  53. 54. Almost certain random lookups
  54. 55. Each lookup or write will cause a single page to be loaded into the buffer pool for the sole purpose of reading or writing a single record
  55. 56. Better idea: use fixed GUID + counter for example (at least inserts will go to the ‘same’ place) </li></ul><ul>What’s wrong with GUIDs? </ul>
  56. 57. <ul>Questions ? </ul>

×