Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
MySQL Optimization Tips and Tricks from the MySQL class
Topics of Discussion <ul><li>General tips/tricks
Data Types
Indexes
General Server Settings
MyISAM vs InnoDB
MyISAM
InnoDB </li></ul>
Next Topic General Tips and Tricks
General tips and tricks <ul><li>This all applies to 5.0, 5.1 – may not apply to 4.1
Recommend 5.1 for new installations
Version 4.0 EOL'd, 4.1 likely to be EOL'd by end of year
Circular replication unofficially discouraged </li></ul>
Rolling upgrades <ul><li>Start (for example) with master and slave using 4.1
Take slave offline and upgrade to 5.0
Bring slave back online and allow it to sync
Take both down, put slave into place as master
Upgrade old master to 5.0, put into place as slave
Repeat for upgrade to 5.1
Ok to have slave at higher rev than master.  Inverse is  not  ok.
Not helpful if replicating mysql.* tables </li></ul>
Client safety trick <ul><li>i-am-a-dummy in .my.cnf, under [client] </li><ul><li>Prevents updates/deletes without a where
Only 1000 rows returned from any query, unless limit by is specified </li></ul></ul>
Faster Dumping/Loading of Tables Store and load data in tab delimited format <ul><li>SELECT * FROM table_name INTO OUTFILE...
LOAD DATA INFILE '/path/to/file' INTO TABLE table_name </li></ul>
MySQL Proxy <ul><li>Technically alpha, but used by MySQL Enterprise
GPLd – works with non-enterprise versions
Supports 5.0 and 5.1
Alter connections, outgoing queries, or returning resultsets
Lua interface
Enterprise version has Query Analyzer based on Proxy </li></ul>
Next Topic Data Types
Help Choosing Data Types <ul><li>Requires a table to already be populated
Do a:  SELECT <field(s) or *> FROM table PROCEDURE ANALYSE()
Won't work if the table is too big (takes too much memory) </li></ul>
Misconceptions <ul><li>enums and sets are well behaved now (and more efficient for lookups than char/varchar(1))
Using smaller int size for primary key is an lookup speed win (not just data size difference) </li></ul>
Other considerations <ul><li>Always use  not null  if possible.  Helps lookup speed  and  storage space
Static width tables (char vs varchar, no text types) are faster for lookups ... only for MyISAM
Store ips as unsigned int </li><ul><li>Use inet_ntoa/inet_aton to access from libs
IPv4 only
Saves 11 bytes per row
Numeric (vs string) lookups – much faster </li></ul></ul>
Data compression <ul><li>Archive table engine
Myisampack </li></ul>
Archive table engine <ul><li>Supports insert/select, not update/delete
Better compression than myisampack
Don't have to go offline to convert ( ALTER TABLE table_name ENGINE=ARCHIVE )
Full table scans – no indexes supported
Have to make sure it's supported/active with  SHOW ENGINES .  Not activated by default in MySQL 5.0 </li></ul>
myisampack <ul><li>Supported for MyISAM tables only
Have to lock table completely (better to shut down db)
Run shell command to pack, then another to fix indexes
Table becomes read only
Indexes still work </li></ul>
Next Topic Indexes
Order matters <ul><li>Index on col_a, col_b, col_c
Will  not  be used by:  SELECT ... WHERE col_b = x, col_c = x
Will  be used by:  SELECT ... WHERE col_a = x, col_b = x  (partial index)
Will  be used by:  SELECT ... WHERE col_a = x order by col_b  (partial index) </li></ul>
Upcoming SlideShare
Loading in …5
×

MySQL Optimization

7,665 views

Published on

Presentation given as a redux of a MySQL training course which was then given to the staff at Dyn Inc.

Published in: Technology
  • Be the first to comment

MySQL Optimization

  1. 1. MySQL Optimization Tips and Tricks from the MySQL class
  2. 2. Topics of Discussion <ul><li>General tips/tricks
  3. 3. Data Types
  4. 4. Indexes
  5. 5. General Server Settings
  6. 6. MyISAM vs InnoDB
  7. 7. MyISAM
  8. 8. InnoDB </li></ul>
  9. 9. Next Topic General Tips and Tricks
  10. 10. General tips and tricks <ul><li>This all applies to 5.0, 5.1 – may not apply to 4.1
  11. 11. Recommend 5.1 for new installations
  12. 12. Version 4.0 EOL'd, 4.1 likely to be EOL'd by end of year
  13. 13. Circular replication unofficially discouraged </li></ul>
  14. 14. Rolling upgrades <ul><li>Start (for example) with master and slave using 4.1
  15. 15. Take slave offline and upgrade to 5.0
  16. 16. Bring slave back online and allow it to sync
  17. 17. Take both down, put slave into place as master
  18. 18. Upgrade old master to 5.0, put into place as slave
  19. 19. Repeat for upgrade to 5.1
  20. 20. Ok to have slave at higher rev than master. Inverse is not ok.
  21. 21. Not helpful if replicating mysql.* tables </li></ul>
  22. 22. Client safety trick <ul><li>i-am-a-dummy in .my.cnf, under [client] </li><ul><li>Prevents updates/deletes without a where
  23. 23. Only 1000 rows returned from any query, unless limit by is specified </li></ul></ul>
  24. 24. Faster Dumping/Loading of Tables Store and load data in tab delimited format <ul><li>SELECT * FROM table_name INTO OUTFILE '/path/to/file'
  25. 25. LOAD DATA INFILE '/path/to/file' INTO TABLE table_name </li></ul>
  26. 26. MySQL Proxy <ul><li>Technically alpha, but used by MySQL Enterprise
  27. 27. GPLd – works with non-enterprise versions
  28. 28. Supports 5.0 and 5.1
  29. 29. Alter connections, outgoing queries, or returning resultsets
  30. 30. Lua interface
  31. 31. Enterprise version has Query Analyzer based on Proxy </li></ul>
  32. 32. Next Topic Data Types
  33. 33. Help Choosing Data Types <ul><li>Requires a table to already be populated
  34. 34. Do a: SELECT <field(s) or *> FROM table PROCEDURE ANALYSE()
  35. 35. Won't work if the table is too big (takes too much memory) </li></ul>
  36. 36. Misconceptions <ul><li>enums and sets are well behaved now (and more efficient for lookups than char/varchar(1))
  37. 37. Using smaller int size for primary key is an lookup speed win (not just data size difference) </li></ul>
  38. 38. Other considerations <ul><li>Always use not null if possible. Helps lookup speed and storage space
  39. 39. Static width tables (char vs varchar, no text types) are faster for lookups ... only for MyISAM
  40. 40. Store ips as unsigned int </li><ul><li>Use inet_ntoa/inet_aton to access from libs
  41. 41. IPv4 only
  42. 42. Saves 11 bytes per row
  43. 43. Numeric (vs string) lookups – much faster </li></ul></ul>
  44. 44. Data compression <ul><li>Archive table engine
  45. 45. Myisampack </li></ul>
  46. 46. Archive table engine <ul><li>Supports insert/select, not update/delete
  47. 47. Better compression than myisampack
  48. 48. Don't have to go offline to convert ( ALTER TABLE table_name ENGINE=ARCHIVE )
  49. 49. Full table scans – no indexes supported
  50. 50. Have to make sure it's supported/active with SHOW ENGINES . Not activated by default in MySQL 5.0 </li></ul>
  51. 51. myisampack <ul><li>Supported for MyISAM tables only
  52. 52. Have to lock table completely (better to shut down db)
  53. 53. Run shell command to pack, then another to fix indexes
  54. 54. Table becomes read only
  55. 55. Indexes still work </li></ul>
  56. 56. Next Topic Indexes
  57. 57. Order matters <ul><li>Index on col_a, col_b, col_c
  58. 58. Will not be used by: SELECT ... WHERE col_b = x, col_c = x
  59. 59. Will be used by: SELECT ... WHERE col_a = x, col_b = x (partial index)
  60. 60. Will be used by: SELECT ... WHERE col_a = x order by col_b (partial index) </li></ul>
  61. 61. Covering indexes <ul><li>Happens when an index covers all fields selected as well as all fields in where clause
  62. 62. Doesn't have to retrieve data – it's in the index
  63. 63. If ”using index” shows up in ”Extra” field of an explain, we're using a covering index </li></ul>
  64. 64. Index merge optimization <ul><li>Query against a single table can potentially use multiple indexes
  65. 65. Examples: </li><ul><li>SELECT * FROM tbl_name WHERE key1 = 10 OR key2 = 20;
  66. 66. SELECT * FROM tbl_name WHERE (key1 = 10 OR key2 = 20) AND non_key=30; </li></ul><li>In ”Explain” output, shows up as ”index_merge” in the ”type” column </li></ul>
  67. 67. Next Topic General Server and Client settings
  68. 68. Open Tables <ul><li>show global status like 'open%tabl%' </li><ul><li>open_tables – tables currently open
  69. 69. opened_tables – how many times tables have been opened </li></ul><li>If opened_tables is always increasing, should probably consider upping the table_open_cache
  70. 70. show open tables </li></ul>
  71. 71. Query Cache <ul><li>Independent of storage engine
  72. 72. Hash key is literal md5sum of unaltered query so case, order, spacing are all significant
  73. 73. Use of user functions or stored procedures mean no qcache entry
  74. 74. Functions like NOW(), RAND(), etc don't go in the qcache
  75. 75. Prepared statements do not go in qcache </li></ul>
  76. 76. Qcache Settings <ul><li>Query cache defaults to 'on' (query_cache_type var) but with a 0 size (query_cache_size var) – so no qcache by default!
  77. 77. According to MySQL trainer, query_cache_size > 128MB typically starts making qcache not as useful
  78. 78. show global variables like 'query%'
  79. 79. show global status like 'qc%' </li></ul>
  80. 80. Qcache Settings cont'd <ul><li>query_cache_limit (var) – maximum size of single result set in qcache
  81. 81. query_cache_min_res_unit (var) – minimum block size for qcache results
  82. 82. qcache_free_blocks (status) – Lots means fragmented query cache
  83. 83. qcache_lowmem_prunes (status) – how many times queries had to be removed from cache because qcache was running out of memory </li></ul>
  84. 84. Optimizing Qcache <ul><li>flush query cache </li><ul><li>”defrags” query cache
  85. 85. Does not empty query cache
  86. 86. Can take a few seconds to run </li></ul><li>reset query cache </li><ul><li>Empties the current query cache </li></ul></ul>
  87. 87. Calculating qcache usage <ul><li>Compare these two: </li><ul><li>show global status like 'qcache_hits'
  88. 88. show global status like 'com_select' </li></ul><li>Qcache hit rate is qcache_hits/(qcache_hits+com_select) </li></ul>
  89. 89. Thread caching <ul><li>Each client gets a thread, there is some minor overhead to creating a thread
  90. 90. show global status like 'thread%'
  91. 91. If threads_created variable increasing quickly, try upping thread_cache_size until it is increasing slowly or not at all </li></ul>
  92. 92. max_connections <ul><li>Default is 100
  93. 93. 500 – 1000 is reasonable for dedicated system
  94. 94. Each connection requires file descriptor, memory for sort buffer ,join buffer, read buffer, etc </li></ul>
  95. 95. Sort Buffer <ul><li>Per client connection
  96. 96. sort_buffer_size – memory allocated per connection (full size always allocated by each connection)
  97. 97. Used by 'order by' and 'group by' second passes
  98. 98. If data is too big for sort buffer, a disk based sort is used </li></ul>
  99. 99. Sort Buffer cont'd <ul><li>Things to watch to help decide on proper sort_buffer_size: </li><ul><li>sort_merge_passes (status): number of merge passes performed – a higher number indicates needing to up sort buffer
  100. 100. sort_range (var) – displays sorts done using ranges
  101. 101. sort_rows (var) – the number of sorted rows
  102. 102. sort_scan (var) – number of sorts done by scanning the table (bigger is bad ) </li></ul></ul>
  103. 103. Join/Read buffers <ul><li>join_buffer_size (var) – size of memory allocated when doing joins that don't use indexes. A well designed SQL database makes very little use of this
  104. 104. read_buffer_size (var) – Buffer for caching row data during full table scans. Again, well designed databases will not use this much. </li></ul>
  105. 105. Tmp tables <ul><li>Used whenever an explain mentions ”using tmp” in the extras column, often used for group by clauses, etc
  106. 106. show global status like '%tmp%tab%' (gives an idea how often these go to disk instead of staying in memory)
  107. 107. tmp_table_size (var) – max size of data for memory based tmp tables. Bigger tmp tables will go to disk </li></ul>
  108. 108. Next Topic MyISAM vs InnoDB
  109. 109. MyISAM Pros/Cons Pros <ul><li>Fast in many scenarios
  110. 110. Smaller data sizes
  111. 111. Easy to backup </li></ul>Cons <ul><li>Data more easily corrupted
  112. 112. No ACID compliance
  113. 113. Cache only stores indexes
  114. 114. Table locks </li></ul>
  115. 115. InnoDB Pros/Cons Pros <ul><li>Row locking
  116. 116. ACID compliance
  117. 117. Cache holds indexes and data </li></ul>Cons <ul><li>Typically a little slower
  118. 118. More difficult to back up
  119. 119. More disk space </li></ul>
  120. 120. Index comparison MyISAM <ul><li>B-Tree (links to data from nodes as well as from leaves)
  121. 121. No auto balance
  122. 122. Optimize table will rebalance tree
  123. 123. No good way to know when to rebalance </li></ul>InnoDB <ul><li>B+Tree (links to data from leaves only)
  124. 124. Auto balances tree
  125. 125. Additional automatic hash index on primary key </li></ul>
  126. 126. Good uses for MyISAM Tables <ul><li>Logging apps
  127. 127. Read only/mostly apps
  128. 128. Bulk data loads, data crunching
  129. 129. Read/wrote with low concurrency
  130. 130. Data warehouses </li></ul>
  131. 131. Good uses for InnoDB <ul><li>If reliability is required
  132. 132. If consistent crash recovery and repair times are important
  133. 133. If there is a performance benefit from the database caching writes itself (esp if there are lots of updates in small periods of time)
  134. 134. High concurrency apps
  135. 135. Where foreign keys or transactions are required </li></ul>
  136. 136. Concurrent MyISAM/InnoDB use <ul><li>Making use of both MyISAM and InnoDB in a single database is non-optimal
  137. 137. They can't share cache space
  138. 138. The query optimizer can get confused on joins between tables of different engine types </li></ul>
  139. 139. Next Topic MyISAM Optimization
  140. 140. Inserts into MyISAM <ul><li>Fixed row size vs variable row size
  141. 141. Data stored in insertion order, except: </li><ul><li>When there's deleted data, leaving empty blocks big enough to insert new row
  142. 142. When using the concurrent insert facility of MyISAM </li></ul><li>For faster population, set session var bulk_insert_buffer_size (defaults to 8MB), and do inserts using multiple values in single insert query. Also works with: LOAD DATA INFILE ... </li></ul>
  143. 143. Concurrent inserts <ul><li>Table locks normally only allow 1 writer at a time
  144. 144. Global var concurrent_insert. Values: </li><ul><li>0 – (Off) Tells the server to not allow any concurrent inserts
  145. 145. 1 – (Default) Enables concurrent insert for MyISAM tables that don't have holes
  146. 146. 2 – Enable concurrent inserts for all MyISAM tables. If the table has a hole and is in use by another thread the new row will be inserted at the end of the table. If the table is not in use, it obtains a normal WRITE lock and inserts the new row into the hole. </li></ul></ul>
  147. 147. Key Cache (aka Key Buffer) <ul><li>Caches index data
  148. 148. Defaults to a mere 16MB, ideal size would be the sum of 'du *.MYI'
  149. 149. show global status like 'key_read_requests' – number of requests for index data
  150. 150. show global status like 'key_reads' – how many requests had to go to disk
  151. 151. Hit rate is (requests - reads) / requests – want this to be > 0.95 ideally </li></ul>
  152. 152. Key Cache cont'd <ul><li>Make sure you keep at least a small (non zero) key_buffer_size, even if using primarily InnoDB because system tables are all MyISAM </li></ul>
  153. 153. Multiple Key Caches <ul><li>Reduces key cache lock contention between threads
  154. 154. Can specify which indexes go into which caches
  155. 155. All caches share the total key_buffer_size
  156. 156. Can preload indexes into a specific cache </li></ul>
  157. 157. Recommended Strategy <ul><li>For busy databases, the following is recommended: </li><ul><li>”static data” cache: about 20% of total size for tables that are never (or rarely) modified
  158. 158. ”heavy update” cache: about 20% of total size for tables that are frequently modified and/or updated
  159. 159. ”default” cache: remaining 60% to use for everything else </li></ul><li>Use 'init_file' mysqld param to point at sql file that sets up these caches, and preloads any indexes into them </li></ul>
  160. 160. Midpoint Insertion Strategy <ul><li>A cache is broken into hot/warm subchains: </li><ul><li>Initial entry: Index is read from the table and put into the end of the warm subchain. After a certain number of hits (currently 3), it gets promoted to the hot subchain.
  161. 161. Promotion: A block starts at the end of the hot subchain and is circulated through. If a block stays at the beginning of the subchain long enough, it gets demoted. Determined by key_cache_age_threshold
  162. 162. Demoted indexes are first candidates for being evicted from cache entirely </li></ul></ul>
  163. 163. Midpoint Insertion Cont'd <ul><li>key_cache_division_limit is minimum percent of index that is warm
  164. 164. Defaults to 100 (no hot subchain) </li></ul>
  165. 165. Pet Idea An idea I'd had was to keep indexes on a ram disk, providing the data set was small enough. Talking with the instructor in the class, he thought it would work. Downside is that you have to rebuild indexes after every reboot. For each table, have to: <ul><ul><li>repair table <table_name> use_frm;
  166. 166. check table <table_name> </li></ul></ul>
  167. 167. Next Topic InnoDB Optimization
  168. 168. Things to Know about InnoDB <ul><li>Performance improved significantly between 4.1 and 5.0
  169. 169. Do not let InnoDB run out of disk space. ”Bad things” will happen.
  170. 170. Too many foreign keys can slow down a database. What's an example of ”too many”? 400 tables with 1600 constraints.
  171. 171. Optimize table only really needs to be run (at most) once a month on tables with lots of deletes. This is because B+Trees auto rebalance. </li></ul>
  172. 172. Buffer Pool
  173. 173. Buffer Pool Setting <ul><li>Buffer pool caches indexes as well as data
  174. 174. Ideally (if InnoDB exclusively) want to set innodb_buffer_pool_size to 60% to 80% of memory
  175. 175. Don't set it so large that you start swapping </li></ul>
  176. 176. Log buffer <ul><li>Used to buffer writes to the InnoDB log files (used for repairing innodb tables on crash)
  177. 177. innodb_log_buffer_size – how much data can be buffered from a single transaction before writing to disk </li><ul><li>Defaults to 1MB
  178. 178. Set as high as 8MB if you have large transactions to reduce disk I/O </li></ul></ul>
  179. 179. InnoDB disk writes <ul><li>Log buffer writes to disk at COMMIT or at checkpoint
  180. 180. Buffer pool writes to disk at checkpoints
  181. 181. innodb_flush_log_at_trx_commit </li><ul><li>0: Logs flushed about 1/second
  182. 182. 1: Commit initiates flush
  183. 183. 2: log_buffer written to file, but only fsync'd every second </li></ul></ul>
  184. 184. Log File Size <ul><li>Bigger means more history, so more stability
  185. 185. Bigger means less checkpoint file/io
  186. 186. Bigger means longer recovery time on crash
  187. 187. According to optimization instructor: </li><ul><li>64 MB log file – recovery time ”a few minutes”
  188. 188. 1 GB log file – recovery time ”an hour or so” </li></ul></ul>
  189. 189. Indexes <ul><li>B+Trees where index page is (default) 16KB
  190. 190. InnoDB tries to leave 1/16 of page free for future inserts and updates
  191. 191. Insertion into table in primary key order helps this
  192. 192. Random order insertion can lead to pages only ½ full (innefficient)
  193. 193. If fill factor is less than ½, InnoDB tries to contract the index tree. </li></ul>
  194. 194. InnoDB best practices <ul><li>Use short, integer primary keys
  195. 195. Load/Insert data in primary key order
  196. 196. Increase log file size – if log file is full, all changes to buffer pool have to get written to disk immediately
  197. 197. Avoid large rollbacks
  198. 198. Avoid mass inserts – insert data in chunks
  199. 199. Use prefix keys if applicable – InnoDB does not do key compression like MyISAM </li></ul>
  200. 200. One last thing about InnoDB <ul><li>innodb_additional_mem_pool_size </li><ul><li>Size of the dictionary cache for meta data about tables. Defaults to 1MB, almost never needs to be more than 8MB </li></ul></ul>
  201. 201. MySQL Optimization The End

×