Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Percona Server with XtraDB
Morgan Tocker
morgan@percona.com
1
★ InnoDB?
★ MyISAM?
★ XtraDB?
★ MySQL 4.x?
★ MySQL 5.0.x?
★ MySQL 5.1?
★ MySQL 5.5?
★ Something else?
2
Show of Hands
Agenda for Today
★ 1. Introduction
★ 2. “How InnoDB Works”.
★ 3. Percona Server with XtraDB.
3
Agenda for Today
★ 1. Introduction
★ 2. “How InnoDB Works”.
★ 3. Percona Server with XtraDB.
4
InnoDB History
★ 1994 - First line of code written
★ 1999 - InnoDB “complete”.
★ 2001 - First alpha of working with MySQL....
InnoDB History
★ 1994 - First line of code written
★ 1999 - InnoDB “complete”.
★ 2001 - First alpha of working with MySQL....
InnoDB Limitations (as at 5.0)
6
Slow Crash Recovery Process Not enough diagnostic
information, particularly around
thread...
InnoDB Limitations (as at 5.0)
7
Can’t move tables between
servers.
Slow statistics not available in
slow query log.
Repli...
InnoDB Limitations (as at 5.0)
8
InnoDB pages have checksums
- a very helpful feature to detect
silent corruption. The pro...
What’s the plugin?
★ Until recently, the InnoDB version has been tied closely
to the MySQL release.
★ MySQL 5.1’s pluggabl...
Enabling the InnoDB Plugin
★ MySQL considers the plugin 1.0 GA from 5.1.46. It is
included, but not enabled in most downlo...
The plugin also brings
★ New features!
✦
CPU scalability, fast index creation, buffer pool tablescan
resistance, fast cras...
Why do I tell you this long story?
★ XtraDB is a fork of the InnoDB plugin.
✦
i.e. it inherits all [plugin] features.
★ “F...
Release Model
★ Short release cycle. Changes are mostly incremental
enhancements / minor features.
★ Rebases against new M...
Releases History
★ Historical average of a new release every 1-2 months:
14
Release-1 Dec 2008
Release-2 Dec 2008
Release-...
So what changes?
★ Most of the enhancements fall into two different
categories:
✦
Performance Improvements
✦
Operational /...
We’ll get to explaining these
changes in just a second...
(First I need to explain how InnoDB works).
Agenda for Today
★ 1. Introduction
★ 2. “How InnoDB Works”.
★ 3. Percona Server with XtraDB.
17
“Numbers everyone should know”
18
L1 cache reference 0.5 ns
Branch mispredict 5 ns
L2 cache reference 7 ns
Mutex lock/unlo...
About Disks.
★ 10,000,000 ns = 10ms = 100 operations/second.
✦
This is about the average for a 7200RPM drive.
★ The actual...
[default] Everything is buffered!
★ When you write to a file, here’s what happens in the
Operating System:
20
Block 9, 10, ...
The OS provides a way!
★ $ man fsync
21
Synopsis
#include <unistd.h>
int fsync(int fd);
int fdatasync(int fd);
Description...
Knowing this:
★ InnoDB wants to try and reduce random IO.
★ It can not (safely) rely on the operating system’s write
buffe...
Basic Operation (High Level)
Log Files
23
SELECT * FROM City
WHERE CountryCode=ʼAUSʼ
Buffer Pool
Tablespace
Basic Operation (High Level)
Log Files
23
SELECT * FROM City
WHERE CountryCode=ʼAUSʼ
Buffer Pool
Tablespace
Basic Operation (High Level)
Log Files
23
SELECT * FROM City
WHERE CountryCode=ʼAUSʼ
Buffer Pool
Tablespace
Basic Operation (High Level)
Log Files
23
SELECT * FROM City
WHERE CountryCode=ʼAUSʼ
Buffer Pool
Tablespace
Basic Operation (High Level)
Log Files
23
SELECT * FROM City
WHERE CountryCode=ʼAUSʼ
Buffer Pool
Tablespace
Basic Operation (High Level)
Log Files
23
SELECT * FROM City
WHERE CountryCode=ʼAUSʼ
Buffer Pool
Tablespace
Basic Operation (cont.)
24
Log Files
UPDATE City SET
name = 'Morgansville'
WHERE name = 'Brisbane'
AND CountryCode='AUS'
B...
Basic Operation (cont.)
24
Log Files
UPDATE City SET
name = 'Morgansville'
WHERE name = 'Brisbane'
AND CountryCode='AUS'
B...
Basic Operation (cont.)
24
Log Files
UPDATE City SET
name = 'Morgansville'
WHERE name = 'Brisbane'
AND CountryCode='AUS'
B...
Basic Operation (cont.)
24
Log Files
UPDATE City SET
name = 'Morgansville'
WHERE name = 'Brisbane'
AND CountryCode='AUS'
B...
Basic Operation (cont.)
24
Log Files
UPDATE City SET
name = 'Morgansville'
WHERE name = 'Brisbane'
AND CountryCode='AUS'
0...
Basic Operation (cont.)
24
Log Files
UPDATE City SET
name = 'Morgansville'
WHERE name = 'Brisbane'
AND CountryCode='AUS'
0...
Basic Operation (cont.)
24
Log Files
UPDATE City SET
name = 'Morgansville'
WHERE name = 'Brisbane'
AND CountryCode='AUS'
0...
Basic Operation (cont.)
24
Log Files
UPDATE City SET
name = 'Morgansville'
WHERE name = 'Brisbane'
AND CountryCode='AUS'
0...
Why don’t we update?
★ This is an optimization!
✦
The log file IO is sequential and much cheaper than live
updates.
✦
The I...
More on Logs...
★ Logs are only used during recovery.
✦
Not even read when we need to write down
dirty pages!
★ To figure o...
Log Files, Checkpoints, etc.
★ Most database systems work this way:
✦
In Oracle the transaction logs are called “Redo Logs...
Log Writing
★ You can change increase innodb_log_file_size.
This will allow InnoDB to “smooth out” background IO
for longe...
Log Writing (cont.)
★ You can also change
innodb_flush_log_at_trx_commit to 0 or 2 to
reduce the durability requirements o...
In summary
★ On commit, the log has to be flushed to guarantee
changes.
✦
Nothing else has to be done.
★ What visibility do...
Agenda for Today
★ 1. Introduction
★ 2. “How InnoDB Works”.
★ 3. Percona Server with XtraDB.
31
Terminology
32
Oracle Product License Percona Equivalent
Product
License
MySQL Server GPL Percona Server GPL
The InnoDB St...
Some changes come free
★ Some of the features from the InnoDB plugin...
33
Fast Index Creation
★ In built-in InnoDB, simple statements require table to be
completely rebuilt:
✦
ALTER TABLE my_table...
Fast Index Creation (cont.)
★ Fast index creation only requires a READ LOCK.
★ When the fast indexes are created, they are...
IO scalability
★ --innodb_io_capacity - Set the number of IO
operations per second the server is capable of to
influence ba...
CPU Scalability
★ InnoDB doesn’t perform so well on systems with a lot of
CPUs/cores.
★ 3-4 main patches:
✦
Faster locking...
What’s a Mutex?
38
Ima Server
Thread #1
Thread #2
Thread #3
Thread #4
4 Connections
What’s a Mutex? (cont.)
39
Ima Server
Thread #1
Thread #2
Thread #3
Thread #4
4 Connections
4-1 = 3
4-1 = 3
X
X
What’s a Mutex? (cont.)
40
Ima Server
Thread #3
Thread #4
3 Connections
Mutexes become hotspots
★ The longer the mutex is held, the more likely you can
hold up other tasks - and reduce CPU scala...
Adaptive Flushing
★ Handle background work more aggressively as log
space runs out.
42
First
invented
in Percona
Server
* ...
Fast Crash Recovery
43
First
invented
in Percona
Server
★ Crash recovery in InnoDB can be very slow. From
MySQL BUG #29847...
Performance Improvements
44
Improved Buffer
Pool Scalability
Improved IO Path
+ adaptive checkpointing
Improved Rollback
S...
Improved Buffer Pool Scalability
★ Additional patches to what is already available in the
InnoDB Plugin.
✦
Splits the buff...
Data Dictionary control
★ Once an InnoDB table is opened it is never freed from
the in-memory data dictionary (which is un...
Undo Slots
★ In the built-in InnoDB, the number of undo slots is
limited to 1024.
✦
This means the number of open transact...
Rollback Segments
★ In XtraDB, it’s also possible to have more than one
rollback segment.
✦
Each segment contains undo slo...
Fast Checksums
★ The InnoDB page checksum computation is slower than
it needs to be.
★ XtraDB has the option to use a fast...
Different Page Sizes
★ XtraDB now has support for different page sizes - 4K,
8K, 16K.
50 Warning: This is binary format in...
Separate Purge Thread
★ Cleans up a long history list length faster:
51
See: http://www.mysqlperformanceblog.com/2009/10/1...
Usability Enhancements
52
Show contents of
the buffer pool
Import / export of
innodb_file_per_table
tables
Import / export...
Show Buffer Pool Contents
53
mysql> SELECT d.*,round(100*cnt*16384/(data_length+index_length),2) fit FROM
(SELECT schema_n...
Save Buffer Pool Contents
★ Export the contents of the buffer pool to a file called
‘ib_lru_dump’ in the data directory:
✦
...
Transactional Replication
★ More resilience from slaves crashing - XtraDB no longer
relies on the relay-log.info file.
✦
In...
Import/Export tables
★ Because --innodb-file-per-table still has
information (data dictionary, undo) in the global
tablesp...
Better Handling of Corrupt Tables
★ Instead of crashing the server when a table is
discovered as corrupt, just mark the ta...
The Slow Query Log
58
$ mysql -e “SET GLOBAL log_slow_verbosity = ‘full’;”
$ tail /var/log/mysql.slow
..
# Time: 100924 13...
User Statistics
59
mysql> SET GLOBAL userstat_running = 1;
mysql> SELECT DISTINCT s.TABLE_SCHEMA, s.TABLE_NAME, s.INDEX_NA...
(Related) Xtrabackup Features
★ Report on fragmentation of indexes:
★ $ xtrabackup --stats --tables=art.link* --
datadir=/...
The End
★ Questions?
61
Upcoming SlideShare
Loading in …5
×

Percona 服务器与 XtraDB 存储引擎

373 views

Published on

Percona 服务器与 XtraDB 存储引擎

http://www.ossez.com/forum.php?mod=viewthread&tid=26856&fromuid=426
(出处: OSSEZ)

Published in: Technology
  • Be the first to comment

Percona 服务器与 XtraDB 存储引擎

  1. 1. Percona Server with XtraDB Morgan Tocker morgan@percona.com 1
  2. 2. ★ InnoDB? ★ MyISAM? ★ XtraDB? ★ MySQL 4.x? ★ MySQL 5.0.x? ★ MySQL 5.1? ★ MySQL 5.5? ★ Something else? 2 Show of Hands
  3. 3. Agenda for Today ★ 1. Introduction ★ 2. “How InnoDB Works”. ★ 3. Percona Server with XtraDB. 3
  4. 4. Agenda for Today ★ 1. Introduction ★ 2. “How InnoDB Works”. ★ 3. Percona Server with XtraDB. 4
  5. 5. InnoDB History ★ 1994 - First line of code written ★ 1999 - InnoDB “complete”. ★ 2001 - First alpha of working with MySQL. ★ ... ★ 2005 - MySQL 5.0 released with COMPACT row format. ★ ... ★ 2008 - InnoDB Plugin Announced ★ 2010 - InnoDB Plugin 1.1 Announced (to ship with 5.5) 5
  6. 6. InnoDB History ★ 1994 - First line of code written ★ 1999 - InnoDB “complete”. ★ 2001 - First alpha of working with MySQL. ★ ... ★ 2005 - MySQL 5.0 released with COMPACT row format. ★ ... ★ 2008 - InnoDB Plugin Announced ★ 2010 - InnoDB Plugin 1.1 Announced (to ship with 5.5) 5 Long delays while MySQL 5.0, 5.1 is released.
  7. 7. InnoDB Limitations (as at 5.0) 6 Slow Crash Recovery Process Not enough diagnostic information, particularly around threads that write data/sync Only one buffer pool. No QoS of mapping tables to buffer pool or pinning indexes/content to prevent eviction. Poor Multi CPU Scalability Broken group commit support No way to see contents of buffer pool. No way to limit the memory resident data dictionary size. No features for warming big buffer pools on server start. The adaptive hash does not suit all workloads. Not able to take advantage of a more powerful IO system, that can sustain multiple concurrent threads. No real ability to configure tablespaces - just two limited options. Page flushing is not aggressive enough, early enough leading up to checkpoints. Insert buffer shows weakness - can be up to 1/2 the buffer pool size - and doesn’t make active attempts to be more aggressive at contracting when reaching limit. IO read ahead assumptions have no configuration options / ability to disable. Limited number of undo segments limits concurrent transactions to 1024. = Patch Available in some form.
  8. 8. InnoDB Limitations (as at 5.0) 7 Can’t move tables between servers. Slow statistics not available in slow query log. Replication is not transactional. No way to force checkpoint Can’t cluster on an index other than Primary key. Opening tables is serialized by LOCK_Open mutex. No way to freeze checkpoint/ flushing activity. auto_increment scalability is very bad. No parallel query execution plans. Adding files to a table space must be done via configuration file not online. Statistics sampling is done by 10 random dives - limited control over resampling Index statistics don’t persist on restart and are recalculated each time. Can’t change page sizes without recompile. Not possible to have multiple page sizes. Diagnostics - Can’t see a history of deadlocks. Can’t control page fill factor. = Patch Available in some form.
  9. 9. InnoDB Limitations (as at 5.0) 8 InnoDB pages have checksums - a very helpful feature to detect silent corruption. The problem is there’s 2 checksums and there may be benefit from being able to change the algorithm. Further improvements possible to IO. InnoDB’s emulated async IO may not be required. Newer system calls like fallocate/fadvise may lead to improvements. No memory manager or effective way to limit memory use. This is both true for MySQL and the overhead consumed with InnoDB meta data. Insert buffer does not assist for delete operations. Dropping an index recreates the whole table. Indexes can not be added online InnoDB per page memory/ storage overhead could probably be reduced. There are no features to compress/pack indexes. There is no support for additional index algorithms (such as hash or bitmap) = Patch Available in some form.
  10. 10. What’s the plugin? ★ Until recently, the InnoDB version has been tied closely to the MySQL release. ★ MySQL 5.1’s pluggable storage engine API - ✦ Developers have increased freedom to make improvements independent of MySQL. ★ Important Note: The default InnoDB storage engine in MySQL 5.1 is not the InnoDB plugin version. ✦ But the plugin version has been declared GA. 9
  11. 11. Enabling the InnoDB Plugin ★ MySQL considers the plugin 1.0 GA from 5.1.46. It is included, but not enabled in most downloads: ✦ [mysqld] ignore-builtin-innodb plugin-load=innodb=ha_innodb_plugin.so ★ See: http://dev.mysql.com/doc/refman/5.1/en/ innodb.html 10
  12. 12. The plugin also brings ★ New features! ✦ CPU scalability, fast index creation, buffer pool tablescan resistance, fast crash recovery... ★ It is the main focus of new InnoDB development. 11
  13. 13. Why do I tell you this long story? ★ XtraDB is a fork of the InnoDB plugin. ✦ i.e. it inherits all [plugin] features. ★ “Fork” doesn’t necessarily mean the same as it used to. ✦ Think of it like after market enhancements you can make to your vehicle. ✦ Percona rebases XtraDB against new plugin releases. ✦ The on disk format in XtraDB does not change by default. i.e. You can switch InnoDB<->XtraDB all day long. 12
  14. 14. Release Model ★ Short release cycle. Changes are mostly incremental enhancements / minor features. ★ Rebases against new MySQL releases. ✦ i.e. Version number should be reads as: Percona-Server-server-51-5.1.47-rel11.0.47.rhel5.x86_64.rpm 13 Major MySQL Version Minor MySQL Version XtraDB Release Version
  15. 15. Releases History ★ Historical average of a new release every 1-2 months: 14 Release-1 Dec 2008 Release-2 Dec 2008 Release-3 Feb 2009 Release-4 April 2009 Release-5 April 2009 Release-6 July 2009 Release-7 August 2009 Release-8 Oct 2009 Release-9 Jan 2010 Release-9.1 March 2010 Release-10 April 2010 Release-10.1 May 2010 Release-11* June 2010 Release-11.1 June 2010 Release-11.2 July 2010 Release-11.3 September 2010 Release-12** September 2010 Release-11.4 September 2010 Release-12.1 October 2010 Release-11.5 October 2010 Release-11.6 November 2010 Release-12.3 December 2010 Release-11.7 December 2010 Release-12.4 December 2010 Release-12.5 January 2011
  16. 16. So what changes? ★ Most of the enhancements fall into two different categories: ✦ Performance Improvements ✦ Operational / Usability Features 15
  17. 17. We’ll get to explaining these changes in just a second... (First I need to explain how InnoDB works).
  18. 18. Agenda for Today ★ 1. Introduction ★ 2. “How InnoDB Works”. ★ 3. Percona Server with XtraDB. 17
  19. 19. “Numbers everyone should know” 18 L1 cache reference 0.5 ns Branch mispredict 5 ns L2 cache reference 7 ns Mutex lock/unlock 25 ns Main memory reference 100 ns Compress 1K bytes with Zippy 3,000 ns Send 2K bytes over 1 Gbps network 20,000 ns Read 1 MB sequentially from memory 250,000 ns Round trip within same datacenter 500,000 ns Disk seek 10,000,000 ns Read 1 MB sequentially from disk 20,000,000 ns Send packet CA->Netherlands->CA 150,000,000 ns See: http://www.linux-mag.com/cache/7589/1.html and Google http:// www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf
  20. 20. About Disks. ★ 10,000,000 ns = 10ms = 100 operations/second. ✦ This is about the average for a 7200RPM drive. ★ The actual time has dramatic variation. ✦ The variation is because disks are mechanical. ✦ We can much write faster sequentially than randomly. 19
  21. 21. [default] Everything is buffered! ★ When you write to a file, here’s what happens in the Operating System: 20 Block 9, 10, 1, 4, 200, 5. Block 1, 4, 5, 9, 10, 200 What happens to this buffer if we loose power?
  22. 22. The OS provides a way! ★ $ man fsync 21 Synopsis #include <unistd.h> int fsync(int fd); int fdatasync(int fd); Description fsync() transfers ("flushes") all modified in-core data of (i.e., modified buffer cache pages for) the file referred to by the file descriptor fd to the disk device (or other permanent storage device) where that file resides. The call blocks until the device reports that the transfer has completed. It also flushes metadata information associated with the file (see stat(2)). Hint: MyISAM just writes to the OS buffer and has no durability. http://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/
  23. 23. Knowing this: ★ InnoDB wants to try and reduce random IO. ★ It can not (safely) rely on the operating system’s write buffering and be ACID compliant. ✦ .. and InnoDB algorithms have to compensate. 22
  24. 24. Basic Operation (High Level) Log Files 23 SELECT * FROM City WHERE CountryCode=ʼAUSʼ Buffer Pool Tablespace
  25. 25. Basic Operation (High Level) Log Files 23 SELECT * FROM City WHERE CountryCode=ʼAUSʼ Buffer Pool Tablespace
  26. 26. Basic Operation (High Level) Log Files 23 SELECT * FROM City WHERE CountryCode=ʼAUSʼ Buffer Pool Tablespace
  27. 27. Basic Operation (High Level) Log Files 23 SELECT * FROM City WHERE CountryCode=ʼAUSʼ Buffer Pool Tablespace
  28. 28. Basic Operation (High Level) Log Files 23 SELECT * FROM City WHERE CountryCode=ʼAUSʼ Buffer Pool Tablespace
  29. 29. Basic Operation (High Level) Log Files 23 SELECT * FROM City WHERE CountryCode=ʼAUSʼ Buffer Pool Tablespace
  30. 30. Basic Operation (cont.) 24 Log Files UPDATE City SET name = 'Morgansville' WHERE name = 'Brisbane' AND CountryCode='AUS' Buffer Pool Tablespace
  31. 31. Basic Operation (cont.) 24 Log Files UPDATE City SET name = 'Morgansville' WHERE name = 'Brisbane' AND CountryCode='AUS' Buffer Pool Tablespace
  32. 32. Basic Operation (cont.) 24 Log Files UPDATE City SET name = 'Morgansville' WHERE name = 'Brisbane' AND CountryCode='AUS' Buffer Pool Tablespace
  33. 33. Basic Operation (cont.) 24 Log Files UPDATE City SET name = 'Morgansville' WHERE name = 'Brisbane' AND CountryCode='AUS' Buffer Pool Tablespace
  34. 34. Basic Operation (cont.) 24 Log Files UPDATE City SET name = 'Morgansville' WHERE name = 'Brisbane' AND CountryCode='AUS' 01010 Buffer Pool Tablespace
  35. 35. Basic Operation (cont.) 24 Log Files UPDATE City SET name = 'Morgansville' WHERE name = 'Brisbane' AND CountryCode='AUS' 01010 Buffer Pool Tablespace
  36. 36. Basic Operation (cont.) 24 Log Files UPDATE City SET name = 'Morgansville' WHERE name = 'Brisbane' AND CountryCode='AUS' 01010 Buffer Pool Tablespace
  37. 37. Basic Operation (cont.) 24 Log Files UPDATE City SET name = 'Morgansville' WHERE name = 'Brisbane' AND CountryCode='AUS' 01010 Buffer Pool Tablespace
  38. 38. Why don’t we update? ★ This is an optimization! ✦ The log file IO is sequential and much cheaper than live updates. ✦ The IO for the eventual updates to the tablespace can be optimized as well. ★ Provided that you saved enough to recover, this shouldn’t matter should it? 25
  39. 39. More on Logs... ★ Logs are only used during recovery. ✦ Not even read when we need to write down dirty pages! ★ To figure out which pages need to be evicted we have two lists - the flush list and the LRU. ★ Log activities are all assigned a LSN (log sequence number). 26 Log Files
  40. 40. Log Files, Checkpoints, etc. ★ Most database systems work this way: ✦ In Oracle the transaction logs are called “Redo Logs”. ★ The background process of syncing dirty pages is normally referred to as a “Checkpoint”. ★ InnoDB has fuzzy checkpointing. 27
  41. 41. Log Writing ★ You can change increase innodb_log_file_size. This will allow InnoDB to “smooth out” background IO for longer. ✦ Tip: Optionally you could change innodb_log_files_in_group as well. Be aware that your effective log file is innodb_log_file_size * innodb_log_files_in_group 28
  42. 42. Log Writing (cont.) ★ You can also change innodb_flush_log_at_trx_commit to 0 or 2 to reduce the durability requirements of this write. ✦ Requires less flushing - particularly helpful on systems without writeback caches! ★ innodb_log_buffer_size may also help buffer changes for longer before writing to the logs. ✦ Very workload dependent - tends to be more helpful for writing big TEXT/BLOB changes. 29
  43. 43. In summary ★ On commit, the log has to be flushed to guarantee changes. ✦ Nothing else has to be done. ★ What visibility do we have into these operations? ★ How do we decide how much background work to do per second? ★ What happens if we fall behind in background work? 30
  44. 44. Agenda for Today ★ 1. Introduction ★ 2. “How InnoDB Works”. ★ 3. Percona Server with XtraDB. 31
  45. 45. Terminology 32 Oracle Product License Percona Equivalent Product License MySQL Server GPL Percona Server GPL The InnoDB Storage Engine (Plugin edition) GPL The XtraDB Storage Engine GPL InnoDB Hot Backup Commercial XtraBackup GPL (GPL = Completely free for you to use. Support not included.)
  46. 46. Some changes come free ★ Some of the features from the InnoDB plugin... 33
  47. 47. Fast Index Creation ★ In built-in InnoDB, simple statements require table to be completely rebuilt: ✦ ALTER TABLE my_table ADD INDEX my_idx (col1); ✦ CREATE INDEX my_idx ON my_table (col1); ✦ ALTER TABLE my_table DROP INDEX my_idx; ★ In InnoDB plugin, these statements just recreate the index, provided that: ✦ The index is not the primary key. ✦ The index does not use the UTF-8 character set (BUG #33650) 34 Inherited via Plugin
  48. 48. Fast Index Creation (cont.) ★ Fast index creation only requires a READ LOCK. ★ When the fast indexes are created, they are first pre- sorted, and then inserted in order. ✦ This results in a better index fill factor, and a faster index insertion. 35
  49. 49. IO scalability ★ --innodb_io_capacity - Set the number of IO operations per second the server is capable of to influence background thread algorithms (default 200). ✦ 100 IOPS is about the capability of a single 7200RPM disk. ★ --innodb_read_io_threads and -- innodb_write_io_threads - Using multiple IO threads will often lead to better performance on bigger raid systems. 36 Inherited via Plugin
  50. 50. CPU Scalability ★ InnoDB doesn’t perform so well on systems with a lot of CPUs/cores. ★ 3-4 main patches: ✦ Faster locking on Linux, Windows and Solaris. ✦ Option to disable InnoDB internal memory allocator. ✦ Improvements to thread concurrency settings. ✦ Using the PAUSE instruction in InnoDB spin loops. ★ Most of these changes are transparent! 37 Inherited via Plugin
  51. 51. What’s a Mutex? 38 Ima Server Thread #1 Thread #2 Thread #3 Thread #4 4 Connections
  52. 52. What’s a Mutex? (cont.) 39 Ima Server Thread #1 Thread #2 Thread #3 Thread #4 4 Connections 4-1 = 3 4-1 = 3 X X
  53. 53. What’s a Mutex? (cont.) 40 Ima Server Thread #3 Thread #4 3 Connections
  54. 54. Mutexes become hotspots ★ The longer the mutex is held, the more likely you can hold up other tasks - and reduce CPU scalability: 41 CPUs in use
  55. 55. Adaptive Flushing ★ Handle background work more aggressively as log space runs out. 42 First invented in Percona Server * Adaptive Checkpointing also available in Percona Server http://www.mysqlperformanceblog.com/2008/11/13/adaptive-checkpointing/
  56. 56. Fast Crash Recovery 43 First invented in Percona Server ★ Crash recovery in InnoDB can be very slow. From MySQL BUG #29847: [28 Oct 2008 21:40] James Day One reported effect of this performance limitation is that a system with 24GB buffer pool size could only recover 10% after 2 hour. With a 4G buffer pool and innodb_flush_method=O_DIRECT removed the system recovered completely in 30 minutes. Partial workarounds. 1. During recovery, temporarily reduce innodb_buffer_pool_size to force InnoDB to flush pages from the flush list. A value of 4G is likely to be reasonable. 2. During recovery, temporarily remove O_DIRECT so that the operating system can cache changes during recovery.
  57. 57. Performance Improvements 44 Improved Buffer Pool Scalability Improved IO Path + adaptive checkpointing Improved Rollback Segment Scalability*Separate purge thread Data dictionary memory consumption controls Increased number of undo slots* Faster page checksums* Support for different page sizes* * Changes on disk format (not backwards compatible) Insert buffer controls Completely disable the query cache. Remove excess fcntl calls Per session configuration of innodb_flush_log_at_trx_commit Separate location of double write buffer* Strip comments before using query cache Transaction logs larger than 4G supported.
  58. 58. Improved Buffer Pool Scalability ★ Additional patches to what is already available in the InnoDB Plugin. ✦ Splits the buffer pool mutex into smaller mutexes: • Flush list mutex • LRU mutex • Free mutex • hash mutex 45
  59. 59. Data Dictionary control ★ Once an InnoDB table is opened it is never freed from the in-memory data dictionary (which is unlimited in size). ★ XtraDB introduces: ✦ --innodb_dict_size_limit - a configuration item in bytes. ✦ innodb_dict_tables - a status variable for number entries in the cache. 46
  60. 60. Undo Slots ★ In the built-in InnoDB, the number of undo slots is limited to 1024. ✦ This means the number of open transactions is limited to 1023. See: http://bugs.mysql.com/bug.php?id=26590 ✦ Some statements require 2 undo slots. ★ In XtraDB, this is expanded to 4072 with -- innodb_extra_undoslots=1. 47 Warning: This is binary format incompatible!
  61. 61. Rollback Segments ★ In XtraDB, it’s also possible to have more than one rollback segment. ✦ Each segment contains undo slots. ★ Configuration is --innodb-extra-rsegments=N ★ This has the added effect of reducing mutex contention on the rollback segment: ✦ “Mutex at 0×1b3e3e78 created file trx/trx0rseg.c line 167″ 48 http://www.percona.com/docs/wiki/percona- xtradb:patch:innodb_extra_rseg http://www.mysqlperformanceblog.com/2009/10/14/tuning- for-heavy-writing-workloads/ Warning: This is binary format incompatible!
  62. 62. Fast Checksums ★ The InnoDB page checksum computation is slower than it needs to be. ★ XtraDB has the option to use a faster checksum format. 49 Warning: This is binary format incompatible!
  63. 63. Different Page Sizes ★ XtraDB now has support for different page sizes - 4K, 8K, 16K. 50 Warning: This is binary format incompatible!
  64. 64. Separate Purge Thread ★ Cleans up a long history list length faster: 51 See: http://www.mysqlperformanceblog.com/2009/10/14/ tuning-for-heavy-writing-workloads/
  65. 65. Usability Enhancements 52 Show contents of the buffer pool Import / export of innodb_file_per_table tables Import / export of buffer pool contents Transactional Replication Better handling of corrupted tables Store buffer pool in shared memory segment Save index statistics between restarts Advise in processlist when waiting on Query cache mutex. Improved slow query log User / Index / Table statistics Disable automatic statistics regeneration Show data dictionary Deadlock counter Show Temporary Tables Log connection errors Retain query response time distribution.
  66. 66. Show Buffer Pool Contents 53 mysql> SELECT d.*,round(100*cnt*16384/(data_length+index_length),2) fit FROM (SELECT schema_name,table_name,count(*) cnt,sum(dirty),sum(hashed)  FROM INNODB_BUFFER_POOL_PAGES_INDEX GROUP BY schema_name,table_name ORDER BY cnt DESC LIMIT 20) d JOIN TABLES ON (TABLES.table_schema=d.schema_name AND TABLES.table_name=d.table_name); +-------------+---------------------+---------+------------+-------------+--------+ | schema_name | table_name          | cnt     | sum(dirty) | sum(hashed) | fit    | +-------------+---------------------+---------+------------+-------------+--------+ | db          | table1              | 1699133 |      13296 |      385841 |  87.49 | | db          | table2              | 1173272 |      17399 |       11099 |  98.42 | | db          | table3              |  916641 |       7849 |       15316 |  94.77 | | db          | table4              |   86999 |       1555 |       75554 |  87.42 | | db          | table5              |   32701 |       7997 |       30082 |  91.61 | | db          | table6              |   31990 |       4495 |       25681 | 102.97 | | db          | table7              |       1 |          0 |           0 | 100.00 | +-------------+---------------------+---------+------------+-------------+--------+ 7 rows in set (26.45 sec) Source: http://www.mysqlperformanceblog.com/ 2010/03/26/tables-fit-buffer-poo/
  67. 67. Save Buffer Pool Contents ★ Export the contents of the buffer pool to a file called ‘ib_lru_dump’ in the data directory: ✦ SELECT * FROM information_schema.XTRADB_ADMIN_COMMAND /*! XTRA_LRU_DUMP*/; ★ Restored ib_lru_dump: ✦ SELECT * FROM information_schema.XTRADB_ADMIN_COMMAND /*! XTRA_LRU_RESTORE*/; 54 Note: Not the actual contents - it takes 8 bytes to remember the address of a 16K page.
  68. 68. Transactional Replication ★ More resilience from slaves crashing - XtraDB no longer relies on the relay-log.info file. ✦ Instead log coordination is stored in InnoDB tables internally. 55
  69. 69. Import/Export tables ★ Because --innodb-file-per-table still has information (data dictionary, undo) in the global tablespace you can’t just back it up by itself. ★ With a new setting, --innodb_expand_import=1, this is no longer the case. ★ Tip: The import/export still has to be done with XtraBackup. Documentation available here: http://www.percona.com/docs/wiki/percona-xtradb:patch:innodb_expand_import 56
  70. 70. Better Handling of Corrupt Tables ★ Instead of crashing the server when a table is discovered as corrupt, just mark the table as corrupt and continue. 57
  71. 71. The Slow Query Log 58 $ mysql -e “SET GLOBAL log_slow_verbosity = ‘full’;” $ tail /var/log/mysql.slow .. # Time: 100924 13:58:47 # User@Host: root[root] @ localhost [] # Thread_id: 10 Schema: imdb Last_errno: 0 Killed: 0 # Query_time: 399.563977 Lock_time: 0.000110 Rows_sent: 1 Rows_examined: 46313608 Rows_affected: 0 Rows_read: 1 # Bytes_sent: 131 Tmp_tables: 1 Tmp_disk_tables: 1 Tmp_table_sizes: 25194923 # InnoDB_trx_id: 1403 # QC_Hit: No Full_scan: Yes Full_join: No Tmp_table: Yes Tmp_table_on_disk: Yes # Filesort: Yes Filesort_on_disk: Yes Merge_passes: 5 # InnoDB_IO_r_ops: 1064749 InnoDB_IO_r_bytes: 17444847616 InnoDB_IO_r_wait: 26.935662 # InnoDB_rec_lock_wait: 0.000000 InnoDB_queue_wait: 0.000000 # InnoDB_pages_distinct: 65329 SET timestamp=1285336727; select STRAIGHT_JOIN count(*) as c, person_id FROM cast_info FORCE INDEX(person_id) INNER JOIN title ON (cast_info.movie_id=title.id) WHERE title.kind_id = 1 GROUP BY cast_info.person_id ORDER by c DESC LIMIT 1; $ tail /var/log/mysql.slow .. # Time: 100924 13:58:47 # User@Host: root[root] @ localhost [] # Query_time: 399.563977 Lock_time: 0.000110 Rows_sent: 1 Rows_examined: 46313608 SET timestamp=1285336727; select STRAIGHT_JOIN count(*) as c, person_id FROM cast_info FORCE INDEX(person_id) INNER JOIN title ON (cast_info.movie_id=title.id) WHERE title.kind_id = 1 GROUP BY cast_info.person_id ORDER by c DESC LIMIT 1; MySQL Server Percona Server
  72. 72. User Statistics 59 mysql> SET GLOBAL userstat_running = 1; mysql> SELECT DISTINCT s.TABLE_SCHEMA, s.TABLE_NAME, s.INDEX_NAME FROM information_schema.statistics `s` LEFT JOIN information_schema.index_statistics IS ON (s.TABLE_SCHEMA = IS.TABLE_SCHEMA AND s.TABLE_NAME=IS.TABLE_NAME AND s.INDEX_NAME=IS.INDEX_NAME) WHERE IS.TABLE_SCHEMA IS NULL; +--------------+---------------------------+-----------------+ | TABLE_SCHEMA | TABLE_NAME                | INDEX_NAME      | +--------------+---------------------------+-----------------+ | art100       | article100                | ext_key         | | art100       | article100                | site_id         | | art100       | article100                | hash            | | art100       | article100                | forum_id_2      | | art100       | article100                | published       | | art100       | article100                | inserted        | | art100       | article100                | site_id_2       | | art100       | author100                 | PRIMARY         | | art100       | author100                 | site_id         | ... +--------------+---------------------------+-----------------+ 1150 rows IN SET (1 min 44.23 sec) MySQL Server Percona Server ( Not Possible )
  73. 73. (Related) Xtrabackup Features ★ Report on fragmentation of indexes: ★ $ xtrabackup --stats --tables=art.link* -- datadir=/mnt/data/mysql/ ... table: art/link_out104, index: PRIMARY, space id: 12, root page 3 leaf pages: recs=25958413, pages=497839, data=7492026403 bytes, data/pages=91% ... 60 http://www.mysqlperformanceblog.com/2009/09/14/statistics- of-innodb-tables-and-indexes-available-in-xtrabackup/
  74. 74. The End ★ Questions? 61

×