Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

cPanelCon 2014: InnoDB Anatomy

4,381 views

Published on

Presented at cPanel Conference 2015 in Houston, TX.

Published in: Technology
  • Login to see the comments

cPanelCon 2014: InnoDB Anatomy

  1. 1. InnoDB Anatomy
  2. 2. The InnoDB Engine Introduction to InnoDB • Currently the default in MySQL (as of 5.5) • Referential/Structural Integrity • Consistent data • Transactional
  3. 3. InnoDB is atomic in that its transactions have only two possible outcomes - complete fully, or fail completely. • SUCCESS: • FAILURE: All changes committed. All changes rolled back. InnoDB Anatomy – ACID Compliance ATOMICITY COMMITROLLBACK Unchanged Data Changes Applied
  4. 4. • Data stays consistent before, during, and after a transaction. • No conflict of “versions” • Successful transactions end with a commit. InnoDB maintains optimism. Introduction to InnoDB Structure CONSISTENCY Valid State Work performed Still a valid state
  5. 5. • Transactions cannot interact with each other • Adjustable level of isolation.* • Row-level locking Introduction to InnoDB Structure ISOLATION * Isolation level changed via the transaction-isolation configuration option.
  6. 6. • Atomic transactions keep data durable. • Changes are permanent once committed. • Doublewrite buffer helps to recover from crashes that occur during page writes. Introduction to InnoDB Structure DURABILITY
  7. 7. Explore the InnoDB structure within the file system, and at its lower levels, to find out how it can affect database operations. InnoDB Anatomy The Goal
  8. 8. Understanding the Physical Structure of Data in InnoDB
  9. 9. InnoDB Anatomy – Physical Structure Physical File Structure DataDirectory (Default:/var/lib/mysql) Ibdata1 System Tablespace ib_logfile0 Redo/Transaction Log File ib_logfile1 Redo/Transaction Log File Database Folder table.ibd Tablespace File table.frm Format File
  10. 10. InnoDB Anatomy – Physical Structure InnoDB Tablespace Page Extent Segment Inode File System Partition Disk Block Allocation Unit File Inode
  11. 11. InnoDB Anatomy – Physical Structure InnoDB Tablespace Page Extent Segment Inode User Records (Index Pages)… First Inode Page Number File Header Insert Buffer Bitmap IBD File
  12. 12. InnoDB Anatomy – Physical Structure InnoDB Tablespace Page Extent Segment Inode PAGE… FIL Header - 38 bytes … FIL Trailer - 8 bytes 16384
  13. 13. InnoDB Anatomy – Physical Structure InnoDB Tablespace Page Extent Segment Inode 1MB Page 1 … Page 64 EXTENT
  14. 14. InnoDB Anatomy – Physical Structure InnoDB Tablespace Page Extent Segment Inode Extent Extent … SEGMENT
  15. 15. InnoDB Anatomy – Physical Structure InnoDB Tablespace Page Extent Segment Inode File segment ID Extent Data (Free/Partial/Full Listing)
  16. 16. InnoDB Anatomy – Physical Structure Tablespace Segment Segment Segment Segment Extent Extent Extent Extent Extent Extent Extent
  17. 17. SYSTEM TABLESPACE ibdata File InnoDB Anatomy – System Tablespace Undo Logs Rollback Segment Rollback Segment Data Dictionary SYS_TABLES SYS_INDEXES SYS_COLUMNS SYS_FIELDS Doublewrite Buffer Block 1 (64 Pages) Block 2 (64 Pages) Change Buffer Insert Buffering Update Buffering Purge Buffering Undo Log Space Rollback Segment …
  18. 18. InnoDB Anatomy – System Tablespace Separate your Undo Logs (5.6+ only) innodb_undo_logs; innodb_undo_tablespaces; innodb_undo_directory I/O to the undo logs is random, instead of sequential like some other areas of InnoDB. Because of this, it makes sense to separate your Undo Log tablespaces out from the system tablespace onto a disk that handles random reads and writes more effectively, such as a SSD. Use the Information Schema, or innodb_table_monitor, to view the Data Dictionary table data. In 5.6, the information_schema database contains INNODB_SYS* tables that allow you to view data dictionary information directly in MySQL. Alternatively, you can create a table called “innodb_table_monitor” to dump the data dictionary into the MySQL error logs. How can you use this?
  19. 19. InnoDB Anatomy – System Tablespace Do you need the Doublewrite buffer? innodb_doublewrite With Doublewrite buffer enabled, there is a 5-10% impact on I/O. If you operate on a transactional file system, you disable this to avoid this impact. Customizing Change Buffering for your Workload innodb_change_buffering Change buffering, by default in 5.5+, encompasses insert, update, and delete buffering. If your workload consists almost entirely of one or the other, it can make sense to limit this down to only one type of buffering. How can you use this?
  20. 20. InnoDB in Memory and on Disk Memory Buffer Pool Insert Buffer Log Buffer Additional Memory Disk System Tablespace Doublewrite Buffer Transaction Log Files Insert Buffer Undo Logs Rollback Segment Data Dictionary Undo Buffering Indexing Thread Processing Tablespace Files
  21. 21. InnoDB Anatomy – Pages Page Headers/Trailers Name Byte Length Offset Description FIL_PAGE_SPACE 4 0 Space ID FIL_PAGE_OFFSET 4 4 Page Number FIL_PAGE_PREV 4 8 Previous Page (in key order) FIL_PAGE_NEXT 4 12 Next Page (in key order) FIL_PAGE_LSN 8 16 LSN of page’s latest log record FIL_PAGE_TYPE 2 24 Page Type FIL_PAGE_FILE_FLUSH_LSN 8 26 Flushed-up-to LSN (only in space ID 0, page 0) FIL_PAGE_ARCH_LOG_NO 4 34 Latest archived LSN (only in space ID 0, page 0) FIL Header (38) FIL Trailer (8) Name Byte Length Offset Description FIL_PAGE_END_LSN 8 16376 Low 4 bytes: Checksum, Last 4 bytes: FIL_PAGE_LSN storage/innobase/include/fil0fil.h
  22. 22. InnoDB Anatomy - Demonstration Changing values directly At the byte level, these values can be changed directly in many situations to “trick” InnoDB in one way or another. One good example of this is to get around a page checksum failure. You can change the stored checksum to match the calculated checksum, bypassing the crash and often allowing you sufficient access to your records. How can you use this? InnoDB: Page checksum 2047964429, prior-to-4.0.14-form checksum 4196043695 InnoDB: stored checksum 1873408413, prior-to-4.0.14-form stored checksum 1946395024 # printf '%Xn' 2047964429; printf '%Xn' 4196043695 7A11750D  Primary calculated checksum FA1A8BAF  “Old-style” calculated checksum # expr 16384 * 6  Example Page 6 98304  Starting byte offset for Page 6 Writing the primary calculated checksum over the stored value of page 6: # printf ‘x7Ax11x75x0D’ | dd of=table.ibd bs=1 seek=98304 count=4 conv=notrunc Writing the “old-style” calculated checksum over the stored value of page 6: # printf ‘x7Ax11x75x0D’ | dd of=table.ibd bs=1 seek=98304 count=4 conv=notrunc
  23. 23. •Stored in 2 files by default (ib_logfile0/1) •Treated as single file •Circular buffer LOG BLOCK Header (12) Log Records Trailer (4) … LOG BLOCK Header (12) Log Records Trailer (4) … LOG BLOCK Header (12) Log Records Trailer (4) … ib_logfile0ib_logfile1 InnoDB Anatomy – Redo Logs Structure •Log blocks are 512 bytes •Each block contains checkpoint data The Logical Log FileThe Redo Logs
  24. 24. InnoDB Anatomy – Redo Logs Optimized log file size innodb_log_file_size Larger size means less checkpoint flushing required, reducing I/O impact. Balance with expected recovery time required as a result of the size (less of an issue in 5.6). General Formula: (Current LSN – LSN 60 seconds later) * 60 / 1024 / 1024 Optimized log buffer size innodb_log_buffer_size Log buffer allows transactions to move forward without having to write the log to disk before commit. Increased size allows larger transactions to run without requiring writes to disk before a commit is performed. How can you use this?
  25. 25. InnoDB Anatomy – Index Pages INDEX Pages - B+Tree Structure •Efficient method of storing data on disk in a tree format. •Actual records stored in leaf pages (level 0). •Root-level pages exist at the top of the tree structure. •Non-leaf pages contain only pointers to leaf pages. Level 0 Level 1 Level 2 Root Non-Leaf Leaf Leaf Non-Leaf Leaf
  26. 26. InnoDB Anatomy – Index Pages B+Tree Structure – Basic Index Example Root Node Customer IDs 1-500 Non-Leaf 1-250 Non-Leaf 251-500 Leaf Node 251-260 Leaf Node 261-270 Leaf Node 1-10 Leaf Node 11-20 …
  27. 27. InnoDB Anatomy – Index Pages INDEX Pages •Not physically in order •User data “grows down” •Page directory “grows up” FIL Header (38) … Page Directory FIL Trailer (8) INDEX Header (36) FSEG Header (20) System Records (26) User Data … EMPTY
  28. 28. InnoDB Anatomy – Index Pages INDEX Page Header (after FIL Header) Name Byte Length Offset Description PAGE_N_DIR_SLOTS 2 38 + 0 Number of Slots in Page Directory PAGE_HEAP_TOP 2 38 + 2 Pointer to Record Heap Top PAGE_N_HEAP 2 38 + 4 Number of Records in Heap PAGE_FREE 2 38 + 6 Pointer to start of page’s free-record list PAGE_GARBAGE 2 38 + 8 Number of bytes in “deleted” records PAGE_LAST_INSERT 2 38 + 10 Pointer to last inserted record, or NULL if this has been reset – eg. by a delete. PAGE_DIRECTION 2 38 + 12 Last Insert direction, PAGE_LEFT, PAGE_RIGHT … PAGE_N_DIRECTION 2 38 + 14 Consecutive inserts in the same direction PAGE_N_RECS 2 38 + 16 Number of user records on the page PAGE_MAX_TRX_ID 8 38 + 18 Highest ID of transaction that may have modified a record on the page. PAGE_LEVEL 2 38 + 26 Level of node in index tree PAGE_INDEX_ID 4 38 + 28 Index ID that page belongs to storage/innobase/include/page0page.h
  29. 29. InnoDB Anatomy – Demonstration Demonstration Determining page level on an INDEX page First, find your page’s start byte: # expr 16384 * 3 49152 The offset of the PAGE_LEVEL value is 26 after the FIL Header (38): # expr 49152 + 38 + 26 49216 The byte-length is 2 # xxd –ps –s 49216 –l 2 customer.ibd 0001 Page Level: 1 /var/lib/mysql/testdb/
  30. 30. Additional Resources
  31. 31. Conclusion • Jeremy Cole • http://blog.jcole.us/ • https://github.com/jeremycole/ • Percona • http://www.percona.com/files/percona-live/justin- innodb-internals.pdf • MySQL Internals Documentation & Source • http://dev.mysql.com/doc/internals/en/innodb.html • https://launchpad.net/mysql Sources and Thanks

×