SlideShare a Scribd company logo
InnoDB: Space Management
Mijin An
meeeeejin@gmail.com
Five Types of Tablespaces
•The storage area for the doublewrite buffer and the
change buffer (ibdata1)
•System Tablespace
•The storage area for the user defined table and index
data (.ibd)
•File-Per-Table Tablespace
•General Tablespace
•The storage area for undo logs (undo_001)
•Undo Tablespace
•The storage area for user/optimizer-created temporary
tables and rollback segments for them (.ibt, ibtmp1)
•Temporary Tablespace
System Tablespace (ibdata1)
• innodb_file_per_table = OFF
•The storage area for the doublewrite buffer and the change buffer
•The files are logically concatenated and there is no striping in use
•You cannot define where within the system tablespace your tables are allocated
•Truncating or dropping a table only frees space and it can be reused for InnoDB data
(i.e., not returned to the OS, as system tablespace data files never shrink)
System Tablespace
(ibdata1)
Doublewrite Buffer
Change Buffer
File-Per-Table Tablespace (.ibd)
• innodb_file_per_table = ON (default)
•Each newly created table is stored in a separate tablespace file with a extension .ibd
•There is less fragmentation within the disk file
•The data of a single table and its indexes reside in one .ibd file
•When a table is truncated or dropped, the freed space is returned to the OS
•This slides will focus on the space management of file-per-tablespace t1.ibd
t2.ibd
innodb_file_per_table=ON
General Tablespace (.ibd)
•Shared tablespaces created using CREATE TABLESPACE syntax
•The data from different tables and their indexes reside in one .ibd file
•They can be created outside of the MySQL data directory
•Capable of holding multiple tables
•Support tables of all row formats
CREATE TABLESPACE tablespace_name
[ADD DATAFILE 'file_name']
[FILE_BLOCK_SIZE = value]
[ENGINE [=] engine_name]
1
2
3
4
ts1.ibd
t3 t4 T5
ts2.ibd
t6 t7 t8
•Two default undo tablespaces are created when the MySQL instance is initialized
•Since MySQL 8.0.14, additional undo table spaces can be created at runtime
•The initial size of an undo tablespace data file depends on the innodb_page_size value
Undo Tablespace (undo_00x)
undo_003.ibu
(user-defined)
undo_001
(system)
undo_004.ibu
(user-defined)
undo_002
(system)
Page Size Undo Tablespace File Size
4 KB 7 MiB
8 KB 8 MiB
16 KB 10 MiB
32 KB 20 MiB
64 KB 40 MiB
•Session Temporary Tablespace (#innodb_temp/temp_x.ibt)
•Store user-created temporary tables and internal temporary tables created by the optimizer
•A pool of 10 temporary tablespaces (5 pages in size) is created when the server is started
•When a session disconnects, its temporary table spaces are truncated and released back to
the pool
•Global Temporary Tablespace (ibtmp1)
•Store rollback segments for changes made to user-created temporary tables
•The initial file size is slightly larger than 12MB
•Removed on normal shutdown and recreated each time the server is started
Temporary Tablespace (.ibt, ibtmp1)
ibtmp1
(global)
temp_1.ibt
(session)
temp_2.ibt
(session)
temp_3.ibt
(session)
.ibd Files
•Example: Create a table test.t1
•Table and index data related to the table t1 will reside in the above file
•For a file-per-table tablespace, the tablespace name is the same as that of the file/table name
•Tablespaces are identified with a unique ID (i.e., tablespace ID)
CREATE TABLE test.t1 (c INT) engine=InnoDB;
$ cd PATH_TO_DATA_DIR/test
$ ls
t1.ibd
t1.ibd
Pages
•Each tablespace consists of number of fixed size pages (default: innodb_page_size=16KB)
•There are different type of pages to server different purpose
TABLESPACE FILE
Page 0
Page 1
…
Page N
Extents
•The pages are grouped into extents of size 1MB for pages up to 16KB in size
•For a page size of 32KB, extent size is 2MB (for 64KB page, 4MB extent)
TABLESPACE FILE
Extent 1
Extent 1 Page 0
Extent 1 Page 1
…
Extent 1 Page 63
… …
Extent N
Extent N Page 0
Extent N Page 1
…
Extent N Page 63
Segments
•The files inside a tablespace are called segments in InnoDB
•Within a file-per-table tablespace:
•Table data: in one segment / Index: in its own segment
•The system tablespace contains many different segments:
•It can hold many tables and their associated indexes
•Rollback segments is separated from the system tablespace since MySQL 8.0
•When a segment needs more room, it is extended by one extent (1MB) at a time:
•InnoDB can add up to 4 extents at a time to ensure good sequentiality of data
Basic Page Overview
•Every page has a 38-byte FIL (file) header and 8-byte FIL trailer
• The space ID and page number is stored in the header
FIL Header (38)
Other headers and page data,
depending on page type
FIL Trailer (8)
Checksum (4)
Offset (Page Number) (4)
Previous Page (4)
Next Page (4)
LSN for last page modification (8)
Page Type (2)
Flush LSN (0 except space 0 page 0) (8)
Space ID (4)
Old-style Checksum (4)
Low 32 bits of LSN (4)
0
38
16376
16384
0
4
16
12
26
24
8
38
34
16376
16380
16384
#define FIL_PAGE_SPACE_OR_CHKSUM 0
...
#define FIL_PAGE_OFFSET 4
...
#define FIL_PAGE_PREV 8
...
#define FIL_PAGE_NEXT 12
...
#define FIL_PAGE_LSN 16
...
#define FIL_PAGE_TYPE 24
...
#define FIL_PAGE_FILE_FLUSH_LSN 26
...
constexpr ulint FIL_PAGE_ARCH_LOG_NO_OR_SPACE_ID = 34;
/** alias for space id */
#define FIL_PAGE_SPACE_ID FIL_PAGE_ARCH_LOG_NO_OR_SPACE_ID
...
constexpr ulint FIL_PAGE_DATA = 38;
Basic Page Overview: FIL Header
•include/fil0types.h
Basic Page Overview: FIL Trailer
•include/fil0types.h
/** File page trailer */
/** the low 4 bytes of this are used to store the page checksum, the
last 4 bytes should be identical to the last 4 bytes of FIL_PAGE_LSN */
constexpr ulint FIL_PAGE_END_LSN_OLD_CHKSUM = 8;
/** size of the page trailer */
constexpr ulint FIL_PAGE_DATA_END = 8;
Space File Overview
•A space file is just a concatenation of many pages
•The first page (page 0) in a space is ALWAYS an FSP_HDR (file space header) page
•An FSP_HDR page only store bookkeeping info for 256 extents (= 16,384 pages = 256 MiB)
•Additional space must be reserved for bookkeeping info in the form of an XDES page
FSP_HDR: Filespace Header / Extent Descriptor
IBUF_BITMAP: Change Buffer Bookkeeping
depending on page type
INODE: Index Node Information
…
XDES: Extent descriptor for new 16,384 pages
IBUF_BITMAP: IBUF Bitmap for new 16,384 pages
…
XDES: Extent descriptor for new 16,384 pages
IBUF_BITMAP: IBUF Bitmap for new 16,384 pages
…
0
16 KiB
32 KiB
48 KiB
256 MiB
512 MiB
ibdata1 File Overview
•The system tablesapce (space 0) is special in InnoDB
•To store a wide range of information critical to InnoDB’s operation
FSP_HDR: Filespace Header / Extent Descriptor
IBUF_BITMAP: Change Buffer Bookkeeping
INODE: Index Node Information
SYS: Change Buffer Header
INDEX: Change Buffer Root
TRX_SYS: Transaction System Header
SYS: First Rollback Segment
SYS: Data Dictionary Header
…
Double Write Buffer Block 1 (64 pages)
Double Write Buffer Block 2 (64 pages)
…
0
16 KiB
32 KiB
48 KiB
64 MiB
Page 64
Page 128
Page 192
.ibd File Overview
•File-Per-Table tablespace
•The .ibd file created for each table has the typical space file structure
FSP_HDR: Filespace Header / Extent Descriptor
IBUF_BITMAP: Change Buffer Bookkeeping
INODE: Index Node Information
INDEX: Root page of the primary index
INDEX: Root page of the secondary index
INDEX: Node pages …
INDEX: Leaf pages …
ALLOCATED: Reserved but unused pages …
…
0
16 KiB
32 KiB
48 KiB
64 MiB
List Base Node
•Lists are fairly generic structure that allows linking multiple related structures together
•The base node is stored only once in some high level structure (e.g., FSP header)
•include/fut0lst.h
•include/fut0lst.ic
List Length (4)
Page Number of The First List Node (4)
Offset of The First List Node (2)
Page Number of The Last List Node (4)
Offset of The Last List Node(2)
N
N+4
N+8
N+10
N+14
N+16
typedef byte flst_base_node_t;
/* The physical size of a list base node in bytes */
constexpr ulint FLST_BASE_NODE_SIZE = 4 + 2 * FIL_ADDR_SIZE;
#define FLST_LEN 0 /* 32-bit list length field */
#define FLST_FIRST 4 /* 6-byte address of the first element of the list */
#define FLST_LAST (4 + FIL_ADDR_SIZE) /* 6-byte address of the last element of the list */
6
List Node
•The list node stores previous and next pointers
•include/fut0lst.h
•include/fut0lst.ic
Page Number of The Previous List Node (4)
Offset of The Previous List Node(2)
Page Number of The Next List Node (4)
Offset of The Next List Node (2)
N
N+4
N+6
N+10
N+12
typedef byte flst_node_t;
/* The physical size of a list node in bytes */
constexpr ulint FLST_NODE_SIZE = 2 * FIL_ADDR_SIZE;
#define FLST_PREV 0 /* 6-byte address of the previous list element */
#define FLST_NEXT FIL_ADDR_SIZE /* 6-byte address of the next */
Header Page
•There is no separate storage for metadata information of tablespace
•It is stored in the same file in header page (always page 0)
FIL Header (38)
FSP Header (112)
XDES Entry 0 (pages 0 - 63) (40)
XDES Entry 1 (pages 64 - 127) (40)
XDES Entry 2 (pages 128 - 191) (40)
XDES Entry 3 (pages 192 - 255) (40)
…
XDES Entry 255 (pages 16320 - 16383) (40)
Empty Space (5986)
FIL Trailer (8)
0
38
150
190
230
270
10390
10350
310
16384
16376
Space ID (4)
Unused (4)
Highest Page Number in File (Size) (4)
Highest Page Number Initialized (Free Limit) (4)
Space Flags (4)
Number of Pages Used in FREE_FRAG List (4)
List Base Node for FREE List (16)
List Base Node for FREE_FRAG List (16)
List Base Node for FULL_FRAG List (16)
Next Unused Segment ID (8)
List Base Node for FULL_INODES List (16)
List Base Node for FREE_INODES List (16)
0
4
16
12
24
20
8
56
40
72
80
96
112
FSP Header
Types of List Base node in FSP Header
• FREE List
•Base node pointer of the linked-list of extents which are free to be allocated
•An extent from this list could be allocated to a File Segment or FREE_FRAG List
• FREE_FRAG List
•Base node pointer of the linked-list of extents which have at least one free page to be
allocated
• FULL_FRAG List
•Base node pointer of the linked-list of extents which have no free page left to be allocated
List Base Node for FREE List (16)
List Base Node for FREE_FRAG List (16)
List Base Node for FULL_FRAG List (16)
24
56
40
72
•include/fsp0fsp.h
/** Offset of the space header within a file page */
#define FSP_HEADER_OFFSET FIL_PAGE_DAT
...
#define FSP_SPACE_ID 0
#define FSP_NOT_USED 4
#define FSP_SIZE 8
#define FSP_FREE_LIMIT 12
#define FSP_SPACE_FLAGS 16
#define FSP_FRAG_N_USED 20
#define FSP_FREE 24
#define FSP_FREE_FRAG (24 + FLST_BASE_NODE_SIZE)
#define FSP_FULL_FRAG (24 + 2 * FLST_BASE_NODE_SIZE)
#define FSP_SEG_ID (24 + 3 * FLST_BASE_NODE_SIZE)
#define FSP_SEG_INODES_FULL (32 + 3 * FLST_BASE_NODE_SIZE)
#define FSP_SEG_INODES_FREE (32 + 4 * FLST_BASE_NODE_SIZE)
...
/* File space header size */
#define FSP_HEADER_SIZE (32 + 5 * FLST_BASE_NODE_SIZE)
...
/** Offset of the descriptor array on a descriptor page */
#define XDES_ARR_OFFSET (FSP_HEADER_OFFSET + FSP_HEADER_SIZE)
16
FSP Header
XDES Page
•XDES pages store metadata information related to pages which belongs to an Extent
•Once the number of total extents is greater than what an XDES page can track:
•a new XDES page is allocated which will be used to track the next set of extents
•NOTE: For first set of extents, header page is used to store XDES entries
FIL Header (38)
zero-filled (112)
XDES Entry 0 (pages 0 - 63) (40)
XDES Entry 1 (pages 64 - 127) (40)
XDES Entry 2 (pages 128 - 191) (40)
XDES Entry 3 (pages 192 - 255) (40)
…
XDES Entry 255 (pages 16320 - 16383) (40)
Empty Space (5986)
FIL Trailer (8)
0
38
150
190
230
270
10390
10350
310
16384
16376
Pages Covered by One XDES Page
•For ease of implementation, number of pages covered by one XDES page entries is equal to
page size
Page Size Extent Size Pages in An Extent
XDES Entry
Size
Pages Covered in An
XDES Page
XDES Entries in An
XDES Page
4 K 1 M 256 88 B 4096 (4K) 16
8 K 1 M 128 56 B 8192 (8K) 64
16 K 1 M 64 40 B 16384 (16K) 256
32 K 2 M 64 40 B 32768 (32K) 512
64 K 4 M 64 40 B 65536 (64K) 1024
XDES Entry
•To keep track of which extents are in use, and which pages within each extent are in use
•List Node for XDES List: Pointers to previous and next extents in a doubly-linked extent
descriptor list
•State: The current state of the extent (FREE, FREE_FRAG, FULL_FRAG, FSEG)
•Page State Bitmap: A bitmap of 2 bits per page in the extent (64 * 2 = 128 bits = 16 bytes)
•The first bit indicates whether the page is free
•The second bit indicates whether the page is clean, but this bit is currently not used
File Segment ID (8)
List Node for XDES List (12)
State (4)
Page Stage Bitmap (16)
N
N+8
N+20
N+24
N+40
XDES Entry
•include/fsp0fsp.h
File Segment ID (8)
List node for XDES list (12)
State (4)
Page Stage Bitmap (16)
N
N+8
N+20
N+24
N+40
/*-------------------------------------*/
#define XDES_ID 
0 /* The identifier of the segment 
to which this extent belongs */
#define XDES_FLST_NODE 
8 /* The list node data structure 
for the descriptors */
#define XDES_STATE (FLST_NODE_SIZE + 8)
/* contains state information
of the extent */
#define XDES_BITMAP (FLST_NODE_SIZE + 12)
/* Descriptor bitmap of the pages
in the extent */
/*-------------------------------------*/
#define XDES_BITS_PER_PAGE 2 /* How many bits are there per page */
#define XDES_FREE_BIT 
0 /* Index of the bit which tells if 
the page is free */
#define XDES_CLEAN_BIT 
1 /* NOTE: currently not used! 
Index of the bit which tells if 
there are old versions of tuples 
on the page */
12
Three Types of State
• FREE_FRAG
•Extents with free pages remaining that are allocated to be used in “fragments”, having
Individual pages allocated to different purposes rather than allocating the entire extent
• FULL_FRAG
•Exactly like FREE_FRAG but for extents with no free pages remaining
•Extents are moved from FREE_FRAG to FULL_FRAG when they become full, and moved back
to FREE_FRAG if a page is released so that they are no longer full
• FREE
•Extents that are completely unused and available to be allocated in whole to some purpose
•A FREE extent could be allocated to a file segment or moved to the FREE_FRAG list for
individual page use
INODE Page Overview
•An INODE entry in InnoDB merely describes a file segment (FSEG)
•Each INODE page contains 85 file segments INODE entries, each of which are 192 bytes
•Also, the page contain a list node which is used in the following lists:
• FREE_INODES: A list of INODE pages which have at least one free file segment INODE entry
• FULL_INODES: A list of INODE pages which have zero free file segment INODE entries
FIL Header (38)
List Node for INODE Page List (12)
INODE 0 (192)
INODE 1 (192)
…
INODE 84 (192)
Empty Space (6)
FIL Trailer (8)
0
38
50
242
434
16178
16384
16376
16370
INODE Page Overview
•include/fsp0fsp.h
N
N+8
N+20
N+24
/* FILE SEGMENT INODE
==================
Segment inode which is created for each segment in a tablespace. NOTE: in
purge we assume that a segment having only one currently used page can be
freed in a few steps, so that the freeing cannot fill the file buffer with
bufferfixed file pages. */
typedef byte fseg_inode_t;
#define FSEG_INODE_PAGE_NODE FSEG_PAGE_DATA
/* the list node for linking
segment inode pages */
#define FSEG_ARR_OFFSET (FSEG_PAGE_DATA + FLST_NODE_SIZE)
•A logical unit which is a collection of pages and extents:
•File segment makes the page management easy
•Once we delete the file segment, we know which all extents and pages are to be freed
File Segment
Extent Extent Extent
Extent Extent Extent
Extent Extent Extent
File Segment ID
…
FREE List
NOT_FULL List
FULL List
…
Fragment Array Entry 0 (4)
…
Fragment Array Entry 31 (4)
INODE Entry
•File Segment ID: The ID of the file segment (FSEG) described by this file segment INODE entry
•Magic Number: A marker that this file segment INODE entry has been properly initialized
•Fragment Array: An array of 32 page numbers of pages allocated individually from extents in
the space’s FREE_FRAG or FULL_FRAG list of “fragment” extents
FSEG ID (8)
Number of Used Pages in NOT_NULL List (4)
List Base Node for FREE List (16)
List Base Node for NOT_FULL List (16)
List Base Node for FULL List (16)
Magic Number = 97937874 (4)
Fragment Array Entry 0 (4)
…
Fragment Array Entry 31 (4)
N
N+8
N+12
N+28
N+44
N+60
N+188
N+68
N+64
N+192
INODE Entry
•include/fsp0fsp.h
N
N+8
N+20
N+24
N+40
#define FSEG_ID 0 /* 8 bytes of segment id */
#define FSEG_NOT_FULL_N_USED 8 /* number of used segment pages in the FSEG_NOT_FULL list */
#define FSEG_FREE 12 /* list of free extents of this segment */
#define FSEG_NOT_FULL (12 + FLST_BASE_NODE_SIZE) /* list of partially free extents */
#define FSEG_FULL (12 + 2 * FLST_BASE_NODE_SIZE) /* list of full extents */
#define FSEG_MAGIC_N (12 + 3 * FLST_BASE_NODE_SIZE) /* magic number used in debugging */
#define FSEG_FRAG_ARR (16 + 3 * FLST_BASE_NODE_SIZE) /* array of individual pages belonging to this
segment in fsp fragment extent lists */
#define FSEG_FRAG_ARR_N_SLOTS (FSP_EXTENT_SIZE / 2) /* number of slots in the array for the
fragment pages */
#define FSEG_FRAG_SLOT_SIZE 4 /* a fragment page slot contains its page number within space,
FIL_NULL means that the slot is not in use */
...
#define FSEG_MAGIC_N_VALUE 97937874
Three Types of List in File Segment
• FREE
•Extents that are completely unused and are allocated to this file segment
• NOT_FULL
•Extents with at least one used page allocated to this file segment
•When the last free page is used the extent is moved to FULL list
• FULL
•Extents with no free pages allocated to this file
•If a page becomes free, the extent is moved to the NOT_FULL list
How Are File Segments Used in Index?
•Two segments are allocated for each index in InnoDB:
•Non leaf page segment: to store non-leaf pages in B tree
•Leaf page segment: to store leaf pages in B tree
•On a page, FSEG HEADER is the place where these two file segments INODE entries
information is stored (detailed later)
Everything Is A Index in InnoDB
•Every table has a primary key
•If the CREATE TABLE dose not specify one, the first non-NULL unique key is used
•If failed, a 48-bit hidden “Row ID” field is automatically added as a primary key
•The row data are stored in the PRIMARY KEY INDEX structure (= clustered key)
•Key: PRIMARY KEY fields, Value: the row data
•Secondary keys are stored in an identical index structure
•Key: KEY fields, Value: the primary key value
INDEX Page Overview
•System Record: InnoDB has two system records in each page, called infimum and supremum
•User Record: The actual data
•Every record has a variable-length header with a “next record” pointer → Singly-linked list
•Page Directory: It contains pointers to some of the records in the page (every 4th to 8th record)
FIL Header (38)
INDEX Header (36)
FSEG Header (20)
System Records (26)
User Records
Free Space
Page Directory
FIL Trailer (8)
0
38
74
94
120
Heap Top
16384
16376
INDEX Page Overview
•include/
#define PAGE_DIR FIL_PAGE_DATA_END
...
#define PAGE_EMPTY_DIR_START (PAGE_DIR + 2 * PAGE_DIR_SLOT_SIZE)
INDEX Header
•The INDEX header in each INDEX page is fixed-length
•It contains many fields related to INDEX pages and record management
Number of Directory Slots (2)
Heap Top Position (2)
Number of Heap Records / Format Flag (2)
First Garbage Record Offset (2)
Garbage Space (2)
Last Insert Position (2)
Page Direction (2)
Number of Inserts in Page Direction (2)
Number of Records (2)
Maximum Transaction ID (8)
Page Level (2)
Index ID (4)
38
40
42
44
46
48
50
52
54
56
64
66
74
INDEX Header
•include/page0types.h
N
N+8
N+20
N+24
N+40
typedef byte page_header_t;
#define PAGE_HEADER FSEG_PAGE_DATA /* index page header starts at this offset */
#define PAGE_N_DIR_SLOTS 0 /* number of slots in page directory */
#define PAGE_HEAP_TOP 2 /* pointer to record heap top */
#define PAGE_N_HEAP 4 /* number of records in the heap */
#define PAGE_FREE 6 /* pointer to start of page free record list */
#define PAGE_GARBAGE 8 /* number of bytes in deleted records */
#define PAGE_LAST_INSERT 10 /* pointer to the last inserted record */
#define PAGE_DIRECTION 12 /* last insert direction: PAGE_LEFT, ... */
#define PAGE_N_DIRECTION 14 /* number of consecutive inserts to the same direction */
#define PAGE_N_RECS 16 /* number of user records on the page */
#define PAGE_MAX_TRX_ID 18 /* highest id of a trx which may have modified a record on the page;
trx_id_t; defined only in secondary indexes and in the insert buffer tree */
...
#define PAGE_LEVEL 26 /* level of the node in an index tree; the leaf level is the level 0. This
field should not be written to after page creation. */
#define PAGE_INDEX_ID 28 /* index id where the page belongs. This field should not be written to
after page creation. */
System Records
•Every INDEX page contains two system records, called infimum and supremum, at fixed
locations (offset 99 & offset 112)
•The structure is as follows (in bits):
Info Flags (4)
Nuber of Records Owned (4)
Order (13)
Record Type (3)
Next Record Offset (16)
“infimum0” (64)
Info Flags (4)
Number of Records Owned (4)
Order (13)
Record Type (3)
Next Record Offset (16)
“supremum” (64)
94
95
97
99
107
108
110
112
120
System Records
•include/page0types.h
N
N+8
N+20
N+24
N+40
#define PAGE_DATA (PAGE_HEADER + 36 + 2 * FSEG_HEADER_SIZE)
/* start of data on the page */
#define PAGE_OLD_INFIMUM (PAGE_DATA + 1 + REC_N_OLD_EXTRA_BYTES)
/* offset of the page infimum record on an old-style page */
#define PAGE_OLD_SUPREMUM (PAGE_DATA + 2 + 2 * REC_N_OLD_EXTRA_BYTES + 8)
/* offset of the page supremum record on an old-style page */
#define PAGE_OLD_SUPREMUM_END (PAGE_OLD_SUPREMUM + 9)
/* offset of the page supremum record end on an old-style page */
#define PAGE_NEW_INFIMUM (PAGE_DATA + REC_N_NEW_EXTRA_BYTES)
/* offset of the page infimum record on a new-style compact page */
#define PAGE_NEW_SUPREMUM (PAGE_DATA + 2 * REC_N_NEW_EXTRA_BYTES + 8)
/* offset of the page supremum record on a new-style compact page */
#define PAGE_NEW_SUPREMUM_END (PAGE_NEW_SUPREMUM + 8)
/* offset of the page supremum record end on a new-style compact page */
FSEG Header
•Each index uses one file segment for leaf pages and one for non-leaf (internal) pages
•This information is stored in the FSEG header structure in the INDEX root page
•All other INDEX pages’ FSEG headers are unused and zero-filled
Leaf Pages Inode Space ID (4)
Leaf Pages Inode Page Number (4)
Leaf Pages Inode Offset (2)
Non-Leaf Inode Space ID (4)
Non-Leaf Inode Page Number (4)
Non-Leaf Inode Offset (2)
74
78
82
84
88
92
94
Index File Segment Structure
How does it work when we CREATE/DROP an index?
•As soon as an index is created, two file segments will be allocated for the index:
•One for leaf pages which will have no page as of now
•One for non-leaf pages which will have only one single page allocate (root page)
•When B-Tree grows:
•New pages are allocated in Fragment Array
•Once demands cross 32 pages, an extent is allocated to segment and is moved to FREE list
•Once pages of this new extent are used, it is moved to NOT_FULL list
•Once all pages of this extent are used, it is moved to FULL list
•Once we drop the index:
•From the root page, we go ahead and mark all the extents in those two file segments as free
Reference
[1] MySQL 8.0 Reference Manul: 15.6 InnoDB On-Disk Structures, MySQL, https://dev.mysql.com/doc/refman/8.0/en/innodb-
tablespace.html
[2] MySQL 8.0 Reference Manul: 15.11 InnoDB Disk I/O and File Space Management, MySQL, https://dev.mysql.com/doc/refman/
8.0/en/innodb-disk-management.html
[3] Innodb, Jeremy Cole, https://blog.jcole.us/innodb/
[4] InnoDB : Tablespace Space Management, Mayank Prasad, MySQL Server Blog, https://mysqlserverteam.com/innodb-
tablespace-space-management/

More Related Content

What's hot

How to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better PerformanceHow to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better Performance
oysteing
 
MySQL innoDB split and merge pages
MySQL innoDB split and merge pagesMySQL innoDB split and merge pages
MySQL innoDB split and merge pages
Marco Tusa
 
Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...
Adrian Huang
 
Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0
Mydbops
 
Linux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKBLinux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKB
shimosawa
 
PostgreSQL Extensions: A deeper look
PostgreSQL Extensions:  A deeper lookPostgreSQL Extensions:  A deeper look
PostgreSQL Extensions: A deeper look
Jignesh Shah
 
MySQL Hash Table
MySQL Hash TableMySQL Hash Table
MySQL Hash Table
MIJIN AN
 
Secondary Index Search in InnoDB
Secondary Index Search in InnoDBSecondary Index Search in InnoDB
Secondary Index Search in InnoDB
MIJIN AN
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performance
PostgreSQL-Consulting
 
Log Structured Merge Tree
Log Structured Merge TreeLog Structured Merge Tree
Log Structured Merge Tree
University of California, Santa Cruz
 
MyRocks Deep Dive
MyRocks Deep DiveMyRocks Deep Dive
MyRocks Deep Dive
Yoshinori Matsunobu
 
Optimizing MariaDB for maximum performance
Optimizing MariaDB for maximum performanceOptimizing MariaDB for maximum performance
Optimizing MariaDB for maximum performance
MariaDB plc
 
Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)
Aleksandr Kuzminsky
 
ASM
ASMASM
Physical Memory Management.pdf
Physical Memory Management.pdfPhysical Memory Management.pdf
Physical Memory Management.pdf
Adrian Huang
 
Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File System
Adrian Huang
 
Introduction to DataFusion An Embeddable Query Engine Written in Rust
Introduction to DataFusion  An Embeddable Query Engine Written in RustIntroduction to DataFusion  An Embeddable Query Engine Written in Rust
Introduction to DataFusion An Embeddable Query Engine Written in Rust
Andrew Lamb
 
PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs
PGConf APAC
 
MongoDB WiredTiger Internals
MongoDB WiredTiger InternalsMongoDB WiredTiger Internals
MongoDB WiredTiger Internals
Norberto Leite
 
Postgresql Database Administration- Day3
Postgresql Database Administration- Day3Postgresql Database Administration- Day3
Postgresql Database Administration- Day3
PoguttuezhiniVP
 

What's hot (20)

How to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better PerformanceHow to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better Performance
 
MySQL innoDB split and merge pages
MySQL innoDB split and merge pagesMySQL innoDB split and merge pages
MySQL innoDB split and merge pages
 
Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...
 
Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0
 
Linux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKBLinux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKB
 
PostgreSQL Extensions: A deeper look
PostgreSQL Extensions:  A deeper lookPostgreSQL Extensions:  A deeper look
PostgreSQL Extensions: A deeper look
 
MySQL Hash Table
MySQL Hash TableMySQL Hash Table
MySQL Hash Table
 
Secondary Index Search in InnoDB
Secondary Index Search in InnoDBSecondary Index Search in InnoDB
Secondary Index Search in InnoDB
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performance
 
Log Structured Merge Tree
Log Structured Merge TreeLog Structured Merge Tree
Log Structured Merge Tree
 
MyRocks Deep Dive
MyRocks Deep DiveMyRocks Deep Dive
MyRocks Deep Dive
 
Optimizing MariaDB for maximum performance
Optimizing MariaDB for maximum performanceOptimizing MariaDB for maximum performance
Optimizing MariaDB for maximum performance
 
Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)
 
ASM
ASMASM
ASM
 
Physical Memory Management.pdf
Physical Memory Management.pdfPhysical Memory Management.pdf
Physical Memory Management.pdf
 
Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File System
 
Introduction to DataFusion An Embeddable Query Engine Written in Rust
Introduction to DataFusion  An Embeddable Query Engine Written in RustIntroduction to DataFusion  An Embeddable Query Engine Written in Rust
Introduction to DataFusion An Embeddable Query Engine Written in Rust
 
PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs
 
MongoDB WiredTiger Internals
MongoDB WiredTiger InternalsMongoDB WiredTiger Internals
MongoDB WiredTiger Internals
 
Postgresql Database Administration- Day3
Postgresql Database Administration- Day3Postgresql Database Administration- Day3
Postgresql Database Administration- Day3
 

Similar to MySQL Space Management

cPanelCon 2014: InnoDB Anatomy
cPanelCon 2014: InnoDB AnatomycPanelCon 2014: InnoDB Anatomy
cPanelCon 2014: InnoDB Anatomy
Ryan Robson
 
Inno db internals innodb file formats and source code structure
Inno db internals innodb file formats and source code structureInno db internals innodb file formats and source code structure
Inno db internals innodb file formats and source code structure
zhaolinjnu
 
InnoDB architecture and performance optimization (Пётр Зайцев)
InnoDB architecture and performance optimization (Пётр Зайцев)InnoDB architecture and performance optimization (Пётр Зайцев)
InnoDB architecture and performance optimization (Пётр Зайцев)
Ontico
 
Innodb 和 XtraDB 结构和性能优化
Innodb 和 XtraDB 结构和性能优化Innodb 和 XtraDB 结构和性能优化
Innodb 和 XtraDB 结构和性能优化
YUCHENG HU
 
Memory Management Strategies - III.pdf
Memory Management Strategies - III.pdfMemory Management Strategies - III.pdf
Memory Management Strategies - III.pdf
Harika Pudugosula
 
Inno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code StructureInno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code Structure
MySQLConference
 
Star schema my sql
Star schema   my sqlStar schema   my sql
Star schema my sql
deathsubte
 
Open sql2010 recovery-of-lost-or-corrupted-innodb-tables
Open sql2010 recovery-of-lost-or-corrupted-innodb-tablesOpen sql2010 recovery-of-lost-or-corrupted-innodb-tables
Open sql2010 recovery-of-lost-or-corrupted-innodb-tables
Arvids Godjuks
 
Lecture storage-buffer
Lecture storage-bufferLecture storage-buffer
Lecture storage-buffer
Klaas Krona
 
InnoDB Architecture and Performance Optimization, Peter Zaitsev
InnoDB Architecture and Performance Optimization, Peter ZaitsevInnoDB Architecture and Performance Optimization, Peter Zaitsev
InnoDB Architecture and Performance Optimization, Peter Zaitsev
Fuenteovejuna
 
[Altibase] 4-1 tablespace concept
[Altibase] 4-1 tablespace concept[Altibase] 4-1 tablespace concept
[Altibase] 4-1 tablespace concept
altistory
 
2011 06-sq lite-forensics
2011 06-sq lite-forensics2011 06-sq lite-forensics
2011 06-sq lite-forensics
viaForensics
 
kerch04.ppt
kerch04.pptkerch04.ppt
kerch04.ppt
KalimuthuVelappan
 
Ppt
PptPpt
Database Sizing
Database SizingDatabase Sizing
Database Sizing
Amin Chowdhury
 
SQLServer Database Structures
SQLServer Database Structures SQLServer Database Structures
SQLServer Database Structures
Antonios Chatzipavlis
 
Working of Volatile and Non-Volatile memory
Working of Volatile and Non-Volatile memoryWorking of Volatile and Non-Volatile memory
Working of Volatile and Non-Volatile memory
Don Caeiro
 
SQL Server 2014 Memory Optimised Tables - Advanced
SQL Server 2014 Memory Optimised Tables - AdvancedSQL Server 2014 Memory Optimised Tables - Advanced
SQL Server 2014 Memory Optimised Tables - Advanced
Tony Rogerson
 
Unit-4 swapping.pptx
Unit-4 swapping.pptxUnit-4 swapping.pptx
Unit-4 swapping.pptx
ItechAnand1
 
S3 l7 db2 storage model
S3 l7   db2 storage modelS3 l7   db2 storage model
S3 l7 db2 storage model
Mohammad Khan
 

Similar to MySQL Space Management (20)

cPanelCon 2014: InnoDB Anatomy
cPanelCon 2014: InnoDB AnatomycPanelCon 2014: InnoDB Anatomy
cPanelCon 2014: InnoDB Anatomy
 
Inno db internals innodb file formats and source code structure
Inno db internals innodb file formats and source code structureInno db internals innodb file formats and source code structure
Inno db internals innodb file formats and source code structure
 
InnoDB architecture and performance optimization (Пётр Зайцев)
InnoDB architecture and performance optimization (Пётр Зайцев)InnoDB architecture and performance optimization (Пётр Зайцев)
InnoDB architecture and performance optimization (Пётр Зайцев)
 
Innodb 和 XtraDB 结构和性能优化
Innodb 和 XtraDB 结构和性能优化Innodb 和 XtraDB 结构和性能优化
Innodb 和 XtraDB 结构和性能优化
 
Memory Management Strategies - III.pdf
Memory Management Strategies - III.pdfMemory Management Strategies - III.pdf
Memory Management Strategies - III.pdf
 
Inno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code StructureInno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code Structure
 
Star schema my sql
Star schema   my sqlStar schema   my sql
Star schema my sql
 
Open sql2010 recovery-of-lost-or-corrupted-innodb-tables
Open sql2010 recovery-of-lost-or-corrupted-innodb-tablesOpen sql2010 recovery-of-lost-or-corrupted-innodb-tables
Open sql2010 recovery-of-lost-or-corrupted-innodb-tables
 
Lecture storage-buffer
Lecture storage-bufferLecture storage-buffer
Lecture storage-buffer
 
InnoDB Architecture and Performance Optimization, Peter Zaitsev
InnoDB Architecture and Performance Optimization, Peter ZaitsevInnoDB Architecture and Performance Optimization, Peter Zaitsev
InnoDB Architecture and Performance Optimization, Peter Zaitsev
 
[Altibase] 4-1 tablespace concept
[Altibase] 4-1 tablespace concept[Altibase] 4-1 tablespace concept
[Altibase] 4-1 tablespace concept
 
2011 06-sq lite-forensics
2011 06-sq lite-forensics2011 06-sq lite-forensics
2011 06-sq lite-forensics
 
kerch04.ppt
kerch04.pptkerch04.ppt
kerch04.ppt
 
Ppt
PptPpt
Ppt
 
Database Sizing
Database SizingDatabase Sizing
Database Sizing
 
SQLServer Database Structures
SQLServer Database Structures SQLServer Database Structures
SQLServer Database Structures
 
Working of Volatile and Non-Volatile memory
Working of Volatile and Non-Volatile memoryWorking of Volatile and Non-Volatile memory
Working of Volatile and Non-Volatile memory
 
SQL Server 2014 Memory Optimised Tables - Advanced
SQL Server 2014 Memory Optimised Tables - AdvancedSQL Server 2014 Memory Optimised Tables - Advanced
SQL Server 2014 Memory Optimised Tables - Advanced
 
Unit-4 swapping.pptx
Unit-4 swapping.pptxUnit-4 swapping.pptx
Unit-4 swapping.pptx
 
S3 l7 db2 storage model
S3 l7   db2 storage modelS3 l7   db2 storage model
S3 l7 db2 storage model
 

Recently uploaded

Software Test Automation - A Comprehensive Guide on Automated Testing.pdf
Software Test Automation - A Comprehensive Guide on Automated Testing.pdfSoftware Test Automation - A Comprehensive Guide on Automated Testing.pdf
Software Test Automation - A Comprehensive Guide on Automated Testing.pdf
kalichargn70th171
 
14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision
ShulagnaSarkar2
 
How GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdfHow GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdf
Zycus
 
Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...
Paul Brebner
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
The Third Creative Media
 
What is Continuous Testing in DevOps - A Definitive Guide.pdf
What is Continuous Testing in DevOps - A Definitive Guide.pdfWhat is Continuous Testing in DevOps - A Definitive Guide.pdf
What is Continuous Testing in DevOps - A Definitive Guide.pdf
kalichargn70th171
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
gapen1
 
42 Ways to Generate Real Estate Leads - Sellxpert
42 Ways to Generate Real Estate Leads - Sellxpert42 Ways to Generate Real Estate Leads - Sellxpert
42 Ways to Generate Real Estate Leads - Sellxpert
vaishalijagtap12
 
Cost-Effective Strategies For iOS App Development
Cost-Effective Strategies For iOS App DevelopmentCost-Effective Strategies For iOS App Development
Cost-Effective Strategies For iOS App Development
Softradix Technologies
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
dakas1
 
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
kalichargn70th171
 
Computer Science & Engineering VI Sem- New Syllabus.pdf
Computer Science & Engineering VI Sem- New Syllabus.pdfComputer Science & Engineering VI Sem- New Syllabus.pdf
Computer Science & Engineering VI Sem- New Syllabus.pdf
chandangoswami40933
 
ppt on the brain chip neuralink.pptx
ppt  on   the brain  chip neuralink.pptxppt  on   the brain  chip neuralink.pptx
ppt on the brain chip neuralink.pptx
Reetu63
 
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
widenerjobeyrl638
 
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdfThe Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
kalichargn70th171
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
Alina Yurenko
 
Stork Product Overview: An AI-Powered Autonomous Delivery Fleet
Stork Product Overview: An AI-Powered Autonomous Delivery FleetStork Product Overview: An AI-Powered Autonomous Delivery Fleet
Stork Product Overview: An AI-Powered Autonomous Delivery Fleet
Vince Scalabrino
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid
 
Beginner's Guide to Observability@Devoxx PL 2024
Beginner's  Guide to Observability@Devoxx PL 2024Beginner's  Guide to Observability@Devoxx PL 2024
Beginner's Guide to Observability@Devoxx PL 2024
michniczscribd
 

Recently uploaded (20)

Software Test Automation - A Comprehensive Guide on Automated Testing.pdf
Software Test Automation - A Comprehensive Guide on Automated Testing.pdfSoftware Test Automation - A Comprehensive Guide on Automated Testing.pdf
Software Test Automation - A Comprehensive Guide on Automated Testing.pdf
 
14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision
 
How GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdfHow GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdf
 
Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
 
What is Continuous Testing in DevOps - A Definitive Guide.pdf
What is Continuous Testing in DevOps - A Definitive Guide.pdfWhat is Continuous Testing in DevOps - A Definitive Guide.pdf
What is Continuous Testing in DevOps - A Definitive Guide.pdf
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
 
42 Ways to Generate Real Estate Leads - Sellxpert
42 Ways to Generate Real Estate Leads - Sellxpert42 Ways to Generate Real Estate Leads - Sellxpert
42 Ways to Generate Real Estate Leads - Sellxpert
 
Cost-Effective Strategies For iOS App Development
Cost-Effective Strategies For iOS App DevelopmentCost-Effective Strategies For iOS App Development
Cost-Effective Strategies For iOS App Development
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
 
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
 
Computer Science & Engineering VI Sem- New Syllabus.pdf
Computer Science & Engineering VI Sem- New Syllabus.pdfComputer Science & Engineering VI Sem- New Syllabus.pdf
Computer Science & Engineering VI Sem- New Syllabus.pdf
 
ppt on the brain chip neuralink.pptx
ppt  on   the brain  chip neuralink.pptxppt  on   the brain  chip neuralink.pptx
ppt on the brain chip neuralink.pptx
 
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
 
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdfThe Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
 
Stork Product Overview: An AI-Powered Autonomous Delivery Fleet
Stork Product Overview: An AI-Powered Autonomous Delivery FleetStork Product Overview: An AI-Powered Autonomous Delivery Fleet
Stork Product Overview: An AI-Powered Autonomous Delivery Fleet
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
 
Beginner's Guide to Observability@Devoxx PL 2024
Beginner's  Guide to Observability@Devoxx PL 2024Beginner's  Guide to Observability@Devoxx PL 2024
Beginner's Guide to Observability@Devoxx PL 2024
 

MySQL Space Management

  • 1. InnoDB: Space Management Mijin An meeeeejin@gmail.com
  • 2. Five Types of Tablespaces •The storage area for the doublewrite buffer and the change buffer (ibdata1) •System Tablespace •The storage area for the user defined table and index data (.ibd) •File-Per-Table Tablespace •General Tablespace •The storage area for undo logs (undo_001) •Undo Tablespace •The storage area for user/optimizer-created temporary tables and rollback segments for them (.ibt, ibtmp1) •Temporary Tablespace
  • 3. System Tablespace (ibdata1) • innodb_file_per_table = OFF •The storage area for the doublewrite buffer and the change buffer •The files are logically concatenated and there is no striping in use •You cannot define where within the system tablespace your tables are allocated •Truncating or dropping a table only frees space and it can be reused for InnoDB data (i.e., not returned to the OS, as system tablespace data files never shrink) System Tablespace (ibdata1) Doublewrite Buffer Change Buffer
  • 4. File-Per-Table Tablespace (.ibd) • innodb_file_per_table = ON (default) •Each newly created table is stored in a separate tablespace file with a extension .ibd •There is less fragmentation within the disk file •The data of a single table and its indexes reside in one .ibd file •When a table is truncated or dropped, the freed space is returned to the OS •This slides will focus on the space management of file-per-tablespace t1.ibd t2.ibd innodb_file_per_table=ON
  • 5. General Tablespace (.ibd) •Shared tablespaces created using CREATE TABLESPACE syntax •The data from different tables and their indexes reside in one .ibd file •They can be created outside of the MySQL data directory •Capable of holding multiple tables •Support tables of all row formats CREATE TABLESPACE tablespace_name [ADD DATAFILE 'file_name'] [FILE_BLOCK_SIZE = value] [ENGINE [=] engine_name] 1 2 3 4 ts1.ibd t3 t4 T5 ts2.ibd t6 t7 t8
  • 6. •Two default undo tablespaces are created when the MySQL instance is initialized •Since MySQL 8.0.14, additional undo table spaces can be created at runtime •The initial size of an undo tablespace data file depends on the innodb_page_size value Undo Tablespace (undo_00x) undo_003.ibu (user-defined) undo_001 (system) undo_004.ibu (user-defined) undo_002 (system) Page Size Undo Tablespace File Size 4 KB 7 MiB 8 KB 8 MiB 16 KB 10 MiB 32 KB 20 MiB 64 KB 40 MiB
  • 7. •Session Temporary Tablespace (#innodb_temp/temp_x.ibt) •Store user-created temporary tables and internal temporary tables created by the optimizer •A pool of 10 temporary tablespaces (5 pages in size) is created when the server is started •When a session disconnects, its temporary table spaces are truncated and released back to the pool •Global Temporary Tablespace (ibtmp1) •Store rollback segments for changes made to user-created temporary tables •The initial file size is slightly larger than 12MB •Removed on normal shutdown and recreated each time the server is started Temporary Tablespace (.ibt, ibtmp1) ibtmp1 (global) temp_1.ibt (session) temp_2.ibt (session) temp_3.ibt (session)
  • 8. .ibd Files •Example: Create a table test.t1 •Table and index data related to the table t1 will reside in the above file •For a file-per-table tablespace, the tablespace name is the same as that of the file/table name •Tablespaces are identified with a unique ID (i.e., tablespace ID) CREATE TABLE test.t1 (c INT) engine=InnoDB; $ cd PATH_TO_DATA_DIR/test $ ls t1.ibd t1.ibd
  • 9. Pages •Each tablespace consists of number of fixed size pages (default: innodb_page_size=16KB) •There are different type of pages to server different purpose TABLESPACE FILE Page 0 Page 1 … Page N
  • 10. Extents •The pages are grouped into extents of size 1MB for pages up to 16KB in size •For a page size of 32KB, extent size is 2MB (for 64KB page, 4MB extent) TABLESPACE FILE Extent 1 Extent 1 Page 0 Extent 1 Page 1 … Extent 1 Page 63 … … Extent N Extent N Page 0 Extent N Page 1 … Extent N Page 63
  • 11. Segments •The files inside a tablespace are called segments in InnoDB •Within a file-per-table tablespace: •Table data: in one segment / Index: in its own segment •The system tablespace contains many different segments: •It can hold many tables and their associated indexes •Rollback segments is separated from the system tablespace since MySQL 8.0 •When a segment needs more room, it is extended by one extent (1MB) at a time: •InnoDB can add up to 4 extents at a time to ensure good sequentiality of data
  • 12. Basic Page Overview •Every page has a 38-byte FIL (file) header and 8-byte FIL trailer • The space ID and page number is stored in the header FIL Header (38) Other headers and page data, depending on page type FIL Trailer (8) Checksum (4) Offset (Page Number) (4) Previous Page (4) Next Page (4) LSN for last page modification (8) Page Type (2) Flush LSN (0 except space 0 page 0) (8) Space ID (4) Old-style Checksum (4) Low 32 bits of LSN (4) 0 38 16376 16384 0 4 16 12 26 24 8 38 34 16376 16380 16384
  • 13. #define FIL_PAGE_SPACE_OR_CHKSUM 0 ... #define FIL_PAGE_OFFSET 4 ... #define FIL_PAGE_PREV 8 ... #define FIL_PAGE_NEXT 12 ... #define FIL_PAGE_LSN 16 ... #define FIL_PAGE_TYPE 24 ... #define FIL_PAGE_FILE_FLUSH_LSN 26 ... constexpr ulint FIL_PAGE_ARCH_LOG_NO_OR_SPACE_ID = 34; /** alias for space id */ #define FIL_PAGE_SPACE_ID FIL_PAGE_ARCH_LOG_NO_OR_SPACE_ID ... constexpr ulint FIL_PAGE_DATA = 38; Basic Page Overview: FIL Header •include/fil0types.h
  • 14. Basic Page Overview: FIL Trailer •include/fil0types.h /** File page trailer */ /** the low 4 bytes of this are used to store the page checksum, the last 4 bytes should be identical to the last 4 bytes of FIL_PAGE_LSN */ constexpr ulint FIL_PAGE_END_LSN_OLD_CHKSUM = 8; /** size of the page trailer */ constexpr ulint FIL_PAGE_DATA_END = 8;
  • 15. Space File Overview •A space file is just a concatenation of many pages •The first page (page 0) in a space is ALWAYS an FSP_HDR (file space header) page •An FSP_HDR page only store bookkeeping info for 256 extents (= 16,384 pages = 256 MiB) •Additional space must be reserved for bookkeeping info in the form of an XDES page FSP_HDR: Filespace Header / Extent Descriptor IBUF_BITMAP: Change Buffer Bookkeeping depending on page type INODE: Index Node Information … XDES: Extent descriptor for new 16,384 pages IBUF_BITMAP: IBUF Bitmap for new 16,384 pages … XDES: Extent descriptor for new 16,384 pages IBUF_BITMAP: IBUF Bitmap for new 16,384 pages … 0 16 KiB 32 KiB 48 KiB 256 MiB 512 MiB
  • 16. ibdata1 File Overview •The system tablesapce (space 0) is special in InnoDB •To store a wide range of information critical to InnoDB’s operation FSP_HDR: Filespace Header / Extent Descriptor IBUF_BITMAP: Change Buffer Bookkeeping INODE: Index Node Information SYS: Change Buffer Header INDEX: Change Buffer Root TRX_SYS: Transaction System Header SYS: First Rollback Segment SYS: Data Dictionary Header … Double Write Buffer Block 1 (64 pages) Double Write Buffer Block 2 (64 pages) … 0 16 KiB 32 KiB 48 KiB 64 MiB Page 64 Page 128 Page 192
  • 17. .ibd File Overview •File-Per-Table tablespace •The .ibd file created for each table has the typical space file structure FSP_HDR: Filespace Header / Extent Descriptor IBUF_BITMAP: Change Buffer Bookkeeping INODE: Index Node Information INDEX: Root page of the primary index INDEX: Root page of the secondary index INDEX: Node pages … INDEX: Leaf pages … ALLOCATED: Reserved but unused pages … … 0 16 KiB 32 KiB 48 KiB 64 MiB
  • 18. List Base Node •Lists are fairly generic structure that allows linking multiple related structures together •The base node is stored only once in some high level structure (e.g., FSP header) •include/fut0lst.h •include/fut0lst.ic List Length (4) Page Number of The First List Node (4) Offset of The First List Node (2) Page Number of The Last List Node (4) Offset of The Last List Node(2) N N+4 N+8 N+10 N+14 N+16 typedef byte flst_base_node_t; /* The physical size of a list base node in bytes */ constexpr ulint FLST_BASE_NODE_SIZE = 4 + 2 * FIL_ADDR_SIZE; #define FLST_LEN 0 /* 32-bit list length field */ #define FLST_FIRST 4 /* 6-byte address of the first element of the list */ #define FLST_LAST (4 + FIL_ADDR_SIZE) /* 6-byte address of the last element of the list */ 6
  • 19. List Node •The list node stores previous and next pointers •include/fut0lst.h •include/fut0lst.ic Page Number of The Previous List Node (4) Offset of The Previous List Node(2) Page Number of The Next List Node (4) Offset of The Next List Node (2) N N+4 N+6 N+10 N+12 typedef byte flst_node_t; /* The physical size of a list node in bytes */ constexpr ulint FLST_NODE_SIZE = 2 * FIL_ADDR_SIZE; #define FLST_PREV 0 /* 6-byte address of the previous list element */ #define FLST_NEXT FIL_ADDR_SIZE /* 6-byte address of the next */
  • 20. Header Page •There is no separate storage for metadata information of tablespace •It is stored in the same file in header page (always page 0) FIL Header (38) FSP Header (112) XDES Entry 0 (pages 0 - 63) (40) XDES Entry 1 (pages 64 - 127) (40) XDES Entry 2 (pages 128 - 191) (40) XDES Entry 3 (pages 192 - 255) (40) … XDES Entry 255 (pages 16320 - 16383) (40) Empty Space (5986) FIL Trailer (8) 0 38 150 190 230 270 10390 10350 310 16384 16376 Space ID (4) Unused (4) Highest Page Number in File (Size) (4) Highest Page Number Initialized (Free Limit) (4) Space Flags (4) Number of Pages Used in FREE_FRAG List (4) List Base Node for FREE List (16) List Base Node for FREE_FRAG List (16) List Base Node for FULL_FRAG List (16) Next Unused Segment ID (8) List Base Node for FULL_INODES List (16) List Base Node for FREE_INODES List (16) 0 4 16 12 24 20 8 56 40 72 80 96 112 FSP Header
  • 21. Types of List Base node in FSP Header • FREE List •Base node pointer of the linked-list of extents which are free to be allocated •An extent from this list could be allocated to a File Segment or FREE_FRAG List • FREE_FRAG List •Base node pointer of the linked-list of extents which have at least one free page to be allocated • FULL_FRAG List •Base node pointer of the linked-list of extents which have no free page left to be allocated List Base Node for FREE List (16) List Base Node for FREE_FRAG List (16) List Base Node for FULL_FRAG List (16) 24 56 40 72
  • 22. •include/fsp0fsp.h /** Offset of the space header within a file page */ #define FSP_HEADER_OFFSET FIL_PAGE_DAT ... #define FSP_SPACE_ID 0 #define FSP_NOT_USED 4 #define FSP_SIZE 8 #define FSP_FREE_LIMIT 12 #define FSP_SPACE_FLAGS 16 #define FSP_FRAG_N_USED 20 #define FSP_FREE 24 #define FSP_FREE_FRAG (24 + FLST_BASE_NODE_SIZE) #define FSP_FULL_FRAG (24 + 2 * FLST_BASE_NODE_SIZE) #define FSP_SEG_ID (24 + 3 * FLST_BASE_NODE_SIZE) #define FSP_SEG_INODES_FULL (32 + 3 * FLST_BASE_NODE_SIZE) #define FSP_SEG_INODES_FREE (32 + 4 * FLST_BASE_NODE_SIZE) ... /* File space header size */ #define FSP_HEADER_SIZE (32 + 5 * FLST_BASE_NODE_SIZE) ... /** Offset of the descriptor array on a descriptor page */ #define XDES_ARR_OFFSET (FSP_HEADER_OFFSET + FSP_HEADER_SIZE) 16 FSP Header
  • 23. XDES Page •XDES pages store metadata information related to pages which belongs to an Extent •Once the number of total extents is greater than what an XDES page can track: •a new XDES page is allocated which will be used to track the next set of extents •NOTE: For first set of extents, header page is used to store XDES entries FIL Header (38) zero-filled (112) XDES Entry 0 (pages 0 - 63) (40) XDES Entry 1 (pages 64 - 127) (40) XDES Entry 2 (pages 128 - 191) (40) XDES Entry 3 (pages 192 - 255) (40) … XDES Entry 255 (pages 16320 - 16383) (40) Empty Space (5986) FIL Trailer (8) 0 38 150 190 230 270 10390 10350 310 16384 16376
  • 24. Pages Covered by One XDES Page •For ease of implementation, number of pages covered by one XDES page entries is equal to page size Page Size Extent Size Pages in An Extent XDES Entry Size Pages Covered in An XDES Page XDES Entries in An XDES Page 4 K 1 M 256 88 B 4096 (4K) 16 8 K 1 M 128 56 B 8192 (8K) 64 16 K 1 M 64 40 B 16384 (16K) 256 32 K 2 M 64 40 B 32768 (32K) 512 64 K 4 M 64 40 B 65536 (64K) 1024
  • 25. XDES Entry •To keep track of which extents are in use, and which pages within each extent are in use •List Node for XDES List: Pointers to previous and next extents in a doubly-linked extent descriptor list •State: The current state of the extent (FREE, FREE_FRAG, FULL_FRAG, FSEG) •Page State Bitmap: A bitmap of 2 bits per page in the extent (64 * 2 = 128 bits = 16 bytes) •The first bit indicates whether the page is free •The second bit indicates whether the page is clean, but this bit is currently not used File Segment ID (8) List Node for XDES List (12) State (4) Page Stage Bitmap (16) N N+8 N+20 N+24 N+40
  • 26. XDES Entry •include/fsp0fsp.h File Segment ID (8) List node for XDES list (12) State (4) Page Stage Bitmap (16) N N+8 N+20 N+24 N+40 /*-------------------------------------*/ #define XDES_ID 0 /* The identifier of the segment to which this extent belongs */ #define XDES_FLST_NODE 8 /* The list node data structure for the descriptors */ #define XDES_STATE (FLST_NODE_SIZE + 8) /* contains state information of the extent */ #define XDES_BITMAP (FLST_NODE_SIZE + 12) /* Descriptor bitmap of the pages in the extent */ /*-------------------------------------*/ #define XDES_BITS_PER_PAGE 2 /* How many bits are there per page */ #define XDES_FREE_BIT 0 /* Index of the bit which tells if the page is free */ #define XDES_CLEAN_BIT 1 /* NOTE: currently not used! Index of the bit which tells if there are old versions of tuples on the page */ 12
  • 27. Three Types of State • FREE_FRAG •Extents with free pages remaining that are allocated to be used in “fragments”, having Individual pages allocated to different purposes rather than allocating the entire extent • FULL_FRAG •Exactly like FREE_FRAG but for extents with no free pages remaining •Extents are moved from FREE_FRAG to FULL_FRAG when they become full, and moved back to FREE_FRAG if a page is released so that they are no longer full • FREE •Extents that are completely unused and available to be allocated in whole to some purpose •A FREE extent could be allocated to a file segment or moved to the FREE_FRAG list for individual page use
  • 28. INODE Page Overview •An INODE entry in InnoDB merely describes a file segment (FSEG) •Each INODE page contains 85 file segments INODE entries, each of which are 192 bytes •Also, the page contain a list node which is used in the following lists: • FREE_INODES: A list of INODE pages which have at least one free file segment INODE entry • FULL_INODES: A list of INODE pages which have zero free file segment INODE entries FIL Header (38) List Node for INODE Page List (12) INODE 0 (192) INODE 1 (192) … INODE 84 (192) Empty Space (6) FIL Trailer (8) 0 38 50 242 434 16178 16384 16376 16370
  • 29. INODE Page Overview •include/fsp0fsp.h N N+8 N+20 N+24 /* FILE SEGMENT INODE ================== Segment inode which is created for each segment in a tablespace. NOTE: in purge we assume that a segment having only one currently used page can be freed in a few steps, so that the freeing cannot fill the file buffer with bufferfixed file pages. */ typedef byte fseg_inode_t; #define FSEG_INODE_PAGE_NODE FSEG_PAGE_DATA /* the list node for linking segment inode pages */ #define FSEG_ARR_OFFSET (FSEG_PAGE_DATA + FLST_NODE_SIZE)
  • 30. •A logical unit which is a collection of pages and extents: •File segment makes the page management easy •Once we delete the file segment, we know which all extents and pages are to be freed File Segment Extent Extent Extent Extent Extent Extent Extent Extent Extent File Segment ID … FREE List NOT_FULL List FULL List … Fragment Array Entry 0 (4) … Fragment Array Entry 31 (4)
  • 31. INODE Entry •File Segment ID: The ID of the file segment (FSEG) described by this file segment INODE entry •Magic Number: A marker that this file segment INODE entry has been properly initialized •Fragment Array: An array of 32 page numbers of pages allocated individually from extents in the space’s FREE_FRAG or FULL_FRAG list of “fragment” extents FSEG ID (8) Number of Used Pages in NOT_NULL List (4) List Base Node for FREE List (16) List Base Node for NOT_FULL List (16) List Base Node for FULL List (16) Magic Number = 97937874 (4) Fragment Array Entry 0 (4) … Fragment Array Entry 31 (4) N N+8 N+12 N+28 N+44 N+60 N+188 N+68 N+64 N+192
  • 32. INODE Entry •include/fsp0fsp.h N N+8 N+20 N+24 N+40 #define FSEG_ID 0 /* 8 bytes of segment id */ #define FSEG_NOT_FULL_N_USED 8 /* number of used segment pages in the FSEG_NOT_FULL list */ #define FSEG_FREE 12 /* list of free extents of this segment */ #define FSEG_NOT_FULL (12 + FLST_BASE_NODE_SIZE) /* list of partially free extents */ #define FSEG_FULL (12 + 2 * FLST_BASE_NODE_SIZE) /* list of full extents */ #define FSEG_MAGIC_N (12 + 3 * FLST_BASE_NODE_SIZE) /* magic number used in debugging */ #define FSEG_FRAG_ARR (16 + 3 * FLST_BASE_NODE_SIZE) /* array of individual pages belonging to this segment in fsp fragment extent lists */ #define FSEG_FRAG_ARR_N_SLOTS (FSP_EXTENT_SIZE / 2) /* number of slots in the array for the fragment pages */ #define FSEG_FRAG_SLOT_SIZE 4 /* a fragment page slot contains its page number within space, FIL_NULL means that the slot is not in use */ ... #define FSEG_MAGIC_N_VALUE 97937874
  • 33. Three Types of List in File Segment • FREE •Extents that are completely unused and are allocated to this file segment • NOT_FULL •Extents with at least one used page allocated to this file segment •When the last free page is used the extent is moved to FULL list • FULL •Extents with no free pages allocated to this file •If a page becomes free, the extent is moved to the NOT_FULL list
  • 34. How Are File Segments Used in Index? •Two segments are allocated for each index in InnoDB: •Non leaf page segment: to store non-leaf pages in B tree •Leaf page segment: to store leaf pages in B tree •On a page, FSEG HEADER is the place where these two file segments INODE entries information is stored (detailed later)
  • 35. Everything Is A Index in InnoDB •Every table has a primary key •If the CREATE TABLE dose not specify one, the first non-NULL unique key is used •If failed, a 48-bit hidden “Row ID” field is automatically added as a primary key •The row data are stored in the PRIMARY KEY INDEX structure (= clustered key) •Key: PRIMARY KEY fields, Value: the row data •Secondary keys are stored in an identical index structure •Key: KEY fields, Value: the primary key value
  • 36. INDEX Page Overview •System Record: InnoDB has two system records in each page, called infimum and supremum •User Record: The actual data •Every record has a variable-length header with a “next record” pointer → Singly-linked list •Page Directory: It contains pointers to some of the records in the page (every 4th to 8th record) FIL Header (38) INDEX Header (36) FSEG Header (20) System Records (26) User Records Free Space Page Directory FIL Trailer (8) 0 38 74 94 120 Heap Top 16384 16376
  • 37. INDEX Page Overview •include/ #define PAGE_DIR FIL_PAGE_DATA_END ... #define PAGE_EMPTY_DIR_START (PAGE_DIR + 2 * PAGE_DIR_SLOT_SIZE)
  • 38. INDEX Header •The INDEX header in each INDEX page is fixed-length •It contains many fields related to INDEX pages and record management Number of Directory Slots (2) Heap Top Position (2) Number of Heap Records / Format Flag (2) First Garbage Record Offset (2) Garbage Space (2) Last Insert Position (2) Page Direction (2) Number of Inserts in Page Direction (2) Number of Records (2) Maximum Transaction ID (8) Page Level (2) Index ID (4) 38 40 42 44 46 48 50 52 54 56 64 66 74
  • 39. INDEX Header •include/page0types.h N N+8 N+20 N+24 N+40 typedef byte page_header_t; #define PAGE_HEADER FSEG_PAGE_DATA /* index page header starts at this offset */ #define PAGE_N_DIR_SLOTS 0 /* number of slots in page directory */ #define PAGE_HEAP_TOP 2 /* pointer to record heap top */ #define PAGE_N_HEAP 4 /* number of records in the heap */ #define PAGE_FREE 6 /* pointer to start of page free record list */ #define PAGE_GARBAGE 8 /* number of bytes in deleted records */ #define PAGE_LAST_INSERT 10 /* pointer to the last inserted record */ #define PAGE_DIRECTION 12 /* last insert direction: PAGE_LEFT, ... */ #define PAGE_N_DIRECTION 14 /* number of consecutive inserts to the same direction */ #define PAGE_N_RECS 16 /* number of user records on the page */ #define PAGE_MAX_TRX_ID 18 /* highest id of a trx which may have modified a record on the page; trx_id_t; defined only in secondary indexes and in the insert buffer tree */ ... #define PAGE_LEVEL 26 /* level of the node in an index tree; the leaf level is the level 0. This field should not be written to after page creation. */ #define PAGE_INDEX_ID 28 /* index id where the page belongs. This field should not be written to after page creation. */
  • 40. System Records •Every INDEX page contains two system records, called infimum and supremum, at fixed locations (offset 99 & offset 112) •The structure is as follows (in bits): Info Flags (4) Nuber of Records Owned (4) Order (13) Record Type (3) Next Record Offset (16) “infimum0” (64) Info Flags (4) Number of Records Owned (4) Order (13) Record Type (3) Next Record Offset (16) “supremum” (64) 94 95 97 99 107 108 110 112 120
  • 41. System Records •include/page0types.h N N+8 N+20 N+24 N+40 #define PAGE_DATA (PAGE_HEADER + 36 + 2 * FSEG_HEADER_SIZE) /* start of data on the page */ #define PAGE_OLD_INFIMUM (PAGE_DATA + 1 + REC_N_OLD_EXTRA_BYTES) /* offset of the page infimum record on an old-style page */ #define PAGE_OLD_SUPREMUM (PAGE_DATA + 2 + 2 * REC_N_OLD_EXTRA_BYTES + 8) /* offset of the page supremum record on an old-style page */ #define PAGE_OLD_SUPREMUM_END (PAGE_OLD_SUPREMUM + 9) /* offset of the page supremum record end on an old-style page */ #define PAGE_NEW_INFIMUM (PAGE_DATA + REC_N_NEW_EXTRA_BYTES) /* offset of the page infimum record on a new-style compact page */ #define PAGE_NEW_SUPREMUM (PAGE_DATA + 2 * REC_N_NEW_EXTRA_BYTES + 8) /* offset of the page supremum record on a new-style compact page */ #define PAGE_NEW_SUPREMUM_END (PAGE_NEW_SUPREMUM + 8) /* offset of the page supremum record end on a new-style compact page */
  • 42. FSEG Header •Each index uses one file segment for leaf pages and one for non-leaf (internal) pages •This information is stored in the FSEG header structure in the INDEX root page •All other INDEX pages’ FSEG headers are unused and zero-filled Leaf Pages Inode Space ID (4) Leaf Pages Inode Page Number (4) Leaf Pages Inode Offset (2) Non-Leaf Inode Space ID (4) Non-Leaf Inode Page Number (4) Non-Leaf Inode Offset (2) 74 78 82 84 88 92 94
  • 43. Index File Segment Structure
  • 44. How does it work when we CREATE/DROP an index? •As soon as an index is created, two file segments will be allocated for the index: •One for leaf pages which will have no page as of now •One for non-leaf pages which will have only one single page allocate (root page) •When B-Tree grows: •New pages are allocated in Fragment Array •Once demands cross 32 pages, an extent is allocated to segment and is moved to FREE list •Once pages of this new extent are used, it is moved to NOT_FULL list •Once all pages of this extent are used, it is moved to FULL list •Once we drop the index: •From the root page, we go ahead and mark all the extents in those two file segments as free
  • 45. Reference [1] MySQL 8.0 Reference Manul: 15.6 InnoDB On-Disk Structures, MySQL, https://dev.mysql.com/doc/refman/8.0/en/innodb- tablespace.html [2] MySQL 8.0 Reference Manul: 15.11 InnoDB Disk I/O and File Space Management, MySQL, https://dev.mysql.com/doc/refman/ 8.0/en/innodb-disk-management.html [3] Innodb, Jeremy Cole, https://blog.jcole.us/innodb/ [4] InnoDB : Tablespace Space Management, Mayank Prasad, MySQL Server Blog, https://mysqlserverteam.com/innodb- tablespace-space-management/