SlideShare a Scribd company logo
Recovery of lost or corrupted
InnoDB tables
Yet another reason to use InnoDB
OpenSQL Camp 2010, Sankt Augustin
Aleksandr.Kuzminsky@Percona.com
Percona Inc.
http://MySQLPerformanceBlog.com
Agenda
1. InnoDB format overview
2. Internal system tables SYS_INDEXES and SYS_TABLES
3. InnoDB Primary and Secondary keys
4. Typical failure scenarios
5. InnoDB recovery tool
-2-/43
Three things are certain:
Death, taxes and lost data.
Guess which has occurred?
1. InnoDB format overview
-3-/43
Recovery of lost or corrupted InnoDB tables
How MySQL stores data in InnoDB
1. A table space (ibdata1)
• System tablespace(data dictionary, undo, insert buffer, etc.)
• PRIMARY indices (PK + data)
• SECONDARY indices (SK + PK) If the key is (f1, f2) it is
stored as (f1, f2, PK)
1. file per table (.ibd)
• PRIMARY index
• SECONDARY indices
1. InnoDB pages size 16k (uncompressed)
2. Every index is identified by index_id
-4-/43
Recovery of lost or corrupted InnoDB tables
-5-/43
Recovery of lost or corrupted InnoDB tables
How MySQL stores data in InnoDB
Page identifier index_id
TABLE: name test/site_folders, id 0 119, columns 9, indexes 1, appr.rows 1
COLUMNS: id: DATA_INT len 4 prec 0; name: type 12 len 765 prec 0; sites_count: DATA_INT
len 4 prec 0;
created_at: DATA_INT len 8 prec 0; updated_at: DATA_INT len 8 prec 0;
DB_ROW_ID: DATA_SYS prtype 256 len 6 prec 0; DB_TRX_ID: DATA_SYS prtype 257
len 6 prec 0;
DB_ROLL_PTR: DATA_SYS prtype 258 len 7 prec 0;
INDEX: name PRIMARY, id 0 254, fields 1/7, type 3
root page 271, appr.key vals 1, leaf pages 1, size pages 1
FIELDS: id DB_TRX_ID DB_ROLL_PTR name sites_count created_at updated_at
mysql> CREATE TABLE innodb_table_monitor(x int)
engine=innodb
Error log:
-6-/43
Recovery of lost or corrupted InnoDB tables
InnoDB page format
FIL HEADER
PAGE_HEADER
INFINUM+SUPREMUM RECORDS
USER RECORDS
FREE SPACE
Page Directory
Fil Trailer
-7-/43
Recovery of lost or corrupted InnoDB tables
InnoDB page format
Fil Header
Name Si
ze
Remarks
FIL_PAGE_SPACE 4 4 ID of the space the page is in
FIL_PAGE_OFFSET 4 ordinal page number from start of space
FIL_PAGE_PREV 4 offset of previous page in key order
FIL_PAGE_NEXT 4 offset of next page in key order
FIL_PAGE_LSN 8 log serial number of page's latest log record
FIL_PAGE_TYPE 2 current defined types are: FIL_PAGE_INDEX, FIL_PAGE_UNDO_LOG,
FIL_PAGE_INODE, FIL_PAGE_IBUF_FREE_LIST
FIL_PAGE_FILE_FLU
SH_LSN
8 "the file has been flushed to disk at least up to this lsn" (log serial number), valid
only on the first page of the file
FIL_PAGE_ARCH_LO
G_NO
4 the latest archived log file number at the time that FIL_PAGE_FILE_FLUSH_LSN was
written (in the log)
Data are stored in
FIL_PAGE_INDEX
-8-/43
Recovery of lost or corrupted InnoDB tables
InnoDB page format
Page Header
Name Size Remarks
PAGE_N_DIR_SLOTS 2 number of directory slots in the Page Directory part; initial value = 2
PAGE_HEAP_TOP 2 record pointer to first record in heap
PAGE_N_HEAP 2 number of heap records; initial value = 2
PAGE_FREE 2 record pointer to first free record
PAGE_GARBAGE 2 "number of bytes in deleted records"
PAGE_LAST_INSERT 2 record pointer to the last inserted record
PAGE_DIRECTION 2 either PAGE_LEFT, PAGE_RIGHT, or PAGE_NO_DIRECTION
PAGE_N_DIRECTION 2 number of consecutive inserts in the same direction, e.g. "last 5 were all to the left"
PAGE_N_RECS 2 number of user records
PAGE_MAX_TRX_ID 8 the highest ID of a transaction which might have changed a record on the page (only set for secondary
indexes)
PAGE_LEVEL 2 level within the index (0 for a leaf page)
PAGE_INDEX_ID 8 identifier of the index the page belongs to
PAGE_BTR_SEG_LEA
F
10 "file segment header for the leaf pages in a B-tree" (this is irrelevant here)
PAGE_BTR_SEG_TOP 10 "file segment header for the non-leaf pages in a B-tree" (this is irrelevant here)
index_id
Highest bit is row format(1
-COMPACT, 0 - REDUNDANT )
-9-/43
Recovery of lost or corrupted InnoDB tables
How to check row format?
• The highest bit of the PAGE_N_HEAP from the
page header
• 0 stands for version REDUNDANT, 1 - for COMACT
• dc -e "2o `hexdump –C d pagefile | grep
00000020 | awk '{ print $12}'` p" | sed
's/./& /g' | awk '{ print $1}'
-10-/43
Recovery of lost or corrupted InnoDB tables
Rows in an InnoDB page
•Rows in a single pages is a linked list
•The first record INFIMUM
•The last record SUPREMUM
•Sorted by Primary key
next infimum
0 supremu
mnext 10
0
data...
next 10
1
data...
next 10
2
data...
next 10
3
data...
-11-/43
Recovery of lost or corrupted InnoDB tables
Records are saved in insert order
insert into t1 values(10, 'aaa');
insert into t1 values(30, 'ccc');
insert into t1 values(20, 'bbb');
JG....................N<E.......
................................
.............................2..
...infimum......supremum......6.
........)....2..aaa.............
...*....2..ccc.... ...........+.
...2..bbb.......................
................................
-12-/43
Recovery of lost or corrupted InnoDB tables
Row format
EXAMPLE:
CREATE TABLE `t1` (
`ID` int(11) unsigned NOT NULL,
`NAME` varchar(120),
`N_FIELDS` int(10),
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT
CHARSET=latin1
Name Size
Field Start Offsets (F*1) or (F*2) bytes
Extra Bytes
6 bytes (5 bytes if
COMPACT format)
Field Contents depends on content
-13-/43
Recovery of lost or corrupted InnoDB tables
InnoDB page format (REDUNDANT)
Extra bytes
Name Size Description
record_status 2 bits _ORDINARY, _NODE_PTR, _INFIMUM, _SUPREMUM
deleted_flag 1 bit 1 if record is deleted
min_rec_flag 1 bit 1 if record is predefined minimum record
n_owned 4 bits number of records owned by this record
heap_no 13 bits record's order number in heap of index page
n_fields 10 bits number of fields in this record, 1 to 1023
1byte_offs_flag 1 bit 1 if each Field Start Offsets is 1 byte long (this item is also called the "short" flag)
next 16 bits 16 bits pointer to next record in page
-14-/43
Recovery of lost or corrupted InnoDB tables
InnoDB page format (COMPACT)
Extra bytes
Name Size, bits Description
record_status
deleted_flag
min_rec_flag
4 4 bits used to delete mark a record, and mark a predefined
minimum record in alphabetical order
n_owned 4 the number of records owned by this record
(this term is explained in page0page.h)
heap_no 13 the order number of this record in the
heap of the index page
record type 3 000=conventional, 001=node pointer (inside B-tree),
010=infimum, 011=supremum, 1xx=reserved
next 16 bits 16 a relative pointer to the next record in the page
-15-/43
Recovery of lost or corrupted InnoDB tables
REDUNDANT
A row: (10, ‘abcdef’, 20)
4 6 7
Actualy stored as: (10, TRX_ID, PTR_ID, ‘abcdef’, 20)
6 4
Field Offsets
…. next
Extra 6 bytes:
0x00 00 00 0A
record_status
deleted_flag
min_rec_flag
n_owned
heap_no
n_fields
1byte_offs_flag
Fields
... ... abcdef 0x80 00 00 14
-16-/43
Recovery of lost or corrupted InnoDB tables
COMPACT
A row: (10, ‘abcdef’, 20)
6 NULLS
Actualy stored as: (10, TRX_ID, PTR_ID, ‘abcdef’, 20)
Field Offsets
…. next
Extra 5 bytes:
0x00 00 00 0A
Fields
... ... abcdef 0x80 00 00 14
A bit per NULL-
able field
-17-/43
Recovery of lost or corrupted InnoDB tables
Data types
INT types (fixed-size)
String types
• VARCHAR(x) – variable-size
• CHAR(x) – fixed-size, variable-size if UTF-8
DECIMAL
• Stored in strings before 5.0.3, variable in size
• Binary format after 5.0.3, fixed-size.
-18-/43
Recovery of lost or corrupted InnoDB tables
BLOB and other long fields
• Field length (so called offset) is one or two byte long
• Page size is 16k
• If record size < (UNIV_PAGE_SIZE/2-200) == ~7k
– the record is stored internally (in a PK page)
• Otherwise – 768 bytes internally, the rest in an
external page
-19-/43
2. Internal system tables
SYS_INDEXES and
SYS_TABLES
-20-/43
Recovery of lost or corrupted InnoDB tables
Why are SYS_* tables needed?
• Correspondence “table name” -> “index_id”
• Storage for other internal information
-21-/43
Recovery of lost or corrupted InnoDB tables
How MySQL stores data in InnoDB
SYS_TABLES and SYS_INDEXES
Always REDUNDANT format!
CREATE TABLE `SYS_INDEXES` (
`TABLE_ID` bigint(20) unsigned NOT NULL
default '0',
`ID` bigint(20) unsigned NOT NULL default
'0',
`NAME` varchar(120) default NULL,
`N_FIELDS` int(10) unsigned default NULL,
`TYPE` int(10) unsigned default NULL,
`SPACE` int(10) unsigned default NULL,
`PAGE_NO` int(10) unsigned default NULL,
PRIMARY KEY (`TABLE_ID`,`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
CREATE TABLE `SYS_TABLES` (
`NAME` varchar(255) NOT NULL default '',
`ID` bigint(20) unsigned NOT NULL default
'0',
`N_COLS` int(10) unsigned default NULL,
`TYPE` int(10) unsigned default NULL,
`MIX_ID` bigint(20) unsigned default NULL,
`MIX_LEN` int(10) unsigned default NULL,
`CLUSTER_NAME` varchar(255) default NULL,
`SPACE` int(10) unsigned default NULL,
PRIMARY KEY (`NAME`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
index_id = 0-3 index_id = 0-1
Name:
PRIMARY
GEN_CLUSTER_ID
or unique index name
-22-/43
Recovery of lost or corrupted InnoDB tables
How MySQL stores data in InnoDB
NAME ID …
"archive/msg_store" 40 8 1 0 0 NULL 0
"archive/msg_store" 40 8 1 0 0 NULL 0
"archive/msg_store" 40 8 1 0 0 NULL 0
TABLE_ID ID NAME …
40 196389 "PRIMARY" 2 3 0 21031026
40 196390 "msg_hash" 1 0 0 21031028
SYS_TABLES
SYS_INDEXES
Example:
-23-/43
3. InnoDB Primary and
Secondary keys
-24-/43
Recovery of lost or corrupted InnoDB tables
Primary key
The table:
CREATE TABLE `t1` (
`ID` int(11),
`NAME` varchar(120),
`N_FIELDS` int(10),
PRIMARY KEY (`ID`),
KEY `NAME` (`NAME`)
) ENGINE=InnoDB DEFAULT
CHARSET=latin1
Fields in the PK:
1. ID
2. DB_TRX_ID
3. DB_ROLL_PTR
4. NAME
5. N_FIELDS
-25-/43
Recovery of lost or corrupted InnoDB tables
Secondary key
The table:
CREATE TABLE `t1` (
`ID` int(11),
`NAME` varchar(120),
`N_FIELDS` int(10),
PRIMARY KEY (`ID`),
KEY `NAME` (`NAME`)
) ENGINE=InnoDB DEFAULT
CHARSET=latin1
Fields in the SK:
1. NAME
2. ID  Primary key
-26-/43
4. Typical failure scenarios
-27-/43
Recovery of lost or corrupted InnoDB tables
Deleted records
• DELETE FROM table WHERE id = 5;
• Forgotten WHERE clause
Band-aid:
• Stop/kill mysqld ASAP
-28-/43
Recovery of lost or corrupted InnoDB tables
How delete is performed?
"row/row0upd.c“:
“…/* How is a delete performed?...The delete is
performed by setting the delete bit in the record and
substituting the id of the deleting transaction for the
original trx id, and substituting a new roll ptr for
previous roll ptr. The old trx id and roll ptr are saved
in the undo log record.
Thus, no physical changes occur in the index tree
structure at the time of the delete. Only when the
undo log is purged, the index records will be
physically deleted from the index trees.…”
-29-/43
Recovery of lost or corrupted InnoDB tables
Dropped table/database
• DROP TABLE table;
• DROP DATABASE database;
• Often happens when restoring from SQL dump
• Bad because .FRM file goes away
• Especially painful when innodb_file_per_table
Band-aid:
• Stop/kill mysqld ASAP
• Stop IO on an HDD or mount read-only or take a raw
image
-30-/43
Recovery of lost or corrupted InnoDB tables
Corrupted InnoDB tablespace
• Hardware failures
• OS or filesystem failures
• InnoDB bugs
• Corrupted InnoDB tablespace by other processes
Band-aid:
• Stop mysqld
• Take a copy of InnoDB files
-31-/43
Recovery of lost or corrupted InnoDB tables
Wrong UPDATE statement
• UPDATE user SET Password =
PASSWORD(‘qwerty’) WHERE User=‘root’;
• Again forgotten WHERE clause
• Bad because changes are applied in a PRIMARY
index immediately
• Old version goes to UNDO segment
Band-aid:
• Stop/kill mysqld ASAP
-32-/43
5. InnoDB recovery tool
-33-/43
Recovery of lost or corrupted InnoDB tables
Recovery prerequisites
1. Media
1. ibdata1
2. *.ibd
3. HDD image
2. Tables structure
1. SQL dump
2. *.FRM files
-34-/43
Recovery of lost or corrupted InnoDB tables
table_defs.h
{ /* int(11) unsigned */
name: “ID",
type: FT_UINT,
fixed_length: 4,
has_limits: TRUE,
limits: {
can_be_null: FALSE,
uint_min_val: 0,
uint_max_val: 4294967295ULL
},
can_be_null: FALSE
},
{ /* varchar(120) */
name: "NAME",
type: FT_CHAR,
min_length: 0,
max_length: 120,
has_limits: TRUE,
limits: {
can_be_null: TRUE,
char_min_len: 0,
char_max_len: 120,
char_ascii_only: TRUE
},
can_be_null: TRUE
},
• generated by create_defs.pl
-35-/43
Recovery of lost or corrupted InnoDB tables
How to get CREATE info from .frm files
1. CREATE TABLE t1 (id int) Engine=INNODB;
2. Replace t1.frm with the one’s you need to get
scheme
3. Run “show create table t1”
If mysqld crashes
1. See the end of bvi t1.frm :
.ID.NAME.N_FIELDS..
2. *.FRM viewer !TODO
-36-/43
Recovery of lost or corrupted InnoDB tables
InnoDB recovery tool
http://launchpad.net/percona-innodb-recovery-tool/
1. Written in Percona
2. Contributed by Percona and community
3. Supported by Percona
Consists of two major tools
•page_parser – splits InnoDB tablespace into 16k pages
•constraints_parser – scans a page and finds good records
-37-/43
Recovery of lost or corrupted InnoDB tables
InnoDB recovery tool
server# ./page_parser -4 -f /var/lib/mysql/ibdata1
Opening file: /var/lib/mysql/ibdata1
Read data from fn=3...
Read page #0.. saving it to pages-1259793800/0-18219008/0-00000000.page
Read page #1.. saving it to pages-1259793800/0-0/1-00000001.page
Read page #2.. saving it to pages-1259793800/4294967295-65535/2-00000002.page
Read page #3.. saving it to pages-1259793800/0-0/3-00000003.page
page_parser
In the recent versions the InnoDB pages are
sorted by page type
-38-/43
Recovery of lost or corrupted InnoDB tables
Page signature check
0{....0...4...4......=..E.
......
........<..~...A.......|..
......
..........................
......
...infimum......supremumf.
....qT
M/T/196001834/XXXXX
XXXXXXXXXXX L
X X X X X X X X X
XXXXXXXXXXXXX
1. INFIMUM and SUPREMUM
records are in fixed positions
2. Works with corrupted pages
-39-/43
Recovery of lost or corrupted InnoDB tables
InnoDB recovery tool
server# ./constraints_parser -4 -f pages-1259793800/0-16/51-00000051.page
constraints_parser
Table structure is defined in "include/table_defs.h"
See HOWTO for details
http://code.google.com/p/innodb-tools/wiki/InnodbRecoveryHowto
Filters inside table_defs.h are very important (if InnoDB
page is corrupted)
-40-/43
Recovery of lost or corrupted InnoDB tables
Check InnoDB page before reading recs
#./constraints_parser -5 -U -f pages/0-418/12665-00012665.page -V
Initializing table definitions...
Processing table: document_type_fieldsets_link
- total fields: 5
- nullable fields: 0
- minimum header size: 5
- minimum rec size: 25
- maximum rec size: 25
Read data from fn=3...
Page id: 12665
Checking a page
Infimum offset: 0x63
Supremum offset: 0x70
Next record at offset: 0x9F (159)
Next record at offset: 0xB0 (176)
Next record at offset: 0x3D95 (15765)
…
Next record at offset: 0x70 (112)
Page is good
1. Check if the tool can follow
all records by addresses
2. If so, find a rec. exactly at
the position where the
record is.
3. Helps a lot for COMPACT
format!
-41-/43
Recovery of lost or corrupted InnoDB tables
Import result
t1 1 "browse" 10
t1 2 "dashboard" 20
t1 3 "addFolder" 18
t1 4 "editFolder" 15
mysql> LOAD DATA INFILE '/path/to/datafile'
REPLACE INTO TABLE <table_name>
FIELDS TERMINATED BY 't' OPTIONALLY
ENCLOSED BY '"'
LINES STARTING BY '<table_name>t' ;
-42-/43
Questions ?
Thank you for coming!
• References
• http://www.mysqlperformanceblog.com/
• http://percona.com/
• http://www.slideshare.net/akuzminsky
-43-/43

More Related Content

What's hot

Rdbms day3
Rdbms day3Rdbms day3
Rdbms day3
Nitesh Singh
 
Sq lite module7
Sq lite module7Sq lite module7
Sq lite module7
Highervista
 
Tables in dd
Tables in ddTables in dd
Tables in dd
Ganesh Kumar
 
Marc 21
Marc 21Marc 21
presentation on MARC21 Standard Bibliography for LibMS
presentation on MARC21 Standard Bibliography for LibMSpresentation on MARC21 Standard Bibliography for LibMS
presentation on MARC21 Standard Bibliography for LibMS
Muhammad Zeeshan
 
Marc
MarcMarc
Diving into MARC
Diving into MARCDiving into MARC
Diving into MARC
Johan Koren
 
Dervy bis-155-week-3-quiz-new
Dervy bis-155-week-3-quiz-newDervy bis-155-week-3-quiz-new
Dervy bis-155-week-3-quiz-new
govendaagoovenda
 
MARC21
MARC21MARC21
Dervy bis 155 week 3 quiz new
Dervy   bis 155 week 3 quiz newDervy   bis 155 week 3 quiz new
Dervy bis 155 week 3 quiz new
kxipvscsk02
 

What's hot (10)

Rdbms day3
Rdbms day3Rdbms day3
Rdbms day3
 
Sq lite module7
Sq lite module7Sq lite module7
Sq lite module7
 
Tables in dd
Tables in ddTables in dd
Tables in dd
 
Marc 21
Marc 21Marc 21
Marc 21
 
presentation on MARC21 Standard Bibliography for LibMS
presentation on MARC21 Standard Bibliography for LibMSpresentation on MARC21 Standard Bibliography for LibMS
presentation on MARC21 Standard Bibliography for LibMS
 
Marc
MarcMarc
Marc
 
Diving into MARC
Diving into MARCDiving into MARC
Diving into MARC
 
Dervy bis-155-week-3-quiz-new
Dervy bis-155-week-3-quiz-newDervy bis-155-week-3-quiz-new
Dervy bis-155-week-3-quiz-new
 
MARC21
MARC21MARC21
MARC21
 
Dervy bis 155 week 3 quiz new
Dervy   bis 155 week 3 quiz newDervy   bis 155 week 3 quiz new
Dervy bis 155 week 3 quiz new
 

Similar to Open sql2010 recovery-of-lost-or-corrupted-innodb-tables

Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)
Aleksandr Kuzminsky
 
Data recovery talk on PLUK
Data recovery talk on PLUKData recovery talk on PLUK
Data recovery talk on PLUK
Aleksandr Kuzminsky
 
cPanelCon 2014: InnoDB Anatomy
cPanelCon 2014: InnoDB AnatomycPanelCon 2014: InnoDB Anatomy
cPanelCon 2014: InnoDB Anatomy
Ryan Robson
 
Inno db internals innodb file formats and source code structure
Inno db internals innodb file formats and source code structureInno db internals innodb file formats and source code structure
Inno db internals innodb file formats and source code structure
zhaolinjnu
 
MySQL Space Management
MySQL Space ManagementMySQL Space Management
MySQL Space Management
MIJIN AN
 
Plmce 14 be a_hero_16x9_final
Plmce 14 be a_hero_16x9_finalPlmce 14 be a_hero_16x9_final
Plmce 14 be a_hero_16x9_final
Marco Tusa
 
InnoDB Internal
InnoDB InternalInnoDB Internal
InnoDB Internal
mysqlops
 
Inno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code StructureInno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code Structure
MySQLConference
 
Page Cache in Linux 2.6.pdf
Page Cache in Linux 2.6.pdfPage Cache in Linux 2.6.pdf
Page Cache in Linux 2.6.pdf
ycelgemici1
 
MySQL innoDB split and merge pages
MySQL innoDB split and merge pagesMySQL innoDB split and merge pages
MySQL innoDB split and merge pages
Marco Tusa
 
Star schema my sql
Star schema   my sqlStar schema   my sql
Star schema my sql
deathsubte
 
cPanelCon 2015: InnoDB Alchemy
cPanelCon 2015: InnoDB AlchemycPanelCon 2015: InnoDB Alchemy
cPanelCon 2015: InnoDB Alchemy
Ryan Robson
 
Data Warehouse and Business Intelligence - Recipe 2
Data Warehouse and Business Intelligence - Recipe 2Data Warehouse and Business Intelligence - Recipe 2
Data Warehouse and Business Intelligence - Recipe 2
Massimo Cenci
 
Clase 12 manejo indices modificada
Clase 12 manejo indices   modificadaClase 12 manejo indices   modificada
Clase 12 manejo indices modificada
Titiushko Jazz
 
Clase 12 manejo indices modificada
Clase 12 manejo indices   modificadaClase 12 manejo indices   modificada
Clase 12 manejo indices modificada
Titiushko Jazz
 
0104 abap dictionary
0104 abap dictionary0104 abap dictionary
0104 abap dictionary
vkyecc1
 
How to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsHow to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analytics
Julien Le Dem
 
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
Insight Technology, Inc.
 
(140625) #fitalk sq lite 소개와 구조 분석
(140625) #fitalk   sq lite 소개와 구조 분석(140625) #fitalk   sq lite 소개와 구조 분석
(140625) #fitalk sq lite 소개와 구조 분석
INSIGHT FORENSIC
 
15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance
guest9912e5
 

Similar to Open sql2010 recovery-of-lost-or-corrupted-innodb-tables (20)

Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)
 
Data recovery talk on PLUK
Data recovery talk on PLUKData recovery talk on PLUK
Data recovery talk on PLUK
 
cPanelCon 2014: InnoDB Anatomy
cPanelCon 2014: InnoDB AnatomycPanelCon 2014: InnoDB Anatomy
cPanelCon 2014: InnoDB Anatomy
 
Inno db internals innodb file formats and source code structure
Inno db internals innodb file formats and source code structureInno db internals innodb file formats and source code structure
Inno db internals innodb file formats and source code structure
 
MySQL Space Management
MySQL Space ManagementMySQL Space Management
MySQL Space Management
 
Plmce 14 be a_hero_16x9_final
Plmce 14 be a_hero_16x9_finalPlmce 14 be a_hero_16x9_final
Plmce 14 be a_hero_16x9_final
 
InnoDB Internal
InnoDB InternalInnoDB Internal
InnoDB Internal
 
Inno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code StructureInno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code Structure
 
Page Cache in Linux 2.6.pdf
Page Cache in Linux 2.6.pdfPage Cache in Linux 2.6.pdf
Page Cache in Linux 2.6.pdf
 
MySQL innoDB split and merge pages
MySQL innoDB split and merge pagesMySQL innoDB split and merge pages
MySQL innoDB split and merge pages
 
Star schema my sql
Star schema   my sqlStar schema   my sql
Star schema my sql
 
cPanelCon 2015: InnoDB Alchemy
cPanelCon 2015: InnoDB AlchemycPanelCon 2015: InnoDB Alchemy
cPanelCon 2015: InnoDB Alchemy
 
Data Warehouse and Business Intelligence - Recipe 2
Data Warehouse and Business Intelligence - Recipe 2Data Warehouse and Business Intelligence - Recipe 2
Data Warehouse and Business Intelligence - Recipe 2
 
Clase 12 manejo indices modificada
Clase 12 manejo indices   modificadaClase 12 manejo indices   modificada
Clase 12 manejo indices modificada
 
Clase 12 manejo indices modificada
Clase 12 manejo indices   modificadaClase 12 manejo indices   modificada
Clase 12 manejo indices modificada
 
0104 abap dictionary
0104 abap dictionary0104 abap dictionary
0104 abap dictionary
 
How to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsHow to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analytics
 
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
 
(140625) #fitalk sq lite 소개와 구조 분석
(140625) #fitalk   sq lite 소개와 구조 분석(140625) #fitalk   sq lite 소개와 구조 분석
(140625) #fitalk sq lite 소개와 구조 분석
 
15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance
 

More from Arvids Godjuks

Uc2010 xtra backup-hot-backups-and-more
Uc2010 xtra backup-hot-backups-and-moreUc2010 xtra backup-hot-backups-and-more
Uc2010 xtra backup-hot-backups-and-more
Arvids Godjuks
 
Alexandr Makarov - PHP framework Yii
Alexandr Makarov - PHP framework YiiAlexandr Makarov - PHP framework Yii
Alexandr Makarov - PHP framework YiiArvids Godjuks
 
Zabbix - an important part of your IT infrastructure
Zabbix - an important part of your IT infrastructureZabbix - an important part of your IT infrastructure
Zabbix - an important part of your IT infrastructure
Arvids Godjuks
 
PHP libevent Daemons. A high performance and reliable solution. Practical exp...
PHP libevent Daemons. A high performance and reliable solution. Practical exp...PHP libevent Daemons. A high performance and reliable solution. Practical exp...
PHP libevent Daemons. A high performance and reliable solution. Practical exp...
Arvids Godjuks
 
Google Analytics: smart use
Google Analytics: smart useGoogle Analytics: smart use
Google Analytics: smart useArvids Godjuks
 
High Availability Solutions with MySQL
High Availability Solutions with MySQLHigh Availability Solutions with MySQL
High Availability Solutions with MySQL
Arvids Godjuks
 
Detecting logged in user's abnormal activity
Detecting logged in user's abnormal activityDetecting logged in user's abnormal activity
Detecting logged in user's abnormal activity
Arvids Godjuks
 

More from Arvids Godjuks (8)

Uc2010 xtra backup-hot-backups-and-more
Uc2010 xtra backup-hot-backups-and-moreUc2010 xtra backup-hot-backups-and-more
Uc2010 xtra backup-hot-backups-and-more
 
Alexandr Makarov - PHP framework Yii
Alexandr Makarov - PHP framework YiiAlexandr Makarov - PHP framework Yii
Alexandr Makarov - PHP framework Yii
 
Zabbix - an important part of your IT infrastructure
Zabbix - an important part of your IT infrastructureZabbix - an important part of your IT infrastructure
Zabbix - an important part of your IT infrastructure
 
PHP libevent Daemons. A high performance and reliable solution. Practical exp...
PHP libevent Daemons. A high performance and reliable solution. Practical exp...PHP libevent Daemons. A high performance and reliable solution. Practical exp...
PHP libevent Daemons. A high performance and reliable solution. Practical exp...
 
System markup
System markupSystem markup
System markup
 
Google Analytics: smart use
Google Analytics: smart useGoogle Analytics: smart use
Google Analytics: smart use
 
High Availability Solutions with MySQL
High Availability Solutions with MySQLHigh Availability Solutions with MySQL
High Availability Solutions with MySQL
 
Detecting logged in user's abnormal activity
Detecting logged in user's abnormal activityDetecting logged in user's abnormal activity
Detecting logged in user's abnormal activity
 

Recently uploaded

Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
Techgropse Pvt.Ltd.
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
David Brossard
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 

Recently uploaded (20)

Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 

Open sql2010 recovery-of-lost-or-corrupted-innodb-tables

  • 1. Recovery of lost or corrupted InnoDB tables Yet another reason to use InnoDB OpenSQL Camp 2010, Sankt Augustin Aleksandr.Kuzminsky@Percona.com Percona Inc. http://MySQLPerformanceBlog.com
  • 2. Agenda 1. InnoDB format overview 2. Internal system tables SYS_INDEXES and SYS_TABLES 3. InnoDB Primary and Secondary keys 4. Typical failure scenarios 5. InnoDB recovery tool -2-/43 Three things are certain: Death, taxes and lost data. Guess which has occurred?
  • 3. 1. InnoDB format overview -3-/43
  • 4. Recovery of lost or corrupted InnoDB tables How MySQL stores data in InnoDB 1. A table space (ibdata1) • System tablespace(data dictionary, undo, insert buffer, etc.) • PRIMARY indices (PK + data) • SECONDARY indices (SK + PK) If the key is (f1, f2) it is stored as (f1, f2, PK) 1. file per table (.ibd) • PRIMARY index • SECONDARY indices 1. InnoDB pages size 16k (uncompressed) 2. Every index is identified by index_id -4-/43
  • 5. Recovery of lost or corrupted InnoDB tables -5-/43
  • 6. Recovery of lost or corrupted InnoDB tables How MySQL stores data in InnoDB Page identifier index_id TABLE: name test/site_folders, id 0 119, columns 9, indexes 1, appr.rows 1 COLUMNS: id: DATA_INT len 4 prec 0; name: type 12 len 765 prec 0; sites_count: DATA_INT len 4 prec 0; created_at: DATA_INT len 8 prec 0; updated_at: DATA_INT len 8 prec 0; DB_ROW_ID: DATA_SYS prtype 256 len 6 prec 0; DB_TRX_ID: DATA_SYS prtype 257 len 6 prec 0; DB_ROLL_PTR: DATA_SYS prtype 258 len 7 prec 0; INDEX: name PRIMARY, id 0 254, fields 1/7, type 3 root page 271, appr.key vals 1, leaf pages 1, size pages 1 FIELDS: id DB_TRX_ID DB_ROLL_PTR name sites_count created_at updated_at mysql> CREATE TABLE innodb_table_monitor(x int) engine=innodb Error log: -6-/43
  • 7. Recovery of lost or corrupted InnoDB tables InnoDB page format FIL HEADER PAGE_HEADER INFINUM+SUPREMUM RECORDS USER RECORDS FREE SPACE Page Directory Fil Trailer -7-/43
  • 8. Recovery of lost or corrupted InnoDB tables InnoDB page format Fil Header Name Si ze Remarks FIL_PAGE_SPACE 4 4 ID of the space the page is in FIL_PAGE_OFFSET 4 ordinal page number from start of space FIL_PAGE_PREV 4 offset of previous page in key order FIL_PAGE_NEXT 4 offset of next page in key order FIL_PAGE_LSN 8 log serial number of page's latest log record FIL_PAGE_TYPE 2 current defined types are: FIL_PAGE_INDEX, FIL_PAGE_UNDO_LOG, FIL_PAGE_INODE, FIL_PAGE_IBUF_FREE_LIST FIL_PAGE_FILE_FLU SH_LSN 8 "the file has been flushed to disk at least up to this lsn" (log serial number), valid only on the first page of the file FIL_PAGE_ARCH_LO G_NO 4 the latest archived log file number at the time that FIL_PAGE_FILE_FLUSH_LSN was written (in the log) Data are stored in FIL_PAGE_INDEX -8-/43
  • 9. Recovery of lost or corrupted InnoDB tables InnoDB page format Page Header Name Size Remarks PAGE_N_DIR_SLOTS 2 number of directory slots in the Page Directory part; initial value = 2 PAGE_HEAP_TOP 2 record pointer to first record in heap PAGE_N_HEAP 2 number of heap records; initial value = 2 PAGE_FREE 2 record pointer to first free record PAGE_GARBAGE 2 "number of bytes in deleted records" PAGE_LAST_INSERT 2 record pointer to the last inserted record PAGE_DIRECTION 2 either PAGE_LEFT, PAGE_RIGHT, or PAGE_NO_DIRECTION PAGE_N_DIRECTION 2 number of consecutive inserts in the same direction, e.g. "last 5 were all to the left" PAGE_N_RECS 2 number of user records PAGE_MAX_TRX_ID 8 the highest ID of a transaction which might have changed a record on the page (only set for secondary indexes) PAGE_LEVEL 2 level within the index (0 for a leaf page) PAGE_INDEX_ID 8 identifier of the index the page belongs to PAGE_BTR_SEG_LEA F 10 "file segment header for the leaf pages in a B-tree" (this is irrelevant here) PAGE_BTR_SEG_TOP 10 "file segment header for the non-leaf pages in a B-tree" (this is irrelevant here) index_id Highest bit is row format(1 -COMPACT, 0 - REDUNDANT ) -9-/43
  • 10. Recovery of lost or corrupted InnoDB tables How to check row format? • The highest bit of the PAGE_N_HEAP from the page header • 0 stands for version REDUNDANT, 1 - for COMACT • dc -e "2o `hexdump –C d pagefile | grep 00000020 | awk '{ print $12}'` p" | sed 's/./& /g' | awk '{ print $1}' -10-/43
  • 11. Recovery of lost or corrupted InnoDB tables Rows in an InnoDB page •Rows in a single pages is a linked list •The first record INFIMUM •The last record SUPREMUM •Sorted by Primary key next infimum 0 supremu mnext 10 0 data... next 10 1 data... next 10 2 data... next 10 3 data... -11-/43
  • 12. Recovery of lost or corrupted InnoDB tables Records are saved in insert order insert into t1 values(10, 'aaa'); insert into t1 values(30, 'ccc'); insert into t1 values(20, 'bbb'); JG....................N<E....... ................................ .............................2.. ...infimum......supremum......6. ........)....2..aaa............. ...*....2..ccc.... ...........+. ...2..bbb....................... ................................ -12-/43
  • 13. Recovery of lost or corrupted InnoDB tables Row format EXAMPLE: CREATE TABLE `t1` ( `ID` int(11) unsigned NOT NULL, `NAME` varchar(120), `N_FIELDS` int(10), PRIMARY KEY (`ID`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 Name Size Field Start Offsets (F*1) or (F*2) bytes Extra Bytes 6 bytes (5 bytes if COMPACT format) Field Contents depends on content -13-/43
  • 14. Recovery of lost or corrupted InnoDB tables InnoDB page format (REDUNDANT) Extra bytes Name Size Description record_status 2 bits _ORDINARY, _NODE_PTR, _INFIMUM, _SUPREMUM deleted_flag 1 bit 1 if record is deleted min_rec_flag 1 bit 1 if record is predefined minimum record n_owned 4 bits number of records owned by this record heap_no 13 bits record's order number in heap of index page n_fields 10 bits number of fields in this record, 1 to 1023 1byte_offs_flag 1 bit 1 if each Field Start Offsets is 1 byte long (this item is also called the "short" flag) next 16 bits 16 bits pointer to next record in page -14-/43
  • 15. Recovery of lost or corrupted InnoDB tables InnoDB page format (COMPACT) Extra bytes Name Size, bits Description record_status deleted_flag min_rec_flag 4 4 bits used to delete mark a record, and mark a predefined minimum record in alphabetical order n_owned 4 the number of records owned by this record (this term is explained in page0page.h) heap_no 13 the order number of this record in the heap of the index page record type 3 000=conventional, 001=node pointer (inside B-tree), 010=infimum, 011=supremum, 1xx=reserved next 16 bits 16 a relative pointer to the next record in the page -15-/43
  • 16. Recovery of lost or corrupted InnoDB tables REDUNDANT A row: (10, ‘abcdef’, 20) 4 6 7 Actualy stored as: (10, TRX_ID, PTR_ID, ‘abcdef’, 20) 6 4 Field Offsets …. next Extra 6 bytes: 0x00 00 00 0A record_status deleted_flag min_rec_flag n_owned heap_no n_fields 1byte_offs_flag Fields ... ... abcdef 0x80 00 00 14 -16-/43
  • 17. Recovery of lost or corrupted InnoDB tables COMPACT A row: (10, ‘abcdef’, 20) 6 NULLS Actualy stored as: (10, TRX_ID, PTR_ID, ‘abcdef’, 20) Field Offsets …. next Extra 5 bytes: 0x00 00 00 0A Fields ... ... abcdef 0x80 00 00 14 A bit per NULL- able field -17-/43
  • 18. Recovery of lost or corrupted InnoDB tables Data types INT types (fixed-size) String types • VARCHAR(x) – variable-size • CHAR(x) – fixed-size, variable-size if UTF-8 DECIMAL • Stored in strings before 5.0.3, variable in size • Binary format after 5.0.3, fixed-size. -18-/43
  • 19. Recovery of lost or corrupted InnoDB tables BLOB and other long fields • Field length (so called offset) is one or two byte long • Page size is 16k • If record size < (UNIV_PAGE_SIZE/2-200) == ~7k – the record is stored internally (in a PK page) • Otherwise – 768 bytes internally, the rest in an external page -19-/43
  • 20. 2. Internal system tables SYS_INDEXES and SYS_TABLES -20-/43
  • 21. Recovery of lost or corrupted InnoDB tables Why are SYS_* tables needed? • Correspondence “table name” -> “index_id” • Storage for other internal information -21-/43
  • 22. Recovery of lost or corrupted InnoDB tables How MySQL stores data in InnoDB SYS_TABLES and SYS_INDEXES Always REDUNDANT format! CREATE TABLE `SYS_INDEXES` ( `TABLE_ID` bigint(20) unsigned NOT NULL default '0', `ID` bigint(20) unsigned NOT NULL default '0', `NAME` varchar(120) default NULL, `N_FIELDS` int(10) unsigned default NULL, `TYPE` int(10) unsigned default NULL, `SPACE` int(10) unsigned default NULL, `PAGE_NO` int(10) unsigned default NULL, PRIMARY KEY (`TABLE_ID`,`ID`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 CREATE TABLE `SYS_TABLES` ( `NAME` varchar(255) NOT NULL default '', `ID` bigint(20) unsigned NOT NULL default '0', `N_COLS` int(10) unsigned default NULL, `TYPE` int(10) unsigned default NULL, `MIX_ID` bigint(20) unsigned default NULL, `MIX_LEN` int(10) unsigned default NULL, `CLUSTER_NAME` varchar(255) default NULL, `SPACE` int(10) unsigned default NULL, PRIMARY KEY (`NAME`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 index_id = 0-3 index_id = 0-1 Name: PRIMARY GEN_CLUSTER_ID or unique index name -22-/43
  • 23. Recovery of lost or corrupted InnoDB tables How MySQL stores data in InnoDB NAME ID … "archive/msg_store" 40 8 1 0 0 NULL 0 "archive/msg_store" 40 8 1 0 0 NULL 0 "archive/msg_store" 40 8 1 0 0 NULL 0 TABLE_ID ID NAME … 40 196389 "PRIMARY" 2 3 0 21031026 40 196390 "msg_hash" 1 0 0 21031028 SYS_TABLES SYS_INDEXES Example: -23-/43
  • 24. 3. InnoDB Primary and Secondary keys -24-/43
  • 25. Recovery of lost or corrupted InnoDB tables Primary key The table: CREATE TABLE `t1` ( `ID` int(11), `NAME` varchar(120), `N_FIELDS` int(10), PRIMARY KEY (`ID`), KEY `NAME` (`NAME`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 Fields in the PK: 1. ID 2. DB_TRX_ID 3. DB_ROLL_PTR 4. NAME 5. N_FIELDS -25-/43
  • 26. Recovery of lost or corrupted InnoDB tables Secondary key The table: CREATE TABLE `t1` ( `ID` int(11), `NAME` varchar(120), `N_FIELDS` int(10), PRIMARY KEY (`ID`), KEY `NAME` (`NAME`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 Fields in the SK: 1. NAME 2. ID  Primary key -26-/43
  • 27. 4. Typical failure scenarios -27-/43
  • 28. Recovery of lost or corrupted InnoDB tables Deleted records • DELETE FROM table WHERE id = 5; • Forgotten WHERE clause Band-aid: • Stop/kill mysqld ASAP -28-/43
  • 29. Recovery of lost or corrupted InnoDB tables How delete is performed? "row/row0upd.c“: “…/* How is a delete performed?...The delete is performed by setting the delete bit in the record and substituting the id of the deleting transaction for the original trx id, and substituting a new roll ptr for previous roll ptr. The old trx id and roll ptr are saved in the undo log record. Thus, no physical changes occur in the index tree structure at the time of the delete. Only when the undo log is purged, the index records will be physically deleted from the index trees.…” -29-/43
  • 30. Recovery of lost or corrupted InnoDB tables Dropped table/database • DROP TABLE table; • DROP DATABASE database; • Often happens when restoring from SQL dump • Bad because .FRM file goes away • Especially painful when innodb_file_per_table Band-aid: • Stop/kill mysqld ASAP • Stop IO on an HDD or mount read-only or take a raw image -30-/43
  • 31. Recovery of lost or corrupted InnoDB tables Corrupted InnoDB tablespace • Hardware failures • OS or filesystem failures • InnoDB bugs • Corrupted InnoDB tablespace by other processes Band-aid: • Stop mysqld • Take a copy of InnoDB files -31-/43
  • 32. Recovery of lost or corrupted InnoDB tables Wrong UPDATE statement • UPDATE user SET Password = PASSWORD(‘qwerty’) WHERE User=‘root’; • Again forgotten WHERE clause • Bad because changes are applied in a PRIMARY index immediately • Old version goes to UNDO segment Band-aid: • Stop/kill mysqld ASAP -32-/43
  • 33. 5. InnoDB recovery tool -33-/43
  • 34. Recovery of lost or corrupted InnoDB tables Recovery prerequisites 1. Media 1. ibdata1 2. *.ibd 3. HDD image 2. Tables structure 1. SQL dump 2. *.FRM files -34-/43
  • 35. Recovery of lost or corrupted InnoDB tables table_defs.h { /* int(11) unsigned */ name: “ID", type: FT_UINT, fixed_length: 4, has_limits: TRUE, limits: { can_be_null: FALSE, uint_min_val: 0, uint_max_val: 4294967295ULL }, can_be_null: FALSE }, { /* varchar(120) */ name: "NAME", type: FT_CHAR, min_length: 0, max_length: 120, has_limits: TRUE, limits: { can_be_null: TRUE, char_min_len: 0, char_max_len: 120, char_ascii_only: TRUE }, can_be_null: TRUE }, • generated by create_defs.pl -35-/43
  • 36. Recovery of lost or corrupted InnoDB tables How to get CREATE info from .frm files 1. CREATE TABLE t1 (id int) Engine=INNODB; 2. Replace t1.frm with the one’s you need to get scheme 3. Run “show create table t1” If mysqld crashes 1. See the end of bvi t1.frm : .ID.NAME.N_FIELDS.. 2. *.FRM viewer !TODO -36-/43
  • 37. Recovery of lost or corrupted InnoDB tables InnoDB recovery tool http://launchpad.net/percona-innodb-recovery-tool/ 1. Written in Percona 2. Contributed by Percona and community 3. Supported by Percona Consists of two major tools •page_parser – splits InnoDB tablespace into 16k pages •constraints_parser – scans a page and finds good records -37-/43
  • 38. Recovery of lost or corrupted InnoDB tables InnoDB recovery tool server# ./page_parser -4 -f /var/lib/mysql/ibdata1 Opening file: /var/lib/mysql/ibdata1 Read data from fn=3... Read page #0.. saving it to pages-1259793800/0-18219008/0-00000000.page Read page #1.. saving it to pages-1259793800/0-0/1-00000001.page Read page #2.. saving it to pages-1259793800/4294967295-65535/2-00000002.page Read page #3.. saving it to pages-1259793800/0-0/3-00000003.page page_parser In the recent versions the InnoDB pages are sorted by page type -38-/43
  • 39. Recovery of lost or corrupted InnoDB tables Page signature check 0{....0...4...4......=..E. ...... ........<..~...A.......|.. ...... .......................... ...... ...infimum......supremumf. ....qT M/T/196001834/XXXXX XXXXXXXXXXX L X X X X X X X X X XXXXXXXXXXXXX 1. INFIMUM and SUPREMUM records are in fixed positions 2. Works with corrupted pages -39-/43
  • 40. Recovery of lost or corrupted InnoDB tables InnoDB recovery tool server# ./constraints_parser -4 -f pages-1259793800/0-16/51-00000051.page constraints_parser Table structure is defined in "include/table_defs.h" See HOWTO for details http://code.google.com/p/innodb-tools/wiki/InnodbRecoveryHowto Filters inside table_defs.h are very important (if InnoDB page is corrupted) -40-/43
  • 41. Recovery of lost or corrupted InnoDB tables Check InnoDB page before reading recs #./constraints_parser -5 -U -f pages/0-418/12665-00012665.page -V Initializing table definitions... Processing table: document_type_fieldsets_link - total fields: 5 - nullable fields: 0 - minimum header size: 5 - minimum rec size: 25 - maximum rec size: 25 Read data from fn=3... Page id: 12665 Checking a page Infimum offset: 0x63 Supremum offset: 0x70 Next record at offset: 0x9F (159) Next record at offset: 0xB0 (176) Next record at offset: 0x3D95 (15765) … Next record at offset: 0x70 (112) Page is good 1. Check if the tool can follow all records by addresses 2. If so, find a rec. exactly at the position where the record is. 3. Helps a lot for COMPACT format! -41-/43
  • 42. Recovery of lost or corrupted InnoDB tables Import result t1 1 "browse" 10 t1 2 "dashboard" 20 t1 3 "addFolder" 18 t1 4 "editFolder" 15 mysql> LOAD DATA INFILE '/path/to/datafile' REPLACE INTO TABLE <table_name> FIELDS TERMINATED BY 't' OPTIONALLY ENCLOSED BY '"' LINES STARTING BY '<table_name>t' ; -42-/43
  • 43. Questions ? Thank you for coming! • References • http://www.mysqlperformanceblog.com/ • http://percona.com/ • http://www.slideshare.net/akuzminsky -43-/43