Your SlideShare is downloading. ×
0
Fosdem innodb prefetch
Fosdem innodb prefetch
Fosdem innodb prefetch
Fosdem innodb prefetch
Fosdem innodb prefetch
Fosdem innodb prefetch
Fosdem innodb prefetch
Fosdem innodb prefetch
Fosdem innodb prefetch
Fosdem innodb prefetch
Fosdem innodb prefetch
Fosdem innodb prefetch
Fosdem innodb prefetch
Fosdem innodb prefetch
Fosdem innodb prefetch
Fosdem innodb prefetch
Fosdem innodb prefetch
Fosdem innodb prefetch
Fosdem innodb prefetch
Fosdem innodb prefetch
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Fosdem innodb prefetch

1,305

Published on

Presentation at 2011 FOSDEM on InnoDB I/O prefetching

Presentation at 2011 FOSDEM on InnoDB I/O prefetching

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,305
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. FOSDEM 2011InnoDB prefetchSteve Hardy, Zarafas.hardy@zarafa.com 1
  • 2. FOSDEM 2011: InnoDB I/O Prefetch FOSDEM 2011Who are you ?- Zarafa Groupware - Opensource groupware (email, calendaring, tasks, etc) - Exchange replacement, MAPI compatible - Web Access - iPhone / Android native sync - MySQL as storage system - Single servers serving > 4000 concurrent users, 1TB Database (excluding attachment data), > 10 billion rows - In fedora and ubuntu repositories 2
  • 3. FOSDEM 2011What did you do ? - Improvement on InnoDB I/O subsystem - Better utilizes disk resources - Patch, based on MySQL 5.5 3
  • 4. FOSDEM 2011 4
  • 5. FOSDEM 2011How fast is it ?- mysqldump > /dev/null- 1.5GB fragmented table- 4x 10k RPM SAS disks, RAID-0, 128k stripe Time (minutes) 5x faster IOPS (r) 5x faster 5
  • 6. FOSDEM 2011How does it work ?- Storage engine gets request from MySQL to get data, eg: - Records between primary key 100 to 10000, and 12000 to 13000 - All records from beginning of the table - Single record with primary key 101- Data in innodb is on 16k-pages - Each page has a sorted, sequential list of records - Each page has a pointer to the next page - A tree provides quick access to the pages (the tree itself is also stored in pages) 6
  • 7. FOSDEM 2011What do these pages look like ? Page 1create table test ( 100 20 a integer, 104 22 108 21 b varchar(1), PRIMARY KEY a);insert into test values (100,’A’),(101,’B’), (102,’C’),(103,’D’), … Page 20 Page 22 Page 21 100 A 104 E 108 I 101 B 105 F 109 J 102 C 106 G 110 K 103 D 107 H 111 L Next: 22 Next: 21 Next: - 7
  • 8. FOSDEM 2011Real-time simulation SELECT * FROM test WHERE id >= 102 AND ID <110 8
  • 9. FOSDEM 2011Will my server be 5x faster ?- In short: no- Depends on data access patterns and data write patterns- Existing prefetch methods - Disk / RAID read-ahead - Entire stripe could be read (in the order of 128K) - Disk has readahead logic (depends on firmware, etc) - FS readahead - Pro: works well with sequential pages - Con: doesn’t work at all with even slightly non-sequential pages - Con: disabled with O_DIRECT 9
  • 10. FOSDEM 2011Will my server be 5x faster ?- Existing prefetch methods - InnoDB linear readahead - Pro: works well with sequential pages - Con: doesn’t work at all with largely non-sequential pages - InnoDB random readahead - Pro: works with random reads in a single extent - Con: extents are not that big - Con: removed in 5.5 (didn’t work very well) 10
  • 11. FOSDEM 2011Example 1: unfragmented table Pages 104 105 106 107 108 109 Read order Filesystem readahead Filesystem readahead kicks in after two pages, maximum disk throughput is achieved. NOTE: with O_DIRECT, readahead doesn’t kick in 11
  • 12. FOSDEM 2011Example 2: fragmented table Pages 118 106 21 213 52 13 Read order Due to random access, no readahead is started, pages are requested from disk one-by-one. 12
  • 13. FOSDEM 2011Example 3: ‘holy’ table Pages 100 102 104 106 108 110 Read order Due to non-sequential access, no readahead is started, pages are requested from disk one-by-one, even when reading pages 100-110 sequentially would have been quicker. 13
  • 14. FOSDEM 2011Is my table fragmented ?- Hard to see - http://developer.zarafa.com/InnoInfo - Not really 100% release quality (OK it’s alpha …) - Gives you an idea of fragmentation - In practice, almost all tables are fragmented to some extent - Out-of-order inserts make it very bad - If your mysqldump does ~200 IOPS, your table is fragmented - (tip: iostat –dx 1, look at r/s)- You can ‘fix’ it by doing ‘OPTIMIZE’ table - Same as dump/re-import - Large downtime - Currently no online defragmentation (nor planned as far as I know .. ?) - Even after OPTIMIZE table you can have a ‘holy’ table 14
  • 15. FOSDEM 2011How to use it- SET GLOBAL innodb_prefetch_pages = X - 0 <= X <= 1024 - Default 32 pages (in benchmark 4x faster) - Too low value: not much prefetched - Too high value: possibly loading too many pages for future use- SHOW GLOBAL STATUS LIKE ‘innodb_prefetch_%’ - innodb_prefetches: number of prefetch batches done - innodb_prefetch_pages: total number of pages prefetched 15
  • 16. FOSDEM 2011Other changes in the patch- records_in_range(): make number of pages read lower - Does estimation of number of pages - Used for query optimization - The query optimization using more I/O than the actual query, is a bad thing - Patch brings down precision, but cuts I/O by up to 80% - Keeps precision in order-of-magnitude (d < 0.1)- Disable query optimization and row estimation when there is no choice - If you only have one query plan, we don’t have to estimate - If you FORCE INDEX, seriously, stop preparing, just GO GO GO 16
  • 17. FOSDEM 2011Summary- Patch to improve parallel data access in innodb- Improves performance mainly for: - Multi-disk RAID systems - Fragmented on-disk data - Queries requesting ranges of logically sequential data 17
  • 18. FOSDEM 2011SummaryPotentially degraded performance for: - LIMIT clauses (prefetch overshoots end of records) - Other queries that stop processing rows before the end of a range- Doesn’t help for: - CPU usage - SSD disks (due to low random-access seek time) - .. But it doesn’t hurt either 18
  • 19. FOSDEM 2011What now ?- Is it compatible with existing databases / MySQL 5.1 ? - Data storage is unchanged - MySQL 5.1 backport should be possible, but it has no native AIO support- Where can I get the patch ? – http://www.zarafa.com/wiki/index.php/MySQL_5.5_InnoDB_prefetch_patch 19
  • 20. FOSDEM 2011What now ?Do you have more benchmarks ? - Not for now, try it out and get back to me- Can I use it in production? - No. It passes tests now, but needs work / proof- When are you going to finish it ? - When someone with InnoDB knowledge tells me it’s right/better/ok 20

×