Why 4k?

703 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
703
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
17
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Why 4k?

  1. 1. Why 4K? October 2, 2012 George Wilsongeorge.wilson@delphix.com
  2. 2. Why 4K? (Not Y4K)● This is not the next Millennium bug! Delphix Proprietary and Confidential
  3. 3. Storage History● 1998 IBM publishes a paper proposing an increase of disk sector size from 512B to 4K● 2000 4K IDEMA (International Disk Drive Equipment and Materials Association) committee was formed● 2005 ZFS released in OpenSolaris with support for block sizes ranging from 512B to 128K● 2005 512B emulation mode proposed, later known as AF 512e● 2006 ZFS adds large sector support● 2009 Advanced Format is approved as naming convention for 4K sectors● 2011 All hard drive manufactures start to ship AF 512e drives Delphix Proprietary and Confidential
  4. 4. Advanced Format Drives● Two flavors of Advanced Format Drives ○ AF 512e - Advanced Format 512B Emulation ○ AF 4Kn - Advanced Format 4K Native Today Future (2012?) Delphix Proprietary and Confidential
  5. 5. Advanced Format 512e (AF 512e)● Maps 8 512B logical blocks into 1 physical 4K block● Provides an emulation layer for compatibility 0 1 2 3 4 5 6 7 512B Logical Blocks 4K Physical Block #0 4K Physical Blocks Delphix Proprietary and Confidential
  6. 6. Predicting Future Problems"Access on a 512-byte basis would continue to besupported, but performance would be inferior tothat in which access is done on a 4096-byte basis,and might well be inferior to that of previousdrives with 512-byte native block size." -- LargeBlock Size by Paul Hodges and David Cheng, 1998
  7. 7. Problems in a 4K World● Lies ○ AF 512e Drive lie about their physical block size ○ LUNs from storage vendors lie about their physical block size● Misaligned I/O ○ Proper partitioning ○ Some AF 512e drives provide an XP jumper (XP partition starts on sector 63, not 4K aligned)● Read-modify-write Delphix Proprietary and Confidential
  8. 8. Sub-block Reads Read 512B Block 0 1 2 3 4 5 6 7 512B Logical Blocks 4K Physical Block #1 4K Physical Blocks 0 1 2 3 4 5 6 7 512B Logical Blocks 4K Physical Block #1 4K Physical Blocks Must Read 4K Block Delphix Proprietary and Confidential
  9. 9. Sub-block Writes (Read-modify-write)Logical Block Read 512B 0 1 2 3 4 5 6 7 512B Logical Blocks 4K Physical Block #0 4K Physical BlocksPhysical Block Read 4K 0 1 2 3 4 5 6 7 512B Logical Blocks 4K Physical Block #0 4K Physical BlocksPhysical Block Write 4K 0 1 2 3 4 5 6 7 512B Logical Blocks 4K Physical Block #0 4K Physical Blocks Delphix Proprietary and Confidential
  10. 10. Misaligned 4K WritesLogical Block Read 4K 0 1 2 3 4 5 6 7 8 9 4K Physical Block #0 4K Physical Block #1Physical Block Read 2 4K 0 1 2 3 4 5 6 7 8 9 4K Physical Block #0 4K Physical Block #1Physical Block Write 2 4K 0 1 2 3 4 5 6 7 8 9 4K Physical Block #0 4K Physical Block #1 Delphix Proprietary and Confidential
  11. 11. Solutions (sort of)● Override the lies from the device ○ FreeBSD, Illumos, and Linux have all implemented a way to override the discovered sector size ○ FreeBSD ■ using gnop to create 4k device ○ Illumos ■ add an override into sd.conf: sd-config-list = "VENDOR PRODUCT", physical-block-size:4096; ○ Linux ■ zpool create -o ashift=12 tank <device> Delphix Proprietary and Confidential
  12. 12. Drawbacks of 4K and ZFS● Reduced compression ratio ○ Blocks less than 4K mean 0% compression ○ 8K block can only achieve 50% compression● Migrating drives from 512B to 4K● Inefficient metadata allocation ○ Some metadata is allocated in 4K chunks and will no longer get compressed● Improper accounting of compressed sizes in datasets● RAID-Z and 4k -- not recommended● Configuring root pools to use 4K ○ Grub support?● Fewer uberblocks Delphix Proprietary and Confidential
  13. 13. Q&A / Beer? Delphix Proprietary and Confidential
  14. 14. ZFS Day October 2, 2012 George Wilsongeorge.wilson@delphix.com

×