State of the Art Thin ProvisioningStephen Foskettstephen@fosketts.nettwitter.com/sfoskett
Storage Is Supposed To Be Getting Cheaper!Disk cost is dropping rapidly$250 buys:1994: 2 GB1999: 20 GB2004: 200 GB2009: 2000 GBBut enterprise storage costs keep rising!2
Where Is The Cost?Hardware and software make up a small percentage of total enterprise storage spending……and hard disk drive capacity makes up a small percentage of that!Data center/environmental, administrative personnel, maintenance, and data protection are much biggerThe biggest opportunity is inefficiency, but this has always been hard to tackle3
Over-Allocation and Under-Utilization4Raw Disk Capacity PurchasedConventional storage provisioning is grossly inefficientUsable Protected StorageCapacity Allocated to ServersRequested CapacityUsed by FilesRequired Capacity
Thin Provisioning Simplified!5Traditional storage provisioningThin storage provisioningAllocated but unusedFree for allocationActually UsedUsed
Thin Provisioning: Potentially ProblematicStorage is commonly over-allocated to serversSome arrays can “thinly” provision just the capacity that actually contains data500 GB request for new project, but only 2 GB of initial data is written – array only allocates 2 GB and expands as data is writtenWhat’s not to love?Oops – we provisioned a petabyte and ran out of storageChunk sizes and formatting conflictsCan it thin unprovision?Can it replicate to and from thin provisioned volumes?
Are You Solving a Technical or Business Issue?7
Ever Play the “Telephone” Game?ApplicationIVFile/Record LayerFile SystemDatabaseIIIEach layer obscures the ones above and below itIIcBlockAggregationHostIIbNetworkDeviceIIaStorage DevicesISNIA Shared Storage Model
File SystemIt’s (Relatively) Easy to Allocate on Write9As applications write dataStorageCapacity is allocatedFile system write requests pass through to storage systemsso they can wait to allocate as requested
File SystemBut What About De-Allocate on Delete?10Data is deletedStorageCapacity is freed upMost file systems don’t send a consistent “de-allocate” message to storageso many thin systems get fatter over time
Two Approaches To Thin11
Server Smarts: Metadata MonitoringFile system/VM combos can handle thin provisioning on their ownZFS, Veritas Volume Manager, VMware VMFSArrays can “watch” an operating system allocate and de-allocate storagePerilous! Known file systems and volume formats only!Data Robotics Drobo supports FAT32, NTFS, HFS+12Drobo watches the file allocation table for deletesFile SystemStorage
Storage Smarts: Zero Page ReclaimStorage arrays watch for “pages” containing all zeros and simply don’t write themIBM XIV, 3PAR, NetApp (with dedupe), HDS, EMC V-MaxSome storage vendors rely on utilities to reclaimNetApp SnapDrive for Windows 5.0Compellent Free Space RecoveryVeritas Storage Foundation Thin ReclamationCan also force it with sdelete13
Zero Page Reclaim: Pros and ConsPro:Straightforward to implement in storageSome implementation: VMware eagerzeroedthickCon:Requires application/OS/file system to actually have written all zeroes - most just ignore unused space rather than zeroingMost implementations are page-basedDrives more I/OVMware thin/thick don’t work14
The Lingo: WRITE_SAMEFacilitates zero page reclaim“Write this block 1,000,000 times”Pro:Conserves I/O operationsPopular with array vendorsExists and is even implemented (a little)Con:Depends on file system layer intelligenceStill introduces extra I/OCould be very, very bad in a thin-unaware array15
The Bridge: Veritas Thin APIThin Reclamation API can communicate de-allocation to arrays by zeroing using WRITE_SAME/UNMAPIntroduced in 5.0 (UNIX) and 5.1 (Windows)Supports 3PAR, EMC CLARiiON CX4, HDS USPV/VM, HP XP20k/24k, IBM XIVWill also support Compellent, EMC Symmetrix DMX, Fujitsu Eternus, HP EVA, HDS AMS, IBM DS8k, NetAppSmartMove copies only allocated blocksSupports any/all storage systemsWorks with thin-capable arraysSpeeds up migrations in all cases16
What About TRIM?TRIM (ATA) and TRIM/UNMAP/PUNCH (SCSI) can inform storage that a block is no longer neededDesigned for SSD architecture:Cells grouped into 4 kB pages and 512 kB blocksOnly empty pages can be written toWriting to empty pages is quick!Writing to used pages requires a block eraseRead-erase-write is slow(er)OS support for TRIM:Windows 7 & Server 2008 R2Linux 2.6.33, Open Solaris, FreeBSD 917
TRIM Isn’t For ThinNot really a thin-provisioning command but could play one on TVNetApp proposed a hole punching standard to INCITS T10 committeeHDS and EMC prefer UNMAP bitA similar NetApp approach uses NFS and a Windows file system redirect
More Obstacles!
Large page – no thin provisioningGranularity (Page Sizes)20Small page – thin even with fragmentation
Processing and Scheduling21IntensiveIneffective
Fragmentation Kills Thin Provisioning22Fragmented filesystem spansthin pagesDefragmentedfile system allowsthin provisioning
The Performance CrunchHow high can we drive utilization without killing performance?
Stephen’s DreamThin provisioning could be awesome, provided it is integrated at all levels of the stackSmart applications that don’t spew data everywhereSmart file systems and volume managers that communicate what is and isn’t usedSmart virtualization layers that don’t obscure usageSmart storage systems that act on all of this information with granularity and without falling over deadSmart monitoring systems to tie everything together and head off disaster
Thank You!Stephen Foskettstephen@fosketts.nettwitter.com/sfoskett+1(508)451-9532FoskettServices.comblog.fosketts.netGestaltIT.com25

State of the Art Thin Provisioning

  • 1.
    State of theArt Thin ProvisioningStephen Foskettstephen@fosketts.nettwitter.com/sfoskett
  • 2.
    Storage Is SupposedTo Be Getting Cheaper!Disk cost is dropping rapidly$250 buys:1994: 2 GB1999: 20 GB2004: 200 GB2009: 2000 GBBut enterprise storage costs keep rising!2
  • 3.
    Where Is TheCost?Hardware and software make up a small percentage of total enterprise storage spending……and hard disk drive capacity makes up a small percentage of that!Data center/environmental, administrative personnel, maintenance, and data protection are much biggerThe biggest opportunity is inefficiency, but this has always been hard to tackle3
  • 4.
    Over-Allocation and Under-Utilization4RawDisk Capacity PurchasedConventional storage provisioning is grossly inefficientUsable Protected StorageCapacity Allocated to ServersRequested CapacityUsed by FilesRequired Capacity
  • 5.
    Thin Provisioning Simplified!5Traditionalstorage provisioningThin storage provisioningAllocated but unusedFree for allocationActually UsedUsed
  • 6.
    Thin Provisioning: PotentiallyProblematicStorage is commonly over-allocated to serversSome arrays can “thinly” provision just the capacity that actually contains data500 GB request for new project, but only 2 GB of initial data is written – array only allocates 2 GB and expands as data is writtenWhat’s not to love?Oops – we provisioned a petabyte and ran out of storageChunk sizes and formatting conflictsCan it thin unprovision?Can it replicate to and from thin provisioned volumes?
  • 7.
    Are You Solvinga Technical or Business Issue?7
  • 8.
    Ever Play the“Telephone” Game?ApplicationIVFile/Record LayerFile SystemDatabaseIIIEach layer obscures the ones above and below itIIcBlockAggregationHostIIbNetworkDeviceIIaStorage DevicesISNIA Shared Storage Model
  • 9.
    File SystemIt’s (Relatively)Easy to Allocate on Write9As applications write dataStorageCapacity is allocatedFile system write requests pass through to storage systemsso they can wait to allocate as requested
  • 10.
    File SystemBut WhatAbout De-Allocate on Delete?10Data is deletedStorageCapacity is freed upMost file systems don’t send a consistent “de-allocate” message to storageso many thin systems get fatter over time
  • 11.
  • 12.
    Server Smarts: MetadataMonitoringFile system/VM combos can handle thin provisioning on their ownZFS, Veritas Volume Manager, VMware VMFSArrays can “watch” an operating system allocate and de-allocate storagePerilous! Known file systems and volume formats only!Data Robotics Drobo supports FAT32, NTFS, HFS+12Drobo watches the file allocation table for deletesFile SystemStorage
  • 13.
    Storage Smarts: ZeroPage ReclaimStorage arrays watch for “pages” containing all zeros and simply don’t write themIBM XIV, 3PAR, NetApp (with dedupe), HDS, EMC V-MaxSome storage vendors rely on utilities to reclaimNetApp SnapDrive for Windows 5.0Compellent Free Space RecoveryVeritas Storage Foundation Thin ReclamationCan also force it with sdelete13
  • 14.
    Zero Page Reclaim:Pros and ConsPro:Straightforward to implement in storageSome implementation: VMware eagerzeroedthickCon:Requires application/OS/file system to actually have written all zeroes - most just ignore unused space rather than zeroingMost implementations are page-basedDrives more I/OVMware thin/thick don’t work14
  • 15.
    The Lingo: WRITE_SAMEFacilitateszero page reclaim“Write this block 1,000,000 times”Pro:Conserves I/O operationsPopular with array vendorsExists and is even implemented (a little)Con:Depends on file system layer intelligenceStill introduces extra I/OCould be very, very bad in a thin-unaware array15
  • 16.
    The Bridge: VeritasThin APIThin Reclamation API can communicate de-allocation to arrays by zeroing using WRITE_SAME/UNMAPIntroduced in 5.0 (UNIX) and 5.1 (Windows)Supports 3PAR, EMC CLARiiON CX4, HDS USPV/VM, HP XP20k/24k, IBM XIVWill also support Compellent, EMC Symmetrix DMX, Fujitsu Eternus, HP EVA, HDS AMS, IBM DS8k, NetAppSmartMove copies only allocated blocksSupports any/all storage systemsWorks with thin-capable arraysSpeeds up migrations in all cases16
  • 17.
    What About TRIM?TRIM(ATA) and TRIM/UNMAP/PUNCH (SCSI) can inform storage that a block is no longer neededDesigned for SSD architecture:Cells grouped into 4 kB pages and 512 kB blocksOnly empty pages can be written toWriting to empty pages is quick!Writing to used pages requires a block eraseRead-erase-write is slow(er)OS support for TRIM:Windows 7 & Server 2008 R2Linux 2.6.33, Open Solaris, FreeBSD 917
  • 18.
    TRIM Isn’t ForThinNot really a thin-provisioning command but could play one on TVNetApp proposed a hole punching standard to INCITS T10 committeeHDS and EMC prefer UNMAP bitA similar NetApp approach uses NFS and a Windows file system redirect
  • 19.
  • 20.
    Large page –no thin provisioningGranularity (Page Sizes)20Small page – thin even with fragmentation
  • 21.
  • 22.
    Fragmentation Kills ThinProvisioning22Fragmented filesystem spansthin pagesDefragmentedfile system allowsthin provisioning
  • 23.
    The Performance CrunchHowhigh can we drive utilization without killing performance?
  • 24.
    Stephen’s DreamThin provisioningcould be awesome, provided it is integrated at all levels of the stackSmart applications that don’t spew data everywhereSmart file systems and volume managers that communicate what is and isn’t usedSmart virtualization layers that don’t obscure usageSmart storage systems that act on all of this information with granularity and without falling over deadSmart monitoring systems to tie everything together and head off disaster
  • 25.