Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Shingled Disk
The Big Data Storage Storage: Magnetic disks   High storage density   Current: 400GB/in2 - 550GB/in2   30-50% increase...
Review: Hard disk
Track
Conventionally                Written Track Non-overlap Track width w (e.g. 25nm) Guard gaps between tracks (e.g. g =  ...
Shingled Disk:              Overlap Tracks Wilder Track written. (e.g. w = 70nm). Shingled writing overlaps tracks. The...
Characteristics Higher Density without significant hardware change.   2-3 times of the conventional disk density. Suppo...
Two High-Level Strategies Mask the operational difference of a Shingled Disk.   Drop-in replacement for current disks.  ...
Strategy One: Masking the  Operational Difference Synergy with SSD: Slow erasure of block in SSD   SSD: Flash Translatio...
Strategy One: Masking the     Operational Difference Drawback   Experience with SSDs indicates the performance will be  ...
Virtual Block Address             Translation Need to quickly translate Virtual Block Address to  Logical Block Address....
Strategy Two:          Specialized System Simple Shingled Translation Layer.   Random Update: Read-Modify-Write.   TRIM...
Design Issues forShingled Write Disk
Band Abstraction Store the bulk of data in band.   A collection of b-contiguous tracks.   A buffer of k tracks at the e...
Proportional Capacity Loss c = 1 – b/(b+k): proportional capacity loss                                          • k = 5 a...
Band Usage1. Only writes complete bands.   Each band contains a segment of Log structured File System.   Assumes data is...
Reserved Space for          Random Update Option 1: NVRAM. Option 2: Random Access Zone (RAZ)   Every track is followed...
How Large Random Access     Zone Can we Have? Assume without RAZ, the capacity of shingled disk is  2.3 times of conventi...
Trade offs for Two Options Reserved Space for Random Access   Option 1: NVRAM     Faster     More Expensive: cost 10 t...
Usage of NVRAM1. Buffering data for writing bands.   Be careful about the limited number of write-erase cycles      of fl...
Number of Logs Here Log-structured File System is assumed to be  used. What’s the benefit to have more than a single log...
Workload for GeneralPurpose Personal Usage
Workloads Evaluation Rate of Block Updates   If few blocks are updated frequently.     Less need for Random Access Zone...
Some Points From Workload1. Identifying hot blocks is important since the volume of   hot blocks is small enough to be hel...
Workload for Disk     Arrays
Shingled Disk Arrays Can the shingled disk used in a server environment?   Probably part of the disk array.   It could ...
Logical Arrangement of        Blocks
Shignled disk
Shignled disk
Shignled disk
Shignled disk
Shignled disk
Shignled disk
Shignled disk
Upcoming SlideShare
Loading in …5
×

Shignled disk

371 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Shignled disk

  1. 1. Shingled Disk
  2. 2. The Big Data Storage Storage: Magnetic disks  High storage density  Current: 400GB/in2 - 550GB/in2  30-50% increase per year. It’s reaching its physical limit…  “Superparamagnetic Limit”  Predicted limit: around 1TB/in2
  3. 3. Review: Hard disk
  4. 4. Track
  5. 5. Conventionally Written Track Non-overlap Track width w (e.g. 25nm) Guard gaps between tracks (e.g. g = 5nm). Bottleneck is the writing track width.  Current read heads can work on much narrower track.  But it is hard to write narrower track.
  6. 6. Shingled Disk: Overlap Tracks Wilder Track written. (e.g. w = 70nm). Shingled writing overlaps tracks. The remaining residual track could be much narrower. (e.g. r = 10nm).
  7. 7. Characteristics Higher Density without significant hardware change.  2-3 times of the conventional disk density. Support Random Read / Sequential Write  A single write will destroy the next k tracks  Typically, k = 4~8 Can we do better than a “tape with random read support”?
  8. 8. Two High-Level Strategies Mask the operational difference of a Shingled Disk.  Drop-in replacement for current disks.  Uses the standard block interface. Specialized file system with no/little hardware mask.  More flexibility in the data layout and block management.  Increased knowledge at file system layer.
  9. 9. Strategy One: Masking the Operational Difference Synergy with SSD: Slow erasure of block in SSD  SSD: Flash Translation Layer (FTL)  Shingled Disk: Shingled Translation Layer (STL)  Translate from Virtual Block Address to Logical Block Address on disk. How to perform random write.  One extreme: Read-modify-write.  Another extreme: Remap the physical location of written data. Benefit  No change for user and system.  “Drop-in” replacement for current system.
  10. 10. Strategy One: Masking the Operational Difference Drawback  Experience with SSDs indicates the performance will be hard to predict.  Reverse engineering on SSD to achieve higher level goals.  Sophisticated STL could be expensive.  Data stored in Continuous Virtual Block Address could be far away on disk.  Database table with frequent edits.  Concurrent downloads of movies.  Might use large NVRAM (as cache) to mitigate the problem.
  11. 11. Virtual Block Address Translation Need to quickly translate Virtual Block Address to Logical Block Address. Translation Table could be very large.  Capacity 2T, each entry 8 bytes.  Block Size 4K: Translation Table 4GB.  Block Size 512 bytes: Translation Table 32GB. Some B+-Tree type structure.
  12. 12. Strategy Two: Specialized System Simple Shingled Translation Layer.  Random Update: Read-Modify-Write.  TRIM Command: Tell hardware that overwriting subsequent tracks is fine.  Support to format some part of the disk unshingled. More Sophisticated System Software  Avoid writing to the middle of the band.  Conceptualize writing as appending to a log.  Perform necessary data remapping and garbage collection.
  13. 13. Design Issues forShingled Write Disk
  14. 14. Band Abstraction Store the bulk of data in band.  A collection of b-contiguous tracks.  A buffer of k tracks at the end of each band.  Bands are not interfered with each other  More flexible.
  15. 15. Proportional Capacity Loss c = 1 – b/(b+k): proportional capacity loss • k = 5 and want to control c < 0.1 • b > 45 • Each band have 67.5MB • Reasonable for modern LFS.
  16. 16. Band Usage1. Only writes complete bands.  Each band contains a segment of Log structured File System.  Assumes data is buffering in NVRAM.2. Only appends to bands.  Less efficient.3. Circular Log inside each band  Consume data from head.  Append data to the tail.  Require additional k track gap between head and tail.4. Flexible band size  Neighboring bands could be joined.  Not suitable for a general purpose SWD: Just for completeness.
  17. 17. Reserved Space for Random Update Option 1: NVRAM. Option 2: Random Access Zone (RAZ)  Every track is followed by k unused tracks.  Density of RAZ is lower than current disk.
  18. 18. How Large Random Access Zone Can we Have? Assume without RAZ, the capacity of shingled disk is 2.3 times of conventional disk. • If we want to guarantee L = 2 times of the conventional disk. • k = 5: 3.75% of total storage capacity.
  19. 19. Trade offs for Two Options Reserved Space for Random Access  Option 1: NVRAM  Faster  More Expensive: cost 10 times to RAZ  Option 2: Random Access Zone (RAZ)  Use some part of the disk for Random Access Zone.  Cheaper but slower.  Trade-offs would be interesting.
  20. 20. Usage of NVRAM1. Buffering data for writing bands.  Be careful about the limited number of write-erase cycles of flash memory.2. Use NVRAM to store metadata.  Metadata tends to have a higher amount of activity.  In Write Anywhere File System, NVRAM could be used to maintain the log of file system activities.3. Store recently created objects.  Temporal locality: a block/object created long long ago is less likely to be updated.  If data is first written to NVRAM, we can also have better placement of data on disk.
  21. 21. Number of Logs Here Log-structured File System is assumed to be used. What’s the benefit to have more than a single log? 1. Separation between metadata and data.  E.g. Access Time. 2. Allocate files for more efficient read access later.  E.g. Downloads several movies at the same time.  If only one log, all the movie objects will be interspersed.  Inefficient for read.
  22. 22. Workload for GeneralPurpose Personal Usage
  23. 23. Workloads Evaluation Rate of Block Updates  If few blocks are updated frequently.  Less need for Random Access Zone / NVRAM.  Shingled Disk is more usable to replace conventional disk. Evaluated Workloads 1. General purpose personal usage for 1 month 2. Specialized workload: Video Edit for 3 hours. 3. Specialized workload: Music Library Management.  Negligible block update.  Not surprised.
  24. 24. Some Points From Workload1. Identifying hot blocks is important since the volume of hot blocks is small enough to be held in the Random Access Zone / NVRAM.2. Larger block sizes reduces the accuracy of identifying hot blocks. But it’s not that significant.3. File system distinguish between metadata from user data would be helpful.
  25. 25. Workload for Disk Arrays
  26. 26. Shingled Disk Arrays Can the shingled disk used in a server environment?  Probably part of the disk array.  It could have writing originating from different sources. Two Impacts in Workload  Data Striping.  Workload Interleaving. Replay workloads against a simulated drive.  Log-structured Writing Scheme to perform in-band update.
  27. 27. Logical Arrangement of Blocks

×