Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

slides

436 views

Published on

  • Be the first to comment

  • Be the first to like this

slides

  1. 1. Endurance Enhancement of Flash-Memory Storage Systems: An Efficient Static Wear Leveling Design Yuan-Hao Chang , Jen-Wei Hsieh, and Tei-Wei Kuo Embedded Systems and Wireless Networking Laboratory Dept. of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan
  2. 2. Outline <ul><li>Introduction </li></ul><ul><li>System Architecture </li></ul><ul><li>Motivation </li></ul><ul><li>An Efficient Static Wear Leveling Mechanism </li></ul><ul><li>Performance Evaluation </li></ul><ul><li>Conclusion </li></ul>
  3. 3. Introduction - Why Flash Memory <ul><li>Diversified Application Domains </li></ul><ul><ul><li>Portable Storage Devices </li></ul></ul><ul><ul><li>Consumer Electronics </li></ul></ul><ul><ul><li>Industrial Applications </li></ul></ul><ul><li>SoC and Hybrid Devices </li></ul><ul><ul><li>Critical System Components </li></ul></ul>
  4. 4. Introduction - Characteristics of Flash Memory … … (2KB+64 Byte) (2KB+64 Byte) … … (2KB+64 Byte) (2KB+64 Byte) … … (2KB+64 Byte) (2KB+64 Byte) Page : basic write-operation unit. 0 63 … … 0 511 … … … … Block : basic erase-operation unit. 64MB SLC Flash Memory
  5. 5. System Architecture * FTL : Flash Translation Layer, MTD : Memory Technology Device SD, xD, MemoryStick , SmartMedia CompactFlash
  6. 6. Policies – FTL <ul><li>FTL adopts a page-level address translation mechanism . </li></ul><ul><ul><li>Main problem: Large memory space requirement </li></ul></ul>
  7. 7. Policies – NFTL <ul><li>Each logical address under NFTL is divided into a virtual block address and a block offset . </li></ul><ul><ul><li>e.g., LBA=1011 => virtual block address (VBA) = 1011 / 8 = 126 and block offset = 1011 % 8 = 3 </li></ul></ul>. . . (9,23) . . . NFTL Address Translation Table (in main-memory) Free Free Free Used Free Free Free Free Used Used Used Free Free Free Free Free A Primary Block Address = 9 A Replacement Block Address = 23 If the page has been used Write to the first free page Write data to LBA=1011 VBA=126 Block Offset=3
  8. 8. Outline <ul><li>Introduction </li></ul><ul><li>System Architecture </li></ul><ul><li>Motivation </li></ul><ul><li>An Efficient Static Wear Leveling Mechanism </li></ul><ul><li>Performance Evaluation </li></ul><ul><li>Conclusion </li></ul>
  9. 9. Motivation <ul><li>A limited bound on erase cycles </li></ul><ul><ul><li>SLC : 100,000 </li></ul></ul><ul><ul><li>MLC x2 : 10,000 </li></ul></ul><ul><li>Applications with frequent access </li></ul><ul><ul><li>Disk cache (e.g., Robson ) </li></ul></ul><ul><ul><li>Solid State Disks (SSD) </li></ul></ul>Solid State Disk
  10. 10. Motivation – Why Static Wear Leveling: No Wear Leveling Number of Write/Erase Cycles per PBA (75% Static Data) Physical Block Addresses (PBA) Erase Cycles 0 20 40 60 80 100 0 1000 2000 3000 4000 5000
  11. 11. Motivation – Why Static Wear Leveling: Dynamic Wear Leveling Number of Write/Erase Cycles per PBA (75% Static Data) Physical Block Addresses (PBA) Erase Cycles 0 20 40 60 80 100 0 1000 2000 3000 4000 5000
  12. 12. Motivation – Why Static Wear Leveling: Perfect Static Wear Leveling Number of Write/Erase Cycles per PBA (75% Static Data) Physical Block Addresses (PBA) Erase Cycles 0 20 40 60 80 100 0 1000 2000 3000 4000 5000
  13. 13. Motivation – Why Static Wear Leveling is So Difficult? <ul><li>An intuitive SWL </li></ul><ul><li>Problems: </li></ul><ul><ul><li>Block erase </li></ul></ul><ul><ul><li>Live-page copying </li></ul></ul><ul><ul><li>RAM space </li></ul></ul><ul><ul><li>CPU time </li></ul></ul>Flash memory counter : free block : dead block : block contains (some) valid data : index in the selection of a victim block : index to the selected free block 5 5 4 5 5 4 4 5 4 5 5 4 4 5 5 4 Update data in block 3 Write new data to block 4 GC starts  find a victim block 5 Update block 15 GC starts 5 5 5 5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
  14. 14. Motivation – Comparison Perfect SWL Intuitive SWL Erase Cycles No Wear Leveling Dynamic Wear Leveling Physical Block Addresses (PBA) Erase Cycles 0 20 40 60 80 100 0 1000 2000 3000 4000 5000 Physical Block Addresses (PBA) 0 20 40 60 80 100 0 1000 2000 3000 4000 5000 0 20 40 60 80 100 0 1000 2000 3000 4000 5000 0 20 40 60 80 100 0 1000 2000 3000 4000 5000
  15. 15. Outline <ul><li>Introduction </li></ul><ul><li>System Architecture </li></ul><ul><li>Motivation </li></ul><ul><li>An Efficient Static Wear Leveling Mechanism </li></ul><ul><li>Performance Evaluation </li></ul><ul><li>Conclusion </li></ul>
  16. 16. Design Considerations in Static Wear Leveling <ul><li>Scalability Issues </li></ul><ul><ul><li>The limitation on the </li></ul></ul><ul><ul><li>RAM space </li></ul></ul><ul><ul><li>The limitation on </li></ul></ul><ul><ul><li>computing power </li></ul></ul><ul><li>Compatibility Issue </li></ul><ul><ul><li>Compatibility with FTL and NFTL </li></ul></ul>
  17. 17. An Efficient Static Wear Leveling Mechanism <ul><li>A modular design for compatibility considerations </li></ul><ul><li>A SWL mechanism </li></ul><ul><ul><li>Block Erasing Table (BET) </li></ul></ul><ul><ul><ul><li>bit flags </li></ul></ul></ul><ul><ul><li>SW Leveler </li></ul></ul><ul><ul><ul><li>SWL-Procedure </li></ul></ul></ul><ul><ul><ul><li>SWL-Update </li></ul></ul></ul>
  18. 18. The Block Erasing Table (BET) <ul><li>A bit-array: Each bit is for 2 k consecutive blocks. </li></ul><ul><ul><li>Small k – in favor of hot-cold data separation </li></ul></ul><ul><ul><li>Large k – in favor of small RAM space </li></ul></ul>Flash BET 1 1 1 1 k=0 k=2 e cnt =0 f cnt =0 e cnt =0 f cnt =0 e cnt = 1 f cnt = 1 e cnt = 2 f cnt = 2 e cnt = 3 f cnt =2 e cnt = 1 f cnt = 1 e cnt = 2 f cnt = 2 e cnt = 3 f cnt =2 e cnt = 4 f cnt =2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 : a block that has been erased in the current resetting interval : an index to a block that the Cleaner wants to erase f cnt : the number of 1’s in the BET e cnt : the total number of block erases done since the BET is reset
  19. 19. <ul><li>An unevenness level ( e cnt / f cnt ) >= T  Triggering of the SW Leveler </li></ul><ul><li>Resetting of BET when all flags are set. </li></ul>The SW Leveler e cnt =1998 f cnt =2 : A block that has been erased in the current resetting interval 1 1 k=2 : An index to a block that the Cleaner wants to erase f cnt : The number of 1’s in the BET e cnt : the total number of block erases since the BET is erased T : A threshold, T=1000 in this example : An index in the selection of a block set e cnt = 1999 f cnt =2 e cnt = 2000 f cnt =2 2000 / 2 = 1000 >= 1000 (E cnt / f cnt >= T) 1 : An index that SW Leveler triggers the Cleaner to do garbage collection The Cleaner is triggered to 1. Copy valid data of selected block set to free area, 2. Erase block in the selected block set, and 3. Inform the Allocator to update the address mapping between LBA and PBA After a period of time, the total erase count reaches 2998 . e cnt = 2004 f cnt = 3 e cnt =2998 f cnt =3 e cnt = 2999 f cnt =3 e cnt = 3000 f cnt =3 3000 / 3 = 1000 >= 1000 (E cnt / f cnt >= T) 1 e cnt = 3004 f cnt = 4 After a period of time, the total erase count reaches 3999 . e cnt =3999 f cnt =4 e cnt = 4000 f cnt =4 4000 / 4 = 1000>=1000 (e cnt / f cnt >=1000) , but all flags in BET are 1  reset BET Reset to a randomly selected block set (flag) e cnt = 0 f cnt = 0 0 0 0 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
  20. 20. Main-Memory Requirements MLC x2 (1 page = 2 KB, 1 block=128 pages) 512MB 1GB 2GB 4GB 8GB k=0 256B 512B 1024B 2048B 4096B k=1 128B 256B 512B 1024B 2048B k=2 64B 128B 256B 512B 1024B k=3 32B 64B 128B 256B 512B
  21. 21. Outline <ul><li>Introduction </li></ul><ul><li>System Architecture </li></ul><ul><li>Motivation </li></ul><ul><li>An Efficient Static Wear Leveling Mechanism </li></ul><ul><li>Performance Evaluation </li></ul><ul><li>Conclusion </li></ul>
  22. 22. Performance Evaluation <ul><li>Metrics </li></ul><ul><ul><li>Endurance Enhancement (First Failure Time) </li></ul></ul><ul><ul><li>Additional Overheads (Extra Block Erases) </li></ul></ul><ul><li>Experiment Setup </li></ul><ul><ul><li>The FTL Layer </li></ul></ul><ul><ul><ul><li>FTL (Flash Translation Layer protocol) </li></ul></ul></ul><ul><ul><ul><li>NFTL (NAND Flash Translation Layer protocol) </li></ul></ul></ul><ul><ul><li>Traces: </li></ul></ul><ul><ul><ul><li>A mobile PC with a 20GB hard disk </li></ul></ul></ul><ul><ul><ul><li>Duration: One month. </li></ul></ul></ul><ul><ul><ul><li>Daily activities </li></ul></ul></ul><ul><ul><ul><li>The average number of Operations: 1.82 writes/sec, 1.97 reads/sec </li></ul></ul></ul><ul><ul><ul><li>The percentage of accessed LBA’s: 36.62% </li></ul></ul></ul><ul><ul><li>Flash Memory: 1GB MLCx2 </li></ul></ul>
  23. 23. Endurance Improvement - The First Failure Time When k=3 and T=100, the endurance is improved by 100.2% . When k=0 and T=100, the endurance is improved by 87.5% .
  24. 24. Additional Overhead - Extra Block Erases Extra block erases is less than 3%. Extra block erases is less than 2%.
  25. 25. Conclusion <ul><li>An Efficient Static Wear Leveling Mechanism </li></ul><ul><ul><li>Small Memory Space Requirement </li></ul></ul><ul><ul><ul><li>Less than 4KB for 8GB flash memory </li></ul></ul></ul><ul><ul><li>Efficient Implementation </li></ul></ul><ul><ul><ul><li>An adjustable house-keeping data structure </li></ul></ul></ul><ul><ul><ul><li>An cyclic queue scanning algorithm </li></ul></ul></ul><ul><li>Performance Evaluation </li></ul><ul><ul><li>Improvement Ratio of the Endurance: </li></ul></ul><ul><ul><ul><li>The ratio is more than 50% with proper settings to T and K </li></ul></ul></ul><ul><ul><li>Extra Overhead: </li></ul></ul><ul><ul><ul><li>It is less than 3% of extra block erase with proper settings </li></ul></ul></ul><ul><ul><ul><li>There is limited overhead in live-page copyings </li></ul></ul></ul>
  26. 26. Conclusion <ul><li>Future work </li></ul><ul><ul><li>Endurance and Reliability Problems due to Manufacturing and Capacity Improvements </li></ul></ul><ul><ul><ul><li>Read/Write Disturbance Problems </li></ul></ul></ul><ul><ul><li>System Components </li></ul></ul><ul><ul><ul><li>Memory Hierarchy, Devices, Adaptors, etc. </li></ul></ul></ul><ul><ul><li>Benchmark Designs </li></ul></ul><ul><ul><ul><li>Targets on Different Designs </li></ul></ul></ul>Selected cell I DS Control Gate Drain Source
  27. 27. <ul><li>~ Thank You ~ </li></ul>

×