μ -FTL: A Memory-Efficient Flash Translation Layer Supporting Multiple Mapping Granularities Yong-Goo Lee Computer Archite...
Introduction Flash Memory
Advantages of NAND Flash Memory <ul><li>Non-volatility </li></ul><ul><li>Short read/write latency </li></ul><ul><li>Low po...
NAND Flash Memory Characteristics <ul><li>A unit of NAND flash memory operation </li></ul><ul><ul><li>Page: a read/write u...
Flash Translation Layer <ul><li>Legacy disk-based host systems cannot be directly used with NAND flash memory </li></ul><u...
Motivation RAM usage
Main Ideas of  μ -FTL
Workload Characteristics <ul><li>Multiple mapping granularities </li></ul><ul><ul><li>Reducing both the RAM usage and the ...
μ -Tree [D. Kang, EMSOFT 2007] <ul><li>Handling any insertion/deletion/update operation of the tree with only one flash wr...
Address Mapping  <ul><li>μ-FTL implements multiple mapping granularities </li></ul><ul><ul><li>Extent :  A variable-sized ...
Main Ideas of  μ -FTL
Garbage Collection <ul><li>Victim selection policy </li></ul><ul><ul><li>Having the largest # of invalid pages </li></ul><...
Bitmap Information <ul><li>Bitmap information is too large to be stored in RAM </li></ul><ul><li>We store the bitmap infor...
Bitmap Cache <ul><li>The  μ -Tree cache is polluted by bitmap entries </li></ul><ul><ul><li>The cache hit ratio of extents...
Main Ideas of  μ -FTL
Logical Address Space Partitioning <ul><li>Separating hot data from cold data has a beneficial effect on the performance o...
Main Ideas of  μ -FTL
μ -FTL Design Summary Write Bitmap Cache μ -Tree Cache Partition 1 Partition 2 Partition 3 Partition 4 μ -Tree  Free block...
Evaluation Methodology <ul><li>Trace-driven simulators for several FTLs </li></ul><ul><ul><li>Block-mapped: the log block,...
The RAM Usage <ul><li>Per-block invalid page counter  </li></ul><ul><li>The  μ -Tree cache size </li></ul><ul><li>The bitm...
The Comparison of FTL Performance GENERAL (32GB) SYSMARK (40GB)  56.8% 89.7%
The Effect of Bitmap Cache Size WEB (32GB) PIC (8GB) 52.5% 33.3% μ -Tree  cache overhead
The Effect of Partitioning WEB (32GB) GENERAL (32GB) 23.6% 39.6%
Conclusion <ul><li>μ -FTL  </li></ul><ul><ul><li>Small and fixed RAM usage </li></ul></ul><ul><ul><ul><li>As small as bloc...
<ul><li>Thank you for listening </li></ul>
Upcoming SlideShare
Loading in …5
×

μ-FTL: A Memory-Efficient Flash Translation Layer Supporting ...

1,739 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,739
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
40
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

μ-FTL: A Memory-Efficient Flash Translation Layer Supporting ...

  1. 1. μ -FTL: A Memory-Efficient Flash Translation Layer Supporting Multiple Mapping Granularities Yong-Goo Lee Computer Architecture Laboratory Korea Advanced Institute of Science and Technology (KAIST) South Korea
  2. 2. Introduction Flash Memory
  3. 3. Advantages of NAND Flash Memory <ul><li>Non-volatility </li></ul><ul><li>Short read/write latency </li></ul><ul><li>Low power consumption </li></ul><ul><li>Small size and light weight </li></ul><ul><li>Solid state reliability </li></ul>
  4. 4. NAND Flash Memory Characteristics <ul><li>A unit of NAND flash memory operation </li></ul><ul><ul><li>Page: a read/write unit (4KB) </li></ul></ul><ul><ul><li>Block: an erase unit (512KB) </li></ul></ul><ul><li>Erase-before-write </li></ul><ul><ul><li>A page once written cannot be overwritten until it is erased </li></ul></ul>page A A’ NAND flash memory block
  5. 5. Flash Translation Layer <ul><li>Legacy disk-based host systems cannot be directly used with NAND flash memory </li></ul><ul><li>FTL translates a logical address used by the host into a physical address in NAND flash memory </li></ul><ul><li>Two major factors affecting the performance of FTL </li></ul><ul><ul><li>Address mapping mechanism </li></ul></ul><ul><ul><li>Garbage collection policy </li></ul></ul>FTL NAND Flash memory Host system (file system, device driver) Logical address Physical address
  6. 6. Motivation RAM usage
  7. 7. Main Ideas of μ -FTL
  8. 8. Workload Characteristics <ul><li>Multiple mapping granularities </li></ul><ul><ul><li>Reducing both the RAM usage and the overhead </li></ul></ul><ul><ul><li>Using μ -Tree as a main data-structure </li></ul></ul>Characteristic (write requests) Example Appropriate mapping granularity Small & Random File system meta data Temporary Internet files Fine-grained Large & Sequential Multimedia files Installed applications Coarse-grained
  9. 9. μ -Tree [D. Kang, EMSOFT 2007] <ul><li>Handling any insertion/deletion/update operation of the tree with only one flash write operation </li></ul><ul><ul><li>By storing multiple nodes from the leaf to the root in a single page </li></ul></ul>LRU head μ -Tree cache  cache miss  cache full (dirty pages only) A B C D E F A’ C’ F’ μ -Tree block A B C D E F A B C D E F F’ A B C C’ D E F F’ A A’ B C C’ D E F F’
  10. 10. Address Mapping <ul><li>μ-FTL implements multiple mapping granularities </li></ul><ul><ul><li>Extent : A variable-sized mapping entry (a pair of key and record) </li></ul></ul>Logical blocks Physical blocks 5 0 4 5 6 100, 4 105, 1 104, 1 106, 2 Key Record μ -FTL mapping example <ul><ul><li>The implementation of mapping example </li></ul></ul>100 101 102 103 104 105 106 107 0 1 2 3 4 5 6 7
  11. 11. Main Ideas of μ -FTL
  12. 12. Garbage Collection <ul><li>Victim selection policy </li></ul><ul><ul><li>Having the largest # of invalid pages </li></ul></ul><ul><li>Garbage collection information </li></ul><ul><ul><li>Per-block invalid page counter </li></ul></ul><ul><ul><ul><li>Maintaining the number of invalid pages in each physical block </li></ul></ul></ul><ul><ul><li>Bitmap information </li></ul></ul><ul><ul><ul><li>Distinguishing valid pages from invalid pages </li></ul></ul></ul>
  13. 13. Bitmap Information <ul><li>Bitmap information is too large to be stored in RAM </li></ul><ul><li>We store the bitmap information in μ -tree </li></ul>5 25 0 4 5 6 100, 4 105, 1 104, 1 106, 2 Physical blocks 100 101 102 103 104 105 106 107 25 26 (0001) 2 (1111) 2 Key (bitmap) Record 1 1 25 26 Key (mapping) 1
  14. 14. Bitmap Cache <ul><li>The μ -Tree cache is polluted by bitmap entries </li></ul><ul><ul><li>The cache hit ratio of extents severely impaired </li></ul></ul><ul><li>Bitmap cache </li></ul><ul><ul><li>A hash table structure for buffering bitmap updates </li></ul></ul><ul><ul><li>Invalid extent </li></ul></ul><ul><ul><ul><li>Physically adjacent invalid pages </li></ul></ul></ul>(Physical Page Number, Length) Pointer Invalid Extent
  15. 15. Main Ideas of μ -FTL
  16. 16. Logical Address Space Partitioning <ul><li>Separating hot data from cold data has a beneficial effect on the performance of an FTL </li></ul><ul><li>Each part of the logical address space exhibits different degree of “hotness” </li></ul><ul><li>Update block </li></ul><ul><ul><li>being charged for receiving data from incoming write requests </li></ul></ul>A global update block Many update blocks for each partition Logical address space Logical address space
  17. 17. Main Ideas of μ -FTL
  18. 18. μ -FTL Design Summary Write Bitmap Cache μ -Tree Cache Partition 1 Partition 2 Partition 3 Partition 4 μ -Tree Free block list Data Extent update Bitmap update Bitmap update flush Cache full Cache miss
  19. 19. Evaluation Methodology <ul><li>Trace-driven simulators for several FTLs </li></ul><ul><ul><li>Block-mapped: the log block, FAST, the super block </li></ul></ul><ul><ul><li>Page-mapped: DAC </li></ul></ul><ul><li>6 traces </li></ul><ul><ul><li>3 multimedia traces: PIC(8GB), MP3(8GB), MOV(8GB) </li></ul></ul><ul><ul><li>3 PC traces : WEB(32GB), GENERAL(32GB), SYSMARK(40GB) </li></ul></ul><ul><li>The standard configurations </li></ul>Partition Size Bitmap cache size The portion of extra blocks 256MB (512 logical blocks) 4KB for multimedia traces 32KB for PC traces 3%
  20. 20. The RAM Usage <ul><li>Per-block invalid page counter </li></ul><ul><li>The μ -Tree cache size </li></ul><ul><li>The bitmap cache size </li></ul>Block-mapped FTL μ-FTL (A+B+C) Page-mapped FTL 8GB 64KB 16KB+44KB+4KB = 64KB 8MB 32GB 256KB 64KB+160KB+32KB = 256KB 32MB 40GB 320KB 80KB+208KB+32KB = 320KB 40MB
  21. 21. The Comparison of FTL Performance GENERAL (32GB) SYSMARK (40GB) 56.8% 89.7%
  22. 22. The Effect of Bitmap Cache Size WEB (32GB) PIC (8GB) 52.5% 33.3% μ -Tree cache overhead
  23. 23. The Effect of Partitioning WEB (32GB) GENERAL (32GB) 23.6% 39.6%
  24. 24. Conclusion <ul><li>μ -FTL </li></ul><ul><ul><li>Small and fixed RAM usage </li></ul></ul><ul><ul><ul><li>As small as block-mapped FTLs </li></ul></ul></ul><ul><ul><li>Low management overhead </li></ul></ul><ul><ul><ul><li>As low as DAC (a page-mapped FTL) </li></ul></ul></ul>
  25. 25. <ul><li>Thank you for listening </li></ul>

×