An Efficient B-Tree Layer for Flash-Memory Storage Systems

905 views

Published on

  • Be the first to comment

  • Be the first to like this

An Efficient B-Tree Layer for Flash-Memory Storage Systems

  1. 1. An Efficient B-Tree Layer for Flash-Memory Storage Systems Chin-Hsien Wu, Li-Pin Chang, and Tei-Wei Kuo The 9th International Conference on Real-Time Computing Systems and Applications (RTCSA 2003), Tainan, Taiwan 2003 발표자 : 안성용
  2. 2. Introduction <ul><li>Flash Memory </li></ul><ul><ul><li>considered as an alternative to replace hard disk </li></ul></ul><ul><li>B-tree </li></ul><ul><ul><li>one of the most popular index structures because of its scalability and efficiently </li></ul></ul><ul><li>Objective </li></ul><ul><ul><li>We propose a module over a traditional FTL to handle intensive byte-wise operations due to B-tree access </li></ul></ul><ul><ul><li>BFTL (efficient B-Tree layer for flash-memory storage systems) </li></ul></ul>
  3. 3. Background <ul><li>Flash Memory </li></ul><ul><ul><li>native file-system approach </li></ul></ul><ul><ul><ul><li>directly manage raw flash memory </li></ul></ul></ul><ul><ul><ul><li>JFFS/JFFS2, LFM, YAFFS </li></ul></ul></ul><ul><ul><li>block-device emulation approach </li></ul></ul><ul><ul><ul><li>provide a transparent block-device emulation </li></ul></ul></ul><ul><ul><ul><li>FTL/FTL-Lite </li></ul></ul></ul>
  4. 4. Background <ul><li>Flash Memory (Con’t) </li></ul><ul><ul><li>out-place update </li></ul></ul><ul><ul><ul><li>could not be over-written unless it is erase first </li></ul></ul></ul><ul><ul><li>garbage collection </li></ul></ul><ul><ul><ul><li>to recycle dead pages scattered over blocks </li></ul></ul></ul><ul><ul><li>were-leveling </li></ul></ul><ul><ul><ul><li>each block could be erased for 1 million(10 6 ) times </li></ul></ul></ul>
  5. 5. Background <ul><li>B-tree </li></ul><ul><ul><li>tree 에 있는 각 Node 는 최대 m 개 , 최소 (m/2) 개의 종속 tree 를 가져야 한다 </li></ul></ul><ul><ul><li>모든 leaf Node 는 같은 level 에 있어야 한다 </li></ul></ul><ul><ul><li>Node 의 key 값의 개수는 종속 Tree 의 개수보다 하나 적으며 최소 (m/2)-1 개 , 최대 m-1 개이다 . </li></ul></ul>internal node: ordered list of key and linkage point leaf node: key value and record pointer
  6. 6. Problem Definition <ul><li>On hard disk </li></ul><ul><ul><li>we usually set the size of a B-Tree node as the size which can be efficiently handled by the used block device. </li></ul></ul><ul><ul><li>To insert, delete, and re-balance B-Trees, B-Tree nodes are fetched from the hard disk and then written back to the original location. </li></ul></ul><ul><ul><li>Such operations are very efficient for hard disks. </li></ul></ul><ul><li>On Flash Memory </li></ul><ul><ul><li>Updating (or writing) data over flash memory is a very complicated and expensive operation. </li></ul></ul><ul><ul><li>Since out-place update is adopted, a whole page (512B) which contains the new version of data will be written to flash memory </li></ul></ul>
  7. 7. The Design and Implementation of BFTL <ul><li>Overview </li></ul><ul><ul><li>BFTL sits between the application layer and the block-device emulated by FTL </li></ul></ul>
  8. 8. The Design and Implementation of BFTL <ul><li>reservation buffer </li></ul><ul><ul><li>temporarily hold the newly generated records ( dirty records ) </li></ul></ul><ul><ul><li>record deletions are handled by adding “ invalidation records ” to the reservation buffer. </li></ul></ul><ul><ul><li>the dirty records should be timely flushed to flash memory </li></ul></ul><ul><li>index units </li></ul><ul><ul><li>When dirty records are flushed, BFTL constructed “ index units ” for each dirty record </li></ul></ul><ul><ul><li>reflect modification of the corresponding B-Tree node </li></ul></ul><ul><ul><li>data_ptr, parent_node, primary_key, left_ptr, right_ptr, identifier, op_flag </li></ul></ul><ul><ul><li>Many index units are packed into few sectors to reduce the number of pages physically written. </li></ul></ul><ul><ul><li>index units of one B-Tree node could now exist in different sectors. </li></ul></ul>
  9. 9. The Design and Implementation of BFTL <ul><li>index units (Con’t) </li></ul><ul><ul><li>B-Tree node could be logically constructed by collecting and parsing all relevant index units </li></ul></ul><ul><ul><li>A node translation table is adopted to handle the collection of index units. </li></ul></ul>
  10. 10. The Design and Implementation of BFTL <ul><li>The node Translation Table </li></ul><ul><ul><li>maps a B-Tree node to a collection of LBA’s where the related index units reside. </li></ul></ul><ul><ul><li>could be re-built by scanning the flash memory when system is powered-up </li></ul></ul>
  11. 11. The Design and Implementation of BFTL <ul><li>The node Translation Table </li></ul><ul><ul><li>system parameter C </li></ul></ul><ul><ul><ul><li>number of items in a list cause low performance and space overhead </li></ul></ul></ul><ul><ul><ul><li>control the maximum length of the lists of the node translation table </li></ul></ul></ul><ul><ul><ul><li>When the length of a list grows beyond C , the list will be compacted. </li></ul></ul></ul><ul><ul><li>To compact a list, </li></ul></ul><ul><ul><ul><li>all related index units are collected into RAM and then written back to flash memory with a smallest number of sectors. </li></ul></ul></ul>
  12. 12. The Design and Implementation of BFTL
  13. 13. The Design and Implementation of BFTL <ul><li>The Commit Policy </li></ul><ul><ul><li>how to smartly pack index units into few sectors </li></ul></ul><ul><ul><ul><li>many index units should be packed together in order to further reduce the number of sectors needed </li></ul></ul></ul><ul><ul><ul><li>we also hope that index units of the same B-Tree node will not be scattered over many sectors </li></ul></ul></ul><ul><ul><li>The packing problem of index units into sectors is NP-Hard. </li></ul></ul><ul><ul><ul><li>The intractability of the problem could be shown by a reduction from the Bin-Packing problem </li></ul></ul></ul>
  14. 14. System Analysis <ul><li>Suppose </li></ul><ul><ul><li>n records are to be inserted. </li></ul></ul><ul><ul><li>let a B-Tree node fit in a sector (provided by FTL). </li></ul></ul><ul><ul><li>Let H denote the current height of the B-Tree </li></ul></ul><ul><ul><li>N split denote the number of nodes which are split during the handling of the insertions. </li></ul></ul><ul><ul><li>C : maximum length of the lists </li></ul></ul><ul><li>number of sectors read </li></ul><ul><li>number of sectors read </li></ul>
  15. 15. Performance Evaluation <ul><li>Experiment Setup </li></ul><ul><ul><li>4MB NAND flash memory </li></ul></ul><ul><ul><li>greedy block-recycling policy </li></ul></ul><ul><ul><li>fanout of B-tree: 21 </li></ul></ul><ul><ul><li>size of a B-Tree node fits in a sector. </li></ul></ul><ul><ul><li>reservation buffer size: 60 records </li></ul></ul><ul><ul><li>maximum length of the list: 3 </li></ul></ul><ul><ul><li>a small amount of B-Tree nodes in the top levels were cached in RAM </li></ul></ul><ul><ul><li>ratio rs : control the value distribution of the inserted keys </li></ul></ul>
  16. 16. Performance Evaluation <ul><li>Performance of B-Tree Index Structures Creation </li></ul><ul><ul><li>insert 24000 records </li></ul></ul>
  17. 17. Performance Evaluation <ul><li>Performance of B-Tree Index Structures Creation (Con’t) </li></ul>
  18. 18. Performance Evaluation <ul><li>Performance of B-Tree Index Structures Maintenance </li></ul><ul><ul><li>24000 operation </li></ul></ul><ul><ul><li>we varied the ratio of the number of deletions to the number of insertions. (50/50, 40/60, 30/70, 20/80,10/90) </li></ul></ul>
  19. 19. Performance Evaluation <ul><li>Size of the reservation buffer </li></ul><ul><li>Energy Consumption </li></ul>
  20. 20. Conclusion <ul><li>Conclusion </li></ul><ul><ul><li>Flash-memory storage systems are very suitable for embedded systems </li></ul></ul><ul><ul><li>we propose a methodology and a layer design to support B-Tree index structures over flash memory. </li></ul></ul><ul><ul><li>BFTL reduces the number of redundant data written to flash memory. </li></ul></ul><ul><li>Future Work </li></ul><ul><ul><li>How to manage data records and their index structures over huge flash memory </li></ul></ul>

×