Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

FlashDB: Dynamic Self-tuning Database for NAND Flash


Published on

  • Be the first to comment

  • Be the first to like this

FlashDB: Dynamic Self-tuning Database for NAND Flash

  1. 1. FlashDB: Dynamic Self-tuning Database for NAND Flash ACM/IEEE IPSN 2007 Suman Nath, Aman Kansal, MS 2007-07-12 Speaker: Yunjung Yoo
  2. 2. FlashDB? <ul><li>A self-tuning database optimized for sensor networks using NAND flash storage. </li></ul><ul><li>Existing DB for flash are not optimized for all types of flash devices or for all workloads </li></ul><ul><li>Formalize the self-tuning nature of an index as a two-state task system </li></ul><ul><li>Propose a 3-competitive online algorithm for the theoretical optimum </li></ul><ul><li>Provide a framework to determine the optimal size of an index node; minimizes energy and latency </li></ul><ul><li>Propose optimizations to further improve the performance of our index </li></ul>
  3. 3. Background – Flash <ul><li>NOR vs. NAND </li></ul><ul><li>Page / Block / Wear-leveling </li></ul><ul><li>Packages : Compact Flash (CF) card, Secure Digital (SD) card, mini SD card, micro SD card, USB sticks, … </li></ul><ul><li>Flash Translation Layer (FTL) : disk-like interface </li></ul><ul><li>Read/Write Costs </li></ul>
  4. 4. Background – Flash <ul><li>Access Pattern </li></ul><ul><ul><li>Energy and time to R/W a page follows a linear model </li></ul></ul><ul><ul><li>Fixed access cost + ‘per page’ access cost) </li></ul></ul><ul><ul><ul><li>Accessing a large # of pages at once may be beneficial </li></ul></ul></ul><ul><ul><ul><li>Re-writing to the same page address is slower than writing to a new page address in sequential order. </li></ul></ul></ul><ul><li>Workload Properties </li></ul><ul><ul><li>Read-write Ratio </li></ul></ul><ul><ul><ul><li>Different part of the index? Different read-write ratio </li></ul></ul></ul><ul><ul><li>Data Pattern </li></ul></ul><ul><ul><ul><li>Random data in nature vs. Correlated data in nature </li></ul></ul></ul><ul><ul><ul><li>Pattern may not be known a priori or may change over time </li></ul></ul></ul>
  5. 5. Background – Indexing Methods(1) <ul><li>B + -tree </li></ul><ul><ul><li>Popular indexing data structure used in various incarnation in DB </li></ul></ul><ul><ul><li>Supporting powerful queries and efficient operations </li></ul></ul><ul><ul><li>A balanced tree </li></ul></ul><ul><ul><li>Each node : </li></ul></ul><ul><ul><ul><li><p 0 , k 0 , p 1 , k 1 , … , p d >, where </li></ul></ul></ul><ul><ul><ul><li>Stores over multiple consecutive flash pages </li></ul></ul></ul><ul><ul><li>Searching for a key, Insertion, and deletion in a B + -tree </li></ul></ul>
  6. 6. Background – Indexing Methods(2) <ul><li>B + -tree(Disk) </li></ul><ul><ul><li>Code portability (Existing implementation for disk can be run) </li></ul></ul><ul><ul><li>Updates are expensive </li></ul></ul><ul><ul><li>Inappropriate with write-intensive workload </li></ul></ul><ul><li>B + -tree(Log) </li></ul><ul><ul><li>Small update cost </li></ul></ul><ul><ul><li>Reading many log entries is expensive </li></ul></ul><ul><ul><li>Inspired by log-structured file system </li></ul></ul><ul><ul><li>To avoid the high update cost of B + -tree(Disk) </li></ul></ul><ul><ul><li>Organizing the index as transaction logs </li></ul></ul>
  7. 7. The Need to Self-tune <ul><li>Which of the two B + -tree designs should one use? </li></ul><ul><ul><li>Depends on the workload and device characteristics </li></ul></ul><ul><li>What if the workload or flash device used is unknown or can change over time? </li></ul><ul><ul><li>The index should be able to tune its structure at a granularity finer than the entire index!!! </li></ul></ul><ul><li>Each node should be organized independently depending on the workload it experiences! </li></ul>
  8. 8. B + -tree(ST) <ul><li>Self-tuning B + -tree designed for NAND flash </li></ul><ul><li>Index node in one of two modes : Log or Disk </li></ul><ul><li>Create / Read / Update </li></ul>
  9. 9. FlashDB Architecture <ul><li>Logical Storage: </li></ul><ul><li>Logical sector addr. abstract </li></ul><ul><li>ReadSector / WriteSector </li></ul><ul><li>Alloc / Free </li></ul><ul><li>Garbage Collection: </li></ul><ul><li>Cleaning dirty pages </li></ul><ul><li>Log Buffer: </li></ul><ul><li>Holding up to one sector </li></ul><ul><li>Used only by the Log node </li></ul><ul><li>Avoid expensive small writes </li></ul><ul><li>Node Translation Table: </li></ul><ul><li>Mapping logical node to their current modes and physical representation. </li></ul><ul><li>Disk mode : sector address </li></ul><ul><li>Log mode : linked list of addr. </li></ul>
  10. 10. Self-tuning Issues(1) <ul><li>The mode must be decided and updated carefully </li></ul><ul><li>The size of an index node must be chosen optimally </li></ul><ul><li>Mode Switching Algorithm </li></ul><ul><ul><li>Switching between modes incurs costs </li></ul></ul><ul><ul><li>Dynamically making a decision; Online algorithm for switching </li></ul></ul><ul><ul><li>The algorithm is run independently for each node of the tree </li></ul></ul>
  11. 11. Self-tuning Issues(2) <ul><li>Optimal Node Size </li></ul><ul><ul><li>Using a utility-cost analysis to suggest the optimal node size </li></ul></ul><ul><ul><li>Bigger nodes ? </li></ul></ul><ul><ul><ul><li>Reducing the # of nodes need to be read to reach from the root to the target leaf node </li></ul></ul></ul><ul><ul><ul><li>Increasing the cost of retrieving an individual node </li></ul></ul></ul><ul><ul><li>The utility/cost ratio is max. when the size is as small as possible </li></ul></ul><ul><ul><li>The smallest granularity of R/W in a flash is a page </li></ul></ul>
  12. 12. Optimizations to B + -tree(ST) <ul><li>Log Compaction </li></ul><ul><ul><li>Clustered into fewer sectors + Semantic Compaction Mechanism </li></ul></ul><ul><li>Log Garbage Collection </li></ul><ul><ul><li>Reclamation for dirty log entries </li></ul></ul><ul><ul><li>Activated when the flash is low in available space entries </li></ul></ul><ul><li>Bigger Log Buffer </li></ul><ul><ul><li>Writing a large amount of data at a time has smaller cost </li></ul></ul><ul><li>Checkpoint and Rollback </li></ul><ul><ul><li>Checkpoint for capturing the state of an index </li></ul></ul><ul><ul><li>Rollback for going to the previously checkpointed state </li></ul></ul><ul><ul><li>Software bugs, HW glitches, Energy depletion, and other faults </li></ul></ul>
  13. 13. Experimental Evaluation <ul><li>When is Indexing Useful? </li></ul><ul><li>Tunability of B+-tree(ST) </li></ul><ul><li>Memory Footprint </li></ul><ul><li>Performance of SWITCHMODE </li></ul><ul><li>Flash Devices </li></ul><ul><ul><li>FLASHCHIP : Samsung K9K1G08R0B (128MB) flashchip </li></ul></ul><ul><ul><li>CAPSULE : Tochiba flash chip interface to a mote </li></ul></ul><ul><ul><li>CF : Sandisk CF card(512MB) </li></ul></ul><ul><ul><li>SD : Kingston mini SD card(512MB) </li></ul></ul><ul><li>Workload : LABDATA, RANDOM, SEQUENTIAL </li></ul>
  14. 14. When is Indexing Useful? <ul><li>Building an index is useful when we expect that the # of queries on the archived data will be more than 1% of the total # of data items. </li></ul>
  15. 15. Tunability of B + -tree(ST) ▲ Energy with different storage devices (LABDATA) ▲ Energy with different workloads (FLASHCHIP)
  16. 16. Memory Footprint / SWITCHMODE ▲ Memory footprint of NTT in B + -tree(Log) and B + -tree(ST) ▼ Node switching algorithm is within a factor of 1.3 of the optimal algorithm
  17. 17. Conclusion <ul><li>FlashDB is database design for flash based storage in sensor networks </li></ul><ul><li>Proposed self-tuning design can adapt itself to various combinations of system parameters to not only achieve the best of the performance of existing methods but in fact improve the performance over and above the specialized methods for most regions. </li></ul>