Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
HybridStore: An Efficient Data Management System for          Hybrid Flash-based Sensor Devices                       Baobin...
Motivation       In-situ Data Storage on Sensor Motes            Centralized data collection: energy wastes (e.g., TinyDB)...
Motivation       In-situ Data Storage on Sensor Motes            Centralized data collection: energy wastes (e.g., TinyDB)...
Motivation   Design Challenges        Unlike magnetic disks, no in-place updates on flash memories        NOR flash: byte-or...
Related Work   Flash-based Storage Systems        Only time-window queries: TL-Tree [Li’12], FlashLog [Nath’09]        Lar...
ContributionsHybridStore Interface    insert(float key , void* record, uint8 t length)    select(uint32 t t1 , uint32 t t2 ...
ContributionsHybridStore Interface    insert(float key , void* record, uint8 t length)    select(uint32 t t1 , uint32 t t2 ...
ContributionsHybridStore Interface    insert(float key , void* record, uint8 t length)    select(uint32 t t1 , uint32 t t2 ...
HybridStore: Overview   Partition the data stream into segments   Create an in-segment index for each segment   Create an ...
HybridStore: Index Management   Inter-segment skip list: addr , tmin , locate segments within [t1 , t2 ]                  ...
HybridStore: Index Management   Inter-segment skip list: addr , tmin , locate segments within [t1 , t2 ]                  ...
HybridStore: In-segment Index   In-segment β-Tree: locate records within [k1 , k2 ]         Binary tree: lowK , highK , le...
HybridStore: In-segment Index   In-segment Bloom filter: check the existence of key values if k1 = k2                      ...
HybridStore: In-segment Index   In-segment Bloom filter: check the existence of key values if k1 = k2                      ...
HybridStore: In-segment Index   In-segment Bloom filter: check the existence of key values if k1 = k2                      ...
HybridStore: Storage Hierarchy   NOR flash: circular array, fixed segment size   NAND flash: circular array, logical segment ...
HybridStore: Operations   Insertion         Update the β tree: allocate new bucket if necessary         Update the Bloom fi...
HybridStore: Operations   Insertion         Update the β tree: allocate new bucket if necessary         Update the Bloom fi...
HybridStore: Operations   Insertion         Update the β tree: allocate new bucket if necessary         Update the Bloom fi...
HybridStore: Implementation and Evaluation                  TinyOS implementation: 16.5KB ROM, 3.2KB RAM                  ...
HybridStore: Value-based Equality Query        Key detection: 26.18ms and 1.5mJ over 0.5 million readings        Nonexiste...
HybridStore: Full Query            Retrieve 120K readings in 11.08 seconds from 0.5 million records                       ...
Conclusion and Future Work       Conclusion            HybridStore: efficient, light-weight, and sensor-friendly            ...
Upcoming SlideShare
Loading in …5
×

Presentation hybrid store-ewsn-2013

193 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Presentation hybrid store-ewsn-2013

  1. 1. HybridStore: An Efficient Data Management System for Hybrid Flash-based Sensor Devices Baobing Wang and John S. Baras Department of Electrical and Computer Engineering Institute for Systems Research University of Maryland, College Park, USA briankw@umd.edu 10th European Conference on Wireless Sensor Networks (EWSN) February 14, 2013 Brian (UMD@USA) HybridStore February 14, 2013 1 / 15
  2. 2. Motivation In-situ Data Storage on Sensor Motes Centralized data collection: energy wastes (e.g., TinyDB) LoCal project1 : 455 nodes, > 900M readings/year Only aggregated data are required: average noise level, peak power consumption, usage pattern Sensors store data locally: sensor database Flash memory: high capacity, energy efficient Figure: 1 http://local.cs.berkeley.edu/ Brian (UMD@USA) HybridStore February 14, 2013 2 / 15
  3. 3. Motivation In-situ Data Storage on Sensor Motes Centralized data collection: energy wastes (e.g., TinyDB) LoCal project1 : 455 nodes, > 900M readings/year Only aggregated data are required: average noise level, peak power consumption, usage pattern Sensors store data locally: sensor database Flash memory: high capacity, energy efficient Figure: Per-byte cost: storage, computation and communication [Mathur’06] 1 http://local.cs.berkeley.edu/ Brian (UMD@USA) HybridStore February 14, 2013 2 / 15
  4. 4. Motivation Design Challenges Unlike magnetic disks, no in-place updates on flash memories NOR flash: byte-oriented, random-accessible, low capacity NAND flash: page-oriented, high capacity, more energy-efficient Random writes are 100× more expensive than sequential writes Very limited RAM: 4KB to 10KB Brian (UMD@USA) HybridStore February 14, 2013 3 / 15
  5. 5. Related Work Flash-based Storage Systems Only time-window queries: TL-Tree [Li’12], FlashLog [Nath’09] Large RAM footprint: FlashDB [Nath’07], LA-Tree [Agrawal’09] Antelope [Tsiftes’11]: NOR flash only, discrete values MicroHash [Lin’06]: long chain of partial pages, extensive page reads and writes, complex failure recovery No efficient joint queries support, global index Brian (UMD@USA) HybridStore February 14, 2013 4 / 15
  6. 6. ContributionsHybridStore Interface insert(float key , void* record, uint8 t length) select(uint32 t t1 , uint32 t t2 , float k1 , float k2 ) HybridStore Features All NAND pages are fully occupied and written purely sequentially In-place updates and out-of-place writes are completely avoided Process typical joint queries efficiently, even on large-scale datasets Data aging without overhead Sensor-friendly: 16.5KB ROM and 3.2KB RAM in TinyOS 2.1 Potential Applications Storage layer abstraction: Squirrel [Mottola’10] Brian (UMD@USA) HybridStore February 14, 2013 5 / 15
  7. 7. ContributionsHybridStore Interface insert(float key , void* record, uint8 t length) select(uint32 t t1 , uint32 t t2 , float k1 , float k2 ) HybridStore Features All NAND pages are fully occupied and written purely sequentially In-place updates and out-of-place writes are completely avoided Process typical joint queries efficiently, even on large-scale datasets Data aging without overhead Sensor-friendly: 16.5KB ROM and 3.2KB RAM in TinyOS 2.1 Potential Applications Storage layer abstraction: Squirrel [Mottola’10] Brian (UMD@USA) HybridStore February 14, 2013 5 / 15
  8. 8. ContributionsHybridStore Interface insert(float key , void* record, uint8 t length) select(uint32 t t1 , uint32 t t2 , float k1 , float k2 ) HybridStore Features All NAND pages are fully occupied and written purely sequentially In-place updates and out-of-place writes are completely avoided Process typical joint queries efficiently, even on large-scale datasets Data aging without overhead Sensor-friendly: 16.5KB ROM and 3.2KB RAM in TinyOS 2.1 Potential Applications Storage layer abstraction: Squirrel [Mottola’10] Brian (UMD@USA) HybridStore February 14, 2013 5 / 15
  9. 9. HybridStore: Overview Partition the data stream into segments Create an in-segment index for each segment Create an inter-segment index to organize segments Benefits: skip unnecessary segments, small index per segment Brian (UMD@USA) HybridStore February 14, 2013 6 / 15
  10. 10. HybridStore: Index Management Inter-segment skip list: addr , tmin , locate segments within [t1 , t2 ] NULL Header In-segment β-Tree: locate records within [k1 , k2 ] In-segment Bloom filter: check the existence of key values if k1 = k2 Brian (UMD@USA) HybridStore February 14, 2013 7 / 15
  11. 11. HybridStore: Index Management Inter-segment skip list: addr , tmin , locate segments within [t1 , t2 ] NULL Header In-segment β-Tree: locate records within [k1 , k2 ] In-segment Bloom filter: check the existence of key values if k1 = k2 Brian (UMD@USA) HybridStore February 14, 2013 7 / 15
  12. 12. HybridStore: In-segment Index In-segment β-Tree: locate records within [k1 , k2 ] Binary tree: lowK , highK , left, right, splitK Prediction-based bucket splitting: compute splitK [-60, 120] [-60, 30] (30, 120] (30, 75] (75, 120] (82, 84] (75, 97.5] (97.5, 120] (75, 86.25] (86.25, 97.5] Brian (UMD@USA) HybridStore February 14, 2013 8 / 15
  13. 13. HybridStore: In-segment Index In-segment Bloom filter: check the existence of key values if k1 = k2 1 qn q v bits, q hash functions, represent n items: p = 1 − 1 − v Must be maintained in RAM: NOR flash is byte-oriented If q = 3, n = 4096, p ≈ 3.06%, then v = 32768 (i.e., 4KB) Horizontal partition: fixed small bloom filter sections (e.g., 256B) Vertical partition: group fragments with the same offset in the same NAND page Brian (UMD@USA) HybridStore February 14, 2013 9 / 15
  14. 14. HybridStore: In-segment Index In-segment Bloom filter: check the existence of key values if k1 = k2 1 qn q v bits, q hash functions, represent n items: p = 1 − 1 − v Must be maintained in RAM: NOR flash is byte-oriented If q = 3, n = 4096, p ≈ 3.06%, then v = 32768 (i.e., 4KB) Horizontal partition: fixed small bloom filter sections (e.g., 256B) Vertical partition: group fragments with the same offset in the same NAND page Brian (UMD@USA) HybridStore February 14, 2013 9 / 15
  15. 15. HybridStore: In-segment Index In-segment Bloom filter: check the existence of key values if k1 = k2 1 qn q v bits, q hash functions, represent n items: p = 1 − 1 − v Must be maintained in RAM: NOR flash is byte-oriented If q = 3, n = 4096, p ≈ 3.06%, then v = 32768 (i.e., 4KB) Horizontal partition: fixed small bloom filter sections (e.g., 256B) Vertical partition: group fragments with the same offset in the same NAND page Brian (UMD@USA) HybridStore February 14, 2013 9 / 15
  16. 16. HybridStore: Storage Hierarchy NOR flash: circular array, fixed segment size NAND flash: circular array, logical segment (multiple erase blocks) Index structure: updated in a NOR segment, copied to the NAND segment later Header: [T1 , T2 ], [K1 , K2 ], dataAddr , idxAddr , bfAddr , skipList Skip List Header Bloom Write Read Filter Buffer RAM Readings ... Readings ... Buffer Buffer Bloom Filter NOR NOR Adaptive Segment ... Segment NOR Binary Tree Readings ... Readings Bloom Filter Tree ... Segment Segment ... Segment NAND Tree }Header Page (a) Storage Hierarchy (b) NAND Segment Structure Brian (UMD@USA) HybridStore February 14, 2013 10 / 15
  17. 17. HybridStore: Operations Insertion Update the β tree: allocate new bucket if necessary Update the Bloom filter buffer: flush it out to NOR flash if necessary NOR segment is full: copy to the NAND segment, update the skip list, start a new segment Querying: t1 , t2 , k1 , k2 t1 = t2 t1 < t2 k1 = k2 skip list skip list + Bloom filter + β-Tree k1 < k2 skip list skip list + β-Tree Skip a segment if [K1 , K2 ] ⊂ [k1 , k2 ] Data Aging: delete the oldest NAND segment No need to update any pointer No need to move any data page Brian (UMD@USA) HybridStore February 14, 2013 11 / 15
  18. 18. HybridStore: Operations Insertion Update the β tree: allocate new bucket if necessary Update the Bloom filter buffer: flush it out to NOR flash if necessary NOR segment is full: copy to the NAND segment, update the skip list, start a new segment Querying: t1 , t2 , k1 , k2 t1 = t2 t1 < t2 k1 = k2 skip list skip list + Bloom filter + β-Tree k1 < k2 skip list skip list + β-Tree Skip a segment if [K1 , K2 ] ⊂ [k1 , k2 ] Data Aging: delete the oldest NAND segment No need to update any pointer No need to move any data page Brian (UMD@USA) HybridStore February 14, 2013 11 / 15
  19. 19. HybridStore: Operations Insertion Update the β tree: allocate new bucket if necessary Update the Bloom filter buffer: flush it out to NOR flash if necessary NOR segment is full: copy to the NAND segment, update the skip list, start a new segment Querying: t1 , t2 , k1 , k2 t1 = t2 t1 < t2 k1 = k2 skip list skip list + Bloom filter + β-Tree k1 < k2 skip list skip list + β-Tree Skip a segment if [K1 , K2 ] ⊂ [k1 , k2 ] Data Aging: delete the oldest NAND segment No need to update any pointer No need to move any data page Brian (UMD@USA) HybridStore February 14, 2013 11 / 15
  20. 20. HybridStore: Implementation and Evaluation TinyOS implementation: 16.5KB ROM, 3.2KB RAM Trace-driven simulation: over 2.6 million weather records in 5 years Insertion: 13% ∼ 18% improvement 2 90 40 β−Tree Static tree β−Tree Static tree β−Tree Static tree 1.8 80 35 1.6 70 30 1.4 60 Space Overhead (%) 25 1.2 Energy (µJ)Time (ms) 50 1 20 40 0.8 15 30 0.6 10 20 0.4 10 5 0.2 0 0 0 64 128 256 64 128 256 64 128 256 NOR Flash Segment Size (KB) NOR Flash Segment Size (KB) NOR Flash Segment Size (KB) (a) Latency (b) Energy (c) Space Overhead Figure: Performance per insertion Brian (UMD@USA) HybridStore February 14, 2013 12 / 15
  21. 21. HybridStore: Value-based Equality Query Key detection: 26.18ms and 1.5mJ over 0.5 million readings Nonexistent keys: more than 3× improvement 300 18 β−Tree (64KB) β−Tree (64KB) β−Tree (128KB) 16 β−Tree (128KB) 250 β−Tree (256KB) β−Tree (256KB) 14 β−Tree (64KB w/o BF) β−Tree (64KB w/o BF) 200 Static (128KB) 12 Static (128KB) Energy (mJ) Time (ms) 10 150 8 100 6 4 50 2 0 0 1 day 1 week 1 month 3 month 1 year 1 day 1 week 1 month 3 month 1 year Time Range Time Range (a) Latency (b) Energy Figure: Impact of Bloom filter for nonexistent keys Brian (UMD@USA) HybridStore February 14, 2013 13 / 15
  22. 22. HybridStore: Full Query Retrieve 120K readings in 11.08 seconds from 0.5 million records [SenSys ’11]: over 20 seconds to get 50% from 50, 000 records 12 700 1 degree 1 degree 3 degree 3 degree 600 10 5 degree 5 degree 7 degree 7 degree 9 degree 500 9 degree Energy (mJ) / Query 8 Time (s) / Query 400 6 300 4 200 2 100 0 0 1 day 1 week 1 month 3 months 6 months 1 year 1 day 1 week 1 month 3 months 6 months 1 year Time Range Time Range (a) Total Latency per query (b) Total energy per query Figure: HybridStore performance per query of full queries Brian (UMD@USA) HybridStore February 14, 2013 14 / 15
  23. 23. Conclusion and Future Work Conclusion HybridStore: efficient, light-weight, and sensor-friendly Process typical joint queries efficiently Process large-scale dataset efficiently Future Work2 Failure recovery mechanism Distributed database system based on HybridStore Testbed experiments 2 B. Wang and J. S. Baras. HybridDB: An Efficient Database System SupportingIncremental epsilon-Approximate Querying for Storage-Centric Sensor Networks.Submitted to the ACM Transactions on Sensor Networks, 2013, pp. 1–35 Brian (UMD@USA) HybridStore February 14, 2013 15 / 15

×