Presentation hybrid store-ewsn-2013

HybridStore: An Eﬃcient Data Management System for
Hybrid Flash-based Sensor Devices

Baobing Wang and John S. Baras

Department of Electrical and Computer Engineering
Institute for Systems Research
University of Maryland, College Park, USA
briankw@umd.edu

10th European Conference on Wireless Sensor Networks (EWSN)

February 14, 2013

Brian (UMD@USA) HybridStore February 14, 2013 1 / 15

Motivation

In-situ Data Storage on Sensor Motes
Centralized data collection: energy wastes (e.g., TinyDB)
LoCal project1 : 455 nodes, > 900M readings/year
Only aggregated data are required: average noise level, peak power
consumption, usage pattern
Sensors store data locally: sensor database
Flash memory: high capacity, energy eﬃcient

Figure:

1
http://local.cs.berkeley.edu/

Motivation
In-situ Data Storage on Sensor Motes
Centralized data collection: energy wastes (e.g., TinyDB)
LoCal project1 : 455 nodes, > 900M readings/year
Only aggregated data are required: average noise level, peak power
consumption, usage pattern
Sensors store data locally: sensor database
Flash memory: high capacity, energy eﬃcient

Figure: Per-byte cost: storage, computation and communication [Mathur’06]

1
http://local.cs.berkeley.edu/

Motivation

Design Challenges
Unlike magnetic disks, no in-place updates on flash memories
NOR flash: byte-oriented, random-accessible, low capacity
NAND flash: page-oriented, high capacity, more energy-efficient
Random writes are 100× more expensive than sequential writes
Very limited RAM: 4KB to 10KB


Related Work

Flash-based Storage Systems
Only time-window queries: TL-Tree [Li’12], FlashLog [Nath’09]
Large RAM footprint: FlashDB [Nath’07], LA-Tree [Agrawal’09]
Antelope [Tsiftes’11]: NOR ﬂash only, discrete values
MicroHash [Lin’06]: long chain of partial pages, extensive page reads
and writes, complex failure recovery
No eﬃcient joint queries support, global index


Contributions

HybridStore Interface
insert(float key , void* record, uint8 t length)
select(uint32 t t1 , uint32 t t2 , float k1 , float k2 )

HybridStore Features
All NAND pages are fully occupied and written purely sequentially
In-place updates and out-of-place writes are completely avoided
Process typical joint queries efficiently, even on large-scale datasets
Data aging without overhead
Sensor-friendly: 16.5KB ROM and 3.2KB RAM in TinyOS 2.1
Potential Applications
Storage layer abstraction: Squirrel [Mottola’10]


HybridStore: Overview

Partition the data stream into segments
Create an in-segment index for each segment
Create an inter-segment index to organize segments
Beneﬁts: skip unnecessary segments, small index per segment


HybridStore: Index Management

Inter-segment skip list: addr , tmin , locate segments within [t1 , t2 ]

NULL
Header

In-segment β-Tree: locate records within [k1 , k2 ]
In-segment Bloom ﬁlter: check the existence of key values if k1 = k2


HybridStore: In-segment Index

In-segment β-Tree: locate records within [k1 , k2 ]
Binary tree: lowK , highK , left, right, splitK
Prediction-based bucket splitting: compute splitK

[-60, 120]

[-60, 30] (30, 120]

(30, 75] (75, 120] (82, 84]

(75, 97.5] (97.5, 120]

(75, 86.25] (86.25, 97.5]



1 qn q
v bits, q hash functions, represent n items: p = 1 − 1 − v
Must be maintained in RAM: NOR flash is byte-oriented
If q = 3, n = 4096, p ≈ 3.06%, then v = 32768 (i.e., 4KB)
Horizontal partition: fixed small bloom filter sections (e.g., 256B)
Vertical partition: group fragments with the same offset in the same
NAND page


HybridStore: Storage Hierarchy

NOR flash: circular array, fixed segment size
NAND flash: circular array, logical segment (multiple erase blocks)
Index structure: updated in a NOR segment, copied to the NAND
segment later
Header: [T1 , T2 ], [K1 , K2 ], dataAddr , idxAddr , bfAddr , skipList

Skip List Header
Bloom Write Read
Filter Buffer RAM
Readings ... Readings
...

Buffer Buffer

Bloom Filter
NOR NOR
Adaptive
Segment ... Segment
NOR
Binary Tree Readings ... Readings

Bloom Filter Tree
...
Segment Segment
...
Segment
NAND
Tree }Header
Page

(a) Storage Hierarchy (b) NAND Segment Structure


HybridStore: Operations

Insertion
Update the β tree: allocate new bucket if necessary
Update the Bloom filter buffer: flush it out to NOR flash if necessary
NOR segment is full: copy to the NAND segment, update the skip list,
start a new segment
Querying: t1 , t2 , k1 , k2
t1 = t2 t1 < t2
k1 = k2 skip list skip list + Bloom filter + β-Tree
k1 < k2 skip list skip list + β-Tree
Skip a segment if [K1 , K2 ] ⊂ [k1 , k2 ]
Data Aging: delete the oldest NAND segment
No need to update any pointer
No need to move any data page


HybridStore: Implementation and Evaluation

TinyOS implementation: 16.5KB ROM, 3.2KB RAM
Trace-driven simulation: over 2.6 million weather records in 5 years
Insertion: 13% ∼ 18% improvement

2 90 40
β−Tree Static tree β−Tree Static tree β−Tree Static tree
1.8 80 35
1.6
70
30
1.4
60

Space Overhead (%)
25
1.2
Energy (µJ)
Time (ms)

50
1 20
40
0.8
15
30
0.6
10
20
0.4

10 5
0.2

0 0 0
64 128 256 64 128 256 64 128 256
NOR Flash Segment Size (KB) NOR Flash Segment Size (KB) NOR Flash Segment Size (KB)

(a) Latency (b) Energy (c) Space Overhead

Figure: Performance per insertion


HybridStore: Value-based Equality Query

Key detection: 26.18ms and 1.5mJ over 0.5 million readings
Nonexistent keys: more than 3× improvement

300 18
β−Tree (64KB) β−Tree (64KB)
β−Tree (128KB) 16 β−Tree (128KB)
250
β−Tree (256KB) β−Tree (256KB)
14
β−Tree (64KB w/o BF) β−Tree (64KB w/o BF)
200 Static (128KB) 12 Static (128KB)

Energy (mJ)
Time (ms)

10
150
8

100 6

4
50
2

0 0
1 day 1 week 1 month 3 month 1 year 1 day 1 week 1 month 3 month 1 year
Time Range Time Range

(a) Latency (b) Energy

Figure: Impact of Bloom ﬁlter for nonexistent keys


HybridStore: Full Query

Retrieve 120K readings in 11.08 seconds from 0.5 million records
[SenSys ’11]: over 20 seconds to get 50% from 50, 000 records

12 700
1 degree 1 degree
3 degree 3 degree
600
10 5 degree 5 degree
7 degree 7 degree
9 degree 500 9 degree

Energy (mJ) / Query
8
Time (s) / Query

400
6
300

4
200

2
100

0 0
1 day 1 week 1 month 3 months 6 months 1 year 1 day 1 week 1 month 3 months 6 months 1 year
Time Range Time Range

(a) Total Latency per query (b) Total energy per query

Figure: HybridStore performance per query of full queries


Conclusion and Future Work

Conclusion
HybridStore: efficient, light-weight, and sensor-friendly
Process typical joint queries efficiently
Process large-scale dataset efficiently
Future Work2
Failure recovery mechanism
Distributed database system based on HybridStore
Testbed experiments

2
B. Wang and J. S. Baras. HybridDB: An Efficient Database System Supporting
Incremental epsilon-Approximate Querying for Storage-Centric Sensor Networks.
Submitted to the ACM Transactions on Sensor Networks, 2013, pp. 1–35

Presentation hybrid store-ewsn-2013

Recommended

Recommended

More Related Content

Similar to Presentation hybrid store-ewsn-2013

Similar to Presentation hybrid store-ewsn-2013 (20)

Recently uploaded

Recently uploaded (20)

Presentation hybrid store-ewsn-2013