Your SlideShare is downloading. ×
  • Like
Slides
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply
Published

 

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
343
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
3
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • - how storage trends have changed the manner in which d-m should be performed.
  • - for instance, in the same seismic data, the scientists are usually interested in processing the data to reveal phenomena that was previously unobserved.
  • - start off by making a case for increased emphasis on storage from a technology trends perspective.
  • Physical layer challenges
  • Querying challenges.
  • Consider the simple example of a value-based index - say a B+ tree constructed over a stream of sensor data. The difficulty arises when you want to delete a portion of the data, say the first N pages that contain data. This triggers a number of updates of the B+ tree, but since in-place updates cannot be performed on flash, this results in a large number of data copy and block erase operations, which is highly energy intensive. Instread, we use an indexed storage mechanism, where we divide data into partitions, build the index structure on the partition, and pack the data and corresponding index together at block boundaries. This enables much simpler erase operations since a partition can be discarded without triggering updates.
  • Consider the simple example of a value-based index - say a B+ tree constructed over a stream of sensor data. The difficulty arises when you want to delete a portion of the data, say the first N pages that contain data. This triggers a number of updates of the B+ tree, but since in-place updates cannot be performed on flash, this results in a large number of data copy and block erase operations, which is highly energy intensive. Instread, we use an indexed storage mechanism, where we divide data into partitions, build the index structure on the partition, and pack the data and corresponding index together at block boundaries. This enables much simpler erase operations since a partition can be discarded without triggering updates.
  • - start off by making a case for increased emphasis on storage from a technology trends perspective.
  • compare with other systems that address similar problems – some are large scale distributed infrastructures for the web, some are systems that exploit spatio-temporal correlation for query processing or compression. No other system combines distributed behavior with the ability to exploit spatio-temporal correlation…set up the uniqueness of our system clearlty. explain web caches better

Transcript

  • 1. Re-thinking Data Management for Storage-Centric Sensor Networks Deepak Ganesan University of Massachusetts Amherst With: Yanlei Diao, Gaurav Mathur, Prashant Shenoy
  • 2. Sensor Network Data Management
    • Live Data Management : Queries on current or recent data.
      • Applications:
        • Real-time feeds/queries: Weather, Fire, Volcano
        • Detection and Notification: Intruder, Vehicle
      • Techniques:
        • Push-down Filters/Triggers: TinyDB, Cougar, Diffusion, …
        • Acquisitional Query Processing: BBQ, PRESTO, …
    • Archival Data Management : Querying or Mining of past data
      • Applications:
        • Scientific Analysis of past events: Weather, Seismic, …
        • Historical trends: Traffic analysis, habitat monitoring
    Our focus is on designing an efficient archival data management architecture for sensor networks
  • 3. Archival Querying in Sensor Networks
    • Data Gathering with centralized archival query processing
      • Efficient for low rate, small volume sensors such as weather sensors (temp, humidity, …).
      • Inefficient energy-wise for “rich” sensor data (acoustic, video, high-rate vibration).
    Lossless aggregation Internet Gateway DBMS
  • 4. Archival Querying in Sensor Networks
    • Store data locally at sensors and push queries into the sensor network
      • Flash memory energy-efficiency, cost, capacity.
      • Limited capabilities of sensor platforms.
    Acoustic stream Internet Gateway Image stream Flash Memory Push query to sensors
  • 5. Technology Trends in Storage Generation of Sensor Platform Energy Cost (uJ/byte) CC1000 CC2420 Telos STM NOR Atmel NOR Communication Storage Micron NAND 128MB
  • 6. Outline
    • Case for Storage-centric Sensor Networks
    • Challenges in a Storage-centric Sensor Database
    • StonesDB Architecture
    • Local Database Architecture
    • Distributed Database Architecture
    • Conclusion
  • 7. Optimize for Flash and RAM Constraints
    • Flash Memory Constraints
      • Data cannot be over-written, only erased
      • Pages can often only be erased in blocks (16-64KB)
      • Unlike magnetic disks, cannot modify in-place
    • Challenges:
      • Memory : Minimize use of memory for flash database.
      • Energy : Organize data on flash to minimize read/write/erase operations
      • Aging : Need to efficiently delete old data items when storage is insufficient.
    • 1. Load block
    • Into Memory
    3. Save block back Erase block Memory 2. Modify in-memory ~16-64 KB ~4-10 KB
  • 8. Support Rich Archival Querying Capability SQL-style Queries : Min, max, count, average, median, top-k, contour, track, etc Similarity Search : Was a bird matching signature S observed last week? Classification Queries : What type of vehicles (truck, car, tank, …) were observed in the field in the last month? Wireless Sensor Network Signal Processing : Perform an FFT to find the mode of vibration signal between time <t1,t2>?
  • 9. StonesDB Goals
    • Our goal is to design a distributed sensor database for archival data management that:
      • Supports energy-efficient sensor data storage, indexing, and aging by optimizing for flash memories.
      • Supports energy-efficient processing of SQL-type queries, as well as data mining and search queries.
      • Is configurable to heterogeneous sensor platforms with different memory and processing constraints.
  • 10. StonesDB Architecture
  • 11. Example: Indexing in StonesDB
    • Naïve Design:
      • Consider a value-based index on entire stream
      • Deletion/Aging of data triggers in-place updates involving energy-intensive block read/write/erase operations.
  • 12. Indexed Storage
    • StonesDB Design:
      • Split data stream into partitions and build index on each partition. Age partitions as a whole cheaply.
    Flash Block
  • 13. Outline
    • Case for Storage-centric Sensor Networks
    • Challenges in a Storage-centric Sensor Database
    • StonesDB Architecture
    • Local Database Architecture
    • Distributed Database Architecture
    • Conclusion
  • 14. StonesDB: Data Mining Queries Similarity Search : Was a bird matching signature S observed last week? Proxy Cache of Image Summaries
  • 15. StonesDB: System Operation Similarity Search : Was a bird matching signature S observed last week? Query Engine Partitioned Access Methods
  • 16. Research Issues
    • Local Database Layer
      • Impact of RAM limitations on storage organization
      • Energy-optimized indexing and aging.
      • New cost models for self-tuning energy-efficient sensor databases.
    • Distributed Database Layer
      • Intelligent split of query processing between proxy and sensor tiers
      • Adaptively tuning quality of data cached at sensor proxy based on query needs
  • 17. The End STONES : STO rage-centric N etworked E mbedded S ystems http://sensors.cs.umass.edu/projects/stones
  • 18. Sensor Data Management Taxonomy Timeline vs Prior Knowledge Querying Mining Current Recent Past Acquisitional Query Processing (BBQ, …) Pushdown Filters (TinyDB, Cougar, …) Timeline of data being processed Search/Mining on Archived Sensor Data Type of data processing
  • 19. Technology Trends in Sensor Platforms
    • Cyclops Camera+ Mica2 Mote
      • 128 x 128 resolution images
      • 4 KB RAM, 10 MHz microcontroller
    • OmniVision Camera + iMote2
      • 128 x 128 resolution images
      • 64KB - 32MB RAM, 10 MHz microcontroller
    Spectrum of sensing devices with different power, capability, resource constraints.