My tutorial on SensorDB Design Issues at SIGMOD 2007
Upcoming SlideShare
Loading in...5
×
 

My tutorial on SensorDB Design Issues at SIGMOD 2007

on

  • 1,340 views

 

Statistics

Views

Total Views
1,340
Views on SlideShare
1,339
Embed Views
1

Actions

Likes
2
Downloads
74
Comments
0

1 Embed 1

http://www.slideshare.net 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Size of pda, cell phone, button. Processor board – lower power microcontroller, limited ram and rom, wireless transceiver, antenna, battery. Sensor board – temperature, humidity, acceleration, light, pressure, and others. Usually harsh deployment environment. Deployed in large quantity.
  • Write embedded code to handle the hardware, networking, scheduling, and query processing. Compile into binary. Inject into the nodes through the sink. Issue SQL-like queries.
  • Atmega128: Atmel. 128KB programmable FlashROM, 4KBRAM, 16MHz, 2.7-5.5volt operation. MSP430: TI. 16-bit. Up to 16KB ROM, 8MIPS, 1.8-3.6Volt. ARM/THUMB: OKI Semiconductor.
  • if devices can change frequency in runtime and communicate at different frequencies, it is called "frequency multiplex", like walkie talkie. if two devices want to communicate, they must have the same frequency some types of tv antennas receive uni-directional or just call directional signals Directional antennas are able to pull in signals from greater distances. because they "see" in only one direction they are resistant to noise and "multipath distortion" (a problem encountered when an antenna receives reflections of the desired signal). Because multi-directional antennas "see" in many directions they are more likely to pick up noise, interference, and multipath distortion.
  • Energy conservation: turn off radio when it is not needed. Bandwidth allocation: allocate communication channels to neighbor nodes. time synchronization is usually done in MAC, but some networks may not need time sync Time sync: to make the time on each node the same, so that the timestamp of the packet from each node is useful ad-hoc networks such as laptops, pdas may not sync their time another purpose of time sync is to make the nodes to return packets within each epoch. the nodes should know when the epoch starts and when it ends Collision detection/handling can be included in bandwidth allocation. there is some scheduling mechanism in the kernel.
  • UART, Radio, sensor, components -> hardware components system components include AM (Active Messages), task execution. Queue, debugger, random -> lib components interface components include attribute interface, message sending interface, timer interface, etc.
  • dynamic code update means the nodes do not need to reinstall all code. reinstalling will cause the node to stop or reset. Without priorities one task cannot preempt another single thread, single application
  • An example of networked embedded systems can be the information sharing system on cars, which can share the information about the highway status, accidents, fires, rains, road quality ahead, etc among cars. contiki doesn't support dynamic loading
  • direct calling cannot change the called module after compilation but jump table can, as it does not need the called module to be compiled together. like DLL - dynamic link lib
  • The small code size does not include application, but the kernel, scheduler, and network. stack occupies less than 500 bytes of RAM and about 14 KB of flash.
  • The application exists as binary code, just like TinyOS application code, which is compiled from (tinyos + application NesC program). It may have accidental program errors. Note that naturalization is not code interpretation. The naturalization is only performed at load-time. After the naturalization, the naturalized program is considered safe and will be executed without monitoring or interpretation. This load-time approach limits most of the software overhead to be a one-time cost. Hence, it is able to provide a much faster execution speed than the approaches based on interpretation. Changes in naturalization: memory change -> to enable virtual memory code change-> do not allow malicious program that try to enter the OS kernel to change something
  • most java machines run on RAM in a pc. Because loading code in RAM will be costly, most embedded device run code on ROM
  • TinyOS adopts such a CSMA protocol in wireless communication
  • The gradients are shown as the arrows in Step 2. They are just the links for the paths from the source to the destination. Normally, after Step 2, query results start flowing towards the destination along multiple paths. During transmission of these query results, the network “reinforces” one, or a small number of these paths. In Step 3 of this slide, the source sensor node reinforces one path by selecting the gradient that has been more reliable than others.
  • If node 3 finds that the link between 1 and 3 is not reliable – the packet loss rate exceeds a system defined threshold, node 3 may change to other neighbors with more reliable links, even though the hop count of node 3 may increase
  • When RERR happens, the source needs to reconstruct a route from the source to the destination. The process is the same as the previous process in the figure.

My tutorial on SensorDB Design Issues at SIGMOD 2007 My tutorial on SensorDB Design Issues at SIGMOD 2007 Presentation Transcript

  • System Design Issues In Sensor Databases Qiong Luo and Hejun Wu Department of Computer Science and Engineering The Hong Kong University of Science & Technology http://www.cse.ust.hk/
  • Wireless Sensor Networks (WSNs) Energy efficiency is the most crucial performance factor. Limited on-node resource Multi-hop communication
  • In-Network Sensor Query Processing (Sensor Databases, SensorDBs) sink SELECT temperature FROM sensors WHERE temperature > 900 SAMPLE INTERVAL 60s σ , π , α Scheduler Sensing & Networking σ , π , α Scheduler Sensing & Networking σ , π , α Scheduler Sensing & Networking σ , π , α Scheduler Networking SELECT avg (light) FROM sensors SAMPLE INTERVAL 60s light: 1000 light: 500 light: 300 Avg light: 300/1 Avg light: 1000/1 Avg light: (1000+500)/2 View slide
  • Two Representative SensorDBs
    • Cougar [BGS01, YG03]
      • Model sensor network data as sequences
      • Declarative query interface with UDFs
      • Cross-layer optimization in later versions
    • TinyDB [MF+02, 03]
      • Declarative query interface
      • Efficient and extensible framework
      • Open-source implementation on real nodes
    View slide
  • Advantages of Sensor Databases
    • Flexibility
      • Declarative SQL style queries
      • Dynamic query injection and removal
    • Efficiency
      • Cross-layer optimization
        • E.g., in-network filtering and aggregation
  • Challenges in Sensor Databases
    • Dynamic data streams
    • Hardware resource limitations
      • Limited per-node computing power and storage
      • Unreliable wireless communication
      • Battery power supply
    • Complex, networked, embedded software
      • Blurred boundaries between components
      • Plenty of cross-layer optimization opportunities
  • Focus of this Tutorial
    • System design issues in sensor databases
      • Software architecture
      • Operating system support
      • Media Access Control (MAC)
      • Routing
      • Scheduling
    These issues often dominate the overall performance.
  • Outline
    • Introduction
    • WSN hardware
      • Computing, sensing, communication, and power supply
    • Software architecture
    • Operating system support
    • MAC protocols
    • Routing
    • Scheduling
    • Summary and future directions
  • Current Sensor Node Hardware: Computing and Storage
    • Low-power microcontroller (CPU of a node)
      • E.g., Atmega128 (MICA series), MSP430 (Telos series), ARM/THUMB (XYZ sensor), and the latest 180MHz ARM920 (SunSpot).
    • Limited memory
      • RAM
        • ≤ 10KB SRAM (Static RAM)
      • ROM
        • Usually ≤ 1MB flash memory
  • Current Sensor Node Hardware: Sensing and Radio
    • Sensing devices
      • Electronic, mechanic, bio-chemical, …
    • Radio transceiver
      • Fixed radio frequency
      • Omni-direction radio signal
      • Transmission rate ≤ 200 kbps
      • Transmission range ≤ 50 meters
  • Current Sensor Node Hardware: Power Supply and Consumption
    • Power supply
      • Batteries, usually <= 2000 mAh
    • Electric currents in a node
      • Sleep 15-20 µA
      • Radio on
        • Idle 20-25 mA, compute 25-30 mA
      • Radio off
        • Idle 1-5 mA, compute 5-10 mA
    Sleeping is the most effective means to save energy.
  • Outline
    • Introduction
    • WSN hardware
    • Software architecture
    • Operating system support
    • MAC protocols
    • Routing
    • Scheduling
    • Summary and future directions
  • Common Software Architecture of Sensor Databases Scheduling Operating System Kernel Boundaries between components in a sensorDB are blurred.
  • Outline
    • Introduction
    • WSN hardware
    • Common SensorDB software architecture
    • Operating system support
      • Hardware management
      • Application code development and deployment
    • MAC protocols
    • Routing
    • Scheduling
    • Summary and future directions
  • TinyOS (http://www.tinyos.net/)
    • De facto OS for sensor nodes
      • Early research effort
      • Open source development
      • Wide presence in commercial products
      • Component-based architecture
        • Adaptive to hardware changes
        • Lightweight for various applications
      • Event-driven processing
        • Responsive to sensor signals and radio messages
  • TinyOS Application Sensor devices Mote main board Hardware manipulation components Abstraction: Hardware: Core system components Lib components TinyOS interface components Commands Events Kernel: Application TinyOS startup (“Main”) Runable image of a TinyOS application A TinyOS application is compiled with TinyOS components.
  • Some Limitations of TinyOS
    • Static code and memory
      • No virtual memory
      • No dynamic memory allocation
      • No dynamic code update
      • Task execution without priorities
    • Single thread
    Global Free Stack TinyOS memory allocation
  • Contiki [DGV04]
    • Multi-threading
    • Lightweight program loading
    • Lightweight communication stacks
      • uIP
        • A micro-version of RFC-compliant TCP/IP
      • Rime
        • A lightweight communication stack for low-power radio
    core #1 #2 #3 #n … Multi-threading in Contiki
  • SOS [HK+05]
    • Dynamic module loading
      • Allows incremental update of binary code
    • Runtime safety mechanisms
      • Memory monitoring
      • Watchdog
        • Restart when system hangs
    System function call Actual function to call … Module #1 Module #2 Module #N SOS Kernel Jump table 1 2
  • MANTIS [BC+05]
    • Multi-threading
    • Remote testing
    • Scheduler for duty-cycle sleeping
    • Small code size
      • Uses less than 500B RAM and 14KB flash memory
    Device driver Communication Layer Kernel / Scheduler Sensor Node Hardware MANTIS System API #1 User threads #n … Network Stack Command Server
  • t-kernel [GS06]
    • OS protection
      • Separates OS/app space
    • Virtual memory
      • Extends the limited SRAM
    • Preemptive scheduling
      • Allows priorities
    • Fault tolerance
      • Prevents system hang-up from application errors
    Load Application binary code Naturalization Run Running an app. in t-kernel
  • On-Node Virtual Machines
    • SunSPOT
      • A compact Java language
      • Java VM directly runs in on-node flash memory
    • SwissQM [MAK07]
      • Combines a powerful gateway with a virtual machine at the sensors
        • Query Machine (QM)
    http://www.sunspotworld.com/ Bytecode interpreter QM programs Sensors Operand stack Query synopsis Transmission buffer
  • Declarative Sensor Networks [CP+07]
    • Snlog language
      • Datalog-like, declarative
      • Suitable for polynomial-time programs
      • Useful in a variety of apps
    • Snlog compiler
      • Translate Snlog into NesC
    • Runtime system
      • Components of user provided rules
      • No on-node interpreter
    Snlog program nesC Templates DSN runtime components Generated NesC program NesC Compiler Binary code to be executed Snlog front-end Execution planner NesC backend Snlog compiler
  • Summary on OS Support
    • Support app. development and deployment
      • Programming interfaces
      • Code compilation and generation
      • Runtime loading and modification
    • Provide hardware resource management
      • Sensor signals, radio messages
      • Memory allocation and virtualization
      • Scheduling and system safety
  • OS and Sensor Databases
    • Desirable OS features for sensor databases
      • Multiple applications
      • Multi-threading
      • Virtual memory
      • Priority scheduling
      • Reliability and fault tolerance
  • Outline
    • Introduction
    • WSN hardware
    • Common SensorDB software architecture
    • Operating system support
    • MAC protocols
      • CSMA, STEM, S-MAC, and T-MAC
    • Routing
    • Scheduling
    • Summary and future directions
  • CSMA (Carrier Sense Multiple Access)
    • Random delay before transmission attempt
    • Node needs to keep idle listening before its communication done
    • Wireless collision remains a major problem
      • Reason: no effective coordination between nodes
    Sender 1 Sender 2 transmitting transmitting Collision
  • Sparse Topology and Energy Management (STEM) [STS02]
    • Periodic wake up and listen
    • Sleep when no packet to send and receive
    Transmit Sleep Power Listening Listening Listening time
  • S-MAC (Sensor-MAC) [YHE02]
    • Schedules nodes to periodically sleep
    • Coordinates the sleeping time of neighbors for reliable transmission
    Listen SYNC RTS Receiver Sender Sleep
  • T-MAC [DL03]
    • Contention based protocol
    • Dynamically ends an active period
      • Adapts to the needs for computation and communication
    Aha, no more to do!, zzz~ S-MAC T-MAC Active time Sleep time Sleep time Active time TA
  • Summary on MAC
    • Important for performance
      • Communication quality
        • Signal errors
        • Noise
      • Communication energy
        • Sleeping nodes
        • Retransmission
      • Communication delay
        • Negotiation for channels
        • Wireless signal transmission delay
    Significant to data quality, energy efficiency and response time in query processing!
  • MAC and Sensor Databases
    • MAC behavior of sensor databases
      • Mostly converge-cast
      • Periodic data flows
    • Opportunities of sensor databases for MAC
      • Sleep scheduling that suits the data flows
  • Outline
    • Introduction
    • WSN hardware
    • Common SensorDB software architecture
    • Operating system support
    • MAC protocols
    • Routing
      • MintRoute, TinyAODV, and Directed Diffusion
    • Scheduling
    • Summary and future directions
  • Location-Based Routing
    • Requires location information
      • Usually finds the shortest, reliable path using location information of each node
        • Energy aware
    • Suitable for queries with spatial predicates
      • Can route queries to some specific regions
    S D Transmission range of S Shortest path
  • Flooding
    • Every node broadcasts received data
      • Broadcast can be reduced by hop count
    • Advantage
      • Simple
    • Problems
      • Message implosion
        • Many duplicates
      • Resource inefficiency
        • Most nodes busy
        • No sleeping
  • Directed Diffusion [IG+00]
    • Queries are defined as interests.
    • Sink nodes post interests.
    • Source nodes generate sensory data.
    • Source nodes select reliable and efficient routes to the sink nodes to forward data.
  • Illustration of Directed Diffusion Sink node Source node Source node Sink node Source node Sink node Step 3: Reinforce one path Step 1: Sink node propagates Interest (query) Step 2: Set up gradients
  • MintRoute [WTC03]
    • Implemented in TinyOS
      • Can be used in TinyDB
    • More than simple shortest-path routing
      • Monitors link connectivity
      • Decides a route based on both link quality and distance
    1 2 3 1 2 3 Unreliable, high loss rate
  • TinyAODV (Tiny Ad-hoc On-Demand Distance Vector)
    • Builds paths only when needed
    • Uses sequence number in RREQ to avoid cycles
    RREQ RREQ: Route Request RREP: Route Reply RERR: Route ERR RREP DATA X RERR … Source Destination …
  • Summary on Routing
    • Focus of current routing protocols
      • Efficient forwarding
        • Shortest path or least retransmission
      • Load balancing
        • Avoid hot spots of heavy traffic
    • Open issues
      • Reliability
        • Node failure
        • Noise
      • Communication delay
        • Find the path of the minimal delay
  • Routing and Sensor Databases
    • Routing characteristics of sensor databases
      • Mainly converge-casting
      • Not all nodes satisfy a query all the time.
    • Opportunities of sensorDBs for routing
      • Data flow aware routing
        • Busy nodes get better routes.
        • Busy queries get better routes.
      • Query type aware routing
        • Aggregation, duplicate-sensitivity, join
  • Outline
    • Introduction
    • WSN hardware
    • Common SensorDB software architecture
    • Operating system support
    • MAC protocols
    • Routing
    • Scheduling
      • FPS, Sichitiu’s Scheme, and DCS
    • Summary and future directions
  • Goal of Scheduling
    • Communication efficiency and reliability
      • Coordinate nodes in communication
        • Wireless collisions among neighbors
        • No receiving on sleeping nodes
    Sending… zzz… zzz… Done! Data Lost!
  • Centralized Scheduling
    • The base station specifies schedules for all nodes.
      • The base station must be aware of the workload and the network topology
    • Hard to scale
    • Hard to adapt to changes
    Sink Schedule …
  • Distributed Scheduling
    • Scheduling in TinyDB (query layer)
      • A node keeps active for 4 seconds and sleeps in the remaining time in a sample interval.
    • Sichitiu’s Scheduling Scheme [Sic04]
      • Schedules at the MAC and routing layers
      • Sets up both routing paths and schedules
        • Schedule construction is time consuming and unreliable because it needs the sink to confirm.
  • FPS (Flexible Power Scheduling) [HDB04]
    • Routing-layer distributed scheduling
    • A parent node assigns transmission slots to its children to avoid collision between siblings.
    • Collisions among non-sibling neighbors are possible.
    Slots: Time transmit … compute sleep sleep
  • DCS (Distributed Cross-Layer Scheduling) [WLX06]
    • Slot based
    • Takes query processing cycles into account
      • Receiving, computing, transmission, and sleep
    • Not only parents assign schedules to children, but neighbors also negotiate.
      • Able to avoid the collisions at the receiving nodes
      • Attempts to assign consecutive transmission slots to each node
  • DCS Components Routing Layer Query Layer MAC Layer Route Maintenance Selection / Projection / Join / Aggregation Transmission / Receiving Route Selection Collision Detection Scheduling Module Schedule Construction Time Synchronization Query Scheduling Schedule Execution
  • Slot in a Schedule
    • A slot is a time period of fixed length.
      • Transmission, Sleeping, PL/R (Processing, Listening / Receiving), and Q/M (Query injection / route Maintenance)
    • Slot number s at t:
      • The length of a slot is ls.
      • The schedule start time is t0.
      • A sample interval has m slots.
  • An Example Schedule Time Leaf Leaf Sink 0 Sleeping 2 Node 1 Node2 Node3 Hop 0 Hop 1 Hop 2 Hop 2 Transmission PL/R Q/M Active (sink only) Routing tree 1 3 0
  • Energy Efficiency DCS achieves 50-60% energy saving.
  • Summary on Scheduling
    • Scheduling is done on one or more layers.
    • Scheduling is crucial for performance.
      • Communication reliability
        • Coordination of nodes
      • Energy consumption
        • Sleep scheduling
      • Response time
        • Different transmission timings of neighboring nodes result in different delays.
  • Scheduling and Sensor Databases
    • SensorDBs are complex to schedule.
    • Opportunities in scheduling for sensorDBs
      • On-node multi-query scheduling
        • Limited resources
        • Changing sensor environments
      • Query-aware transmission scheduling
      • Interaction between scheduling and query execution
  • Outline
    • Introduction
    • WSN hardware
    • Common SensorDB software architecture
    • Operating system support
    • MAC protocols
    • Routing
    • Scheduling
    • Summary and future directions
  • Tutorial Summary
    • System design issues have a significant impact on the overall performance of sensor databases.
    • A holistic sensor database system requires considerations on all layers – from OS kernel, MAC, routing to query processing.
    • Cross-layer design is necessary, especially in scheduling.
  • Future Directions
    • Multi-query processing in sensor networks
      • Query optimization
        • Sharing, caching, and pipelining
      • Scheduling
        • Queries, operators, and transmission
    • In-network joins among different nodes
      • Fine-grained scheduling for node cooperation
    • Query processing in multi-sink networks
    Cross-layer design is necessary for an efficient, holistic sensor database.
  • References: Sensor Databases and Runtime Support
    • [BGS01] Philippe Bonnet, Johannes Gehrke, and Praveen Seshadri. Towards Sensor Database Systems. MDM, 2001 .
    • [MF+02] Samuel Madden, Michael J. Franklin, Joseph M. Hellerstein, and Wei Hong. TAG: a Tiny AGgregation Service for Ad-Hoc Sensor Networks. OSDI , 2002.
    • [MF+03] Samuel Madden, Michael J. Franklin, Joseph M. Hellerstein, and Wei Hong. The Design of an Acquisitional Query Processor for Sensor Networks . SIGMOD , 2003.
    • [YG03] Yong Yao and Johannes Gehrke. Query Processing for Sensor Networks. CIDR , 2003.
    • [MAK07] Rene Muller, Gustavo Alonso, and Donald Kossman. SwissQM: Next Generation Data Processing in Sensor Networks. CIDR, 2007.
    • [CP+07] David Chu, Lucian Popa, Arsalan Tavakoli, Joseph M. Hellerstein, Philip Levis, Scott Shenker, and Ion Stoica. The Design and Implementation of A Declarative Sensor Network System . Submitted for publication , 2007.
  • References: OS Support
    • [DGV04] Adam Dunkels, Björn Grönvall, and Thiemo Voigt. Contiki - A Lightweight and Flexible Operating System for Tiny Networked Sensors . The 29th Annual IEEE Conference on Local Computer Networks , 2004.
    • [HK+05] Chih-Chieh Han, Ram Kumar, Roy Shea, Eddie Kohler and Mani Srivastava. A Dynamic Operating System for Sensor Nodes. International Conference on Mobile Systems, Applications, and Services , 2005.
    • [BC+05] Shah Bhatti, James Carlson, Hui Dai, Jing Deng, Jeff Rose, Anmol Sheth, Brian Shucker, Charles Gruenwald, Adam Torgerson, and Richard Han. MANTIS OS: An Embedded Multithreaded Operating System For Wireless Micro Sensor Platforms. ACM/Kluwer Mobile Networks and Applications (MONET), Special Issue on Wireless Sensor Networks , vol. 10, no. 4, pp.563–579, Aug 2005.
    • [GS06] Lin Gu and John A. Stankovic. t-kernel: Providing Reliable OS Support for Wireless Sensor Networks. SenSys , 2006.
  • References: MAC and Routing
    • [STS02] Curt Schurgers, Vlasios Tsiatsis, and Mani B. Srivastava. STEM: Topology Management for Energy Efficient Sensor Networks . IEEE Aerospace Conference , 2002.
    • [YHE02] Wei Ye, John Heidemann, and Deborah Estrin. An Energy-Efficient MAC Protocol for Wireless Sensor Networks . INFOCOM , 2002.
    • [DL03] Tijs van Dam and Koen Langendoen. An Adaptive Energy-Efficient MAC Protocol for Wireless Sensor Networks . SenSys, 2003.
    • [IGE00] Chalermek Intanagonwiwat, Ramesh Govindan, and Deborah Estrin. Directed Diffusion: A Scalable and Robust Communication Paradigm for Sensor Networks. MobiCom , 2000.
    • [WTC03] Alec Woo, Ternence Tony, and David Culler. Taming the Underlying Challenges of Reliable Multihop Routing in Sensor Networks. SenSys , 2003.
  • References: Scheduling
    • [HDB04] Barbara Hohlt, Lance Doherty, and Eric Brewer. Flexible Power Scheduling for Sensor Networks . IPSN , 2004.
    • [Sic04] Mihail L. Sichitiu. Cross-Layer Scheduling for Power Efficiency in Wireless Sensor Networks . INFOCOM , 2004.
    • [WLX06] Hejun Wu, Qiong Luo, and Wenwei Xue. Distributed Cross-Layer Scheduling for In-Network Sensor Query Processing . PerCom , 2006.