This document presents the Cougar approach to in-network query processing in sensor networks. It introduces the concept of a database abstraction layer that allows users to interact with sensor networks using declarative queries. This abstraction layer optimizes queries for efficient in-network processing to reduce energy consumption. The presentation outlines key components of the architecture like the query proxy layer and query optimizer. It also discusses several open research problems in building such a system, including aggregation, query languages, optimization, catalog management, and multi-query optimization.
3. Yong Yao
• Software Engineer at Google
• Ph.D., Computer Science
Cornell University (2000 – 2007)
• Research Interests
– Databases
– Sensor Networks
– Distributed Systems
4. Johannes Gehrke
• University Professor
Department of Computer Science
Cornell University
• Research Interests
–Scalability in Computer Games and Simulations
–Data Privacy
–Data Mining
5. Motivation
• “Database Abstraction Layer” for Sensor
Networks
• Most popular sensor data management
middleware
• Introduces Database Abstraction Layer Concept
• Cited by 1185 (source: Google Scholar)
No. of citations
Year
24. Query Optimizer
• Generates “Query Processing Plans”
• Refers to
– Catalog Information
– Query Specification
• Specifies
– Data Flow between sensors
– Computation Plan
• Finally, plan is disseminated to all sensors
35. QP for Non-Leader Node
In-network
Aggregation
2
Network
Interface
AVG Temperature
= 35 °C
Contributor Count = 1
AVG Temperature
= 36 °C
Contributor Count = 1
Sensor
Scan
Temperature = 38
°C
36. QP for Non-Leader Node
In-network
Aggregation
Network
Interface
AVG Temperature
= 35 °C
Contributor Count = 1
AVG Temperature
= 36 °C
Contributor Count = 1
Sensor
Scan
Temperature = 38
°C
37. In-Network Aggregation
AVG Temperature
= 35 °C
Contributor Count = 1
AVG Temperature
= 36 °C
Contributor Count = 1
Total Temperature
No of Contributors
AVG Temperature
Temperature = 38
°C
= 35*1 + 36*1 + 38
= 109
=3
= 109 / 3
= 36.33
AVG Temperature
36.33 °C
Contributor Count = 3
=
38. QP for Non-Leader Node
Towards the Leader
AVG Temperature
In-network
36.33 °C Aggregation
Contributor Count = 3
Network
Interface
=
Sensor
Scan
40. QP for Leader Node
Towards the Leader
Select
AVG > threshold
Average Value
Aggregate
Operator (AVG)
Partially aggregated
results
Network
Interface
41. QP for Leader Node
Towards the Leader
Select
AVG > threshold
Average Value
Aggregate
Operator (AVG)
Network
Interface
Partially aggregated
results
1
42. QP for Leader Node
Leader Node
AVG Temperature
39 °C
Contributor Count = 2
=
AVG Temperature
36.33 °C
Contributor Count = 3
=
43. QP for Leader Node
Towards the Leader
Select
AVG > threshold
Average Value
Aggregate
Operator (AVG)
Partially aggregated
results
AVG Temperature
39 °C
Contributor Count = 2
=
Network
Interface
AVG Temperature
36.33 °C
Contributor Count = 3
=
44. Aggregate Operator
AVG Temperature
39 °C
Contributor Count = 2
=
AVG Temperature
36.33 °C
Contributor Count = 3
Total Temperature
= 39*2 + 36.33*3
= 186.99
No of Contributors = 5
AVG Temperature = 186.99 / 5
= 37.40
AVG Temperature
37.40 °C
=
=
45. QP for Leader Node
Towards the Leader
Select
AVG > threshold
Average Value
AVG Temperature
37.40 °C
=
Aggregate
Operator (AVG)
Partially aggregated
results
Network
Interface
46. QP for Leader Node
Towards the Leader
AVG Temperature
37.40 °C
=
Select
AVG > threshold
“Notify when
Threshold = 35 °C
the average temperature
exceeds 35 °C”
Average Value
Aggregate
Operator (AVG)
Partially aggregated
results
Network
Interface
52. Data Delivery
“How should the data be delivered from
source nodes to the leader?”
– Send all data to leader?
– Should intermediate nodes participate?
54. 3. Query Optimization
• Cost of query plan has changed
• Energy should be the focus
• Reactive to changes in catalog information
– Changes in topology
– Power level at sensor nodes
55. 4. Catalog Management
• Maintained at the server
• Provides Meta Data about the network
• Question: What is the best way to main
the catalog?
56. 5. Multi-Query Optimization
• Occurs when the WSN is shared
• Users may pose similar queries
• Share common data among the users
57. Conclusion
• Interacting with a WSN is made easy
• Database Abstraction layer provides
– Friendly Interface
– Efficient scheme to reduce energy consumption
• Research problems need to be carefully
addressed
58. My Views on the Paper
• Presents a concept
• Easy-to-understand
• Flow of the paper sometimes confuse the
reader
I think all of you are aware of the concept “Database Abstraction Layer for Sensor Networks”.
As you can see in the citation-count vs. year graph, it has being most cited in 2008
Having this motivation in mind, let’s move on.
Linking: Besides these limitations, WSNs are successfully used in a wide variety of application domains.
At the time of writing this paper, the authors have percieved the future of
Seeing this future, the authors identified two facts that motivated them to come up with the Database Approach for WSNs.
The first reason is
As the popularity of the WSNs grow, they may be used by technically expert users as well as non-expert users.
To cater such a diversified user group, it would be useful if a middleware can provide an abstract view of the WSN that hides the underlying messy details of the WSN.
By giving users a declarative query interface, they can issue queries without even knowing how the data is generated in the sensor network, how they are processed, and how the answers are computed.
Link to the next slide Motivated by these facts and as a solution to these issues, the Database Abstraction Layer for WSNs was introduced.
The Database Abstraction Layer allows the users to issue SQL-like queries to retrieve data from the WSN. This Layer hides all the complex and messy details of the underlying WSN by giving users a feeling like they are using a traditional database management system. Further,
The architecture of the Database Abstraction Layer spans over two main regions:
Gateway node
WSN
Note that here the gateway node is excluded from the WSN and it is considered as an external entity.
Query Optimizer generates “Query Processing Plans” upon receiving a query from a user.
The query plan for a non-leader node has 3 components:
The query plan for a non-leader node has 3 components:
The query plan for a non-leader node has 3 components:
The query plan for a non-leader node has 3 components:
The query plan for a non-leader node has 3 components:
The query plan for a non-leader node has 3 components:
The query plan for a non-leader node has 3 components:
The query plan for a non-leader node has 3 components:
The resulting aggregate value from the Aggregate Operator step will be passed to the next step: Selection
The resulting aggregate value from the Aggregate Operator step will be passed to the next step: Selection
Most popular computation and communication pattern for WSN
This is the same operation that we considered in the example.
To support aggregation we have to address two research problems:
Leader Selection
Data Delivery
With the introduction of such a layer,
However consequent to that several research problems have arrived that need to be addressed to achieve the fullest potential of a DB Layer.