Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Semantics in Sensor Networks


Published on

Invited talk at the FIS2009 (Future Internet Symposium) Workshop on Semantics and Future Internet. September 1st, 2009

Published in: Technology

Semantics in Sensor Networks

  1. 1. Semantics in Sensor Networks Workshop on Semantics and Future Internet Berlin, 1 Sep 2009 Oscar Corcho Facultad de Informática Universidad Politécnica de Madrid Campus de Montegancedo sn 28660 Boadilla del Monte, Madrid [email_address] Phone: 34.91.3366605 Fax: 34.91.3524819
  2. 2. Sensor Networks <ul><li>Increasing availability of cheap, robust, deployable sensors as ubiquitous information sources </li></ul><ul><li>Dynamic and reactive, but noisy, and unstructured data streams </li></ul>Source: Antonis Deligiannakis
  3. 3. Parts of a Sensor <ul><li>Sensing equipment (sensor and data acquisition boards) </li></ul><ul><ul><li>Internal (“built-in”) vs external sensing capabilities </li></ul></ul><ul><li>CPU </li></ul><ul><li>Memory </li></ul><ul><li>Battery </li></ul><ul><li>Radio to transmit/receive data from other sensors </li></ul>Source: Antonis Deligiannakis
  4. 4. Some examples of sensors and sensor parts <ul><li>Maximizing network lifetime is the main target </li></ul><ul><ul><li>Cost-effective only if sensor networks last long </li></ul></ul><ul><ul><li>Sensor-based applications without power constraints are much easier to handle </li></ul></ul>Source: Antonis Deligiannakis Passive RFID tag Berkeley Mica2 Stargate (Intel PXA255 cpu) Constraint Battery -- 2 ΑΑ Li-Ion Conserve to increase network lifetime CPU -- 7.38 MHz 400 MHz Computationally cheap algorithms Memory 1Kb 4KB SRAM, 512 KB EEPROM up to 256 MB FLASH Algorithms with low memory requirements Radio A few feet 30 0 μέτρα Depends on radio model Transmission range, bandwidth (bits/sec)
  5. 5. The Sensor Web <ul><li>Sensor networks may be networked, mostly wireless, hence global and integrated </li></ul><ul><li>Universal, web-based access to sensor data </li></ul><ul><li>Each network with some kind of authority and administration </li></ul><ul><li>Sensor networks vs robust networks </li></ul>Source: Adapted from Alan Smeaton’s invited talk at ESWC2009
  6. 6. Sensor Web: Is this part of the Web/Internet? Source: SemsorGrid4Env consortium
  7. 7. You haven’t done sensor networks research until... <ul><li>You have fallen in the river/mud/glacier/... </li></ul><ul><li>You have felt the excitement of data arriving. And uncertainty when it stops. </li></ul><ul><li>The data has had an impact. </li></ul><ul><li>Methodologically, this research must be conducted in the wild. </li></ul><ul><ul><li>Well, yes and no: there is a need for all types of research (pure, basic, applied, cross-discipline, etc.) </li></ul></ul><ul><ul><li>But the first three bullets have to be kept in mind all the time. </li></ul></ul>Source: Adapted from Dave de Roure
  8. 8. Energy Constraints <ul><li>3-5% battery yearly increase </li></ul><ul><ul><li>CPU speed increases much faster </li></ul></ul><ul><ul><ul><li>However, energy per cpu instruction decreases </li></ul></ul></ul><ul><li>Some applications: unattended deployment </li></ul><ul><ul><li>Eg: Disaster scenarios, military environments… </li></ul></ul><ul><ul><li>Often hard or impossible to replace batteries </li></ul></ul><ul><li>Maximizing network lifetime is the main target </li></ul><ul><ul><li>Cost-effective only if sensor networks last long </li></ul></ul><ul><ul><li>Applications with sensors without power constraints are much easier to handle </li></ul></ul>
  9. 9. Sources of Energy Drain <ul><li>CPU computations </li></ul><ul><li>Measurements from sensing equipments (cost depends on what you sense) </li></ul><ul><li>Very small energy consumption in sleep mode </li></ul><ul><li>Radio is main source </li></ul><ul><ul><li>cost(Transmitting) > cost(Receiving) ≥ cost(idle listening) </li></ul></ul><ul><ul><li>Popular goal: reduce #transmitted bits </li></ul></ul><ul><ul><li>Synchronization + communication protocols equally important </li></ul></ul><ul><ul><ul><li>I.e., cost of transmitting K bits depends on duty cycle (percentage of time sensor is awake to listen for data) </li></ul></ul></ul><ul><ul><ul><li>Idle listening for too long is extremely costly </li></ul></ul></ul>
  10. 10. Assumptions and Goals in Subsequent Algorithms <ul><li>Research Emphasis on more constrained environments </li></ul><ul><ul><li>Wireless communication, short transmission ranges </li></ul></ul><ul><ul><ul><li>One or more base stations with increased capabilities may exist </li></ul></ul></ul><ul><ul><ul><li>Candidates for gateways to the semantic sensor web </li></ul></ul></ul><ul><ul><li>Energy limitations (battery powered sensors) </li></ul></ul><ul><li>Goal of algorithms: </li></ul><ul><ul><li>Preserve Energy </li></ul></ul><ul><ul><li>Organize sensors and their schedules </li></ul></ul><ul><ul><ul><li>Good schedules allow sensors to power down their radios/cpus and go into a sleep mode </li></ul></ul></ul><ul><ul><li>Reduce size of transmitted data </li></ul></ul><ul><li>Processing (esp, aggregation) focuses on numeric measurements </li></ul><ul><li>Implication of having a strict schedule on when to collect data: base stations knows when quantities are collected </li></ul><ul><ul><li>Such metadata may not even need to be transmitted </li></ul></ul>
  11. 11. Who are the end users of sensor networks? Source: Dave de Roure The climate change expert, or a simple citizen
  12. 12. And what do these users want? <ul><li>Long lived sensors </li></ul><ul><li>Real time readings and prediction </li></ul><ul><li>Integration across sensors and sensor sources </li></ul><ul><li>Events </li></ul>
  13. 13. But why is it worth falling in mud? <ul><li>Sensor networks address an important set of “grand challenge” Computer Science issues including: </li></ul><ul><ul><li>Scale, scalable </li></ul></ul><ul><ul><li>Autonomic behaviour versus control </li></ul></ul><ul><ul><li>Persistent, heterogeneous, evolving </li></ul></ul><ul><ul><li>Holistic approach including information systems </li></ul></ul><ul><ul><li>Deployment challenge </li></ul></ul><ul><ul><li>Some mobile devices </li></ul></ul>Source: Dave de Roure
  14. 14. A set of challenges in sensor data management <ul><li>Provisioning </li></ul><ul><ul><li>Complexity of acquisition: distributed sources, data volumes, uncertainty, data quality, incompleteness </li></ul></ul><ul><ul><li>Pre-processing incoming data: calibration on instruments (specific), lack of re-grid, calibration, gap-filling features </li></ul></ul><ul><ul><li>Tools for data ingestion needed: generic, customizable, provide estimates, uncertainty degree, etc. </li></ul></ul><ul><li>Spatial/temporal </li></ul><ul><li>Analysis, modeling </li></ul><ul><ul><li>Discovery: identify sources, metadata </li></ul></ul><ul><ul><li>Data quality: gaps, faulty data, loss, estimates </li></ul></ul><ul><ul><li>Analysis models </li></ul></ul><ul><ul><li>Republish analytic results, computations,  </li></ul></ul><ul><ul><li>Workflows for data stream processing </li></ul></ul>Source: Data Management in the WorldWide Sensor Web. Balazinska et al. IEEE Pervasive Computing, 2007
  15. 15. A set of challenges in sensor data management <ul><li>Interoperability </li></ul><ul><ul><li>Data aggregation/integration </li></ul></ul><ul><li>Uncertainty, data quality </li></ul><ul><ul><li>Noise, failures, measurement errors, confidence, trust </li></ul></ul><ul><li>Distributed processing </li></ul><ul><ul><ul><li>High volume, time critical </li></ul></ul></ul><ul><ul><ul><li>Fault-tolerance </li></ul></ul></ul><ul><ul><ul><li>Load management  </li></ul></ul></ul><ul><ul><ul><li>Stream processing features </li></ul></ul></ul><ul><ul><ul><li>Continuous queries </li></ul></ul></ul><ul><ul><ul><li>Live & historical data </li></ul></ul></ul>Source: Data Management in the WorldWide Sensor Web. Balazinska et al. IEEE Pervasive Computing, 2007
  16. 16. A semantic perspective on these challenges <ul><li>Sensor data querying and (pre-)processing </li></ul><ul><ul><li>Data heterogeneity </li></ul></ul><ul><ul><li>Data quality </li></ul></ul><ul><ul><li>New inference capabilities required to deal with sensor information </li></ul></ul><ul><li>Sensor data model representation and management </li></ul><ul><ul><li>For data publication, integration and discovery </li></ul></ul><ul><ul><li>Bridging between sensor data and ontological representations for data integration </li></ul></ul><ul><ul><li>Ontologies: Observations and measurements, time series, etc. </li></ul></ul><ul><ul><li>Event models </li></ul></ul><ul><li>User interaction with sensor data </li></ul>
  17. 17. Final Discussion: Hot Topics and Open Problems
  18. 18. Challenges. A 1000-feet architectural perspective
  19. 19. Challenge 1: Querying and (pre-)processing <ul><li>For a model of surface water drainage, every 15 minutes, and within 24 hours of their being taken, we wish to obtain time-correlated measurements of the river depth now and the rainfall at the top of the hill 15 minutes before, provided that it is now raining less in the river than it was in the hill top and that the rainfall in the hill top was above 5mm. </li></ul><ul><li>Assume that: </li></ul><ul><ul><ul><li>sink (0) </li></ul></ul></ul><ul><ul><ul><li>river (rain : int, depth : int) at sites (5, 6, 7, 9) </li></ul></ul></ul><ul><ul><ul><li>hilltop (rain : int) at sites (4) </li></ul></ul></ul><ul><li>Then: </li></ul><ul><ul><ul><li>SELECT RSTREAM </li></ul></ul></ul><ul><ul><ul><li>river.time, hilltop.rain, river.depth </li></ul></ul></ul><ul><ul><ul><li>FROM river[NOW], </li></ul></ul></ul><ul><ul><ul><li>hilltop[AT NOW-15 MINUTES] </li></ul></ul></ul><ul><ul><ul><li>WHERE hilltop.rain > 5 </li></ul></ul></ul><ul><ul><ul><li>AND river.rain < hilltop.rain </li></ul></ul></ul><ul><ul><li>ACQUISITION RATE=EVERY 15 MINUTES, MAX DELIVERY TIME=24 HOURS </li></ul></ul>29-30 Sep 2008 SemsorGrid4Env, Kick-Off, Madrid 0 2 1 3 4 5 6 7 8 9
  20. 20. SNEE as a Decision-Making Sequence <ul><li>Is the query well-formed and well-typed? </li></ul><ul><li>Which algebraic translation of the query is, heuristically, most efficient? </li></ul><ul><li>Which algorithms will reduce the evaluation cost? </li></ul><ul><li>Which physical routes should be used to transport tuples from sources to sink? </li></ul><ul><li>Given the logical data flow paths, are there sites where the amount of data flowing through is getting smaller? </li></ul><ul><li>Given the transport paths, which sites should get which fragments, given that acquisition, (some) processing and delivery must be carried out in specific places? </li></ul><ul><li>Given the space and time estimation models, when should each fragment in each site execute, when should communication take place, and when should the node go to sleep? </li></ul><ul><li>Given a target language, how to generate node-specific source code that correctly executes the distributed computation specified in the agenda? </li></ul>SemsorGrid4Env, Kick-Off, Madrid routing parsing/type checking translation/rewriting algorithm assignment partitioning where-scheduling when-scheduling code generation <query, QoS-expectations>, <schemas, description(node,network), cost parameters> <N 1 , …, N m > nesC code abstract-syntactic tree logical-algebraic form physical-algebraic form PAF routing tree RT fragmented-algebraic form agenda 1 2 3 4 5 6 7 8 RT distributed-algebraic form RT DAF single-site phase multi-site phase
  21. 21. Challenge 1: Querying and (pre-)processing <ul><li>Collecting all data ( SELECT * queries) </li></ul><ul><ul><li>Entire network or a subregion, periodically or frequently </li></ul></ul><ul><li>Collect aggregates of data </li></ul><ul><ul><li>Report AVG, SUM, MAX quantities in an area/network </li></ul></ul><ul><li>Data Reduction based on user-specified data quality </li></ul><ul><ul><li>In all types of queries: (historical, aggregate…) </li></ul></ul><ul><ul><li>Minimize bandwidth based on quality (or the dual problem) </li></ul></ul><ul><li>Detecting Outliers </li></ul><ul><ul><li>“ Strange” readings: Interesting phenomenon or malfunction? </li></ul></ul><ul><li>Joins </li></ul><ul><ul><li>Report information based on combined readings of sensors (i.e., report when a lion is close to a deer) </li></ul></ul><ul><ul><li>Harder to optimize. Naïve solution of sending potential joining tuples (or projected attributes of them) to base station is often not far from best case </li></ul></ul>Source: Antonis Deligiannakis
  22. 22. Challenge 1: Querying and (pre-)processing <ul><li>Requires additional information ( metadata ) for collected data </li></ul><ul><ul><li>Location/orientation of sensor, time, authority, measured quantities, units, errors etc </li></ul></ul><ul><ul><ul><li>Some of them are static, some may change (time, location…) </li></ul></ul></ul><ul><ul><li>Additional info may significantly impact volume of transmitted data </li></ul></ul><ul><ul><li>Query execution still needs to be optimized within each network </li></ul></ul><ul><li>Tradeoff between the logical correctness of results and statistical models </li></ul><ul><ul><li>Deliver the right information at the right time </li></ul></ul>
  23. 23. Challenge 2: Sensor Data Modelling and Management <ul><li>The “easy” part </li></ul><ul><ul><li>Agree on a network of sensor network ontologies </li></ul></ul><ul><ul><li>Use these ontologies to annotate SensorML readings </li></ul></ul><ul><li>Use registries to publish available sensor networks </li></ul>
  24. 24. Challenge 2: Sensor Data Modelling and Management <ul><li>Data publication and integration </li></ul>SensorLocation:stored ( id:int , locx:int, locy:int) TreeSensor:sensed ( id:int , ts:time , smoke:boolean, temperature:float, relHumidity:float) SoilSensor:sensed ( id:int , ts:time , moisture:float) WindSensor:sensed ( id:int , ts:time , speed:float, direction:float) RainGauge:sensed ( id:int , ts:time , level:float) Streaming SPARQL, C-SPARQL, etc. Linked Stream Data (ISWC2009 semantic sensor wokshop),-2.5225/1
  25. 25. Simple Query <ul><li>SNEEql </li></ul><ul><li>RSTREAM SELECT id, speed, direction </li></ul><ul><li>FROM wind[NOW]; </li></ul><ul><li>Streaming SPARQL </li></ul><ul><li>PREFIX fire: <> </li></ul><ul><li>SELECT ?sensor ?speed ?direction </li></ul><ul><li>FROM STREAM <http://…/SensorReadings.rdf> WINDOW RANGE 1 MS SLIDE 1 MS </li></ul><ul><li>WHERE { </li></ul><ul><li>?sensor a fire:WindSensor; </li></ul><ul><li>fire:hasMeasurements ?WindSpeed, ?WindDirection. </li></ul><ul><li>?WindSpeed a fire:WindSpeedMeasurement; </li></ul><ul><li>fire:hasSpeedValue ?speed; </li></ul><ul><li>fire:hasTimestampValue ?wsTime. </li></ul><ul><li>?WindDirection a fire:WindDirectionMeasurement; </li></ul><ul><li>fire:hasDirectionValue ?direction; </li></ul><ul><li>fire:hasTimestampValue ?dirTime. </li></ul><ul><li>FILTER (?wsTime == ?dirTime) </li></ul><ul><li>} </li></ul><ul><li>C-SPARQL </li></ul><ul><li>REGISTER QUERY WindSpeedAndDirection AS </li></ul><ul><li>PREFIX fire: <> </li></ul><ul><li>SELECT ?sensor ?speed ?direction </li></ul><ul><li>FROM STREAM <http://…/SensorReadings.rdf> [RANGE 1 MSEC SLIDE 1 MSEC] </li></ul><ul><li>WHERE { … </li></ul>Semantically Integrating Streaming and Stored Data
  26. 26. Challenge 3: User Interaction with Sensor Data Source: SemsorGrid4Env consortium
  27. 27. Vision (after some iterations, and more to come) Source: RWI Working Group on IoT: Networked Knowledge Networked Knowledge Before 2010 2010-2015 2015-2020 Beyond 2020 Today Incremental Incremental-Visionary Visionary Interoperability <ul><li>Middleware </li></ul><ul><li>Sensor ontologies </li></ul><ul><li>Intra-network cross-layer integration and optimization </li></ul><ul><li>Sensor Internet </li></ul><ul><li>Inter-network cross-layer integration and optimization </li></ul>Information & Context <ul><li>Relational database integration </li></ul><ul><li>Sensor network data warehouses </li></ul><ul><li>Stream aggregation </li></ul><ul><li>Query processing and reasoning on sensor networks </li></ul><ul><li>Event modelling </li></ul><ul><li>Database-stream integration </li></ul><ul><li>Sensor actuation (In-network processing) </li></ul><ul><li>QoS models </li></ul><ul><li>QoS-based information integration of DB and streams </li></ul>Discovery <ul><li>Centralised non-semantic registries ( </li></ul><ul><li>Semantic discovery of sensors and sensor data </li></ul><ul><li>Distributed registries </li></ul><ul><li>Sensor network location transparency </li></ul>Identity & Trust & Privacy <ul><li>RFID tags </li></ul><ul><li>No privacy mgmnt </li></ul><ul><li>URIs </li></ul><ul><li>User-centric privacy and policies </li></ul><ul><li>Virtual sensor networks through dynamic policies </li></ul>Provenance <ul><li>Data provenance (where, what and who) </li></ul><ul><li>Data transformation processes (how) </li></ul><ul><li>Process and problem solving understanding (why) </li></ul><ul><li>Problem solving interpretation and explanation </li></ul>
  28. 28. Another list of R&D challenges <ul><li>Short-term </li></ul><ul><ul><li>Sensor network ontologies </li></ul></ul><ul><ul><li>Spatio-temporal RDF queries </li></ul></ul><ul><ul><li>RDF stream generation (RDB2RDF tools can be “easily” adapted, e.g., R2O/D2R/etc.) </li></ul></ul><ul><ul><li>RDF/DB/HTML/Maps Mashup support </li></ul></ul><ul><li>Medium-term </li></ul><ul><ul><li>Semantic query-based access to sensor networks </li></ul></ul><ul><ul><ul><li>In-network processing: real-time SPARQL to [CQL, SNEEql, etc.] </li></ul></ul></ul><ul><ul><ul><li>(Ontology-based) relational data and data stream integration </li></ul></ul></ul><ul><ul><ul><li>Pay-as-you-go techniques for query planning </li></ul></ul></ul><ul><ul><li>Distributed RDF-based query processing and integration </li></ul></ul><ul><ul><li>Usability of mashup-supporting technology </li></ul></ul>
  29. 29. Semantics in Sensor Networks Workshop on Semantics and Future Internet Berlin, 1 Sep 2009 Oscar Corcho Facultad de Informática Universidad Politécnica de Madrid Campus de Montegancedo sn 28660 Boadilla del Monte, Madrid [email_address] Phone: 34.91.3366605 Fax: 34.91.3524819
  30. 30. Real World: Where do we talk about this? <ul><li>Join us at </li></ul><ul><ul><li>W3C SSN Incubator Group </li></ul></ul><ul><ul><li> </li></ul></ul><ul><ul><li>OGC Sensor Web Enablement WG </li></ul></ul><ul><li>A bunch of (mainly research oriented) workshops </li></ul><ul><ul><li>ESWC2009 workshops on stream reasoning and semantic sensor networks </li></ul></ul><ul><ul><li>ISWC2009 workshop on semantic sensor networks </li></ul></ul><ul><ul><li>Future Internet Symposium series </li></ul></ul>
  31. 31. <ul><li>Development of an integrated information space where new sensor networks can be easily discovered and integrated with existing ones and possibly other data sources (e.g., historical databases), </li></ul><ul><li>Rapid development of flexible and user-centric decision support systems that use data from multiple autonomous independently deployed sensor networks and other applications. </li></ul>SemsorGrid4Env: Objectives Start date: 01/09/2009 Duration: 36 months
  32. 32. SemsorGrid4Env: use cases <ul><ul><li>Fire Risk Monitoring and Warning in a specific area of Castilla y León </li></ul></ul><ul><ul><li>Coastal and Estuarine Flood Warning in Southern UK. </li></ul></ul>
  33. 33. SemsorGrid4Env: Technologies and Expected Results <ul><li>Distributed RDF-based query processing and integration </li></ul><ul><ul><li>Extensions of OGSA-DQP and WS-DAI-RDF </li></ul></ul><ul><ul><li>Archival data and data streams ontology-based integration </li></ul></ul><ul><li>Sensor network ontologies </li></ul><ul><li>Spatio-temporal RDF queries </li></ul><ul><li>Query-based access to sensor networks (in-network processing) </li></ul><ul><ul><li>SNEEql </li></ul></ul><ul><ul><li>Outlier detection algorithms </li></ul></ul><ul><li>Mashup support </li></ul>