Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Internet of Things: Concepts and Technologies


Published on

Internet of Things: Concepts and Technologies, Institute for Communication Systems
Faculty of Engineering and Physical Sciences
University of Surrey

Published in: Education

Internet of Things: Concepts and Technologies

  1. 1. 1 Internet of Things: Concepts and Technologies Payam Barnaghi Institute for Communication Systems Faculty of Engineering and Physical Sciences University of Surrey
  2. 2. Internet of Things 2P. Barnaghi et al., "Digital Technology Adoption in the Smart Built Environment", IET Sector Technical Briefing, The Institution of Engineering and Technology (IET), I. Borthwick (editor), March 2015.
  3. 3. 3 Wireless Sensor Networks (WSN) − Sensor Networks consist of nodes with different capabilities. − Large number of heterogeneous sensor nodes − Spread over a physical location − It includes physical sensing, data processing and networking − In ad-hoc networks, sensors can join and leave due to mobility, failure etc. − Data can be processed in-network, or it can be directly communicated to the endpoints.
  4. 4. 4 Wireless Sensor Networks (WSN) Sink node Gateway Core network e.g. InternetGateway End-user Computer services - The networks typically run Low Power Devices - Consist of one or more sensors, could be different type of sensors (or actuators)
  5. 5. 5 Types of nodes − Sensor nodes − Low power − Consist of sensing device, memory, processor and radio − Resource-constrained − Sink nodes − Another sensor node or a different wireless node − Normally more powerful/better resources − Gateway − A more powerful node − Connection to core network − Could consist service representation, cache/storage, discovery and other functions
  6. 6. 6 Types of applications − Event detection − Reporting occurrences of events − Reporting abnormalities and changes − Could require collaboration of other nearby or remote nodes − Event definition and classification is an issue − Periodic measurements − Sensors periodically measure and report the observation and measurement data − Reporting period is application dependent − Approximation and pattern detection − Sending messages along the boundaries of patterns in both space/time − Tracking − When the source of an event is mobile − Sending event updates with location information
  7. 7. 7 Requirements and challenges − Types of services − In conventional communication network the target is moving bits from one place to another − In WSN moving the data is not the actual goal. − It is expected to provide meaningful information/actions.
  8. 8. 8 Type of Services in WSN Sink node Gateway Core network e.g. Internet End-user Data Sender Data Receiver A sample data communication in conventional networks A sample data communication in WSN Fire! Some bits 01100011100
  9. 9. 9 Requirements and challenges – Cont’d − Quality of Service − In networks QoS usually comes from multimedia type applications; e.g. delay, bandwidth (and often for large data) − Here the transmitted data is small (but in large networks could be a large number of small data transmissions). − QoS requirements depends on the application − Latency could be an important factor for time-sensitive applications or actuation control − The packet delivery metric could be insufficient − What is more relevant is Quality of Information that can be extracted.
  10. 10. 10 Requirements and challenges – Cont’d − Fault tolerance − The nodes can get damaged, run out of power, the wireless communication between two nodes can be interrupted, etc. − To tolerate node failures, redundant deployments can be necessary. − Lifetime − The nodes could have a limited energy supply; − Sometimes replacing the energy sources is not practical (e.g. underwater deployment, large/remote field deployments). − Energy efficient operation can be a necessity.
  11. 11. 11 Requirements and challenges – Cont’d − Scalability − AWSN can consists of a large number of nodes − The employed architectures and protocols should scale to these numbers. − Wide range of densities − Density of the network can vary − Different applications can have different node densities − Density does not need to be homogeneous in the entire network and network should adapt to such variations.
  12. 12. 12 Requirements and challenges – Cont’d − Programmability − Nodes should be flexible and their tasks could change − The programmes should be also changeable during operation. − Maintainability − WSN and environment of aWSN can change; − The system should be adaptable to the changes. − The operational parameters can change to choose different trade-offs (e.g. to provide lower quality when energy efficiency is more important)
  13. 13. 13 Required mechanisms − Multi-hop wireless communications − Communication over long distances can require intermediary nodes as relay (instead of using high transmission power for long range communications). − Energy-efficient operation − To support long lifetime − Energy efficient communication/dissemination of information − Energy efficient determination of a requested information − Auto-configuration − Self-xxx functionalities − Tolerating node failures − Integrating new nodes
  14. 14. 14 − Collaboration and in-network processing − In some applications a single sensor node is not able to handle the given task or provide the requested information. − Instead of sending the information form various source to an external network/node, the information can be processed in the network itself. − e.g. data aggregation, summarisation and then propagating the processed data with reduced size (hence improving energy efficiency by reducing the amount of data to be transmitted). − Data-centric − Conventional networks often focus on sending data between two specific nodes each equipped with an address. − Here what is important is data and the observations and measurements not the node that provides it. Required mechanisms
  15. 15. 15 Architectures (hardware and software) − Hardware components − Examples of nodes − Energy characteristics − Operating systems and run-time environments
  16. 16. 16 Sensor node- overview Power supply Communication device Controller Sensors/ Actuators Memory
  17. 17. 17 Example: Radiation Sensor Board (Libelium) Source: Wireless Sensor Networks to Control Radiation Levels, David Gascón, Marcos Yarza, Libelium, April 2011. Waspmote
  18. 18. 18 Sensors and sensor nodes − Active & Passive Sensors − Energy Efficiency − Processing capabilities − e.g. Intel StrongARM (RISC, 32bit, up to 206MHz) − e.g.Texas Instrument MSP40 (RISC core, 16bit, up to 4MHz) − Network communications − For actual communication both a transmitter and a receiver is required in a sensor node. − Device that combined these two tasks (transmission, and receive) are called transceivers. − Data rate: typically a few tens of kilobits per second
  19. 19. 19 Sensor devices are becoming widely available - Programmable devices - Off-the-shelf gadgets/tools
  20. 20. 20 Radio-frequency identification (RFID) − Active Tags and Passive Tags − Applications: supply chain, inventory tracking, tools collection, etc. − Limitations: − Technology − Reading range − Physical limitations − Interference − Security and Privacy
  21. 21. 21 Some of the existing solutions for sensor nodes
  22. 22. 22 Energy consumption of the nodes − Batteries have small capacity and recharging could be complex (if not impossible) in some cases. − The main consumers of the energy are: the controller, radio, to some extent memory and depending on the type, the sensor(s). − A controller can go to: − “active”,“idle” and “sleep” − A radio modem could turn transmitter, receiver, or both on or off, − sensors and memory can be also turned on and off.
  23. 23. 23 Power consumption of commercial sensor nodes Source: James M. Gilbert, Farooq Balouchi, "Comparison of Energy Harvesting Systems for Wireless Sensor Networks", International Journal of Automation and Computing, 2008
  24. 24. 24 Comparison of Energy sources Source: UC Berkeley, via C. Edward Chow, Wireless Sensor Network (WSN), University of Colorado
  25. 25. 25 Power consumption: Computation vs. Communication Source: ISI & DARPA PAC/C Program, via C. Edward Chow, Wireless Sensor Network (WSN), University of Colorado
  26. 26. 26 Relationship between computation and communication − Communication is considerably more expensive than computation. − However, energy required for communication can not be ignored; − The main principle is investing into computation within the network wherever applicable (e.g. in-network processing, aggregation) and reduce the communication. − The load of computation should still be considered.
  27. 27. 27 Energy Management Issues − Actuation usually uses more energy − Strategy: using ultra-low-power nodes − Wake-up or command movement of mobile nodes − Communication energy is the next important issue − Strategy: energy-aware data communication − Adapting the instantaneous performance to meet the timing and error rate constraints, while minimizing energy/bit − Processor and sensor energy usually use less energy Source: C. Edward Chow, Wireless Sensor Network (WSN), University of Colorado.
  28. 28. 28 Operating Systems and Run-time environments − Embedded operating systems − Virtual machines − Abstracting the hardware specific issues from the users. − Need for energy-efficient execution − The code is more restricted (compared to conventional operating systems) so a full-blown OS is not obviously required. − An appropriate programming model − A clear way to structure a protocol stack − And support for energy management
  29. 29. 29 Embedded Operating Systems − OS running on devices with restricted functionality − In the case of sensor nodes, there devices typically also have limited processing capability − e.g.TinyOS − Restricted to narrow applications − industrial controllers, robots, networking gear, gaming consoles, metering, sensor nodes… − Architecture and purpose of embedded OS changes as the hardware capabilities change (i.e. mobile phones) Source: The Web of Things, Marko Grobelnik, Carolina Fortuna, Jožef Stefan Institute.
  30. 30. 30 TinyOS − “TinyOS is an open source, BSD-licensed operating system designed for low-power wireless devices, such as those used in sensor networks .” − TinyOS applications are developed using nesC − nesC is a dialect of the C language that is optimised for the memory limits of sensor networks.
  31. 31. 31 TinOS - programming − “TinyOS is completely non-blocking: − it has one stack. − All I/O operations that last longer than a few hundred microseconds are asynchronous and have a callback. − To enable the native compiler to better optimize across call boundaries,TinyOS uses nesC's features to link these callbacks, called events, statically.” TinyOS home page: TinyOS tutorial:
  32. 32. 32 Contiki − Contiki is the open source operating system for the Internet ofThings. − runs on networked embedded systems and wireless sensor networks. − It is designed for microcontrollers with small amounts of memory.A typical Contiki configuration is 2 kilobytes of RAM and 40 kilobytes of ROM. − Contiki provides IP communication, both for IPv4 and IPv6. − It has a fully tested IPv6 stack that, combined with power-efficient radio mechanisms such as ContikiMAC, allow battery-operated devices to participate in IPv6 networking - even routers can run on batteries. − Contiki supports 6lowPAN header compression, IETF RPL IPv6 routing, and the IETF CoAP application layer protocol. Source:
  33. 33. 33 Beyond conventional sensors − Human as a sensor (citizen sensors) − e.g. tweeting real world data and/or events − Virtual (software) sensors − e.g. Software agents/services generating/representing data Road block, A3 Road block, A3 Suggest a different route
  34. 34. 34 Actuators Stepper Motor [1] Image sources: [1] [2] [3] [2] [3]
  35. 35. 35 Wireless Sensor Networks (WSN) Source: Protocols and Architectures for Wireless Sensor Networks, Protocols and Architectures for Wireless Sensor Networks Holger Karl, Andreas Willig, chapter 3, Wiley, 2005 .
  36. 36. 36 Gateway Concept
  37. 37. Gateway/Middleware Connectivity/Device Association Layer Data Exchange/Interoperability Layer Service/Application Layer Blue toot h WiFI ZigB EE Cloud Hyp erC AT RES T API Proprietary Cloud/Data Services Hy pe rC AT Hy pe rC AT
  38. 38. 38 Gateway communications
  39. 39. 39 Designing a gateway/node association protocol Source: F. Ganz, P. Barnaghi, F. Carrez, and K. Moessner, Context-aware Management of Wireless Sensor Networks, In: The 5th Int. Conf. on COMmunication System softWAre and middlewaRE, (COMSWARE11),ACM, 2011.
  40. 40. 40 How to say what a sensor is and what it measures? More about this soon! in the coming slides Sink node Gateway
  41. 41. 41 Distributed WSN- Gateways and directories
  42. 42. 42 What are the main issues? − Heterogeneity − Interoperability − Mobility − Energy efficiency − Scalability − Security
  43. 43. 43 Communication Protocols − Wired − USB, Ethernet − Wireless − Wifi, Bluetooth, ZigBee, IEEE 802.15.x − Single-hop or multi-hop − Sink nodes, cluster heads… − Point-to-Point or Point-to-Multi Point − (Energy) efficient routing
  44. 44. 44 Wireless Communications −Mostly performed in unlicensed bands according to open standards −Standard: IEEE 802.15.4 -Low Rate WPAN −868/915 MHz bands with transfer rates of 20 and 40 kbit/s, 2450 MHz band with a rate of 250 kbit/s −Technology: ZigBee,WirelessHART −Standard: ISO/IEC 18000-7 (standard for active RFID) −433 MHz unlicensed spectrum with transfer rates of 200kbit/s Adapted from: The Web of Things, Marko Grobelnik, Carolina Fortuna, Jožef Stefan Institute.
  45. 45. 45 Wireless Communications - continued − Standard: IEEE 802.15.1 –High Rate WPAN − 2.40GHzbands with transfer rates of 1-24 Mbit/s − Technology: Bluetooth (BT 3.0 Low Energy Mode) − Standard: IEEE 802.11x –WLAN − 2.4, 3.6 and 5GHz with transfer rates 15-150 Mbit/s − Technology: Wi-Fi − Licensed bands − Standard: 3GPP –WMAN,WWAN cellular communication − 950 MHz, 1.8 and 2.1 GHz bands with data rate ranging from 20 Kbit/s to 7.2 Mbit/s, depending on the release − Technology: GPRS, HSPA Adapted from: The Web of Things, Marko Grobelnik, Carolina Fortuna, Jožef Stefan Institute.
  46. 46. 46 Wireless Communications − Proprietary standards and protocols − Z-Wave–for home automation − 900 MHzband (partly overlaps with 900 MHz cellular) with data rates of 9.6 Kbit/s or 40 Kbit/s − ANT–for sportsmen and outdoor activity monitoring, owned by Garmin − 2.4 GHz and 1 Mbit/s data rates − Wavenis–for M2M periodic low data rate communication − 868 MHz, 915 MHz, 433 MHz with data rates from 4.8 Kbits/s to 100 Kbits/s − mostWavenis applications communicate at 19.2 kbits/s. − MiWi, SimpliciTI, Digixxx, … Adapted from: The Web of Things, Marko Grobelnik, Carolina Fortuna, Jožef Stefan Institute.
  47. 47. 47 IEEE 802.15.4 WPAN − IEEE standard for WPAN applications − MAC protocol − Single channel at any one time − Combines contention-based and schedule-based schemes − Asymmetric: nodes can assume different roles − It does not define other higher-level layers and interoperability sub-layers are − ZigBee is built on this standard − TinyOS stack also uses some items of IEEE 802.15.4 hardware.
  48. 48. 48 ISO/IEC 18000-7 − RFID devices operating in the 433 MHz frequency band. − Provides an air interface implementation for wireless, non- contact information system equipments. − Parameters for active air interface communications at 433 MHz. − Typical applications operate at ranges greater than one meter.
  49. 49. ZigBee − It is supposed to be a low cost, low power mesh network protocol. − ZigBee operation range is in the industrial, scientific and medical radio bands; − ZigBee’s physical layer and media access control defined in defined based on the IEEE 802.15.4 standard. − ZigBee nodes can go from sleep to active mode in 30 ms or less, the latency can be low and in result the devices can be responsive, in particular compared to Bluetooth devices that wake-up time can be longer (typically around three seconds). [source: Gary Legg, ZigBee: Wireless Technology for Low-Power Sensor Networks,]
  50. 50. ZigBee [source: Gary Legg, ZigBee: Wireless Technology for Low-Power Sensor Networks,]
  51. 51. 51 Network protocols − The network (or OSI Layer 3 abstraction) provides an abstraction of the physical world. − Communication protocols − Most of the IP-based communications are based on the IPV.4 (and often via gateway middleware solutions) − IP overhead makes it inefficient for embedded devices with low bit rate and constrained power. − However, IPv6.0 is increasingly being introduced for embedded devices − 6LowPAN
  52. 52. IPv6 over Low power Wireless Personal Area Networks (6LowPAN) − 6LoWPAN typically includes devices that work together to connect the physical environment to real-world applications, e.g., wireless sensors. − Small packet size − the maximum physical layer packet is 127 bytes − 81 octets (81 * 8 bits) for data packets. − Header compression − Fragmentation and reassembly − 6LoWPAN defines a header encoding to support fragmentation when IPv6 datagrams do not fit within a single frame and compresses IPv6 headers to reduce header overhead. − Support for both 16-bit short or IEEE 64-bit extended media access control addresses. − Low bandwidth − Data rates of 250 kbps, 40 kbps, and 20 kbps for each of the currently defined physical layers (2.4 GHz, 915 MHz, and 868 MHz, respectively). Source: Jonathan W. Hui and David E. Culler, IPv6 in Low-Power Wireless Networks, Proceedings of the IEEE (Volume:98 , Issue: 11 ).
  53. 53. 6LowPAN − IPv6 requires the link to carry a payload of up to 1280 Bytes. − Low-power radio links often do not support such a large payload - IEEE 802.15.4 frame only supports 127 Bytes of payload and around 80 B in the worst case (with extended addressing and full security information). − the IPv6 base header, as shown, is relatively large at 40 Bytes. Source: Jonathan W. Hui and David E. Culler, IPv6 in Low-Power Wireless Networks, Proceedings of the IEEE (Volume:98 , Issue: 11 ).
  54. 54. 54 Using gateway and middleware − It is unlikely that everything will be IP enabled and/or will run IP protocol stack − Gateway and middleware solutions can interfaces between low-level sensor island protocols and IP-based networks. − The gateway can also provide other components such as QoS support, caching, mechanisms to address heterogeneity and interoperability issues.
  55. 55. 55 Gateway and IP networks Gateway
  56. 56. 56 In-network processing − Mobile Ad-hoc Networks are supposed to deliver bits from one end to the other − WSNs, on the other end, are expected to provide information, not necessarily original bits − Gives additional options − e.g., manipulate or process the data in the network − Main example: aggregation − Applying aggregation functions to a obtain an average value of measurement data − Typical functions: minimum, maximum, average, sum, … − Not amenable functions: median Source: Protocols and Architectures for Wireless Sensor Networks, Protocols and Architectures for Wireless Sensor Networks Holger Karl, Andreas Willig, chapter 3, Wiley, 2005 .
  57. 57. 57 In-network processing- signal processing − Depending on application, more sophisticated processing of data can take place within the network − Example edge detection: locally exchange raw data with neighboring nodes, compute edges, only communicate edge description to far away data sinks − Example tracking/angle detection of signal source: Conceive of sensor nodes as a distributed microphone array, use it to compute the angle of a single source, only communicate this angle, not all the raw data − Exploit temporal and spatial correlation − Observed signals might vary only slowly in time ! no need to transmit all data at full rate all the time − Signals of neighboring nodes are often quite similar! only try to transmit differences (details a bit complicated, see later) Source: Protocols and Architectures for Wireless Sensor Networks, Protocols and Architectures for Wireless Sensor Networks Holger Karl, Andreas Willig, chapter 3, Wiley, 2005 .
  58. 58. 58 In-network processing- example Using Symbolic Aggregate Approximation (SAX) SAX Pattern (blue) with word length of 20 and a vocabulary of 10 symbols over the original sensor time-series data (green) Source: P. Barnaghi, F. Ganz, C. Henson, A. Sheth, "Computing Perception from Sensor Data", in Proc. of the IEEE Sensors 2012, Oct. 2012. fggfffhfffffgjhghfff jfhiggfffhfffffgjhgi fggfffhfffffgjhghfff
  59. 59. 59 Data-centric networking − In typical networks (including ad hoc networks), network transactions are addressed to the identities of specific nodes − A “node-centric” or “address-centric” networking paradigm − In a redundantly deployed sensor networks, specific source of an event, alarm, etc. might not be important − Redundancy: e.g., several nodes can observe the same area − Thus: focus networking transactions on the data directly instead of their senders and transmitters ! data-centric networking − Principal design change Source: Protocols and Architectures for Wireless Sensor Networks, Protocols and Architectures for Wireless Sensor Networks Holger Karl, Andreas Willig, chapter 3, Wiley, 2005 .
  60. 60. 60 Implementation options for data-centric networking − Overlay networks & distributed hash tables (DHT) − Hash table: content-addressable memory − Retrieve data from an unknown source, like in peer-to-peer networking – with efficient implementation − Some disparities remain − Static key in DHT, dynamic changes in WSN − DHTs typically ignore issues like hop count or distance between nodes when performing a lookup operation − Publish/subscribe − Different interaction paradigm − Nodes can publish data, can subscribe to any particular kind of data − Once data of a certain type has been published, it is delivered to all subscribes − Subscription and publication are decoupled in time; subscriber and published are agnostic of each other (decoupled in identity); Source: Protocols and Architectures for Wireless Sensor Networks, Protocols and Architectures for Wireless Sensor Networks Holger Karl, Andreas Willig, chapter 3, Wiley, 2005 .
  61. 61. 61 Data-centric networking for Internet content
  62. 62. 62 Data-centric networking in WSN − Data in WSN is transient (or at least time dependent) − Spatial feature of data − Quality of Information − In large-scale deployments, we have large number of small information (in contrast to conventional data-centric networks that mainly focus on multimedia data) − Data discovery (or resource discovery) is a challenge − Data annotation and description frameworks − e.g. Semantic sensor Networks- to annotate sensor resources and observation and measurement data.
  63. 63. 63 Other issues/concepts in WSN − Naming and addressing − Time Synchronization − Localisation and positioning − Routing protocols − Mobility − Security/Privacy
  64. 64. 64 Some of the security issues − Identity management − Trade-off between security, usability, and privacy in resource constrained devices. − Encryption and Public Key and/ or Pre-Provisioned Symmetric Keys − Resource constraints and applicability of the existing solutions − Privacy issues − Hijacking
  65. 65. 65 Service interfaces to WSN − Supporting high-level request/response interactions − Asynchronous event notifications − Identifying and accessing data − By location, by observed entity, − By semantically meaningful representations – “Room 35BA01” − Accessibility of in-network processing functions − Defining complex events − Allow to specify Quality of Information requirements (e.g. accuracy & timeliness requirements) − Accessing node/network status information (e.g., battery level) − Security, management functionality, … − There are emerging solutions and standards in this area supported by Semantic Web technologies and Linked-data.
  66. 66. 66 Service interfaces and Web connectivity − WSN nodes are typically resource constrained − Memory and process limitations − Communication load − Often none-IP or use 6LowPAN − Using gateway and middleware is a clear solution − Or can the nodes directly connect to the Web and or support service interfaces?
  67. 67. 67 Constrained Application Protocol (CoAp) − CoAp is a transfer protocol for constrained nodes and networks. − CoAp uses the Representational StateTransfer (REST) architecture. − REST make information available as resources that are identified by URIs. − Applications communication by exchanging representation of these resources using a transfer protocol such as HTTP. − Clients access servicer controlled resources using synchronous request/response mechanisms. − Such as GET, PUT, POST and DELETE. − CoAp uses UDP instead of TCP and has a simple “message layer” for re- transmitting lost packets. − It also uses compression techniques. C. Bormann, A. P. Castellani, Z. Shelby, "CoAP: An Application Protocol for Billions of Tiny Internet Nodes," IEEE Internet Computing, vol. 16, no. 2, pp. 62-67, Feb. 2012, doi:10.1109/MIC.2012.29
  68. 68. 68 Constrained Application Protocol (CoAp)- continued Client GET/temperature, Room A Server 200 OK Txt/plain 17, Celsius
  69. 69. CoAP protocol stack and interactions C. Bormann, A. P. Castellani, Z. Shelby, "CoAP: An Application Protocol for Billions of Tiny Internet Nodes," IEEE Internet Computing, vol. 16, no. 2, pp. 62-67, Feb. 2012, doi:10.1109/MIC.2012.29
  70. 70. 70 Connecting WSN nodes to Internet
  71. 71. 71 Machine-to-Machine Communications (M2M) − What is M2M? − Design of M2M services − Networking technologies and M2M − M2M data communication − M2M Applications − Making Sense of Data − Semantic technologies and Linked-data
  72. 72. 72 Machine-to-Machine  Machine-to-Machine (M2M) communications represent technological solutions and deployments allowing Machines, Devices or Objects to communicate with each other, with no human interactions. [source EU FP7 Exalted project] Source: ETSI  M2M system – Key features - Support of a huge number of devices - Seamless operability across multiple domains - Autonomous operation - Self organisation - Power efficiency - etc., etc…
  73. 73. 73 M2M Device Domain − M2M Device − A device that runs application(s) using M2M capabilities and network domain functions. − An M2M Device is either connected straight to an Access Network or interfaced to M2M Gateways via an M2M Area Network. − M2M Area Network − A M2M Area Network provides connectivity between M2M Devices and M2M Gateways. − Examples of M2M Area Networks include: Personal Area Network technologies such as IEEE 802.15, SRD, UWB, Zigbee, Bluetooth, etc or local networks such as PLC, M-BUS, Wireless M-BUS. − M2M Gateways − Equipments using M2M Capabilities to ensure M2M Devices interworking and interconnection to the Network and Application Domain. − The M2M Gateway may also run M2M applications. Source: KAIST KSE, Uichin Lee, M2M and Semantic Sensor Web
  74. 74. 74 M2M Network/Application Domain − Network Service Capabilities − Provide functions that are shared by different applications − Expose functionalities through a set of open interfaces − Use Core Network functionalities and simplify and optimize applications development and deployment whilst hiding network specificities to applications − Examples include: data storage and aggregation, unicast and multicast message delivery, etc. − M2M Applications (Server) − Applications that run the service logic and use service capabilities accessible via open interfaces. Source: KAIST KSE, Uichin Lee, M2M and Semantic Sensor Web
  75. 75. 75 More “Things” are being connected Home/daily-life devices Business and Public infrastructure Health-care …
  76. 76. 76 People Connecting to Things Motion sensor Motion sensor Motion sensor ECG sensor Internet
  77. 77. 77 Things Connecting to Things - Complex and heterogeneous resources and networks
  78. 78. 78 Network connected Things and Devices Image courtesy: CISCO
  79. 79. 79 Networking technologies − Billions of devices, subscribers, trillions of objects, − Seamless connection and integration − Cellular,WiMax,WiFi, Femto − ZigBee, IEEE 802.15.4 WPAN, …
  80. 80. 80 3GPP Long Term Evolution (LTE) − LTE radio access is called Evolved UMTS Terrestrial Radio Access Network (E-UTRAN) − It is expected to substantially improve end-user throughputs and bring significantly improved user experience with full mobility. − support for IP-based traffic with end-to-end Quality of service (QoS). − Voice traffic is supported mainly asVoice over IP (VoIP) enabling better integration with other multimedia services. − a new Packet Core, the Evolved Packet Core (EPC) network architecture to support the E-UTRAN. Source: Long Term Evolution (LTE): A Technical Overview, Technical White Paper, Motorola.
  81. 81. 81 M2M Gateway Client Application M2M Application M2M Area Network M2M Architecture (ETSI) Service Capabilities M2M Core Source: ETSI, via KAIST KSE, Uichin Lee, M2M and Semantic Sensor Web Application domain Network domain M2M device domain
  82. 82. 82  M2M architecture Remote clients Communication networks M2M area networks -Zigbee - Bluetooth -WiFi - … Satellite (GPS) GPRS, 3G, 4G, … LAN, xDSL, … M2M Gateway
  83. 83. 83 Wide-Area M2M Networks − Access network – connecting devices such as sensors and actuators: − Wired (Cable, xDSL, optical, etc.) − Wireless cellular (GSM, GPRS, EDGE, 3G, LTE,WiMAX, etc.) − Wireless capillary (Zigbee, Bluetooth, RFID,Wi-Fi, etc.) − Gateway/Middleware – connecting the access network to the core network − Network address translation − Local device management − Traffic aggregation − etc. − Core network – connecting the Internet − IP enabled
  84. 84. 84 Enabling the Internet of Things - Diversity range of applications - Interacting with large number of devices with various types -Multiple heterogeneous networks -Deluge of data
  85. 85. 85 Key characteristics of M2M −Inexpensive sensors equipped with a radio transceiver for various applications, typically low data rate ~ 10-250 kbps. −Deployed in large numbers (several thousands). −The sensors coordinate to perform the desired task. −The acquired information (periodic or event-based) is reported back to the information processing centre (sink, BS, etc.). −Solutions are application-dependent. 85
  86. 86. 86 M2M Requirements − A huge number of devices i.e. many active users (~10 times more than H2H users/devices) − Low data rate (small data transmission) − Larger delay tolerance (also depends on application) − Autonomous devices − Battery life for months and years (i.e. minimum energy at a given payload) − Low cost devices and operations − Low mobility − High security Challenge: − Different applications have different requirements
  87. 87. 87 Application requirements in M2M −Smart Grid −Lower power consumption, location tracking, reliability and long maintenance cycles −eHealth −Service reliability, mobility, lower power consumption, lower delays − Automotive −Mobility, location tracking −Smart cities −Reliability, fault tolerance, delay tolerance
  88. 88. 88 Key challenges for M2M − Scarce Resources − Battery, Computational Power,Available Memory, Bandwidth − Extended unattended lifetime − Replacing or recharging the batteries may be impossible − Fault tolerance − Scalability − Node Addressing − Spectrum allocation and interference management − PHY issues, Medium Access Control (MAC) issues, Routing issues, E2E Transport protocols − Security – Authentication, data integrity, robustness to attacks, etc. − Mobility − Topology Control − Data Fusion & Aggregation 88
  89. 89. 89 “Raw data is both an oxymoron and bad data” Geoff Bowker, 2005 Source: Kate Crawford, "Algorithmic Illusions: Hidden Biases of Big Data", Strata 2013.
  90. 90. 90 From data to actionable information Data Information Knowledge Wisdom? Raw sensory data Structured data (with semantics) Abstractions and perceptions Actionable information
  91. 91. Heterogeneity, multi-modality and volume are among the key issues. We need interoperable and machine-interpretable solutions… 91
  92. 92. Semantics and Data −Data with semantic annotations −Provenance, quality of information −Interpretable formats −Links and interconnections −Background knowledge, domain information −Hypotheses, expert knowledge −Adaptable and context-aware solutions 92
  93. 93. Wireless Sensor (and Actuator) Networks Sink node Gateway Core network e.g. InternetGateway End-user Computer services - The networks typically run Low Power Devices - Consist of one or more sensors, could be different type of sensors (or actuators) Operating Systems? Services? Protocols? Protocols? In-node Data Processing Data Aggregation/ Fusion Inference/ Processing of IoT data Interoperable/ Machine- interpretable representations Interoperable/ Machine- interpretable Representations? “Web of Things” Interoperable/ Machine- interpretable representations
  94. 94. 94 Observation and measurement data- annotation Tags Data formats Location Source:
  95. 95. Observation and measurement data 15, C, 08:15, 51.243057, -0.589444 95 value Unit of measurement Time Longitude Latitude How to make the data representations more machine-readable and machine-interpretable;
  96. 96. Observation and measurement data 15, C, 08:15, 51.243057, -0.589444 96 <value> <unit> <Time> <Longitude> <Latitude> What about this? <value>15</value> <unit>C</unit> <time>08:15</time> <longitude>51.243057</longitude> <latitude>-0.58944</latitude>
  97. 97. Extensible Markup Language (XML) − XML is a simple, flexible text format that is used for data representation and annotation. − XML was originally designed for large-scale electronic publishing. − XML plays a key role in the exchange of a wide variety of data on the Web and elsewhere. − It is one of the most widely-used formats for sharing structured information. 97
  98. 98. XML Document Example <?xml version="1.0"?> <measurement> <value>15</value> <unit>C</unit> <time>08:15</time> <longitude>51.243057</longitude> <latitude>-0.58944</latitude> </measurement> 98 XML Prolog- the XML declaration XML elements XML documents MUST be “well formed” Root element
  99. 99. XML Document Example- with attributes <?xml version="1.0“ encoding="ISO-8859-1"?> <measurement> <value type=“Decimal”>15</value> <unit>C</unit> <time>08:15</time> <longitude>51.243057</longitude> <latitude>-0.58944</latitude> </measurement> 99
  100. 100. Well Formed XML Documents −A "Well Formed" XML document has correct XML syntax. −XML documents must have a root element −XML elements must have a closing tag −XML tags are case sensitive −XML elements must be properly nested −XML attribute values must be quoted 100Source: W3C Schools,
  101. 101. Validating XML Documents −A "Valid" XML document is a "Well Formed" XML document, which conforms to the structure of the document defined in an XML Schema. −XML Schema defines the structure and a list of defined elements for an XML document. 101
  102. 102. XML Schema- example <xs:element name=“measurement"> <xs:complexType> <xs:sequence> <xs:element name=“value" type="xs:decimal"/> <xs:element name=“unit" type="xs:string"/> <xs:element name=“time" type="xs:time"/> <xs:element name=“longitude" type="xs:double"/> <xs:element name=“latitude" type="xs:double"/> </xs:sequence> </xs:complexType> </xs:element> 102 - XML Schema defines the structure and elements - An XML document then becomes an instantiation of the document defined by the schema;
  103. 103. XML Documents– revisiting the example <?xml version="1.0"?> <measurement> <value>15</value> <unit>C</unit> <time>08:15</time> <longitude>51.243057</longitude> <latitude>-0.58944</latitude> </measurement> 103 <?xml version="1.0"?> “But how about this?” <sensor_data> <reading>15</reading> <u>C</u> <timestamp>08:15</timestamp> <long>51.243057</long> <lat>-0.58944</lat> </sensor_data>
  104. 104. 104 XML − Meaning of XML-Documents is intuitively clear − due to "semantic" Mark-Up − tags are domain-terms − But, computers do not have intuition − tag-names do not provide semantics for machines. − DTDs or XML Schema specify the structure of documents, not the meaning of the document contents − XML lacks a semantic model − has only a "surface model”, i.e. tree Source: Semantic Web, John Davies, BT, 2003.
  105. 105. XML: limitations for semantic markup − XML representation makes no commitment on: − Domain specific ontological vocabulary − Which words shall we use to describe a given set of concepts? − Ontological modelling primitives − How can we combine these concepts, e.g. “car is a-kind-of (subclass-of) vehicle”  requires pre-arranged agreement on vocabulary and primitives  Only feasible for closed collaboration  agents in a small & stable community  pages on a small & stable intranet .. not for sharable Web-resources Source: Semantic Web, John Davies, BT, 2003. 105
  106. 106. Semantic Web technologies − XML provide a metadata format. − It defines the elements but does not provide any modelling primitive nor describes the meaningful relations between different elements. − Using semantic technologies to solve these issues. 106
  107. 107. A bit of history − “The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in co-operation.“ (Tim Berners-Lee et al, 2001) 107 Image source: Miller 2004
  108. 108. Semantics & the IoT − The Semantic Sensor (&Actuator) Web is an extension of the current Web/Internet in which information is given well-defined meaning, better enabling objects, devices and people to work in co-operation and to also enable autonomous interactions between devices and/or objects. 108
  109. 109. Resource Description Framework (RDF) − AW3C standard − Relationships between documents − Consisting of triples or sentences: − <subject, property, object> − <“Sensor”, hasType,“Temperature”> − <“Node01”, hasLocation,“Room_BA_01” > − RDFS extends RDF with standard “ontology vocabulary”: − Class, Property − Type, subClassOf − domain, range 109
  110. 110. RDF for semantic annotation − RDF provides metadata about resources − Object -> Attribute->Value triples or − Object -> Property-> Subject − It can be represented in XML − The RDF triples form a graph 110
  111. 111. RDF Graph 111 xsd:decimal Measurement hasValue hasTime xsd:double xsd:time xsd:double xsd:string hasLongitude hasLatitude hasUnit
  112. 112. RDF Graph- an instance 112 15 Measurement #0001 hasValue hasTime -0.589444 08:15 51.243057 C hasLongitude hasLatitude hasUnit
  113. 113. RDF/XML <rdf:RDF> <rdf:Description rdf:about=“Measurment#0001"> <hasValue>15</hasValue> <hasUnit>C</hasUnit> <hasTime>08:15</hasTime> <hasLongitude>51.243057</hasLongitude> <hasLatitude>-0.589444</hasLatitude> </rdf:Description> </rdf:RDF> 113
  114. 114. Let’s add a bit more structure (complexity?) 114 xsd:decimal Location hasValue hasTime xsd:double xsd:time xsd:double xsd:string hasLongitude hasLatitude hasUnit Measurement hasLocation
  115. 115. An instance of our model 115 15 Location #0126 hasValue hasTime 51.243057 08:15 -0.589444 C hasLongitude hasLatitude hasUnit Measurement #0001 hasLocation
  116. 116. RDF: Basic Ideas −Resources −Every resource has a URI (Universal Resource Identifier) −A URI can be a URL (a web address) or a some other kind of identifier; −An identifier does not necessarily enable access to a resources −We can think of a resources as an object that we want to describe it. −Car −Person −Places, etc. 116
  117. 117. RDF: Basic Ideas − Properties − Properties are special kind of resources; − Properties describe relations between resources. − For example: “hasLocation”,“hasType”,“hasID”,“sratTime”, “deviceID”,. − Properties in RDF are also identified by URIs. − This provides a global, unique naming scheme. − For example: − “hasLocation” can be defined as: − URI: − SPARQL is a query language for the RDF data. − SPARQL provide capabilities to query RDF graph patterns along with their conjunctions and disjunctions. 117
  118. 118. Ontologies −The term ontology is originated from philosophy. In that context it is used as the name of a subfield of philosophy, namely, the study of the nature of existence. −In the Semantic Web: −An ontology is a formal specification of a domain; concepts in a domain and relationships between the concepts (and some logical restrictions). 118
  119. 119. Ontologies and Semantic Web − In general, an ontology describes a set of concepts in a domain. − An ontology consists of a finite list of terms and the relationships between the terms. − The terms denote important concepts (classes of objects) of the domain. − For example, in a university setting, staff members, students, courses, modules, lecture theatres, and schools are some important concepts. 119
  120. 120. Web Ontology Language (OWL) − RDF(S) is useful to describe the concepts and their relationships, but does not solve all possible requirements − Complex applications may want more possibilities: − similarity and/or differences of terms (properties or classes) − construct classes, not just name them − can a program reason about some terms? e.g.: − each «Sensor» resource «A» has at least one «hasLocation» − each «Sensor» resource «A» has maximum one ID − This lead to the development of Web Ontology Language or OWL. 120
  121. 121. OWL − OWL provide more concepts to express meaning and semantics than XML and RDF(S) − OWL provides more constructs for stating logical expressions such as: Equality, Property Characteristics, Property Restrictions, Restricted Cardinality, Class Intersection, Annotation Properties, Versioning, etc. Source: 121
  122. 122. Ontology engineering − An ontology: classes and properties (also referred to as schema ontology) − Knowledge base: a set of individual instances of classes and their relationships − Steps for developing an ontology: − defining classes in the ontology and arranging the classes in a taxonomic (subclass–superclass) hierarchy − defining properties and describing allowed values and restriction for these properties − Adding instances and individuals
  123. 123. Basic rules for designing ontologies − There is no one correct way to model a domain; there are always possible alternatives. − The best solution almost always depends on the application that you have in mind and the required scope and details. − Ontology development is an iterative process. − The ontologies provide a sharable and extensible form to represent a domain model. − Concepts that you choose in an ontology should be close to physical or logical objects and relationships in your domain of interest (using meaningful nouns and verbs).
  124. 124. A simple methodology 1. Determine the domain and scope of the model that you want to design your ontology. 2. Consider reusing existing concepts/ontologies; this will help to increase the interoperability of your ontology. 3. Enumerate important terms in the ontology; this will determine what are the key concepts that need to be defined in an ontology. 4. Define the classes and the class hierarchy; decide on the classes and the parent/child relationships 5. Define the properties of classes; define the properties that relate the classes; 6. Define features of the properties; if you are going to add restriction or other OWL type restrictions/logical expressions. 7. Define/add instances 124
  125. 125. Semantic technologies in the IoT −Applying semantic technologies to IoT can support: −Interoperability −effective data access and integration −resource discovery −reasoning and processing of data −knowledge extraction (for automated decision making and management) 125
  126. 126. 126 Data/Service description frameworks −There are standards such as Sensor Web Enablement (SWE) set developed by the Open Geospatial Consortium that are widely being adopted in industry, government and academia. −While such frameworks provide some interoperability, semantic technologies are increasingly seen as key enabler for integration of IoT data and broader Web information systems.
  127. 127. Revisiting goals of the Internet of Things − A primary goal of interconnecting devices and collecting/processing data from them is to create situation awareness and enable applications, machines, and human users to better understand their surrounding environments. − The understanding of a situation, or context, potentially enables services and applications to make intelligent decisions and to respond to the dynamics of their environments. 127
  128. 128. 128 Semantic technologies − The sensors (and in general “Things”) are increasingly being connected withWeb infrastructure. − This can be supported by embedded devices that directly support IP and web-based connection (e.g. 6LowPAN and CoAp) or devices that are connected via gateway components. − Broadening the IoT to the concept of “Web of Things” − There are already Sensor Web Enablement (SWE) standards developed by the Open Geospatial Consortium that are widely being adopted in industry, government and academia. − While such frameworks provide some interoperability, semantic technologies are increasingly seen as key enabler for integration of IoT data and broaderWeb information systems.
  129. 129. 129 129 Observation and measurement data Source: W3C Semantic Sensor Networks, SSN Ontology presentation, Laurent Lefort et al.
  130. 130. 130 Sensor Markup Language (SensorML) Source:
  131. 131. 131 Semantics and sensor data Source: W. Wang, P. Barnaghi, "Semantic Annotation and Reasoning for Sensor Data", In proceedings of the 4th European Conference on Smart Sensing and Context (EuroSSC2009), 2009.
  132. 132. 132 Observation and measurement data- a semantic model P. Barnaghi, S. Meissner, M. Presser, K. Moessner, "Sense and Sens’ability: Semantic Data Modelling for Sensor Networks", In Proc of the ICT Mobile Summit 2009, June 2009.
  133. 133. Semantic modelling − Lightweight: experiences show that a lightweight ontology model that well balances expressiveness and inference complexity is more likely to be widely adopted and reused; also large number of IoT resources and huge amount of data need efficient processing − Compatibility: an ontology needs to be consistent with those well designed, existing ontologies to ensure compatibility wherever possible. − Modularity: modular approach to facilitate ontology evolution, extension and integration with external ontologies. 133
  134. 134. Existing models- SSN Ontology − W3C Semantic Sensor Network Incubator Group’s SSN ontology (mainly for sensors and sensor networks, platforms and systems).
  135. 135. Stimulus-Sensor-Observation - The SSO Ontology Design Pattern developed following the principle of minimal ontological commitments to make it reusable for a variety of application areas. -Introduces a minimal set of classes and relations centered around the notions of stimuli, sensor, and observations. -Defines stimuli as the (only) link to the physical environment. 135
  136. 136. SSN Ontology Modules 136
  137. 137. 137 Basic Structure
  138. 138. 138 SSN Ontology Ontology Link: M. Compton et al, "The SSN Ontology of the W3C Semantic Sensor Network Incubator Group", Journal of Web Semantics, 2012.
  139. 139. 139 139 W3C SSN Ontology makes observations of this type Where it is What it measures units SSN-XG ontologies SSN-XG annotations SSN-XG Ontology Scope
  140. 140. What SSN does not model − Sensor types and models − Networks: communication, topology − Representation of data and units of measurement − Location, mobility or other dynamic behaviours − Control and actuation − …. 140
  141. 141. Web of Things − Integrating the real world data into the Web and providing Web-based interactions with the IoT resources is also often discussed under umbrella term of “Web of Things” (WoT). − WoT data is not only large in scale and volume, but also continuous, with rich spatiotemporal dependency. 141
  142. 142. 142 Example: Linked IoT Data Internal location ontology (local) Lined-data location (external)
  143. 143. 143 IoT and Semantics: Challenges and issues
  144. 144. Several ontologies and description models 144
  145. 145. 145 We have good models and description frameworks; The problem is that having good models and developing ontologies is not enough.
  146. 146. 146 Semantic descriptions are intermediary solutions, not the end product. They should be transparent to the end-user and probably to the data producer as well.
  147. 147. A WoT/IoT Framework WSN WSN WSN WSN WSN Network-enabled Devices Semantically annotate data 147 Gateway CoAP HTTP CoAP CoAP HTTP 6LowPAN Semantically annotate data http://mynet1/snodeA23/readTemp? WSN MQTT MQTT Gateway And several other protocols and solutions…
  148. 148. Publishing Semantic annotations − We need a model (ontology) – this is often the easy part for a single application. − Interoperability between the models is a big issue. − Express-ability vs Complexity is a challenge − How and where to add the semantics − Where to publish and store them − Semantic descriptions for data, streams, devices (resources) and entities that are represented by the devices, and description of the services. 148
  149. 149. 149 Simplicity can be very useful…
  150. 150. Hyper/CAT 150 Source: Toby Jaffey, HyperCat Consortium, - Servers provide catalogues of resources to clients. - A catalogue is an array of URIs. - Each resource in the catalogue is annotated with metadata (RDF-like triples).
  151. 151. Hyper/CAT model 151 Source: Toby Jaffey, HyperCat Consortium,
  152. 152. 152 Complex models are (sometimes) good for publishing research papers…. But they are often difficult to implement and use in real world products.
  153. 153. What happens afterwards is more important − How to index and query the annotated data − How to make the publication suitable for constrained environments and/or allow them to scale − How to query them (considering the fact that here we are dealing with live data and often reducing the processing time and latency is crucial) − Linking to other sources 153
  154. 154. 154 Data Challenges − Discovery: finding appropriate device and data sources − Access:Availability and (open) access to M2M resources and data − Search: querying for data − Integration: dealing with heterogeneous device, networks and data − Interpretation: translating data to knowledge usable by people and applications − Scalability: dealing with large number of devices and myriad of data and computational complexity of interpreting the data.
  155. 155. 155 155 Myth and reality − #1: If we create an Ontology our data is interoperable − Reality: there are/could be a number of ontologies for a domain − Ontology mapping − Reference ontologies − Standardisation efforts − #2: Semantic data will make my data machine-understandable and my system will be intelligent. − Reality: it is still meta-data, machines don’t understand it but can interpret it. It still does need intelligent processing, reasoning mechanism to process and interpret the data. − #3: It’s a Hype! Ontologies and semantic data are too much overhead; we deal with tiny devices in IoT. − Reality: Ontologies are a way to share and agree on a common vocabulary and knowledge; at the same time there are machine-interpretable and represented in interoperable and re-usable forms; − You don’t necessarily need to add semantic metadata in the source- it could be added to the data at a later stage (e.g. in a gateway); − Legacy applications can ignore it or to be extended to work with it.
  156. 156. 156 Semantics and Linked-data − The principles in designing the linked data are defined as: − using URI’s as names for things; − using HTTP URI’s to enable people to look up those names; − provide useful RDF information related to URI’s that are looked up by machine or people; − including RDF statements that link to other URI’s to enable discovery of other related concepts of the Web of Data;
  157. 157. 157 157 Linked-data Linked data is data presented in a better way and in relation to other resources…
  158. 158. 158 Linked Sensor data
  159. 159. 159 Linked Open Data Collectively, the 203 data sets consist of over 25 billion RDF triples, which are interlinked by around 395 million RDF links (September 2010).
  160. 160. 160 Creating and using Linked Sensor Data
  161. 161. 161 Linked sensor data
  162. 162. 162 Smart-Campus Infrastructure User devices Core Network by FI platform USB Ethernet Ethernet 802.15.4 Smart Campus Service platform IoT devices GW devices Bluetooth Bluetooth Wifi, Ethernet Display infrastructure Wifi, Ethernet GW devices Indoor IoT deployments • Smart Transportation • Smart Waste Management • Environmental Monitoring Outdoor IoT deployments • Intelligent offices spaces Sustainable Campus using Internet of Things Technologies Source: A. Gluhak et al, CCSR, University of Surrey, 2012.
  163. 163. 163 IoT Data Access − Publish/Subscribe (long-term/short-term) − Ad-hoc query − The typical types of data request for sensory data: − Query based on − ID (resource/service) – for known resources − Location − Type − Time – requests for freshness data or historical data; − One of the above + a range [+ Unit of Measurement] − Type/Location/Time + A combination of Quality of Information attributes − An entity of interest (a feature of an entity on interest) − Complex DataTypes (e.g. pollution data could be a combination of different types)
  164. 164. Comparing IoT data streams with conventional multimedia streams Source: P. Barnaghi, W. Wang, L. Dong, C. Wang, "A Linked-data Model for Semantic Sensor Streams", in the Proc. of IEEE International Conference on Internet of Things (iThings 2013), August 2013.
  165. 165. 165 Describing IoT Data:An example Time Location Type Value Link to QoI metadata UTC #GeoHash #Hash [DataType, Value] URI Ontology for common types
  166. 166. 166 Observation and MeasurementValue GeoHash UTC time UTC time (in Java) : The time indicated is returned represented as the distance, measured in milliseconds, of that time from the epoch (00:00:00 GMT on January 1, 1970). Standard XSD data type
  167. 167. 167 GeoHashing − For example Guildford: lat: 51.235401 and long: -0.574600 can be hashed as: gcpe6zjeffgp − It can be used as: − A unique identifier − represent point data as hash string − It uses Base 32 encoding and bit interleaving − It’s used for geo-tagging (and is a symmetric technique) − Place close to each other will have similar prefix (string similarity) − Limitations: − We could have Geohash codes with no common prefix − Edge case (locations close to each other but on opposite sides of the Equator) − A meridian point (line of longitude)
  168. 168. 168 GeoHash Example Sample locations on a Google Map and their equivalent geohash strings; - close locations have similar prefixes
  169. 169. IoT Data Processing WSN WSN WSN WSN WSN Network-enabled Devices Network-enabled Devices Network services/storage and processing units Data/service access at application level Data collections and processing within the networks Data Discovery Service/ Resource Discovery
  170. 170. Data Aggregation − Computing a smaller representation of a number of data items (or messages) that is extracted from all the individual data items. − For example computing min/max or mean of sensor data. − More advance aggregation solutions could use approximation techniques to transform high-dimensionality data to lower- dimensionality abstractions/representations. − The aggregated data can be smaller in size, represent patterns/abstractions; so in multi-hop networks, nodes can receive data form other node and aggregate them before forwarding them to a sink or gateway. − Or the aggregation can happen on a sink/gateway node.
  171. 171. Aggregation example − Reduce number of transmitted bits/packets by applying an aggregation function in the network 1 1 3 1 1 6 1 1 1 1 1 1 Source: Holger Karl, Andreas Willig, Protocols and Architectures for Wireless Sensor Networks, Protocols and Architectures for Wireless Sensor Networks, chapter 3, Wiley, 2005 .
  172. 172. Efficacy of an aggregation mechanism − Accuracy: difference between the resulting value or representation and the original data − Some solutions can be lossless or lossly depending on the applied techniques. − Completeness: the percentage of all the data items that are included in the computation of the aggregated data. − Latency: delay time to compute and report the aggregated data − Computation foot-print; complexity; − Overhead: the main advantage of the aggregation is reducing the size of the data representation; − Aggregation functions can trade-off between accuracy, latency and overhead; − Aggregation should happen close to the source.
  173. 173. Publish/Subscribe − Achieved by publish/subscribe paradigm − Idea: Entities can publish data under certain names − Entities can subscribe to updates of such named data − Conceptually: Implemented by a software bus − Software bus stores subscriptions, published data; names used as filters; subscribers notified when values of named data changes Software bus Publisher 1 Publisher 2 Subscriber 1 Subscriber 2 Subscriber 3 − Variations − Topic-based P/S – inflexible − Content-based P/S – use general predicates over named data Source: Holger Karl, Andreas Willig, Protocols and Architectures for Wireless Sensor Networks, Protocols and Architectures for Wireless Sensor Networks, chapter 12, Wiley, 2005 .
  174. 174. MQTT Pub/Sub Protocol − MQTelemetry Transport (MQTT) is a lightweight broker-based publish/subscribe messaging protocol. − MQTT is designed to be open, simple, lightweight and easy to implement. − These characteristics make MQTT ideal for use in constrained environments, for example in IoT. −Where the network is expensive, has low bandwidth or is unreliable −When run on an embedded device with limited processor or memory resources; − A small transport overhead (the fixed-length header is just 2 bytes), and protocol exchanges minimised to reduce network traffic − MQTT was developed by Andy Stanford-Clark of IBM, and Arlen Nipper of Cirrus Link Solutions. Source: MQTT V3.1 Protocol Specification, IBM,
  175. 175. MQTT − It supports publish/subscribe message pattern to provide one-to-many message distribution and decoupling of applications − A messaging transport that is agnostic to the content of the payload − The use of TCP/IP to provide basic network connectivity − Three qualities of service for message delivery: − "At most once", where messages are delivered according to the best efforts of the underlying TCP/IP network. Message loss or duplication can occur. − This level could be used, for example, with ambient sensor data where it does not matter if an individual reading is lost as the next one will be published soon after. − "At least once", where messages are assured to arrive but duplicates may occur. − "Exactly once", where message are assured to arrive exactly once.This level could be used, for example, with billing systems where duplicate or lost messages could lead to incorrect charges being applied. Source: MQTT V3.1 Protocol Specification, IBM,
  176. 176. MQTT Message Format − The message header for each MQTT command message contains a fixed header. − Some messages also require a variable header and a payload. − The format for each part of the message header: Source: MQTT V3.1 Protocol Specification, IBM, — DUP: Duplicate delivery — QoS: Quality of Service — RETAIN: RETAIN flag —This flag is only used on PUBLISH messages. When a client sends a PUBLISH to a server, if the Retain flag is set (1), the server should hold on to the message after it has been delivered to the current subscribers. —This allows new subscribers to instantly receive data with the retained, or Last Known Good, value.
  177. 177. Sensor Data as time-series data − The sensor data (or IoT data in general) can be seen as time- series data. − A sensor stream refers to a source that provide sensor data over time. − The data can be sampled/collected at a rate (can be also variable) and is sent as a series of values. − Over time, there will be a large number of data items collected. − Using time-series processing techniques can help to reduce the size of the data that is communicated; −Let’s remember, communication can consume more energy than communication;
  178. 178. Sensor Data as time-series data − Different representation method that introduced for time-series data can be applied. − The goal is to reduce the dimensionality (and size) of the data, to find patterns, detect anomalies, to query similar data; − Dimensionality reduction techniques transform a data series with n items to a representation with w items where w < n. − This functions are often lossy in comparison with solutions like normal compression that preserve all the data. − One of these techniques is called Symbolic Aggregation Approximation (SAX). − SAX was originally proposed for symbolic representation of time-series data; it can be also used for symbolic representation of time-series sensor measurements. − The computational foot-print of SAX is low; so it can be also used as a an in- network processing technique.
  179. 179. 179 In-network processing Using Symbolic Aggregate Approximation (SAX) SAX Pattern (blue) with word length of 20 and a vocabulary of 10 symbols over the original sensor time-series data (green) Source: P. Barnaghi, F. Ganz, C. Henson, A. Sheth, "Computing Perception from Sensor Data", in Proc. of the IEEE Sensors 2012, Oct. 2012. fggfffhfffffgjhghfff jfhiggfffhfffffgjhgi fggfffhfffffgjhghfff
  180. 180. Symbolic Aggregate Approximation (SAX) − SAX transforms time-series data into symbolic string representations. − Symbolic Aggregate approXimation was proposed by Jessica Lin et al at the University of California –Riverside; − . − It extends Piecewise Aggregate Approximation (PAA) symbolic representation approach. − SAX algorithm is interesting for in-network processing in WSN because of its simplicity and low computational complexity. − SAX provides reasonable sensitivity and selectivity in representing the data. − The use of a symbolic representation makes it possible to use several other algorithms and techniques to process/utilise SAX representations such as hashing, pattern matching, suffix trees etc.
  181. 181. Processing Steps in SAX − SAX transforms a time-series X of length n into the string of arbitrary length, where typically, using an alphabet A of size a > 2. − The SAX algorithm has two main steps: −Transforming the original time-series into a PAA representation −Converting the PAA intermediate representation into a string during. − The string representations can be used for pattern matching, distance measurements, outlier detection, etc.
  182. 182. Piecewise Aggregate Approximation − In PAA, to reduce the time series from n dimensions to w dimensions, the data is divided into w equal sized “frames.” − The mean value of the data falling within a frame is calculated and a vector of these values becomes the data-reduced representation. − Before applying PAA, each time series to have a needs to be normalised to achieve a mean of zero and a standard deviation of one. −The reason is to avoid comparing time series with different offsets and amplitudes; Source: Jessica Lin, Eamonn Keogh, Stefano Lonardi, and Bill Chiu. 2003. A symbolic representation of time series, with implications for streaming algorithms. In Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery (DMKD '03). ACM, New York, NY, USA, 2-11.
  183. 183. SAX- normalisation before PAA Timeseries (c): 2, 3, 4.5, 7.6, 4, 2, 2, 2, 3, 1 Mean (μ): =μ (2+3+4.5+7.6+4+2+2+2+3+1)/10= 3.11 Standard deviation (σ): (2-3.11)2 = 1.2321 (3-3.11)2 = 0.0121 (4.5-3.11)2 = 1.9321 (7.6-3.11)2 = 20.1601 (4-3.11)2 = 0.7921 (2-3.11)2 = 1.2321 (2-3.11)2 = 1.2321 (2-3.11)2 = 1.2321 (3-3.11)2 = 0.0121 (1-3.11)2 = 4.4521 1.2321+0.0121+ 1.9321+ 20.1601+ 0.7921+ 1.2321+ 1.2321+ 1.2321+ 1.2321+ 0.0121+4.4521 = 33.5211 = √ (33.5211/10) = 1.83087683911σ
  184. 184. Normalisation Timeseries (c): 2, 3, 4.5, 7.6, 4, 2, 2, 2, 3, 1 Normalised: zi = (ci – )/μ σ = 1.83087683911σ = 3.11μ z1 = (2- 3.11)/1.83087683911 = -0.606 z2 = (3-3.11)/ 1.83087683911= -0.600 z3 = (4.5-3.11)/ 1.83087683911= 2.452 z4 = (7.6-3.11)/ 1.83087683911= -0.600 z5 = (4-3.11)/ 1.83087683911= 0.486 z6 = (2-3.11)/ 1.83087683911= -0.606 z7 = (2-3.11)/ 1.83087683911= -0.606 z8 = (2-3.11)/ 1.83087683911= -0.606 z9 = (3-3.11)/ 1.83087683911= -0.600 z10 = (1-3.11)/ 1.83087683911= -1.152 Normalised Timeseries (z): -0.606, -0.600, 2.452, -0.600, 0.486, -0.606, -0.606, -0.606 , -0.600, -1.152
  185. 185. PAA calculation Timeseries (c): 2, 3, 4.5, 7.6, 4, 2, 2, 2, 3, 1 Normalised Timeseries (z): -0.606, -0.600, 2.452, -0.600, 0.486, -0.606, -0.606, -0.606 , -0.600, -1.152 PAA (w=5): -0.603, 0.926, -0.06, -0.606, 0.273
  186. 186. PAA to SAX Conversion − Conversion of the PAA representation of a time-series into SAX is based on producing symbols that correspond to the time-series features with equal probability. − The SAX developers have shown that time-series which are normalised (zero mean and standard deviation of 1) follow a Normal distribution (Gaussian distribution). − The SAX method introduces breakpoints that divides the PAA representation to equal sections and assigns an alphabet for each section. − For defining breakpoints, Normal inverse cumulative distribution function
  187. 187. Breakpoints in SAX − “Breakpoints: breakpoints are a sorted list of numbers B = β 1,…, β a-1 such that the area under a N(0,1) Gaussian curve from βi to βi+1 = 1/a”. Source: Jessica Lin, Eamonn Keogh, Stefano Lonardi, and Bill Chiu. 2003. A symbolic representation of time series, with implications for streaming algorithms. In Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery (DMKD '03). ACM, New York, NY, USA, 2-11.
  188. 188. Alphabet representation in SAX − Let’s assume that we will have 4 symbols alphabet: a,b,c,d − As shown in the table in the previous slide, the cut lines for this alphabet (also shown as the thin red lines on the plot below) will be { -0.67, 0, 0.67 } Source: JMOTIF Time series mining,
  189. 189. SAX Represetantion Timeseries (c): 2, 3, 4.5, 7.6, 4, 2, 2, 2, 3, 1 Normalised Timeseries (z): -0.606, -0.600, 2.452, -0.600, 0.486, -0.606, -0.606, -0.606 , -0.600, -1.152 PAA (w=5): -0.603, 0.926, -0.06, -0.606, 0.273 Cut off ranges: {-0.67, 0, 0.67} Alphabet: a ,b ,c, d SAX representation: bdbbc
  190. 190. Features of the SAX technique − SAX divides a time series data into equal segments and then creates a string representation for each segment. − The SAX patterns create the lower-level abstractions that are used to create the higher-level interpretation of the underlying data. − The string representation of the SAX mechanism enables to compare the patterns using a specific type of string similarity function.
  191. 191. 191 A sample data processing framework fggfffhfffffgjhghfff dddfffffffffffddd cccddddccccdddcc c aaaacccaaaaaaaaccccdddcdcdcdcddasddd PIR Sensor Light Sensor Temperature Sensor Raw sensor data stream Raw sensor data streamRaw sensor data stream Attendance Phone Hot Temperature Cold Temperature Bright Day-time Night-time Office room BA0121 On going meeting Window has been left open …. Temporal data (extracted from descriptions) Spatial data (extracted from descriptions) Thematic data (low level abstractions) Intelligent Processing Observations High-level abstractions Domain knowledge SAX Patterns Raw sensor data (or Annotated data) … …. Intelligent Processing/ Reasoning High-level information/ knowledge
  192. 192. 192 Summary − Wireless Sensor Networks − Communication − Networks − Middleware/gateway − Service/application layer − 6LowPAN and CoAp − Machine-to-machine communication − M2M architecture − M2M networks − Applications − Semantic technologies − Machine-interpretable data for automated processing, − Modelling and annotation − Linked-data
  193. 193. 193 Some related books
  194. 194. 194 Links and further Reading − ETSI, Machine to Machine Communications − − Machine-to-Machine Communications, OECD Library, − − Internet of Things, ITU − − IoT Comic Book − − W3C Semantic Sensor Networks −
  195. 195. 195 Acknowledgments − Protocols and Architectures forWireless Sensor Networks, Protocols and Architectures forWireless Sensor Networks, Holger Karl,AndreasWillig, Wiley, 2005 . − Dave Evans,The Internet ofThings: How the Next Evolution of the Internet Is Changing Everything, Cisco,April 2011. − TheWeb ofThings, Marko Grobelnik, Carolina Fortuna, Jožef Stefan Institute, Slovenia.
  196. 196. Acknowledgements − Some parts of the content are adapted from: − Holger Karl,Andreas Willig, Protocols and Architectures for Wireless Sensor Networks, Protocols and Architectures for Wireless Sensor Networks, chapters 3 and 12,Wiley, 2005 . − Jessica Lin, Eamonn Keogh, Stefano Lonardi, and Bill Chiu. 2003.A symbolic representation of time series, with implications for streaming algorithms. In Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery (DMKD '03).ACM, NewYork, NY, USA, 2-11. − JMOTIF Time series mining,
  197. 197. Q&A