This document provides an overview of a data-driven approach for Internet of Things (IoT) applications. It discusses IoT data sources, semantic data modeling methods, data storage and retrieval frameworks, and case studies of IoT application domains. The key points covered are:
- IoT generates data from physical sensors, mobile devices, and social networks that needs to be modeled and processed into useful information.
- A semantic approach to data modeling uses ontologies to represent IoT resources, entities, services, and observation data in a machine-interpretable way.
- An indexing, storage, and retrieval framework maps sensor data to an ontology schema, spatially indexes the data using geohashes, and stores time-
EDF2014: BIG - NESSI Networking Session: Edward Curry, National University of...European Data Forum
BIG - NESSI Networking Session, Talk by Edward Curry, National University of Ireland Galway at the European Data Forum 2014, 20 March 2014 in Athens, Greece: The Big Data Value Chain.
EDF2014: BIG - NESSI Networking Session: Edward Curry, National University of...European Data Forum
BIG - NESSI Networking Session, Talk by Edward Curry, National University of Ireland Galway at the European Data Forum 2014, 20 March 2014 in Athens, Greece: The Big Data Value Chain.
Towards an INSPIREd e-reporting & INSPIRE priority datasets in SlovakiaMartin Tuchyna
SKI contribution for the Workshop "Priority list of datasets for eReporting" (https://inspire.ec.europa.eu/events/conferences/inspire_2017/submissions/222.html ) held during the INSPIRE 2017 conference (http://inspire.ec.europa.eu/conference2017).
Everything Counts in Large Amounts: Measuring the Impact of Usage Activity in...OpenAIRE
OpenAIRE usage statistics session slides at #DI4R2017. 1 Dec. 2017 - by Dimitris Pierrakos, Athena Research Center and Jochen Schirrwagen, Bielefeld University.
These slides were used at the first Aarhus Follower Group meet-up for the EU-funded project IoTCrawler. They entail an introduction to the project aswell as a more in depth presentation of the difference between web search and Internet of Things (IoT) search an the development of Internet of Things. Furthermore some of the scenarios from the project are presented.
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...BigData_Europe
Where we are and are going for Big Data in OpenScience
Keynote talk at the Big Data Europe SC6 Workshop on 11.9.2017 in Amsterdam co-located with SEMANTiCS2017: The perspective of European official statistics by Fernando Reis, Task-Force Big Data, European Commission (Eurostat).
Introductory talk on the usage of Linked Data for official statistics, given at the ESS (Linked) Open Data Workshop 2017, in Malta, January 2017.
In this introductory talk we will discuss the main foundations for the application of Linked Data principles into official statistics. We will briefly introduce what Linked Data is, as well as the main principles, languages and technologies behind it (URIs, RDF, SPARQL). We will also discuss about the different formats in which data can be made available on the Web (e.g., RDF Turtle, JSON-LD, CSV on the Web). We will then move into providing a detailed presentation, with step by step examples based on existing Linked Statistical Data sources, of the W3C recommendation RDF DataCube, which is the basis for the dissemination of statistical data as Linked Data. Finally, we will provide some examples of applications, and the opportunities that this approach offers for the development of the proofs of concepts selected by Eurostat and to be discussed during the meeting.
Snap4City November 2019 Course: Smart City IOT Data Ingestion Interoperabilit...Paolo Nesi
• Data Ingestion Capabilities
• Data Ingestion Strategy
• Setting Up the Road Graph on Knowledge Base
• Data Set Load via Data Gate (plus how to load triples into Knowledge base)
• Data Ingestion and Transformation via ETL Processes
• Data Ingestion via IOT Brokers
• IOT Network: recall of basic concepts
• IOT Directory
• IOT Devices and IOT Brokers Registration
• Data Ingestion via IOT Applications
• Data Ingestion from API, External Services, Custom MicroServices
• Data Ingestion via Web Scraping
• Data Streams from Smart City API, participatory
• Data Streams from Mobile Devices
• Data Streams from Dashboards
• GIS Data Import and Export
• Social Media data collection and exploitation
• Acknowledgements
Ontology Building vs Data Harvesting and Cleaning for Smart-city ServicesPaolo Nesi
Presently, a very large number of public and private data sets are available around the local governments. In most cases, they are not semantically interoperable and a huge human effort is needed to create integrated ontologies and knowledge base for smart city. Smart City ontology is not yet standardized, and a lot of research work is needed to identify models that can easily support the data reconciliation, the management of the complexity and reasoning. In this paper, a system for data ingestion and reconciliation of smart cities related aspects as road graph, services available on the roads, traffic sensors etc., is proposed. The system allows managing a big volume of data coming from a variety of sources considering both static and dynamic data. These data are mapped to smart-city ontology and stored into an RDF-Store where they are available for applications via SPARQL queries to provide new services to the users. The paper presents the process adopted to produce the ontology and the knowledge base and the mechanisms adopted for the verification, reconciliation and validation. Some examples about the possible usage of the coherent knowledge base produced are also offered and are accessible from the RDF-Store and related services. The article also presented the work performed about reconciliation algorithms and their comparative assessment and selection. Keywords Smart city, knowledge base construction, reconciliation, validation and verification of knowledge base, smart city ontology, linked open graph.
A Web of Things Based Eco-System for Urban Computing - Towards Smarter CitiesAndreas Kamilaris
Environmental awareness and knowledge may help people to take more informed decisions in their everyday lives, ensuring their health and safety. The Web of Things enables embedded sensors to become easily deployed in urban areas for environmental monitoring such as air quality, electromagnetism, radiation, etc. In this presentation, we propose an eco-system for urban computing which combines the concept of the Web of Things, together with big data analysis and event processing, towards the vision of smarter cities that offer real-time information to their habitants about the urban environment. We touch upon near real-time web-based discovery of sensory services, citizen participation, semantic technologies and mobile computing, helping people to take more informed everyday decisions when interacting with their urban landscape. We then present a case study where we demonstrate the feasibility and usefulness of this eco-system to the everyday lives of citizens.
This research has been supported by the P-SPHERE project, which has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skodowska-Curie grant agreement No 665919.
Semantic Interoperability Issues and Approaches in the IoT.est Projectiotest
P Barnaghi, Semantic Interoperability Issues and Approaches in the IoT.est Project, at the IERC AC4 Semantic interoperability Workshop (during the IoT-week 2012), Venice, Italy, 19 June 2012
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...Edward Curry
The Real-time Linked Dataspace (RLD) is an enabling platform for data management for intelligent systems within smart environments that combines the pay-as-you-go paradigm of dataspaces, linked data, and knowledge graphs with entity-centric real-time query capabilities.
The RLD contains all the relevant information within a data ecosystem including things, sensors, and data sources and has the responsibility for managing the relationships among these participants.
It manages sources without presuming a pre-existing semantic integration among them using specialised dataspace support services for loose administrative proximity and semantic integration for event and stream systems. Support services leverage approximate and best-effort techniques and operate under a 5 star model for “pay-as-you-go” incremental data management.
Towards an INSPIREd e-reporting & INSPIRE priority datasets in SlovakiaMartin Tuchyna
SKI contribution for the Workshop "Priority list of datasets for eReporting" (https://inspire.ec.europa.eu/events/conferences/inspire_2017/submissions/222.html ) held during the INSPIRE 2017 conference (http://inspire.ec.europa.eu/conference2017).
Everything Counts in Large Amounts: Measuring the Impact of Usage Activity in...OpenAIRE
OpenAIRE usage statistics session slides at #DI4R2017. 1 Dec. 2017 - by Dimitris Pierrakos, Athena Research Center and Jochen Schirrwagen, Bielefeld University.
These slides were used at the first Aarhus Follower Group meet-up for the EU-funded project IoTCrawler. They entail an introduction to the project aswell as a more in depth presentation of the difference between web search and Internet of Things (IoT) search an the development of Internet of Things. Furthermore some of the scenarios from the project are presented.
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...BigData_Europe
Where we are and are going for Big Data in OpenScience
Keynote talk at the Big Data Europe SC6 Workshop on 11.9.2017 in Amsterdam co-located with SEMANTiCS2017: The perspective of European official statistics by Fernando Reis, Task-Force Big Data, European Commission (Eurostat).
Introductory talk on the usage of Linked Data for official statistics, given at the ESS (Linked) Open Data Workshop 2017, in Malta, January 2017.
In this introductory talk we will discuss the main foundations for the application of Linked Data principles into official statistics. We will briefly introduce what Linked Data is, as well as the main principles, languages and technologies behind it (URIs, RDF, SPARQL). We will also discuss about the different formats in which data can be made available on the Web (e.g., RDF Turtle, JSON-LD, CSV on the Web). We will then move into providing a detailed presentation, with step by step examples based on existing Linked Statistical Data sources, of the W3C recommendation RDF DataCube, which is the basis for the dissemination of statistical data as Linked Data. Finally, we will provide some examples of applications, and the opportunities that this approach offers for the development of the proofs of concepts selected by Eurostat and to be discussed during the meeting.
Snap4City November 2019 Course: Smart City IOT Data Ingestion Interoperabilit...Paolo Nesi
• Data Ingestion Capabilities
• Data Ingestion Strategy
• Setting Up the Road Graph on Knowledge Base
• Data Set Load via Data Gate (plus how to load triples into Knowledge base)
• Data Ingestion and Transformation via ETL Processes
• Data Ingestion via IOT Brokers
• IOT Network: recall of basic concepts
• IOT Directory
• IOT Devices and IOT Brokers Registration
• Data Ingestion via IOT Applications
• Data Ingestion from API, External Services, Custom MicroServices
• Data Ingestion via Web Scraping
• Data Streams from Smart City API, participatory
• Data Streams from Mobile Devices
• Data Streams from Dashboards
• GIS Data Import and Export
• Social Media data collection and exploitation
• Acknowledgements
Ontology Building vs Data Harvesting and Cleaning for Smart-city ServicesPaolo Nesi
Presently, a very large number of public and private data sets are available around the local governments. In most cases, they are not semantically interoperable and a huge human effort is needed to create integrated ontologies and knowledge base for smart city. Smart City ontology is not yet standardized, and a lot of research work is needed to identify models that can easily support the data reconciliation, the management of the complexity and reasoning. In this paper, a system for data ingestion and reconciliation of smart cities related aspects as road graph, services available on the roads, traffic sensors etc., is proposed. The system allows managing a big volume of data coming from a variety of sources considering both static and dynamic data. These data are mapped to smart-city ontology and stored into an RDF-Store where they are available for applications via SPARQL queries to provide new services to the users. The paper presents the process adopted to produce the ontology and the knowledge base and the mechanisms adopted for the verification, reconciliation and validation. Some examples about the possible usage of the coherent knowledge base produced are also offered and are accessible from the RDF-Store and related services. The article also presented the work performed about reconciliation algorithms and their comparative assessment and selection. Keywords Smart city, knowledge base construction, reconciliation, validation and verification of knowledge base, smart city ontology, linked open graph.
A Web of Things Based Eco-System for Urban Computing - Towards Smarter CitiesAndreas Kamilaris
Environmental awareness and knowledge may help people to take more informed decisions in their everyday lives, ensuring their health and safety. The Web of Things enables embedded sensors to become easily deployed in urban areas for environmental monitoring such as air quality, electromagnetism, radiation, etc. In this presentation, we propose an eco-system for urban computing which combines the concept of the Web of Things, together with big data analysis and event processing, towards the vision of smarter cities that offer real-time information to their habitants about the urban environment. We touch upon near real-time web-based discovery of sensory services, citizen participation, semantic technologies and mobile computing, helping people to take more informed everyday decisions when interacting with their urban landscape. We then present a case study where we demonstrate the feasibility and usefulness of this eco-system to the everyday lives of citizens.
This research has been supported by the P-SPHERE project, which has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skodowska-Curie grant agreement No 665919.
Semantic Interoperability Issues and Approaches in the IoT.est Projectiotest
P Barnaghi, Semantic Interoperability Issues and Approaches in the IoT.est Project, at the IERC AC4 Semantic interoperability Workshop (during the IoT-week 2012), Venice, Italy, 19 June 2012
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...Edward Curry
The Real-time Linked Dataspace (RLD) is an enabling platform for data management for intelligent systems within smart environments that combines the pay-as-you-go paradigm of dataspaces, linked data, and knowledge graphs with entity-centric real-time query capabilities.
The RLD contains all the relevant information within a data ecosystem including things, sensors, and data sources and has the responsibility for managing the relationships among these participants.
It manages sources without presuming a pre-existing semantic integration among them using specialised dataspace support services for loose administrative proximity and semantic integration for event and stream systems. Support services leverage approximate and best-effort techniques and operate under a 5 star model for “pay-as-you-go” incremental data management.
Toward Semantic Data Stream - Technologies and ApplicationsRaja Chiky
Massive data stream processing is a scientific challenge and an industrial concern. But with the current volumes of data streams , their number and variety, current techniques are not able to meet the requirements of applications. The Semantic Web tools , through the RDF for example, allow to address the problem of heterogeneous data. Thus, the data stream are converted to semantic data stream by using RDF triples extended with a timestamp. To be able to query , filter, or reason semantic data streams, the query language SPARQL must be extended to include concepts such as windowing , based on what has been done in Data Stream Management Systems. In this talk, I will present recent work on the semantic data stream management , particularly extensions made on SPARQL language and associated benchmarks.
Advanced Analytics and Machine Learning with Data VirtualizationDenodo
Watch here: https://bit.ly/3719Bi7
Advanced data science techniques, like machine learning, have proven an extremely useful tool to derive valuable insights from existing data. Platforms like Spark, and complex libraries for R, Python and Scala put advanced techniques at the fingertips of the data scientists. However, these data scientists spent most of their time looking for the right data and massaging it into a usable format. Data virtualization offers a new alternative to address these issues in a more efficient and agile way.
Attend this webinar and learn:
-How data virtualization can accelerate data acquisition and massaging, providing the data scientist with a powerful tool to complement their practice
- How popular tools from the data science ecosystem: Spark, Python, Zeppelin, Jupyter, etc. integrate with Denodo
- How you can use the Denodo Platform with large data volumes in an efficient way
-About the success McCormick has had as a result of seasoning the Machine Learning and Blockchain Landscape with data virtualization
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfGetInData
Recently we have observed the rise of open-source Large Language Models (LLMs) that are community-driven or developed by the AI market leaders, such as Meta (Llama3), Databricks (DBRX) and Snowflake (Arctic). On the other hand, there is a growth in interest in specialized, carefully fine-tuned yet relatively small models that can efficiently assist programmers in day-to-day tasks. Finally, Retrieval-Augmented Generation (RAG) architectures have gained a lot of traction as the preferred approach for LLMs context and prompt augmentation for building conversational SQL data copilots, code copilots and chatbots.
In this presentation, we will show how we built upon these three concepts a robust Data Copilot that can help to democratize access to company data assets and boost performance of everyone working with data platforms.
Why do we need yet another (open-source ) Copilot?
How can we build one?
Architecture and evaluation
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
Adjusting OpenMP PageRank : SHORT REPORT / NOTESSubhajit Sahu
For massive graphs that fit in RAM, but not in GPU memory, it is possible to take
advantage of a shared memory system with multiple CPUs, each with multiple cores, to
accelerate pagerank computation. If the NUMA architecture of the system is properly taken
into account with good vertex partitioning, the speedup can be significant. To take steps in
this direction, experiments are conducted to implement pagerank in OpenMP using two
different approaches, uniform and hybrid. The uniform approach runs all primitives required
for pagerank in OpenMP mode (with multiple threads). On the other hand, the hybrid
approach runs certain primitives in sequential mode (i.e., sumAt, multiply).
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
A Data-driven Approach for Internet of Things Applications: Methods and Case Studies
1. A Data-driven Approach for Internet of Things
Applications: Methods and Case Studies
Wednesday, 11 April 2018 1
Suparna De
30th October, 2017
University of Granada, Spain
VII SEMINARIOS DE FORMACIÓN PARA LA INVESTIGACIÓN EN TIC
2. Outline
– Internet of Things: an Introduction
– Data Processing pipeline
• Data Sources
• Data Modelling: a Semantic Approach
• Data Search and Retrieval
• Data Analysis - Reasoning Methods
– Case Studies
• IoT Application Domains
– Open Research
Wednesday, 11 April 2018 2
4. – Term coined by Kevin Ashton in 1999.
– Interconnection of objects to computers with self-configuring capabilities
– Main enablers:
• sensors and actuators embedded in
physical objects
• RFID and sensor technology enable
computers to observe, identify and
understand the world
– Drivers:
• things-to-things communications
• integration of things data with applications
Wednesday, 11 April 2018 4
5. Low-cost Sensors are becoming prevalent
Wednesday, 11 April 2018
5
Environment sensors
Utility consumption sensors
Dynamic Tags
6. More parts of life are getting connected…
Wednesday, 11 April 2018
6
Cities
Public transport
Consumer goods
Smart Homes
Image courtesy: 1. Exigent Networks; www.exigentnetworks.ie
2. SmartCitiesCouncil
8. From IoT to the Web of Things (WoT)
– Connecting “Things” to the Web for:
• access
• description and discovery
• resource directories
• security
– Typical connectivity solutions:
• Constrained Application Protocol (CoAP)
• Lightweight HTTP
Wednesday, 11 April 2018 8
10. IoT: the case for a Data Perspective
– Abstractions of high-dimensional, high-volume data generated by
heterogeneous devices
– Associate data with context information
– Data fusion through application of data analytics and reasoning techniques
– Example Applications:
• Analyse road, environment (pollution) conditions with real-time location
information (proximity) to recommend events and venues
• Adjust traffic signal timings based on vehicle and cyclist arrival data
• Efficient waste management using FMCG (fast moving consumer
packaged goods) lifecyle information from smart tags
Wednesday, 11 April 2018 10
11. Challenges in IoT realization
– Many devices do not speak the same
language and cannot exchange data across
different gateways and smart hubs
– Things’ data may have a defined structure in a known
format, e.g. JSON, CSV, XML
– But data models adopted are different and not compatible
– Different units and context representation
Interoperability Challenge
Wednesday, 11 April 2018 11
12. The Data Lifecycle
– Data Collection
• Identification and connection to data sources
• Physical and social world data sources
• Data virtualisation
– Data modelling
– Schema adaptors
– Data Management
• Data indexing
• Metadata, data storage and retrieval
– Data Processing
• Missing data estimation
• Redundancy filtering, pre-sorting…
– Data Analysis
• Reasoning mechanisms
• Data fusion
– Applications
Wednesday, 11 April 2018 12
14. Data Source taxonomy
– Physical sensor deployments
• Fixed Sensing
• Mobile sensor nodes
– Mobile crowd sensing
• Participatory sensing
• Online social networks
Wednesday, 11 April 2018 14
Data Sources
Physical Sensor
Deployments
Mobile Crowd
Sensing
Participatory
Sensing
Online Social
Networks
Fixed Sensing Mobile Sensing
15. Physical Sensor Deployments
– Fixed Sensing
• Fixed installations – static location configuration of deployed sensors
• O&M data ~ continuous time series, resolution dependent upon sampling rate
• Typical deployments in urban areas, smart homes, ITS solutions
• Structured data, typically in JSON/CSV/XML formats
• Examples:
• London Air Quality Network (www.londonair.org.uk/London)
– Air pollution sensors: CO, NO2, O3, PM10, PM2.5, SO2
– Data sampled every 15 minutes
– 4 location types: roadside, suburban, urban background, industrial
– APIs for accessing data in XML or JSON; historical downloads in CSV
• Smart Santander (http://maps.smartsantander.eu/)
– Environmental monitoring: temperature, CO, noise, light
– Traffic monitoring: traffic volume, road occupancy, vehicle speed, queue length
– Agriculture monitoring: moisture, temperature, humidity…
• Water Distribution Networks…
Wednesday, 11 April 2018 15
16. Physical Sensor Deployments (2)
– Mobile Sensing
• O&M data ~ typically frequently updated, timestamped and structured (FUTS) data
• Each sampling data point associated with a distinct location tag
• Data typically accessible in a known structured format
• Not obtained at successive, equally-spaced time points
• Typical deployments for urban monitoring
• Opportunity for large-scale environmental monitoring
• Structured data, typically in JSON/CSV/XML formats
• Examples:
– Smart Santander
» http://maps.smartsantander.eu/
» Sensors attached to public
vehicles
» Environmental monitoring:
» temperature, CO, noise, light
– Madrid
» Pollen sensors on public buses
Wednesday, 11 April 2018 16
Image courtesy: Smart Santander testbed; http://maps.smartsantander.eu
17. Mobile crowd sensing
– Participatory Sensing
• Smartphone accompanied citizens forming sensing networks for local
knowledge gathering
• Involves explicit participation
• Made possible through dedicated apps or hardware carried by citizens
• Examples:
– Congested road and traffic incident detection
» Arduino boards in cars: speed and position of the car
» Environmental monitoring: temperature, CO, noise, light
– Melbourne: noise heatmaps of city
» Complement noise data from fixed sensors, perceptions of noise
and urban sounds
Wednesday, 11 April 2018 17
18. Mobile crowd sensing (2)
– Online social networks
• Immediacy of social network messages: rich source of city-information
• Data amenable to mining
• Can provide semantic context to physical sensing data
• Examples:
– Twitter
» 140 character tweets
» Wide adoption – 500 million users worldwide
» Both streaming and RESTful API
» User perception of pollution, representation term mining for traffic incidents
– Foursquare
» Location-based social network
» Users check-in to a venue
» Data: time, type, user details, venue details (name, location, category)
» Modality and format allows direct manipulation through statistical methods and
integration with (numeric) physical sensing data
Wednesday, 11 April 2018 18
20. Semantics for IoT Resources, data
– Semantics: machine-processible metadata (tagging)
– Semantic languages
• Web Ontology Language (OWL), Resource Description Framework (RDF)
– Structured, common platform for Things and data representation [Data Modelling]
– Achieving: ontologies for IoT “Things”
• Resource model
– Gateway, sensors, processing resources
• Entity model
– Physical world objects
– Features of interest for each entity
• Service model
– IoT services and interfaces
• Observation and Measurment (O&M) data
– Machine interpretation of relationships and hierarchies
Wednesday, 11 April 2018 20
Image ref: Jara et al. Semantic web of things: An analysis of the application semantics for the iot
moving towards the iot convergence. Int. J. Web Grid Serv., 10(2/3):244–272, April 2014
21. Role of Metadata
– Semantic tagging
– Machine-interpretable data annotation and resource descriptions
– Re-usable ontologies
– Resource description framework(s)
– Structured data, structured query
Wednesday, 11 April 2018 21
22. Metadata and Semantics
– To describe:
• Content
• Context
• Resources
• Entities and features of interest
– To create:
• Perception
• Situation awareness
– To support
• Automated processes for management of resources and decision
making
Wednesday, 11 April 2018 22
29. IoT Service model Instance
Wednesday, 11 April 2018 29
Refer to publication [3] for details
30. IoT Models: Summary
– Distinct repositories for metadata and
data
– Metadata
• Less frequency of update
– Data
• frequently updated, timestamped
and structured
• not obtained at successive,
equally-spaced time points
Wednesday, 11 April 2018 30
Refer to publication [4] for a survey of WoT ontologies
32. Data Indexing, Storage, and Retrieval Framework
Wednesday, 11 April 2018 32
– Data Parser
• Parse and transform from
JSON, CSV, or other data
– Data Record Mapping
• Map parsed data to
format defined by
Ontology Schema
– Spatial Indexing Component
• Geohash-Grid Tree
• Index spatial parameters
of data records
– Time-Series Database
(InfluxDB)
• Store O&M data
DataCollection
Smart Objects
Gateway
Virtual Object
Schema
Data Parser
JSON/CSV
Scripts
DataManagement
Data Record
Mapping
O&M
Time-Series
Database
Indexing
Component
Geohash
Generation
Refer to publication [6] for details.
33. Data Indexing, Storage, and Retrieval Framework
Wednesday, 11 April 2018 33
– Data Records
• a data record is a 5-tuple of the
form [<object-id>, [<tag-
key>=<tag-value>…(0..n)],
[<field-key>=<field-
value>…(1..n)], <geohash>,
<unix-nano-timestamp>]
– Considered Query
• A query asking for observation
and measurement (O&M) data
based on spatial and temporal
constraints
36. Spatial Index: Geohash-Grid Tree
– Node
• Geohash string
• Spatial grid (2 pairs of {latitude, longitude})
• A list of stored VO_IDs (leaf node)
– Features
• Unbalanced tree
• Fixed height
• Insertion without changing existing tree structure
• No need to split node
Wednesday, 11 April 2018 36
37. Data Retrieval Components and Steps
Wednesday, 11 April 2018 37
– Query Interface
• Get query
– Query Analysis
• Analyze query
• Send request with spatial constraints to
indexing component to get matched IDs
– Query Rewriting
• Rewrite query with matched IDs and
other constraints
– Results Refinement
• Refine returned data from database into
a proper format for display on query
interface
38. Spatial Index: Performance measurements
Wednesday, 11 April 2018 38
Geohash-Grid Tree Comparing to R Tree
– Insertion does not change existing tree structure
– Fast indexing creation time
– Better query response time in dense areas
0.000
1.000
2.000
3.000
4.000
5.000
1,000
5,000
10,000
15,000
20,000
25,000
30,000
35,000
40,000
45,000
50,000
55,000
60,000
65,000
70,000
75,000
80,000
85,000
90,000
95,000
100,000
ResponseTime(logms)
Number of Indexed Items
Range Query at Dense Area
R-Tree Geohash-Grid Tree
0.000
0.005
0.010
0.015
0.020
0.025
1,000
5,000
10,000
15,000
20,000
25,000
30,000
35,000
40,000
45,000
50,000
55,000
60,000
65,000
70,000
75,000
80,000
85,000
90,000
95,000
100,000
ResponseTime(logms)
Number of Indexed Items
Point Query at Dense Area
R-Tree Geohash-Grid Tree
0
50
100
150
200
1,000
5,000
10,000
15,000
20,000
25,000
30,000
35,000
40,000
45,000
50,000
55,000
60,000
65,000
70,000
75,000
80,000
85,000
90,000
95,000
100,000
InsertionTime(milliseconds)
Number of Data Records in Each Dataset
Tree Creation Time For Each Dataset (milliseconds)
R-Tree Geohash-Grid-Tree
39. Spatial Indexing and Retrieval: Performance measurements
Wednesday, 11 April 2018 39
0
1
2
3
4
1,000
5,000
10,000
15,000
20,000
25,000
30,000
35,000
40,000
45,000
50,000
55,000
60,000
65,000
70,000
75,000
80,000
85,000
90,000
95,000
100,000
QueryResponseTime(logms)
Number of Data Records
Geo-coordinates_as_Field Geohash-Grid Tree R-Tree Geohash_as_Tag
0
0.5
1
1.5
2
2.5 1,000
5,000
10,000
15,000
20,000
25,000
30,000
35,000
40,000
45,000
50,000
55,000
60,000
65,000
70,000
75,000
80,000
85,000
90,000
95,000
100,000
QueryResponseTime(logms)
Number of Data Records
Geocoordinates_as_Field GeohashGridTree R Tree Geohash_as_Tag
0
0.5
1
1.5
2
2.5
3
1,000
5,000
10,000
15,000
20,000
25,000
30,000
35,000
40,000
45,000
50,000
55,000
60,000
65,000
70,000
75,000
80,000
85,000
90,000
95,000
100,000
QueryResponseTime(logms)
Number of Data Records
Geocoordinates_as_Field GeohashGridTree R Tree Geohash_as_Tag
0
1
2
3
4
1,000
5,000
10,000
15,000
20,000
25,000
30,000
35,000
40,000
45,000
50,000
55,000
60,000
65,000
70,000
75,000
80,000
85,000
90,000
95,000
100,000
QueryResponseTime(logms)
Number of Data Records
Geocoordinates_as_Field GeohashGridTree R Tree Geohash_as_Tag
Long time
period
Short time
period
Sparse area Dense area
41. Case Study I: Smart Campus – IoT
testbed
Wednesday, 11 April 2018 41
42. Semantic Reasoning for Association Analysis
Wednesday, 11 April 2018 42
– Associations along thematic-spatial-temporal axes
• Thematic (feature) match – utilising domain
ontologies that capture virtual entity’s attributes
and IoT Service’s input/output parameter
• Spatial match – utilising location ontologies that
model logical locations with properties such as
‘contains’
• Temporal match – utilising temporal aspects of
entities which have a temporal aspect, such as
meeting rooms with the IoT Service’s
observation_schedule
– Dynamic association inference through
• Rules that incrementally reason on feature, spatial aspects
and time
– Provision for semantic queries on derived
associations
Spatial, temporal and
thematic association
Domain ontologies and
Rules
Entity of Interest
IoT Service
Refer to publication [2] for details.
43. Smart Campus
Wednesday, 11 April 2018 43
– Driven with semantic models describing
• Entities
• Campus buildings, floors and rooms
• Location model capturing indoor locations
• Rooms, corridoors etc. with their
proximity, containment relationships
• Resources
• Temperature, light sensors
• IoT services
• Access interface to IoT resources and
their O&M data
– Dynamic thematic-spatial-temporal association
inference
Processing and Storage
Association Manager
Resources
Semantic
Models
Triple Store
Association
rules
Knowledge
sharing
rules
Knowledge propagation
Semantic
resource
repository
Association
repository
Selected
Nodes
Shared association results
Triple Store update
Rule Manager
Results
Dispatcher
Rule Engine
Geolocation
Mapper
Refer to publication [2] for details.
45. Smart Campus: Online platform
Wednesday, 11 April 2018 45
Ref: T. Elsaleh et al. Sense2Web Linked Data Platform; http://iot.ee.surrey.ac.uk/s2w/. Details in reference [1]
47. Case Study II: Recycling FMCG
Wednesday, 11 April 2018 47
48. Semantic modelling of Smart tags
Wednesday, 11 April 2018 48
– Smart tags for:
– Ice cream packaging
• QR code
– Unique identifier for each ice cream
• Time temperature indicator (TTI) label
• indicator of the quality of the ice cream
based on its cold chain
• irreversible thermochromic ink that will
change its colour after exposure to a
temperature above -10 C° for more than
30 minutes
49. Semantic modelling of Smart tags
Wednesday, 11 April 2018 49
– Representation of both
• printed electronics (e.g. NFC) and
• passive printed 2D tags such as datamatrix and QR codes (encoded using
dynamic inks)
– that can be read by scanners
– Semantic model for SmartTags capturing:
• reactions to chemical/physical conditions [e.g. Inks could be thermochromic,
hydrochromic or fluorescent visible/invisible, i.e. they ‘reactTo’
temperature/humidity/light within defined ranges]
• reaction state (reversible/permanent)
• status (activated/expired/not-expired)
• links to required decoding mechanism
• links to recorded measurements, including context data (location, time etc.)
50. Semantic Analysis for FMCG
Wednesday, 11 April 2018 50
– System provides point-of-recycling information
for every consumer packaged good (CPG)
– Allows tracking of the FMCG lifecycle
– Enables cold chain visible quality indicator
– Semantic modelling can enable:
• Detection of erroneous QR code scan
measurements
• Generic, enhanced recommendations on
(nearest and relevant) recycling points
51. Case Study III: Cyber-Physical-Social
(CPS) data analysis, fusion
Wednesday, 11 April 2018 51
52. CPS Data Fusion
Wednesday, 11 April 2018 52
– Social data source:
• Recorded Foursquare check-ins in Patras,
Greece for 3 months
– 100 Days between July to September 2012
– Created a grid of “listening posts” that sampled
foursquare API every 30 minutes
– Each listening post queries API to retrieve
names of businesses within their range, current
check-ins and total check-ins. From this data
the number of users who checked-in within the
last 30 minutes can be calculated
– 282 venues recorded of which 249 checked into
– Average of 145.82 Check-ins per day
– Physical world data: Traffic and Air Quality
• Network of 29 stations where measurements
taken
• Blue stations where traffic measurements taken
over single 24-hour period, Pink stations where
traffic measurements taken over 7 day period
• X marks the air quality monitoring station.
• Air pollution measurements provided by the
public repository of the Hellenic Ministry of
Environment, Energy and Climate Change
Refer to publication [7] for details.
53. CPS Data Fusion
Wednesday, 11 April 2018 53
Correlated Air Quality (NO and CO) and Traffic data during a single 24 hour period (2
stations were over 1 week) with Foursquare check-in data
Correlation found between diurnal Foursquare check-ins, traffic volume, NOx and CO
pollutants
54. CPS Data Fusion
Wednesday, 11 April 2018 54
– Multilevel categories of check-ins used
to better understand peoples diurnal
patterns
• was used to compare activity
popularity, activity times and
compare venues
• only a few venues take up the
majority of checkins (Top 20% of
venues take up 69.2% of checkins).
55. CPS Data Analysis: City Rhythms
Wednesday, 11 April 2018 55
– Foursquare check-ins recorded in
London and New York
• >2 months in 2016; 39414 check-ins
(London) and 98726 (NY)
• Unique users:
– NY – 7363
– London – 4417
• Venues:
– NY – 25100
– London – 10853
Acknowledgement: Alex Grace, “Analysing Social Network Data to Reveal
City Demographics: Mining Social Networks For A City’s Typical Diurnal
Movement Patterns”, BEng final year project report, University of Surrey, 2017. Heatmap of Foursquare venues in London
56. CPS Data Analysis: City Rhythms
Wednesday, 11 April 2018 56
5125
5131
5137
5143
5149
5155
5161
5167
5173
0
500
1000
1500
2000
2500
3000
-55-53-51-49-47-45-43-41-39-37-35-33-31-29-27-25-23-21-19-17-15-13-11 -9 -7 -5 -3 -1 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
CoordinateLatitude(x100)
UniqueUserCount
Coordinate Longitude (x100)
London Unique Users
0-500 500-1000 1000-1500 1500-2000 2000-2500 2500-3000
Acknowledgement: Alex Grace, “Analysing Social
Network Data to Reveal City Demographics: Mining
Social Networks For A City’s Typical Diurnal
Movement Patterns”, BEng final year project report,
University of Surrey, 2017.
58. Wednesday, 11 April 2018 58
Smart thermostat
Smart lighting
Assisted Living
Noise heatmaps
Characterisation of urban
neighbourhoods
Event/venue recommendations
Event detection
Electric vehicle charging
Smart traffic lights
Smart parking meters
Individual trip planning
Pollution mapping and
monitoring
Sentiment analysis –
environment conditions
Refer to publication [5] for details.
59. Open Research Issues
Wednesday, 11 April 2018 59
– Cross-space data fusion:
• Mobile crowd sensed and physical sensor data are multimodal and in
different scales of measure
• Physical sensor data in interval or ratio scale
• Open datasets in nominal scale (qualitative classifications)
– Need intelligent methods to convert mobile crowd sensed data into ratio
scale for efficient integration with physical sensor data
– Reasoning methods: typically deductive
• Extend to probabilistic reasoning to handle uncertain situations
• Learning in dynamic or evolving environments
– Detect changes in environment to trigger adaptive strategies
60. Selected Publications
Wednesday, 11 April 2018 60
1. De, S.; Elsaleh, T.; Barnaghi, P.; Meissner, S. An Internet of Things Platform for Real-World and Digital Objects. Journal
of Scalable Computing: Practice and Experience 2012, 13, 45-57.
2. De, S.; Christophe, B.; Moessner, K. Semantic Enablers for Dynamic Digital-Physical Object Associations in a Federated
Node Architecture for the Internet of Things. Ad Hoc Networks 2014, 18, 102-120.
3. Wang, W.; De, S.; Cassar, G.; Moessner, K. An experimental study on geospatial indexing for sensor service discovery.
Expert Systems with Applications 2015, 42, 3528-3538.
4. De, S.; Zhou, Y.; K., M. Ontologies and context modeling for the Web of Things. In Managing the Web of Things:
Linking the Real World to the Web, 1 ed.; Sheng M, Q.Y., Yao L, Benatallah B, Ed. Morgan Kaufmann: Burlington,
Massachusetts, 2017.
5. De, S.; Zhou, Y.; Larizgoitia Abad, I.; Moessner, K. Cyber–Physical–Social Frameworks for Urban Big Data Systems: A
Survey. Applied Sciences 2017, 7, 1017.
6. Zhou, Y.; De, S.; Wang, W.; Moessner, K.; Palaniswami, M. Spatial Indexing for Data Searching in Mobile Sensing
Environments. Sensors 2017, 17, 1427.
7. Komninos, A.; Stefanis, V.; Plessas, A.; Besharat, J. Capturing Urban Dynamics with Scarce Check-In Data. Pervasive
Computing, IEEE 2013, 12, 20-28.
61. Contact
Wednesday, 11 April 2018 61
Dr. Suparna De
Senior Research Fellow
Institute for Communication Systems (ICS)
University of Surrey
Guildford, UK
Email: s.de@surrey.ac.uk
Website: https://www.surrey.ac.uk/people/suparna-de