3. Contextualise Sensors with Linked Data
3
dB, Km, µPa?
dBs in water have a different relative value than in air
4. Q1. How to model it?
—> Is it worth it?
4
4
dB,dB, Km, µPa?
Yes, because
Q1. Contextualised Model for
Q3. Relevancy Prediction
Q4. Enriched Web Content
Q5. Network Adaptability
Q2. Cross-Network Communication
5. Research Questions
5
Q1. How to model
Linked Sensor Data
for
Q2. Cross-Network
Communication
Q3. Relevancy
Prediction
Q4. Web Data
Quality
Q5. Network
Adaptability
6. Outline
1. Linked Sensor Data Model [Q1]
2. LD4Sensors Web Service [Q2]
3. Sensor Relevancy Prediction [Q3]
4. Enriched Web Content [Q4]
5. Network Adaptability [Q5]
6. Research Answers
7. Lessons Learned and Future Work
6
Core Research
and
Results
Conclusion
Q1. How can contextual information
be used to enrich sensor data?
7. Linked Sensor Data Model
7
Ontology
Modularisation
Context
Network
Components
Energy
Conservation
8. Linked Sensor Data Model
8
Application
Ontology
Domain Ontology Task Ontology
Upper Ontology
Ontology Aligning
Inheritance & Reuse
Dolce+DnS Ultralite(DUL)
W3C Semantic Sensor Network (SSN)
Our Ontology
Quantities, Units, Dimensions
and Data Types (QUDT)
9. Linked Sensor Data Model
9
Dolce+DnS Ultralite(DUL)
W3C Semantic Sensor Network (SSN)
Provenance (PROV)
Event Model-F (EVENT)
Unified Code for Units of Measure (UCUM)
Friend Of A Friend (FOAF)
Measurement Unit (MUO)
Online Presence (OPO)
Review Vocabulary (REV)
Quantities, Units, Dimensions
and Data Types (QUDT)
Ontology Aligning
Inheritance & Reuse
10. 10
Linked Sensor Data Model
spt:Agent
spt:Activity
ssn:Device
ssn:Sensor
EventParticipation
ssn:Stimulus
spt:Place
11. 11
Linked Sensor Data Model
OWL Full
Symmetric
Transitive Inverse
Equivalent
room A floor 2 house H
spt:containedIn
Asserted, Inferred, Direct
Relations
13. Outline
1. Linked Sensor Data Model [Q1]
2. LD4Sensors Web Service [Q2]
3. Sensor Relevancy Prediction [Q3]
4. Enriched Web Content [Q4]
5. Network Adaptability [Q5]
6. Research Answers
7. Lessons Learned and Future Work
13
Core Research
and
Results
Conclusion
Q1. How can contextual information
be used to enrich sensor data?
Q2. How can sensors communicate
across different platforms without ad-
hoc solutions?
14. 14
Dolce+DnS Ultralite(DUL)
W3C Semantic Sensor Network (SSN)
Provenance (PROV)
Event Model-F (EVENT)
Unified Code for Units of Measure (UCUM)
Friend Of A Friend (FOAF)
Measurement Unit (MUO)
Online Presence (OPO)
Review Vocabulary (REV)
Quantities, Units, Dimensions
and Data Types (QUDT)
Which ontologies?
Which links? How to enable inference?
How to enable
cross-communication?
Non-experts
Average users
19. 19
LD4S: Usability Evaluation
2. Uptake,
Usability,Utility
Users
feedback
Tot. Participants: 38
1% had previously
interacted with sensors
Survey
GUI Usable and clear
API to be improved
Applicability to be made explicit
20. 20
LD4S: Utility Evaluation
Time of usage
# Data accessed
# Data transmitted
Amount
Type
Uniqueness
Location
Quality
Time Sensitivity
Relevance
Web Service Resources Linked Data output
Purpose to be made more explicit
Links relevancy to purpose to be improved
Highlight importance of network/context metadata
21. 21
LD4S: Uptake Evaluation
Unique accesses
Tot # accesses
Per day (over the 30 days period)
2. Uptake,
Usability,Utility
Users
feedback
Satisfying for pilot evaluation
Project-driven modelling: positive feedback from partners
To be repeated over a longer time period
22. 22
LD4S: Performance Evaluation
Threshold = # requests / response time (sec)
compared to
Payload size sent + received by LD4S
Performance
3. Implementation
Quality Decrease of throughput as payloads increases
But not exponential
Improvable by implementing a cache
23. Outline
1. Linked Sensor Data Model [Q1]
2. LD4Sensors Web Service [Q2]
3. Sensor Relevancy Prediction [Q3]
4. Enriched Web Content [Q4]
5. Network Adaptability [Q5]
6. Research Answers
7. Lessons Learned and Future Work
23
Core Research
and
Results
Conclusion
Q2. How can sensors communicate
across different platforms without ad-
hoc solutions?
Q3. How to identify which sensors are more
relevant sources of information to define a
specific small scope of interest?
26. 26
Relevancy Prediction
in Activity Logging
1. from DataHub:
Algorithm
Sensors sharing
Location & Time
Activated
Sensors
2. EasyESA Similarity (X,Y)
X
Y
3. Add to Distance Matrix
4. Clustering
Sensors in the same cluster are relevant for the same activity.
Activity = Cluster
27. Predicting Sensor Relevancy for ADLs Logging
of the rows corresponds to a word that occurs in
S
i=1...n di. An e
corresponds to the TF-IDF value of term ti in document dj.
T[i, j] = tf(ti, dj) ⇤ log
n
dfi
where tf(ti, dj) is the term frequency of the term ti in the do
Relevancy Prediction:
Distributional Semantics
Term frequency
Inverse document frequency
model
term frequency of term t in document d
tot documents
tot documents containing the term t
28. Relevancy Prediction:
Hierarchical Clustering
Unweighted Pair Group Method
with Arithmetic mean (UPGMA)
Weighted Pair Group Method
with Arithmetic mean (WPGMA)
Farthest Point or VoorHees (VH)
Reflection
of Semantic
Distribution
Reflection of
Structural
Subdivision
Reflection of
Centrality
29. 29
Predicting Sensor Relevancy for ADLs Logging 131
with the sensors manually annotated as part of such activity logging. These annotations
and readings are taken from the public14
dataset MITes [Tapia et al., 2004] and were
collected during live experiment settings. We pre-processed such dataset (i.e., CSV files
of sensor readings and metadata about both sensors and activities) to form HTTP PUT
requests to the LD4S API for annotating and storing the data, as in Listing 5.7. Based
on such comparison, the overall accuracy and precision of our system are calculated when
applying either of the clustering algorithms UPGMA, WPGMA or VH.
⇤
1 PUT ld4s:device/2_99
2
3 payload: {’observed_property ’: ’switch ’,
4 ’location -name ’: [’Kitchen ’],
5 ’foi’: [’Fridge ’]}
6
7 headers: {’Content -type ’: ’application/json’,
8 ’Accept ’: ’application/x-turtle ’}
⇥
Listing 5.1: HTTP PUT request forwarded to the LD4S RESTful API.
DataHub (see Section 5.5) was then queried for all the sensor datasets available15
thus returning a JSON list of details of these datasets such as their ID, title, tags, license
and endpoint URIs. The system filters only those datasets that either have no license or
Sensor Relevancy for ADLs Logging 133
Table 5.1.: Activities labelled in the MITes dataset.
Number of Examples per Class
Activity Subject 1 Subject 2
Preparing dinner 8 14
Preparing lunch 17 20
Listening to music - 18
Taking medication - 14
Toileting 85 40
Preparing breakfast 14 18
Washing dishes 7 21
Preparing a snack 14 16
Watching TV - 15
Bathing 18 -
Going out to work 12 -
Dressing 24 -
Grooming 37 -
Preparing a beverage 15 -
Doing laundry 19 -
Cleaning 8 -
hings expansion. The growth of time cost is analysed more thoroughfully in
.
t semantic similarity value calculated was 1.0 for the pair ¡switch, tv¿ and
mper ¿, followed by 0.00036 for the pair ¡ switch, jewelry box¿ and ¡ switch,
Relevancy Prediction:
Evaluation Data
27 FoIs —> 351 Similarity Pairs
We considered the worse case in which only one of the sensors sharing the
at the same time range has recently sensed a change in status for the cu
activity, while all the other nearby ones which will likely do so in the near
be predicted. In this case, given n sensors, the amount of pairs to check
relatedness is the binomial coe cient as in Equation 5.10. In our case sinc
di↵erent features of interest, there are 27 di↵erent types of sensors and 351
✓
n
2
◆
=
n!
2!(n 2!)
Even though the binomial coe cient grows quickly, it only depends o
of features of interest rather than on the amount of actually deployed se
same time, the amount of ICOs is expected to grow but the amount
sensors is not, since there is only so much in the real world that can be
sensors. Our method then is not expected to hinder the system from scal
Worst case scenario:
only one of the sensors sharing the same location at the
same time range has recently sensed a change in status for
the current ongoing activity
30. n the MITes annotations (i.e., actual class). Consequently, we considered a
ssification problem, i.e., whether the sensors actually part of the same activity
clustered in the same cluster. As a result a separate confusion matrix (Table 5.2)
ed for each of the annotated activity.
.: Confusion matrix displaying number of true positives, true negatives, false positives
and false negatives for a 2-class classification problem.
Predicted vs Actual Actual class
1 2
Predicted class
1 TP11 FP12
2 FN21 TN22
uch settings, we calculated precision and overall accuracy.
Precision =
TP11
TP11 + FP12
(5.11)
Accuracy =
TP11 + TN21
TP11 + TN22 + FP12 + FN12
(5.12)
in the same cluster. As a result a separate confusion matrix (Table 5.2)
ch of the annotated activity.
sion matrix displaying number of true positives, true negatives, false positives
lse negatives for a 2-class classification problem.
Predicted vs Actual Actual class
1 2
Predicted class
1 TP11 FP12
2 FN21 TN22
ings, we calculated precision and overall accuracy.
Precision =
TP11
TP11 + FP12
(5.11)
Accuracy =
TP11 + TN21
TP11 + TN22 + FP12 + FN12
(5.12)
Relevancy Prediction:
Evaluation: Precision
Dressing Cleaning Toileting Laundry Dinner WashingUp Snack Lunch
Precision of the Activity Clustering
Performance%
0
20
40
60
80
100
WPGMA
UPGMA
VH
31. n the MITes annotations (i.e., actual class). Consequently, we considered a
ssification problem, i.e., whether the sensors actually part of the same activity
clustered in the same cluster. As a result a separate confusion matrix (Table 5.2)
ed for each of the annotated activity.
.: Confusion matrix displaying number of true positives, true negatives, false positives
and false negatives for a 2-class classification problem.
Predicted vs Actual Actual class
1 2
Predicted class
1 TP11 FP12
2 FN21 TN22
uch settings, we calculated precision and overall accuracy.
Precision =
TP11
TP11 + FP12
(5.11)
Accuracy =
TP11 + TN21
TP11 + TN22 + FP12 + FN12
(5.12)
in the same cluster. As a result a separate confusion matrix (Table 5.2)
ch of the annotated activity.
sion matrix displaying number of true positives, true negatives, false positives
lse negatives for a 2-class classification problem.
Predicted vs Actual Actual class
1 2
Predicted class
1 TP11 FP12
2 FN21 TN22
ings, we calculated precision and overall accuracy.
Precision =
TP11
TP11 + FP12
(5.11)
Accuracy =
TP11 + TN21
TP11 + TN22 + FP12 + FN12
(5.12)
Dressing Cleaning Toileting Laundry Dinner WashingUp Snack Lunch
Accuracy of the Activity Clustering
Accuracy%
0
20
40
60
80
WPGMA
UPGMA
VH
Relevancy Prediction:
Evaluation: Accuracy
32. Relevancy Prediction:
Hierarchical Clustering
Unweighted Pair Group Method
with Arithmetic mean (UPGMA)
Weighted Pair Group Method
with Arithmetic mean (WPGMA)
Farthest Point or VoorHees (VH)
Reflection
of Semantic
Distribution
Reflection of
Structural
Subdivision
Reflection of
Centrality
33. Relevancy Prediction:
Evaluation: Comparison with SoTA
Dressing Cleaning Toileting Laundry Dinner WashingUp Snack Lunch
0
Figure 5.6.: Comparison between accuracy percentages achieved by the clustering algorithms
for some of the activities.
Table 5.3.: Comparison between the experiment setup and results for our own approach and
the previous closest research e↵orts.
Kwon et al. Wyatt et al. Ours
# Sensors 3 100 200
# Activities 5 26 16
Collection Time 50 mins 360 mins 2 weeks
Goal AR AI RSP
Algorithms HIER HMM UH
Precision 79% 70% 89%
Accuracy - 52% 69%
Our results are relevant as we can notice that our system improved the accuracy by 32%
nd the precision by 5% with respect to such previous e↵orts from the state of the art.
Increase of 32% accuracy
and 5% precision
34. Relevancy Prediction:
Evaluation: Performance
50 100 150 200
20406080
Features of Interest (FoIs)
Time(msec)
●
●
●
Time Complexity Growth
Time Growth per Amount of FoIs
●
●
●
#FoIs
27
54
81
112
135
162
189
216
HTTP PUT requests: 3ms
Overall Execution: 18ms
Dataset Discovery on DataHub: 3ms
(20 datasets)
LD4S SPARQL response: 246ms
ESA: 14ms (351 similarity pairs)
Easy-ESA response: 9ms
Highest time cost = 1 min 26 sec for comparing 216 FoIs
Possibility of updating sensors similarities at run-time
CoRE devices (RAM 4 kB and ROM 128 kB): pre-compute offline clustering
35. Outline
1. Linked Sensor Data Model [Q1]
2. LD4Sensors Web Service [Q2]
3. Sensor Relevancy Prediction [Q3]
4. Enriched Web Content [Q4]
5. Network Adaptability [Q5]
6. Research Answers
7. Lessons Learned and Future Work
35
Core Research
and
Results
Conclusion
Q3. How to identify which sensors
are more relevant sources of
information to define a specific small
scope of interest?
Q4. How can contextualised sensors
improve the quality of traditional
Web content?
39. 39
1. from DataHub:
Algorithm
Sensors
sharing
Location &
Time
2. Extract Google Search results representing Real Places
3. Live Data Fetching
4. Result Dictionary Update
Bridging the gap between Web and Real Places
Enriched Web Content
G-Sensing
41. 41
Enriched Web Content
Evaluation Deployment
DataHub
Clinic
Clinic
Clinic
30
sensors
1Km
LD4S
PUT <JSON sensor metadata>
G-Sensing
Google Places
3.692 locations
1.455 (39.4%) have a
website
Query: Acupuncture Galway Salthill
42. 42
Enriched Web Content
Evaluation Coverage
How much of the area defined by the
virtual locations overlaps with the city of
Galway
within radius r=150m
Coverage percentage as we
vary the vicinity radius
Added value of our approach for integrating live data
into physical locations' websites:
43. ! We divided the areas of
Galway divided into squares
with different side lengths l
! We counted the number of
virtual locations within each
square.
Enriched Web Content
Evaluation Distribution
! # virtual locations per square and
their respective frequency shows a
power-law relationship
! while most squares only contain a
small set of locations, a few
squares contain a very large
number of locations
! (e.g., city centres, business parks).
44. 44
Enriched Web Content
Evaluation Performance
! Google search result page: ~145 KB
! After enabling G-Sensing: ~175 KB (~20% increase)
! At browser start-up: query to DataHub for data source discovery: 3 ms
○ 20 sensor datasets discovered
○ 3 sensor datasets have an open license + expose a SPARQL endpoint
○ 1 sensor dataset’s SPARQL endpoint was accessible (LD4S): 246 ms
G-Sensing does not impede on a user's browsing experience
Bandwidth Overhead
Response Time
45. Outline
1. Linked Sensor Data Model [Q1]
2. LD4Sensors Web Service [Q2]
3. Sensor Relevancy Prediction [Q3]
4. Enriched Web Content [Q4]
5. Network Adaptability [Q5]
6. Research Answers
7. Lessons Learned and Future Work
45
Core Research
and
Results
Conclusion
Q4. How can contextualised sensors
improve the quality of traditional
Web content?
Q5. How can contextualised
sensors improve the adaptability of
mobile constrained and
heterogeneous sensor networks?
47. Outline
1. Linked Sensor Data Model [Q1]
2. LD4Sensors Web Service [Q2]
3. Sensor Relevancy Prediction [Q3]
4. Enriched Web Content [Q4]
5. Network Adaptability [Q5]
6. Research Answers
7. Lessons Learned and Future Work
47
Core Research
and
Results
Conclusion
Q5. How can contextualised
sensors improve the adaptability of
mobile constrained and
heterogeneous sensor networks?
49. Future Work
• Filtering
• of links according to the resource rating/review LD4S system
• of sensor data injected into Google Search results according to user’s prefs &
context
• Extending
• derive labels of activities beyond the per-activity sensor clustering
• sensor data injected into any Web page and content
• sensor data sources extended to include, e.g., TripAdvisor and other user-
generated content
• collect user’s feedback on auto-derived annotations for incremental learning
• Evaluation
• Long-term large-scale user study to gather insights into how users really use
the current functionalities offered by LD4S
• Other areas of research
• Sensor-triggered data can feed back to Linguistic Linked Data knowledge
49