WOTS2E: A Search Engine for a
Semantic Web of Things
Unit for Reasoning, Querying and Stream Processing
Insight Centre for Data Analytics,
National University of Ireland, Galway
Andreas Kamilaris, Semih Yumusak, Ali Intizar
World Forum – IoT 2016
Reston, VA, USA- December 12-14, 2016
https://www.w3.org/WoT/images/iot.png
Web of Things
• Designed to connect
“things” to the Web
• A combination of
• Approaches
• Software Architectures
• Interfaces
https://www.w3.org/WoT/images/iot.png
• Increase Interoperability
among IoT platforms
• Mitigate Silo Architecture
• Avoid Multiple and Conflicting
Standards
• Global and Easy Discovery of
Devices
Why we need Web of Things?
• Few of the emerging WoT
platforms
• Sorcades
• ThingWorx
• SpitFire
• Evrythng
• Open.Sen.se
• WoTKit
• Auto WoT
• Xively
Web of Things Platforms
• Can we improve the discoverability of Web of Things?
• Can we use semantic technologies to improve device
discoverability?
• Are there any datasets produced by WoT devices available as
Open Data on the Web?
• Can we create a global and distributed index for search and
discovery of WoT devices?
Our Motivation
Discovery
• Machines needs to automatically discover devices/things and their
description
• Global repositories
• Indexing Things and their description
• Semantic Annotation to describe things
• SPARQL queries and data endpoints
• Discover devices on the fly (Late Binding)
Slide Source: ISWC 2016, Tutorial on Semantic Web Meets IoT/WoT (Soumya Kanti Datta)
Repository Based Discovery
Slide Source: ISWC 2016, Tutorial on Semantic Web Meets IoT/WoT (Soumya Kanti Datta)
Device Discovery Mechanisms
• Spatial Search
– BLE beacon based things
• Network Based Search
– mDNS, multicast CoAP
• Device Registration Directories
– CoRE resource directory, XMPP IoT directory, HyperCat
• Meta-Data Discovery
– CoRE Link Format
• Semantic Search and Discovery Techniqyes
– Offers high richness in search queries
• E.g. “search for all light bulbs in my house”
Slide Source: ISWC 2016, Tutorial on Semantic Web Meets IoT/WoT (Soumya Kanti Datta)
Semantic Search & Discovery: Key Challenges
• Optimal Data Source Discovery
• Streams are everywhere
• Multiple data streams can answer the same query
• Optimal data stream selection
• Catering for user-defined constraints and preferences
• On-Demand Stream Federation
• Automated composition of primitive data streams to
answer complex queries
• Adaptation
• Data source properties can change over time
• Make sure selected sources remain “optimal”
throughout life cycle of the query
Stream Discovery, Federation and Adaptation
• Stream Discovery
– Data interoperability:
• Semantic descriptions (ontologies and annotations)
– Interface interoperability:
• Streams as event services (service discovery)
• Stream Federation
– Efficient processing of complicated event logics
• Data Stream Management Systems
• Complex Event Processing
Semantic
Web
Service
Oriented
Architectures
DSMS and
CEP
Semantic Description
• A sensor service description is annotated as:
sdesc = (td, g, qd, Pd, FoId, fd)
type grounding QoS
Observed
Properties
Feature Of
Iterest
Pd → FoId
• Similarly, a sensor service request is annotated:
sr = (tr, Pr, FoIr, fr, pref, C)
type Requested
Properties
Feature of
Interest
Pd → FoId
no
grounding
NFP Constraint and
Preferences
Middle-ware for Stream Discovery and Federation
Semantic Annotation
ACEIS Core
Resource
Management
Application
Interface
Knowledge Base
QoI/QoS
Stream
Description
Data Mgmt,
Indexing,
Caching
User Input
Event Request
Data
Federation
Resource Discovery
Event Service Composer
Composition Plan
Subscription Manager
Query Transformer
Query Engine
Query
Results
Constraint
Validation
Constraint
Violation
Adaptation
Manager
Data Store
IoT Data
Stream
Social Data
Stream
Web of Things Discovery
• Optimal Data Source Discovery
• Web of Things Search Space is
Global
• Across the whole Web
• Indexing
• Geo-Spatial Mapping
• Movable Objects/Things
• Require Frequent Updates in
Indexes
Problem StatementLinked Open Data Cloud
WOTS2E: ArchitectureWOTS2E: Overview
• A Search engine to
discover semantic meta-
description of things
• Crawls the Web to
discover Linked Data
Sources
• Analyzes Linked Data
sources to identify
relevant WoT devices
WOTS2E: ArchitectureWOTS2E: Overview
• Maintain a registry of
devices for discovery
• Support application
request and provide
details to interact with
the devices.
WOTS2E: ArchitectureWOTS2E: Operations
• Discovery of Linked Data Endpoints
– Web crawlers that continuously scan the
web for discovery of Linked Data
endpoints, frequency of one scan per day
• Examination of Discovered Linked
Data Endpoints
– Query endpoints for IoT/WoT relevant
ontologies
– Explore popular dataset descriptions, such
as VoID, SPARQL-SD
• Analysis of Linked Data Endpoints
and WoT Device Discovery
– Through SPARQL queries, VoID/SPARQL-
SD files, use of open APIs
• Recording of WoT Devices and
Services Discovered
– Service type, location, time, features,
interaction types, restrictions etc.
WOTS2E: ArchitectureWOTS2E: Implementation/Analysis
Common Patterns
<meta name=”Keywords” content=”OpenLink Virtuoso Sparql”>
Virtuoso SPARQL Query Editor
OpenLink Software
<label for=”debug”>Strict checking of void variables</label>
<a href=”http://www.openlinksw.com/virtuoso” >
<a href=”/isparql”>iSPARQL</a>
<label for=”query”>Query text</label>
• SPARQL Endpoint Listed at Datahub.io
• Common Patterns Identification
• SPARQL endpoint discovery
WOTS2E: ArchitectureWOTS2E: Implementation/Analysis
• Discovered patterns are used
as an input to our web
crawlers, in order to search the
web for available SPARQL
endpoints.
• For web crawling, we used a
meta-crawling service called
SpEnD.
• SpEnD exploits the search
functionality available over
popular search engines to
accelerate the performance of
web crawling.
WOTS2E: ArchitectureWOTS2E: Implementation/Analysis
• To analyze the URLs retrieved, the Jena Framework was
used to send SPARQL queries to the candidate endpoints,
checking whether they are valid SPARQL endpoints or not.
• All valid SPARQL endpoints were examined whether they
contain information related to IoT/WoT, i.e. contained
relevant ontologies.
SELECT DISTINCT ?Concept WHERE {[] a ?Concept} LIMIT 100
SELECT * WHERE {{?s ?p ?o}
UNION {GRAPH ?g {?s ?p ?o}}
FILTER regex(?o, "/SSN"). FILTER isIRI(?o).
• After labeling some SPARQL endpoint as related to IoT/WoT, the next step was to
analyze it, discovering which devices/services are available through it:
VoID file adapted to
reveal information about
WoT devices and services
Extend SSN to include discovery and
description information through some
ontology that describes web services.
:ExampleDataset a void:Dataset;
void:subset :ExampleSensor .
:ExampleSensor a void:Dataset;
dcterms:title "WoT Example Sensor";
dcterms:description "http://../sens.wadl";
dcterms:contributor "Insight Centre";
dcterms:source "140.203.154.11"
WOTS2E: Implementation/Analysis
WOTS2E: ArchitectureWOTS2E: Implementation/Analysis
• Information about SPARQL endpoints, devices/services
discovered and sensor meta-data information was stored
as RDF triples on a Virtuoso RDF store, installed on
WOTS2E.
• Use of the IoT ontologyPrefix iot: <http://purl.org/IoT/iot#>
Prefix ssn: <http://purl.oclc.org/../ssn#>
SELECT ?sm ?device ?service
WHERE {
?sm a iot:smart_entity
?sm iot:has_part_device
?device ?device ssn:observes
?service ?service a iot:Temperature
}
WOTS2E: ArchitectureWOTS2E: Evaluation
• The SpEnD (meta-)crawling service ran for 24 hours
• Using the common patterns for SPARQL endpoints
• Relevant URLs from the Bing, Yahoo, Google, Baidu, and
Yandex search engines.
• Comparison of discovered endpoints with Datahub.
Active Inactive Total
WOTS2E 638 640 1278
Datahub 258 296 554
WOTS2E: ArchitectureWOTS2E: Evaluation
• From the discovered 638 active SPARQL endpoints, we
examined them one by one for relevance to IoT/WoT
Ontology Number of Endpoints
SSN 13
DBPedia 13
SmartBuilding 3
DogOnt 2
DUL 2
km4city 2
OpenEI 2
RDFS, SKOS 4
Fan Fpai, Fiemser, IoT,
PROV, SAREF
5 (once each ontology)
WOTS2E: ArchitectureWOTS2E: Evaluation
• IoT/WoT-specific triples from the endpoints
Ontology Number of Triples
SSN 1.433,248
DUL 182
km4city 56
Fiemser 50
OpenIoT 44
SmartBuilding 36
DogOnt 24
SAREF 4
Fan Fpai 2
Conclusion
• Semantic Search and Discovery is essential for Web of
Things
• Currently only a handful of available SPARQL endpoints
(46, 7.2%) seem to relate to IoT/WoT.
• Lack of meta data availability
• Need for standardization for discovery mechanisms
• Our method aims to suggest a solid proposal on how to
achieve discovery on SWoT seamlessly and with minimum
effort.
• WOTS2E can support applications looking for on the fly
discovery and integration of devices
Future Work
• Improve Search mechanism by designing good
vocabularies/ontologies and descriptions for IoT/WoT
devices, services and data.
• A user-friendly website of WOTS2E, to incrementally let
users to access the discovered lists of services in a well-
organized way.
• From meta-crawling to efficient (classical) web crawling.
• Contribute to the standardization efforts on the WoT (W3C
WoT IG, OGC Sensor Web Interface for IoT SWG),
promoting WOTS2E as a viable solution for a SWoT search
engine.

WOTS2E: A Search Engine for a Semantic Web of Things

  • 1.
    WOTS2E: A SearchEngine for a Semantic Web of Things Unit for Reasoning, Querying and Stream Processing Insight Centre for Data Analytics, National University of Ireland, Galway Andreas Kamilaris, Semih Yumusak, Ali Intizar World Forum – IoT 2016 Reston, VA, USA- December 12-14, 2016
  • 2.
    https://www.w3.org/WoT/images/iot.png Web of Things •Designed to connect “things” to the Web • A combination of • Approaches • Software Architectures • Interfaces
  • 3.
    https://www.w3.org/WoT/images/iot.png • Increase Interoperability amongIoT platforms • Mitigate Silo Architecture • Avoid Multiple and Conflicting Standards • Global and Easy Discovery of Devices Why we need Web of Things?
  • 4.
    • Few ofthe emerging WoT platforms • Sorcades • ThingWorx • SpitFire • Evrythng • Open.Sen.se • WoTKit • Auto WoT • Xively Web of Things Platforms
  • 5.
    • Can weimprove the discoverability of Web of Things? • Can we use semantic technologies to improve device discoverability? • Are there any datasets produced by WoT devices available as Open Data on the Web? • Can we create a global and distributed index for search and discovery of WoT devices? Our Motivation
  • 6.
    Discovery • Machines needsto automatically discover devices/things and their description • Global repositories • Indexing Things and their description • Semantic Annotation to describe things • SPARQL queries and data endpoints • Discover devices on the fly (Late Binding) Slide Source: ISWC 2016, Tutorial on Semantic Web Meets IoT/WoT (Soumya Kanti Datta)
  • 7.
    Repository Based Discovery SlideSource: ISWC 2016, Tutorial on Semantic Web Meets IoT/WoT (Soumya Kanti Datta)
  • 8.
    Device Discovery Mechanisms •Spatial Search – BLE beacon based things • Network Based Search – mDNS, multicast CoAP • Device Registration Directories – CoRE resource directory, XMPP IoT directory, HyperCat • Meta-Data Discovery – CoRE Link Format • Semantic Search and Discovery Techniqyes – Offers high richness in search queries • E.g. “search for all light bulbs in my house” Slide Source: ISWC 2016, Tutorial on Semantic Web Meets IoT/WoT (Soumya Kanti Datta)
  • 9.
    Semantic Search &Discovery: Key Challenges • Optimal Data Source Discovery • Streams are everywhere • Multiple data streams can answer the same query • Optimal data stream selection • Catering for user-defined constraints and preferences • On-Demand Stream Federation • Automated composition of primitive data streams to answer complex queries • Adaptation • Data source properties can change over time • Make sure selected sources remain “optimal” throughout life cycle of the query
  • 10.
    Stream Discovery, Federationand Adaptation • Stream Discovery – Data interoperability: • Semantic descriptions (ontologies and annotations) – Interface interoperability: • Streams as event services (service discovery) • Stream Federation – Efficient processing of complicated event logics • Data Stream Management Systems • Complex Event Processing Semantic Web Service Oriented Architectures DSMS and CEP
  • 11.
    Semantic Description • Asensor service description is annotated as: sdesc = (td, g, qd, Pd, FoId, fd) type grounding QoS Observed Properties Feature Of Iterest Pd → FoId • Similarly, a sensor service request is annotated: sr = (tr, Pr, FoIr, fr, pref, C) type Requested Properties Feature of Interest Pd → FoId no grounding NFP Constraint and Preferences
  • 12.
    Middle-ware for StreamDiscovery and Federation Semantic Annotation ACEIS Core Resource Management Application Interface Knowledge Base QoI/QoS Stream Description Data Mgmt, Indexing, Caching User Input Event Request Data Federation Resource Discovery Event Service Composer Composition Plan Subscription Manager Query Transformer Query Engine Query Results Constraint Validation Constraint Violation Adaptation Manager Data Store IoT Data Stream Social Data Stream
  • 13.
    Web of ThingsDiscovery • Optimal Data Source Discovery • Web of Things Search Space is Global • Across the whole Web • Indexing • Geo-Spatial Mapping • Movable Objects/Things • Require Frequent Updates in Indexes
  • 14.
  • 15.
    WOTS2E: ArchitectureWOTS2E: Overview •A Search engine to discover semantic meta- description of things • Crawls the Web to discover Linked Data Sources • Analyzes Linked Data sources to identify relevant WoT devices
  • 16.
    WOTS2E: ArchitectureWOTS2E: Overview •Maintain a registry of devices for discovery • Support application request and provide details to interact with the devices.
  • 17.
    WOTS2E: ArchitectureWOTS2E: Operations •Discovery of Linked Data Endpoints – Web crawlers that continuously scan the web for discovery of Linked Data endpoints, frequency of one scan per day • Examination of Discovered Linked Data Endpoints – Query endpoints for IoT/WoT relevant ontologies – Explore popular dataset descriptions, such as VoID, SPARQL-SD • Analysis of Linked Data Endpoints and WoT Device Discovery – Through SPARQL queries, VoID/SPARQL- SD files, use of open APIs • Recording of WoT Devices and Services Discovered – Service type, location, time, features, interaction types, restrictions etc.
  • 18.
    WOTS2E: ArchitectureWOTS2E: Implementation/Analysis CommonPatterns <meta name=”Keywords” content=”OpenLink Virtuoso Sparql”> Virtuoso SPARQL Query Editor OpenLink Software <label for=”debug”>Strict checking of void variables</label> <a href=”http://www.openlinksw.com/virtuoso” > <a href=”/isparql”>iSPARQL</a> <label for=”query”>Query text</label> • SPARQL Endpoint Listed at Datahub.io • Common Patterns Identification • SPARQL endpoint discovery
  • 19.
    WOTS2E: ArchitectureWOTS2E: Implementation/Analysis •Discovered patterns are used as an input to our web crawlers, in order to search the web for available SPARQL endpoints. • For web crawling, we used a meta-crawling service called SpEnD. • SpEnD exploits the search functionality available over popular search engines to accelerate the performance of web crawling.
  • 20.
    WOTS2E: ArchitectureWOTS2E: Implementation/Analysis •To analyze the URLs retrieved, the Jena Framework was used to send SPARQL queries to the candidate endpoints, checking whether they are valid SPARQL endpoints or not. • All valid SPARQL endpoints were examined whether they contain information related to IoT/WoT, i.e. contained relevant ontologies. SELECT DISTINCT ?Concept WHERE {[] a ?Concept} LIMIT 100 SELECT * WHERE {{?s ?p ?o} UNION {GRAPH ?g {?s ?p ?o}} FILTER regex(?o, "/SSN"). FILTER isIRI(?o).
  • 21.
    • After labelingsome SPARQL endpoint as related to IoT/WoT, the next step was to analyze it, discovering which devices/services are available through it: VoID file adapted to reveal information about WoT devices and services Extend SSN to include discovery and description information through some ontology that describes web services. :ExampleDataset a void:Dataset; void:subset :ExampleSensor . :ExampleSensor a void:Dataset; dcterms:title "WoT Example Sensor"; dcterms:description "http://../sens.wadl"; dcterms:contributor "Insight Centre"; dcterms:source "140.203.154.11" WOTS2E: Implementation/Analysis
  • 22.
    WOTS2E: ArchitectureWOTS2E: Implementation/Analysis •Information about SPARQL endpoints, devices/services discovered and sensor meta-data information was stored as RDF triples on a Virtuoso RDF store, installed on WOTS2E. • Use of the IoT ontologyPrefix iot: <http://purl.org/IoT/iot#> Prefix ssn: <http://purl.oclc.org/../ssn#> SELECT ?sm ?device ?service WHERE { ?sm a iot:smart_entity ?sm iot:has_part_device ?device ?device ssn:observes ?service ?service a iot:Temperature }
  • 23.
    WOTS2E: ArchitectureWOTS2E: Evaluation •The SpEnD (meta-)crawling service ran for 24 hours • Using the common patterns for SPARQL endpoints • Relevant URLs from the Bing, Yahoo, Google, Baidu, and Yandex search engines. • Comparison of discovered endpoints with Datahub. Active Inactive Total WOTS2E 638 640 1278 Datahub 258 296 554
  • 24.
    WOTS2E: ArchitectureWOTS2E: Evaluation •From the discovered 638 active SPARQL endpoints, we examined them one by one for relevance to IoT/WoT Ontology Number of Endpoints SSN 13 DBPedia 13 SmartBuilding 3 DogOnt 2 DUL 2 km4city 2 OpenEI 2 RDFS, SKOS 4 Fan Fpai, Fiemser, IoT, PROV, SAREF 5 (once each ontology)
  • 25.
    WOTS2E: ArchitectureWOTS2E: Evaluation •IoT/WoT-specific triples from the endpoints Ontology Number of Triples SSN 1.433,248 DUL 182 km4city 56 Fiemser 50 OpenIoT 44 SmartBuilding 36 DogOnt 24 SAREF 4 Fan Fpai 2
  • 26.
    Conclusion • Semantic Searchand Discovery is essential for Web of Things • Currently only a handful of available SPARQL endpoints (46, 7.2%) seem to relate to IoT/WoT. • Lack of meta data availability • Need for standardization for discovery mechanisms • Our method aims to suggest a solid proposal on how to achieve discovery on SWoT seamlessly and with minimum effort. • WOTS2E can support applications looking for on the fly discovery and integration of devices
  • 27.
    Future Work • ImproveSearch mechanism by designing good vocabularies/ontologies and descriptions for IoT/WoT devices, services and data. • A user-friendly website of WOTS2E, to incrementally let users to access the discovered lists of services in a well- organized way. • From meta-crawling to efficient (classical) web crawling. • Contribute to the standardization efforts on the WoT (W3C WoT IG, OGC Sensor Web Interface for IoT SWG), promoting WOTS2E as a viable solution for a SWoT search engine.