SlideShare a Scribd company logo
From Data Platforms to Dataspaces:
Enabling Data Ecosystems for Intelligent Systems
Edward Curry
Insight @ NUI Galway
edward.curry@nuigalway.ie
Open Access Book
Contents
Part I: Fundamentals and Concepts
Part II: Data Support Services
Part III: Stream and Event Processing Services
Part IV: Intelligent Systems and Applications
Part V: Future Directions
Team
http://dataspaces.info
Web:dataspaces.info
Part I: Fundamentals and Concepts
3
http://dataspaces.info
Data Driven Innovations
Digital Twins: A digital replica of physical
assets (car), processes (value-chain), systems,
or physical environments (building). The
digital representation (i.e. simulation
modelling or data-driven models) provided by
the digital twin can be analysed to optimise
the operation of the “physical twin”.
Physical-Cyber-Social (PCS): A computing
paradigm that supports a richer human
experience with a holistic data-rich view of
the smart environment that integrates,
correlates, interprets, and provides
contextually relevant abstractions to humans.
Mass Personalisation: More human-centric
thinking in the design of systems where users
have growing expectations for highly
personalised digital services for the “Market
of One”.
Data Network Effects: As more systems/users
join and contribute data to the smart
environment, a “network effect” can take
place, resulting in the overall data available
becoming more valuable.
http://dataspaces.info
Real World Digital World
Sensors Orient
DecideActuators Act
Observe
Physical Twin
(Asset-centric)
Digital Twin
(System-centric)
Digital
Twins
http://dataspaces.info 5
Connected Intelligent Systems
6
http://dataspaces.info
http://dataspaces.info 7
Value Chains in Data Ecosystems
Data Management Challenges
• Pay-as-you-go Data Integration, Accessibility, and Sharing
– Standard data syntax, semantics, and linkage: Facilitate integration and sharing, ideally with open standards
and non-proprietary approaches.
– Single-point data discoverability and accessibility: Allow the organisation and access to datasets and
metadata through a single location.
– Incremental data management: Enable a low barrier to entry and a pay-as-you-go paradigm to minimise
costs.
• Secure Access Control: Support data access rights to preserve the security of data and privacy of
users in the smart environment.
• Real-time Data Processing and Historical Querying
– Real-time data processing: Including ingestion, aggregation, and pattern detection within event streams
originating from sensors and things in the smart environment.
– Unified querying of real-time data and historical data: Provide applications and end-users with a holistic
queryable state of the smart environment at a latency suitable for user interaction.
• Entity-centric Data Views
– Entity management: The storage, linkage, curation, and retrieval of entity data, such as users, zones, and
locations.
– Event enrichment: Enhancement of sensor/things streams with contextual data (e.g. entities) to make the
stream data more encapsulated and useful in downstream processing.
http://dataspaces.info
The “gold mining” metaphor applied to data processing
http://dataspaces.info
Traditional Approaches to Data Integration
Low
High
High
Frequency
of use
Cost of administration &
semantic integration using
traditional approaches
Popularity/Use
Number of data sources, entities, attributes
http://dataspaces.info
Data is Key to AI…Data Platforms will Fuel AI Decisions
Data Generation
and Analysis
(including IoT)
Data Platforms
(Access and Portability)
AI and Decision Platformshttp://dataspaces.info
IoT-Enablement
Layer 1 - Communication and Sensing
IPv6, Wi-Fi, RFID, CoAP, AVB, etc.
Layer 3 - Data
Schema, Entities, Catalog, Sharing, Access/Control, etc.
Layer 4 – Intelligent Apps, Analytics, and Users
Datasets
Things / Sensors
Contextual Data Sources
(including legacy systems)
Predictive
Analytics
Situation
Awareness
Decision
Support
Digital
Twin
Machine
Learning
Users
Layer 2 - Middleware
Peer-to-Peer, Events, Pub/Sub, SOA, SDN, etc.
A Data Sharing Layer is needed….
Adapted from: L. Atzori, A. Iera, and G. Morabito, “The
Internet of Things: A survey,” Comput. Networks, vol. 54,
no. 15, pp. 2787–2805, Oct. 2010.http://dataspaces.info
Cost of Data Management Solutions
http://dataspaces.info
Administrative Proximity:
– With close control many assumptions
can hold concerning guarantees such
as data quality and consistency.,
– Far control refers to a loosely coupled
environment and a lack of
coordination on the data sources.
Semantic Integration
– Degree to which data schemas are
matched up (types, attributes, and
names).
– All data conform to an agreed-upon
schema vs. no schema information.
This dimension is relevant to how
much semantically rich querying can
be done. 13
Halevy, A., Franklin, M. and Maier, D. 2006. Principles of dataspace
systems. 25th ACM SIGMOD-SIGACT-SIGART symposium on Principles of
database systems - PODS ’06 (New York, New York, USA, 2006), 1–9.
(Real-time Linked) Dataspace
Principles: (adapted from by Halevy et al.)
• Must deal with many different formats of
streams and events.
• Does not subsume the stream and event
processing engines; they still provide
individual access via their native interfaces.
• Queries in are provided on a best-effort
and approximate basis.
• Must provide pathways to improve the
integration among the data sources,
including streams and events, in a pay-as-
you-go fashion.
14http://dataspaces.info
Dataspace
“Dataspaces are not a data integration
approach; rather, they are more of a data co-
existence approach. The goal of dataspace
support is to provide base functionality over
all data sources, regardless of how integrated
they are.” (Halevy, A., Franklin, M. and Maier, D. 2006.)
Real-time Linked Dataspace (RLD)
Enabling platform for data management for
intelligent systems within smart environments
that combines the pay-as-you-go paradigm of
dataspaces, linked data, and knowledge
graphs with entity-centric real-time query
capabilities.
Approximate and Best Effort Approaches
Low
High
High
Frequency
of use Approximate &
best-effort
approaches
Cost of administration &
semantic integration using
traditional approaches
Popularity/Use
Number of data sources, entities, attributes
http://dataspaces.info
Architecture of Real-time Linked Dataspace
• Support Platform: Responsible for providing
the functionalities and services essential for
managing the dataspace.
• Things / Sensors: Produce real-time data
streams that need to be processed & managed.
• Data Sources: Available in a wide variety of
formats and accessible through different
systems interfaces.
• Managed Entities: Actively managed entities
including their relationship to participating
things, data sources, and other entities.
• Intelligent Applications, Analytics, & Users:
Leverage RLDs data and services to provide
data analytics, decision support tools, user
interfaces, and data visualisations. 16http://dataspaces.info
Pay-as-you-Go Tiered Data Model
http://dataspaces.info 17
• Provides flexibility by reducing
the initial cost and barriers to
joining the dataspace.
• Specialisation of the 5 star
scheme defined by
Tim Berners-Lee.
• Over time the level of integration
with the support services can be
improved in an incremental
manner on an as-needed basis.
• The more investment made to
integrate with the support
services; the better integration is
achievable in the dataspace.
http://dataspaces.info
Service Tiers for Support Services
Part II: Data Support Services
http://dataspaces.info
Part III: Stream and Event Processing Services
http://dataspaces.info
Data Self-Management
http://dataspaces.info 21
Techniques for:
• Self-Configuration
• Self-Healing
• Self-Optimizing
Automatic Source
Selection
• Source Selection
• Source Replacement
• Model Selection
• Model Training
• Parameterization
Entity Data Management and Humans in the Loop
http://dataspaces.info
Enables Users in the Smart
Environment to participate in
data management tasks
• Collection & Enrichment
• Mapping & Matching
• Operator Evaluation
• Feedback & Refinement
• Citizen Actuation
Key HIL Challenges
• Task Specification (simplicity)
• Interaction Mechanism
• Task Assignment (Geospatial,
expertise) 22
Semantic Approximation Matching of Streams
http://dataspaces.info
Challenges
• Heterogeneity in Event
Semantics (000s schema)
• Heterogeneity in processing
Rules (000s of rule tied to
schema)
Approx. Semantic Event Matcher
• Sub-symbolic Distributional
Event Semantics
• Enables pay-as-you-go event
matching for data streams
• Replaced 48,000 exact rules with
100 approximate rules with
around 85% accuracy
23
Part IV: Intelligent Systems and Applications
http://dataspaces.info
LOCATION
Airport Office Home Mixed Use School
LINATE AIRPORT,
MILAN, ITALY
INSIGHT,
GALWAY, IRELAND
HOUSES,
THERMI, GREECE
ENGINEERING,
NUI GALWAY
COLÁISTE NA
COIRIBE, IRELAND
TARGETUSERS
• Corporate users
• ~9.5 million
passengers
• Utilities
management
• Maintenance
staff
• Environmental
managers
• 130 staff
• Office consumers
• Operations
managers
• Utility providers
• Building
managers
• Domestic
consumers
(adults, young
adults and
children)
• Utility providers
• Mixed/Public
consumers
• Building
managers
• 100 staff
• 1000 students
(ages 18 to 24)
• Mixed/Public
consumers
• School
management
• Maintenance
staff
• 500 students
(ages 12 to 18)
• 40 teachers
INFRASTRUCTURE
• Safety critical
• 10 km water
network
• Multiple
buildings
• Water meters
• Energy meters
• Legacy systems
• 2190 m2 space
• 22 offices + 160
open plan spaces
• Conference room
• 4 meeting rooms
• 3 kitchens
• Data centre
• 30 person café
• Energy meters
• 10 households
• Typical variety of
domestic settings
including kitchen,
showers, baths,
living room,
bedrooms, and
garden
• Water meters
• Water meters
• Energy meters
• Rainwater
harvesting
• Café
• Weather station
• Wet labs
• Showers
• Water meters
• Energy meters
• Rainwater
harvesting
India (OK)India (OK)India (OK)
Smart Water
and Energy
Management
Pilots
Smart School
CnaC School in
Galway, Ireland
Mixed Use
Galway, Ireland
Building
Manager
University Students
Smart Airport
Milan Linate,
Italy
Corporate
Staff
Passengers
Smart Homes
Municipality of
Thermi, Greece
Smart Office
Galway, Ireland
Families
Operational
Staff
Researchers
Application
Developers
Teaching Staff School Students
Data
Scientist
Need to target different Target Users
http://dataspaces.info
IoT-enabled
Digital Twins
and
Intelligent
Applications
Real-time Linked Dataspace
DatasetsThings / Sensors
Entity Management Service
Catalog &
Access Control
Service
Personal DashboardPublic Dashboards
Decision Analytics and
Machine Learning
Notifications Apps
Alerts
Orient Decide
Act
Search & Query
Service
Entity-Centric
Real-Time Query
Service
Complex Event
Processing Service
Digital Twin
CEP
D
Human Task Service
Human Task
Service
Observe
http://dataspaces.info
“OODA” Loop
Interactive Public Displays
Alerts and NotificationsPersonalised Dashboards
Example
Applications
Experiences and Lessons Learnt from Dataspaces
http://dataspaces.info
• Developer education need for stream processing and approximate results
• Incremental data management can support agile software development
• Build the business case for data-driven innovation
• Integration with legacy data is a significant cost in smart environments
• The 5 star pay-as-you-go model simplified communication with non-technical
users
• A secure canonical source for entity data simplifies application development
• Data quality with things and sensors is challenging in an operational
environment
• Working with three pipelines add overhead (LAMBDA + Entity Layer)
28
Part V: Future Directions
http://dataspaces.info 29
Large-scale Decentralised Support Services
• Enhanced Supported Services
• Scaling Entity Management
• Maintenance and Operation Cost
Multimedia/Knowledge-Intensive Event
Processing
• Support Services for Multimedia Data
• Placement of Multimedia Data and
Workloads
• Adaptive Training of Classifiers
• Complex Multimedia Event Processing
Trusted Data Sharing
• Trusted Platforms
• Usage Control
• Personal/ Industrial Dataspaces
Ecosystem Governance and Economic
Models
• Decentralised Data Governance
• Economic Models
Incremental Intelligent Systems
Engineering Cognitive Adaptability
• Pay-as-you-go Systems
• Cognitive Adaptability
Towards Human-centric Systems
• Explainable Artificial Intelligence
and Data Provenance
• Human-in-the-loop
Some final thoughts on
Impacts, Influence, and Future Funding
http://dataspaces.info
Data Sharing Spaces – Position Paper
Key Recommendations
Create the conditions for the
development of a trusted European
data sharing framework
Incorporate data sharing at the core
of the data lifecycle to enable greater
access to data.
Provide supportive measures for
European businesses to safely
embrace new technologies, practices
and policies.
Assemble a European-wide digital
skills strategy to equip the workforce
for the new data economy.
A European Strategy for Data
BDVA Meeting
26 February 2020
Yvo Volman
Head of Unit G1 - Data Policy and Innovation
DG CNECT, European Commission
European Strategy for Data
Data can flow within the
EU and across sectors
European rules and values
are fully respected
Rules for access and use of data are
fair, practical and clear & clear data
governance mechanisms are in place
A common European data space, a single market for data
Availability of high quality data
to create and innovate
Rich pool of data
(varying degree of
accessibility)
Free flow of data
across sectors and
countries
Full respect of GDPR
Health
Industrial &
Manufacturing Agriculture Finance Mobility Green Deal Energy
−Technical tools for data pooling and sharing
−Standards & interoperability (technical,
semantic)
− Sectoral Data Governance (contracts,
licenses, access rights, usage rights)
− IT capacity, including cloud storage,
processing and services
Horizontal
framework for data
governance and data
access
Common European data spaces
Public
Administration Skills

More Related Content

What's hot

Data Audit Approach To Developing An Enterprise Data Strategy
Data Audit Approach To Developing An Enterprise Data StrategyData Audit Approach To Developing An Enterprise Data Strategy
Data Audit Approach To Developing An Enterprise Data Strategy
Alan McSweeney
 
The Importance of DataOps in a Multi-Cloud World
The Importance of DataOps in a Multi-Cloud WorldThe Importance of DataOps in a Multi-Cloud World
The Importance of DataOps in a Multi-Cloud World
DATAVERSITY
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
James Serra
 
Big Data use cases in telcos
Big Data use cases in telcosBig Data use cases in telcos
Big Data use cases in telcos
Mohamed Zuber Khatib
 
How to build a successful Data Lake
How to build a successful Data LakeHow to build a successful Data Lake
How to build a successful Data Lake
DataWorks Summit/Hadoop Summit
 
BI & Big data use case for banking - by rully feranata
BI & Big data use case for banking - by rully feranataBI & Big data use case for banking - by rully feranata
BI & Big data use case for banking - by rully feranata
Rully Feranata
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Simplilearn
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
DATAVERSITY
 
Blockchain Technology | Blockchain Explained | Blockchain Tutorial | Blockcha...
Blockchain Technology | Blockchain Explained | Blockchain Tutorial | Blockcha...Blockchain Technology | Blockchain Explained | Blockchain Tutorial | Blockcha...
Blockchain Technology | Blockchain Explained | Blockchain Tutorial | Blockcha...
Edureka!
 
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
HostedbyConfluent
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
James Serra
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
James Serra
 
Customer-Centric Data Management for Better Customer Experiences
Customer-Centric Data Management for Better Customer ExperiencesCustomer-Centric Data Management for Better Customer Experiences
Customer-Centric Data Management for Better Customer Experiences
Informatica
 
Big Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data DemocratizationBig Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data Democratization
Cambridge Semantics
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
James Serra
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
Jeffrey T. Pollock
 
0. dao as a token economy
0. dao as a token economy0. dao as a token economy
0. dao as a token economy
Andy Martin
 
Fan-out, fan-in & the multiplexer: Replication recipes for global platform di...
Fan-out, fan-in & the multiplexer: Replication recipes for global platform di...Fan-out, fan-in & the multiplexer: Replication recipes for global platform di...
Fan-out, fan-in & the multiplexer: Replication recipes for global platform di...
HostedbyConfluent
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
humerashaziya
 
Apache Spark Based Reliable Data Ingestion in Datalake with Gagan Agrawal
Apache Spark Based Reliable Data Ingestion in Datalake with Gagan AgrawalApache Spark Based Reliable Data Ingestion in Datalake with Gagan Agrawal
Apache Spark Based Reliable Data Ingestion in Datalake with Gagan Agrawal
Databricks
 

What's hot (20)

Data Audit Approach To Developing An Enterprise Data Strategy
Data Audit Approach To Developing An Enterprise Data StrategyData Audit Approach To Developing An Enterprise Data Strategy
Data Audit Approach To Developing An Enterprise Data Strategy
 
The Importance of DataOps in a Multi-Cloud World
The Importance of DataOps in a Multi-Cloud WorldThe Importance of DataOps in a Multi-Cloud World
The Importance of DataOps in a Multi-Cloud World
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Big Data use cases in telcos
Big Data use cases in telcosBig Data use cases in telcos
Big Data use cases in telcos
 
How to build a successful Data Lake
How to build a successful Data LakeHow to build a successful Data Lake
How to build a successful Data Lake
 
BI & Big data use case for banking - by rully feranata
BI & Big data use case for banking - by rully feranataBI & Big data use case for banking - by rully feranata
BI & Big data use case for banking - by rully feranata
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
 
Blockchain Technology | Blockchain Explained | Blockchain Tutorial | Blockcha...
Blockchain Technology | Blockchain Explained | Blockchain Tutorial | Blockcha...Blockchain Technology | Blockchain Explained | Blockchain Tutorial | Blockcha...
Blockchain Technology | Blockchain Explained | Blockchain Tutorial | Blockcha...
 
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Customer-Centric Data Management for Better Customer Experiences
Customer-Centric Data Management for Better Customer ExperiencesCustomer-Centric Data Management for Better Customer Experiences
Customer-Centric Data Management for Better Customer Experiences
 
Big Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data DemocratizationBig Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data Democratization
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 
0. dao as a token economy
0. dao as a token economy0. dao as a token economy
0. dao as a token economy
 
Fan-out, fan-in & the multiplexer: Replication recipes for global platform di...
Fan-out, fan-in & the multiplexer: Replication recipes for global platform di...Fan-out, fan-in & the multiplexer: Replication recipes for global platform di...
Fan-out, fan-in & the multiplexer: Replication recipes for global platform di...
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Apache Spark Based Reliable Data Ingestion in Datalake with Gagan Agrawal
Apache Spark Based Reliable Data Ingestion in Datalake with Gagan AgrawalApache Spark Based Reliable Data Ingestion in Datalake with Gagan Agrawal
Apache Spark Based Reliable Data Ingestion in Datalake with Gagan Agrawal
 

Similar to From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent Systems

Unit i introduction to grid computing
Unit i   introduction to grid computingUnit i   introduction to grid computing
Unit i introduction to grid computing
sudha kar
 
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
Edward Curry
 
Big data and oracle
Big data and oracleBig data and oracle
Big data and oracle
Sourabh Saxena
 
Grid computing
Grid computingGrid computing
Grid computing
Dikshita_Viradia
 
The Internet of Things: What's next?
The Internet of Things: What's next? The Internet of Things: What's next?
The Internet of Things: What's next?
PayamBarnaghi
 
Data Domain-Driven Design
Data Domain-Driven DesignData Domain-Driven Design
Data Domain-Driven Design
Kiran Kumar Chittoori
 
Big Data Architecture Intro and its implementation in the insutry.pptx
Big Data Architecture Intro and its implementation in the insutry.pptxBig Data Architecture Intro and its implementation in the insutry.pptx
Big Data Architecture Intro and its implementation in the insutry.pptx
totondak
 
Mining on Relationships in Big Data era using Improve Apriori Algorithm with ...
Mining on Relationships in Big Data era using Improve Apriori Algorithm with ...Mining on Relationships in Big Data era using Improve Apriori Algorithm with ...
Mining on Relationships in Big Data era using Improve Apriori Algorithm with ...
KamleshKumar394
 
GridComputing-an introduction.ppt
GridComputing-an introduction.pptGridComputing-an introduction.ppt
GridComputing-an introduction.ppt
NileshkuGiri
 
Intelligent Data Processing for the Internet of Things
Intelligent Data Processing for the Internet of Things Intelligent Data Processing for the Internet of Things
Intelligent Data Processing for the Internet of Things
PayamBarnaghi
 
Mobile Data Analytics
Mobile Data AnalyticsMobile Data Analytics
Mobile Data Analytics
RICHARD AMUOK
 
Distributed Trust Architecture: The New Reality of ML-based Systems
Distributed Trust Architecture: The New Reality of ML-based SystemsDistributed Trust Architecture: The New Reality of ML-based Systems
Distributed Trust Architecture: The New Reality of ML-based Systems
Liming Zhu
 
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
Denodo
 
FR.pptx
FR.pptxFR.pptx
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Geoffrey Fox
 
Dynamic Data Analytics for the Internet of Things: Challenges and Opportunities
Dynamic Data Analytics for the Internet of Things: Challenges and OpportunitiesDynamic Data Analytics for the Internet of Things: Challenges and Opportunities
Dynamic Data Analytics for the Internet of Things: Challenges and Opportunities
PayamBarnaghi
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptx
ElsonPaul2
 
Project Haystack - 10 Years of Bringing Data Semantics to the Built Environment!
Project Haystack - 10 Years of Bringing Data Semantics to the Built Environment!Project Haystack - 10 Years of Bringing Data Semantics to the Built Environment!
Project Haystack - 10 Years of Bringing Data Semantics to the Built Environment!
Memoori
 
Dm unit i r16
Dm unit i   r16Dm unit i   r16
Dm unit i r16
Kishore Kumar
 
B017240812
B017240812B017240812
B017240812
IOSR Journals
 

Similar to From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent Systems (20)

Unit i introduction to grid computing
Unit i   introduction to grid computingUnit i   introduction to grid computing
Unit i introduction to grid computing
 
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
 
Big data and oracle
Big data and oracleBig data and oracle
Big data and oracle
 
Grid computing
Grid computingGrid computing
Grid computing
 
The Internet of Things: What's next?
The Internet of Things: What's next? The Internet of Things: What's next?
The Internet of Things: What's next?
 
Data Domain-Driven Design
Data Domain-Driven DesignData Domain-Driven Design
Data Domain-Driven Design
 
Big Data Architecture Intro and its implementation in the insutry.pptx
Big Data Architecture Intro and its implementation in the insutry.pptxBig Data Architecture Intro and its implementation in the insutry.pptx
Big Data Architecture Intro and its implementation in the insutry.pptx
 
Mining on Relationships in Big Data era using Improve Apriori Algorithm with ...
Mining on Relationships in Big Data era using Improve Apriori Algorithm with ...Mining on Relationships in Big Data era using Improve Apriori Algorithm with ...
Mining on Relationships in Big Data era using Improve Apriori Algorithm with ...
 
GridComputing-an introduction.ppt
GridComputing-an introduction.pptGridComputing-an introduction.ppt
GridComputing-an introduction.ppt
 
Intelligent Data Processing for the Internet of Things
Intelligent Data Processing for the Internet of Things Intelligent Data Processing for the Internet of Things
Intelligent Data Processing for the Internet of Things
 
Mobile Data Analytics
Mobile Data AnalyticsMobile Data Analytics
Mobile Data Analytics
 
Distributed Trust Architecture: The New Reality of ML-based Systems
Distributed Trust Architecture: The New Reality of ML-based SystemsDistributed Trust Architecture: The New Reality of ML-based Systems
Distributed Trust Architecture: The New Reality of ML-based Systems
 
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
 
FR.pptx
FR.pptxFR.pptx
FR.pptx
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
 
Dynamic Data Analytics for the Internet of Things: Challenges and Opportunities
Dynamic Data Analytics for the Internet of Things: Challenges and OpportunitiesDynamic Data Analytics for the Internet of Things: Challenges and Opportunities
Dynamic Data Analytics for the Internet of Things: Challenges and Opportunities
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptx
 
Project Haystack - 10 Years of Bringing Data Semantics to the Built Environment!
Project Haystack - 10 Years of Bringing Data Semantics to the Built Environment!Project Haystack - 10 Years of Bringing Data Semantics to the Built Environment!
Project Haystack - 10 Years of Bringing Data Semantics to the Built Environment!
 
Dm unit i r16
Dm unit i   r16Dm unit i   r16
Dm unit i r16
 
B017240812
B017240812B017240812
B017240812
 

Recently uploaded

一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
nyvan3
 
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
Rebecca Bilbro
 
A gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented GenerationA gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented Generation
dataschool1
 
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
lzdvtmy8
 
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理 原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
tzu5xla
 
一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理
zsafxbf
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
Timothy Spann
 
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdfreading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
perranet1
 
Sample Devops SRE Product Companies .pdf
Sample Devops SRE  Product Companies .pdfSample Devops SRE  Product Companies .pdf
Sample Devops SRE Product Companies .pdf
Vineet
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
eoxhsaa
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
oaxefes
 
一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理
ugydym
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
mkkikqvo
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
Márton Kodok
 
Sid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.pptSid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.ppt
ArshadAyub49
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
blueshagoo1
 
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Marlon Dumas
 
Data Scientist Machine Learning Profiles .pdf
Data Scientist Machine Learning  Profiles .pdfData Scientist Machine Learning  Profiles .pdf
Data Scientist Machine Learning Profiles .pdf
Vineet
 
8 things to know before you start to code in 2024
8 things to know before you start to code in 20248 things to know before you start to code in 2024
8 things to know before you start to code in 2024
ArianaRamos54
 

Recently uploaded (20)

一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
 
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
 
A gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented GenerationA gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented Generation
 
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
 
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理 原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
 
一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
 
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdfreading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
 
Sample Devops SRE Product Companies .pdf
Sample Devops SRE  Product Companies .pdfSample Devops SRE  Product Companies .pdf
Sample Devops SRE Product Companies .pdf
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
 
一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
 
Sid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.pptSid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.ppt
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
 
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
 
Data Scientist Machine Learning Profiles .pdf
Data Scientist Machine Learning  Profiles .pdfData Scientist Machine Learning  Profiles .pdf
Data Scientist Machine Learning Profiles .pdf
 
8 things to know before you start to code in 2024
8 things to know before you start to code in 20248 things to know before you start to code in 2024
8 things to know before you start to code in 2024
 

From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent Systems

  • 1. From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent Systems Edward Curry Insight @ NUI Galway edward.curry@nuigalway.ie
  • 2. Open Access Book Contents Part I: Fundamentals and Concepts Part II: Data Support Services Part III: Stream and Event Processing Services Part IV: Intelligent Systems and Applications Part V: Future Directions Team http://dataspaces.info Web:dataspaces.info
  • 3. Part I: Fundamentals and Concepts 3 http://dataspaces.info
  • 4. Data Driven Innovations Digital Twins: A digital replica of physical assets (car), processes (value-chain), systems, or physical environments (building). The digital representation (i.e. simulation modelling or data-driven models) provided by the digital twin can be analysed to optimise the operation of the “physical twin”. Physical-Cyber-Social (PCS): A computing paradigm that supports a richer human experience with a holistic data-rich view of the smart environment that integrates, correlates, interprets, and provides contextually relevant abstractions to humans. Mass Personalisation: More human-centric thinking in the design of systems where users have growing expectations for highly personalised digital services for the “Market of One”. Data Network Effects: As more systems/users join and contribute data to the smart environment, a “network effect” can take place, resulting in the overall data available becoming more valuable. http://dataspaces.info
  • 5. Real World Digital World Sensors Orient DecideActuators Act Observe Physical Twin (Asset-centric) Digital Twin (System-centric) Digital Twins http://dataspaces.info 5
  • 8. Data Management Challenges • Pay-as-you-go Data Integration, Accessibility, and Sharing – Standard data syntax, semantics, and linkage: Facilitate integration and sharing, ideally with open standards and non-proprietary approaches. – Single-point data discoverability and accessibility: Allow the organisation and access to datasets and metadata through a single location. – Incremental data management: Enable a low barrier to entry and a pay-as-you-go paradigm to minimise costs. • Secure Access Control: Support data access rights to preserve the security of data and privacy of users in the smart environment. • Real-time Data Processing and Historical Querying – Real-time data processing: Including ingestion, aggregation, and pattern detection within event streams originating from sensors and things in the smart environment. – Unified querying of real-time data and historical data: Provide applications and end-users with a holistic queryable state of the smart environment at a latency suitable for user interaction. • Entity-centric Data Views – Entity management: The storage, linkage, curation, and retrieval of entity data, such as users, zones, and locations. – Event enrichment: Enhancement of sensor/things streams with contextual data (e.g. entities) to make the stream data more encapsulated and useful in downstream processing. http://dataspaces.info
  • 9. The “gold mining” metaphor applied to data processing http://dataspaces.info
  • 10. Traditional Approaches to Data Integration Low High High Frequency of use Cost of administration & semantic integration using traditional approaches Popularity/Use Number of data sources, entities, attributes http://dataspaces.info
  • 11. Data is Key to AI…Data Platforms will Fuel AI Decisions Data Generation and Analysis (including IoT) Data Platforms (Access and Portability) AI and Decision Platformshttp://dataspaces.info
  • 12. IoT-Enablement Layer 1 - Communication and Sensing IPv6, Wi-Fi, RFID, CoAP, AVB, etc. Layer 3 - Data Schema, Entities, Catalog, Sharing, Access/Control, etc. Layer 4 – Intelligent Apps, Analytics, and Users Datasets Things / Sensors Contextual Data Sources (including legacy systems) Predictive Analytics Situation Awareness Decision Support Digital Twin Machine Learning Users Layer 2 - Middleware Peer-to-Peer, Events, Pub/Sub, SOA, SDN, etc. A Data Sharing Layer is needed…. Adapted from: L. Atzori, A. Iera, and G. Morabito, “The Internet of Things: A survey,” Comput. Networks, vol. 54, no. 15, pp. 2787–2805, Oct. 2010.http://dataspaces.info
  • 13. Cost of Data Management Solutions http://dataspaces.info Administrative Proximity: – With close control many assumptions can hold concerning guarantees such as data quality and consistency., – Far control refers to a loosely coupled environment and a lack of coordination on the data sources. Semantic Integration – Degree to which data schemas are matched up (types, attributes, and names). – All data conform to an agreed-upon schema vs. no schema information. This dimension is relevant to how much semantically rich querying can be done. 13 Halevy, A., Franklin, M. and Maier, D. 2006. Principles of dataspace systems. 25th ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems - PODS ’06 (New York, New York, USA, 2006), 1–9.
  • 14. (Real-time Linked) Dataspace Principles: (adapted from by Halevy et al.) • Must deal with many different formats of streams and events. • Does not subsume the stream and event processing engines; they still provide individual access via their native interfaces. • Queries in are provided on a best-effort and approximate basis. • Must provide pathways to improve the integration among the data sources, including streams and events, in a pay-as- you-go fashion. 14http://dataspaces.info Dataspace “Dataspaces are not a data integration approach; rather, they are more of a data co- existence approach. The goal of dataspace support is to provide base functionality over all data sources, regardless of how integrated they are.” (Halevy, A., Franklin, M. and Maier, D. 2006.) Real-time Linked Dataspace (RLD) Enabling platform for data management for intelligent systems within smart environments that combines the pay-as-you-go paradigm of dataspaces, linked data, and knowledge graphs with entity-centric real-time query capabilities.
  • 15. Approximate and Best Effort Approaches Low High High Frequency of use Approximate & best-effort approaches Cost of administration & semantic integration using traditional approaches Popularity/Use Number of data sources, entities, attributes http://dataspaces.info
  • 16. Architecture of Real-time Linked Dataspace • Support Platform: Responsible for providing the functionalities and services essential for managing the dataspace. • Things / Sensors: Produce real-time data streams that need to be processed & managed. • Data Sources: Available in a wide variety of formats and accessible through different systems interfaces. • Managed Entities: Actively managed entities including their relationship to participating things, data sources, and other entities. • Intelligent Applications, Analytics, & Users: Leverage RLDs data and services to provide data analytics, decision support tools, user interfaces, and data visualisations. 16http://dataspaces.info
  • 17. Pay-as-you-Go Tiered Data Model http://dataspaces.info 17 • Provides flexibility by reducing the initial cost and barriers to joining the dataspace. • Specialisation of the 5 star scheme defined by Tim Berners-Lee. • Over time the level of integration with the support services can be improved in an incremental manner on an as-needed basis. • The more investment made to integrate with the support services; the better integration is achievable in the dataspace.
  • 19. Part II: Data Support Services http://dataspaces.info
  • 20. Part III: Stream and Event Processing Services http://dataspaces.info
  • 21. Data Self-Management http://dataspaces.info 21 Techniques for: • Self-Configuration • Self-Healing • Self-Optimizing Automatic Source Selection • Source Selection • Source Replacement • Model Selection • Model Training • Parameterization
  • 22. Entity Data Management and Humans in the Loop http://dataspaces.info Enables Users in the Smart Environment to participate in data management tasks • Collection & Enrichment • Mapping & Matching • Operator Evaluation • Feedback & Refinement • Citizen Actuation Key HIL Challenges • Task Specification (simplicity) • Interaction Mechanism • Task Assignment (Geospatial, expertise) 22
  • 23. Semantic Approximation Matching of Streams http://dataspaces.info Challenges • Heterogeneity in Event Semantics (000s schema) • Heterogeneity in processing Rules (000s of rule tied to schema) Approx. Semantic Event Matcher • Sub-symbolic Distributional Event Semantics • Enables pay-as-you-go event matching for data streams • Replaced 48,000 exact rules with 100 approximate rules with around 85% accuracy 23
  • 24. Part IV: Intelligent Systems and Applications http://dataspaces.info LOCATION Airport Office Home Mixed Use School LINATE AIRPORT, MILAN, ITALY INSIGHT, GALWAY, IRELAND HOUSES, THERMI, GREECE ENGINEERING, NUI GALWAY COLÁISTE NA COIRIBE, IRELAND TARGETUSERS • Corporate users • ~9.5 million passengers • Utilities management • Maintenance staff • Environmental managers • 130 staff • Office consumers • Operations managers • Utility providers • Building managers • Domestic consumers (adults, young adults and children) • Utility providers • Mixed/Public consumers • Building managers • 100 staff • 1000 students (ages 18 to 24) • Mixed/Public consumers • School management • Maintenance staff • 500 students (ages 12 to 18) • 40 teachers INFRASTRUCTURE • Safety critical • 10 km water network • Multiple buildings • Water meters • Energy meters • Legacy systems • 2190 m2 space • 22 offices + 160 open plan spaces • Conference room • 4 meeting rooms • 3 kitchens • Data centre • 30 person café • Energy meters • 10 households • Typical variety of domestic settings including kitchen, showers, baths, living room, bedrooms, and garden • Water meters • Water meters • Energy meters • Rainwater harvesting • Café • Weather station • Wet labs • Showers • Water meters • Energy meters • Rainwater harvesting India (OK)India (OK)India (OK) Smart Water and Energy Management Pilots
  • 25. Smart School CnaC School in Galway, Ireland Mixed Use Galway, Ireland Building Manager University Students Smart Airport Milan Linate, Italy Corporate Staff Passengers Smart Homes Municipality of Thermi, Greece Smart Office Galway, Ireland Families Operational Staff Researchers Application Developers Teaching Staff School Students Data Scientist Need to target different Target Users http://dataspaces.info
  • 26. IoT-enabled Digital Twins and Intelligent Applications Real-time Linked Dataspace DatasetsThings / Sensors Entity Management Service Catalog & Access Control Service Personal DashboardPublic Dashboards Decision Analytics and Machine Learning Notifications Apps Alerts Orient Decide Act Search & Query Service Entity-Centric Real-Time Query Service Complex Event Processing Service Digital Twin CEP D Human Task Service Human Task Service Observe http://dataspaces.info “OODA” Loop
  • 27. Interactive Public Displays Alerts and NotificationsPersonalised Dashboards Example Applications
  • 28. Experiences and Lessons Learnt from Dataspaces http://dataspaces.info • Developer education need for stream processing and approximate results • Incremental data management can support agile software development • Build the business case for data-driven innovation • Integration with legacy data is a significant cost in smart environments • The 5 star pay-as-you-go model simplified communication with non-technical users • A secure canonical source for entity data simplifies application development • Data quality with things and sensors is challenging in an operational environment • Working with three pipelines add overhead (LAMBDA + Entity Layer) 28
  • 29. Part V: Future Directions http://dataspaces.info 29 Large-scale Decentralised Support Services • Enhanced Supported Services • Scaling Entity Management • Maintenance and Operation Cost Multimedia/Knowledge-Intensive Event Processing • Support Services for Multimedia Data • Placement of Multimedia Data and Workloads • Adaptive Training of Classifiers • Complex Multimedia Event Processing Trusted Data Sharing • Trusted Platforms • Usage Control • Personal/ Industrial Dataspaces Ecosystem Governance and Economic Models • Decentralised Data Governance • Economic Models Incremental Intelligent Systems Engineering Cognitive Adaptability • Pay-as-you-go Systems • Cognitive Adaptability Towards Human-centric Systems • Explainable Artificial Intelligence and Data Provenance • Human-in-the-loop
  • 30. Some final thoughts on Impacts, Influence, and Future Funding http://dataspaces.info
  • 31. Data Sharing Spaces – Position Paper Key Recommendations Create the conditions for the development of a trusted European data sharing framework Incorporate data sharing at the core of the data lifecycle to enable greater access to data. Provide supportive measures for European businesses to safely embrace new technologies, practices and policies. Assemble a European-wide digital skills strategy to equip the workforce for the new data economy.
  • 32. A European Strategy for Data BDVA Meeting 26 February 2020 Yvo Volman Head of Unit G1 - Data Policy and Innovation DG CNECT, European Commission
  • 33. European Strategy for Data Data can flow within the EU and across sectors European rules and values are fully respected Rules for access and use of data are fair, practical and clear & clear data governance mechanisms are in place A common European data space, a single market for data Availability of high quality data to create and innovate
  • 34. Rich pool of data (varying degree of accessibility) Free flow of data across sectors and countries Full respect of GDPR Health Industrial & Manufacturing Agriculture Finance Mobility Green Deal Energy −Technical tools for data pooling and sharing −Standards & interoperability (technical, semantic) − Sectoral Data Governance (contracts, licenses, access rights, usage rights) − IT capacity, including cloud storage, processing and services Horizontal framework for data governance and data access Common European data spaces Public Administration Skills