SlideShare a Scribd company logo
1 of 34
On the need for a W3C
community group on RDF
Stream Processing
ISWC2013 Workshop on Ordering and Reasoning,
Sydney, 22/10/2013

Oscar Corcho
ocorcho@fi.upm.es, ocorcho@localidata.com
@ocorcho
http://www.slideshare.net/ocorcho/
Disclaimer…

This presentation expresses my view but not necessarily the one from
the rest of the group (although I hope that it is similar)

<<Texto libre: proyecto, speaker, etc.>>

2
Acknowledgements
• All those that I have “stolen” slides, material and
ideas from
•
•
•
•
•

Emanuele Della Valle
Daniele Dell’Aglio
Marco Balduini
Jean Paul Calbimonte
And many others who
have already started
contributing…

<<Texto libre: proyecto, speaker, etc.>>

3
Why setting up a community group?
In RDF Stream models
(timestamps, events, time
intervals, triple-based, graph-based …)

In RDF Stream query languages
(windows, stream selection,
CEP-based operators, …)

Heterogeneity

In implementations
(RDF native, query rewriting,
continuous query registration,
scalability, static vs streaming data…)

<<Texto libre: proyecto, speaker, etc.>>

4

In operational semantics
(tick, window content, report)
You may think that we do not like heterogeneity…

<<Texto libre: proyecto, speaker, etc.>>

5
But at least I love it…
• However, we need to tell people what to expect with
each system, and smooth differences when they
are not crucial……

<<Texto libre: proyecto, speaker, etc.>>

6
The solution…
• Let’s create a W3C community group…

•
•
•
•
•

To understand better those differences
The requirements on which we are based
And explain to others
…
And maybe get some “recommendation” out

<<Texto libre: proyecto, speaker, etc.>>

7
The W3C RDF Stream Processing Comm. Group
• http://www.w3.org/community/rsp/

<<Texto libre: proyecto, speaker, etc.>>

8
W3C RSP Community Group mission
“The mission of the RDF Stream Processing
Community Group (RSP) is to define a common model
for producing, transmitting and continuously querying
RDF Streams. This includes extensions to both RDF
and SPARQL for representing streaming data, as well
as their semantics. Moreover this work envisions an
ecosystem of streaming and static RDF data sources
whose data can be combined through standard models,
languages and protocols. Complementary to related
work in the area of databases, this Community Group
looks at the dynamic properties of graph-based data,
i.e., graphs that are produced over time and which may
change their shape and data over time.”

<<Texto libre: proyecto, speaker, etc.>>

9
Use cases
• We have started collecting them

• And I hope that by the end of my talk you will
consider contributing some more…
<<Texto libre: proyecto, speaker, etc.>>

10
A template to describe use cases (I)
•

Streaming Information
•
•

•
•

•
•

Type: Environmental data: temperatures, pressures, salinity, acidity, fluid
velocities etc,
Nature:
• Relational Stream: yes
• Text stream: no
Origin: Data is produced by sensors in oil wells and on oil and gas
platforms equipments. Each oil platform has an average of 400.000.
Frequency of update:
• from sub-second to minutes
• In triples/minute: [10000-10] t/min
Quality: It varies, due to instrument/sensor issues
Management /access
• Technology in use: Dedicated (relational and proprietary) stores
• Problems: The ability of users to access data from different sources is
limited by an insufficient description of the context
• Means of improvement: Add context (metadata) to the data so it
become meaningful and use reasoning techniques to process that
metadata

<<Texto libre: proyecto, speaker, etc.>>

11
A template to describe use cases (II)
•

[optional] Static Information required to interpret the streaming
information
•
•
•

•
•

Type: Topology of the sensor network, position of each sensor, the
descriptions of the oil platform
Origin: Oil and gas production operations
Dimension:
• 100s of MB as PostGIS dump
• In triples: 10^8
Quality: Good
Management / access
• Technology in use: RDBMS, proprietary technologies
• Available Ontologies and Vocabularies: Reference Semantic Model
(RSM), based on ISO 15926

<<Texto libre: proyecto, speaker, etc.>>

12
A tale of four heterogeneities
ISWC2013 Workshop on Ordering and Reasoning,
Sydney, 22/10/2013

Oscar Corcho
ocorcho@fi.upm.es, ocorcho@localidata.com
@ocorcho
http://www.slideshare.net/ocorcho/
Heterogeneity #1: Representing RDF Streams

<<Texto libre: proyecto, speaker, etc.>>

14
What is an RDF stream?
• Several possibilities:
• An RDF stream is an infinite sequence of timestamped
events (triples or graphs), where timestamps are nondecreasing
…
<eventi,ti >
<eventi+1,ti+1 >
<eventi+2,ti+2 >
…
• An RDF stream is an infinite sequence of triple occurrences
<<s,p,o>,tα,tω> where <s,p,o> is an RDF triple and tα and tω
are the start and end of the interval

• How are timestamps assigned?
Some examples…
• What would be the best/possible RDF stream
representation for the following types of problems?
• Does Alice meet Bob before Carl?
• Who does Carl meet first?
:alice :isWith :bob

:alice :isWith :carl

e1

:diana :isWith :carl

:bob :isWith :diana

e2

e3

e4

• How many people has Alice met in the last 5m?
• Does Diana meet Bob and then Carl within 5m?
1

3

6

9

t

• Which are the meetings the last less than 5m?
• Which are the meetings with conflicts?

:alice :isWith :bob

:alice :isWith :carl

:bob :isWith :diana

:diana :isWith :carl

e4

e2
e1
<<Texto libre: proyecto, speaker, etc.>>

e3
16
Data types for semantic streams - Summary
•

Multiple notions of RDF stream proposed
• Ordered sequence (implicit timestamp)
• One timestamp per triple (point in time semantics)
• Two timestamps per triple (interval base semantics)

•

Comparison between existing approaches
System

Time model

# of timestamps

INSTANS

triple

Implicit

0

C-SPARQL

triple

Point in time

1

SPARQLstream

triple

Point in time

1

CQELS

triple

Point in time

1

Sparkwave

triple

Point in time

1

Streaming Linked Data

RDF graph

Point in time

1

ETALIS

•

Data item

triple

Interval

2

More investigation is required to agree on an RDF stream model
17
Heterogeneity #2: RDF Stream processors

<<Texto libre: proyecto, speaker, etc.>>

18
Existing RDF Stream Processing systems
• C-SPARQL: RDF Store + Stream processor
• Combined architecture
C-SPARQL
query

sta

translator

tic

stre

amin

RDF Store

g

Stream
processor

continuous
results

• CQELS: Implemented from scratch. Focus on performance
• Native + adaptive joins for static-data and streaming data
CQELS
query

Native RSP

continuous
results

• CQELS-Cloud: Reusing Storm
• Paper presentation on Thursday
CQELS
query

Storm
topology

continuous
results
Existing RSP systems
• EP-SPARQL: Complex-event detection
• SEQ, EQUALS operators
EP-SPARQL
query

translator

Prolog
engine

continuous
results

• SPARQLStream: Ontology-based stream query
answering
• Virtual RDF views, using R2RML mappings
• SPARQL stream queries over the original data streams.
SPARQLStream
query

rewriter

DSMS/CEP

R2RML mappings

• Instans: RETE-based evaluation

continuous
results
Query languages for semantic streams - Summary

• Different architectural choices
• It is not clear when each choice is best for which type of use
case
• Wrappers over existing systems
• C-SPARQL, ETALIS, SPARQLstream , CQELS-Cloud
• Better reliability and maintainability?
• Native implementations
• CQELS, Streaming Linked Data, INSTANS
• Better scalability: optimizations that are not possible
in other systems

• Different operational semantics
• See later

21
Heterogeneity #3: Querying RDF Streams

<<Texto libre: proyecto, speaker, etc.>>

22
Querying data streams (from CQL to SPARQL-X)
stream-to-relation (S2R)

Relation
s

Streams
infinite
unbounded
bag

…
<s,τ>
…

relation-to-relation (R2R)

relation-to-stream (R2S)

Stream

<s1>
<s2>
<s3>

finite
bag

Relati on R(t)

Mapping: T  R

S2R Window operators

RDF
Streams

SPARQL operators

RDF

R2S operators
Output: relation
• Case 1: the output is a set of timestamped mappings
a … ?b… [t1]
a … ?b…

SELECT ?a ?b …
FROM ….
WHERE ….

queries

CONSTRUCT {?a :prop ?b }
FROM ….
WHERE ….

a … ?b… [t3]
a … ?b… [t5]

RS
P

a … ?b… [t7]

bindings
 <… :prop … > [t1]
 <… :prop … >
 <… :prop … > [t3]
 <… :prop … > [t5]
 <… :prop … > [t7]

triples
Output: stream
• Case 2: the output is a stream
• R2S operators
CONSTRUCT RSTREAM {?a :prop ?b }
FROM ….
WHERE ….

query

RS
P

stream
… 
<… :prop … > [t1]
 <… :prop … > [t1]
<… :prop … > [t3]
<… :prop … > [t5]
< …:prop … > [t7]
…



ISTREAM: stream out data in the last step that wasn’t on the previous step



DSTREAM: stream out data in the previous step that isn’t in the last step



RSTREAM: stream out all data in the last step
Other operators
• Sequence operators and CEP world
e4

S

e1

e2

e3

1

3

6

Sequence

9

Simultaneous

 SEQ: joins eti,tf and e’ti’,tf’ if e’ occurs after e
 EQUALS: joins eti,tf and e’ti’,tf’ if they occur simultaneously
 OPTIONALSEQ, OPTIONALEQUALS: Optional join variants
Query languages for semantic streams - Summary
•

Comparison between existing approaches

System

S2R

R2R

Time-aware

R2S

INSTANS

Based on
time events

SPARQL
update

Based on time events

Ins only

C-SPARQL
Engine

Logical and
triple-based

SPARQL 1.1
query

timestamp function

Batch only

SPARQLstream

Logical and
triple-based

SPARQL 1.1
query

no

Ins, batch,
del

CQELS

Logical and
triple-based

SPARQL 1.1
query

no

Ins only

Sparkwave

Logical

SPARQL 1.0

no

Ins only

Streaming Linked
Data

Logical and
graph-based

SPARQL 1.1

no

Batch only

ETALIS

no

SPARQL 1.0

• Is it time to converge on a
27

SEQ, PAR, AND, OR,
DURING, STARTS,
standard? NOT,
EQUALS,
MEETS, FINISHES

Ins only
Query languages for semantic streams - Issues

• Different syntax for S2R operator
• Semantics of query languages is similar, but not
identical
• Lack of R2S operator in some cases
• Different support for time-aware operators

28
Classification of existing systems
Heterogeneity #4: Operational Semantics

<<Texto libre: proyecto, speaker, etc.>>

30
Operational Semantics

Where are both alice and bob in the last 5s?
hall
:hall
sIn :
:i
isIn
e
:
:alic
:bob

S

e
:alic

hen
:kitc
:isIn

S1

S2

S3

S4

1

3

6

:bob

hen
:kitc
:isIn

9

System 1:
System 2:

:hall [5]
:hall [3]

t

:kitchen [10]
:kitchen [9]

Both correct?
ISWC 2013 evaluation track for "On Correctness in RDF stream
processor benchmarking" by Daniele Dell’Aglio, Jean-Paul
Calbimonte, Marco Balduini, Oscar Corcho and Emanuele Della Valle
Conclusions…

<<Texto libre: proyecto, speaker, etc.>>

32
Next steps in the community group…
• Agree on an RDF model?
•
•
•
•

Metamodel?
Timestamps in graphs?
Timestamp intervals
Compatibility with normal (static) RDF

• Additional operators for SPARQL?
• Windows (not only time based?)
• CEP operators
• Semantics

• Go Web
• Volatile URIs
• Serialization: terse, compact
• Protocols: HTTP, Websockets?
On the need for a W3C
community group on RDF
Stream Processing
ISWC2013 Workshop on Ordering and Reasoning,
Sydney, 22/10/2013

Oscar Corcho
ocorcho@fi.upm.es, ocorcho@localidata.com
@ocorcho
http://www.slideshare.net/ocorcho/

More Related Content

What's hot

Heaven: A Framework for Systematic Comparative Research Approach for RSP Engines
Heaven: A Framework for Systematic Comparative Research Approach for RSP EnginesHeaven: A Framework for Systematic Comparative Research Approach for RSP Engines
Heaven: A Framework for Systematic Comparative Research Approach for RSP EnginesRiccardo Tommasini
 
A Hierarchical approach towards Efficient and Expressive Stream Reasoning
A Hierarchical approach towards Efficient and Expressive Stream ReasoningA Hierarchical approach towards Efficient and Expressive Stream Reasoning
A Hierarchical approach towards Efficient and Expressive Stream ReasoningRiccardo Tommasini
 
RSP-QL*: Querying Data-Level Annotations in RDF Streams
RSP-QL*: Querying Data-Level Annotations in RDF StreamsRSP-QL*: Querying Data-Level Annotations in RDF Streams
RSP-QL*: Querying Data-Level Annotations in RDF Streamskeski
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataGiorgos Santipantakis
 
Information-Rich Programming in F# with Semantic Data
Information-Rich Programming in F# with Semantic DataInformation-Rich Programming in F# with Semantic Data
Information-Rich Programming in F# with Semantic DataSteffen Staab
 
Introduction to SparkR
Introduction to SparkRIntroduction to SparkR
Introduction to SparkRKien Dang
 
Introduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphXIntroduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphXrhatr
 
Introduction to Spark R with R studio - Mr. Pragith
Introduction to Spark R with R studio - Mr. Pragith Introduction to Spark R with R studio - Mr. Pragith
Introduction to Spark R with R studio - Mr. Pragith Sigmoid
 
EKAW - Triple Pattern Fragments
EKAW - Triple Pattern FragmentsEKAW - Triple Pattern Fragments
EKAW - Triple Pattern FragmentsRuben Taelman
 
LarKC Tutorial at ISWC 2009 - Data Model
LarKC Tutorial at ISWC 2009 - Data ModelLarKC Tutorial at ISWC 2009 - Data Model
LarKC Tutorial at ISWC 2009 - Data ModelLarKC
 
First impressions of SparkR: our own machine learning algorithm
First impressions of SparkR: our own machine learning algorithmFirst impressions of SparkR: our own machine learning algorithm
First impressions of SparkR: our own machine learning algorithmInfoFarm
 
LDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked DataLDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked DataOlaf Hartig
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven productsLars Albertsson
 
Big Data, Mob Scale.
Big Data, Mob Scale.Big Data, Mob Scale.
Big Data, Mob Scale.darach
 
Overview of the SPARQL-Generate language and latest developments
Overview of the SPARQL-Generate language and latest developmentsOverview of the SPARQL-Generate language and latest developments
Overview of the SPARQL-Generate language and latest developmentsMaxime Lefrançois
 
Incorporating Functions in Mappings to Facilitate the Uplift of CSV Files int...
Incorporating Functions in Mappings to Facilitate the Uplift of CSV Files int...Incorporating Functions in Mappings to Facilitate the Uplift of CSV Files int...
Incorporating Functions in Mappings to Facilitate the Uplift of CSV Files int...Ademar Crotti Junior
 
Data Infrastructure for a World of Music
Data Infrastructure for a World of MusicData Infrastructure for a World of Music
Data Infrastructure for a World of MusicLars Albertsson
 

What's hot (20)

Heaven: A Framework for Systematic Comparative Research Approach for RSP Engines
Heaven: A Framework for Systematic Comparative Research Approach for RSP EnginesHeaven: A Framework for Systematic Comparative Research Approach for RSP Engines
Heaven: A Framework for Systematic Comparative Research Approach for RSP Engines
 
A Hierarchical approach towards Efficient and Expressive Stream Reasoning
A Hierarchical approach towards Efficient and Expressive Stream ReasoningA Hierarchical approach towards Efficient and Expressive Stream Reasoning
A Hierarchical approach towards Efficient and Expressive Stream Reasoning
 
RSP-QL*: Querying Data-Level Annotations in RDF Streams
RSP-QL*: Querying Data-Level Annotations in RDF StreamsRSP-QL*: Querying Data-Level Annotations in RDF Streams
RSP-QL*: Querying Data-Level Annotations in RDF Streams
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
LD4KD 2015 - Demos and tools
LD4KD 2015 - Demos and toolsLD4KD 2015 - Demos and tools
LD4KD 2015 - Demos and tools
 
Information-Rich Programming in F# with Semantic Data
Information-Rich Programming in F# with Semantic DataInformation-Rich Programming in F# with Semantic Data
Information-Rich Programming in F# with Semantic Data
 
Shebanq gniezno
Shebanq gnieznoShebanq gniezno
Shebanq gniezno
 
Introduction to SparkR
Introduction to SparkRIntroduction to SparkR
Introduction to SparkR
 
Introduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphXIntroduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphX
 
Introduction to Spark R with R studio - Mr. Pragith
Introduction to Spark R with R studio - Mr. Pragith Introduction to Spark R with R studio - Mr. Pragith
Introduction to Spark R with R studio - Mr. Pragith
 
EKAW - Triple Pattern Fragments
EKAW - Triple Pattern FragmentsEKAW - Triple Pattern Fragments
EKAW - Triple Pattern Fragments
 
LarKC Tutorial at ISWC 2009 - Data Model
LarKC Tutorial at ISWC 2009 - Data ModelLarKC Tutorial at ISWC 2009 - Data Model
LarKC Tutorial at ISWC 2009 - Data Model
 
First impressions of SparkR: our own machine learning algorithm
First impressions of SparkR: our own machine learning algorithmFirst impressions of SparkR: our own machine learning algorithm
First impressions of SparkR: our own machine learning algorithm
 
LDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked DataLDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked Data
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven products
 
Efficient RDF Interchange (ERI) Format for RDF Data Streams
Efficient RDF Interchange (ERI) Format for RDF Data StreamsEfficient RDF Interchange (ERI) Format for RDF Data Streams
Efficient RDF Interchange (ERI) Format for RDF Data Streams
 
Big Data, Mob Scale.
Big Data, Mob Scale.Big Data, Mob Scale.
Big Data, Mob Scale.
 
Overview of the SPARQL-Generate language and latest developments
Overview of the SPARQL-Generate language and latest developmentsOverview of the SPARQL-Generate language and latest developments
Overview of the SPARQL-Generate language and latest developments
 
Incorporating Functions in Mappings to Facilitate the Uplift of CSV Files int...
Incorporating Functions in Mappings to Facilitate the Uplift of CSV Files int...Incorporating Functions in Mappings to Facilitate the Uplift of CSV Files int...
Incorporating Functions in Mappings to Facilitate the Uplift of CSV Files int...
 
Data Infrastructure for a World of Music
Data Infrastructure for a World of MusicData Infrastructure for a World of Music
Data Infrastructure for a World of Music
 

Similar to W3C RDF Stream Processing Group Discusses Heterogeneity

RDF Stream Processing Models (SR4LD2013)
RDF Stream Processing Models (SR4LD2013)RDF Stream Processing Models (SR4LD2013)
RDF Stream Processing Models (SR4LD2013)Daniele Dell'Aglio
 
RDF Stream Processing and the role of Semantics
RDF Stream Processing and the role of SemanticsRDF Stream Processing and the role of Semantics
RDF Stream Processing and the role of SemanticsJean-Paul Calbimonte
 
Unified Big Data Processing with Apache Spark
Unified Big Data Processing with Apache SparkUnified Big Data Processing with Apache Spark
Unified Big Data Processing with Apache SparkC4Media
 
Build a Time Series Application with Apache Spark and Apache HBase
Build a Time Series Application with Apache Spark and Apache  HBaseBuild a Time Series Application with Apache Spark and Apache  HBase
Build a Time Series Application with Apache Spark and Apache HBaseCarol McDonald
 
Enabling ontology based streaming data access final
Enabling ontology based streaming data access finalEnabling ontology based streaming data access final
Enabling ontology based streaming data access finalJean-Paul Calbimonte
 
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...eswcsummerschool
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesDavid Martínez Rego
 
Spark Summit EU talk by Sameer Agarwal
Spark Summit EU talk by Sameer AgarwalSpark Summit EU talk by Sameer Agarwal
Spark Summit EU talk by Sameer AgarwalSpark Summit
 
LarKC Tutorial at ISWC 2009 - Introduction
LarKC Tutorial at ISWC 2009 - IntroductionLarKC Tutorial at ISWC 2009 - Introduction
LarKC Tutorial at ISWC 2009 - IntroductionLarKC
 
Towards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsTowards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsAlejandro Llaves
 
Towards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsTowards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsAlejandro Llaves
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Guido Schmutz
 
Streaming Day - an overview of Stream Reasoning
Streaming Day - an overview of Stream ReasoningStreaming Day - an overview of Stream Reasoning
Streaming Day - an overview of Stream ReasoningRiccardo Tommasini
 
Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014Raja Chiky
 
Mining and Managing Large-scale Linked Open Data
Mining and Managing Large-scale Linked Open DataMining and Managing Large-scale Linked Open Data
Mining and Managing Large-scale Linked Open DataMOVING Project
 
Mining and Managing Large-scale Linked Open Data
Mining and Managing Large-scale Linked Open DataMining and Managing Large-scale Linked Open Data
Mining and Managing Large-scale Linked Open DataAnsgar Scherp
 
OSDC 2016 - Chronix - A fast and efficient time series storage based on Apach...
OSDC 2016 - Chronix - A fast and efficient time series storage based on Apach...OSDC 2016 - Chronix - A fast and efficient time series storage based on Apach...
OSDC 2016 - Chronix - A fast and efficient time series storage based on Apach...NETWAYS
 

Similar to W3C RDF Stream Processing Group Discusses Heterogeneity (20)

RDF Stream Processing Models (SR4LD2013)
RDF Stream Processing Models (SR4LD2013)RDF Stream Processing Models (SR4LD2013)
RDF Stream Processing Models (SR4LD2013)
 
RDF Stream Processing and the role of Semantics
RDF Stream Processing and the role of SemanticsRDF Stream Processing and the role of Semantics
RDF Stream Processing and the role of Semantics
 
Unified Big Data Processing with Apache Spark
Unified Big Data Processing with Apache SparkUnified Big Data Processing with Apache Spark
Unified Big Data Processing with Apache Spark
 
Build a Time Series Application with Apache Spark and Apache HBase
Build a Time Series Application with Apache Spark and Apache  HBaseBuild a Time Series Application with Apache Spark and Apache  HBase
Build a Time Series Application with Apache Spark and Apache HBase
 
Enabling ontology based streaming data access final
Enabling ontology based streaming data access finalEnabling ontology based streaming data access final
Enabling ontology based streaming data access final
 
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming Architectures
 
Spark Summit EU talk by Sameer Agarwal
Spark Summit EU talk by Sameer AgarwalSpark Summit EU talk by Sameer Agarwal
Spark Summit EU talk by Sameer Agarwal
 
LarKC Tutorial at ISWC 2009 - Introduction
LarKC Tutorial at ISWC 2009 - IntroductionLarKC Tutorial at ISWC 2009 - Introduction
LarKC Tutorial at ISWC 2009 - Introduction
 
Serial-War
Serial-WarSerial-War
Serial-War
 
Towards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsTowards efficient processing of RDF data streams
Towards efficient processing of RDF data streams
 
Towards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsTowards efficient processing of RDF data streams
Towards efficient processing of RDF data streams
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
 
Streaming Day - an overview of Stream Reasoning
Streaming Day - an overview of Stream ReasoningStreaming Day - an overview of Stream Reasoning
Streaming Day - an overview of Stream Reasoning
 
Timbuctoo 2 EASY
Timbuctoo 2 EASYTimbuctoo 2 EASY
Timbuctoo 2 EASY
 
Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014
 
Polyraptor
PolyraptorPolyraptor
Polyraptor
 
Mining and Managing Large-scale Linked Open Data
Mining and Managing Large-scale Linked Open DataMining and Managing Large-scale Linked Open Data
Mining and Managing Large-scale Linked Open Data
 
Mining and Managing Large-scale Linked Open Data
Mining and Managing Large-scale Linked Open DataMining and Managing Large-scale Linked Open Data
Mining and Managing Large-scale Linked Open Data
 
OSDC 2016 - Chronix - A fast and efficient time series storage based on Apach...
OSDC 2016 - Chronix - A fast and efficient time series storage based on Apach...OSDC 2016 - Chronix - A fast and efficient time series storage based on Apach...
OSDC 2016 - Chronix - A fast and efficient time series storage based on Apach...
 

More from Oscar Corcho

Organisational Interoperability in Practice at Universidad Politécnica de Madrid
Organisational Interoperability in Practice at Universidad Politécnica de MadridOrganisational Interoperability in Practice at Universidad Politécnica de Madrid
Organisational Interoperability in Practice at Universidad Politécnica de MadridOscar Corcho
 
Introducción a los Datos Abiertos - Open Data Day 2020
Introducción a los Datos Abiertos - Open Data Day 2020Introducción a los Datos Abiertos - Open Data Day 2020
Introducción a los Datos Abiertos - Open Data Day 2020Oscar Corcho
 
Open Data (and Software, and other Research Artefacts) - A proper management
Open Data (and Software, and other Research Artefacts) -A proper managementOpen Data (and Software, and other Research Artefacts) -A proper management
Open Data (and Software, and other Research Artefacts) - A proper management Oscar Corcho
 
Adiós a los ficheros, hola a los grafos de conocimientos estadísticos
Adiós a los ficheros, hola a los grafos de conocimientos estadísticosAdiós a los ficheros, hola a los grafos de conocimientos estadísticos
Adiós a los ficheros, hola a los grafos de conocimientos estadísticosOscar Corcho
 
Ontology Engineering at Scale for Open City Data Sharing
Ontology Engineering at Scale for Open City Data SharingOntology Engineering at Scale for Open City Data Sharing
Ontology Engineering at Scale for Open City Data SharingOscar Corcho
 
Situación de las iniciativas de Open Data internacionales (y algunas recomen...
Situación de las iniciativas de Open Data internacionales (y algunas recomen...Situación de las iniciativas de Open Data internacionales (y algunas recomen...
Situación de las iniciativas de Open Data internacionales (y algunas recomen...Oscar Corcho
 
STARS4ALL - Contaminación Lumínica
STARS4ALL - Contaminación LumínicaSTARS4ALL - Contaminación Lumínica
STARS4ALL - Contaminación LumínicaOscar Corcho
 
Towards Reproducible Science: a few building blocks from my personal experience
Towards Reproducible Science: a few building blocks from my personal experienceTowards Reproducible Science: a few building blocks from my personal experience
Towards Reproducible Science: a few building blocks from my personal experienceOscar Corcho
 
Publishing Linked Statistical Data: Aragón, a case study
Publishing Linked Statistical Data: Aragón, a case studyPublishing Linked Statistical Data: Aragón, a case study
Publishing Linked Statistical Data: Aragón, a case studyOscar Corcho
 
An initial analysis of topic-based similarity among scientific documents base...
An initial analysis of topic-based similarity among scientific documents base...An initial analysis of topic-based similarity among scientific documents base...
An initial analysis of topic-based similarity among scientific documents base...Oscar Corcho
 
Linked Statistical Data 101
Linked Statistical Data 101Linked Statistical Data 101
Linked Statistical Data 101Oscar Corcho
 
Aplicando los principios de Linked Data en AEMET
Aplicando los principios de Linked Data en AEMETAplicando los principios de Linked Data en AEMET
Aplicando los principios de Linked Data en AEMET Oscar Corcho
 
Ojo Al Data 100 - Call for sharing session at IODC 2016
Ojo Al Data 100 - Call for sharing session at IODC 2016Ojo Al Data 100 - Call for sharing session at IODC 2016
Ojo Al Data 100 - Call for sharing session at IODC 2016Oscar Corcho
 
Educando sobre datos abiertos: desde el colegio a la universidad
Educando sobre datos abiertos: desde el colegio a la universidadEducando sobre datos abiertos: desde el colegio a la universidad
Educando sobre datos abiertos: desde el colegio a la universidadOscar Corcho
 
STARS4ALL general presentation at ALAN2016
STARS4ALL general presentation at ALAN2016STARS4ALL general presentation at ALAN2016
STARS4ALL general presentation at ALAN2016Oscar Corcho
 
Generación de datos estadísticos enlazados del Instituto Aragonés de Estadística
Generación de datos estadísticos enlazados del Instituto Aragonés de EstadísticaGeneración de datos estadísticos enlazados del Instituto Aragonés de Estadística
Generación de datos estadísticos enlazados del Instituto Aragonés de EstadísticaOscar Corcho
 
Presentación de la red de excelencia de Open Data y Smart Cities
Presentación de la red de excelencia de Open Data y Smart CitiesPresentación de la red de excelencia de Open Data y Smart Cities
Presentación de la red de excelencia de Open Data y Smart CitiesOscar Corcho
 
Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Oscar Corcho
 
Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?Oscar Corcho
 
Slow-cooked data and APIs in the world of Big Data: the view from a city per...
Slow-cooked data and APIs in the world of Big Data: the view from a city per...Slow-cooked data and APIs in the world of Big Data: the view from a city per...
Slow-cooked data and APIs in the world of Big Data: the view from a city per...Oscar Corcho
 

More from Oscar Corcho (20)

Organisational Interoperability in Practice at Universidad Politécnica de Madrid
Organisational Interoperability in Practice at Universidad Politécnica de MadridOrganisational Interoperability in Practice at Universidad Politécnica de Madrid
Organisational Interoperability in Practice at Universidad Politécnica de Madrid
 
Introducción a los Datos Abiertos - Open Data Day 2020
Introducción a los Datos Abiertos - Open Data Day 2020Introducción a los Datos Abiertos - Open Data Day 2020
Introducción a los Datos Abiertos - Open Data Day 2020
 
Open Data (and Software, and other Research Artefacts) - A proper management
Open Data (and Software, and other Research Artefacts) -A proper managementOpen Data (and Software, and other Research Artefacts) -A proper management
Open Data (and Software, and other Research Artefacts) - A proper management
 
Adiós a los ficheros, hola a los grafos de conocimientos estadísticos
Adiós a los ficheros, hola a los grafos de conocimientos estadísticosAdiós a los ficheros, hola a los grafos de conocimientos estadísticos
Adiós a los ficheros, hola a los grafos de conocimientos estadísticos
 
Ontology Engineering at Scale for Open City Data Sharing
Ontology Engineering at Scale for Open City Data SharingOntology Engineering at Scale for Open City Data Sharing
Ontology Engineering at Scale for Open City Data Sharing
 
Situación de las iniciativas de Open Data internacionales (y algunas recomen...
Situación de las iniciativas de Open Data internacionales (y algunas recomen...Situación de las iniciativas de Open Data internacionales (y algunas recomen...
Situación de las iniciativas de Open Data internacionales (y algunas recomen...
 
STARS4ALL - Contaminación Lumínica
STARS4ALL - Contaminación LumínicaSTARS4ALL - Contaminación Lumínica
STARS4ALL - Contaminación Lumínica
 
Towards Reproducible Science: a few building blocks from my personal experience
Towards Reproducible Science: a few building blocks from my personal experienceTowards Reproducible Science: a few building blocks from my personal experience
Towards Reproducible Science: a few building blocks from my personal experience
 
Publishing Linked Statistical Data: Aragón, a case study
Publishing Linked Statistical Data: Aragón, a case studyPublishing Linked Statistical Data: Aragón, a case study
Publishing Linked Statistical Data: Aragón, a case study
 
An initial analysis of topic-based similarity among scientific documents base...
An initial analysis of topic-based similarity among scientific documents base...An initial analysis of topic-based similarity among scientific documents base...
An initial analysis of topic-based similarity among scientific documents base...
 
Linked Statistical Data 101
Linked Statistical Data 101Linked Statistical Data 101
Linked Statistical Data 101
 
Aplicando los principios de Linked Data en AEMET
Aplicando los principios de Linked Data en AEMETAplicando los principios de Linked Data en AEMET
Aplicando los principios de Linked Data en AEMET
 
Ojo Al Data 100 - Call for sharing session at IODC 2016
Ojo Al Data 100 - Call for sharing session at IODC 2016Ojo Al Data 100 - Call for sharing session at IODC 2016
Ojo Al Data 100 - Call for sharing session at IODC 2016
 
Educando sobre datos abiertos: desde el colegio a la universidad
Educando sobre datos abiertos: desde el colegio a la universidadEducando sobre datos abiertos: desde el colegio a la universidad
Educando sobre datos abiertos: desde el colegio a la universidad
 
STARS4ALL general presentation at ALAN2016
STARS4ALL general presentation at ALAN2016STARS4ALL general presentation at ALAN2016
STARS4ALL general presentation at ALAN2016
 
Generación de datos estadísticos enlazados del Instituto Aragonés de Estadística
Generación de datos estadísticos enlazados del Instituto Aragonés de EstadísticaGeneración de datos estadísticos enlazados del Instituto Aragonés de Estadística
Generación de datos estadísticos enlazados del Instituto Aragonés de Estadística
 
Presentación de la red de excelencia de Open Data y Smart Cities
Presentación de la red de excelencia de Open Data y Smart CitiesPresentación de la red de excelencia de Open Data y Smart Cities
Presentación de la red de excelencia de Open Data y Smart Cities
 
Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?
 
Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?Linked Statistical Data: does it actually pay off?
Linked Statistical Data: does it actually pay off?
 
Slow-cooked data and APIs in the world of Big Data: the view from a city per...
Slow-cooked data and APIs in the world of Big Data: the view from a city per...Slow-cooked data and APIs in the world of Big Data: the view from a city per...
Slow-cooked data and APIs in the world of Big Data: the view from a city per...
 

Recently uploaded

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 

Recently uploaded (20)

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 

W3C RDF Stream Processing Group Discusses Heterogeneity

  • 1. On the need for a W3C community group on RDF Stream Processing ISWC2013 Workshop on Ordering and Reasoning, Sydney, 22/10/2013 Oscar Corcho ocorcho@fi.upm.es, ocorcho@localidata.com @ocorcho http://www.slideshare.net/ocorcho/
  • 2. Disclaimer… This presentation expresses my view but not necessarily the one from the rest of the group (although I hope that it is similar) <<Texto libre: proyecto, speaker, etc.>> 2
  • 3. Acknowledgements • All those that I have “stolen” slides, material and ideas from • • • • • Emanuele Della Valle Daniele Dell’Aglio Marco Balduini Jean Paul Calbimonte And many others who have already started contributing… <<Texto libre: proyecto, speaker, etc.>> 3
  • 4. Why setting up a community group? In RDF Stream models (timestamps, events, time intervals, triple-based, graph-based …) In RDF Stream query languages (windows, stream selection, CEP-based operators, …) Heterogeneity In implementations (RDF native, query rewriting, continuous query registration, scalability, static vs streaming data…) <<Texto libre: proyecto, speaker, etc.>> 4 In operational semantics (tick, window content, report)
  • 5. You may think that we do not like heterogeneity… <<Texto libre: proyecto, speaker, etc.>> 5
  • 6. But at least I love it… • However, we need to tell people what to expect with each system, and smooth differences when they are not crucial…… <<Texto libre: proyecto, speaker, etc.>> 6
  • 7. The solution… • Let’s create a W3C community group… • • • • • To understand better those differences The requirements on which we are based And explain to others … And maybe get some “recommendation” out <<Texto libre: proyecto, speaker, etc.>> 7
  • 8. The W3C RDF Stream Processing Comm. Group • http://www.w3.org/community/rsp/ <<Texto libre: proyecto, speaker, etc.>> 8
  • 9. W3C RSP Community Group mission “The mission of the RDF Stream Processing Community Group (RSP) is to define a common model for producing, transmitting and continuously querying RDF Streams. This includes extensions to both RDF and SPARQL for representing streaming data, as well as their semantics. Moreover this work envisions an ecosystem of streaming and static RDF data sources whose data can be combined through standard models, languages and protocols. Complementary to related work in the area of databases, this Community Group looks at the dynamic properties of graph-based data, i.e., graphs that are produced over time and which may change their shape and data over time.” <<Texto libre: proyecto, speaker, etc.>> 9
  • 10. Use cases • We have started collecting them • And I hope that by the end of my talk you will consider contributing some more… <<Texto libre: proyecto, speaker, etc.>> 10
  • 11. A template to describe use cases (I) • Streaming Information • • • • • • Type: Environmental data: temperatures, pressures, salinity, acidity, fluid velocities etc, Nature: • Relational Stream: yes • Text stream: no Origin: Data is produced by sensors in oil wells and on oil and gas platforms equipments. Each oil platform has an average of 400.000. Frequency of update: • from sub-second to minutes • In triples/minute: [10000-10] t/min Quality: It varies, due to instrument/sensor issues Management /access • Technology in use: Dedicated (relational and proprietary) stores • Problems: The ability of users to access data from different sources is limited by an insufficient description of the context • Means of improvement: Add context (metadata) to the data so it become meaningful and use reasoning techniques to process that metadata <<Texto libre: proyecto, speaker, etc.>> 11
  • 12. A template to describe use cases (II) • [optional] Static Information required to interpret the streaming information • • • • • Type: Topology of the sensor network, position of each sensor, the descriptions of the oil platform Origin: Oil and gas production operations Dimension: • 100s of MB as PostGIS dump • In triples: 10^8 Quality: Good Management / access • Technology in use: RDBMS, proprietary technologies • Available Ontologies and Vocabularies: Reference Semantic Model (RSM), based on ISO 15926 <<Texto libre: proyecto, speaker, etc.>> 12
  • 13. A tale of four heterogeneities ISWC2013 Workshop on Ordering and Reasoning, Sydney, 22/10/2013 Oscar Corcho ocorcho@fi.upm.es, ocorcho@localidata.com @ocorcho http://www.slideshare.net/ocorcho/
  • 14. Heterogeneity #1: Representing RDF Streams <<Texto libre: proyecto, speaker, etc.>> 14
  • 15. What is an RDF stream? • Several possibilities: • An RDF stream is an infinite sequence of timestamped events (triples or graphs), where timestamps are nondecreasing … <eventi,ti > <eventi+1,ti+1 > <eventi+2,ti+2 > … • An RDF stream is an infinite sequence of triple occurrences <<s,p,o>,tα,tω> where <s,p,o> is an RDF triple and tα and tω are the start and end of the interval • How are timestamps assigned?
  • 16. Some examples… • What would be the best/possible RDF stream representation for the following types of problems? • Does Alice meet Bob before Carl? • Who does Carl meet first? :alice :isWith :bob :alice :isWith :carl e1 :diana :isWith :carl :bob :isWith :diana e2 e3 e4 • How many people has Alice met in the last 5m? • Does Diana meet Bob and then Carl within 5m? 1 3 6 9 t • Which are the meetings the last less than 5m? • Which are the meetings with conflicts? :alice :isWith :bob :alice :isWith :carl :bob :isWith :diana :diana :isWith :carl e4 e2 e1 <<Texto libre: proyecto, speaker, etc.>> e3 16
  • 17. Data types for semantic streams - Summary • Multiple notions of RDF stream proposed • Ordered sequence (implicit timestamp) • One timestamp per triple (point in time semantics) • Two timestamps per triple (interval base semantics) • Comparison between existing approaches System Time model # of timestamps INSTANS triple Implicit 0 C-SPARQL triple Point in time 1 SPARQLstream triple Point in time 1 CQELS triple Point in time 1 Sparkwave triple Point in time 1 Streaming Linked Data RDF graph Point in time 1 ETALIS • Data item triple Interval 2 More investigation is required to agree on an RDF stream model 17
  • 18. Heterogeneity #2: RDF Stream processors <<Texto libre: proyecto, speaker, etc.>> 18
  • 19. Existing RDF Stream Processing systems • C-SPARQL: RDF Store + Stream processor • Combined architecture C-SPARQL query sta translator tic stre amin RDF Store g Stream processor continuous results • CQELS: Implemented from scratch. Focus on performance • Native + adaptive joins for static-data and streaming data CQELS query Native RSP continuous results • CQELS-Cloud: Reusing Storm • Paper presentation on Thursday CQELS query Storm topology continuous results
  • 20. Existing RSP systems • EP-SPARQL: Complex-event detection • SEQ, EQUALS operators EP-SPARQL query translator Prolog engine continuous results • SPARQLStream: Ontology-based stream query answering • Virtual RDF views, using R2RML mappings • SPARQL stream queries over the original data streams. SPARQLStream query rewriter DSMS/CEP R2RML mappings • Instans: RETE-based evaluation continuous results
  • 21. Query languages for semantic streams - Summary • Different architectural choices • It is not clear when each choice is best for which type of use case • Wrappers over existing systems • C-SPARQL, ETALIS, SPARQLstream , CQELS-Cloud • Better reliability and maintainability? • Native implementations • CQELS, Streaming Linked Data, INSTANS • Better scalability: optimizations that are not possible in other systems • Different operational semantics • See later 21
  • 22. Heterogeneity #3: Querying RDF Streams <<Texto libre: proyecto, speaker, etc.>> 22
  • 23. Querying data streams (from CQL to SPARQL-X) stream-to-relation (S2R) Relation s Streams infinite unbounded bag … <s,τ> … relation-to-relation (R2R) relation-to-stream (R2S) Stream <s1> <s2> <s3> finite bag Relati on R(t) Mapping: T  R S2R Window operators RDF Streams SPARQL operators RDF R2S operators
  • 24. Output: relation • Case 1: the output is a set of timestamped mappings a … ?b… [t1] a … ?b… SELECT ?a ?b … FROM …. WHERE …. queries CONSTRUCT {?a :prop ?b } FROM …. WHERE …. a … ?b… [t3] a … ?b… [t5] RS P a … ?b… [t7] bindings  <… :prop … > [t1]  <… :prop … >  <… :prop … > [t3]  <… :prop … > [t5]  <… :prop … > [t7] triples
  • 25. Output: stream • Case 2: the output is a stream • R2S operators CONSTRUCT RSTREAM {?a :prop ?b } FROM …. WHERE …. query RS P stream …  <… :prop … > [t1]  <… :prop … > [t1] <… :prop … > [t3] <… :prop … > [t5] < …:prop … > [t7] …  ISTREAM: stream out data in the last step that wasn’t on the previous step  DSTREAM: stream out data in the previous step that isn’t in the last step  RSTREAM: stream out all data in the last step
  • 26. Other operators • Sequence operators and CEP world e4 S e1 e2 e3 1 3 6 Sequence 9 Simultaneous  SEQ: joins eti,tf and e’ti’,tf’ if e’ occurs after e  EQUALS: joins eti,tf and e’ti’,tf’ if they occur simultaneously  OPTIONALSEQ, OPTIONALEQUALS: Optional join variants
  • 27. Query languages for semantic streams - Summary • Comparison between existing approaches System S2R R2R Time-aware R2S INSTANS Based on time events SPARQL update Based on time events Ins only C-SPARQL Engine Logical and triple-based SPARQL 1.1 query timestamp function Batch only SPARQLstream Logical and triple-based SPARQL 1.1 query no Ins, batch, del CQELS Logical and triple-based SPARQL 1.1 query no Ins only Sparkwave Logical SPARQL 1.0 no Ins only Streaming Linked Data Logical and graph-based SPARQL 1.1 no Batch only ETALIS no SPARQL 1.0 • Is it time to converge on a 27 SEQ, PAR, AND, OR, DURING, STARTS, standard? NOT, EQUALS, MEETS, FINISHES Ins only
  • 28. Query languages for semantic streams - Issues • Different syntax for S2R operator • Semantics of query languages is similar, but not identical • Lack of R2S operator in some cases • Different support for time-aware operators 28
  • 30. Heterogeneity #4: Operational Semantics <<Texto libre: proyecto, speaker, etc.>> 30
  • 31. Operational Semantics Where are both alice and bob in the last 5s? hall :hall sIn : :i isIn e : :alic :bob S e :alic hen :kitc :isIn S1 S2 S3 S4 1 3 6 :bob hen :kitc :isIn 9 System 1: System 2: :hall [5] :hall [3] t :kitchen [10] :kitchen [9] Both correct? ISWC 2013 evaluation track for "On Correctness in RDF stream processor benchmarking" by Daniele Dell’Aglio, Jean-Paul Calbimonte, Marco Balduini, Oscar Corcho and Emanuele Della Valle
  • 33. Next steps in the community group… • Agree on an RDF model? • • • • Metamodel? Timestamps in graphs? Timestamp intervals Compatibility with normal (static) RDF • Additional operators for SPARQL? • Windows (not only time based?) • CEP operators • Semantics • Go Web • Volatile URIs • Serialization: terse, compact • Protocols: HTTP, Websockets?
  • 34. On the need for a W3C community group on RDF Stream Processing ISWC2013 Workshop on Ordering and Reasoning, Sydney, 22/10/2013 Oscar Corcho ocorcho@fi.upm.es, ocorcho@localidata.com @ocorcho http://www.slideshare.net/ocorcho/

Editor's Notes

  1. {}