Stream Reasoning
For Linked Data
M. Balduini, J-P Calbimonte, O. Corcho,
D. Dell'Aglio, E. Della Valle, and J.Z. Pan
http:...
Share, Remix, Reuse — Legally
 This work is licensed under the Creative Commons
Attribution 3.0 Unported License.
 Your ...
Outline
 Continuous RDF model extensions
•

RDF Streams, timestamps

 Continuous extensions of SPARQL
•
•

Continuous ev...
Continuous extensions of RDF
 As you know, “RDF is a standard model for data interchange on the
Web” (http://www.w3.org/R...
Data items
 With data item we can refer to:
1. A triple

<:alice :isWith :bob>
2. A graph

<:alice :posts :p>
<:p :who :b...
Data items and time
 Do we need to associate the time to data items?
•

It depends on what we want to achieve (see next!)...
Application time
 A timestamp is a temporal identifier associated to a data item
 The application time is a set of one o...
Missing application time
:alice :isWith :bob

:bob :isWith :diana

:alice :isWith :carl

S

e1

e2

:diana :isWith :carl

...
Application time: one timestamp
:alice :isWith :bob
:bob :isWith :diana
:alice :isWith :carl
:diana :isWith :carl

S

e1

...
Application time: two timestamps
:alice :isWith :bob

:bob :isWith :diana
:alice :isWith :carl
:diana :isWith :carl

e4

e...
Classification of existing systems
Triple
No timestamp

Instans

One timestamp

C-SPARQL
CQELS
SPARQLstream

Two timestamp...
Our assumptions
:alice :isWith :bob
:bob :isWith :diana
:alice :isWith :carl
:diana :isWith :carl

S

e1

e2

e3

e4

1

3...
Let’s process the RDF streams!
 DSMS and CEP worlds suggest different techniques and approaches
to process data streams

...
System time
 Stream processors can elaborate data streams exploiting the
timestamps associated to the events

 When a sy...
RDF stream
 An RDF stream is an infinite sequence of timestamped events
(triples or graphs)

…
<eventi,ti >
<eventi+1,ti+...
Querying data streams
 CQL model

stream-to-relation

Streams

infinite
unbounded
bag

…
<s,τ>
…

relation-to-relation

R...
Querying RDF data streams
 CQL model

S2R Window operators

RDF
Streams

SPARQL operators

RDF
Mappings
R2S operators

Ab...
Time-based Windows
 Who are both alice and carl meeting?

S

e1

e2

e3

1

3

6

:bob

S

e4

e5
9

t

:diana

e1

e2

e...
R2R operators
 SPARQL operators
•
•
•
•
•

Graph pattern matching
JOIN
OPTIONAL JOIN
SELECTION
UNION
S2R Window operators...
SPARQL: a quick recap

http://streamreasoning.org/sr4ld2013

20
Output: relation
 Case 1: the output is a set of timestamped mappings
a … ?b… [t1]
a … ?b…

SELECT ?a ?b …
FROM ….
W...
Output: stream
 Case 2: the output is a stream
 R2S operators
CONSTRUCT RSTREAM {?a :prop ?b }
FROM ….
WHERE ….

query

...
Other operators
 Sequence operators and CEP world
e4

S
e1

e2

e3

1

3

6

Sequence

9

Simultaneous

 SEQ: joins eti,...
Existing RSP systems
 C-SPARQL: RDF Store + Stream processor
•

Combined architecture

RDF Store

C-SPARQL
query

transla...
Existing RSP systems
 EP-SPARQL: Complex-event detection
•

SEQ, EQUALS operators

EP-SPARQL
query

Prolog
engine

transl...
Query languages syntax
SELECT ?sensor
FROM NAMED STREAM <http://www.cwi.nl/SRBench/observations> [NOW-3 HOURS SLIDE 10
MIN...
Time function

Sequence,
Co-ocurrence

R2S operator

Triple window

Time window

Aggregates

Union, Join,
Optional,
Filter...
RDF Stream Processors

Can we compare these RSPs?
Do RSPs behave the same?
Do we get the same results form RSPs?

http://s...
Operational Semantics
Where are both alice and bob in the last 5s?

S

S1

S2

S3

S4

1

3

6

9

System 1:
System 2:

:h...
SECRET Model: understand operational
semantics
t0: When does the
windowing start?
(internal window param)

REPORT: When is...
SECRET model classification

CQELS
Report Content-change

Tick Tuple-driven
Empty relation No
notification

C-SPARQL

SPAR...
Benchmarks and comparing
http://www.w3.org/wiki/SRBench

C-SPARQL
SPARQLStream
CQELS
Not exhaustive!

http://streamreasoni...
Functional Evaluation

System

Q1

Q2

Q3

Q4

Q5

Q6 Q7

Q8

Q9

Q10 Q11 Q12

G

G,IF SD

SPARQLStream

PP

A

G

G

CQEL...
A lot to do…
 Agree on an RDF model?
•
•
•
•

Metamodel?
Timestamps in graphs?
Timestamp intervals
Compatibility with nor...
References


Arasu, A., Babu, S., Widom, J.: The CQL continuous query language : semantic
foundations. The VLDB Journal 1...
Stream Reasoning
For Linked Data
M. Balduini, J-P Calbimonte, O. Corcho,
D. Dell'Aglio, E. Della Valle, and J.Z. Pan
http:...
Upcoming SlideShare
Loading in …5
×

RDF Stream Processing Models (SR4LD2013)

507 views

Published on

Presentation on RDF Stream Processing models given at the SR4LD tutorial (ISWC 2013) -- updated version at: http://www.slideshare.net/dellaglio/rsp2014-01rspmodelsss

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
507
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
18
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

RDF Stream Processing Models (SR4LD2013)

  1. 1. Stream Reasoning For Linked Data M. Balduini, J-P Calbimonte, O. Corcho, D. Dell'Aglio, E. Della Valle, and J.Z. Pan http://streamreasoning.org/sr4ld2013 RDF stream processing models Daniele Dell’Aglio, daniele.dellaglio@polimi.it Jean-Paul Cabilmonte, jp.calbimonte@upm.es
  2. 2. Share, Remix, Reuse — Legally  This work is licensed under the Creative Commons Attribution 3.0 Unported License.  Your are free: to Share — to copy, distribute and transmit the work to Remix — to adapt the work  Under the following conditions Attribution — You must attribute the work by inserting – “[source http://streamreasoning.org/sr4ld2013]” at the end of each reused slide – a credits slide stating - These slides are partially based on “Streaming Reasoning for Linked Data 2013” by M. Balduini, J-P Calbimonte, O. Corcho, D. Dell'Aglio, E. Della Valle, and J.Z. Pan http://streamreasoning.org/sr4ld2013  To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/ http://streamreasoning.org/sr4ld2013 2
  3. 3. Outline  Continuous RDF model extensions • RDF Streams, timestamps  Continuous extensions of SPARQL • • Continuous evaluation Additional operators  Overview of existing systems • • Implemented operators Different evaluation approaches http://streamreasoning.org/sr4ld2013 3
  4. 4. Continuous extensions of RDF  As you know, “RDF is a standard model for data interchange on the Web” (http://www.w3.org/RDF/) <sub1 pred1 obj1> <sub2 pred2 obj2>  We want to extend RDF to model data streams  A data stream is an (infinite) ordered sequence of data items  A data item is a self-consumable informative unit http://streamreasoning.org/sr4ld2013 4
  5. 5. Data items  With data item we can refer to: 1. A triple <:alice :isWith :bob> 2. A graph <:alice :posts :p> <:p :who :bob> :graph1 <:p :where :redRoom> http://streamreasoning.org/sr4ld2013 5
  6. 6. Data items and time  Do we need to associate the time to data items? • It depends on what we want to achieve (see next!)  If yes, how to take into account the time? • • • Time should not (but could) be part of the schema Time should not be accessible through the query language Time as object would require a lot of reification  How to extend the RDF model to take into account the time? http://streamreasoning.org/sr4ld2013 6
  7. 7. Application time  A timestamp is a temporal identifier associated to a data item  The application time is a set of one or more timestamps associated to the data item  Two data items can have the same application time • Contemporaneity  Who does assign the application time to an event? • The one that generates the data stream! http://streamreasoning.org/sr4ld2013 7
  8. 8. Missing application time :alice :isWith :bob :bob :isWith :diana :alice :isWith :carl S e1 e2 :diana :isWith :carl e3 e4  A RDF stream without timestamp is an ordered sequence of data items  The order can be exploited to perform queries • • Does Alice meet Bob before Carl? Who does Carl meet first? http://streamreasoning.org/sr4ld2013 8
  9. 9. Application time: one timestamp :alice :isWith :bob :bob :isWith :diana :alice :isWith :carl :diana :isWith :carl S e1 e2 e3 e4 1 3 6 9 t  One timestamp: the time on which the data item occurs  We can start to compose queries taking into account the time • • How many people has Alice met in the last 5m? Does Diana meet Bob and then Carl within 5m? http://streamreasoning.org/sr4ld2013 9
  10. 10. Application time: two timestamps :alice :isWith :bob :bob :isWith :diana :alice :isWith :carl :diana :isWith :carl e4 e2 S e1 1 3 e3 6 9 t  Two timestamps: the time range on which the data item is valid (from, to]  It is possible to write even more complex constraints: • • Which are the meetings the last less than 5m? Which are the meetings with conflicts? http://streamreasoning.org/sr4ld2013 10
  11. 11. Classification of existing systems Triple No timestamp Instans One timestamp C-SPARQL CQELS SPARQLstream Two timestamps Graph EP-SPARQL/Etalis http://streamreasoning.org/sr4ld2013 SLD 11
  12. 12. Our assumptions :alice :isWith :bob :bob :isWith :diana :alice :isWith :carl :diana :isWith :carl S e1 e2 e3 e4 1 3 6 9 t  In the following we will consider the following setting • • • A RDF triple is an event Application time: single timestamp System time = application time <:alice :isWith:bob>:[1] <:alice :isWith:carl>:[3] <:bob :isWith :diana>:[6] ... http://streamreasoning.org/sr4ld2013 12
  13. 13. Let’s process the RDF streams!  DSMS and CEP worlds suggest different techniques and approaches to process data streams  We focus on the CQL/STREAM model http://streamreasoning.org/sr4ld2013 13
  14. 14. System time  Stream processors can elaborate data streams exploiting the timestamps associated to the events  When a system receives an event, it could have the need of associating a timestamp • This is the system time  The system time is an internal value, it does not exit from the system!  The system time must be unique  Can application and system time coincide? • • It depends Approximation http://streamreasoning.org/sr4ld2013 14
  15. 15. RDF stream  An RDF stream is an infinite sequence of timestamped events (triples or graphs) … <eventi,ti > <eventi+1,ti+1 > <eventi+2,ti+2 > …  The (application) timestamps must be non-decreasing ti <= ti+1 http://streamreasoning.org/sr4ld2013 15
  16. 16. Querying data streams  CQL model stream-to-relation Streams infinite unbounded bag … <s,τ> … relation-to-relation Relations relation-to-stream Stream http://streamreasoning.org/sr4ld2013 <s1> <s2> <s3> finite bag Relation R(t) Mapping: T  R 16
  17. 17. Querying RDF data streams  CQL model S2R Window operators RDF Streams SPARQL operators RDF Mappings R2S operators Abstract query processing model http://streamreasoning.org/sr4ld2013 17
  18. 18. Time-based Windows  Who are both alice and carl meeting? S e1 e2 e3 1 3 6 :bob S e4 e5 9 t :diana e1 e2 e3 1 3 6 :bob http://streamreasoning.org/sr4ld2013 e4 e5 9 Windows + slides t 18
  19. 19. R2R operators  SPARQL operators • • • • • Graph pattern matching JOIN OPTIONAL JOIN SELECTION UNION S2R Window operators SPARQL operators RDF Mappings RDF Streams R2S operators http://streamreasoning.org/sr4ld2013 19
  20. 20. SPARQL: a quick recap http://streamreasoning.org/sr4ld2013 20
  21. 21. Output: relation  Case 1: the output is a set of timestamped mappings a … ?b… [t1] a … ?b… SELECT ?a ?b … FROM …. WHERE …. queries a … ?b… [t3] a … ?b… [t5] RSP CONSTRUCT {?a :prop ?b } FROM …. WHERE …. a … ?b… [t7] bindings <… :prop … > [t1] <… :prop … > <… :prop … > [t3] <… :prop … > [t5] <… :prop … > [t7] triples http://streamreasoning.org/sr4ld2013 21
  22. 22. Output: stream  Case 2: the output is a stream  R2S operators CONSTRUCT RSTREAM {?a :prop ?b } FROM …. WHERE …. query RSP stream … <… :prop <… :prop <… :prop <… :prop < …:prop … … … … … … > > > > > [t1] [t1] [t3] [t5] [t7]  R2S operators:  ISTREAM: stream out data in the last step that wasn’t on the previous step  DSTREAM: stream out data in the previous step that isn’t in the last step  RSTREAM: stream out all data in the last step http://streamreasoning.org/sr4ld2013 22
  23. 23. Other operators  Sequence operators and CEP world e4 S e1 e2 e3 1 3 6 Sequence 9 Simultaneous  SEQ: joins eti,tf and e’ti’,tf’ if e’ occurs after e  EQUALS: joins eti,tf and e’ti’,tf’ if they occur simultaneously  OPTIONALSEQ, OPTIONALEQUALS: Optional join variants http://streamreasoning.org/sr4ld2013 23
  24. 24. Existing RSP systems  C-SPARQL: RDF Store + Stream processor • Combined architecture RDF Store C-SPARQL query translator Stream processor continuous results  CQELS: Implemented from scratch. Focus on performance • Native + adaptive joins for static-data and streaming data CQELS query Native RSP continuous results Disclaimer: oversimplified descriptions http://streamreasoning.org/sr4ld2013 24
  25. 25. Existing RSP systems  EP-SPARQL: Complex-event detection • SEQ, EQUALS operators EP-SPARQL query Prolog engine translator continuous results  SPARQLStream: Ontology-based stream query answering • • Virtual RDF views, using R2RML mappings SPARQL stream queries over the original data streams. SPARQLStream query rewriter DSMS/CEP continuous results R2RML mappings  Instans: RETE-based evaluation Disclaimer: oversimplified descriptions http://streamreasoning.org/sr4ld2013 25
  26. 26. Query languages syntax SELECT ?sensor FROM NAMED STREAM <http://www.cwi.nl/SRBench/observations> [NOW-3 HOURS SLIDE 10 MINUTES] WHERE { Stream ?observation om-owl:procedure ?sensor ; om-owl:observedProperty weather:WindSpeed ; om-owl:result [ om-owl:floatValue ?value ] . } GROUP BY ?sensor HAVING ( AVG(?value) >= "74"^^xsd:float ) SPARQL SELECT ?sensor FROM STREAM <http://www.cwi.nl/SRBench/observations> [RANGE 1h STEP 10m] WHERE { ?observation om-owl:procedure ?sensor ; om-owl:observedProperty weather:WindSpeed ; om-owl:result [ om-owl:floatValue ?value ] . } GROUP BY ?sensor HAVING ( AVG(?value) >= "74"^^xsd:float ) C-SPARQL SELECT ?sensor WHERE { STREAM <http://www.cwi.nl/SRBench/observations> [RANGE 10800s SLIDE 600s] { ?observation om-owl:procedure ?sensor ; om-owl:observedProperty weather:WindSpeed ; om-owl:result [ om-owl:floatValue ?value ] .} } GROUP BY ?sensor HAVING ( AVG(?value) >= "74"^^xsd:float ) CQELS http://streamreasoning.org/sr4ld2013 26
  27. 27. Time function Sequence, Co-ocurrence R2S operator Triple window Time window Aggregates Union, Join, Optional, Filter Continuous execution Model Classification of existing systems TASPARQL TA-RDF ✗ ✔ Limited ✗ ✗ ✗ ✗ ✗ tSPARQL tRDF ✗ ✔ ✗ ✗ ✗ ✗ ✗ ✗ Streaming SPARQL RDF Stream ✔ ✔ ✗ ✔ ✔ ✗ ✗ ✗ C-SPARQL RDF Stream ✔ ✔ ✔ ✔ ✔ ✗ ✗ ✔ CQELS RDF Stream ✔ ✔ ✔ ✔ ✔ ✗ ✗ ✗ SPARQLStr eam (Virtual) RDF Stream ✔ ✔ ✔ ✔ ✗ ✔ ✗ ✗ EPSPARQL RDF Stream ✔ ✔ ✔ ✗ ✗ ✗ ✔ ✗ Instans RDF ✔ ✔ ✔ ✗ ✗ ✗ ✗ ✗ Disclaimer: other features may be missing http://streamreasoning.org/sr4ld2013 27
  28. 28. RDF Stream Processors Can we compare these RSPs? Do RSPs behave the same? Do we get the same results form RSPs? http://streamreasoning.org/sr4ld2013 28
  29. 29. Operational Semantics Where are both alice and bob in the last 5s? S S1 S2 S3 S4 1 3 6 9 System 1: System 2: :hall [5] :hall [3] Both correct? t :kitchen [10] :kitchen [10] Find out more later this week on the ISWC Evaluation Track! Thursday at noon! http://streamreasoning.org/sr4ld2013 29
  30. 30. SECRET Model: understand operational semantics t0: When does the windowing start? (internal window param) REPORT: When is the window content made available to the R2R operator? WINDOW Non-empty content, Content-change, CONTENT: Which Window-close, Periodic stream elements are in the window? R2R operator ω β TICK: When the data stream are inserted in the window? Triple-based vs graph-based S W(ω,β) S1 S3 S2 S4 S6 S5 S7 S9 S11 S8 S10 S12 t http://streamreasoning.org/sr4ld2013 30
  31. 31. SECRET model classification CQELS Report Content-change Tick Tuple-driven Empty relation No notification C-SPARQL SPARQLstream Window-close Non-empty content Window-close Non-empty content Tuple-driven Tuple-driven Yes No  Characterize non-window-based RSPs?  Multiple streams?, reasoning?, linking with static data? http://streamreasoning.org/sr4ld2013 31
  32. 32. Benchmarks and comparing http://www.w3.org/wiki/SRBench C-SPARQL SPARQLStream CQELS Not exhaustive! http://streamreasoning.org/sr4ld2013 32
  33. 33. Functional Evaluation System Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 G G,IF SD SPARQLStream PP A G G CQELS PP A D/N IF PP PP PP PP PP PP C-SPARQL PP A D IF PP PP PP PP PP PP SD Q13 Q14 Q15 Q16 Q17 PP,SD PP,SD PP,SD PP,SD PP,SD PP,SD Ask Dstream Group by and aggregations IF expression Negation Property Path Static Dataset http://streamreasoning.org/sr4ld2013 33
  34. 34. A lot to do…  Agree on an RDF model? • • • • Metamodel? Timestamps in graphs? Timestamp intervals Compatibility with normal (static) RDF  Additional operators for SPARQL? • • • Windows (not only time based?) CEP operators Semantics  Go Web • • • Volatile URIs Serialization: terse, compact Protocols: HTTP, Websockets? http://streamreasoning.org/sr4ld2013 34
  35. 35. References  Arasu, A., Babu, S., Widom, J.: The CQL continuous query language : semantic foundations. The VLDB Journal 15(2) (2006) 121–142  Barbieri, D.F., Braga, D., Ceri, S., Della Valle, E., Grossniklaus, M.: C-SPARQL: A continuous query language for RDF data streams. IJSC 4(1) (2010) 3–25  Botan, I., Derakhshan, R., Dindar, N., Haas, L., Miller, R.J., Tatbul, N.: Secret:A model for analysis of the execution semantics of stream processing systems. PVLDB 3(1) (2010) 232–243  Calbimonte, J.P., Jeung, H., Corcho, O., Aberer, K.: Enabling Query Technologies for the Semantic Sensor Web. IJSWIS 8(1) (2012) 43–63  Le-Phuoc, D., Dao-Tran, M., Xavier Parreira, J., Hauswirth, M.: A native and adaptive approach for unified processing of linked streams and linked data. In: ISWC. (2011) 370–388  Anicic, D., Fodor, P., Rudolph, S., Stojanovic, N.: EP-SPARQL: a unified language for event processing and stream reasoning. In: WWW. (2011) 635–644 http://streamreasoning.org/sr4ld2013 35
  36. 36. Stream Reasoning For Linked Data M. Balduini, J-P Calbimonte, O. Corcho, D. Dell'Aglio, E. Della Valle, and J.Z. Pan http://streamreasoning.org/sr4ld2013 RDF stream processing models Daniele Dell’Aglio, daniele.dellaglio@polimi.it Jean-Paul Cabilmonte, jp.calbimonte@upm.es

×