On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio
On Unified Stream Reasoning
The RDF Stream Processing realm
WU Vienna, 18/02/2016

Problem setting
Real time integration of huge volumes of dynamic data from
heterogeneous sources
– Traffic Prediction
– Social media analytics
– Personalised services
2

Stream Reasoning
Stream Reasoning (SR): inference over streams of data
– Stream and Event Processing: real-time processing of
highly dynamic data
• Aggregations, filters
• Complex event detection
– Reasoning
• Access and integration of heterogeneous data
• Make explicit hidden information
3

A non-comprehensive view on the
Stream Reasoning research so far
EuropeMapfromWikipedia
4

The initial problem (1)
Where are Alice and Bob,
when they are together?
Let’s consider a tumbling
window W(ω=β=5)
Let’s execute the
experiment 4 times
Execution 1° answer 2° answer
1 :hall [6] :kitchen [11]
4 - [7] - [12]
S1 S2 S3 S4S
t3 6 91
{:alice :isIn :hall}
{:bob :isIn :hall}
{:alice :isIn :kitchen}
{:bob :isIn :kitchen}
Which is the correct answer?
width
slide
5

The initial problem (2)
System 1 System 2
Which system behaves in the correct way?
4 - [7] - [12]
2 No answers
4 No answers
S1 S2 S3 S4S
t3 6 91
{:bob :isIn :hall} {:bob :isIn :kitchen}
{:alice :isIn :hall} {:alice :isIn :kitchen}
6

Problem
How to unify current Stream Reasoning techniques?
Why do we need it?
• Comparison and contrast
• Interoperability
• Study RDF Stream Processing related problems
• Standard RSP query language
7

Streams Ontology
Background
data
Entailment
Regimes
RSEP-QL
Applications
RSP-QL
BGP evaluation
over streams BGP evaluation
over BKG
Event Pattern
detection operators
Model to express
continous queries
The entailment regimes
require an ontology and
provide more answers w.r.t.
Both RSP-QL and RSEP-QL
Not part of the today talk!
Contribution – RSEP-QL
A comprehensive model that formally deﬁnes the semantics of
RDF Stream Processing engines
8

Q
(E, DS, QF)
From SPARQL…
Evaluator
Data layer
Result
Formatter
Ans(Q)RDF graphs
E
DS
QF
Query
Interface
9

Q
(E, DS, QF)
…to RSEP-QL
Evaluator
Data layer
Result
Formatter
Ans(Q)RDF graphs
E
DS
QF
Continuous
EvaluatorET
RDF graphs
RDF streams
Query
Interface
SDS
Q
(E, SDS, QF)
Q
(E, SDS, ET, QF)
Q
(SE, SDS, ET, QF)
SE
10

Sequence of timestamped
graphs (stream items)
Streaming Dataset – Time-based sliding window: 𝕎(ω,β,t0)
𝕎(3,1,1)
ω
t
width
S3
S4 S5 S6
S7
S8 S9
S10
S11
S
S1
S2
t0
11

𝕎(3,1,1)
β
ω
t
widthslide
S3
S4 S5 S6
S7
S8 S9
S10
S11
S
S1
S2
t0
12

𝕎(3,1,1)
β
ω
t
widthslide
S3
S4 S5 S6
S7
S8 S9
S10
S11
S
S1
S2
t0
13

Streaming Dataset – Landmark window: 𝕃(t0)
𝕃(2)
t
S3
S4 S5 S6
S7
S8 S9
S10
S11
S
S1
S2
t0
14

𝕃(2)
t
S3
S4 S5 S6
S7
S8 S9
S10
S11
S
S1
S2
t0
15

𝕃(2)
t
S3
S4 S5 S6
S7
S8 S9
S10
S11
S
S1
S2
t0
16

From SPARQL dataset to RSEP-QL Streaming Dataset
t1
G(t1)
T⊆ ℕ R={RDF graph}
SPARQL dataset
G
H
Instantaneous Graph
G(t1)  RTime-Varying Graph
G: T R
RSP-QL dataset
S3
S4 S5
S6
S7
S8
S9 S10
S11
S12
S
S1
S2
𝕎(S)
17

Evaluation
The SPARQL evaluation function is defined as
⟦𝑃⟧ 𝐷𝑆(𝐺)
The RSEP-QL evaluation function extends the SPARQL one by
introducing the evaluation time instant
⟦𝑃⟧ 𝑆𝐷𝑆(𝐴)
𝑡
SPARQL operators are straight extended to the new evaluation
function
Example: JOIN
⟦𝐽𝑂𝐼𝑁(𝑃1, 𝑃2)⟧ 𝑆𝐷𝑆 𝐴
𝑡
= ⟦𝑃1⟧ 𝑆𝐷𝑆 𝐴
𝑡
⨝ ⟦𝑃2⟧ 𝑆𝐷𝑆 𝐴
𝑡
18

Instantaneous evaluation
The main difference is on the BGP evaluation:
⟦𝐵𝐺𝑃⟧ 𝑆𝐷𝑆(𝐴)
𝑡
=⟦𝐵𝐺𝑃⟧ 𝑆𝐷𝑆(𝐴,𝑡)
SDS(A,t) is:
SDS(G,t)= SDS(G(t)) if A is a time-varying graph G
SDS(𝕎(S),t)=SDS(m(𝕎(S,t))) if A is from a sliding window 𝕎
SDS(𝕃(S),t)=SDS(m(𝕃(S,t))) if A is from a landmark window 𝕃
where m denotes a merge function
m(𝕎(S,t))= 𝑑 𝑖,𝑡 𝑖 ∈𝕎(S,t) 𝑑𝑖
– takes as input a window content i.e. a sequence of timestamped
RDF graphs
– produces an RDF graph
19

Continuous evaluation
For each evaluation time t ∈ ET: ⟦𝑆𝐸⟧ 𝑆𝐷𝑆(𝐴)
𝑡
– The continuous evaluation is a sequence of instantaneous
evaluations
It is not always possible to compute ET a priori
– Can be data dependent
– ET is expressed through a Report Policy
A Report Policy is a set of conditions to one or more window
operators in SDS
– Initially defined in SECRET for Stream Processing engines
20

Continuous evaluation – Report Policies
Report Policy examples:
– P Periodic: the window reports only at regular intervals
– WC Window Close: the window reports if the active
window closes
– CC Content Change: the window reports if the content
changes.
21

Event Processing – Basic Event Pattern
Support to Complex Event Processing operators
The minimal element is the Basic Event Pattern:
EVENT 𝑤 𝑃
Intuitively, the Basic Graph Pattern 𝑃 should match against one
stream item of the window identified by 𝑤
BEP can be combined through complex operators
• SEQ, LAST, EVERY
Example:
EVENT 𝑤1 𝑃1 SEQ EVERY EVENT 𝑤2 𝑃2
22

Event Processing – Evaluation semantics
Formally, we use a new evaluation function ⦅⋅⦆ 𝑜,𝑐
𝑡
• t is the evaluation time instant,
• 𝑜, 𝑐 is an additional window to identify the portion of the
data on which the event may happen
Event pattern evaluation produces event mappings 𝜇, 𝑡1, 𝑡2
• 𝜇 is a solution mapping
• 𝑡1 and 𝑡2 denote the time inverval justifying 𝜇
23

Event Processing – Evaluation semantics - Examples
The evaluation of EVENT 𝑤1 𝑃1 SEQ EVERY EVENT 𝑤2 𝑃2 is
24
S2
S3 S4
S1 S1
S6
S7
S8
S9 S10
S11
S12
S2
EVENT 𝑤1 𝑃1
SEQ
EVERY EVENT 𝑤2 𝑃2
t
10 12 14 1611 13 15

25
S2
S3 S4
S1 S1
S6
S7
S8
S9 S10
S11
S12
S2
EVENT 𝑤1 𝑃1
SEQ
t
10 12 14 1611 13 15

26
S2
S3 S4
S1 S1
S6
S7
S8
S9 S10
S11
S12
S2
S1 S10
EVENT 𝑤1 𝑃1
SEQ
t
10 12 14 1611 13 15
11 13

27
S2
S3 S4
S1 S1
S6
S7
S8
S9 S10
S11
S12
S2
S1 S10
S1 S12
EVENT 𝑤1 𝑃1
SEQ
t
10 12 14 1611 13 15
11 13
11 15

Event Processing – MATCH graph pattern
Event patterns are eclosed in MATCH graph patterns
• Event mappings exist only in the context of event patterns
• The evaluation of a MATCH graph pattern produces a bag of
solution mappings
𝑀𝐴𝑇𝐶𝐻 𝐸 𝑆𝐷𝑆 𝐴
𝑡
= {𝜇| 𝜇, 𝑡1, 𝑡2 ∈ ⦅𝐸⦆ 0,𝑡
𝑡
}
It is possible to combine the MATCH graph pattern with other
SPARQL graph patterns
28

RSP output correctness
S1 S2 S3 S4S
t3 6 91
t0=0
4 - [7] - [12]
Window 1° answer 2° answer
t0=0 :hall [5] :kitchen [10]
t0=1 :hall [6] :kitchen [11]
t0=2 - [7] - [12]
t0=1
t0=2
29

RSP output correctness
System 1 System 2
4 - [7] - [12]
2 No answers
4 No answers
S1 S2 S3 S4S
t3 6 91
Window-close vs
content-change
Empty relation
notification (yes|no)
30

Correctness Assessment
Online
Offline
Data
importer
Query
transformer
SPARQL
engine
Result
matcherDS
R
RSP-QL model
of R
Correctness
assessment
Q
(E, SDS, ET, QF)
31

What’s next?
An RSEP-QL query language
• W3C RSP CG ongoing activities
Implementations
• Yet another RSP engine
• Framework to let existing RSP engine interoperate
Streams are getting popular – applications want more and more
sophisticated features
• Different timestamps, out-of-orders
• Inductive reasoning to cope with noise
• Permanent storage of portions of data (raw or inferred)
32

Conclusions
The dynamics introduced in the continuous query evlauation
process have not been totally understood
• Not fully captured by existing models
• RSEP-QL captures those dynamics
• All of them? Let’s discover it!
We need to push implementations and applications on use cases
• To understand which helpful operators are missing
• To find new unexpected behaviours
33

People I am grateful to...
Emanuele Della Valle
and:
Marco Balduini
Jean-Paul Calbimonte
Oscar Corcho
Minh Dao-Trao
Danh Le Phuoc
Freddy Lecue
34

... without forgetting you!
Thank you! Questions?
On Unified Stream Reasoning
The RDF Stream Processing realm
daniele.dellaglio@polimi.it
http://dellaglio.org
35

On Unified Stream Reasoning - The RDF Stream Processing realm

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (16)

Similar to On Unified Stream Reasoning - The RDF Stream Processing realm

Similar to On Unified Stream Reasoning - The RDF Stream Processing realm (20)

More from Daniele Dell'Aglio

More from Daniele Dell'Aglio (8)

Recently uploaded

Recently uploaded (20)

On Unified Stream Reasoning - The RDF Stream Processing realm