Heaven: A Framework for Systematic Comparative Research Approach for RSP Engines

DEIB - Politecnico di Milano
Riccardo Tommasini, Emanuele Della Valle,
Marco Balduini and Daniele Dell’Aglio
Heaven: a framework for systematic
comparative research approach for RSP
engines

ESWC - 2016 - Riccardo Tommasini - @rictomm - DEIB Polimi
ESWC16
2
Agenda
• Introduction
• Motivation
• Heaven [Contribution]
• Requirements Analysis
• Test Stand Architecture
• Baselines
• Conclusion and Future Works

ESWC16
3
It’s a Streaming World

ESWC16
4
Challenges
Challenges IFP SW
vast ✓ ✓
heterogenous ✖ ✓
complex domain models ✖ ✓
rapidly changing ✓ ✖
reactivity ✓ ✖
expressiveness ✖ ✓

ESWC16
5
Stream Reasoning
Logical real time reasoning on multiple,
heterogeneous, gigantic and inevitably noisy data
streams.
-- E. Della Valle, S. Ceri, F. van Harmelen and H.
Stuckenschmidt, 2010

ESWC16
6
Stream Reasoning Vision
Challenges IFP SW SR
vast ✓ ✓ ✓
heterogenous ✖ ✓ ✓
complex domain models ✖ ✓ ✓
rapidly changing ✓ ✖ ✓
reactivity ✓ ✖ ✓
expressiveness ✖ ✓ ✓

ESWC16
7
RDF Stream Processing (RSP)
Challenges IFP SW RSP
vast ✓ ✓ ✓
heterogenous ✖ ✓ ✓
complex domain models ✖ ✓ ≈
rapidly changing ✓ ✖ ✓
reactivity ✓ ✖ ✓
expressiveness ✖ ✓ ≈

ESWC16
8
RSP state-of-the-art
C-SPARQL Engine
SparkWave
MorphStream
IMaRS
CQELS
DynamiTE
EP-SPARQL
growing
number of
solutions

ESWC16
9
RSP Benchmarking State-of-the-art
RSP ENGINE
C-SPAEQL
Engine
CQELS SPARQLstream SparkWAVE DynamiTE
C-SPARQL
Engine
≡ ✔ ✔ ✔
CQELS ✔ ≡
SPARQLstream ✔ ≡
SparkWAVE ✔ ≡
DynamiTE ≡
No systematic
comparison

ESWC16
10
State of the art RSP Benchmarking
Benchmark
DataStreams &
Ontologies
Queries Metrics
SR Bench ✔ ✔ Feasibility
LS Bench ✔ ✔ Feasibility, Throughput
CSRBench ✔ ✔
Feasibility, Throughput,
Correctness
CityBench ✔ ✔
Feasibility, Throughput,
Memory
No absolute
winner

ESWC16
11
Domain Specific Benchmark
The goal of a domain specific benchmark is to
foster technological progress by guaranteeing a
fair assessment. 
- Jim Gray, The Benchmark Handbook
for Database and Transaction Systems, 1993

ESWC16
12
A Well-Known Hypothesis
The incremental maintenance of the
materialisation is faster then full re-materialisation
of ontological entailment when content changes
are small enough (e.g. greater than 10%).

ESWC16
13
Uncomfortable Truths in RSP Benchmarking
Memory (mb)
Latency (ms)
6
4
0.10 0.200.05
Window Cardinality
(# Triples )
1000
100
10
1
Naive
Incremental

ESWC16
2 51
14
Uncomfortable Truths in RSP Benchmarking
Memory (mb)
Latency (ms)
6
4
Abox Cardinality
(# Triples )
1000
100
10
1
Naive
Incremental

ESWC16
15
Analysis
A. Qualitatively, is there a solution that always
outperforms the others? 
B. If no dominant solution can be found, when
does a solution work better than another one? 
C. Quantitatively, is there a solution that
distinguishes itself from the others?  
D. Why does a solution perform better than another
solution under a certain experimental
condition?

ESWC16
16
Comparative Research
• It is natively case driven:
• It considers cases as a combination of known properties
• It defines analysis guidelines through baselines
• It is extensively used to analyse complex systems 
• It provides layered frameworks to 
• systematically examine cases
• identify similarities/differences enabling us to catch more
insights.

ESWC16
17
Research Question
Can we enable a systematic comparative
research approach (SCRA) for RSP
engines?

ESWC16
18
Heaven
• A set of requirements to satisfy. 
• An architecture for an RSP engine Test Stand. 
• Two baseline RSP engine architectures 
• A proof-of-concept implementation (open
source)

ESWC16
19
Requirement Analysis
An Experimental Environment guarantees
Comparability
Repeatability
Reproducibility
On their definitions we eliciting the the
requirements our framework has to satisfy.

ESWC16
20
Comparability related requirements
[R1] RSP engine agnostic, i.e. independent from
the tested RSP engine.
[R2] Independent from the measured key
performance indicators (KPIs), i.e., the KPIs set
has to be extensible.
[R3] Identify baseline RSP engines, i.e., the
minimal meaningful approaches to realise an RSP
engine.

ESWC16
21
Reproducibility related requirements
[R4] Data independent, i.e. allowing the usage
of any data stream and any static data.
[R5] Query independent, i.e. allowing the usage
of any query from users’ domains of interest.

ESWC16
22
Repeatability related requirements
[R6] Minimise the experimental error, i.e., it has
to affect the RSP engine evaluation as little as
possible and in a predictable way.

ESWC16
23
RSP Experiment Design
is the RSP engine used as subject in the experiment;
is an ontology and any data not subject to change
during the experiment.
is the description of the input data streams:
is the set of continuous queries registered into the
engine
is the set of key performance indicators (KPIs) to
collect.
The result of the execution of an experiment is a
Report that captures the engine dynamics.
E
T
Q
D
K
R

ESWC16
24
Test Stands (from aerospace engineering)
• Experimental environment
• Systematic evaluation of
complex system
• Black Box evaluation of
complex system

ESWC16
RSPEngine
< ,Q>
25
E,D,T,Q,KE
Input outputInterface
Interface
T
T QD
Streamer
D
Receiver
Heaven Test Stand Architecture
K
ResultCollector
K

ESWC16
26
Heaven Test Stand Architecture

ESWC16
27
RSP Baselines
Simplified RSP engines cases that combine known
properties, i.e. minimal meaningful approaches to
realise an RSP engine.
Pipeline of a Data Stream Management System
(DSMS) and a Reasoner.

ESWC16
Tumbling Window
Wt ∩ Wt+1 = ⦰
Sliding Window
Wt ∩ Wt+1 = δ > 0
28
DSMS Background: Window

ESWC16
29
RSP Baselines
Δ+
Δ-
Incremental
Input Triple Inferred Triple
Naive
active
window
RDF Stream
DSMS
Reasoner
RDF Stream
DSMS
Incremental
Reasoner
RDF Stream RDF Stream

ESWC16
30
RSP Baselines
• 𝞺DF entailment regime  
• they exploit absolute time, i.e. their internal
clock can be externally controlled.  
• Ensures results correctness even when
overloaded
• Allows to calculate latency of query
response (responsiveness)

ESWC16
31
Example of Dynamics Comparison
Incremental Baseline

ESWC16
32
Conclusion
Top-down hypothesis verification, even when an
RSP engines is extremely simple (i.e. the baselines), is
not straight forward.
There is a growing need of comparative analysis.
Heaven enables the systematic execution of
experiments, paving the road to comparative
investigations.

ESWC16
33
Future Works
Systematic analysis of existing solutions
A web-based environment where a users can:
• choose one of existing benchmarks (datasets,
queries) 
• design experiment 
• run them and consult the results online 
• compare the results agains the baselines or
existing integrated RSP engines.

ESWC16
34
Thank You!
Thank You
Thank
You!

ESWC16
35

ESWC16
36
Contacts
Contact
@rictomm
RiccardoTommasini+
tomma156
riccardo.tommasini@polimi.it
riccardotommasini

Heaven: A Framework for Systematic Comparative Research Approach for RSP Engines

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Heaven: A Framework for Systematic Comparative Research Approach for RSP Engines

Similar to Heaven: A Framework for Systematic Comparative Research Approach for RSP Engines (20)

Recently uploaded

Recently uploaded (20)

Heaven: A Framework for Systematic Comparative Research Approach for RSP Engines