SlideShare a Scribd company logo
1 of 21
Download to read offline
Query Rewriting in RDF
Stream Processing
Jean-Paul Calbimonte – Jose Mora – Oscar Corcho
LSIR EPFL
OEG UPM
ESWC
Heraklion, Greece. June 2016
@jpcik
2
Query Rewriting in
RDF Stream Processing
RDF Streams
3
s p o
Triple
s p o t
Time-stamped triple
s p o
s p o
s p o
t
Time-stamped graph
Gi-1 Gi Gi+1 Gi+2 Gi+3
RDF stream
RSPS
W(S,t)
Windowed Queries in RSP
4
:obs1 rdf:type ssn:Observation .
:obs2 rdf:type ssn:Observation .
:obs3 rdf:type ssn:Observation .
qG(x) ← ssn:Observation(x)
:obs1,:obs2 ,:obs3
qW(S,t)(x) ← ssn:Observation(x)
:obs1,:obs2 [5]
:obs3,:obs4 [10]
:obs5,:obs6 [15]
g1[3]
[4]
[7]
[10]
[11]
[13]
:obs1 rdf:type ssn:Observation .
g2 :obs2 rdf:type ssn:Observation .
g3 :obs3 rdf:type ssn:Observation .
g4 :obs4 rdf:type ssn:Observation .
g5 :obs5 rdf:type ssn:Observation .
g6 :obs6 rdf:type ssn:Observation .
[1]
S
G
graph
W
W size=5
qW(S,t)(x) ← ssn:observedBy(x,y)
No answers for this query:
no matches in the stream
RSP Query Languages
5
SELECT (MAX(?temp) AS ?maxtemp) ?sensor
WHERE {
STREAM :stream [RANGE 1h] {
?obs ssn:observationResult ?result;
ssn:observedProperty cf-property:air_temperature;
ssn:observedBy ?sensor.
?result ssn:hasValue ?obsValue.
?obsValue qu:numericalValue ?temp.
}
} GROUP BY ?sensor
SELECT ?obs WHERE {
STREAM :stream [RANGE 5s]
{?obs rdf:type ssn:Observation. }
}
CQELS
C-SPARQL
SPARQLStream
EP-SPARQL
RSP languages
Example:
Get the maximum air
temperature
observed in the last
hour, grouped by
sensor.
6
Query Rewriting in
RDF Stream Processing
Query Rewriting
7
q’’(x) ← HumiditySensor(x)
Query
Rewriting
q(x) ← Sensor(x)
q’(x) ← Sensor(x)
Sensor
HumiditySensor
TBox
Original Query
Rewritten UCQ
ABox
Ontology expressiveness: e.g. RDFS, ELHIO, etc.
OBDA systems, query rewriters: Prexto, Requiem, Kyrie, etc.
Explosion of rewritings, complexity of queries
Semantic Sensor Network Ontology
8
ssn:Sensor
ssn:Platform
ssn:FeatureOfInterest
ssn:Deployment
ssn:Property
cf-prop:air_temperature
ssn:observes
ssn:onPlatform
dul:Place
dul:hasLocation
ssn:SensingDevicessn:inDeployment
ssn:MeasurementCapability
ssn:MeasurementProperty
geo:lat, geo:lng
xsd:double
ssn:hasMeasurementProperty
ssn:Accuracy
ssn:ofFeature
aws:TemperatureSensor
aws:Thermistor
ssn:Latency
dim:Temperature
qu:QuantityKind
cf-prop:soil_temperature
cf-feat:Wind
cf-feat:Surface
cf-
feat:Medium
cf-feat:air
cf-feat:soil
dim:VelocityOrSpeed
cf-prop:wind_speed
cf-prop:rainfall_rate
aws:CapacitiveBead …
…
…
Rewriting in RSP
9
:obs1,:obs2 [5]
:obs3,:obs4 [10]
:obs5,:obs6 [15]
g1[3]
[4]
[7]
[10]
[11]
[13]
:obs1 rdf:type ssn:Observation .
g2 :obs2 rdf:type ssn:Observation .
g3 :obs3 rdf:type ssn:Observation .
g4 :obs4 rdf:type ssn:Observation .
g5 :obs5 rdf:type ssn:Observation .
g6 :obs6 rdf:type ssn:Observation .
[1]
q’’W(S,t)(x) ← ssn:Observation(x)
Query
Rewriting
qW(S,t)(x) ← ssn:observedBy(x,y)
q’W(S,t)(x) ← ssn:observedBy(x,y)
Original Query
Rewritten UCQ
ssn:Observation ⊑ ∃ssn:observedBy
S
W(S,t)
Rewriting in RSP: Semantics
10
𝑞: 𝑞ℎ 𝒙 ← 𝑝1 ∧ ⋯ ∧ 𝑝 𝑛(𝒙 𝑛)
𝑐𝑒𝑟𝑡 𝑞, 𝒪 = 𝜶 | 𝑞 ∪ 𝒯 ∪ 𝒜 ⊨ 𝑞ℎ(𝜶)
𝑐𝑒𝑟𝑡 𝑞, 𝒪 = 𝜶 | 𝑞′ ∪ 𝒜 ⊨ 𝑞ℎ(𝜶)
Stream: 𝒜 0 , 𝒜 1 , 𝒜 2 , 𝒜 3 … 𝒜 𝑡𝑖 , …
𝒜 𝑤
𝑐𝑒𝑟𝑡 𝑞 𝑤, 𝒯, 𝒜 𝑤 = 𝜶 | 𝑞 𝑤 ∪ 𝒯 ∪ 𝒜 𝑤 ⊨ 𝑞ℎ(𝜶)
𝑐𝑒𝑟𝑡 𝑞 𝑤, 𝒯, 𝒜 𝑤 = 𝜶 | 𝑞′ 𝑤 ∪ 𝒜 𝑤 ⊨ 𝑞ℎ(𝜶)
Query
Certain
answers
Ontology
rewritten
query
get rid of TBox
Streaming Aboxes
Window
Evaluated only for a window
Now for RDF streams:
Implementation: StreamQR
11
kyrie
rewriter
CQELS
Ontology
TBOX
StreamQR
query
registration
continuous
answers
CQELS
query
CQELS
UCQ
RDF Stream
Github StreamQR: https://github.com/jpcik/streapler/tree/streamQR
StreamQR: Rewriting Steps
12
preprocess
separate BGP
& context
Ontology
TBOX
Datalog
query
CQELS
query
Rewritten
CQELS
UCQ
remove
functional
terms
remove
auxiliary
predicates
expand
syntactic
transformation
UCQ
CQ
(BGP)
StreamQR in Action
13
met:TemperatureObservation ⊑ ∃ssn:observedBy.aws:TemperatureSensor
met:AirTemperatureObservation ⊑ met:TemperatureObservation
met:ThermistorObservation ⊑ met:TemperatureObservation
aws:Thermistor ⊑ aws:TemperatureSensor
aws:CapacitiveBead ⊑ aws:TemperatureSensor
TBox
CONSTRUCT { ?o a :ObservedTemperature. }
WHERE {
STREAM :stream1 [RANGE 10ms] {
?o ssn:observedBy ?t .
?t a aws:TemperatureSensor. }
}
Q(?0) <- aws:TemperatureSensor(?1), ssn:observedBy(?0,?1)
kyrie
rewriter
CQELS
Query
CQ extraction & syntactic transformation
StreamQR in Action
14
Q(?0) <- met:TemperatureObservation(?0)
Q(?0) <- met:AirTemperatureObservation(?0)
Q(?0) <- met:ThermistorObservation(?0)
Q(?0) <- aws:TemperatureSensor(?1), ssn:observedBy(?0,?1)
Q(?0) <- aws:Thermistor(?1), ssn:observedBy(?0,?1)
Q(?0) <- aws:CapacitiveBead(?1), ssn:observedBy(?0,?1)
Q(?0) <- aws:TemperatureSensor(?1), ssn:observedBy(?0,?1)
met:TemperatureObservation ⊑ ∃ssn:observedBy.aws:TemperatureSensor
met:AirTemperatureObservation ⊑ met:TemperatureObservation
met:ThermistorObservation ⊑ met:TemperatureObservation
aws:Thermistor ⊑ aws:TemperatureSensor
aws:CapacitiveBead ⊑ aws:TemperatureSensor
TBox
Query expansion
Rewritten UCQ
StreamQR in Action
15
Q(?0) <- met:TemperatureObservation(?0)
Q(?0) <- met:AirTemperatureObservation(?0)
Q(?0) <- met:ThermistorObservation(?0)
Q(?0) <- aws:TemperatureSensor(?1), ssn:observedBy(?0,?1)
Q(?0) <- aws:Thermistor(?1), ssn:observedBy(?0,?1)
Q(?0) <- aws:CapacitiveBead(?1), ssn:observedBy(?0,?1)
PREFIX ssn: <http://purl.oclc.org/NET/ssnx/ssn#>
PREFIX aws: <http://purl.oclc.org/NET/ssnx/meteo/aws#>
PREFIX met: <http://purl.org/env/meteo#>
CONSTRUCT { ?o a :ObservedTemperature. }
WHERE { STREAM :stream1 [RANGE 10ms] {
{ ?o a met:TemperatureObservation } UNION
{ ?o a met:AirTemperatureObservation } UNION
{ ?o a met:ThermistorObservation } UNION
{ ?s a aws:TemperatureSensor . ?o ssn:observedBy ?s } UNION
{ ?s a aws:Thermistor . ?o ssn:observedBy ?s } UNION
{ ?s a aws:CapacitiveBead . ?o ssn:observedBy ?s }
}
CQELS rewritten
query: syntactic
transformation
Rewritten UCQ
Comparison: Rewriting vs No-rewriting
16
0
10000
20000
30000
40000
50000
10 100 1000 10000 100000
throughput[triple/s]
input rate [triple/s]
no rewriting
rewriting
0
5000
10000
15000
20000
25000
30000
10 100 1000 10000 100000
throughput[triple/s]
input rate [triple/s]
no rewriting
rewriting
0
5000
10000
15000
20000
25000
10 100 1000 10000 100000
throughput[triple/s]
input rate [triple/s]
no rewriting
rewriting
• Throughput under different input data loads
• Modified version of the SRBench benchmark queries
• Ontology based on AWS, which extends SSN-O
• Compared the throughput of StreamQR with CQELS (no rewiriting)
• Tested under different input rates, ranging from 10 to 100K triples/s
• Under three different load conditions: such that only:
• 10%, 50% and 90% of the input data matches the continuous query
10% of matches 90% of matches50% of matches
Reaches optimal t-put
up to this point
Varying input distributions
17
0
5000
10000
15000
20000
25000
30000
35000
40000
10 100 1000 10000 100000
throughput[triple/s]
input rate[triple/s]
10% match
20% match
50% match
80% match
• The more matches, the more time the engine spends on evaluation
• Up to 10K triples/s., StreamQR is capable to keep up with the input
• Beyond that, the throughput degrades until it reaches a limit
Changning que input
stream triple distribution
• Experiments under different input loads
• Varying distribution of the types of triples
• 10, 20, 50, 80 and 90% of the triples match the query.
Throughput for different queries
18
100
1000
10000
100000
100 1000 10000 100000
throughput[triple/s]
input rate[triple/s]
q4
q6
q1
q5
q2
q3
q7
q10
Different queries prducing
from 2 to 180 rewritings
• Throughput decreases for queries that produce more rewrittings
• Optimizations: existing techniques used in query rewriting and OBDA
• Pruning queries that may not match any input.
• For around 1K triples/s, it still reaches maximum throughput.
• Different queries -> UCQ with a different number of sub-queries.
• 9 distinct queries that produce from 2 to over 180 sub-queries
Comparing with TrOwl
19
100
1000
10000
100000
100 1000 10000 100000
throughput[triple/s]
input rate[triple/s]
StreamQR
Trowl-no-reclassify
Trowl-only-add
Trowl-add-remove
• Removal operation known to be expensive in incremental materialization
• StreamQR sustains better throughput under fast input rates
• Under lower input rates both are able to reach maximum throughput.
• TrOWL Provides incremental reasoning for ABoxes
• Three settings:
• TrOWL without performing any reasoning (noreclassify)
• Activating the reasoning, but allowing only additions
• Including removals
Conclusions
20
• Query answering over ontologies for RDF stream processors
• Novel approach that combines query rewriting techniques and an
RSP engines
• Implemented StreamQR: incorporates the kyrie into an existing RSP
• Still efficient in terms of throughput, for a large range of scenarios
• Plan to study other criteria such as correctness
• Exploring different expressiveness
• Find a good balance between efficiency and complexity.
• Research in approaches that combine rewriting and incremental
materialization
Thanks a lot!
¿Tienes preguntas?
Jean-Paul Calbimonte
LSIR EPFL
@jpcik

More Related Content

What's hot

Summary of the Stream Reasoning workshop at ISWC 2016
Summary of the Stream Reasoning workshop at ISWC 2016Summary of the Stream Reasoning workshop at ISWC 2016
Summary of the Stream Reasoning workshop at ISWC 2016Daniele Dell'Aglio
 
Querying the Web of Data with XSPARQL 1.1
Querying the Web of Data with XSPARQL 1.1Querying the Web of Data with XSPARQL 1.1
Querying the Web of Data with XSPARQL 1.1Daniele Dell'Aglio
 
Introduction to SparkR
Introduction to SparkRIntroduction to SparkR
Introduction to SparkRKien Dang
 
LarKC Tutorial at ISWC 2009 - Data Model
LarKC Tutorial at ISWC 2009 - Data ModelLarKC Tutorial at ISWC 2009 - Data Model
LarKC Tutorial at ISWC 2009 - Data ModelLarKC
 
Toward Semantic Sensor Data Archives on the Web
Toward Semantic Sensor Data Archives on the WebToward Semantic Sensor Data Archives on the Web
Toward Semantic Sensor Data Archives on the WebJean-Paul Calbimonte
 
SchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
SchemEX - Creating the Yellow Pages for the Linked Open Data CloudSchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
SchemEX - Creating the Yellow Pages for the Linked Open Data CloudAnsgar Scherp
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataGiorgos Santipantakis
 
More Complete Resultset Retrieval from Large Heterogeneous RDF Sources
More Complete Resultset Retrieval from Large Heterogeneous RDF SourcesMore Complete Resultset Retrieval from Large Heterogeneous RDF Sources
More Complete Resultset Retrieval from Large Heterogeneous RDF SourcesAndré Valdestilhas
 
On correctness in RDF stream processor benchmarking
On correctness in RDF stream processor benchmarkingOn correctness in RDF stream processor benchmarking
On correctness in RDF stream processor benchmarkingDaniele Dell'Aglio
 
Debunking Common Myths in Stream Processing
Debunking Common Myths in Stream ProcessingDebunking Common Myths in Stream Processing
Debunking Common Myths in Stream ProcessingKostas Tzoumas
 
RDFox Poster
RDFox PosterRDFox Poster
RDFox PosterDBOnto
 
First Flink Bay Area meetup
First Flink Bay Area meetupFirst Flink Bay Area meetup
First Flink Bay Area meetupKostas Tzoumas
 
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeRDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeNational Institute of Informatics
 
Data Structures and Performance for Scientific Computing with Hadoop and Dumb...
Data Structures and Performance for Scientific Computing with Hadoop and Dumb...Data Structures and Performance for Scientific Computing with Hadoop and Dumb...
Data Structures and Performance for Scientific Computing with Hadoop and Dumb...Austin Benson
 
Toying with spark
Toying with sparkToying with spark
Toying with sparkRaymond Tay
 
Go and Uber’s time series database m3
Go and Uber’s time series database m3Go and Uber’s time series database m3
Go and Uber’s time series database m3Rob Skillington
 
SRAdb Bioconductor Package Overview
SRAdb Bioconductor Package OverviewSRAdb Bioconductor Package Overview
SRAdb Bioconductor Package OverviewSean Davis
 
Machine Learning and GraphX
Machine Learning and GraphXMachine Learning and GraphX
Machine Learning and GraphXAndy Petrella
 
Rdf conjunctive query selectivity estimation
Rdf conjunctive query selectivity estimationRdf conjunctive query selectivity estimation
Rdf conjunctive query selectivity estimationINRIA-OAK
 

What's hot (20)

Summary of the Stream Reasoning workshop at ISWC 2016
Summary of the Stream Reasoning workshop at ISWC 2016Summary of the Stream Reasoning workshop at ISWC 2016
Summary of the Stream Reasoning workshop at ISWC 2016
 
Querying the Web of Data with XSPARQL 1.1
Querying the Web of Data with XSPARQL 1.1Querying the Web of Data with XSPARQL 1.1
Querying the Web of Data with XSPARQL 1.1
 
Introduction to SparkR
Introduction to SparkRIntroduction to SparkR
Introduction to SparkR
 
LarKC Tutorial at ISWC 2009 - Data Model
LarKC Tutorial at ISWC 2009 - Data ModelLarKC Tutorial at ISWC 2009 - Data Model
LarKC Tutorial at ISWC 2009 - Data Model
 
Toward Semantic Sensor Data Archives on the Web
Toward Semantic Sensor Data Archives on the WebToward Semantic Sensor Data Archives on the Web
Toward Semantic Sensor Data Archives on the Web
 
SchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
SchemEX - Creating the Yellow Pages for the Linked Open Data CloudSchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
SchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
More Complete Resultset Retrieval from Large Heterogeneous RDF Sources
More Complete Resultset Retrieval from Large Heterogeneous RDF SourcesMore Complete Resultset Retrieval from Large Heterogeneous RDF Sources
More Complete Resultset Retrieval from Large Heterogeneous RDF Sources
 
On correctness in RDF stream processor benchmarking
On correctness in RDF stream processor benchmarkingOn correctness in RDF stream processor benchmarking
On correctness in RDF stream processor benchmarking
 
Debunking Common Myths in Stream Processing
Debunking Common Myths in Stream ProcessingDebunking Common Myths in Stream Processing
Debunking Common Myths in Stream Processing
 
RDFox Poster
RDFox PosterRDFox Poster
RDFox Poster
 
First Flink Bay Area meetup
First Flink Bay Area meetupFirst Flink Bay Area meetup
First Flink Bay Area meetup
 
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeRDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
 
Data Structures and Performance for Scientific Computing with Hadoop and Dumb...
Data Structures and Performance for Scientific Computing with Hadoop and Dumb...Data Structures and Performance for Scientific Computing with Hadoop and Dumb...
Data Structures and Performance for Scientific Computing with Hadoop and Dumb...
 
Toying with spark
Toying with sparkToying with spark
Toying with spark
 
Go and Uber’s time series database m3
Go and Uber’s time series database m3Go and Uber’s time series database m3
Go and Uber’s time series database m3
 
SRAdb Bioconductor Package Overview
SRAdb Bioconductor Package OverviewSRAdb Bioconductor Package Overview
SRAdb Bioconductor Package Overview
 
Machine Learning and GraphX
Machine Learning and GraphXMachine Learning and GraphX
Machine Learning and GraphX
 
Rdf conjunctive query selectivity estimation
Rdf conjunctive query selectivity estimationRdf conjunctive query selectivity estimation
Rdf conjunctive query selectivity estimation
 
Querying Incomplete Geospatial Information in RDF
Querying Incomplete Geospatial Information in RDFQuerying Incomplete Geospatial Information in RDF
Querying Incomplete Geospatial Information in RDF
 

Similar to Query Rewriting in RDF Stream Processing

On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingPlanetData Network of Excellence
 
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)Thomas Graf
 
Adventures in RDS Load Testing
Adventures in RDS Load TestingAdventures in RDS Load Testing
Adventures in RDS Load TestingMike Harnish
 
Creating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | Prometheus
Creating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | PrometheusCreating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | Prometheus
Creating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | PrometheusInfluxData
 
Paper Study - Demand-Driven Computation of Interprocedural Data Flow
Paper Study - Demand-Driven Computation of Interprocedural Data FlowPaper Study - Demand-Driven Computation of Interprocedural Data Flow
Paper Study - Demand-Driven Computation of Interprocedural Data FlowMin-Yih Hsu
 
Optimizing with persistent data structures (LLVM Cauldron 2016)
Optimizing with persistent data structures (LLVM Cauldron 2016)Optimizing with persistent data structures (LLVM Cauldron 2016)
Optimizing with persistent data structures (LLVM Cauldron 2016)Igalia
 
Reactive Stream Processing with Akka Streams
Reactive Stream Processing with Akka StreamsReactive Stream Processing with Akka Streams
Reactive Stream Processing with Akka StreamsKonrad Malawski
 
Opensample: A Low-latency, Sampling-based Measurement Platform for Software D...
Opensample: A Low-latency, Sampling-based Measurement Platform for Software D...Opensample: A Low-latency, Sampling-based Measurement Platform for Software D...
Opensample: A Low-latency, Sampling-based Measurement Platform for Software D...Junho Suh
 
Fast and Scalable NUMA-based Thread Parallel Breadth-first Search
Fast and Scalable NUMA-based Thread Parallel Breadth-first SearchFast and Scalable NUMA-based Thread Parallel Breadth-first Search
Fast and Scalable NUMA-based Thread Parallel Breadth-first SearchYuichiro Yasui
 
Carrier recovery and clock recovery for qpsk demodulation
Carrier recovery and clock recovery for qpsk demodulationCarrier recovery and clock recovery for qpsk demodulation
Carrier recovery and clock recovery for qpsk demodulationeSAT Publishing House
 
Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Usatyuk Vasiliy
 
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...Databricks
 
Correctness and Performance of Apache Spark SQL
Correctness and Performance of Apache Spark SQLCorrectness and Performance of Apache Spark SQL
Correctness and Performance of Apache Spark SQLNicolas Poggi
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...Association for Computational Linguistics
 
McGill Ozone Contactor Design Project
McGill Ozone Contactor Design ProjectMcGill Ozone Contactor Design Project
McGill Ozone Contactor Design ProjectNicholas Mead-Fox
 
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...Mariano Rodriguez-Muro
 

Similar to Query Rewriting in RDF Stream Processing (20)

On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
 
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
 
Adventures in RDS Load Testing
Adventures in RDS Load TestingAdventures in RDS Load Testing
Adventures in RDS Load Testing
 
Creating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | Prometheus
Creating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | PrometheusCreating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | Prometheus
Creating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | Prometheus
 
Paper Study - Demand-Driven Computation of Interprocedural Data Flow
Paper Study - Demand-Driven Computation of Interprocedural Data FlowPaper Study - Demand-Driven Computation of Interprocedural Data Flow
Paper Study - Demand-Driven Computation of Interprocedural Data Flow
 
defense
defensedefense
defense
 
Optimizing with persistent data structures (LLVM Cauldron 2016)
Optimizing with persistent data structures (LLVM Cauldron 2016)Optimizing with persistent data structures (LLVM Cauldron 2016)
Optimizing with persistent data structures (LLVM Cauldron 2016)
 
Reactive Stream Processing with Akka Streams
Reactive Stream Processing with Akka StreamsReactive Stream Processing with Akka Streams
Reactive Stream Processing with Akka Streams
 
Opensample: A Low-latency, Sampling-based Measurement Platform for Software D...
Opensample: A Low-latency, Sampling-based Measurement Platform for Software D...Opensample: A Low-latency, Sampling-based Measurement Platform for Software D...
Opensample: A Low-latency, Sampling-based Measurement Platform for Software D...
 
Fast and Scalable NUMA-based Thread Parallel Breadth-first Search
Fast and Scalable NUMA-based Thread Parallel Breadth-first SearchFast and Scalable NUMA-based Thread Parallel Breadth-first Search
Fast and Scalable NUMA-based Thread Parallel Breadth-first Search
 
Carrier recovery and clock recovery for qpsk demodulation
Carrier recovery and clock recovery for qpsk demodulationCarrier recovery and clock recovery for qpsk demodulation
Carrier recovery and clock recovery for qpsk demodulation
 
Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...
 
3.01.Lists.pptx
3.01.Lists.pptx3.01.Lists.pptx
3.01.Lists.pptx
 
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...
Correctness and Performance of Apache Spark SQL with Bogdan Ghit and Nicolas ...
 
Correctness and Performance of Apache Spark SQL
Correctness and Performance of Apache Spark SQLCorrectness and Performance of Apache Spark SQL
Correctness and Performance of Apache Spark SQL
 
Use of Solid Core Chromatography for the Analysis of Pharmaceutical Compounds
Use of Solid Core Chromatography for the Analysis of Pharmaceutical CompoundsUse of Solid Core Chromatography for the Analysis of Pharmaceutical Compounds
Use of Solid Core Chromatography for the Analysis of Pharmaceutical Compounds
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
Ofdma 1
Ofdma 1Ofdma 1
Ofdma 1
 
McGill Ozone Contactor Design Project
McGill Ozone Contactor Design ProjectMcGill Ozone Contactor Design Project
McGill Ozone Contactor Design Project
 
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
 

More from Jean-Paul Calbimonte

Towards Collaborative Creativity in Persuasive Multi-agent Systems
Towards Collaborative Creativity in Persuasive Multi-agent SystemsTowards Collaborative Creativity in Persuasive Multi-agent Systems
Towards Collaborative Creativity in Persuasive Multi-agent SystemsJean-Paul Calbimonte
 
A Platform for Difficulty Assessment and Recommendation of Hiking Trails
A Platform for Difficulty Assessment andRecommendation of Hiking TrailsA Platform for Difficulty Assessment andRecommendation of Hiking Trails
A Platform for Difficulty Assessment and Recommendation of Hiking TrailsJean-Paul Calbimonte
 
Decentralized Management of Patient Profiles and Trajectories through Semanti...
Decentralized Management of Patient Profiles and Trajectories through Semanti...Decentralized Management of Patient Profiles and Trajectories through Semanti...
Decentralized Management of Patient Profiles and Trajectories through Semanti...Jean-Paul Calbimonte
 
Personal Data Privacy Semantics in Multi-Agent Systems Interactions
Personal Data Privacy Semantics in Multi-Agent Systems InteractionsPersonal Data Privacy Semantics in Multi-Agent Systems Interactions
Personal Data Privacy Semantics in Multi-Agent Systems InteractionsJean-Paul Calbimonte
 
SanTour: Personalized Recommendation of Hiking Trails to Health Pro files
SanTour: Personalized Recommendation of Hiking Trails to Health ProfilesSanTour: Personalized Recommendation of Hiking Trails to Health Profiles
SanTour: Personalized Recommendation of Hiking Trails to Health Pro filesJean-Paul Calbimonte
 
Multi-agent interactions on the Web through Linked Data Notifications
Multi-agent interactions on the Web through Linked Data NotificationsMulti-agent interactions on the Web through Linked Data Notifications
Multi-agent interactions on the Web through Linked Data NotificationsJean-Paul Calbimonte
 
The MedRed Ontology for Representing Clinical Data Acquisition Metadata
The MedRed Ontology for Representing Clinical Data Acquisition MetadataThe MedRed Ontology for Representing Clinical Data Acquisition Metadata
The MedRed Ontology for Representing Clinical Data Acquisition MetadataJean-Paul Calbimonte
 
Fundamentos de Scala (Scala Basics) (español) Catecbol
Fundamentos de Scala (Scala Basics) (español) CatecbolFundamentos de Scala (Scala Basics) (español) Catecbol
Fundamentos de Scala (Scala Basics) (español) CatecbolJean-Paul Calbimonte
 
Detection of hypoglycemic events through wearable sensors
Detection of hypoglycemic events through wearable sensorsDetection of hypoglycemic events through wearable sensors
Detection of hypoglycemic events through wearable sensorsJean-Paul Calbimonte
 
The Schema Editor of OpenIoT for Semantic Sensor Networks
The Schema Editor of OpenIoT for Semantic Sensor NetworksThe Schema Editor of OpenIoT for Semantic Sensor Networks
The Schema Editor of OpenIoT for Semantic Sensor NetworksJean-Paul Calbimonte
 
Scala Programming for Semantic Web Developers ESWC Semdev2015
Scala Programming for Semantic Web Developers ESWC Semdev2015Scala Programming for Semantic Web Developers ESWC Semdev2015
Scala Programming for Semantic Web Developers ESWC Semdev2015Jean-Paul Calbimonte
 
XGSN: An Open-source Semantic Sensing Middleware for the Web of Things
XGSN: An Open-source Semantic Sensing Middleware for the Web of ThingsXGSN: An Open-source Semantic Sensing Middleware for the Web of Things
XGSN: An Open-source Semantic Sensing Middleware for the Web of ThingsJean-Paul Calbimonte
 
GSN Global Sensor Networks for Environmental Data Management
GSN Global Sensor Networks for Environmental Data ManagementGSN Global Sensor Networks for Environmental Data Management
GSN Global Sensor Networks for Environmental Data ManagementJean-Paul Calbimonte
 
SSN2013 Demo: tablet based visualization of transport data with SPARQLStream
SSN2013 Demo: tablet based visualization of transport data with SPARQLStreamSSN2013 Demo: tablet based visualization of transport data with SPARQLStream
SSN2013 Demo: tablet based visualization of transport data with SPARQLStreamJean-Paul Calbimonte
 
Tutorial Stream Reasoning SPARQLstream and Morph-streams
Tutorial Stream Reasoning SPARQLstream and Morph-streamsTutorial Stream Reasoning SPARQLstream and Morph-streams
Tutorial Stream Reasoning SPARQLstream and Morph-streamsJean-Paul Calbimonte
 

More from Jean-Paul Calbimonte (20)

Towards Collaborative Creativity in Persuasive Multi-agent Systems
Towards Collaborative Creativity in Persuasive Multi-agent SystemsTowards Collaborative Creativity in Persuasive Multi-agent Systems
Towards Collaborative Creativity in Persuasive Multi-agent Systems
 
A Platform for Difficulty Assessment and Recommendation of Hiking Trails
A Platform for Difficulty Assessment andRecommendation of Hiking TrailsA Platform for Difficulty Assessment andRecommendation of Hiking Trails
A Platform for Difficulty Assessment and Recommendation of Hiking Trails
 
Stream reasoning agents
Stream reasoning agentsStream reasoning agents
Stream reasoning agents
 
Decentralized Management of Patient Profiles and Trajectories through Semanti...
Decentralized Management of Patient Profiles and Trajectories through Semanti...Decentralized Management of Patient Profiles and Trajectories through Semanti...
Decentralized Management of Patient Profiles and Trajectories through Semanti...
 
Personal Data Privacy Semantics in Multi-Agent Systems Interactions
Personal Data Privacy Semantics in Multi-Agent Systems InteractionsPersonal Data Privacy Semantics in Multi-Agent Systems Interactions
Personal Data Privacy Semantics in Multi-Agent Systems Interactions
 
RDF data validation 2017 SHACL
RDF data validation 2017 SHACLRDF data validation 2017 SHACL
RDF data validation 2017 SHACL
 
SanTour: Personalized Recommendation of Hiking Trails to Health Pro files
SanTour: Personalized Recommendation of Hiking Trails to Health ProfilesSanTour: Personalized Recommendation of Hiking Trails to Health Profiles
SanTour: Personalized Recommendation of Hiking Trails to Health Pro files
 
Multi-agent interactions on the Web through Linked Data Notifications
Multi-agent interactions on the Web through Linked Data NotificationsMulti-agent interactions on the Web through Linked Data Notifications
Multi-agent interactions on the Web through Linked Data Notifications
 
The MedRed Ontology for Representing Clinical Data Acquisition Metadata
The MedRed Ontology for Representing Clinical Data Acquisition MetadataThe MedRed Ontology for Representing Clinical Data Acquisition Metadata
The MedRed Ontology for Representing Clinical Data Acquisition Metadata
 
Fundamentos de Scala (Scala Basics) (español) Catecbol
Fundamentos de Scala (Scala Basics) (español) CatecbolFundamentos de Scala (Scala Basics) (español) Catecbol
Fundamentos de Scala (Scala Basics) (español) Catecbol
 
Detection of hypoglycemic events through wearable sensors
Detection of hypoglycemic events through wearable sensorsDetection of hypoglycemic events through wearable sensors
Detection of hypoglycemic events through wearable sensors
 
The Schema Editor of OpenIoT for Semantic Sensor Networks
The Schema Editor of OpenIoT for Semantic Sensor NetworksThe Schema Editor of OpenIoT for Semantic Sensor Networks
The Schema Editor of OpenIoT for Semantic Sensor Networks
 
Scala Programming for Semantic Web Developers ESWC Semdev2015
Scala Programming for Semantic Web Developers ESWC Semdev2015Scala Programming for Semantic Web Developers ESWC Semdev2015
Scala Programming for Semantic Web Developers ESWC Semdev2015
 
Streams of RDF Events Derive2015
Streams of RDF Events Derive2015Streams of RDF Events Derive2015
Streams of RDF Events Derive2015
 
XGSN: An Open-source Semantic Sensing Middleware for the Web of Things
XGSN: An Open-source Semantic Sensing Middleware for the Web of ThingsXGSN: An Open-source Semantic Sensing Middleware for the Web of Things
XGSN: An Open-source Semantic Sensing Middleware for the Web of Things
 
X-GSN in OpenIoT SummerSchool
X-GSN in OpenIoT SummerSchoolX-GSN in OpenIoT SummerSchool
X-GSN in OpenIoT SummerSchool
 
GSN Global Sensor Networks for Environmental Data Management
GSN Global Sensor Networks for Environmental Data ManagementGSN Global Sensor Networks for Environmental Data Management
GSN Global Sensor Networks for Environmental Data Management
 
SSN2013 Demo: tablet based visualization of transport data with SPARQLStream
SSN2013 Demo: tablet based visualization of transport data with SPARQLStreamSSN2013 Demo: tablet based visualization of transport data with SPARQLStream
SSN2013 Demo: tablet based visualization of transport data with SPARQLStream
 
Tutorial Stream Reasoning SPARQLstream and Morph-streams
Tutorial Stream Reasoning SPARQLstream and Morph-streamsTutorial Stream Reasoning SPARQLstream and Morph-streams
Tutorial Stream Reasoning SPARQLstream and Morph-streams
 
SPARQLstream and Morph-streams
SPARQLstream and Morph-streamsSPARQLstream and Morph-streams
SPARQLstream and Morph-streams
 

Query Rewriting in RDF Stream Processing

  • 1. Query Rewriting in RDF Stream Processing Jean-Paul Calbimonte – Jose Mora – Oscar Corcho LSIR EPFL OEG UPM ESWC Heraklion, Greece. June 2016 @jpcik
  • 2. 2 Query Rewriting in RDF Stream Processing
  • 3. RDF Streams 3 s p o Triple s p o t Time-stamped triple s p o s p o s p o t Time-stamped graph Gi-1 Gi Gi+1 Gi+2 Gi+3 RDF stream RSPS W(S,t)
  • 4. Windowed Queries in RSP 4 :obs1 rdf:type ssn:Observation . :obs2 rdf:type ssn:Observation . :obs3 rdf:type ssn:Observation . qG(x) ← ssn:Observation(x) :obs1,:obs2 ,:obs3 qW(S,t)(x) ← ssn:Observation(x) :obs1,:obs2 [5] :obs3,:obs4 [10] :obs5,:obs6 [15] g1[3] [4] [7] [10] [11] [13] :obs1 rdf:type ssn:Observation . g2 :obs2 rdf:type ssn:Observation . g3 :obs3 rdf:type ssn:Observation . g4 :obs4 rdf:type ssn:Observation . g5 :obs5 rdf:type ssn:Observation . g6 :obs6 rdf:type ssn:Observation . [1] S G graph W W size=5 qW(S,t)(x) ← ssn:observedBy(x,y) No answers for this query: no matches in the stream
  • 5. RSP Query Languages 5 SELECT (MAX(?temp) AS ?maxtemp) ?sensor WHERE { STREAM :stream [RANGE 1h] { ?obs ssn:observationResult ?result; ssn:observedProperty cf-property:air_temperature; ssn:observedBy ?sensor. ?result ssn:hasValue ?obsValue. ?obsValue qu:numericalValue ?temp. } } GROUP BY ?sensor SELECT ?obs WHERE { STREAM :stream [RANGE 5s] {?obs rdf:type ssn:Observation. } } CQELS C-SPARQL SPARQLStream EP-SPARQL RSP languages Example: Get the maximum air temperature observed in the last hour, grouped by sensor.
  • 6. 6 Query Rewriting in RDF Stream Processing
  • 7. Query Rewriting 7 q’’(x) ← HumiditySensor(x) Query Rewriting q(x) ← Sensor(x) q’(x) ← Sensor(x) Sensor HumiditySensor TBox Original Query Rewritten UCQ ABox Ontology expressiveness: e.g. RDFS, ELHIO, etc. OBDA systems, query rewriters: Prexto, Requiem, Kyrie, etc. Explosion of rewritings, complexity of queries
  • 8. Semantic Sensor Network Ontology 8 ssn:Sensor ssn:Platform ssn:FeatureOfInterest ssn:Deployment ssn:Property cf-prop:air_temperature ssn:observes ssn:onPlatform dul:Place dul:hasLocation ssn:SensingDevicessn:inDeployment ssn:MeasurementCapability ssn:MeasurementProperty geo:lat, geo:lng xsd:double ssn:hasMeasurementProperty ssn:Accuracy ssn:ofFeature aws:TemperatureSensor aws:Thermistor ssn:Latency dim:Temperature qu:QuantityKind cf-prop:soil_temperature cf-feat:Wind cf-feat:Surface cf- feat:Medium cf-feat:air cf-feat:soil dim:VelocityOrSpeed cf-prop:wind_speed cf-prop:rainfall_rate aws:CapacitiveBead … … …
  • 9. Rewriting in RSP 9 :obs1,:obs2 [5] :obs3,:obs4 [10] :obs5,:obs6 [15] g1[3] [4] [7] [10] [11] [13] :obs1 rdf:type ssn:Observation . g2 :obs2 rdf:type ssn:Observation . g3 :obs3 rdf:type ssn:Observation . g4 :obs4 rdf:type ssn:Observation . g5 :obs5 rdf:type ssn:Observation . g6 :obs6 rdf:type ssn:Observation . [1] q’’W(S,t)(x) ← ssn:Observation(x) Query Rewriting qW(S,t)(x) ← ssn:observedBy(x,y) q’W(S,t)(x) ← ssn:observedBy(x,y) Original Query Rewritten UCQ ssn:Observation ⊑ ∃ssn:observedBy S W(S,t)
  • 10. Rewriting in RSP: Semantics 10 𝑞: 𝑞ℎ 𝒙 ← 𝑝1 ∧ ⋯ ∧ 𝑝 𝑛(𝒙 𝑛) 𝑐𝑒𝑟𝑡 𝑞, 𝒪 = 𝜶 | 𝑞 ∪ 𝒯 ∪ 𝒜 ⊨ 𝑞ℎ(𝜶) 𝑐𝑒𝑟𝑡 𝑞, 𝒪 = 𝜶 | 𝑞′ ∪ 𝒜 ⊨ 𝑞ℎ(𝜶) Stream: 𝒜 0 , 𝒜 1 , 𝒜 2 , 𝒜 3 … 𝒜 𝑡𝑖 , … 𝒜 𝑤 𝑐𝑒𝑟𝑡 𝑞 𝑤, 𝒯, 𝒜 𝑤 = 𝜶 | 𝑞 𝑤 ∪ 𝒯 ∪ 𝒜 𝑤 ⊨ 𝑞ℎ(𝜶) 𝑐𝑒𝑟𝑡 𝑞 𝑤, 𝒯, 𝒜 𝑤 = 𝜶 | 𝑞′ 𝑤 ∪ 𝒜 𝑤 ⊨ 𝑞ℎ(𝜶) Query Certain answers Ontology rewritten query get rid of TBox Streaming Aboxes Window Evaluated only for a window Now for RDF streams:
  • 12. StreamQR: Rewriting Steps 12 preprocess separate BGP & context Ontology TBOX Datalog query CQELS query Rewritten CQELS UCQ remove functional terms remove auxiliary predicates expand syntactic transformation UCQ CQ (BGP)
  • 13. StreamQR in Action 13 met:TemperatureObservation ⊑ ∃ssn:observedBy.aws:TemperatureSensor met:AirTemperatureObservation ⊑ met:TemperatureObservation met:ThermistorObservation ⊑ met:TemperatureObservation aws:Thermistor ⊑ aws:TemperatureSensor aws:CapacitiveBead ⊑ aws:TemperatureSensor TBox CONSTRUCT { ?o a :ObservedTemperature. } WHERE { STREAM :stream1 [RANGE 10ms] { ?o ssn:observedBy ?t . ?t a aws:TemperatureSensor. } } Q(?0) <- aws:TemperatureSensor(?1), ssn:observedBy(?0,?1) kyrie rewriter CQELS Query CQ extraction & syntactic transformation
  • 14. StreamQR in Action 14 Q(?0) <- met:TemperatureObservation(?0) Q(?0) <- met:AirTemperatureObservation(?0) Q(?0) <- met:ThermistorObservation(?0) Q(?0) <- aws:TemperatureSensor(?1), ssn:observedBy(?0,?1) Q(?0) <- aws:Thermistor(?1), ssn:observedBy(?0,?1) Q(?0) <- aws:CapacitiveBead(?1), ssn:observedBy(?0,?1) Q(?0) <- aws:TemperatureSensor(?1), ssn:observedBy(?0,?1) met:TemperatureObservation ⊑ ∃ssn:observedBy.aws:TemperatureSensor met:AirTemperatureObservation ⊑ met:TemperatureObservation met:ThermistorObservation ⊑ met:TemperatureObservation aws:Thermistor ⊑ aws:TemperatureSensor aws:CapacitiveBead ⊑ aws:TemperatureSensor TBox Query expansion Rewritten UCQ
  • 15. StreamQR in Action 15 Q(?0) <- met:TemperatureObservation(?0) Q(?0) <- met:AirTemperatureObservation(?0) Q(?0) <- met:ThermistorObservation(?0) Q(?0) <- aws:TemperatureSensor(?1), ssn:observedBy(?0,?1) Q(?0) <- aws:Thermistor(?1), ssn:observedBy(?0,?1) Q(?0) <- aws:CapacitiveBead(?1), ssn:observedBy(?0,?1) PREFIX ssn: <http://purl.oclc.org/NET/ssnx/ssn#> PREFIX aws: <http://purl.oclc.org/NET/ssnx/meteo/aws#> PREFIX met: <http://purl.org/env/meteo#> CONSTRUCT { ?o a :ObservedTemperature. } WHERE { STREAM :stream1 [RANGE 10ms] { { ?o a met:TemperatureObservation } UNION { ?o a met:AirTemperatureObservation } UNION { ?o a met:ThermistorObservation } UNION { ?s a aws:TemperatureSensor . ?o ssn:observedBy ?s } UNION { ?s a aws:Thermistor . ?o ssn:observedBy ?s } UNION { ?s a aws:CapacitiveBead . ?o ssn:observedBy ?s } } CQELS rewritten query: syntactic transformation Rewritten UCQ
  • 16. Comparison: Rewriting vs No-rewriting 16 0 10000 20000 30000 40000 50000 10 100 1000 10000 100000 throughput[triple/s] input rate [triple/s] no rewriting rewriting 0 5000 10000 15000 20000 25000 30000 10 100 1000 10000 100000 throughput[triple/s] input rate [triple/s] no rewriting rewriting 0 5000 10000 15000 20000 25000 10 100 1000 10000 100000 throughput[triple/s] input rate [triple/s] no rewriting rewriting • Throughput under different input data loads • Modified version of the SRBench benchmark queries • Ontology based on AWS, which extends SSN-O • Compared the throughput of StreamQR with CQELS (no rewiriting) • Tested under different input rates, ranging from 10 to 100K triples/s • Under three different load conditions: such that only: • 10%, 50% and 90% of the input data matches the continuous query 10% of matches 90% of matches50% of matches Reaches optimal t-put up to this point
  • 17. Varying input distributions 17 0 5000 10000 15000 20000 25000 30000 35000 40000 10 100 1000 10000 100000 throughput[triple/s] input rate[triple/s] 10% match 20% match 50% match 80% match • The more matches, the more time the engine spends on evaluation • Up to 10K triples/s., StreamQR is capable to keep up with the input • Beyond that, the throughput degrades until it reaches a limit Changning que input stream triple distribution • Experiments under different input loads • Varying distribution of the types of triples • 10, 20, 50, 80 and 90% of the triples match the query.
  • 18. Throughput for different queries 18 100 1000 10000 100000 100 1000 10000 100000 throughput[triple/s] input rate[triple/s] q4 q6 q1 q5 q2 q3 q7 q10 Different queries prducing from 2 to 180 rewritings • Throughput decreases for queries that produce more rewrittings • Optimizations: existing techniques used in query rewriting and OBDA • Pruning queries that may not match any input. • For around 1K triples/s, it still reaches maximum throughput. • Different queries -> UCQ with a different number of sub-queries. • 9 distinct queries that produce from 2 to over 180 sub-queries
  • 19. Comparing with TrOwl 19 100 1000 10000 100000 100 1000 10000 100000 throughput[triple/s] input rate[triple/s] StreamQR Trowl-no-reclassify Trowl-only-add Trowl-add-remove • Removal operation known to be expensive in incremental materialization • StreamQR sustains better throughput under fast input rates • Under lower input rates both are able to reach maximum throughput. • TrOWL Provides incremental reasoning for ABoxes • Three settings: • TrOWL without performing any reasoning (noreclassify) • Activating the reasoning, but allowing only additions • Including removals
  • 20. Conclusions 20 • Query answering over ontologies for RDF stream processors • Novel approach that combines query rewriting techniques and an RSP engines • Implemented StreamQR: incorporates the kyrie into an existing RSP • Still efficient in terms of throughput, for a large range of scenarios • Plan to study other criteria such as correctness • Exploring different expressiveness • Find a good balance between efficiency and complexity. • Research in approaches that combine rewriting and incremental materialization
  • 21. Thanks a lot! ¿Tienes preguntas? Jean-Paul Calbimonte LSIR EPFL @jpcik