Katraj ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...
ย
Query Rewriting in RDF Stream Processing
1. Query Rewriting in RDF
Stream Processing
Jean-Paul Calbimonte โ Jose Mora โ Oscar Corcho
LSIR EPFL
OEG UPM
ESWC
Heraklion, Greece. June 2016
@jpcik
3. RDF Streams
3
s p o
Triple
s p o t
Time-stamped triple
s p o
s p o
s p o
t
Time-stamped graph
Gi-1 Gi Gi+1 Gi+2 Gi+3
RDF stream
RSPS
W(S,t)
4. Windowed Queries in RSP
4
:obs1 rdf:type ssn:Observation .
:obs2 rdf:type ssn:Observation .
:obs3 rdf:type ssn:Observation .
qG(x) โ ssn:Observation(x)
:obs1,:obs2 ,:obs3
qW(S,t)(x) โ ssn:Observation(x)
:obs1,:obs2 [5]
:obs3,:obs4 [10]
:obs5,:obs6 [15]
g1[3]
[4]
[7]
[10]
[11]
[13]
:obs1 rdf:type ssn:Observation .
g2 :obs2 rdf:type ssn:Observation .
g3 :obs3 rdf:type ssn:Observation .
g4 :obs4 rdf:type ssn:Observation .
g5 :obs5 rdf:type ssn:Observation .
g6 :obs6 rdf:type ssn:Observation .
[1]
S
G
graph
W
W size=5
qW(S,t)(x) โ ssn:observedBy(x,y)
No answers for this query:
no matches in the stream
5. RSP Query Languages
5
SELECT (MAX(?temp) AS ?maxtemp) ?sensor
WHERE {
STREAM :stream [RANGE 1h] {
?obs ssn:observationResult ?result;
ssn:observedProperty cf-property:air_temperature;
ssn:observedBy ?sensor.
?result ssn:hasValue ?obsValue.
?obsValue qu:numericalValue ?temp.
}
} GROUP BY ?sensor
SELECT ?obs WHERE {
STREAM :stream [RANGE 5s]
{?obs rdf:type ssn:Observation. }
}
CQELS
C-SPARQL
SPARQLStream
EP-SPARQL
RSP languages
Example:
Get the maximum air
temperature
observed in the last
hour, grouped by
sensor.
15. StreamQR in Action
15
Q(?0) <- met:TemperatureObservation(?0)
Q(?0) <- met:AirTemperatureObservation(?0)
Q(?0) <- met:ThermistorObservation(?0)
Q(?0) <- aws:TemperatureSensor(?1), ssn:observedBy(?0,?1)
Q(?0) <- aws:Thermistor(?1), ssn:observedBy(?0,?1)
Q(?0) <- aws:CapacitiveBead(?1), ssn:observedBy(?0,?1)
PREFIX ssn: <http://purl.oclc.org/NET/ssnx/ssn#>
PREFIX aws: <http://purl.oclc.org/NET/ssnx/meteo/aws#>
PREFIX met: <http://purl.org/env/meteo#>
CONSTRUCT { ?o a :ObservedTemperature. }
WHERE { STREAM :stream1 [RANGE 10ms] {
{ ?o a met:TemperatureObservation } UNION
{ ?o a met:AirTemperatureObservation } UNION
{ ?o a met:ThermistorObservation } UNION
{ ?s a aws:TemperatureSensor . ?o ssn:observedBy ?s } UNION
{ ?s a aws:Thermistor . ?o ssn:observedBy ?s } UNION
{ ?s a aws:CapacitiveBead . ?o ssn:observedBy ?s }
}
CQELS rewritten
query: syntactic
transformation
Rewritten UCQ
16. Comparison: Rewriting vs No-rewriting
16
0
10000
20000
30000
40000
50000
10 100 1000 10000 100000
throughput[triple/s]
input rate [triple/s]
no rewriting
rewriting
0
5000
10000
15000
20000
25000
30000
10 100 1000 10000 100000
throughput[triple/s]
input rate [triple/s]
no rewriting
rewriting
0
5000
10000
15000
20000
25000
10 100 1000 10000 100000
throughput[triple/s]
input rate [triple/s]
no rewriting
rewriting
โข Throughput under different input data loads
โข Modified version of the SRBench benchmark queries
โข Ontology based on AWS, which extends SSN-O
โข Compared the throughput of StreamQR with CQELS (no rewiriting)
โข Tested under different input rates, ranging from 10 to 100K triples/s
โข Under three different load conditions: such that only:
โข 10%, 50% and 90% of the input data matches the continuous query
10% of matches 90% of matches50% of matches
Reaches optimal t-put
up to this point
17. Varying input distributions
17
0
5000
10000
15000
20000
25000
30000
35000
40000
10 100 1000 10000 100000
throughput[triple/s]
input rate[triple/s]
10% match
20% match
50% match
80% match
โข The more matches, the more time the engine spends on evaluation
โข Up to 10K triples/s., StreamQR is capable to keep up with the input
โข Beyond that, the throughput degrades until it reaches a limit
Changning que input
stream triple distribution
โข Experiments under different input loads
โข Varying distribution of the types of triples
โข 10, 20, 50, 80 and 90% of the triples match the query.
18. Throughput for different queries
18
100
1000
10000
100000
100 1000 10000 100000
throughput[triple/s]
input rate[triple/s]
q4
q6
q1
q5
q2
q3
q7
q10
Different queries prducing
from 2 to 180 rewritings
โข Throughput decreases for queries that produce more rewrittings
โข Optimizations: existing techniques used in query rewriting and OBDA
โข Pruning queries that may not match any input.
โข For around 1K triples/s, it still reaches maximum throughput.
โข Different queries -> UCQ with a different number of sub-queries.
โข 9 distinct queries that produce from 2 to over 180 sub-queries
19. Comparing with TrOwl
19
100
1000
10000
100000
100 1000 10000 100000
throughput[triple/s]
input rate[triple/s]
StreamQR
Trowl-no-reclassify
Trowl-only-add
Trowl-add-remove
โข Removal operation known to be expensive in incremental materialization
โข StreamQR sustains better throughput under fast input rates
โข Under lower input rates both are able to reach maximum throughput.
โข TrOWL Provides incremental reasoning for ABoxes
โข Three settings:
โข TrOWL without performing any reasoning (noreclassify)
โข Activating the reasoning, but allowing only additions
โข Including removals
20. Conclusions
20
โข Query answering over ontologies for RDF stream processors
โข Novel approach that combines query rewriting techniques and an
RSP engines
โข Implemented StreamQR: incorporates the kyrie into an existing RSP
โข Still efficient in terms of throughput, for a large range of scenarios
โข Plan to study other criteria such as correctness
โข Exploring different expressiveness
โข Find a good balance between efficiency and complexity.
โข Research in approaches that combine rewriting and incremental
materialization