Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

RDF Streams and Continuous SPARQL (C-SPARQL)

228 views

Published on

RDF Streams and Continuous SPARQL (C-SPARQL)
Riccardo Tommasini - DEIB Politecnico di Milano

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

RDF Streams and Continuous SPARQL (C-SPARQL)

  1. 1. BigData Semantic Approach to Big Data and Event Processing RDF Streams and Con0nuous SPARQL (C-SPARQL) Riccardo Tommasini PhD Student - Politecnico di Milano riccardo.tommasini@polimi.it
  2. 2. BigData Agenda •  Introduc0on •  Running example •  C-SPARQL language –  Query and Stream Registra0on –  FROM STREAM Clause –  TimeStamp Func0on –  Query Chaining –  Accessing background Informa0on –  Stream reasoning support •  C-SPARQL Engine •  Resources 2
  3. 3. BigData Memo: SPARQL
  4. 4. BigData Where C-SPARQL Extends SPARQL
  5. 5. BigData C-SPARQL language •  C-SPARQL is an extension of SPARQL 1.1 •  C-SPARQL queries – Changes the seman0cs of SPARQL 1.1 query forms from the one-0me seman0cs to the con0nuous one (i.e., instantaneous) – Adds to SPARQL 1.1 query forms the STREAM form and – Adds to SPARQL 1.1 datasets clauses the FROM STREAM one – Adds a built-in func0on to access the 0mestamp of a triple 5
  6. 6. BigData Running Example 6 window input streams streams of answerRegistered Con0nuous Query 4 Foursquare f Facebook R RFID
  7. 7. BigData Example data in the streams •  Three ways to learn who is where 7 Sensor Room Person Time-stamp RedSensor RedRoom Alice T1 … … … … Person ChecksIn Time-stamp Bob BlueRoom T2 … … … Person IsIn With Time-stamp Carl null Bob T2 David RedRoom Elena T3 … … … …
  8. 8. BigData Visualize the running example BlueRoom RedRoom RedSensor BlueSensor R f 4 f Alice David Bob Carl R RFID 4 Foursquare f Facebook is with
  9. 9. BigData What do we want to achieve? •  Obtain answers that are not explicit in the data, but that can be inferred from the data and a conceptual integrated model h[p://emanueledellavalle.org 9 Sensor Room Person Time-stamp RedSensor RedRoom Alice T1 Person ChecksIn Time-stamp Bob BlueRoom T2 Person IsIn With Time-stamp Carl null Bob T2 David RedRoom Alice T3 Carl is in the BlueRoom at T2 David is in the RedRoom at T3
  10. 10. BigData Running Example – Data Model Observa0on Sensor Person Post Room where discusses who observes subClassOf subClassOf posts subPropOf Streaming information Background information isWith isIn isConnectedTo
  11. 11. BigData Background Informa0on •  The Ontology –  h[p://www.streamreasoning.org/ontologies/sr4ld2014-onto.rdf •  The Sta0c Instances –  :Alice a :Person . –  :Bob a :Person . –  :Carl a :Person . –  :David a :Person . –  :Elena a :Person . –  :RedRoom a :Room . –  :BlueRoom a :Room . –  :RedRoom :isConnectedTo :BlueRoom . –  :RedSensor a :Sensor . –  :BlueSensor a :Sensor . 11
  12. 12. BigData (Virtual) RDF streams •  RDF Stream Data Type –  Ordered sequence of pairs, where each pair is made of an RDF triple and its 0mestamp •  Timestamps are not required to be unique, they must be non-decreasing •  E.g., 12 Streamed RDF graph Time- stamp Stream :RedSensor :observes [ :who :Alice; :where :RedRoom ] . T1 rfid :Bob :posts [ :who :Bob ; :where :BlueRoom ] . T2 fs :Carl :posts [ :who :Carl , :Bob ] . T2 d :David :posts [ :who :David , :Elena ; :where :RedRoom] T3 d
  13. 13. BigData C-SPARQL language •  Features illustrated in the rest of this session –  register con0nuous queries •  QUERY form •  STREAM form –  iden0fy relevant informa0on in one or more RDF streams –  derive and aggregate informa0on from one or more RDF streams –  join or merge RDF streams –  feed results of one C-SPARQL query to a subsequent C-SPARQL query –  access background RDF graphs –  answer using rule based encoding of owl fragments 13
  14. 14. BigData Query and Stream Registra0on
  15. 15. BigData Query and Stream Registration •  All C-SPARQL queries over RDF streams are con0nuous –  Registered through the REGISTER statement •  The output of queries is in the form of –  Instantaneous tables of variable bindings –  Instantaneous RDF graphs –  RDF stream •  Only queries in the CONSTRUCT form can be registered as generators of RDF streams •  Composability: –  Query results registered as streams can feed other registered queries just like every other RDF stream 15
  16. 16. BigData Query registra0on - Example •  Using the social stream d, Who is where? REGISTER QUERY QWhoIsWhereOnFb AS PREFIX : <http://…/sr4ld2014-onto#> SELECT ?room ?person FROM STREAM <http://…/fb> [RANGE 1m STEP 10s] WHERE { ?person1 :posts [ :who ?person ; :where ?room ] . } •  The resul0ng variable bindings has to be interpreted as an instantaneous. It expires as soon as the query is recomputed 16
  17. 17. BigData Stream registra0on - Example •  Results of a C-SPARQL query can be stream out for down stream queries REGISTER STREAM SWhoIsWhereOnFb AS PREFIX : <http://…/sr4ld2014-onto#> CONSTRUCT { ?person :isIn ?room } FROM STREAM <http://…/fb> [RANGE 1m STEP 10s] WHERE { ?person1 :posts [ :who ?person ; :where ?room ] . } •  The resul0ng RDF triples are streamed out on an RDF stream 17
  18. 18. BigData Stream Registra0on - Notes •  The output is constructed in the format of an RDF stream. •  Every query execu0on may produce from a minimum of zero triples to a maximum of an en0re graph. •  The 0mestamp is always dependent on the query execu0on 0me only, and is not taken from the triples that match the pa[erns in the WHERE clause. 18
  19. 19. BigData FROM STREAM Clause
  20. 20. BigData FROM STREAM Clause •  FROM STREAM clauses are similar to SPARQL datasets – They iden0fy RDF stream data sources – They represent windows over a RDF stream •  They define the RDF triples available for querying and filtering. 20
  21. 21. BigData FROM STREAM Clause - windows •  physical: a given number of triples •  logical: a variable number of triples which occur during a given 0me interval (e.g., 1 hour) –  Sliding: they are progressively advanced of a given STEP (e.g., 5 minutes) –  Tumbling: they are advanced of exactly their 0me interval 21
  22. 22. BigData FROM STREAM Clause - Windows •  Grammar of the FROM STREAM clause 22
  23. 23. BigData FROM STREAM Clause - Example •  Using the social stream d, how many people are in the same room? Count on a window of 1 minute that slides every 10 seconds REGISTER QUERY HowManyPoepleAreInTheSameRoom AS PREFIX : <http://…/sr4ld2014-onto#> SELECT ?room (COUNT(DISTINCT ?person) as ?n) FROM STREAM <http://…/fb> [RANGE 1m STEP 10s] WHERE { ?person1 :posts [ :who ?person ; :where ?room ] . } GROUP BY ?room 23
  24. 24. BigData C-SPARQL reports only snapshots t t+10 t+20 t+30 t+40 t+50 t+60 t+70 t+80 d1 d2 d3 d1 d1 d1 d1 d1 d2 d2 d2 d2 d3 d3 Incoming 0mestamped RDF triples Time window [RANGE 40s STEP 10s] Window content t+40 d1 d1, d2 d1, d2 d1, d2, d3 d2, d3 t+50 t+60 t+70 t+80
  25. 25. BigData Mul0ple FROM STREAM Clause - Example •  Who is where using the social stream d and fs, REGISTER QUERY WhoIsWhereOnFbAndFs AS PREFIX : <http://…/rsp2014-onto#> SELECT ?room ?person FROM STREAM <http://…/fb> [RANGE 1m STEP 10s] FROM STREAM <http://…/fs> [RANGE 1m STEP 10s] WHERE { ?person1 :posts [ :who ?person ; :where ?room ] . } 25
  26. 26. BigData •  Who is where using the streams output by the C-SPARQL that process the foursquare and the facebook stream REGISTER STREAM isInWith AS PREFIX : <http://…/rsp2014-onto#> CONSTRUCT { ?person :isIn ?room } FROM STREAM <http://…/IsInFs> [RANGE 10s STEP 1s] FROM STREAM <http:/…/IsWithFb> [RANGE 10s STEP 1s] WHERE { ?person :isWith ?person1 . ?person1 :isIn ?room . FILTER (?person1 != ?person) } 26 Query Chaining
  27. 27. BigData Query Chaining •  A C-SPARQL query Q1 registered using the STREAM clause streams results on an RDF stream •  A down stream C-SPARQL query Q2 can open a window on the RDF stream of Q1 using the FROM STREAM clause •  E.g., 27 Is in on 4 query 4 Stream f Stream Is with on f query Is In across f and 4 query Stream Stream :Bob :posts [ :who :Bob ; :where :BlueRoom ] . :Carl :posts [ :who :Carl , :Bob ] . :Bob :isIn :BlueRoom . :Carl :isWith :Bob . :Carl :isIn :BlueRoom .
  28. 28. BigData What do we want to achieve? •  Obtain answers that are not explicit in the data, but that can be inferred from the data and a conceptual integrated model h[p://emanueledellavalle.org 28 Sensor Room Person Time-stamp RedSensor RedRoom Alice T1 Person ChecksIn Time-stamp Bob BlueRoom T2 Person IsIn With Time-stamp Carl null Bob T2 David RedRoom Alice T3 Carl is in the BlueRoom at T2 David is in the RedRoom at T3 ✗
  29. 29. BigData C-SPARQL queries and reasoning - example •  Memo: posts is a sub property of observes •  Data •  Query under RDFS entailment regime REGISTER QUERY QueryUnderRDFSEntailmentRegime AS PREFIX : <http://…/sr4ld2014-onto#> SELECT ?x ?room ?person FROM STREAM <http://…/fs> [RANGE 1m STEP 10s] FROM STREAM <http://…/sensors> [RANGE 1m STEP 10s] WHERE { ?x :observes [ :who ?person ; :where ?room ] .} •  Results at t1 + 10s 29 RDF graph Time-stamp Stream :RedSensor :observes [ :who :Alice; :where :RedRoom ] . t1 sensors :Bob :posts [ :who :Bob ; :where :RedRoom] . t2 fs ?x ?room ?person :RedSensor :RedRoom :Alice :Bob :RedRoom :Bob
  30. 30. BigData TimeStamp Func0on
  31. 31. BigData TimeStamp Func0on – Syntax and Seman0cs •  The 0mestamp of a triple can be bound to a variable using a 0mestamp() func0on •  Syntax –  0mestamp(variable|IRI|bn, variable|IRI, variable|IRI|bn|literal) •  Seman0cs 31 Triple Result of evalutaion It is not in the window Type Error It appears once in the window Timestamp of triple It appears mul0ple 0mes in the window The 0mestamp of the most recent triple
  32. 32. BigData TimeStamp Func0on - Example •  Who is “following” whom? REGISTER QUERY FindFollowers AS PREFIX f: <http://larkc.eu/csparql/sparql/jena/ext#> PREFIX : <http://…/sr4ld2014-onto#> SELECT ?someOne ?someOneElse ?room FROM STREAM <http://…/isIn> [RANGE 1m STEP 10s] WHERE { ?someOne :isIn ?room . ?someOneElse :isIn ?room . FILTER(?someOne!=?someOneElse ) FILTER (f:timestamp(?someOne, :isIn, ?room) < f:timestamp(?someOneElse, :isIn, ?room) } 32
  33. 33. BigData Accessing background Informa0on •  C-SPARQL allows for asking the engine to issue the query also against RDF graphs using the FROM clauses. •  E.g., Where else can Alice go? REGISTER QUERY WhereElseCanAliceGo AS PREFIX : <http://…/sr4ld2014-onto#> SELECT ?room FROM STREAM <http://…/isIn> [RANGE 10m STEP 10m] FROM <http://…/bgInfo> WHERE { :Alice :isIn ?someRoom . ?someRoom :isConnectedTo ?room . } 33 IRI identifying the graph containing the background information
  34. 34. BigData C-SPARQL Engine Architecture •  Simple, modular architecture •  It relies en0rely on exis0ng technologies •  Integra0on of –  DSMSs (Esper) and –  SPARQL engines (Jena-ARQ)
  35. 35. BigData C-SPARQL Engine Features at a glance 1/3 •  In-memory RDF stream Processing –  Con0nuous queries, filtering, aggrega0ons, joins, sub- queries via C-SPARQL –  Push based –  Reac0ve •  C-SPARQL Engine 0.9.5 supports –  SPARQL 1.1 (tested with h[p://www.w3.org/wiki/SRBench) –  query chaining –  background RDF graph access and update (via SPARQL 1.1 Update) –  naïve stream reasoning (via Jena Generic Rule Reasoner) –  0me aware matching via 0mestamp func0on 35
  36. 36. BigData C-SPARQL Engine Features at a glance 2/3 •  Extensible Middleware –  Run0me management of •  RDF streams •  C-SPARQL query •  Result listerners –  API driven •  Quick start available –  C-SPARQL Engine •  h[p://streamreasoning.org/download/csparqlreadytogopack •  Source code are released open source under Apache 2.0 –  C-SPARQL Engine •  h[ps://github.com/streamreasoning/CSPARQL-engine •  h[ps://github.com/streamreasoning/CSPARQL-ReadyToGoPack 36
  37. 37. BigData C-SPARQL Engine Features at a glance 3/3 •  Known limita0ons – large background data and 0mestamp func0on can spoil performance – no support for named graphs and named streams – no support for mul0ple windows on the same stream – triple based windows are deprecated 37
  38. 38. BigData Read out more C-SPARQL semanIcs •  D F Barbieri, D Braga, S Ceri, E Della Valle, M Grossniklaus: C-SPARQL: a ConInuous Query Language for RDF Data Streams. Int. J. Seman0c Compu0ng 4(1): 3-25 (2010) Most recent syntax •  D. F. Barbieri, D. Braga, S. Ceri, E. Della Valle, M. Grossniklaus, Querying RDF streams with C-SPARQL, SIGMOD Record 39 (1) (2010) 20–26. ApplicaIons •  M Balduini, I Celino, D Dell’Aglio, E Della Valle, Y Huang, T Lee, S-H Kim, V Tresp: BOTTARI: An augmented re- ality mobile applicaIon to deliver personalized and locaIon-based recommenda-Ions by conInuous analysis of social media streams. J. Web Sem. 16: 33-41 (2012) •  M Balduini, E Della Valle, D Dell’Aglio, M Tsytsarau, T Palpanas, C Confalonieri: Social Listening of City Scale Events Using the Streaming Linked Data Framework. Interna0onal Seman0c Web Con- ference (2) 2013: 1-16 38
  39. 39. BigData Join RSP the community 39
  40. 40. BigData What's next •  RDF Stream Processing Language –  Proposed seman0cs •  D.Dell'Aglio, E.Della Valle, J-P Calbimonte, Ó. Corcho: RSP-QL Seman0cs: A Unifying Query Model to Explain Heterogeneity of RDF Stream Processing Systems. Int. J. Seman0c Web Inf. Syst. 10(4): 17-44 (2014) –  Proposed syntax •  D.Dell'Aglio, E.Della Valle, J-P Calbimonte, Ó. Corcho: Towards A Unified Language for RDF Stream Query Processing ESWC 2015 Workshops Post Proceedings. •  RDF Stream Processing Services –  Source code are released open source under Apache 2.0 •  h[ps://github.com/streamreasoning/rsp-services-csparql •  h[ps://github.com/streamreasoning/rsp-services-api •  h[ps://github.com/streamreasoning/rsp-services-client-example h[p://emanueledellavalle.org 40
  41. 41. BigData Credits •  These slides are par0ally based on “Streaming Reasoning for Linked Data 2014” by M. Balduini, J-P Calbimonte, O. Corcho, D. Dell'Aglio, and E. Della Valle, h[p://streamreasoning.org/events/sr4ld2014
  42. 42. BigData Semantic Approach to Big Data and Event Processing Thank you! Any Ques0on? Riccardo Tommasini PhD Student - Politecnico di Milano riccardo.tommasini@polimi.it

×