Your SlideShare is downloading. ×
0
Challenges, Approaches,   and Solutions in  Stream Reasoning   http://streamreasoning.org       Emanuele Della Valle      ...
Agenda  •   Motivation  •   Concept  •   Achievements  •   Applications  •   Conclusions      Stavanger, 2012-5-9   Emanue...
MotivationIt‘s a streaming World! [IEEE-IS2009]                                 1/3   • Oil operations   • Traffic   • Fin...
MotivationIt‘s a streaming World! [IEEE-IS2009]                                 2/4    • … and want to analyse      data s...
MotivationIt‘s a streaming World! [IEEE-IS2009]                                       3/4   • e.g., Real Time Rome (mobile...
MotivationIt‘s a streaming World! [IEEE-IS2009]                                 4/4   • e.g., Pulse of the Nation (social ...
MotivationWhat are data streams anyway?   • Formally:        – Data streams are unbounded sequences of time-          vary...
MotivationThe continuous nature of streams   • The nature of streams requires a     paradigmatic change*          – from p...
Motivation – The continuous nature of streamsContinuous Semantics   • Continuous queries registered over streams that,    ...
Motivation – The continuous nature of streamsTools exists [Cugola2011]   • Types      – Data Stream Management Systems    ...
MotivationNew Requirements  New Challenges    Typical Requirements    •Processing Streams                        •Continu...
MotivationAre DSMS/CEP ready to address them?    Typical Requirements                       DSMS/CEP    •Processing Stream...
MotivationIs Semantic Web ready to address them?    • The Semantic Web, the Web of Data is doing fine        – RDF, RDF Sc...
MotivationNew Requirements  New Challenges    Typical Requirements                       Semantic Web    •Processing Stre...
MotivationNew Requirements call for Stream Reasoning    Typical Requirements    •Processing Streams                       ...
ConceptStream Reasoning Definition [IEEE-IS2010]   • Making sense       – in real time       – of multiple, heterogeneous,...
ConceptResearch Challenges   • Relation with DSMSs and CEPs       – Just as RDF relates to data-base systems?   • Data typ...
Achievements  • RDF Streams      – Notion defined  • C-SPARQL      – Syntax and semantics defined as a SPARQL extension   ...
AchievementsOutline   • RDF Streams       – Notion defined   • C-SPARQL       – Syntax and semantics defined as a SPARQL e...
Memo: Sensor Network Ontology•20   Stavanger, 2012-5-9   Emanuele Della Valle - visit http://streamreasoning.org
Memo: Sensor Network Ontology[ ] streaming part [ ] static part     Stavanger, 2012-5-9   Emanuele Della Valle - visit htt...
AchievementsRDF Stream [WWW2009,EDBT2010,IJSC2010]   • RDF Stream Data Type        – Ordered sequence of pairs, where each...
AchievementsOutline   • RDF Streams       – Notion defined   • C-SPARQL       – Syntax and semantics defined as a SPARQL e...
MEMO: SPARQL Stavanger, 2012-5-9   Emanuele Della Valle - visit http://streamreasoning.org   24
AchievementsWhere C-SPARQL Extends SPARQL   Stavanger, 2012-5-9   Emanuele Della Valle - visit http://streamreasoning.org ...
AchievementsAn Example of C-SPARQL Query   Which are the sensors that have observing winds above 50 mph in the    last hal...
AchievementsAn Example of C-SPARQL Query   Which are the sensors that have observing winds above 50 mph in the            ...
AchievementsOutline   • RDF Streams       – Notion defined   • C-SPARQL       – Syntax and semantics defined as a SPARQL e...
AchievementsFROM STREAM Clause - Types of Window    • physical: a given number of triples    • logical: a variable number ...
AchievementsEfficiency of Evaluation [IEEE-IS2010]   • window based selection of C-SPARQL outperforms     the standard FIL...
AchievementsOutline   • RDF Streams       – Notion defined   • C-SPARQL       – Syntax and semantics defined as a SPARQL e...
AchievementsAlgebraic optimizations of C-SPARQL [EDBT2010]   • Several transformations can be applied to algebraic     rep...
Achievementsalgebraic optimizations of C-SPARQL [EDBT2010]   • Push of filters and projections                125         ...
AchievementsOutline   • RDF Streams       – Notion defined   • C-SPARQL       – Syntax and semantics defined as a SPARQL e...
AchievementsComplex Event Detection as stream compositions   • e.g., continuous detection of blizzards by analyzing     mu...
AchievementsComplex Event Detection as stream compositions                                       Q [1 HOUR]               ...
AchievementsComplex Event Detection as stream compositions                    snowfall + strong winds + low temp          ...
AchievementsHigh Throughputs [JWS2012a]    Stavanger, 2012-5-9   Emanuele Della Valle - visit http://streamreasoning.org  ...
AchievementsOutline   • RDF Streams       – Notion defined   • C-SPARQL       – Syntax and semantics defined as a SPARQL e...
AchievementsWhere’s the Reasoning?   Example: can we measure the the impact of a tweet?   Twitter allows two traceable w...
AchievementsExample of C-SPARQL and Reasoning 1/2What impact have I been creating with my tweets in the last hour?Let’s co...
AchievementsOur approach [ESWC2010]   • The algorithm       1. deletes all triples (asserted or inferred) that have just  ...
ImplementationComparative Evaluation on Materialization    •            base-line: re-computing the materialization from s...
AchievementsComparative Evaluation on Query Answering   • comparison of the average time needed to answer     a C-SPARQL q...
AchievementsOutline   • RDF Streams      – Notion defined   • C-SPARQL      – Syntax and semantics defined as a SPARQL ext...
AchievementsStreaming Linked Data Framework [JWS2012a]    Features    •Accessing raw data    stream from C-    SPARQL    •...
ApplicationsLocation Based Social Media Analytics [JWS2012b]    Stavanger, 2012-5-9   Emanuele Della Valle - visit http://...
Applications  Location Based Social Media Analytics [JWS2012b]                                                    Sk s    ...
ApplicationsBlizzard Detection [JWS2012a]    Stavanger, 2012-5-9   Emanuele Della Valle - visit http://streamreasoning.org...
Applications Blizzard Detection [JWS2012a]Live demonstration at http://streamreasoning.org/demos/blizzard-detection     St...
ApplicationsHurricane Detection [JWS2012a]    Stavanger, 2012-5-9   Emanuele Della Valle - visit http://streamreasoning.or...
Applications Hurricane Detection [JWS2012a]Live demonstration at http://streamreasoning.org/demos/hurricane-detection     ...
You can try C-SPARQL out!   Working prototype available for download in a    “ready to go pack”    http://streamreasoning...
ConclusionsResearch Challenges vs. Achievements   Relation with DSMSs and CEPs          Notion of RDF stream :-| alterna...
Credits   Politecnico di Milano’s colleagues          Prof. Stefano Ceri who had the initial intuition about the value o...
ReferencesMy papers   [IEEE-IS2009] E. Della Valle, S. Ceri, F. van Harmelen, D. Fensel    Its a Streaming World! Reasoni...
ReferencesOther groups’ papers[1] Darko Anicic, Paul Fodor, Sebastian Rudolph, Nenad Stojanovic: EP-SPARQL: a    unified l...
ReferencesBackground papers   [Henzinger98] Henzinger, M. R. & Raghavan, P. (1998). Computing on    data streams. Systems...
Thank You! Questions?                                Much More to Come!                                  Keep an eye on   ...
Upcoming SlideShare
Loading in...5
×

Challenges, Approaches, and Solutions in Stream Reasoning

1,879

Published on

The presentation I gave at Semantic Days 2012 (https://www.posccaesar.org/wiki/PCA/SemanticDays2012) about Stream Reasoning. The main goal of the presentation is to give the most up to date comprehensive view on Stream Reasoning.

Published in: Education, Technology

Transcript of "Challenges, Approaches, and Solutions in Stream Reasoning "

  1. 1. Challenges, Approaches, and Solutions in Stream Reasoning http://streamreasoning.org Emanuele Della Valle DEI - Politecnico di Milano emanuele.dellavalle@polimi.it http://emanueledellavalle.org Emanuele Della Valle - visit http://streamreasoning.org
  2. 2. Agenda • Motivation • Concept • Achievements • Applications • Conclusions Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 2
  3. 3. MotivationIt‘s a streaming World! [IEEE-IS2009] 1/3 • Oil operations • Traffic • Financial markets • Social networks • Generate data streams! Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 3
  4. 4. MotivationIt‘s a streaming World! [IEEE-IS2009] 2/4 • … and want to analyse data streams in real time • In a well in progress to drown, how long time do I have given its historical behavior? • Is public transportation where the people are? • Can we detect any intra-day correlation clusters among stock exchanges? • Who is driving the discussion about the top 10 emerging topics ? Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 4
  5. 5. MotivationIt‘s a streaming World! [IEEE-IS2009] 3/4 • e.g., Real Time Rome (mobile network data streams) Normal day Exceptional day [source: http://senseable.mit.edu/realtimerome/ ] afternoon evening Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 5
  6. 6. MotivationIt‘s a streaming World! [IEEE-IS2009] 4/4 • e.g., Pulse of the Nation (social media streams) [source: http://www.ccs.neu.edu/home/amislove/twittermood/ ] 12:00 23:00 happier Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 6
  7. 7. MotivationWhat are data streams anyway? • Formally: – Data streams are unbounded sequences of time- varying data elements time • Less formally: – an (almost) “continuous” flow of information – with the recent information being more relevant as it describes the current state of a dynamic system Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 7
  8. 8. MotivationThe continuous nature of streams • The nature of streams requires a paradigmatic change* – from persistent data • to be stored and queried on demand • a.k.a. one time semantics – to transient data • to be consumed on the fly by continuous queries • a.k.a. continuous semantics * This paradigmatic change first arose in DB community [Henzinger98] Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 8
  9. 9. Motivation – The continuous nature of streamsContinuous Semantics • Continuous queries registered over streams that, in most of the cases, are observed trough windows window input streams Registered streams of answer Continuous Query Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 9
  10. 10. Motivation – The continuous nature of streamsTools exists [Cugola2011] • Types – Data Stream Management Systems – Complex Event Processors • Research Prototypes – Amazon/Cougar (Cornell) – sensors – Aurora (Brown/MIT) – sensor monitoring, dataflow – Gigascope: AT&T Labs – Network Monitoring – Hancock (AT&T) – Telecom streams – Niagara (OGI/Wisconsin) – Internet DBs & XML – OpenCQ (Georgia) – triggers, view maintenance – Stream (Stanford) – general-purpose DSMS – Stream Mill (UCLA) - power & extensibility – Tapestry (Xerox) – publish/subscribe filtering – Telegraph (Berkeley) – adaptive engine for sensors – Tribeca (Bellcore) – network monitoring • High-tech startups – Streambase, Coral8, Apama, Truviso • Major DBMS vendors are all adding stream extensions as well – IBM InfoSphere Stream – Microsoft streaminsight – Oracle CEP Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 10
  11. 11. MotivationNew Requirements  New Challenges Typical Requirements •Processing Streams •Continuous semantics •Large datasets • Scalable processing •Reactivity • Real-time systems •Fine-grained information • Powerful query access languages •Modeling complex •Rich ontology languages application domains Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 11
  12. 12. MotivationAre DSMS/CEP ready to address them? Typical Requirements DSMS/CEP •Processing Streams •Continuous semantics •Large datasets • Scalable processing •Reactivity • Real-time systems •Fine-grained information • Powerful query access languages •Modeling complex •Rich ontology languages application domains Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 12
  13. 13. MotivationIs Semantic Web ready to address them? • The Semantic Web, the Web of Data is doing fine – RDF, RDF Schema, SPARQL, OWL, RIF – well understood theory, – rapid increase in scalability • BUT it pretends that the world is static or at best a low change rate both in change-volume and change-frequency – ontology versioning – belief revision – time stamps on named graphs • It sticks to the traditional one-time semantics Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 13
  14. 14. MotivationNew Requirements  New Challenges Typical Requirements Semantic Web •Processing Streams •Continuous semantics •Large datasets • Scalable processing •Reactivity • Real-time systems •Fine-grained information • Powerful query access languages •Modeling complex •Rich ontology languages application domains Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 14
  15. 15. MotivationNew Requirements call for Stream Reasoning Typical Requirements •Processing Streams •Continuous semantics •Large datasets • Scalable processing •Reactivity • Real-time systems •Fine-grained information • Powerful query access languages •Modeling complex •Rich ontology languages application domains Stream Reasoning Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 15
  16. 16. ConceptStream Reasoning Definition [IEEE-IS2010] • Making sense – in real time – of multiple, heterogeneous, gigantic and inevitably noisy data streams – in order to support the decision process of extremely large numbers of concurrent user • Note: making sense of streams necessarily requires processing them against rich background knowledge, an unsolved problem in database Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 16
  17. 17. ConceptResearch Challenges • Relation with DSMSs and CEPs – Just as RDF relates to data-base systems? • Data types and query languages for semantic streams – Just RDF and SPARQL but with continuous semantics? • Reasoning on Streams – Theory – Efficiency – Scalability • Dealing with incomplete & noisy data – Even more than on the current Web of Data • Distributed and parallel processing – Streams are parallel in nature, … • Engineering Stream Reasoning Applications – Development Environment – Integration with other technologies – Benchmarks Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 17
  18. 18. Achievements • RDF Streams – Notion defined • C-SPARQL – Syntax and semantics defined as a SPARQL extension – Engine designed and implemented • Experiments with C-SPARQL under simple RDF entailment regimes – window based selection of C-SPARQL outperforms the standard FILTER based selection – algebraic optimizations of C-SPARQL queries are possible – Complex event can be detected using a network of C-SPARQL queries at high throughputs • Experiment with C-SPARQL under RDFS++ entailment regimes – efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as stream • Streaming Linked Data Framework prototyped Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 18
  19. 19. AchievementsOutline • RDF Streams – Notion defined • C-SPARQL – Syntax and semantics defined as a SPARQL extension – Engine designed and implemented • Experiments with C-SPARQL under simple RDF entailment regimes – window based selection of C-SPARQL outperforms the standard FILTER based selection – algebraic optimizations of C-SPARQL queries are possible – Complex event can be detected using a network of C-SPARQL queries at high throughputs • Experiment with C-SPARQL under RDFS++ entailment regimes – efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as stream • Streaming Linked Data Framework prototyped Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 19
  20. 20. Memo: Sensor Network Ontology•20 Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org
  21. 21. Memo: Sensor Network Ontology[ ] streaming part [ ] static part Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 21
  22. 22. AchievementsRDF Stream [WWW2009,EDBT2010,IJSC2010] • RDF Stream Data Type – Ordered sequence of pairs, where each pair is made of an RDF triple and its timestamp  Timestamps are not required to be unique, they must be non- decreasing • E.g., (< :s1 ssn:generatedObservation :o1 >, 2010-02-12T13:34:41) (< :o1 a weather:SnowfallObservation >, 2010-02-12T13:34:41) (< :s1 om-owl:generatedObservation :o2 >, 2010-02-12T13:36:28) (< :o2 a weather:WindSpeedObservation >, 2010-02-12T13:36:28) (< :o2 ssn:result :a1 >, 2010-02-12T13:36:28) (< :a1 ssn:floatValue "35.4”^^xsd:float >, 2010-02-12T13:36:28) Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 22
  23. 23. AchievementsOutline • RDF Streams – Notion defined • C-SPARQL – Syntax and semantics defined as a SPARQL extension – Engine designed and implemented • Experiments with C-SPARQL under simple RDF entailment regimes – window based selection of C-SPARQL outperforms the standard FILTER based selection – algebraic optimizations of C-SPARQL queries are possible – Complex event can be detected using a network of C-SPARQL queries at high throughputs • Experiment with C-SPARQL under RDFS++ entailment regimes – efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as stream • Streaming Linked Data Framework prototyped Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 23
  24. 24. MEMO: SPARQL Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 24
  25. 25. AchievementsWhere C-SPARQL Extends SPARQL Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 25
  26. 26. AchievementsAn Example of C-SPARQL Query Which are the sensors that have observing winds above 50 mph in the last half an hour? Which is the observed average wind? REGISTER STREAM AvgWindSpeed AS CONSTRUCT { ?sens w:avgWindSpeed ?avgWindSpeed } FROM STREAM <.../streams/ssnmeteostream> [RANGE 1h STEP 1h] WHERE { SELECT ?sens (AVG(?v) as ?avgWindSpeed) WHERE { ?sens om-owl:generatedObservation ?o . ?o a weather:WindSpeedObservation . ?o om-owl:result ?r . ?r om-owl:floatValue ?v . } GROUP BY ?sens HAVING (?avgWindSpeed > 5) } Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 26
  27. 27. AchievementsAn Example of C-SPARQL Query Which are the sensors that have observing winds above 50 mph in the Query registration RDF Stream added as last half (for hour? Which is the observed average wind? format an continuous new ouput execution) REGISTER STREAM AvgWindSpeed AS FROM STREAM clause CONSTRUCT { ?sens w:avgWindSpeed ?avgWindSpeed } FROM STREAM <.../streams/ssnmeteostream> [RANGE 1h STEP 1h] WHERE { SELECT ?sens (AVG(?v) as ?avgWindSpeed) WINDOW WHERE { ?sens om-owl:generatedObservation ?o . ?o a weather:WindSpeedObservation . ?o om-owl:result ?r . SPARQ 1.1 features •Sub-queries ?r om-owl:floatValue ?v . } •aggregates GROUP BY ?sens HAVING (?avgWindSpeed > 5) } Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 27
  28. 28. AchievementsOutline • RDF Streams – Notion defined • C-SPARQL – Syntax and semantics defined as a SPARQL extension – Engine designed and implemented • Experiments with C-SPARQL under simple RDF entailment regimes – window based selection of C-SPARQL outperforms the standard FILTER based selection – algebraic optimizations of C-SPARQL queries are possible – Complex event can be detected using a network of C-SPARQL queries at high throughputs • Experiment with C-SPARQL under RDFS++ entailment regimes – efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as stream • Streaming Linked Data Framework prototyped Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 28
  29. 29. AchievementsFROM STREAM Clause - Types of Window • physical: a given number of triples • logical: a variable number of triples which occur during a given time interval (e.g., 1 hour) – Sliding: they are progressively advanced of a given STEP (e.g., 5 minutes) – Tumbling: they are advanced of exactly their time interval Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 29
  30. 30. AchievementsEfficiency of Evaluation [IEEE-IS2010] • window based selection of C-SPARQL outperforms the standard FILTER based selection Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 30
  31. 31. AchievementsOutline • RDF Streams – Notion defined • C-SPARQL – Syntax and semantics defined as a SPARQL extension – Engine designed and implemented • Experiments with C-SPARQL under simple RDF entailment regimes – window based selection of C-SPARQL outperforms the standard FILTER based selection – algebraic optimizations of C-SPARQL queries are possible – Complex event can be detected using a network of C-SPARQL queries at high throughputs • Experiment with C-SPARQL under RDFS++ entailment regimes – efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as stream • Streaming Linked Data Framework prototyped Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 31
  32. 32. AchievementsAlgebraic optimizations of C-SPARQL [EDBT2010] • Several transformations can be applied to algebraic representation of C-SPARQL • some recalling well known results from classical relational optimization – push of FILTERs and projections • some being more specific to the domain of streams. – push of aggregates. Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 32
  33. 33. Achievementsalgebraic optimizations of C-SPARQL [EDBT2010] • Push of filters and projections 125 100 75 m s 50 25 0 10 100 1000 10000 100000 Window Size None Static Only Streaming Only Both Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 33
  34. 34. AchievementsOutline • RDF Streams – Notion defined • C-SPARQL – Syntax and semantics defined as a SPARQL extension – Engine designed and implemented • Experiments with C-SPARQL under simple RDF entailment regimes – window based selection of C-SPARQL outperforms the standard FILTER based selection – algebraic optimizations of C-SPARQL queries are possible – Complex event can be detected using a network of C-SPARQL queries at high throughputs • Experiment with C-SPARQL under RDFS++ entailment regimes – efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as stream • Streaming Linked Data Framework prototyped Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 34
  35. 35. AchievementsComplex Event Detection as stream compositions • e.g., continuous detection of blizzards by analyzing multiple streams of data generated by weather sensors spread across a continental area Blizzard: a severe snowstorm with high winds and low visibility lasting at least three hours Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 35
  36. 36. AchievementsComplex Event Detection as stream compositions Q [1 HOUR] [TUMBLING] Count Snow Fall Q [1 HOUR] [TUMBLING] Q [3 HOURS] [STEP 1 HOUR] Linked Sensor Data AVG Wind Speed Blizzard Q [1 HOUR] [TUMBLING] AVG Temp Adapter LEGEND C-SPARQL Query Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 36
  37. 37. AchievementsComplex Event Detection as stream compositions snowfall + strong winds + low temp  blizzards Q [1 HOUR] [1 HOUR] Q [1 HOUR] [TUMBLING] Q [TUMBLING] [TUMBLING] Q [1 HOUR] Count Snow Fall [TUMBLING] Count Snow Fall Count Snow Fall Count Snow Fall [1 HOUR] Q [3 HOURS] Q [TUMBLING] [STEP 1 HOUR] Q [1 HOUR] Q [3 HOURS] Q [1 HOUR] Q [3 HOURS] [1 HOUR] [3 HOURS] [TUMBLING] [STEP 1 HOUR] [TUMBLING] [STEP 1 HOUR] Q [TUMBLING] Q [STEP 1 HOUR] Linked Sensor Data AVG Wind Speed Most Liked POIs Linked Sensor Data Linked Sensor Data AVG Wind Speed Most Liked POIs AVG Wind Speed Most Liked POIs Linked Sensor Data AVG Wind Speed Most Liked POIs [1 HOUR] Q [TUMBLING] Q [1 HOUR] Q [1 HOUR] [1 HOUR] [TUMBLING] [TUMBLING] Q AVG Temp [TUMBLING] AVG Temp AVG Temp AVG Temp Live demonstration at http://streamreasoning.org/demos/blizzard-detection Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 37
  38. 38. AchievementsHigh Throughputs [JWS2012a] Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 38
  39. 39. AchievementsOutline • RDF Streams – Notion defined • C-SPARQL – Syntax and semantics defined as a SPARQL extension – Engine designed and implemented • Experiments with C-SPARQL under simple RDF entailment regimes – window based selection of C-SPARQL outperforms the standard FILTER based selection – algebraic optimizations of C-SPARQL queries are possible – Complex event can be detected using a network of C-SPARQL queries at high throughputs • Experiment with C-SPARQL under RDFS++ entailment regimes – efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as stream • Streaming Linked Data Framework prototyped Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 39
  40. 40. AchievementsWhere’s the Reasoning? Example: can we measure the the impact of a tweet? Twitter allows two traceable ways of discussing a tweet:  reply: a user reply to a tweet of another user (it always retweet the original tweet)  retweet: a user propagates to his/her followers an interesting tweet For example reply reply reply t2 t4 t7 retweet reply reply t1 t3 t5 t8 retweet t6 50 min ago 40 min ago 30 min ago 20 min ago 10 min ago now Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 40
  41. 41. AchievementsExample of C-SPARQL and Reasoning 1/2What impact have I been creating with my tweets in the last hour?Let’s count them …REGISTER STREAM OpinionSpreading COMPUTED EVERY 30s ASSELECT ?tweet (count(?tweet) AS ?impactFROM STREAM <http://ex.org> [RANGE 60m STEP 10m]WHERE { :reply rdfs:subPropertyOf :discuss . :retweet rdfs:subPropertyOf :discuss . :t1 sr:discuss ?tweet :discuss a owl:TransitiveProperty .} discuss reply discuss reply discuss reply t2 t4 t7 7! discuss retweet discuss reply discuss reply t1 t3 t5 t8 retweet discuss t6 Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 41
  42. 42. AchievementsOur approach [ESWC2010] • The algorithm 1. deletes all triples (asserted or inferred) that have just expired 2. computes the entailments derived by the inserts, 3. annotates each entailed triple with a expiration time, and 4. eliminates from the current state all copies of derived triples except the one with the highest timestamp. Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 42
  43. 43. ImplementationComparative Evaluation on Materialization • base-line: re-computing the materialization from scratch • state-of-the-art [Ceri1994,Volz2005] • our approach [ESWC2010] 10000 1000 m s . 100 10 0,0% 2,0% 4,0% 6,0% 8,0% 10,0% 12,0% 14,0% 16,0% 18,0% 20,0% % of the materialization changed when the slides % of the materialization changed when the window window slides incremental-volz incremental-stream Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 43
  44. 44. AchievementsComparative Evaluation on Query Answering • comparison of the average time needed to answer a C-SPARQL query using – backward reasoner – the naive approach of re-computing the materialization – our approach 20 15 10 m s . 5 0 forward reasoning Backward reasoning naive approach incremental-stream query 5,82 1,61 1,61 materialization 0 15,91 0,28 Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 44
  45. 45. AchievementsOutline • RDF Streams – Notion defined • C-SPARQL – Syntax and semantics defined as a SPARQL extension – Engine designed and implemented • Experiments with C-SPARQL under simple RDF entailment regimes – window based selection of C-SPARQL outperforms the standard FILTER based selection – algebraic optimizations of C-SPARQL queries are possible – Complex event can be detected using a network of C-SPARQL queries at high throughputs • Experiment with C-SPARQL under RDFS++ entailment regimes – efficient incremental updates of deductive closures investigated – our approach outperform state-of-the-art when updates comes as stream • Streaming Linked Data Framework prototyped Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 45
  46. 46. AchievementsStreaming Linked Data Framework [JWS2012a] Features •Accessing raw data stream from C- SPARQL •Publishing streams and C-SPARQL query results as Linked Data •Connecting C- SPARQL queries in a network •Recoding and replaying portions of stream •Supporting fast prototyping of applications Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 46
  47. 47. ApplicationsLocation Based Social Media Analytics [JWS2012b] Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 47
  48. 48. Applications Location Based Social Media Analytics [JWS2012b] Sk s O K e_lh b e/XG yo utu. :// httpLive demonstration at http://streamreasoning.org/demos/bottari Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 48
  49. 49. ApplicationsBlizzard Detection [JWS2012a] Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 49
  50. 50. Applications Blizzard Detection [JWS2012a]Live demonstration at http://streamreasoning.org/demos/blizzard-detection Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 50
  51. 51. ApplicationsHurricane Detection [JWS2012a] Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 51
  52. 52. Applications Hurricane Detection [JWS2012a]Live demonstration at http://streamreasoning.org/demos/hurricane-detection Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 52
  53. 53. You can try C-SPARQL out! Working prototype available for download in a “ready to go pack” http://streamreasoning.org/download The Streaming Linked Data Framework will be soon released too, ask me directly for a pre- release version. Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 53
  54. 54. ConclusionsResearch Challenges vs. Achievements Relation with DSMSs and CEPs  Notion of RDF stream :-| alternative solutions can be investigated Data types and query languages for semantic streams  C-SPARQL :-D work in progress in FZI&AIFB [1,2] DERI [3], UPM [4] Reasoning on Streams  Theory :-(  Efficiency :-) work in progress in ISTI-Innsbruck [5]  Scalability :-| work in progress in IBM&VUA [6] Dealing with incomplete & noisy data  Even more than on the current Web of Data :-( some initial joint work with SIEMENS only [IEEE-IS2010] Distributed and parallel processing  Streams are parallel in nature, … :-| work in progress in IBM&VUA [6] Engineering Stream Reasoning Applications  Development Environment :-) work in progress in UPM [7]  Integration with other technologies :-)  Benchmarks :-P work in progress in Planet Data [8] Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 54
  55. 55. Credits Politecnico di Milano’s colleagues  Prof. Stefano Ceri who had the initial intuition about the value of introducing data streams to the semantic Web community  Marco Balduini, Davide Barbieri, Daniele Braga, Stefano Ceri and Michael Grossniklaus who helped concieving the C-SPARQL Engine and the Streaming Linked Data Framework  once again to Davide Barbieri who engineered most of the C- SPARQL Engine as part of his PhD  once again to Marco Balduini who engineered most of Streaming Linked Data Framework as part of his M.Sc. Thesis Politecnico di Milano’s Master students that assisted in the design and development of the prototypes  Mirko Bratomi, and Marco Regaldo Colleagues that in helped in concieving, designing, and prototyping the applications  CEFRIEL: Irene Celino, and Danile Dell’Aglio  SIEMENS: Yi Huang, and Volker Tresp  Saltlux: Seonho Kim, and Tony Lee Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 55
  56. 56. ReferencesMy papers [IEEE-IS2009] E. Della Valle, S. Ceri, F. van Harmelen, D. Fensel Its a Streaming World! Reasoning upon Rapidly Changing Information. IEEE Intelligent Systems 24(6): 83-89 (2009) [EDBT2010] D.F. Barbieri, D.Braga, S. Ceri and M. Grossniklaus. An Execution Environment for C-SPARQL Queries. EDBT 2010 [WWW2009] D.F. Barbieri, D. Braga, S. Ceri, E. Della Valle, M. Grossniklaus: C-SPARQL: SPARQL for continuous querying. WWW 2009: 1061-1062 [SIGMODRec2010] D.F. Barbieri, D.Braga, S. Ceri and M. Grossniklaus. : Querying RDF streams with C-SPARQL. SIGMOD Record 39(1): 20-26 (2010) [IEEE-IS2010] D. Barbieri, D. Braga, S. Ceri, E. Della Valle, Y. Huang, V. Tresp, A.Rettinger, H. Wermser: Deductive and Inductive Stream Reasoning for Semantic Social Media Analytics IEEE Intelligent Systems, 30 Aug. 2010. [JWS2012a] E. Della Valle, M. Balduini: SLD: a Framework for Streaming Linked Data. JWS. 2012 Under Review [JWS2012b] M. Balduini; I.Celino; E. Della Valle; D.DellAglio; Y. Huang; T. Lee; S. Kim; V. Tresp: BOTTARI: an Augmented Reality Mobile Application to deliver Personalized and Location-based Recommendations by Continuous Analysis of Social Media Streams. JWS. 2012. to appear. [ESWC2010] D.F. Barbieri, D. Braga, S. Ceri, E. Della Valle, M. Grossniklaus. Incremental Reasoning on Streams and Rich Background Knowledge. ESWC 2010 Emanuele Della Valle - visit http://streamreasoning.org 56 Stavanger, 2012-5-9
  57. 57. ReferencesOther groups’ papers[1] Darko Anicic, Paul Fodor, Sebastian Rudolph, Nenad Stojanovic: EP-SPARQL: a unified language for event processing and stream reasoning. WWW 2011: 635-644[2] Danh Le Phuoc, Minh Dao-Tran, Josiane Xavier Parreira, Manfred Hauswirth: A Native and Adaptive Approach for Unified Processing of Linked Streams and Linked Data. International Semantic Web Conference (1) 2011: 370-388[3] D. Anicic, S. Rudolph, P. Fodor, N. Stojanovic: Real-Time Complex Event Recognition and Reasoning-a Logic Programming Approach. Applied Artificial Intelligence 26(1-2): 6-57 (2012)[4] Jean-Paul Calbimonte, Óscar Corcho, Alasdair J. G. Gray: Enabling Ontology-Based Access to Streaming Data Sources. ISWC (1) 2010: 96-111[5] S. Komazec and D. Cerri: Towards Efficient Schema-Enhanced Pattern Matching over RDF Data Streams. First International Workshop on Ordering and Reasoning (OrdRing2011)[6] Jesper Hoeksema, Spyros Kotoulas: High-performance Distributed Stream Reasoning using S4. First International Workshop on Ordering and Reasoning (OrdRing2011)[7] A.J.G. Gray, R.Garcia-Castro, K.Kyzirakos, M.Karpathiotakis, J.Calbimonte, K.R.Page, J.Sadler, A.Frazer, I.Galpin, A.A.A. Fernandes, N.W. Paton, O.Corcho, M.Koubarakis, D.De Roure, K. Martinez, A. Gómez-Pérez: A Semantically Enabled Service Architecture for Mashups over Streaming and Stored Data. ESWC (2) 2011: 300-314[8] PlanetData. D1.2 Benchmarking RDF Storage Engines. http://wiki.planet-data.eu/web/D1.2 Emanuele Della Valle - visit http://streamreasoning.org 57 Stavanger, 2012-5-9
  58. 58. ReferencesBackground papers [Henzinger98] Henzinger, M. R. & Raghavan, P. (1998). Computing on data streams. Systems Research. [Ceri1994] Stefano Ceri, Jennifer Widom: Deriving Incremental Production Rules for Deductive Data. Inf. Syst. 19(6): 467-490 (1994) [Volz2005] Raphael Volz, Steffen Staab, Boris Motik: Incrementally Maintaining Materializations of Ontologies Stored in Logic Databases. J. Data Semantics 2: 1-34 (2005) [Cugola2011] Alessandro Margara, Gianpaolo Cugola: Processing flows of information: from data stream to complex event proces . DEBS 2011: 359-360 Emanuele Della Valle - visit http://streamreasoning.org 58 Stavanger, 2012-5-9
  59. 59. Thank You! Questions? Much More to Come! Keep an eye on http://www.streamreasoning.org Stavanger, 2012-5-9 Emanuele Della Valle - visit http://streamreasoning.org 59
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×