Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Incremental Reasoning on Streams and Rich Background Knowledge

2,279 views

Published on

The presentation I gave at ESWC 2010 in Heraklion, Greece, June 1st, 2010

Published in: Technology, Education
  • Be the first to comment

Incremental Reasoning on Streams and Rich Background Knowledge

  1. 1. Incremental Reasoning on Streams and Rich Background Knowledge http://streamreasoning.org Emanuele Della Valle DEI - Politecnico di Milano [email_address] http://emanueledellavalle.org Joint work with: Davide Francesco Barbieri, Daniele Braga, Stefano Ceri, and Michael Grossniklaus
  2. 2. Agenda <ul><li>Motivation </li></ul><ul><li>Background </li></ul><ul><li>Stream Reasoning Concept </li></ul><ul><li>Past Achievements </li></ul><ul><li>Main Contribution </li></ul><ul><li>Retrospective and Conclusions </li></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010
  3. 3. Motivation It‘s a streaming World! [IEEE-IS2009] <ul><li>Sensor networks, … </li></ul><ul><li>traffic engineering, … </li></ul><ul><li>social networking, … </li></ul><ul><li>… and many others … </li></ul><ul><li>generate streams! </li></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010
  4. 4. Motivation Questions People are Asking <ul><li>Given this brand of turbine, what is the expected time to failure when the barring starts to vibrate as now detected? </li></ul><ul><li>Is a traffic jam going to happen in this highway? And is then convenient to reallocate travelers based upon the forecast? </li></ul><ul><li>Who are the opinion makers? i.e., the users who are likely to influence the behavior of other users who follow them </li></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010
  5. 5. Motivation Problem Statement <ul><li>Making sense </li></ul><ul><ul><li>in real time </li></ul></ul><ul><ul><li>of gigantic and inevitably noisy data streams </li></ul></ul><ul><ul><li>in order to support the decision process of extremely large numbers of concurrent users </li></ul></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010
  6. 6. Background What are data streams anyway? <ul><li>Formally: </li></ul><ul><ul><li>Data streams are unbounded sequences of time-varying data elements </li></ul></ul><ul><li>Less formally: </li></ul><ul><ul><li>an (almost) “continuous” flow of information </li></ul></ul><ul><ul><li>with the recent information being more relevant as it describes the current state of a dynamic system </li></ul></ul>time ESWC 2010, Heraklion, Greece, June 1st, 2010
  7. 7. Background Can the Semantic Web Process Data Stream? <ul><li>The Semantic Web, the Web of Data is doing fine </li></ul><ul><ul><li>RDF, RDF Schema, SPARQL, OWL, DL </li></ul></ul><ul><ul><li>well understood theory, </li></ul></ul><ul><ul><li>rapid increase in scalability </li></ul></ul><ul><li>BUT it pretends that the world is static or at best a low change rate both in change-volume and change-frequency </li></ul><ul><ul><li>ontology versioning </li></ul></ul><ul><ul><li>belief revision </li></ul></ul><ul><ul><li>time stamps on named graphs </li></ul></ul><ul><li>It sticks to the traditional one-time semantics </li></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010
  8. 8. Background Continuous Semantics <ul><li>Processing data streams in the space of one-time semantics is difficult because of the very nature of the underlying data </li></ul><ul><li>Innovative * assumption: continuous semantics! </li></ul><ul><ul><li>streams can be consumed on the fly rather than being stored forever and </li></ul></ul><ul><ul><li>queries are registered and continuously produce answers </li></ul></ul><ul><li>* This innovation arose in DB community in ’90s </li></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010
  9. 9. Background Stream Processing <ul><li>Continuous queries registered over streams that are observed trough windows </li></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010 window input stream stream of answer Registered Continuous Query
  10. 10. Background Key Optimization in Stream Processing <ul><li>When a continuous query is registered, generate a query execution plan </li></ul><ul><ul><li>New plan merged with existing plans </li></ul></ul><ul><ul><li>Global scheduler for plan execution maximizing experience gathered with previous queries. </li></ul></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010
  11. 11. Background Data Stream Management Systems (DSMS) <ul><li>Research Prototypes </li></ul><ul><ul><li>Amazon/Cougar (Cornell) – sensors </li></ul></ul><ul><ul><li>Aurora (Brown/MIT) – sensor monitoring, dataflow </li></ul></ul><ul><ul><li>Gigascope: AT&T Labs – Network Monitoring </li></ul></ul><ul><ul><li>Hancock (AT&T) – Telecom streams </li></ul></ul><ul><ul><li>Niagara (OGI/Wisconsin) – Internet DBs & XML </li></ul></ul><ul><ul><li>OpenCQ (Georgia) – triggers, view maintenance </li></ul></ul><ul><ul><li>Stream (Stanford) – general-purpose DSMS </li></ul></ul><ul><ul><li>Stream Mill (UCLA) - power & extensibility </li></ul></ul><ul><ul><li>Tapestry (Xerox) – publish/subscribe filtering </li></ul></ul><ul><ul><li>Telegraph (Berkeley) – adaptive engine for sensors </li></ul></ul><ul><ul><li>Tribeca (Bellcore) – network monitoring </li></ul></ul><ul><li>High-tech startups </li></ul><ul><ul><li>Streambase, Coral8, Apama, Truviso </li></ul></ul><ul><li>Major DBMS vendors are all adding stream extensions as well </li></ul><ul><ul><li>Oracle http://www.oracle.com/technology/products/dataint/htdocs/streams_fo.html </li></ul></ul><ul><ul><li>DB2 http://www.eweek.com/c/a/Database/IBM-DB2-Turns-25-and-Prepares-for-New-Life/ </li></ul></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010
  12. 12. Concept Stream Reasoning [IEEE-IS2009,Dagstuhl2010] <ul><li>Idea origination </li></ul><ul><ul><li>Can continuous semantics be ported to reasoning? </li></ul></ul><ul><ul><li>This is an unexplored yet high impact research area! </li></ul></ul><ul><li>Stream Reasoning </li></ul><ul><ul><li>Logical reasoning in real time on gigantic and inevitably noisy data streams in order to support the decision process of extremely large numbers of concurrent users. </li></ul></ul><ul><ul><li>-- S. Ceri , E. Della Valle , F. van Harmelen and H. Stuckenschmidt , 2010 </li></ul></ul><ul><li>Note: making sense of streams necessarily requires processing them against rich background knowledge </li></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010
  13. 13. Concept Research Challenges (selection) [IEEE-IS2009] <ul><li>Relation with data-stream systems </li></ul><ul><ul><li>Just as RDF relates to data-base systems? </li></ul></ul><ul><li>Query languages for semantic streams </li></ul><ul><ul><li>Just as SPARQL for RDF but with continuous semantics? </li></ul></ul><ul><li>Reasoning on Streams </li></ul><ul><ul><li>Efficient incremental updates of deductive closures? </li></ul></ul><ul><ul><li>How to combine streams and background knowledge? </li></ul></ul><ul><li>Distributed and parallel processing </li></ul><ul><ul><li>Streams are parallel in nature </li></ul></ul><ul><li>Real time constrains </li></ul><ul><ul><li>A reasoning task must be completed before the answer become useless </li></ul></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010
  14. 14. Past Achievements Explored Continuous Semantics for SeWeb <ul><li>We gave up one-time semantics in Semantic Web and explored the benefits provided by continuous semantics when dealing with streams </li></ul><ul><li>We investigated </li></ul><ul><ul><li>RDF streams [WWW2009] </li></ul></ul><ul><ul><ul><li>the natural extension of the RDF data model to the new continuous scenario and </li></ul></ul></ul><ul><ul><li>Continuous SPARQL (or simply C-SPARQL ) [WWW2009, EDBT2010] </li></ul></ul><ul><ul><ul><li>A syntactic and semantic extension of SPARQL for querying RDF streams </li></ul></ul></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010
  15. 15. Past Achievements RDF Stream <ul><li>RDF Stream Data Type </li></ul><ul><ul><li>Ordered sequence of pairs, where each pair is made of an RDF triple and its timestamp t </li></ul></ul><ul><ul><ul><li>(< triple >, t) </li></ul></ul></ul><ul><li>E.g., </li></ul><ul><ul><li>(<:ourmaninsa :isIn :Munich>, 2010-05-31T18:34:41) </li></ul></ul><ul><ul><li>(<:MadamMichelle :isIn :SouthAfrica >, 2010-05-31T18:24:28) </li></ul></ul><ul><ul><li>(<:Ayngelina :isIn :Nicaragua >, 2010-05-31T18:19:21) </li></ul></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010 “ just arrived in”
  16. 16. Past Achievements An Example of C-SPARQL Query <ul><li>Who has landed in USA in the last hour? </li></ul><ul><li>REGISTER QUERY WhoHasLandedInUSAinTheLastHour AS </li></ul><ul><li>PREFIX gno: <http://www.geonames.org/ontology#> </li></ul><ul><li>PREFIX c: < http://www.geonames.org/countries/#> </li></ul><ul><li>PREFIX : <http://example> </li></ul><ul><li>SELECT ?traveller ?place ?type </li></ul><ul><li>FROM <http://sws.geonames.org/nonExistingUSfeatureGraph> </li></ul><ul><li>FROM STREAM <http://someStreamGeneratedFromTwitter> </li></ul><ul><li>[ RANGE 60m STEP 5m ] </li></ul><ul><li>WHERE { </li></ul><ul><li>?traveller :isIn ?place . </li></ul><ul><li>?place gno:inCountry c:US . </li></ul><ul><li>?place gno:featureCode ?type . </li></ul><ul><li>} </li></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010
  17. 17. Past Achievements An Example of C-SPARQL Query Explained <ul><li>Who has landed in USA in the last hour? </li></ul><ul><li>REGISTER QUERY WhoHasLandedInUSAinTheLastHour AS </li></ul><ul><li>PREFIX gno: <http://www.geonames.org/ontology#> </li></ul><ul><li>PREFIX c: < http://www.geonames.org/countries/#> </li></ul><ul><li>PREFIX : <http://example> </li></ul><ul><li>SELECT ?traveller ?place ?type </li></ul><ul><li>FROM <http://sws.geonames.org/nonExistingUSfeatureGraph> </li></ul><ul><li>FROM STREAM <http://someStreamGeneratedFromTwitter> </li></ul><ul><li>[ RANGE 60m STEP 5m ] </li></ul><ul><li>WHERE { </li></ul><ul><li>?traveller :isIn ?place . </li></ul><ul><li>?place gno:inCountry c:US . </li></ul><ul><li>?place gno:featureCode ?type . </li></ul><ul><li>} </li></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010 Combined with triples a RDF graph triples from a stream Query registration (for continuous execution) FROM STREAM clause WINDOW
  18. 18. Past Achievements C-SPARQL Engine Architecture <ul><li>We implemented a C-SPARQL engine based on LarKC conceptual framework </li></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010 Performed by a DSMS Select Abstract Reason Streamed Input Window Content RDF Streams Answers Streams Window RDF Graphs
  19. 19. Main Contribution Achievements vs. Research Challenges <ul><li>Relation with data-stream systems </li></ul><ul><ul><li>Notion of RDF stream [WWW2009] </li></ul></ul><ul><li>Query languages for semantic streams </li></ul><ul><ul><li>C-SPARQL [WWW2009,EDBT2010] </li></ul></ul><ul><li>Reasoning on Streams </li></ul><ul><ul><li>Efficient incremental updates of deductive closures </li></ul></ul><ul><ul><li>How to combine streams and background knowledge </li></ul></ul><ul><li>Distributed and parallel processing </li></ul><ul><ul><li>Streams are parallel in nature </li></ul></ul><ul><li>Real time constrains </li></ul><ul><ul><li>A reasoning task must be completed before the answer become useless </li></ul></ul>Contribution of this work ESWC 2010, Heraklion, Greece, June 1st, 2010
  20. 20. Main Contribution State-of-the-Art Approach [Ceri1994,Volz2005] <ul><li>Overestimation of deletion : Overestimates deletions by computing all direct consequences of a deletion. </li></ul><ul><li>Rederivation : Prunes those estimated deletions for which alternative derivations (via some other facts in the program) exist. </li></ul><ul><li>Insertion : Adds the new derivations that are consequences of insertions to extensional predicates. </li></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010
  21. 21. Main Contribution Our approach 1/2 <ul><li>Assumption </li></ul><ul><ul><li>Insertions and deletions are triples respectively entering and exiting the window </li></ul></ul><ul><ul><li>The window size is known </li></ul></ul><ul><li>Therefore </li></ul><ul><ul><li>The time when each triple will expire is known and determined by the window size </li></ul></ul><ul><ul><ul><li>E.g. if the window is 10s long a triple entering at time t then it will exit at time t+10s </li></ul></ul></ul><ul><ul><li>Note: all knowledge can be annotated with an expiration time </li></ul></ul><ul><ul><ul><li>i.e., background knowledge is annotated with +  </li></ul></ul></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010
  22. 22. Main Contribution Our approach 2/2 <ul><li>The algorithm </li></ul><ul><ul><li>computes the entailments derived by the inserts, </li></ul></ul><ul><ul><li>annotates each entailed triple with a expiration time, and </li></ul></ul><ul><ul><li>eliminates from the current state all copies of derived triples except the one with the highest timestamp. </li></ul></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010
  23. 23. Main Contribution Our Approach at Work ESWC 2010, Heraklion, Greece, June 1st, 2010 12 Jan 2009 A B A B C 1 2 TS Triples in the Window Entailments in the Window A C [11] [11] [11] [12] A B C 3 A C [11] [11] [12] D [13] D B [12] [11] A B C 4 A C [11] [11] [12] D [13] D B [12] [11] E [14] [14] [14] x A B C 12 A C [12] D [13] D B [12] E [14] [14] [14] A C 13 A D [13] D E [14] [14] [14] [11] [11] 11
  24. 24. Main Contribution Comparative Evaluation <ul><li>Hypothesis </li></ul><ul><ul><li>Background knowledge do not change and it is materialized </li></ul></ul><ul><ul><li>Changes only take place in the window </li></ul></ul><ul><li>An experiment comparing the time required to compute a new materialization using </li></ul><ul><ul><li>Re-computing from scratch (i.e.,1250 ms in our setting) </li></ul></ul><ul><ul><li>State of the art incremental approach [Volz, 2005] </li></ul></ul><ul><ul><li>Our approach </li></ul></ul><ul><li>Results at increasing % of the triples updated </li></ul><ul><li>. </li></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010
  25. 25. Retrospective and Conclusions Achievements vs. Research Challenges <ul><li>Relation with data-stream systems </li></ul><ul><ul><li>Notion of RDF stream :-| </li></ul></ul><ul><li>Query languages for semantic streams </li></ul><ul><ul><li>C-SPARQL :-D </li></ul></ul><ul><li>Reasoning on Streams </li></ul><ul><ul><li>Efficient incremental updates of deductive closures </li></ul></ul><ul><ul><ul><li>This paper :-) ... but much more work is needed! </li></ul></ul></ul><ul><ul><li>How to combine streams and background knowledge </li></ul></ul><ul><ul><ul><li>This paper :-| ... but a lot needs to be studied ... </li></ul></ul></ul><ul><li>Distributed and parallel processing </li></ul><ul><ul><li>Future work :-P </li></ul></ul><ul><li>Real time constrains </li></ul><ul><ul><li>Future work :-P </li></ul></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010
  26. 26. References (selection) <ul><li>Vision </li></ul><ul><ul><li>[IEEE-IS2009] Emanuele Della Valle, Stefano Ceri, Frank van Harmelen, Dieter Fensel It's a Streaming World! Reasoning upon Rapidly Changing Information . IEEE Intelligent Systems 24(6): 83-89 (2009) bibtex </li></ul></ul><ul><li>Continuous SPARQL (C-SPARQL) </li></ul><ul><ul><li>[EDBT2010] Davide Francesco Barbieri, Daniele Braga, Stefano Ceri and Michael Grossniklaus. An Execution Environment for C-SPARQL Queries . EDBT 2010 </li></ul></ul><ul><ul><li>[WWW2009] Davide Francesco Barbieri, Daniele Braga, Stefano Ceri, Emanuele Della Valle, Michael Grossniklaus: C-SPARQL: SPARQL for continuous querying . WWW 2009: 1061-1062 bibtex </li></ul></ul><ul><li>Stream Reasoning </li></ul><ul><ul><li>[Dagstuhl2010] Heiner Stuckenschmidt, Stefano Ceri, Emanuele Della Valle and Frank van Harmelen. Towards Expressive Stream Reasoning. Proceedings of the Dagstuhl Seminar on Semantic Aspects of Sensor Networks, 2010. </li></ul></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010
  27. 27. Thank You! Questions? Much More to Come! Keep an eye on http://www.streamreasoning.org ESWC 2010, Heraklion, Greece, June 1st, 2010
  28. 28. Back-up Slides The Entailment Regime That We Used <ul><li>In the current implementation we support RDF-S++ </li></ul><ul><ul><li>rdf:type </li></ul></ul><ul><ul><li>rdfs:subClassOf </li></ul></ul><ul><ul><li>rdfs:domain and rdfs:range </li></ul></ul><ul><ul><li>rdfs:subPropertyOf </li></ul></ul><ul><ul><li>owl:sameAs </li></ul></ul><ul><ul><li>owl:inverseOf </li></ul></ul><ul><ul><li>owl:TransitiveProperty </li></ul></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010
  29. 29. Back-up Slides Volz 2005 rewriting rules ESWC 2010, Heraklion, Greece, June 1st, 2010
  30. 30. Back-up Slides Example of maintenance program <ul><li>Original Rule </li></ul><ul><li>Maintenance Program </li></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010
  31. 31. Back-up Slides Our rewriting rules ESWC 2010, Heraklion, Greece, June 1st, 2010
  32. 32. Back-up Slides Example of maintenance program for streams <ul><li>Original Rule </li></ul><ul><li>Maintenance Program </li></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010
  33. 33. Back-up Slides Simple Stream Reasoner Architecture ESWC 2010, Heraklion, Greece, June 1st, 2010
  34. 34. Achievements Incremental Reasoning: State-of-the-Art <ul><li>Incremental Maintenance of Materialized Views </li></ul><ul><ul><li>Stefano Ceri, Jennifer Widom: Deriving Incremental Production Rules for Deductive Data. Inf. Syst. 19(6): 467-490 (1994) </li></ul></ul><ul><ul><li>HA Kuno, EA Rundensteiner: Incremental Maintenance of Materialized Object-Oriented Views in MultiView: Strategies and Performance Evaluation. TDKE 1998 </li></ul></ul><ul><ul><li>Raphael Volz, Steffen Staab, Boris Motik: Incrementally Maintaining Materializations of Ontologies Stored in Logic Databases. J. Data Semantics 2: 1-34 (2005) </li></ul></ul><ul><li>Incremental Rule-based Reasoning </li></ul><ul><ul><li>F Fabret, M Regnier, E Simon: An Adaptive Algorithm for Incremental Evaluation of Production Rules in Databases. VLDB 1993 </li></ul></ul><ul><ul><li>B. Berster: Extending the RETE Algorithm for Event Management.TIME’02 </li></ul></ul><ul><li>Incremental DL Reasoning </li></ul><ul><ul><li>Cuenca-Grau et al : History Matters: Incremental Ontology Reasoning Using Modules. ISWC 2007. </li></ul></ul><ul><ul><li>Parsia et al: Towards incremental reasoning through updates in OWL-DL. - Reasoning on the Web-Workshop at WWW-2006 </li></ul></ul>ESWC 2010, Heraklion, Greece, June 1st, 2010

×