Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked Data Sets

218 views

Published on

The lighting talk I gave to the "Federated Semantic Data Management" seminar at Dagstuhl

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked Data Sets

  1. 1. ACQUA: APPROXIMATE CONTINUOUS QUERY ANSWERING OVER STREAMS AND DYNAMIC LINKED DATA SETS Emanuele Della Valle DEIB - Politecnico of Milano http://emanueledellavalle.org @manudellavalle Schloss Dagstuhl, Germany - 26 June 2017
  2. 2. Stream Processing in Nutshell Stream Processing Engine ResultsWindows Stream data Register query once and execute it continuously 2Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
  3. 3. Web Stream Processing Web Results Join WindowsWeb Streams Linked Data  High Latency  Rate Limits  Loosing Reactiveness 3 Stream Processing Engine Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
  4. 4. RDF Stream Processing (RSP) EngineRSPengine Web Results Join WindowsRDF Streams SPARQL endpoint 4Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
  5. 5. An example The cloth brand ACME wants to persuade influential Social Networks users to post commercial endorsements. Every minute give me the ID of the users that are mentioned on Social Network in the last 10 minutes whose number of followers is greater than 100,000. 5 REGISTER STREAM <:InfluencersToContact> AS CONSTRUCT {?user a :influentialUser} FROM NAMED WINDOW W ON S [RANGE 10m STEP 1m] WHERE { WINDOW W {?user :hasMentions ?mentionsNumber} SERVICE BKG {?user :hasFollowers ?followerCount } FILTER (?followerCount > 100,000) } Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
  6. 6. Problem DefinitionRSPengine Web Results Join WindowsRDF Streams Define Refresh Budget to control reactiveness 6 Data become stale if not refreshed Correct vs approximate answer SPARQL endpoint Local Replica Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
  7. 7. Problem DefinitionRSPengine Web Results Join WindowsRDF Streams SPARQL endpoint 7 Local Replica Maintenance Policy Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
  8. 8. ACQUA approach 8 WINDOW clause Stream data JOIN Proposer Ranker Maintainer 2 3 1 SERVICE clause AQCUA: without FILTER AQCUA.F: with FILTER Clause E C ACQUA [2] RND LRU WBM ACQUA.F [3] Filter Update Policy RND.F LRU.F WBM.F ACQUA.F+/* [5] LRU.F+ WBM.F+ WBM.F* Candidate set Elected set: top γ mappings of Candidate set Local Replica WSJ: Filter out mappings that are not involved in current evaluation Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
  9. 9. Where to read about ACQUA 1. Soheila Dehghanzadeh, Alessandra Mileo, Daniele Dell'Aglio, Emanuele Della Valle, Shen Gao, Abraham Bernstein: Online View Maintenance for Continuous Query Evaluation. WWW (Companion Volume) 2015: 25-26 2. Soheila Dehghanzadeh, Daniele Dell'Aglio, Shen Gao, Emanuele Della Valle, Alessandra Mileo, Abraham Bernstein: Approximate Continuous Query Answering over Streams and Dynamic Linked Data Sets. ICWE 2015: 307-325 3. Shima Zahmatkesh, Emanuele Della Valle, Daniele Dell'Aglio: When a FILTER Makes the Difference in Continuously Answering SPARQL Queries on Streaming and Quasi-Static Linked Data. ICWE 2016: 299-316 4. Shen Gao, Daniele Dell'Aglio, Soheila Dehghanzadeh, Abraham Bernstein, Emanuele Della Valle, Alessandra Mileo: Planning Ahead: Stream-Driven Linked-Data Access Under Update-Budget Constraints. International Semantic Web Conference (1) 2016: 252- 270 5. Shima Zahmatkesh, Emanuele Della Valle, Daniele Dell'Aglio: Using Rank Aggregation in Continuously Answering SPARQL Queries on Streaming and Quasi-static Linked Data. DEBS 2017: 170-179 Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle 9
  10. 10. Experimental Results 10 WorstBest Performance Experiment Dimension For high selectivity Filter Update Policy is better than WBM For low selectivity WBM is better than Filter Update Policy Comparable to Best Result Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
  11. 11. Future works • Broaden the class of queries • Multiple filtering • Filtering condition formulated as a ranking clause • Pushing the FILTER clause into the SERVICE clause and considering caching instead of local replica • Study the effect of different trends in the data 11Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle
  12. 12. ACQUA IN THE STREAM REASONING CONTEXT Annex Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle 12
  13. 13. Stream Reasoning in a nutshell Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle 13 Tame data Variety and Velocity simultaneously Traditional StreamReasoning
  14. 14. Tame data Variety and Velocity simultaneously Traditional StreamReasoning Stream Reasoning in a nutshell Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle 14
  15. 15. Tame data Variety and Velocity simultaneously without forgetting volume Traditional StreamReasoning What if the analysis includes also data "at rest"? What if the data "at rest" are massive and slowly evolving? Emanuele Della Valle - http://emanueledellavalle.org - @manudellavalle 15 Stream Reasoning in a nutshell ACQUA

×