Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

SLD Revolution: A Cheaper, Faster yet more Accurate Streaming Linked Data Framework

270 views

Published on

Second RSP workshop co-located with the 17th Extended Seamntic Web Conference.

This is a joint work from Marco Balduini, Emanuele Della Valle at Riccardo Tommasini DEIB, Politecnico of Milano, Milano, Italy.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

SLD Revolution: A Cheaper, Faster yet more Accurate Streaming Linked Data Framework

  1. 1. Politecnico di Milano, DEIB Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano Marco Balduini, Riccardo Tommasini, Emanuele Della Valle A Cheaper, Faster yet more Accurate 
 Streaming Linked Data Framework 1
  2. 2. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano RSP is Great! 2
  3. 3. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC Why RSP? 3 - offers a generic overview over streams and static data - enables query answering across heterogeneous sources - consents to create/publish new streams or graphs
  4. 4. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano The RSP Idea 4 in short
  5. 5. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC CQL Model 5 Streams Relations Streams-to-Relations Relations-to-Streams Relationsto-Relations The CQL continuous query language 
 - Arvind Arasu · Shivnath Babu · Jennifer Widom, 2006, VLDBJ
  6. 6. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC RSP-QL Model 6 RDF Streams Solution Mappings S2R operators R2S operators R2R operators The CQL continuous query language 
 - E. Della Valle, S. Ceri, D. Barbieri, D. Braga, A. Campi, 2008, FIS
  7. 7. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC RSP in a Nutshell 7 RDF Stream-to-RDF RDF-to-RDF (solution mappings) RDF-to-RDF Stream on RDF Streams
  8. 8. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano RSP in Practice 8 With SLD
  9. 9. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC An Social Media Example 9 How many micro-posts do occur over time? How often does a hashtag appears in the micro-post stream? Two Information Needs
  10. 10. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC Streaming Linked Data Server Sources Raw Stream 10 Adapter RDF Stream Bus Publisher Visualizer Recorder Re-player Analiser Decorator
  11. 11. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC 11
  12. 12. Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano An Important Optimisation REGISTER STREAM sstr AS CONSTRUCT { ?id sma:twitterCount ?tc } FROM STREAM <social> [RANGE 1m STEP 1m] WHERE { SELECT (uuid() AS ?id) ?tc WHERE { SELECT (COUNT (DISTINCT ?mp) AS ?tc) WHERE { ?mp a sma:Tweet } } } 12
  13. 13. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC 13
  14. 14. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC 14
  15. 15. Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano Using C-SPARQL REGISTER STREAM countT AS CONSTRUCT { ?uid sma:twitterCount ?tot .} FROM STREAM <sstr> [RANGE 15m STEP 1m] WHERE { SELECT (uuid() AS ?uid) (SUM(?tc) AS ?tot) WHERE { ?id sma:twitterCount ?tc }} 15
  16. 16. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano Is RSP always great? 16
  17. 17. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC Observations 17 It is flexible. :) It forces RDF when query results are often relational :( It is not optimal, i.e. RSP-QL vs SQL vs Path Queries on SLD
  18. 18. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano Revolutionising SLD 18
  19. 19. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC A “Lazy” Processing Model 19 Stream operators can be applied on generic data items. QL-specific operators requires a particular data type. Postpone the data transformation as late as possible. on streams
  20. 20. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC Generic Programming 20 Generic programming is a style of computer programming in which algorithms are written in terms of types to-be-specified-later that are then instantiated when needed for specific types provided as parameters. an old idea
  21. 21. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC A new Processing Model 21 Generic
 Streams<T> Generic
 Instantaneous<T> S2I<T> I2S<T> I2I<T>
  22. 22. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC Lazy Transformation by Generic Programming 22 stream-to-instantaneous<T> instantaneous-to-instantaneous<T> instantaneous-to-stream<T> on streams<T>
  23. 23. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano RSP in Practice 23 with SLD Revolution
  24. 24. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC Sources Stream Sink Streaming Linked Data Revolution Server 24 Receiver Generic Stream Bus Translator Stream Recorder Re-player Processor Decorator
  25. 25. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC 25
  26. 26. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano SLD vs SLD Revolution 26 Let’s be quantitative
  27. 27. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC It is faster, cheaper yet more accurate than SLD. 27 R² = 0,96413R² = 0,99891 30 300 3000 1 10 100 Median Engine Memory (MB) Median CPU Load (%) SLD SLD Revolution Expon. (SLD) Linear (SLD Revolution)
  28. 28. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano Discussion & Conclusion 28
  29. 29. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC Observations 29 It is faster, cheaper yet more accurate than SLD. :) It requires to know EPL, SPARQL, JSON path queries. :( It is optimised and, thus, not flexible. :( on SLD Revolution
  30. 30. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC Open Problems 30 RSP-QL is not always the best solution in terms of cost/performance Can we identify an optimum? Can we define a cost model for RSP-QL ?
  31. 31. Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano Questions? Email: riccardo.tommasini@polimi.it
 Twitter: @rictomm 31 Email: marco.balduini@polimi.it
 Twitter: @ balducci85 Pablo Picasso, Les Demoiselles d'Avignon, 1907. 
 Museum of Modern Art (MoMA), New York City, NY, US

×