Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Continuously Updating Query Results over Real-Time Linked Data

606 views

Published on

Presentation for the MEPDaW workshop @ESWC 2016

Published in: Engineering
  • Be the first to comment

Continuously Updating Query Results over Real-Time Linked Data

  1. 1. Ruben Taelman - @rubensworks iMinds - Ghent University Continuously Updating Query Results over Real-Time Linked Data
  2. 2. Dynamic Linked Data E.g. Thermometer measures every minute: “19,05°C” - 30-05-2016 11:00 “19,06°C” - 30-05-2016 11:01 “19,11°C” - 30-05-2016 11:02 “19,08°C” - 30-05-2016 11:03 … Typically exposed as an RDF stream = stream of <RDF triple, timestamp>
  3. 3. Querying continous data Clients send queries to server: e.g. What is the current temperature? Server continuously evaluates the queries → Server does all of the work Cause of low public endpoint availability! ½ have availability of < 95% (Buil-Aranda 2013) → Clients just wait for results
  4. 4. What if we moved continuous query evaluation to the client? → to lower server load
  5. 5. Triple Pattern Fragments does this for static data! Triple pattern fragments (TPF) (Verborgh 2016): Servers can only respond to triple pattern queries Clients need to evaluate queries locally → Lowers server complexity Can we do the same for dynamic data?
  6. 6. Overview Dynamic data representation Query streamer engine Evaluation
  7. 7. Overview Dynamic data representation Query streamer engine Evaluation
  8. 8. Dynamic data representation Expose dynamic data through the TPF interface → Represent dynamic data in RDF We annotate dynamic data with the time at which they are valid → Client can derive the time at which data can change! But how do we annotate data/triples with time?
  9. 9. Annotation methods Reification Singleton properties (Nguyen 2014) Graphs Implicit graphs Outdated Instantiate predicates Define fourth element in quad TPF makes triples (de)referencable
  10. 10. Time labeling types Time interval Expiration time Start- and endtime of validity Good for maintaining a history of elements Endtime of validity When only the latest version is required
  11. 11. Dynamic data example radio:bbc-radio-1 m:plays radio:jauz-netsky-higher. GRAPH _:g1 { radio:bbc-radio-1 m:plays radio:jauz-netsky-higher. } _:g1 tmp:interval _:interval_1. _:interval_1 tmp:initial "2016-05-30T09:15:00"^^xsd:dateTime. _:interval_1 tmp:final "2016-05-30T09:20:00"^^xsd:dateTime. Graph-annotation: [ 9:15, 9:20 ]
  12. 12. Overview Dynamic data representation Query engine Evaluation
  13. 13. Query streamer engine
  14. 14. Overview Dynamic data representation Query streamer engine Evaluation
  15. 15. Measure query execution times for query duration Query: “All trains with their delay in station X within the next hour” Frequency: 10 seconds Clients: 1 Engine: Query streamer Annotation methods: singleton property, graph, implicit graph Time labeling types: time interval, expiration time Evaluating annotation methods
  16. 16. Evaluating annotation methods Time interval Expiration time
  17. 17. Evaluating scalability Measure server CPU usage for increasing # clients Query: “All trains with their delay in station X within the next hour” Frequency: 10 seconds Clients: 1 → 200 Engines: Query streamer, C-SPARQL (Barbieri 2012) and CQELS (Le-Phuoc 2011) Annotation method: graph Time labeling types: expiration time
  18. 18. Query Streamer has better scalability
  19. 19. Query Streamer moves load from server to client
  20. 20. Overview Dynamic data representation Annotate dynamic data with time Query streamer engine Client-side query engine Dynamic data at TPF server Evaluation Annotation methods Scalability
  21. 21. Conclusions Further evaluation: Different query types, …? Solve efficiency-problem time intervals? Promising approach for improved scalability

×