SlideShare a Scribd company logo
1 of 25
oheila Dehghanzadeh, Daniele Dell’Aglio, Shen Gao,
manuele Della Valle, Alessandra Mileo , Abraham Bernstein
ICWE - 25 June
Outline
● Introduction to Continous Queries
● Motivating Example
● Problem Description
● Solution
● Experimental Results
● Conclusions
2ICWE - 25 June 2015
Introduction
DF Stream Processing engines usually register queries
and execute them in a continuous fashion.
3ICWE - 25 June 2015
RDF Stream
Generator
Query
W(ω,β)
EvaluationEvaluation
Time-based sliding window
S3
S4 S5
S6
S7
S8
S9 S10
S11
S12
SS
S1
S2
β
ω
t
widthslide
Window
4ICWE - 25 June 2015
Introduction
omplex continuous queries combines data streams with
remote background data.
Join
RDF Stream
Generator
Background data
(SPARQL endpoint)
5ICWE - 25 June 2015
Motivating Example
Finding Influential Users
nfluential User: users who have more than a specific number of
followers and are mentioned more than a specific times in a specific
period (200 seconds).
ollower number: stored in a remote endpoint.
ention number: computed by processing the stream of messages.
6ICWE - 25 June 2015
Inspired by Chris Testa's SemTech 2011 talk: http://goo.gl/kLSqGo
Investigating the Scenario
Symmetrical hash join
rawbacks:
• Data access constraints.
• Background data is huge and has to be fetched at every
evaluation - slow and wasting computational and financial
resources.
Join
RDF Stream
Generator
Background data
(SPARQL endpoint)
7ICWE - 25 June 2015
Investigating the Scenario
Nested Loop Join
rawbacks:
• One invocation for each mapping from the WINDOW
clause evaluation – high number of requests to the server.
• API restrictions (e.g., limited amount of requests over
time).
Join
RDF Stream
Generator
Background data
(SPARQL endpoint)
8ICWE - 25 June 2015
Investigating the Scenario
Local Views
hallenges:
• Data goes out of date
Join
RDF Stream
Generator
Background data
(SPARQL endpoint)
Local
View
9ICWE - 25 June 2015
Investigating the Scenario
Maintenance processes
aintenance introduces a trade-off between response quality and
time.
e propose to manage this trade-off by fixing time dimension
based on query constraints and maximizing freshness of response.
Join
RDF Stream
Generator
Background data
(SPARQL endpoint)
Local
View
Maintenance
Process
Freshness decreases
Refresh
Cost/Quality trade-
off
10ICWE - 25 June 2015
Problem Description
The maintenance process should identify elements of the local
view that maximize response freshness.
11ICWE - 25 June 2015
Requirements of The Maintenance Process
1. should satisfy the Quality of Service constraints
on responsiveness and freshness of the answer;
2. should take into account the change rates of the
data elements in the REST API;
3. should consider the dynamicity of the change
rate values;
4. may consider the sliding window operator.
12ICWE - 25 June 2015
Hypotheses
e formulated the following hypotheses to build the maintenance
process
P1: the freshness of the answer can increase by maintaining part
of the local view involved in the current query evaluation
P2: the freshness of the answer increases by refreshing the
(possibly) stale local view entries that would remain fresh in a
higher number of evaluations
13ICWE - 25 June 2015
JOIN WSJWSJ WBMWBM
RefresherRefresher
BKG
Window
Solution: WSJ+WBM
Local View
HP1
HP2
14ICWE - 25 June 2015
τ
t5 6 7 8 9 10 11
W1 W2 W3 W4
124
5 6 7 8 9 10 11 124
Terminology
Best Before Time: the time
that an element will
become stale and is defined
by:
Mappings from the
WINDOW clause
Mappings in the
LOCAL VIEW
Compatible
mappings
15ICWE - 25 June 2015
τ
t5 6 7 8 9 10 11
W1 W2 W3 W4
124
5 6 7 8 9 10 11 124
WSJ
SJ identifies the candidate
set: the possibly stale local
view mappings involved in
the current evaluation.
SJ analyzes the content of the
current window evaluation
and identifying the
compatible mappings in the
local view.
he possibly stale mappings
are identified by analyzing
the associated best before 16ICWE - 25 June 2015
V L Score
τ
t5 6 7 8 9 10 11
W1 W2 W3 W4
124
5 6 7 8 9 10 11 124
WBM
BM ranks the candidate set
to determine which
mappings to update.
he ranking is computed
through two values: the
renewed best before time
and the remaining life time
he top k elements are
selected to be refreshed. The
value k is selected according
to the responsiveness
constraint. 17ICWE - 25 June 2015
V L Score
3
4
1
τ
t5 6 7 8 9 10 11
W1 W2 W3 W4
124
5 6 7 8 9 10 11 124
WBM: renewed best before time
hen would the mappings
became stale if refreshed
now?
he renewed best before time
V is computed as:
18ICWE - 25 June 2015
V L Score
3 3
4 1
1 3
τ
t5 6 7 8 9 10 11
W1 W2 W3 W4
124
5 6 7 8 9 10 11 124
WBM: remaining life time and score
or how many future
evaluations the mappings is
involved?
he remaining life time L is
computed as:
BM ranks the mappings by
using a score:
core=min(L,V)
19ICWE - 25 June 2015
Experiment- Data Collection
1. Streaming API
a. Twitter stream data for mention count
2. Twitter APIs to get number of followers
a. Create snapshots everyone minutes
b. Simulate the change based on user’s predefined change rates.
Streaming
Dataset
Snapshots
/synthetic
data
20ICWE - 25 June 2015
Experimental setup
e study our hypotheses using a comparative evaluation with
• LRU: use the least recently updated elements for maintenance
• RND: use a random subset of elements for maintenance
rror measure
• Comparing the differences between consecutive evaluation of the
motivated query against cache and real/synthetic dataset.
P1: We compared the cumulative staleness of using WSJ or not (i.e.,
GNR) for both baselines.
• GNR: candidate set is the whole view entries.
P2: We compared the cumulative staleness of using WBM and the
improved baselines.
21ICWE - 25 June 2015
HP1: Maintaining involved entries of local view maximizes response
accuracy.
Synthetic Real
WSJ shows better improvement by increasing the update budget than GNR.
22ICWE - 25 June 2015
HP2: Maintaining possibly stale entries from local view that will stay
fresh for a longer time maximizes response accuracy.
Synthetic Real
WBM doesn’t improve as well as WBM* which shows the estimation error
has caused by wrong estimation for BBT. Use more accurate prediction
for BBT.
23ICWE - 25 June 2015
Conclusions and Future Work
onclusions:
• We proposed using the idea of materialization to optimize processing
continuous queries.
• We proposed a policy to maximize the freshness according to time
constraint in continuous query.
• We tested our policy against based line policies (LRU and Random).
uture Work:
• Extensions of real continuous query processors with the proposed
approach
• Measuring the time overhead of maintenance
• Investigating more complex queries that have complicated join patterns
between the SERVICE and STREAM clauses.
• Dynamically estimating the change rate of users.
24ICWE - 25 June 2015
Slide
25
Soheila Dehghanzadeh, Daniele Dell’Aglio, Shen Gao,
Emanuele Della Valle, Alessandra Mileo , Abraham Bernstein
soheila.dehghanzadeh@insight-centre.org
http://www.slideshare.net/sallyde
ICWE - 25 June 2015

More Related Content

Similar to Approximate Continuous Query Answering Over Streams and Dynamic Linked Data Sets

Reducing_Learning_Curve_in_LB_GB_Sujith
Reducing_Learning_Curve_in_LB_GB_SujithReducing_Learning_Curve_in_LB_GB_Sujith
Reducing_Learning_Curve_in_LB_GB_Sujith
Sujith Kolath
 
Project presentation
Project presentationProject presentation
Project presentation
Niraj Bhujel
 
DMAIC-Review-Template-14-steps-1.1.pptx
DMAIC-Review-Template-14-steps-1.1.pptxDMAIC-Review-Template-14-steps-1.1.pptx
DMAIC-Review-Template-14-steps-1.1.pptx
KristofMC
 
Phase_3_Introduction_to_Statistical_Analysis.pptx
Phase_3_Introduction_to_Statistical_Analysis.pptxPhase_3_Introduction_to_Statistical_Analysis.pptx
Phase_3_Introduction_to_Statistical_Analysis.pptx
JayalaxmiRs
 
Phase_3_Introduction_to_Statistical_Analysis.pptx
Phase_3_Introduction_to_Statistical_Analysis.pptxPhase_3_Introduction_to_Statistical_Analysis.pptx
Phase_3_Introduction_to_Statistical_Analysis.pptx
JayalaxmiRs
 
Phase_3_Introduction_to_Statistical_Analysis.pptx
Phase_3_Introduction_to_Statistical_Analysis.pptxPhase_3_Introduction_to_Statistical_Analysis.pptx
Phase_3_Introduction_to_Statistical_Analysis.pptx
BrendaL11
 
Matthew Egan End of Assignment Presentation 2nd Rotation
Matthew Egan End of Assignment Presentation 2nd RotationMatthew Egan End of Assignment Presentation 2nd Rotation
Matthew Egan End of Assignment Presentation 2nd Rotation
Matthew Egan
 
10 - CIP-002-5.1 Medley - Carr
10 - CIP-002-5.1 Medley - Carr10 - CIP-002-5.1 Medley - Carr
10 - CIP-002-5.1 Medley - Carr
Bryan Carr
 

Similar to Approximate Continuous Query Answering Over Streams and Dynamic Linked Data Sets (20)

Reducing_Learning_Curve_in_LB_GB_Sujith
Reducing_Learning_Curve_in_LB_GB_SujithReducing_Learning_Curve_in_LB_GB_Sujith
Reducing_Learning_Curve_in_LB_GB_Sujith
 
Project presentation
Project presentationProject presentation
Project presentation
 
DMAIC-Review-Template-14-steps-1.1.pptx
DMAIC-Review-Template-14-steps-1.1.pptxDMAIC-Review-Template-14-steps-1.1.pptx
DMAIC-Review-Template-14-steps-1.1.pptx
 
addressing tim/quality trade-off in view maintenance
addressing tim/quality trade-off in view maintenanceaddressing tim/quality trade-off in view maintenance
addressing tim/quality trade-off in view maintenance
 
Phase_3_Introduction_to_Statistical_Analysis.pptx
Phase_3_Introduction_to_Statistical_Analysis.pptxPhase_3_Introduction_to_Statistical_Analysis.pptx
Phase_3_Introduction_to_Statistical_Analysis.pptx
 
Phase_3_Introduction_to_Statistical_Analysis.pptx
Phase_3_Introduction_to_Statistical_Analysis.pptxPhase_3_Introduction_to_Statistical_Analysis.pptx
Phase_3_Introduction_to_Statistical_Analysis.pptx
 
Phase_3_Introduction_to_Statistical_Analysis.pptx
Phase_3_Introduction_to_Statistical_Analysis.pptxPhase_3_Introduction_to_Statistical_Analysis.pptx
Phase_3_Introduction_to_Statistical_Analysis.pptx
 
Assessment_4.pptx
Assessment_4.pptxAssessment_4.pptx
Assessment_4.pptx
 
Matthew Egan End of Assignment Presentation 2nd Rotation
Matthew Egan End of Assignment Presentation 2nd RotationMatthew Egan End of Assignment Presentation 2nd Rotation
Matthew Egan End of Assignment Presentation 2nd Rotation
 
Lean six sigma executive overview (case study) templates
Lean six sigma executive overview (case study) templatesLean six sigma executive overview (case study) templates
Lean six sigma executive overview (case study) templates
 
Mortgage Data for Machine Learning Algorithms
Mortgage Data for Machine Learning AlgorithmsMortgage Data for Machine Learning Algorithms
Mortgage Data for Machine Learning Algorithms
 
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...
 
Project management
Project managementProject management
Project management
 
Integrated Agile Software Development with Earned Value Management
Integrated Agile Software Development with Earned Value ManagementIntegrated Agile Software Development with Earned Value Management
Integrated Agile Software Development with Earned Value Management
 
"How to document your decisions", Dmytro Ovcharenko
"How to document your decisions", Dmytro Ovcharenko "How to document your decisions", Dmytro Ovcharenko
"How to document your decisions", Dmytro Ovcharenko
 
Value stream mapping
Value stream mappingValue stream mapping
Value stream mapping
 
10 - CIP-002-5.1 Medley - Carr
10 - CIP-002-5.1 Medley - Carr10 - CIP-002-5.1 Medley - Carr
10 - CIP-002-5.1 Medley - Carr
 
Driving Innovation with Kanban at Jaguar Land Rover
Driving Innovation with Kanban at Jaguar Land RoverDriving Innovation with Kanban at Jaguar Land Rover
Driving Innovation with Kanban at Jaguar Land Rover
 
How should we estimates agile projects (CAST)
How should we estimates agile projects (CAST)How should we estimates agile projects (CAST)
How should we estimates agile projects (CAST)
 
PROJECT STORYBOARD: Reducing Learning Curve Ramp for Temp Employees by 2 Weeks
PROJECT STORYBOARD: Reducing Learning Curve Ramp for Temp Employees by 2 WeeksPROJECT STORYBOARD: Reducing Learning Curve Ramp for Temp Employees by 2 Weeks
PROJECT STORYBOARD: Reducing Learning Curve Ramp for Temp Employees by 2 Weeks
 

Recently uploaded

Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
MsecMca
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 

Approximate Continuous Query Answering Over Streams and Dynamic Linked Data Sets

  • 1. oheila Dehghanzadeh, Daniele Dell’Aglio, Shen Gao, manuele Della Valle, Alessandra Mileo , Abraham Bernstein ICWE - 25 June
  • 2. Outline ● Introduction to Continous Queries ● Motivating Example ● Problem Description ● Solution ● Experimental Results ● Conclusions 2ICWE - 25 June 2015
  • 3. Introduction DF Stream Processing engines usually register queries and execute them in a continuous fashion. 3ICWE - 25 June 2015 RDF Stream Generator Query
  • 4. W(ω,β) EvaluationEvaluation Time-based sliding window S3 S4 S5 S6 S7 S8 S9 S10 S11 S12 SS S1 S2 β ω t widthslide Window 4ICWE - 25 June 2015
  • 5. Introduction omplex continuous queries combines data streams with remote background data. Join RDF Stream Generator Background data (SPARQL endpoint) 5ICWE - 25 June 2015
  • 6. Motivating Example Finding Influential Users nfluential User: users who have more than a specific number of followers and are mentioned more than a specific times in a specific period (200 seconds). ollower number: stored in a remote endpoint. ention number: computed by processing the stream of messages. 6ICWE - 25 June 2015 Inspired by Chris Testa's SemTech 2011 talk: http://goo.gl/kLSqGo
  • 7. Investigating the Scenario Symmetrical hash join rawbacks: • Data access constraints. • Background data is huge and has to be fetched at every evaluation - slow and wasting computational and financial resources. Join RDF Stream Generator Background data (SPARQL endpoint) 7ICWE - 25 June 2015
  • 8. Investigating the Scenario Nested Loop Join rawbacks: • One invocation for each mapping from the WINDOW clause evaluation – high number of requests to the server. • API restrictions (e.g., limited amount of requests over time). Join RDF Stream Generator Background data (SPARQL endpoint) 8ICWE - 25 June 2015
  • 9. Investigating the Scenario Local Views hallenges: • Data goes out of date Join RDF Stream Generator Background data (SPARQL endpoint) Local View 9ICWE - 25 June 2015
  • 10. Investigating the Scenario Maintenance processes aintenance introduces a trade-off between response quality and time. e propose to manage this trade-off by fixing time dimension based on query constraints and maximizing freshness of response. Join RDF Stream Generator Background data (SPARQL endpoint) Local View Maintenance Process Freshness decreases Refresh Cost/Quality trade- off 10ICWE - 25 June 2015
  • 11. Problem Description The maintenance process should identify elements of the local view that maximize response freshness. 11ICWE - 25 June 2015
  • 12. Requirements of The Maintenance Process 1. should satisfy the Quality of Service constraints on responsiveness and freshness of the answer; 2. should take into account the change rates of the data elements in the REST API; 3. should consider the dynamicity of the change rate values; 4. may consider the sliding window operator. 12ICWE - 25 June 2015
  • 13. Hypotheses e formulated the following hypotheses to build the maintenance process P1: the freshness of the answer can increase by maintaining part of the local view involved in the current query evaluation P2: the freshness of the answer increases by refreshing the (possibly) stale local view entries that would remain fresh in a higher number of evaluations 13ICWE - 25 June 2015
  • 14. JOIN WSJWSJ WBMWBM RefresherRefresher BKG Window Solution: WSJ+WBM Local View HP1 HP2 14ICWE - 25 June 2015
  • 15. τ t5 6 7 8 9 10 11 W1 W2 W3 W4 124 5 6 7 8 9 10 11 124 Terminology Best Before Time: the time that an element will become stale and is defined by: Mappings from the WINDOW clause Mappings in the LOCAL VIEW Compatible mappings 15ICWE - 25 June 2015
  • 16. τ t5 6 7 8 9 10 11 W1 W2 W3 W4 124 5 6 7 8 9 10 11 124 WSJ SJ identifies the candidate set: the possibly stale local view mappings involved in the current evaluation. SJ analyzes the content of the current window evaluation and identifying the compatible mappings in the local view. he possibly stale mappings are identified by analyzing the associated best before 16ICWE - 25 June 2015
  • 17. V L Score τ t5 6 7 8 9 10 11 W1 W2 W3 W4 124 5 6 7 8 9 10 11 124 WBM BM ranks the candidate set to determine which mappings to update. he ranking is computed through two values: the renewed best before time and the remaining life time he top k elements are selected to be refreshed. The value k is selected according to the responsiveness constraint. 17ICWE - 25 June 2015
  • 18. V L Score 3 4 1 τ t5 6 7 8 9 10 11 W1 W2 W3 W4 124 5 6 7 8 9 10 11 124 WBM: renewed best before time hen would the mappings became stale if refreshed now? he renewed best before time V is computed as: 18ICWE - 25 June 2015
  • 19. V L Score 3 3 4 1 1 3 τ t5 6 7 8 9 10 11 W1 W2 W3 W4 124 5 6 7 8 9 10 11 124 WBM: remaining life time and score or how many future evaluations the mappings is involved? he remaining life time L is computed as: BM ranks the mappings by using a score: core=min(L,V) 19ICWE - 25 June 2015
  • 20. Experiment- Data Collection 1. Streaming API a. Twitter stream data for mention count 2. Twitter APIs to get number of followers a. Create snapshots everyone minutes b. Simulate the change based on user’s predefined change rates. Streaming Dataset Snapshots /synthetic data 20ICWE - 25 June 2015
  • 21. Experimental setup e study our hypotheses using a comparative evaluation with • LRU: use the least recently updated elements for maintenance • RND: use a random subset of elements for maintenance rror measure • Comparing the differences between consecutive evaluation of the motivated query against cache and real/synthetic dataset. P1: We compared the cumulative staleness of using WSJ or not (i.e., GNR) for both baselines. • GNR: candidate set is the whole view entries. P2: We compared the cumulative staleness of using WBM and the improved baselines. 21ICWE - 25 June 2015
  • 22. HP1: Maintaining involved entries of local view maximizes response accuracy. Synthetic Real WSJ shows better improvement by increasing the update budget than GNR. 22ICWE - 25 June 2015
  • 23. HP2: Maintaining possibly stale entries from local view that will stay fresh for a longer time maximizes response accuracy. Synthetic Real WBM doesn’t improve as well as WBM* which shows the estimation error has caused by wrong estimation for BBT. Use more accurate prediction for BBT. 23ICWE - 25 June 2015
  • 24. Conclusions and Future Work onclusions: • We proposed using the idea of materialization to optimize processing continuous queries. • We proposed a policy to maximize the freshness according to time constraint in continuous query. • We tested our policy against based line policies (LRU and Random). uture Work: • Extensions of real continuous query processors with the proposed approach • Measuring the time overhead of maintenance • Investigating more complex queries that have complicated join patterns between the SERVICE and STREAM clauses. • Dynamically estimating the change rate of users. 24ICWE - 25 June 2015
  • 25. Slide 25 Soheila Dehghanzadeh, Daniele Dell’Aglio, Shen Gao, Emanuele Della Valle, Alessandra Mileo , Abraham Bernstein soheila.dehghanzadeh@insight-centre.org http://www.slideshare.net/sallyde ICWE - 25 June 2015

Editor's Notes

  1. We motivate this work with a semtec talk Problem is very specific, you should generalize it to other cases
  2. How many time units we consider for one window How many time units we slide the window to create the next window Here we introduce some notions that we will use them over the window
  3. In order to produce the stream of influential users over time, we need to access mention stream and follower’s data from REST API. A sketch of the query in a continues query language
  4. The less we maintain the faster we can process queries, but how much less? How to minimize the maintenance? Extension: to consider all users from the stream, if a user doesn’t exist in the local view, we fetch it and replace it with one of the existing entries from the local view
  5. The less we maintain the faster we can process queries, but how much less? How to minimize the maintenance? Extension: to consider all users from the stream, if a user doesn’t exist in the local view, we fetch it and replace it with one of the existing entries from the local view
  6. Our goal is to minimize the maintenance based on constraints on QoS as the cost function If an entry stays fresh for a longer time but its life in window is short we choose entries that are staying longer in window (B+D)/(A+B+C+D) A=false positive B= true positive C=false negative D=true negative
  7. An efficient maintenance process should take into account the change rates of cached data as well as dynamics of the change rates , constraints on quality of service and definition of sliding window to optimally maintain the data.
  8. Wbm picks the top-k based on the time constraints of the query , send them to refresher to maintain the local view only for that particular subset. The maintenance policy will be done online at every evaluation of the sliding window to maintain the local view It uses the content of the current window as well as the statistics of change rates to pick a sub-set of the local view which will be passed to maintainer to fetch the rest API and re-write the content of those elements only for that particular sub-set.
  9. Our proposed solution uses the change rates(R1) to identify stale mappings (red,green,blue and pink) Our proposed solution uses window definition (R4) to identify the involved elements. (red,yellow,blue and green) So WSJ only considers the intersection which is red,green and blue
  10. Our proposed solution uses the change rates(R1) to identify stale mappings (red,green,blue and pink) Our proposed solution uses window definition (R4) to identify the involved elements. (red,yellow,blue and green) So WSJ only considers the intersection which is red,green and blue
  11. Our proposed solution uses the change rates(R1) to identify stale mappings (red,green,blue and pink) Our proposed solution uses window definition (R4) to identify the involved elements. (red,yellow,blue and green) So WSJ only considers the intersection which is red,green and blue
  12. Our proposed solution uses the change rates(R1) to identify stale mappings (red,green,blue and pink) Our proposed solution uses window definition (R4) to identify the involved elements. (red,yellow,blue and green) So WSJ only considers the intersection which is red,green and blue
  13. Our proposed solution uses the change rates(R1) to identify stale mappings (red,green,blue and pink) Our proposed solution uses window definition (R4) to identify the involved elements. (red,yellow,blue and green) So WSJ only considers the intersection which is red,green and blue
  14. To investigate the first hypothesis, we investigate the effect if including(WSJ) or excluding(GNR) proposer in the maintenance process and for the ranker we used the 2 baselines. WST= no maintenanceBST= If proposer just select stale-involved elements from the local view based on the update budget