SlideShare a Scribd company logo
1 of 28
Download to read offline
WHEN A FILTER MAKES THE DIFFERENCE IN
CONTINUOUSLYANSWERING SPARQL
QUERIES ON STREAMING AND QUASI-STATIC
LINKED DATA
Shima Zahmatkesh, Emanuele Della Valle, and Daniele Dell'Aglio
DEIB - Politecnico of Milano
ICWE 2016 – Lugano
7th June 2016
Outline
• Introduction
• Motivation
• Problem Statement
• State-of-the-art
• Research Question and Hypotheses
• Proposed Solution
• Experimental Results
• Conclusion and future works
!2
Introduction to Stream Processing
Stream Processing
Engine
ResultsWindows
Stream data
Register query once
and execute it
continuously
!3
Time-based sliding window

Sequence of
timestamped events
W(3,1 ,1)
β
ω
widthslide
S3
S4 S5 S6
S7
S8 S9
S10
S11
S
S1
S2
t
1 2 3 4 5 6 7 8 9 10 11 12
W i
W i+1
!4
)
Introduction to RSP EnginesRSPengine
Web
Results
Join
Windows
RDF Streams SPARQL endpoint
- Fetching all background data in each evaluation
- High number of invocation to the endpoint
- Loosing reactiveness of the engine
!5
Introduction to RSP EnginesRSPengine
Web
Results
Join
WindowsRDF Streams SPARQL endpoint
Local
Replica
Data become stale
during time
!6
Introduction to Continuous QueryRSPengine
Web
Results
Join
Local
Replica
Maintenance
Policy
WindowsRDF Streams SPARQL endpoint
!7
Motivation
The cloth brand ACME wants to persuade influential Social
Network users to post commercial endorsements.
Every minute give me the ID of the users that are mentioned on
Social Network in the last 10 minutes whose number of followers is
greater than 100,000.
!8
Motivation
The cloth brand ACME wants to persuade influential Social
Network users to post commercial endorsements.
Every minute give me the ID of the users that are mentioned on
Social Network in the last 10 minutes whose number of followers is
greater than 100,000.
ΩW
?user :hasMentions ?metionsNum
ΩS
?user :hasFollowers ?followerCount
WINDOW
SERVICE
JOIN
FILTER
?followerCount > 100000
!8
Problem Statement
• RSP Engines:
• Fetch all the background data in each evaluation
• Have high number of invocations
• The application may loose its reactiveness.
• Class of query with WINDOW and SERVICE and FILTER
clauses
• Solution:
• Using a local replica of the background data
• Need maintenance policies to keep the data freshness
!9
State-of-the-art
• Acqua: guarantee reactiveness of RSP engines while
maximizing the freshness of the replica
• Used a local replica of the background data
• Proposed maintenance policies
• Used refresh budget γ to limit the number of the access to the
background data and guarantee reactiveness
• Class of query with WINDOW and SERVICE clauses without
FILTER
!10
Acqua framework
!11
WINDOW clause
JOIN Proposer Ranker
MaintainerLocal Replica
2
3
1
SERVICE clause
E
C
RND
LRU
WBM
WSJ
Candidate set
Elected set: top
γ mappings of
Candidate set
Research Question
• Given a class of query with WINDOW and SERVICE and
FILTER clauses
• how can we refresh the local replica in order to
• guarantee reactiveness while
• maximizing the freshness the replica?
!12
Intuition
!13
time
NumberofFollowers
t
Filtering Threshold
User A
User B
User C
User D
Hypotheses
• H.1 the replica can be maintained fresher than when using
Acqua policies, if we first refresh the mappings whose
values are closer to the Filtering Threshold.
• H.2 the replica can be maintained fresher than when using
Acqua policies by first selecting the mappings as in
Hypothesis H.1 and, then, applying the Acqua policies.
!14
Proposed Solution
!15
WINDOW clause
JOIN Proposer Ranker
MaintainerLocal Replica
2
3
1
SERVICE clause
contains
FILTER clause
E
C
WSJ
Candidate set
Elected set
Filter Update Policy
RND.F
LRU.F
WBM.F
Proposed Solution
• Filter Update Policy:
For each mapping in the replica:
• Computes how close is the value associate to the variable of the
mapping to the Filtering Threshold.
• Combination with Acqua:
• RND.F, LRU.F, WBM.F
• Select a subset of Candidate set whose mappings are closer to
Filtering Threshold.
• Apply the Acqua policies over this subset.
!16
Experimental Evaluation (1 of 3)
• Data Sets
• Streaming data: collected from 400 verified users of Twitter for
three hours of tweets
• Background data: collected the number of followers per user, every
minute during the three hours
• Transformation of the background data:
• To control the selectivity of filter
• The time-series of specified percentage of the users crosses the Filtering
Threshold at least once during the experiment.
• To avoid biases, generate 10 different dataset for each specified
percentage
!17
Experimental Evaluation (2 of 3)
• Query
• Contains WINDOW, SERVICE, and FILTER clauses
• Generate correct answer of the query by a Oracle
• Ans(Oracle)
• KPIs
• Use Jaccard distance to measure diversity of the set generated by
the query Ans(Qi) and correct answers Ans(Oraclei)
• Define cumulative Jaccard distance
!18
Experimental Evaluation (3 of 3)
• Two sets of experiments:
• Comparing the Filter Update Policy (.F) with Acqua ones
• Comparing the Filter Update Policy (.F) with the combine policies
(RND.F, LRU.F, and WBM.F)
• Sensitivity of Refresh Budget
• Sensitivity of Selectivity of Filter
!19
How to read result graphs
!20
CumulativeJaccardDistance
Iterations of query execution
Policy A
Policy C
Policy B
Policy A outperform Policy B and C
Result of Experiment 1
!21
Iterations of query execution
Result of Experiment 1
!22
90 80 75 70 60 50
Selectivity
Result of Experiment 2
!23
Iterations of query execution
Filter
Filter
Result of Experiment 2
!24
90 80 75 70 60 50
Selectivity
Conclusion
• Continuous evaluation of a class of queries that joins data
streams and background data with Filtering.
• Get the idea of local replica, maintenance policy, and refresh
budget from the state-of-the-art
• Extend the class of continuous queries
• Propose new maintenance policies
• Filter Update Policy keeps the replica fresher than Acqua
policies in higher selectivity (>60%).
• Combination of policies keep the replica even fresher than the
Filter Update Policy.
!25
Future works
• Broaden the class of queries
• Multiple filtering
• Filtering condition formulated as a ranking clause
• Pushing the FILTER clause into the SERVICE clause and
considering caching instead of local replica
• Study the effect of different trends in the data
!26
Thank you!

Any Question?
When a FILTER makes the difference in
continuously answering SPARQL queries
on streaming and quasi-static Linked Data
Shima Zahmatkesh
shima.zahmatkesh@polimi.it
DEIB - Politecnico of Milano
!27

More Related Content

Similar to When a FILTER makes the di fference in continuously answering SPARQL queries on streaming and quasi-static Linked Data

Relevant Query Answering on Dynamic and Distributed Datasets
Relevant Query Answering on Dynamic and Distributed DatasetsRelevant Query Answering on Dynamic and Distributed Datasets
Relevant Query Answering on Dynamic and Distributed DatasetsShima Zahmatkesh
 
Automated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise ApplicationsAutomated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise ApplicationsSAIL_QU
 
Elastic Scaling of a High-Throughput Content-Based Publish/Subscribe Engine
Elastic Scaling of a High-Throughput Content-Based Publish/Subscribe EngineElastic Scaling of a High-Throughput Content-Based Publish/Subscribe Engine
Elastic Scaling of a High-Throughput Content-Based Publish/Subscribe EngineZbigniew Jerzak
 
Change Impact Analysis for Natural Language Requirements
Change Impact Analysis for Natural Language RequirementsChange Impact Analysis for Natural Language Requirements
Change Impact Analysis for Natural Language RequirementsLionel Briand
 
Paris Video Tech - 1st Edition: Dailymotion Améliorer l'expérience utilisateu...
Paris Video Tech - 1st Edition: Dailymotion Améliorer l'expérience utilisateu...Paris Video Tech - 1st Edition: Dailymotion Améliorer l'expérience utilisateu...
Paris Video Tech - 1st Edition: Dailymotion Améliorer l'expérience utilisateu...Erica Beavers
 
Offsite presentation original
Offsite presentation originalOffsite presentation original
Offsite presentation originalsally.de
 
Documented Requirements are not Useless After All!
Documented Requirements are not Useless After All!Documented Requirements are not Useless After All!
Documented Requirements are not Useless After All!Lionel Briand
 
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...Emanuel Lacić
 
SplunkLive! Customer Presentation – Directv
SplunkLive! Customer Presentation – DirectvSplunkLive! Customer Presentation – Directv
SplunkLive! Customer Presentation – DirectvSplunk
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...HostedbyConfluent
 
A flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TVA flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TVIntoTheMinds
 
A Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TVA Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TVFrancisco Couto
 
Automating Clinical Workflows with the VarSeq Suite
Automating Clinical Workflows with the VarSeq SuiteAutomating Clinical Workflows with the VarSeq Suite
Automating Clinical Workflows with the VarSeq SuiteGolden Helix
 
The art of system and solution testing
The art of system and solution testingThe art of system and solution testing
The art of system and solution testinggaoliang641
 
On the need for applications aware adaptive middleware in real-time RDF data ...
On the need for applications aware adaptive middleware in real-time RDF data ...On the need for applications aware adaptive middleware in real-time RDF data ...
On the need for applications aware adaptive middleware in real-time RDF data ...Zia Ush Shamszaman
 
Automating NGS Gene Panel Analysis Workflows with Golden Helix
Automating NGS Gene Panel Analysis Workflows with Golden HelixAutomating NGS Gene Panel Analysis Workflows with Golden Helix
Automating NGS Gene Panel Analysis Workflows with Golden HelixGolden Helix
 
System and User Aspects of Web Search Latency
System and User Aspects of Web Search LatencySystem and User Aspects of Web Search Latency
System and User Aspects of Web Search LatencyTelefonica Research
 
Sida LEAP Training Lectures #7 and #8: Linking LEAP and WEAP and other advanc...
Sida LEAP Training Lectures #7 and #8: Linking LEAP and WEAP and other advanc...Sida LEAP Training Lectures #7 and #8: Linking LEAP and WEAP and other advanc...
Sida LEAP Training Lectures #7 and #8: Linking LEAP and WEAP and other advanc...weADAPT
 

Similar to When a FILTER makes the di fference in continuously answering SPARQL queries on streaming and quasi-static Linked Data (20)

Relevant Query Answering on Dynamic and Distributed Datasets
Relevant Query Answering on Dynamic and Distributed DatasetsRelevant Query Answering on Dynamic and Distributed Datasets
Relevant Query Answering on Dynamic and Distributed Datasets
 
Automated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise ApplicationsAutomated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise Applications
 
Elastic Scaling of a High-Throughput Content-Based Publish/Subscribe Engine
Elastic Scaling of a High-Throughput Content-Based Publish/Subscribe EngineElastic Scaling of a High-Throughput Content-Based Publish/Subscribe Engine
Elastic Scaling of a High-Throughput Content-Based Publish/Subscribe Engine
 
Change Impact Analysis for Natural Language Requirements
Change Impact Analysis for Natural Language RequirementsChange Impact Analysis for Natural Language Requirements
Change Impact Analysis for Natural Language Requirements
 
Paris Video Tech - 1st Edition: Dailymotion Améliorer l'expérience utilisateu...
Paris Video Tech - 1st Edition: Dailymotion Améliorer l'expérience utilisateu...Paris Video Tech - 1st Edition: Dailymotion Améliorer l'expérience utilisateu...
Paris Video Tech - 1st Edition: Dailymotion Améliorer l'expérience utilisateu...
 
Offsite presentation original
Offsite presentation originalOffsite presentation original
Offsite presentation original
 
rerngvit_phd_seminar
rerngvit_phd_seminarrerngvit_phd_seminar
rerngvit_phd_seminar
 
Documented Requirements are not Useless After All!
Documented Requirements are not Useless After All!Documented Requirements are not Useless After All!
Documented Requirements are not Useless After All!
 
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
 
SplunkLive! Customer Presentation – Directv
SplunkLive! Customer Presentation – DirectvSplunkLive! Customer Presentation – Directv
SplunkLive! Customer Presentation – Directv
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
 
Strel streaming
Strel streamingStrel streaming
Strel streaming
 
A flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TVA flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TV
 
A Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TVA Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TV
 
Automating Clinical Workflows with the VarSeq Suite
Automating Clinical Workflows with the VarSeq SuiteAutomating Clinical Workflows with the VarSeq Suite
Automating Clinical Workflows with the VarSeq Suite
 
The art of system and solution testing
The art of system and solution testingThe art of system and solution testing
The art of system and solution testing
 
On the need for applications aware adaptive middleware in real-time RDF data ...
On the need for applications aware adaptive middleware in real-time RDF data ...On the need for applications aware adaptive middleware in real-time RDF data ...
On the need for applications aware adaptive middleware in real-time RDF data ...
 
Automating NGS Gene Panel Analysis Workflows with Golden Helix
Automating NGS Gene Panel Analysis Workflows with Golden HelixAutomating NGS Gene Panel Analysis Workflows with Golden Helix
Automating NGS Gene Panel Analysis Workflows with Golden Helix
 
System and User Aspects of Web Search Latency
System and User Aspects of Web Search LatencySystem and User Aspects of Web Search Latency
System and User Aspects of Web Search Latency
 
Sida LEAP Training Lectures #7 and #8: Linking LEAP and WEAP and other advanc...
Sida LEAP Training Lectures #7 and #8: Linking LEAP and WEAP and other advanc...Sida LEAP Training Lectures #7 and #8: Linking LEAP and WEAP and other advanc...
Sida LEAP Training Lectures #7 and #8: Linking LEAP and WEAP and other advanc...
 

Recently uploaded

An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 

Recently uploaded (20)

An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 

When a FILTER makes the di fference in continuously answering SPARQL queries on streaming and quasi-static Linked Data

  • 1. WHEN A FILTER MAKES THE DIFFERENCE IN CONTINUOUSLYANSWERING SPARQL QUERIES ON STREAMING AND QUASI-STATIC LINKED DATA Shima Zahmatkesh, Emanuele Della Valle, and Daniele Dell'Aglio DEIB - Politecnico of Milano ICWE 2016 – Lugano 7th June 2016
  • 2. Outline • Introduction • Motivation • Problem Statement • State-of-the-art • Research Question and Hypotheses • Proposed Solution • Experimental Results • Conclusion and future works !2
  • 3. Introduction to Stream Processing Stream Processing Engine ResultsWindows Stream data Register query once and execute it continuously !3
  • 4. Time-based sliding window
 Sequence of timestamped events W(3,1 ,1) β ω widthslide S3 S4 S5 S6 S7 S8 S9 S10 S11 S S1 S2 t 1 2 3 4 5 6 7 8 9 10 11 12 W i W i+1 !4 )
  • 5. Introduction to RSP EnginesRSPengine Web Results Join Windows RDF Streams SPARQL endpoint - Fetching all background data in each evaluation - High number of invocation to the endpoint - Loosing reactiveness of the engine !5
  • 6. Introduction to RSP EnginesRSPengine Web Results Join WindowsRDF Streams SPARQL endpoint Local Replica Data become stale during time !6
  • 7. Introduction to Continuous QueryRSPengine Web Results Join Local Replica Maintenance Policy WindowsRDF Streams SPARQL endpoint !7
  • 8. Motivation The cloth brand ACME wants to persuade influential Social Network users to post commercial endorsements. Every minute give me the ID of the users that are mentioned on Social Network in the last 10 minutes whose number of followers is greater than 100,000. !8
  • 9. Motivation The cloth brand ACME wants to persuade influential Social Network users to post commercial endorsements. Every minute give me the ID of the users that are mentioned on Social Network in the last 10 minutes whose number of followers is greater than 100,000. ΩW ?user :hasMentions ?metionsNum ΩS ?user :hasFollowers ?followerCount WINDOW SERVICE JOIN FILTER ?followerCount > 100000 !8
  • 10. Problem Statement • RSP Engines: • Fetch all the background data in each evaluation • Have high number of invocations • The application may loose its reactiveness. • Class of query with WINDOW and SERVICE and FILTER clauses • Solution: • Using a local replica of the background data • Need maintenance policies to keep the data freshness !9
  • 11. State-of-the-art • Acqua: guarantee reactiveness of RSP engines while maximizing the freshness of the replica • Used a local replica of the background data • Proposed maintenance policies • Used refresh budget γ to limit the number of the access to the background data and guarantee reactiveness • Class of query with WINDOW and SERVICE clauses without FILTER !10
  • 12. Acqua framework !11 WINDOW clause JOIN Proposer Ranker MaintainerLocal Replica 2 3 1 SERVICE clause E C RND LRU WBM WSJ Candidate set Elected set: top γ mappings of Candidate set
  • 13. Research Question • Given a class of query with WINDOW and SERVICE and FILTER clauses • how can we refresh the local replica in order to • guarantee reactiveness while • maximizing the freshness the replica? !12
  • 15. Hypotheses • H.1 the replica can be maintained fresher than when using Acqua policies, if we first refresh the mappings whose values are closer to the Filtering Threshold. • H.2 the replica can be maintained fresher than when using Acqua policies by first selecting the mappings as in Hypothesis H.1 and, then, applying the Acqua policies. !14
  • 16. Proposed Solution !15 WINDOW clause JOIN Proposer Ranker MaintainerLocal Replica 2 3 1 SERVICE clause contains FILTER clause E C WSJ Candidate set Elected set Filter Update Policy RND.F LRU.F WBM.F
  • 17. Proposed Solution • Filter Update Policy: For each mapping in the replica: • Computes how close is the value associate to the variable of the mapping to the Filtering Threshold. • Combination with Acqua: • RND.F, LRU.F, WBM.F • Select a subset of Candidate set whose mappings are closer to Filtering Threshold. • Apply the Acqua policies over this subset. !16
  • 18. Experimental Evaluation (1 of 3) • Data Sets • Streaming data: collected from 400 verified users of Twitter for three hours of tweets • Background data: collected the number of followers per user, every minute during the three hours • Transformation of the background data: • To control the selectivity of filter • The time-series of specified percentage of the users crosses the Filtering Threshold at least once during the experiment. • To avoid biases, generate 10 different dataset for each specified percentage !17
  • 19. Experimental Evaluation (2 of 3) • Query • Contains WINDOW, SERVICE, and FILTER clauses • Generate correct answer of the query by a Oracle • Ans(Oracle) • KPIs • Use Jaccard distance to measure diversity of the set generated by the query Ans(Qi) and correct answers Ans(Oraclei) • Define cumulative Jaccard distance !18
  • 20. Experimental Evaluation (3 of 3) • Two sets of experiments: • Comparing the Filter Update Policy (.F) with Acqua ones • Comparing the Filter Update Policy (.F) with the combine policies (RND.F, LRU.F, and WBM.F) • Sensitivity of Refresh Budget • Sensitivity of Selectivity of Filter !19
  • 21. How to read result graphs !20 CumulativeJaccardDistance Iterations of query execution Policy A Policy C Policy B Policy A outperform Policy B and C
  • 22. Result of Experiment 1 !21 Iterations of query execution
  • 23. Result of Experiment 1 !22 90 80 75 70 60 50 Selectivity
  • 24. Result of Experiment 2 !23 Iterations of query execution Filter Filter
  • 25. Result of Experiment 2 !24 90 80 75 70 60 50 Selectivity
  • 26. Conclusion • Continuous evaluation of a class of queries that joins data streams and background data with Filtering. • Get the idea of local replica, maintenance policy, and refresh budget from the state-of-the-art • Extend the class of continuous queries • Propose new maintenance policies • Filter Update Policy keeps the replica fresher than Acqua policies in higher selectivity (>60%). • Combination of policies keep the replica even fresher than the Filter Update Policy. !25
  • 27. Future works • Broaden the class of queries • Multiple filtering • Filtering condition formulated as a ranking clause • Pushing the FILTER clause into the SERVICE clause and considering caching instead of local replica • Study the effect of different trends in the data !26
  • 28. Thank you!
 Any Question? When a FILTER makes the difference in continuously answering SPARQL queries on streaming and quasi-static Linked Data Shima Zahmatkesh shima.zahmatkesh@polimi.it DEIB - Politecnico of Milano !27