SlideShare a Scribd company logo
1 of 19
Download to read offline
Shima Zahmatkesh, Emanuele Della Valle, and Daniele Dell'Aglio
DEIB - Politecnico of Milano
DEBS 2017 – Barcelona, Spain
23 June 2017
Using Rank Aggregation in Continuously
Answering SPARQL Queries on Streaming
and Quasi-static Linked Data
Stream Processing in Nutshell
Stream Processing Engine
ResultsWindows
Stream data Register query once
and execute it
continuously
!2
Web Stream Processing
Web
Results
Join
Windows
Web Streams Linked Data
✓ High Latency
✓ Rate Limits
✓ Loosing Reactiveness
!3
Stream Processing Engine
RDF Stream Processing (RSP) EngineRSPengine
Web
Results
Join
Windows
RDF Streams SPARQL endpoint
!4
Motivation
The cloth brand ACME wants to persuade influential Social
Networks users to post commercial endorsements.
Every minute give me the ID of the users that are mentioned on
Social Network in the last 10 minutes whose number of followers is
greater than 100,000.
!5
REGISTER STREAM <:InfluencersToContact> AS
CONSTRUCT {?user a :influentialUser}
FROM NAMED WINDOW W ON S [RANGE 10m STEP 1m]
WHERE {
WINDOW W {?user :hasMentions ?mentionsNumber}
SERVICE BKG {?user :hasFollowers ?followerCount }
FILTER (?followerCount > 100,000)
}
Problem DefinitionRSPengine
Web
Results
Join
WindowsRDF Streams
Define Refresh
Budget to control
reactiveness
!6
Data become stale
if not refreshed
Correct vs
approximate
answer
SPARQL endpoint
Local
Replica
Problem DefinitionRSPengine
Web
Results
Join
WindowsRDF Streams SPARQL endpoint
!7
Local
Replica
Maintenance
Policy
ACQUA, ACQUA.F Frameworks
!8
WINDOW clause
Stream data
JOIN Proposer Ranker
Maintainer
2
3
1
SERVICE clause
ACQUA: without FILTER
ACQUA.F: with FILTER Clause
E
C
ACQUA:
RND
LRU
WBM
ACQUA.F:
Filter Update Policy
RND.F
LRU.F
WBM.F
Candidate set
Elected set: top γ mappings
of Candidate set
Local Replica
WSJ: Filter out mappings that
are not involved in current
evaluation
Soheila Dehghanzadeh, et al., Approximate Continuous Query Answering over Streams and Dynamic Linked Data Sets, ICWE 2015.
Shima Zahmatkesh, et al., When a FILTER Makes the Dierence in Continuously Answering SPARQL Queries on Streaming and
Quasi-Static Linked Data, ICWE 2016.
Emanuele Della Valle, et al., Taming velocity and variety simultaneously in big data with stream reasoning, DEBS 2016.
Rankers
• LRU
• Use Least-Recently Used (LRU) cache replacement algorithm
• The less recently a mapping have been refreshed in a query, the
higher is its rank.
• Filter Update Policy
• For each mapping in the replica:
• Computes how close is the value associate to the variable of the
mapping to the Filtering Threshold used in Filter clause.
• Arrange mappings in ascending order.
!9
User Last Update Time LRU policy #followers Filter Update
Alice 8 1 120 2
Bob 10 2 30 3
Carol 14 3 95 1
Filtering Threshold = 100
Rank Aggregation
• Fairly take into account the opinions of different
algorithms.
• Combine the ranking lists obtained from different
algorithms by computing aggregated score
!10
User Score
Alice 0.8
Bob 0.7
Carol 0.5
David 0.4
Eve 0.1
User Score
Bob 0.9
David 0.8
Alice 0.7
Eve 0.4
Carol 0.1
User Scoreagg
Bob 0.8
Alice 0.75α = 0.5
T = 0.5*0.8 + 0.5*0.9 = 0.85List 1 List 2
Rank Aggregation
• Fairly take into account the opinions of different
algorithms.
• Combine the ranking lists obtained from different
algorithms by computing aggregated score
!10
User Score
Alice 0.8
Bob 0.7
Carol 0.5
David 0.4
Eve 0.1
User Score
Bob 0.9
David 0.8
Alice 0.7
Eve 0.4
Carol 0.1
α = 0.5
User Scoreagg
Bob 0.8
Alice 0.75
David 0.6
T = 0.5*0.7 + 0.5*0.8 = 0.75List 1 List 2
Experimental Evaluation
• Data Sets
• Streaming data, and realistic background data from real data of
Twitter
• Query
• Contains WINDOW, SERVICE, and FILTER clauses
• Generate correct answer of the query by an Oracle
• KPIs
• Measure diversity of the set generated by the query and correct
answers
• Compute cummulative errors over evaluations
!11
Experimental Results
!12
WorstBest
Performance
Experiment Dimension
Experimental Results
!12
WorstBest
Performance
Experiment Dimension
For low selectivity
WBM is better than
Filter Update Policy
Experimental Results
!12
WorstBest
Performance
Experiment Dimension
For high selectivity
Filter Update Policy is
better than WBM
Experimental Results
!12
WorstBest
Performance
Experiment Dimension
Comparable to
Best Result
Conclusion
• Problem of continuously evaluating queries over data
stream and background data.
• The results of experiments show that proposed policies
have the same accuracy of the best result achieved
without using any assumption.
• They also show that the proposed policies are not
sensitive to the value of alpha used in rank aggregation
formula.
!13
Future works
• Broaden the class of queries
• Multiple filtering
• Filtering condition formulated as a ranking clause
• Pushing the FILTER clause into the SERVICE clause and
considering caching instead of local replica
• Study the effect of different trends in the data
!14
Thank you!

Any Question?
Using Rank Aggregation in Continuously
Answering SPARQL Queries on Streaming and
Quasi-static Linked Data
Shima Zahmatkesh
shima.zahmatkesh@polimi.it
DEIB - Politecnico of Milano
!15

More Related Content

Similar to Using Rank Aggregation in Continuously Answering SPARQL Queries on Streaming and Quasi-static Linked Data

[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...Emanuel Lacić
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Rakebul Hasan
 
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkData-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkDatabricks
 
Rules validation - Copy
Rules validation - CopyRules validation - Copy
Rules validation - CopyHicham Berrada
 
Online index recommendations for high dimensional databases using query workl...
Online index recommendations for high dimensional databases using query workl...Online index recommendations for high dimensional databases using query workl...
Online index recommendations for high dimensional databases using query workl...Mumbai Academisc
 
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...confluent
 
Umm, how did you get that number? Managing Data Integrity throughout the Data...
Umm, how did you get that number? Managing Data Integrity throughout the Data...Umm, how did you get that number? Managing Data Integrity throughout the Data...
Umm, how did you get that number? Managing Data Integrity throughout the Data...John Kinmonth
 
Low Cost Business Intelligence Platform for MongoDB instances using MEAN stack
Low Cost Business Intelligence Platform for MongoDB instances using MEAN stackLow Cost Business Intelligence Platform for MongoDB instances using MEAN stack
Low Cost Business Intelligence Platform for MongoDB instances using MEAN stackAvinash Kaza
 
Couchbase overview033113long
Couchbase overview033113longCouchbase overview033113long
Couchbase overview033113longJeff Harris
 
Couchbase overview033113long
Couchbase overview033113longCouchbase overview033113long
Couchbase overview033113longJeff Harris
 
database management system
database management systemdatabase management system
database management systemNivetha Ganesan
 
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...Spark Summit
 
Petabytes and Nanoseconds
Petabytes and NanosecondsPetabytes and Nanoseconds
Petabytes and NanosecondsRobert Greiner
 
Accumulo Summit 2016: Accumulo Indexing Strategies for Searching Semantic Net...
Accumulo Summit 2016: Accumulo Indexing Strategies for Searching Semantic Net...Accumulo Summit 2016: Accumulo Indexing Strategies for Searching Semantic Net...
Accumulo Summit 2016: Accumulo Indexing Strategies for Searching Semantic Net...Accumulo Summit
 
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...Institute of Information Systems (HES-SO)
 
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL ServicesAmazon Web Services
 
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...RTTS
 

Similar to Using Rank Aggregation in Continuously Answering SPARQL Queries on Streaming and Quasi-static Linked Data (20)

[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...
 
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkData-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
 
slides-sd
slides-sdslides-sd
slides-sd
 
Rules validation - Copy
Rules validation - CopyRules validation - Copy
Rules validation - Copy
 
Online index recommendations for high dimensional databases using query workl...
Online index recommendations for high dimensional databases using query workl...Online index recommendations for high dimensional databases using query workl...
Online index recommendations for high dimensional databases using query workl...
 
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
 
Umm, how did you get that number? Managing Data Integrity throughout the Data...
Umm, how did you get that number? Managing Data Integrity throughout the Data...Umm, how did you get that number? Managing Data Integrity throughout the Data...
Umm, how did you get that number? Managing Data Integrity throughout the Data...
 
Low Cost Business Intelligence Platform for MongoDB instances using MEAN stack
Low Cost Business Intelligence Platform for MongoDB instances using MEAN stackLow Cost Business Intelligence Platform for MongoDB instances using MEAN stack
Low Cost Business Intelligence Platform for MongoDB instances using MEAN stack
 
Couchbase overview033113long
Couchbase overview033113longCouchbase overview033113long
Couchbase overview033113long
 
Couchbase overview033113long
Couchbase overview033113longCouchbase overview033113long
Couchbase overview033113long
 
database management system
database management systemdatabase management system
database management system
 
NoSQL and Couchbase
NoSQL and CouchbaseNoSQL and Couchbase
NoSQL and Couchbase
 
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
 
Petabytes and Nanoseconds
Petabytes and NanosecondsPetabytes and Nanoseconds
Petabytes and Nanoseconds
 
Accumulo Summit 2016: Accumulo Indexing Strategies for Searching Semantic Net...
Accumulo Summit 2016: Accumulo Indexing Strategies for Searching Semantic Net...Accumulo Summit 2016: Accumulo Indexing Strategies for Searching Semantic Net...
Accumulo Summit 2016: Accumulo Indexing Strategies for Searching Semantic Net...
 
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...
 
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
 
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
 
DataHub
DataHubDataHub
DataHub
 

Recently uploaded

Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network DevicesChandrakantDivate1
 
Introduction to Artificial Intelligence ( AI)
Introduction to Artificial Intelligence ( AI)Introduction to Artificial Intelligence ( AI)
Introduction to Artificial Intelligence ( AI)ChandrakantDivate1
 
Introduction to Geographic Information Systems
Introduction to Geographic Information SystemsIntroduction to Geographic Information Systems
Introduction to Geographic Information SystemsAnge Felix NSANZIYERA
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsvanyagupta248
 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...HenryBriggs2
 
Post office management system project ..pdf
Post office management system project ..pdfPost office management system project ..pdf
Post office management system project ..pdfKamal Acharya
 
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using PipesLinux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using PipesRashidFaridChishti
 
Path loss model, OKUMURA Model, Hata Model
Path loss model, OKUMURA Model, Hata ModelPath loss model, OKUMURA Model, Hata Model
Path loss model, OKUMURA Model, Hata ModelDrAjayKumarYadav4
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwaitjaanualu31
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"mphochane1998
 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...Amil baba
 
Basic Electronics for diploma students as per technical education Kerala Syll...
Basic Electronics for diploma students as per technical education Kerala Syll...Basic Electronics for diploma students as per technical education Kerala Syll...
Basic Electronics for diploma students as per technical education Kerala Syll...ppkakm
 
Query optimization and processing for advanced database systems
Query optimization and processing for advanced database systemsQuery optimization and processing for advanced database systems
Query optimization and processing for advanced database systemsmeharikiros2
 
Max. shear stress theory-Maximum Shear Stress Theory ​ Maximum Distortional ...
Max. shear stress theory-Maximum Shear Stress Theory ​  Maximum Distortional ...Max. shear stress theory-Maximum Shear Stress Theory ​  Maximum Distortional ...
Max. shear stress theory-Maximum Shear Stress Theory ​ Maximum Distortional ...ronahami
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptxJIT KUMAR GUPTA
 
fitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .pptfitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .pptAfnanAhmad53
 
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...josephjonse
 

Recently uploaded (20)

Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
Introduction to Artificial Intelligence ( AI)
Introduction to Artificial Intelligence ( AI)Introduction to Artificial Intelligence ( AI)
Introduction to Artificial Intelligence ( AI)
 
Introduction to Geographic Information Systems
Introduction to Geographic Information SystemsIntroduction to Geographic Information Systems
Introduction to Geographic Information Systems
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
 
Signal Processing and Linear System Analysis
Signal Processing and Linear System AnalysisSignal Processing and Linear System Analysis
Signal Processing and Linear System Analysis
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
Post office management system project ..pdf
Post office management system project ..pdfPost office management system project ..pdf
Post office management system project ..pdf
 
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using PipesLinux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
 
Path loss model, OKUMURA Model, Hata Model
Path loss model, OKUMURA Model, Hata ModelPath loss model, OKUMURA Model, Hata Model
Path loss model, OKUMURA Model, Hata Model
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
 
Basic Electronics for diploma students as per technical education Kerala Syll...
Basic Electronics for diploma students as per technical education Kerala Syll...Basic Electronics for diploma students as per technical education Kerala Syll...
Basic Electronics for diploma students as per technical education Kerala Syll...
 
Query optimization and processing for advanced database systems
Query optimization and processing for advanced database systemsQuery optimization and processing for advanced database systems
Query optimization and processing for advanced database systems
 
Max. shear stress theory-Maximum Shear Stress Theory ​ Maximum Distortional ...
Max. shear stress theory-Maximum Shear Stress Theory ​  Maximum Distortional ...Max. shear stress theory-Maximum Shear Stress Theory ​  Maximum Distortional ...
Max. shear stress theory-Maximum Shear Stress Theory ​ Maximum Distortional ...
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
fitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .pptfitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .ppt
 
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
 

Using Rank Aggregation in Continuously Answering SPARQL Queries on Streaming and Quasi-static Linked Data

  • 1. Shima Zahmatkesh, Emanuele Della Valle, and Daniele Dell'Aglio DEIB - Politecnico of Milano DEBS 2017 – Barcelona, Spain 23 June 2017 Using Rank Aggregation in Continuously Answering SPARQL Queries on Streaming and Quasi-static Linked Data
  • 2. Stream Processing in Nutshell Stream Processing Engine ResultsWindows Stream data Register query once and execute it continuously !2
  • 3. Web Stream Processing Web Results Join Windows Web Streams Linked Data ✓ High Latency ✓ Rate Limits ✓ Loosing Reactiveness !3 Stream Processing Engine
  • 4. RDF Stream Processing (RSP) EngineRSPengine Web Results Join Windows RDF Streams SPARQL endpoint !4
  • 5. Motivation The cloth brand ACME wants to persuade influential Social Networks users to post commercial endorsements. Every minute give me the ID of the users that are mentioned on Social Network in the last 10 minutes whose number of followers is greater than 100,000. !5 REGISTER STREAM <:InfluencersToContact> AS CONSTRUCT {?user a :influentialUser} FROM NAMED WINDOW W ON S [RANGE 10m STEP 1m] WHERE { WINDOW W {?user :hasMentions ?mentionsNumber} SERVICE BKG {?user :hasFollowers ?followerCount } FILTER (?followerCount > 100,000) }
  • 6. Problem DefinitionRSPengine Web Results Join WindowsRDF Streams Define Refresh Budget to control reactiveness !6 Data become stale if not refreshed Correct vs approximate answer SPARQL endpoint Local Replica
  • 7. Problem DefinitionRSPengine Web Results Join WindowsRDF Streams SPARQL endpoint !7 Local Replica Maintenance Policy
  • 8. ACQUA, ACQUA.F Frameworks !8 WINDOW clause Stream data JOIN Proposer Ranker Maintainer 2 3 1 SERVICE clause ACQUA: without FILTER ACQUA.F: with FILTER Clause E C ACQUA: RND LRU WBM ACQUA.F: Filter Update Policy RND.F LRU.F WBM.F Candidate set Elected set: top γ mappings of Candidate set Local Replica WSJ: Filter out mappings that are not involved in current evaluation Soheila Dehghanzadeh, et al., Approximate Continuous Query Answering over Streams and Dynamic Linked Data Sets, ICWE 2015. Shima Zahmatkesh, et al., When a FILTER Makes the Dierence in Continuously Answering SPARQL Queries on Streaming and Quasi-Static Linked Data, ICWE 2016. Emanuele Della Valle, et al., Taming velocity and variety simultaneously in big data with stream reasoning, DEBS 2016.
  • 9. Rankers • LRU • Use Least-Recently Used (LRU) cache replacement algorithm • The less recently a mapping have been refreshed in a query, the higher is its rank. • Filter Update Policy • For each mapping in the replica: • Computes how close is the value associate to the variable of the mapping to the Filtering Threshold used in Filter clause. • Arrange mappings in ascending order. !9 User Last Update Time LRU policy #followers Filter Update Alice 8 1 120 2 Bob 10 2 30 3 Carol 14 3 95 1 Filtering Threshold = 100
  • 10. Rank Aggregation • Fairly take into account the opinions of different algorithms. • Combine the ranking lists obtained from different algorithms by computing aggregated score !10 User Score Alice 0.8 Bob 0.7 Carol 0.5 David 0.4 Eve 0.1 User Score Bob 0.9 David 0.8 Alice 0.7 Eve 0.4 Carol 0.1 User Scoreagg Bob 0.8 Alice 0.75α = 0.5 T = 0.5*0.8 + 0.5*0.9 = 0.85List 1 List 2
  • 11. Rank Aggregation • Fairly take into account the opinions of different algorithms. • Combine the ranking lists obtained from different algorithms by computing aggregated score !10 User Score Alice 0.8 Bob 0.7 Carol 0.5 David 0.4 Eve 0.1 User Score Bob 0.9 David 0.8 Alice 0.7 Eve 0.4 Carol 0.1 α = 0.5 User Scoreagg Bob 0.8 Alice 0.75 David 0.6 T = 0.5*0.7 + 0.5*0.8 = 0.75List 1 List 2
  • 12. Experimental Evaluation • Data Sets • Streaming data, and realistic background data from real data of Twitter • Query • Contains WINDOW, SERVICE, and FILTER clauses • Generate correct answer of the query by an Oracle • KPIs • Measure diversity of the set generated by the query and correct answers • Compute cummulative errors over evaluations !11
  • 14. Experimental Results !12 WorstBest Performance Experiment Dimension For low selectivity WBM is better than Filter Update Policy
  • 15. Experimental Results !12 WorstBest Performance Experiment Dimension For high selectivity Filter Update Policy is better than WBM
  • 17. Conclusion • Problem of continuously evaluating queries over data stream and background data. • The results of experiments show that proposed policies have the same accuracy of the best result achieved without using any assumption. • They also show that the proposed policies are not sensitive to the value of alpha used in rank aggregation formula. !13
  • 18. Future works • Broaden the class of queries • Multiple filtering • Filtering condition formulated as a ranking clause • Pushing the FILTER clause into the SERVICE clause and considering caching instead of local replica • Study the effect of different trends in the data !14
  • 19. Thank you!
 Any Question? Using Rank Aggregation in Continuously Answering SPARQL Queries on Streaming and Quasi-static Linked Data Shima Zahmatkesh shima.zahmatkesh@polimi.it DEIB - Politecnico of Milano !15