FFWD - Fast Forward With Degradation

201 views

Published on

Presentation of the paper "FFWD: latency-aware event stream processing via domain-specific load-shedding policies" at EUC 2016

Published in: Engineering
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
201
On SlideShare
0
From Embeds
0
Number of Embeds
102
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

FFWD - Fast Forward With Degradation

  1. 1. FFWD: latency-aware event stream processing via domain-specific load-shedding policies R. Brondolin, M. Ferroni, M. D. Santambrogio 2016 IEEE 14th International Conference on Embedded and Ubiquitous Computing (EUC) 1
  2. 2. Outline 2 • Stream processing engines and real-time sentiment analysis • Problem definition and proposed solution • FFWD design • Load-Shedding components • Experimental evaluation • Conclusion and future work
  3. 3. Introduction 3 • Stream processing engines (SPEs) are scalable tools that process continuous data streams. They are widely used for example in network monitoring and telecommunication • Sentiment analysis is the process of determining the emotional tone behind a series of words, in our case Twitter messages
  4. 4. Real-time sentiment analysis 4 • Real-time sentiment analysis allows to: – Track the sentiment of a topic over time – Correlate real world events and related sentiment, e.g. • Toyota crisis (2010) [1] • 2012 US Presidential Election Cycle [2] – Track online evolution of companies reputation, derive social profiling and allow enhanced social marketing strategies [1] Bifet Figuerol, Albert Carles, et al. "Detecting sentiment change in Twitter streaming data." Journal of Machine Learning Research: Workshop and Conference Proceedings Series. 2011. [2] Wang, Hao, et al. "A system for real-time twitter sentiment analysis of 2012 us presidential election cycle." Proceedings of the ACL 2012 System Demonstrations.
  5. 5. Case Study 5 • Simple Twitter streaming sentiment analyzer with Stanford NLP • System components: – Event producer – RabbitMQ queue – Event consumer • Consumer components: – Event Capture – Sentiment Analyzer – Sentiment Aggregator • Real-time queue consumption, aggregated metrics emission each second (keywords and hashtag sentiment)
  6. 6. Problem definition (1) 6 • Our sentiment analyzer is a streaming system with a finite queue • Unpredictable arrival rate λ(t) • Limited service rate μ(t) S λ(t) μ(t) • If λ(t) limited -> λ(t) ≃ μ(t) • Stable system • Limited response time
  7. 7. Problem definition (2) 7 • If λ(t) increases too much -> λ(t) >> μ(t) • The queue starts to fill • Response time increases… S λ(t) μ(t) • Our sentiment analyzer is a streaming system with a finite queue • Unpredictable arrival rate λ(t) • Limited service rate μ(t)
  8. 8. Problem definition (2) 8 • … until the system looses its real-time behavior S λ(t) μ(t) • Our sentiment analyzer is a streaming system with a finite queue • Unpredictable arrival rate λ(t) • Limited service rate μ(t)
  9. 9. Proposed solution 9 • Scale-out? – however limited to the available machines • What if we try to drop tweets? – Keep bounded the response time – Try to minimize the number of dropped tweets – Try to minimize the error between the exact computation and the approximated one • Use probabilistic approach to load shedding • domain-specific policies to enhance the accuracy in estimation
  10. 10. Fast Forward With Degradation (FFWD) • FFWD adds four components: 10 Event Capture Sentiment Analyzer Sentiment Aggregator account metrics output metrics analyze event Producer eventinput tweets real-time queue
  11. 11. Fast Forward With Degradation (FFWD) 11 • FFWD adds four components: – Load shedding filter at the beginning of the pipeline – Shedding plan used by the filter Producer Load Shedding Filter Event Capture Sentiment Analyzer Sentiment Aggregator Shedding Plan real-time queue ok ko ko count account metrics event output metricsinput tweets drop probability analyze event
  12. 12. Fast Forward With Degradation (FFWD) 12 • FFWD adds four components: – Load shedding filter at the beginning of the pipeline – Shedding plan used by the filter – Domain-specific policy wrapper Producer Load Shedding Filter Event Capture Sentiment Analyzer Sentiment Aggregator Policy Wrapper Shedding Plan real-time queue ok ko ko count account metrics stream statsupdated plan event output metricsinput tweets drop probability analyze event
  13. 13. Fast Forward With Degradation (FFWD) 13 • FFWD adds four components: – Load shedding filter at the beginning of the pipeline – Shedding plan used by the filter – Domain-specific policy wrapper – Application controller manager to detect load peaks Producer Load Shedding Filter Event Capture Sentiment Analyzer Sentiment Aggregator Policy Wrapper Controller Shedding Plan real-time queue ok ko ko count account metrics λ(t) R(t) stream statsupdated plan μ(t+1) event output metricsinput tweets drop probability Rt analyze event
  14. 14. Controller 14 S: (Little’s Law) (Jobs in the system) The system can be characterized by its response time and the jobs in the system Control error: Requested throughput: The requested throughput is used by the load shedding policies to derive the LS probabilities Controller
  15. 15. Controller 15 S: (Little’s Law) (Jobs in the system) The system can be characterized by its response time and the jobs in the system Control error: Requested throughput: The requested throughput is used by the load shedding policies to derive the LS probabilities Old response time Target response time Controller
  16. 16. Controller 16 S: (Little’s Law) (Jobs in the system) The system can be characterized by its response time and the jobs in the system Control error: Requested throughput: The requested throughput is used by the load shedding policies to derive the LS probabilities Requested throughput Arrival rate Controller Control error
  17. 17. Policies • Baseline: General drop probability computed from the 
 requested throughput 17 en the event e component a drop queue n to perform pecific Policy computes the erence signal µ(t) = (t 1) µmax · e(t) (6) U(t) = ¯U (7) P(X) = 1 µc(t 1) µ(t) (8) Policy Wrapper
  18. 18. Policies • Baseline: General drop probability computed from the 
 requested throughput • Fair: Assign to each input class the “same" number of events – Save metrics of small classes, still accurate results on big ones 18 en the event e component a drop queue n to perform pecific Policy computes the erence signal µ(t) = (t 1) µmax · e(t) (6) U(t) = ¯U (7) P(X) = 1 µc(t 1) µ(t) (8) Policy Wrapper
  19. 19. Policies • Baseline: General drop probability computed from the 
 requested throughput • Fair: Assign to each input class the “same" number of events – Save metrics of small classes, still accurate results on big ones • Priority: Assign a priority to each input class – Divide events depending on the priorities – General case of Fair policy 19 en the event e component a drop queue n to perform pecific Policy computes the erence signal µ(t) = (t 1) µmax · e(t) (6) U(t) = ¯U (7) P(X) = 1 µc(t 1) µ(t) (8) Policy Wrapper
  20. 20. Filter 20 • For each event in the system: – looks for probabilities in shedding plan using its meta-data – if not found uses general drop probability Load Shedding Filter Load Shedding Filter Shedding Plan real-time queue batch queue ok ko drop probability Event Capture • If specified, the dropped events are placed in a different queue for a later analysis
  21. 21. Evaluation setup 21 • Separate tests to understand FFWD behavior: – Controller performance – Policy and degradation evaluation • Dataset: 900K tweets of 35th week of Premier League • Performed tests: – Controller: synthetic and real tweets at various λ(t) – Policy: real tweets at various λ(t) • Evaluation setup – Intel core i7 3770, 4 cores @ 3.4 Ghz + HT, 8MB LLC – 8 GB RAM @ 1600 Mhz
  22. 22. Controller Performance 22 case A: λ(t) = λ(t-1) case B: λ(t) = avg(λ(t)) λ(t) estimation:
  23. 23. Controller showcase (1) • Controller demo (Rt = 5s): – λ(t) increased after 60s and 240s – response time: 23 0 1 2 3 4 5 6 7 0 50 100 150 200 250 300 Responsetime(s) time (s) Controller performance QoS = 5s R
  24. 24. Controller showcase (2) • Controller demo (Rt = 5s): – λ(t) increased after 60s and 240s – throughput: 24 0 100 200 300 400 500 0 50 100 150 200 250 300 #Events time (s) Actuation lambda dropped computed mu
  25. 25. Degradation Evaluation 25 • Real tweets, μc(t) ≃ 40 evt/s • Evaluated policies: • Baseline • Fair • Priority • R = 5s, λ(t) = 100 evt/s, 200 evt/s, 400 evt/s • Error metric: Mean Absolute Percentage Error (MAPE %) (lower is better) 0 10 20 30 40 50 A B C D MAPE(%) Groups baseline_error fair_error priority_error λ(t) = 100 evt/s 0 10 20 30 40 50 A B C D MAPE(%) Groups baseline_error fair_error priority_error λ(t) = 200 evt/s 0 10 20 30 40 50 A B C D MAPE(%) Groups baseline_error fair_error priority_error λ(t) = 400 evt/s
  26. 26. Conclusions and future work 26 • We saw the main challenges of stream processing for real- time sentiment analysis • Fast Forward With Degradation (FFWD) – Heuristic controller for bounded response time – Pluggable policies for domain-specific load shedding – Accurate computation of metrics – Simple Load Shedding Filter for fast drop • Future work – Controller generalization, to cope with other control metrics (CPU) – Predictive modeling of the arrival rate – Explore different fields of application, use cases and policies
  27. 27. Any questions? 27

×