Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Debs 2011 pattern rewritingforeventprocessingoptimization


Published on

DEBS 2011 presentation
Pattern Rewriting for event processing optimization by Ella Rabinovich, Opher Etzion and Avigdor Gal

  • Be the first to comment

  • Be the first to like this

Debs 2011 pattern rewritingforeventprocessingoptimization

  1. 1. Pattern Rewriting Framework for Event Processing Optimization Ella Rabinovich, Opher Etzion , Avigdor Gal
  2. 2. Motivation Adi A., Etzion O. Amit - the situation manager. The VLDB Journal – The International Journal on Very Large Databases. Volume 13 Issue 2, 2004. Previous studies ‎indicate that there is a major performance degradation as application complexity increases. Mendes M., Bizarro P., Marques P. Benchmarking event processing systems: current state and future directions. WOSP/SIPEW 2010: 259-260 . Optimize complex scenarios
  3. 3. Optimization tools Blackbox optimizations: Distribution Parallelism Scheduling Load balancing Load shedding Whitebox optimizations: Implementation selection Implementation optimization Pattern rewriting Our focus
  4. 4. An example of a complex scenario E1 E2 E3 E15 E16 A process has 16 steps, that have to be executed in a predefined order; termination of each step creates an event with a status-code (SC). The process is reported as committed when The 16 steps have completed in the correct order (sequence pattern) and the pattern assertion is satisfied. The assertion that may look like: E 1 .SC == E2.SC or E3.SC < 4 For this scenario we succeeded to achieve more than tenfold decrease of latency, or more than 20% increase in throughput
  5. 5. Pattern Rewriting Approach <ul><li>The goal: create equivalent pattern that provides better performance </li></ul>Rewriting techniques exist in other domains such as: rule system, SQL queries Due to the inherent complexity of event processing patterns there are some unique challenges seq(E1,E2,E3,E4) seq(E1,E2,E3,E5,E6) seq(E1,E2,E3) seq(DE,E4) seq(DE,E5,E6) all(E1,E2,E3,E4) all(E1,E2) all(E3,E4) all(DE1,DE2) subsumption of a common logic splitting for parallel execution DE1 DE2 DE
  6. 6. Challenges: Assertion Split A pattern assertion (PA) is a predicate that event collection needs to satisfied for the pattern to be matched. seq(E1,E2) with PA’ seq(DE,E3) with PA’’ DE seq(E1,E2,E3) with pattern assertion: E1.SC == E2.SC OR E3.SC < 4 E1.SC == E2.SC OR E3.SC <4 E1.SC == E2.SC E3.SC < 4 seq(E1,E2,E3) with PA <ul><li>the direct connection of the two patterns implies “AND” operator between PA’ and PA’’ </li></ul>seq(E1,E3) with PA’ seq(DE,E2) with PA’’ DE seq(E1,E2,E3) with pattern assertion: E1.SC == E3.SC AND E2.SC = 0 E1.SC == E3.SC AND E2.SC = 0 E1.SC == E3.SC E2.SC = 0 seq(E1,E2,E3) with PA <ul><li>the assertion should be separable in terms of its variables </li></ul>
  7. 7. Assertion Split – Solution <ul><li>Convert the pattern assertion expression into conjunctive normal form (CNF). </li></ul><ul><li>Identify independent participants’ sub-groups, by generating assertion variables dependency graph. </li></ul><ul><li>Maximal number of independent partitions implies the finest granulation of the assertion expression. </li></ul>E1 E2 E4 E5 E6 E3 (E1.SC > E2.SC) AND (E4.SC > E5.SC) AND NOT ((E5.SC==E6.SC) AND (E3.SC==77)) (E1.SC > E2.SC) AND (E4.SC > E5.SC) AND (NOT(E5.SC==E6.SC) OR NOT(E3.SC==77)) CNF
  8. 8. Pattern Matching - Policies Pattern: seq(PG, ATM-W) within 10 minutes PG1 PG2 ATM-W1 Instance selection policy PG1 PG2 ATM-W1 ATM-W2 first detection additional detection? Cardinality policy PG1 PG2 ATM-W1 first detection – are instances consumed? ATM-W2 Consumption policy
  9. 9. <ul><li>Naïve pattern split, keeping the original policies in the rewritten </li></ul><ul><li>version will result in incorrect matching: </li></ul>Challenges: Policies Mapping seq(E1,E2,E3) {single, last, …} seq(E1,E2) {single, last, …} seq(DE,E3) {single, last, …} e1.1 e3.1 e1.1 e3.1 e2.1 blood pressure measure e2.2 blood pressure measure e2.1 blood pressure measure e2.2 blood pressure measure detection point detection point detection point
  10. 10. Policies Mapping – Solution Mapping of policies in the rewritten alternative (f2’ + f2’’), based on the original pattern (f1): - reuse - Consumption last last last Instance selection single unrestricted single Cardinality rewritten (f2’’) rewritten (f2’) original (f1) policy seq(E1,E2,E3) seq(E1,E2) seq(DE,E3) + pattern assertion extensions consume reuse consume Consumption last each last Instance selection unrestricted unrestricted unrestricted Cardinality rewritten (f2’’) rewritten (f2’) original (f1) policy
  11. 11. <ul><li>Denotational semantics approach: </li></ul><ul><li>Event processing pattern is a function (f) , mapping pattern’s input (participant </li></ul><ul><li>set - PS) into its output (matching set - MS). We formally demonstrate that for </li></ul><ul><li>the same PS both alternatives produce the identical MS: </li></ul><ul><li>f1(PS, …) == f2’( (f2’’(PS’, …)  PS’’) , …)  PS </li></ul>Equivalence assurance seq(E 1 ,…, E N ) PA, Policies seq(E 1 , …, E K ) PA’, Policies’ seq(DE, E K+1 , …, E N ) PA’’, Policies’’ participant set (PS) participant set (PS) matching set (MS) matching set (MS)
  12. 12. Throughput vs. Latency Tradeoff Pattern throughput is an average rate of events it can process The detecting event latency as a delay between the last input event causing a this pattern detection and the detection itself, resulting in derivation of an output event. Example : seq(E1,E2,E3) produces derived event DE Detecting event latency = DE.detection_time - E3.detection_time DE.detection_time: time DE was detected by the system E3.detection_time: time E3 arrived to the system seq(E 1 ,…, E N ) seq(E 1 , …, E N-2 ) seq(DE, E N-1 , E N ) throughput latency throughput latency lazy evaluation eager evaluation
  13. 13. Bi-objective Performance Optimization <ul><li>Define bi-objective performance function </li></ul><ul><ul><li>Assign a scalar weight for each objective to be optimized </li></ul></ul><ul><ul><ul><li>Weight of  to pattern throughput (th) </li></ul></ul></ul><ul><ul><ul><li>Complementary weight (1-  ) to the detecting event latency (lt) </li></ul></ul></ul><ul><ul><li>Minimize the goal function of the form: </li></ul></ul><ul><li>g =  *lt + C*(1-  )*(1/th) </li></ul><ul><li>Simulation-based approach to select the optimal rewriting alternative (minimizing the goal function g) </li></ul><ul><ul><li>For a set of rewriting alternatives A = {A 1 , … A K } , find </li></ul></ul><ul><li>argmin Ai ( g ) </li></ul>
  14. 14. Experimental Results Simulation results for seq (E1, …, E16) split of pairs The Pareto frontier Min latency Max throughput The base pattern Not in the Pareto frontier 15 95 7 : 1 8 32 163 6 : 2 7 63 172 5 : 3 6 95 165 4 : 4 5 rewritten pattern 147 155 3 : 5 4 174 142 2 : 6 3 189 110 1 : 7 2 260 140 0 : 8 1 Detected event latency (ms) throughput (event/s) rewriting # lazy eager
  15. 15. <ul><li>Pattern rewriting framework for event processing optimization </li></ul><ul><ul><li>With more than tenfold performance improvement between the original pattern and its rewritten alternative </li></ul></ul><ul><li>Future research and practical activities </li></ul><ul><ul><li>Investigation of additional rewritings </li></ul></ul><ul><ul><ul><li>Using patterns of the same type (e.g., for all pattern) </li></ul></ul></ul><ul><ul><ul><li>Additional methods for rewriting (e.g. seq using all and filter agents) </li></ul></ul></ul><ul><ul><li>Elaborating an algorithm for event processing network rewriting </li></ul></ul><ul><ul><li>Exploring heuristic-based approach for selection of the rewriting alternative of the sequence pattern </li></ul></ul>Future Work