Debs Presentation 2009 July62009

A Stratified Approach for Supporting High
Throughput Event Processing Applications

Geetika T. Lakshmanan Yuri G. Rabinovich Opher Etzion
IBM T. J. Watson Research Center IBM Haifa Research Lab IBM Haifa Research Lab
gtlakshm@us.ibm.com yurir@il.ibm.com opher@il.ibm.com

July 2009

Outline
 Why scalable event processing is an important problem?
 Some terms (EPN, EPA)
 What has been done already?
 Overview of our solution
– Credit-Card Scenario
– Profiling and initial assignment of nodes to strata
– Stratification
– Load Distribution Algorithm
– Algorithm optimizations and support for dynamic changes in event
processing graph
 Implementation and Results
 Conclusion

2

Our Goal
Devise a generic framework to maximize the overall input (and thus
output) throughput of an event processing application which is
represented as an EPN, given a specific set of resources (cluster
of nodes with varying computational power) and a traffic model.
The framework should be adaptive to changes either in the
configuration or in the traffic model.

Engine EPN Engine
Event
EPA EPA Event
Producer EPA EPA
EPA EPA Consumer
EPA EPA
Event Event
Producer Engine Consumer
Repository
EPA
EPA Event
Event EPA
EPA
Producer Consumer

3

Why is this an important problem?
 Quantity of events that a single application needs to process is
constantly increasing (E.g. RFID events, Massive Online
Multiplayer Games, financial transactions).
 Manual partitioning is difficult (due to semantic dependencies
between event processing agents) particularly when it is required
to be adaptive and dynamic.

4

Event Processing Agent
 An event processing agent has input and output event channels.
 In general it receives a collection of events as input, derives one
or more events as output and emits them on one or more output
channels.
 The input channels are partitioned according to a context which
partitions the space of events according to semantic partitions of
relevance

Input Output Event Processing Agent
Event Processing Agent
Channel Channel

Derived Event
Context Agent Spec
Definition Filter Transform Detect Pattern Route

Translate Aggregate Split Enrich

5

Related Work
 Scalability in event processing
– Scalable event processing infrastructure (E.g. Astrolobe (2003), PIER
(2003), Sienna (2000.))
– Controlled input load shedding (Kulkarni et al. (2008)).
– CEP over streams (Wu et al. (2006)).
– More work needs to be done.
 Numerous centralized implementations arising due to interdependencies
among event processing agents.
 Synergy between stream processing and event processing.
– Distributed stream processing techniques:
• Mehta et al., 1995
• Shah et al., 2003
• Balazinska et al., 2004
• Kumar et al., 2005
• Xing et al., 2005, 2006
• Zhou et al., 2006
• Pietzuch et al., 2006
• Gu et al., 2007
• Lakshmanan et al., 2008
6

Is this a solved problem?

Centralized stream processing
Implementations
Centralized
event processing
Load distribution algorithms
Implementations
for scalable stream processing
Shah et al., Mehta et al., Gu et al., Xing et al.
Zhou et al., Liu et al. ………
Scalable event processing
implementations
(Astrolobe, PIER, Sienna)

Event-at-a-time Implementations Set-at-a-time Implementations

7

Overview of Our Solution
 Profiling
– Used to assign agents to nodes in order to maximize throughput
 Stratification of EPN
– Splitting the EPN into strata layers
– Based on semantic dependencies between agents
– Distributed implementation design with event proxy to relay
events between strata
 Load Distribution
– Distribute load among agents dynamically during runtime and
respect statistical load relationships between nodes

8

Distributed Event Processing Network Architecture
 Input: Specification of an Event Processing Application
 Output: Stratified EPN (event processing operations event processing agents)

Stratum 1 Stratum 2 Stratum 3

EP Node EP Node EP Node
Events Events
EP EP EP
Proxy Proxy Proxy


DB DB DB

 Event Proxy receives input events and routes them to nodes in a stratum
according to the event context.
 Event proxy periodically collects performance statistics per node in a
stratum.
9

Stratified Event Processing Graph
1. Define the event processing application in the form of an Event
Processing Network Dependency Graph
 G=(V,E) (directed edges from event source to event target)
2. Overview of Stratification Algorithm
 Create partitions by finding sub graphs that are independent in
the dependency graph.
 For each sub graph, construct a network of EPAs.
 Push filters to the beginning of the network to filter out irrelevant
events.
 Iterate through graph and identify areas of strict interdependence.
(i.e. sub graphs with no connecting edges).
 For each sub graph define stratum levels.

High Volume
Purchase Give Discount to
Amount > Purchase More Than
5 Occurrences Company
100 Discount
High Volume Within 1 Hour Canceled
Amount > Purchase More Than Cancel
5 Occurrences Company Follows
100 Discount
Within 1 Hour Discount
Canceled
Cancel Cancel High Volume
More Than Cancel
Follows Amount > Cancel
3 Occurrences Discount to
Discount 100 Within 1 Hour Company
Cancel High Volume
More Than Cancel
Amount > Cancel
100 Within 1 Hour Company

High Volume
100 Discount
Within 1 Hour Canceled
Cancel
Follows
Discount

Cancel High Volume
More Than Cancel
Amount > Cancel

10

Example: Credit Card Scenario
Event Processing Dependency Graph
High Volume
100 Discount
Within 1 Hour Canceled
Cancel
Follows
Discount
High Volume
Cancel More Than
Amount > Cancel Cancel

Stratification algorithm

High Volume Give Discount
Purchase Amount > Purchase More Than to Company
100 5 Occurrences
Cancel Canceled
Follows
Discount
High Volume
Cancel More Than
Amount > Cancel
3 Occurrences Cancel
100 Within 1 Hour Discount to
Company

Stratified Event Processing Graph
11

Assigning Nodes to Each Stratum
 Goal: Executing at a user set percentage of their capacity, nodes in a
stratum can process all of the incoming events in their stratum level in
parallel under peak event traffic conditions.
– Assume agents in a single stratum are replicated on all nodes in
that stratum.
 Overall strategy:
– Profiling nodes. Determine maximum event processing capability of available
nodes by observing performance under synthetic workload.
– Compute ratio of events split between nodes for first stratum.
– Determine number of nodes to assign to stratum.
– Repeat for next stratum, and next, until done.

12

Assigning Nodes to Each Stratum
– ti : User set percentage of node capacity
– mi : All the incoming events
– ri : Maximum possible event processing rate (events/sec)
– di : Maximum possible derived event production rate (events/sec)
Formulas

((ti*ri)/mi)*100 Stratum ID: n
mi*(di/ri) Stratum ID:
n+1
Percentage of event Derived event production
stream directed to node i rate of nodes in stratum n

Example: Incoming event rate: 200,000 ev/sec. Processing Capacity of node i: 36,000 events/sec. ti=0.95.

If (di/ri)=0.5, total rate of
((0.95*36,000)/200,000)*100 = 17.1% derived events created by Stratum ID:
the stratum n nodes is n+1
Percentage of event Stratum ID: n 200,000*0.5=100,000
stream directed to node i events/sec
Thus, 6 nodes will be
needed in this stratum
13

Overview of Dynamic Load Distribution Algorithm
 Statistics collected by event Proxy:
– Number of input events processed by execution of agents in a
particular context
– Number of derived events produced by the execution of agents in
this context
– Number of different agent executions evaluated in this context
– Total amount of latency to evaluate all agents executed in this
context
 For these statistics, event proxy maintains a time series, and computes
statistics such as mean, standard deviation, covariance and correlation
coefficient (between agents on the same node, and between contexts
for the same agent).
 These statistics dictate the choice of load donor and recipient nodes.
 Definition of load is purposely generic to incorporate priorities and
preferences of application priorities.

14

Overview of Dynamic Load Distribution Algorithm
Stratum n Stratum n+1

Engine Queue AMIT Engine Queue AMIT

Engine Queue AMIT EPProxy Engine Queue AMIT

Engine Queue AMIT Engine Queue AMIT

 Event Proxy collects statistics and maintains a time series and makes
the following decisions:
– Identify most heavily loaded node in a stratum (donor node).
– Identify a heavy context to migrate from the donor node. (Also use load
correlation as a guiding factor).
– Identify recipient node for migrated load.
– Estimate post migration utilization of donor and recipient nodes. If post
migration utilization of recipient node is unsatisfactory, go back to step 3 and identify
new recipient node. If post migration utilization of donor node is unsatisfactory, go back
to step 2 and identify new context to migrate.
– Execute migration and wait for x length time interval. Go to step 1.

15

Post Migration Utilization Calculation
 We need to determine whether this migration will lead to overload. If it
triggers other migrations then the system will become unstable. Therefore
compute the post migration utilization of the donor and recipient machines.
 If the average event arrival rate in time period t for context c is (c) and the
average latency to evaluate context c is a (c), then the load of this context in
time period t can be defined as (c1)( (c1).
 Thus the post migration utilization, Ud, of the donor machine and Ur of the
recipient machine after migrating a context c1, and where nd and nr are the
total number of contexts on the donor and recipient respectively, is:

λ (c1) p (c1) λ (c1) p(c1)
Ud ' = Ud (1 − i≤nd ) Ur ' = Ur (1 + i ≤nr )
∑ λ (ci) p(ci)
i =1
∑ λ (ci) p(ci)
i =1

 Post migration utilization of donor and recipient nodes must be less than
preset quality thresholds.

16

Implementation
 Used nodes running IBM Active Middleware Technology (AMiT), a CEP
engine that serves as a container for event processing agents.
 Event processing scenario: credit card scenario
 Node hardware characteristics:
– Type 1: Dual Core AMD Opteron 280 2.4 GHz and 1GB memory.
– Type 2: Intel Pentium D 3.6 Ghz and 2GB memory.
– Type 3: Intel Xeon 2.6 Ghz and 2 GB memory.

High Volume Give Discount
Purchase Amount > Purchase More Than to Company
100 5 Occurrences
Cancel Canceled
Follows
Discount
High Volume
Cancel More Than
Amount > Cancel
3 Occurrences Cancel
100 Within 1 Hour Discount to
Company

17

Results
450000
398000
400000

350000

300000
300000
Input events processing
rate by stratified versus
Events/Sec

250000

200000
162000
partitioned event
150000
150000 processing networks
81000 90000
100000

50000

0
1:1:1 = 3 Machines 2:2:1 = 5 Machines 5:4:1 = 10 Machines
System Type

Centralized = 30,000 Stratified Input Rate Partitioned Input Rate

 Synthetic workload: consists strictly of all events that trigger the generation of
derived events. Number of nodes: 3, 5, 10. Heterogeneous mix of Type 1, 2 and 3.
Ratios are selected to be “optimal.”
 Y-axis: Maximum number of input event processing rate is computed as the sum of
the average input event processing rate of all nodes in the network.
 Illustrates the maximum performance that can be achieved by the event
processing network when it is overloaded
18

Results
25,000

21000
20,000

15000
Derived events
15,000
production rate by
Events/Sec

10,000
9000 stratified versus
4,500 4500
7500
partitioned event
5,000
processing networks.
0
1:1:1 = 3 Machines 2:2:1 = 5 Machines 5:4:1 = 10 Machines
System Type

Centralized = 1,500 Stratified Derived Rate Partitioned Derived Rate

 Synthetic workload: consists strictly of all events that trigger the generation of
derived events. Number of nodes: 3, 5, 10. Heterogeneous mix of Type 1, 2 and 3.
Ratios are selected to be “optimal.”
 Y-axis: Maximum number of derived event processing rate is computed as the sum
of the average derived event processing rate of all nodes in the network.

19

Results
50.00
32.67 40.00
40.00 Percentage of
Improvement (Percentage)

30.00
20.00
improvement in
10.00
performance of
-32.08 -31.75
0.00 the stratified
-10.00
network relative to
-20.00
-30.00
a partitioned
-40.00 network
100% - 5:4:1 12.5.% - 5:4:1
Percentage of Events Participating in Derived Events Production

Improvement In Event Processing Rate Improvement In Derived Events Rate

 Stratified network of ten nodes, where the proportion of nodes in three strata is
5:4:1 with ten nodes in a partitioned network.
 All nodes used for this experiment are of Type 1.
 Illustrates how changing the proportion of input events that participate in derived
events production in the first stratum level impacts the input event processing rate
and derived events production rate of the entire system.

20

Results
60000
52800
49083
50000
44100
39800 39800 37375
40000
34813 34438
Events/Sec

30000 Average input events processing
20000
rate per node in a stratified
network with different
configurations
10000

0
100% - 5:4:1 50% - 6:3:1 25% - 8:3:1 12.5.% - 11:3:1
Percenatage of Events Participating in Derived Events Productions

5:4:1Ratio Optimal Ratio for Percentage

 Compares the average input events processing rate per processing node of a
stratified network of ten nodes where the distribution of nodes among the three
strata is 5:4:1 to an optimal configuration of nodes to strata.
 Demonstrates that reconfiguration of the system with optimal ratio of nodes per
stratum can improve performance and reacts effectively to changes in the
proportion of input events that participate in derived events production in the first
stratum level
21

Results
70000
300000 Dynamic Load Distribution (Ours)

60000 Largest Load First
250000 Dynamic Load Distribution (Ours) Random
50000
No Load Distribution
Total Throughput

200000 No Load Distribution

Mean Throughput
40000
150000
30000

100000
20000

50000
10000

0
0
0 200 400 600 800 1000 1200
5 10 15 20 25 30
Time (sec) Number of Nodes

Total throughput since the system Mean throughput of load distribution
started event processing compared with LLF, Random and None for
different number of nodes
 Load is defined as the total number of agents' executions for a particular context.
 0.5: load threshold for a node to initiate load distribution.
 0.1: load threshold of a context’s contribution to the percentage of the total load on a node, where this
context has the highest load correlation coefficient with respect to the remaining contexts on the same
node.
 0.85: acceptable post-migration utilization of a recipient node. 0.1 is the threshold for percentage decrease
in utilization of a donor node to warrant a migration.
 Periodically fluctuating workload.
22

Support for Dynamic Changes in EP Graph
 Our algorithm supports:
– Addition of a new connected subgraph to the existing EPN.
– Addition of an agent to the graph in the EPN.
– Deletion of agents from the graph
– Failure of one or more nodes in a stratum level.
 Algorithm is also amenable to agent-level optimizations (E.g.
coalescing of neighboring agents).

23

Conclusion and Future Work
 We demonstrate a novel architecture for distributed event
processing that maximizes the throughput of the event
processing system, and stratification algorithm to partition an
event processing application on to a distributed set of nodes.
 The experimental results illustrate the effectiveness of the
stratification technique for achieving an initial partitioning of the
event processing graph in a distributed event processing system
that anticipates a high volume of agent triggering events.
 Performance of a stratified network can be improved during
runtime with the dynamic load distribution algorithm.
 Future Work:
– Investigate high availability
– Techniques for optimizing stateful load migration between nodes
dynamically during runtime.
– Investigate variations of stratification (currently in IBM HRL)

24

Goal of Implementation
 Explore benefits of event processing on stratified vs. centralized
(single node) vs. partitioned network (single stratum in which
load is distributed according to context) when system is under
heavy load (when the number of incoming events that trigger the
generation of derived events increases).
 Compare stratification with partitioned approach when system is
not heavily loaded.
 Explore the effectiveness and scalability of the load distribution
algorithm

26

Debs Presentation 2009 July62009

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (20)

Similar to Debs Presentation 2009 July62009

Similar to Debs Presentation 2009 July62009 (20)

More from Opher Etzion

More from Opher Etzion (20)

Recently uploaded

Recently uploaded (20)

Debs Presentation 2009 July62009