Introducing the WSO2 
Complex Event Processor 
Simplifying Complexities of Data Processing 
S. Suhothayan 
Software Engineer, 
Data Technologies Team.
Outline 
ƒ Introduction to CEP 
ƒ WSO2 CEP Server 
ƒ Siddhi Runtime 
ƒ HA & Scalability of WSO2 CEP 
ƒ WSO2 CEP server and WSO2 BAM 
ƒ Use Cases
Event Processing (Contd.) 
ƒ Event processing is about listening to events and 
detecting patterns in near real-time without storing 
all events. 
ƒ Three models 
o Simple Event Processing 
- Simple filters (e.g. Is this a gold or platinum customer?) 
o Event Stream Processing 
- Looking across multiple event streams and joining 
multiple event stream etc. 
o Complex Event Processing 
- Processing multiple event streams to identify meaningful 
patterns, using complex conditions & temporal windows 
- E.g. There has been a more than 10% increase in overall 
trading activity AND the average price of commodities 
has fallen 2% in the last 4 hours
Complex Event Processing 
ƒ We categorize events into different streams 
ƒ Process with minimal storage 
ƒ Use queries to evaluate the continuous event 
streams (Usually SQL like query language) 
ƒ Very fast results (in milliseconds range)
CEP Queries 
ƒ Types of queries are following 
o Filters and Projection 
o Windows – events are processed within temporal 
windows (e.g. for aggregation and joins). 
Time window vs. length window. 
o Ordering – identify event sequences and patterns 
(e.g. for a credit card new location followed by 
small and a large purchase might suggest a fraud) 
o Joins – join two streams
Example Query 
from p=PINChangeEvents#window.time(3600) join 
t=TransactionEvents[amount>10000]#window.time(3600) 
on p.custid==t.custid 
return t.custid, t.amount;
Opensource CEP Runtimes 
ƒ Siddhi 
o Apache License, a java library, Tuple based event 
model 
o Supports distributed processing 
o Supports multiple query models 
- Based on a SQL-like language 
- Filters, Windows, Joins, Ordering and others 
ƒ Esper, http://esper.codehaus.org 
o GPLv2 License, a Java library, Events can be XML, Map, 
Object 
o Supports multiple query models 
- Based on a SQL-like language 
- Filters, Windows, Joins, Ordering and others 
ƒ Drools Fusion 
o Apache License, a java library 
o Support for temporal reasoning + windows
WSO2 CEP Server 
ƒ Enterprise grade server for CEP runtimes 
ƒ Provides support for several transports 
(network access) and data formats 
o SOAP/WS-Eventing – XML messages 
o REST/JSON – JSON messages 
o JMS – map messages, XML messages 
o Thrift – WSO2 data bridge format 
- High Performant Event Capturing & Delivery Framework 
supports Java/C/C++/C# via Thrift language bindings. 
ƒ Support multiple CEP runtimes 
o Siddhi – WSO2, new, very fast, distributed 
o Esper - well known CEP runtime 
o Drools Fusion – rule based, but much slower 
ƒ Easy plugin new brokers, new CEP engines
WSO2 CEP Server(Contd.) 
File System
CEP Buckets 
ƒ CEP Bucket is a 
logical execution 
unit 
ƒ Each CEP bucket has 
set of queries, 
event sources and 
input, output event 
mappings. 
ƒ It is one-one with a 
CEP engine
Management UI 
ƒ To define 
buckets 
ƒ Update running 
queries without 
resetting 
current 
execution 
states 
ƒ Manage brokers 
(Data adopters)
Developer Studio UI 
ƒ Eclipse based 
tool to define 
buckets 
ƒ Can manage 
the 
configurations 
through the 
production 
lifecycle
Siddhi Complex 
Event Processing 
Engine
Big Picture 
ƒ Users provide query/queries 
ƒ Map event streams to queries 
ƒ Siddhi keep the queries running and invoke 
callbacks registered against one or more 
queries/streams 
ƒ Example Query 
from cseEventStream[ symbol == ‘IBM’]#win.time(50000) 
insert into IBMStockQuote symbol, avg(price) as avgPrice
Siddhi High Level Architecture
Siddhi Queries: Filters 
from <stream-name> [<conditions>]* 
insert into <stream-name> 
ƒ Filters the events by conditions 
ƒ Conditions 
o >, <, = , <=, <=, != 
o contains 
o and, or, not 
ƒ Example 
from cseEventStream[price >= 20 and symbol==’IBM’] 
insert into StockQuote symbol, volume
Window 
from <stream-name> [<conditions>]#window.<window-name>(<parameters>) 
Insert [<output-type>] into <stream-name 
ƒ Types of Windows 
o (Time | Length) (Sliding| Batch) windows 
o Unique window, First unique (not supported in 1.0) 
ƒ Type of aggregate functions 
o sum, avg, max, min 
ƒ Example 
from cseEventStream[price >= 20]#window.lengthBatch(50) 
insert expired-events into StockQuote 
symbol, avg(price) as avgPrice 
group by symbol 
having avgPrice>50
Join 
from <stream>#<window> [unidirectional] join <stream>#<window> 
on <condition> within <time> 
insert into <stream> 
ƒ Join two streams based on a condition and window 
ƒ Join can be in multiple forms ((left|right|full outer) | 
inner) join - only inner is supported in 1.0 
ƒ Unidirectional – event arriving only to the 
unidirectional stream triggers the join 
ƒ Example 
from TickEvent[symbol==’IBM’]#win.length(2000) 
join NewsEvent#win.time(500) 
insert into JoinStream *
Pattern 
from [every] <condition> Æ [every] <condition> … <condition> 
within <time> 
insert into StockQuote (<attribute-name>* | * ) 
ƒ Check condition A happen before/after condition B 
ƒ Can do iterative checks via “every” keyword. 
ƒ Here with “within <time>”, SIddhi emits only events 
that are within that time of each other 
ƒ Example 
from every (a1 = purchase[price < 10] ) 
Æa2 = purchase [price >10000 and a1.cardNo==a2.cardNo] 
within 300000 
insert into potentialFraud 
a2. cardNo as cardNo, a2. price as price, a2.place as place
Sequence 
from <event-regular-expression> within <time> insert into <stream> 
ƒ Regular Expressions supported 
o * - Zero or more matches (reluctant). 
o + - One or more matches (reluctant). 
o ? - Zero or one match (reluctant). 
o or – either event 
ƒ Here we have to refer events returned by * , + using 
square brackets to access a specific occurrence of 
that event 
From a1 = requestOrder[action == "buy"], 
b1 = cseEventStream[price > a1.price and symbol==a1.symbol]+, 
b2 = cseEventStream[price <b1.price] 
insert into purchaseOrder 
a1. symbol as symbol, b1[0].price as firstPrice, b2.price as orderPrice
Performance Results 
ƒ We compared Siddhi with Esper, the widely used 
opensource CEP engine 
ƒ For evaluation, we did setup different queries using both 
systems, push events in to the system, and measure the 
time till all of them are processed. 
ƒ We used Intel(R) Xeon(R) X3440 @2.53GHz , 4 cores 8M 
cache 8GB RAM running Debian 2.6.32-5-amd64 Kernel
Performance Comparison With ESPER 
Simple filter without window 
from StockTick[prize >6] return symbol, prize
Performance Comparison With ESPER 
State machine query for pattern matching 
From f=FraudWarningEvent -> 
p=PINChangeEvent(accountNumber=f.accountNumber) 
return accountNumber;
Siddhi Features 
ƒ Supports State Persistence 
o Enabling Queries to span lifetimes much greater 
than server uptime. 
o By taking periodic snapshots and storing all state 
information and windows to a scalable persistence 
store (Apache Cassandra). 
o Pluggable persistent stores. 
ƒ Support Highly Available Deployment 
o Using Hazelcast distributed cache as a shared 
working memory.
HA/ Persistence 
ƒ This is ability to recover 
runtime state in the 
case of a failure 
ƒ CEP server can support 
if CEP engine supports 
persistence (OK with 
Siddhi, Esper)
Scaling 
ƒ CEP pipeline can be distributed,But queries like 
windows, patterns, and Join are hard to distribute 
ƒ WSO2 CEP with Siddhi uses distributed cache 
(Hazelcast) as shared memory and selective 
processing approach to achieve massive scalability in 
distributed processing
Event Recording 
ƒ Ability to record all/some of the events for 
future processing 
ƒ Few options 
o Publish them to Cassandra cluster using WSO2 data 
bridge API or BAM (can process data in Cassandra 
with Hadoop using WSO2 BAM). 
o Write them to distributed cache 
o Custom thrift based event recorder
WSO2 BAM
CEP Role within WSO2 Platform
DEMO
Scenario 
ƒ Monitoring stock exchange for game changing 
moments 
ƒ Two input event streams. 
o Event stream of Stock Quotes from a stock 
exchange 
o Event stream of word count on various company 
names from twitter pages 
ƒ Check whether the last traded price of the 
stock has changed significantly(by 2%) within 
last minute, and people are twitting about that 
company (> 10) within last minute
Example Scenario 
JMS Event 
Publisher 
JMS Event 
Receiver
Input events 
ƒ Input events are JMS Maps 
o Stock Exchange Stream 
Map<String, Object> map1 = new HashMap<String, Object>(); 
map1.put("symbol", "MSFT"); 
map1.put("price", 26.36); 
publisher.publish("AllStockQuotes", map1); 
o Twitter Stream 
Map<String, Object> map1 = new HashMap<String, Object>(); 
map1.put("company", "MSFT"); 
map1.put("wordCount", 8); 
publisher.publish("TwitterFeed", map1);
Queries
Queries 
from allStockQuotes[win.time(60000)] 
insert into fastMovingStockQuotes 
symbol,price, avg(price) as averagePrice 
group by symbol 
having ((price > averagePrice*1.02) or (averagePrice*0.98 > price )) 
from twitterFeed[win.time(60000)] 
insert into highFrequentTweets 
company as company, sum(wordCount) as words 
group by company 
having (words > 10) 
from fastMovingStockQuotes[win.time(60000)] as fastMovingStockQuotes 
join highFrequentTweets[win.time(60000)] as highFrequentTweets 
on fastMovingStockQuotes.symbol==highFrequentTweets.company 
insert into predictedStockQuotes 
fastMovingStockQuotes.symbol as company, 
fastMovingStockQuotes.averagePrice as amount, 
highFrequentTweets.words as words
Alert 
ƒ As a XML 
<quotedata:StockQuoteDataEvent 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
xmlns:quotedata="http://ws.cdyne.com/"> 
<quotedata:StockSymbol>{company}</quotedata:StockSymbol> 
<quotedata:LastTradeAmount>{amount}</quotedata:LastTradeAmount> 
<quotedata:WordCount>{words}</quotedata:WordCount> 
</quotedata:StockQuoteDataEvent>
Useful links 
ƒ WSO2 CEP 2.0.0 Milestone 2 
https://svn.wso2.org/repos/wso2/people/suho/packs/cep/wso2cep-2.0.0- 
M2.zip 
ƒ Distributed Processing Sample With Siddhi CEP 
and ActiveMQ JMS Broker. 
http://suhothayan.blogspot.com/2012/08/distributed-processing-sample-for-wso2. 
html
Questions?
Thank you.

Introducing the WSO2 Complex Event Processor

  • 1.
    Introducing the WSO2 Complex Event Processor Simplifying Complexities of Data Processing S. Suhothayan Software Engineer, Data Technologies Team.
  • 2.
    Outline ƒ Introductionto CEP ƒ WSO2 CEP Server ƒ Siddhi Runtime ƒ HA & Scalability of WSO2 CEP ƒ WSO2 CEP server and WSO2 BAM ƒ Use Cases
  • 3.
    Event Processing (Contd.) ƒ Event processing is about listening to events and detecting patterns in near real-time without storing all events. ƒ Three models o Simple Event Processing - Simple filters (e.g. Is this a gold or platinum customer?) o Event Stream Processing - Looking across multiple event streams and joining multiple event stream etc. o Complex Event Processing - Processing multiple event streams to identify meaningful patterns, using complex conditions & temporal windows - E.g. There has been a more than 10% increase in overall trading activity AND the average price of commodities has fallen 2% in the last 4 hours
  • 4.
    Complex Event Processing ƒ We categorize events into different streams ƒ Process with minimal storage ƒ Use queries to evaluate the continuous event streams (Usually SQL like query language) ƒ Very fast results (in milliseconds range)
  • 5.
    CEP Queries ƒTypes of queries are following o Filters and Projection o Windows – events are processed within temporal windows (e.g. for aggregation and joins). Time window vs. length window. o Ordering – identify event sequences and patterns (e.g. for a credit card new location followed by small and a large purchase might suggest a fraud) o Joins – join two streams
  • 6.
    Example Query fromp=PINChangeEvents#window.time(3600) join t=TransactionEvents[amount>10000]#window.time(3600) on p.custid==t.custid return t.custid, t.amount;
  • 7.
    Opensource CEP Runtimes ƒ Siddhi o Apache License, a java library, Tuple based event model o Supports distributed processing o Supports multiple query models - Based on a SQL-like language - Filters, Windows, Joins, Ordering and others ƒ Esper, http://esper.codehaus.org o GPLv2 License, a Java library, Events can be XML, Map, Object o Supports multiple query models - Based on a SQL-like language - Filters, Windows, Joins, Ordering and others ƒ Drools Fusion o Apache License, a java library o Support for temporal reasoning + windows
  • 8.
    WSO2 CEP Server ƒ Enterprise grade server for CEP runtimes ƒ Provides support for several transports (network access) and data formats o SOAP/WS-Eventing – XML messages o REST/JSON – JSON messages o JMS – map messages, XML messages o Thrift – WSO2 data bridge format - High Performant Event Capturing & Delivery Framework supports Java/C/C++/C# via Thrift language bindings. ƒ Support multiple CEP runtimes o Siddhi – WSO2, new, very fast, distributed o Esper - well known CEP runtime o Drools Fusion – rule based, but much slower ƒ Easy plugin new brokers, new CEP engines
  • 9.
  • 10.
    CEP Buckets ƒCEP Bucket is a logical execution unit ƒ Each CEP bucket has set of queries, event sources and input, output event mappings. ƒ It is one-one with a CEP engine
  • 11.
    Management UI ƒTo define buckets ƒ Update running queries without resetting current execution states ƒ Manage brokers (Data adopters)
  • 12.
    Developer Studio UI ƒ Eclipse based tool to define buckets ƒ Can manage the configurations through the production lifecycle
  • 13.
    Siddhi Complex EventProcessing Engine
  • 14.
    Big Picture ƒUsers provide query/queries ƒ Map event streams to queries ƒ Siddhi keep the queries running and invoke callbacks registered against one or more queries/streams ƒ Example Query from cseEventStream[ symbol == ‘IBM’]#win.time(50000) insert into IBMStockQuote symbol, avg(price) as avgPrice
  • 15.
    Siddhi High LevelArchitecture
  • 16.
    Siddhi Queries: Filters from <stream-name> [<conditions>]* insert into <stream-name> ƒ Filters the events by conditions ƒ Conditions o >, <, = , <=, <=, != o contains o and, or, not ƒ Example from cseEventStream[price >= 20 and symbol==’IBM’] insert into StockQuote symbol, volume
  • 17.
    Window from <stream-name>[<conditions>]#window.<window-name>(<parameters>) Insert [<output-type>] into <stream-name ƒ Types of Windows o (Time | Length) (Sliding| Batch) windows o Unique window, First unique (not supported in 1.0) ƒ Type of aggregate functions o sum, avg, max, min ƒ Example from cseEventStream[price >= 20]#window.lengthBatch(50) insert expired-events into StockQuote symbol, avg(price) as avgPrice group by symbol having avgPrice>50
  • 18.
    Join from <stream>#<window>[unidirectional] join <stream>#<window> on <condition> within <time> insert into <stream> ƒ Join two streams based on a condition and window ƒ Join can be in multiple forms ((left|right|full outer) | inner) join - only inner is supported in 1.0 ƒ Unidirectional – event arriving only to the unidirectional stream triggers the join ƒ Example from TickEvent[symbol==’IBM’]#win.length(2000) join NewsEvent#win.time(500) insert into JoinStream *
  • 19.
    Pattern from [every]<condition> Æ [every] <condition> … <condition> within <time> insert into StockQuote (<attribute-name>* | * ) ƒ Check condition A happen before/after condition B ƒ Can do iterative checks via “every” keyword. ƒ Here with “within <time>”, SIddhi emits only events that are within that time of each other ƒ Example from every (a1 = purchase[price < 10] ) Æa2 = purchase [price >10000 and a1.cardNo==a2.cardNo] within 300000 insert into potentialFraud a2. cardNo as cardNo, a2. price as price, a2.place as place
  • 20.
    Sequence from <event-regular-expression>within <time> insert into <stream> ƒ Regular Expressions supported o * - Zero or more matches (reluctant). o + - One or more matches (reluctant). o ? - Zero or one match (reluctant). o or – either event ƒ Here we have to refer events returned by * , + using square brackets to access a specific occurrence of that event From a1 = requestOrder[action == "buy"], b1 = cseEventStream[price > a1.price and symbol==a1.symbol]+, b2 = cseEventStream[price <b1.price] insert into purchaseOrder a1. symbol as symbol, b1[0].price as firstPrice, b2.price as orderPrice
  • 21.
    Performance Results ƒWe compared Siddhi with Esper, the widely used opensource CEP engine ƒ For evaluation, we did setup different queries using both systems, push events in to the system, and measure the time till all of them are processed. ƒ We used Intel(R) Xeon(R) X3440 @2.53GHz , 4 cores 8M cache 8GB RAM running Debian 2.6.32-5-amd64 Kernel
  • 22.
    Performance Comparison WithESPER Simple filter without window from StockTick[prize >6] return symbol, prize
  • 23.
    Performance Comparison WithESPER State machine query for pattern matching From f=FraudWarningEvent -> p=PINChangeEvent(accountNumber=f.accountNumber) return accountNumber;
  • 24.
    Siddhi Features ƒSupports State Persistence o Enabling Queries to span lifetimes much greater than server uptime. o By taking periodic snapshots and storing all state information and windows to a scalable persistence store (Apache Cassandra). o Pluggable persistent stores. ƒ Support Highly Available Deployment o Using Hazelcast distributed cache as a shared working memory.
  • 25.
    HA/ Persistence ƒThis is ability to recover runtime state in the case of a failure ƒ CEP server can support if CEP engine supports persistence (OK with Siddhi, Esper)
  • 26.
    Scaling ƒ CEPpipeline can be distributed,But queries like windows, patterns, and Join are hard to distribute ƒ WSO2 CEP with Siddhi uses distributed cache (Hazelcast) as shared memory and selective processing approach to achieve massive scalability in distributed processing
  • 27.
    Event Recording ƒAbility to record all/some of the events for future processing ƒ Few options o Publish them to Cassandra cluster using WSO2 data bridge API or BAM (can process data in Cassandra with Hadoop using WSO2 BAM). o Write them to distributed cache o Custom thrift based event recorder
  • 28.
  • 29.
    CEP Role withinWSO2 Platform
  • 30.
  • 31.
    Scenario ƒ Monitoringstock exchange for game changing moments ƒ Two input event streams. o Event stream of Stock Quotes from a stock exchange o Event stream of word count on various company names from twitter pages ƒ Check whether the last traded price of the stock has changed significantly(by 2%) within last minute, and people are twitting about that company (> 10) within last minute
  • 33.
    Example Scenario JMSEvent Publisher JMS Event Receiver
  • 34.
    Input events ƒInput events are JMS Maps o Stock Exchange Stream Map<String, Object> map1 = new HashMap<String, Object>(); map1.put("symbol", "MSFT"); map1.put("price", 26.36); publisher.publish("AllStockQuotes", map1); o Twitter Stream Map<String, Object> map1 = new HashMap<String, Object>(); map1.put("company", "MSFT"); map1.put("wordCount", 8); publisher.publish("TwitterFeed", map1);
  • 35.
  • 36.
    Queries from allStockQuotes[win.time(60000)] insert into fastMovingStockQuotes symbol,price, avg(price) as averagePrice group by symbol having ((price > averagePrice*1.02) or (averagePrice*0.98 > price )) from twitterFeed[win.time(60000)] insert into highFrequentTweets company as company, sum(wordCount) as words group by company having (words > 10) from fastMovingStockQuotes[win.time(60000)] as fastMovingStockQuotes join highFrequentTweets[win.time(60000)] as highFrequentTweets on fastMovingStockQuotes.symbol==highFrequentTweets.company insert into predictedStockQuotes fastMovingStockQuotes.symbol as company, fastMovingStockQuotes.averagePrice as amount, highFrequentTweets.words as words
  • 37.
    Alert ƒ Asa XML <quotedata:StockQuoteDataEvent xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:quotedata="http://ws.cdyne.com/"> <quotedata:StockSymbol>{company}</quotedata:StockSymbol> <quotedata:LastTradeAmount>{amount}</quotedata:LastTradeAmount> <quotedata:WordCount>{words}</quotedata:WordCount> </quotedata:StockQuoteDataEvent>
  • 38.
    Useful links ƒWSO2 CEP 2.0.0 Milestone 2 https://svn.wso2.org/repos/wso2/people/suho/packs/cep/wso2cep-2.0.0- M2.zip ƒ Distributed Processing Sample With Siddhi CEP and ActiveMQ JMS Broker. http://suhothayan.blogspot.com/2012/08/distributed-processing-sample-for-wso2. html
  • 39.
  • 40.