Introducing the WSO2 Complex Event Processor

7,612 views

Published on

Published in: Technology
0 Comments
9 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
7,612
On SlideShare
0
From Embeds
0
Number of Embeds
2,040
Actions
Shares
0
Downloads
361
Comments
0
Likes
9
Embeds 0
No embeds

No notes for slide

Introducing the WSO2 Complex Event Processor

  1. 1. Introducing the WSO2 Complex Event Processor Simplifying Complexities of Data Processing S. Suhothayan Software Engineer, Data Technologies Team.
  2. 2. Outline ƒ Introduction to CEP ƒ WSO2 CEP Server ƒ Siddhi Runtime ƒ HA & Scalability of WSO2 CEP ƒ WSO2 CEP server and WSO2 BAM ƒ Use Cases
  3. 3. Event Processing (Contd.) ƒ Event processing is about listening to events and detecting patterns in near real-time without storing all events. ƒ Three models o Simple Event Processing - Simple filters (e.g. Is this a gold or platinum customer?) o Event Stream Processing - Looking across multiple event streams and joining multiple event stream etc. o Complex Event Processing - Processing multiple event streams to identify meaningful patterns, using complex conditions & temporal windows - E.g. There has been a more than 10% increase in overall trading activity AND the average price of commodities has fallen 2% in the last 4 hours
  4. 4. Complex Event Processing ƒ We categorize events into different streams ƒ Process with minimal storage ƒ Use queries to evaluate the continuous event streams (Usually SQL like query language) ƒ Very fast results (in milliseconds range)
  5. 5. CEP Queries ƒ Types of queries are following o Filters and Projection o Windows – events are processed within temporal windows (e.g. for aggregation and joins). Time window vs. length window. o Ordering – identify event sequences and patterns (e.g. for a credit card new location followed by small and a large purchase might suggest a fraud) o Joins – join two streams
  6. 6. Example Query from p=PINChangeEvents#window.time(3600) join t=TransactionEvents[amount>10000]#window.time(3600) on p.custid==t.custid return t.custid, t.amount;
  7. 7. Opensource CEP Runtimes ƒ Siddhi o Apache License, a java library, Tuple based event model o Supports distributed processing o Supports multiple query models - Based on a SQL-like language - Filters, Windows, Joins, Ordering and others ƒ Esper, http://esper.codehaus.org o GPLv2 License, a Java library, Events can be XML, Map, Object o Supports multiple query models - Based on a SQL-like language - Filters, Windows, Joins, Ordering and others ƒ Drools Fusion o Apache License, a java library o Support for temporal reasoning + windows
  8. 8. WSO2 CEP Server ƒ Enterprise grade server for CEP runtimes ƒ Provides support for several transports (network access) and data formats o SOAP/WS-Eventing – XML messages o REST/JSON – JSON messages o JMS – map messages, XML messages o Thrift – WSO2 data bridge format - High Performant Event Capturing & Delivery Framework supports Java/C/C++/C# via Thrift language bindings. ƒ Support multiple CEP runtimes o Siddhi – WSO2, new, very fast, distributed o Esper - well known CEP runtime o Drools Fusion – rule based, but much slower ƒ Easy plugin new brokers, new CEP engines
  9. 9. WSO2 CEP Server(Contd.) File System
  10. 10. CEP Buckets ƒ CEP Bucket is a logical execution unit ƒ Each CEP bucket has set of queries, event sources and input, output event mappings. ƒ It is one-one with a CEP engine
  11. 11. Management UI ƒ To define buckets ƒ Update running queries without resetting current execution states ƒ Manage brokers (Data adopters)
  12. 12. Developer Studio UI ƒ Eclipse based tool to define buckets ƒ Can manage the configurations through the production lifecycle
  13. 13. Siddhi Complex Event Processing Engine
  14. 14. Big Picture ƒ Users provide query/queries ƒ Map event streams to queries ƒ Siddhi keep the queries running and invoke callbacks registered against one or more queries/streams ƒ Example Query from cseEventStream[ symbol == ‘IBM’]#win.time(50000) insert into IBMStockQuote symbol, avg(price) as avgPrice
  15. 15. Siddhi High Level Architecture
  16. 16. Siddhi Queries: Filters from <stream-name> [<conditions>]* insert into <stream-name> ƒ Filters the events by conditions ƒ Conditions o >, <, = , <=, <=, != o contains o and, or, not ƒ Example from cseEventStream[price >= 20 and symbol==’IBM’] insert into StockQuote symbol, volume
  17. 17. Window from <stream-name> [<conditions>]#window.<window-name>(<parameters>) Insert [<output-type>] into <stream-name ƒ Types of Windows o (Time | Length) (Sliding| Batch) windows o Unique window, First unique (not supported in 1.0) ƒ Type of aggregate functions o sum, avg, max, min ƒ Example from cseEventStream[price >= 20]#window.lengthBatch(50) insert expired-events into StockQuote symbol, avg(price) as avgPrice group by symbol having avgPrice>50
  18. 18. Join from <stream>#<window> [unidirectional] join <stream>#<window> on <condition> within <time> insert into <stream> ƒ Join two streams based on a condition and window ƒ Join can be in multiple forms ((left|right|full outer) | inner) join - only inner is supported in 1.0 ƒ Unidirectional – event arriving only to the unidirectional stream triggers the join ƒ Example from TickEvent[symbol==’IBM’]#win.length(2000) join NewsEvent#win.time(500) insert into JoinStream *
  19. 19. Pattern from [every] <condition> Æ [every] <condition> … <condition> within <time> insert into StockQuote (<attribute-name>* | * ) ƒ Check condition A happen before/after condition B ƒ Can do iterative checks via “every” keyword. ƒ Here with “within <time>”, SIddhi emits only events that are within that time of each other ƒ Example from every (a1 = purchase[price < 10] ) Æa2 = purchase [price >10000 and a1.cardNo==a2.cardNo] within 300000 insert into potentialFraud a2. cardNo as cardNo, a2. price as price, a2.place as place
  20. 20. Sequence from <event-regular-expression> within <time> insert into <stream> ƒ Regular Expressions supported o * - Zero or more matches (reluctant). o + - One or more matches (reluctant). o ? - Zero or one match (reluctant). o or – either event ƒ Here we have to refer events returned by * , + using square brackets to access a specific occurrence of that event From a1 = requestOrder[action == "buy"], b1 = cseEventStream[price > a1.price and symbol==a1.symbol]+, b2 = cseEventStream[price <b1.price] insert into purchaseOrder a1. symbol as symbol, b1[0].price as firstPrice, b2.price as orderPrice
  21. 21. Performance Results ƒ We compared Siddhi with Esper, the widely used opensource CEP engine ƒ For evaluation, we did setup different queries using both systems, push events in to the system, and measure the time till all of them are processed. ƒ We used Intel(R) Xeon(R) X3440 @2.53GHz , 4 cores 8M cache 8GB RAM running Debian 2.6.32-5-amd64 Kernel
  22. 22. Performance Comparison With ESPER Simple filter without window from StockTick[prize >6] return symbol, prize
  23. 23. Performance Comparison With ESPER State machine query for pattern matching From f=FraudWarningEvent -> p=PINChangeEvent(accountNumber=f.accountNumber) return accountNumber;
  24. 24. Siddhi Features ƒ Supports State Persistence o Enabling Queries to span lifetimes much greater than server uptime. o By taking periodic snapshots and storing all state information and windows to a scalable persistence store (Apache Cassandra). o Pluggable persistent stores. ƒ Support Highly Available Deployment o Using Hazelcast distributed cache as a shared working memory.
  25. 25. HA/ Persistence ƒ This is ability to recover runtime state in the case of a failure ƒ CEP server can support if CEP engine supports persistence (OK with Siddhi, Esper)
  26. 26. Scaling ƒ CEP pipeline can be distributed,But queries like windows, patterns, and Join are hard to distribute ƒ WSO2 CEP with Siddhi uses distributed cache (Hazelcast) as shared memory and selective processing approach to achieve massive scalability in distributed processing
  27. 27. Event Recording ƒ Ability to record all/some of the events for future processing ƒ Few options o Publish them to Cassandra cluster using WSO2 data bridge API or BAM (can process data in Cassandra with Hadoop using WSO2 BAM). o Write them to distributed cache o Custom thrift based event recorder
  28. 28. WSO2 BAM
  29. 29. CEP Role within WSO2 Platform
  30. 30. DEMO
  31. 31. Scenario ƒ Monitoring stock exchange for game changing moments ƒ Two input event streams. o Event stream of Stock Quotes from a stock exchange o Event stream of word count on various company names from twitter pages ƒ Check whether the last traded price of the stock has changed significantly(by 2%) within last minute, and people are twitting about that company (> 10) within last minute
  32. 32. Example Scenario JMS Event Publisher JMS Event Receiver
  33. 33. Input events ƒ Input events are JMS Maps o Stock Exchange Stream Map<String, Object> map1 = new HashMap<String, Object>(); map1.put("symbol", "MSFT"); map1.put("price", 26.36); publisher.publish("AllStockQuotes", map1); o Twitter Stream Map<String, Object> map1 = new HashMap<String, Object>(); map1.put("company", "MSFT"); map1.put("wordCount", 8); publisher.publish("TwitterFeed", map1);
  34. 34. Queries
  35. 35. Queries from allStockQuotes[win.time(60000)] insert into fastMovingStockQuotes symbol,price, avg(price) as averagePrice group by symbol having ((price > averagePrice*1.02) or (averagePrice*0.98 > price )) from twitterFeed[win.time(60000)] insert into highFrequentTweets company as company, sum(wordCount) as words group by company having (words > 10) from fastMovingStockQuotes[win.time(60000)] as fastMovingStockQuotes join highFrequentTweets[win.time(60000)] as highFrequentTweets on fastMovingStockQuotes.symbol==highFrequentTweets.company insert into predictedStockQuotes fastMovingStockQuotes.symbol as company, fastMovingStockQuotes.averagePrice as amount, highFrequentTweets.words as words
  36. 36. Alert ƒ As a XML <quotedata:StockQuoteDataEvent xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:quotedata="http://ws.cdyne.com/"> <quotedata:StockSymbol>{company}</quotedata:StockSymbol> <quotedata:LastTradeAmount>{amount}</quotedata:LastTradeAmount> <quotedata:WordCount>{words}</quotedata:WordCount> </quotedata:StockQuoteDataEvent>
  37. 37. Useful links ƒ WSO2 CEP 2.0.0 Milestone 2 https://svn.wso2.org/repos/wso2/people/suho/packs/cep/wso2cep-2.0.0- M2.zip ƒ Distributed Processing Sample With Siddhi CEP and ActiveMQ JMS Broker. http://suhothayan.blogspot.com/2012/08/distributed-processing-sample-for-wso2. html
  38. 38. Questions?
  39. 39. Thank you.

×