WSO2 Complex Event Processor

2,061 views

Published on

At WSO2Con 2014 Europe

Published in: Data & Analytics, Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,061
On SlideShare
0
From Embeds
0
Number of Embeds
72
Actions
Shares
0
Downloads
130
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

WSO2 Complex Event Processor

  1. 1. WSO2 Complex Event Processor Sriskandarajah Suhothayan (Suho) Srinath Perera WSO2 Inc.
  2. 2. Outline • BigData • Complex Event Processing • Basic Constructs of Query Language • CEP Solution Patterns • Scale, HA and Performance • Demo
  3. 3. Why Big Data is hard? • How store? Assuming 1TB bytes it takes 1000 computers to store a 1PB • How to move? Assuming 10Gb network, it takes 2 hours to copy 1TB, or 83 days to copy a 1PB • How to search? Assuming each record is 1KB and one machine can process 1000 records per sec, it needs 277CPU days to process a 1TB and 785 CPU years to process a 1 PB • How to process? – How to convert algorithms to work in large size – How to create new algorithms http://www.susanica.com/photo/9
  4. 4. Why it is hard (Contd.)? • System build of many computers • That handles lots of data • Running complex logic • This pushes us to frontier of Distributed Systems and Databases • More data does not mean there is a simple model • Some models can be complex as the system http://www.flickr.com/photos/mariachily/5250487136, Licensed CC
  5. 5. Big data Processing Technologies Landscape
  6. 6. WSO2 Bigdata Offerings
  7. 7. CEP Is & Is NOT! • Is NOT! • Simple filters • Simple Event Processing • E.g. Is this a gold or platinum customer? • Joining multiple event streams • Event Stream Processing • Is ! • Processing multiple event streams • Identify meaningful patterns among streams • Using temporal windows • E.g. Notify if there is a 10% increase in overall trading activity AND the average price of commodities has fallen 2% in the last 4 hours
  8. 8. What is ?
  9. 9. Query Functions of CEP • Filter • Transformation • Window + { Aggregation, group by } • Join • Event Sequence • Event Table
  10. 10. CEP Architecture
  11. 11. Event Streams • Event stream is a sequence of events • Event streams are defined by Stream Definitions • Events streams have in-flows and out-flows • Inflows can be from • Event builders Converts incoming XML, JSON, etc events to event stream • Execution plans • Outflows are to • Event formatters Converts to event stream to XML, JSON, etc events • Execution plans
  12. 12. Stream Definition { 'name':'phone.retail.shop', 'version':'1.0.0', 'nickName': 'Phone_Retail_Shop', 'description': 'Phone Sales', 'metaData':[ {'name':'clientType','type':'STRING'} ], 'correlaitonData':[ {'name':’transactionID’,'type':'STRING'} ], 'payloadData':[ {'name':'brand','type':'STRING'}, {'name':'quantity','type':'INT'}, {'name':'total','type':'INT'}, {'name':'user','type':'STRING'} ] }
  13. 13. Event Format • Standard event formats are available for • XML • JSON • Text • Map • WSO2 Event • If events adhere to the standard format they do not need data mapping. • If events do not adhere custom event mapping should be configured in Event builder & Event Formatter appropriately.
  14. 14. Event Format Standard XML event format <events> <event> <metaData> <tenant_id>2</tenant_id> </metaData> <correlationData> <activity_id>ID5</activity_id> </correlationData> <payloadData> <clientPhoneNo>0771117673</clientPhoneNo> <clientName>Mohanadarshan</clientName> <clientResidenceAddress>15, Alexendra road, California</clientResidenceAddress> <clientAccountNo>ACT5673</clientAccountNo> </payloadData> </event> <events>
  15. 15. CEP Execution Plan ● Is an isolated logical execution unit ● Each execution plan imports some of the event streams available in CEP and defines the execution logic using queries and exports the results as output event streams. ● Has one-to-one relationship with CEP Backend Runtime. ● Has many-to-many relationship with Event Streams. ● Each execution plan spawns a Siddhi Engine Instance.
  16. 16. CEP Solution patterns 1. Transformation - project, translate, enrich, split 2. Filter 3. Composition / Aggregation / Analytics ● basic stats, group by, moving averages 4. Join multiple streams 5. Detect patterns ● Coordinating events over time ● Trends - increasing, decreasing, stable, non-increasing, non-decreasing, mixed 6. Blacklisting 7. Building a profile
  17. 17. Siddhi Query Structure define stream <event stream> (<attribute> <type>,<attribute> <type>, ...); from <event stream> select <attribute>,<attribute>, ... insert into <event stream> ;
  18. 18. Siddhi Query : Projection define stream TempStream (deviceID long, roomNo int, temp double); from TempStream select roomNo, temp insert into OutputStream ;
  19. 19. Siddhi Query : Inferred Streams from TempStream select roomNo, temp insert into OutputStream ; define stream OutputStream (roomNo int, temp double);
  20. 20. Siddhi Query : Enrich from TempStream select roomNo, temp,‘C’ as scale insert into OutputStream define stream OutputStream (roomNo int, temp double, scale string);
  21. 21. Siddhi Query : Enrich from TempStream select deviceID, roomNo, avg(temp) as avgTemp insert into OutputStream ;
  22. 22. Siddhi Query : Transformation from TempStream select concat(deviceID, ‘-’, roomNo) as uid, toFahrenheit(temp) as tempInF, ‘F’ as scale insert into OutputStream ;
  23. 23. Siddhi Query : Split from TempStream select roomNo, temp insert into RoomTempStream ; from TempStream select deviceID, temp insert into DeviceTempStream ;
  24. 24. Siddhi Query : Filter from TempStream [temp > 30.0 and roomNo != 2043] select roomNo, temp insert into HotRoomsStream ;
  25. 25. Siddhi Query : Window from TempStream select roomNo, avg(temp) as avgTemp insert into HotRoomsStream ;
  26. 26. Siddhi Query : Window from TempStream#window.time(1 min) select roomNo, avg(temp) as avgTemp insert into HotRoomsStream ;
  27. 27. Siddhi Query : Window from TempStream#window.time(1 min) select roomNo, avg(temp) as avgTemp group by roomNo insert into HotRoomsStream ;
  28. 28. Siddhi Query : Batch Window from TempStream#window.timeBatch(5 min) select roomNo, avg(temp) as avgTemp group by roomNo insert into HotRoomsStream ;
  29. 29. Siddhi Query : Join define stream TempStream (deviceID long, roomNo int, temp double); define stream RegulatorStream (deviceID long, roomNo int, isOn bool);
  30. 30. Siddhi Query : Join define stream TempStream (deviceID long, roomNo int, temp double); define stream RegulatorStream (deviceID long, roomNo int, isOn bool); from TempStream[temp > 30.0]#window.time(1 min) as T join RegulatorStream[isOn == false]#window.lenght(1) as R on T.roomNo == R.roomNo select T.roomNo, R.deviceID, ‘start’ as action insert into RegulatorActionStream ;
  31. 31. Siddhi Query : Detect Trend from t1=TempStream, t2=TempStream [t1.temp < t2.temp and t1.deviceID == t2.deviceID]+ within 5 min select t1.temp as initialTemp, t2.temp as finalTemp, t1.deviceID, t1.roomNo insert into IncreaingHotRoomsStream ;
  32. 32. Siddhi Query : Partition define partition Device by TempStream.deviceID ; define partition Temp by range TempStream.temp <= 0 as ‘ICE’, range TempStream.temp > 0 and TempStream.temp < 100 as ‘WATER’, range TempStream.temp > 100 as ‘VAPOUR’ ;
  33. 33. Siddhi Query : Detect Trend per Partition define partition Device by TempStream.deviceID ; from t1=TempStream, t2=TempStream [t1.temp < t2.temp and t1.deviceID == t2.deviceID]+ within 5 min select t1.temp as initialTemp, t2.temp as finalTemp, t1.deviceID, t1.roomNo insert into IncreaingHotRoomsStream partition by Device ;
  34. 34. Siddhi Query : Detect Pattern define stream Purchase (price double, cardNo long,place string); from every (a1 = Purchase[price < 10] -> a3= ..) -> a2 = Purchase[price >10000 and a1.cardNo == a2.cardNo] within 1 day select a1.cardNo as cardNo, a2.price as price, a2.place as place insert into PotentialFraud ;
  35. 35. Siddhi Query : Define Event Table define table CardUserTable (name string, cardNum long) ; define table CardUserTable (name string, cardNum long) from (‘datasource.name’=‘CardDataSource’, ‘table.name’= ‘UserTable’, ‘caching.algorithm’=‘LRU’) ; Cache types supported ● Basic: A size-based algorithm based on FIFO. ● LRU (Least Recently Used): The least recently used event is dropped when cache is full. ● LFU (Least Frequently Used): The least frequently used event is dropped when cache is full.
  36. 36. Siddhi Query : Query Event Table define stream Purchase (price double, cardNo long, place string); define table CardUserTable (name string, cardNum long) ; from Purchase#window.length(1) join CardUserTable on Purchase.cardNo == CardUserTable.cardNum select Purchase.cardNo as cardNo, CardUserTable.name as name, Purchase.price as price insert into PurchaseUserStream ;
  37. 37. Siddhi Query : Insert into Event Table define stream FraudStream (price double, cardNo long, userName string); define table BlacklistedUserTable (name string, cardNum long) ; from FraudStream select userName as name, cardNo as cardNum insert into BlacklistedUserTable ;
  38. 38. Siddhi Query : Update into Event Table define stream LoginStream (userID string, islogin bool, loginTime long); define table LastLoginTable (userID string, time long) ; from LoginStream select userID, loginTime as time update LastLoginTable on LoginStream.userID == LastLoginTable.userID ;
  39. 39. Siddhi Extensions ● Function extension ● Aggregator extension ● Window extension ● Transform extension
  40. 40. Siddhi Query : Function Extension from TempStream select deviceID, roomNo, custom:toKelvin(temp) as tempInKelvin, ‘K’ as scale insert into OutputStream ;
  41. 41. Siddhi Query : Aggregator Extension from TempStream select deviceID, roomNo, temp custom:stdev(temp) as stdevTemp, ‘C’ as scale insert into OutputStream ;
  42. 42. Siddhi Query : Window Extension from TempStream #window.custom:lastUnique(roomNo,2 min) select * insert into OutputStream ;
  43. 43. Siddhi Query : Transform Extension from XYZSpeedStream #transform.custom:getVelocityVector(v,vx,vy,vz) select velocity, direction insert into SpeedStream ;
  44. 44. CEP Event Adaptors ● For receiving and publishing events ● Has the configurations to connect to external endpoints ● Has many-to-one relationship with Event Streams
  45. 45. CEP Event Adaptors Support for several transports (network access) ● SOAP ● HTTP ● JMS ● SMTP ● SMS ● Thrift ● Kafka Supporting data formats ● XML ● JSON ● Map ● Text ● WSO2Event - WSO2 data format over Thrift for High Performant Event transfer supporting Java/C/C++/C# via Thrift language bindings
  46. 46. CEP Event Adaptors Supports database writes using Map messages ● Cassandra ● MYSQL ● H2 Supports custom event adaptors via its pluggable architecture!
  47. 47. Monitoring & Debugging : Event Flow ● Visualization of the Event Stream flow in CEP ● Helps to get the big picture ● Good for debugging
  48. 48. Monitoring & Debugging : Event Tracer • Dump message traces in a textual format • Before and after processing each stage of event flow
  49. 49. Monitoring & Debugging : Event Statistics • Real-time statistics • via visual illustrations & JMX • Time based request & response counts • Stats on all components of CEP server
  50. 50. Real Time Dashboard • Provides tools to configure gadgets • Currently supports RDBMS only • Powered by WSO2 User Engagement Server ( WSO2UES)
  51. 51. Performance Results • Same JVM Performance (Siddhi with Esper, M means a Million) 4 core machine • Filters 8M Events/Sec vs Esper 2M • Window 2.5M Events/Sec vs. Esper 1M • Patterns 1.4M Events/Sec about 10X faster than Esper • Over the Network Performance (Using thrift based WSO2 event format) - 8 core machine • Filter 0.25M (or 250K) Event/Sec
  52. 52. CEP High Availability Execution plan in “RedundantNode” based distributed processing mode <executionPlan name="RedundantNodeExecutionPlan" statistics="enable" trace="enable" xmlns="http://wso2.org/carbon/eventprocessor"> ... <siddhiConfiguration> <property name="siddhi.enable.distributed.processing">RedundantNode</property> <property name="siddhi.persistence.snapshot.time.interval.minutes">0</property> </siddhiConfiguration> ... </executionPlan>
  53. 53. HA / Persistence • Option 1: Side by side • Recommended • Takes 2X hardware • Gives zero down time • Option 2: Snapshot and restore • Uses less HW • Will lose events between snapshots • Downtime while recovery • ** Some scenarios you can use event tables to keep intermediate state
  54. 54. Scaling • Vertically scaling • Can be distributed as a pipeline • Horizontally scaling • Queries like windows, patterns, and Join have shared states • Hard to distribute!
  55. 55. Scaling (Contd.) • Currently users have to setup the pipeline manually (WSO2 team can help) • Work is underway to support above pipeline and distributer operators out of the box
  56. 56. Lambda Architecture
  57. 57. Demo
  58. 58. Scenario MyPizzaShop – On time delivery or free Pizza Offer !!!
  59. 59. Order Event { "event": { "correlationData": { "orderNo": "0023" }, "payloadData": { "orderInfo": "2 L PEPPERONI", "amount": "25.70", "name": "James Mark", "address": "29BX Finchwood Ave, Clovis, CA 93611", "tpNo": "(626)446-4601" } } }
  60. 60. Delivered to customer event correlation_orderNo:23, isDelivered:true
  61. 61. Email Notification Hi Alis Miranda Your order for 1 L CHICKEN pizza will be delivered in 30 mins to 779 Burl Ave, Clovis, CA 93611. The total cost of the order is $14.5. If you didn't get the pizza within 30 min you will be eligible to have those pizzas for free..!! MyPizzaShop
  62. 62. Final Payment Notification <event xmlns="http://wso2.org/carbon/event"> <correlationData> <orderNo>3</orderNo> </correlationData> <payloadData> <name>James Clark</name> <amount>54.0</amount> </payloadData> </event>
  63. 63. Thank You

×