Successfully reported this slideshow.

Complex Event Processing with Esper


Published on

Talk I gave at Codebits 2011 on 11/11/11 about Complex Event Processing using Esper.

Published in: Technology, Business
  • Be the first to comment

Complex Event Processing with Esper

  1. 1. Complex EventProcessing with Esper @antonioalegria
  2. 2. Complex Event Processing? CEP
  3. 3. “Complex Event is an event that could only happen if lots of other events happened” “CEP is a set of tools and techniques for analyzing and controlling the complex seriesof interrelated events that drivemodern distributed information systems” David Luckham, 2002
  4. 4. Example• Church bell ringing• Appearance of a man in a tuxedo• Appearance of a woman in a white gown• Rice flying through the air
  5. 5. Example• Church bell ringing• Appearance of a man in a tuxedo• Appearance of a woman in a white gown• Rice flying through the air Wedding has happened!
  6. 6. CEP Use Cases• Are our business processes running on time and correctly?• Can we detect an opportunity for arbitrage in our trading department?• Are we servicing our call center customer’s requests in a timely fashion?• Was there a breach in our network?
  7. 7. It’s not a technology
  8. 8. It’s a Buzzword like SOA!
  9. 9. It’s an Architectural Pattern
  10. 10. What do you need for CEP?
  11. 11. Event driven
  12. 12. (soft) Real-time
  13. 13. (soft) Real-time Right
  14. 14. Across all layers of organization
  15. 15. Event Aggregation
  16. 16. Event Relationships• Causality• Membership• Timing
  17. 17. Event Patterns
  18. 18. Domain Specific Language for Event Processing
  19. 19. What you need for CEP• Event Driven• Right-time• Across all layers• Aggregation, Correlation & Traceability• Patterns• DSL
  20. 20. Common CEP Operations• Windowing• Transformation• Aggregation/Grouping• Merging/Union• Filtering• Sorting• Correlation• Pattern Detection
  21. 21. Esper
  22. 22. Esper makes it easier to build a CEP app
  23. 23. Not meant to replace Databases
  24. 24. But some parallels can be made
  25. 25. Esper DB• Stores queries • Stores data• Continuous queries • On-demand queries• Time is a dimension • Time is a data type
  26. 26. Esper DB• EPL • SQL• Event Streams • Tables• Events • Rows
  27. 27. Esper Processing Model
  28. 28. EPLEvent Processing Language
  29. 29. Event Definition (1/2)create schema Event ( id string, // Event unique identifier ts long // Timestamp (milliseconds));create schema Tweet ( user string, // username (e.g. ‘codebits’) text string, // actual tweet retweet_of string // references a inherits Event;
  30. 30. Event Definition (2/2)create schema Hashtag ( tweet_id string, // references a user string, value string) inherits Event;// Create Url and Mention event types as a copy of Hashtagcreate schema Url() copyfrom Hashtag;create schema Mention() copyfrom Hashtag;
  31. 31. Looks like SQL...// All eventsselect * from Event;// Only tweetsselect user, text as statusfrom Tweet;
  32. 32. Filtering// Tweets from @codebitsselect * from Tweet(user = codebits);// Another way to do itselect * from Tweet where user = codebits;// All occurrences of #codebits not posted by @codebitsselect user, value as hashtag, current_timestamp() as tsfrom Hashtag(value = codebits and user != codebits);
  33. 33. Stream Creation and Redirectioninsert into CodebitsTweetsselect * from Tweet(user = ‘codebits’);select * from CodebitsTweets;
  34. 34. Aggregationinsert into UrlsPerSecondselect count(*) as count from sec);// Every second (driven by above rule) calculate for last minute// - average Urls tweeted// - total Urls tweetedselect avg(count), sum(count)from;
  35. 35. Groupingselect value as hashtag, count(*)from Hashtag(value != null).win:time(30 seconds)group by value;
  36. 36. Simple Event Viewsselect * from min);select * from hour);select * from;select * from;
  37. 37. Other Standard Event Views// Don’t use system clock, use event stream propertyselect * from, 5 min);// Last 10 tweets per userselect * from Tweet.std:groupwin(user).win:length(10);// Top 5 Hashtagsselect * from HashtagsPerMinute.std:sort(5, count desc);
  38. 38. You can create your own custom Views
  39. 39. Correlation// Associate hashtags used to describe a URLinsert into UrlTagsselect u.value as url, h.value as hashtagfrom Url.std:lastevent() as u, Hashtag.std:lastevent() as hwhere u.tweet_id = h.tweet_id;insert into UrlTagsCountselect url, hashtag, count(*) as countfrom hour)group by url, hashtag;
  40. 40. Correlation (1/2)// Every minute, output Top 3 hashtags per URLselect * from UrlTagsCount.ext:sort(3, count desc)output snapshot at(*/1,*,*,*,*);
  41. 41. Event Patterns// Measure how long it takes users to respond to Tweetinsert into ResponseDelayselect as tweet_id, t.user as author, m.value as responder, t.ts as start_ts, m.ts as stop_ts, m.ts - t.ts as durationfrom pattern [ every (t=Tweet -> m=Mention(value = t.user))];
  42. 42. Detecting Missing Events// No Tweet from @codebits in 1 hourselect *from pattern [ every Tweet(user = ‘codebits’) -> (timer:interval(1 hour) and not Tweet(user = ‘codebits’))];
  43. 43. Other features• Subqueries• Inner, outer joins• Named windows• 1 class integration with databases (JDBC) st• Regex-like Event Pattern matching (match- recognize)
  44. 44. Esper is awesome!
  45. 45. It’s not a silver bullet well, duh!
  46. 46. Memory Usage
  47. 47. Resilience &Persistence
  48. 48. Weak Pattern matching
  49. 49. Drill-down not trivial
  50. 50. It’s NOT distributed!
  51. 51. Not full-stack
  52. 52. QAFor more: @antonioalegria