Complex Event Processing with Esper

Complex Event
Processing with Esper

@antonioalegria

Complex Event
Processing?
CEP

“Complex Event is an event
that could only happen if lots
of other events happened”

“CEP is a set of tools and
techniques for analyzing and
controlling the complex series
of interrelated events that drive
modern distributed information
systems”

David Luckham, 2002

Example

• Church bell ringing
• Appearance of a man in a tuxedo
• Appearance of a woman in a white gown
• Rice ﬂying through the air

Example

• Church bell ringing
• Appearance of a man in a tuxedo
• Appearance of a woman in a white gown
• Rice ﬂying through the air
Wedding has happened!

CEP Use Cases
• Are our business processes running on
time and correctly?
• Can we detect an opportunity for arbitrage
in our trading department?
• Are we servicing our call center customer’s
requests in a timely fashion?
• Was there a breach in our network?

It’s a Buzzword
like SOA!

It’s an Architectural
Pattern

What do you need for
CEP?

Across all layers of
organization

Event Relationships

• Causality
• Membership
• Timing

Domain Speciﬁc
Language
for Event Processing

What you need for
CEP
• Event Driven
• Right-time
• Across all layers
• Aggregation, Correlation & Traceability
• Patterns
• DSL

Common CEP
Operations
• Windowing
• Transformation
• Aggregation/Grouping
• Merging/Union
• Filtering
• Sorting
• Correlation
• Pattern Detection

Esper
http://esper.codehaus.org

Esper makes it easier to
build a CEP app

Not meant to replace
Databases

But some parallels can
be made

Esper DB

• Stores queries • Stores data

• Continuous queries • On-demand queries

• Time is a dimension • Time is a data type

Esper DB

• EPL • SQL

• Event Streams • Tables

• Events • Rows

Event Deﬁnition (1/2)

create schema Event (
id string, // Event unique identifier
ts long // Timestamp (milliseconds)
);

create schema Tweet (
user string, // username (e.g. ‘codebits’)
text string, // actual tweet
retweet_of string // references a Tweet.id
) inherits Event;

Event Deﬁnition (2/2)

create schema Hashtag (
tweet_id string, // references a Tweet.id
user string,
value string
) inherits Event;

// Create Url and Mention event types as a copy of Hashtag

create schema Url() copyfrom Hashtag;

create schema Mention() copyfrom Hashtag;

Looks like SQL...

// All events
select * from Event;

// Only tweets
select user, text as status
from Tweet;

Filtering

// Tweets from @codebits
select * from Tweet(user = 'codebits');

// Another way to do it
select * from Tweet where user = 'codebits';

// All occurrences of #codebits not posted by @codebits
select user,
value as hashtag,
current_timestamp() as ts
from Hashtag(value = 'codebits' and user != 'codebits');

Stream Creation and Redirection

insert into CodebitsTweets
select * from Tweet(user = ‘codebits’);

select * from CodebitsTweets;

Aggregation

insert into UrlsPerSecond
select count(*) as count from Url.win:time_batch(1 sec);

// Every second (driven by above rule) calculate for last minute
// - average Urls tweeted
// - total Urls tweeted
select avg(count), sum(count)
from UrlsPerSecond.win:length(60);

Grouping

select value as hashtag, count(*)
from Hashtag(value != null).win:time(30 seconds)
group by value;

Simple Event Views

select * from Tweet.win:time(5 min);

select * from Tweet.win:time_batch(1 hour);

select * from Tweet.win:length(10);

select * from Tweet.win:length_batch(10);

Other Standard Event Views

// Don’t use system clock, use event stream property
select * from Tweet.win:ext_timed(ts, 5 min);

// Last 10 tweets per user
select * from Tweet.std:groupwin(user).win:length(10);

// Top 5 Hashtags
select * from HashtagsPerMinute.std:sort(5, count desc);

You can create your
own custom Views

Correlation

// Associate hashtags used to describe a URL
insert into UrlTags
select u.value as url, h.value as hashtag
from Url.std:lastevent() as u,
Hashtag.std:lastevent() as h
where u.tweet_id = h.tweet_id;

insert into UrlTagsCount
select url,
hashtag,
count(*) as count
from UrlTags.win:time(1 hour)
group by url, hashtag;

Correlation (1/2)

// Every minute, output Top 3 hashtags per URL
select * from UrlTagsCount.ext:sort(3, count desc)
output snapshot at(*/1,*,*,*,*);

Event Patterns

// Measure how long it takes users to respond to Tweet
insert into ResponseDelay
select t.id as tweet_id,
t.user as author,
m.value as responder,
t.ts as start_ts,
m.ts as stop_ts,
m.ts - t.ts as duration
from pattern [
every (t=Tweet -> m=Mention(value = t.user))
];

Detecting Missing Events

// No Tweet from @codebits in 1 hour
select *
from pattern [ every Tweet(user = ‘codebits’) ->
(timer:interval(1 hour) and not Tweet(user = ‘codebits’))
];

Other features
• Subqueries
• Inner, outer joins
• Named windows
• 1 class integration with databases (JDBC)
st

• Regex-like Event Pattern matching (match-
recognize)

It’s not a silver bullet
well, duh!

Complex Event Processing with Esper

More Related Content

What's hot

Viewers also liked

Similar to Complex Event Processing with Esper

Recently uploaded

Complex Event Processing with Esper

Editor's Notes