A HIGH THROUGHPUT
COMPLEX EVENT DETECTION TECHNIQUE
WITH BULK EVALUATION
Naotaka Nishimura (University of Tsukuba)
Hideyuk...
Outline
Background
Related work (SASE)
State of the art
Chance for further improvement
Proposal: Bulk evaluation
Ext...
Big Data Streams
-- Volume, Velocity, Variety, Veracity, Value -3

Social network

Facebook, 600 TB in a day
(VLDB’13 Key...
Quick Review

Data Stream Management System (DSMS)
4

How many packets are
arrived for port 80
in a minute ?

SELECT COUNT...
SELECT COUNT(*)
FROM eth0[TIME 1 MIN]
WHERE port = 80

5

DSMS
Data

Input
adapter

w

σ

α

Query
Output
adapter

Result
...
Complex Event Processing (CEP)
Detect a certain pattern from input stream data
Stream Data

A1

A2

B3

C4

E5

D6

D7

…...
Complex Event Processing (CEP)
A case for order management in a restaurant.
Detect a guest who passed entrance and took ...
Outline
Background
Related work (SASE)
Proposal
Evaluation
Conclusions and Future work
SASE [1] Overview (1/2)

[1]:High-Performance Complex Event
Processing over Streams, ACM
SIGMOD 2006

SASE detects specif...
SASE Overview (2/2)
Problem of NFA:
NFA can detect specified patterns, but it does not produce

pattern occurrences (seq...
Behavior of SASE (1/3)
Translate a query pattern to an NFA
A

0

A

B

1

*

B

D

2

*

D

3
Behavior of SASE (2/3)
Prepare an AIS for each state of NFA
Create a link when an event is pushed
Event arrival sequence...
Behavior of SASE (3/3)
Create a pattern occurrence when acceptance state is
achieved using link information
0

A

1

B

D...
IDEA: If we can evaluate d7 and d9 in a lump, the cost
Problem of a1-b3 should be reduced (2 to 1).
for constructing SASE ...
Outline
Background
Related work (SASE)
Proposal: Bulk evaluation
Extension of SASE
Evaluation
Conclusions and Future...
Concept: Bulk Evaluation
Generate Result

Generate Result

Generate Result

Generate Result

Generate Result

[SASE]
a1

c...
Behavior of Proposal (1/3)
Create a link when an event is pushed to AIS
Keep D events, different from SASE
a1 c2 b3 a4 d...
Behavior of Proposal (2/3)
Create a cluster on final AIS
0

A

1

B

D

2

*

*
a1

b3

a4

b6

a8

3

0

A

1

B

D

2

...
Behavior of Proposal (3/3)
Create pattern occurrences in a bulk
Result with d9 is made with result on d7
0

A

1

B

D

...
Outline
Background
Related work
Proposal
Evaluation
Conclusions and Future work
Environment for Experiment
OS: WindowsXP
RAM: 3GB
CPU: Intel Core2Duo E8400(3GHz)
Language: Java(JRE 1.7.0_4)
Result of Experiment

Pattern:A→B→D

5.24 times
Outline
Background
Related work
Proposal
Evaluation
Conclusions and Future work
Conclusions and Future Work
Conclusions
SASE had a chance for further improvement on throughput.
Bulk evaluation scheme...
CryptDB

Privacy Preservation

Encryption

FPGA

Privacy
ML&DM
Jubatus

Online
LDA
CPD

SQL
Norikra

System S

Borealis

P...
27

- UDP-RX
- Window join (64-cores)

Performance
Monitor

Falcon

Basic: 6.7 millions of tuples per second
Proposal: 14....
Upcoming SlideShare
Loading in...5
×

SMDMS'13

589

Published on

Talk at International Workshop on Streaming Media Delivery and Management Systems

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
589
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
4
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

SMDMS'13

  1. 1. A HIGH THROUGHPUT COMPLEX EVENT DETECTION TECHNIQUE WITH BULK EVALUATION Naotaka Nishimura (University of Tsukuba) Hideyuki Kawashima (University of Tsukuba) Hiroyuki Kitagawa (University of Tsukuba)
  2. 2. Outline Background Related work (SASE) State of the art Chance for further improvement Proposal: Bulk evaluation Extension of SASE Evaluation Conclusions and Future work
  3. 3. Big Data Streams -- Volume, Velocity, Variety, Veracity, Value -3 Social network Facebook, 600 TB in a day (VLDB’13 Keynote) Monitoring System CISCO, 322 Tbps Science LHC, 15 PB / year LSST, 20 TB / day CRS-3
  4. 4. Quick Review Data Stream Management System (DSMS) 4 How many packets are arrived for port 80 in a minute ? SELECT COUNT(*) FROM eth0[TIME 1 MIN] WHERE port = 80 Q1 DSMS 20 Relational schema Relation eth0 ・Destination IP ・Source IP ・Destination Port ・Source Port ・Interface (e.g. eth0) ・Length ・Version (e.g. IPV4 ) ・Payload
  5. 5. SELECT COUNT(*) FROM eth0[TIME 1 MIN] WHERE port = 80 5 DSMS Data Input adapter w σ α Query Output adapter Result Users/Apps. SQL is translated to operator tree. On arrival of data, tree is evaluated. Operators are based on relational database w(Window): Cutting off relations from a stream σ (Selection): Filter α (Aggregation): such as AVG, MIN, MAX CEP (complex event processing)
  6. 6. Complex Event Processing (CEP) Detect a certain pattern from input stream data Stream Data A1 A2 B3 C4 E5 D6 D7 … Query Pattern:A→B→D Pattern occurrences (sequences of events specified by user) A1→B3→D6 A2→B3→D6 A1→B3→D7 A2→B3→D7 …
  7. 7. Complex Event Processing (CEP) A case for order management in a restaurant. Detect a guest who passed entrance and took a seat. Pattern: Entrance→Seat RFID Place Seat2 Seat3 Floor Entrance Seat6 Seat5 9:54:11 xx Toilet Seat4 Entrance 10:10:01 xx 10:10:31 yy Floor Seat1 TagID Entrance Toilet Time 10:10:31 yy Seat5 10:11:11 yy A pattern occurrence is constructed by 2
  8. 8. Outline Background Related work (SASE) Proposal Evaluation Conclusions and Future work
  9. 9. SASE [1] Overview (1/2) [1]:High-Performance Complex Event Processing over Streams, ACM SIGMOD 2006 SASE detects specified patterns using NFA(Non deterministic Finite Automata). NFA (quick review) Is a finite automaton which can achieve multiple states at the same time. FA is an architecture that transits from current state to next state by input symbol. It is constituted of initial state, acceptance state, state set, input symbol, and transition function. Ex) NFA that detects A→B→D • Self transition; This is a self loop transition which is invoked by every event.
  10. 10. SASE Overview (2/2) Problem of NFA: NFA can detect specified patterns, but it does not produce pattern occurrences (sequence of input events that achieved acceptance state) SASE Utilizes stack structure (AIS) to output pattern occurrences. AIS (Active Instance Stack) For a state, an AIS is prepared 0 A 1 * AIS B D 2 3 * AIS AIS
  11. 11. Behavior of SASE (1/3) Translate a query pattern to an NFA A 0 A B 1 * B D 2 * D 3
  12. 12. Behavior of SASE (2/3) Prepare an AIS for each state of NFA Create a link when an event is pushed Event arrival sequence t a1 c2 b3 a4 d5 0 A 1 * a1 a4 B D 2 * b3 3 d5
  13. 13. Behavior of SASE (3/3) Create a pattern occurrence when acceptance state is achieved using link information 0 A 1 B D 2 * 3 * a1 a4 b3 d5 a1 b3 d5
  14. 14. IDEA: If we can evaluate d7 and d9 in a lump, the cost Problem of a1-b3 should be reduced (2 to 1). for constructing SASE we found Duplicate generation (e.g. b3 → a1) b6,d7,a8,d9 0 A 1 B 0 D 2 * 3 * a1 b3 a4 d7 b6 A 1 B * b3 b6 b3 d7 d7 d7 3 * a1 b3 a4 b6 a8 Result Generation a1 a1 a4 D 2 Result Generation a1 a1 a4 b3 b6 b3 d9 d9 d9 d9
  15. 15. Outline Background Related work (SASE) Proposal: Bulk evaluation Extension of SASE Evaluation Conclusions and Future work
  16. 16. Concept: Bulk Evaluation Generate Result Generate Result Generate Result Generate Result Generate Result [SASE] a1 c2 b3 a4 d5 b6 d7 a8 d9 b10 d11 d12 b13 [Proposal] Generate Result Generate Result t
  17. 17. Behavior of Proposal (1/3) Create a link when an event is pushed to AIS Keep D events, different from SASE a1 c2 b3 a4 d5 b6 d7 a8 d9 0 A 1 B D 2 3 * * a1 b3 d5 a4 b6 d7 a8 d9 t
  18. 18. Behavior of Proposal (2/3) Create a cluster on final AIS 0 A 1 B D 2 * * a1 b3 a4 b6 a8 3 0 A 1 B D 2 3 * d5 * a1 b3 d5 d7 a4 b6 d7 d9 a8 d9
  19. 19. Behavior of Proposal (3/3) Create pattern occurrences in a bulk Result with d9 is made with result on d7 0 A 1 B D 2 3 * * a1 b3 d5 a4 b6 d7 a8 d9 a1 b3 d5 a1 a1 a4 b3 b6 b6 d7 d7 d7 a1 a1 a4 b3 b6 b6 d9 d9 d9
  20. 20. Outline Background Related work Proposal Evaluation Conclusions and Future work
  21. 21. Environment for Experiment OS: WindowsXP RAM: 3GB CPU: Intel Core2Duo E8400(3GHz) Language: Java(JRE 1.7.0_4)
  22. 22. Result of Experiment Pattern:A→B→D 5.24 times
  23. 23. Outline Background Related work Proposal Evaluation Conclusions and Future work
  24. 24. Conclusions and Future Work Conclusions SASE had a chance for further improvement on throughput. Bulk evaluation scheme improved throughput. Factor of 5.24 at the maximum case Future work Implementing the proposal to Falcon
  25. 25. CryptDB Privacy Preservation Encryption FPGA Privacy ML&DM Jubatus Online LDA CPD SQL Norikra System S Borealis Puma MADLib @UCB Spring (DTW) Data Mining & Machine Learning Esper Kafka SASE STORM Cayuga Window join Bismarck Online @Stanford Intel MIC NoSQL MLBase Oracle-R Incr. LOCI Tilera Accelerator Falcon 26 GPGPU Window aggregate Relational stream Continual query & Window Complex event processing Tuple stream S4
  26. 26. 27 - UDP-RX - Window join (64-cores) Performance Monitor Falcon Basic: 6.7 millions of tuples per second Proposal: 14.6 millions of tuples per second
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×