Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Stony Brook University 
Shebuti & Leman 
Shebuti Rayana Leman Akoglu
Tax evasion Credit card fraud 
& Many More… 
Network intrusion 
Healthcare fraud 
Shebuti & Leman Event Detection & Charac...
 Problem: Given a sequence of graphs, 
Q1. Event detection: find time points at which 
graph changes significantly 
Q2. C...
 Main framework 
 Compute graph similarity/distance scores 
… … … 
time 
 Find unusual occurrences in time series 
Sheb...
 Flow of Ensemble Approach 
 Event Detection in Dynamic Graphs 
Ensemble Algorithms 
Eigen Behavior based Event Detect...
Event Detection 
Consensus Rank Merging 
•Rank based 
•Inverse Rank 
•Kemeny Young 
•Score Based 
•Unification 
(avg, max)...
 Numerous algorithms for event detection 
 Hard to decide which one will work well 
for a specific data set 
 Our Goal:...
 Extract “typical behavior” (eigen-behavior) of 
nodes/edges 
 eigen-behavior ≡ principal eigen-vector 
 Compare eigen-...
Nodes 
T 
Features 
(egonet) 
Time 
T 
N 
Feature: 
degree 
WW 
past pattern 
right 
singular 
vector 
N 
 
 
 
eigen-b...
 Individual nodes/edges time series with 
distributions 
 Poisson 
 Zero-inflated Poisson 
 Hurdle Process 
▪ Hurdle C...
Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 11
 Streaming Pattern dIscoveRy in multIple Time-series 
(SPIRIT) [Papadimitriou et al. 2005] 
 Discovers trends – whenever...
Event Detection 
Characterization 
Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 13
RankList2 
ScoreList2 
Consensus 
RankList1 
ScoreList1 
Rank based Score based 
•Inverse Rank 
•Kemeny Young 
[J. Kemeny ...
 We were given a “Cyber Challenge Network” 
from NGAS R&T Space Park 
 Simulated cyber network traffic 
 10 days activi...
Eigen-behaviors 
Probabilistic Approach 
SPIRIT 
Z-score 
1 – norm. 
(sum 
p-value) 
projection 
Time tick 
Shebuti & Lema...
Eigen-behaviors 
at Time tick 376 
Probabilistic Approach 
SPIRIT 
relative 
activity 
change 
projection 
weight 
nodes 
...
Average Precision Table (Feature: Degree) 
Algorithm Sample rate (10 min) 
Base 
Algorithms 
EBED 0.8333 
PTSAD 0.5722 
SP...
Average Precision Table for Node anomalies 
Feature: Degree [Sample rate 10 min] 
Algorithm Event at 376 Event at 1126 
Ba...
Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 20
 ~8 years (Jan 2000- July 2007) of published 
articles of New York Times 
 Graph links: Co-mention of named entities 
(p...
Feature: 
Weighted 
Degree 
Eigen-behaviors 
Columbia disaster 
Probabilistic Approach 
SPIRIT 
2001 election 
Z Score 
1 ...
Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 23
 Heterogeneous detectors 
 different scores 
 different effectiveness (depending on dataset) 
 Ensemble for event dete...
 Near-future: Robust consensus by automatically 
selecting effective base algorithms 
 Challenge: no ground truth 
 Nea...
srayana@cs.stonybrook.edu 
http://www.cs.stonybrook.edu/~datalab/ 
Judge a man by his questions rather than his answers. 
...
Upcoming SlideShare
Loading in …5
×

Event Detection and Characterization in Dynamic Graphs

702 views

Published on

Presented at ODD^2 @ KDD 2014 on August 24, 2014

Published in: Engineering
  • Be the first to like this

Event Detection and Characterization in Dynamic Graphs

  1. 1. Stony Brook University Shebuti & Leman Shebuti Rayana Leman Akoglu
  2. 2. Tax evasion Credit card fraud & Many More… Network intrusion Healthcare fraud Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 2
  3. 3.  Problem: Given a sequence of graphs, Q1. Event detection: find time points at which graph changes significantly Q2. Characterization: find (top k) nodes / edges / regions that change the most Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 3
  4. 4.  Main framework  Compute graph similarity/distance scores … … … time  Find unusual occurrences in time series Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 4
  5. 5.  Flow of Ensemble Approach  Event Detection in Dynamic Graphs Ensemble Algorithms Eigen Behavior based Event Detection (EBED) Probabilistic Approach (PTSAD) SPIRIT Consensus Method Rank based Score based Results Dataset 1: Challenge Network flow Data Dataset 2: New York Times News Corpus Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 5
  6. 6. Event Detection Consensus Rank Merging •Rank based •Inverse Rank •Kemeny Young •Score Based •Unification (avg, max) •Mixture Model (avg, max) • Final Ensemble (Inverse Rank) Characterization Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 6
  7. 7.  Numerous algorithms for event detection  Hard to decide which one will work well for a specific data set  Our Goal: design an ensemble approach which might not give best result but “better” than most base algorithms  Challenges:  Different scores/scales  Different merging approaches Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 7
  8. 8.  Extract “typical behavior” (eigen-behavior) of nodes/edges  eigen-behavior ≡ principal eigen-vector  Compare eigen-behavior over time  Score the time ticks depending on amount of change in behavior from previous time tick.  Mark the ones with high score as anomalous. T N Feature: Degree Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 8
  9. 9. Nodes T Features (egonet) Time T N Feature: degree WW past pattern right singular vector N    eigen-behavior at t eigen-behaviors change-score metric: Z = 1- uTr Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 9
  10. 10.  Individual nodes/edges time series with distributions  Poisson  Zero-inflated Poisson  Hurdle Process ▪ Hurdle Component: Bernoulli & Markov Chain ▪ Count Component: Zero-truncated Poisson  Model Selection:  AIC, log likelihood, Vuong’s test and log gain  Find single-sided p-value as the probability of observing a count as extreme as v [P(X ≥ v)] Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 10
  11. 11. Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 11
  12. 12.  Streaming Pattern dIscoveRy in multIple Time-series (SPIRIT) [Papadimitriou et al. 2005]  Discovers trends – whenever trend changes it introduce new hidden variable & remove when not needed  Detects anomalous points in trends  Nodes weights change in each step  At a change point the node which has highest weight is most anomalous Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 12
  13. 13. Event Detection Characterization Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 13
  14. 14. RankList2 ScoreList2 Consensus RankList1 ScoreList1 Rank based Score based •Inverse Rank •Kemeny Young [J. Kemeny 1959] RankList3 ScoreList3 •Unification [Zimek et al. 2011] -avg & max •Mixture Model [Jing et al. 2006] -avg & max Final Ensemble: inverse rank FinalRankList Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 14
  15. 15.  We were given a “Cyber Challenge Network” from NGAS R&T Space Park  Simulated cyber network traffic  10 days activities  125 hosts  To-from information with timestamps  Find “suspicious” events and the entities associated with the corresponding events in Challenge Network. Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 15
  16. 16. Eigen-behaviors Probabilistic Approach SPIRIT Z-score 1 – norm. (sum p-value) projection Time tick Shebuti & Leman 16 Event Detection & Characterization in Dynamic Graphs Feature: Degree
  17. 17. Eigen-behaviors at Time tick 376 Probabilistic Approach SPIRIT relative activity change projection weight nodes Shebuti & Leman 17 Event Detection & Characterization in Dynamic Graphs normal. |log(p)|
  18. 18. Average Precision Table (Feature: Degree) Algorithm Sample rate (10 min) Base Algorithms EBED 0.8333 PTSAD 0.5722 SPIRIT 0.7292 Consensus Rank Merging Algorithms Inverse Rank (1/R) 1.0000 Kemeny Young 0.8095 Unification (avg) 0.8056 Unification (max) 0.7255 Mixture model (avg) 0.1684 Mixture model (max) 0.1684 Final Ensemble (1/R) 0.8667 Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 18
  19. 19. Average Precision Table for Node anomalies Feature: Degree [Sample rate 10 min] Algorithm Event at 376 Event at 1126 Base Algorithms EBED 1.0000 1.0000 PTSAD 1.0000 0.2500 SPIRIT 0.3026 0.0213 Consensus Rank Merging Algorithms Inverse Rank (1/R) 1.0000 0.5000 Kemeny Young 1.0000 0.2000 Unification (avg) 1.0000 1.0000 Unification (max) 0.8333 1.0000 Mixture model (avg) 1.0000 1.0000 Mixture model (max) 1.0000 1.0000 Final Ensemble (1/R) 1.0000 1.0000 Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 19
  20. 20. Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 20
  21. 21.  ~8 years (Jan 2000- July 2007) of published articles of New York Times  Graph links: Co-mention of named entities (people, places, organization)  Sample rate: 1 week  No ground truth  Big Events detected:  January, 2001 – George W. Bush elected US president  September 11, 2001 – Terrorist attack in WTC  February 1, 2003 – Space Shuttle Columbia Disaster Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 21
  22. 22. Feature: Weighted Degree Eigen-behaviors Columbia disaster Probabilistic Approach SPIRIT 2001 election Z Score 1 – norm. (sum p-value) projection 9/11 WTC attack Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 22
  23. 23. Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 23
  24. 24.  Heterogeneous detectors  different scores  different effectiveness (depending on dataset)  Ensemble for event detection on dynamic graphs  Multiple consensus (merging) approaches  two-phase consensus finding Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 24
  25. 25.  Near-future: Robust consensus by automatically selecting effective base algorithms  Challenge: no ground truth  Near-future: real-time detection  Event detection under diverse data sources (e.g., news media, social media, the Web, …)  Challenges: different entity types, different time granularity, entity resolution Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 25
  26. 26. srayana@cs.stonybrook.edu http://www.cs.stonybrook.edu/~datalab/ Judge a man by his questions rather than his answers. -Voltaire Event Detection Characterization Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 26

×