Successfully reported this slideshow.

Event Detection and Characterization in Dynamic Graphs

1

Share

Loading in …3
×
1 of 26
1 of 26

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

Event Detection and Characterization in Dynamic Graphs

  1. 1. Stony Brook University Shebuti & Leman Shebuti Rayana Leman Akoglu
  2. 2. Tax evasion Credit card fraud & Many More… Network intrusion Healthcare fraud Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 2
  3. 3.  Problem: Given a sequence of graphs, Q1. Event detection: find time points at which graph changes significantly Q2. Characterization: find (top k) nodes / edges / regions that change the most Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 3
  4. 4.  Main framework  Compute graph similarity/distance scores … … … time  Find unusual occurrences in time series Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 4
  5. 5.  Flow of Ensemble Approach  Event Detection in Dynamic Graphs Ensemble Algorithms Eigen Behavior based Event Detection (EBED) Probabilistic Approach (PTSAD) SPIRIT Consensus Method Rank based Score based Results Dataset 1: Challenge Network flow Data Dataset 2: New York Times News Corpus Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 5
  6. 6. Event Detection Consensus Rank Merging •Rank based •Inverse Rank •Kemeny Young •Score Based •Unification (avg, max) •Mixture Model (avg, max) • Final Ensemble (Inverse Rank) Characterization Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 6
  7. 7.  Numerous algorithms for event detection  Hard to decide which one will work well for a specific data set  Our Goal: design an ensemble approach which might not give best result but “better” than most base algorithms  Challenges:  Different scores/scales  Different merging approaches Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 7
  8. 8.  Extract “typical behavior” (eigen-behavior) of nodes/edges  eigen-behavior ≡ principal eigen-vector  Compare eigen-behavior over time  Score the time ticks depending on amount of change in behavior from previous time tick.  Mark the ones with high score as anomalous. T N Feature: Degree Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 8
  9. 9. Nodes T Features (egonet) Time T N Feature: degree WW past pattern right singular vector N    eigen-behavior at t eigen-behaviors change-score metric: Z = 1- uTr Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 9
  10. 10.  Individual nodes/edges time series with distributions  Poisson  Zero-inflated Poisson  Hurdle Process ▪ Hurdle Component: Bernoulli & Markov Chain ▪ Count Component: Zero-truncated Poisson  Model Selection:  AIC, log likelihood, Vuong’s test and log gain  Find single-sided p-value as the probability of observing a count as extreme as v [P(X ≥ v)] Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 10
  11. 11. Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 11
  12. 12.  Streaming Pattern dIscoveRy in multIple Time-series (SPIRIT) [Papadimitriou et al. 2005]  Discovers trends – whenever trend changes it introduce new hidden variable & remove when not needed  Detects anomalous points in trends  Nodes weights change in each step  At a change point the node which has highest weight is most anomalous Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 12
  13. 13. Event Detection Characterization Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 13
  14. 14. RankList2 ScoreList2 Consensus RankList1 ScoreList1 Rank based Score based •Inverse Rank •Kemeny Young [J. Kemeny 1959] RankList3 ScoreList3 •Unification [Zimek et al. 2011] -avg & max •Mixture Model [Jing et al. 2006] -avg & max Final Ensemble: inverse rank FinalRankList Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 14
  15. 15.  We were given a “Cyber Challenge Network” from NGAS R&T Space Park  Simulated cyber network traffic  10 days activities  125 hosts  To-from information with timestamps  Find “suspicious” events and the entities associated with the corresponding events in Challenge Network. Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 15
  16. 16. Eigen-behaviors Probabilistic Approach SPIRIT Z-score 1 – norm. (sum p-value) projection Time tick Shebuti & Leman 16 Event Detection & Characterization in Dynamic Graphs Feature: Degree
  17. 17. Eigen-behaviors at Time tick 376 Probabilistic Approach SPIRIT relative activity change projection weight nodes Shebuti & Leman 17 Event Detection & Characterization in Dynamic Graphs normal. |log(p)|
  18. 18. Average Precision Table (Feature: Degree) Algorithm Sample rate (10 min) Base Algorithms EBED 0.8333 PTSAD 0.5722 SPIRIT 0.7292 Consensus Rank Merging Algorithms Inverse Rank (1/R) 1.0000 Kemeny Young 0.8095 Unification (avg) 0.8056 Unification (max) 0.7255 Mixture model (avg) 0.1684 Mixture model (max) 0.1684 Final Ensemble (1/R) 0.8667 Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 18
  19. 19. Average Precision Table for Node anomalies Feature: Degree [Sample rate 10 min] Algorithm Event at 376 Event at 1126 Base Algorithms EBED 1.0000 1.0000 PTSAD 1.0000 0.2500 SPIRIT 0.3026 0.0213 Consensus Rank Merging Algorithms Inverse Rank (1/R) 1.0000 0.5000 Kemeny Young 1.0000 0.2000 Unification (avg) 1.0000 1.0000 Unification (max) 0.8333 1.0000 Mixture model (avg) 1.0000 1.0000 Mixture model (max) 1.0000 1.0000 Final Ensemble (1/R) 1.0000 1.0000 Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 19
  20. 20. Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 20
  21. 21.  ~8 years (Jan 2000- July 2007) of published articles of New York Times  Graph links: Co-mention of named entities (people, places, organization)  Sample rate: 1 week  No ground truth  Big Events detected:  January, 2001 – George W. Bush elected US president  September 11, 2001 – Terrorist attack in WTC  February 1, 2003 – Space Shuttle Columbia Disaster Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 21
  22. 22. Feature: Weighted Degree Eigen-behaviors Columbia disaster Probabilistic Approach SPIRIT 2001 election Z Score 1 – norm. (sum p-value) projection 9/11 WTC attack Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 22
  23. 23. Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 23
  24. 24.  Heterogeneous detectors  different scores  different effectiveness (depending on dataset)  Ensemble for event detection on dynamic graphs  Multiple consensus (merging) approaches  two-phase consensus finding Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 24
  25. 25.  Near-future: Robust consensus by automatically selecting effective base algorithms  Challenge: no ground truth  Near-future: real-time detection  Event detection under diverse data sources (e.g., news media, social media, the Web, …)  Challenges: different entity types, different time granularity, entity resolution Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 25
  26. 26. srayana@cs.stonybrook.edu http://www.cs.stonybrook.edu/~datalab/ Judge a man by his questions rather than his answers. -Voltaire Event Detection Characterization Shebuti & Leman Event Detection & Characterization in Dynamic Graphs 26

Editor's Notes

  • My work focuses on
    discovering patterns and detecting anomalies in real-world data,
    using graph analytics techniques, and
    developing effective and efficient tools to do so .
  • ×