Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Monitoring User-System
Interactions through Graph-Based
Intrinsic Dyn...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Monitoring user-system
interactions
What type of user-system interact...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Studied social network
S´ebastien Heymann, B´en´edicte Le Grand — Mon...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Github interaction: code commit
S´ebastien Heymann, B´en´edicte Le Gr...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Github interaction: bug report
S´ebastien Heymann, B´en´edicte Le Gra...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Collected Dataset
👤  👤   👤
📸  📸  📸 📸  📸  📸
  
  
❞ ❞ 🎔
Interact...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Log trace sample
User, user, repository, event, timestamp
lukearmstro...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Bipartite graph
👤  👤   👤
📸  📸  📸 📸  📸  📸
: users
⊥: repositories
S´eb...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Links appear over time
👤
📸
S´ebastien Heymann, B´en´edicte Le Grand —...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Links appear over time
👤 👤
📸 📸
S´ebastien Heymann, B´en´edicte Le Gra...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Links appear over time
👤 👤
📸 📸 📸
S´ebastien Heymann, B´en´edicte Le G...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Links appear over time
👤 👤
📸 📸 📸 📸
S´ebastien Heymann, B´en´edicte Le...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Links appear over time
👤 👤
📸 📸 📸 📸
S´ebastien Heymann, B´en´edicte Le...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Links appear over time
👤 👤👤
📸 📸 📸 📸 📸
S´ebastien Heymann, B´en´edicte...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Links appear over time
👤 👤👤
📸 📸 📸 📸 📸 📸
S´ebastien Heymann, B´en´edic...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Links appear over time
👤 👤👤
📸 📸 📸 📸 📸 📸
Detection of statistically ab...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Methodology
1 Order links by timestamp.
2 Define a sliding window of w...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Example
Date
Nbnodes
500
1000
1500
11 March 13 April 31 May 18 July
w...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Notions of time
Extrinsic time (real time)
Time measured in units suc...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Window width: high resolution
Time (nb links)
Nbnodes
200
400
600
800...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Window width: lower resolution
Number of nodes
Time (nb links)
Nbnode...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Event validation
Visualization of the sub-graph: connected nodes are ...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
http://try.github.io
S´ebastien Heymann, B´en´edicte Le Grand — Monit...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Towards automatic anomaly
detection
Need for more elaborate propertie...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Results
Ratio of -internal links
Time (nb links)
Ratiooftop−internall...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Conclusion
Contributions
• Graph-based methodology to monitor user-sy...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Future work
• Which property for anomaly detection?
• Models of inter...
Questions?
Monitoring User-System Interactions through
Graph-Based Intrinsic Dynamics Analysis
<sebastien.heymann@lip6.fr>
Thank You!
Monitoring User-System Interactions through
Graph-Based Intrinsic Dynamics Analysis
<sebastien.heymann@lip6.fr>
Backup Slides
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Statistically significant anomalies
General definition
Values which dev...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Skewness coefficient
γ = n
(n−1)(n−2) x∈X
x−mean
standard deviation
3
d...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Automatic anomaly detection
Outskewer classifies each value as:
qqqqqq...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Event detection in time series
On a sliding window of size w, each va...
l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i
Why Outskewer?
• claims no strong hypothesis on data
• 1 parameter: t...
Upcoming SlideShare
Loading in …5
×

Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

1,085 views

Published on

Talk IEEE RCIS 2013.

Published in: Technology
  • Be the first to comment

Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis

  1. 1. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis S´ebastien Heymann, B´en´edicte Le Grand Emails: Sebastien.Heymann@lip6.fr, Benedicte.Le-Grand@univ-paris1.fr May 30, 2013
  2. 2. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Monitoring user-system interactions What type of user-system interactions? • user-invoked services in information systems • social networks • ... What kind of monitoring? • discovery • conformance • model improvement Our ultimate goal: automatic and real-time anomaly detection. S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 2/28
  3. 3. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Studied social network S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 3/28
  4. 4. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Github interaction: code commit S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 4/28
  5. 5. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Github interaction: bug report S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 5/28
  6. 6. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Collected Dataset 👤 👤 👤 📸 📸 📸 📸 📸 📸       ❞ ❞ 🎔 Interactions examples commit code / merge repositories. open / close bug reports. ❞comment on bug reports. 🎔edit the repository wiki. ”who contributes to which source code repository” • 336 000 users and repositories monitored during 4 months. • 2.2 million interactions recorded sequentially with timestamps. S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 6/28
  7. 7. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Log trace sample User, user, repository, event, timestamp lukearmstrong, fuel, core, IssuesEvent, 1341420003 Try-Git, clarkeash, try git, CreateEvent, 1341420006 uGoMobi, jquery, jquery-mobile, IssuesEvent, 1341420009 jexp, neo4j, java-rest-binding, IssueCommentEvent, 1341420011 HosipLan, nette, nette, PullRequestEvent, 1341420152 S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 7/28
  8. 8. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Bipartite graph 👤 👤 👤 📸 📸 📸 📸 📸 📸 : users ⊥: repositories S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 8/28
  9. 9. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Links appear over time 👤 📸 S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 9/28
  10. 10. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Links appear over time 👤 👤 📸 📸 S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 9/28
  11. 11. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Links appear over time 👤 👤 📸 📸 📸 S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 9/28
  12. 12. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Links appear over time 👤 👤 📸 📸 📸 📸 S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 9/28
  13. 13. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Links appear over time 👤 👤 📸 📸 📸 📸 S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 9/28
  14. 14. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Links appear over time 👤 👤👤 📸 📸 📸 📸 📸 S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 9/28
  15. 15. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Links appear over time 👤 👤👤 📸 📸 📸 📸 📸 📸 S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 9/28
  16. 16. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Links appear over time 👤 👤👤 📸 📸 📸 📸 📸 📸 Detection of statistically abnormal links dynamics? Model of links dynamics? Link prediction? S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 9/28
  17. 17. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Methodology 1 Order links by timestamp. 2 Define a sliding window of width w (time unit?). 3 Extract the bipartite graph from each window at interval i. 4 Compute an appropriate property on each graph. 5 Analyze the time series. S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 10/28
  18. 18. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Example Date Nbnodes 500 1000 1500 11 March 13 April 31 May 18 July weekly patternNumber of nodes Date Nbnodes 400 600 800 1000 1200 1400 1600 15 April 22 April day-night pattern zoom w =1 hour, i = 5 minutes. Question: don’t temporal patterns hide information? S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 11/28
  19. 19. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Notions of time Extrinsic time (real time) Time measured in units such as seconds. Good at revealing exogenous phenomena, e.g. day-night patterns. Intrinsic time (related to graph dynamics) Time measured in units such as the transition of two states in the graph. Better at revealing endogenous phenomena independently from the graph dynamics? S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 12/28
  20. 20. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Window width: high resolution Time (nb links) Nbnodes 200 400 600 800 1000 1200 500000 1000000 1500000 2000000 Number of nodes w = 1000 links, i = 100 links. :) Additional observation S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 13/28
  21. 21. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Window width: lower resolution Number of nodes Time (nb links) Nbnodes 15000 20000 25000 30000 500000 1000000 1500000 2000000 w = 50, 000 links, i = 1000 links. :) No need for high resolution S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 14/28
  22. 22. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Event validation Visualization of the sub-graph: connected nodes are closer, disconnected nodes are more distant. In the sub-graph of 8,370 nodes and 10,000 links at the time of the event, one node has a high number of links: Try-Git interacts with 4,127 users (over 5,000). S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 15/28
  23. 23. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i http://try.github.io S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 16/28
  24. 24. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Towards automatic anomaly detection Need for more elaborate properties, like: Internal links Their removal does not change the projection of the graph for a given set of nodes, either or ⊥. 👤 👤 👤👤 👤👤 📸 📸 📸 📸 📸 📸 G G’ = G - (red link) G’T = GT 👤 👤👤 📸 📸 📸 📸 📸 📸 S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 17/28
  25. 25. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Results Ratio of -internal links Time (nb links) Ratiooftop−internallinks 0.5 0.6 0.7 0.8 0.9 1.0 0 500000 1000000 1500000 2000000 2300000 not outlier potential outlier outlier unknown A B C D E F G H I J K w = 10, 000 links, i = 1000 links. Color = outlier class using the automatic Outskewer method*. * S. Heymann, M.Latapy and C. Magnien. Outskewer: Using Skewness to Spot Outliers in Samples and Time Series, IEEE ASONAM 2012 S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 18/28
  26. 26. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Conclusion Contributions • Graph-based methodology to monitor user-system interactions • Intrinsic time unit avoids exogeneous patterns impact • Smaller windows not necessarily optimal • Checked relevance of detected events Applicable in other contexts • Client-server architectures • Processes-messages graphs • File-provider graphs • User-invoked services in information systems S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 19/28
  27. 27. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Future work • Which property for anomaly detection? • Models of interaction dynamics • Link prediction S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 20/28
  28. 28. Questions? Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis <sebastien.heymann@lip6.fr>
  29. 29. Thank You! Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis <sebastien.heymann@lip6.fr>
  30. 30. Backup Slides
  31. 31. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Statistically significant anomalies General definition Values which deviate remarkably from the remainder of values (Grubbs, 1969) Outskewer method*: Our definition Extremal value which skews a distribution of values. * Heymann, Latapy and Magnien. Outskewer: Using Skewness to Spot Outliers in Samples and Time Series, IEEE ASONAM 2012 S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 24/28
  32. 32. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Skewness coefficient γ = n (n−1)(n−2) x∈X x−mean standard deviation 3 density x density xγ < 0 γ > 0 Example of skewed distributions. It is sensitive to extremal values (min/max) far from the mean ! S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 25/28
  33. 33. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Automatic anomaly detection Outskewer classifies each value as: qqqqqqqqqqqqq qqqqqqqqqqq 2000 status q not outlier potential outlier outlier or ’unknown’ for heterogeneous distributions of values. S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 26/28
  34. 34. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Event detection in time series On a sliding window of size w, each value of X is classified w times. The final class of a value is the one that appears the most. time S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 27/28
  35. 35. l i p 6 u n i v e r s i t ´e d e p a r i s 1 - c r i Why Outskewer? • claims no strong hypothesis on data • 1 parameter: the time window width • ignores regime changes (shifts in normality) • can be implemented on-line. S´ebastien Heymann, B´en´edicte Le Grand — Monitoring User-System Interactions — May 30, 2013 28/28

×