Anomaly Detection Using Data Mining Techniques Margaret H. Dunham, Yu Meng, Donya Quick, Jie Huang, Charlie Isaksson CSE D...
Objectives/Outline <ul><li>Develop modeling techniques which can “learn/forget” past behavior of spatiotemporal stream eve...
 
Outline <ul><li>Introduction </li></ul><ul><ul><li>Motivation </li></ul></ul><ul><ul><li>What is an anomaly? </li></ul></u...
Motivation <ul><li>A growing number of applications generate streams of data. </li></ul><ul><ul><li>Computer network monit...
What is Anomaly? <ul><li>Event that is unusual </li></ul><ul><li>Event that doesn’t occur frequently </li></ul><ul><li>Pre...
What is Anomaly in Stream Data? <ul><li>Rare - Anomalous – Surprising </li></ul><ul><li>Out of the ordinary </li></ul><ul>...
Statistical View of Anomaly <ul><li>Outlier </li></ul><ul><li>Data item that is outside the normal distribution of the dat...
Statistical View of Anomaly <ul><li>Identify by looking at distribution </li></ul><ul><li>THIS DOES NOT WORK with stream d...
Data Mining View of Anomaly <ul><li>Classification Problem </li></ul><ul><ul><li>Build classifier from training data </li>...
Visualizing Anomalies  <ul><li>Temporal Heat Map (THM)  is a visualization technique for streaming data derived from multi...
THM of VoIP Data <ul><li>VoIP traffic data was provided by Cisco Systems and represents logged VoIP traffic in their Richa...
Spatiotemporal Stream Data <ul><li>Records may arrive at a rapid rate </li></ul><ul><li>High volume (possibly infinite) of...
Spatiotemporal Environment <ul><li>Events arriving in a stream </li></ul><ul><li>At any time, t, we can view the state of ...
Data Stream Modeling <ul><li>Single pass: Each record is examined at most once </li></ul><ul><li>Bounded storage: Limited ...
MM <ul><li>A first order Markov Chain is a finite or countably infinite sequence of events {E1, E2, … } over discrete time...
Problem with Markov Chains <ul><li>The required structure of the MC may not be certain at the model construction time. </l...
Outline <ul><li>Introduction </li></ul><ul><li>EMM Overview </li></ul><ul><li>EMM Applications to Anomaly Detection </li><...
Extensible Markov Model (EMM) <ul><li>Time Varying Discrete First Order Markov Model </li></ul><ul><li>Nodes are clusters ...
Related Work <ul><li>Splitting Nodes in HMMs </li></ul><ul><ul><li>Create new states by splitting an existing state  </li>...
EMM vs AMM <ul><li>Our proposed EMM model is similar to AMM, but is more flexible: </li></ul><ul><li>EMM continues to lear...
EMM <ul><li>Extensible Markov Model (EMM):  at any time t, EMM consists of an MM and algorithms to modify it, where algori...
EMMSim <ul><li>Find closest node to incoming event. </li></ul><ul><li>If none “close” create new node </li></ul><ul><li>La...
EMMIncrement <18,10,3,3,1,0,0> <17,10,2,3,1,0,0> <16,9,2,3,1,0,0> <14,8,2,3,1,0,0> <14,8,2,3,0,0,0> <18,10,3,3,1,1,0.> 1/3...
EMMDecrement Delete N2 N2 N1 N3 N5 N6 2/2 1/3 1/3 1/3 1/2 N1 N3 N5 N6 1/6 1/6 1/6 1/3 1/3 1/3
EMM Advantages <ul><li>Dynamic  </li></ul><ul><li>Adaptable </li></ul><ul><li>Use of clustering </li></ul><ul><li>Learns r...
Growth of EMM Servent Data
EMM Performance – Growth Rate 1 1 1 1 1 Ovrlap 24 13 10 8 6 Cosine 105 66 52 43 40 Dice 162 105 81 66 56 Jaccrd Ouse 4 3 3...
EMM Performance – Growth Rate Minnesota Traffic Data
Outline <ul><li>Introduction </li></ul><ul><li>EMM Overview </li></ul><ul><li>EMM Applications to Anomaly Detection </li><...
Datasets/Anomalies <ul><li>MnDot – Minnesota Department of Transportation </li></ul><ul><ul><li>Automobile Accident </li><...
Rare Event Detection Weekdays  Weekend Minnesota DOT Traffic Data Detected unusual weekend traffic pattern
Our Approach to Detect Anomalies <ul><li>By learning what is normal, the model can predict what is not  </li></ul><ul><li>...
EMMRare <ul><li>EMMRare  algorithm indicates if the current input event is rare.  Using a threshold occurrence percentage,...
EMM Labels for Anomaly Detection <ul><li>Label of Nodes (CF):  </li></ul><ul><ul><li>Cluster feature: <CN i ,  LS i > </li...
Determining Rare <ul><li>Occurrence Frequency  (OF c )  of a node N c  :  </li></ul><ul><li>OF c  =  </li></ul><ul><li>Nor...
EMMRare <ul><li>Given: </li></ul><ul><ul><li>Rule#1: CN i  <= th CN </li></ul></ul><ul><ul><li>Rule#2: CL ij  <= th CL </l...
Rare Event in Cisco Data
<ul><li>Problem: Mitigate false alarm rate while maintaining a high detection rate. </li></ul><ul><li>Methodology:  </li><...
Reducing False Alarms <ul><li>Calculate Risk using historical feedback </li></ul><ul><li>Historical Feedback: </li></ul><u...
Detection Rate Experiments
False Alarm Rate
Outline <ul><li>Introduction </li></ul><ul><li>EMM Overview </li></ul><ul><li>EMM Applications to Anomaly Detection </li><...
Ongoing/Future Work <ul><li>Extend to Emerging Patterns </li></ul><ul><li>Extend to Hierarchical/Distributed </li></ul><ul...
Thanks!
Upcoming SlideShare
Loading in …5
×

Anomaly Detection Using Data Mining Techniques, SMU CSE ...

1,602 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,602
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
78
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Nowadays, a growing number of applications generate streams of data. The data of this type include computer network monitoring data, highway traffic data, call detail records in telecomm industry, online purchase logs and data collected by other sensor networks. A data stream management system is a new research area in recent years. And it has been a new application area of data mining. What it that? A feature of data stream is its high volume of data, it is not possible for us to store all data like traditional database. Instead the data stream must be modeled. Data mining is suitable for this modeling task. On the slide, the items in the parentheses are the datasets available to us. The datasets highlighted with red are those being used in this proposal. The Cisco VoIP data is a 8 weeks of VoIP call log collected in Cisco. MnDot is provided by Mn Dept of transportation. It is the highway traffic data in Twin City area. The data is available from 2000, up to now. Ouse and Serwent are two sets of sensor network data. Ouse is a river level data at three locations near York in UK. And the Serwent data is the water flow rate data at 7 locations near Serwent in UK. (2’20’’)
  • By examining stream data, we can see the following characteristics. The data are raw because only online preprocessing is applicable. Records may arrive at a rapid rate The volume of data is high, possible infinite. The data profile may change on the fly – we call it concept change or concept drift. Data can be multidimensional Temporal dependency may occur in the data series. To process the data stream, a technique must satisfy the following requirements: Data must be modeled since it is not possible to store all data. In literature the modeled data is called synopsis of the data stream. Single pass: Each record or data point is read at most once. Random access to data like relational databases is not possible. The storage space of the synopsis is bounded. Each record must be processed in a soft real-time manner The system should respond to queries incrementally. (1’35’’)
  • Anomaly Detection Using Data Mining Techniques, SMU CSE ...

    1. 1. Anomaly Detection Using Data Mining Techniques Margaret H. Dunham, Yu Meng, Donya Quick, Jie Huang, Charlie Isaksson CSE Department Southern Methodist University Dallas, Texas 75275 [email_address] This material is based upon work supported by the National Science Foundation under Grant No. IIS-0208741
    2. 2. Objectives/Outline <ul><li>Develop modeling techniques which can “learn/forget” past behavior of spatiotemporal stream events. Apply to prediction of anomalous events. </li></ul><ul><li>Introduction </li></ul><ul><li>EMM Overview </li></ul><ul><li>EMM Applications to Anomaly Detection </li></ul><ul><li>Future Work </li></ul>
    3. 4. Outline <ul><li>Introduction </li></ul><ul><ul><li>Motivation </li></ul></ul><ul><ul><li>What is an anomaly? </li></ul></ul><ul><ul><li>Spatiotemporal Data </li></ul></ul><ul><ul><li>Modeling Spatiotemporal Data </li></ul></ul><ul><li>EMM Overview </li></ul><ul><li>EMM Applications to Anomaly Detection </li></ul><ul><li>Future Work </li></ul>
    4. 5. Motivation <ul><li>A growing number of applications generate streams of data. </li></ul><ul><ul><li>Computer network monitoring data </li></ul></ul><ul><ul><li>Call detail records in telecommunications </li></ul></ul><ul><ul><li>Highway transportation traffic data </li></ul></ul><ul><ul><li>Online web purchase log records </li></ul></ul><ul><ul><li>Sensor network data </li></ul></ul><ul><ul><li>Stock exchange, transactions in retail chains, ATM operations in banks, credit card transactions. </li></ul></ul><ul><li>Data mining techniques play a key role in modeling and analyzing this data. </li></ul>
    5. 6. What is Anomaly? <ul><li>Event that is unusual </li></ul><ul><li>Event that doesn’t occur frequently </li></ul><ul><li>Predefined event </li></ul><ul><li>What is unusual? </li></ul><ul><li>What is deviation? </li></ul>
    6. 7. What is Anomaly in Stream Data? <ul><li>Rare - Anomalous – Surprising </li></ul><ul><li>Out of the ordinary </li></ul><ul><li>Not outlier detection </li></ul><ul><ul><li>No knowledge of data distribution </li></ul></ul><ul><ul><li>Data is not static </li></ul></ul><ul><ul><li>Must take temporal and spatial values into account </li></ul></ul><ul><ul><li>May be interested in sequence of events </li></ul></ul><ul><li>Ex: Snow in upstate New York is not an anomaly </li></ul><ul><ul><li>Snow in upstate New York in June is rare </li></ul></ul><ul><li>Rare events may change over time </li></ul>
    7. 8. Statistical View of Anomaly <ul><li>Outlier </li></ul><ul><li>Data item that is outside the normal distribution of the data </li></ul><ul><li>Identify by Box Plot </li></ul>Image from Data Mining, Introductory and Advanced Topics , Prentice Hall, 2002.
    8. 9. Statistical View of Anomaly <ul><li>Identify by looking at distribution </li></ul><ul><li>THIS DOES NOT WORK with stream data </li></ul>Image from www.wikipedia.org , Normal distribution .
    9. 10. Data Mining View of Anomaly <ul><li>Classification Problem </li></ul><ul><ul><li>Build classifier from training data </li></ul></ul><ul><ul><li>Problem is that training data shows what is NOT an anomaly </li></ul></ul><ul><ul><li>Thus an anomaly is anything that is not viewed as normal by the classification technique </li></ul></ul><ul><ul><li>MUST build dynamic classifier </li></ul></ul><ul><li>Identify anomalous behavior </li></ul><ul><ul><li>Signatures of what anomalous behavior looks like </li></ul></ul><ul><ul><li>Input data is identified as anomaly if it is similar enough to one of these signatures </li></ul></ul><ul><li>Mixed – Classification and Signature </li></ul>
    10. 11. Visualizing Anomalies <ul><li>Temporal Heat Map (THM) is a visualization technique for streaming data derived from multiple sensors. </li></ul><ul><li>Two dimensional structure similar to an infinite table. </li></ul><ul><li>Each row of the table is associated with one sensor value. </li></ul><ul><li>Each column of the table is associated with a point in time. </li></ul><ul><li>Each cell within the THM is a color representation of the sensor value </li></ul><ul><li>Colors normalized (in our examples) </li></ul><ul><ul><li>0 – While </li></ul></ul><ul><ul><li>0.5 – Blue </li></ul></ul><ul><ul><li>1.0 - Red </li></ul></ul>
    11. 12. THM of VoIP Data <ul><li>VoIP traffic data was provided by Cisco Systems and represents logged VoIP traffic in their Richardson, Texas facility from Mon Sep 22 12:17:32 2003 to Mon Nov 17 11:29:11 2003. </li></ul>
    12. 13. Spatiotemporal Stream Data <ul><li>Records may arrive at a rapid rate </li></ul><ul><li>High volume (possibly infinite) of continuous data </li></ul><ul><li>Concept drifts: Data distribution changes on the fly </li></ul><ul><li>Data does not necessarily fit any distribution pattern </li></ul><ul><li>Multidimensional </li></ul><ul><li>Temporal </li></ul><ul><li>Spatial </li></ul><ul><li>Data are collected in discrete time intervals, </li></ul><ul><li>Data are in structured format, <a1, a2, …> </li></ul><ul><li>Data hold an approximation of the Markov property. </li></ul>
    13. 14. Spatiotemporal Environment <ul><li>Events arriving in a stream </li></ul><ul><li>At any time, t, we can view the state of the problem as represented by a vector of n numeric values: </li></ul><ul><li>V t = <S 1t , S 2t , ..., S nt > </li></ul>Time S nq … S n2 S n1 S n … … … … … S 2q … S 22 S 21 S 2 S 1q … S 12 S 11 S 1 V q … V 2 V 1
    14. 15. Data Stream Modeling <ul><li>Single pass: Each record is examined at most once </li></ul><ul><li>Bounded storage: Limited Memory for storing synopsis </li></ul><ul><li>Real-time: Per record processing time must be low </li></ul><ul><li>Summarization (Synopsis )of data </li></ul><ul><li>Use data NOT SAMPLE </li></ul><ul><li>Temporal and Spatial </li></ul><ul><li>Dynamic </li></ul><ul><li>Continuous (infinite stream) </li></ul><ul><li>Learn </li></ul><ul><li>Forget </li></ul><ul><li>Sublinear growth rate - Clustering </li></ul>11/26/07 – IRADSN’07
    15. 16. MM <ul><li>A first order Markov Chain is a finite or countably infinite sequence of events {E1, E2, … } over discrete time points, where Pij = P(Ej | Ei), and at any time the future behavior of the process is based solely on the current state </li></ul><ul><li>A Markov Model (MM) is a graph with m vertices or states, S, and directed arcs, A, such that: </li></ul><ul><li>S ={N 1 ,N 2 , …, N m }, and </li></ul><ul><li>A = {L ij | i  1, 2, …, m, j  1, 2, …, m} and Each arc, </li></ul><ul><ul><li>L ij = <N i ,N j > is labeled with a transition probability </li></ul></ul><ul><ul><li>P ij = P(N j | N i ). </li></ul></ul>
    16. 17. Problem with Markov Chains <ul><li>The required structure of the MC may not be certain at the model construction time. </li></ul><ul><li>As the real world being modeled by the MC changes, so should the structure of the MC. </li></ul><ul><li>Not scalable – grows linearly as number of events. </li></ul><ul><li>Our solution: </li></ul><ul><ul><li>Extensible Markov Model (EMM) </li></ul></ul><ul><ul><li>Cluster real world events </li></ul></ul><ul><ul><li>Allow Markov chain to grow and shrink dynamically </li></ul></ul>
    17. 18. Outline <ul><li>Introduction </li></ul><ul><li>EMM Overview </li></ul><ul><li>EMM Applications to Anomaly Detection </li></ul><ul><li>Future Work </li></ul>
    18. 19. Extensible Markov Model (EMM) <ul><li>Time Varying Discrete First Order Markov Model </li></ul><ul><li>Nodes are clusters of real world states. </li></ul><ul><li>Learning continues during application phase. </li></ul><ul><li>Learning: </li></ul><ul><ul><li>Transition probabilities between nodes </li></ul></ul><ul><ul><li>Node labels (centroid/medoid of cluster) </li></ul></ul><ul><ul><li>Nodes are added and removed as data arrives </li></ul></ul>
    19. 20. Related Work <ul><li>Splitting Nodes in HMMs </li></ul><ul><ul><li>Create new states by splitting an existing state </li></ul></ul><ul><ul><li>M.J. Black and Y. Yacoob ,” Recognizing facial expressions in image sequences using local parameterized models of image motion ” , Int. Journal of Computer Vision , 25(1), 1997, 23-48 . </li></ul></ul><ul><li>Dynamic Markov Modeling </li></ul><ul><ul><li>States and transitions are cloned </li></ul></ul><ul><ul><li>G. V. Cormack, R. N. S. Horspool. “Data compression using dynamic Markov Modeling, ” The Computer Journal, Vol. 30, No. 6, 1987. </li></ul></ul><ul><li>Augmented Markov Model (AMM) </li></ul><ul><ul><li>Creates new states if the input data has never been seen in the model, and transition probabilities are adjusted </li></ul></ul><ul><ul><li>Dani Goldberg, Maja J Mataric. “Coordinating mobile robot group behavior using a model of interaction dynamics, ” Proceedings, the Third International Conference on Autonomous Agents (agents ’99), Seattle, Washington </li></ul></ul>
    20. 21. EMM vs AMM <ul><li>Our proposed EMM model is similar to AMM, but is more flexible: </li></ul><ul><li>EMM continues to learn during the application phase. </li></ul><ul><li>The EMM is a generic incremental model whose nodes can have any kind of representatives. </li></ul><ul><li>State matching is determined using a clustering technique. </li></ul><ul><li>EMM not only allows the creation of new nodes, but deletion (or merging) of existing nodes. This allows the EMM model to “forget” old information which may not be relevant in the future. It also allows the EMM to adapt to any main memory constraints for large scale datasets. </li></ul><ul><li>EMM performs one scan of data and therefore is suitable for online data processing. </li></ul>
    21. 22. EMM <ul><li>Extensible Markov Model (EMM): at any time t, EMM consists of an MM and algorithms to modify it, where algorithms include: </li></ul><ul><li>EMMSim, which defines a technique for matching between input data at time t + 1 and existing states in the MM at time t. </li></ul><ul><li>EMMIncrement algorithm, which updates MM at time t + 1 given the MM at time t and classification measure result at time t + 1. </li></ul><ul><li>Additional algorithms may be added to modify the model or for applications . </li></ul>
    22. 23. EMMSim <ul><li>Find closest node to incoming event. </li></ul><ul><li>If none “close” create new node </li></ul><ul><li>Labeling of cluster is centroid/medoid of members in cluster </li></ul><ul><li>Problem </li></ul><ul><ul><li>Nearest Neighbhor O(n) </li></ul></ul><ul><ul><li>BIRCH O(lg n) </li></ul></ul><ul><ul><ul><li>Requires second phase to recluster initial </li></ul></ul></ul>
    23. 24. EMMIncrement <18,10,3,3,1,0,0> <17,10,2,3,1,0,0> <16,9,2,3,1,0,0> <14,8,2,3,1,0,0> <14,8,2,3,0,0,0> <18,10,3,3,1,1,0.> 1/3 N1 N2 2/3 N3 1/1 1/3 N1 N2 2/3 1/1 N3 1/1 1/2 1/3 N1 N2 2/3 1/2 1/2 N3 1/1 2/3 1/3 N1 N2 N1 2/2 1/1 N1 1
    24. 25. EMMDecrement Delete N2 N2 N1 N3 N5 N6 2/2 1/3 1/3 1/3 1/2 N1 N3 N5 N6 1/6 1/6 1/6 1/3 1/3 1/3
    25. 26. EMM Advantages <ul><li>Dynamic </li></ul><ul><li>Adaptable </li></ul><ul><li>Use of clustering </li></ul><ul><li>Learns rare event </li></ul><ul><li>Scalable: </li></ul><ul><ul><li>Growth of EMM is not linear on size of data. </li></ul></ul><ul><ul><li>Hierarchical feature of EMM </li></ul></ul><ul><li>Creation/evaluation quasi-real time </li></ul><ul><li>Distributed / Hierarchical extensions </li></ul>
    26. 27. Growth of EMM Servent Data
    27. 28. EMM Performance – Growth Rate 1 1 1 1 1 Ovrlap 24 13 10 8 6 Cosine 105 66 52 43 40 Dice 162 105 81 66 56 Jaccrd Ouse 4 3 3 2 2 Ovrlap 61 31 19 14 11 Cosine 389 191 123 92 72 Dice 667 389 268 190 156 Jaccrd Ser went 0.998 0.996 0.994 0.992 0.99 Threshold Sim Data
    28. 29. EMM Performance – Growth Rate Minnesota Traffic Data
    29. 30. Outline <ul><li>Introduction </li></ul><ul><li>EMM Overview </li></ul><ul><li>EMM Applications to Anomaly Detection </li></ul><ul><li>Future Work </li></ul>
    30. 31. Datasets/Anomalies <ul><li>MnDot – Minnesota Department of Transportation </li></ul><ul><ul><li>Automobile Accident </li></ul></ul><ul><li>Ouse and Serwent – River flow data from England </li></ul><ul><ul><li>Flood </li></ul></ul><ul><ul><li>Drought </li></ul></ul><ul><li>KDD Cup’99 </li></ul><ul><li>http:// kdd.ics.uci.edu/databases/kddcup99/kddcup99.html </li></ul><ul><ul><li>Intrusion </li></ul></ul><ul><li>Cisco VoIP – VoIP traffic data obtained at Cisco </li></ul><ul><ul><li>Unusual Phone Call </li></ul></ul>
    31. 32. Rare Event Detection Weekdays Weekend Minnesota DOT Traffic Data Detected unusual weekend traffic pattern
    32. 33. Our Approach to Detect Anomalies <ul><li>By learning what is normal, the model can predict what is not </li></ul><ul><li>Normal is based on likelihood of occurrence </li></ul><ul><li>Use EMM to build model of behavior </li></ul><ul><li>We view a rare event as: </li></ul><ul><ul><li>Unusual event </li></ul></ul><ul><ul><li>Transition between events states which does not frequently occur. </li></ul></ul><ul><li>Base rare event detection on determining events or transitions between events that do not frequently occur. </li></ul><ul><li>Continue learning </li></ul>
    33. 34. EMMRare <ul><li>EMMRare algorithm indicates if the current input event is rare. Using a threshold occurrence percentage, the input event is determined to be rare if either of the following occurs: </li></ul><ul><ul><li>The frequency of the node at time t+1 is below this threshold </li></ul></ul><ul><ul><li>The updated transition probability of the MC transition from node at time t to the node at t+1 is below the threshold </li></ul></ul>
    34. 35. EMM Labels for Anomaly Detection <ul><li>Label of Nodes (CF): </li></ul><ul><ul><li>Cluster feature: <CN i , LS i > </li></ul></ul><ul><ul><ul><li>CN i: cardinality </li></ul></ul></ul><ul><ul><ul><li>LS i : first moment ( Medoid or Centroid based) give defines here. </li></ul></ul></ul><ul><li>Label of Links: </li></ul><ul><ul><li><CL ij > </li></ul></ul>
    35. 36. Determining Rare <ul><li>Occurrence Frequency (OF c ) of a node N c : </li></ul><ul><li>OF c = </li></ul><ul><li>Normalized Transition Probability (NTP mn ), from one state, N m , to another, N n : </li></ul><ul><ul><ul><li>NTP mn = </li></ul></ul></ul>
    36. 37. EMMRare <ul><li>Given: </li></ul><ul><ul><li>Rule#1: CN i <= th CN </li></ul></ul><ul><ul><li>Rule#2: CL ij <= th CL </li></ul></ul><ul><ul><li>Rule#3: OF c <= th OF </li></ul></ul><ul><ul><li>Rule#4: NTP mn <= th NTP </li></ul></ul><ul><li>Input: G t : EMM at time t </li></ul><ul><li> i: Current state at time t </li></ul><ul><li> R= {R 1 , R 2 ,…,R N }: A set of rules </li></ul><ul><li>Output: A t : Boolean alarm at time t </li></ul><ul><li>Algorithm: </li></ul><ul><li>A t = </li></ul><ul><ul><li>1  R i = True </li></ul></ul><ul><ul><li>0  R i = False </li></ul></ul>
    37. 38. Rare Event in Cisco Data
    38. 39. <ul><li>Problem: Mitigate false alarm rate while maintaining a high detection rate. </li></ul><ul><li>Methodology: </li></ul><ul><ul><li>Historic feedbacks can be used as a free resource to take out some possibly safe anomalies </li></ul></ul><ul><ul><li>Combine anomaly detection model and user’s feedbacks. </li></ul></ul><ul><ul><li>Risk level index </li></ul></ul><ul><li>Evaluation metrics: Detection rate, false alarm rate. </li></ul><ul><ul><li>Detection rate </li></ul></ul><ul><ul><li>False alarm rate </li></ul></ul><ul><ul><li>Operational Curve </li></ul></ul>Risk assessment Detection rate = TP/(TP+TN) False alarm rate = FP/(TP+FP)
    39. 40. Reducing False Alarms <ul><li>Calculate Risk using historical feedback </li></ul><ul><li>Historical Feedback: </li></ul><ul><li>Count of true alarms: </li></ul>
    40. 41. Detection Rate Experiments
    41. 42. False Alarm Rate
    42. 43. Outline <ul><li>Introduction </li></ul><ul><li>EMM Overview </li></ul><ul><li>EMM Applications to Anomaly Detection </li></ul><ul><li>Future Work </li></ul>
    43. 44. Ongoing/Future Work <ul><li>Extend to Emerging Patterns </li></ul><ul><li>Extend to Hierarchical/Distributed </li></ul><ul><ul><li>Yu Su </li></ul></ul><ul><li>Test with more data – KDD Cup </li></ul><ul><li>Compare to other approaches </li></ul><ul><ul><li>Charlie Isaksson </li></ul></ul><ul><li>Apply to nuclear testing </li></ul>
    44. 45. Thanks!

    ×