P47 Eait06


Published on

Bayesian Net Based approach of Network Security.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

P47 Eait06

  1. 1. User Profiling For Host Based Anomaly Intrusion Detection In Windows NT Debapriyay Mukhopadhyay Satyajit Banerjee
  2. 2. Definition of IDS: Intrusion is defined as the set of unauthorized activities that violate the security policy of the system and intrusion detection is the act of tracing those unauthorized users or activities on the system. • Two kinds of IDS: 1) Misuse Detection:- Previous attacks are captured in attack signatures and this approach looks for any of these known signatures in the data under test. 2) Anomaly Detection:- Data that strongly deviates from the normal behavioral profile are considered as intrusive. So, mechanism involves learning the normal behavioral profile of an user/system.
  3. 3. Motivation: 1) Prior work on IDS have mainly targeted UNIX machines. But, majority of world’s computer while is running WINDOWS OS. 2) A major fraction of intrusive activities is actually launched from the inside host machines. Problem Definition: 1) In this paper, we have tried to address the problem of host based anomaly intrusion detection running Windows OS. 2) Problem can be seen as of learning the “normal behavior” of an user and then scoring new activities against this model to identify malicious insiders.
  4. 4. Issues How to model “normal behavior” of an user is a highly non-trivial problem. How to ensure a significant coverage of the space of user’s “normal behavior” – as otherwise there will be increase in false alarms. How to utilize the model characterizing “normal behavior” of an user to detect anomaly intrusions from an inside host.
  5. 5. What we have achieved? • We have identified and categorized data that are truly reflective of user’s normal behavior. • We have taken a User Profiling based approach to learn and model the “normal behavior” of an user. • Bayesian Network has been used to profile an user and also to detect host based anomaly intrusions.
  6. 6. Source Data and Feature Selection • System Processes : - set of processes or services that starts running when system starts up. These system processes provide us with a top level profile of an user. • Application Processes :- launched by the user shell explorer.exe. One application (user ) process is launched by another application (user) process. Exploiting this dependency a DAG can be learnt. • Window Title Bars :- capture a huge amount of information related to user’s behavior. Per process visible window titles can be text mined to gain valuable information. e.g. – iexplorer.exe can be related to one’s browsing profile.
  7. 7. Source Data and Feature Selection • Application Usage Profile: capturing how a user browses through the different features of an application. For each application, we need to track both user key strokes and mouse click events. A nearly related concept is Program profiling. • For each user and for each session, the following features can also be collected. i) max. number of instances of each application in each user session; ii) average time spent on each instance of this application (normalized by session length); iii) percentage of the session length being spent on this application; iv) average waiting time for an instance of an application being active (normalized by session length).
  8. 8. User Profiling • Bayesian Network – used to capture the mutual influence of different domain variables on target attributes. Its an effective tool to be applied for reasoning in uncertain situations. • Categories 1 and 2 data both have a kind of causal relationship between themselves in a sense that one process has generated the other. • Each process is considered as a domain variable and “normal behavior” as target attribute. • Detection of intrusion is done by evaluating Prob(Normal | Evidences), by evidence we mean the set of domain variables that are true at the time of evaluation.
  9. 9. Learning the Bayesian Network • Each process exe corresponds to a node in the DAG and also as a random variable of the underlying probability model. • Exploit the parent-child relationship to construct the DAG. • For each random variable N, and for each distinct state S of values of its parents, count the frequency of N happening in association with S. • Calculate Prob (N | S) – entries of the Conditional Probability Table. For root nodes, these conditional probabilities are simply the a priori probabilities.
  10. 10. An Example Bayesian Net (Applications)
  11. 11. An Example Bayesian Net (System Services)
  12. 12. Inferencing • Polytree algorithm is not applicable – we can have more than one path between two nodes. • We apply Junction Tree algorithm for inferencing and calculate the following. • P1 = Prob (Normal| Evidence of category 1). • P2 = Prob (Normal| Evidence of category 2). • If P1 < T1 and P2 < T2, then the data can be a case of intrusion. • T1 and T2 are pre-determined thresholds for Category 1 and 2 data respectively.
  13. 13. Conclusions • This is a work in progress. • We have identified five categories of data, but only have provided means of how to use the first two categories of data. • Different types of data can be used hierarchically or parallelly to help in detecting an anomaly intrusion. • We have planned to use Probabilistic Temporal Network to unify temporal information of (5) with the atemporal information of (1 or 2).
  14. 14. Thank You.