Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

BUILDING A USAGE PROFILE FOR ANOMALY DETECTION

329 views

Published on

Usage Modelling and Threat Detection

Published in: Technology
  • Be the first to comment

  • Be the first to like this

BUILDING A USAGE PROFILE FOR ANOMALY DETECTION

  1. 1. BUILDING A USAGE PROFILE OF A SYSTEM FOR DETECTING THREATS ON THE SYSTEM October, 2016 Nathanael Asaam nat_ato@yahoo.com Abstract This paper is an investigation into building usage profiles of a system using behavior models. Such behavior models are the heart of machine learning, and evolutionary computing. Some other methods of building such usage profiles include the use of statistical models such as time series models, univariate models and mean and standard deviation models. The aim of building these usage profiles is to be able to detect unusual behavior on the system. This paper uses regression to determine the usage profiles of a system by studying the relationship between relevant system variables that will be used to formulate the usage profile. The dependent and independent variables for the usage profile can be determined from an audit trail. Additionally, the paper applies hidden markov models to study the various states a computer system can fall into and the various stage transition in order to be able to predict unusual behavior in the system. Unusual behavior in this case may be a particular state or a transition from one state to another or the manner in which a particular state transition occurred. With this usage profile which is composed of the usage profile equation and a mean and standard deviation model that capture average usage and its standard deviation and the markov chain model that captures the various states of the system and the various state transition it becomes possible to detect anomaly on the system. Using linear and nonlinear programming, the usage profile equation can be maximized to determine states of the system and points at which the system is optimal. This can help improve the system’s usage. Also using differential coefficient of the usage profile equation and other statistical models such as the mean and standard deviation model, a threat profile of the system can be developed. When the threat profile equation is minimized using linear and nonlinear programming, it will help prevent threats on the system. The benefit of this research is its application to the development of anomaly threat detection systems and risk analysis systems that can be used for performing computer security risk assessments and analysis
  2. 2. INTRODUCTION If a usage profile of a system can be built, it will become possible to detect unusual behavior on the system. The method for building such usage profiles involved determining factors of the system that are critical to the system. These factors can be seen as critical system variables that affect the system’s usage. The other thing to consider is determining the way in which you can obtain an abstract representation of the usage profile. The abstract representation of the usage profile can be achieved by the application of behavior models such as statistical models, machine learning models and cognitive based models. The first goal of this research paper is to investigate techniques for building a usage profile of a computer system. The aim of building the usage profile is to be able to have a working model that describes the systems behavior. The second goal of the research is to be able to detect unusual behavior on the system. Unusual behavior will be detected as deviation from the usage profile model built. The last goal is to be able to build an anomaly threat detection system and a risk analysis system that can be used for detecting threats and performing risk analysis on a system. RESEARCH MODEL AND METHODOLOGY This paper investigates threat detection using application of machine learning, regression, linear and nonlinear programming and calculus. The research model of this paper is inspired by application of regression, machine learning and statistical models and the methodology for threat detection is inspired by linear and nonlinear programming, calculus and multithreading. These fields of study are mainly related to computer science, discrete mathematics and operations research. BEHAVIOR MODELS The behavior models that can be used for building a usage profile include, machine learning models, statistical models, cognitive based models, user intention based models and computer immunology models. Examples of statistical models that can be used are time series models, univariate models and moment or mean and standard deviation model. Machine learning
  3. 3. models that can be used include neural networks, Bayesian networks, hidden markov models and genetic algorithms. SYSTEM THREATS There are three types of system logs that our intended threat analysis and detection hope to arrive at. These are system errors, system threats, and usage rates all categorized based on the magnitude and characteristics of an instance of the threat model. These logs must as such be audited by a security expert to analyze changes in our computer system that fits or deviates from the current usage profile, in order to project a more appropriate instance of the usage profile that will be perfectly functional and suiting in the future. PROBLEM DEFINITION If the normal usage or behavior of a computer system can be represented by an abstract model, then this abstract model can be used to detect threats on the system. The threats on the system can be detected as deviations from the abstract model which is the behavior of the system. The main problems this paper seek to investigate are listed below.  Representing the normal usage or behavior of a system with an abstract model.  Determining activities and occurrences on the system that are deviations from the system’s normal behavior or usage.  Representing these activities or deviations with an abstract model.  Preventing such activities or occurrences from occurring on the system. In this paper the system’s normal behavior is known as the usage profile and the deviations from the systems normal behavior is known as the threat profile of the system. RESEARCH QUESTIONS The main questions to be investigated are listed below.  What are the best and most efficient techniques for modelling a system’s normal behavior or usage?  What are the best and most efficient techniques for design and implementation of a threat detection system?
  4. 4.  How can a we build a risk analysis system for performing risk assessment of a computer system? OBJECTIVES The main objectives of this research are as follows.  Representing a system’s normal functioning with an abstract model.  Design and implementation of a threat detection system.  Design and implementation of a mobile security audit framework.
  5. 5. LITERATURE REVIEW This section review major topics that constitute this research and work done in this areas. The topics that will be reviewed are intrusion detection systems, behavior encryption, risk analysis, information security awareness and practices, ways of mitigating risks on social networking sites and application of markov chain models for anomaly detection. INTRUSION DETCETION SYSTEMS There are two types of intrusion detection systems. These are signature based or knowledge based threat detection systems and anomaly based threat detection systems. Signature based threat detection systems detect threat using direct mappings of incidents with a database of known threats. The database of know threats is called the threat signatures. Anomaly threat detection system detect threats based on deviation from an observed behavior pattern of the system for which the threat detection system has been built. They are also known as behavior threat detection systems. The database of threats of knowledge based threat detection system which is the threat signatures must be constantly update for new identified threats. Since new threats may not be in the threat signatures, the correctness of detecting threats using such threat detection systems is sometimes compromised. However, anomaly threat detection systems have higher correctness of detecting threats but they sometimes give false alarms since there are no mapping of system incidents with a database of known threats. Anomaly based threat detection system are built using artificial intelligence technologies. Besides this, intrusion detection systems are classified based on purpose for which they are built and activeness and passiveness with which they deal with threats. There are host based and network based threat detection systems made for such purposes. Also, active threat detection systems are configured to block or prevent attacks while passive intrusion systems are configured to monitor, detect and alert threats. BEHAVIOR ENCRYPTION Behavior algorithms are applied to safeguard information on computing devices such as mobile phone and laptops. These algorithms are the basis for building systems that study and encrypt
  6. 6. user behavior on computing devices in order to ensure the security of the computing device. A study into mobile platform security reports that behavior encryption system have been designed and built focusing on mobile platforms. Results from this study indicates that behavior encryption application systems are effective at ensuring mobile platform security. It must be stated that cryptographic study into how to encrypt the usage profile of a system can fall under behavior encryption. This will help in securing the information that embodies the usage profile. It is necessary because if the usage profile can be known, then it is possible to launch an attack on the system. It must be stated that the usage profile may consist of user behavior and network behavior. As such, if such information is compromised, then a user can be impersonated or the network can be compromised. RISK ANALYSIS Computer risk analysis is also called risk assessment. It involves the process of analyzing and interpreting risk. To analyze risk, the scope and methodology has to be initially determined. Later, information is collected and analyzed before interpreting the risk analysis results. Determining the scope can be described as identifying the system to be analyzed for risk and parts of the system that will be considered. Also, the analytical method that will be used with its detail and formality must be planned. The boundary, scope and methodology used during risk assessment determine the total amount of work efforts that is needed in the risk management, and the type and usefulness of the assessments result. Risk has many components including assets, threats, likelihood of threat occurrence, vulnerability, safeguard and consequence. Risk management include risk acceptance which takes place after several risk analysis. Normally, after risk has been analyzed and safeguards implemented, the remaining or residual risk in the system that makes the system functional must be accepted by management. This may be due to constraints on the system such as ease of use, or features of the systems for which strict safeguard will cost the organization operational problems. As such, risk acceptance, like the selection of safeguards, should take into account various factors besides those addressed in the risk assessment. In addition, risk acceptance should take into account the limitations of the risk assessment.
  7. 7. INFORMATION SECURITY AWARENESS AND PRACTICES A paper on information security awareness in Saudi Arabia discusses information security awareness and practices. The paper is entitled “A study of information security awareness and practices in Saudi Arabia.” This paper emphasizes the fact that information is under constant threat from cyber vandals. However, Saudi Arabia is rated poor in terms of information security due to the fact that the country is a highly suppressed, patriarchical and tribal culture country. The paper examined the level of information security awareness among the general public in the country using an anonymous online survey based on instruments the Malaysian Security Organization produced. In all, 633 persons responded to the survey and analysis confirmed that indeed, information security awareness is low in the country and this is mostly related to the fact that, the country is highly suppressed, patriarchical and tribal in nature. PROTOCOL FOR MITIGATING RISKS ON SOCIAL NETWORKING SITES According to an academic paper entitled, “Protocol for mitigating the risk of hijacking social networking sites”, hackers can hijack a user’s session on social networking sites, impersonate the victim and take over his session. The paper deals with this risk by presenting a security authentication protocol for mitigating the risk. The protocol takes into account that users of social networking sites connect to the sites using several platforms and connection speeds. To cater for mobile devices and tablets using Wifi connection, a novel Self-Configuring Repeatable Hash Chains (SCRHC) protocol was developed to prevent the hijacking of session cookies. This protocol supports three levels of caching making it possible to forfeit storage space for enhanced performance and reduced workload. APPLICATION OF MARKOV CHAIN MODELS FOR ANOMALY DETECTION According to a research paper on application of markov chain models for anomaly detection, a temporal behavior of a system can be built using markov chain models. The technique was used to represent the temporal profile of the normal behavior of a computer and a network system. The markov chain model of the normal profile was learnt from historic data. The observed behavior was analyzed to infer into the probability that the markov chain model of the norm
  8. 8. profile supports the observed behavior. A low probability of support indicates anomalous behavior that may result from intrusive activities. The technique was implemented and tested on audit data of a Sun Solaris system. The testing results show that the technique clearly distinguished intrusive activities from normal activities. According to the paper, two primary sources of data has been used to capture activities in a computer or network system for intrusion detection, network traffic and audit data.
  9. 9. RESEARCH MODEL AND METHODOLOGY RESEARCH MODEL Assume that the normal usage (Y) of a system such as smart phone, laptop or a wireless network can be represented by a mathematical function; Y=f (Xi, Ci) such that Xi represents system variables like number of functions or number of authentications. Ci represents system constants like maximum or minimum number of authentications. When a change in Y is beyond the standard deviation determined from the data set of our usage, then that change indicates a threat. To investigate this threat, machine learning algorithms, mathematical functions and behavior based intrusion detection systems will be studied to determine Y in terms of a number of variables that represent Y appropriately. The expected usage model of the network to be investigated includes the following components. Host Usage Model, Server Usage Model, Device Usage Model, Port Usage Model, Network Usage Model, Session Usage Model, Authentication Usage Model, Memory Usage Model, CPU Usage Model, Battery Usage Model and Program Usage Model. These components are expected to be derived from the variables listed below. • Average number of application software that run on the mobile system while using the system • Average number of system processes that run on the mobile system while using the system • Average number of authentications in the mobile system. • Average number of user actions that happens on the mobile system • Average time a user spends before his session expires. • Average time the mobile facility or resource functions each day. • Number of paired ports communicating on the network • Average amount of memory space used on devices while the network is being operated. • Average CPU time spent on a single device on the network • Average life span of a single device battery on the network.
  10. 10. METHODOLOGY The list below details activities or processes that will be followed to represent a computer system with an abstract mathematical model and analyze changes in the system. It is hoped that following these processes will arrive at design and implementation of a normal usage model, a threat detection system and a mobile security audit framework. • Machine Learning Algorithms & Behavior Based Intrusion Systems: Investigate machine learning algorithms, mathematical functions, and behavior based intrusion detection systems in order to determine the extent to which the normal usage of a mobile system can be represented by the research model. • Audit trails: Analyze audit trails in order to formulate a set of independent and dependent variables and their associated data set that will help in modelling the usage model of a mobile system. • Normal Usage Model: Apply the knowledge gained from the machine learning algorithms and behavior based intrusion detection systems study and the audit trails analysis to model and represent the normal usage of a mobile system such as a smart phone, laptop or wireless network. • Threat Modelling: Study differential equations of the normal usage model and its applications in order to model, detect and prevent threats. • Boolean Calculus: Apply Boolean algebra and calculus of Boolean functions to design and implement a hardware and software that make up the Normal Usage and Threat Detection Systems. • Use programming as a tool to experiment representations of the normal usage and threat models to aid design and implementation of a mobile security audit framework. • Employ questionnaire to collect information about the usage of computers and mobile phones. • Threat Detection Systems: Develop an anomaly based threat detection system to demonstrate the effectiveness of the research model. The goal is to measure the effectiveness of the threat detection system developed, at preventing threats on a computer system. Machine Learning Algorithms & Behavior Based Intrusion Systems Machine learning techniques and algorithms will be investigated to know the extent to which an expert system that learns a computer system’s usage can be built. Since the expected usage model is a mathematical model, various mathematical modelling techniques will be applied to determining the normal usage model.
  11. 11. When deviations from these mathematical models are analyzed it can lead to design and implementation of behavior based intrusion detection systems. As such, a thorough study into design and implementation of behavior based intrusion detection systems will be done. Audit Trail Analysis It is expected that computer security audit reports will be sampled and analyzed to arrive at a set of dependent and independent variables and their data set. These variables and their associated data set can be used to formulate the normal usage model. Normal Usage Model An investigation into applying the knowledge gained from the machine learning study, the mathematical modelling study, the behavior based intrusion detection system study and the audit trail analysis will the done. It is hoped that this will answer the question how do you represent the normal functioning of a computer system with a mathematical abstract model. Threat Modelling Differential equations of the normal usage model will be investigated to know the extent to which deviations from the normal usage models can be analyzed. An abstract mathematical model of these deviations will be formulated. These abstract models are derivatives of the normal usage model. Boolean Calculus A study into representing the normal usage model with a boolean function will be done. It is hoped that analyzing these boolean functions will aid in building a hardware that is the expected usage system. Differential equations of these boolean functions will be studied to analyze changes in the system that indicate deviation from the normal usage model. Experimenting Usage and Threat Models Programming will be used as a tool to experiment various usage and threat models. These usage and threat models are expected to be derived from a computer system. This experiment will lead to design and implementation of a normal usage system, a threat detection system and a risk
  12. 12. analysis system. These systems are expected to be components of a mobile security audit framework. Computer Usage Survey A questionnaire for obtaining information about computer and smart phone usage will be employed. It is expected that this will give an idea about various statistics that make up a computer or smart phone’s usage. These statistics will be a guideline for sampling experimental data of a computer system’s usage during experimenting the usage and threat models. Threat Detection Systems It is hoped that an anomaly based threat detection system will be developed to demonstrate the effectiveness of the research model at being used to model systems usage and threats. The effectiveness of the threat detection system developed at preventing threats on a computer system will also be measured. In this project, the threat detection system that will be developed is for ecommerce sites.
  13. 13. USAGE PROFILE OF A SYSTEM AND THREATS ASSOCIATED WITH THE SYSTEM Building a usage profile of a system requires determining an abstract model that represents the system’s usage appropriately. In this paper, we will determine the usage profile by using statistical models and machine learning models. The statistical model that will be used is the moments or mean and standard deviation model. The machine learning model that will be used is the hidden markov model. Additionally, we will be determining the usage profile by trying to find out the relationship between dependent and independent variables that make up the system. As such, we will be using regression to determine the relationship between the dependent and independent variables. A study into the application of differentiation also gives insight into the kind of threats associated with the system. MATHEMATICAL MODELLING TECHNIQUES The mathematical relation that represents the usage of a system can be determined using regression analysis. Regression analysis is a field of statistics that employs the least squares method to determine the relationship between a dependent and one or more independent variables given the data set for these variables. The least squares method tries to determine the relationship by minimizing the error margin of the relationship determined. Additionally, differential equations will be used to model threats in the system given the usage equation. SIMPLE LINEAR REGRESSION Simple linear regression involves a dependent variable and a single independent variable. The goal is to find a linear relationship between the two variables. The linear relationship found are typically of the form y=b0+b1x where y is the dependent variables. The slope of the line is b1 and the y-intercept is b0. The relationship between the dependent and independent variables can be determined using the least squares method. First of all, the sum of the dependent and the independent variables (∑y, ∑x) and the sum product of the dependent and the independent variables (∑xy) are determined. Secondly, the sum of the squares of the dependent and the independent variables (∑y2, ∑x2) must be determined.
  14. 14. The constant that represents the slope of the line that best fits the relationship determined is calculated as the product of the sum product of the dependent and the independent variables and the sample size, minus the product of the sums of the dependent and the independent variables divided by the product of the sample size and the sum of square of the independent variable minus the square of the sum of the independent variable. This is given a s: (n∑xy- (∑x ∑y) ∕(n∑x2-(∑x)2). Where n is the sample size. The constant that represent the y-intercept of the line determined to be the relationship between the dependent and the independent variables can be calculated as the product of the sum of the dependent variable and the sum of squares of the independent variables minus the product of the sum of the independent variable and the sum product of the dependent and independent variables divided by the product of the sum of squares of the independent variable and the sample size minus the square of the sum of the independent variable. This is given as (∑y∑x2-∑x∑xy) /((n∑x)2-(∑x)2). Where n is the sample size. Finally, the correlation coefficient of the predictive relationship determined is calculated as the product of the sample size and the sum product of the dependent and independent variable minus the product of the sums of the dependent and independent variables divided by the square root of the product of the sample size and the sum of the squares of the independent variable minus the product of the squares of the sum of the independent variables multiplied by the product of the sample size and the sum of the squares of the dependent variable minus the square of the sum of the dependent variable. This is given as (n∑xy-∑x∑y)/√(n∑x2-(∑x)2(n∑y2- (∑y)2) where n is the sample size. MULTIPLE LINEAR REGRESSION Multiple linear regression problems involve a dependent variable and two or more independent variables. Using the least squares method, the goal is to find the linear relationship between the variables involved. The relationships are of the form y=b0 + b1x1+b2x2+…+bnxn, where n is the number of independent variables and x1, x2,… ,xn are the various independent variables and y is the dependent variable.
  15. 15. To solve multiple linear problems, we first need to reduce the expected function or multiple linear models to their simple linear forms. In this form, it is easier to determine the regression equation. To do this we need to determine the y=b0+b1x for every independent variable. That way, the regression coefficient set denoted b associated with the independent variables can be determined using the least squares method. As such the set b made up of b1, b2,…bn is a set containing the entire regression coefficient associated with the predicted regression function. NON LINEAR REGRESSION Nonlinear regression problems involve finding a nonlinear relationship between a dependent variable and one or more independent variables. Because nonlinear graphs are difficult to analyze, they can be represented mathematically as linear models before they are analyzed. This makes it possible to use linear regression techniques to analyze such relationships. One of the ways used to represent nonlinear relationships with linear models is taking logs on both sides of the relationship equation. That reduces the nonlinear relationship to a linear relationship. An example is the equation y2=x2/xy. To reduce this relationship to a linear relation we take logs on both sides of the relation. The resulting relationship is 2logy=2logx-logx-logy. When this relationship is simplified the resulting relationship is logy=(logx)/3. In this form, the logy term represents the dependent variable and the logx term represents the independent variable. Let K=logy and let P = logx. It implies that K=P/3. This becomes the linear form of our nonlinear relation. SINGLE VARIABLE CALCULUS REVIEW (DIFFERENTIATION) Assume a system with exactly three major system variables. If sampling each of these variables helps us to arrive at exactly one micro usage equation of our system that best represents the behavior or functioning of that feature of our system, then we can use differential equations of the three micro models to analyze and detect threats. Below are some examples of calculus basics for our usage profile and threat modelling. Y=2X+3 is a linear function that represents our first micro usage model. X is number of authentications. Y=3X2+2X+6 is a quadratic function that represents our second micro usage
  16. 16. model and X is the number of host on the system’s wireless network. Y=40/ X+ 5 is an exponential function that represents our third micro usage model and X is the number of applications on a host on the system’s wireless network. For each micro usage model, the differential coefficient can be computed using the law for differentiation given below. THEOREM 1: dy/dx(C) =0, where C is a constant. THEOREM 2: dy/dx (f[Xi, Ci]) is computed as the product of the exponent of the first term that results from simplifying f (Xi, Ci) and the constant besides it multiplied by the system variable Xi raise to the power the original exponent of the first term minus one plus the result for iterating the first step till every term of f (Xi, Ci) has been evaluated based on the first step. The final result looks like the sum of a series of rational numbers computed from the law after going through all the terms. From the calculus basics review above, the corresponding differential coefficients of the three micro models are determined as follows; 2, 6X+2, and -40/ X2. If the average usage and standard deviations of our micro models are computed, then we can analyze changes in our system by looking at values of our usage equations and their derivatives and how they relate to the average usage, its corresponding standard deviation, and how this helps us determine threats. INTEGRATION REVIEW Assume the following functions y=3, y=4X+2 and y=9X2 +3 are threat model equations. Based on our three functions we will do an introductory review of integration which is a branch of calculus that is a reverse operation for differentiation. The integrals for the functions are computed respectively as 3X +C, 2X2+2X+C and 3X3+3X+C where C represents system constants in the system. Computing the integral can be tricky so two laws are defined below to aid quick computation of the integrals of a normal mathematical function. THEOREM 3: If a function is represented by a constant such as a rational number, the integral is the product of the variable x and the rational number which is the constant plus a system constant c, to be determined by about a pair of x and y values.
  17. 17. THEOREM 4: If a function is not represented by a constant, the integral is given as the constant of the first x occurring term divided by the sum of the exponent of the first x occurring term and 1 multiplied by the variable x raised to the power the sum of the exponent of the first x occurring term and 1 plus repeating the same for every x occurring term plus the corresponding system constant c. BUIDING THE USAGE PROFILE To build a usage profile, we use a mathematical model that captures the behavior of the system and a markov chain model that captures various states and transitions in the system. The mathematical model is made up of a usage equation composed of a dependent and independent variables and a statistical model that captures average usage and its standard deviation. The usage equation of the system can be summarized as Y=f (Xi, Ci), where Y is our systems’ usage and Xi are the various independent variables of our system that constitutes the normal usage or behavior of the system. In order to determine the usage equation, it is essential to keep the method simple and the variables simple in abstraction and minimal in quantity. This makes it easy to appropriately represent the system’s usage with a usage equation. Then we use regression to determine the mathematical equation that describe the usage of the system. In this paper, we break down the usage of a system into various micro usage models that describe smaller parts of the system. When we are able to determine the usage equation of these micro usage models and their associates mean and standard deviation model, it means that we have finally built a usage profile of the system. The other thing left is to be able to study the various states and state transitions in the systems. This is also part of the usage profile. AUTHENTICATION USAGE MODEL The authentication usage model represents the usage of an authentication system. The independent variables that must be sampled to determine the usage of an authentication system are the average data transmitted during an authentication (x1) and the average network speed for a single authentication (x2). The average data transmitted is the average of request and response data for a single authentication and the average network speed is the average upload
  18. 18. and download speed for a single authentication. The dependent variable that must be sampled is the time taken for an authentication (y). The goal of modelling the dependent and independent variables is to arrive at a mathematical relationship between y and the two independent variables x1 and x2. It is expected that the relationship will be Y=c1(x2/x1) +c2, where c1 and c2 are system constants. In addition to that, some system constants that will aid threat analysis must be determined. These are the total number of valid authentications, the expected authentications within a time frame, the minimum authentications within a time frame and the maximum authentications within a time frame. The mathematical relationship between y, x1 and x2 is the normal usage model of the authentication system. After this relationship has been determined, various occurrences that deviate from this relationship can be used to analyze threats. For instance, any occurrence that is not equal to the average usage is a threat. Additionally, any occurrence that indicates a change outside an acceptable threshold is a threat. The acceptable threshold is a range within which changes in the systems are deemed normal. Such a range is composed of the average usage and standard deviation. SESSION USAGE MODEL A session usage model represents a single user’s behavior before his session expires. To determine the mathematical model for a user’s session, two main independent variables must be sampled. These are size of session data accumulated (x1), and number of user actions (x2). The dependent variable that must be sampled is time spent before session expires (y). The session usage model is expected to be made up of two micro usage models. The mathematical representation of the micro usage models are expected to be Y=c1x1+c2 where c1 and c2 are systems constants and Y=c1x2+c2 where c1 and c2 are system constants. In addition to the two mathematical functions, some system constants that will aid threat analysis must be determined. These include average user actions, average size of data
  19. 19. accumulated, average time spent. These constants can be determined from the data set used to determine the usage model. The two mathematical relationships represent the session usage model. Both are linear functions. It is expected that as user actions increase the time spent also increases. It is also expected that as data accumulated increase times spent also increases. MEMORY USAGE MODEL The memory usage model represents the usage of memory space in a system. The independent variables that must be sampled are number of application programs running (x1), and the number of system processes running (x2). The dependent variable that must be sample is amount of memory space being used(y). The mathematical relationship between x1, x2, and y is expected to be y=c1x1+c2x2+c3 where c1 is the average memory space for programs, c2 is the average memory space for processes and c3 is the average memory being used when no process or program is running. In addition to these, some system constants that aid threat analysis must be determined. These include the minimum and maximum memory space for programs and the minimum and maximum memory space for processes. The mathematical relationship between x1, x2, and y is the memory usage model. When determined, the memory usage model can be used to analyze changes in the memory usage that indicate threats in the system. CPU USAGE MODEL The CPU usage model represents CPU usage in a system. The independent variables that must be sampled are the number of application programs running (x1), and number of system processes running (x2). The dependent variable that must be sampled is amount of CPU power being used (y). The mathematical relationship between x1, x2, and y is expected to be y=c1x1+c2x2+c3 where c1 is the average CPU power being used for programs, c2 is the average CPU power being used for processes and c3 is average CPU power being used when no process or program is running. In addition to these, some system constants that aid threat analysis must be determined. These include the minimum and maximum CPU power for programs and the
  20. 20. minimum and maximum CPU power for processes. The mathematical relationship between x1, x2 and y is the CPU usage model. When determined, the CPU usage model can be used to analyze changes in the CPU usage that indicate threats in the system. PROGRAM USAGE MODEL To determine the program usage model, the dependent and independent variables that must be sampled are time spent using program (y), and number of functions used (x). In addition to that, the following constants must also be determined. Minimum functions used and maximum functions used. The relationship between y and x determined after sampling various x and y values is the program usage model denoted by y=f(x). HOST USAGE MODEL The host usage model is composed of four independent variables. Memory usage (x1), session usage (x2), CPU usage (x3), and program usage (x4), derived from their respective usage models. The dependent variable that must be sampled in the time spent on host (y). Any relationship determined between the dependent and the independent variables is the host usage model. The resulting host usage model is denoted y=f (x1,x2, x3, x4). BATTERY USAGE MODEL The battery usage model is made up of the average usage of CPU, average memory usage and the average usage of how a session behaves in the system. These are the independent variables. The dependent variable is the battery lifespan. The independent variables are derived from their respective micro usage models. DEVICE USAGE MODEL The device usage model is made up of a battery usage model, a host usage model, and the time spent on the device. The usage models that make up the device usage model compute the average micro usage and try to relate that with the time spent on the device. The time spent on the device is the dependent variable.
  21. 21. SERVER USAGE MODEL The server usage model is made up of the CPU time being used, the memory space being used and the number of processes running. These variables are used to form two different micro usage models. As such, there are two dependent variables, CPU time and memory space. The independent variable for both micro usage models is the number of processes running. PORT USAGE MODEL The port usage model is made up of the time elapsed during communication, number of programs that use the port and the number of paired ports. The number of paired ports is the dependent variable and the remaining variables are the independent variables. NETWORK USAGE MODEL The network usage model is made up of average port usage, average server usage, average host usage, the average size of data transmitted on the network, and time spent on the network. The first three variables are the independent variables. The remaining two are the dependent variables. As such, two micro usage models make up the network usage model. AGGRESSIVE USAGE DETECTOR This model is a utility that detects aggressive behavior on a system. It is modelled just like the various micro usage models. Various factors that determine aggressive behavior during system usage are used to determine the mathematical representation of this utility. Aggressive behavior includes aggressive use of major system resources, and aggressive use of system components with limited resources. The average aggressive behavior and its standard deviation are determined. Any system occurrence that indicates the average aggressive behavior, or the average aggressive behavior plus its standard deviation or the average aggressive behavior minus its standard deviation is considered a threat and must be halted, alerted or stored for audit purposes. FALSE ALARM DETECTOR The false alarm detector is a utility that detects normal system usage that otherwise may be seen as threat. Occurrences that meet the criteria for false alarms are normal usage that seems to put
  22. 22. the entire usage of the system into a false state of vibration or anarchy. Such usage occurrences are as such prioritized as normal optimal usage. The remedy for the vibrations such usage occurrences cause is delay in other normal usage occurrences in the system. The state and magnitude of other system occurrences plus the state and magnitude of the normal optimal usage determine the impact of the perceived anarchy. To increase convenience with which the system for which this utility is developed, the average delay time and its standard deviation must be detected. This utility is part of the normal usage. The utility is modelled just like the aggressive usage detector. SPECIAL PARAMETERS OF THE USAGE PROFILE This section discusses special parameters of our usage profile. These parameters include the average usage, the usage standard deviation, the minimum usage, the maximum usage and the most frequent usage value recorded. The average usage is the predicted average usage after the normal usage model function has been determined. The usage standard deviation is the standard deviation of the predicted normal usage function. The minimum and maximum usage values are the minimum and maximum usage predicted using the usage equation. These parameters together with usage rates, threat model constants and other usage constants are used in analyzing and detecting threats. BUILDING THE THREAT PROFILE OF THE SYSTEM To build a threat profile of a system we use differential equations of the usage profile. When we study differential equations of the usage profile, we will arrive at occurrences in the system that deviate from our usage profile. The differential coefficient of the usage profile is known as a threat model. Threats in the system occur as a result of changes in the usage profile that are beyond a certain acceptable threshold called the standard deviation of the usage profile. A threat model on the other hand is an abstract representation of this change in our system that is beyond the acceptable threshold. In addition to that, any state of the system that is unusual or any state
  23. 23. transition in the system that is unusual is also a threat. We will look at how to detect and prevent such unusual states and transitions in the system using hidden markov models. Integration can be performed on a threat model to determine the source of the threat. Integration is a reverse operation for differentiation in calculus. A threat model that can perform integration operations can be called a novel self integrating data structure. The sections that follow will look at how to analyze and prevent threats using the usage profile. Also, how to determine the sources of these threats using a novel self integrating threat model will be discussed. THREAT ANALYSIS AND DETECTION To do threat analysis in a system and abort processes that initiated those threats, linear and nonlinear programming techniques can be used. The goal here is to minimize the threat occurrence frequency and the overall impacts associated with the threat and optimize the usage equation. In addition to these two goals, there are some constants that aid threat analysis. These constants are associated with the usage profile and the threats in the system. Examples of these constants may be the rate at which usage is increasing with respect to a particular usage variable or the rate at which the threat impact and frequency increases with respect to a particular variable in the usage profile and other special parameters associated with the usage profile equation. The average usage, its standard deviation and the threat model equation make up the threat model. The average usage and standard deviation are constants in the threat model. Using the threat model equation, the average usage and standard deviation, threats analysis can be done using linear and nonlinear programming. The goal is to minimize threats using the threat model equation as the objective function and the average usage and standard deviation as constraints. Other parameters that may be used as constraints include the rate at which usage is increasing with respect to a particular usage variable or the rate at which the threat impact and frequency is increasing with respect to a particular usage variable.
  24. 24. THREAT PREDICTION This section discusses how to predict threats in a system. The network usage model discussed in this chapter and its associated threat model will be used to demonstrate how to predict or detect a threat in a system. As discussed in the previous section, threat can be detected using linear and nonlinear programming. The network usage model equation and its associated threat model equation are the objective functions. The constraints that will be used are the average network usage and its standard deviation, and other parameters such as the rate at which the network threat increases with respect to other network usage profile components such as average host usage, average server usage, average port usage, average time the network operates, average data transmitted on the network. The goal of the linear or nonlinear programming is to optimize the usage such that usage is within the range of the average usage minus its standard deviation and the average usage plus its standard deviation. These are the lower and upper bounds of our objective function. Every combination of system variables whose usage is within this usage range minimizes threat in the system. Since the average port, host and server usage are derived from their corresponding usage profile models, the linear and nonlinear programming analysis will be done independently for these ones. When a threat is predicted in a system, the chance of it being accurate is dependent on the usage value at that instance and whether it is within the range of the acceptable usage. This is constructed using the average usage and its standard deviation. Any usage value that is less than the average usage minus its standard deviation is a threat. Also, a usage value that is greater than the average usage plus its standard deviation is a threat. That means that any predicted threat at a point where the predicted usage is within the usage range has a high chance of being false. In addition to that, the actual and predicted usage values can be used to determine the chance that the predicted threat is accurate. If the difference between them is high, there is a chance that the predicted usage may be wrong. Since the predicted usage and the threat models are derived from the usage profile equation, there is a chance the predicted threat is also false.
  25. 25. Finally, the closer the correlation coefficient of the usage profile equation is to zero, the higher the chance the predicted usage and its associated threats values are wrong. Usage model functions with correlation coefficient of 0.6 and above indicate that the predicted usage values and predicted threats values are accurate. These values are obtained from the usage model function and the threat model function respectively which are modeled using relevant systems variables that make it possible to model system usage and system threats. RISK ANALYSIS IN A SYSTEM To do risk analysis in a system, the frequency at which threats in the system occur and the impact they have on the system must be known. When a frequency table is constructed for all threats and their associated impacts stored, it becomes easy to analyze risks associated with a system. When a threat is predicted, the likelihood of the threat occurring in the system can be computed using the threat frequencies. The impacts various threats have can also be determined based on the types of threats and other parameters such as the number of such threats, the speed at which they occurred and the resources they affected or damaged. Risk in a system is computed as the product of the likelihood of threat occurrence and the impact that threat occurrence has on the system. These concepts are the basics for developing a risk analysis system using the techniques we have discussed so far. PROPERTIES AND METHODS OF THE NOVEL SELF INTEGRATING STRUCTURE The best properties or characteristics of the data structure that represents our threat model include just to mention a few, names of network software or host application software, version number of network and host software, license information that include date software was purchased or released and number of years needed for renewal, IP address and Mac address of a host on a network. The methods of such a gigantic or simulative object may include methods for computing the integral of a threat model, another for computing the differential coefficient of the predictive usage profile equation, a method for computing the differential equation of a network or host
  26. 26. threat model. These methods included are mostly methods needed for performing the major calculus operations that will help in the novel calculus simulation on a network to detect threat and their sources on a wireless network. Besides these, it may be necessary to implements methods that retrieve hidden network identity like IP and Mac addresses on a local area network. INTERPRETATION OF THREAT MODEL INTEGRALS Since the novel self integrating data structure is a programmed threat model, it is important to discuss the meaning of its integrals. The integrals represent the source of the original threat. Examples of the integrals of the threat model may result in detecting the function, software, host or network from which the threat was detected. With properties like software name, version number, IP and Mac addresses it becomes easy to pin point the source of the threat. If the integral of a threat model looks like the usage profile equation of a function of the system under examination, then that function from the system under examination can be predicted as the source of the threat. Similarly, if the integral is similar to the usage profile equation of a software, host, or network that forms part of the system which is being investigated, then that threat can be predicted to be from that software, host or network. APPLICATION OF HIDDEN MARKOV MODELS FOR STUDYING VARIOUS STATES IN THE SYSTEM AND THREATS THAT CAN BE DETECTED Hidden markov models are machine learning models that are used to model states in a system, the sequence in which they occur and the associated probabilities for each state transition. When a system has a set of states in which it usually falls and it can be predicted or established that each new state is dependent on the previous states, then hidden markov models can be used to learn the state transitions that usually happens in the system. It must be stated that the sequence in which states occur in a system can be characterized by a parametric random process. Also, the probability associated with each state transition is irrespective of the time in which the transition occurred in the system. For computer systems which have occurrences that happen based on a parametric random process, these occurrences can be seen as the set of states in the system. Some of these
  27. 27. occurrences may be the point at which the system is at its optimal, maximum or minimum usage, or the point at which the system attains average usage. It must be stated that, if the various states of the system’s usage are determined from the usage profile which is made up of the usage equation and the mean and standard deviation model, then any state that is unusual is seen as a threat. The usage states can be the minimum, average and maximum usage. Also the various state transitions from one of these states to another can be determined. As such, any state which is less than the minimum usage or greater than the maximum usage can be seen as an unusual behavior. Also any transition from one of these states to another which is not captured as a normal state transition can be seen as a threat. Additionally, when a set of threat types that happens in the system is determined, it becomes possible to study the sequence in which these threats occur in the system and the various transitions between the threats using hidden markov models. Also, the various usage points including the optimal, the minimum and the average usage and how they are transited in the system can be studied using hidden markov models. Because various occurrences and threats can be studied using hidden markov models, it becomes possible to predict the next occurrence or threat that will happen on a host or a computer network. Threat sources can also be predicted using threat models. When threat models are integrated, they give a general idea about the source of the threat. With such knowledge and ability, the next threat or occurrence that has a higher likelihood of happening on a host or network can be predicted using application of hidden markov models. As such, occurrences can be prevented if they are estimated to be disastrous. Also, if for instance, for some reason the optimal or minimal usage must be reached, it becomes possible to study ways of optimizing the transition from the current state or predicted next state to the required state. This makes it possible to move from a particular usage point to the desired usage point. This approach to threat detection and usage optimization, make it possible to build anomaly based intrusion detection systems that are correct, prompt and increase optimal use of the system. The anomaly based intrusion detection systems built using these techniques are correct because the threat models come from usage models that are built using similar
  28. 28. approaches and the threat prediction and prevention mechanisms are designed using robust techniques developed using these approaches. Also, there are likely going to be lower false alarms since the threats predicted on host or networks come from threat models designed from such robust methods. An example of a kind of cyber security threat that this approach can be used to model is a network problem where a student is determined or predicted to be sending threatening or socially unacceptable emails to colleagues. Typically, his identity is hidden on the network on which he sends the emails. As such, it is difficult to determine the likelihood that he will send such threatening emails on a particular day or hour so that his identity could be determined and brought to book. Using hidden markov models, a usage profile of the email system could be developed. This will make it possible to determine the day or hour in which he is likely going to send such threatening email so that his identity can be found and the problem solved.
  29. 29. EXPERIMENTING THE USAGE PROFILE AND THREATS ASSOCIATED WITH IT In this chapter, we discuss the experiment that was conducted to determine the usage of a computer system. We also discuss how to simulate the threat and usage models with the hope of developing a threat detection system. The experiment was conducted by implementing the usage models in java. The implementation uses an interface that captures what makes up a usage model. Each micro usage model implements the interface. The interface is given in Appendix B. Additionally, a multithreading object was also implemented for learning the systems usage, monitoring the system, and for determining the relationship that describes the usage model. This object is also given in Appendix A. The interface is made up of eight functions these are computeval() for computing the usage value based on system variables, findchange() for finding any change in the usage model, learnsys (int t) for learning the system within a time frame, findrelationship() for determining the relationship that represents the usage model, monitor(int t) for monitoring the system, showalarm(String info) for displaying alarms, haltprocess() for halting systems processes that are threats, and predictvals() for predicting system usage values. The multithreading object implements the run method of the thread class. The run method implemented performs three main functions. It calls the learnsys method of the usage model when the systems usage needs to be learnt. It also calls the findrelationship method of the usage model when the system usage profile equation needs to be determined. It also calls the monitor method of the usage model when the system has to be monitored for various activities in the system. During the experiment the various independent and dependent variables of the usage models where sampled. Each sample was captured for the particular usage model using random numbers. The samples were captured within a time frame. The time was in seconds and refers to the t variable that the learnsys method of the usage model takes. Each second a sample is captured for each usage model. Then after the samples have been captured, the relationship that describes the usage model for the system was determined using the mathematical modelling techniques reviewed in the previous chapter.
  30. 30. Also, after the usage profile has been built by learning the system and capturing the samples that helps determine the various usage models, the system was monitored within a time frame. The time was set in second. This time refers to the t variable that the monitor method of the usage model takes. This was used to analyze the system in order to have an idea about the activities going on in the system. Based on the usage profile built, activities that are deviations from the usage profile are flag as incidents that indicate threat in the system. Additionally, during the experiment processes that are independent are grouped first and executed together. Processes that depended on other processes had to wait for those processes to finish first before. This was important because it makes it possible to determine the usage model of micro usage models that form part of other usage models first. Later these micro usage models can be used to determine the usage model of the parts of the system that depend on the micro usage models. This can be seen in the main method in Appendix C. It must be stated that because the usage model for authentications was determined to be a rational function, logs must be taken on both sides of the relation as part of the experiment in order to reduce the relation to their linear form. The original function is Y=c1(x2/x1) +c2. When reduced to its linear form we have log Y= log c1+ log x2 – log x1 + log c2. Since log c2 and log c2 results in constants let denote them with k1 and k2 respectively. Additionally, let B= log Y, let j1= log x1 and let j2= log x2. Therefore, the linear form of the usage for authentication is B= j2- j1 + k1 + k2. Since k1 + k2 is a constant let it be represented by k. As such B= j2- j1 + k where B is the dependent variable and j2 and j1 are the independent variables. When B, j2, and j1 are sampled, Y=c1(x2/x1) +c2 can be determined. The cpu and the memory usage models on the other hand are multiple linear. The original relations are of the form y=c1x1+c2x2+c3 where x1 and x2 are the independent variables. The original relations must be reduced to their simple linear form. To do this, determine y=b0+bx for each independent variable. The sum of the various b0 equals c3. The various b correspond to the constant associated with the independent variable for which y=b0+bx was determined. For example, the b for any y=b0+bx determined for x1 equals to c1 and that for x2 equals to c2. When
  31. 31. x1, x2, and y are sampled and the various y=b0+bx determined, y=c1x1+c2x2+c3 can be determined completely. CHALLENGES WITH PROJECT This section of the report mentions the challenges encountered during the research. First of all, it is important to state that it was not easy to get audit data so the system variables that were enumerated in this research were based on careful selection of critical aspects of a computer system that were seen as good to be used. Secondly, it is important to note that during the simulation of the usage and threat models, the Multithreading object discussed earlier was not working as expected. As such, the capturing of system incidents while learning the system’s usage was done by using a method that captures the information each second. This method uses the sleep method of the thread class. The sleep method is a static method. As such, a while loop that runs t second was implemented using the sleep method that makes the system to wait for a second before capturing the next set of sample data.
  32. 32. TOOLS AND COMPUTER PACKAGES This chapter discusses the tools and computer packages that were used throughout this research project. We will also look at the programming languages, database platforms and development frameworks that can be used to develop an anomaly based intrusion system for ecommerce sites using the concepts were have discussed in this paper. The simulation was implemented using java. It was a console based simulation. Java was chosen for its object oriented concepts such as encapsulation, inheritance, interfaces, objects, and polymorphism. To implement an intrusion detection system using results of this research, the following tools will be essentials. These tools are best suited for intrusion detection systems developed for ecommerce sites. Bootstrap, Codeignitor, MySQL Database Management System, SQLite, SQLyog, and Eclipse. The programming languages that will be used are PHP and Android. PHP is for the desktops and laptops that connect to the ecommerce sites and Android is for mobile phones that use the ecommerce sites. Bootstrap and Codeignitor are web development frameworks. Bootstrap is for frontend developments and Codeignitor is a backend framework for PHP developers. For Android Eclipse can be used as the best IDE for Android developments. MySQL and SQLyog are for the database servers that will run on the ecommerce site as part of the intrusion detection system implementation. SQLite is for the databases that run on the Android implementations that form part of the intrusion detection system developed for the ecommerce website. With are these tools, frameworks and packages, developers are ready to develop intrusion detection systems for ecommerce sites using the concepts in this research paper. It is expected that the micro usage models discussed will be integral libraries that will be implemented in PHP and Android as part of an implementation for ecommerce sites or any group of web or mobile application system.
  33. 33. CONCLUSION AND DISCUSSION To end this discussion, it is worth mentioning that the normal usage models and threat models experimented in this paper represents a computer system and it associated threats. These threats can be analyzed periodically and audited as part of a computer security audit. This will fuel development of a risk analysis system. A risk analysis system, threat detection system and normal usage system developed from experimenting the usage and threat models will make up a mobile security audit framework that can be used for maintaining cyber security on computer systems. When practices and processes for maintaining this framework are drafted and adhered to, it will make it easy to maintain cyber security on various computer systems. Additionally, it can be established that using the differential equations technique, the novel self integrating data structure, and the linear and non-programming techniques, threats on a system can be analyzed and detected. To halt such threats, the intrusion detection system developed using the techniques stated above must possess certain qualities. These qualities include correctness, promptness, and ease of use. Correctness means how good the intrusion detection system can detect threats. This is important because correctness affects the rate at which a predicted threat is false or true. Promptness is related to the time it takes to detect or halt a threat and ease of use is related to the property of the intrusion detection system aiding convenient use of the computer system for which it is developed. The techniques we have discussed make it possible to achieve correctness, promptness and ease of use. The usage model function with its associated average usage and standard deviation make it possible to ensure correctness of the intrusion detection system. This is because the statistical data sampled for developing the intrusion detection system is within the range of the acceptable usage. The average usage and standard deviation are computed using statistical models. One of such models used in this research is the moments or mean and standard deviation model. With this statistical model and the usage model equation, it becomes possible to ensure correctness of the intrusion detection system. To achieve promptness, multithreading is applied to analyze, predict, detect and halt threats. All threats alarms and detection must use multithreading. Multithreading is a
  34. 34. programming concept that ensure that several processes run on the computer at the same time. This concept makes it possible to predict multiple threats, do multiple threat analysis and halt or alarm occurrences of multiple threats in a computer system. This makes the threat detection system prompt. Multithreading may be optimized to halt, prevent or alarm threats with high magnitude and impact. Prioritizing the detection of such threats also use multithreading. Ease of use is achieved using the mean and standard deviation model. Without that model, there is no acceptable range of our usage. That means that the average usage and its standard deviation prevents a rigid usage model and as such makes usage convenient. Periodic audits also ensure that the normal usage function and its mean and standard deviation model are up to date. Application of machine learning techniques such as data mining also ensure that the usage model and its associated mean and standard deviation model are up to date with actual usage of the system. With monitoring mechanisms, intrusion detection systems developed using the normal usage model and its associated mean and standard deviation model help ensure ease of use. This is because an administrator monitoring a network or computer system using an intrusion detection system can prevent relevant threats using a click. He can also use several configurations of the intrusion detection system to halt or prevent normal threats. All these mechanism, together with the mean and standard deviation model of the normal usage make the computer or network system being monitored easy to use. It is expected that using Boolean algebra and calculus of Boolean functions, the normal usage model can have a hardware representation. Researching how to implement this hardware representation can be done using Boolean algebra and calculus of Boolean functions. These concepts are related with concepts from computer organization and architecture such as logic gates, multipliers, design of arithmetic and logic units, and concepts from embedded systems like architecture of various embedded system implementation. These architectures include hardware only implementation and hardware/software implementation.
  35. 35. The two utilities that compose the usage model are essential for improving convenient usage and preventing false alarms. This makes the intrusion detection system correct and prompt at preventing threats. These utilities can be modelled mathematically as linear and quadratic functions. For instance, the process that causes aggressive usage can be modelled as quadratic or linear. If the process involves a download or data transfer on a network, then the size of the data being transferred or downloaded determines whether the mathematical model is linear or quadratic. If the size of the data is huge, the function is quadratic. If it is small, the function is linear. The aggressive behavior and false alarm is analogical to public road transportation. When a car wants to overtake another on a highway, the car or truck ahead of it must wait for the car making the overtake. This prevents anarchy on the highway. Similarly, overtaking may seem like over speeding and may create a false state of anarchy on the road. If a process has a high chance of completing quickly and has indicated that it wants to commence, then it is expected that the state of the system usage does not change. Any other process that changes the state of the system usage such that the commencement of the new process leads to vibrations or anarchy is aggressive and must be stopped. The implication of this research is its application to critical systems such as medical systems, banking systems and systems for monitoring incidents on a country’s road network. Usage of banking and medical systems may pose social threats such as theft and increase mortality rates. To prevent such threat, frameworks for auditing these systems must be used periodically to secure human life, savings and investments of citizens and organizations. Incidents on a country’s road network can lead to prolonged court cases and health and work hazards. Building incidents monitoring systems for such road networks can reduce these health and work hazards and the prolonged court cases in a country. The concepts discussed in this paper may be beneficial to the design and implementation of such systems.
  36. 36. APPENDIX A import java.util.*; class LearnSysProcess extends Thread{ String process_name; model usage; int time; int time2; int N; Thread process; LearnSysProcess(String name,model x, int t,int n,int t2){ process_name=name; usage=x; time=t; N=n; time2=t2; } public void run(){ try{ switch(N){ case 1://learn system usage.learnsys(time); break; case 2:// find relationship usage.findrelationship(); break; case 3:// monitor system usage.monitor(time2); break; } } catch(Exception e){ } } public void start(){ if(process==null){ process=new Thread(this,process_name); process.start();
  37. 37. } } } APPENDIX B public interface model{ public double computeval(); public double findchange(); public void learnsys(int t); public Object findrelationship(); public void monitor(int t); public void showalarm(String info); public void haltprocess(); public void predictvals(); } APPENDIX C class normal_usage implements model{ double usageVal;// a system's usage value at a point in time network_usage net_usag;// network behavior; user_behavior u_bev;//user behavior; device_usage device_usag;// avearge behavior of a mobile device on the networrk; memory_usage memory_usag; cpu_usage cpu_usag; program_usage program_usag; host_usage host_usag; double net_usag_time; // how long the system functions usage_modeller math_modeller; normal_usage(){ net_usag=new network_usage(); u_bev=new user_behavior(); device_usag=new device_usage(); memory_usag=new memory_usage(); cpu_usag=new cpu_usage(); program_usag=new program_usage(); host_usag=new host_usage(); math_modeller=new usage_modeller(4); }
  38. 38. public static void main(String args[]) throws InterruptedException{ normal_usage usage=new normal_usage(); int min=1; int time=min*60*1000; LearnSysProcess p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13; p1=new LearnSysProcess("learn session_usage",usage.u_bev.x1,time,1,0); p1.start(); p2=new LearnSysProcess("learn auth_usage",usage.u_bev.x2,time,1,0); p2.start(); p5=new LearnSysProcess("learn memory_usage",usage.memory_usag,time,1,0); p5.start(); p6=new LearnSysProcess("learn cpu_usage",usage.cpu_usag,time,1,0); p6.start(); p10=new LearnSysProcess("learn server",usage.net_usag.server_usag,time,1,0); p10.start(); p11=new LearnSysProcess("learn port_usage",usage.net_usag.port_usag,time,1,0); p11.start(); p8=new LearnSysProcess("learn program_usage",usage.program_usag,time,1,0); p8.start(); p1.join(); p2.join(); p5.join(); p6.join(); p10.join(); p11.join(); p8.join(); p3=null; // check if processes p1 and p2 are complete then if(!p1.isAlive()&&!p2.isAlive()){ p3=new LearnSysProcess("learn user_behavior",usage.u_bev,time,1,0); p3.start(); } p4=null; p7=null; // check if processes p1,p5, and p6 are complete then if(!p1.isAlive()&&!p5.isAlive()&&!p6.isAlive()){ usage.device_usag.battery_usag.set_memory_usage(usage.memory_usag); usage.device_usag.battery_usag.set_session_usage(usage.u_bev.x1); usage.device_usag.battery_usag.set_cpu_usage(usage.cpu_usag); p4=new LearnSysProcess("learn battery_usag",usage.device_usag.battery_usag,time,1,0); p4.start(); p4.join();
  39. 39. p7=new LearnSysProcess("learn device_usag",usage.device_usag,time,1,0); p7.start(); } p3.join(); p7.join(); // check if process p5 is complete then if(!p5.isAlive()){ usage.host_usag.set_memory_usage(usage.memory_usag); } // check if process p6 is complete then if(!p6.isAlive()){ usage.host_usag.set_cpu_usage(usage.cpu_usag); } // check if process p8 is complete then if(!p8.isAlive()){ usage.host_usag.set_program_usage(usage.program_usag); } // check if process p1 is complete then if(!p1.isAlive()){ usage.host_usag.set_session_usage(usage.u_bev.x1); } p9=null; // check if processes p1,p5,p6, and p8 are complete then if(!p1.isAlive()&&!p5.isAlive()&&!p6.isAlive()&&!p8.isAlive()){ usage.net_usag.set_host_usage(usage.host_usag); p9= LearnSysProcess("learnhost_usage",usage.net_usag.host_usag,time,1,0); p9.start(); p9.join(); } p12=null; // check if processes p9,p10 and p11 are complete then if(!p9.isAlive()&&!p10.isAlive()&&!p11.isAlive()){ p12=new LearnSysProcess ("learn network_usage",usage.net_usag,time,1,0); p12.start(); p12.join(); } p13=null; // check if processes p9,p10,p11,p1,p2 and p3 are complete then if(!p9.isAlive()&&!p10.isAlive()&&!p11.isAlive()&&!p1.isAlive()&&!p2.isAlive()&&!p3.isAlive()){ p13=new LearnSysProcess("learn system usage",usage,time,1,0); p13.start(); }
  40. 40. double x_vals[]=new double[4]; x_vals[0]=9; x_vals[1]=7; x_vals[2]=54; x_vals[3]=43; usage.math_modeller.sample_x_set(x_vals) ; usage.math_modeller.sample_y(87); usage.math_modeller.queue_sample(); x_vals[0]=9; x_vals[1]=7; x_vals[2]=54; x_vals[3]=43; usage.math_modeller.sample_x_set(x_vals); usage.math_modeller.sample_y(87); usage.math_modeller.queue_sample(); usage.math_modeller.end_modeller=true; usage_stats ustats=usage.math_modeller.get_usage_stats(); x_set avg_xset=usage.math_modeller.average_x_set; x_vals=avg_xset.x_set; System.out.println("mean 1: "+x_vals[0]); System.out.println("mean 2: "+x_vals[1]); System.out.println("mean 3: "+x_vals[2]); System.out.println("mean 4: "+x_vals[3]); } public double computeval(){ return 0; } public double findchange(){ return 0; } public void learnsys(int t){ int timer=1; while(t>=timer){ net_usag.learnsys(t); try { Thread.sleep(1000); }catch (InterruptedException e){ e.printStackTrace(); } timer++; if(timer==t){ try{ Thread.yield();
  41. 41. } catch(Exception e){ } } } } public Object findrelationship(){ return null; } public void monitor(int t){ int timer=0; while(timer<t){ try{ net_usag.monitor(t); } catch(Exception e){ } } timer++; } public void showalarm(String info){ System.out.println(info); } public void haltprocess(){ } public void predictvals(){ } }
  42. 42. REFERENCES A, A study of information security awareness and practices in Saudi Arabia [26-28 June 2012] http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=6285845&contentType=Confere nce+Publications&queryText%3DCyber+Security+Papers+.PLS.+mobile+phones [15th November, 2012] Cashion J, Protocol for mitigating the risk of hijacking social networking sites [15-18 Oct. 2011] http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=6144818&contentType=Confere nce+Publications&queryText%3DCyber+Security+Papers+.PLS.+mobile+phones [15th November, 2012] Chidambaram Mahadevan (CISA, FCA), Intrusion, Attack, Penetration – Some Issues Cybersecurity http://whatis.techtarget.com/definition/cybersecurity [14th November 2012] Ethem Alpadin, Introduction to Machine Learning 2nd edition 2010 Jian g Chunfeng, Research and application of behavior encryption [27-31 May 2012] http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=6320096&contentType=Confere nce+Publications&queryText%3DCyber+Security+Papers+.PLS.+mobile+phones [15th November, 2012] Larson, Hostetler, Edwards, Multivariable Calculus 8th edition 2006 Matthew E. Whiteman, Herbert J. Mattoro, Principles of Information Security 2nd edition 2005 Nong Ye, A Markov Chain model of Temporal Behaviour for Anomaly Detection [6-7 June 2000] Security hole allows anyone to hijack your Skype account using only your email address http://thenextweb.com/microsoft/2012/11/14/security-hole-allows-anyone- tohijackyourskypeaccountusing-only-your-email-address/ [12th November, 2012]

×