SlideShare a Scribd company logo
1 of 49
Download to read offline
0
Building a Usage Profile of a Computer Network System for
Anomaly Detection on the Computer Network and various
Peripherals on the Network
Nathanael Ato Asaam
Founder and CEO
Equicksales Consulting Ltd.
2019
1
1
Abstract
This paper is an investigation into building usage profiles of a system using behavior models. Such
behavior models are the heart of machine learning, and evolutionary computing. Some other
methods of building such usage profiles include the use of statistical models such as time series
models, univariate models and mean and standard deviation models. The aim of building these
usage profiles is to be able to detect unusual behavior on the system. This paper uses regression
to determine the usage profiles of a system by studying the relationship between relevant system
variables that will be used to formulate the usage profile. The dependent and independent
variables for the usage profile can be determined from an audit trail.
Additionally, the paper applies hidden markov models to study the various states a
computer system can fall into and the various stage transitions in order to be able to predict
unusual behavior in the system. Unusual behavior in this case may be a particular state or a
transition from one state to another or the manner in which a particular state transition occurred.
With this usage profile which is composed of the usage profile equation and a mean and standard
deviation model that capture average usage and its standard deviation and the markov chain
model that captures the various states of the system and the various state transition it becomes
possible to detect anomaly on the system. Using linear and nonlinear programming, the usage
profile equation can be maximized or minimized to determine states of the system and points at
which the system is optimal. This can help improve the system’s usage.
Also using differential coefficient of the usage profile equation and other statistical models
such as the mean and standard deviation model, a threat profile of the system can be developed.
When the threat profile equation is minimized using linear and nonlinear programming, it will
help prevent threats on the system. The benefit of this research is its application to the development
of anomaly threat detection systems and risk analysis systems that can be used for performing
computer security risk assessments and analysis.
The research model of this paper is Y=f (Xi, Ci) such that Xi represents system variables
like number of application software running or number of system processes. Ci represents system
constants like average number of processes. During this research, an experiment was conducted
into how to represent a computer system’s usage with an abstract mathematical model. The
experiment was conducted on desktops using micro usage models of a network system.
2
2
Threat analysis and detection is also done using some special parameters of the usage
model. These parameters are constants in the system. Examples of these constants include rates at
which the system’s usage increase or decrease with respect to certain variables in the system, and
the rate at which threat occurrence increase or decrease in the system with respect to variables
that make up the usage model. The normal usage and threat models on the other hand are the
objective functions that are used for analyzing threats.
The techniques we have discussed in this paper make it possible to achieve correctness,
promptness and ease of use. The usage model function with its associated average usage and
standard deviation make it possible to ensure correctness of the intrusion detection system. This is
because the statistical data sampled for development of an intrusion detection system developed
using these techniques can be used to formulate an acceptable usage range. There are two special
utilities that compose the usage model. They are essential for improving convenient usage and
preventing false alarms. They make the intrusion detection system correct and prompt at
preventing threats. These concepts are the basics for developing a security audit framework.
A simulation was run for four times within a week. On the first instance, it was run for
15 minutes. On the second instance, it was run for 30 minutes and on the third instance it was run
for 45 minutes. On the last instance it was run for 60 minutes. The results indicate that a threat
detection system can be built using the differential equation technique, the novel self integrating
data structure and linear and non linear programming concepts.
To make the intrusion detection system for which this research model proposes detect
threats promptly, multithreading is applied to analyze, predict, detect and halt threats.
Multithreading is a programming concept that ensure that several processes run on the computer
at the same time. This concept makes it possible to predict multiple threats, do multiple threat
analysis and halt or alarm the occurrences of multiple threats on a computer system.
Ease of use of a system for which the intrusion detection system is developed is achieved
using the mean and standard deviation model. Without that model, there is no acceptable range of
our usage. That means that the average usage and its standard deviation prevents a rigid usage
model and as such makes usage convenient.
3
3
Table of Contents
Abstract.......................................................................................................................................1
Introduction.................................................................................................................................7
Background.............................................................................................................................7
Problem Definition..................................................................................................................8
Research Questions .................................................................................................................8
Objectives ...............................................................................................................................9
Behavior Models .....................................................................................................................9
System Threats......................................................................................................................10
Boolean Calculus...................................................................................................................10
Micro Usage Models .............................................................................................................10
Properties of Intrusion Detection Systems .............................................................................10
Research Model and Methodology ........................................................................................11
Statistical and Machine Learning Models ..............................................................................11
Cognitive Based, User Intention Based and Computer Immunology Based Models ...............11
Literature Review......................................................................................................................12
Intrusion Detection Systems ..................................................................................................12
Behavior Encryption..............................................................................................................13
Risk Analysis ........................................................................................................................14
Information Security Awareness and Practices ......................................................................14
Protocol for Mitigating Risks on Social Networking Sites .....................................................15
Research Model and Methodology ............................................................................................16
Research Model.....................................................................................................................16
Methodology.........................................................................................................................18
4
4
Machine Learning Algorithms & Behavior Based Intrusion Systems.....................................19
Audit Trail Analysis ..............................................................................................................19
Normal Usage Model ............................................................................................................19
Threat Modelling...................................................................................................................19
Boolean Calculus...................................................................................................................20
Experimenting Usage and Threat Models ..............................................................................20
Computer Usage Survey........................................................................................................20
Threat Detection Systems......................................................................................................20
Threats Associated with Computer Systems ..............................................................................21
Attacks associated with a computer system............................................................................21
Malicious Code .....................................................................................................................21
IP Scan and Attack ................................................................................................................21
Web Browsing.......................................................................................................................21
Virus .....................................................................................................................................22
Unprotected Shares................................................................................................................22
Mass emails...........................................................................................................................22
Simple Network Management Protocol (SNMP)....................................................................22
Hoaxes ..................................................................................................................................22
Backdoors .............................................................................................................................22
Password Crack.....................................................................................................................23
Brute Force............................................................................................................................23
Dictionary .............................................................................................................................23
Denial of Service (DoS) and Distributed Denial of Service (DDoS).......................................23
Spoofing................................................................................................................................24
5
5
Man in the Middle.................................................................................................................24
Spam.....................................................................................................................................24
Mail Bombing .......................................................................................................................25
Mathematical Modelling Techniques and Machine Learning Based Models .............................26
Simple Linear Regression......................................................................................................26
Multiple Linear Regression ...................................................................................................27
Non Linear Regression ..........................................................................................................27
Machine Learning based models Used for Developing Anomaly Based.....................................28
The Normal Usage Model of a System ......................................................................................30
Single Variable Calculus Review and its Applications...........................................................31
Authentication Usage Model .................................................................................................32
Session Usage Model ............................................................................................................33
Memory Usage Model...........................................................................................................33
CPU Usage Model.................................................................................................................33
Program Usage Model .......................................................................................................34
Host Usage Model .............................................................................................................34
Battery Usage Model.........................................................................................................34
Device Usage Model .........................................................................................................34
Server Usage Model ..........................................................................................................35
Port Usage Model..............................................................................................................35
Network Usage Model.......................................................................................................35
Aggressive Usage Detector................................................................................................35
False Alarm Detector.............................................................................................................35
Threat Models in a System ........................................................................................................38
6
6
Properties and Methods of the Novel Self Integrating Data Structure.....................................38
Integration Review ................................................................................................................38
Interpretation of Threat Model Integrals ................................................................................39
Threat Analysis and Detection...............................................................................................39
Threat Prediction...................................................................................................................40
Risk Analysis in a System .....................................................................................................41
Normal Usage Model and Threat Model Simulation..................................................................41
Tools and Computer Packages...................................................................................................43
Conclusion and Discussion........................................................................................................44
References.................................................................................................................................47
7
7
Introduction
Background
Cyber security threats on computer networks have the potential of causing damage to resources on
the computer network. Examples of these damages include corrupting data stored or transmitted
on the network, infesting a host on the network with virus, impersonating a valid user on the
network and preventing proper functioning of applications softwares on various host on the
network. The security of computer systems is very essential to various organizations. Computer
systems security is usually provided by computer software that protect the computer system for
which they were developed. Such a computer software system is an intrusion detection system.
Other computer systems that provide security are antivirus and firewall and risk analysis systems.
Also, periodic computer security audits will enable threat detection and prevention on computer
networks.
There are two types of intrusion detection systems. These are knowledge based intrusion
detection systems also known as signature based intrusion detection systems and behavior based
intrusion detection systems also known as anomaly intrusion detection systems. Behavior based
intrusion detection systems detect and prevent intrusions based on deviations from an observed
behavior pattern of the computer system for which the intrusion detection system has been built.
These deviations represent threats on the system. Knowledge based intrusion detection systems
detect intrusions based on mappings of system occurrences with a database of known threats. The
database of known threats is known as threat signatures. Intrusion detection systems are also
known as threat detection systems.
The first goal of this paper is to investigate techniques for representing a computer system’s
normal usage with a mathematical abstract model. The mathematical abstract model is known in
this paper as a normal usage model. It is hoped that, the normal usage model will aid in analyzing
activities and occurrences on a computer system that deviates from the system’s normal usage.
This will help in detecting and preventing threat on the system. The second goal of this paper is to
examine anomaly detection by analyzing changes in a system that deviates from the systems
normal usage.
This research paper will also be doing an investigation into how to build a usage profile
that can be used to determine anomalous activities on a computer system. The paper proposes a
8
8
research model made up of a dependent variable and one or more independent variables that can
be used for modelling the usage of a computer system. This research model is a regression based
model. As such, simple linear or multiple linear regression can be used to develop the model. The
research paper also uses a statistical model known as mean and standard deviation model. The
mean and standard deviation model models the average usage of the system and it associated
standard deviation. Finally, the paper also uses a markov chain model to model various states in a
computer system, their associated probabilities and the various state transitions. These three
different models are behavior models and together form the usage profile that this paper proposes.
Also, the paper uses a java interface for implementing the usage model that describes a component
of a computer system whose usage can be modelled using simple or multiple linear regression.
Problem Definition
If the normal functioning of a computer system can be represented by an abstract model, then any
deviation from that abstract model can be used to analyze and detect threats in that system.
The main problems this paper seeks to investigate are listed below.
• To represent the normal usage of a computer network with a mathematical abstract
model.
• To investigate techniques for building a usage profile of a computer network.
• To determining activities and occurrences that are deviations from a system’s
normal usage and flagging them as anomalous activities.
• To develop an anomaly intrusion detection system.
• To develop a risk analysis system
• To develop a security audit framework made of an anomaly intrusion detection
system and a risk analysis system.
• To draft a document that will detail the operation and administration of the security
audit framework.
In this paper, the abstract model of the system’s usage is known as a normal usage model and the
deviations from the system’s normal usage is known as threats.
Research Questions
The main questions to be investigated are listed below.
• What are the best and most efficient techniques for modelling a computer network’s
normal usage?
9
9
• How can we build a usage profile of a computer network that will be adequate for
detecting anomalous activities on the network?
• What are the best techniques for designing and implementing an anomaly intrusion
detection system?
• What are the best techniques for designing and implementing a risk analysis system?
• What are the best techniques for design and implementation of a security audit
framework?
• What are the procedures, and processes that must be followed in the operation and
administration of a security audit framework?
Objectives
The main objectives of this paper are as follows.
• Representing a computer network’s normal functioning with an abstract
model
• Building a usage profile of a computer network.
• Detecting activities and occurrences that deviate from the normal usage of
a computer network and flag these activities and occurrences as anomalous
activities on a computer network.
• Design and implementation of an Anomaly Intrusion Detection System.
• Design and implementation of a Risk Analysis System.
• Design and implementation of a Security Audit Framework.
• Draft a document that details the procedures, processes and guidelines that
must be followed in the operation and administration of a security audit
framework.
Behavior Models
It is hoped that the abstract representation of a system’s normal usage will capture the entire
behavior of the system. Such models are known as behavior models. As such, the threat detection
system this paper seeks to explore is expected to be a behavior based threat detection system.
Examples of behavior models that this paper seeks to explore are statistical models, cognitive
based models, machine learning based models, user intention based models, and computer
immunology based models. These models are associated with the development of anomaly-based
intrusion detection systems.
10
10
System Threats
There are three types of system logs that our intended threat analysis and detection hopes to arrive
at. These are system errors, system threats and usage rates all categorized based on the magnitude
and characteristics of an instance of the threat model.
These logs must as such be audited by a security expert to analyze changes in our computer
system that fits or deviates from our current usage model in order to project a more appropriate
instance of our usage model that will be perfectly functional and suiting in the future.
Boolean Calculus
It is expected that using Boolean algebra and calculus of Boolean functions, the normal usage
model can have a hardware representation. Researching how to implement this hardware
representation can be done using Boolean algebra and calculus of Boolean functions. These
concepts are related with concepts from computer organization and architecture such as logic gates,
multipliers, design of arithmetic and logic units, and concepts from embedded systems like
architecture of various embedded system implementation. These architectures include hardware
only implementation and hardware/software implementation.
Micro Usage Models
Micro usage models are sub models of our normal usage model. They are modelled using the same
research model. Examples of micro usage models that this paper explores are Device Usage Model,
Host Usage Model, Server Usage Model, Authentication Usage Model, Session Usage Model,
CPU Usage Model, Memory Usage Model ,Port Usage Model and Network Usage Model. These
micro usage models are expected to derive their mathematical representation from variables
sampled from an audit trail analysis. These micro usage models are expected to be components of
a usage profile developed for computer network.
Properties of Intrusion Detection Systems
There are special properties of intrusion detection systems that make them effective and efficient
at detecting and preventing threats. Examples of these properties are correctness, promptness and
ease of use. Correctness means how good the intrusion detection system can detect threats. This is
important because correctness affects the rate at which a predicted threat is false or true.
Promptness is related to the time it takes to detect and halt a threat and ease of use is related to the
11
11
property of the intrusion detection system aiding convenient use of the Computer Network or
System for which it was developed.
Research Model and Methodology
The research model of this paper investigates threat detection using application of Calculus,
Boolean algebra, Machine learning and Statistical models. These fields of study are mainly related
to Discrete Mathematics, Computer Science, Operational Research, Linear and Non Linear
Programming, Regression Analysis and Data Mining.
The research model is inspired by linear and non linear regression. The methodology for threat
detection is inspired by linear and non linear programming and calculus. Some Computer Science
fields that inspire the threat detection parts of this research are multithreading, architectures of
embedded system design and implementation, and concepts from computer organization and
architecture like implementation of arithmetic and logic unit.
Statistical and Machine Learning Models
Statistical models are mathematical models that can be used in the development of intrusion
detection systems. These models have different types. Machine learning techniques are also used
to build intrusion detection systems. These techniques have special models or structures that aid
development of intrusion detection systems. Examples of statistical models are mean and standard
deviation models, univariate models, and time series models. Machine learning models include
Neural networks, Bayesian networks, Hidden Markov Models and Genetic algorithms.
Cognitive Based, User Intention Based and Computer Immunology Based Models
Besides the statistical and machine learning based models that can be used for developing anomaly
based intrusion detection systems, there are cognitive based models that are used to develop
anomaly intrusion detection systems.
12
12
Literature Review
This section reviews major topics that constitute this research paper and work done in some of
these areas. The topics and areas that will be considered for discussion include intrusion detection
systems since any discussion or study of threat and their source detection is centered on intrusion
detection systems. Also, behavior encryption is another computer security field that will be
discussed in detail since it adds much value to information hiding parts of this research. Risk
analysis will also be reviewed to sum up what constitutes risk analysis. Finally, there will be a
review on Normal Usage Models.
Intrusion Detection Systems
Basically, there are two types of intrusion detection systems in the industry based on the approach
used for threat detection and the technologies used to build the system. These are knowledge based
also known as signature based and behavior based intrusion detection systems. Each takes a
different approach to threat detection and each uses different technology for building the intrusion
detection systems. Also, every single one has its pros and cons.
Knowledge based intrusion detection systems are built on a database of already known
threats. These known vulnerabilities or threats are called threat signatures. Usually, detection is
done as direct mappings of various system incidents that indicate threats with threat signatures. As
a result, the database of threats must be constantly updated for new identified threats. Because new
threats can be detected for inclusion in the database, the correctness of detecting threat is
sometimes compromised since threats which do not have corresponding signatures cannot be
mapped and detected. But these types of intrusion detection systems have lower false alarms since
each detected threat is registered in the database of threat signatures.
Behavior based intrusion detection systems take a different approach to threat detection.
They are built using artificial intelligence technologies. Usually, the system for which the intrusion
detection is built is modelled for its behavior and deviations from that behavior is used as a
technique for detecting the threats. Because of this, they have a better correctness at detecting
threats. No threat signatures or mappings of incidents that indicate threat is required. Additionally,
they have higher false alarms because there is no mapping of detected threats with a database of
known threats.
13
13
Besides these, intrusion detection systems are classified based on purposes for which they
are built and the activeness or passiveness at which they deal with threats. There are host based
and network based intrusion detection systems made for such purposes. Active intrusion detection
systems are configured to block or prevent attacks while passive intrusion detection systems are
configured to monitor, detect and alert threats.
Anomaly Detection Systems
According to a research paper entitled “Design and Implementation of Anomaly Detection
System”, there are global variables of a network that can be used for detecting anomalous activities
on a network. The paper used a hybrid of signature based and anomaly intrusion detection to detect
anomaly. According to the paper, some of the techniques used for detecting intrusion include using
generic network rules to detect network anomaly. The paper also used dynamic network knowledge
such as network statistics to detect anomalous activities.
Behavior Encryption
Behavior algorithms are applied to safeguard information on computing devices such as mobile
phones and laptops. These algorithms are the basics for building systems that study and encrypt
user behavior on a computing device in order to ensure the security of information on the
computing devices. A study into mobile platform security reports that behavior encryption
application systems have been designed and built, focusing on mobile platforms. Results from this
study indicated that encryption application systems are effective in ensuing mobile platform
security.
In addition to this, it must be noted that, since mobile devices can have security through
behavior encryption systems, then the behavior of host on a network or network systems can also
be encrypted to ensure safe communication since each host or user on a system or network has a
particular behavior pattern.
Cryptographic study into encrypting the normal usage model can fall under behavior encryption
since the usage model represents a system’s behavior and can be composed of a user’s behavior.
This can aid in securing the information that embodies the usage model. It is also necessary because
if the usage model can easily be predicted then it is possible to manipulate the usage model and
launch an attack.
14
14
Risk Analysis
Computer risk analysis is also called risk assessment. It involves the process of analyzing and
interpreting risk. To analyze risk, the scope and methodology has to be initially determined. Later,
information is collected and analyzed before interpreting the risk analysis results. Determining the
scope can be described as identifying the system to be analyzed for risk and parts of the system
that will be considered. Also, the analytical method that will be used with its detail and formality
must be planned. The boundary, scope and methodology used during risk assessment determine
the total amount of work efforts that is needed in the risk management, and the type and usefulness
of the assessments result.
Risk has many components including assets, threats, likelihood of threat occurrence,
vulnerability, safeguard and consequence. Risk management include risk acceptance which takes
place after several risk analyses. Normally, after risk has been analyzed and safeguards
implemented, the remaining or residual risk in the system that makes the system functional must
be accepted by management. This may be due to constraints on the system such as ease of use, or
features of the systems for which strict safeguard will cost the organization operational problems.
As such, risk acceptance, like the selection of safeguards, should take into account various factors
besides those addressed in the risk assessment. In addition, risk acceptance should take into
account the limitations of the risk assessment.
Information Security Awareness and Practices
A paper on information security awareness in Saudi Arabia discusses information security
awareness and practices. The paper is entitled “A study of information security awareness and
practices in Saudi Arabia.” This paper emphasizes the fact that information is under constant threat
from cyber vandals. However, Saudi Arabia is rated poor in terms of information security due to
the fact that the country is a highly suppressed, patriarchical and tribal culture country.
The paper examined the level of information security awareness among the general public
in the country using an anonymous online survey based on instruments the Malaysian Security
Organization produced. In all, 633 persons responded to the survey and analysis confirmed that
indeed, information security awareness is low in the country and this is mostly related to the fact
that, the country is highly suppressed, patriarchical and tribal in nature.
15
15
Protocol for Mitigating Risks on Social Networking Sites
According to an academic paper entitled, “Protocol for mitigating the risk of hijacking social
networking sites”, hackers can hijack a user’s session on social networking sites, impersonate the
victim and take over his session.
The paper deals with this risk by presenting a security authentication protocol for mitigating
the risk. The protocol takes into account that users of social networking sites connect to the sites
using several platforms and connection speeds. To cater for mobile devices and tablets using Wifi
connection, a novel Self-Configuring Repeatable Hash Chains (SCRHC) protocol was developed
to prevent the hijacking of session cookies. This protocol supports three levels of caching making
it possible to forfeit storage space for enhanced performance and reduced workload.
Behavior/Anomaly Based Intrusion Detection
Behavior models are used to detect intrusion in computer system. This section reviews the behavior
models that can be used to build behavior based intrusion detection systems. These models are put
into various categories. The categories are, statistical models, machine learning based techniques,
cognitive models, computer immunology, user intention. Statistical models include operational or
threshold metric model, markov process or marker model, multivariate model, statistical moments
model, time series models, univariate models. Machine learning based models include bayesian
networks, generic algorithms, neural networks, fuzzy logic, and outlier detection, cognitive models
include finite state machines, description scripts, and expert systems.
16
16
Research Model and Methodology
Research Model
Assume that the normal usage (Y) of a computer network can be represented by a mathematical
function;
Y=f (Xi, Ci) such that Xi represents system variables like number of functions or number of
authentications. Ci represents system constants like maximum or minimum number of
authentications. When a change in Y is beyond the standard deviation determined from the data
set of our usage, then that change indicates a threat. To investigate this threat, machine learning
algorithms, mathematical functions and behavior based intrusion detection systems will be studied
to determine Y in terms of a number of variables that represent Y appropriately. The expected
usage model of the network to be investigated includes the following components. Host Usage
Model, Server Usage Model, Device Usage Model, Port Usage Model, Network Usage Model,
Session Usage Model, Authentication Usage Model, Memory Usage Model, CPU Usage Model,
Battery Usage Model and Program Usage Model. These components are expected to be derived
from the variables listed below.
• Average number of application software that run on the mobile system while using the
system
• Average number of system processes that run on the mobile system while using the
system.
• Average number of authentications in the mobile system.
• Average number of user actions that happens on the mobile system Average time a user
spends before his session expires.
• Average time the mobile facility or resource functions each day.
• Number of paired ports communicating on the network
• Average amount of memory space used on devices while the network is being operated.
• Average CPU time spent on a single device on the network
• Average life span of a single device battery on the network.
USAGE MODEL: JAVA INTERFACE THAT IMPLEMENTS THE RESEARCH MODEL
For each component of a computer system under investigation, we will program a usage model
which is an implementation of the research model for that component which forms part of the
17
17
computer system under investigation. Each usage model implements an interface captured in a
java file called model.java.
There are eight functions in the model.java interface. The first one is computeval which is
for computing the usage value at an instance. The second one is findchange which is for finding
changes in the usage of the computer system. The third one is learnsys which is for learning the
usage of the system. The fourth one is findrelationship which is for finding the regression equation.
The fifth one is monitor which is for monitoring the usage of the system. The sixth one is
showalarm which is for displaying error messages and detected intrusion. The seventh one is
haltprocess which is for halting detected intrusion and the eighth one is predictvals. It is for
predicting usage values based on the regression equation determined. Omitting an implementation
of one of the functions of the usage model will throw an exception. To implement the usage model,
you will use the java keyword implements. Below is an implementation of the model.java file
USAGE MODEL FILE
public interface model{
public double computeval();
public double findchange();
public void learnsys(int t);
public Object findrelationship();
public void monitor(int t);
public void showalarm(String info);
public void haltprocess();
public void predictvals();
}
IMPLEMENTING THE USAGE MODEL FOR AN AUTHENTICATION SYSTEM
class auth_usage implements model{
/* variable declaration for dependent and independent variables
public double computeval(){
}
public double findchange(){
}
public void learnsys(int t){
}
18
18
public Object findrelationship(){
}
public void monitor(int t){
}
public void showalarm(String info){
}
public void haltprocess(){
}
public void predictvals(){
}
}
Methodology
The list below details activities or processes that will be followed to represent a computer system
with an abstract mathematical model and analyze changes in that system. It is hoped that following
these processes will arrive at design and implementation of a normal usage model, a threat
detection system and a mobile security audit framework.
• Machine Learning Algorithms & Behavior Based Intrusion Systems: Investigate machine
learning algorithms, mathematical functions, and behavior based intrusion detection
systems in order to determine the extent to which the normal usage of a mobile system can
be represented by the research model.
• Audit trails: Analyze audit trails in order to formulate a set of independent and dependent
variables and their associated data set that will help in modelling the usage model of a
mobile system.
• Normal Usage Model: Apply the knowledge gained from the machine learning algorithms
and behavior based intrusion detection systems study and the audit trails analysis to model
and represent the normal usage of a mobile system such as a smart phone, laptop or wireless
network.
• Threat Modelling: Study differential equations of the normal usage model and its
applications in order to model, detect and prevent threats.
• Boolean Calculus: Apply Boolean algebra and calculus of Boolean functions to design and
implement a hardware and software that make up the Normal Usage and Threat Detection
Systems.
19
19
• Use programming as a tool to experiment representations of the normal usage and threat
models to aid design and implementation of a mobile security audit framework.
• Employ questionnaire to collect information about the usage of computers and mobile
phones.
• Threat Detection Systems: Develop an anomaly based threat detection system to
demonstrate the effectiveness of the research model. The goal is to measure the
effectiveness of the threat detection system developed, at preventing threats on a computer
system.
Machine Learning Algorithms & Behavior Based Intrusion Systems
Machine learning techniques and algorithms will be investigated to know the extent to which an
expert system that learns a computer system’s usage can be built. Since the expected usage model
is a mathematical model, various mathematical modelling techniques will be applied to
determining the normal usage model.
When deviations from these mathematical models are analyzed it can lead to design and
implementation of behavior based intrusion detection systems. As such, a thorough study into
design and implementation of behavior based intrusion detection systems will be done.
Audit Trail Analysis
It is expected that computer security audit reports will be sampled and analyzed to arrive at a set
of dependent and independent variables and their data set. These variables and their associated
data set can be used to formulate the normal usage model.
Normal Usage Model
An investigation into applying the knowledge gained from the machine learning study, the
mathematical modelling study, the behavior based intrusion detection system study and the audit
trail analysis will the done. It is hoped that this will answer the question how do you represent the
normal functioning of a computer system with a mathematical abstract model.
Threat Modelling
Differential equations of the normal usage model will be investigated to know the extent to which
deviations from the normal usage models can be analyzed. An abstract mathematical model of
these deviations will be formulated. These abstract models are derivatives of the normal usage
model.
20
20
Boolean Calculus
A study into representing the normal usage model with a boolean function will be done. It is hoped
that analyzing these boolean functions will aid in building a hardware that is the expected usage
system. Differential equations of these boolean functions will be studied to analyze changes in the
system that indicate deviation from the normal usage model.
Experimenting Usage and Threat Models
Programming will be used as a tool to experiment various usage and threat models. These usage
and threat models are expected to be derived from a computer system. This experiment will lead
to design and implementation of a normal usage system, a threat detection system and a risk
analysis system. These systems are expected to be components of a mobile security audit
framework.
Computer Usage Survey
A questionnaire for obtaining information about computer and smart phone usage will be
employed. It is expected that this will give an idea about various statistics that make up a computer
or smart phone’s usage. These statistics will be a guideline for sampling experimental data of a
computer system’s usage during experimenting the usage and threat models.
Threat Detection Systems
It is hoped that an anomaly based threat detection system will be developed to demonstrate the
effectiveness of the research model at being used to model systems usage and threats. The
effectiveness of the threat detection system developed at preventing threats on a computer system
will also be measured. In this project, the threat detection system that will be developed is for
ecommerce sites.
21
21
Threats Associated with Computer Systems
This chapter discusses some of the threats and attacks associated with computer and network
systems.
Attacks associated with a computer system
The attack types that will be discussed include Malicious Code, IP Scan and Attack, Web
Browsing, Virus,
Unprotected Shares, Mass emails, Simple Network Management Protocol(SNMP), Hoaxes,
Backdoors, Password Crack, Brute Force, Dictionary, Denial of Service(DoS) and Distributed
Denial of Service Attack(DDoS), Spoofing, Man in the Middle, Spam, Mail Bombing, Sniffers,
Social Engineering, Buffer Overflow and Timing Attack.
Malicious Code
Malicious Code attack include the execution of viruses, worms Trojan horses and active Web
scripts with the intent to destroy or steal information. The state of the art malicious code is the
polymorphic or multivector worm. The attack programs uses up to six attack vectors to exploit a
variety of vulnerabilities in commonly known information system devices. Perhaps the best
illustration of such an attack remains the outbreak Nimda in Septembers 2001 which used five of
the six vectors with startling speed. TruSecure Corporation an industry source for information
security statistics and solutions reports that Nimda spread to span the internet address of 14
countries in less than 25 minutes.
IP Scan and Attack
The infested system scans a random or the local IP addresses and targets any of the several
vulnerabilities known to hackers or left over from previous exploits such as Code Red Black
Orifice, Poizon Box.
Web Browsing
If the infested system has write access to any Web page, it makes all the Web content files (html,
asp,gci and others) infectious so that users who browse to those pages become infected.
22
22
Virus
Each infested machine infects certain common executable or script files on all computers to which
it can write with virus code that can cause infection.
Unprotected Shares
Using vulnerabilities in file systems and the way many organizations configure them, the infested
machine copies the viral components to all locations it can reach.
Mass emails
By sending email infections to addresses found in the address book. The infected machine infects
many users, whose mail reading program also automatically runs the programs and infects other
systems.
Simple Network Management Protocol (SNMP)
By using the widely known and common password that were employed in the early versions of
the protocol (which is used for remote management of networks and computer devices) the
attacking program can gain control of the device. Most vendors have closed these vulnerabilities
with software upgrades.
Hoaxes
A more devious approach to attacking computer systems is the transmission of a virus hoax with
a real virus attached, when the attack is masked, in seemingly legitimate message, unsuspecting
users readily distribute it. Even though those users are trying to do the right thing to avoid
infection, they end up sending the attack on to their coworkers and friends and infesting many
users along the way.
Backdoors
Using a known or previously unknown and newly discovered access mechanism, an attacker can
gain access into a system or network resource through a back door. Sometimes, these entries are
left behind by system designers or maintenance staff and thus referred to as trap doors. A trap door
is hard to detect, because, very often the programmer who puts it in place also makes the access
exempt from the usual audit logging features of the system.
23
23
Password Crack
Attempting to reverse-calculate a password is often called cracking. A cracking attack is a
component of many dictionary attacks. It is used when a copy of the security account manager
(SAM) data file can be obtained. The SAM file contains the hashed representation of the user’s
password. A password can be hashed using the same algorithm and compared to the hashed results.
If they are the same the password has then been cracked.
Brute Force
The application of computing and network resources to try every possible combination of options
of password is called brute force attack. Since this is often an attempt to repeatedly guess
passwords to commonly used accounts, it is sometimes called a password attack. If attackers can
narrow the field of accounts to be attacked, they can devote more time and resources to attacking
fewer accounts. That is one reason a recommended practice is to change account names for
common accounts from the manufacturer’s default. While often effective against low-security
systems, password attacks are often not useful against systems that have adopted the usual security
practices recommended by manufacturers.
Dictionary
This is another form of brute force attack. The dictionary attack narrows the field by selecting
specific accounts to attack and uses a list of commonly used password (the dictionary) instead of
random combinations. Organizations can use similar dictionaries to disallow passwords during
the reset process and thus guard against easy-to-guess passwords. In addition, rules requiring
additional number and/ or special characters make the dictionary attack less effective.
Denial of Service (DoS) and Distributed Denial of Service (DDoS)
In a denial of service attack, the attacker sends a large number of connections or information
requests to a target. So many requests are made that the target system cannot handle them along
with legitimate request for service successfully. This may result in the system crashing or simply
becoming unable to perform ordinary functions. A distributed denial of service is an attack in
which a coordinated stream of request is launched against a target from many locations at the same
time. Most DDos attacks are preceded by a preparation phase in which many systems, perhaps
24
24
thousands are compromised. The compromised machines are turned into zombies, machines that
are directed remotely (usually by a transmitted command) by the attacker or participate in the
attack. DDos attacks are the most difficult to defend against and there are presently no controls
that any single organization can apply. There are, however some cooperative efforts to enable
DDos defenses among groups of services providers; among them is the Consensus Roadmap for
Defeating Distributed Denial of Service attacks.
Spoofing
Spoofing is a technique used to gain unauthorized access to computers wherein the intruder sends
messages to a computer that has an IP address that indicates that the messages are coming from a
trusted host. To engage in IP spoofing, a hacker must first use a variety of techniques to find an
IP address of a trusted host and then modify the packet headers so that it appears that the packets
are coming from that host. Newer routers and firewalls arrangements can offer protection against
IP spoofing
Man in the Middle
In the well-known man-in-the-middle or TCP hijacking attack, an attacker monitors (or sniffs)
packets from the network, modifies them and inserts them back into the network. This type of
attack uses IP spoofing to enable an attacker to impersonate another entity on the network. It
allows the attacker to eavesdrop as well as to change, delete, reroute, add forge, or divert data. In
a variant on the TCP hijacking session, the spoofing involves the interception of an encryption
key exchange, which enables the hacker to act as an invisible man-in-the-middle – that is
eavesdropper – with regard to encrypted communications.
Spam
Spam is unsolicited commercial email. While many considers spam a trivial nuisance rather than
an attack, it has been used as means to make malicious code attacks more effective. In March
2002, reports emerged of malicious code embedded in MP3 files that were included as attachments
to spam. The most significant consequence of spam on the modern organization, however, is the
waste of both computer and human resources it causes by the flow of unwanted electronic mail.
25
25
Many organizations attempt to cope with the flood of spam by using filtering technologies to stem
the flow. Other organizations tell the users of the mail system to delete unwanted messages.
Mail Bombing
Another form of e-mail attack that is also Dos is called mail bomb, in which an attacker routes
larger quantities of e-mail to the target. This can be accomplished through social engineering or
by exploiting various technical flaws in the Simple Mail Transport Protocol. The target of the
attack receives unmanageable large volumes of unsolicited e-mail. By sending large e-mails with
forged header information, attackers can take advantage of poorly configured e-mail systems on
the internet
26
26
Mathematical Modelling Techniques and Machine Learning Based Models
The mathematical relation that represents the normal usage model can be determined using
regression analysis. Regression analysis is a field of statistics. It employs the least squares method
to determine relationship between a data set compose of two or more variables. The least squares
method tries to determine the relationship by minimizing the error margin of the derived relation.
Simple Linear Regression
Simple linear regression problems involve a dependent and a single independent variable. The goal
is to find a linear relationship between the two variables. The linear relationships are of the form
y=b0+b1x where y is the dependent variable and x is the independent variable. The slope of the line
is b1 and the y-intercept is b0. The relationship between the dependent and independent variable
can be derived using the least squares method. First of all, the sum of the dependent and the
independent variables, and the sum product of the dependent and the independent variables must
be calculated. Secondly, the sum of the squares of the dependent and the independent variables
must be calculated.
The constant that represents the slope of the line that fits the predicted function is calculated as
the product of the sum product of the dependent variable and the independent variable and the
sample size minus the product of the sums of the dependent and the independent variables divided
by the product of the sample size and the sum of the squares of the independent variable minus the
square of the sum of the independent variable.
The constant that represents the y-intercept of the line is also calculated as the product of the sum
of the dependent variable and the sum of the squares of the independent variable minus the product
of the sum of the independent and the sum product of the dependent and independent variables
divided by the product of the sum of the squares of the independent variable and the sample size
minus the square of the sum of the independent variable.
Finally, the correlation coefficient of the predictive relation is also calculated as the product of the
sample size and the sum product of the dependent and independent variable minus the product of
the sums of the dependent and independent variables divided by the square root of the product of
the sample size and the sum of the squares of the independent variable minus the product of the
squares of the sum of the independent variables multiplied by the product of the sample size and
27
27
the sum of the squares of the dependent variable minus the square of the sum of the dependent
variable.
Multiple Linear Regression
Multiple linear regression problems involve a dependent variable and two or more independent
variables. Using the least squares method, the goal is to find the linear relationship between the
variables involved. The relationships are of the form y=b0 + b1x1+b2x2+…+bnxn, where n is the
number of independent variables, x1, x2,… ,xn are the various independent variables and y is the
dependent variable.
To solve multiple linear problems, we first need to reduce the expected function or multiple linear
models to their simple linear forms. In this form, it is easier to determine the regression equation.
To do this we need to determine the y=b0+b1x for every independent variable. That way, the
regression coefficient set denoted b associated with the independent variables can be determined
using the least squares method. As such the set b made up of b1, b2,…bn is a set containing the entire
regression coefficient associated with the predicted regression function.
Non Linear Regression
Non linear regression problems involve finding a non linear relationship between a dependent
variable and one or more independent variables. Because non linear graphs are difficult to analyze,
they can be represented mathematically as linear models before they are analyzed. This makes it
possible to use linear regression techniques to analyze such relationships.
One of the ways used to represent non linear relationships with linear models is taking logs
on both sides of the relationship equation. That reduces the non linear relationship to a linear
relationship. An example is of the form y2
=x2
/xy. To reduce this relationship to a linear relation
we take logs on both sides of the relation.
The resulting relationship is 2logy=2logx-logx-logy. When this relationship is simplified
the resulting relationship is logy=(logx)/3. In this form, the logy term represents the dependent
variable and the logx term represents the independent variable. Let K=logy and let P = logx. It
implies that K=P/3. This becomes the linear form of our non linear relation.
28
28
Machine Learning based models Used for Developing Anomaly Based
Intrusion Detection Systems.
This section discusses how hidden markov models can be used to detect and prevent threats on a
computer system.
Application of hidden markov models to detect threats and other critical occurrences in a
system.
Hidden markov models are machine learning models that are used to model states in a
system, the sequence in which they occur and the associated probabilities for each state transition.
When a system has a set of states in which it usually falls and it can be predicted or established
that each new state is dependent on the previous states, then hidden markov models can be used to
learn the state transitions that usually happens in the system. It must be stated that the sequence in
which states occur in a system can be characterized by a parametric random process. Also, the
probability associated with each state transition is irrespective of the time in which the transition
occurred in the system.
For computer systems which have occurrences that happen based on a parametric random
process, these occurrences can be seen as the set of states in the system. Some of these occurrences
may be the point at which the system is at its optimal usage, and the point at which a particular
threat occurs in the system. When a set of threat types that happens in the system is determined, it
becomes possible to study the sequence in which these threats occur in the system and the various
transitions between the threats using hidden markov models. Also, the various usage points
including the optimal, the minimum and the average usage and how they are transited in the system
can be studied using hidden markov models.
Because various occurrences and threats can be studied using hidden markov models, it
becomes possible to predict the next occurrence or threat that will happen on a host or a computer
network. Threat sources can also be predicted using threat models. When threat models are
integrated, they give a general idea about the source of the threat. With such knowledge and ability,
the next threat or occurrence that has a higher likelihood of happening on a host or network can be
predicted using application of hidden markov models. As such, occurrences can be prevented if
they are estimated to be disastrous. Also, if for instance, for some reason, the optimal or minimal
usage must be reached, it becomes possible to study ways of optimizing the transition from the
29
29
current state or predicted next state to the required state. This makes it possible to move from a
particular usage point to the desired usage point.
This approach to threat detection and usage optimization, make it possible to build anomaly
based intrusion detection systems that are correct, prompt and increase optimal use of the system.
The anomaly based intrusion detection systems built using these techniques are correct because
the threat models come from usage models that are built using similar approaches and the threat
prediction and prevention mechanisms are designed using robust techniques developed using these
approaches. Also, there are likely going to be lower false alarms since the threats predicted on host
or networks come from threat models designed from such robust methods.
An example of a kind of cyber security threat that this approach can be used to model is a
network problem where a student is determined or predicted to be sending threatening or socially
unacceptable emails to colleagues. Typically, his identity is hidden on the network on which he
sends the emails. As such, it is difficult to determine the likelihood that he will send such
threatening emails on a particular day or hour so that his identity could be determined and brought
to book. Using hidden markov models, a usage model of the email system could be developed that
will make it possible to determine the day or hour in which he is likely going to send such an email.
This will help in determining his identity and bring him to book.
30
30
The Normal Usage Model of a System
If the normal usage of a mobile system can be represented by a mathematical function such that
that function is made up of system variables Xi and system constants Ci, then any representation
of our mobile system can be summarized as Y=f (Xi, Ci), where Y is our systems’ usage and Xi
are the various independent variables of our mobile system that constitutes the normal usage model
of the system. A normal usage model is an abstract representation of the usual or normal
functioning or behavior of a system.
In order to model the normal usage of our system and determine its mathematical
representation, it is essential to keep the method simple and the variables simple in abstraction and
minimal in quantity. This makes it easy to analyze, model and detect threats by applying a branch
of calculus called differentiation. Simplicity and minimal number of variables make it possible to
arrive at a mathematical function whose differential coefficient can be easily computed using
differentiation. As such, two cases will be considered.
In the first case, the normal usage model of our system can be analyzed and modelled
based on simple but essential micro usage models. These micro usage models represent smaller
components of our mobile system such as an authentication system of our mobile system, and a
user’s session. Ideally, these models are best derived from exactly one most appropriate system
variable when feasible or at most two in order to reduce the complexity involve in computing the
differential coefficient of the usage model.
For a mathematical function involving more than a single independent variable, our method
for threat detection using the differential equations techniques is within the scope of multivariable
calculus. Since it is easy to compute the differential coefficient of a single variable function, our
threat analysis and detection can be easy if all our micro models are single variable functions.
In the second case however, our usage model derives it mathematical representation from
at less two or three most relevant system variables of the mobile system under examination. This
option increases the complexity involved in calculating the differential coefficient of our normal
usage model and analyzing the threat associated. This is because the normal usage model for this
case is a function that can be derived from two or more independent system variables.
To do this type of differentiation, we use a branch of calculus called partial differentiation,
where one of the independent variables of our usage model is held constant to analyze changes in
31
31
the usage. This type of differentiation is also within the scope of multivariable calculus. The
sections that follow the one below throw more light on how to model the normal usage of several
micro usage models. These micro usage models are expected to be components of a computer
network’s usage.
It must be noted that the usage model is made up of the usage model function and a
statistical model that captures the mean and standard deviation of the predicted usage function.
This statistical usage model is called moments or mean and standard deviation model. There are
other statistical model that could have been used. These include time series models, univariate
models and bivariate models.
Single Variable Calculus Review and its Applications
Assume a mobile system with exactly three major system variables. If sampling each of these
variables helps us to arrive at exactly one micro usage model of our mobile system that best
represents the behavior or functioning of that feature of our system, then we can use differential
equations of the three micro models to analyze and detect threats. Below are some examples of
calculus basics for our threat modelling techniques.
Y=2X+3 is a linear function that represents our first micro usage model. X is number of
authentications. Y=3X2+2X+6 is a quadratic function that represents our second micro usage
model and X is the number of host on the mobile system’s wireless network. Y=40/ X+ 5 is an
exponential function that represents our third micro usage model and X is the number of application
on a host on the mobile system’s wireless network. For each micro usage model, the differential
coefficient can be computed using the law for differentiation given below.
Theorem 1: dy/dx(C) =0, where C is a constant. Theorem 2: dy/dx (f[Xi, Ci]) is computed
as the product of the exponent of the first term that results from simplifying f (Xi, Ci) and the
constant besides it multiplied by the system variable Xi raise to the power the original exponent of
the first term minus one plus the result for iterating the first step till every term of f (Xi, Ci) has
been evaluated based on the first step. The final result looks like the sum of a series of rational
numbers computed from the law after going through all the terms.
From the calculus basics review above, the corresponding differential coefficients of the
three micro models are determined as follows; 2, 6X+2, and -40/ X2
. If the standard deviations of
our micro models are computed, then we can analyze changes in our system by looking at values
32
32
of our usage model and its derivates and how they relate to the average usage, its corresponding
standard deviation, and the acceptable thresholds for threats.
Any occurrence at a point where our usage model value is not equal to the average usage
indicates a threat. Any occurrences at a point where the usage model value is less than the average
usage minus its corresponding standard deviation is a denial of service threat. Any occurrence at a
point where the usage model value is greater than the average usage plus it corresponding standard
deviation is an intrusion. Also any occurrence at a point where the value of the usage’s derivative
is not equal to the acceptable threshold for threats is a threat.
Usage Model List
Authentication Usage Model
The authentication usage model represents the usage of an authentication system. The independent
variables that must be sampled to determine the usage of an authentication system are the average
data transmitted during an authentication (x1) and the average network speed for a single
authentication (x2). The average data transmitted is the average of request and response data for a
single authentication and the average network speed is the average upload and download speed for
a single authentication. The dependent variable that must be sampled is the time taken for an
authentication (y).
The goal of modelling the dependent and independent variables is to arrive at a mathematical
relationship between y and the two independent variables x1 and x2. It is expected that the
relationship will be Y=c1(x2/x1) +c2, where c1 and c2 are system constants. In addition to that, some
system constants that will aid threat analysis must be determined. These are the total number of
valid authentications, the expected authentications within a time frame, the minimum
authentications within a time frame and the maximum authentications within a time frame. The
mathematical relationship between y, x1 and x2 is the normal usage model of the authentication
system. After this relationship has been determined, various occurrences that deviate from this
relationship can be used to analyze threats. For instance, any occurrence that is not equal to the
average usage is a threat. Additionally, any occurrence that indicates a change outside an
acceptable threshold is a threat. The acceptable threshold is a range within which changes in the
systems are deemed normal. Such a range is composed of the average usage and standard
deviation.
33
33
Session Usage Model
A session usage model represents a single user’s behavior before his session expires. To determine
the mathematical model for a user’s session, two main independent variables must be sampled.
These are size of session data accumulated (x1), and number of user actions (x2). The dependent
variable that must be sampled is time spent before session expires (y). The session usage model is
expected to be made up of two micro usage models. The mathematical representation of the micro
usage models are expected to be Y=c1x1+c2 where c1 and c2 are systems constants and Y=c1x2+c2
where c1 and c2 are system constants.
In addition to the two mathematical functions, some system constants that will aid threat analysis
must be determined. These include average user actions, average size of data accumulated, average
time spent. These constants can be determined from the data set used to determine the usage model.
The two mathematical relationships represent the session usage model. Both are linear
functions. It is expected that as user actions increase the time spent also increases. It is also
expected that as data accumulated increase times spent also increases.
Memory Usage Model
The memory usage model represents the usage of memory space in a system. The independent
variables that must be sampled are number of application programs running (x1), and the number
of system processes running (x2). The dependent variable that must be sample is amount of
memory space being used(y). The mathematical relationship between x1, x2, and y is expected to
be y=c1x1+c2x2+c3 where c1 is the average memory space for programs, c2 is the average memory
space for processes and c3 is the average memory being used when no process or program is
running.
In addition to these, some system constants that aid threat analysis must be determined. These
include the minimum and maximum memory space for programs and the minimum and maximum
memory space for processes. The mathematical relationship between x1, x2, and y is the memory
usage model. When determined, the memory usage model can be used to analyze changes in the
memory usage that indicate threats in the system.
CPU Usage Model
The CPU usage model represents CPU usage in a system. The independent variables that must be
sampled are the number of application programs running (x1), and number of system processes
34
34
running (x2). The dependent variable that must be sampled is amount of CPU power being used
(y). The mathematical relationship between x1, x2, and y is expected to be y=c1x1+c2x2+c3 where
c1 is the average CPU power being used for programs, c2 is the average CPU power being used for
processes and c3 is average CPU power being used when no process or program is running. In
addition to these, some system constants that aid threat analysis must be determined. These include
the minimum and maximum CPU power for programs and the minimum and maximum CPU
power for processes. The mathematical relationship between x1, x2 and y is the CPU usage model.
When determined, the CPU usage model can be used to analyze changes in the CPU usage that
indicate threats in the system.
Program Usage Model
To determine the program usage model the dependent and independent variables that must be
sampled are time spent using program (y), and number of functions used (x). In addition to that,
the following constants must also be determined. Minimum functions used and maximum
functions used. The relationship between y and x determined after sampling various x and y values
is the program usage model denoted by y=f(x).
Host Usage Model
The host usage model is composed of four independent variables. Memory usage (x1), session
usage (x2), CPU usage (x3), and program usage (x4), derived from their respective usage models.
The dependent variable that must be sampled in the time host spent on host (y). Any relationship
determined between the dependent and the independent variables is the host usage model. The
resulting host usage model is denoted y=f (x1,x2, x3, x4).
Battery Usage Model
The battery usage model is made up of the average usage of CPU, average memory usage and the
average usage of how a session behaves in the system. These are the independent variables. The
dependent variable is the battery lifespan. The independent variables are derived from their
respective micro usage models.
Device Usage Model
The device usage model is made up of a battery usage model, a host usage model, and the time
spent on the device. The usage models that make up the device usage model compute the average
35
35
micro usage and try to relate that with the time spent on the device. The time spent on the device
is the dependent variable.
Server Usage Model
The server usage model is made up of the CPU time being used, the memory space being used and
the number of processes running. These variables are used to form two different micro usage
models. As such, there are two dependent variables, CPU time and memory space. The
independent variable for both micro usage models is the number of processes running.
Port Usage Model
The port usage model is made up of the time elapsed during communication, number of programs
that use the port and the number of paired ports. The number of paired ports is the dependent
variable and the remaining variables are the independent variables.
Network Usage Model
The network usage model is made up of average port usage, average server usage average host
usage, the average size of data transmitted on the network, and time spent on the network. The first
three variables are the independent variables. The remaining two are the dependent variables. As
such two micro usage models make up the network usage model.
Aggressive Usage Detector
This model is a utility that detects aggressive behavior on a system. It is modelled just like the
various micro usage models. Various factors that determine aggressive behavior during system
usage are used to determine the mathematical representation of this utility. Aggressive behavior
includes aggressive use of major system resources, and aggressive use of system components with
limited resources.
The average aggressive behavior and its standard deviation are determined. Any system
occurrence that indicates the average aggressive behavior, or the average aggressive behavior plus
its standard deviation or the average aggressive behavior minus its standard deviation is considered
a threat and must be halted, alerted or stored for audit purposes.
False Alarm Detector
The false alarm detector is a utility that detects normal system usage that otherwise may be deemed
threats. Occurrences that meet the criteria for false alarms are normal usage that seems to put the
entire usage of the system into a false state of vibration or anarchy. Such usage occurrences are as
36
36
such prioritized as normal optimal usage. The remedy for the vibrations such usage occurrences
cause is delay in other normal usage occurrences in the system.
The state and magnitude of other system occurrences plus the state and magnitude of the normal
optimal usage determine the impact of the perceived anarchy. To increase convenience with which
the system for which this utility is developed, the average delay time and its standard deviation
must be detected. This utility is part of the normal usage. The utility is modelled just like the
aggressive usage detector.
Special parameters of the usage model
This section discusses special parameters of our normal usage model. These parameters include
the average usage, the usage standard deviation, the minimum usage, the maximum usage and the
most frequent usage value recorded.
The average usage is the predicted average usage after the normal usage model function has been
determined. The usage standard deviation is the standard deviation of the predicted normal usage
function. The minimum and maximum usage values are the minimum and maximum usage
predicted using the normal usage model. These parameters together with usage rates, threat model
constants and other usage constants are used in analyzing and detecting threats.
BUILDING THE USAGE PROFILE
To build the usage profile we will first program a usage model for all the components of the
computer system under investigation. For this research, we want to build the usage profile for a
computer network. As such we will program a usage model for authentication on the computer
system, we will also program a usage model for a user’s session on the computer system. Also, we
will program the usage model for memory usage in a computer system. We will also program a
usage model for CPU usage. Additionally, we will program a usage model for a host on a network
and program another usage model for a server on the network and finally we will program a usage
model for the network its self.
The usage model for each component represents the behaviour of that component of a
computer system under investigation. The usage model when implemented will help us determine
the regression equation which represents the research model and the average usage and its standard
deviation. In addition to the regression equation and the mean and standard deviation model we
will develop a markov chain model for the system under investigation. As such we will determine
37
37
states in the entire computer network and the various state transitions and the associated
probabilities of state transitions. The rest of this chapter will explain how to build a usage profile
using an authentication system and explain the details of the critical variables of the other usage
models and explain the mathematical theory needed for building the usage profile.
BUILDING A MODEL PROFILE FOR AN AUTHENTICATION SYSTEM
To build a usage model for an authentication system, we must sample critical system variables of
a system. These variables include the download speed on the network, the upload speed on the
network, the size of data sent to the server during authentication, the size of data sent to the client
during authentication and the time it takes for a successful authentication. The size of data sent and
received from the server are request data and response data respectively.
To build the usage model for the authentication data, we will capture data for all the critical
variables at equal time intervals say every 10 minutes while the authentication system is being
used. After having a sample of sample size of about 10 we will try to determine the relationship
between the dependent variable and the independent variables. As already stated the relationship
can be determined using simple or multiple linear regression. In addition to the regression equation,
we will also determine other statistics that describe the behavior of the authentication system such
as the mean and standard deviations for the variables that were sampled.
BUILDING THE MARKOV CHAIN MODEL FOR THE AUTHENTICATION SYSTEM
Hidden markov models are machine learning models that are used to model states in a system, the
sequence in which they occur and the associated probabilities for each state transition. When a
system has a set of states in which it usually falls, and it can be predicted or established that each
new state is dependent on the previous states, then hidden markov models can be used to learn the
state transitions that usually happens in the system.
To build the markov chain model we will determine states on the authentication system and their
associated probabilities. Some of these states include the average usage of the authentication
system. This may be abstracted as the average time it takes for a successful authentication. Other
states include the minimum and maximum recorded time for a successful authentication and the
average time it takes for a failed authentication or the maximum and minimum recorded time for
failed authentications. With this information and their associated probabilities of occurrence during
a normal day we have more information about the behaviour of the authentication system.
38
38
Threat Models in a System
A threat is a change in the normal usage model that is beyond a certain acceptable threshold called
the standard deviation of the usage model. A threat model on the other hand is an abstract
representation of this change in our mobile system that is beyond the acceptable threshold.
Integration can be performed on a threat model to determine the source of the threat. Integration is
a reverse operation for differentiation in calculus. A threat model that can perform integration
operations can be called a novel self integrating data structure. This chapter of the paper will look
at threat models of the micro usage models that make up a computer network and how to analyze
these threats in order to prevent them.
Also, how to determine the sources of these threats using a novel self integrating threat
model will be discussed. To do this, three main functions are introduced. The functions are y=3,
y=4X+2 and y=9X2
+3. These functions are in the context of the novel self integrating data
structure. These functions are three different threat models. Additionally, the threat models of the
various micro usage models discussed in this paper will be explored.
Properties and Methods of the Novel Self Integrating Data Structure
The best properties or characteristics of the data structure that represents our threat model include
just to mention a few, names of network software or host application software, version number of
network and host software, license information that include date software was purchased or
released and number of years needed for renewal, IP address and Mac address of a host on a
network.
The methods of such a gigantic or simulative object may include methods for computing
the integral of a threat model, another for computing the differential coefficient of the predictive
normal usage model, a method for computing the differential equation of a network or host threat
model. These methods included are mostly methods needed for performing the major calculus
operations that will help in the novel calculus simulation on a network to detect threat and their
sources on a wireless network. Besides these, it may be necessary to implements methods that
retrieve hidden network identity like IP and Mac addresses on a local area network.
Integration Review
Based on our three functions stated in this chapter, we will do an introductory review of integration
which is a branch of calculus that is a reverse operation for differentiation. The integrals for the
39
39
functions introduced in this chapter are computed respectively as 3X +C, 2X2
+4X+C and
3X3
+3X+C where C represents system constants in the mobile system. Computing the integral can
be tricky so two laws are defined below to aid quick computation of the integrals of a normal
mathematical function.
Theorem 1:
If a function is represented by a constant such as a rational number, the integral is the product of
the variable x and the rational number which is the constant plus a system constant c, to be
determined by about a pair of x and y values.
Theorem 2:
If a function is not represented by a constant, the integral is given as the constant of the first x
occurring term divided by the sum of the exponent of the first x occurring term and 1 multiplied
by the variable x raised to the power the sum of the exponent of the first x occurring term and 1
plus repeating the same for every x occurring term plus the corresponding system constant c.
Interpretation of Threat Model Integrals
Since the novel self integrating data structure is a programmed threat model, it is important to
discuss the meaning of its integrals. The integrals represent the source of the original threat.
Examples of the integrals of the threat model may result in detecting the function, software, host
or network from which the threat was detected. With properties like software name, version
number, IP and Mac addresses it becomes easy to pin point the source of the threat.
If the integral of a threat model looks like the normal usage model of a function of the
system under examination, then that function from the system under examination can be predicted
as the source of the threat. Similarly, if the integral is similar to the normal usage model of a
software, host, or network that forms part of the system which is being investigated, then that threat
can be predicted to be from that software, host or network.
Threat Analysis and Detection
To do threat analysis in a system and abort processes that initiated those threats, linear and non
linear programming techniques can be used. The goal here is to minimize the threat occurrence
frequency and the overall impacts associated with the threat and optimize the normal usage
function. In addition to these two goals, there are some constants that aid threat analysis. These
constants are associated with the normal usage model and the threats in the system.
40
40
Examples of these constants may be the rate at which usage is increasing with respect to a
particular usage variable or the rate at which the threat impact and frequency increases with respect
to a particular variable in the usage model and other special parameters associated with the usage
model function.
The average usage, its standard deviation and the threat model function make up the threat
model. The average usage and standard deviation are constants in the threat model. Using the threat
model function, the average usage and standard deviation, threats analysis can be done using linear
and non linear programming. The goal is to minimize threats using the threat model function as
the objective function and the average usage and standard deviation as constraints. Other
parameters that may be used as constraints include the rate at which usage is increasing with respect
to a particular usage variable or the rate at which the threat impact and frequency is increasing with
respect to a particular usage variable.
Threat Prediction
This section discusses how to predict threats in a system. The network usage model discussed in
the previous chapter and its associated threat model will be used to demonstrate how to predict or
detect a threat in a system. As discussed in the previous section, threat can be detected using linear
and non linear programming. The network usage model function and its associated threat model
function are the objective functions.
The constraints that will be used are the average network usage and its standard deviation,
and other parameters such as the rate at which the network threat increases with respect to other
network usage model components such as average host usage, average server usage, average port
usage, average time the network operates, average data transmitted on the network. The goal of the
linear or non linear programming is to optimize the usage such that usage is within the range of the
average usage minus its standard deviation and the average usage plus its standard deviation. These
are the lower and upper bounds of our objective function. Every combination of system variables
whose usage is within this usage range minimizes threat in the system.
Since the average port, host and server usage are derived from their corresponding usage models,
the linear and non linear programming analysis will be done independently for these ones. When
a threat is predicted in a system, the chance of it being accurate is dependent on the usage value at
that instance and whether it is within the range of the acceptable usage. This is constructed using
41
41
the average usage and its standard deviation. Any usage value that is less than the average usage
minus its standard deviation is a threat. Also, a usage value that is greater than the average usage
plus its standard deviation is a threat. That means that any predicted threat at a point where the
predicted usage is within the usage range has a high chance of being false. In addition to that, the
actual and predicted usage values can be used to determine that chance that the predicted threat is
accurate. If the difference between them is high, there is a chance that the predicted usage may be
wrong. Since the predicted usage and the threat models are derived from the usage model function,
there is a chance the predicted threat is also false. Finally, the closer the correlation coefficient of
the usage model function is to zero, the higher the chance the predicted usage and its associated
threats values are wrong. Usage model functions with correlation coefficient of 0.6 and above
indicate that the predicted usage values and predicted threats values are accurate. These values are
obtained from the usage model function and the threat model function respectively which are
modeled using relevant systems variables that make it possible to model system usage and system
threats.
Risk Analysis in a System
To do risk analysis in a system, the frequency at which threats in the system occur and the impact
they have on the system must be known. When a frequency table is constructed for all threats and
their associated impacts stored, it becomes easy to analyze risks associated with a system.
When a threat is predicted, the likelihood of the threat occurring in the system can be computed
using the threat frequencies. The impacts various threats have can also be determined based on the
types of threats and other parameters such as the number of such threats, the speed at which they
occurred and the resources they affected or damaged. Risk in a system is computed as the product
of the likelihood of threat occurrence and the impact that threat occurrence has on the system.
These concepts are the basics for developing a risk analysis system using the techniques we have
discussed so far.
Normal Usage Model and Threat Model Simulation
In this chapter, we discuss the experiment that was conducted to determine the usage of a computer
system. We also discuss how to simulate the threat and usage models with the hope of developing
a threat detection system. Four of the micro usage models that were discussed in this paper were
used for the simulation. These are the ones for authentication, session CPU and memory.
42
42
Because the usage model for authentications was determined to be a rational function, logs was
taken on both sides of the relation as part of the simulation in order to reduce the relation to their
linear form. The original function is Y=c1(x2/x1) +c2. When reduced to its linear form we have log
Y= log c1+ log x2 – log x1 + log c2. Since log c2 and log c2 results in constants let denote them with
k1 and k2 respectively. Additionally, let B= log Y, let j1= log x1 and let j2= log x2. Therefore, the
linear form of the usage for authentication is B= j2- j1 + k1 + k2. Since k1 + k2 is a constant let it be
represented by k. As such B= j2- j1 + k where B is the dependent variable and j2 and j1 are the
independent variables. When B, j2, and j1 are sampled, Y=c1(x2/x1) +c2 can be determined.
The cpu and the memory usage models are multiple linear forms. The original relation is of the
form y=c1x1+c2x2+c3 where x1 and x2 are the independent variables. The original relation must be
reduced to their simple linear form. To do this, determine y=b0+bx for each independent variable.
The sum of the various b0 equals c3. The various b correspond to the constant associated with the
independent variable for which y=b0+bx was determined. For example, the b for any y=b0+bx
determined for x1 equals to c1 and that for x2 equals to c2. When x1, x2, and y are sampled and the
various y=b0+bx determined, y=c1x1+c2x2+c3 can be determined completely.
The simulation was run for four times within a week. On the first instance, it was run for 15
minutes. On the second instance, it was run for 30 minutes. On the third instance it was run for 45
minutes. On the last instance it was run for 60 minutes. The functions for the usage models, and
their corresponding correlation coefficient were also determined.
43
43
Tools and Computer Packages
This chapter discusses the tools and computer packages that were used throughout this research
project. We will also look at the programming languages, database platforms and development
frameworks that can be used to develop an anomaly based intrusion system for ecommerce sites
using the concepts were have discussed in this paper. The simulation was implemented using java.
It was a console based simulation. Java was chosen for its object oriented concepts such as
encapsulation, inheritance, interfaces, objects, and polymorphism.
To implement an intrusion detection system using results of this research, the following
tools will be essentials. These tools are best suited for intrusion detection systems developed for
ecommerce sites. Bootstrap, Codeignitor, MySQL Database Management System, SQLite,
SQLyog, and Eclipse. The programming languages that will be used are PHP and Android. PHP
is for the desktops and laptops that connect to the ecommerce sites and Android is for mobile
phones that use the ecommerce sites.
Bootstrap and Codeignitor are web development frameworks. Bootstrap is for frontend
developments and Codeignitor is a backend framework for PHP developers. For Android Eclipse
can be used as the best IDE for Android developments. MySQL and SQLyog are for the database
servers that will run on the ecommerce site as part of the intrusion detection system
implementation. SQLite is for the databases that run on the Android implementations that form
part of the intrusion detection system developed for the ecommerce website.
With are these tools frameworks and packages, developers are ready to develop intrusion
detection systems for ecommerce sites using the concepts in this research paper. It is expected that
the micro usage models discussed will be integral libraries that will be implemented in PHP and
Android as part of an implementation for ecommerce sites or any group of web or mobile
application system.
44
44
Conclusion and Discussion
It is worth mentioning that the normal usage models and threat models experimented in this paper
represents a computer system and it associated threats. These threats can be analyzed periodically
and audited as part of a computer security audit. This will fuel development of a risk analysis
system. A risk analysis system, threat detection system and normal usage system developed from
experimenting the usage and threat models will make up a mobile security audit framework that
can be used for maintaining cyber security on computer systems. When practices and processes
for maintaining this framework are drafted and adhered to, it will make it easy to maintain cyber
security on various computer systems.
Additionally, it can be established that using the differential equations technique, the novel
self integrating data structure, and the linear and non programming techniques, threats on a system
can be analyzed and detected. To halt such threats, the intrusion detection system developed using
the techniques stated above must possess certain qualities. These qualities include correctness,
promptness, and ease of use. Correctness means how good the intrusion detection system can
detect threats. This is important because correctness affects the rate at which a predicted threat is
false or true. Promptness is related to the time it takes to detect or halt a threat and ease of use is
related to the property of the intrusion detection system aiding convenient use of the computer
system for which it is developed.
The techniques we have discussed make it possible to achieve correctness, promptness and
ease of use. The usage model function with its associated average usage and standard deviation
make it possible to ensure correctness of the intrusion detection system. This is because the
statistical data sampled for developing the intrusion detection system is within the range of the
acceptable usage. The average usage and standard deviation are computed using statistical models.
One of such models used in this research is the moments or mean and standard deviation model.
With this statistical model and the usage model equation, it becomes possible to ensure correctness
of the intrusion detection system.
To achieve promptness, multithreading is applied to analyze, predict, detect and halt
threats. All threats alarms and detection must use multithreading. Multithreading is a programming
concept that ensure that several processes run on the computer at the same time. This concept
makes it possible to predict multiple threats, do multiple threat analysis and halt or alarm
Building a usage profile for anomaly detection on computer networks
Building a usage profile for anomaly detection on computer networks
Building a usage profile for anomaly detection on computer networks
Building a usage profile for anomaly detection on computer networks

More Related Content

What's hot

NTXISSACSC2 - Advanced Persistent Threat (APT) Life Cycle Management Monty Mc...
NTXISSACSC2 - Advanced Persistent Threat (APT) Life Cycle Management Monty Mc...NTXISSACSC2 - Advanced Persistent Threat (APT) Life Cycle Management Monty Mc...
NTXISSACSC2 - Advanced Persistent Threat (APT) Life Cycle Management Monty Mc...North Texas Chapter of the ISSA
 
Digital forensic
Digital forensicDigital forensic
Digital forensicChandan Sah
 
Introduction To Ethical Hacking
Introduction To Ethical HackingIntroduction To Ethical Hacking
Introduction To Ethical HackingAkshay Kale
 
Cs8792 cns - unit i
Cs8792   cns - unit iCs8792   cns - unit i
Cs8792 cns - unit iArthyR3
 
Quantum Cryptography presentation
Quantum Cryptography presentationQuantum Cryptography presentation
Quantum Cryptography presentationKalluri Madhuri
 
Machine learning basics
Machine learning basics Machine learning basics
Machine learning basics Akanksha Bali
 
Quantum Cryptography - Seminar report
Quantum Cryptography - Seminar reportQuantum Cryptography - Seminar report
Quantum Cryptography - Seminar reportShyam Mohan
 
Preserving and recovering digital evidence
Preserving and recovering digital evidencePreserving and recovering digital evidence
Preserving and recovering digital evidenceOnline
 
Quantam cryptogrphy ppt (1)
Quantam cryptogrphy ppt (1)Quantam cryptogrphy ppt (1)
Quantam cryptogrphy ppt (1)deepu427
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learningGanesh Satpute
 
Cyber Security Awareness (Reduce Personal & Business Risk)
Cyber Security Awareness (Reduce Personal & Business Risk)Cyber Security Awareness (Reduce Personal & Business Risk)
Cyber Security Awareness (Reduce Personal & Business Risk)Gian Gentile
 
Authentication service security
Authentication service securityAuthentication service security
Authentication service securityG Prachi
 

What's hot (20)

NTXISSACSC2 - Advanced Persistent Threat (APT) Life Cycle Management Monty Mc...
NTXISSACSC2 - Advanced Persistent Threat (APT) Life Cycle Management Monty Mc...NTXISSACSC2 - Advanced Persistent Threat (APT) Life Cycle Management Monty Mc...
NTXISSACSC2 - Advanced Persistent Threat (APT) Life Cycle Management Monty Mc...
 
Tracking Emails
Tracking EmailsTracking Emails
Tracking Emails
 
Digital forensic
Digital forensicDigital forensic
Digital forensic
 
Cyber terrorism
Cyber terrorismCyber terrorism
Cyber terrorism
 
Introduction To Ethical Hacking
Introduction To Ethical HackingIntroduction To Ethical Hacking
Introduction To Ethical Hacking
 
Sms spam-detection
Sms spam-detectionSms spam-detection
Sms spam-detection
 
Next word Prediction
Next word PredictionNext word Prediction
Next word Prediction
 
Cs8792 cns - unit i
Cs8792   cns - unit iCs8792   cns - unit i
Cs8792 cns - unit i
 
Quantum Cryptography presentation
Quantum Cryptography presentationQuantum Cryptography presentation
Quantum Cryptography presentation
 
Machine learning basics
Machine learning basics Machine learning basics
Machine learning basics
 
Quantum Cryptography - Seminar report
Quantum Cryptography - Seminar reportQuantum Cryptography - Seminar report
Quantum Cryptography - Seminar report
 
Software Security
Software SecuritySoftware Security
Software Security
 
Cryptography
Cryptography Cryptography
Cryptography
 
Preserving and recovering digital evidence
Preserving and recovering digital evidencePreserving and recovering digital evidence
Preserving and recovering digital evidence
 
Cyber Crime Evidence Collection Ifsa 2009
Cyber Crime Evidence Collection Ifsa 2009Cyber Crime Evidence Collection Ifsa 2009
Cyber Crime Evidence Collection Ifsa 2009
 
Quantam cryptogrphy ppt (1)
Quantam cryptogrphy ppt (1)Quantam cryptogrphy ppt (1)
Quantam cryptogrphy ppt (1)
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Cyber Security Awareness (Reduce Personal & Business Risk)
Cyber Security Awareness (Reduce Personal & Business Risk)Cyber Security Awareness (Reduce Personal & Business Risk)
Cyber Security Awareness (Reduce Personal & Business Risk)
 
Authentication service security
Authentication service securityAuthentication service security
Authentication service security
 
Computer Forensics
Computer ForensicsComputer Forensics
Computer Forensics
 

Similar to Building a usage profile for anomaly detection on computer networks

A new approach for formal behavioral
A new approach for formal behavioralA new approach for formal behavioral
A new approach for formal behavioralijfcstjournal
 
Improvement of Security Systems by Keystroke Dynamics of Passwords
Improvement of Security Systems by Keystroke Dynamics of PasswordsImprovement of Security Systems by Keystroke Dynamics of Passwords
Improvement of Security Systems by Keystroke Dynamics of PasswordsIJCSIS Research Publications
 
APPLICATION SPECIFIC USAGE CONTROL IMPLEMENTATION VERIFICATION
APPLICATION SPECIFIC USAGE CONTROL IMPLEMENTATION VERIFICATIONAPPLICATION SPECIFIC USAGE CONTROL IMPLEMENTATION VERIFICATION
APPLICATION SPECIFIC USAGE CONTROL IMPLEMENTATION VERIFICATIONIJNSA Journal
 
Modeling & simulation in projects
Modeling & simulation in projectsModeling & simulation in projects
Modeling & simulation in projectsanki009
 
Handwritten Text Recognition Using Machine Learning
Handwritten Text Recognition Using Machine LearningHandwritten Text Recognition Using Machine Learning
Handwritten Text Recognition Using Machine LearningIRJET Journal
 
A Novel Approach to Derive the Average-Case Behavior of Distributed Embedded ...
A Novel Approach to Derive the Average-Case Behavior of Distributed Embedded ...A Novel Approach to Derive the Average-Case Behavior of Distributed Embedded ...
A Novel Approach to Derive the Average-Case Behavior of Distributed Embedded ...ijccmsjournal
 
IRJET- Two Factor Authentication using User Behavioural Analytics
IRJET- Two Factor Authentication using User Behavioural AnalyticsIRJET- Two Factor Authentication using User Behavioural Analytics
IRJET- Two Factor Authentication using User Behavioural AnalyticsIRJET Journal
 
Chapter 3 - Analytical Techniques
Chapter 3 - Analytical TechniquesChapter 3 - Analytical Techniques
Chapter 3 - Analytical TechniquesNeeraj Kumar Singh
 
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...IRJET Journal
 
Verification of the protection services in antivirus systems by using nusmv m...
Verification of the protection services in antivirus systems by using nusmv m...Verification of the protection services in antivirus systems by using nusmv m...
Verification of the protection services in antivirus systems by using nusmv m...ijfcstjournal
 
An Intrusion Detection System, a Risk Analysis System , a Secured Expert Med...
An Intrusion Detection System, a Risk Analysis System , a Secured  Expert Med...An Intrusion Detection System, a Risk Analysis System , a Secured  Expert Med...
An Intrusion Detection System, a Risk Analysis System , a Secured Expert Med...Nathanael Asaam
 
Intrusion detection system based on web usage mining
Intrusion detection system based on web usage miningIntrusion detection system based on web usage mining
Intrusion detection system based on web usage miningIJCSEA Journal
 
System Modeling & Simulation Introduction
System Modeling & Simulation  IntroductionSystem Modeling & Simulation  Introduction
System Modeling & Simulation IntroductionSharmilaChidaravalli
 
A hybrid technique for sql injection attacks detection and prevention
A hybrid technique for sql injection attacks detection and preventionA hybrid technique for sql injection attacks detection and prevention
A hybrid technique for sql injection attacks detection and preventionijdms
 
A web application detecting dos attack using mca and tam
A web application detecting dos attack using mca and tamA web application detecting dos attack using mca and tam
A web application detecting dos attack using mca and tameSAT Journals
 
Ids 013 detection approaches
Ids 013 detection approachesIds 013 detection approaches
Ids 013 detection approachesjyoti_lakhani
 
A Survey On Genetic Algorithm For Intrusion Detection System
A Survey On Genetic Algorithm For Intrusion Detection SystemA Survey On Genetic Algorithm For Intrusion Detection System
A Survey On Genetic Algorithm For Intrusion Detection SystemIJARIIE JOURNAL
 
Synthesis of Polyurethane Solution (Castor oil based polyol for polyurethane)
Synthesis of Polyurethane Solution (Castor oil based polyol for polyurethane)Synthesis of Polyurethane Solution (Castor oil based polyol for polyurethane)
Synthesis of Polyurethane Solution (Castor oil based polyol for polyurethane)IJARIIE JOURNAL
 

Similar to Building a usage profile for anomaly detection on computer networks (20)

A new approach for formal behavioral
A new approach for formal behavioralA new approach for formal behavioral
A new approach for formal behavioral
 
WS97-07-013
WS97-07-013WS97-07-013
WS97-07-013
 
Improvement of Security Systems by Keystroke Dynamics of Passwords
Improvement of Security Systems by Keystroke Dynamics of PasswordsImprovement of Security Systems by Keystroke Dynamics of Passwords
Improvement of Security Systems by Keystroke Dynamics of Passwords
 
APPLICATION SPECIFIC USAGE CONTROL IMPLEMENTATION VERIFICATION
APPLICATION SPECIFIC USAGE CONTROL IMPLEMENTATION VERIFICATIONAPPLICATION SPECIFIC USAGE CONTROL IMPLEMENTATION VERIFICATION
APPLICATION SPECIFIC USAGE CONTROL IMPLEMENTATION VERIFICATION
 
Modeling & simulation in projects
Modeling & simulation in projectsModeling & simulation in projects
Modeling & simulation in projects
 
Handwritten Text Recognition Using Machine Learning
Handwritten Text Recognition Using Machine LearningHandwritten Text Recognition Using Machine Learning
Handwritten Text Recognition Using Machine Learning
 
MODELING & SIMULATION.docx
MODELING & SIMULATION.docxMODELING & SIMULATION.docx
MODELING & SIMULATION.docx
 
A Novel Approach to Derive the Average-Case Behavior of Distributed Embedded ...
A Novel Approach to Derive the Average-Case Behavior of Distributed Embedded ...A Novel Approach to Derive the Average-Case Behavior of Distributed Embedded ...
A Novel Approach to Derive the Average-Case Behavior of Distributed Embedded ...
 
IRJET- Two Factor Authentication using User Behavioural Analytics
IRJET- Two Factor Authentication using User Behavioural AnalyticsIRJET- Two Factor Authentication using User Behavioural Analytics
IRJET- Two Factor Authentication using User Behavioural Analytics
 
Chapter 3 - Analytical Techniques
Chapter 3 - Analytical TechniquesChapter 3 - Analytical Techniques
Chapter 3 - Analytical Techniques
 
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
 
Verification of the protection services in antivirus systems by using nusmv m...
Verification of the protection services in antivirus systems by using nusmv m...Verification of the protection services in antivirus systems by using nusmv m...
Verification of the protection services in antivirus systems by using nusmv m...
 
An Intrusion Detection System, a Risk Analysis System , a Secured Expert Med...
An Intrusion Detection System, a Risk Analysis System , a Secured  Expert Med...An Intrusion Detection System, a Risk Analysis System , a Secured  Expert Med...
An Intrusion Detection System, a Risk Analysis System , a Secured Expert Med...
 
Intrusion detection system based on web usage mining
Intrusion detection system based on web usage miningIntrusion detection system based on web usage mining
Intrusion detection system based on web usage mining
 
System Modeling & Simulation Introduction
System Modeling & Simulation  IntroductionSystem Modeling & Simulation  Introduction
System Modeling & Simulation Introduction
 
A hybrid technique for sql injection attacks detection and prevention
A hybrid technique for sql injection attacks detection and preventionA hybrid technique for sql injection attacks detection and prevention
A hybrid technique for sql injection attacks detection and prevention
 
A web application detecting dos attack using mca and tam
A web application detecting dos attack using mca and tamA web application detecting dos attack using mca and tam
A web application detecting dos attack using mca and tam
 
Ids 013 detection approaches
Ids 013 detection approachesIds 013 detection approaches
Ids 013 detection approaches
 
A Survey On Genetic Algorithm For Intrusion Detection System
A Survey On Genetic Algorithm For Intrusion Detection SystemA Survey On Genetic Algorithm For Intrusion Detection System
A Survey On Genetic Algorithm For Intrusion Detection System
 
Synthesis of Polyurethane Solution (Castor oil based polyol for polyurethane)
Synthesis of Polyurethane Solution (Castor oil based polyol for polyurethane)Synthesis of Polyurethane Solution (Castor oil based polyol for polyurethane)
Synthesis of Polyurethane Solution (Castor oil based polyol for polyurethane)
 

Recently uploaded

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 

Recently uploaded (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 

Building a usage profile for anomaly detection on computer networks

  • 1. 0 Building a Usage Profile of a Computer Network System for Anomaly Detection on the Computer Network and various Peripherals on the Network Nathanael Ato Asaam Founder and CEO Equicksales Consulting Ltd. 2019
  • 2. 1 1 Abstract This paper is an investigation into building usage profiles of a system using behavior models. Such behavior models are the heart of machine learning, and evolutionary computing. Some other methods of building such usage profiles include the use of statistical models such as time series models, univariate models and mean and standard deviation models. The aim of building these usage profiles is to be able to detect unusual behavior on the system. This paper uses regression to determine the usage profiles of a system by studying the relationship between relevant system variables that will be used to formulate the usage profile. The dependent and independent variables for the usage profile can be determined from an audit trail. Additionally, the paper applies hidden markov models to study the various states a computer system can fall into and the various stage transitions in order to be able to predict unusual behavior in the system. Unusual behavior in this case may be a particular state or a transition from one state to another or the manner in which a particular state transition occurred. With this usage profile which is composed of the usage profile equation and a mean and standard deviation model that capture average usage and its standard deviation and the markov chain model that captures the various states of the system and the various state transition it becomes possible to detect anomaly on the system. Using linear and nonlinear programming, the usage profile equation can be maximized or minimized to determine states of the system and points at which the system is optimal. This can help improve the system’s usage. Also using differential coefficient of the usage profile equation and other statistical models such as the mean and standard deviation model, a threat profile of the system can be developed. When the threat profile equation is minimized using linear and nonlinear programming, it will help prevent threats on the system. The benefit of this research is its application to the development of anomaly threat detection systems and risk analysis systems that can be used for performing computer security risk assessments and analysis. The research model of this paper is Y=f (Xi, Ci) such that Xi represents system variables like number of application software running or number of system processes. Ci represents system constants like average number of processes. During this research, an experiment was conducted into how to represent a computer system’s usage with an abstract mathematical model. The experiment was conducted on desktops using micro usage models of a network system.
  • 3. 2 2 Threat analysis and detection is also done using some special parameters of the usage model. These parameters are constants in the system. Examples of these constants include rates at which the system’s usage increase or decrease with respect to certain variables in the system, and the rate at which threat occurrence increase or decrease in the system with respect to variables that make up the usage model. The normal usage and threat models on the other hand are the objective functions that are used for analyzing threats. The techniques we have discussed in this paper make it possible to achieve correctness, promptness and ease of use. The usage model function with its associated average usage and standard deviation make it possible to ensure correctness of the intrusion detection system. This is because the statistical data sampled for development of an intrusion detection system developed using these techniques can be used to formulate an acceptable usage range. There are two special utilities that compose the usage model. They are essential for improving convenient usage and preventing false alarms. They make the intrusion detection system correct and prompt at preventing threats. These concepts are the basics for developing a security audit framework. A simulation was run for four times within a week. On the first instance, it was run for 15 minutes. On the second instance, it was run for 30 minutes and on the third instance it was run for 45 minutes. On the last instance it was run for 60 minutes. The results indicate that a threat detection system can be built using the differential equation technique, the novel self integrating data structure and linear and non linear programming concepts. To make the intrusion detection system for which this research model proposes detect threats promptly, multithreading is applied to analyze, predict, detect and halt threats. Multithreading is a programming concept that ensure that several processes run on the computer at the same time. This concept makes it possible to predict multiple threats, do multiple threat analysis and halt or alarm the occurrences of multiple threats on a computer system. Ease of use of a system for which the intrusion detection system is developed is achieved using the mean and standard deviation model. Without that model, there is no acceptable range of our usage. That means that the average usage and its standard deviation prevents a rigid usage model and as such makes usage convenient.
  • 4. 3 3 Table of Contents Abstract.......................................................................................................................................1 Introduction.................................................................................................................................7 Background.............................................................................................................................7 Problem Definition..................................................................................................................8 Research Questions .................................................................................................................8 Objectives ...............................................................................................................................9 Behavior Models .....................................................................................................................9 System Threats......................................................................................................................10 Boolean Calculus...................................................................................................................10 Micro Usage Models .............................................................................................................10 Properties of Intrusion Detection Systems .............................................................................10 Research Model and Methodology ........................................................................................11 Statistical and Machine Learning Models ..............................................................................11 Cognitive Based, User Intention Based and Computer Immunology Based Models ...............11 Literature Review......................................................................................................................12 Intrusion Detection Systems ..................................................................................................12 Behavior Encryption..............................................................................................................13 Risk Analysis ........................................................................................................................14 Information Security Awareness and Practices ......................................................................14 Protocol for Mitigating Risks on Social Networking Sites .....................................................15 Research Model and Methodology ............................................................................................16 Research Model.....................................................................................................................16 Methodology.........................................................................................................................18
  • 5. 4 4 Machine Learning Algorithms & Behavior Based Intrusion Systems.....................................19 Audit Trail Analysis ..............................................................................................................19 Normal Usage Model ............................................................................................................19 Threat Modelling...................................................................................................................19 Boolean Calculus...................................................................................................................20 Experimenting Usage and Threat Models ..............................................................................20 Computer Usage Survey........................................................................................................20 Threat Detection Systems......................................................................................................20 Threats Associated with Computer Systems ..............................................................................21 Attacks associated with a computer system............................................................................21 Malicious Code .....................................................................................................................21 IP Scan and Attack ................................................................................................................21 Web Browsing.......................................................................................................................21 Virus .....................................................................................................................................22 Unprotected Shares................................................................................................................22 Mass emails...........................................................................................................................22 Simple Network Management Protocol (SNMP)....................................................................22 Hoaxes ..................................................................................................................................22 Backdoors .............................................................................................................................22 Password Crack.....................................................................................................................23 Brute Force............................................................................................................................23 Dictionary .............................................................................................................................23 Denial of Service (DoS) and Distributed Denial of Service (DDoS).......................................23 Spoofing................................................................................................................................24
  • 6. 5 5 Man in the Middle.................................................................................................................24 Spam.....................................................................................................................................24 Mail Bombing .......................................................................................................................25 Mathematical Modelling Techniques and Machine Learning Based Models .............................26 Simple Linear Regression......................................................................................................26 Multiple Linear Regression ...................................................................................................27 Non Linear Regression ..........................................................................................................27 Machine Learning based models Used for Developing Anomaly Based.....................................28 The Normal Usage Model of a System ......................................................................................30 Single Variable Calculus Review and its Applications...........................................................31 Authentication Usage Model .................................................................................................32 Session Usage Model ............................................................................................................33 Memory Usage Model...........................................................................................................33 CPU Usage Model.................................................................................................................33 Program Usage Model .......................................................................................................34 Host Usage Model .............................................................................................................34 Battery Usage Model.........................................................................................................34 Device Usage Model .........................................................................................................34 Server Usage Model ..........................................................................................................35 Port Usage Model..............................................................................................................35 Network Usage Model.......................................................................................................35 Aggressive Usage Detector................................................................................................35 False Alarm Detector.............................................................................................................35 Threat Models in a System ........................................................................................................38
  • 7. 6 6 Properties and Methods of the Novel Self Integrating Data Structure.....................................38 Integration Review ................................................................................................................38 Interpretation of Threat Model Integrals ................................................................................39 Threat Analysis and Detection...............................................................................................39 Threat Prediction...................................................................................................................40 Risk Analysis in a System .....................................................................................................41 Normal Usage Model and Threat Model Simulation..................................................................41 Tools and Computer Packages...................................................................................................43 Conclusion and Discussion........................................................................................................44 References.................................................................................................................................47
  • 8. 7 7 Introduction Background Cyber security threats on computer networks have the potential of causing damage to resources on the computer network. Examples of these damages include corrupting data stored or transmitted on the network, infesting a host on the network with virus, impersonating a valid user on the network and preventing proper functioning of applications softwares on various host on the network. The security of computer systems is very essential to various organizations. Computer systems security is usually provided by computer software that protect the computer system for which they were developed. Such a computer software system is an intrusion detection system. Other computer systems that provide security are antivirus and firewall and risk analysis systems. Also, periodic computer security audits will enable threat detection and prevention on computer networks. There are two types of intrusion detection systems. These are knowledge based intrusion detection systems also known as signature based intrusion detection systems and behavior based intrusion detection systems also known as anomaly intrusion detection systems. Behavior based intrusion detection systems detect and prevent intrusions based on deviations from an observed behavior pattern of the computer system for which the intrusion detection system has been built. These deviations represent threats on the system. Knowledge based intrusion detection systems detect intrusions based on mappings of system occurrences with a database of known threats. The database of known threats is known as threat signatures. Intrusion detection systems are also known as threat detection systems. The first goal of this paper is to investigate techniques for representing a computer system’s normal usage with a mathematical abstract model. The mathematical abstract model is known in this paper as a normal usage model. It is hoped that, the normal usage model will aid in analyzing activities and occurrences on a computer system that deviates from the system’s normal usage. This will help in detecting and preventing threat on the system. The second goal of this paper is to examine anomaly detection by analyzing changes in a system that deviates from the systems normal usage. This research paper will also be doing an investigation into how to build a usage profile that can be used to determine anomalous activities on a computer system. The paper proposes a
  • 9. 8 8 research model made up of a dependent variable and one or more independent variables that can be used for modelling the usage of a computer system. This research model is a regression based model. As such, simple linear or multiple linear regression can be used to develop the model. The research paper also uses a statistical model known as mean and standard deviation model. The mean and standard deviation model models the average usage of the system and it associated standard deviation. Finally, the paper also uses a markov chain model to model various states in a computer system, their associated probabilities and the various state transitions. These three different models are behavior models and together form the usage profile that this paper proposes. Also, the paper uses a java interface for implementing the usage model that describes a component of a computer system whose usage can be modelled using simple or multiple linear regression. Problem Definition If the normal functioning of a computer system can be represented by an abstract model, then any deviation from that abstract model can be used to analyze and detect threats in that system. The main problems this paper seeks to investigate are listed below. • To represent the normal usage of a computer network with a mathematical abstract model. • To investigate techniques for building a usage profile of a computer network. • To determining activities and occurrences that are deviations from a system’s normal usage and flagging them as anomalous activities. • To develop an anomaly intrusion detection system. • To develop a risk analysis system • To develop a security audit framework made of an anomaly intrusion detection system and a risk analysis system. • To draft a document that will detail the operation and administration of the security audit framework. In this paper, the abstract model of the system’s usage is known as a normal usage model and the deviations from the system’s normal usage is known as threats. Research Questions The main questions to be investigated are listed below. • What are the best and most efficient techniques for modelling a computer network’s normal usage?
  • 10. 9 9 • How can we build a usage profile of a computer network that will be adequate for detecting anomalous activities on the network? • What are the best techniques for designing and implementing an anomaly intrusion detection system? • What are the best techniques for designing and implementing a risk analysis system? • What are the best techniques for design and implementation of a security audit framework? • What are the procedures, and processes that must be followed in the operation and administration of a security audit framework? Objectives The main objectives of this paper are as follows. • Representing a computer network’s normal functioning with an abstract model • Building a usage profile of a computer network. • Detecting activities and occurrences that deviate from the normal usage of a computer network and flag these activities and occurrences as anomalous activities on a computer network. • Design and implementation of an Anomaly Intrusion Detection System. • Design and implementation of a Risk Analysis System. • Design and implementation of a Security Audit Framework. • Draft a document that details the procedures, processes and guidelines that must be followed in the operation and administration of a security audit framework. Behavior Models It is hoped that the abstract representation of a system’s normal usage will capture the entire behavior of the system. Such models are known as behavior models. As such, the threat detection system this paper seeks to explore is expected to be a behavior based threat detection system. Examples of behavior models that this paper seeks to explore are statistical models, cognitive based models, machine learning based models, user intention based models, and computer immunology based models. These models are associated with the development of anomaly-based intrusion detection systems.
  • 11. 10 10 System Threats There are three types of system logs that our intended threat analysis and detection hopes to arrive at. These are system errors, system threats and usage rates all categorized based on the magnitude and characteristics of an instance of the threat model. These logs must as such be audited by a security expert to analyze changes in our computer system that fits or deviates from our current usage model in order to project a more appropriate instance of our usage model that will be perfectly functional and suiting in the future. Boolean Calculus It is expected that using Boolean algebra and calculus of Boolean functions, the normal usage model can have a hardware representation. Researching how to implement this hardware representation can be done using Boolean algebra and calculus of Boolean functions. These concepts are related with concepts from computer organization and architecture such as logic gates, multipliers, design of arithmetic and logic units, and concepts from embedded systems like architecture of various embedded system implementation. These architectures include hardware only implementation and hardware/software implementation. Micro Usage Models Micro usage models are sub models of our normal usage model. They are modelled using the same research model. Examples of micro usage models that this paper explores are Device Usage Model, Host Usage Model, Server Usage Model, Authentication Usage Model, Session Usage Model, CPU Usage Model, Memory Usage Model ,Port Usage Model and Network Usage Model. These micro usage models are expected to derive their mathematical representation from variables sampled from an audit trail analysis. These micro usage models are expected to be components of a usage profile developed for computer network. Properties of Intrusion Detection Systems There are special properties of intrusion detection systems that make them effective and efficient at detecting and preventing threats. Examples of these properties are correctness, promptness and ease of use. Correctness means how good the intrusion detection system can detect threats. This is important because correctness affects the rate at which a predicted threat is false or true. Promptness is related to the time it takes to detect and halt a threat and ease of use is related to the
  • 12. 11 11 property of the intrusion detection system aiding convenient use of the Computer Network or System for which it was developed. Research Model and Methodology The research model of this paper investigates threat detection using application of Calculus, Boolean algebra, Machine learning and Statistical models. These fields of study are mainly related to Discrete Mathematics, Computer Science, Operational Research, Linear and Non Linear Programming, Regression Analysis and Data Mining. The research model is inspired by linear and non linear regression. The methodology for threat detection is inspired by linear and non linear programming and calculus. Some Computer Science fields that inspire the threat detection parts of this research are multithreading, architectures of embedded system design and implementation, and concepts from computer organization and architecture like implementation of arithmetic and logic unit. Statistical and Machine Learning Models Statistical models are mathematical models that can be used in the development of intrusion detection systems. These models have different types. Machine learning techniques are also used to build intrusion detection systems. These techniques have special models or structures that aid development of intrusion detection systems. Examples of statistical models are mean and standard deviation models, univariate models, and time series models. Machine learning models include Neural networks, Bayesian networks, Hidden Markov Models and Genetic algorithms. Cognitive Based, User Intention Based and Computer Immunology Based Models Besides the statistical and machine learning based models that can be used for developing anomaly based intrusion detection systems, there are cognitive based models that are used to develop anomaly intrusion detection systems.
  • 13. 12 12 Literature Review This section reviews major topics that constitute this research paper and work done in some of these areas. The topics and areas that will be considered for discussion include intrusion detection systems since any discussion or study of threat and their source detection is centered on intrusion detection systems. Also, behavior encryption is another computer security field that will be discussed in detail since it adds much value to information hiding parts of this research. Risk analysis will also be reviewed to sum up what constitutes risk analysis. Finally, there will be a review on Normal Usage Models. Intrusion Detection Systems Basically, there are two types of intrusion detection systems in the industry based on the approach used for threat detection and the technologies used to build the system. These are knowledge based also known as signature based and behavior based intrusion detection systems. Each takes a different approach to threat detection and each uses different technology for building the intrusion detection systems. Also, every single one has its pros and cons. Knowledge based intrusion detection systems are built on a database of already known threats. These known vulnerabilities or threats are called threat signatures. Usually, detection is done as direct mappings of various system incidents that indicate threats with threat signatures. As a result, the database of threats must be constantly updated for new identified threats. Because new threats can be detected for inclusion in the database, the correctness of detecting threat is sometimes compromised since threats which do not have corresponding signatures cannot be mapped and detected. But these types of intrusion detection systems have lower false alarms since each detected threat is registered in the database of threat signatures. Behavior based intrusion detection systems take a different approach to threat detection. They are built using artificial intelligence technologies. Usually, the system for which the intrusion detection is built is modelled for its behavior and deviations from that behavior is used as a technique for detecting the threats. Because of this, they have a better correctness at detecting threats. No threat signatures or mappings of incidents that indicate threat is required. Additionally, they have higher false alarms because there is no mapping of detected threats with a database of known threats.
  • 14. 13 13 Besides these, intrusion detection systems are classified based on purposes for which they are built and the activeness or passiveness at which they deal with threats. There are host based and network based intrusion detection systems made for such purposes. Active intrusion detection systems are configured to block or prevent attacks while passive intrusion detection systems are configured to monitor, detect and alert threats. Anomaly Detection Systems According to a research paper entitled “Design and Implementation of Anomaly Detection System”, there are global variables of a network that can be used for detecting anomalous activities on a network. The paper used a hybrid of signature based and anomaly intrusion detection to detect anomaly. According to the paper, some of the techniques used for detecting intrusion include using generic network rules to detect network anomaly. The paper also used dynamic network knowledge such as network statistics to detect anomalous activities. Behavior Encryption Behavior algorithms are applied to safeguard information on computing devices such as mobile phones and laptops. These algorithms are the basics for building systems that study and encrypt user behavior on a computing device in order to ensure the security of information on the computing devices. A study into mobile platform security reports that behavior encryption application systems have been designed and built, focusing on mobile platforms. Results from this study indicated that encryption application systems are effective in ensuing mobile platform security. In addition to this, it must be noted that, since mobile devices can have security through behavior encryption systems, then the behavior of host on a network or network systems can also be encrypted to ensure safe communication since each host or user on a system or network has a particular behavior pattern. Cryptographic study into encrypting the normal usage model can fall under behavior encryption since the usage model represents a system’s behavior and can be composed of a user’s behavior. This can aid in securing the information that embodies the usage model. It is also necessary because if the usage model can easily be predicted then it is possible to manipulate the usage model and launch an attack.
  • 15. 14 14 Risk Analysis Computer risk analysis is also called risk assessment. It involves the process of analyzing and interpreting risk. To analyze risk, the scope and methodology has to be initially determined. Later, information is collected and analyzed before interpreting the risk analysis results. Determining the scope can be described as identifying the system to be analyzed for risk and parts of the system that will be considered. Also, the analytical method that will be used with its detail and formality must be planned. The boundary, scope and methodology used during risk assessment determine the total amount of work efforts that is needed in the risk management, and the type and usefulness of the assessments result. Risk has many components including assets, threats, likelihood of threat occurrence, vulnerability, safeguard and consequence. Risk management include risk acceptance which takes place after several risk analyses. Normally, after risk has been analyzed and safeguards implemented, the remaining or residual risk in the system that makes the system functional must be accepted by management. This may be due to constraints on the system such as ease of use, or features of the systems for which strict safeguard will cost the organization operational problems. As such, risk acceptance, like the selection of safeguards, should take into account various factors besides those addressed in the risk assessment. In addition, risk acceptance should take into account the limitations of the risk assessment. Information Security Awareness and Practices A paper on information security awareness in Saudi Arabia discusses information security awareness and practices. The paper is entitled “A study of information security awareness and practices in Saudi Arabia.” This paper emphasizes the fact that information is under constant threat from cyber vandals. However, Saudi Arabia is rated poor in terms of information security due to the fact that the country is a highly suppressed, patriarchical and tribal culture country. The paper examined the level of information security awareness among the general public in the country using an anonymous online survey based on instruments the Malaysian Security Organization produced. In all, 633 persons responded to the survey and analysis confirmed that indeed, information security awareness is low in the country and this is mostly related to the fact that, the country is highly suppressed, patriarchical and tribal in nature.
  • 16. 15 15 Protocol for Mitigating Risks on Social Networking Sites According to an academic paper entitled, “Protocol for mitigating the risk of hijacking social networking sites”, hackers can hijack a user’s session on social networking sites, impersonate the victim and take over his session. The paper deals with this risk by presenting a security authentication protocol for mitigating the risk. The protocol takes into account that users of social networking sites connect to the sites using several platforms and connection speeds. To cater for mobile devices and tablets using Wifi connection, a novel Self-Configuring Repeatable Hash Chains (SCRHC) protocol was developed to prevent the hijacking of session cookies. This protocol supports three levels of caching making it possible to forfeit storage space for enhanced performance and reduced workload. Behavior/Anomaly Based Intrusion Detection Behavior models are used to detect intrusion in computer system. This section reviews the behavior models that can be used to build behavior based intrusion detection systems. These models are put into various categories. The categories are, statistical models, machine learning based techniques, cognitive models, computer immunology, user intention. Statistical models include operational or threshold metric model, markov process or marker model, multivariate model, statistical moments model, time series models, univariate models. Machine learning based models include bayesian networks, generic algorithms, neural networks, fuzzy logic, and outlier detection, cognitive models include finite state machines, description scripts, and expert systems.
  • 17. 16 16 Research Model and Methodology Research Model Assume that the normal usage (Y) of a computer network can be represented by a mathematical function; Y=f (Xi, Ci) such that Xi represents system variables like number of functions or number of authentications. Ci represents system constants like maximum or minimum number of authentications. When a change in Y is beyond the standard deviation determined from the data set of our usage, then that change indicates a threat. To investigate this threat, machine learning algorithms, mathematical functions and behavior based intrusion detection systems will be studied to determine Y in terms of a number of variables that represent Y appropriately. The expected usage model of the network to be investigated includes the following components. Host Usage Model, Server Usage Model, Device Usage Model, Port Usage Model, Network Usage Model, Session Usage Model, Authentication Usage Model, Memory Usage Model, CPU Usage Model, Battery Usage Model and Program Usage Model. These components are expected to be derived from the variables listed below. • Average number of application software that run on the mobile system while using the system • Average number of system processes that run on the mobile system while using the system. • Average number of authentications in the mobile system. • Average number of user actions that happens on the mobile system Average time a user spends before his session expires. • Average time the mobile facility or resource functions each day. • Number of paired ports communicating on the network • Average amount of memory space used on devices while the network is being operated. • Average CPU time spent on a single device on the network • Average life span of a single device battery on the network. USAGE MODEL: JAVA INTERFACE THAT IMPLEMENTS THE RESEARCH MODEL For each component of a computer system under investigation, we will program a usage model which is an implementation of the research model for that component which forms part of the
  • 18. 17 17 computer system under investigation. Each usage model implements an interface captured in a java file called model.java. There are eight functions in the model.java interface. The first one is computeval which is for computing the usage value at an instance. The second one is findchange which is for finding changes in the usage of the computer system. The third one is learnsys which is for learning the usage of the system. The fourth one is findrelationship which is for finding the regression equation. The fifth one is monitor which is for monitoring the usage of the system. The sixth one is showalarm which is for displaying error messages and detected intrusion. The seventh one is haltprocess which is for halting detected intrusion and the eighth one is predictvals. It is for predicting usage values based on the regression equation determined. Omitting an implementation of one of the functions of the usage model will throw an exception. To implement the usage model, you will use the java keyword implements. Below is an implementation of the model.java file USAGE MODEL FILE public interface model{ public double computeval(); public double findchange(); public void learnsys(int t); public Object findrelationship(); public void monitor(int t); public void showalarm(String info); public void haltprocess(); public void predictvals(); } IMPLEMENTING THE USAGE MODEL FOR AN AUTHENTICATION SYSTEM class auth_usage implements model{ /* variable declaration for dependent and independent variables public double computeval(){ } public double findchange(){ } public void learnsys(int t){ }
  • 19. 18 18 public Object findrelationship(){ } public void monitor(int t){ } public void showalarm(String info){ } public void haltprocess(){ } public void predictvals(){ } } Methodology The list below details activities or processes that will be followed to represent a computer system with an abstract mathematical model and analyze changes in that system. It is hoped that following these processes will arrive at design and implementation of a normal usage model, a threat detection system and a mobile security audit framework. • Machine Learning Algorithms & Behavior Based Intrusion Systems: Investigate machine learning algorithms, mathematical functions, and behavior based intrusion detection systems in order to determine the extent to which the normal usage of a mobile system can be represented by the research model. • Audit trails: Analyze audit trails in order to formulate a set of independent and dependent variables and their associated data set that will help in modelling the usage model of a mobile system. • Normal Usage Model: Apply the knowledge gained from the machine learning algorithms and behavior based intrusion detection systems study and the audit trails analysis to model and represent the normal usage of a mobile system such as a smart phone, laptop or wireless network. • Threat Modelling: Study differential equations of the normal usage model and its applications in order to model, detect and prevent threats. • Boolean Calculus: Apply Boolean algebra and calculus of Boolean functions to design and implement a hardware and software that make up the Normal Usage and Threat Detection Systems.
  • 20. 19 19 • Use programming as a tool to experiment representations of the normal usage and threat models to aid design and implementation of a mobile security audit framework. • Employ questionnaire to collect information about the usage of computers and mobile phones. • Threat Detection Systems: Develop an anomaly based threat detection system to demonstrate the effectiveness of the research model. The goal is to measure the effectiveness of the threat detection system developed, at preventing threats on a computer system. Machine Learning Algorithms & Behavior Based Intrusion Systems Machine learning techniques and algorithms will be investigated to know the extent to which an expert system that learns a computer system’s usage can be built. Since the expected usage model is a mathematical model, various mathematical modelling techniques will be applied to determining the normal usage model. When deviations from these mathematical models are analyzed it can lead to design and implementation of behavior based intrusion detection systems. As such, a thorough study into design and implementation of behavior based intrusion detection systems will be done. Audit Trail Analysis It is expected that computer security audit reports will be sampled and analyzed to arrive at a set of dependent and independent variables and their data set. These variables and their associated data set can be used to formulate the normal usage model. Normal Usage Model An investigation into applying the knowledge gained from the machine learning study, the mathematical modelling study, the behavior based intrusion detection system study and the audit trail analysis will the done. It is hoped that this will answer the question how do you represent the normal functioning of a computer system with a mathematical abstract model. Threat Modelling Differential equations of the normal usage model will be investigated to know the extent to which deviations from the normal usage models can be analyzed. An abstract mathematical model of these deviations will be formulated. These abstract models are derivatives of the normal usage model.
  • 21. 20 20 Boolean Calculus A study into representing the normal usage model with a boolean function will be done. It is hoped that analyzing these boolean functions will aid in building a hardware that is the expected usage system. Differential equations of these boolean functions will be studied to analyze changes in the system that indicate deviation from the normal usage model. Experimenting Usage and Threat Models Programming will be used as a tool to experiment various usage and threat models. These usage and threat models are expected to be derived from a computer system. This experiment will lead to design and implementation of a normal usage system, a threat detection system and a risk analysis system. These systems are expected to be components of a mobile security audit framework. Computer Usage Survey A questionnaire for obtaining information about computer and smart phone usage will be employed. It is expected that this will give an idea about various statistics that make up a computer or smart phone’s usage. These statistics will be a guideline for sampling experimental data of a computer system’s usage during experimenting the usage and threat models. Threat Detection Systems It is hoped that an anomaly based threat detection system will be developed to demonstrate the effectiveness of the research model at being used to model systems usage and threats. The effectiveness of the threat detection system developed at preventing threats on a computer system will also be measured. In this project, the threat detection system that will be developed is for ecommerce sites.
  • 22. 21 21 Threats Associated with Computer Systems This chapter discusses some of the threats and attacks associated with computer and network systems. Attacks associated with a computer system The attack types that will be discussed include Malicious Code, IP Scan and Attack, Web Browsing, Virus, Unprotected Shares, Mass emails, Simple Network Management Protocol(SNMP), Hoaxes, Backdoors, Password Crack, Brute Force, Dictionary, Denial of Service(DoS) and Distributed Denial of Service Attack(DDoS), Spoofing, Man in the Middle, Spam, Mail Bombing, Sniffers, Social Engineering, Buffer Overflow and Timing Attack. Malicious Code Malicious Code attack include the execution of viruses, worms Trojan horses and active Web scripts with the intent to destroy or steal information. The state of the art malicious code is the polymorphic or multivector worm. The attack programs uses up to six attack vectors to exploit a variety of vulnerabilities in commonly known information system devices. Perhaps the best illustration of such an attack remains the outbreak Nimda in Septembers 2001 which used five of the six vectors with startling speed. TruSecure Corporation an industry source for information security statistics and solutions reports that Nimda spread to span the internet address of 14 countries in less than 25 minutes. IP Scan and Attack The infested system scans a random or the local IP addresses and targets any of the several vulnerabilities known to hackers or left over from previous exploits such as Code Red Black Orifice, Poizon Box. Web Browsing If the infested system has write access to any Web page, it makes all the Web content files (html, asp,gci and others) infectious so that users who browse to those pages become infected.
  • 23. 22 22 Virus Each infested machine infects certain common executable or script files on all computers to which it can write with virus code that can cause infection. Unprotected Shares Using vulnerabilities in file systems and the way many organizations configure them, the infested machine copies the viral components to all locations it can reach. Mass emails By sending email infections to addresses found in the address book. The infected machine infects many users, whose mail reading program also automatically runs the programs and infects other systems. Simple Network Management Protocol (SNMP) By using the widely known and common password that were employed in the early versions of the protocol (which is used for remote management of networks and computer devices) the attacking program can gain control of the device. Most vendors have closed these vulnerabilities with software upgrades. Hoaxes A more devious approach to attacking computer systems is the transmission of a virus hoax with a real virus attached, when the attack is masked, in seemingly legitimate message, unsuspecting users readily distribute it. Even though those users are trying to do the right thing to avoid infection, they end up sending the attack on to their coworkers and friends and infesting many users along the way. Backdoors Using a known or previously unknown and newly discovered access mechanism, an attacker can gain access into a system or network resource through a back door. Sometimes, these entries are left behind by system designers or maintenance staff and thus referred to as trap doors. A trap door is hard to detect, because, very often the programmer who puts it in place also makes the access exempt from the usual audit logging features of the system.
  • 24. 23 23 Password Crack Attempting to reverse-calculate a password is often called cracking. A cracking attack is a component of many dictionary attacks. It is used when a copy of the security account manager (SAM) data file can be obtained. The SAM file contains the hashed representation of the user’s password. A password can be hashed using the same algorithm and compared to the hashed results. If they are the same the password has then been cracked. Brute Force The application of computing and network resources to try every possible combination of options of password is called brute force attack. Since this is often an attempt to repeatedly guess passwords to commonly used accounts, it is sometimes called a password attack. If attackers can narrow the field of accounts to be attacked, they can devote more time and resources to attacking fewer accounts. That is one reason a recommended practice is to change account names for common accounts from the manufacturer’s default. While often effective against low-security systems, password attacks are often not useful against systems that have adopted the usual security practices recommended by manufacturers. Dictionary This is another form of brute force attack. The dictionary attack narrows the field by selecting specific accounts to attack and uses a list of commonly used password (the dictionary) instead of random combinations. Organizations can use similar dictionaries to disallow passwords during the reset process and thus guard against easy-to-guess passwords. In addition, rules requiring additional number and/ or special characters make the dictionary attack less effective. Denial of Service (DoS) and Distributed Denial of Service (DDoS) In a denial of service attack, the attacker sends a large number of connections or information requests to a target. So many requests are made that the target system cannot handle them along with legitimate request for service successfully. This may result in the system crashing or simply becoming unable to perform ordinary functions. A distributed denial of service is an attack in which a coordinated stream of request is launched against a target from many locations at the same time. Most DDos attacks are preceded by a preparation phase in which many systems, perhaps
  • 25. 24 24 thousands are compromised. The compromised machines are turned into zombies, machines that are directed remotely (usually by a transmitted command) by the attacker or participate in the attack. DDos attacks are the most difficult to defend against and there are presently no controls that any single organization can apply. There are, however some cooperative efforts to enable DDos defenses among groups of services providers; among them is the Consensus Roadmap for Defeating Distributed Denial of Service attacks. Spoofing Spoofing is a technique used to gain unauthorized access to computers wherein the intruder sends messages to a computer that has an IP address that indicates that the messages are coming from a trusted host. To engage in IP spoofing, a hacker must first use a variety of techniques to find an IP address of a trusted host and then modify the packet headers so that it appears that the packets are coming from that host. Newer routers and firewalls arrangements can offer protection against IP spoofing Man in the Middle In the well-known man-in-the-middle or TCP hijacking attack, an attacker monitors (or sniffs) packets from the network, modifies them and inserts them back into the network. This type of attack uses IP spoofing to enable an attacker to impersonate another entity on the network. It allows the attacker to eavesdrop as well as to change, delete, reroute, add forge, or divert data. In a variant on the TCP hijacking session, the spoofing involves the interception of an encryption key exchange, which enables the hacker to act as an invisible man-in-the-middle – that is eavesdropper – with regard to encrypted communications. Spam Spam is unsolicited commercial email. While many considers spam a trivial nuisance rather than an attack, it has been used as means to make malicious code attacks more effective. In March 2002, reports emerged of malicious code embedded in MP3 files that were included as attachments to spam. The most significant consequence of spam on the modern organization, however, is the waste of both computer and human resources it causes by the flow of unwanted electronic mail.
  • 26. 25 25 Many organizations attempt to cope with the flood of spam by using filtering technologies to stem the flow. Other organizations tell the users of the mail system to delete unwanted messages. Mail Bombing Another form of e-mail attack that is also Dos is called mail bomb, in which an attacker routes larger quantities of e-mail to the target. This can be accomplished through social engineering or by exploiting various technical flaws in the Simple Mail Transport Protocol. The target of the attack receives unmanageable large volumes of unsolicited e-mail. By sending large e-mails with forged header information, attackers can take advantage of poorly configured e-mail systems on the internet
  • 27. 26 26 Mathematical Modelling Techniques and Machine Learning Based Models The mathematical relation that represents the normal usage model can be determined using regression analysis. Regression analysis is a field of statistics. It employs the least squares method to determine relationship between a data set compose of two or more variables. The least squares method tries to determine the relationship by minimizing the error margin of the derived relation. Simple Linear Regression Simple linear regression problems involve a dependent and a single independent variable. The goal is to find a linear relationship between the two variables. The linear relationships are of the form y=b0+b1x where y is the dependent variable and x is the independent variable. The slope of the line is b1 and the y-intercept is b0. The relationship between the dependent and independent variable can be derived using the least squares method. First of all, the sum of the dependent and the independent variables, and the sum product of the dependent and the independent variables must be calculated. Secondly, the sum of the squares of the dependent and the independent variables must be calculated. The constant that represents the slope of the line that fits the predicted function is calculated as the product of the sum product of the dependent variable and the independent variable and the sample size minus the product of the sums of the dependent and the independent variables divided by the product of the sample size and the sum of the squares of the independent variable minus the square of the sum of the independent variable. The constant that represents the y-intercept of the line is also calculated as the product of the sum of the dependent variable and the sum of the squares of the independent variable minus the product of the sum of the independent and the sum product of the dependent and independent variables divided by the product of the sum of the squares of the independent variable and the sample size minus the square of the sum of the independent variable. Finally, the correlation coefficient of the predictive relation is also calculated as the product of the sample size and the sum product of the dependent and independent variable minus the product of the sums of the dependent and independent variables divided by the square root of the product of the sample size and the sum of the squares of the independent variable minus the product of the squares of the sum of the independent variables multiplied by the product of the sample size and
  • 28. 27 27 the sum of the squares of the dependent variable minus the square of the sum of the dependent variable. Multiple Linear Regression Multiple linear regression problems involve a dependent variable and two or more independent variables. Using the least squares method, the goal is to find the linear relationship between the variables involved. The relationships are of the form y=b0 + b1x1+b2x2+…+bnxn, where n is the number of independent variables, x1, x2,… ,xn are the various independent variables and y is the dependent variable. To solve multiple linear problems, we first need to reduce the expected function or multiple linear models to their simple linear forms. In this form, it is easier to determine the regression equation. To do this we need to determine the y=b0+b1x for every independent variable. That way, the regression coefficient set denoted b associated with the independent variables can be determined using the least squares method. As such the set b made up of b1, b2,…bn is a set containing the entire regression coefficient associated with the predicted regression function. Non Linear Regression Non linear regression problems involve finding a non linear relationship between a dependent variable and one or more independent variables. Because non linear graphs are difficult to analyze, they can be represented mathematically as linear models before they are analyzed. This makes it possible to use linear regression techniques to analyze such relationships. One of the ways used to represent non linear relationships with linear models is taking logs on both sides of the relationship equation. That reduces the non linear relationship to a linear relationship. An example is of the form y2 =x2 /xy. To reduce this relationship to a linear relation we take logs on both sides of the relation. The resulting relationship is 2logy=2logx-logx-logy. When this relationship is simplified the resulting relationship is logy=(logx)/3. In this form, the logy term represents the dependent variable and the logx term represents the independent variable. Let K=logy and let P = logx. It implies that K=P/3. This becomes the linear form of our non linear relation.
  • 29. 28 28 Machine Learning based models Used for Developing Anomaly Based Intrusion Detection Systems. This section discusses how hidden markov models can be used to detect and prevent threats on a computer system. Application of hidden markov models to detect threats and other critical occurrences in a system. Hidden markov models are machine learning models that are used to model states in a system, the sequence in which they occur and the associated probabilities for each state transition. When a system has a set of states in which it usually falls and it can be predicted or established that each new state is dependent on the previous states, then hidden markov models can be used to learn the state transitions that usually happens in the system. It must be stated that the sequence in which states occur in a system can be characterized by a parametric random process. Also, the probability associated with each state transition is irrespective of the time in which the transition occurred in the system. For computer systems which have occurrences that happen based on a parametric random process, these occurrences can be seen as the set of states in the system. Some of these occurrences may be the point at which the system is at its optimal usage, and the point at which a particular threat occurs in the system. When a set of threat types that happens in the system is determined, it becomes possible to study the sequence in which these threats occur in the system and the various transitions between the threats using hidden markov models. Also, the various usage points including the optimal, the minimum and the average usage and how they are transited in the system can be studied using hidden markov models. Because various occurrences and threats can be studied using hidden markov models, it becomes possible to predict the next occurrence or threat that will happen on a host or a computer network. Threat sources can also be predicted using threat models. When threat models are integrated, they give a general idea about the source of the threat. With such knowledge and ability, the next threat or occurrence that has a higher likelihood of happening on a host or network can be predicted using application of hidden markov models. As such, occurrences can be prevented if they are estimated to be disastrous. Also, if for instance, for some reason, the optimal or minimal usage must be reached, it becomes possible to study ways of optimizing the transition from the
  • 30. 29 29 current state or predicted next state to the required state. This makes it possible to move from a particular usage point to the desired usage point. This approach to threat detection and usage optimization, make it possible to build anomaly based intrusion detection systems that are correct, prompt and increase optimal use of the system. The anomaly based intrusion detection systems built using these techniques are correct because the threat models come from usage models that are built using similar approaches and the threat prediction and prevention mechanisms are designed using robust techniques developed using these approaches. Also, there are likely going to be lower false alarms since the threats predicted on host or networks come from threat models designed from such robust methods. An example of a kind of cyber security threat that this approach can be used to model is a network problem where a student is determined or predicted to be sending threatening or socially unacceptable emails to colleagues. Typically, his identity is hidden on the network on which he sends the emails. As such, it is difficult to determine the likelihood that he will send such threatening emails on a particular day or hour so that his identity could be determined and brought to book. Using hidden markov models, a usage model of the email system could be developed that will make it possible to determine the day or hour in which he is likely going to send such an email. This will help in determining his identity and bring him to book.
  • 31. 30 30 The Normal Usage Model of a System If the normal usage of a mobile system can be represented by a mathematical function such that that function is made up of system variables Xi and system constants Ci, then any representation of our mobile system can be summarized as Y=f (Xi, Ci), where Y is our systems’ usage and Xi are the various independent variables of our mobile system that constitutes the normal usage model of the system. A normal usage model is an abstract representation of the usual or normal functioning or behavior of a system. In order to model the normal usage of our system and determine its mathematical representation, it is essential to keep the method simple and the variables simple in abstraction and minimal in quantity. This makes it easy to analyze, model and detect threats by applying a branch of calculus called differentiation. Simplicity and minimal number of variables make it possible to arrive at a mathematical function whose differential coefficient can be easily computed using differentiation. As such, two cases will be considered. In the first case, the normal usage model of our system can be analyzed and modelled based on simple but essential micro usage models. These micro usage models represent smaller components of our mobile system such as an authentication system of our mobile system, and a user’s session. Ideally, these models are best derived from exactly one most appropriate system variable when feasible or at most two in order to reduce the complexity involve in computing the differential coefficient of the usage model. For a mathematical function involving more than a single independent variable, our method for threat detection using the differential equations techniques is within the scope of multivariable calculus. Since it is easy to compute the differential coefficient of a single variable function, our threat analysis and detection can be easy if all our micro models are single variable functions. In the second case however, our usage model derives it mathematical representation from at less two or three most relevant system variables of the mobile system under examination. This option increases the complexity involved in calculating the differential coefficient of our normal usage model and analyzing the threat associated. This is because the normal usage model for this case is a function that can be derived from two or more independent system variables. To do this type of differentiation, we use a branch of calculus called partial differentiation, where one of the independent variables of our usage model is held constant to analyze changes in
  • 32. 31 31 the usage. This type of differentiation is also within the scope of multivariable calculus. The sections that follow the one below throw more light on how to model the normal usage of several micro usage models. These micro usage models are expected to be components of a computer network’s usage. It must be noted that the usage model is made up of the usage model function and a statistical model that captures the mean and standard deviation of the predicted usage function. This statistical usage model is called moments or mean and standard deviation model. There are other statistical model that could have been used. These include time series models, univariate models and bivariate models. Single Variable Calculus Review and its Applications Assume a mobile system with exactly three major system variables. If sampling each of these variables helps us to arrive at exactly one micro usage model of our mobile system that best represents the behavior or functioning of that feature of our system, then we can use differential equations of the three micro models to analyze and detect threats. Below are some examples of calculus basics for our threat modelling techniques. Y=2X+3 is a linear function that represents our first micro usage model. X is number of authentications. Y=3X2+2X+6 is a quadratic function that represents our second micro usage model and X is the number of host on the mobile system’s wireless network. Y=40/ X+ 5 is an exponential function that represents our third micro usage model and X is the number of application on a host on the mobile system’s wireless network. For each micro usage model, the differential coefficient can be computed using the law for differentiation given below. Theorem 1: dy/dx(C) =0, where C is a constant. Theorem 2: dy/dx (f[Xi, Ci]) is computed as the product of the exponent of the first term that results from simplifying f (Xi, Ci) and the constant besides it multiplied by the system variable Xi raise to the power the original exponent of the first term minus one plus the result for iterating the first step till every term of f (Xi, Ci) has been evaluated based on the first step. The final result looks like the sum of a series of rational numbers computed from the law after going through all the terms. From the calculus basics review above, the corresponding differential coefficients of the three micro models are determined as follows; 2, 6X+2, and -40/ X2 . If the standard deviations of our micro models are computed, then we can analyze changes in our system by looking at values
  • 33. 32 32 of our usage model and its derivates and how they relate to the average usage, its corresponding standard deviation, and the acceptable thresholds for threats. Any occurrence at a point where our usage model value is not equal to the average usage indicates a threat. Any occurrences at a point where the usage model value is less than the average usage minus its corresponding standard deviation is a denial of service threat. Any occurrence at a point where the usage model value is greater than the average usage plus it corresponding standard deviation is an intrusion. Also any occurrence at a point where the value of the usage’s derivative is not equal to the acceptable threshold for threats is a threat. Usage Model List Authentication Usage Model The authentication usage model represents the usage of an authentication system. The independent variables that must be sampled to determine the usage of an authentication system are the average data transmitted during an authentication (x1) and the average network speed for a single authentication (x2). The average data transmitted is the average of request and response data for a single authentication and the average network speed is the average upload and download speed for a single authentication. The dependent variable that must be sampled is the time taken for an authentication (y). The goal of modelling the dependent and independent variables is to arrive at a mathematical relationship between y and the two independent variables x1 and x2. It is expected that the relationship will be Y=c1(x2/x1) +c2, where c1 and c2 are system constants. In addition to that, some system constants that will aid threat analysis must be determined. These are the total number of valid authentications, the expected authentications within a time frame, the minimum authentications within a time frame and the maximum authentications within a time frame. The mathematical relationship between y, x1 and x2 is the normal usage model of the authentication system. After this relationship has been determined, various occurrences that deviate from this relationship can be used to analyze threats. For instance, any occurrence that is not equal to the average usage is a threat. Additionally, any occurrence that indicates a change outside an acceptable threshold is a threat. The acceptable threshold is a range within which changes in the systems are deemed normal. Such a range is composed of the average usage and standard deviation.
  • 34. 33 33 Session Usage Model A session usage model represents a single user’s behavior before his session expires. To determine the mathematical model for a user’s session, two main independent variables must be sampled. These are size of session data accumulated (x1), and number of user actions (x2). The dependent variable that must be sampled is time spent before session expires (y). The session usage model is expected to be made up of two micro usage models. The mathematical representation of the micro usage models are expected to be Y=c1x1+c2 where c1 and c2 are systems constants and Y=c1x2+c2 where c1 and c2 are system constants. In addition to the two mathematical functions, some system constants that will aid threat analysis must be determined. These include average user actions, average size of data accumulated, average time spent. These constants can be determined from the data set used to determine the usage model. The two mathematical relationships represent the session usage model. Both are linear functions. It is expected that as user actions increase the time spent also increases. It is also expected that as data accumulated increase times spent also increases. Memory Usage Model The memory usage model represents the usage of memory space in a system. The independent variables that must be sampled are number of application programs running (x1), and the number of system processes running (x2). The dependent variable that must be sample is amount of memory space being used(y). The mathematical relationship between x1, x2, and y is expected to be y=c1x1+c2x2+c3 where c1 is the average memory space for programs, c2 is the average memory space for processes and c3 is the average memory being used when no process or program is running. In addition to these, some system constants that aid threat analysis must be determined. These include the minimum and maximum memory space for programs and the minimum and maximum memory space for processes. The mathematical relationship between x1, x2, and y is the memory usage model. When determined, the memory usage model can be used to analyze changes in the memory usage that indicate threats in the system. CPU Usage Model The CPU usage model represents CPU usage in a system. The independent variables that must be sampled are the number of application programs running (x1), and number of system processes
  • 35. 34 34 running (x2). The dependent variable that must be sampled is amount of CPU power being used (y). The mathematical relationship between x1, x2, and y is expected to be y=c1x1+c2x2+c3 where c1 is the average CPU power being used for programs, c2 is the average CPU power being used for processes and c3 is average CPU power being used when no process or program is running. In addition to these, some system constants that aid threat analysis must be determined. These include the minimum and maximum CPU power for programs and the minimum and maximum CPU power for processes. The mathematical relationship between x1, x2 and y is the CPU usage model. When determined, the CPU usage model can be used to analyze changes in the CPU usage that indicate threats in the system. Program Usage Model To determine the program usage model the dependent and independent variables that must be sampled are time spent using program (y), and number of functions used (x). In addition to that, the following constants must also be determined. Minimum functions used and maximum functions used. The relationship between y and x determined after sampling various x and y values is the program usage model denoted by y=f(x). Host Usage Model The host usage model is composed of four independent variables. Memory usage (x1), session usage (x2), CPU usage (x3), and program usage (x4), derived from their respective usage models. The dependent variable that must be sampled in the time host spent on host (y). Any relationship determined between the dependent and the independent variables is the host usage model. The resulting host usage model is denoted y=f (x1,x2, x3, x4). Battery Usage Model The battery usage model is made up of the average usage of CPU, average memory usage and the average usage of how a session behaves in the system. These are the independent variables. The dependent variable is the battery lifespan. The independent variables are derived from their respective micro usage models. Device Usage Model The device usage model is made up of a battery usage model, a host usage model, and the time spent on the device. The usage models that make up the device usage model compute the average
  • 36. 35 35 micro usage and try to relate that with the time spent on the device. The time spent on the device is the dependent variable. Server Usage Model The server usage model is made up of the CPU time being used, the memory space being used and the number of processes running. These variables are used to form two different micro usage models. As such, there are two dependent variables, CPU time and memory space. The independent variable for both micro usage models is the number of processes running. Port Usage Model The port usage model is made up of the time elapsed during communication, number of programs that use the port and the number of paired ports. The number of paired ports is the dependent variable and the remaining variables are the independent variables. Network Usage Model The network usage model is made up of average port usage, average server usage average host usage, the average size of data transmitted on the network, and time spent on the network. The first three variables are the independent variables. The remaining two are the dependent variables. As such two micro usage models make up the network usage model. Aggressive Usage Detector This model is a utility that detects aggressive behavior on a system. It is modelled just like the various micro usage models. Various factors that determine aggressive behavior during system usage are used to determine the mathematical representation of this utility. Aggressive behavior includes aggressive use of major system resources, and aggressive use of system components with limited resources. The average aggressive behavior and its standard deviation are determined. Any system occurrence that indicates the average aggressive behavior, or the average aggressive behavior plus its standard deviation or the average aggressive behavior minus its standard deviation is considered a threat and must be halted, alerted or stored for audit purposes. False Alarm Detector The false alarm detector is a utility that detects normal system usage that otherwise may be deemed threats. Occurrences that meet the criteria for false alarms are normal usage that seems to put the entire usage of the system into a false state of vibration or anarchy. Such usage occurrences are as
  • 37. 36 36 such prioritized as normal optimal usage. The remedy for the vibrations such usage occurrences cause is delay in other normal usage occurrences in the system. The state and magnitude of other system occurrences plus the state and magnitude of the normal optimal usage determine the impact of the perceived anarchy. To increase convenience with which the system for which this utility is developed, the average delay time and its standard deviation must be detected. This utility is part of the normal usage. The utility is modelled just like the aggressive usage detector. Special parameters of the usage model This section discusses special parameters of our normal usage model. These parameters include the average usage, the usage standard deviation, the minimum usage, the maximum usage and the most frequent usage value recorded. The average usage is the predicted average usage after the normal usage model function has been determined. The usage standard deviation is the standard deviation of the predicted normal usage function. The minimum and maximum usage values are the minimum and maximum usage predicted using the normal usage model. These parameters together with usage rates, threat model constants and other usage constants are used in analyzing and detecting threats. BUILDING THE USAGE PROFILE To build the usage profile we will first program a usage model for all the components of the computer system under investigation. For this research, we want to build the usage profile for a computer network. As such we will program a usage model for authentication on the computer system, we will also program a usage model for a user’s session on the computer system. Also, we will program the usage model for memory usage in a computer system. We will also program a usage model for CPU usage. Additionally, we will program a usage model for a host on a network and program another usage model for a server on the network and finally we will program a usage model for the network its self. The usage model for each component represents the behaviour of that component of a computer system under investigation. The usage model when implemented will help us determine the regression equation which represents the research model and the average usage and its standard deviation. In addition to the regression equation and the mean and standard deviation model we will develop a markov chain model for the system under investigation. As such we will determine
  • 38. 37 37 states in the entire computer network and the various state transitions and the associated probabilities of state transitions. The rest of this chapter will explain how to build a usage profile using an authentication system and explain the details of the critical variables of the other usage models and explain the mathematical theory needed for building the usage profile. BUILDING A MODEL PROFILE FOR AN AUTHENTICATION SYSTEM To build a usage model for an authentication system, we must sample critical system variables of a system. These variables include the download speed on the network, the upload speed on the network, the size of data sent to the server during authentication, the size of data sent to the client during authentication and the time it takes for a successful authentication. The size of data sent and received from the server are request data and response data respectively. To build the usage model for the authentication data, we will capture data for all the critical variables at equal time intervals say every 10 minutes while the authentication system is being used. After having a sample of sample size of about 10 we will try to determine the relationship between the dependent variable and the independent variables. As already stated the relationship can be determined using simple or multiple linear regression. In addition to the regression equation, we will also determine other statistics that describe the behavior of the authentication system such as the mean and standard deviations for the variables that were sampled. BUILDING THE MARKOV CHAIN MODEL FOR THE AUTHENTICATION SYSTEM Hidden markov models are machine learning models that are used to model states in a system, the sequence in which they occur and the associated probabilities for each state transition. When a system has a set of states in which it usually falls, and it can be predicted or established that each new state is dependent on the previous states, then hidden markov models can be used to learn the state transitions that usually happens in the system. To build the markov chain model we will determine states on the authentication system and their associated probabilities. Some of these states include the average usage of the authentication system. This may be abstracted as the average time it takes for a successful authentication. Other states include the minimum and maximum recorded time for a successful authentication and the average time it takes for a failed authentication or the maximum and minimum recorded time for failed authentications. With this information and their associated probabilities of occurrence during a normal day we have more information about the behaviour of the authentication system.
  • 39. 38 38 Threat Models in a System A threat is a change in the normal usage model that is beyond a certain acceptable threshold called the standard deviation of the usage model. A threat model on the other hand is an abstract representation of this change in our mobile system that is beyond the acceptable threshold. Integration can be performed on a threat model to determine the source of the threat. Integration is a reverse operation for differentiation in calculus. A threat model that can perform integration operations can be called a novel self integrating data structure. This chapter of the paper will look at threat models of the micro usage models that make up a computer network and how to analyze these threats in order to prevent them. Also, how to determine the sources of these threats using a novel self integrating threat model will be discussed. To do this, three main functions are introduced. The functions are y=3, y=4X+2 and y=9X2 +3. These functions are in the context of the novel self integrating data structure. These functions are three different threat models. Additionally, the threat models of the various micro usage models discussed in this paper will be explored. Properties and Methods of the Novel Self Integrating Data Structure The best properties or characteristics of the data structure that represents our threat model include just to mention a few, names of network software or host application software, version number of network and host software, license information that include date software was purchased or released and number of years needed for renewal, IP address and Mac address of a host on a network. The methods of such a gigantic or simulative object may include methods for computing the integral of a threat model, another for computing the differential coefficient of the predictive normal usage model, a method for computing the differential equation of a network or host threat model. These methods included are mostly methods needed for performing the major calculus operations that will help in the novel calculus simulation on a network to detect threat and their sources on a wireless network. Besides these, it may be necessary to implements methods that retrieve hidden network identity like IP and Mac addresses on a local area network. Integration Review Based on our three functions stated in this chapter, we will do an introductory review of integration which is a branch of calculus that is a reverse operation for differentiation. The integrals for the
  • 40. 39 39 functions introduced in this chapter are computed respectively as 3X +C, 2X2 +4X+C and 3X3 +3X+C where C represents system constants in the mobile system. Computing the integral can be tricky so two laws are defined below to aid quick computation of the integrals of a normal mathematical function. Theorem 1: If a function is represented by a constant such as a rational number, the integral is the product of the variable x and the rational number which is the constant plus a system constant c, to be determined by about a pair of x and y values. Theorem 2: If a function is not represented by a constant, the integral is given as the constant of the first x occurring term divided by the sum of the exponent of the first x occurring term and 1 multiplied by the variable x raised to the power the sum of the exponent of the first x occurring term and 1 plus repeating the same for every x occurring term plus the corresponding system constant c. Interpretation of Threat Model Integrals Since the novel self integrating data structure is a programmed threat model, it is important to discuss the meaning of its integrals. The integrals represent the source of the original threat. Examples of the integrals of the threat model may result in detecting the function, software, host or network from which the threat was detected. With properties like software name, version number, IP and Mac addresses it becomes easy to pin point the source of the threat. If the integral of a threat model looks like the normal usage model of a function of the system under examination, then that function from the system under examination can be predicted as the source of the threat. Similarly, if the integral is similar to the normal usage model of a software, host, or network that forms part of the system which is being investigated, then that threat can be predicted to be from that software, host or network. Threat Analysis and Detection To do threat analysis in a system and abort processes that initiated those threats, linear and non linear programming techniques can be used. The goal here is to minimize the threat occurrence frequency and the overall impacts associated with the threat and optimize the normal usage function. In addition to these two goals, there are some constants that aid threat analysis. These constants are associated with the normal usage model and the threats in the system.
  • 41. 40 40 Examples of these constants may be the rate at which usage is increasing with respect to a particular usage variable or the rate at which the threat impact and frequency increases with respect to a particular variable in the usage model and other special parameters associated with the usage model function. The average usage, its standard deviation and the threat model function make up the threat model. The average usage and standard deviation are constants in the threat model. Using the threat model function, the average usage and standard deviation, threats analysis can be done using linear and non linear programming. The goal is to minimize threats using the threat model function as the objective function and the average usage and standard deviation as constraints. Other parameters that may be used as constraints include the rate at which usage is increasing with respect to a particular usage variable or the rate at which the threat impact and frequency is increasing with respect to a particular usage variable. Threat Prediction This section discusses how to predict threats in a system. The network usage model discussed in the previous chapter and its associated threat model will be used to demonstrate how to predict or detect a threat in a system. As discussed in the previous section, threat can be detected using linear and non linear programming. The network usage model function and its associated threat model function are the objective functions. The constraints that will be used are the average network usage and its standard deviation, and other parameters such as the rate at which the network threat increases with respect to other network usage model components such as average host usage, average server usage, average port usage, average time the network operates, average data transmitted on the network. The goal of the linear or non linear programming is to optimize the usage such that usage is within the range of the average usage minus its standard deviation and the average usage plus its standard deviation. These are the lower and upper bounds of our objective function. Every combination of system variables whose usage is within this usage range minimizes threat in the system. Since the average port, host and server usage are derived from their corresponding usage models, the linear and non linear programming analysis will be done independently for these ones. When a threat is predicted in a system, the chance of it being accurate is dependent on the usage value at that instance and whether it is within the range of the acceptable usage. This is constructed using
  • 42. 41 41 the average usage and its standard deviation. Any usage value that is less than the average usage minus its standard deviation is a threat. Also, a usage value that is greater than the average usage plus its standard deviation is a threat. That means that any predicted threat at a point where the predicted usage is within the usage range has a high chance of being false. In addition to that, the actual and predicted usage values can be used to determine that chance that the predicted threat is accurate. If the difference between them is high, there is a chance that the predicted usage may be wrong. Since the predicted usage and the threat models are derived from the usage model function, there is a chance the predicted threat is also false. Finally, the closer the correlation coefficient of the usage model function is to zero, the higher the chance the predicted usage and its associated threats values are wrong. Usage model functions with correlation coefficient of 0.6 and above indicate that the predicted usage values and predicted threats values are accurate. These values are obtained from the usage model function and the threat model function respectively which are modeled using relevant systems variables that make it possible to model system usage and system threats. Risk Analysis in a System To do risk analysis in a system, the frequency at which threats in the system occur and the impact they have on the system must be known. When a frequency table is constructed for all threats and their associated impacts stored, it becomes easy to analyze risks associated with a system. When a threat is predicted, the likelihood of the threat occurring in the system can be computed using the threat frequencies. The impacts various threats have can also be determined based on the types of threats and other parameters such as the number of such threats, the speed at which they occurred and the resources they affected or damaged. Risk in a system is computed as the product of the likelihood of threat occurrence and the impact that threat occurrence has on the system. These concepts are the basics for developing a risk analysis system using the techniques we have discussed so far. Normal Usage Model and Threat Model Simulation In this chapter, we discuss the experiment that was conducted to determine the usage of a computer system. We also discuss how to simulate the threat and usage models with the hope of developing a threat detection system. Four of the micro usage models that were discussed in this paper were used for the simulation. These are the ones for authentication, session CPU and memory.
  • 43. 42 42 Because the usage model for authentications was determined to be a rational function, logs was taken on both sides of the relation as part of the simulation in order to reduce the relation to their linear form. The original function is Y=c1(x2/x1) +c2. When reduced to its linear form we have log Y= log c1+ log x2 – log x1 + log c2. Since log c2 and log c2 results in constants let denote them with k1 and k2 respectively. Additionally, let B= log Y, let j1= log x1 and let j2= log x2. Therefore, the linear form of the usage for authentication is B= j2- j1 + k1 + k2. Since k1 + k2 is a constant let it be represented by k. As such B= j2- j1 + k where B is the dependent variable and j2 and j1 are the independent variables. When B, j2, and j1 are sampled, Y=c1(x2/x1) +c2 can be determined. The cpu and the memory usage models are multiple linear forms. The original relation is of the form y=c1x1+c2x2+c3 where x1 and x2 are the independent variables. The original relation must be reduced to their simple linear form. To do this, determine y=b0+bx for each independent variable. The sum of the various b0 equals c3. The various b correspond to the constant associated with the independent variable for which y=b0+bx was determined. For example, the b for any y=b0+bx determined for x1 equals to c1 and that for x2 equals to c2. When x1, x2, and y are sampled and the various y=b0+bx determined, y=c1x1+c2x2+c3 can be determined completely. The simulation was run for four times within a week. On the first instance, it was run for 15 minutes. On the second instance, it was run for 30 minutes. On the third instance it was run for 45 minutes. On the last instance it was run for 60 minutes. The functions for the usage models, and their corresponding correlation coefficient were also determined.
  • 44. 43 43 Tools and Computer Packages This chapter discusses the tools and computer packages that were used throughout this research project. We will also look at the programming languages, database platforms and development frameworks that can be used to develop an anomaly based intrusion system for ecommerce sites using the concepts were have discussed in this paper. The simulation was implemented using java. It was a console based simulation. Java was chosen for its object oriented concepts such as encapsulation, inheritance, interfaces, objects, and polymorphism. To implement an intrusion detection system using results of this research, the following tools will be essentials. These tools are best suited for intrusion detection systems developed for ecommerce sites. Bootstrap, Codeignitor, MySQL Database Management System, SQLite, SQLyog, and Eclipse. The programming languages that will be used are PHP and Android. PHP is for the desktops and laptops that connect to the ecommerce sites and Android is for mobile phones that use the ecommerce sites. Bootstrap and Codeignitor are web development frameworks. Bootstrap is for frontend developments and Codeignitor is a backend framework for PHP developers. For Android Eclipse can be used as the best IDE for Android developments. MySQL and SQLyog are for the database servers that will run on the ecommerce site as part of the intrusion detection system implementation. SQLite is for the databases that run on the Android implementations that form part of the intrusion detection system developed for the ecommerce website. With are these tools frameworks and packages, developers are ready to develop intrusion detection systems for ecommerce sites using the concepts in this research paper. It is expected that the micro usage models discussed will be integral libraries that will be implemented in PHP and Android as part of an implementation for ecommerce sites or any group of web or mobile application system.
  • 45. 44 44 Conclusion and Discussion It is worth mentioning that the normal usage models and threat models experimented in this paper represents a computer system and it associated threats. These threats can be analyzed periodically and audited as part of a computer security audit. This will fuel development of a risk analysis system. A risk analysis system, threat detection system and normal usage system developed from experimenting the usage and threat models will make up a mobile security audit framework that can be used for maintaining cyber security on computer systems. When practices and processes for maintaining this framework are drafted and adhered to, it will make it easy to maintain cyber security on various computer systems. Additionally, it can be established that using the differential equations technique, the novel self integrating data structure, and the linear and non programming techniques, threats on a system can be analyzed and detected. To halt such threats, the intrusion detection system developed using the techniques stated above must possess certain qualities. These qualities include correctness, promptness, and ease of use. Correctness means how good the intrusion detection system can detect threats. This is important because correctness affects the rate at which a predicted threat is false or true. Promptness is related to the time it takes to detect or halt a threat and ease of use is related to the property of the intrusion detection system aiding convenient use of the computer system for which it is developed. The techniques we have discussed make it possible to achieve correctness, promptness and ease of use. The usage model function with its associated average usage and standard deviation make it possible to ensure correctness of the intrusion detection system. This is because the statistical data sampled for developing the intrusion detection system is within the range of the acceptable usage. The average usage and standard deviation are computed using statistical models. One of such models used in this research is the moments or mean and standard deviation model. With this statistical model and the usage model equation, it becomes possible to ensure correctness of the intrusion detection system. To achieve promptness, multithreading is applied to analyze, predict, detect and halt threats. All threats alarms and detection must use multithreading. Multithreading is a programming concept that ensure that several processes run on the computer at the same time. This concept makes it possible to predict multiple threats, do multiple threat analysis and halt or alarm