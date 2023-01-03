A Cyber Security Audit Framework and a Secured Expert Medical Consultation System.pdf
A Cyber Security Audit Framework and a Secured Expert Medical Consultation System
Developed Using Artificial Intelligence Technologies
Abstract
This paper is an investigation into building a Security Audit Framework and a Secured Expert
Medical Consultation System using Artificial Intelligence Techniques. First of all, we describe
how to build a Usage Profile of a Computer Network. This Usage Profile of the Computer
Network will be the basic building block for the development of a Security Audit Framework
which is composed of an Anomaly Intrusion Detection System and a Risk Analysis System. The
proposed Usage Profile is made up of a Linear Regression model, a Mean and Standard
Deviation model, and a Hidden Markov model. These models can be built by sampling
experimental data of critical variables of a Computer Network.
Secondly, the Secured Expert Medical Consultation System will be developed using
Evolutionary Computing techniques. The proposed system will make it possible for Medical
Doctors to perform medical consultations with ease. This is because the system will be able
to suggest Doctor’s diagnosis, medical test and drug prescription associated with medical
information captured during medical consultation such as symptoms and reactions of
patients. The system would be able to do this by mapping a set of symptoms and conditions
with a database of diseases.
Additionally, this project will answer the question: how do we secure such an expert
medical consultation system? As such, we will look at security of the system itself and security
of other components of the system such as the database of the system and also security of
the server on which the system and its database would be hosted. In this project, we will
secure the system by performing Penetration Testing on the system as the system is being
developed. Some of the techniques we would explore include SQL Injections, Dictionary
attacks etc. Also, we would look at how to secure the database by configuring the appropriate
Access Controls on the Database Management System (DBMS) and configuring the DBMS to
guard against intrusion using authentication and authorization. We will also look at Password
types that will be good for such a system.
Finally, it is suggested that deviations from the usage profile of the computer network
can be flagged as anomalous activities. This can help us develop a cyber security audit
framework made up of an anomaly intrusion detection system and a risk analysis system.
Abstract 0
1.0 Introduction 5
2.0 Security Audit Framework 6
2.1 Problem Definition 6
2.2 Research Questions 6
2.3 Objectives 6
2.4 Literature Review 7
2.4.1 Intrusion Detection Systems 7
2.4.2 Anomaly Detection Systems 8
2.4.3 Behaviour Encryption 8
2.4.4 Risk Analysis 8
2.4.5 Information Security Awareness and Practices 9
2.4.6 Protocol For Mitigating Risks on Social Networking Sites 9
2.4.7 Behaviour Models and Anomaly Intrusion Detection 9
2.5 Research Model and Methodology 10
2.5.1 Research Model 10
2.5.2 Usage Model: A Java Interface That Implements the Research Model 10
2.5.3 Usage Model File: model.java 11
2.5.4 Implementing the Usage Model for an Authentication System 11
2.5.5 Methodology 12
2.5.5.1 Machine Learning Algorithms & Behaviour Based Intrusion Systems 12
2.5.5.2 Audit Trail Analysis 12
2.5.5.3 Normal Usage Model 12
2.5.5.4 Threat Modelling 13
2.5.5.5 Boolean Calculus 13
2.5.5.6 Experimenting Usage and Threat Models 13
2.5.5.7Computer Usage Survey 13
2.5.5.8 Threat Detection Systems 13
2.6 Threats Associated With Computer Systems 13
2.6.1 Attacks associated with a computer system 13
2.6.2 Malicious Code 14
2.6.3 IP Scan and Attack 14
2.6.4 Web Browsing 14
2.6.5 Virus 14
2.6.6 Unprotected Shares 14
2.6.7 Mass emails 14
2.6.8 Simple Network Management Protocol (SNMP) 14
2.6.9 Hoaxes 15
2.6.10 Backdoors 15
2.6.11 Password Crack 15
2.6.12 Brute Force 15
2.6.13 Dictionary 15
2.6.14 Denial of Service (DoS) and Distributed Denial of Service (DDoS) 15
2.6.15 Spoofing 16
2.6.16 Man in the Middle 16
2.6.17 Spam 16
2.6.18 Mail Bombing 16
2.7 Mathematical Modelling Techniques and Machine Learning Based Models 17
2.7.1 Simple Linear Regression 17
2.7.2 Multiple Linear Regression 18
2.7.3 Non Linear Regression 18
2.7.4 Machine Learning Based Models Used for Developing Anomaly Based
Intrusion Detection Systems. 18
2.8 The Normal Usage Model of a System 19
2.8.1 Single Variable Calculus Review and its Applications 21
2.8.2 Usage Model List 21
2.8.3 Authentication Usage Model 21
2.8.4 Session Usage Model 22
2.8.5 Memory Usage Model 22
2.8.6 CPU Usage Model 23
2.8.7 Program Usage Model 23
2.8.8 Host Usage Model 23
2.8.9 Battery Usage Model 23
2.8.10 Device Usage Model 23
2.8.11 Server Usage Model 24
2.8.12 Port Usage Model 24
2.8.13 Network Usage Model 24
2.8.14 Aggressive Usage Detector 24
2.8.15 False Alarm Detector 24
2.8.16 Special Parameters of The Usage Model 25
2.8.17 Building The Usage Profile 25
2.8.18 Building a Usage Profile for an Authentication System 25
2.8.19 Building A Markov Chain Model for An Authentication System 26
2.8.20 Threat Models in a System 26
2.8.21 Properties and Methods of the Novel Self Integrating Data Structure 26
2.8.22 Integration Review 27
2.8.23 Interpretation of Threat Model Integrals 27
2.8.24 Threat Analysis and Detection 28
2.8.25 Threat Prediction 28
2.8.26 Risk Analysis in a System 29
2.9 Normal Usage Model and Threat Model Simulation 29
2.10 Tools and Computer Packages 30
3.0 Secured Expert Medical Consultation System 32
3.1 Problem Definition 32
3.2 Research Questions 32
3.3 Objectives of Paper 32
3.4 Literature Review 33
3.4.1 Evolutionary Computing Terminologies 33
3.4.2 Evolutionary Algorithms 33
3.4.2.1 Representation 33
3.4.2.2 Evaluation or Fitness Function 33
3.4.2.3 Population 34
3.4.2.4 Parent Selection Mechanism 34
3.4.2.5 Variation Operators 34
3.4.2.6 Mutation 34
3.4.2.7 Recombination 34
3.4.2.8 Survivor Selection Mechanism 35
3.4.2.9 Initialization 35
3.4.2.10 Termination Condition 35
3.4.3 Genetic Algorithms 35
3.4.4 Evolutionary Strategies 35
3.4.5 Genetic Programming 36
3.4.6 Evolutionary Programming 36
3.4.7 Differential Evolution 36
3.4.8 A Survey on Wearable Sensor-Based Systems for Health Monitoring and
Prognosis 36
3.4.9 Sensors in Medicine 37
3.5 Research Model and Methodology 37
3.5.1 Research Model 37
3.5.2 Research Methodology 37
3.5.2.1 Assumption Enumeration 37
3.5.2.2 Hypothesis Formulation 38
3.5.2.3 Experimentation 38
3.5.2.4 Hypothesis Testing 39
3.5.2.5 Demonstration 39
3.5.2.6 Agile Development 39
3.6 Requirement Specification 41
3.6.1 Functional Requirement for the Android App 41
3.6.1.1 Consultation 41
3.6.1.2 Patient Basic details and medical information 41
3.6.1.3 User Settings and Authentication 41
3.6.2 Functional Requirements of the Web Application 42
3.6.2.1 Patient Details and Medical Information 42
3.6.2.2 Consultation 42
3.6.2.3 Drug Prescription and Medical Test 42
3.6.2.4 User Settings and Authentications 42
3.6.3 Non-Functional Requirements 43
3.7 Scope of Diseases 43
3.7.1 Symptoms and Reactions of Diseases that will be modelled 43
3.7.2 Symptoms of Malaria 44
3.7.3 Symptoms of Cholera 44
3.7.4 Symptoms of Diarrhoea 44
3.7.5 Symptoms of Bi-polar Disorder 44
3.7.6 Symptoms of Schizophrenia 44
3.7.7 Symptoms of Diabetes 44
3.7.8 Skin Diseases 45
3.7.9 Symptoms of Hypertension 45
3.7.10 Symptoms of Asthma 45
3.7.11 Medical Tests Associated with Diseases 46
3.8 The Evolutionary Computing World 46
3.8.1 Representation 46
3.8.2 Population 46
3.8.3 Initialization 46
3.8.4 Fitness Function 46
3.8.5 Parent Selection Mechanism 47
3.8.6 Survivor Selection Mechanism 47
3.8.7 Mutation 47
3.8.8 Recombination 47
3.8.9 Termination Condition 47
3.9 Design and Implementation 49
3.9.1 Designing the Mobile App 49
Fig. 2 50
3.9.2 Implementing the Mobile App 50
3.9.3 Designing the Web App 52
3.9.4 Implementing the Web App 52
3.10 Expert System Methodologies and Application 55
3.10.1 Rule-based Systems 55
3.10.2 Knowledge-based Systems 55
3.10.3 Neural Networks 56
3.10.4 Fuzzy Expert System 56
3.10.5 Object Oriented Methodologies 56
3.10.6 Case-based Reasoning 56
3.10.7 Modelling 56
3.10.8 System Architecture 57
3.10.9 Intelligent Agents 57
3.10.10 Ontology 57
3.10.11 Database Methodology 57
4.0 Conclusion and Discussion 58
5.0 References 60
1.0 Introduction
If a usage profile of a system can be built, it will become possible to detect unusual behaviour
on the system. The method for building such usage profiles involves determining factors of
the system that are critical to the system. These factors can be seen as critical system variables
that affect the system’s usage. The other thing to consider is determining the way in which
you can obtain an abstract representation of the usage profile. The abstract representation
of the usage profile can be achieved by the application of behaviour models such as statistical
models, machine learning models and cognitive based models.
Secondly, Cyber security threats on computer networks have the potential of causing
damage to resources on the computer network. Examples of these damages include
corrupting data stored or transmitted on the network, infesting a host on the network with
virus, impersonating a valid user on the network and preventing proper functioning of
applications softwares on various hosts on the network. The security of computer systems is
very essential to various organizations. Computer systems security is usually provided by
computer software that protects the computer system for which they were developed. Such
a computer software system is an intrusion detection system. Other computer systems that
provide security are antivirus and firewall and risk analysis systems. Also, periodic computer
security audits will enable threat detection and prevention on computer networks.
Additionally, It must be stated that medical health care can be made a bit successful,
timely and will yield the expected results when an expert medical consultation system is used
in administering medical care and performing medical consultation. This type of system can
be developed through the application of artificial intelligence technologies such as machine
learning, artificial neural networks and evolutionary computing.
It must be stated that application of concepts of evolutionary computing can make it
possible to develop an expert medical consultation system. This is possible when we develop
a database of diseases with their symptoms and a database of medicines that are used to cure
the diseases and a database of reactions and indications and their associated medical test
that will aid in administering medical consultation.
This paper is an investigation into building a Security Audit Framework and a Secured
Expert Medical Consultation System using Artificial Intelligence Techniques. The security audit
framework consists of an anomaly intrusion detection system and a risk analysis system which
will be developed using behaviour models such as statistical models, machine learning models
and cognitive models. The expert medical consultation system will be developed using
evolutionary computing techniques.
2.0 Security Audit Framework
This chapter of the research paper is dedicated to the security audit framework. We will
discuss the objectives for building the security audit framework, the problem we seek to solve
and the research question for this part of the research paper. We will also describe how to
build the usage profile that is the basic building block of the security audit framework.
2.1 Problem Definition
If the normal usage or behaviour of a computer system can be represented by an abstract
model, then this abstract model can be used to detect threats on the system. The threats on
the system can be detected as deviations from the abstract model which is the behaviour of
the system. The main problems this paper seeks to investigate are listed below.
● Representing the normal usage or behaviour of a system with an abstract model.
● Determining activities and occurrences on the system that are deviations from the
system’s normal behaviour or usage.
● Representing these activities or deviations with an abstract model.
● Preventing such activities or occurrences from occurring on the system.
● In this paper the system’s normal behaviour is known as the usage profile and the
deviations from the system’s normal behaviour is known as the threat profile of the
system.
2.2 Research Questions
The main questions to be investigated are listed below.
● What are the best and most efficient techniques for modelling a system’s normal
behaviour or usage?
● What are the best and most efficient techniques for design and implementation of a
threat detection system?
● How can we build a risk analysis system for performing risk assessment of a computer
system?
2.3 Objectives
The main objectives of this research are as follows.
● Representing a computer network’s normal functioning with an abstract model
● Building a usage profile of a computer network.
● Detecting activities and occurrences that deviate from the normal usage of a computer
network and flagging these activities and occurrences as anomalous activities on a
computer network.
● Design and implementation of an Anomaly Intrusion Detection System.
● Design and implementation of a Risk Analysis System.
● Design and implementation of a Security Audit Framework.
● Drafting of a document that details the procedures, processes and guidelines that
must be followed in the operation and administration of a security audit framework.
2.4 Literature Review
This section reviews major topics that constitute this research paper and work done in some
of these areas. The topics and areas that will be considered for discussion include intrusion
detection systems since any discussion or study of threat and their source detection is centred
on intrusion detection systems. Also, behaviour encryption is another computer security field
that will be discussed in detail since it adds much value to information hiding parts of this
research. Risk analysis will also be reviewed to sum up what constitutes risk analysis. Finally,
there will be a review on Normal Usage Models.
2.4.1 Intrusion Detection Systems
Basically, there are two types of intrusion detection systems in the industry based on the
approach used for threat detection and the technologies used to build the system [25]. These
are knowledge based also known as signature based and behaviour based intrusion detection
systems [25]. Each takes a different approach to threat detection and each uses different
technology for building the intrusion detection systems. Also, every single one has its pros
and cons.
Knowledge based intrusion detection systems are built on a database of already
known threats [25]. These known vulnerabilities or threats are called threat signatures [25].
Usually, detection is done as direct mappings of various system incidents that indicate threats
with threat signatures [25]. As a result, the database of threats must be constantly updated
for new identified threats [25]. Because new threats can be detected for inclusion in the
database, the correctness of detecting threats is sometimes compromised since threats which
do not have corresponding signatures cannot be mapped and detected [25]. But these types
of intrusion detection systems have lower false alarms since each detected threat is
registered in the database of threat signatures [25].
Behaviour based intrusion detection systems take a different approach to threat
detection. They are built using artificial intelligence technologies [25]. Usually, the system for
which the intrusion detection is built is modelled for its behaviour and deviations from that
behaviour is used as a technique for detecting the threats [25]. Because of this, they have a
better correctness at detecting threats [25]. No threat signatures or mappings of incidents
that indicate threat is required [25]. Additionally, they have higher false alarms because there
is no mapping of detected threats with a database of known threats [25].
Besides these, intrusion detection systems are classified based on purposes for which
they are built and the activeness or passiveness at which they deal with threats [25]. There
are host based and network based intrusion detection systems made for such purposes [25].
Active intrusion detection systems are configured to block or prevent attacks while passive
intrusion detection systems are configured to monitor, detect and alert threats [25].
2.4.2 Anomaly Detection Systems
According to a research paper entitled “Design and Implementation of Anomaly Detection
System”, there are global variables of a network that can be used for detecting anomalous
activities on a network [19]. The paper used a hybrid of signature based and anomaly intrusion
detection to detect anomaly [19]. According to the paper, some of the techniques used for
detecting intrusion include using generic network rules to detect network anomaly. The paper
also used dynamic network knowledge such as network statistics to detect anomalous
activities [19].
2.4.3 Behaviour Encryption
Behaviour algorithms are applied to safeguard information on computing devices such as
mobile phones and laptops [27]. These algorithms are the basics for building systems that
study and encrypt user behaviour on a computing device in order to ensure the security of
information on the computing devices [27]. A study into mobile platform security reports that
behaviour encryption application systems have been designed and built, focusing on mobile
platforms [27]. Results from this study indicated that encryption application systems are
effective in ensuing mobile platform security [27].
In addition to this, it must be noted that, since mobile devices can have security
through behaviour encryption systems, then the behaviour of hosts on a network or network
systems can also be encrypted to ensure safe communication since each host or user on a
system or network has a particular behaviour pattern.
Cryptographic study into encrypting the normal usage model can fall under behaviour
encryption since the usage model represents a system’s behaviour and can be composed of
a user’s behaviour. This can aid in securing the information that embodies the usage model.
It is also necessary because if the usage model can easily be predicted then it is possible to
manipulate the usage model and launch an attack.
2.4.4 Risk Analysis
Computer risk analysis is also called risk assessment. It involves the process of analyzing and
interpreting risk. To analyze risk, the scope and methodology has to be initially determined.
Later, information is collected and analyzed before interpreting the risk analysis results.
Determining the scope can be described as identifying the system to be analyzed for risk and
parts of the system that will be considered. Also, the analytical method that will be used with
its detail and formality must be planned. The boundary, scope and methodology used during
risk assessment determine the total amount of work efforts that is needed in the risk
management, and the type and usefulness of the assessments result.
Risk has many components including assets, threats, likelihood of threat occurrence,
vulnerability, safeguard and consequence. Risk management includes risk acceptance which
takes place after several risk analyses. Normally, after risk has been analyzed and safeguards
implemented, the remaining or residual risk in the system that makes the system functional
must be accepted by management. This may be due to constraints on the system such as ease
of use, or features of the systems for which strict safeguard will cost the organization
operational problems. As such, risk acceptance, like the selection of safeguards, should take
into account various factors besides those addressed in the risk assessment. In addition, risk
acceptance should take into account the limitations of the risk assessment.
2.4.5 Information Security Awareness and Practices
A paper on information security awareness in Saudi Arabia discusses information security
awareness and practices. The paper is entitled “A study of information security awareness
and practices in Saudi Arabia.” This paper emphasizes the fact that information is under
constant threat from cyber vandals [1]. However, Saudi Arabia is rated poor in terms of
information security due to the fact that the country is a highly suppressed, patriarchical and
tribal culture country [1].
The paper examined the level of information security awareness among the general
public in the country using an anonymous online survey based on instruments the Malaysian
Security Organization produced [1]. In all, 633 persons responded to the survey and analysis
confirmed that indeed, information security awareness is low in the country and this is mostly
related to the fact that the country is highly suppressed, patriarchical and tribal in nature [1].
2.4.6 Protocol For Mitigating Risks on Social Networking Sites
According to an academic paper entitled, “Protocol for mitigating the risk of hijacking social
networking sites”, hackers can hijack a user’s session on social networking sites, impersonate
the victim and take over his session [7].
The paper deals with this risk by presenting a security authentication protocol for
mitigating the risk [7]. The protocol takes into account that users of social networking sites
connect to the sites using several platforms and connection speeds [7]. To cater for mobile
devices and tablets using Wifi connection, a novel Self-Configuring Repeatable Hash Chains
(SCRHC) protocol was developed to prevent the hijacking of session cookies [7]. This protocol
supports three levels of caching making it possible to forfeit storage space for enhanced
performance and reduced workload [7].
2.4.7 Behaviour Models and Anomaly Intrusion Detection
Behaviour models are used to detect intrusion in computer systems. This section reviews the
behaviour models that can be used to build behaviour based intrusion detection systems.
These models are put into various categories. The categories are, statistical models, machine
learning based techniques, cognitive models, computer immunology, user intention.
Statistical models include operational or threshold metric model, markov process or marker
model, multivariate model, statistical moments model, time series models, univariate models.
Machine learning based models include bayesian networks, generic algorithms, neural
networks, fuzzy logic, and outlier detection, cognitive models include finite state machines,
description scripts, and expert systems.
2.5 Research Model and Methodology
This section describes the research model and methodology for developing the security audit
framework. We will describe the research model and the steps that make up the
methodology.
2.5.1 Research Model
Assume that the normal usage (Y) of a computer network can be represented by a
mathematical function;
Y=f (Xi, Ci) such that Xi represents system variables like number of functions or number of
authentications. Ci represents system constants like maximum or minimum number of
authentications. When a change in Y is beyond the standard deviation determined from the
data set of our usage, then that change indicates a threat. To investigate this threat, machine
learning algorithms, mathematical functions and behaviour based intrusion detection
systems will be studied to determine Y in terms of a number of variables that represent Y
appropriately. The expected usage model of the network to be investigated includes the
following components. Host Usage Model, Server Usage Model, Device Usage Model, Port
Usage Model, Network Usage Model, Session Usage Model, Authentication Usage Model,
Memory Usage Model, CPU Usage Model, Battery Usage Model and Program Usage Model.
These components are expected to be derived from the variables listed below.
● Average number of application software that run on the network system while using
the system
● Average number of system processes that run on the network system while using the
system.
● Average number of authentications in the network system.
● Average number of user actions that happens on the network system
● Average time a user spends before his session expires.
● Average time the network system functions each day.
● Number of paired ports communicating on the network
● Average amount of memory space used on devices while the network is being
operated.
● Average CPU time spent on a single device on the network
● Average life span of a single device battery on the network.
2.5.2 Usage Model: A Java Interface That Implements the Research Model
For each component of a computer system under investigation, we will program a usage
model which is an implementation of the research model for that component which forms
part of the computer system under investigation. Each usage model implements an interface
captured in a java file called model.java.
There are eight functions in the model.java interface. The first one is computeval
which is for computing the usage value at an instance. The second one is findchange which is
for finding changes in the usage of the computer system. The third one is learnsys which is
11
for learning the usage of the system. The fourth one is findrelationship which is for finding
the regression equation. The fifth one is monitor which is for monitoring the usage of the
system. The sixth one is showalarm which is for displaying error messages and detected
intrusion. The seventh one is haltprocess which is for halting detected intrusion and the
eighth one is predictvals. It is for predicting usage values based on the regression equation
determined. Omitting an implementation of one of the functions of the usage model will
throw an exception. To implement the usage model, you will use the java keyword
implements. Below is an implementation of the model.java file
2.5.3 Usage Model File: model.java
public interface model{
public double computeval();
public double findchange();
public void learnsys(int t);
public Object findrelationship();
public void monitor(int t);
public void showalarm(String info);
public void haltprocess();
public void predictvals();
}
2.5.4 Implementing the Usage Model for an Authentication System
class auth_usage implements model{
/*variable declaration for dependent and independent variables */
public double computeval(){
}
public double findchange(){
}
public void learnsys(int t){
}
public Object findrelationship(){
}
public void monitor(int t){
}
public void showalarm(String info){
}
public void haltprocess(){
}
public void predictvals(){
}
}/* end of class
2.5.5 Methodology
The list below details activities or processes that will be followed to represent a computer
system with an abstract mathematical model and analyze changes in that system. It is hoped
that following these processes will arrive at the design and implementation of a normal usage
model, an intrusion detection system and a risk analysis system that together form a security
audit framework.
2.5.5.1 Machine Learning Algorithms & Behaviour Based Intrusion Systems
Machine learning techniques and algorithms will be investigated to know the extent to which
an expert system that learns a computer system’s usage can be built. Since the expected
usage model is a mathematical model, various mathematical modelling techniques will be
applied to determining the normal usage model.
When deviations from these mathematical models are analyzed it can lead to design and
implementation of behaviour based intrusion detection systems. As such, a thorough study
into design and implementation of behaviour based intrusion detection systems will be done.
2.5.5.2 Audit Trail Analysis
It is expected that computer security audit reports will be sampled and analyzed to arrive at
a set of dependent and independent variables and their data set. These variables and their
associated data set can be used to formulate the normal usage model.
2.5.5.3 Normal Usage Model
An investigation into applying the knowledge gained from the machine learning study, the
mathematical modelling study, the behaviour based intrusion detection system study and the
audit trail analysis will be done. It is hoped that this will answer the question how do you
represent the normal functioning of a computer system with a mathematical abstract model.
2.5.5.4 Threat Modelling
Differential equations of the normal usage model will be investigated to know the extent to
which deviations from the normal usage models can be analyzed. An abstract mathematical
model of these deviations will be formulated. These abstract models are derivatives of the
normal usage model.
2.5.5.5 Boolean Calculus
A study into representing the normal usage model with a boolean function will be done. It is
hoped that analyzing these boolean functions will aid in building a hardware that is the
expected usage system. Differential equations of these boolean functions will be studied to
analyze changes in the system that indicate deviation from the normal usage model.
2.5.5.6 Experimenting Usage and Threat Models
Programming will be used as a tool to experiment various usage and threat models. These
usage and threat models are expected to be derived from a computer system. This
experiment will lead to design and implementation of a normal usage system, an intrusion
detection system and a risk analysis system. These systems are expected to be components
of a security audit framework.
2.5.5.7Computer Usage Survey
A questionnaire for obtaining information about computer and smart phone usage will be
employed. It is expected that this will give an idea about various statistics that make up a
computer or smart phone’s usage. These statistics will be a guideline for sampling
experimental data of a computer system’s usage during experimenting the usage and threat
models.
2.5.5.8 Threat Detection Systems
It is hoped that an anomaly based threat detection system will be developed to demonstrate
the effectiveness of the research model at being used to model systems usage and threats.
The effectiveness of the threat detection system developed at preventing threats on a
computer system will also be measured. In this project, the threat detection system that will
be developed is for ecommerce sites.
2.6 Threats Associated With Computer Systems
This chapter discusses some of the threats and attacks associated with computer and network
systems.
2.6.1 Attacks associated with a computer system
The attack types that will be discussed include Malicious Code, IP Scan and Attack, Web
Browsing, Virus, Unprotected Shares, Mass emails, Simple Network Management
Protocol(SNMP), Hoaxes, Backdoors, Password Crack, Brute Force, Dictionary, Denial of
Service(DoS) and Distributed Denial of Service Attack(DDoS), Spoofing, Man in the Middle,
Spam, Mail Bombing, Sniffers, Social Engineering, Buffer Overflow and Timing Attack.
2.6.2 Malicious Code
Malicious Code attack include the execution of viruses, worms Trojan horses and active Web
scripts with the intent to destroy or steal information. The state of the art malicious code is
the polymorphic or multivector worm. The attack programs uses up to six attack vectors to
exploit a variety of vulnerabilities in commonly known information system devices. Perhaps
the best illustration of such an attack remains the outbreak Nimda in Septembers 2001 which
used five of the six vectors with startling speed. TruSecure Corporation an industry source for
information security statistics and solutions reports that Nimda spread to span the internet
address of 14 countries in less than 25 minutes.
2.6.3 IP Scan and Attack
The infested system scans a random or the local IP addresses and targets any of the several
vulnerabilities known to hackers or left over from previous exploits such as Code Red Black
Orifice, Poizon Box.
2.6.4 Web Browsing
If the infested system has write access to any Web page, it makes all the Web content files
(html, asp,gci and others) infectious so that users who browse to those pages become
infected.
2.6.5 Virus
Each infested machine infects certain common executable or script files on all computers to
which it can write with virus code that can cause infection.
2.6.6 Unprotected Shares
Using vulnerabilities in file systems and the way many organizations configure them, the
infested machine copies the viral components to all locations it can reach.
2.6.7 Mass emails
By sending email infections to addresses found in the address book. The infected machine
infects many users, whose mail reading program also automatically runs the programs and
infects other systems.
2.6.8 Simple Network Management Protocol (SNMP)
By using the widely known and common password that were employed in the early versions
of the protocol (which is used for remote management of networks and computer devices)
the attacking program can gain control of the device. Most vendors have closed these
vulnerabilities with software upgrades.
2.6.9 Hoaxes
A more devious approach to attacking computer systems is the transmission of a virus hoax
with a real virus attached, when the attack is masked, in seemingly legitimate message,
unsuspecting users readily distribute it. Even though those users are trying to do the right
thing to avoid infection, they end up sending the attack on to their coworkers and friends and
infesting many users along the way.
2.6.10 Backdoors
Using a known or previously unknown and newly discovered access mechanism, an attacker
can gain access into a system or network resource through a back door. Sometimes, these
entries are left behind by system designers or maintenance staff and thus referred to as trap
doors. A trap door is hard to detect, because, very often the programmer who puts it in place
also makes the access exempt from the usual audit logging features of the system.
2.6.11 Password Crack
Attempting to reverse-calculate a password is often called cracking. A cracking attack is a
component of many dictionary attacks. It is used when a copy of the security account manager
(SAM) data file can be obtained. The SAM file contains the hashed representation of the user’s
password. A password can be hashed using the same algorithm and compared to the hashed
results. If they are the same the password has then been cracked.
2.6.12 Brute Force
The application of computing and network resources to try every possible combination of
options of password is called brute force attack. Since this is often an attempt to repeatedly
guess passwords to commonly used accounts, it is sometimes called a password attack. If
attackers can narrow the field of accounts to be attacked, they can devote more time and
resources to attacking fewer accounts. That is one reason a recommended practice is to
change account names for common accounts from the manufacturer’s default. While often
effective against low-security systems, password attacks are often not useful against systems
that have adopted the usual security practices recommended by manufacturers.
2.6.13 Dictionary
This is another form of brute force attack. The dictionary attack narrows the field by selecting
specific accounts to attack and uses a list of commonly used password (the dictionary) instead
of random combinations. Organizations can use similar dictionaries to disallow passwords
during the reset process and thus guard against easy-to-guess passwords. In addition, rules
requiring additional number and/ or special characters make the dictionary attack less
effective.
2.6.14 Denial of Service (DoS) and Distributed Denial of Service (DDoS)
In a denial of service attack, the attacker sends a large number of connections or information
requests to a target. So many requests are made that the target system cannot handle them
along with legitimate requests for service successfully. This may result in the system crashing
or simply becoming unable to perform ordinary functions. A distributed denial of service is an
attack in which a coordinated stream of requests is launched against a target from many
locations at the same time. Most DDos attacks are preceded by a preparation phase in which
many systems, perhaps thousands are compromised. The compromised machines are turned
into zombies, machines that are directed remotely (usually by a transmitted command) by
the attacker or participate in the attack. DDos attacks are the most difficult to defend against
and there are presently no controls that any single organization can apply. There are,
however, some cooperative efforts to enable DDos defences among groups of services
providers; among them is the Consensus Roadmap for Defeating Distributed Denial of Service
attacks.
2.6.15 Spoofing
Spoofing is a technique used to gain unauthorized access to computers wherein the intruder
sends messages to a computer that has an IP address that indicates that the messages are
coming from a trusted host. To engage in IP spoofing, a hacker must first use a variety of
techniques to find an IP address of a trusted host and then modify the packet headers so that
it appears that the packets are coming from that host. Newer routers and firewalls
arrangements can offer protection against IP spoofing
2.6.16 Man in the Middle
In the well-known man-in-the-middle or TCP hijacking attack, an attacker monitors (or sniffs)
packets from the network, modifies them and inserts them back into the network. This type
of attack uses IP spoofing to enable an attacker to impersonate another entity on the
network. It allows the attacker to eavesdrop as well as to change, delete, reroute, add forge,
or divert data. In a variant on the TCP hijacking session, the spoofing involves the interception
of an encryption key exchange, which enables the hacker to act as an invisible man-in-the-
middle – that is eavesdropper – with regard to encrypted communications.
2.6.17 Spam
Spam is unsolicited commercial email. While many considers spam a trivial nuisance rather
than an attack, it has been used as a means to make malicious code attacks more effective.
In March 2002, reports emerged of malicious code embedded in MP3 files that were included
as attachments to spam. The most significant consequence of spam on the modern
organization, however, is the waste of both computer and human resources it causes by the
flow of unwanted electronic mail. Many organizations attempt to cope with the flood of spam
by using filtering technologies to stem the flow. Other organizations tell the users of the mail
system to delete unwanted messages.
2.6.18 Mail Bombing
Another form of e-mail attack that is also Dos is called mail bomb, in which an attacker routes
larger quantities of e-mail to the target. This can be accomplished through social engineering
or by exploiting various technical flaws in the Simple Mail Transport Protocol. The target of
the attack receives unmanageable large volumes of unsolicited e-mail. By sending large e-
mails with forged header information, attackers can take advantage of poorly configured e-
mail systems on the internet.
2.7 Mathematical Modelling Techniques and Machine Learning
Based Models
The mathematical relation that represents the normal usage model can be determined using
regression analysis. Regression analysis is a field of statistics. It employs the least squares
method to determine the relationship between a data set composed of two or more variables.
The least squares method tries to determine the relationship by minimizing the error margin
of the derived relation.
2.7.1 Simple Linear Regression
Simple linear regression problems involve a dependent and a single independent variable.
The goal is to find a linear relationship between the two variables. The linear relationships are
of the form y=b0+b1x where y is the dependent variable and x is the independent variable.
The slope of the line is b1 and the y-intercept is b0. The relationship between the dependent
and independent variable can be derived using the least squares method. First of all, the sum
of the dependent and the independent variables, and the sum product of the dependent and
the independent variables must be calculated. Secondly, the sum of the squares of the
dependent and the independent variables must be calculated.
The constant that represents the slope of the line that fits the predicted function is
calculated as the product of the sum product of the dependent variable and the independent
variable and the sample size minus the product of the sums of the dependent and the
independent variables divided by the product of the sample size and the sum of the squares
of the independent variable minus the square of the sum of the independent variable.
The constant that represents the y-intercept of the line is also calculated as the
product of the sum of the dependent variable and the sum of the squares of the independent
variable minus the product of the sum of the independent and the sum product of the
dependent and independent variables divided by the product of the sum of the squares of
the independent variable and the sample size minus the square of the sum of the independent
variable.
Finally, the correlation coefficient of the predictive relation is also calculated as the
product of the sample size and the sum product of the dependent and independent variable
minus the product of the sums of the dependent and independent variables divided by the
square root of the product of the sample size and the sum of the squares of the independent
variable minus the product of the squares of the sum of the independent variables multiplied
by the product of the sample size and the sum of the squares of the dependent variable minus
the square of the sum of the dependent variable.
2.7.2 Multiple Linear Regression
Multiple linear regression problems involve a dependent variable and two or more
independent variables. Using the least squares method, the goal is to find the linear
relationship between the variables involved. The relationships are of the form y=b0 +
b1x1+b2x2+…+bnxn, where n is the number of independent variables, x1, x2,… ,xn are the
various independent variables and y is the dependent variable.
To solve multiple linear problems, we first need to reduce the expected function or multiple
linear models to their simple linear forms. In this form, it is easier to determine the regression
equation. To do this we need to determine the y=b0+b1x for every independent variable. That
way, the regression coefficient set denoted b associated with the independent variables can
be determined using the least squares method. As such the set b made up of b1, b2,…bn is a
set containing the entire regression coefficient associated with the predicted regression
function.
2.7.3 Non Linear Regression
Non linear regression problems involve finding a non linear relationship between a dependent
variable and one or more independent variables. Because non linear graphs are difficult to
analyze, they can be represented mathematically as linear models before they are analyzed.
This makes it possible to use linear regression techniques to analyze such relationships.
One of the ways used to represent non linear relationships with linear models is taking logs
on both sides of the relationship equation. That reduces the non linear relationship to a linear
relationship. An example is of the form y2=x2/xy. To reduce this relationship to a linear
relation we take logs on both sides of the relation.
The resulting relationship is 2logy=2logx-logx-logy. When this relationship is simplified the
resulting relationship is logy=(logx)/3. In this form, the logy term represents the dependent
variable and the logx term represents the independent variable. Let K=logy and let P = logx. It
implies that K=P/3. This becomes the linear form of our non linear relation.
2.7.4 Machine Learning Based Models Used for Developing Anomaly Based Intrusion
Detection Systems.
This section discusses how hidden markov models can be used to detect and prevent threats
on a computer system.
Hidden markov models are machine learning models that are used to model states in a
system, the sequence in which they occur and the associated probabilities for each state
transition. When a system has a set of states in which it usually falls and it can be predicted
or established that each new state is dependent on the previous states, then hidden markov
models can be used to learn the state transitions that usually happens in the system. It must
be stated that the sequence in which states occur in a system can be characterized by a
parametric random process. Also, the probability associated with each state transition is
irrespective of the time in which the transition occurred in the system.
For computer systems which have occurrences that happen based on a parametric
random process, these occurrences can be seen as the set of states in the system. Some of
these occurrences may be the point at which the system is at its optimal usage, and the point
at which a particular threat occurs in the system. When a set of threat types that happens in
the system is determined, it becomes possible to study the sequence in which these threats
occur in the system and the various transitions between the threats using hidden markov
models. Also, the various usage points including the optimal, the minimum and the average
usage and how they are transited in the system can be studied using hidden markov models.
Because various occurrences and threats can be studied using hidden markov models,
it becomes possible to predict the next occurrence or threat that will happen on a host or a
computer network. Threat sources can also be predicted using threat models. When threat
models are integrated, they give a general idea about the source of the threat. With such
knowledge and ability, the next threat or occurrence that has a higher likelihood of happening
on a host or network can be predicted using application of hidden markov models. As such,
occurrences can be prevented if they are estimated to be disastrous. Also, if for instance, for
some reason, the optimal or minimal usage must be reached, it becomes possible to study
ways of optimizing the transition from the current state or predicted next state to the
required state. This makes it possible to move from a particular usage point to the desired
usage point.
This approach to threat detection and usage optimization, make it possible to build
anomaly based intrusion detection systems that are correct, prompt and increase optimal use
of the system. The anomaly based intrusion detection systems built using these techniques
are correct because the threat models come from usage models that are built using similar
approaches and the threat prediction and prevention mechanisms are designed using robust
techniques developed using these approaches. Also, there are likely going to be lower false
alarms since the threats predicted on hosts or on the network come from threat models
designed from such robust methods.
An example of a kind of cyber security threat that this approach can be used to model
is a network problem where a student is determined or predicted to be sending threatening
or socially unacceptable emails to colleagues. Typically, his identity is hidden on the network
on which he sends the emails. As such, it is difficult to determine the likelihood that he will
send such threatening emails on a particular day or hour so that his identity could be
determined and brought to book. Using hidden markov models, a usage model of the email
system could be developed that will make it possible to determine the day or hour in which
he is likely going to send such an email. This will help in determining his identity and bring him
to book.
2.8 The Normal Usage Model of a System
If the normal usage of a network system can be represented by a mathematical function such
that that function is made up of system variables Xi and system constants Ci, then any
representation of our mobile system can be summarized as Y=f (Xi, Ci), where Y is our systems’
usage and Xi are the various independent variables of our mobile system that constitutes the
normal usage model of the system. A normal usage model is an abstract representation of
the usual or normal functioning or behaviour of a system.
In order to model the normal usage of our system and determine its mathematical
representation, it is essential to keep the method simple and the variables simple in
abstraction and minimal in quantity. This makes it easy to analyze, model and detect threats
by applying a branch of calculus called differentiation. Simplicity and minimal number of
variables make it possible to arrive at a mathematical function whose differential coefficient
can be easily computed using differentiation. As such, two cases will be considered.
In the first case, the normal usage model of our system can be analyzed and modelled
based on simple but essential micro usage models. These micro usage models represent
smaller components of our mobile system such as an authentication system of our mobile
system, and a user’s session. Ideally, these models are best derived from exactly one most
appropriate system variable when feasible or at most two in order to reduce the complexity
involved in computing the differential coefficient of the usage model.
For a mathematical function involving more than a single independent variable, our
method for threat detection using the differential equations techniques is within the scope
of multivariable calculus. Since it is easy to compute the differential coefficient of a single
variable function, our threat analysis and detection can be easy if all our micro models are
single variable functions.
In the second case however, our usage model derives its mathematical representation from
at least two or three most relevant system variables of the mobile system under examination.
This option increases the complexity involved in calculating the differential coefficient of our
normal usage model and analyzing the threat associated. This is because the normal usage
model for this case is a function that can be derived from two or more independent system
variables.
To do this type of differentiation, we use a branch of calculus called partial
differentiation, where one of the independent variables of our usage model is held constant
to analyze changes in the usage. This type of differentiation is also within the scope of
multivariable calculus. The sections that follow the one below throw more light on how to
model the normal usage of several micro usage models. These micro usage models are
expected to be components of a computer network’s usage.
It must be noted that the usage model is made up of the usage model function and a
statistical model that captures the mean and standard deviation of the predicted usage
function. This statistical usage model is called moments or mean and standard deviation
model. There are other statistical models that could have been used. These include time series
models, univariate models and bivariate models.
2.8.1 Single Variable Calculus Review and its Applications
Assume a mobile system with exactly three major system variables. If sampling each of these
variables helps us to arrive at exactly one micro usage model of our mobile system that best
represents the behavior or functioning of that feature of our system, then we can use
differential equations of the three micro models to analyze and detect threats. Below are
some examples of calculus basics for our threat modelling techniques.
Y=2X+3 is a linear function that represents our first micro usage model. X is the number of
authentications. Y=3X2+2X+6 is a quadratic function that represents our second micro usage
model and X is the number of hosts on the mobile system’s wireless network. Y=40/ X+ 5 is
an exponential function that represents our third micro usage model and X is the number of
applications on a host on the mobile system’s wireless network. For each micro usage model,
the differential coefficient can be computed using the law for differentiation given below.
Theorem 1: dy/dx(C) =0, where C is a constant. Theorem 2: dy/dx (f[Xi, Ci]) is computed as the
product of the exponent of the first term that results from simplifying f (Xi, Ci) and the
constant besides it multiplied by the system variable Xi raise to the power the original
exponent of the first term minus one plus the result for iterating the first step till every term
of f (Xi, Ci) has been evaluated based on the first step. The final result looks like the sum of a
series of rational numbers computed from the law after going through all the terms.
From the calculus basics review above, the corresponding differential coefficients of the three
micro models are determined as follows; 2, 6X+2, and -40/ X2. If the standard deviations of
our micro models are computed, then we can analyze changes in our system by looking at
values of our usage model and its derivatives and how they relate to the average usage, its
corresponding standard deviation, and the acceptable thresholds for threats.
Any occurrence at a point where our usage model value is not equal to the average usage
indicates a threat. Any occurrences at a point where the usage model value is less than the
average usage minus its corresponding standard deviation is a denial of service threat. Any
occurrence at a point where the usage model value is greater than the average usage plus it
corresponding standard deviation is an intrusion. Also any occurrence at a point where the
value of the usage’s derivative is not equal to the acceptable threshold for threats is a threat.
2.8.2 Usage Model List
It must be stated that for each component of the system under investigation, we will create
a usage model.
2.8.3 Authentication Usage Model
The authentication usage model represents the usage of an authentication system. The
independent variables that must be sampled to determine the usage of an authentication
system are the average data transmitted during an authentication (x1) and the average
network speed for a single authentication (x2). The average data transmitted is the average
of request and response data for a single authentication and the average network speed is
the average upload and download speed for a single authentication. The dependent variable
that must be sampled is the time taken for an authentication (y).
The goal of modelling the dependent and independent variables is to arrive at a
mathematical relationship between y and the two independent variables x1 and x2. It is
expected that the relationship will be Y=c1(x2/x1) +c2, where c1 and c2 are system constants.
In addition to that, some system constants that will aid threat analysis must be determined.
These are the total number of valid authentications, the expected authentications within a
time frame, the minimum authentications within a time frame and the maximum
authentications within a time frame. The mathematical relationship between y, x1 and x2 is
the normal usage model of the authentication system. After this relationship has been
determined, various occurrences that deviate from this relationship can be used to analyze
threats. For instance, any occurrence that is not equal to the average usage is a threat.
Additionally, any occurrence that indicates a change outside an acceptable threshold is a
threat. The acceptable threshold is a range within which changes in the systems are deemed
normal. Such a range is composed of the average usage and standard deviation.
2.8.4 Session Usage Model
A session usage model represents a single user’s behavior before his session expires. To
determine the mathematical model for a user’s session, two main independent variables must
be sampled. These are size of session data accumulated (x1), and number of user actions (x2).
The dependent variable that must be sampled is time spent before session expires (y). The
session usage model is expected to be made up of two micro usage models. The mathematical
representation of the micro usage models are expected to be Y=c1x1+c2 where c1 and c2 are
systems constants and Y=c1x2+c2 where c1 and c2 are system constants.
In addition to the two mathematical functions, some system constants that will aid threat
analysis must be determined. These include average user actions, average size of data
accumulated, average time spent. These constants can be determined from the data set used
to determine the usage model.
The two mathematical relationships represent the session usage model. Both are linear
functions. It is expected that as user actions increase the time spent also increases. It is also
expected that as data accumulated increase times spent also increases.
2.8.5 Memory Usage Model
The memory usage model represents the usage of memory space in a system. The
independent variables that must be sampled are number of application programs running
(x1), and the number of system processes running (x2). The dependent variable that must be
sample is amount of memory space being used(y). The mathematical relationship between
x1, x2, and y is expected to be y=c1x1+c2x2+c3 where c1 is the average memory space for
programs, c2 is the average memory space for processes and c3 is the average memory being
used when no process or program is running.
In addition to these, some system constants that aid threat analysis must be determined.
These include the minimum and maximum memory space for programs and the minimum
and maximum memory space for processes. The mathematical relationship between x1, x2,
and y is the memory usage model. When determined, the memory usage model can be used
to analyze changes in the memory usage that indicate threats in the system.
2.8.6 CPU Usage Model
The CPU usage model represents CPU usage in a system. The independent variables that must
be sampled are the number of application programs running (x1), and number of system
processes running (x2). The dependent variable that must be sampled is amount of CPU
power being used (y). The mathematical relationship between x1, x2, and y is expected to be
y=c1x1+c2x2+c3 where c1 is the average CPU power being used for programs, c2 is the
average CPU power being used for processes and c3 is average CPU power being used when
no process or program is running. In addition to these, some system constants that aid threat
analysis must be determined. These include the minimum and maximum CPU power for
programs and the minimum and maximum CPU power for processes. The mathematical
relationship between x1, x2 and y is the CPU usage model. When determined, the CPU usage
model can be used to analyze changes in the CPU usage that indicate threats in the system.
2.8.7 Program Usage Model
To determine the program usage model the dependent and independent variables that must
be sampled are time spent using program (y), and number of functions used (x). In addition
to that, the following constants must also be determined. Minimum functions used and
maximum functions used. The relationship between y and x determined after sampling
various x and y values is the program usage model denoted by y=f(x).
2.8.8 Host Usage Model
The host usage model is composed of four independent variables. Memory usage (x1), session
usage (x2), CPU usage (x3), and program usage (x4), derived from their respective usage
models. The dependent variable that must be sampled in the time host spent on host (y). Any
relationship determined between the dependent and the independent variables is the host
usage model. The resulting host usage model is denoted y=f (x1,x2, x3, x4).
2.8.9 Battery Usage Model
The battery usage model is made up of the average usage of CPU, average memory usage and
the average usage of how a session behaves in the system. These are the independent
variables. The dependent variable is the battery lifespan. The independent variables are
derived from their respective micro usage models.
2.8.10 Device Usage Model
The device usage model is made up of a battery usage model, a host usage model, and the
time spent on the device. The usage models that make up the device usage model compute
the average micro usage and try to relate that with the time spent on the device. The time
spent on the device is the dependent variable.
2.8.11 Server Usage Model
The server usage model is made up of the CPU time being used, the memory space being used
and the number of processes running. These variables are used to form two different micro
usage models. As such, there are two dependent variables, CPU time and memory space. The
independent variable for both micro usage models is the number of processes running.
2.8.12 Port Usage Model
The port usage model is made up of the time elapsed during communication, number of
programs that use the port and the number of paired ports. The number of paired ports is the
dependent variable and the remaining variables are the independent variables.
2.8.13 Network Usage Model
The network usage model is made up of average port usage, average server usage average
host usage, the average size of data transmitted on the network, and time spent on the
network. The first three variables are the independent variables. The remaining two are the
dependent variables. As such two micro usage models make up the network usage model.
2.8.14 Aggressive Usage Detector
This model is a utility that detects aggressive behavior on a system. It is modelled just like the
various micro usage models. Various factors that determine aggressive behavior during
system usage are used to determine the mathematical representation of this utility.
Aggressive behavior includes aggressive use of major system resources, and aggressive use of
system components with limited resources.
The average aggressive behavior and its standard deviation are determined. Any system
occurrence that indicates the average aggressive behavior, or the average aggressive
behavior plus its standard deviation or the average aggressive behavior minus its standard
deviation is considered a threat and must be halted, alerted or stored for audit purposes.
2.8.15 False Alarm Detector
The false alarm detector is a utility that detects normal system usage that otherwise may be
deemed threats. Occurrences that meet the criteria for false alarms are normal usage that
seems to put the entire usage of the system into a false state of vibration or anarchy. Such
usage occurrences are as such prioritized as normal optimal usage. The remedy for the
vibrations such usage occurrences cause is delay in other normal usage occurrences in the
system.
The state and magnitude of other system occurrences plus the state and magnitude
of the normal optimal usage determine the impact of the perceived anarchy. To increase
convenience with which the system for which this utility is developed, the average delay time
26.
is modelled just like the aggressive usage detector.
2.8.16 Special Parameters of The Usage Model
This section discusses special parameters of our normal usage model. These parameters
include the average usage, the usage standard deviation, the minimum usage, the maximum
usage and the most frequent usage value recorded.
The average usage is the predicted average usage after the normal usage model function has
been determined. The usage standard deviation is the standard deviation of the predicted
normal usage function. The minimum and maximum usage values are the minimum and
maximum usage predicted using the normal usage model. These parameters together with
usage rates, threat model constants and other usage constants are used in analyzing and
detecting threats.
2.8.17 Building The Usage Profile
To build the usage profile we will first program a usage model for all the components of the
computer system under investigation. For this research, we want to build the usage profile
for a computer network. As such we will program a usage model for authentication on the
computer system, we will also program a usage model for a user’s session on the computer
system. Also, we will program the usage model for memory usage in a computer system. We
will also program a usage model for CPU usage. Additionally, we will program a usage model
for a host on a network and program another usage model for a server on the network and
finally we will program a usage model for the network its self.
The usage model for each component represents the behaviour of that component of a
computer system under investigation. The usage model when implemented will help us
determine the regression equation which represents the research model and the average
usage and its standard deviation. In addition to the regression equation and the mean and
standard deviation model we will develop a markov chain model for the system under
investigation. As such we will determine states in the entire computer network and the
various state transitions and the associated probabilities of state transitions. The rest of this
chapter will explain how to build a usage profile using an authentication system and explain
the details of the critical variables of the other usage models and explain the mathematical
theory needed for building the usage profile.
2.8.18 Building a Usage Profile for an Authentication System
To build a usage model for an authentication system, we must sample critical system variables
of a system. These variables include the download speed on the network, the upload speed
on the network, the size of data sent to the server during authentication, the size of data sent
to the client during authentication and the time it takes for a successful authentication. The
size of data sent and received from the server are request data and response data
respectively.
To build the usage model for the authentication data, we will capture data for all the critical
variables at equal time intervals say every 10 minutes while the authentication system is being
used. After having a sample of sample size of about 10 we will try to determine the
relationship between the dependent variable and the independent variables. As already
stated the relationship can be determined using simple or multiple linear regression. In
addition to the regression equation, we will also determine other statistics that describe the
behaviour of the authentication system such as the mean and standard deviations for the
variables that were sampled.
2.8.19 Building A Markov Chain Model for An Authentication System
Hidden markov models are machine learning models that are used to model states in a
system, the sequence in which they occur and the associated probabilities for each state
transition. When a system has a set of states in which it usually falls, and it can be predicted
or established that each new state is dependent on the previous states, then hidden markov
models can be used to learn the state transitions that usually happens in the system.
To build the markov chain model we will determine states on the authentication system and
their associated probabilities. Some of these states include the average usage of the
authentication system. This may be abstracted as the average time it takes for a successful
authentication. Other states include the minimum and maximum recorded time for a
successful authentication and the average time it takes for a failed authentication or the
maximum and minimum recorded time for failed authentications. With this information and
their associated probabilities of occurrence during a normal day we have more information
about the behaviour of the authentication system.
2.8.20 Threat Models in a System
A threat is a change in the normal usage model that is beyond a certain acceptable threshold
called the standard deviation of the usage model. A threat model on the other hand is an
abstract representation of this change in our mobile system that is beyond the acceptable
threshold. Integration can be performed on a threat model to determine the source of the
threat. Integration is a reverse operation for differentiation in calculus. A threat model that
can perform integration operations can be called a novel self integrating data structure. This
chapter of the paper will look at threat models of the micro usage models that make up a
computer network and how to analyze these threats in order to prevent them.
Also, how to determine the sources of these threats using a novel self integrating
threat model will be discussed. To do this, three main functions are introduced. The functions
are y=3, y=4X+2 and y=9X2+3. These functions are in the context of the novel self integrating
data structure. These functions are three different threat models. Additionally, the threat
models of the various micro usage models discussed in this paper will be explored.
2.8.21 Properties and Methods of the Novel Self Integrating Data Structure
The best properties or characteristics of the data structure that represents our threat model
include just to mention a few, names of network software or host application software,
version number of network and host software, license information that include date software
was purchased or released and number of years needed for renewal, IP address and Mac
address of a host on a network.
The methods of such a gigantic or simulative object may include methods for
computing the integral of a threat model, another for computing the differential coefficient
of the predictive normal usage model, a method for computing the differential equation of a
network or host threat model. These methods included are mostly methods needed for
performing the major calculus operations that will help in the novel calculus simulation on a
network to detect threats and their sources on a wireless network. Besides these, it may be
necessary to implement methods that retrieve hidden network identity like IP and Mac
addresses on a local area network.
2.8.22 Integration Review
Based on our three functions stated in this chapter, we will do an introductory review of
integration which is a branch of calculus that is a reverse operation for differentiation. The
integrals for the functions introduced in this chapter are computed respectively as 3X +C,
2X2+4X+C and 3X3+3X+C where C represents system constants in the mobile system.
Computing the integral can be tricky so two laws are defined below to aid quick computation
of the integrals of a normal mathematical function.
Theorem 1:
If a function is represented by a constant such as a rational number, the integral is the product
of the variable x and the rational number which is the constant plus a system constant c, to
be determined by about a pair of x and y values.
Theorem 2:
If a function is not represented by a constant, the integral is given as the constant of the first
x occurring term divided by the sum of the exponent of the first x occurring term and 1
multiplied by the variable x raised to the power the sum of the exponent of the first x
occurring term and 1 plus repeating the same for every x occurring term plus the
corresponding system constant c.
2.8.23 Interpretation of Threat Model Integrals
Since the novel self integrating data structure is a programmed threat model, it is important
to discuss the meaning of its integrals. The integrals represent the source of the original
threat. Examples of the integrals of the threat model may result in detecting the function,
software, host or network from which the threat was detected. With properties like software
name, version number, IP and Mac addresses it becomes easy to pin point the source of the
threat.
If the integral of a threat model looks like the normal usage model of a function of the
system under examination, then that function from the system under examination can be
predicted as the source of the threat. Similarly, if the integral is similar to the normal usage
model of a software, host, or network that forms part of the system which is being
investigated, then that threat can be predicted to be from that software, host or network.
2.8.24 Threat Analysis and Detection
To do threat analysis in a system and abort processes that initiated those threats, linear and
non linear programming techniques can be used. The goal here is to minimize the threat
occurrence frequency and the overall impacts associated with the threat and optimize the
normal usage function. In addition to these two goals, there are some constants that aid
threat analysis. These constants are associated with the normal usage model and the threats
in the system.
Examples of these constants may be the rate at which usage is increasing with respect
to a particular usage variable or the rate at which the threat impact and frequency increases
with respect to a particular variable in the usage model and other special parameters
associated with the usage model function.
The average usage, its standard deviation and the threat model function make up the
threat model. The average usage and standard deviation are constants in the threat model.
Using the threat model function, the average usage and standard deviation, threats analysis
can be done using linear and non linear programming. The goal is to minimize threats using
the threat model function as the objective function and the average usage and standard
deviation as constraints. Other parameters that may be used as constraints include the rate
at which usage is increasing with respect to a particular usage variable or the rate at which
the threat impact and frequency is increasing with respect to a particular usage variable.
2.8.25 Threat Prediction
This section discusses how to predict threats in a system. The network usage model discussed
in the previous chapter and its associated threat model will be used to demonstrate how to
predict or detect a threat in a system. As discussed in the previous section, threats can be
detected using linear and non linear programming. The network usage model function and its
associated threat model function are the objective functions.
The constraints that will be used are the average network usage and its standard
deviation, and other parameters such as the rate at which the network threat increases with
respect to other network usage model components such as average host usage, average
server usage, average port usage, average time the network operates, average data
transmitted on the network. The goal of the linear or non linear programming is to optimize
the usage such that usage is within the range of the average usage minus its standard
deviation and the average usage plus its standard deviation. These are the lower and upper
bounds of our objective function. Every combination of system variables whose usage is
within this usage range minimizes threat in the system.
Since the average port, host and server usage are derived from their corresponding
usage models, the linear and non linear programming analysis will be done independently for
these ones. When a threat is predicted in a system, the chance of it being accurate is
dependent on the usage value at that instance and whether it is within the range of the
acceptable usage. This is constructed using the average usage and its standard deviation. Any
usage value that is less than the average usage minus its standard deviation is a threat. Also,
a usage value that is greater than the average usage plus its standard deviation is a threat.
That means that any predicted threat at a point where the predicted usage is within the usage
range has a high chance of being false. In addition to that, the actual and predicted usage
values can be used to determine that chance that the predicted threat is accurate. If the
difference between them is high, there is a chance that the predicted usage may be wrong.
Since the predicted usage and the threat models are derived from the usage model
function, there is a chance the predicted threat is also false. Finally, the closer the correlation
coefficient of the usage model function is to zero, the higher the chance the predicted usage
and its associated threats values are wrong. Usage model functions with correlation
coefficient of 0.6 and above indicate that the predicted usage values and predicted threats
values are accurate. These values are obtained from the usage model function and the threat
model function respectively which are modelled using relevant system variables that make it
possible to model system usage and system threats.
2.8.26 Risk Analysis in a System
To do risk analysis in a system, the frequency at which threats in the system occur and the
impact they have on the system must be known. When a frequency table is constructed for
all threats and their associated impacts stored, it becomes easy to analyze risks associated
with a system.
When a threat is predicted, the likelihood of the threat occurring in the system can be
computed using the threat frequencies. The impacts various threats have can also be
determined based on the types of threats and other parameters such as the number of such
threats, the speed at which they occurred and the resources they affected or damaged. Risk
in a system is computed as the product of the likelihood of threat occurrence and the impact
that threat occurrence has on the system. These concepts are the basics for developing a risk
analysis system using the techniques we have discussed so far.
2.9 Normal Usage Model and Threat Model Simulation
In this chapter, we discuss the experiment that was conducted to determine the usage of a
computer system. We also discuss how to simulate the threat and usage models with the
hope of developing a threat detection system. Four of the micro usage models that were
discussed in this paper were used for the simulation. These are the ones for authentication,
session CPU and memory.
Because the usage model for authentications was determined to be a rational
function, logs were taken on both sides of the relation as part of the simulation in order to
reduce the relation to their linear form. The original function is Y=c1(x2/x1) +c2. When
reduced to its linear form we have log Y= log c1+ log x2 – log x1 + log c2. Since log c2 and log
c2 result in constants let denote them with k1 and k2 respectively. Additionally, let B= log Y,
let j1= log x1 and let j2= log x2. Therefore, the linear form of the usage for authentication is
B= j2- j1 + k1 + k2. Since k1 + k2 is a constant, let it be represented by k. As such B= j2- j1 + k
where B is the dependent variable and j2 and j1 are the independent variables. When B, j2,
and j1 are sampled, Y=c1(x2/x1) +c2 can be determined.
The cpu and the memory usage models are multiple linear forms. The original relation
is of the form y=c1x1+c2x2+c3 where x1 and x2 are the independent variables. The original
relation must be reduced to their simple linear form. To do this, determine y=b0+bx for each
independent variable. The sum of the various b0 equals c3. The various b correspond to the
constant associated with the independent variable for which y=b0+bx was determined. For
example, the b for any y=b0+bx determined for x1 equals to c1 and that for x2 equals to c2.
When x1, x2, and y are sampled and the various y=b0+bx determined, y=c1x1+c2x2+c3 can
be determined completely.
The simulation was run for four times within a week. On the first instance, it was run
for 15 minutes. On the second instance, it was run for 30 minutes. On the third instance it
was run for 45 minutes. On the last instance it was run for 60 minutes. The functions for the
usage models, and their corresponding correlation coefficient were also determined.
2.10 Tools and Computer Packages
This chapter discusses the tools and computer packages that were used throughout this
research project. We will also look at the programming languages, database platforms and
development frameworks that can be used to develop an anomaly based intrusion system for
ecommerce sites using the concepts we have discussed in this paper. The simulation was
implemented using java. It was a console based simulation. Java was chosen for its object
oriented concepts such as encapsulation, inheritance, interfaces, objects, and polymorphism.
To implement an intrusion detection system using results of this research, the
following tools will be essentials These tools are best suited for intrusion detection systems
developed for ecommerce sites. Bootstrap, CodeIgnitor, MySQL Database Management
System, SQLite, SQLyog, and Eclipse. The programming languages that will be used are PHP
and Android. PHP is for the desktops and laptops that connect to the ecommerce sites and
Android is for mobile phones that use the ecommerce sites.
Bootstrap and CodeIgnitor are web development frameworks. Bootstrap is for
frontend developments and CodeIgnitor is a backend framework for PHP developers. For
Android Eclipse can be used as the best IDE for Android developments. MySQL and SQLyog
are for the database servers that will run on the ecommerce site as part of the intrusion
detection system implementation. SQLite is for the databases that run on the Android
implementations that form part of the intrusion detection system developed for the
ecommerce website.
With all these tools, frameworks and packages, developers are ready to develop
intrusion detection systems for ecommerce sites using the concepts in this research paper. It
is expected that the micro usage models discussed will be integral libraries that will be
implemented in PHP and Android as part of an implementation for ecommerce sites or any
group of web or mobile application systems.
3.0 Secured Expert Medical Consultation System
In this chapter, we describe the objectives for the secured expert medical consultation
system, the research question for developing the system, and the problems that we seek to
address. We will also look at the requirements of the system and then describe how the
system will be developed.
3.1 Problem Definition
The three problems that this research project seek to address are:
● Developing a medical system that will assist in medical consultation
● Determining diseases that a Patient is likely to get based on medical history.
● Measuring medical information such as height, weight, temperature, blood pressure
and blood sugar.
3.2 Research Questions
● What are the best and most efficient ways of modelling diseases for development of
a medical care system for administering medical care?
● How can an expert medical consultation system be developed?
● How can we determine disease that a Patient will get based on medical history?
● How can we develop a hardware system and a software that can be used to measure
medical information such as weight, height, temperature, blood pressure and blood
sugar?
3.3 Objectives of Paper
● The first goal is to model disease for the development of an expert system for
diagnosing diseases during medical consultation.
● The second goal is to optimize how to match a set of symptoms and indications to a
particular disease.
● The third goal is to find diseases that a Patient can bet based on medical history
● The fourth goal is to develop a medical equipment that can be used to measure
medical information such as height, weight, temperature, blood pressure and blood
sugar.
● The last goal is to develop an expert medical consultation system for mobile phones,
tablets and personal computers.
3.4 Literature Review
In this section we describe the various literature that forms part of this research. We will look
at evolutionary algorithms, the various types of evolutionary algorithms and how sensors can
help in measuring medical data.
3.4.1 Evolutionary Computing Terminologies
Some of the terms used in evolutionary computing are phenotypes, genotypes,
chromosomes, genes and alleles [3]. Phenotypes are a set of search space that are related
to the possible solutions in a problem [3]. Genotypes are a set of result space that are related
to the possible solution in a problem [3]. The transition from the search space, phenotypes
to the results space genotypes is encoding [3]. The transition from the results space to the
search space is decoding [3]. In some cases, the search space may be a set of integers and
the result space may be a set of binary numbers representing a search integer in the search
space [3].
3.4.2 Evolutionary Algorithms
One of the techniques used in evolutionary computing is evolutionary algorithms. The
components of an evolutionary algorithm are representation, Evaluation or fitness function,
population, parent selection mechanism, variation operators which are recombination and
mutation, survivor selection mechanism (replacement), initialization and termination
condition [3]. Some of the classes of Evolutionary Algorithms are Genetic Algorithms, Genetic
Programming, Differential Evolution, Evolutionary Strategy, and Evolutionary Programming
[2].
3.4.2.1 Representation
Representation includes changing the real world into the evolutionary computing world [3].
The possible solution set which is the set of phenotypes is encoded into objects in the
evolutionary computing world called genotypes [3]. Many synonyms are used to describe
elements of the two space [3]. The genotypes are called chromosomes [3]. Genes are
placeholders and alleles describe objects in the place [3].
3.4.2.2 Evaluation or Fitness Function
The evaluation function forms the basis for selection [3]. It is the requirement to adapt to. It
defines what improvements means [3]. From the problem-solving perspective, it represents
the task to solve in the evolutionary computing context [3]. Technically, it represents a
procedure that assigns a quality measure to the genotypes [3]. Typically, the procedure is
composed from a quality measure in the genotype space and the reverse representation [3].
Often the problem to solve in an evolutionary algorithm is an optimization problem [3]. In
such cases the name objective function is used in the problem context and the fitness
function is identical to or a simple transformation of the objective function [3].
3.4.2.3 Population
The population is a multiset of genotypes [3]. The role of the population is to hold possible
solutions [3]. The population is the unit of evolution [3]. Genotypes are static individual
objects, not changing or adapting, it is the population that does [3].
3.4.2.4 Parent Selection Mechanism
The role of parent selection mechanism is to distinguish among individuals based on their
quality [3]. This is to allow better individuals to become parents of the next generation. An
individual is seen as a parent if it has been selected to undergo variation in order to create
offspring [3]. The parent selection mechanism together with the survivor selection
mechanism is essential for quality improvements [3]. In EC parent selection mechanism is
usually probabilistic. Thus, high quality individuals get a higher chance of becoming parents
than those with low quality [3]. However low quality is usually given a small chance otherwise
the entire search becomes too greedy and gets stuck in a local optimum [3].
3.4.2.5 Variation Operators
The variation operators are mutation and recombination [3]. The review for mutation and
recombination is given below.
3.4.2.6 Mutation
Mutation is an operation that is performed on one genotype and produces a slightly modified
mutant [3]. As such, mutation is a unary operator [3]. A mutation operator is usually
stochastic [3]. As such, its output, which is the child, depends on a series of random choices
[3]. It should be noted that an arbitrary unary operator is not necessarily mutation [3].
Mutation in general is supposed to cause an unbiased random change [3]. It must be noted
that the variation operator forms the evolutionary implementation of basic steps within the
search space [3]. Theorems suggesting that given sufficient time evolutionary algorithms (EA)
determine a global optimum depends on the property of each genotype representing a
possible solution that can be reached by the variation operators [3].
3.4.2.7 Recombination
The name for a binary operator is recombination or crossover [3]. Similar to mutation
recombination is a stochastic operator [3]. The choice on which part of the parent is
combined and how these parts are combined are random selection [3]. Recombination
operators with higher arity, that is having more than one operand or parent is mathematically
possible and easy to implement but have no biological equivalence [3]. That is why perhaps
they are not widely used although several studies show that they have a positive effect on
evolution [3]. The principle behind recombination is very simple. That is, by mating parents
or individuals with different features we can produce offspring with both features [3].
Biologically, recombination is the superior form of reproduction [3].
3.4.2.8 Survivor Selection Mechanism
Survivor selection mechanism is often called replacement or replacement strategy [3].
However, a good reason to use the survivor mechanism is to keep terminology [3]. The role
of the survivor mechanism is to distinguish among individuals based on their quality. It is
similar to parent selection, but it is used at a different stage of the evolution cycle [3].
3.4.2.9 Initialization
Initializations are kept simple in most EA applications [3]. The first population is seeded by
randomly generated individuals [3].
3.4.2.10 Termination Condition
There are two types of termination conditions [3]. The first one is when the evolutionary
computing problem has an optimal fitness level [3]. This may probably come from a known
optimum of the given objective function or fitness function [3]. In such cases, when that level
is reached then the evolutionary problem search can be stopped [3]. However, EAs are
stochastic in nature and the optimum may not be reached hence the fitness function may
never be satisfied and the algorithm may never stop [3]. That requires that the condition is
extended with one that certainly stops the algorithm [3]. Some of these extensions include
the following. Using maximum allowed CPU time [3]. So, when this maximum time elapses
then the algorithm is stopped [3]. Total number of fitness evaluations is given a limit so that
the algorithm is stopped when this limit is reached [3]. To do the evolution a number of times
for example for a number of generations [3].
3.4.3 Genetic Algorithms
A Genetic Algorithm is a search heuristic that is inspired by Charles Darwin’s theory of natural
evolution [24]. This algorithm reflects the process of natural selection where the fittest
individuals are selected for reproduction in order to produce offspring for the next generation
[24]. There are five phases in a genetic algorithm [24]. These are;
Initial Population, Fitness Function, Selection, Crossover, and Mutation [24].
3.4.4 Evolutionary Strategies
Evolutionary Strategies (ES) is one type of black - box optimization algorithm that belongs to
the family of evolutionary algorithms [30]. The optimization targets of Evolutionary Strategies
are vectors of real numbers [30]. It must be noted that Evolutionary Strategies are stochastic
optimization algorithms and are designed specifically for continuous function optimization
[26].
3.4.5 Genetic Programming
Genetic Programming is a domain-independent method for genetically breeding a population
of computer programs to solve a problem ]28]. That is, Genetic Programming iteratively
transforms a population of computer programs into a new generation of programs by
applying analogs of naturally occurring genetic operations [28]. It must be stated that Genetic
Programming is a form of Artificial Intelligence that mimics natural selection to find optimal
results [20].
3.4.6 Evolutionary Programming
Evolutionary Programming originally conceived by Lawrence J. Fogel in 1960 is a stochastic
optimization technique similar to Genetic Algorithms [43]. One main difference between
Evolutionary Programming and Genetic Algorithms is that it places emphasis on behavioural
linkage between parent and offspring rather than seeking to emulate specific genetic
operators as observed in nature [43]. It is also similar to Evolutionary Strategies although they
were developed independently [43].
3.4.7 Differential Evolution
Differential Evolution is a heuristic approach for global optimization of nonlinear and non-
differentiable continuous space functions [38]. Differential Evolution is similar to popular
direct search approaches such as genetic algorithms and evolutionary strategies [38]. It must
be stated that this algorithm is advantageous over the other mentioned approaches because
it can handle nonlinear and non-differentiable muti-dimension objective functions, while
requiring very few control parameters to steer the minimisation [38].
3.4.8 A Survey on Wearable Sensor-Based Systems for Health
Monitoring and Prognosis
A research paper entitled “A Survey on Wearable Sensor-Based Systems for Health
Monitoring and Prognosis” describes wearable and biomedical health systems for health
monitoring and prognosis [5]. The paper explains that these wearable and biomedical health
systems have gained a lot of attention in the scientific community [5]. The paper also explains
that this is “mainly motivated by increasing healthcare costs and propelled by recent
technological advances in miniature biosensing devices, smart textiles, microelectronics, and
wireless communications, the continuous advance of wearable sensor-based systems will
potentially transform the future of healthcare by enabling proactive personal health
management and ubiquitous monitoring of a patient's health condition” [5].
The paper attempts to review the current research and developments on wearable
biosensor systems for medical monitoring. According to the paper, a variety of system
implementations are compared in an approach to identify the technological shortcoming of
the current state of the art in wearable biosensor solutions and systems [5]. The paper also
explains that “an emphasis is given to multiparameter physiological sensing system designs,
providing reliable vital signs measurements and incorporating real-time decision support for
early detection of symptoms or context awareness” [5].
3.4.9 Sensors in Medicine
According to another research paper entitled “Sensors in Medicine'', sensors are devices that
detect physical, chemical and biological signals and provide a way for those signals to be
measured and recorded.[8] Also that paper explains that “physical properties that can be
sensed include temperature, pressure, vibration, sound level, light intensity, load or weight,
flow rate of gases and liquids, amplitude of magnetic and electronic fields, and concentrations
of many substances in gaseous, liquid, or solid form. Although sensors of today are where
computers were in 1970, medical applications of sensors are taking off because of advances
in microchip technologies and molecular chemistry.” [8]
3.5 Research Model and Methodology
This section of the research paper describes the research model and methodology for
developing the expert medical consultation system.
3.5.1 Research Model
The research model for this research is for modelling diseases that a patient has and other
diseases he can get. The model is composed of two independent variables and a dependent
variable. These variables are basically, the chance that a patient has an illness. The chance is
computed on a hundred percent scale based on symptoms and reactions the patient is having
and other medical parameters such as weight, height, blood pressure, and glucose level of
the patient. The other factor that affects the research model is the number of years a patient
has been having symptoms that point to a particular disease. The research model is given as
Y=aX1+bX2+c. Note that a, b, and c are constants.
3.5.2 Research Methodology
The methodology for this research is made up of five major steps. These are assumption
enumeration, hypothesis formulation, experimentation, hypothesis testing and
demonstration. All these steps contribute to successful conduction of the research. It is
hoped that following these steps will lead to the development of an Expert Medical
Consultation System that will assist Medical Doctors in performing medical consultations.
3.5.2.1 Assumption Enumeration
At this stage, a number of assumptions about providing medical care using medical systems
that run on android or on browsers must be made. Assumptions must also be made about
symptoms that can be predicted and diseases that can be diagnosed using application of
evolutionary algorithms. Assumptions will also be made about how well the system being
developed will be secured. The assumptions below form the basis of this research. Based on
these assumptions, several hypotheses are formulated as part of this research.
● Sensors can be used to measure temperature, height, weight, blood pressure and
blood sugar.
● Weight of a patient can be computed using an accelerometer on a hardware system
that runs on the android operating system (OS).
● Blood pressure of a patient can be computed using a sensor that measures patient’s
blood pressure.
● Temperature of a patient can be measured using a sensor that measures
temperature.
● Height can also be measured based on motion data measured using a sensor that
measures height.
3.5.2.2 Hypothesis Formulation
At this stage, the researcher must formulate hypotheses about the extent to which android
applications and web applications can be used to provide medical care. Hypotheses will also
be formulated about the extent to which ailments can be diagnosed using an android
application and web applications developed using the application of evolutionary algorithms.
Also, we will formulate hypotheses about the extent to which the system will be secured. This
will include how to secure the database, web server and other components of the system.
Based on the above assumptions, several hypotheses as listed below can be formulated for
conducting this research.
● The first group of hypotheses is based on the extent to which a medical system can
be used to provide medical care. In this project, it is claimed that an expert medical
consultation can be developed on mobile phones and tablets using android
programming language and on personal computers using web technologies.
● The second group of hypotheses is based on what diseases can be diagnosed using
medical systems developed for mobile phones, tablets and personal computers. This
project claims that all kinds of diseases can be diagnosed using a medical system
developed for mobile phones, tablets and personal computers.
● The third group of hypotheses is based on the medical information that can be
measured, for medical care using a medical equipment that has sensors for measuring
medical data. In this project, it is claimed that all medical data can be measured. These
include weight, height, temperature, blood pressure and blood sugar.
3.5.2.3 Experimentation
At this stage, the hypotheses formulated will be experimented using simulation. The
simulation can be done through the application of evolutionary algorithms. The simulation
will be performed on a personal computer and an android device . We will also experiment