This document discusses mini projects using multilayer perceptrons (MLP) and support vector machines (SVM) to classify cancer samples. It provides information on MLP and SVM, including their structure and training. MATLAB code is shown to load a cancer dataset, divide it for training and testing an MLP and SVM, and train and test each model. The MLP achieved 96.7% accuracy on training data while the SVM achieved 99% accuracy. The SVM also had fewer errors at 1% compared to the MLP's 3.3% error rate.
1. Mini projects MLP & SVM
General information about SVM & MLP and
executed it on Matlab program
Hussain ALkabi & Mohammed alrekabi
Dr Saeed Shaerbaf
By
2. 1 | P a g e
Soft computing – mini projects (2 - 3)
ABSTRACT
Artificial neural network has been widely used in various fields as an intelligent
tool in recent years, such as artificial intelligence, pattern recognition, medical
diagnosis, machine learning and so on.
The classification of cancer is a medical application that poses a great challenge
for researchers and scientists. Recently, the neural network has become a popular
tool in the classification of cancer datasets. Classification is one of the most active
research and application areas of neural networks.
In this soft computing course we studied the classification of 699 samplesto know
the samplescarrying thedisease and thesafety samples by using MATLAB program.
we explaining the Artificial neural network (ANN) structure, training, testing and
the overtraining of it , and show the steps to doing that in MATLAB .
And we doing the same operation by using (SVM)support vector machines
SVMs are supervised learning models with associated learning algorithms that
analyzedata used for classification and regression analysis. Given a set of training.
KEYWORDS
Artificial neuralnetwork, training, testing, classification of cancer, overtraining
support vector machinesSVM
Cancerdiagnosticresearch is one of the majorresearch areas in the medicalfield.
We will explain it by using Artificial neural network (ANN)
A multilayer perceptron (MLP) is a class of feedforward artificial neuralnetwork.
An MLP consists of at least three layers of nodes. Except for the input nodes, each
node is a neuron that uses a nonlinear activation function.
MLPs are useful in research for their ability to solve problems stochastically,
which often allows approximate solutions for extremely complex problems like
fitness approximation.
MLPs are universal function approximates as showed by Cybenko's theorem so
they can be used to create mathematical models by regression analysis. As
classification is a particular case of regression when the response variable is
categorical, MLPs make good classifier algorithms.
3. 2 | P a g e
Soft computing – mini projects (2 - 3)
Fig 1 - A multilayer perceptron structure
Cancer dataset
The data set contains 699 samples (instances). The first attribute is the ID of an
instance, and the later 9 all represent different characteristics of an instance. Each
instancehasoneof 2 possible classes (benign ormalignant).Thecharacteristicsthat
are used in the prediction process are:
Clump Thickness
Uniformity of Cell Size
Uniformity of Cell Shape
Marginal Adhesion
Single Epithelial Cell Size
Bare Nuclei
Bland Chromatin
Normal Nucleoli
Mitoses
4. 3 | P a g e
Soft computing – mini projects (2 - 3)
Procedures of training a neural network
In order to train a neural network, there are five steps:
1- loading cancer dataset.
2- dividing dataset matrix to (train data, test data).
3- Create a neural network.
4- Train the network.
5- Test the network to make sure that it is trained properly.
1- Loading cancer dataset.
In MATLAB program we can load the cancer data by writing this command.
Load cancer_dataset
Than pressenter key will appeartwomatrixes in workspace(cancerinput& cancer
target) like that appear in fig 2.
Fig 2 cancer dataset in workspace
5. 4 | P a g e
Soft computing – mini projects (2 - 3)
2- dividing dataset matrix to (train data, test data).
in this step we dividing the cancer dataset in two group (two matrixes one for
training the ANN and one for testing the ANN), by using this commends:
input_train_data = cancerinput (:,1:500)’:
target_train_data = cancertarget (:,1:500)’:
input_test_data = cancerinput (:,501:699)’:
target_test_data = cancertarget (:,501:699)’:
3- Create a neural network
By Wiring nntool command in MATLAB program will be appear Neural
Network/Data manger window shown in fig 3
Fig 3 – Neural Network/Data manger (nntool)
6. 5 | P a g e
Soft computing – mini projects (2 - 3)
Than push importkey to import input_train_dataand target_train_dataafter that
push new key to create the Neural Network,
Name:
the name of the network such as hussain
Network properties:
Network type such as Hopfield or feed-
forward backprop
Number of layers:
The total number of layers in network at
least 2 layers
Properties for (layer X)
We can detect the number of neurons in
each layer such as 10 neurons in layer 1
and 20 neurons in layer 2.
And should be select the transfer function
for each layer
At the end push create key
7. 6 | P a g e
Soft computing – mini projects (2 - 3)
After creating the neural network, we can see the structure of it depending on our
sitting like fig 4
Fig 4 - neural network structure
4-Train the network.
Training neuralnetwork most importantstep because it ensures the efficient of the
neuralnetwork, at the beginninglets know what is the training.
Training isanyreaction from responding toinstruction or accept new acknowledge,
there are two types of training (supervised & unsupervised).
Supervised learning: is the machine learning task of inferring a function from
labeled training data, the training data consist of a set of training examples. In
supervised learning, each example is a pair consisting of an input object (typically
a vector) and a desired output value (also called the supervisory signal).
Unsupervised learning: is a type of machine learning algorithm used to draw
inferences from datasets consisting of input data without labeled responses. The
most common unsupervised learning method is cluster analysis, which is used for
exploratory data analysis to find hidden patterns or grouping in data. The clusters
are modeled using a measure of similarity which is defined upon metrics such as
Euclidean or probabilistic distance.
8. 7 | P a g e
Soft computing – mini projects (2 - 3)
Fig 5 – train neuralnetwork
Training data:
Inputs = train cancer input
Targets = train cancer targets
Training results
Outputs = choose name for it such as
hussain_outputs
Errors = choose name for it such as
hussain_errors
After that push train network key
9. 8 | P a g e
Soft computing – mini projects (2 - 3)
After training antherwindow will be
appearlike fig 6:
Fig 6 Fig 7
Epoch (number of training) = 11
Time of training = 0.1 sec
Performance = 0.0326
Gradient = 0.0251
Validation checks = 6
In fig 7 the plot performs of neural network
training and shown to as the point of
overtraining
Overtraining main increasing the errors of
neural network by increasing the training.
10. 9 | P a g e
Soft computing – mini projects (2 - 3)
5-testing the neural network
Better way to represent the results of neural network A confusion matrix is a table
that is often used to describe the performance of a classification model (or
"classifier") on a set of test data for which the true values are known. Theconfusion
matrix itself is relatively simple to understand, but the related terminology can be
confusing.
Fig 8 confusion matrix
11. 10 | P a g e
Soft computing – mini projects (2 - 3)
support vector machinesSVM
are supervised learning models with associated learning algorithms that analyze
data used for classification and regression analysis. Given a set of training.
Fig 1 SVM
SVM code in MATLAB
load cancer_dataset
train_input=cancerInputs(:,1:500);
test_input=cancerInputs(:,501:699);
train_target=cancerTargets(:,1:500);
test_target=cancerTargets(:,501:699);
train_target=train_target(1,:);
net=svmtrain(train_input,train_target);
out_net=svmclassify(net,test_input');
test_target=test_target(1,:);
error=test_target'-out_net;
12. 11 | P a g e
Soft computing – mini projects (2 - 3)
Errors in SVM
After testing the cancer dataset, we have just two errors
Thatmake the error rate 0.01%
Fig 2 errors in SVM
Fig 3 error position
13. 12 | P a g e
Soft computing – mini projects (2 - 3)
Comparingbetween MLP & SVM
Accuracy:
In MLP the accuracy of training is 96.7 and the accuracy in SVM is 99%
Errors:
In MLP the errors rate is 3.3%and the errors rate in SVM is 1%
Performance
Performance of an SVM is substantiallyhighercompared to NN. For
a three layer (one hidden-layer) NN
Training Time
SVMsare much slower than MLP
What we learn in this project
- Whatis the artificial neuralnetwork(ANN).
- Stricture of the artificial neuralnetwork.
- Learning theartificial neural network.
- Kindsof learning (supervised learning & unsupervising learning).
- Overtraining.
- Testing the artificial neural network.
- Confusion matrix.
- General information aboutMatlab program (nntool).
- Classification the samples using ANN.
- Whatis support vector machinesSVM
- How support vector machinesworking
- How to writing SVM code in Matlab.
- Comparing between SVM & MLP
14. 13 | P a g e
Soft computing – mini projects (2 - 3)
Reference
Classify Patterns with a Shallow Neural Network - MATLAB & Simulink
https://www.mathworks.com/help/nnet/gs/classify-patterns-with-a-neural-network.html
Divide Data for Optimal Neural Network Training - MATLAB & Simulink
https://www.mathworks.com/help/nnet/ug/divide-data-for-optimal-neural-network-training.html
Java Neural Network Framework Neuroph
http://neuroph.sourceforge.net/index.html