This document proposes a hybrid approach using genetic algorithm, K-nearest neighbor, and probabilistic neural network for classifying MRI brain tumors. It extracts texture features using gray level co-occurrence matrix from wavelet decomposed MRI images. A genetic algorithm is then used for feature selection to identify an optimal feature subset for classification. Finally, probabilistic neural network is used to classify tumors into seven types based on the selected features, achieving accurate classification results.
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
A Hybrid Approach for Classification of MRI Brain Tumors Using Genetic Algorithm, K-Nearest Neighbor and Probabilistic Neural Network
1. A Hybrid Approach for Classification
of MRI Brain Tumors Using Genetic
Algorithm, K-Nearest Neighbor and
Probabilistic Neural Network
Raghad Majeed Azawi Dhahir Abdulhade Abdulah
Department of Computer Science Department of Computer Sciences
College of Science, University of Diyala College of Science, University of Diyala
Diyala, Iraq Diyala, Iraq
raghadma2016@yahoo.com Dhahir@yahoo.com
Jamal Mustafa Abbas Ibrahim Tareq Ibrahim
Department of Computer Science College of Medicine, University of
College of Science, University of Diyala Diyala, Diyala, Iraq
Diyala, Iraq Ibr_33@yahoo.com
altuwaijari@yahoo.com.uk
Abstract Detection and classification of brain tumor are very important because it provides anatomical
information of normal and abnormal tissues which helps in early treatment planning and patient's case
follow-up. There is a number of techniques for medical image classification. We used PNN (Probabilistic
Neural Network Algorithm) for image classification technique based on Genetic Algorithm (GA) and K-
Nearest Neighbor (K-NN) classifier for feature selection is proposed in this paper. The searching capabilities
of genetic algorithms are explored for appropriate selection of features from input data and to obtain an
optimal classification. The method is implemented to classify and label brain MRI images into seven tumor
types. A number of texture features (Gray Level Co-occurrence Matrix (GLCM)) can be extracted from an
image, so choosing the best features to avoid poor generalization and over specialization is of paramount
importance then the classification of the image and compare results based on the PNN algorithm.
Keywords - Brain tumors, MRI, Gray Level Co-occurrence Matrix (GLCM), Classification accuracy,
Genetic Algorithm (GA), K-Nearest Neighbor (K-NN) and Probabilistic Neural Network Algorithm (PNN).
I. INTRODUCTION
The human body is made of many cells. Each cell has a specific job. The cells grow within the body
and are divided to reproduce new cells. These divisions have certain functions in the body. But when
each cell loses the ability to control its growth, these divisions are done without any limitations, and
tumor consists. The brain is the central part of the human body responsible for coordinating and
observing all other body organs, so if a tumor is present in any part of the brain then the activities
controlled by this part of the nervous system are also affected. There are two types of brain tumors
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
74 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
2. malignant tumor and benign tumor [1]. Many imaging techniques can be used to diagnose and detect
brain tumors early. Compared to all other imaging techniques, MRI is actively involved in the
application of brain tumor identification and detection. It does not use ionizing radiation (X-rays) [2].
II. LITERATURE REVIEW
N.D.Pergad and Kshitija V.Shingare in 2015 [4] designed a system for brain tumor extraction.
This proposed system consists of preprocessing method for removing noise and Gray Level Co-
occurrence Matrix (GLCM) for feature extraction step. Probabilistic Neural Network (PNN) is
used for classification step of the image into normal and abnormal tumors. The last step is
segmentation technique. The accuracy of this proposed system is 88.2%.
Naveena H. S. et al., in 2015[5] exploited the capability of ANN algorithm for classification of
MRI brain tumor images to either cancerous or non-cancerous. K-means clustering algorithm was
used for segmentation stage. Then, gray level co-occurrence matrix (GLCM) was used for feature
extraction stage of segmented image. Finally, Backpropagation neural network (BPNN) and
Probabilistic Neural Network (PNN) is used for classification stage of brain tumors. The overall
accuracy of the system is 79.02% in case of BPNN algorithm and 97.25% in case of PNN
algorithm.
Ata'a A. and Dhia A. in 2016 [3] this system is to detect and define tumor type in MRI brain
images. The proposed system consists of multiple phases. The preprocessing stage the MRI
image. Step two, transformations (features extraction algorithm based on using two level of 2-D
discrete wavelet (DWT) and multiwavelet (DMWT) decomposition). Step three, the statistical
measurements utilized to extract features from (GLCM). Step four, which deals with classification
utilized (PNN) and the final Step, a proposed algorithm to segment, Superpixel Hexagonal
Algorithm. The accuracy of testing in DWT is 91% and in case DMWT is 97%.
S.U Aswathy, and et.al, in 2017 [21] designed a system for brain tumor segmentation using a
genetic algorithm with SVM classifier. The proposed system is consisting of multiple phases.
Step one is Pre-processing using the high pass, low pass and median filter for preprocessing. Step
two, the segmentation by using a combination of expectation maximization (EM) algorithm and
the level set method. Step three, feature extraction and selection using GA. Step four,
classification MRI brain image to normal or abnormal by using SVM. The present work segments
the tumor using Genetic Algorithm and classification of the tumor by using the SVM classifier.
III. THE PROPOSED SYSTEM
In the proposed system seven types of MRI image (normal and six types of tumors are considered,
these are Lymphoma, Glioblastoma multiform, Cystic oligodendroglioma, Ependymoma, Meningioma
and Anaplastic astrocytoma). The input data set consists of 140 (20 images for each type of the six
tumors and 20 images for normal images) with 8 bit (pixel value 0-255). The methodology of the MRI
brain human image classification is as follow:
1- Preprocessing Step using a median filter.
2- Feature extraction using Haar Wavelet and GLCM.
3- Feature selection by GA and K-NN.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
75 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
3. 4- Classification step using PNN algorithm. Block diagram of the proposed system is shown in
figure 1.
Figure 1: Block Diagram of the Proposed MRI Brain Tumor System.
A. Preprocessing Stage
In this step, we try to analysis the image which performs noise reduction and image enhancement
techniques to enhance the image quality. In this step, we'll use a median filter.
Median Filter
The median filter is used to reduce the salt and pepper noise present due to motion artifacts (movement
of the patient during the scan) in the MRI images. It is done for smoothening of MRI brain image. Here
we are using 3x3 (MRI) median filters to eliminate salt and pepper noise [6]. Figure 2 shows the after
the applied median filter.
Figure 2: (a) input image and (b) After Applying Median Filter.
B. Feature Extraction Stage
Feature Extraction is a challenging task to extract a good feature set for classification. The purpose of
feature extraction is to reduce the original data set by measuring features or certain properties, which
Input MRI
Image
Pre-Processing Feature
Extraction
Feature
Selection
Classification
Class Name
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
76 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
4. distinguish one input pattern from another. There are different feature extraction methods but in this
section, the texture-based ones can be most effective for classifying the medical images. There are
several texture-based feature extraction methods but Gray Level Co-occurrence Matrix (GLCM) is
very common and successful [7]. In the proposed method one Level Discrete Wavelet transform (Haar
Wavelet) is firstly used to decompose input image into four sub-images and then GLCM method is
applied on each sub-image.
1. Discrete Wavelet Transform
The discrete wavelet transform is identical to a hierarchical sub-band system where the sub-bands are
logarithmically spaced in frequency and represent octave-band decomposition. By applying DWT, the
image is actually decomposed (i.e., divided) into four sub-bands in level one. As shown in figure 3 the
critically sub-sampled of DWT [7]. As a result, there are four sub-band (LL, LH, HH, and HL) images
at each scale. For feature extraction, only the four sub-bands are used for DWT decomposition at this
scale then feature extraction based on GLCM.
(a) Original image. (b) One level.
Figure 3: Image Decomposition.
2. Gray Level Co-occurrence Matrix (GLCM)
GLCM is used for feature extraction from MRI brain image. A feature of the image based on pixels and
its neighboring pixels are extracted from image GLCM matrix is formed contains the textural feature
based on two-pixel intensity values in the matrix. Feature-based on pixel and its neighboring pixel is
extracted by GLCM (i, j) matrix. GLCM is a two-dimensional function, composed of n of horizontal
direction pixels and m of vertical direction pixels. The horizontal and vertical coordinates of the image
is given by i, j. 0 ≤ i ≤ n ≤ j ≤ m where total pixel number is m×n. First, the intensity of the pixel and its
neighboring pixel is calculated for the entire image. For getting more reliable texture feature multiple
GLCMs are computed for different directions at (0°, 45°, 90° and 135°) which can give the spatial
relationship between neighboring pixels[8]. This method reduces the computational complexity. After
calculation for GLCMs of 4 sub-bands images, it is used to calculate features of the image which
uniquely describes the images.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
77 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
6. C. Feature Selection Stage
The medical image is a high volume in nature. If the data set contains redundant and irrelevant
attributes, classification may produce a less accurate result. The genetic algorithm can deal with large
search spaces efficiently, and hence has less chance to get a locally optimal solution than other
algorithms. Our proposed algorithm consists of two parts:
1- The first part deals with evaluating features (chromosome) using a genetic algorithm (GA).
2- The second part deals with building classifier (K-NN) and measuring the accuracy of the classifier.
In the proposed system, we used the KNN classifier with a different (K) at each time. Starting from
k=1 to the k =the square root of the training set. Then our multi classifiers system uses majority rule to
identify the class, i.e. the class with the highest number of votes (by 1-NN, 3-NN, 5-NN… √n-NN) is
chosen.
1. K-Nearest Neighbor
This approach is one of the simplest and oldest methods used for pattern classification. It often yields
efficient performance and, in certain cases, its accuracy is greater than state-of-the-art classifiers. The
KNN classifier categorizes an unlabeled test example using the label of the majority of examples
among its k-nearest (most similar) neighbors in the training set. The similarity depends on a specific
distance metric; therefore, the performance of the classifier depends significantly on the distance metric
used. he uclidean distance between a test sample x and samples of a training set. or N-
dimensional space, uclidean distance between any two samples or vectors x and x is given in 28)
[14].
D = √∑ ( ) (28)
2. Genetic Algorithm
A genetic algorithm is a general adaptive optimization search methodology based on a direct analogy to
Darwinian natural selection and genetics in biological systems. GA work with a set of candidate
solutions called a population. Based on the Darwinian principle of ‘survival of the fittest’, the GA
obtains the optimal solution after a series of iterative computations. GA generates successive
populations of alternate solutions that are represented by a chromosome, i.e. a solution to the problem
until acceptable results are obtained. A fitness function assesses the quality of a solution in the
evaluation step. As defined by formula (29).
Fitness = WA × KNN_accuracy + Wnb/N (29)
Where WA is the weight of accuracy, and it's can be set from (0.75 to 1). And Wnb is the weight of N
features participated in classification where N ≠ 0 .
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
79 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
7. The crossover and mutation functions are the main operators that randomly impact the fitness value.
Chromosomes are selected for reproduction by evaluating the fitness value. The fitter chromosomes
have a higher probability to be selected into the recombination pool using the roulette wheel method.
Crossover is a random mechanism for exchanging genes between two chromosomes using the one
point crossover. In mutation the genes may occasionally be altered, i.e. in binary code genes changing
genes code from 0 to 1 or vice versa. Offspring replaces the old population using the elitism or
diversity replacement strategy and forms a new population in the next generation. Figure 4 illustrates
the genetic operators of crossover and mutations [11] from the experiment the results were extracted,
10 features (energy, entropy, contrast, variance, sum entropy, difference entropy, homogeneity, cluster
prominence, cluster shade, and dissimilarity) were selected from the set of 20 features and the
population size, P was varied (50,100 and 500) .
Figure 4: Genetic crossover and mutation operation.
3. Proposed Algorithm
Step (1): Input patterns of wavelet transform (M).
Step (2): Apply genetic search to generate the random population (Gi).
Step (3): Compute the transformed patterns (N) by applying the following equation
N=M×Gi (30)
Step (4): Calculates the accuracy of the classifier (K-NN) and returns to GA by the following equation
Accuracy = (31)
Step (5): Calculate the fitness value of the population by applying the function (29).
Step (6): Select the subset of higher fitness features.
Step (7): Crossover is done between the fittest individual.
Step (8): Mutation is done between the fittest individual.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
80 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
8. Step (9): New population is created.
Step (10): If the generation is not ended, it will calculate fitness v value.
Step (11): End.
Figure 5 illustrates classification accuracy by using a GA-based features extractor.
Input pattern of Genetic Algorithm Population Transformed K-NN
Wavelet Trans. Of Chromosome Gi Patterns Classifier
Accuracy of classifier transformed patterns from Gi
Figure 5: Classification Accuracy Using a GA-Based Features Extractor.
M
a11
….
ann
G1
G2
.…
Gn
N=M×Gi
(K-NN)
Classifier
Fitness Function
Selection Parents
Crossover Process
Mutation Process
New Generation
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
81 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
9. D. Classification Stage by Probabilistic Neural Network (PNN)
PNN is supervised feed-forward neural network algorithm derived from Bayes classifiers. It is used
probability density function (pdf) for each class of training sample in the classification. If training
sample is increased then classification goes near to density function of the class. The purpose of the
probabilistic neural network (PNN) is to classification. In this stage, the test MRI brain image is
compared with the training MRI brain image and gives output training MRI image which is similar to
test image. PDF is given by the equation (32).
( ) (
( )
) ( ) ∑ ,
( ) ( )
- ( )
Where = denotes the dimension of the pattern vector( ). = pattern number, = denotes the total
number of samples in class, = vector of i-th training pattern from class 1, T= vector transpose. he σ
is the " σj =STD (Xi) " where Xi is the vector in training data and j number of classes. PNN algorithm
consists of three layers.The input layer is the first layer which is the first distributed of the training
input patterns. The number of neurons or nodes in the input layer is equal to the number of input
vectors or variables. The second layer is the pattern layer or hidden layer. Each input vector in the
training set has a processing element. Each element in the pattern layer is trained once. The third layer
is the output layer (for each output class), an equal number of processing elements is used. Otherwise,
the network will generate poor results. When an input vector matches the training vector, an element
generates a high output value. Figure 6 illustrates the architecture of Probabilistic Neural Network [12].
Figure 6: Architecture of Probabilistic Neural Network.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
82 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
10. IV. RESULTS AND DISCUSSION
It includes the results of the classification of two systems and the comparison between them. In this
paper, an automatic brain tumor classifier was proposed. The proposed technique was implemented on
MRI dataset (these are Lymphoma, Cystic oligodendroglioma, Glioblastoma multiform, Meningioma,
Ependymoma and Anaplastic astrocytoma). The numbers of collected images are 140. The algorithm
described in this paper is developed and successfully trained in Visual Basic.Net.2013 using a
combination of image processing and neural network toolbox. The remaining 70 MRI brain images
from different types will be utilized as testing data phase. The result represents that 70 images are
classified correctly. The First System is the classification of the MRI of the brain with 20 GLCM
features (without a genetic algorithm). The classification rate of testing is 92.85%. The second system
is (the proposed system) to classify the MRI images brain with 10 GLCM features using the genetic
algorithm and K-NN. Classification rates of 4 cases (direction =0˚, 45˚, 90˚ and 135˚ are 98.57 %,
100%, 97.14% and 98.57 % respectively. The maximum classification rate of testing "is 100% in
case=45˚ so the proposed system with a hybrid approach (Genetic Algorithm and K-NN classifier) is
better than the first system (without Genetic Algorithm). Figure 7 illustrates a flowchart of the
classification rate of the first system and the proposed system.
Figure 7: Flow Chart of the Classification Rate of the First System and the Proposed System.
The optimal solution of the proposed system is achieved with a population size of 100 (in GA) and
classification results for the four bands is convergent (L1L1=22%, L1H1=26%, H1L1=28, and H1H1=24)
as shown in figure 8. It is achieved an optimal solution in case (k=7) of the K-NN classifier.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
83 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
11. Figure 8: The Classification Rate for Four Bands
V. CONCLUSION
In this work, the new method is a combination of Discrete Wavelet Transform (Haar Wavelet), Genetic
Algorithm, K-NN and Probabilistic Neural Network. By using this algorithm, an efficient Brain Tumor
Classification method is been constructed with a maximum classification rate of 100 %.This method
could serve inaccurate classification of Brain Tumor diagnosis.
VI. REFERENCES
[1] Lashkari A., "A Neural Network-Based Method for Brain Abnormality Detection in MR
Images Using Zernike Moments and Geometric Moments", International Journal of Computer
Applications, Vol. 4, No. 7, pp. 1-8, July 2010.
[2] Abdullah N., Ngah U., and Aziz S., "Image Classification of Brain MRI Using Support Vector
Machine", Imaging Systems and Techniques (IST), IEEE International Conference, pp. 242-247, May
2011.
[3] Ata'a A.H and Dhia A., "Classification Human Brain Images and Detection Suspicious Abnormal
Area", IOSR Journal of Computer Engineering, Volume 18, May-Jun, 2016.
[4] N. Pergad and K. Shingare, "Brain MRI Image Classification Using Probabilistic Neural Network
and Tumor Detection Using Image Segmentation", International Journal of Advanced Research in
Computer Engineering & Technology (IJARCET), Vol. 4, Issue 6, 2015.
[5] H. Naveena and K. Shreedhara, M. Rafi, "Detection and Classification of Brain Tumor using BPN
and PNN Artificial Neural Network Algorithms", International Journal of Computer Science and
Mobile Computing, Vol.4, Issue 4, India, 2015.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
84 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
12. [6] M. Sudharson, S.R. Thangadurai Rajapandiyan, and P.U. Ilavarasi, "Brain Tumor Detection by
Image Processing Using MATLAB", Middle-East Journal of Scientific Research 24 (S1): 143-148,
2016.
[7] Ali Sh., Mohammad R. A. and Mohammad H. N. Sh. "A CAD System for Automatic Classification
of Brain Strokes in CT Images", International Journal of Mechatronics, Electrical and Computer
Technology, ISSN: 2305-0543, pp. 67-85, Vol. 4(10), Jan 2014.
[8] Dipanshu, N. Masalkar, and Shitole, A.S, "Advance Method for Brain Tumor Classification",
International Journal on Recent and Innovation Trends in Computing and Communication vol.2, 2014.
[9] Cheng L. and Chieh J., "A GA-based feature selection and parameters optimization for support
vector machines", ELSEVIER, Expert Systems with Applications 31, 2006.
[10] Ahmad B., Mohammad A., and Ahmad A., "Solving the Problem of the K Parameter in the KNN
Classifier Using Ensemble Learning Approach", (IJCSIS) International Journal of Computer Science
and Information Security, Vol. 12, No. 8, August 2014.
[11] S.U Aswathy, G.Glan Devadhas and S.S.Kumar, "MRI Brain Tumor Segmentation Using Genetic
Algorithm With SVM Classifier", Journal of Electronics and Communication Engineering, e-ISSN:
2278-2834, p-ISSN: 2278-8735 PP 22-26, 2017.
[12] Swapnali S. and Dimple C., "Classification of Brain Tumor Using Discrete Wavelet Transform,
Principal Component Analysis, and Probabilistic Neural Network", International Journal for Research
in Emerging Science and Technology, Volume.1, Issue.6, pp.13-19, November 2014.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
85 https://sites.google.com/site/ijcsis/
ISSN 1947-5500