2014 Gene expressionmicroarrayclassification usingPCA–BEL.

Gene expression microarray classification using PCA–BEL
Ehsan Lotfi a
, Azita Keshavarz b,n
a
Department of Computer Engineering, Torbat-e-Jam Branch, Islamic Azad University, Torbat-e-Jam, Iran
b
Department of Psychology, Torbat-e-Jam Branch, Islamic Azad University, Torbat-e-Jam, Iran
a r t i c l e i n f o
Article history:
Received 21 February 2014
Accepted 16 September 2014
Keywords:
Amygdala
BEL
Emotional neural network
Cancer
BELBIC
Diagnosis
Diagnostic method
a b s t r a c t
In this paper, a novel hybrid method is proposed based on Principal Component Analysis (PCA) and Brain
Emotional Learning (BEL) network for the classification tasks of gene-expression microarray data. BEL
network is a computational neural model of the emotional brain which simulates its neuropsychological
features. The distinctive feature of BEL is its low computational complexity which makes it suitable for
high dimensional feature vector classification. Thus BEL can be adopted in pattern recognition in order to
overcome the curse of dimensionality problem. In the experimental studies, the proposed model is
utilized for the classification problems of the small round blue cell tumors (SRBCTs), high grade gliomas
(HGG), lung, colon and breast cancer datasets. According to the results based on 5-fold cross validation,
the PCA–BEL provides an average accuracy of 100%, 96%, 98.32%, 87.40% and 88% in these datasets
respectively. Therefore, they can be effectively used in gene-expression microarray classification tasks.
& 2014 Elsevier Ltd. All rights reserved.
1. Introduction
Every cell in our body contains a number of genes that specify
the unique features of different types of cells. The gene expression
of cells can be obtained by DNA microarray technology which is
capable of showing simultaneous expressions of tens of thousands
of genes. This technology is widely used to distinguish between
normal and cancerous tissue samples and support clinical cancer
diagnosis [27]. There are certain challenges facing classification of
gene expression in cancer diagnosis. The main challenge is the
huge number of genes compared to the small number of available
training samples [47]. Microarray learning data samples are
typically gathered from often less than one hundred of patients,
while the number of genes in each sample is usually more than
thousands of genes. Furthermore, microarray data contain an
abundance of redundancy, missing values [7] and noise due to
biological and technical factors [25,75]. In the literature, there are
two general approaches to these issues including feature selection
and feature extraction. A feature selection method selects a feature
subset from the original feature space and provides the marker
and causal genes [9,4,1] which are able to identify cancers quickly
and easily. However, feature extraction methods, normally trans-
forms the original data to other spaces to generate a new set of
features containing high information packing properties. In each of
these two approaches, the reduced features are applied by a
proper classifier to diagnosis. A proper classifier increases the
accuracy of detection and can influence the feature reduction step.
This paper aims to review these approaches, investigate the
recently developed methodology and propose a proper feature
reduction-classification method for cancer detection. The organi-
zation of the paper is as follows: feature selection methods are
reviewed in Section 1.1. Section 1.2 explains the feature extraction
methods and Section 2 offers the proposed method. Experimental
results on cancer classification are evaluated in Section 3. Finally,
conclusions are made in Section 4.
1.1. Feature selection methods
Researchers have developed various feature selection methods
for classification. Feature selection methods are categorized into
three techniques including the filter model [62], wrapper model
and embedded model [19]. The filter model considers feature
selection and classifier's learning as two separate steps and utilizes
the general characteristics of training data to select features. The
filter model includes both traditional methods which often eval-
uate genes separately and new methods which consider gene-to-
gene correlation. These methods rank the genes and select top
ranked genes as input features for the learning step. The gene
ranking methods need a threshold for the number of genes to be
selected. For example Golub et al. [20] proposed the selection of
the top 50 genes. Additionally the filter model needs a criterion to
rank the genes. Liu et al. [35] and Golub et al. [20] have
investigated some filter methods based on statistical tests and
information gain. Examples of the filter criterion include Pearson
correlation coefficient method [84], t-statistics method [2] and
Contents lists available at ScienceDirect
journal homepage: www.elsevier.com/locate/cbm
Computers in Biology and Medicine
http://dx.doi.org/10.1016/j.compbiomed.2014.09.008
0010-4825/& 2014 Elsevier Ltd. All rights reserved.
n
Corresponding author.
E-mail address: esilotf@gmail.com (E. Lotfi).
Computers in Biology and Medicine 54 (2014) 180–187

signal-to-noise ratio method [20]. The time complexity of these
methods is O(N) where N shows the dimensionality. It is efficient
but they cannot remove redundant genes, the issue studied in
recent literature [83,78,14,26,37].
In the wrapper model, a subset is selected and then the
accuracy of a predetermined learning algorithm is predicted to
determine the properness of a selected subset. In the wrapper
model of Xiong et al. [83], the selected subsets learn through three
learning algorithms including; linear discriminant analysis, logistic
regression and support vector machines. These classifiers should
be run for every subset of genes selected from the search space.
This procedure has a high computational complexity. Like the
wrapper methods, in the embedded models, the genes are selected
as part of the specific learning method but with lower computa-
tional complexity [19]. The subset selection methods of wrapper
model can be categorized into the population-based methods
[71,34,53] and backward selection methods. Recently Lee and
Leu [34], and Tong and Schierz [69] shed light on the effectiveness
of the hybrid model in feature selection. The elements of a hybrid
method include Neural Network (NN), Fuzzy System, Genetic
Algorithm (GA; [76,23]) and Ant Colony [79]. Lee and Leu [34]
examined the GA's ability in the feature selection. Furthermore,
the abilities of fuzzy theories have been successfully applied by
many researchers [12,72,10]. Tong and Schierz [69] used a genetic
algorithm-Neural Network approach (GANN) as a wrapper model.
The feature subset extraction is performed by GA and then the
extracted subset is applied to learn the NN. These processes are
repeated until the best subset is determined. Because of the high
dimension data, the GA looks to be a proper strategy for feature
selection.
1.2. Feature extraction methods
In the literature, there are two well-known methods for feature
extraction including principal component analysis (PCA; [78]) and
linear discriminant analysis (LDA; [48]). They normally transform
the original feature space to a lower dimensional feature transfor-
mation methods. PCA transforms the original data to a set of
reduced feature that best approximate the original data. In the first
step, PCA calculates the data covariance matrix and then finds the
eigenvalues and the eigenvectors of the matrix. Finally it goes
through a dimensionality reduction step. According to the final step,
the only terms corresponding to the K largest eigenvalues are kept.
In contrast to the PCA, first LDA calculates the scatter matrices
including a within-class scatter matrix for each class, and the
between-class scatter matrix. The within-class scatter matrix is
measured by the respective class mean, and within-class scatter
matrix measures the scatter of class means around the mixture
mean. Then LDA transforms the data in a way that maximizes the
between-class scatter and minimizes the within-class scatter. So
the dimension is reduced and the class separability is maximized.
The feature extraction/selection method is the first step in gene
expression microarray classification and cancer detection. The
second step consists of a classifier learning the reduced features.
In the literature, various classifiers have been investigated in order
to find the best classifier. It seems that the NN and various types
of NN [29,36,57,6,56,68,74,81,69,16], k nearest neighbors [61,13],
k-means algorithms [32], Fuzzy c-means algorithm [11], bayesian
networks [4], vector quantization based classifier [59], manifold
methods [18,80], fuzzy approaches [54,58,30,60], complementary learn-
ing fuzzy neural network [64–67], ensemble learning [55,8,27,50],
logistic regression, support vector machines [22,5,82,73,63,46,70],
LSVM [44], wavelet transform [28] as well as radial basis-support
vector machines [51] have been investigated successfully in classi-
fication and cancer detection. But the recently developed classifiers
such as brain emotional learning (BEL) networks [42] have not been
examined in this field.
BEL networks are recently developed methodologies that use
simulated emotions to aid their learning process. BEL is motivated
by the neurophysiological knowledge of the human's emotional
brain. In contrast to the published models, the distinctive features of
the BEL are low computational complexity and fast training which
make it suitable for high dimensional feature vector classification.
In this paper, BEL is developed and examined for gene expression
microarray classification tasks. It is expected that a model with low
computational complexity can be more successful in solving the
challenges of high dimensional microarray classification.
2. Proposed PCA–BEL to microarray data classification
Fig. 1 shows the general view of the proposed methods and the
final proposed algorithm is presented in Fig. 2. In the proposed
framework, what's different from published diagnostic methods is
the application of BEL model to cancer classification. There are
various versions of BEL, including basic BEL [3], BELBIC (BEL based
intelligent controller; [45]), BELPR (BEL based pattern recognizer;
[39]), BELPIC (BEL based picture recognizer; [43]) and supervised
BEL [38,40–42]. They are learning algorithms of emotional neural
networks [42]. These models are inspired by the emotional brain.
The description of the relationship between the main components
of emotional brain is common among all these models. What
differs from one model to another is how they formulate the
reward signal in the learning process. For example in the model
presented by Balkenius and Morén [3], it is not clarified how the
reward is assigned. In the BELBIC, the reward signal is defined
explicitly and the formulization of other equations is formed
accordingly. However, the supervised BEL employs the target value
of input pattern instead of the reward signal in the learning phase.
So supervised BEL is model free and can be utilized in different
applications and here, this version is developed for gene expres-
sion microarray classification task. Generally the computational
complexity of BEL is very low [39–42]. It is O(n) that make it
suitable to use in high dimensional feature vector classification.
Fig. 1. General view of proposed method.
E. Lotfi, A. Keshavarz / Computers in Biology and Medicine 54 (2014) 180–187 181

BEL [42] is inspired by the interactions of thalamus, amygdala
(AMYG) [15,17,21,24,31,33,77], orbitofrontal cortex (OFC) and
sensory cortex in the emotional brain [42].
The first step is associated with PCA dimension reduction
(Fig. 1). Consider the first k-principle components p1, p2,…, pk,
they are the outputs of the first step and the inputs of second step.
In the second step, this pattern should be normalized between [0
1]. The normalized k-principle components p1, p2,…, pk are outputs
of the second step and the inputs of thirds step. Fig. 2 illustrates
the details of the proposed method. The input pattern of BEL is
illustrated by vector p1, p2,…, pk and the E is the final output. The
model consists of two main subsystems including AMYG and the
OFC. The AMYG receives the input pattern including: p1, p2,…, pk
from the sensory cortex, and pk þ 1 from the thalamus. The OFC
receives the input pattern including p1, p2,…, pk from the sensory
cortex only. The pk þ1 calculated by following formula:
pk þ 1 ¼ maxj ¼ 1:::kðpjÞ ð1Þ
The vk þ 1 is related to AMYG weight and the wk þ 1 is related to OFC
weight. The Ea is the internal output of AMYG which is used to
adjust the plastic connection weights v1, v2,…, vk þ 1 (Eq. (6)). The
Eo is the output of OFC which is used to inhibit the AMYG output.
This inhibitory task is implemented by subtraction of Eo from Ea
(Eq. (5)). As the corrected AMYG response, E is the final output
Fig. 2. The flowchart of proposed method in learning step.
E. Lotfi, A. Keshavarz / Computers in Biology and Medicine 54 (2014) 180–187182

node. It is evaluated by monotonic increasing activation function
tansig and used to adjust OFC connection weights including w1, w2,
…, wk þ 1 (Eq. (7)). The activation function is as follows:
tan sigðxÞ ¼
2
1þeÀ 2x
À1 ð2Þ
The AMYG, OFC and the final output are simply calculated by
following formulas respectively:
Ea ¼ ∑
k þ1
j ¼ 1
ðvj Â pjÞþba ð3Þ
Eo ¼ ∑
k
j ¼ 1
ðwj Â pjÞþbo ð4Þ
E ¼ tan sigðEa ÀEoÞ ð5Þ
Let t be target value associated to nth pattern (p). The t should
be binary encoded. So the supervised learning rules are as follows:
vj ¼ vj þlr Â maxðtÀEa; 0Þ Â pj f or j ¼ 1…kþ1 ð6Þ
wj ¼ wj þlr Â ðEa ÀEo ÀtÞ Â pj f or j ¼ 1…kþ1 ð7Þ
ba ¼ baþlr Â maxðtÀEa; 0Þ ð8Þ
bo ¼ boþlr Â ðEa ÀEo ÀtÞ ð9Þ
where lr is learning rate, t is binary target and tÀEa is calculated
error, ba is the bias of AMYG neuron and bo is the bias of OFC
neuron. The v1, v2,…, vk þ1 AMYG are learning weights and w1,
w2,…, wk þ 1 are OFC learning weights. Eqs. (3)–(9) show the
multiple-inputs single-output model. In Figs. 2 and 3, the equa-
tions are extended to multiple-inputs multiple-outputs usage. The
input training microarray data in Fig. 2 includes the two matrices
of P and T. The size of the matrix P is m Â s where m is the number
of patterns and s is the number of features in each pattern (s⪢k).
The size of the matrix T is m Â c where c is the number of classes.
The targets are encoded binary. So each row of matrix T includes
only one “1” and other columns are “0”. In the flowcharts, pi
denotes the ith pattern and ti is related target.
The learning rate lr can be adaptively adjusted to increase the
performance. The final flowchart, Fig. 2, shows this adaptation and
the related parameters including the ratio to increase the learning
rate (lr_inc) initialized with 1.05, the ratio to decrease learning rate
(lr_dec) with the initial value 0.7, the maximum performance
increase (minc) with initial the value 1.04, first performance (perf_f;
in step 4 of the flowchart) and last performance (perf_l) which can be
calculated as MSE. The initial lr¼0.001, the learning weights are
initialized randomly (step 3 in the flowchart.) and according to the
algorithm, if (perf_l/perf_f)4minc then lr¼lrÂ lr_dec, else if (per-
f_loperf_f), lr¼lr Â lr_inc. In the Fig. 2, the stop criterion is to reach a
determined learning epoch. The stop criterion can be the maximum
epoch, which means the maximum number of epochs has been
reached (for example 10,000 epochs). Fig. 2 presents the learning
step and Fig. 3 shows the flowchart of the testing step. The inputs of
the algorithm presented in Fig. 3 are a testing pattern, number of
classes and the weights adjusted in the learning step. The last step of
the algorithm is associated to the diagnosis where the index of the
maximum E shows the class number of the pattern.
3. Experimental studies
The source code of the proposed method is accessible from http://
www.bitools.ir/tprojects.html and it is evaluated to classify the gene
expression microarray data of 4-class complementary DNA (cDNA)
microarray dataset of the small round blue cell tumors (SRBCTs), high
grade gliomas (HGG), lung, colon and breast cancer datasets. The
SRBCTs dataset is a 4-class cDNA microarray data and contains 2308
genes and 83 samples including 29 samples in Ewing's sarcoma
(EWS), 25 in rhabdomyosarcoma (RMS), 18 in neuroblastoma (NB)
and 11 in Burkitt lymphoma (BL). This data set can be obtained from
http://research.nghri.nih.gov/microarray/Supplement/. In the pro-
posed algorithm, the maximum learning epoch¼10,000, k¼100
and initial lr is set at 0.001, 0.000001 and 0.001 for SRBCT, HGG
and lung cancer datasets respectively. These parameters are picked
empirically. The value k¼100 and lr¼0.001 and 0.000001 can show
better results for these datasets. However, in other applications these
parameters should be optimized.
The HGG dataset applied here, consist of 50 samples with 12,625
genes including 14 classic glioblastomas, 14 non-classic glioblasto-
mas, 7 classic anaplastic oligodendrogliomas and 15 non-classic
anaplastic oligodendrogliomas. HGG dataset is accessible from
http://www.broadinstitute.org. In this dataset, the number of pat-
terns much less than the number of the features in each sample and
it may be difficult for classification methods to classify the data.
In the lung cancer dataset, there are 181 tissue samples in two
classes: 31 points are malignant pleural mesothelioma and 150 points
are adenocarcinoma. Each sample is described by 12,533 genes. This
data set is also accessible from http://datam.i2r.a-star.edu.sg/datasets/.
Other datasets, applied here, are colon and breast cancer datasets that
are accessible from http://genomics-pubs.princeton.edu/oncology/
affydata/index.html and http://datam.i2r.a-star.edu.sg/datasets/krbd/
BreastCancer/BreastCancer.html, respectively. Colon dataset includes
62 tissue samples with 2000 genes and the breast cancer dataset
consist of 97 samples and 24,481 genes.
Here and prior to entering comparative numerical studies, let
us analyze the computational complexity of the proposed BEL.
Regarding the learning step, the algorithm adjust O(2n) weights for
each pattern-target sample, where n is the number of input
attributes (for example for HGG database n¼12,625). Let us compare
the computational complexity with traditional neural networks and
a supervised orthogonal discriminant projection classifier (SODP;
[80]) applied in cancer detection. As mentioned above, the computa-
tional complexity of the proposed classifier is O(n). In contrast,
computational time is O(cn) for neural network and it is O(n2
) for
SODP. In NN architecture, c is the number of hidden neurons
(generally c¼10) and SODP uses a Lagrangian multiplier that
imposes the complexity of O(n2
). So the proposed method has
a lower computational complexity. This improved computing effi-
ciency can be important for high dimensional feature vector classi-
fication and cancer detection. The key to the proposed method is the
fast processing resulting from low computational complexity that
makes it suitable for cancer detection.
Another important point which is observed across in the
experimental implementations is that the results of the proposed
model can change by changing the initial lr and k values. lr indicates
the learning rate and k specifies the number of PCA's initial k
component in the algorithm. In other words, the value of lr and k
should be optimized in each problem. Here, the optimum values of
0.001, 0.000001, 0.001, 0.00001 and 0.000001 are assigned to lr and
100 to k for SRBCT, HGG, lung, colon and breast cancers respectively.
The values assigned to lr are obtained from 0.1, 0.001, 0.0001,
0.00001… 0.0000000001 and in the case of k from the values of 10,
50 and 100 through implementation and observation.
The proposed method is compared with the results of the
methods which have been reported by Zhang and Zhang [80].
They have reported the results based on the 5-fold cross validation
method. This implementation can result in the assessment of
accuracy and repeatability and it can be used to validate the
proposed method [46]. The compared methods include supervised
locally linear embedding (SLLE), probability-based locally linear
embedding (PLLE), locally linear discriminant embedding (LLDE),
constrained maximum variance mapping (CMVU), orthogonal

discriminant projection (ODP) and supervised orthogonal discri-
minant projection (SODP).
These methods are extended manifold approaches that have been
successfully used in tumor classification. SLLE, PLLE and LLDE are
extended versions of the locally linear embedding (LLE) that is a
classical manifold method. SODP is an extended version of ODP and
CMVU is a linear approximation of multi-manifolds learning method.
Figs. 4–8 show the comparative results based on average
accuracy of 5-fold cross validation. As illustrated in the figures,
the proposed model shows consistent results and provides higher
performance in SRBCT, HGG and Lung cancer (Figs. 4–6). Table 1
presents the percentage improvement of PCA–BEL with respect to
the best compared method reported by [80]. The best method in
SRBCT and HDD detection is a supervised orthogonal discriminant
projection (SODP) algorithm with 96.56% and 73.74% average
accuracy while in lung cancer classification the best method is a
locally linear discriminant embedding (LLDE) with average accu-
racy 93.18%. The proposed method improves these results.
Fig. 3. The testing step of proposed method to class diagnosis of an input issue.
85.00
90.00
95.00
100.00
105.00
SLLE PLLE ODP LLDE CMVU SODP PCA-BEL
SRBCT
Fig. 4. The accuracy comparison between various methods and proposed PCA–BEL
in SRBCTs classification problem.
in HGG classification problem.
in the lung cancer classification problem.
Fig. 7. The accuracy comparison between various methods reported by Zhang and
Zhang [80] and proposed PCA–BEL in the colon classification problem.

It seems that SRBCT and lung cancer are rather simple chal-
lenges for the classifiers in terms of complexity, since the best
compared classifiers i.e. SODP and LLDE (Table 1 and Figs. 8 and 6)
have been able to exhibit a detection precision of 96.56% and
93.18%. The proposed model improves these numbers by 3.56%
and 5.52% turning the accuracy into 100% and 98.32% for SRBCT
and lung cancer, respectively (Table 1)
At any rate the detection precision of the proposed model is
very significant for HGG. It seems that this dataset is too complex
for other classifiers, because the best detection precision achieved
for HGG is 73.74% using the SODP method (refer to Table 1 and
Fig. 5). The proposed PCA–BEL has been able to effect a 30.18%
improvement which results in 96% precision rate. However, the
results of colon and breast cancers, obtained from PCA–BEL, are
87.40% and 88% accuracy which does not show any significant
improvement compared to the existing methods (Figs. 7 and 8).
The percentage improvement of the proposed PCA–BEL is sum-
marized in Table 1 and calculated by the following formulas:
Percentage improvement ¼ 100
Â ðpropose method result–compared resultÞ=ðcompared resultÞ
ð10Þ
As illustrated in Table 1, the average accuracy of SRBCT, HGG
and lung cancer classification are 100%, 96% and 98.32% respec-
tively obtained from proposed PCA–BEL. Table 2 shows the
statistical details of the improved results. The confidence level
(confiLevel) in Table 2 is the Student's t-test with 95% confidence.
Finally Fig. 9 shows the averaged confusion matrix including
accuracy, precision and recall of improved results obtained from
proposed PCA–BEL in 5-fold. In Fig. 9a, the class numbers 1, 2,
3 and 4 belong to EWS, RMS, BL and NB respectively. In the
experimental results, 10,000 cycles is considered as the maximum
number of learning cycles in every run. However this parameter
can change for different problems. The maximum number of
cycles for the model is 220 in order to reach convergence and
100% accuracy while there is a need for more than 8000 or even
the whole 10,000 cycles to reach convergence in some folds of
HGG and lung cancer datasets. This parameter should preferably
have the maximum value and considering the low calculation
complexity of the method, increasing the number of learning
cycles even to 100,000 will result in an acceptable calculation
time in modern computers.
4. Conclusions
In this paper, a novel gene-expression microarray classification
method is proposed based on PCA and BEL network. In contrast to
the many other classifiers, the proposed method shows lower
computational complexity. Thus BEL can be considered as an
alternative approach to overcome the curse of dimensionality
Table 1
Percentage improvement of classification of the small round blue cell tumor
(SRBCT), high grade gliomas (HGG) and Lung cancer, obtained from proposed
method. The compared methods are the supervised orthogonal discriminant
projection classifier (SODP) and locally linear discriminant embedding (LLDE)
which are the best compared methods (Figs. 4–6).
Problem SRBCT HGG Lung cancer
Compared method SODP SODP LLDE
Detection accuracy of compared method 96.56% 73.74% 93.18%
Detection accuracy of our PCA–BEL method 100% 96% 98.32%
Percentage improvement 3.56% 30.18% 5.52%
Table 2
The statistical results of proposed PCA–BEL in three following improved problems:
the small round blue cell tumor (SRBCT), high grade gliomas (HGG) and Lung
cancer datasets. The rows 2, 3…, 5 show the detection accuracy of the folds and the
remaining rows present the statistical information including maximum, mean,
standard deviation (STD) of the results, and the confidence level (ConfiLevel) based
on the Student's t-test with 95% confidence.
Foldnumber SRBCT (%) HGG (%) Lung cancer (%)
F#1 100.00 100.00 100.00
F#2 100.00 80.00 94.40
F#3 100.00 100.00 97.22
F#4 100.00 100.00 100.00
F#5 100.00 100.00 100.00
Max 100.00 100.00 100.00
Average 100.00 96.00 98.32
STD 0.00 8.94 2.50
ConfiLevel 0.00 11.10 3.10
Fig. 8. The accuracy comparison between various methods reported by Zhang and
Zhang [80] and proposed PCA–BEL in the breast cancer classification problem.
Fig. 9. The averaged confusion matrix of improved problems including (a) SRBCT, (b) HGG and (c) lung cancer datasets.

problem. The proposed model is accessible from http://www.
bitools.ir/projects.html and is utilized for classification tasks of
SRBCT, HGG, lung, colon and breast cancer datasets. According to
the experimental results, the proposed method is more accurate
than traditional methods in SRBCT, HGG and lung datasets. PCA–
BEL improves the detection accuracy about 3.56%, 30.18% and
5.52% obtained respectively from SRBCT, HGG and lung cancer. The
results indicate the superiority of the approach in terms of higher
accuracy and lower computational complexity. Hence, it is
expected that the proposed approach can be generally applicable
to high dimensional feature vector classification problems.
However, the proposed approach has a drawback. Like many
other methods that used PCA, this method has not just extract the
informative gens. As mentioned in Section 1, PCA is a feature
extraction method and cannot select the features. For future
improvements the informative genes should be determined. To
determine the informative gens, the proposed method should
apply a feature selection step. This issue can be considered as
the next step of this research effort i.e. a proper feature selection
method should be found and replaced by PCA step of the proposed
method. Furthermore, in order for the proposed method to
provide a proper response in other cancer classification problems,
lr and k parameters should be specifically optimized for each
problem. This issue can also be considered for the future works
and on the other datasets such as prostate cancer.
Conflict of interest
There is no conflict of interest.
References
[1] M. Alshalalfa, G. Naji, A. Qabaja, R. Alhajj, Combining multiple perspective as
intelligent agents into robust approach for biomarker detection in gene
expression data, Int. J. Data Min. Bioinform. 5 (3) (2011) 332–350.
[2] P. Baldi, A.D. Long, A Bayesian framework for the analysis of microarray
expression data: regularized t-test and statistical inferences of gene changes,
Bioinformatics 17 (6) (2001) 509–519.
[3] C. Balkenius, J. Morén, Emotional learning: a computational model of amyg-
dala, Cybern. Syst. 32 (6) (2001) 611–636.
[4] R. Cai, Z. Zhang, Z. Hao, Causal gene identification using combinatorial
V-structure search, Neural Netw. 43 (2013) 63–71.
[5] A.H. Chen, C.H. Lin, A novel support vector sampling technique to improve
classification accuracy and to identify key genes of leukaemia and prostate
cancers, Expert Syst. Appl. 38 (4) (2011) 3209–3219.
[6] J.H. Chiang, S.H. Ho, A combination of rough-based feature selection and RBF
neural network for classification using gene expression data, NanoBiosci. IEEE
Trans. 7 (1) (2008) 91–99.
[7] W.K. Ching, L. Li, N.K. Tsing, C.W. Tai, T.W. Ng, A. Wong, K.W. Cheng, A
weighted local least squares imputation method for missing value estimation
in microarray gene expression data, Int. J. Data Min. Bioinform. 4 (3) (2010)
331–347.
[8] D. Chung, H. Kim, Robust classification ensemble method for microarray data,
Int. J. Data Min. Bioinform. 5 (5) (2011) 504–518.
[9] Y.R. Cho, A. Zhang, X. Xu, Semantic similarity based feature extraction from
microarray expression data, Int. J. Data Min. Bioinform. 3 (3) (2009) 333–345.
[10] J. Dai, Q. Xu, Attribute selection based on information gain ratio in fuzzy rough
set theory with application to tumor classification, Appl. Soft Comput. 13 (1)
(2013) 211–221.
[11] D. Dembele, P. Kastner, Fuzzy C-means method for clustering microarray data,
Bioinformatics 19 (8) (2003) 973–980.
[12] Z. Deng, K.S. Choi, F.L. Chung, S. Wang, EEW-SC: Enhanced Entropy-Weighting
Subspace Clustering for high dimensional gene expression data clustering
analysis, Appl. Soft Comput. 11 (8) (2011) 4798–4806.
[13] M. Dhawan, S. Selvaraja, Z.H. Duan, Application of committee kNN classifiers
for gene expression profile classification, Int. J. Bioinform. Res. Appl. 6 (4)
(2010) 344–352.
[14] C. Ding, H. Peng, Minimum redundancy feature selection from microarray
gene expression data, J. Bioinform. Comput. Biol. 3 (02) (2005) 185–205.
[15] J.P. Fadok, M. Darvas, T.M. Dickerson, R.D. Palmiter, Long-term memory for
pavlovian fear conditioning requires dopamine in the nucleus accumbens and
basolateral amygdala, PloS One 5 (9) (2010) e12751.
[16] F. Fernández-Navarro, C. Hervás-Martínez, R. Ruiz, J.C. Riquelme, Evolutionary
generalized radial basis function neural networks for improving prediction
accuracy in gene classification using feature selection, Appl. Soft Comput. 12
(6) (2012) 1787–1800.
[17] R. Gallassi, L. Sambati, R. Poda, M.S. Maserati, F. Oppi, M. Giulioni, P. Tinuper,
Accelerated long-term forgetting in temporal lobe epilepsy: evidence of
improvement after left temporal pole lobectomy, Epilepsy Behav. 22 (4)
(2011) 793–795.
[18] J.M. García-Gómez, J. Gómez-Sanchs, P. Escandell-Montero, E. Fuster-Garcia,
E. Soria-Olivas, Sparse Manifold Clustering and Embedding to discriminate
gene expression profiles of glioblastoma and meningioma tumors, Comput.
Biol. Med. 43 (11) (2013) 1863–1869.
[19] S. Ghorai, A. Mukherjee, P.K. Dutta, Gene expression data classification by
VVRKFA, Procedia Technol. 4 (2012) 330–335.
[20] T.R. Golub, D.K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J.P. Mesirov,
E.S. Lander, Molecular classification of cancer: class discovery and class
prediction by gene expression monitoring, Science 286 (5439) (1999)
531–537.
[21] E.M. Griggs, E.J. Young, G. Rumbaugh, C.A. Miller, MicroRNA-182 regulates
amygdala-dependent memory formation, J. Neurosci. 33 (4) (2013) 1734–1740.
[22] I. Guyon, J. Weston, S. Barnhill, V. Vapnik, Gene selection for cancer classification
using support vector machines, Mach. Learn. 46 (1) (2002) 389–422.
[23] C. Gillies, N. Patel, J. Akervall, G. Wilson, Gene expression classification using
binary rule majority voting genetic programming classifier, Int. J. Adv. Intell.
Paradig. 4 (3) (2012) 241–255.
[24] O. Hardt, K. Nader, L. Nadel, Decay happens: the role of active forgetting in
memory, Trends Cogn. Sci. 17 (3) (2013) 111–120.
[25] H. Hong, Q. Hong, J. Liu, W. Tong, L. Shi, Estimating relative noise to signal in
DNA microarray data, Int. J. Bioinform. Res. Appl. 9 (5) (2013) 433–448.
[26] D.S. Huang, C.H. Zheng, Independent component analysis-based penalized
discriminant method for tumor classification using gene expression data,
Bioinformatics 22 (15) (2006) 1855–1862.
[27] N. Iam-On, T. Boongoen, S. Garrett, C. Price, New cluster ensemble approach to
integrative biological data analysis, Int. J. Data Min. Bioinform. 8 (2) (2013)
150–168.
[28] A. Jose, D. Mugler, Z.H. Duan, A gene selection method for classifying cancer
samples using 1D discrete wavelet transform, Int. J. Comput. Biol. Drug Des. 2
(4) (2009) 398–411.
[29] J. Khan, J.S. Wei, M. Ringner, L.H. Saal, M. Ladanyi, F. Westermann, P.S. Meltzer,
Classification and diagnostic prediction of cancers using gene expression
profiling and artificial neural networks, Nat. Med. 7 (6) (2001) 673–679.
[30] M. Khashei, A. Zeinal Hamadani, M. Bijari, A fuzzy intelligent approach to the
classification problem in gene expression data analysis, Knowl.-Based Syst. 27
(2012) 465–474.
[31] J.H. Kim, S. Li, A.S. Hamlin, G.P. McNally, R. Richardson, Phosphorylation of
mitogen-activated protein kinase in the medial prefrontal cortex and the
amygdala following memory retrieval or forgetting in developing rats,
Neurobiol. Learn. Mem. 97 (1) (2011) 59–68.
[32] Y.K. Lam, P.W. Tsang, eXploratory K-Means: a new simple and efficient
algorithm for gene clustering, Appl. Soft Comput. 12 (3) (2012) 1149–1157.
[33] R. Lamprecht, S. Hazvi, Y. Dudai, cAMP response element-binding protein in
the amygdala is required for long-but not short-term conditioned taste
aversion memory, J. Neurosci. 17 (21) (1997) 8443–8450.
[34] C.P. Lee, Y. Leu, A novel hybrid feature selection method for microarray data
analysis, Appl. Soft Comput. 11 (1) (2011) 208–213.
[35] H. Liu, J. Li, L. Wong, A comparative study on feature selection and classifica-
tion methods using gene expression profiles and proteomic patterns, Genome
Inform. Ser. 13 (2002) 51–60.
[36] B. Liu, Q. Cui, T. Jiang, S. Ma, A combinational feature selection and ensemble
neural network method for classification of gene expression data, BMC
Bioinform. 5 (1) (2004) 136.
[37] Y. Liu, Wavelet feature extraction for high-dimensional microarray data,
Neurocomputing 72 (4) (2009) 985–990.
[38] E. Lotfi, M.R. Akbarzadeh-T, Supervised brain emotional learning. IEEE Inter-
national Joint Conference on Neural Networks (IJCNN), 2012, pp. 1–6, http://
dx.doi.org/10.1109/IJCNN.2012.6252391.
[39] E. Lotfi, M.R. Akbarzadeh-T, Brain Emotional Learning-Based Pattern Recogni-
zer, Cybern. Syst. 44 (5) (2013) 402–421.
[40] E. Lotfi, M.R. Akbarzadeh-T, Emotional brain-inspired adaptive fuzzy decayed
learning for online prediction problems, in: 2013 IEEE International Confer-
ence on Fuzzy Systems (FUZZ), pp. 1–7, IEEE, 2013, July).
[41] E. Lotfi, M.R. Akbarzadeh-T, Adaptive brain emotional decayed learning for
online prediction of geomagnetic activity indices, Neurocomputing 126 (2014)
188–196.
[42] E. Lotfi, M.R. Akbarzadeh-T, Practical emotional neural networks, Neural
Networks 59 (2014) 61–72. http://dx.doi.org/10.1016/j.neunet.2014.06.012.
[43] E. Lotfi, S. Setayeshi, S. Taimory, A neural basis computational model of
emotional brain for online visual object recognition, Appl. Artif. Intell. 28
(2014) 1–21. http://dx.doi.org/10.1080/08839514.2014.952924.
[44] Z. Liu, D. Chen, Y. Xu, J. Liu, Logistic support vector machines and their
application to gene expression data, Int. J. Bioinform. Res. Appl. 1 (2) (2005)
169–182.
[45] C. Lucas, D. Shahmirzadi, N. Sheikholeslami, Introducing BELBIC: brain emo-
tional learning based intelligent controller, Int. J. Intell. Autom. Soft Comput.
10 (2004) 11–21.
[46] M. Meselhy Eltoukhy, I. Faye, B. Belhaouari Samir, A statistical based feature
extraction method for breast cancer diagnosis in digital mammogram using
multiresolution representation, Comput. Biol. Med. 42 (1) (2012) 123–128.

[47] V.S. Tseng, H.H. Yu, Microarray data classification by multi-information based
gene scoring integrated with Gene Ontology, Int. J. Data Min. Bioinform. 5 (4)
(2011) 402–416.
[48] M. Xiong, L. Jin, W. Li, E. Boerwinkle, Computational methods for gene
expression-based tumor classification, Biotechniques 29 (6) (2000) 1264–1271.
[50] Reboiro-Jato Miguel, Glez-Peña Daniel, Díaz Fernando, Fdez-Riverola Florentino,
A novel ensemble approach for multicategory classification of DNA microarray
data using biological relevant gene sets, Int. J. Data Min. Bioinform. 6 (6) (2012)
602–616.
[51] L. Nanni, A. Lumini, Ensemblator: an ensemble of classifiers for reliable
classification of biological data, Pattern Recognit. Lett. 28 (5) (2007) 622–630.
[53] T. Prasartvit, A. Banharnsakun, B. Kaewkamnerdpong, T. Achalakul, Reducing
bioinformatics data dimension with ABC-kNN, Neurocomputing 116 (2013)
367–381. http://dx.doi.org/10.1016/j.neucom.2012.01.045.
[54] M. Perez, D.M. Rubin, L.E. Scott, T. Marwala, W. Stevens, A hybrid fuzzy-svm
classifier, applied to gene expression profiling for automated leukaemia
diagnosis, in: IEEE 25th Convention of Electrical and Electronics Engineers
in Israel, 2008, IEEEI 2008, IEEE, 2008, December, pp. 041–045.
[55] Y. Peng, A novel ensemble machine learning for robust microarray data
classification, Comput. Biol. Med. 36 (6) (2006) 553–573.
[56] L.P. Petalidis, A. Oulas, M. Backlund, M.T. Wayland, L. Liu, K. Plant, V.P. Collins,
Improved grading and survival prediction of human astrocytic brain tumors
by artificial neural network analysis of gene expression microarray data, Mol.
Cancer Ther. 7 (5) (2008) 1013–1024.
[57] L.E. Peterson, M. Ozen, H. Erdem, A. Amini, L. Gomez, C.C. Nelson, M. Ittmann,
Artificial neural network analysis of DNA microarray-based prostate cancer
recurrence, in: Proceedings of the 2005 IEEE Symposium on Computational
Intelligence in Bioinformatics and Computational Biology, 2005, CIBCB'05,
IEEE, 2005, November, pp. 1–8.
[58] L.E. Peterson, M.A. Coleman, Machine learning-based receiver operating
characteristic (ROC) curves for crisp and fuzzy classification of DNA micro-
arrays in cancer research, Int. J. Approx. Reason. 47 (1) (2008) 17–36.
[59] I. Porto-Díaz, V. Bolón-Canedo, A. Alonso-Betanzos, O. Fontenla-Romero, A
study of performance on microarray data sets for a classifier based on
information theoretic learning, Neural Netw. 24 (8) (2011) 888–896.
[60] S. Saha, A. Ekbal, K. Gupta, S. Bandyopadhyay, Gene expression data clustering
using a multiobjective symmetry based clustering technique, Comput. Biol.
Med. 43 (11) (2013) 1965–1977.
[61] A. Statnikov, C.F. Aliferis, I. Tsamardinos, D. Hardin, S. Levy, A comprehensive
evaluation of multicategory classification methods for microarray gene
expression cancer diagnosis, Bioinformatics 21 (5) (2005) 631–643.
[62] X. Sun, Y. Liu, M. Xu, H. Chen, J. Han, K. Wang, Feature selection using dynamic
weights for classification, Knowl.-Based Syst. 37 (2013) 541–549. http://dx.doi.
org/10.1016/j.knosys.2012.10.001.
[63] M. Song, S. Rajasekaran, A greedy algorithm for gene selection based on SVM
and correlation, Int. J. Bioinform. Res. Appl. 6 (3) (2010) 296–307.
[64] T.Z. Tan, C. Quek, G.S. Ng, Ovarian cancer diagnosis by hippocampus and
neocortex-inspired learning memory structures, Neural Netw. 18 (5) (2005)
818–825.
[65] T.Z. Tan, C. Quek, G.S. Ng, E.Y.K. Ng, A novel cognitive interpretation of breast
cancer thermography with complementary learning fuzzy neural memory
structure, Expert Syst. Appl. 33 (3) (2007) 652–666.
[66] T.Z. Tan, G.S. Ng, C. Quek, Complementary learning fuzzy neural network: an
approach to imbalanced dataset, in: International Joint Conference on Neural
Networks, 2007, IJCNN 2007, IEEE, pp. 2306-2311, 2007.
[67] T.Z. Tan, C. Quek, G.S. Ng, K. Razvi, Ovarian cancer diagnosis with comple-
mentary learning fuzzy neural network, Artif. Intell. Med. 43 (3) (2008)
207–222.
[68] M. Takahashi, H. Hayashi, Y. Watanabe, K. Sawamura, N. Fukui, J. Watanabe,
T. Someya, Diagnostic classification of schizophrenia by neural network
analysis of blood-based gene expression signatures, Schizophr. Res. 119 (1)
(2010) 210–218.
[69] D.L. Tong, A.C. Schierz, Hybrid genetic algorithm-neural network: feature
extraction for unpreprocessed microarray data, Artif. Intell. Med. 53 (1) (2011)
47–56.
[70] M. Tong, K.H. Liu, C. Xu, W. Ju, An ensemble of SVM classifiers based on gene
pairs, Comput. Biol. Med. 43 (6) (2013) 729–737.
[71] M.H. Tseng, H.C. Liao, The genetic algorithm for breast tumor diagnosis – the
case of DNA viruses, Appl. Soft Comput. 9 (2) (2009) 703–710.
[72] P. Vadakkepat, L.A. Poh, Fuzzy-rough discriminative feature selection and
classification algorithm, with application to microarray and image datasets,
Appl. Soft Comput. 11 (4) (2011) 3429–3440.
[73] V. Vinaya, N. Bulsara, C.J. Gadgil, M. Gadgil, Comparison of feature selection
and classification combinations for cancer classification using microarray data,
Int. J. Bioinform. Res. Appl. 5 (4) (2009) 417–431.
[74] S.L. Wang, X. Li, S. Zhang, J. Gui, D.S. Huang, Tumor classification by combining
PNN classifier ensemble with neighborhood rough set based gene reduction,
Comput. Biol. Med. 40 (2) (2010) 179–189.
[75] Y.F. Wang, Z.G. Yu, V. Anh, Fuzzy C–means method with empirical mode
decomposition for clustering microarray data, Int. J. Data Min. Bioinform. 7 (2)
(2013) 103–117.
[76] A. Yardimci, Soft computing in medicine, Appl. Soft Comput. 9 (3) (2009)
1029–1043.
[77] S.H. Yeh, C.H. Lin, P.W. Gean, Acetylation of nuclear factor-κB in rat amygdala
improves long-term but not short-term retention of fear memory, Mol.
Pharmacol. 65 (5) (2004) 1286–1292.
[78] K.Y. Yeung, W.L. Ruzzo, Principal component analysis for clustering gene
expression data, Bioinformatics 17 (9) (2001) 763–774.
[79] Y. Zhang, J. Xuan, R. Clarke, H.W. Ressom, Module-based breast cancer
classification, Int. J. Data Min. Bioinform. 7 (3) (2013) 284–302.
[80] C. Zhang, S. Zhang, A supervised orthogonal discriminant projection for tumor
classification using gene expression data, Comput. Biol. Med. 43 (5) (2013)
568–575. http://dx.doi.org/10.1016/j.compbiomed.2013.01.019.
[81] Z. Zainuddin, P. Ong, Reliable multiclass cancer classification of microarray
gene expression profiles using an improved wavelet neural network, Expert
Syst. Appl. 38 (11) (2011) 13711–13722.
[82] X.L. Xia, K. Li, G.W. Irwin, Two-stage gene selection for support vector machine
classification of microarray data, Int. J. Model. Identif. Control 8 (2) (2009)
164–171.
[83] M. Xiong, X. Fang, J. Zhao, Biomarker identification by feature wrappers,
Genome Res. 11 (11) (2001) 1878–1887.
[84] H. Xiong, S. Shekhar, P.N. Tan, V., Kumar, Exploiting a support-based upper
bound of Pearson's correlation coefficient for efficiently identifying strongly
correlated pairs, in: Proceedings of the Tenth ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, ACM, 2004, August,
pp. 334–343.

2014 Gene expressionmicroarrayclassification usingPCA–BEL.

Recommended

Recommended

More Related Content

What's hot

What's hot (15)

Viewers also liked

Viewers also liked (20)

Similar to 2014 Gene expressionmicroarrayclassification usingPCA–BEL.

Similar to 2014 Gene expressionmicroarrayclassification usingPCA–BEL. (20)

2014 Gene expressionmicroarrayclassification usingPCA–BEL.