Novel algorithms for Knowledge discovery from neural networks in Classification problems

Dr. M.Gethsiyal Augasta
Assistant Professor
Kamaraj College
Thoothukudi – 628 003
Presented on : 17-06-2013
Novel algorithms for
Knowledge discovery fromneural networks in Classification problems

Outline
Introduction
A New Mean wise Discretization and Pattern Selection Method
for Classification
A New Discretization Algorithm based on Range Coefficient of
Dispersion and Skewness for neural networks classifier
An Algorithm for Pruning Irrelevant Hidden Neurons of
feedforward Neural Network (PINS)
A Novel Pruning Algorithm (N2PS) for Optimizing Feedforward
Neural Networks
Reverse Engineering the Neural Networks for Rule Extraction
Conclusion

Overview
Classification is one of the data mining problems receiving
great attention recently in the data base community. This research has
focused on proposing novel algorithms for improving the performance
of feedforward neural networks on classification problems. All the
algorithms are proposed in three phases using three approaches
such as Preprocessing the data, Pruning & retraining and Rule
discovery & extraction.
Phase I : Two discretization algorithms namely MDC+PS
and DRDS have been proposed for preprocessing the
data.

Overview (Contd.,)
Phase II : Two pruning algorithms namely PIHNS and N2PS have
been proposed for optimizing the architecture of
neural network.
Phase III : A rule extraction algorithm RxREN has been proposed
for extracting classification rules of large datasets from
the trained neural network.
The efficiency of the proposed methods have been proved by
implementing them on various real datasets.

Implementation
•The proposed algorithms are implemented in JDK1.5.
•All experiments were run on a PC with Windows XP operating
system, Pentium IV 1.8GHz CPU and 504MB SDRAM memory.
The datasets used to test the algorithm are,
•The training and testing examples are selected based on 10-fold
cross validation method or Random selection method.
Properties
Datasets
iris iono hea pid wav breastw Creditg hepatitis
# of classes
# of examples
# of training
examples
# of testing
examples
# of attributes
# of continuous
attributes
3
150
75
75
4
4
2
351
176
175
34
34
2
270
135
135
13
13
2
768
384
384
8
8
3
5000
2501
2499
40
40
2
699
350
349
9
9
2
1000
550
450
20
7
2
155
81
74
19
6

Preprocessing the Data -
DiscretizationDiscretization
It transforms continuous attributes values into a finite number of
intervals and associates with each interval a numerical discrete value.
Why Essential? i. some learning methods do not handle continuous
attributes ii. the data transformed in a set of intervals are more
cognitively relevant for a human interpretation .
Main Goals of Discretization Methods
1. Generating high quality discretization scheme with least number of
intervals without any user supervision.
2. The generated discretization scheme should lead to the improvement
of accuracy and efficiency of learning algorithm.
3. the discretization process should be as fast as possible.

A New Discretization
and Pattern Selection Method
For Classification in Data Mining
Using Feedforward Neural Networks
Published in :
International Journal of Advanced Research in
Computer Science, 2 (1), Jan. –Feb, 2011, 615-
620. ISSN No. 0976-5697

Phases of Proposed Method – MDC+PS
This work consists of two phases,
 In the first phase, a new supervised mean wise (MDC)
discretization method is proposed to automatically discretize the
continuous attributes of large datasets into discrete intervals by the
computed mean value . It is aimed at reducing the discretization time
and number of intervals.
 In the second phase a novel pattern selection mechanism (PS)
is proposed to select the most informative training patterns based on
pattern disparity in advance of the training phase from the patterns
discretized in the first phase.

MDC Discretization Algorithm
Input : Consider a dataset with N continuous Attributes, M Patterns
and S target classes.
Begin
1. For each continuous Attribute
1.1 Initialize the first interval as d0 i.e., values < min1.
1.2 Let the dynamic value t as min1.
1.3 For each target class k.
1.3.1 Find the maximum value maxk, minimum
value mink and the mean value Ek.
1.3.2 Assign maxk−1 as t for all k classes where
k > 1 and maxk−1 > mink

MDC Discretization Algorithm
1.3.3 Compute the best interval length using lk = |Ek − t|
1.3.4 Compute the number of intervals using
n = ( maxk − t ) / ( lk )
1.3.5 Generate n number of intervals {dki/1≤i≤n}.
1.4 Include additional intervals if mink > maxk−1 to cover all
possible values of a continuous attribute for each class k.
1.5 Set the final interval as dm i.e., values > upper bound value of
the last interval.
2. The Discretization Scheme (D) for S classes would be
D = {d0, dk1, dk2, dk3, ..., dki, ..., dkn, dm}
Output: The Discretization Scheme D.

Pattern Selection Method (PS)
Pattern selection
 It is an active learning strategy to select the most informative
patterns for training the network.
 It obtains a good training set to increase the performance of
a neural network in terms of convergence speed and generalization.
Proposed Pattern Selection (PS) method :
A data which was discretized into many intervals by MDC is
converted into binary code using the Thermometer coding scheme
[27]. PS selects all distinct patterns based on pattern disparity for
training the feedforward neural network.

Steps of proposed pattern selection method
1. Let P be the set of discretized patterns, A be the number of
attributes i and S be the number of target classes k;
2. Compute threshold value η such as
If ( A / S ) > S then η= A / S else η= S;
3. Select a pattern pik from P randomly ; R=R+{pik}; P=P-{pik};
3.1. For each pattern pjk ,j ≠ i of P
3.1.1. Compare pik and pjk and find number of
differed bits e;
3.1.2. If e<= η then T=T+{pjk}; P=P-{pjk};
3.2. end
4. end

Experimental Results
The data are classified with feedforward neural network using
backpropagation algorithm.

Results Comparisons
The comparisons of six datasets results with other six discretization
schemes are shown below.
Table shows that the generated number of intervals of MDC is
comparable with all other discretization algorithms except CAIM.
Also the discretization time of MDC is smaller than all other methods
for all datasets.

Results Comparisons – Contd.
Here the MDC+PS always achieve the highest classification accuracy for
all datasets than Equal-w and CAIM discretization method.

MDC+PS– Summary
 MDC generates the smallest number of intervals that assumes
low computational cost and smaller discretization time
 PS method selects the most informative training patterns that
leads to the improvement in the classification performance of
neural networks.
 Simulation results show that MDC+PS achieves significant
improvement in classification accuracy in minimum training time
for maximum datasets among other six discretization algorithms.
 The main drawback of the proposed Meanwise Discretization
method (MDC) is that it has to be combined with the proposed PS
to achieve the best classification performance.
The MDC algorithm is very effective and easy to use supervised
discretization algorithm for any classifier if its training data has
been selected using the proposed pattern selection (PS) method.

A new Discretization algorithm
based on Range coefficient of
Dispersion and Skewness for
neural networks classifier
Published in:
Applied Soft Computing, Elsevier Publications.
2012; Vol.12 No. 2; pp:619-625

Proposed Discretization Method (DRDS)
 A new static, global, supervised, incremental and bottom-up
discretization algorithm based on coefficient of dispersion and
skewness of data range.
 It automates the discretization process by introducing the
number of intervals and stopping criterion.
The DRDS method has two phases,
Phase I : gets the Initial Discretization scheme (IDS) by searching
through globally.
Phase II : refines the intervals. Here the intervals are further
merged upto the stopping criterion without affecting the quality of the
discretization and FDS is obtained.

IDS of DRDS Method
• The degree to which numerical data tend to spread is called
dispersion. The range coefficient of dispersion is the relative
measure of dispersion based on the value of range
•When the dispersion is large, the values are widely scattered; when
it is small they are tightly clustered.
•A value jth minimum value jmink is taken between mink and maxk to
get best interval length.
• For a data series with large dispersion, smaller j value is selected
and for a data series with small dispersion, larger j value is selected.
• The value CDk of data of the discretized attribute in the class k is
estimated by

IDS of DRDS Method
The value of CDk is always in [-1; +1]. To decide the value j, the range [-1; +1] is divided
into set of intervals based on the magnitude of number of distinct values in the
discretizing attribute of the class k.
The value j is selected based on the value of CDk lies in the above interval.
The best interval length lk for a discretizing attribute of a class k can be obtained by,
A distribution of data is said to be skewed if the given data is symmetrical but
stretched more to one side than to the other. The selection of very small jmink value
due to the right skewness leads the interval length lk as too small and the number of
intervals n also as very high vice versa. This adjustment process of lk can be formulated
as,

IDS of DRDS Method -The Selection Process of ‘j’

IDS of DRDS Method
Let t be a dynamic variable and it specifies the value from which the discretiza-
tion process to be begun for a discretizing attribute of the class k.
The number of intervals n for a discretizing attribute of the target class cls(i), i = 1 to S
is calculated by
The intervals in the Initial Discretization Scheme (IDS) can be written as
where dij represents an interval j of a discretizing attribute of the class cls(i).

FDS of DRDS Method
• The goal of proposed discretization method is to reduce the
number of intervals while maximizing the classification accuracy.
• To achieve that the number of intervals in IDS are to be reduced
by merging the intervals as follows.
• Let b be the number of intervals in IDS and for each interval Ii,
i. Calculate the total number of examples qi within the interval Ii.
ii. Merge the interval Ii with the adjacent smallest interval until
Where i= 2 to b-1, M – total no. of examples

Results of DRDS
Discretization : The results obtained by the DRDS algorithm with the
six datasets are shown in Table 2.
Classification accuracy: computed using the feed forward neural
network with conjugate gradient training (MLP-CG) algorithm[21]
with the help of KEEL software [25].
Criterion
Datasets
iris iono heart pid wav breastw
Mean Number of Intervals 5.75 5.1 5.0 10.8 12.4 4.0
Discretization time (s) 0.09 0.64 0.31 1.74 35.7 0.15
Criterion
Datasets
iris iono heart pid wav breast
Topology 23-5-3 175-5-2 65-5-2 87-5-2 495-5-3 36-5-2
Learning
time (s) 0.18 0.53 0.59 0.54 34.5 0.34
Training
Accuracy (%)
97.9 99.3 96.8 80.4 83.1 99.2
Testing
Accuracy (%)
96 90.1 80.7 74.0 81.3 95.4

Comparison of Discretization
Methods• DRDS is compared with other discretizaion methods such as
Equal-w, Equal-F, Chimerge, Ex-chi2, CACC and CAIM.
Criterion
Discretizat
ion
Methods
Datasets
Mean Number
of Intervals
Equal-W
Equal-F
DRDS
Chimerge
Ex-chi2
CACC
CAIM
4.0
4.0
5.75
3.5
7.5
3.0
3.0
20.0
20.0
5.1
21.4
8.8
4.3
2.0
10.0
10.0
5.0
7.8
2.3
6.4
2.0
14.0
14.0
10.8
25.6
20.0
11.2
2.0
20.0
20.0
12.4
28.5
12.2
18.1
3.0
14.0
14.0
4.0
4.6
3.3
2.0
2.0
Discretization
Time (s)
Equal-w
Equal-F
DRDS
Chimerge
Ex-chi2
CACC
CAIM
0.02
0.03
0.09
0.09
0.11
0.08
0.08
1.72
1.84
0.64
4.28
11.11
3.62
3.43
0.12
0.12
0.31
0.39
1.68
0.22
0.20
0.33
0.33
1.74
0.94
3.23
0.90
0.80
9.06
9.33
35.7
64.33
136.0
61.41
52.38
0.26
0.27
0.15
0.66
1.91
0.58
0.58

MethodsFigure compares the discretization time of DRDS with only the
algorithms which require no parameters.
DRDS requires less discretization time due to its low computational
cost.

Methods
DRDS achieves a high or closer accuracy for all datasets.
The accuracies obtained by neural network (MLP-CG) for DRDS are
compared with the accuracies obtained for other six discretization
schemes on all datasets and it is shown in the following Table.
Discretization
Methods
Datasets
Equal-w
Equal-F
Chimerge
Ex-chi2
DRDS
CACC
CAIM
96.6
95.3
96.0
93.3
96.0
93.0
94.6
89.7
84.6
89.4
64.1
90.1
90.3
89.5
77.4
73.7
57.8
55.5
80.7
79.3
77.0
74.1
71.9
65.1
72.6
74.9
72.9
72.1
74.3
79.1
78.3
77.4
81.3
80.2
78.1
94.1
95.7
96.3
95.1
95.4
95.1
94.9

DRDS - Summary
The proposed the DRDS algorithm handles continuous and mixed
mode attributes.
 It does not require any user interaction in both phases and performs
automatic selection of the number of discrete intervals based on
coefficient of dispersion and skewness of data range.
 The results show that our DRDS method discretizes an attribute into
smallest number of intervals within less amount of time.
 The discretization time of DRDS is smaller than the other bottom-up
methods for maximum datasets.
Also our proposed algorithm DRDS achieves highest classification
accuracy among the other six discretization algorithms.

Pruning
 Pruning is defined as a network trimming within the assumed
initial architecture. The trimmed network is of smaller size and is
likely give higher accuracy than before its trimming.
Why Pruning?
The ANN with large number of hidden nodes able to learn fast
but with poor generalization.
 The better generalization performance can be achieved only by
the small network.
 The small trained networks are easier to interpret and the
knowledge can be easily extracted in the form of simple rules.

A Novel method for Pruning
Irrelevant
Hidden Neurons of
Feedforward Neural Network
Published in :
Proceedings of the International conference on
Emerging Trends in Mathematics and Computer
Applications, MEPCO Schlenk Engineering College,
Sivakasi, India. Dec 16-18, 2010. pp. 579-584.

Proposed Method (PIHNS)
 Prunes the irrelevant hidden neurons of the single hidden layer
Neural Network by sensitivity.
 The sensitivity of the global error
changes are computed using the Euclidean
distance with respect to each individual
hidden node after the training process.
Named as PIHNS as it Prunes Irrelevant Hidden Neurons by
Sensitivity.

PIHNS Algorithm
Input:
A feedforward neural network with l input neurons, m hidden neurons
and n output neurons, and a dataset with np patterns and q attributes.
Begin
1.Train the network until a predetermined accuracy rate is achieved using
the Backpropagation algorithm with momentum.
2. For each hidden node j,
2.1. Compute the total net value with all the patterns in a dataset
using

PIHNS Algorithm – contd.
2.3. Eliminate hidden neuron j if sj≤α, α € {1,2,…n}.
3. Retrain the currently pruned network.
4. If classification rate of the network falls below an acceptable level
then stop pruning, otherwise goto step 2.
Output:
The Pruned multilayer feedforward neural network.
2.2. Compute the sensitivity measure sj for the hidden neuron j by
(sj is calculated by finding the squared Euclidian distance between the node hj and
weight vjk of its all outgoing connections, where k = (1, 2, …n))

 The datasets used to test the algorithm are Iris, Wisconsin Breast
Cancer, Hepatitis domain, Wave form-5000.
 pruning parameter α is selected depending on the problem
Dataset Initial
Architecture
Acctest
%
mse Execution
time (s)
Final
Architecture
Acctest % Execution
time (s)
Pruning
parameter α
Pruning
Steps
iris 4-10-3 95.9 0.016 0.17 4-3-3 98.67 0.28 8 2
cancer 9-10-2 96.4 0.01 1.41 9-2-2 97.1 1.93 10 3
hepatit
is
19-25-2 78.2 0.08 0.63 19-2-2 83.95 0.76 4 3
wave 40-10-3 80.5 0.03 8.42 40-3-3 84.6 8.81 10 1
Pruned network of iris dataset with
the classification accuracy of 98.7%
with 4-3-3 architecture.

Hepatitis Pruning Results
Step Current
Architecture
Acctest % Epochs Pruned Neurons
1 19-25-2 78.2 200 18 hidden neurons
2 19-7-2 80.5 50 5 hidden neurons
3 19-2-2 83.95 50 Pruning stops
Original network with architecture 19-25-2 with accuracy 78.2%
is reduced to the architecture 19-2-2 with accuracy 83.95%.
 Requires 0.76 seconds to obtain the pruned network.

Comparison of Pruning methods
The proposed method PIHNS is compared with other five pruning
methods such as MBP, OBS,OBD,VNP and Xing-Hu’s method.
Better architecture with minimum number of hidden nodes .
Accuracy is similar or better than other pruning methods.
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
iris breast-w hepatitis
Datasets
ClassificationAccuracy
PHINS
OBD
OBS
MBP
VNP
Xing-Hu
PIHNS

Comparing hidden nodes removal with
other methods.
It shows that the PIHNS method removes more number of hidden
neurons for hepatitis and cancer datasets than all other pruning
methods.
0
5
10
15
20
25
iris cancer hepatitis
Datasets
Numberofprunedhiddenneurons
N2PS
VNP
Xing-Hu
OBD
OBS
MBP
PIHNS

PIHNS - Summary
 Determines the best architecture for feedforward Neural network
based on Sensitivity Analysis (SA) using the squared Euclidean distance.
 Efficient in identifying irrelevant hidden neurons
 Pruned Neural network are more accurate than the original Neural
network used in the training phase.
 Large decrease in number of hidden nodes with out affecting the
classification accuracy which leads to high degree of generalization, and
computational time.
 It prunes the nodes directly instead of removing unwanted
connections associated with those nodes, and hence reduces
computational time.

A Novel Pruning Algorithm
for Optimizing
Feedforward Neural Network
of Classification Problems
Published in:
Neural Processing Letters, Springer Publications
2011; 34(3):241-258

Proposed N2PS algorithm
This work deals with a new approach which determines the
insignificant input and hidden neurons to detect the optimum
structure of a feedforward neural network.
The proposed pruning algorithm, called as Neural Network Pruning
by Significance (N2PS), is based on a new significant measure which
is calculated by the Sigmoidal activation value of the node and all
the weights of its outgoing connections.

Pruning by Significance
N2PS considers all the nodes with significance value below the
threshold as insignificant and eliminates them.

Steps of N2PS method
1. Train the network T until a predetermined accuracy rate is
achieved using the Backpropagation algorithm with momentum.
2. Compute the significance of each hidden neuron using
where
,
and eliminate the neurons if where

Steps of N2PS method (Contd.,)
3. Compute the significance of each input neuron using
and eliminate the neurons which are below the threshold value α
where
4. Retrain the pruned network and compute its classification
accuracy on testing dataset.
5. If classification accuracy of the network P falls below an
acceptable level then stop pruning otherwise repeat the process.

The performances of the N2PS algorithm on six datasets are shown in
Table .
• the algorithm doesn’t require more iteration to prune the network
and requires maximum three pruning steps only.
• the pruned network achieves higher accuracy than the initially
selected network.

Results Comparisons
• Classification accuracy of N2PS is compared with other
pruning methods such as VNP, Xing-Hu’s method, MBP, OBD
and OBS.

Results Comparisons (Contd.,)
Comparing hidden nodes
removal of N2PS with
other five pruning
methods
Comparing input nodes
removal of N2PS with
VNP and XingHu’s
methods

N2PS Summary
• A new pruning algorithm to determine the optimal architecture for
feedforward neural network has been proposed based on new
significance measure which is estimated using the Sigmoidal
function and weights.
• Results indicate that the proposed algorithm is very efficient in
identifying insignificant input and hidden neurons and also confirm
that the pruned neural network yields better accurate results than
the original neural network used in the training phase.
• The main advantages of this algorithm are,
– no user defined parameters needs to be set
– large decrease in number of nodes without affecting the classification
accuracy
– requires small number of pruning steps and requires small number of
iterations for retraining the pruned network.

Rule Extraction
Why Rule extraction?
An important drawback of neural networks is their lack of
explanation capability i.e., it is very difficult to understand how
an ANN has solved a problem. To overcome this problem various
rule extraction algorithms have been developed.
Rule extraction : It changes a black box system into a white box system
by translating the internal knowledge of a neural network into a set
of symbolic rules .
It is the process of developing natural language like syntax that
describes the behaviour of a neural network

Reverse Engineering
the Neural Networks for Rule Extraction
in Classification Problems
Published in:
Neural Processing Letters, Springer Publications,
2012; vol.35 no.2, pp:131-150.

Proposed RxREN
algorithm In pedagogical approach the proposed algorithm extracts rules by
mapping the input output relationships as closely as possible to the
way the neural networks understand the relationship.
 Reverse engineering is a method of analyzing a product in which
the finished item is studied to determine its makeup or component
parts. The algorithm relies on reverse engineering technique since the
neural networks are black box i.e., how they solve a problem is not
interpretable..
The novelty of this algorithm lies in the simplicity of the extracted
rules and conditions in rule are involving both discrete and continuous
mode of attributes.

Phases of RxREN
algorithmThe proposed algorithm consists of two phases.
 The first phase removes the insignificant input neurons from the
trained neural network and finds the mandatory data range of each
significant input neuron for classifying the given testing data as in
particular class.
It learns about the importance of each input connection of the
trained neural network by analyzing the misclassifications occurred in its
absence.
 The second phase constructs the classification rules for each class
using the data ranges obtained in phase1 and refines the generated
rules by the process of rule pruning and rule updation.

Summarized steps of proposed
algorithm

Status of neural network at the removal of each neuron for PID
dataset.

Various steps of Rule Pruning and Rule Updation of
neural network for PID dataset.

Extracted Rules of 6 real datasets.

Performance of RxREN on 6 real datasets.
Random
10-fold
cross validation

Comparison of Proposed algorithm with
various rule extraction algorithms on WBC
dataset.
• RxREN obtains minimum number of rules with high accuracy.

 A new pedagogical approach rule extraction algorithm RxREN has
been proposed to determine the best classification rules from trained
neural networks by the technique of reverse engineering.
 The RxREN requires minimum time to search the rule since its search
space consists only misclassified data.
 it doesn't require retraining after pruning.
 It extracts the rules with low computational cost but with high
accuracy and it extracts more comprehensible set of rules.
 It improves the generalization of a rule by the process of rule pruning
and it increases the classification accuracy of obtained ruleset by
updating them based on the misclassification of the ruleset.
RxREN-Summary

Conclusion
This research provides novel algorithms for preprocessing the data for
classification in datamining, for identifying the optimal architecture of neural networks for
generalization and for extracting classification rules of large datasets from neural networks.
In MDC+PS method, MDC discretizes the continuous attributes into many
intervals by the computed mean value but with nominal accuracy. PS increases the
performance of a discretized data on neural network in terms of classification accuracy,
convergence speed and generalization by obtaining a good training set based on pattern
disparity. The results show that the discretization method MDC has to be combined with
PS to achieve the best performance. To overcome this drawback, a new static, global,
incremental, supervised and bottom-up discretization method DRDS which is based on
coefficient of dispersion and skewness of data range has been proposed. The results
obtained using this discretization algorithm show that the discretization scheme generated
by the algorithm almost has minimum number of intervals , requires smallest discretization
time and leads to highest classification accuracy .

Conclusion (Contd.,)
The pruning method PIHNS prunes irrelevant hidden neurons by the sensitivity using
the Euclidian distance. The main advantages of this algorithm are large decrease in number of
hidden nodes with out affecting the classification accuracy that leads to high degree of
generalization, large decrease in computational time for pruning procedure compared with
traditional pruning methods. But the main drawbacks of this algorithm are, the irrelevant input
neurons can’t be pruned by this algorithm and need the user to specify pruning parameters.
N2PS overcomes these drawbacks by pruning both irrelevant input neurons and
hidden neurons based on a significance of a node automatically. The main advantages of this
algorithm are, no user defined parameters needs to be set, large decrease in number of nodes
without affecting the classification accuracy, requires small number of pruning steps and
requires small number of iterations for retraining the pruned network compared with other
pruning methods and achieves better generalization ability on all datasets. The experimental
results demonstrate that the proposed N2PS algorithm is very promising method for
determining the optimal architecture of neural networks of arbitrary topology for classifying
large datasets.

Conclusion (Contd.,)
The rule extraction algorithm RxREN, proposed in this research extracts the
rules from neural networks using pedagogical approach. The algorithm relies on
reverse engineering technique to prune the insignificant input neurons and to
discover the technological principles of each significant input neuron of neural
network in classification. The results show that RxREN is quite efficient in extracting
smallest set of rules with high classification accuracy than those generated by other
neural network rule extraction methods. As a summary the proposed rule extraction
algorithm RxREN is very promising method for discovering the knowledge from neural
networks and for interpreting the behaviour of neurons in human understandable
format from large datasets with mixed mode attributes of data.
In a nutshell, various algorithms proposed in this research are very effective
and easy to use supervised knowledge discovery algorithms which can be applied to
problems that require classification of large datasets.

List of Publications
 A New Discretization and Pattern Selection Method For Classification in Data Mining Using
Feedforward Neural Networks, International Journal of Advanced Research in Computer
Science, 2 (1), Jan. –Feb, 2011, 615-620. ISSN No. 0976-5697.
 A Novel Method for pruning Irrelevant Hidden Neurons of Feedforward Neural Network,
Proceedings of the International conference on Emerging Trends in Mathematics and
Computer Applications, MEPCO Schlenk Engineering College, Sivakasi, India. Dec 16-18,
2010. pp. 579-584
 A Novel Pruning Algorithm for Optimizing Feedforward Neural Network of Classification
Problems, Neural Processing Letters, Springer, 2011; 34(3):241-258, Impact Factor : 0.75
 Reverse Engineering the Neural Networks for Rule Extraction in Classification Problems,
Neural Processing Letters, Springer 2012; 35(2):131-150, Impact Factor : 0.75
 A new Discretization algorithm based on Range coefficient of Dispersion and Skewness for
neural networks classifier, Applied Soft Computing., Elsevier, 2012; 12(2):619-625 , Impact
Factor : 2.61.
 M.Gethsiyal Augasta, T.Kathirvalavakumar, Rule extraction from neural networks – A
comparative Study, Proceedings of IEEE International conference on Pattern recognition,
Informatics and Medical Engineering (IEEE-PRIME 2012), Periyar University, India.

References
 Kaikhah.K, Doddmeti S., Discovering trends in large datasets using neural network, Applied
Intelligence 29 (2006) 51-60.
 Xing H.J., Gang Hu B., Two phase construction of multilayer perceptrons using Information
Theory. IEEE Transactions on Neural Networks 20(4) (2009) 715-721.
 Castellano G, Fanelli AM, Pelillo M, An iterative pruning algorithm for feedforward neural
networks. IEEE Transactions on Neural Networks 8(3) (1997) 519-530.
 Han J., Kamber M., Data Mining: Concepts and Techniques, Morgan Kaufman,2001.
 Tsai C.J., Lee C.I., Yang W.P., A Discretization algorithm based on Class Attribute Contingency
Coefficient, Information Sciences 178 (2008) 714-731.
 Saad E.W., Donald C.Wunsch II, Neural network explanation using inversion, Neural networks
20 (2007) 78-93.
 Kurgan L.A., Cios K.J., CAIM Discretization Algorithm, IEEE Transactions on knowledge and
Data Engineering 16 (2004) 145-152.
 Odajima K., Yoichi Hayashi , Gong Tianxia, Rudy Setiono, Greedy rule generation from discrete
data and its use in neural network rule extraction, Neural Networks 21 (2008) 1020-1028.

MDC - Example
Age : 10, 8, 24, 43, 12, 61,33
Mean : 27
Interval length : 27-8=19
No. of Intervals : (61-8)/19 ≈ 3
Intervals : [8-27][27-46][46-65]
Thermometer coding :
for age 12 : 100

DRDS - Example
Age : 10, 8, 24, 43, 12, 61,33
CD ≈ 0.8
j= 2 [2/3,3/3] jmin=10
Interval length = 2 (10-8) <11 (53/5)
Interval length = 4 <11 (53/5)
Interval length = 16
No. of Intervals : (61-8)/16 ≈ 4
Intervals : [8-23][24-39][40-55][56-61]
After Merging : [8-23][24-61] (if <sqrt(7 ) ≈ 3)
Thermometer coding :
for age 24 : 01

Novel algorithms for Knowledge discovery from neural networks in Classification problems

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Novel algorithms for Knowledge discovery from neural networks in Classification problems

Similar to Novel algorithms for Knowledge discovery from neural networks in Classification problems (20)

Recently uploaded

Recently uploaded (20)

Novel algorithms for Knowledge discovery from neural networks in Classification problems