Data Mining With A Simulated Annealing Based Fuzzy Classification System

Pattern Recognition 41 (2008) 1824-1833

Presenter: Chia-Ming Wang

Goal and Contribution
• Construct a fuzzy classiﬁer (Goal)

• map the attrs. to predeﬁned fuzzy sets

• rules with conf. and target label (How?)

• combination optimization problem (6 ) n


• Use SA to ﬁnd a set of fuzzy rules (Contribution)


• Use SA to ﬁnd a set of fuzzy rules (Contribution)
• authors said

The used antecedent
fuzzy sets
1.0
Membership

1.small
2.medium small
S MS M ML L
3.medium
4.medium large
Attribute Value
0.0 1.0 5.large
6.don’t care
Membership

1.0 if x1 is small and
x2 is medium and
DC
x3 is don’t care
Encode: 136
Attribute Value
0.0 1.0

Determination of Cj and CFj
1. calculate the compatibility of each training pattern xp with the rule Rj
µ j (x p ) = µ j1 (x p1 ) ×L × µ jn (x pn ), p = 1,K , m
2. for each class, calculate the relative sum of compatibility grades
of the training patterns in class h with the rule Rj
∑
βClass h (R j ) = µ j (x p ) / N Class h
x p ∈Class h

3. Find class hj hat
{ } if 0 or conﬂict, set Cj be φ
βClass h j (R j ) = max βClass1 (R j ),L , βClass c (R j )
)

4. if Cj = φ, set CFj of rule Rj to 0. Otherwise
c

( ) where β = ∑ βClass h (R j ) / (c − 1)
CFj = βClass h (R j ) − β / ∑ βClass h (R j )
)
)
j
h=1 h≠ h j
5. classify the sample xp with rule set S
{ } reject if µ j (x p ) = 0 ∀R j ∈S
µ j* (x p ) ⋅ CFj* = max µ j (x p ) ⋅ CFj R j ∈S

Structure of the goal classiﬁer

Structure of the goal classifier
Classifier #1

Set of rules
for class #1

Classifier #2

Set of rules
Decision
for class #2 Detected
Test
. Fusion Class
Dataset
.
.
Classifier #c

Set of rules
for class #c

Procedure of SAFCS
T = Tmax
Scurrent = Sinit
Sbest = Scurrent
EFcurrent = NNCP(Scurrent)
EFbest = NNCP(Sbest)
While (T ≥ Tmin)
For i = 1 to k
Call Metropolis(Scurrent, EFcurrent, Sbest, EFbest,T)
Time = Time + k
k=β×k
T = α ×T
Return(Sbest)

Procedure of SAFCS
Mb
1 −ΔEFb
T = Tmax # M = Mg + Mb ∑ ΔEFb Tmax
ΔEFb = =
Mb ln(Pinit )
Scurrent = Sinit # Ninit i=1

Sbest = Scurrent N
EFcurrent = NNCP(Scurrent) #NNCP(S)= m− ∑ NCP(Rj )
EFbest = NNCP(Sbest) j=1

While (T ≥ Tmin) # Tmin = 0.01
For i = 1 to k # k is num of calls of metropolis
Call Metropolis(Scurrent, EFcurrent, Sbest, EFbest,T)
Time = Time + k # Time is the spend time so far
k=β×k # β is a constant (set to 1)
T = α ×T # α is the cooling rate (set to 0.9)
Return(Sbest)

Metropolis Procedure
Snew = Perturb (Scurrent) # generate new S
EFnew = NNCP(Snew)
ΔEF = EFnew - EFcurrent
IF (ΔEF < 0), Then # better rule set
Scurrent = Snew
IF EFnew < EFbest , Then # better than best
Sbest = Snew

ELSEIF (rand(0,1) < exp(-ΔEF/T)), Then # accept, too
Scurrent = Snew

Perturbation(3 func.)
1. Modify
• select a rule from S randomly
• modiﬁed one or more antecedent of it
• if the consequent is equal, then replace;
otherwise, repeated
2. Delete f itnessmax (SClass h ) − f itness(R)
P (R) =
select with, f itnessmax (SClass h ) − f itnessmin (SClass h )

3. Create
the same as modify, but add
(NB:change more linguistic values than“Modify”,
they said for jump)

Parameters
Parameters Values
Initial set of rule size (Ninit) 50
Initial temperature (Tmax) 100
Final temperature (Tmin) 0.01
Cooling rate (α) 0.90
# Iteration at each temperature (k) 40
Iteration increment rate (β) 1

Estimate: 88 × 40 = 3520 iterations (keep in mind)

Competes
• C4.5:
• IBk: nearest neighbor, k = 3
• Naive Bayes
• LIBSVM
• XCS: Michigan approach
• GAssist: Pittsburgh approach

Dataset (UCI)
1829
H. Mohamadi et al. / Pattern Recognition 41 (2008) 1824 – 1833

Table 1
Features of the data sets used in computational experiments

Name #Instance #Attribute #Real. #Nominal #Class Dev. cla. (%) Mag. cla. (%) Min. cla. (%)

bswd 625 4 4 – 3 18.03 46.08 7.84
cra 690 15 6 9 2 5.51 55.51 44.49
ion 351 34 34 – 2 14.10 64.10 35.90
iris 150 4 4 – 3 – 33.33 33.33
lab 57 16 8 8 2 14.91 64.91 35.09
pima 768 8 8 – 2 15.10 65.10 34.90
wave 5000 40 40 – 3 0.36 33.84 33.06
wine 178 13 13 – 3 5.28 39.89 26.97

Dev.cla., deviation of class distribution; Mag. Cla, percentage of majority class instances; Min. Cla, percentage of minority class instances.

IBk [31] is the nearest neighbor classifier technique. It uses
Table 2
Parameters specification in computer simulations for the SAFCS the whole training set as the core of the classifier and Euclidean
distance to select the k nearest instances. The class prediction
Parameter Value

10-fold cross validation provided by the system is the majority class in these k examples.
Initial set of rules size (Ninit ) 50
Here, k is set equal to 3.
Initial temperature (Tmax ) 100
Naïve Bayes [32] is a very simple Bayesian network approach
Final temperature (Tmin ) 0.01
that assumes that the predictive attributes are conditionally
Cooling rate ( ) 0.90
independent given the class and also that no hidden or latent
# Iteration at each temperature (k) 40
Iteration increment rate ( ) 1 attributes influence the prediction process. These assumptions

Accuracies
1831
H. Mohamadi et al. / Pattern Recognition 41 (2008) 1824 – 1833

Table 3
Train set and test set accuracies of different algorithms on eight UCI data sets (mean ± standard deviation)

Data set Algorithm C4.5 IBk Naïve Bayes SVM GAssist XCS SAFCS

95.19 ± 1.28
89.93 ± 0.68 90.53 ± 0.54 91.92 ± 0.25 91.01 ± 0.19 92.14 ± 0.28 94.63 ± 0.46
bswd Train set accuracy %
91.43 ± 1.25
77.66 ± 2.91 86.09 ± 2.72 90.90 ± 1.43 89.62 ± 2.22 81.10 ± 3.80 90.47 ± 1.36
Test set accuracy %

98.90 ± 0.73
90.31 ± 0.86 91.05 ± 0.52 82.58 ± 0.82 55.51 ± 0.08 91.07 ± 0.73 94.25 ± 0.54
cra Train set accuracy %
85.77 ± 3.27
85.55 ± 3.45 84.73 ± 4.04 81.07 ± 5.32 55.51 ± 0.70 85.62 ± 4.00 85.60 ± 3.5
Test set accuracy %

99.86 ± 0.24
98.68 ± 0.54 90.94 ± 0.59 93.00 ± 0.42 94.19 ± 0.64 96.90 ± 0.74 99.66 ± 0.34
ion Train set accuracy %
92.71 ± 5.01
88.97 ± 5.91 85.66 ± 4.66 91.50 ± 4.70 92.14 ± 4.62 90.10 ± 4.70 91.89 ± 4.65
Test set accuracy %

99.85 ± 0.19
98.00 ± 0.61 96.59 ± 0.49 96.67 ± 0.53 97.11 ± 0.64 98.33 ± 0.79 99.10 ± 1.19
iris Train set accuracy %
96.66 ± 3.09
94.22 ± 5.37 94.89 ± 6.37 96.22 ± 5.36 96.22 ± 4.77 95.20 ± 5.87 94.70 ± 5.10
Test set accuracy %

100 ± 0.00
91.58 ± 4.00 98.77 ± 1.55 95.92 ± 1.60 96.04 ± 0.93 99.92 ± 0.24 99.96 ± 0.08
lab Train set accuracy %
97.83 ± 5.33
80.31 ± 17.44 95.38 ± 7.75 93.76 ± 10.50 93.35 ± 8.32 97.77 ± 5.98 83.50 ± 14.80
Test set accuracy %

98.90 ± 0.67
84.43 ± 2.41 85.67 ± 0.65 77.07 ± 0.61 78.27 ± 0.53 83.11 ± 0.82 87.55 ± 0.59
pima Train set accuracy %
77.32 ± 4.70
75.44 ± 4.79 74.52 ± 3.91 75.30 ± 4.45 74.46 ± 5.19 72.40 ± 5.30 75.71 ± 4.41
Test set accuracy %

97.29 ± 0.61 81.59 ± 0.21 78.28 ± 0.60 85.02 ± 0.18
wave Train set accuracy % N/A N/A N/A
80.00 ± 1.16
75.93 ± 2.10 79.89 ± 1.40 76.01 ± 1.97
Test set accuracy % N/A N/A N/A

100 ± 0.00 100 ± 0.00
98.86 ± 0.54 97.27 ± 0.53 98.67 ± 0.45 99.33 ± 0.32 99.98 ± 0.04
wine Train set accuracy %
98.10 ± 3.40
94.24 ± 6.44 96.61 ± 4.02 97.20 ± 3.43 96.33 ± 4.13 95.60 ± 4.90 97.63 ± 3.02
Test set accuracy %

The best values are in bold.

C4.5 IBk NB LIBSVM Gassist XCS SAFCS

Data Mining With A Simulated Annealing Based Fuzzy Classification System

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (11)

Similar to Data Mining With A Simulated Annealing Based Fuzzy Classification System

Similar to Data Mining With A Simulated Annealing Based Fuzzy Classification System (20)

More from Jamie (Taka) Wang

More from Jamie (Taka) Wang (20)

Recently uploaded

Recently uploaded (20)

Data Mining With A Simulated Annealing Based Fuzzy Classification System

Editor's Notes