Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Measures

Combining decision trees
based on imprecise
probabilities and
uncertainty measures
J. Abellán, A. R. Masegosa
Department of Computer Science and A.I.
University of Granada
Spain

Outline
1. Introduction
2. Previous knowledges
3. Experimentation
4. Conclusions & future works

Introduction
Classificacion tree (decision tree)
Tumor
Classification:
absent
Calcium
Classification:
absent
Classification:
present
Attribute variableNode
Case of the class variableLeaf
 SPLIT CRITERION
 STOP CRITERION
 1 LEAF = 1 RULE

Introduction
Classification tree. New observation
 Observation: ( high, a1, absent, present)
 Variables: [Calcium, Tumor, Coma, Migraine]
 Classification: Cancer present
normal high
a0 a1
Classification :
Absent 0.9
Classification:
Absent 0.7
Classification:
Present 0.8
Calcium
Tumor

Introduction
Approach of the work presented
 Show how the combination of a few decision trees
obtained by a simple method from the IDM produces
high improvemnts.
 As reference, we use
NAIVE BAYES and J48 (improve version of C4.5)
 We carry out EXPERIMENTS on a large set of data
bases.
 For results comparison, we use:
 PERCENTAGE OF CORRECT CLASSIFICATIONS
 NUMBER OF SELECTED VARIABLES
 RUN TIME
Tools: WEKA &
Elvira

Previos knowledges
Naive Bayes (Duda & Hart, 1973)
 Attribute variables {Xi | i=1,..,r}
 Class variable C with states in
{c1,..,ck}
 Select state of C:
arg maxci
(P(ci|X)).
 Supposition of independecy
known the class variable:
arg maxci
(P(ci) ∏r
j=1
P(zj|ci))
…
C
X1 X2 Xr
Graphical Structure

Previos knowledges
J48 Classifier
 Selects the attribute variable with higher positive
value of IGR(Xi,C) = IG(Xi,C)/ H(Xi)
J48
(improve version of C4.5)
 Work with continuous data bases
 Have a posterior prune process
 Penalizes the use of variables with higher number of
cases

Previos knowledges
Imprecise Info-Gain (Abellán & Moral, 2003)
 Representing the information from a data base
Imprecise Dirichlet Model (IDM) Probability estimation
j
jj
c
cc
j I
sN
sn
sN
n
cP ≡





+
+
+
∈ ,)(
})(|{)( jcj IcqqCK ∈= })(|{)|( },{ ij xcji IcqqxXCK ∈==
Credal Sets

Previos knowledges
Split Criterions for decision trees:
Imprecise Info-Gain (Abellán & Moral, 2003)
 Select the attribute variable with higher positive
value of:
IGI(Xi,C) = S(K(C)) - ∑t
P(xi
t) S(K(C| Xi=xi
t))
with S as Maximum entropy function of a credal set.
 Global uncertainty measure ⊃ conflict & no-especificity
 Conflict is on the side of ramification.
 No-especificty tries to reduce the ramification.

Previous Knowledge
INFORMATIVE ORDER by IIG FOR THE ROOT NODE (Abellán & Moral, 2003)
DB Training set
First more Informative Variable
DB
Training set
Second more Informative Variable
………………..........…………
DB Training set
M more Informative Variable

Previous Knowledge
New observation x
x
C class variable, with states {ci, i:1,…,k}
(P1(C1|X),…,P1(Cn|X))
P(C|X) = Average(Pi(C|O))
x
(P2(C1|X),…,P2(Cn|X))
………………..........…………
x
…………
(Pm(C1|X),…,Pm(Cn|X))

Outline
1. Introduction
2. Previous knowledge
3. Experimentation
4. Conclusions & future works

Experimentation
Data Bases
 27 UCI Data Bases.
 Preprocessing:
 Replace Mising Data
 Discretization
 10 fold-cross validation
repeated 10 times.
 Comparison with a
corrected paired t-test
with 5% of significance
level.

Experimentation
Naive-Bayes comparison
 Adding decision
trees is
outperforming.
 Optimal Number of
decision trees
depends on data
base.
 No degradation
because of the
addition of decision
trees.

Experimentation
Naive-Bayes comparison
 Audiology: Significant improvement wit 3 trees.
 German: No differences with 4 trees.
 Optidigits: No differences with 5 trees.

Experimentation
J48 comparison
 Letter: Significant improvement wit 2 trees.
 Mfeat: Large improvement with 6 trees.
 Vowel: Large improvement with 6 trees.

Experimentation
Summary comparison
“As many decision trees are combined better
results are obtained”

State-of-the-art Classifiers
Combined NB J48 AODE TAN SVM

 Training Time:
 Test Time
 Numero medio de árboles: 22.88
Time Complexity Analysis

Conclusions & future works
 We have presented a simple method for
combinating decision trees obtained from IDM
and uncertainty measures.
 Combining a low number of simple classification
trees, it is possible to obtain considerable
accuracy improvements.
 This method can be easily parallelized and, in
consequency, speed up the classification task.
 Apply this method to larga data sets (text, gene
analysis…).
 Study methods for weighting decision trees.

Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Measures

Recommended

Recommended

More Related Content

What's hot

What's hot (9)

Viewers also liked

Viewers also liked (6)

Similar to Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Measures

Similar to Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Measures (20)

More from NTNU

More from NTNU (16)

Recently uploaded

Recently uploaded (20)

Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Measures