Parfionovas_Interface_2008

Oblique Trees Classification
Andrejus Parfionovas
Adele Cutler
Utah State University

Tree-based classification:
Fever?
Sore
throat?
Running
Nose?
Cold HealthyAngina Flu
Yes No Yes No
Yes No
Weight / Height > 26
(Body Mass Index)
Risk of
heart attack
Cholesterol/Age > 4
(Framingham
Point Score)
Risk of
heart attack
More
tests
Yes No
Yes No

Classical tree-based methods:
1. Pick a variable (1 dimension)
2. Find the best split to separate classes
(Gini index as a measure of purity)
3. Repeat steps 1-2 and choose the best
separable variable
4. Go to step 1 with fewer observations,
until all nodes in the tree are pure
The feature space is now partitioned
into a set of rectangles

Newly proposed algorithm:
1. Pick two variables
2. Find the best split in 2D space to
separate classes (Gini-based criteria)
3. Repeat steps 1-2 and choose the best
separable pair of variables
4. Go to step 1 with fewer observations,
until all nodes in the tree are pure
The feature space is now partitioned
by oblique linear splits

D
C
E
A
B
1 2 3 4 5
Finding the best split in 2D:
Slopes
DE -3
AB -1
CE -.5
AE .25
BE .67
AC 1
AD 1.33
CD 2
BD 2.5
BC 3

D
C
E
A 4
5
B 3
1
2
Slopes
DE -3
AB -1
CE -.5
AE .25
BE .67
AC 1
AD 1.33
CD 2
BD 2.5
BC 3

D
C
E 5
A 3
4
B
2
1
Slopes
DE -3
AB -1
CE -.5
AE .25
BE .67
AC 1
AD 1.33
CD 2
BD 2.5
BC 3

Ordering of the points
A .3 B .47 C .47 D .4 E
A B C E .3 D
B .3 A C E D
B A E same C D
B E .26 A C D
E same B A C D
E B C 0 A D
Slopes
DE -3
AB -1
CE -.5
AE .25
BE .67
AC 1
AD 1.33
CD 2
BD 2.5
BC 3

Algorithm Performance:
O(n⋅n1 – n1
2) ~ between O(n –1) and O(n2),
where n – sample size,
n1 ∈ (0, n) – biggest class size
No need to recalculate Gini index!
(can easily be updated)

Simulation: Orthogonal XOR
Training sample = 500
Test sample = 1000
# of repetitions = 6
-10 -5 0 5 10
-10-50510

20 40 60 80 100
012345
number of trees
errorrate(%)
Classical trees: –x–x–
Oblique trees: –o–o–

20 40 60 80 100
012345
number of trees
errorrate(%)

Simulation: Diagonal XOR
Test sample = 1000
-10 -5 0 5 10
-10-50510

Simulation: Diagonal XOR
20 40 60 80 100
01234567
number of trees
errorrate(%)

Simulation: Mixed XOR
Test sample = 1000
-10 -5 0 5 10
-10-50510

20 40 60 80 100
01234567
number of trees
errorrate(%)Simulation: Mixed XOR

Example: Anderson’s Iris Data
150 observations (50 in each class)
3 classes (species)
4 variables:
Sepal Length
Sepal Width
Petal Length
Petal Width

Trees comparison:
Petal Length < 2.45
Class 1 Petal Width < 1.75
Petal Length < 4.95 Class 3
Class 2 Class 3
SW > (.79)⋅SL – 1.27
Class 1 PW > 3.72 – (.45)⋅PL
PW > (.3)⋅SW+.81 Class 2
Class 3 Class 2
Orthogonal Tree Oblique Tree

Summary:
Consistently, smaller error rate
Fewer trees required for desired accuracy
Smoother, more adaptive class separation
Provides insights of the data structure
(variables selection)
Orthogonal separation – stay with classics

Future investigations:
Randomization
When choosing between ties
Try random pairs of variables
High dimensional data
C++ based package for R users

Thank you for your attention!
Special thanks to Dr. Adele Cutler
for her contribution and advice

More information:
http://cc.usu.edu/~andrej/
andrej@cc.usu.edu
adele.cutler@usu.edu

Parfionovas_Interface_2008

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (12)

Similar to Parfionovas_Interface_2008

Similar to Parfionovas_Interface_2008 (20)

Parfionovas_Interface_2008