SlideShare a Scribd company logo
1 of 46
Download to read offline
Leonardo Auslender Copyright 2004Leonardo Auslender9/8/2019
Tree World.
By Leonardo Auslender.
Copyright 2019.
Leonardo ‘dot’ auslender ‘at’ gmail ‘dot’ com
Leonardo Auslender Copyright 2004Leonardo Auslender 29/8/2019
Contents
Varieties of trees
CART algorithm.
Tree variable selection
Tree Pruning
Tree variable importance.
Tree model diagnostics.
Sections marked with *** can be skipped at first reading.
Leonardo Auslender Copyright 2004 Ch. 1.4-39/8/2019
Varieties of Tree Methodologies.
CART
Tree (S+)
AID
THAID
CHAID ID3
C4.5
C5.0
We’ll focus on CART methodology
Leonardo Auslender Copyright 20049/8/2019
Basic References.
Breiman L. et al, 1984.
Quinlan J. (1993).
“Easy Reading” Auslender L. (1998, 1999, 2000a, 2001)
Bayesian Perspective: Chipman et al (1998).
Many, many other references.
Leonardo Auslender Copyright 20049/8/2019
Basic CART Algorithm: binary dependent
variable or target (0,1): Classification Trees.
Range of Continuous Variable A
“0”
“0”
70%
“1”
“1”
20%
50%
Original % of ‘0’s and ‘1’s of dep. var
Splitting point
With continuous dep var, decrease in variance
from root to nodes: Regression Trees.
Leonardo Auslender Copyright 2004 Ch. 1.4-69/8/2019
Divide and Conquer: recursive
partitioning.
n = 5,000
10% Event
n = 3,350 n = 1,650
Debits < 19
yes no
21% Event5% Event
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 1.4-79/8/2019
Ideal SAS code to find splits (for those who
dare).
Proc summary data = …. Nway;
class (all independent vars);
var depvar;
output out = ….. Sum = ;
run;
For large data sets (large N, large p),
hardware and software constraints may
prevent completion.
Binary Case
Leonardo Auslender Copyright 2004 Ch. 1.4-89/8/2019
Fitted Decision Tree: Interpretation
and structure.
VAR C
>1
25%
0-52
45%
VAR B
VAR A
<19 19
5%
0,1
21%
>52
Leonardo Auslender Copyright 2004 Ch. 1.4-99/8/2019
Cultivation of Trees.
• Split Search
– Which splits are to be considered?
• Splitting Criterion
– Which split is best?
• Stopping Rule
– When should splitting stop?
• Pruning Rule
– Should some branches be lopped-off?
Leonardo Auslender Copyright 2004 Ch. 1.4-109/8/2019
Splitting Criterion: gini, twoing, misclassification, entropy,
chi-square, etc, etc. …
A) Minimize Gini impurity criterion (favors node homogeneity)
B) Maximize Twoing impurity criterion (favors class separation)
Empirical results: for binary dependent variables, Gini and Twoing are
equivalent. For trinomial, Gini provides more accurate trees. Beyond three
categories, twoing performs better.
2
r
1
l r
P
( ) [ [ ( / ) ( / )]
4
t and t : left and right nodes, respectively.
K
l
l r
k
P
i t p k t p k t
=
 = −
2
1
( ) 1 ( / ) .
( / ) Cond. prob. of class k in node t.
K
k
i t Gini impurity p k t
p k t
=
= = −
=

Leonardo Auslender Copyright 2004Leonardo Auslender 119/8/2019
Choosing between No_claims and Dr. Visits. No_claims yields lower impurity (0.237)
and split value at or below 0 is chosen. Dr. Visits impurity is 0.280.
Leonardo Auslender Copyright 2004 Ch. 1.4-129/8/2019
The Right-Sized Tree
Stunting
Pruning
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 1.4-139/8/2019
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 1.4-149/8/2019
Tree Prediction.
Let J disjoint regions (final nodes) {R1,,,,,Rj}
Classification:
Y ε {c1, c2, ,,,, cK} , i.e., Y has K categs ➔ predictors {F1 ,,,,, FK}.
T(X) = arg max (F1 ,,,,, FK) ( category Mode is predicted value)
Regression:
Pred. Rule: obs X ε Rj ➔ T(X) = avg (yj)
Leonardo Auslender Copyright 2004 Ch. 1.4-159/8/2019
Benefits of Trees.
• Interpretability: tree structured presentation, easy to
conceptualize but gets crowded with large trees.
• Mixed Measurement Scales
– Nominal, ordinal, interval variables.
– Regression trees for continuous target variable.
• Robustnes. Outliers just become additional possible split value.
• Missing Values: treated as one more possible split value.
• Automatic variable selection, and even ‘coefficients’ (I.e.,
splitting points) because splitter can be undrstood as selected variable,
but not in the linear model sense.
Leonardo Auslender Copyright 2004 Ch. 1.4-169/8/2019
…Benefits.
• Automatically
– Detects interactions
(AID) in hierarchical
conditioning search, i.e.,
hierarchy level is all
important.
– Invariance under
monotonic
transformations. All that
matters is values rankings.
Input
Input
Prob
Multivariate
Step Function
Leonardo Auslender Copyright 2004 Ch. 1.4-179/8/2019
Drawbacks of Trees.
• Unstable: small perturbations in data can lead to big changes in
trees, because splitting points can change.
• Linear structures are approximated in
very rough form.
• Applications may require that rules
descriptions for different categories not
share the same attributes (e.g., finance, splitters
may appear just once).
Leonardo Auslender Copyright 2004 Ch. 1.4-189/8/2019
Drawbacks of Trees (cont.).
• . Tend to over-fit ➔ overly optimistic accuracy (even when
pruned).
• . Large trees very difficult to interpret.
• . Tree size conditioned by data set size.
• . No valid inferential procedures at present (matters?).
• . Greedy search algorithm (one variable at a time, one step
ahead).
• . Difficulty in accepting final fit,
especially for data near boundaries.
• . Difficulties when data contains lot of missing values (but
other methods could be far worse in this case).
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 1.4-199/8/2019
/* PROGRAM ALGOR8.PGM WITH 8 FINAL NODES*/
/* METHOD MISSCL ALACART TEST */
RETAIN ROOT 1;
IF ROOT & CURRDUE <= 105.38 & PASTDUE <= 90.36 & CURRDUE <= 12
THEN DO;
NODE = '4_1 ';
PRED = 0 ;
/* % NODE IMPURITY = 0.0399 ; */
/* BRANCH # = 1 ; */
/* NODE FREQ = 81 ; */
END;
ELSE IF ROOT & CURRDUE <= 105.38 & PASTDUE <= 90.36 & CURRDUE > 12
THEN DO;
NODE = '4_2 ';
PRED = 1 ;
/* % NODE IMPURITY = 0.4478 ; */
/* BRANCH # = 2 ; */
/* NODE FREQ = 212 ; */
END;
ELSE IF ROOT & CURRDUE <= 105.38 & PASTDUE > 90.36
THEN DO;
NODE = '3_2 ';
PRED = 0 ;
Scoring Recipe: example of scoring output generated by TREE
like programs.
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 1.4-209/8/2019
Tree Variable Selection
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 1.4-219/8/2019
With same data set, partial picture of Tree found, Example.
Leonardo Auslender Copyright 2004Leonardo Auslender 229/8/2019
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 1.4-239/8/2019
Tree Pruning.
Trained tree could be quite large and obtain seemingly low overall
misclassification rate due to over fitting. Pruning (Breiman’s et al,
1984) , aims at remedying fitting problem.
Starts from tree originally created and selectively recombines nodes
and obtains decreasing sequence of sub-trees from the bottom up.
Decision as to which final nodes to recombine depends on
comparing loss in accuracy from not splitting intermediate node
in relation to number of final nodes that that split generates.
Comparison made across all possible intermediate node splits, and
‘minimal cost-complexity’ loss in accuracy is rule for pruning.
Sequence of sub-trees generated ends up with root node. Decision
as to which tree among sub-trees to utilize is based on either one of
two methods: 1) cross-validation, or 2) a test-data set.
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 1.4-249/8/2019
Tree Pruning
1) Cross-validation.
Preferred when original data set is not ‘large’. ‘v’ stratified samples on
dependent variable are created, without replacement. Create ‘v’ data
sets, each one containing (v –1) of samples created, and ‘v’ test data
sets, which consists of ‘left-out’ sample. ‘v’ maximal trees are trained
on ‘v’ samples, and pruned.
For instance, let v = 10 and obtain 10 samples from original data set
without replacement. Then from the 10 samples, create 10 additional
data sets combining 9 of the 10 samples, and skipping a different one
each time. The left out sample is used as test data. Thus we obtain 10
training and 10 test samples. Create 10 maximal trees and prune them.
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 1.4-259/8/2019
Tree Pruning.
2) Test data set.
Test data set method preferred when size of data set is not
constraint on estimation process. Split original data set into training
and test subsets.
Once maximal tree and sequence of sub-trees due to pruning are
obtained, ‘score’ different sub-trees with test data set and obtain
the corresponding misclassification rates.
Choose that sub-tree which minimizes misclassification rate. While
this rate decreases with number of final nodes at stage of tree
development, it typically plateaus at some number of final nodes
smaller than maximal number of final nodes for test data set.
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-269/8/2019
Tree Pruning
The test data sets are then used to obtain misclassification rates of each
of pruning subsequences. Index each pruning subsequence and
corresponding misclassification rate by number of final nodes, and
obtain array of miscl. Rates by pruned-subtrees. Choose size of tree
that which minimizes overall misclassification rate.
Final tree will be taken from original pruning sequence of tree derived
with entire sample at number of final nodes just described.
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-279/8/2019
Variable Importance
Variable importance can be defined in many ways.
It can be considered as a measure of the actual splitting or the actual
and potential splitting capability of all variables.
By actual we mean variables that were used to create splits and by
potential we mean variables which mimic the primary splitter e.g.
surrogates. It involves calculating for each primary splitter and each
surrogate the improvement in the Gini or Entropy index or the chi-
square over all internal nodes weighted by the size of the node.
The final result is scaled so that the maximum value is 1.00.
l
i
j j
i = 1
N
Importance(x ) = improvement in Gini for variable x
N

Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-289/8/2019
Fraud Data
Example.
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-299/8/2019 No-event Event M1_TRN_TREES
Leaf or
Final
Nodes.
Decision or
intermediate
nodes.
Root
node.
Leonardo Auslender Copyright 2004Leonardo Auslender
Requested Tree Models: Names & Descriptions.
Pred
Level 1 + Prob. Level 2 + Prob. Level 3 + Prob. Level 4 + Prob.
0.718
no_claims < 0.5 (
0.142 )
member_duration <
180.5 ( 0.201 )
total_spend < 4250 (
0.718 )
total_spend >= 4250 (
0.189 )
optom_presc >= 4.5 (
0.444 ) 0.444
optom_presc < 4.5 (
0.177 ) 0.177
no_claims >= 0.5 (
0.447 )
no_claims < 3.5 (
0.389 )
optom_presc < 3.5 (
0.341 )
member_duration <
92.5 ( 0.672 )
0.672
member_duration >=
92.5 ( 0.299 )
0.299
optom_presc >= 3.5 (
0.813 ) 0.813
no_claims >= 3.5 (
0.825 )
no_claims < 4.5 ( 0.65
)
num_members >= 1.5
( 0.476 ) 0.476
num_members < 1.5 (
0.842 ) 0.842
no_claims >= 4.5 (
0.947 )
member_duration <
318 ( 1 ) 1.000
member_duration >=
318 ( 0.4 )
0.400
A bit easier to see.
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-319/8/2019
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-329/8/2019
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-339/8/2019
Rather flat for 0 - 1
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-349/8/2019
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-359/8/2019
Final Nodes Tree Diagnostics.
Highly non-linear relations
With jagged connecting Lines.
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-369/8/2019
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-379/8/2019
Very similar TRN / VAL
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-389/8/2019
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-399/8/2019
Leonardo Auslender Copyright 2004Leonardo Auslender 409/8/2019
Very good performance in terms of TRN lift, relative to logistic.
Gains Table
%
Event
Cum
%
Event
s
%
Capt.
Event
s
Cum
%
Capt.
Event
s Lift
Cum
Lift
Pctl Min
Prob
Max Prob Model Name
55.45 55.45 28.82 28.82 2.88 2.88
10 0.299 1.000 M1_VAL_TREES
0.400 1.000 M1_TRN_TREES
69.17 69.17 33.97 33.97 3.39 3.39
20 0.299 0.299 M1_TRN_TREES
29.88 49.55 14.63 48.60 1.47 2.43
M1_VAL_TREES 34.15 44.82 17.67 46.49 1.77 2.32
30 0.217 0.299 M1_TRN_TREES
24.97 41.35 12.26 60.87 1.22 2.03
M1_VAL_TREES 26.33 38.65 13.69 60.18 1.37 2.00
40 0.217 0.217 M1_TRN_TREES
21.73 36.45 10.64 71.51 1.07 1.79
M1_VAL_TREES 22.20 34.54 11.49 71.66 1.15 1.79
50 0.131 0.217 M1_TRN_TREES
15.96 32.35 7.84 79.34 0.78 1.59
M1_VAL_TREES 13.38 30.30 6.96 78.62 0.69 1.57
60 0.131 0.131 M1_TRN_TREES
13.11 29.14 6.42 85.76 0.64 1.43
M1_VAL_TREES 11.75 27.22 6.08 84.70 0.61 1.41
70 0.062 0.131 M1_TRN_TREES
10.49 26.48 5.15 90.91 0.51 1.30
M1_VAL_TREES 8.92 24.60 4.64 89.34 0.46 1.28
80 0.062 0.062 M1_TRN_TREES
6.18 23.94 3.03 93.94 0.30 1.17
M1_VAL_TREES 6.86 22.39 3.55 92.89 0.36 1.16
90 0.062 0.062 M1_TRN_TREES
6.18 21.97 3.03 96.97 0.30 1.08
M1_VAL_TREES 6.86 20.66 3.56 96.45 0.36 1.07
100 0.062 0.062 M1_TRN_TREES
6.18 20.39 3.03
100.0
0 0.30 1.00
M1_VAL_TREES 6.86 19.28 3.55
100.0
0 0.36 1.00
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-419/8/2019
Lift, cumulative,
Best lift.
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-429/8/2019
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-439/8/2019
Precision + classification
Similar for VAL.
Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-449/8/2019
Comparing Gains-chart info with Precision Recall.
The gains-chart provides information on cumulative # of
Events per descending percentile / bin. These bins contain a
fixed number of observations.
Precision recall instead is at probability level, not at bin
Level, and thus # of observations along the curve is not
Uniform. Thus, selecting cutoff point from gains-chart selects
invariably from within a range of probabilities.
Selecting from Precision recall, selects a specific probability
point.
Leonardo Auslender Copyright 2004Leonardo Auslender 459/8/2019
References
Auslender L. (1998): Alacart, poor man’s classification trees, NESUG.
Breiman L., Friedman J., Olshen R., Stone J. (1984): Classification and Regression Trees,
Wadsworth.
Chipman H., George E., McCulloch R.: BART, Bayesian additive regression Trees, The
Annals of Statistics.
Friedman, J. (2001).Greedy boosting approximation: a gradient boosting machine. Ann.Stat.
29, 1189–1232.doi:10.1214/aos/1013203451
Quinlan J. Ross (1993): C4.5: programs for machine learning, Morgan Kaufmann Publshers.
Leonardo Auslender Copyright 2004Leonardo Auslender 469/8/2019
𝑻𝒉𝒆
End

More Related Content

Similar to 4 1 tree world

Decision tree induction
Decision tree inductionDecision tree induction
Decision tree inductionthamizh arasi
 
4 2 ensemble models and grad boost part 1 2019-10-07
4 2 ensemble models and grad boost part 1 2019-10-074 2 ensemble models and grad boost part 1 2019-10-07
4 2 ensemble models and grad boost part 1 2019-10-07Leonardo Auslender
 
4_2_Ensemble models and grad boost part 1.pdf
4_2_Ensemble models and grad boost part 1.pdf4_2_Ensemble models and grad boost part 1.pdf
4_2_Ensemble models and grad boost part 1.pdfLeonardo Auslender
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Derek Kane
 
Supervised learning (2)
Supervised learning (2)Supervised learning (2)
Supervised learning (2)AlexAman1
 
Decision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmDecision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmPalin analytics
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)Abhimanyu Dwivedi
 
Decision tree presentation
Decision tree presentationDecision tree presentation
Decision tree presentationVijay Yadav
 
DS-004-Robust Design
DS-004-Robust DesignDS-004-Robust Design
DS-004-Robust Designhandbook
 
4_3_Ensemble models and grad boost part 2.pdf
4_3_Ensemble models and grad boost part 2.pdf4_3_Ensemble models and grad boost part 2.pdf
4_3_Ensemble models and grad boost part 2.pdfLeonardo Auslender
 
Introduction to Random Forest
Introduction to Random Forest Introduction to Random Forest
Introduction to Random Forest Rupak Roy
 
module_3_1.pptx
module_3_1.pptxmodule_3_1.pptx
module_3_1.pptxWanderer20
 
module_3_1.pptx
module_3_1.pptxmodule_3_1.pptx
module_3_1.pptxWanderer20
 

Similar to 4 1 tree world (20)

Ensembles.pdf
Ensembles.pdfEnsembles.pdf
Ensembles.pdf
 
Decision tree induction
Decision tree inductionDecision tree induction
Decision tree induction
 
Adam Ashenfelter - Finding the Oddballs
Adam Ashenfelter - Finding the OddballsAdam Ashenfelter - Finding the Oddballs
Adam Ashenfelter - Finding the Oddballs
 
16 Simple CART
16 Simple CART16 Simple CART
16 Simple CART
 
4 2 ensemble models and grad boost part 1 2019-10-07
4 2 ensemble models and grad boost part 1 2019-10-074 2 ensemble models and grad boost part 1 2019-10-07
4 2 ensemble models and grad boost part 1 2019-10-07
 
4_2_Ensemble models and grad boost part 1.pdf
4_2_Ensemble models and grad boost part 1.pdf4_2_Ensemble models and grad boost part 1.pdf
4_2_Ensemble models and grad boost part 1.pdf
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
 
Supervised learning (2)
Supervised learning (2)Supervised learning (2)
Supervised learning (2)
 
Decision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmDecision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning Algorithm
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)
 
Decision tree presentation
Decision tree presentationDecision tree presentation
Decision tree presentation
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
 
M3R.FINAL
M3R.FINALM3R.FINAL
M3R.FINAL
 
DS-004-Robust Design
DS-004-Robust DesignDS-004-Robust Design
DS-004-Robust Design
 
Advanced cart 2007
Advanced cart 2007Advanced cart 2007
Advanced cart 2007
 
4_3_Ensemble models and grad boost part 2.pdf
4_3_Ensemble models and grad boost part 2.pdf4_3_Ensemble models and grad boost part 2.pdf
4_3_Ensemble models and grad boost part 2.pdf
 
Decision tree
Decision tree Decision tree
Decision tree
 
Introduction to Random Forest
Introduction to Random Forest Introduction to Random Forest
Introduction to Random Forest
 
module_3_1.pptx
module_3_1.pptxmodule_3_1.pptx
module_3_1.pptx
 
module_3_1.pptx
module_3_1.pptxmodule_3_1.pptx
module_3_1.pptx
 

More from Leonardo Auslender (20)

1 UMI.pdf
1 UMI.pdf1 UMI.pdf
1 UMI.pdf
 
Suppression Enhancement.pdf
Suppression Enhancement.pdfSuppression Enhancement.pdf
Suppression Enhancement.pdf
 
4_5_Model Interpretation and diagnostics part 4_B.pdf
4_5_Model Interpretation and diagnostics part 4_B.pdf4_5_Model Interpretation and diagnostics part 4_B.pdf
4_5_Model Interpretation and diagnostics part 4_B.pdf
 
4_2_Ensemble models and grad boost part 2.pdf
4_2_Ensemble models and grad boost part 2.pdf4_2_Ensemble models and grad boost part 2.pdf
4_2_Ensemble models and grad boost part 2.pdf
 
4_2_Ensemble models and grad boost part 3.pdf
4_2_Ensemble models and grad boost part 3.pdf4_2_Ensemble models and grad boost part 3.pdf
4_2_Ensemble models and grad boost part 3.pdf
 
4_5_Model Interpretation and diagnostics part 4.pdf
4_5_Model Interpretation and diagnostics part 4.pdf4_5_Model Interpretation and diagnostics part 4.pdf
4_5_Model Interpretation and diagnostics part 4.pdf
 
Classification methods and assessment.pdf
Classification methods and assessment.pdfClassification methods and assessment.pdf
Classification methods and assessment.pdf
 
Linear Regression.pdf
Linear Regression.pdfLinear Regression.pdf
Linear Regression.pdf
 
4 MEDA.pdf
4 MEDA.pdf4 MEDA.pdf
4 MEDA.pdf
 
2 UEDA.pdf
2 UEDA.pdf2 UEDA.pdf
2 UEDA.pdf
 
3 BEDA.pdf
3 BEDA.pdf3 BEDA.pdf
3 BEDA.pdf
 
1 EDA.pdf
1 EDA.pdf1 EDA.pdf
1 EDA.pdf
 
0 Statistics Intro.pdf
0 Statistics Intro.pdf0 Statistics Intro.pdf
0 Statistics Intro.pdf
 
0 Model Interpretation setting.pdf
0 Model Interpretation setting.pdf0 Model Interpretation setting.pdf
0 Model Interpretation setting.pdf
 
4 2 ensemble models and grad boost part 3 2019-10-07
4 2 ensemble models and grad boost part 3 2019-10-074 2 ensemble models and grad boost part 3 2019-10-07
4 2 ensemble models and grad boost part 3 2019-10-07
 
4 2 ensemble models and grad boost part 2 2019-10-07
4 2 ensemble models and grad boost part 2 2019-10-074 2 ensemble models and grad boost part 2 2019-10-07
4 2 ensemble models and grad boost part 2 2019-10-07
 
4 meda
4 meda4 meda
4 meda
 
3 beda
3 beda3 beda
3 beda
 
2 ueda
2 ueda2 ueda
2 ueda
 
1 eda
1 eda1 eda
1 eda
 

Recently uploaded

04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 

Recently uploaded (20)

04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 

4 1 tree world

  • 1. Leonardo Auslender Copyright 2004Leonardo Auslender9/8/2019 Tree World. By Leonardo Auslender. Copyright 2019. Leonardo ‘dot’ auslender ‘at’ gmail ‘dot’ com
  • 2. Leonardo Auslender Copyright 2004Leonardo Auslender 29/8/2019 Contents Varieties of trees CART algorithm. Tree variable selection Tree Pruning Tree variable importance. Tree model diagnostics. Sections marked with *** can be skipped at first reading.
  • 3. Leonardo Auslender Copyright 2004 Ch. 1.4-39/8/2019 Varieties of Tree Methodologies. CART Tree (S+) AID THAID CHAID ID3 C4.5 C5.0 We’ll focus on CART methodology
  • 4. Leonardo Auslender Copyright 20049/8/2019 Basic References. Breiman L. et al, 1984. Quinlan J. (1993). “Easy Reading” Auslender L. (1998, 1999, 2000a, 2001) Bayesian Perspective: Chipman et al (1998). Many, many other references.
  • 5. Leonardo Auslender Copyright 20049/8/2019 Basic CART Algorithm: binary dependent variable or target (0,1): Classification Trees. Range of Continuous Variable A “0” “0” 70% “1” “1” 20% 50% Original % of ‘0’s and ‘1’s of dep. var Splitting point With continuous dep var, decrease in variance from root to nodes: Regression Trees.
  • 6. Leonardo Auslender Copyright 2004 Ch. 1.4-69/8/2019 Divide and Conquer: recursive partitioning. n = 5,000 10% Event n = 3,350 n = 1,650 Debits < 19 yes no 21% Event5% Event
  • 7. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 1.4-79/8/2019 Ideal SAS code to find splits (for those who dare). Proc summary data = …. Nway; class (all independent vars); var depvar; output out = ….. Sum = ; run; For large data sets (large N, large p), hardware and software constraints may prevent completion. Binary Case
  • 8. Leonardo Auslender Copyright 2004 Ch. 1.4-89/8/2019 Fitted Decision Tree: Interpretation and structure. VAR C >1 25% 0-52 45% VAR B VAR A <19 19 5% 0,1 21% >52
  • 9. Leonardo Auslender Copyright 2004 Ch. 1.4-99/8/2019 Cultivation of Trees. • Split Search – Which splits are to be considered? • Splitting Criterion – Which split is best? • Stopping Rule – When should splitting stop? • Pruning Rule – Should some branches be lopped-off?
  • 10. Leonardo Auslender Copyright 2004 Ch. 1.4-109/8/2019 Splitting Criterion: gini, twoing, misclassification, entropy, chi-square, etc, etc. … A) Minimize Gini impurity criterion (favors node homogeneity) B) Maximize Twoing impurity criterion (favors class separation) Empirical results: for binary dependent variables, Gini and Twoing are equivalent. For trinomial, Gini provides more accurate trees. Beyond three categories, twoing performs better. 2 r 1 l r P ( ) [ [ ( / ) ( / )] 4 t and t : left and right nodes, respectively. K l l r k P i t p k t p k t =  = − 2 1 ( ) 1 ( / ) . ( / ) Cond. prob. of class k in node t. K k i t Gini impurity p k t p k t = = = − = 
  • 11. Leonardo Auslender Copyright 2004Leonardo Auslender 119/8/2019 Choosing between No_claims and Dr. Visits. No_claims yields lower impurity (0.237) and split value at or below 0 is chosen. Dr. Visits impurity is 0.280.
  • 12. Leonardo Auslender Copyright 2004 Ch. 1.4-129/8/2019 The Right-Sized Tree Stunting Pruning
  • 13. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 1.4-139/8/2019
  • 14. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 1.4-149/8/2019 Tree Prediction. Let J disjoint regions (final nodes) {R1,,,,,Rj} Classification: Y ε {c1, c2, ,,,, cK} , i.e., Y has K categs ➔ predictors {F1 ,,,,, FK}. T(X) = arg max (F1 ,,,,, FK) ( category Mode is predicted value) Regression: Pred. Rule: obs X ε Rj ➔ T(X) = avg (yj)
  • 15. Leonardo Auslender Copyright 2004 Ch. 1.4-159/8/2019 Benefits of Trees. • Interpretability: tree structured presentation, easy to conceptualize but gets crowded with large trees. • Mixed Measurement Scales – Nominal, ordinal, interval variables. – Regression trees for continuous target variable. • Robustnes. Outliers just become additional possible split value. • Missing Values: treated as one more possible split value. • Automatic variable selection, and even ‘coefficients’ (I.e., splitting points) because splitter can be undrstood as selected variable, but not in the linear model sense.
  • 16. Leonardo Auslender Copyright 2004 Ch. 1.4-169/8/2019 …Benefits. • Automatically – Detects interactions (AID) in hierarchical conditioning search, i.e., hierarchy level is all important. – Invariance under monotonic transformations. All that matters is values rankings. Input Input Prob Multivariate Step Function
  • 17. Leonardo Auslender Copyright 2004 Ch. 1.4-179/8/2019 Drawbacks of Trees. • Unstable: small perturbations in data can lead to big changes in trees, because splitting points can change. • Linear structures are approximated in very rough form. • Applications may require that rules descriptions for different categories not share the same attributes (e.g., finance, splitters may appear just once).
  • 18. Leonardo Auslender Copyright 2004 Ch. 1.4-189/8/2019 Drawbacks of Trees (cont.). • . Tend to over-fit ➔ overly optimistic accuracy (even when pruned). • . Large trees very difficult to interpret. • . Tree size conditioned by data set size. • . No valid inferential procedures at present (matters?). • . Greedy search algorithm (one variable at a time, one step ahead). • . Difficulty in accepting final fit, especially for data near boundaries. • . Difficulties when data contains lot of missing values (but other methods could be far worse in this case).
  • 19. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 1.4-199/8/2019 /* PROGRAM ALGOR8.PGM WITH 8 FINAL NODES*/ /* METHOD MISSCL ALACART TEST */ RETAIN ROOT 1; IF ROOT & CURRDUE <= 105.38 & PASTDUE <= 90.36 & CURRDUE <= 12 THEN DO; NODE = '4_1 '; PRED = 0 ; /* % NODE IMPURITY = 0.0399 ; */ /* BRANCH # = 1 ; */ /* NODE FREQ = 81 ; */ END; ELSE IF ROOT & CURRDUE <= 105.38 & PASTDUE <= 90.36 & CURRDUE > 12 THEN DO; NODE = '4_2 '; PRED = 1 ; /* % NODE IMPURITY = 0.4478 ; */ /* BRANCH # = 2 ; */ /* NODE FREQ = 212 ; */ END; ELSE IF ROOT & CURRDUE <= 105.38 & PASTDUE > 90.36 THEN DO; NODE = '3_2 '; PRED = 0 ; Scoring Recipe: example of scoring output generated by TREE like programs.
  • 20. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 1.4-209/8/2019 Tree Variable Selection
  • 21. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 1.4-219/8/2019 With same data set, partial picture of Tree found, Example.
  • 22. Leonardo Auslender Copyright 2004Leonardo Auslender 229/8/2019
  • 23. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 1.4-239/8/2019 Tree Pruning. Trained tree could be quite large and obtain seemingly low overall misclassification rate due to over fitting. Pruning (Breiman’s et al, 1984) , aims at remedying fitting problem. Starts from tree originally created and selectively recombines nodes and obtains decreasing sequence of sub-trees from the bottom up. Decision as to which final nodes to recombine depends on comparing loss in accuracy from not splitting intermediate node in relation to number of final nodes that that split generates. Comparison made across all possible intermediate node splits, and ‘minimal cost-complexity’ loss in accuracy is rule for pruning. Sequence of sub-trees generated ends up with root node. Decision as to which tree among sub-trees to utilize is based on either one of two methods: 1) cross-validation, or 2) a test-data set.
  • 24. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 1.4-249/8/2019 Tree Pruning 1) Cross-validation. Preferred when original data set is not ‘large’. ‘v’ stratified samples on dependent variable are created, without replacement. Create ‘v’ data sets, each one containing (v –1) of samples created, and ‘v’ test data sets, which consists of ‘left-out’ sample. ‘v’ maximal trees are trained on ‘v’ samples, and pruned. For instance, let v = 10 and obtain 10 samples from original data set without replacement. Then from the 10 samples, create 10 additional data sets combining 9 of the 10 samples, and skipping a different one each time. The left out sample is used as test data. Thus we obtain 10 training and 10 test samples. Create 10 maximal trees and prune them.
  • 25. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 1.4-259/8/2019 Tree Pruning. 2) Test data set. Test data set method preferred when size of data set is not constraint on estimation process. Split original data set into training and test subsets. Once maximal tree and sequence of sub-trees due to pruning are obtained, ‘score’ different sub-trees with test data set and obtain the corresponding misclassification rates. Choose that sub-tree which minimizes misclassification rate. While this rate decreases with number of final nodes at stage of tree development, it typically plateaus at some number of final nodes smaller than maximal number of final nodes for test data set.
  • 26. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-269/8/2019 Tree Pruning The test data sets are then used to obtain misclassification rates of each of pruning subsequences. Index each pruning subsequence and corresponding misclassification rate by number of final nodes, and obtain array of miscl. Rates by pruned-subtrees. Choose size of tree that which minimizes overall misclassification rate. Final tree will be taken from original pruning sequence of tree derived with entire sample at number of final nodes just described.
  • 27. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-279/8/2019 Variable Importance Variable importance can be defined in many ways. It can be considered as a measure of the actual splitting or the actual and potential splitting capability of all variables. By actual we mean variables that were used to create splits and by potential we mean variables which mimic the primary splitter e.g. surrogates. It involves calculating for each primary splitter and each surrogate the improvement in the Gini or Entropy index or the chi- square over all internal nodes weighted by the size of the node. The final result is scaled so that the maximum value is 1.00. l i j j i = 1 N Importance(x ) = improvement in Gini for variable x N 
  • 28. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-289/8/2019 Fraud Data Example.
  • 29. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-299/8/2019 No-event Event M1_TRN_TREES Leaf or Final Nodes. Decision or intermediate nodes. Root node.
  • 30. Leonardo Auslender Copyright 2004Leonardo Auslender Requested Tree Models: Names & Descriptions. Pred Level 1 + Prob. Level 2 + Prob. Level 3 + Prob. Level 4 + Prob. 0.718 no_claims < 0.5 ( 0.142 ) member_duration < 180.5 ( 0.201 ) total_spend < 4250 ( 0.718 ) total_spend >= 4250 ( 0.189 ) optom_presc >= 4.5 ( 0.444 ) 0.444 optom_presc < 4.5 ( 0.177 ) 0.177 no_claims >= 0.5 ( 0.447 ) no_claims < 3.5 ( 0.389 ) optom_presc < 3.5 ( 0.341 ) member_duration < 92.5 ( 0.672 ) 0.672 member_duration >= 92.5 ( 0.299 ) 0.299 optom_presc >= 3.5 ( 0.813 ) 0.813 no_claims >= 3.5 ( 0.825 ) no_claims < 4.5 ( 0.65 ) num_members >= 1.5 ( 0.476 ) 0.476 num_members < 1.5 ( 0.842 ) 0.842 no_claims >= 4.5 ( 0.947 ) member_duration < 318 ( 1 ) 1.000 member_duration >= 318 ( 0.4 ) 0.400 A bit easier to see.
  • 31. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-319/8/2019
  • 32. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-329/8/2019
  • 33. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-339/8/2019 Rather flat for 0 - 1
  • 34. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-349/8/2019
  • 35. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-359/8/2019 Final Nodes Tree Diagnostics. Highly non-linear relations With jagged connecting Lines.
  • 36. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-369/8/2019
  • 37. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-379/8/2019 Very similar TRN / VAL
  • 38. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-389/8/2019
  • 39. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-399/8/2019
  • 40. Leonardo Auslender Copyright 2004Leonardo Auslender 409/8/2019 Very good performance in terms of TRN lift, relative to logistic. Gains Table % Event Cum % Event s % Capt. Event s Cum % Capt. Event s Lift Cum Lift Pctl Min Prob Max Prob Model Name 55.45 55.45 28.82 28.82 2.88 2.88 10 0.299 1.000 M1_VAL_TREES 0.400 1.000 M1_TRN_TREES 69.17 69.17 33.97 33.97 3.39 3.39 20 0.299 0.299 M1_TRN_TREES 29.88 49.55 14.63 48.60 1.47 2.43 M1_VAL_TREES 34.15 44.82 17.67 46.49 1.77 2.32 30 0.217 0.299 M1_TRN_TREES 24.97 41.35 12.26 60.87 1.22 2.03 M1_VAL_TREES 26.33 38.65 13.69 60.18 1.37 2.00 40 0.217 0.217 M1_TRN_TREES 21.73 36.45 10.64 71.51 1.07 1.79 M1_VAL_TREES 22.20 34.54 11.49 71.66 1.15 1.79 50 0.131 0.217 M1_TRN_TREES 15.96 32.35 7.84 79.34 0.78 1.59 M1_VAL_TREES 13.38 30.30 6.96 78.62 0.69 1.57 60 0.131 0.131 M1_TRN_TREES 13.11 29.14 6.42 85.76 0.64 1.43 M1_VAL_TREES 11.75 27.22 6.08 84.70 0.61 1.41 70 0.062 0.131 M1_TRN_TREES 10.49 26.48 5.15 90.91 0.51 1.30 M1_VAL_TREES 8.92 24.60 4.64 89.34 0.46 1.28 80 0.062 0.062 M1_TRN_TREES 6.18 23.94 3.03 93.94 0.30 1.17 M1_VAL_TREES 6.86 22.39 3.55 92.89 0.36 1.16 90 0.062 0.062 M1_TRN_TREES 6.18 21.97 3.03 96.97 0.30 1.08 M1_VAL_TREES 6.86 20.66 3.56 96.45 0.36 1.07 100 0.062 0.062 M1_TRN_TREES 6.18 20.39 3.03 100.0 0 0.30 1.00 M1_VAL_TREES 6.86 19.28 3.55 100.0 0 0.36 1.00
  • 41. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-419/8/2019 Lift, cumulative, Best lift.
  • 42. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-429/8/2019
  • 43. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-439/8/2019 Precision + classification Similar for VAL.
  • 44. Leonardo Auslender Copyright 2004Leonardo Auslender Ch. 5-449/8/2019 Comparing Gains-chart info with Precision Recall. The gains-chart provides information on cumulative # of Events per descending percentile / bin. These bins contain a fixed number of observations. Precision recall instead is at probability level, not at bin Level, and thus # of observations along the curve is not Uniform. Thus, selecting cutoff point from gains-chart selects invariably from within a range of probabilities. Selecting from Precision recall, selects a specific probability point.
  • 45. Leonardo Auslender Copyright 2004Leonardo Auslender 459/8/2019 References Auslender L. (1998): Alacart, poor man’s classification trees, NESUG. Breiman L., Friedman J., Olshen R., Stone J. (1984): Classification and Regression Trees, Wadsworth. Chipman H., George E., McCulloch R.: BART, Bayesian additive regression Trees, The Annals of Statistics. Friedman, J. (2001).Greedy boosting approximation: a gradient boosting machine. Ann.Stat. 29, 1189–1232.doi:10.1214/aos/1013203451 Quinlan J. Ross (1993): C4.5: programs for machine learning, Morgan Kaufmann Publshers.
  • 46. Leonardo Auslender Copyright 2004Leonardo Auslender 469/8/2019 𝑻𝒉𝒆 End