SlideShare a Scribd company logo
Combining decision trees
based on imprecise
probabilities and
uncertainty measures
J. Abellán, A. R. Masegosa
Department of Computer Science and A.I.
University of Granada
Spain
Outline
1. Introduction
2. Previous knowledges
3. Experimentation
4. Conclusions & future works
Introduction
Classificacion tree (decision tree)
Tumor
Classification:
absent
Calcium
Classification:
absent
Classification:
present
Attribute variableNode
Case of the class variableLeaf
 SPLIT CRITERION
 STOP CRITERION
 1 LEAF = 1 RULE
Introduction
Classification tree. New observation
 Observation: ( high, a1, absent, present)
 Variables: [Calcium, Tumor, Coma, Migraine]
 Classification: Cancer present
normal high
a0 a1
Classification :
Absent 0.9
Classification:
Absent 0.7
Classification:
Present 0.8
Calcium
Tumor
Introduction
Combination of Decision Trees
………………..........
…………
DB
DB
DB
INFORMATIVE ORDER FOR
THE ROOT NODE (Abellán &
Masegosa, 2007)
Training set
Training set
Training set
P(C|X) = Average(Pi(C|X))
Observation X
(P1(C1|X),…,P1(Cn|X))
…………
(P2(C1|X),…,P2(Cn|X))
(Pm(C1|X),…,Pm(Cn|X))
Introduction
Approach of the work presented
 Show how the combination of a few decision trees
obtained by a simple method from the IDM produces
high improvemnts.
 As reference, we use
NAIVE BAYES and J48 (improve version of C4.5)
 We carry out EXPERIMENTS on a large set of data
bases.
 For results comparison, we use:
 PERCENTAGE OF CORRECT CLASSIFICATIONS
 NUMBER OF SELECTED VARIABLES
 RUN TIME
Tools: WEKA &
Elvira
Previos knowledges
Naive Bayes (Duda & Hart, 1973)
 Attribute variables {Xi | i=1,..,r}
 Class variable C with states in
{c1,..,ck}
 Select state of C:
arg maxci
(P(ci|X)).
 Supposition of independecy
known the class variable:
arg maxci
(P(ci) ∏r
j=1
P(zj|ci))
…
C
X1 X2 Xr
Graphical Structure
Previos knowledges
J48 Classifier
 Selects the attribute variable with higher positive
value of IGR(Xi,C) = IG(Xi,C)/ H(Xi)
J48
(improve version of C4.5)
 Work with continuous data bases
 Have a posterior prune process
 Penalizes the use of variables with higher number of
cases
Previos knowledges
Imprecise Info-Gain (Abellán & Moral, 2003)
 Representing the information from a data base
Imprecise Dirichlet Model (IDM) Probability estimation
j
jj
c
cc
j I
sN
sn
sN
n
cP ≡





+
+
+
∈ ,)(
})(|{)( jcj IcqqCK ∈= })(|{)|( },{ ij xcji IcqqxXCK ∈==
Credal Sets
Previos knowledges
Split Criterions for decision trees:
Imprecise Info-Gain (Abellán & Moral, 2003)
 Select the attribute variable with higher positive
value of:
IGI(Xi,C) = S(K(C)) - ∑t
P(xi
t) S(K(C| Xi=xi
t))
with S as Maximum entropy function of a credal set.
 Global uncertainty measure ⊃ conflict & no-especificity
 Conflict is on the side of ramification.
 No-especificty tries to reduce the ramification.
Previous Knowledge
Combination of Decision Trees
INFORMATIVE ORDER by IIG FOR THE ROOT NODE (Abellán & Moral, 2003)
DB Training set
First more Informative Variable
DB
Training set
Second more Informative Variable
………………..........…………
DB Training set
M more Informative Variable
Previous Knowledge
Combination of Decision Trees
New observation x
x
C class variable, with states {ci, i:1,…,k}
(P1(C1|X),…,P1(Cn|X))
P(C|X) = Average(Pi(C|O))
x
(P2(C1|X),…,P2(Cn|X))
………………..........…………
x
…………
(Pm(C1|X),…,Pm(Cn|X))
Outline
1. Introduction
2. Previous knowledge
3. Experimentation
4. Conclusions & future works
Experimentation
Data Bases
 27 UCI Data Bases.
 Preprocessing:
 Replace Mising Data
 Discretization
 10 fold-cross validation
repeated 10 times.
 Comparison with a
corrected paired t-test
with 5% of significance
level.
Experimentation
Naive-Bayes comparison
 Adding decision
trees is
outperforming.
 Optimal Number of
decision trees
depends on data
base.
 No degradation
because of the
addition of decision
trees.
Experimentation
Naive-Bayes comparison
 Audiology: Significant improvement wit 3 trees.
 German: No differences with 4 trees.
 Optidigits: No differences with 5 trees.
Experimentation
J48 comparison
 Letter: Significant improvement wit 2 trees.
 Mfeat: Large improvement with 6 trees.
 Vowel: Large improvement with 6 trees.
Experimentation
Summary comparison
“As many decision trees are combined better
results are obtained”
State-of-the-art Classifiers
Combined NB J48 AODE TAN SVM
 Training Time:
 Test Time
 Numero medio de árboles: 22.88
Time Complexity Analysis
Combined NB J48 AODE TAN SVM
Combined NB J48 AODE TAN SVM
Outline
1. Introduction
2. Previous knowledges
3. Experimentation
4. Conclusions & future works
Conclusions & future works
 We have presented a simple method for
combinating decision trees obtained from IDM
and uncertainty measures.
 Combining a low number of simple classification
trees, it is possible to obtain considerable
accuracy improvements.
 This method can be easily parallelized and, in
consequency, speed up the classification task.
 Apply this method to larga data sets (text, gene
analysis…).
 Study methods for weighting decision trees.

More Related Content

What's hot

P1121133727
P1121133727P1121133727
P1121133727
Ashraf Aboshosha
 
Decision tree and random forest
Decision tree and random forestDecision tree and random forest
Decision tree and random forest
Lippo Group Digital
 
Random forest
Random forestRandom forest
Random forest
Ujjawal
 
Machine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
Machine learning by Dr. Vivek Vijay and Dr. Sandeep YadavMachine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
Machine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
Agile Testing Alliance
 
Decision tree lecture 3
Decision tree lecture 3Decision tree lecture 3
Decision tree lecture 3
Laila Fatehy
 
A Genetic Algorithm-Based Solver for Very Large Jigsaw Puzzles
A Genetic Algorithm-Based Solver for Very Large Jigsaw PuzzlesA Genetic Algorithm-Based Solver for Very Large Jigsaw Puzzles
A Genetic Algorithm-Based Solver for Very Large Jigsaw Puzzles
International Islamic University
 
Comparison Study of Decision Tree Ensembles for Regression
Comparison Study of Decision Tree Ensembles for RegressionComparison Study of Decision Tree Ensembles for Regression
Comparison Study of Decision Tree Ensembles for Regression
Seonho Park
 
Fr pca lda
Fr pca ldaFr pca lda
Fr pca lda
ultraraj
 
Random Forest and KNN is fun
Random Forest and KNN is funRandom Forest and KNN is fun
Random Forest and KNN is fun
Zhen Li
 

What's hot (9)

P1121133727
P1121133727P1121133727
P1121133727
 
Decision tree and random forest
Decision tree and random forestDecision tree and random forest
Decision tree and random forest
 
Random forest
Random forestRandom forest
Random forest
 
Machine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
Machine learning by Dr. Vivek Vijay and Dr. Sandeep YadavMachine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
Machine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
 
Decision tree lecture 3
Decision tree lecture 3Decision tree lecture 3
Decision tree lecture 3
 
A Genetic Algorithm-Based Solver for Very Large Jigsaw Puzzles
A Genetic Algorithm-Based Solver for Very Large Jigsaw PuzzlesA Genetic Algorithm-Based Solver for Very Large Jigsaw Puzzles
A Genetic Algorithm-Based Solver for Very Large Jigsaw Puzzles
 
Comparison Study of Decision Tree Ensembles for Regression
Comparison Study of Decision Tree Ensembles for RegressionComparison Study of Decision Tree Ensembles for Regression
Comparison Study of Decision Tree Ensembles for Regression
 
Fr pca lda
Fr pca ldaFr pca lda
Fr pca lda
 
Random Forest and KNN is fun
Random Forest and KNN is funRandom Forest and KNN is fun
Random Forest and KNN is fun
 

Viewers also liked

Lazy Association Classification
Lazy Association ClassificationLazy Association Classification
Lazy Association Classification
Jason Yang
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
butest
 
Benchmarking entre Replay, Diesel y G-Star
Benchmarking entre Replay, Diesel y G-StarBenchmarking entre Replay, Diesel y G-Star
Benchmarking entre Replay, Diesel y G-Star
Michiel Tromp
 
Fp growth algorithm
Fp growth algorithmFp growth algorithm
Fp growth algorithm
Pradip Kumar
 
Decision trees for machine learning
Decision trees for machine learningDecision trees for machine learning
Decision trees for machine learning
Amr BARAKAT
 
Apriori Algorithm
Apriori AlgorithmApriori Algorithm

Viewers also liked (6)

Lazy Association Classification
Lazy Association ClassificationLazy Association Classification
Lazy Association Classification
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Benchmarking entre Replay, Diesel y G-Star
Benchmarking entre Replay, Diesel y G-StarBenchmarking entre Replay, Diesel y G-Star
Benchmarking entre Replay, Diesel y G-Star
 
Fp growth algorithm
Fp growth algorithmFp growth algorithm
Fp growth algorithm
 
Decision trees for machine learning
Decision trees for machine learningDecision trees for machine learning
Decision trees for machine learning
 
Apriori Algorithm
Apriori AlgorithmApriori Algorithm
Apriori Algorithm
 

Similar to Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Measures

Split Criterions for Variable Selection Using Decision Trees
Split Criterions for Variable Selection Using Decision TreesSplit Criterions for Variable Selection Using Decision Trees
Split Criterions for Variable Selection Using Decision Trees
NTNU
 
On cascading small decision trees
On cascading small decision treesOn cascading small decision trees
On cascading small decision trees
Julià Minguillón
 
Business Analytics using R.ppt
Business Analytics using R.pptBusiness Analytics using R.ppt
Business Analytics using R.ppt
Rohit Raj
 
A Bayesian approach to estimate probabilities in classification trees
A Bayesian approach to estimate probabilities in classification treesA Bayesian approach to estimate probabilities in classification trees
A Bayesian approach to estimate probabilities in classification trees
NTNU
 
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...
PyData
 
Lecture7 cross validation
Lecture7 cross validationLecture7 cross validation
Lecture7 cross validation
Stéphane Canu
 
Intro to Model Selection
Intro to Model SelectionIntro to Model Selection
Intro to Model Selection
chenhm
 
Dataming-chapter-7-Classification-Basic.pptx
Dataming-chapter-7-Classification-Basic.pptxDataming-chapter-7-Classification-Basic.pptx
Dataming-chapter-7-Classification-Basic.pptx
HimanshuSharma997566
 
IDA 2015: Efficient model selection for regularized classification by exploit...
IDA 2015: Efficient model selection for regularized classification by exploit...IDA 2015: Efficient model selection for regularized classification by exploit...
IDA 2015: Efficient model selection for regularized classification by exploit...
George Balikas
 
Refining Bayesian Data Analysis Methods for Use with Longer Waveforms
Refining Bayesian Data Analysis Methods for Use with Longer WaveformsRefining Bayesian Data Analysis Methods for Use with Longer Waveforms
Refining Bayesian Data Analysis Methods for Use with Longer Waveforms
James Bell
 
Forest Cover type prediction
Forest Cover type predictionForest Cover type prediction
Forest Cover type prediction
Daniel Gribel
 
Parfionovas_Interface_2008
Parfionovas_Interface_2008Parfionovas_Interface_2008
Parfionovas_Interface_2008
Andrejus Parfionovas
 
Gradient Boosted Regression Trees in scikit-learn
Gradient Boosted Regression Trees in scikit-learnGradient Boosted Regression Trees in scikit-learn
Gradient Boosted Regression Trees in scikit-learn
DataRobot
 
Download It
Download ItDownload It
Download It
butest
 
08 classbasic
08 classbasic08 classbasic
08 classbasic
engrasi
 
08 classbasic
08 classbasic08 classbasic
08 classbasic
ritumysterious1
 
Naive Bayes Presentation
Naive Bayes PresentationNaive Bayes Presentation
Naive Bayes Presentation
Md. Enamul Haque Chowdhury
 
Unit 3classification
Unit 3classificationUnit 3classification
Unit 3classification
Kalpna Saharan
 
08 classbasic
08 classbasic08 classbasic
08 classbasic
JoonyoungJayGwak
 
Data Mining Concepts and Techniques.ppt
Data Mining Concepts and Techniques.pptData Mining Concepts and Techniques.ppt
Data Mining Concepts and Techniques.ppt
Rvishnupriya2
 

Similar to Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Measures (20)

Split Criterions for Variable Selection Using Decision Trees
Split Criterions for Variable Selection Using Decision TreesSplit Criterions for Variable Selection Using Decision Trees
Split Criterions for Variable Selection Using Decision Trees
 
On cascading small decision trees
On cascading small decision treesOn cascading small decision trees
On cascading small decision trees
 
Business Analytics using R.ppt
Business Analytics using R.pptBusiness Analytics using R.ppt
Business Analytics using R.ppt
 
A Bayesian approach to estimate probabilities in classification trees
A Bayesian approach to estimate probabilities in classification treesA Bayesian approach to estimate probabilities in classification trees
A Bayesian approach to estimate probabilities in classification trees
 
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...
 
Lecture7 cross validation
Lecture7 cross validationLecture7 cross validation
Lecture7 cross validation
 
Intro to Model Selection
Intro to Model SelectionIntro to Model Selection
Intro to Model Selection
 
Dataming-chapter-7-Classification-Basic.pptx
Dataming-chapter-7-Classification-Basic.pptxDataming-chapter-7-Classification-Basic.pptx
Dataming-chapter-7-Classification-Basic.pptx
 
IDA 2015: Efficient model selection for regularized classification by exploit...
IDA 2015: Efficient model selection for regularized classification by exploit...IDA 2015: Efficient model selection for regularized classification by exploit...
IDA 2015: Efficient model selection for regularized classification by exploit...
 
Refining Bayesian Data Analysis Methods for Use with Longer Waveforms
Refining Bayesian Data Analysis Methods for Use with Longer WaveformsRefining Bayesian Data Analysis Methods for Use with Longer Waveforms
Refining Bayesian Data Analysis Methods for Use with Longer Waveforms
 
Forest Cover type prediction
Forest Cover type predictionForest Cover type prediction
Forest Cover type prediction
 
Parfionovas_Interface_2008
Parfionovas_Interface_2008Parfionovas_Interface_2008
Parfionovas_Interface_2008
 
Gradient Boosted Regression Trees in scikit-learn
Gradient Boosted Regression Trees in scikit-learnGradient Boosted Regression Trees in scikit-learn
Gradient Boosted Regression Trees in scikit-learn
 
Download It
Download ItDownload It
Download It
 
08 classbasic
08 classbasic08 classbasic
08 classbasic
 
08 classbasic
08 classbasic08 classbasic
08 classbasic
 
Naive Bayes Presentation
Naive Bayes PresentationNaive Bayes Presentation
Naive Bayes Presentation
 
Unit 3classification
Unit 3classificationUnit 3classification
Unit 3classification
 
08 classbasic
08 classbasic08 classbasic
08 classbasic
 
Data Mining Concepts and Techniques.ppt
Data Mining Concepts and Techniques.pptData Mining Concepts and Techniques.ppt
Data Mining Concepts and Techniques.ppt
 

More from NTNU

Varying parameter in classification based on imprecise probabilities
Varying parameter in classification based on imprecise probabilitiesVarying parameter in classification based on imprecise probabilities
Varying parameter in classification based on imprecise probabilities
NTNU
 
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
NTNU
 
Bagging Decision Trees on Data Sets with Classification Noise
Bagging Decision Trees on Data Sets with Classification NoiseBagging Decision Trees on Data Sets with Classification Noise
Bagging Decision Trees on Data Sets with Classification Noise
NTNU
 
lassification with decision trees from a nonparametric predictive inference p...
lassification with decision trees from a nonparametric predictive inference p...lassification with decision trees from a nonparametric predictive inference p...
lassification with decision trees from a nonparametric predictive inference p...
NTNU
 
Locally Averaged Bayesian Dirichlet Metrics
Locally Averaged Bayesian Dirichlet MetricsLocally Averaged Bayesian Dirichlet Metrics
Locally Averaged Bayesian Dirichlet Metrics
NTNU
 
Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...
Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...
Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...
NTNU
 
An interactive approach for cleaning noisy observations in Bayesian networks ...
An interactive approach for cleaning noisy observations in Bayesian networks ...An interactive approach for cleaning noisy observations in Bayesian networks ...
An interactive approach for cleaning noisy observations in Bayesian networks ...
NTNU
 
Learning classifiers from discretized expression quantitative trait loci
Learning classifiers from discretized expression quantitative trait lociLearning classifiers from discretized expression quantitative trait loci
Learning classifiers from discretized expression quantitative trait loci
NTNU
 
A Semi-naive Bayes Classifier with Grouping of Cases
A Semi-naive Bayes Classifier with Grouping of CasesA Semi-naive Bayes Classifier with Grouping of Cases
A Semi-naive Bayes Classifier with Grouping of Cases
NTNU
 
Interactive Learning of Bayesian Networks
Interactive Learning of Bayesian NetworksInteractive Learning of Bayesian Networks
Interactive Learning of Bayesian Networks
NTNU
 
A Bayesian Random Split to Build Ensembles of Classification Trees
A Bayesian Random Split to Build Ensembles of Classification TreesA Bayesian Random Split to Build Ensembles of Classification Trees
A Bayesian Random Split to Build Ensembles of Classification Trees
NTNU
 
An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...
An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...
An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...
NTNU
 
Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...
Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...
Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...
NTNU
 
Evaluating query-independent object features for relevancy prediction
Evaluating query-independent object features for relevancy predictionEvaluating query-independent object features for relevancy prediction
Evaluating query-independent object features for relevancy prediction
NTNU
 
Effects of Highly Agreed Documents in Relevancy Prediction
Effects of Highly Agreed Documents in Relevancy PredictionEffects of Highly Agreed Documents in Relevancy Prediction
Effects of Highly Agreed Documents in Relevancy Prediction
NTNU
 
Conference poster 6
Conference poster 6Conference poster 6
Conference poster 6
NTNU
 

More from NTNU (16)

Varying parameter in classification based on imprecise probabilities
Varying parameter in classification based on imprecise probabilitiesVarying parameter in classification based on imprecise probabilities
Varying parameter in classification based on imprecise probabilities
 
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
 
Bagging Decision Trees on Data Sets with Classification Noise
Bagging Decision Trees on Data Sets with Classification NoiseBagging Decision Trees on Data Sets with Classification Noise
Bagging Decision Trees on Data Sets with Classification Noise
 
lassification with decision trees from a nonparametric predictive inference p...
lassification with decision trees from a nonparametric predictive inference p...lassification with decision trees from a nonparametric predictive inference p...
lassification with decision trees from a nonparametric predictive inference p...
 
Locally Averaged Bayesian Dirichlet Metrics
Locally Averaged Bayesian Dirichlet MetricsLocally Averaged Bayesian Dirichlet Metrics
Locally Averaged Bayesian Dirichlet Metrics
 
Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...
Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...
Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...
 
An interactive approach for cleaning noisy observations in Bayesian networks ...
An interactive approach for cleaning noisy observations in Bayesian networks ...An interactive approach for cleaning noisy observations in Bayesian networks ...
An interactive approach for cleaning noisy observations in Bayesian networks ...
 
Learning classifiers from discretized expression quantitative trait loci
Learning classifiers from discretized expression quantitative trait lociLearning classifiers from discretized expression quantitative trait loci
Learning classifiers from discretized expression quantitative trait loci
 
A Semi-naive Bayes Classifier with Grouping of Cases
A Semi-naive Bayes Classifier with Grouping of CasesA Semi-naive Bayes Classifier with Grouping of Cases
A Semi-naive Bayes Classifier with Grouping of Cases
 
Interactive Learning of Bayesian Networks
Interactive Learning of Bayesian NetworksInteractive Learning of Bayesian Networks
Interactive Learning of Bayesian Networks
 
A Bayesian Random Split to Build Ensembles of Classification Trees
A Bayesian Random Split to Build Ensembles of Classification TreesA Bayesian Random Split to Build Ensembles of Classification Trees
A Bayesian Random Split to Build Ensembles of Classification Trees
 
An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...
An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...
An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...
 
Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...
Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...
Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...
 
Evaluating query-independent object features for relevancy prediction
Evaluating query-independent object features for relevancy predictionEvaluating query-independent object features for relevancy prediction
Evaluating query-independent object features for relevancy prediction
 
Effects of Highly Agreed Documents in Relevancy Prediction
Effects of Highly Agreed Documents in Relevancy PredictionEffects of Highly Agreed Documents in Relevancy Prediction
Effects of Highly Agreed Documents in Relevancy Prediction
 
Conference poster 6
Conference poster 6Conference poster 6
Conference poster 6
 

Recently uploaded

CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)
CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)
CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)
eitps1506
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
RDhivya6
 
Physiology of Nervous System presentation.pptx
Physiology of Nervous System presentation.pptxPhysiology of Nervous System presentation.pptx
Physiology of Nervous System presentation.pptx
fatima132662
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
Areesha Ahmad
 
Clinical periodontology and implant dentistry 2003.pdf
Clinical periodontology and implant dentistry 2003.pdfClinical periodontology and implant dentistry 2003.pdf
Clinical periodontology and implant dentistry 2003.pdf
RAYMUNDONAVARROCORON
 
Summary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdfSummary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdf
vadgavevedant86
 
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptxLEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
yourprojectpartner05
 
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Sérgio Sacani
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
Leonel Morgado
 
Signatures of wave erosion in Titan’s coasts
Signatures of wave erosion in Titan’s coastsSignatures of wave erosion in Titan’s coasts
Signatures of wave erosion in Titan’s coasts
Sérgio Sacani
 
IMPORTANCE OF ALGAE AND ITS BENIFITS.pptx
IMPORTANCE OF ALGAE  AND ITS BENIFITS.pptxIMPORTANCE OF ALGAE  AND ITS BENIFITS.pptx
IMPORTANCE OF ALGAE AND ITS BENIFITS.pptx
OmAle5
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
hozt8xgk
 
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
PsychoTech Services
 
2001_Book_HumanChromosomes - Genéticapdf
2001_Book_HumanChromosomes - Genéticapdf2001_Book_HumanChromosomes - Genéticapdf
2001_Book_HumanChromosomes - Genéticapdf
lucianamillenium
 
Anti-Universe And Emergent Gravity and the Dark Universe
Anti-Universe And Emergent Gravity and the Dark UniverseAnti-Universe And Emergent Gravity and the Dark Universe
Anti-Universe And Emergent Gravity and the Dark Universe
Sérgio Sacani
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
University of Maribor
 
Introduction_Ch_01_Biotech Biotechnology course .pptx
Introduction_Ch_01_Biotech Biotechnology course .pptxIntroduction_Ch_01_Biotech Biotechnology course .pptx
Introduction_Ch_01_Biotech Biotechnology course .pptx
QusayMaghayerh
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Leonel Morgado
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
vluwdy49
 
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Sérgio Sacani
 

Recently uploaded (20)

CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)
CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)
CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
 
Physiology of Nervous System presentation.pptx
Physiology of Nervous System presentation.pptxPhysiology of Nervous System presentation.pptx
Physiology of Nervous System presentation.pptx
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
 
Clinical periodontology and implant dentistry 2003.pdf
Clinical periodontology and implant dentistry 2003.pdfClinical periodontology and implant dentistry 2003.pdf
Clinical periodontology and implant dentistry 2003.pdf
 
Summary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdfSummary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdf
 
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptxLEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
 
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
 
Signatures of wave erosion in Titan’s coasts
Signatures of wave erosion in Titan’s coastsSignatures of wave erosion in Titan’s coasts
Signatures of wave erosion in Titan’s coasts
 
IMPORTANCE OF ALGAE AND ITS BENIFITS.pptx
IMPORTANCE OF ALGAE  AND ITS BENIFITS.pptxIMPORTANCE OF ALGAE  AND ITS BENIFITS.pptx
IMPORTANCE OF ALGAE AND ITS BENIFITS.pptx
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
 
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
 
2001_Book_HumanChromosomes - Genéticapdf
2001_Book_HumanChromosomes - Genéticapdf2001_Book_HumanChromosomes - Genéticapdf
2001_Book_HumanChromosomes - Genéticapdf
 
Anti-Universe And Emergent Gravity and the Dark Universe
Anti-Universe And Emergent Gravity and the Dark UniverseAnti-Universe And Emergent Gravity and the Dark Universe
Anti-Universe And Emergent Gravity and the Dark Universe
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
 
Introduction_Ch_01_Biotech Biotechnology course .pptx
Introduction_Ch_01_Biotech Biotechnology course .pptxIntroduction_Ch_01_Biotech Biotechnology course .pptx
Introduction_Ch_01_Biotech Biotechnology course .pptx
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
 
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
 

Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Measures

  • 1. Combining decision trees based on imprecise probabilities and uncertainty measures J. Abellán, A. R. Masegosa Department of Computer Science and A.I. University of Granada Spain
  • 2. Outline 1. Introduction 2. Previous knowledges 3. Experimentation 4. Conclusions & future works
  • 3. Introduction Classificacion tree (decision tree) Tumor Classification: absent Calcium Classification: absent Classification: present Attribute variableNode Case of the class variableLeaf  SPLIT CRITERION  STOP CRITERION  1 LEAF = 1 RULE
  • 4. Introduction Classification tree. New observation  Observation: ( high, a1, absent, present)  Variables: [Calcium, Tumor, Coma, Migraine]  Classification: Cancer present normal high a0 a1 Classification : Absent 0.9 Classification: Absent 0.7 Classification: Present 0.8 Calcium Tumor
  • 5. Introduction Combination of Decision Trees ……………….......... ………… DB DB DB INFORMATIVE ORDER FOR THE ROOT NODE (Abellán & Masegosa, 2007) Training set Training set Training set P(C|X) = Average(Pi(C|X)) Observation X (P1(C1|X),…,P1(Cn|X)) ………… (P2(C1|X),…,P2(Cn|X)) (Pm(C1|X),…,Pm(Cn|X))
  • 6. Introduction Approach of the work presented  Show how the combination of a few decision trees obtained by a simple method from the IDM produces high improvemnts.  As reference, we use NAIVE BAYES and J48 (improve version of C4.5)  We carry out EXPERIMENTS on a large set of data bases.  For results comparison, we use:  PERCENTAGE OF CORRECT CLASSIFICATIONS  NUMBER OF SELECTED VARIABLES  RUN TIME Tools: WEKA & Elvira
  • 7. Previos knowledges Naive Bayes (Duda & Hart, 1973)  Attribute variables {Xi | i=1,..,r}  Class variable C with states in {c1,..,ck}  Select state of C: arg maxci (P(ci|X)).  Supposition of independecy known the class variable: arg maxci (P(ci) ∏r j=1 P(zj|ci)) … C X1 X2 Xr Graphical Structure
  • 8. Previos knowledges J48 Classifier  Selects the attribute variable with higher positive value of IGR(Xi,C) = IG(Xi,C)/ H(Xi) J48 (improve version of C4.5)  Work with continuous data bases  Have a posterior prune process  Penalizes the use of variables with higher number of cases
  • 9. Previos knowledges Imprecise Info-Gain (Abellán & Moral, 2003)  Representing the information from a data base Imprecise Dirichlet Model (IDM) Probability estimation j jj c cc j I sN sn sN n cP ≡      + + + ∈ ,)( })(|{)( jcj IcqqCK ∈= })(|{)|( },{ ij xcji IcqqxXCK ∈== Credal Sets
  • 10. Previos knowledges Split Criterions for decision trees: Imprecise Info-Gain (Abellán & Moral, 2003)  Select the attribute variable with higher positive value of: IGI(Xi,C) = S(K(C)) - ∑t P(xi t) S(K(C| Xi=xi t)) with S as Maximum entropy function of a credal set.  Global uncertainty measure ⊃ conflict & no-especificity  Conflict is on the side of ramification.  No-especificty tries to reduce the ramification.
  • 11. Previous Knowledge Combination of Decision Trees INFORMATIVE ORDER by IIG FOR THE ROOT NODE (Abellán & Moral, 2003) DB Training set First more Informative Variable DB Training set Second more Informative Variable ………………..........………… DB Training set M more Informative Variable
  • 12. Previous Knowledge Combination of Decision Trees New observation x x C class variable, with states {ci, i:1,…,k} (P1(C1|X),…,P1(Cn|X)) P(C|X) = Average(Pi(C|O)) x (P2(C1|X),…,P2(Cn|X)) ………………..........………… x ………… (Pm(C1|X),…,Pm(Cn|X))
  • 13. Outline 1. Introduction 2. Previous knowledge 3. Experimentation 4. Conclusions & future works
  • 14. Experimentation Data Bases  27 UCI Data Bases.  Preprocessing:  Replace Mising Data  Discretization  10 fold-cross validation repeated 10 times.  Comparison with a corrected paired t-test with 5% of significance level.
  • 15. Experimentation Naive-Bayes comparison  Adding decision trees is outperforming.  Optimal Number of decision trees depends on data base.  No degradation because of the addition of decision trees.
  • 16. Experimentation Naive-Bayes comparison  Audiology: Significant improvement wit 3 trees.  German: No differences with 4 trees.  Optidigits: No differences with 5 trees.
  • 17. Experimentation J48 comparison  Letter: Significant improvement wit 2 trees.  Mfeat: Large improvement with 6 trees.  Vowel: Large improvement with 6 trees.
  • 18. Experimentation Summary comparison “As many decision trees are combined better results are obtained”
  • 20.  Training Time:  Test Time  Numero medio de árboles: 22.88 Time Complexity Analysis Combined NB J48 AODE TAN SVM Combined NB J48 AODE TAN SVM
  • 21. Outline 1. Introduction 2. Previous knowledges 3. Experimentation 4. Conclusions & future works
  • 22. Conclusions & future works  We have presented a simple method for combinating decision trees obtained from IDM and uncertainty measures.  Combining a low number of simple classification trees, it is possible to obtain considerable accuracy improvements.  This method can be easily parallelized and, in consequency, speed up the classification task.  Apply this method to larga data sets (text, gene analysis…).  Study methods for weighting decision trees.