SlideShare a Scribd company logo
Bias-variance decomposition 
in Random Forests 
Gilles Louppe @glouppe 
Paris ML S02E04, December 8, 2014 
1 / 14
Motivation 
In supervised learning, combining the predictions of several 
randomized models often achieves better results than a single 
non-randomized model. 
Why ? 
2 / 14
Supervised learning 
 The inputs are random variables X = X1, ..., Xp ; 
 The output is a random variable Y . 
 Data comes as a
nite learning set 
L = f(xi , yi )ji = 0, . . . ,N - 1g, 
where xi 2 X = X1  ...  Xp and yi 2 Y are randomly drawn 
from PX,Y . 
 The goal is to
nd a model 'L : X7! Y minimizing 
Err ('L) = EX,Y fL(Y ,'L(X))g. 
3 / 14
Performance evaluation 
Classi
cation 
 Symbolic output (e.g., Y = fyes, nog) 
 Zero-one loss 
L(Y ,'L(X)) = 1(Y6= 'L(X)) 
Regression 
 Numerical output (e.g., Y = R) 
 Squared error loss 
L(Y ,'L(X)) = (Y - 'L(X))2 
4 / 14
Decision trees 
0.7 
0.5 
X1 
X2 
t5 t3 t4 
푡2 
푡1 
푋1 ≤ 0.7 
푡3 
푡4 푡5 
풙 
푝(푌 = 푐|푋 = 풙) 
Split node 
≤  Leaf node 
푋2 ≤ 0.5 ≤  
t 2 ' : nodes of the tree ' 
Xt : split variable at t 
vt 2 R : split threshold at t 
'(x) = arg maxc2Y p(Y = cjX = x) 
5 / 14
Bias-variance decomposition in regression 
Theorem. For the squared error loss, the bias-variance 
decomposition of the expected generalization error at X = x is 
ELfErr ('L(x))g = noise(x) + bias2(x) + var(x) 
where 
noise(x) = Err ('B(x)), 
bias2(x) = ('B(x) - ELf'L(x)g)2, 
var(x) = ELf(ELf'L(x)g - 'L(x))2g. 
6 / 14
Bias-variance decomposition 
7 / 14
Diagnosing the generalization error of a decision tree 
 (Residual error : Lowest achievable error, independent of 'L.) 
 Bias : Decision trees usually have low bias. 
 Variance : They often suer from high variance. 
 Solution : Combine the predictions of several randomized trees 
into a single model. 
8 / 14
Random forests 
풙 
휑1 휑푀 
푝휑1 (푌 = 푐|푋 = 풙) 
… 
푝휑푚 (푌 = 푐|푋 = 풙) 
Σ 
푝휓(푌 = 푐|푋 = 풙) 
Randomization 
 Bootstrap samples  Random selection of K 6 p split variables g Random Forests  Random selection of the threshold g Extra-Trees 
9 / 14
Bias-variance decomposition (cont.) 
Theorem. For the squared error loss, the bias-variance 
decomposition of the expected generalization error 
ELfErr ( L,1,...,M (x))g at X = x of an ensemble of M 
randomized models 'L,m is 
ELfErr ( L,1,...,M (x))g = noise(x) + bias2(x) + var(x), 
where 
noise(x) = Err ('B(x)), 
bias2(x) = ('B(x) - EL,f'L,(x)g)2, 
var(x) = (x)2 
L,(x) + 
1 - (x) 
M 
2 
L,(x). 
and where (x) is the Pearson correlation coecient between the 
predictions of two randomized trees built on the same learning set. 
10 / 14
Interpretation of (x) (Louppe, 2014) 
Theorem. (x) = 
VLfEjLf'L,(x)gg 
VLfEjLf'L,(x)gg+ELfVjLf'L,(x)gg 
In other words, it is the ratio between 
 the variance due to the learning set and 
 the total variance, accounting for random eects due to both 
the learning set and the random perburbations. 
(x) ! 1 when variance is mostly due to the learning set ; 
(x) ! 0 when variance is mostly due to the random 
perturbations ; 
(x)  0. 
11 / 14

More Related Content

What's hot

Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning Classifiers
Functional Imperative
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision tree
Krish_ver2
 
SVM & KNN Presentation.pptx
SVM & KNN Presentation.pptxSVM & KNN Presentation.pptx
SVM & KNN Presentation.pptx
MohamedMonir33
 
High Dimensional Data Visualization using t-SNE
High Dimensional Data Visualization using t-SNEHigh Dimensional Data Visualization using t-SNE
High Dimensional Data Visualization using t-SNE
Kai-Wen Zhao
 
Topic Modeling - NLP
Topic Modeling - NLPTopic Modeling - NLP
Topic Modeling - NLP
Rupak Roy
 
Machine Learning: Bias and Variance Trade-off
Machine Learning: Bias and Variance Trade-offMachine Learning: Bias and Variance Trade-off
Machine Learning: Bias and Variance Trade-off
International Institute of Information Technology (I²IT)
 
Hierarchical clustering.pptx
Hierarchical clustering.pptxHierarchical clustering.pptx
Hierarchical clustering.pptx
NTUConcepts1
 
Learning Vector Quantization LVQ
Learning Vector Quantization LVQLearning Vector Quantization LVQ
Learning Vector Quantization LVQESCOM
 
Clustering
ClusteringClustering
Clustering
NLPseminar
 
Machine Learning Unit 2 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 2 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 2 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 2 Semester 3 MSc IT Part 2 Mumbai University
Madhav Mishra
 
Capter10 cluster basic
Capter10 cluster basicCapter10 cluster basic
Capter10 cluster basic
Houw Liong The
 
Decision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmDecision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning Algorithm
Palin analytics
 
Knn 160904075605-converted
Knn 160904075605-convertedKnn 160904075605-converted
Knn 160904075605-converted
rameswara reddy venkat
 
LDA classifier: Tutorial
LDA classifier: TutorialLDA classifier: Tutorial
LDA classifier: Tutorial
Alaa Tharwat
 
A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...
A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...
A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...
Jin-Hwa Kim
 
Nearest Neighbor Algorithm Zaffar Ahmed
Nearest Neighbor Algorithm  Zaffar AhmedNearest Neighbor Algorithm  Zaffar Ahmed
Nearest Neighbor Algorithm Zaffar AhmedZaffar Ahmed Shaikh
 
Machine Learning Unit 3 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 3 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 3 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 3 Semester 3 MSc IT Part 2 Mumbai University
Madhav Mishra
 
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Salah Amean
 
Data mining-2
Data mining-2Data mining-2
Data mining-2
Nit Hik
 

What's hot (20)

Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning Classifiers
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision tree
 
SVM & KNN Presentation.pptx
SVM & KNN Presentation.pptxSVM & KNN Presentation.pptx
SVM & KNN Presentation.pptx
 
High Dimensional Data Visualization using t-SNE
High Dimensional Data Visualization using t-SNEHigh Dimensional Data Visualization using t-SNE
High Dimensional Data Visualization using t-SNE
 
Topic Modeling - NLP
Topic Modeling - NLPTopic Modeling - NLP
Topic Modeling - NLP
 
Machine Learning: Bias and Variance Trade-off
Machine Learning: Bias and Variance Trade-offMachine Learning: Bias and Variance Trade-off
Machine Learning: Bias and Variance Trade-off
 
Hierarchical clustering.pptx
Hierarchical clustering.pptxHierarchical clustering.pptx
Hierarchical clustering.pptx
 
Learning Vector Quantization LVQ
Learning Vector Quantization LVQLearning Vector Quantization LVQ
Learning Vector Quantization LVQ
 
Clustering
ClusteringClustering
Clustering
 
Machine Learning Unit 2 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 2 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 2 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 2 Semester 3 MSc IT Part 2 Mumbai University
 
Capter10 cluster basic
Capter10 cluster basicCapter10 cluster basic
Capter10 cluster basic
 
Decision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmDecision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning Algorithm
 
Knn 160904075605-converted
Knn 160904075605-convertedKnn 160904075605-converted
Knn 160904075605-converted
 
LDA classifier: Tutorial
LDA classifier: TutorialLDA classifier: Tutorial
LDA classifier: Tutorial
 
A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...
A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...
A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...
 
Decision tree
Decision treeDecision tree
Decision tree
 
Nearest Neighbor Algorithm Zaffar Ahmed
Nearest Neighbor Algorithm  Zaffar AhmedNearest Neighbor Algorithm  Zaffar Ahmed
Nearest Neighbor Algorithm Zaffar Ahmed
 
Machine Learning Unit 3 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 3 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 3 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 3 Semester 3 MSc IT Part 2 Mumbai University
 
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
 
Data mining-2
Data mining-2Data mining-2
Data mining-2
 

Similar to Bias-variance decomposition in Random Forests

Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptions
Gilles Louppe
 
1 - Linear Regression
1 - Linear Regression1 - Linear Regression
1 - Linear Regression
Nikita Zhiltsov
 
Estimationtheory2
Estimationtheory2Estimationtheory2
Estimationtheory2
Gopi Saiteja
 
The dual geometry of Shannon information
The dual geometry of Shannon informationThe dual geometry of Shannon information
The dual geometry of Shannon information
Frank Nielsen
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
The Statistical and Applied Mathematical Sciences Institute
 
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
Frank Nielsen
 
Numarical values
Numarical valuesNumarical values
Numarical values
AmanSaeed11
 
Numarical values highlighted
Numarical values highlightedNumarical values highlighted
Numarical values highlighted
AmanSaeed11
 
IVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionIVR - Chapter 1 - Introduction
IVR - Chapter 1 - Introduction
Charles Deledalle
 
Distributions
DistributionsDistributions
Distributions
adnan gilani
 
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
Christian Robert
 
QMC: Operator Splitting Workshop, Progressive Decoupling of Linkages in Optim...
QMC: Operator Splitting Workshop, Progressive Decoupling of Linkages in Optim...QMC: Operator Splitting Workshop, Progressive Decoupling of Linkages in Optim...
QMC: Operator Splitting Workshop, Progressive Decoupling of Linkages in Optim...
The Statistical and Applied Mathematical Sciences Institute
 
Density theorems for Euclidean point configurations
Density theorems for Euclidean point configurationsDensity theorems for Euclidean point configurations
Density theorems for Euclidean point configurations
VjekoslavKovac1
 
Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...
Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...
Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...
Frank Nielsen
 
02-Random Variables.ppt
02-Random Variables.ppt02-Random Variables.ppt
02-Random Variables.ppt
AkliluAyele3
 
Kernels and Support Vector Machines
Kernels and Support Vector  MachinesKernels and Support Vector  Machines
Kernels and Support Vector Machines
Edgar Marca
 
Lecture notes
Lecture notes Lecture notes
Lecture notes
graciasabineKAGLAN
 
On learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodOn learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihood
Frank Nielsen
 
Integration with kernel methods, Transported meshfree methods
Integration with kernel methods, Transported meshfree methodsIntegration with kernel methods, Transported meshfree methods
Integration with kernel methods, Transported meshfree methods
Mercier Jean-Marc
 

Similar to Bias-variance decomposition in Random Forests (20)

Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptions
 
1 - Linear Regression
1 - Linear Regression1 - Linear Regression
1 - Linear Regression
 
Estimationtheory2
Estimationtheory2Estimationtheory2
Estimationtheory2
 
The dual geometry of Shannon information
The dual geometry of Shannon informationThe dual geometry of Shannon information
The dual geometry of Shannon information
 
ma112011id535
ma112011id535ma112011id535
ma112011id535
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
 
Numarical values
Numarical valuesNumarical values
Numarical values
 
Numarical values highlighted
Numarical values highlightedNumarical values highlighted
Numarical values highlighted
 
IVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionIVR - Chapter 1 - Introduction
IVR - Chapter 1 - Introduction
 
Distributions
DistributionsDistributions
Distributions
 
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
 
QMC: Operator Splitting Workshop, Progressive Decoupling of Linkages in Optim...
QMC: Operator Splitting Workshop, Progressive Decoupling of Linkages in Optim...QMC: Operator Splitting Workshop, Progressive Decoupling of Linkages in Optim...
QMC: Operator Splitting Workshop, Progressive Decoupling of Linkages in Optim...
 
Density theorems for Euclidean point configurations
Density theorems for Euclidean point configurationsDensity theorems for Euclidean point configurations
Density theorems for Euclidean point configurations
 
Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...
Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...
Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...
 
02-Random Variables.ppt
02-Random Variables.ppt02-Random Variables.ppt
02-Random Variables.ppt
 
Kernels and Support Vector Machines
Kernels and Support Vector  MachinesKernels and Support Vector  Machines
Kernels and Support Vector Machines
 
Lecture notes
Lecture notes Lecture notes
Lecture notes
 
On learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodOn learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihood
 
Integration with kernel methods, Transported meshfree methods
Integration with kernel methods, Transported meshfree methodsIntegration with kernel methods, Transported meshfree methods
Integration with kernel methods, Transported meshfree methods
 

Recently uploaded

PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 

Recently uploaded (20)

PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 

Bias-variance decomposition in Random Forests

  • 1. Bias-variance decomposition in Random Forests Gilles Louppe @glouppe Paris ML S02E04, December 8, 2014 1 / 14
  • 2. Motivation In supervised learning, combining the predictions of several randomized models often achieves better results than a single non-randomized model. Why ? 2 / 14
  • 3. Supervised learning The inputs are random variables X = X1, ..., Xp ; The output is a random variable Y . Data comes as a
  • 4. nite learning set L = f(xi , yi )ji = 0, . . . ,N - 1g, where xi 2 X = X1 ... Xp and yi 2 Y are randomly drawn from PX,Y . The goal is to
  • 5. nd a model 'L : X7! Y minimizing Err ('L) = EX,Y fL(Y ,'L(X))g. 3 / 14
  • 7. cation Symbolic output (e.g., Y = fyes, nog) Zero-one loss L(Y ,'L(X)) = 1(Y6= 'L(X)) Regression Numerical output (e.g., Y = R) Squared error loss L(Y ,'L(X)) = (Y - 'L(X))2 4 / 14
  • 8. Decision trees 0.7 0.5 X1 X2 t5 t3 t4 푡2 푡1 푋1 ≤ 0.7 푡3 푡4 푡5 풙 푝(푌 = 푐|푋 = 풙) Split node ≤ Leaf node 푋2 ≤ 0.5 ≤ t 2 ' : nodes of the tree ' Xt : split variable at t vt 2 R : split threshold at t '(x) = arg maxc2Y p(Y = cjX = x) 5 / 14
  • 9. Bias-variance decomposition in regression Theorem. For the squared error loss, the bias-variance decomposition of the expected generalization error at X = x is ELfErr ('L(x))g = noise(x) + bias2(x) + var(x) where noise(x) = Err ('B(x)), bias2(x) = ('B(x) - ELf'L(x)g)2, var(x) = ELf(ELf'L(x)g - 'L(x))2g. 6 / 14
  • 11. Diagnosing the generalization error of a decision tree (Residual error : Lowest achievable error, independent of 'L.) Bias : Decision trees usually have low bias. Variance : They often suer from high variance. Solution : Combine the predictions of several randomized trees into a single model. 8 / 14
  • 12. Random forests 풙 휑1 휑푀 푝휑1 (푌 = 푐|푋 = 풙) … 푝휑푚 (푌 = 푐|푋 = 풙) Σ 푝휓(푌 = 푐|푋 = 풙) Randomization Bootstrap samples Random selection of K 6 p split variables g Random Forests Random selection of the threshold g Extra-Trees 9 / 14
  • 13. Bias-variance decomposition (cont.) Theorem. For the squared error loss, the bias-variance decomposition of the expected generalization error ELfErr ( L,1,...,M (x))g at X = x of an ensemble of M randomized models 'L,m is ELfErr ( L,1,...,M (x))g = noise(x) + bias2(x) + var(x), where noise(x) = Err ('B(x)), bias2(x) = ('B(x) - EL,f'L,(x)g)2, var(x) = (x)2 L,(x) + 1 - (x) M 2 L,(x). and where (x) is the Pearson correlation coecient between the predictions of two randomized trees built on the same learning set. 10 / 14
  • 14. Interpretation of (x) (Louppe, 2014) Theorem. (x) = VLfEjLf'L,(x)gg VLfEjLf'L,(x)gg+ELfVjLf'L,(x)gg In other words, it is the ratio between the variance due to the learning set and the total variance, accounting for random eects due to both the learning set and the random perburbations. (x) ! 1 when variance is mostly due to the learning set ; (x) ! 0 when variance is mostly due to the random perturbations ; (x) 0. 11 / 14
  • 15. Diagnosing the generalization error of random forests Bias : Identical to the bias of a single randomized tree. Variance : var(x) = (x)2 L,(x) + 1-(x) M 2 L,(x) As M ! 1, var(x) ! (x)2 L,(x) The stronger the randomization, (x) ! 0, var(x) ! 0. The weaker the randomization, (x) ! 1, var(x) ! 2 L,(x) Bias-variance trade-o. Randomization increases bias but makes it possible to reduce the variance of the corresponding ensemble model through averaging. The crux of the problem is to
  • 16. nd the right trade-o. Tips : tune max features in Random Forests. 12 / 14
  • 18. cation Theorem. For the zero-one loss and binary classi
  • 19. cation, the expected generalization error ELfErr ('L(x))g at X = x decomposes as follows : ELfErr ('L(x))g = P('B(x)6= Y ) + ( 0.5 - ELfbp p bp L(Y = 'B(x))g VLfL(Y = 'B(x))g )(2P('B(x) = Y ) - 1) For ELfbp L(Y = 'B(x)g 0.5, VLfbp L(Y = 'B(x))g ! 0 makes ! 0 and the expected generalization error tends to the error of the Bayes model. Conversely, for ELfbp L(Y = 'B(x)g 0.5, VLfbp L(Y = 'B(x))g ! 0 makes ! 1 and the error is maximal. 13 / 14