SlideShare a Scribd company logo
Genetic Programming-based Evolutionary
Feature Construction for Heterogeneous
Ensemble Learning (IEEE TEVC)
Hengzhe Zhang
Supervisor: Mengjie Zhang, Bing Xue, Qi Chen, Aimin Zhou (ECNU)
Victoria University of Wellington
18/07/2023
Table of Contents
1 Background
2 Algorithm Framework
3 Experimental Results
4 Conclusions
1 15
Background
Evolutionary Feature Construction
The general idea of feature construction is to construct a set of new features
{ϕ1, . . . , ϕm} that improve the learning performance on a given dataset
{{x1, y1}, . . . , {xn, yn}} compared to learning on the original features
{x1, . . . , xp}.
Genetic programming (GP) has been widely used for automatic feature
construction due to its flexible representation and gradient-free search
mechanism.
(a) Feature Construction on Linear Regression (b) New Feature Space
2 15
GP for Ensemble Learning
Motivation: An ensemble of multiple simple/weak GP trees is better than a
single complex/strong GP tree.
GP is naturally suited for ensemble learning because it can generate a diverse
set of candidate solutions (models) through genetic operations in a single run 1.
Multi-modal Landscape on Training Data
1
H. Zhang, A. Zhou, et al. "An Evolutionary Forest for Regression," in IEEE TEVC, 2022.
3 15
Research Objectives
Key Questions in Heterogenous Ensemble Learning:
How to define base learners? (Decision Tree or Linear Model?)
How to select base learners and features of the final model? (Top-N?)
How to search features efficiently? (Guided Mutation?)
4 15
Algorithm Framework
Base Learner
Motivation:
Decision trees are good at fitting piecewise data.
Linear regression is good at fitting continuous data.
Genetic programming is good at constructing features.
How to combine them?
Combine decision tree and linear regression using gradient boosting.
Use GP for feature construction.
Feature construction on a mixed base learner.
5 15
Base Learner
Gradient boosting:
Train a linear regression model first.
Learn the residual using a decision tree.
Illustration of the heterogeneous base learner.
6 15
Greedy Ensemble Selection
Why select a subset?
Not all learners in an ensemble model are accurate and diverse.
How to select a subset:
Select Top-5 models.
Select a model that minimizes training error on top of selected models.
Repeat step 2 until reaching a termination criterion.
Ensemble Selection
7 15
Variable Selection
Why need terminal variable selection?
Not all variables in training data are useful!
How to select?
Calculate the importance of each constructed feature.
Calculate the relative frequency of all variables.
Weight frequency by feature importance values.
Variable Selection
8 15
Experimental Results
R2
Scores
SR-Forest is the best on average in terms of test R2 scores.
í í      
R27HVW6FRUH
65)RUHVW
()

6%3*3

2SHURQ
*3*20($SRF
.7%RRVW

)($7
;*%
*3*20($E

(3/(;

*3*20($
$GD%RRVW
5DQGRP)RUHVW

$)3B)(

$)3

,7($
/*%0
.HUQHO5LGJH

'65

JSOHDUQ
0/3
/LQHDU

05*3

%65

));

$,)HQPDQ
'DWD6XEVHW
5HDO:RUOG'DWDVHW
6QWKHWLF'DWDVHW
$OO
Average R2
scores on 120 regression datasets.
9 15
Running Time
The cost of running time for SR-Forest ranks in the middle among algorithms.












7UDLQLQJ7LPHV
/LQHDU
/*%0
0/3
.HUQHO5LGJH
$GD%RRVW
5DQGRP)RUHVW
;*%

));
.7%RRVW

2SHURQ

$)3
()

$)3B)(
65)RUHVW

)($7

*3*20($

,7($

(3/(;
*3*20($E
*3*20($SRF

%65

JSOHDUQ

'65

$,)HQPDQ

6%3*3

05*3
'DWD6XEVHW
5HDO:RUOG'DWDVHW
6QWKHWLF'DWDVHW
$OO
Average training time on 120 regression datasets.
10 15
Heterogeneous Base Learner
Feature construction on SR-Tree (DT+LR) is better than construction on Ridge
(LR) in 22 out of 106 datasets.
Feature construction on SR-Tree (DT+LR) is better than construction on RDT in
82 out of 106 datasets.
Statistical Comparison on 106 regression datasets.
11 15
Ensemble Selection
Ensemble selection can reduce ensemble size from 100 to 30 on average.
Ensemble selection delivers better performance in 37 datasets.
       
(QVHPEOH6L]H






RXQW
(a) Ensemble Size
  a







(b) Statistical Comparison of R2
on 106
datasets.
12 15
Variable Selection
Importance-based Terminal selection (GM) delivers better performance in 30
datasets.
Selecting only terminal variables is sufficient.
(a) Importance-based GM outperforms
frequency-based GM in 15 datasets.
  a



 


(b) GM outperforms Random in 30 datasets
on R2
.
13 15
Constructed Features
Feature construction on heterogeneous ensemble can make it outperform
other machine learning algorithms.
Thanks to base learners providing feature importance values, we can visualize
which constructed features are important.
6
5

7
U
H
H
6
5

%
D
J
J
L
Q
J
*
3
5
.
1
1
5
L
G
J
H
'
7
5
)
(
7
$
G
D
%
R
R
V
W
*
%
'
7
;
*
%
R
R
V
W
/
L
J
K
W
*
%
0
6
5

)
R
U
H
V
W
0RGHO








6FRUH
R
2
0HDQ6FRUH
(a) Performance Comparison
       
)HDWXUH,PSRUWDQFH
max(MCW, RA)
√SS2 + 1
max(MCW, RA)
MCW− SS
max(MCW, RA, SS)
log(|max(SS, RA− 1)| + 1)
2 ⋅ MCW3
−SS+ max(MCW, RA)
SS
√MCW2 + 1
log(|SS⋅ max(CBL, MCW)| + 1)
log(|MCW| + 1)
RA⋅ (MCW+ 1)
MCW
max(2, MCW, RA, SS)
CBL⋅ (RL− 1)
RA
√SS2 + 1
)HDWXUH1DPH
(b) Feature Importance
14 15
Conclusions

More Related Content

Similar to Genetic Programming-based Evolutionary Feature Construction for Heterogeneous Ensemble Learning (IEEE TEVC)

A Review on Prediction of Compressive Strength and Slump by Using Different M...
A Review on Prediction of Compressive Strength and Slump by Using Different M...A Review on Prediction of Compressive Strength and Slump by Using Different M...
A Review on Prediction of Compressive Strength and Slump by Using Different M...
IRJET Journal
 
PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...
PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...
PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...
AkankshaRawat53
 
Pegasus
PegasusPegasus
Pegasus
Hangil Kim
 
A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...
IOSR Journals
 
LNCS 5050 - Bilevel Optimization and Machine Learning
LNCS 5050 - Bilevel Optimization and Machine LearningLNCS 5050 - Bilevel Optimization and Machine Learning
LNCS 5050 - Bilevel Optimization and Machine Learningbutest
 
Presentation
PresentationPresentation
Presentationbutest
 
Machine Learning Approach.pptx
Machine Learning Approach.pptxMachine Learning Approach.pptx
Machine Learning Approach.pptx
CYPatrickKwee
 
ML_in_QM_JC_02-10-18
ML_in_QM_JC_02-10-18ML_in_QM_JC_02-10-18
ML_in_QM_JC_02-10-18
Suzanne Wallace
 
Wasserstein 1031 thesis [Chung il kim]
Wasserstein 1031 thesis [Chung il kim]Wasserstein 1031 thesis [Chung il kim]
Wasserstein 1031 thesis [Chung il kim]
Chung-Il Kim
 
Tutorial rpo
Tutorial rpoTutorial rpo
Tutorial rpomosi2005
 
13 random forest
13 random forest13 random forest
13 random forest
Vishal Dutt
 
XGBOOST [Autosaved]12.pptx
XGBOOST [Autosaved]12.pptxXGBOOST [Autosaved]12.pptx
XGBOOST [Autosaved]12.pptx
yadav834181
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep models
Young-Geun Choi
 
Second Order Heuristics in ACGP
Second Order Heuristics in ACGPSecond Order Heuristics in ACGP
Second Order Heuristics in ACGPhauschildm
 
IRJET- Machine Learning Techniques for Code Optimization
IRJET-  	  Machine Learning Techniques for Code OptimizationIRJET-  	  Machine Learning Techniques for Code Optimization
IRJET- Machine Learning Techniques for Code Optimization
IRJET Journal
 
[update] Introductory Parts of the Book "Dive into Deep Learning"
[update] Introductory Parts of the Book "Dive into Deep Learning"[update] Introductory Parts of the Book "Dive into Deep Learning"
[update] Introductory Parts of the Book "Dive into Deep Learning"
Young-Min kang
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
Jinwon Lee
 
Newtonian Law Inspired Optimization Techniques Based on Gravitational Search ...
Newtonian Law Inspired Optimization Techniques Based on Gravitational Search ...Newtonian Law Inspired Optimization Techniques Based on Gravitational Search ...
Newtonian Law Inspired Optimization Techniques Based on Gravitational Search ...
Dr. Rajdeep Chatterjee
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
Saad Elbeleidy
 
Algoritma Random Forest beserta aplikasi nya
Algoritma Random Forest beserta aplikasi nyaAlgoritma Random Forest beserta aplikasi nya
Algoritma Random Forest beserta aplikasi nya
batubao
 

Similar to Genetic Programming-based Evolutionary Feature Construction for Heterogeneous Ensemble Learning (IEEE TEVC) (20)

A Review on Prediction of Compressive Strength and Slump by Using Different M...
A Review on Prediction of Compressive Strength and Slump by Using Different M...A Review on Prediction of Compressive Strength and Slump by Using Different M...
A Review on Prediction of Compressive Strength and Slump by Using Different M...
 
PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...
PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...
PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...
 
Pegasus
PegasusPegasus
Pegasus
 
A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...
 
LNCS 5050 - Bilevel Optimization and Machine Learning
LNCS 5050 - Bilevel Optimization and Machine LearningLNCS 5050 - Bilevel Optimization and Machine Learning
LNCS 5050 - Bilevel Optimization and Machine Learning
 
Presentation
PresentationPresentation
Presentation
 
Machine Learning Approach.pptx
Machine Learning Approach.pptxMachine Learning Approach.pptx
Machine Learning Approach.pptx
 
ML_in_QM_JC_02-10-18
ML_in_QM_JC_02-10-18ML_in_QM_JC_02-10-18
ML_in_QM_JC_02-10-18
 
Wasserstein 1031 thesis [Chung il kim]
Wasserstein 1031 thesis [Chung il kim]Wasserstein 1031 thesis [Chung il kim]
Wasserstein 1031 thesis [Chung il kim]
 
Tutorial rpo
Tutorial rpoTutorial rpo
Tutorial rpo
 
13 random forest
13 random forest13 random forest
13 random forest
 
XGBOOST [Autosaved]12.pptx
XGBOOST [Autosaved]12.pptxXGBOOST [Autosaved]12.pptx
XGBOOST [Autosaved]12.pptx
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep models
 
Second Order Heuristics in ACGP
Second Order Heuristics in ACGPSecond Order Heuristics in ACGP
Second Order Heuristics in ACGP
 
IRJET- Machine Learning Techniques for Code Optimization
IRJET-  	  Machine Learning Techniques for Code OptimizationIRJET-  	  Machine Learning Techniques for Code Optimization
IRJET- Machine Learning Techniques for Code Optimization
 
[update] Introductory Parts of the Book "Dive into Deep Learning"
[update] Introductory Parts of the Book "Dive into Deep Learning"[update] Introductory Parts of the Book "Dive into Deep Learning"
[update] Introductory Parts of the Book "Dive into Deep Learning"
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
 
Newtonian Law Inspired Optimization Techniques Based on Gravitational Search ...
Newtonian Law Inspired Optimization Techniques Based on Gravitational Search ...Newtonian Law Inspired Optimization Techniques Based on Gravitational Search ...
Newtonian Law Inspired Optimization Techniques Based on Gravitational Search ...
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
Algoritma Random Forest beserta aplikasi nya
Algoritma Random Forest beserta aplikasi nyaAlgoritma Random Forest beserta aplikasi nya
Algoritma Random Forest beserta aplikasi nya
 

Recently uploaded

What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
sachin783648
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
ChetanK57
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
sonaliswain16
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
fafyfskhan251kmf
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
Areesha Ahmad
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiologyBLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
NoelManyise1
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
muralinath2
 

Recently uploaded (20)

What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiologyBLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
 

Genetic Programming-based Evolutionary Feature Construction for Heterogeneous Ensemble Learning (IEEE TEVC)

  • 1. Genetic Programming-based Evolutionary Feature Construction for Heterogeneous Ensemble Learning (IEEE TEVC) Hengzhe Zhang Supervisor: Mengjie Zhang, Bing Xue, Qi Chen, Aimin Zhou (ECNU) Victoria University of Wellington 18/07/2023
  • 2. Table of Contents 1 Background 2 Algorithm Framework 3 Experimental Results 4 Conclusions 1 15
  • 4. Evolutionary Feature Construction The general idea of feature construction is to construct a set of new features {ϕ1, . . . , ϕm} that improve the learning performance on a given dataset {{x1, y1}, . . . , {xn, yn}} compared to learning on the original features {x1, . . . , xp}. Genetic programming (GP) has been widely used for automatic feature construction due to its flexible representation and gradient-free search mechanism. (a) Feature Construction on Linear Regression (b) New Feature Space 2 15
  • 5. GP for Ensemble Learning Motivation: An ensemble of multiple simple/weak GP trees is better than a single complex/strong GP tree. GP is naturally suited for ensemble learning because it can generate a diverse set of candidate solutions (models) through genetic operations in a single run 1. Multi-modal Landscape on Training Data 1 H. Zhang, A. Zhou, et al. "An Evolutionary Forest for Regression," in IEEE TEVC, 2022. 3 15
  • 6. Research Objectives Key Questions in Heterogenous Ensemble Learning: How to define base learners? (Decision Tree or Linear Model?) How to select base learners and features of the final model? (Top-N?) How to search features efficiently? (Guided Mutation?) 4 15
  • 8. Base Learner Motivation: Decision trees are good at fitting piecewise data. Linear regression is good at fitting continuous data. Genetic programming is good at constructing features. How to combine them? Combine decision tree and linear regression using gradient boosting. Use GP for feature construction. Feature construction on a mixed base learner. 5 15
  • 9. Base Learner Gradient boosting: Train a linear regression model first. Learn the residual using a decision tree. Illustration of the heterogeneous base learner. 6 15
  • 10. Greedy Ensemble Selection Why select a subset? Not all learners in an ensemble model are accurate and diverse. How to select a subset: Select Top-5 models. Select a model that minimizes training error on top of selected models. Repeat step 2 until reaching a termination criterion. Ensemble Selection 7 15
  • 11. Variable Selection Why need terminal variable selection? Not all variables in training data are useful! How to select? Calculate the importance of each constructed feature. Calculate the relative frequency of all variables. Weight frequency by feature importance values. Variable Selection 8 15
  • 13. R2 Scores SR-Forest is the best on average in terms of test R2 scores. í í R27HVW6FRUH 65)RUHVW () 6%3*3 2SHURQ *3*20($SRF .7%RRVW )($7 ;*% *3*20($E (3/(; *3*20($ $GD%RRVW 5DQGRP)RUHVW $)3B)( $)3 ,7($ /*%0 .HUQHO5LGJH '65 JSOHDUQ 0/3 /LQHDU 05*3 %65 )); $,)HQPDQ 'DWD6XEVHW 5HDO:RUOG'DWDVHW 6QWKHWLF'DWDVHW $OO Average R2 scores on 120 regression datasets. 9 15
  • 14. Running Time The cost of running time for SR-Forest ranks in the middle among algorithms. 7UDLQLQJ7LPHV
  • 16. Heterogeneous Base Learner Feature construction on SR-Tree (DT+LR) is better than construction on Ridge (LR) in 22 out of 106 datasets. Feature construction on SR-Tree (DT+LR) is better than construction on RDT in 82 out of 106 datasets. Statistical Comparison on 106 regression datasets. 11 15
  • 17. Ensemble Selection Ensemble selection can reduce ensemble size from 100 to 30 on average. Ensemble selection delivers better performance in 37 datasets. (QVHPEOH6L]H RXQW (a) Ensemble Size a (b) Statistical Comparison of R2 on 106 datasets. 12 15
  • 18. Variable Selection Importance-based Terminal selection (GM) delivers better performance in 30 datasets. Selecting only terminal variables is sufficient. (a) Importance-based GM outperforms frequency-based GM in 15 datasets. a (b) GM outperforms Random in 30 datasets on R2 . 13 15
  • 19. Constructed Features Feature construction on heterogeneous ensemble can make it outperform other machine learning algorithms. Thanks to base learners providing feature importance values, we can visualize which constructed features are important. 6 5 7 U H H 6 5 % D J J L Q J * 3 5 . 1 1 5 L G J H ' 7 5 ) ( 7 $ G D % R R V W * % ' 7 ; * % R R V W / L J K W * % 0 6 5 ) R U H V W 0RGHO 6FRUH R 2
  • 20. 0HDQ6FRUH (a) Performance Comparison )HDWXUH,PSRUWDQFH max(MCW, RA) √SS2 + 1 max(MCW, RA) MCW− SS max(MCW, RA, SS) log(|max(SS, RA− 1)| + 1) 2 ⋅ MCW3 −SS+ max(MCW, RA) SS √MCW2 + 1 log(|SS⋅ max(CBL, MCW)| + 1) log(|MCW| + 1) RA⋅ (MCW+ 1) MCW max(2, MCW, RA, SS) CBL⋅ (RL− 1) RA √SS2 + 1 )HDWXUH1DPH (b) Feature Importance 14 15
  • 22. Conclusions Feature construction on a heterogeneous ensemble outperforms that on a homogeneous ensemble. Ensemble selection effectively reduces ensemble size while enhancing prediction performance. Utilizing feature importance-based variable selection (guided mutation) improves search effectiveness. Open Source Project: Evolutionary Forest (90 GitHub Stars) 15 / 15
  • 23. Thanks for listening! Email: Hengzhe.zhang@ecs.vuw.ac.nz GitHub Project: https://github.com/hengzhe-zhang/EvolutionaryForest/