Feature Subset Selection for Learning
Huge Configuration Spaces
The case of Linux Kernel Size
Mathieu Acher, Hugo Martin, Juliana Alves Pereira, Luc Lesoil, Arnaud
Blouin, Jean-Marc Jézéquel, Djamel Eddine Khelladi, Olivier Barais
Preprint: https://hal.inria.fr/hal-03720273
15,000+
options
Linux 5.2.8, arm
(% of types’ options)
39000
26000
≈106000
variants
(without constraints) 2
100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Linux Kernel
≈106000
variants
≈1080
is the estimated number of atoms
in the universe
≈1040
is the estimated number of
possible chess positions
3
Dimensionality reduction with feature selection
Huge configuration space ≈106000
configurations
Large option/feature* set: 9K+ options for x86_64
Hypothesis: only a
subset of options
matter when
predicting
properties of
variants
4
*options (~Linux features) are encoded as features (~predictive variables in learning problems)
Dimensionality reduction with feature selection
Huge configuration space ≈106000
configurations
Large option/feature set: 9K+ options for x86_64
Hypothesis: only a subset of
options matter when predicting
properties of variants.
Very few studies at this scale
p options p’ options with p’ << p
n configurations
5
Hypothesis: Only a subset of options matter when predicting
properties of variants. Key results:
● Some state-of-the-art solutions are not scaling
due to “too many feature interactions” (think
about combinatorial with thousands of features!)
● Only ~300 features* (instead of 9K+) are
sufficient to efficiently predict and even
outperforms the accuracy of “learning over all
features/options”
● Training time can be decreased
● Identification of influential options is
consistent with, and can even improve, the
expert knowledge about Linux kernel
configuration.
6
*options (~Linux features) are encoded as features (~predictive
variables in learning problems)
Configurable
software
system
Configurations Variants Quantitative
property
(eg related to performance,
security, energy consumption)
176.8Mb
Linux kernel
.config
(compile-time/Kconfig)
Kernel variants
(binaries)
binary size7
Configurable
software
system
Configurations Variants Quantitative
property
(eg related to performance,
security, energy consumption)
16.1Mb
Linux kernel
.config
(compile-time/Kconfig)
Kernel variants
(binaries)
binary size8
Configurable
software
system
Configurations Variants Quantitative
property
(eg related to performance,
security, energy consumption)
176.8Mb
Linux kernel
.config
(compile-time/Kconfig)
Kernel variants
(binaries)
binary size
16.1Mb
77.2Mb
9
Configurable
software
system
Configurations Variants Quantitative
property
(eg related to performance,
security, energy consumption)
Linux kernel
.config
(compile-time/Kconfig)
Kernel variants
(binaries)
binary size
?
10
Challenge: you cannot build ≈106000
configurations; sampling and
learning to the rescue but…
Is it accurate? Is it effective with p’ features and feature selection?
How many features*? Which options* matter?
7.1Mb
176.8Mb
?
11
p’ options with p’ << p
*options (~Linux features) are encoded as features (~predictive
variables in learning problems)
A challenging case
● Targeted non-functional, quantitative
property: binary size
○ interest for maintainers/users of the Linux
kernel (embedded systems, cloud, etc.)
○ challenging to predict (cross-cutting
options, interplay with compilers/build
systems, etc.)
● Dataset: version 4.13.3, x86_64 arch,
measurements of 95K+ random
configurations
○ paranoiac about deep variability since
2017: Docker to control the build
environment and scale
○ build: 8 minutes on average
○ diversity: from 7Mb to 1.9Gb 12
TUXML: Sampling, Measuring, Learning
13
Most of the work consider a relatively low number of options (<50) Linux has 9K+ options for x86_64
Feature subset selection vs recursive feature elimination: scale? accuracy?
*EX: execution, SI: simulation, SA: static analysis, UF: user feedback, SM: synthetic measurements.
TUXML: Sampling, Measuring, Learning
Docker for a reproducible environment
with tools/packages needed
and Python procedures inside
Easy to launch campaign:
”python kernel_generator.py 10”
builds/measures
10 random configurations
(information sent to a database)
https://github.com/TuxML/
14
TUXML: Sampling, Measuring, Learning
Docker for a reproducible environment
with tools/packages needed
and Python procedures inside
Easy to launch campaign:
”python3 kernel_generator.py 10”
builds/measures
10 random configurations
(information sent to a database)
https://github.com/TuxML/
15
Data: version 4.13.3 (x86_64)
95K+ configurations for Linux 4.13.3
(and 15K hours of computation on a grid computing)
16
RQ1: How do SOTA
techniques perform on
huge configuration spaces?
● Linear-based algorithms : high error rate (it’s not additive!)
● Polynomial regression & performance-influence model : Out Of Memory (too
much interactions and not designed for 9K+ options)
● Tree-based algorithms & neural networks: low error rate
Mean Absolute Percentage Error
(MAPE): the lower the better
17
N : percentage of the
dataset used to training
Dimensionality reduction with feature selection
Huge configuration space ≈106000
configurations
Large options/feature set: 9K+ options for x86_64
Only a subset of options matter when
predicting properties of variants.
RQ2: How accurate is the prediction
model with and without feature selection?
p options p’ options with p’ << p
n configurations
18
Dimensionality reduction with Tree-based feature selection
Tree-based algorithm
(Random Forest)
p=8.743 options
Learn on
Full dataset
p’ <<<<< p options
Reduced dataset
Filter
Any learning algorithm
Learn on
DEBUG_INFO (0.33)
active_options (0.19)
group_129 (0.14)
DEBUG_INFO_REDUCED (0.11)
DEBUG_INFO_SPLIT (0.08)
feature ranking
list
(based on
feature
importance)
19
RQ2: Tree-based Feature Selection pays off!
● Tree-based algorithms & neural
networks:
○ Lower error rate
○ Lower training time
■ Random forest : 18x
■ Gradient Boosting Tree : 5x
● Simpler models, easier to train,
and improved accuracy
● Bonus: interpretable and
consistent with domain
knowledge
20
RQ2: Optimal number of features/options when performing
feature selection
● Depending on algorithm
○ Gradient Boosting Trees &
Neural networks : 1500
● Depending on training set size
● Random forest : 250 options
Sweet spot where only ~300
features are sufficient to efficiently
train a Random Forest and a
Gradient Boosting Tree to obtain a
prediction model that outperforms
other baselines operating over the
full set of features (6% prediction
errors for 40K configurations). 21
RQ3+4: Stability of influential options and Training time
reduction
Using an ensemble of Random
Forest allows the creation of a far
more stable list, with more than 95%
common features in top 300 between
multiple list
Tree-based feature selection speeds
the model training at least 5 times
up to 48 times (since p’ <<<< p)
22
RQ5: How do feature ranking lists, as computed by tree-based
feature selection, relate to Linux knowledge? Top
influential
options
147 documented
options in Kconfig
0 - 50 7
50 - 250 6
250 - 500 6
500 - 1500 28
1500 - 69
Top 50 options in the feature ranking list represents 95% of the feature
importance; collinearity and interpretability: beware!
Incompleteness of Linux documentation:
● Vast majority of influential options is either not documented or not
referring to size: only 7 options of the top 50 are documented as
having a clear influence on size
● Leveraging all the 147 options in the Linux documentation (and
only them) leads to prediction error of 23.6% (instead of <6% for
our feature ranking list)
Relevance: Investigations and exchanges with domain experts confirm
the relevance of the top 50, giving 6 categories of options.
Effective identification of important features:
● consistent with Linux knowledge (Kconfig documentation and
expert insight)
● can be used to refine or augment the incomplete
documentation of the Linux kernel.
23
Kaggle competition using our dataset
https://www.kaggle.com/competitions/linux-kernel-size/overview
24
We can benefit from contributions of the
machine learning community…
And our dataset/problems are raising interests.
Conclusion Feature subset selection is effective over
the huge configuration space of Linux:
● only ~300 features out of 9K+
● accuracy is better with than without tree-based
● training time is decreased
● interpretability: identification of influential options is consistent with, and can
even improve, the expert knowledge about Linux kernel configuration
Future work
● Replication on different versions of Linux
● Does feature ranking list transfer to other versions?
https://www.kaggle.com/competitions/linux-kernel-size/overview
25
26
Computation time
27
Decision Tree
● Ability to handle interactions between features
● Low impact of combinatorial explosion
● Competitive accuracy
● Interpretability
○ Decision rules
○ Feature importance
● Ensembles : Random Forests, Gradient Boosting Trees...
○ More accurate, less interpretable
28
Kpredict
Python module for Python 3.8+ ( https://github.com/HugoJPMartin/kpredict )
Works for many kernel versions and any configuration x86_64
Error : ≃ 6.3%
97% of the predictions are below 20% error
H. Martin, M. Acher, J. A. Pereira, L. Lesoil, J. Jézéquel and D. E. Khelladi, “Transfer learning across variants and versions: The
case of linux kernel size” Transactions on Software Engineering (TSE), 2021 29
Published at IEEE Transactions on Software
Engineering (TSE) in 2021
Preprint: https://hal.inria.fr/hal-03358817
30
Linux
Kernel
31
Backup / Draft slides
32
Transfer learning
“Inductive transfer refers to any algorithmic process by which structure or
knowledge derived from a learning problem is used to enhance learning on a
related problem.” - Jeremy West in A theoretical foundation for inductive transfer
● 100.000 configuration measurements, 15.000 hours of computation
● Mission Impossible : Saving Private Model 4.13
○ Budget : 5.000 configurations measurements (one night worth of ISTIC computing power)
33
Model 4.13: Genesis
34
Model 4.13: Genesis
f1
f2
f3
... fn
1 0 0 ... 1
0 1 0 ... 0
... ... ... ... ...
1 1 1 ... 0
Size
16MB
52MB
...
115MB
35
Model 4.13: Genesis
f1
f2
f3
... fn
1 0 0 ... 1
0 1 0 ... 0
... ... ... ... ...
1 1 1 ... 0
Size
16MB
52MB
...
115MB
36
Model 4.13: Genesis
f1
f2
f3
... fn
1 0 0 ... 1
0 1 0 ... 0
... ... ... ... ...
1 1 1 ... 0
Size
16MB
52MB
...
115MB
Gradient Boosting
Tree algorithm
Features Target
37
Model 4.13: Genesis
f1
f2
f3
... fn
1 0 0 ... 1
0 1 0 ... 0
... ... ... ... ...
1 1 1 ... 0
Size
16MB
52MB
...
115MB
Gradient Boosting
Tree algorithm
Features Target
Model 4.13
38
Model 4.13: Genesis
f1
f2
f3
... fn
1 0 0 ... 1
0 1 0 ... 0
... ... ... ... ...
1 1 1 ... 0
Size
16MB
52MB
...
115MB
Gradient Boosting
Tree algorithm
Features Target
Model 4.13
f1
f2
f3
... fn
0 1 1 ... 0
1 0 0 ... 1
... ... ... ... ...
1 0 1 ... 0
39
Model 4.13: Genesis
f1
f2
f3
... fn
1 0 0 ... 1
0 1 0 ... 0
... ... ... ... ...
1 1 1 ... 0
Size
16MB
52MB
...
115MB
Gradient Boosting
Tree algorithm
Features Target
Model 4.13
Size
18MB
25MB
...
228MB
Predict
f1
f2
f3
... fn
0 1 1 ... 0
1 0 0 ... 1
... ... ... ... ...
1 0 1 ... 0
40
Model 4.13: Genesis
f1
f2
f3
... fn
1 0 0 ... 1
0 1 0 ... 0
... ... ... ... ...
1 1 1 ... 0
Size
16MB
52MB
...
115MB
Gradient Boosting
Tree algorithm
Features Target
Model 4.13
Size
18MB
25MB
...
228MB
Predict
✅
✅
✅
f1
f2
f3
... fn
0 1 1 ... 0
1 0 0 ... 1
... ... ... ... ...
1 0 1 ... 0
41
Model Shifting
42
Model Shifting
f1
f2
f3
... fn
1 0 0 ... 1
0 1 0 ... 0
... ... ... ... ...
1 1 1 ... 0
Size
22MB
68MB
...
105MB
Gradient Boosting
Tree algorithm
Features Target
Model 4.15
43
Model Shifting
f1
f2
f3
... fn
1 0 0 ... 1
0 1 0 ... 0
... ... ... ... ...
1 1 1 ... 0
Size
22MB
68MB
...
105MB
Gradient Boosting
Tree algorithm
Features Target
Model 4.15
Size
19MB
26MB
...
298MB
Predict
❌
❌
✅
f1
f2
f3
... fn
0 1 1 ... 0
1 0 0 ... 1
... ... ... ... ...
1 0 1 ... 0
44
Model Shifting
f1
f2
f3
... fn
1 0 0 ... 1
0 1 0 ... 0
... ... ... ... ...
1 1 1 ... 0
Size
22MB
68MB
...
105MB
Gradient Boosting
Tree algorithm
Features Target
Model 4.15
Size
19MB
26MB
...
298MB
Predict
❌
❌
✅
f1
f2
f3
... fn
0 1 1 ... 0
1 0 0 ... 1
... ... ... ... ...
1 0 1 ... 0
Model 4.13
45
Model Shifting
f1
f2
f3
... fn
1 0 0 ... 1
0 1 0 ... 0
... ... ... ... ...
1 1 1 ... 0
Size
22MB
68MB
...
105MB
Gradient Boosting
Tree algorithm
Features Target
Model 4.15
f1
f2
f3
... fn
0 1 1 ... 0
1 0 0 ... 1
... ... ... ... ...
1 0 1 ... 0
Size
19MB
26MB
...
298MB
Predict
❌
❌
✅
Model 4.13
Old Size
16MB
52MB
...
115MB
Old Size
18MB
25MB
...
228MB
Predict
Predict
46
Model Shifting
f1
f2
f3
... fn
1 0 0 ... 1
0 1 0 ... 0
... ... ... ... ...
1 1 1 ... 0
Size
22MB
68MB
...
105MB
Gradient Boosting
Tree algorithm
Features Target
Shifting Model
4.15
f1
f2
f3
... fn
0 1 1 ... 0
1 0 0 ... 1
... ... ... ... ...
1 0 1 ... 0
Size
21MB
35MB
...
298MB
Predict
✅
✅
✅
Model 4.13
Old Size
16MB
52MB
...
115MB
Old Size
18MB
25MB
...
228MB
Predict
Predict
47
Results
48
Results
Budget : 5.000 configurations
● Model shifting :
○ From 5.6% to 7.1% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
49
Results
Budget : 5.000 configurations
● Model shifting :
○ From 5.6% to 7.1% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
Budget : 10.000 configurations
● Model shifting :
○ From 5.2% to 6.1% error rate
● Scratch :
○ From 7.1% to 7.7% error rate
50
Results
Budget : 1.000 configurations
● Model shifting :
○ From 6.7% to 10.6% error rate
● Scratch :
○ From 14.9% to 16.7% error rate
Budget : 5.000 configurations
● Model shifting :
○ From 5.6% to 7.1% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
Budget : 10.000 configurations
● Model shifting :
○ From 5.2% to 6.1% error rate
● Scratch :
○ From 7.1% to 7.7% error rate
51
Incremental Model Shifting
52
Incremental Model Shifting
Model 4.13 + Shifting Model 4.15 = Model 4.15
Model 4.13 + Shifting Model 4.20 = Model 4.20
Model 4.13 + Shifting Model 5.0 = Model 5.0
Model 4.13 + Shifting Model 5.4 = Model 5.4
Model 4.13 + Shifting Model 5.7 = Model 5.7
Model 4.13 + Shifting Model 5.8 = Model 5.8
Source + Shifting Model = Full Model
Simple Model Shifting
53
Incremental Model Shifting
Model 4.13 + Shifting Model 4.15 = Model 4.15
Model 4.13 + Shifting Model 4.20 = Model 4.20
Model 4.13 + Shifting Model 5.0 = Model 5.0
Model 4.13 + Shifting Model 5.4 = Model 5.4
Model 4.13 + Shifting Model 5.7 = Model 5.7
Model 4.13 + Shifting Model 5.8 = Model 5.8
Source + Shifting Model = Full Model
Simple Model Shifting
Model 4.13 + Shifting Model 4.15 = Model 4.15
Model 4.13 + Shifting Model 4.20 = Model 4.20
Source + Shifting Model = Full Model
Incremental Model Shifting
54
Incremental Model Shifting
Model 4.13 + Shifting Model 4.15 = Model 4.15
Model 4.13 + Shifting Model 4.20 = Model 4.20
Model 4.13 + Shifting Model 5.0 = Model 5.0
Model 4.13 + Shifting Model 5.4 = Model 5.4
Model 4.13 + Shifting Model 5.7 = Model 5.7
Model 4.13 + Shifting Model 5.8 = Model 5.8
Source + Shifting Model = Full Model
Simple Model Shifting
Model 4.13 + Shifting Model 4.15 = Model 4.15
Model 4.15 + Shifting Model 4.20 = Model 4.20
Source + Shifting Model = Full Model
9
Incremental Model Shifting
55
Incremental Model Shifting
Model 4.13 + Shifting Model 4.15 = Model 4.15
Model 4.13 + Shifting Model 4.20 = Model 4.20
Model 4.13 + Shifting Model 5.0 = Model 5.0
Model 4.13 + Shifting Model 5.4 = Model 5.4
Model 4.13 + Shifting Model 5.7 = Model 5.7
Model 4.13 + Shifting Model 5.8 = Model 5.8
Source + Shifting Model = Full Model
Simple Model Shifting
Model 4.13 + Shifting Model 4.15 = Model 4.15
Model 4.15 + Shifting Model 4.20 = Model 4.20
Model 4.20 + Shifting Model 5.0 = Model 5.0
Model 5.0 + Shifting Model 5.4 = Model 5.4
Model 5.4 + Shifting Model 5.7 = Model 5.7
Model 5.7 + Shifting Model 5.8 = Model 5.8
Source + Shifting Model = Full Model
9
Incremental Model Shifting
56
10
57
Results
Budget : 5.000 configurations
● Model shifting :
○ From 5.6% to 7.1% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
● Incremental Shifting :
○ From 5.6% to 7.5%
10
58
Results
Budget : 5.000 configurations
● Model shifting :
○ From 5.6% to 7.1% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
● Incremental Shifting :
○ From 5.6% to 7.5%
Budget : 10.000 configurations
● Model shifting :
○ From 5.2% to 6.1% error rate
● Scratch :
○ From 7.1% to 7.7% error rate
● Incremental Shifting :
○ From 5.2% to 6.5%
10
59
Results
Budget : 1.000 configurations
● Model shifting :
○ From 6.7% to 10.6% error rate
● Scratch :
○ From 14.9% to 16.7% error rate
● Incremental Shifting :
○ From 6.7% to 13.3%
Budget : 5.000 configurations
● Model shifting :
○ From 5.6% to 7.1% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
● Incremental Shifting :
○ From 5.6% to 7.5%
Budget : 10.000 configurations
● Model shifting :
○ From 5.2% to 6.1% error rate
● Scratch :
○ From 7.1% to 7.7% error rate
● Incremental Shifting :
○ From 5.2% to 6.5%
10
60
Results
Budget : 1.000 configurations
● Model shifting :
○ From 6.7% to 10.6% error rate
● Scratch :
○ From 14.9% to 16.7% error rate
● Incremental Shifting :
○ From 6.7% to 13.3%
Budget : 5.000 configurations
● Model shifting :
○ From 5.6% to 7.1% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
● Incremental Shifting :
○ From 5.6% to 7.5%
Budget : 10.000 configurations
● Model shifting :
○ From 5.2% to 6.1% error rate
● Scratch :
○ From 7.1% to 7.7% error rate
● Incremental Shifting :
○ From 5.2% to 6.5%
Model 4.13 Budget : 85.000 configurations
10
61
Results
Budget : 1.000 configurations
● Model shifting :
○ From 6.7% to 10.6% error rate
● Scratch :
○ From 14.9% to 16.7% error rate
● Incremental Shifting :
○ From 6.7% to 13.3%
Budget : 5.000 configurations
● Model shifting :
○ From 5.6% to 7.1% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
● Incremental Shifting :
○ From 5.6% to 7.5%
Budget : 10.000 configurations
● Model shifting :
○ From 5.2% to 6.1% error rate
● Scratch :
○ From 7.1% to 7.7% error rate
● Incremental Shifting :
○ From 5.2% to 6.5%
Model 4.13 Budget : 85.000 configurations
Model 4.13 Budget : 20.000 configurations
10
62
Results
Budget : 1.000 configurations
● Model shifting :
○ From 6.7% to 10.6% error rate
● Scratch :
○ From 14.9% to 16.7% error rate
● Incremental Shifting :
○ From 6.7% to 13.3%
Budget : 5.000 configurations
● Model shifting :
○ From 5.6% to 7.1% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
● Incremental Shifting :
○ From 5.6% to 7.5%
Budget : 10.000 configurations
● Model shifting :
○ From 5.2% to 6.1% error rate
● Scratch :
○ From 7.1% to 7.7% error rate
● Incremental Shifting :
○ From 5.2% to 6.5%
Model 4.13 Budget : 85.000 configurations
Model 4.13 Budget : 20.000 configurations
Budget : 5.000 configurations
● Model shifting :
○ From 6.7% to 7.9% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
● Incremental Shifting :
○ From 6.7% to 7.9%
10
63
Results
Budget : 1.000 configurations
● Model shifting :
○ From 6.7% to 10.6% error rate
● Scratch :
○ From 14.9% to 16.7% error rate
● Incremental Shifting :
○ From 6.7% to 13.3%
Budget : 5.000 configurations
● Model shifting :
○ From 5.6% to 7.1% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
● Incremental Shifting :
○ From 5.6% to 7.5%
Budget : 10.000 configurations
● Model shifting :
○ From 5.2% to 6.1% error rate
● Scratch :
○ From 7.1% to 7.7% error rate
● Incremental Shifting :
○ From 5.2% to 6.5%
Model 4.13 Budget : 85.000 configurations
Model 4.13 Budget : 20.000 configurations
Budget : 5.000 configurations
● Model shifting :
○ From 6.7% to 7.9% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
● Incremental Shifting :
○ From 6.7% to 7.9%
Budget : 10.000 configurations
● Model shifting :
○ From 6.2% to 6.7% error rate
● Scratch :
○ From 7.1% to 7.7% error rate
● Incremental Shifting :
○ From 6.1% to 6.7%
10
64
Results
Budget : 1.000 configurations
● Model shifting :
○ From 6.7% to 10.6% error rate
● Scratch :
○ From 14.9% to 16.7% error rate
● Incremental Shifting :
○ From 6.7% to 13.3%
Budget : 5.000 configurations
● Model shifting :
○ From 5.6% to 7.1% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
● Incremental Shifting :
○ From 5.6% to 7.5%
Budget : 10.000 configurations
● Model shifting :
○ From 5.2% to 6.1% error rate
● Scratch :
○ From 7.1% to 7.7% error rate
● Incremental Shifting :
○ From 5.2% to 6.5%
Model 4.13 Budget : 85.000 configurations
Model 4.13 Budget : 20.000 configurations
Budget : 1.000 configurations
● Model shifting :
○ From 8.5% to 11.6% error rate
● Scratch :
○ From 14.9% to 16.7% error rate
● Incremental Shifting :
○ From 8.5% to 13.8%
Budget : 5.000 configurations
● Model shifting :
○ From 6.7% to 7.9% error rate
● Scratch :
○ From 8.2% to 9.2% error rate
● Incremental Shifting :
○ From 6.7% to 7.9%
Budget : 10.000 configurations
● Model shifting :
○ From 6.2% to 6.7% error rate
● Scratch :
○ From 7.1% to 7.7% error rate
● Incremental Shifting :
○ From 6.1% to 6.7%
10
65
Summary
● Model 4.13 is saved
○ Positively reuse old model on new version at lower cost
○ Better than learning from scratch for years
● Incremental Shifting
○ More sensible to previous models error
○ Better use of more transfer budget
11
66
Kpredict
Python module for Python 3.8+ ( https://github.com/HugoJPMartin/kpredict )
Error : ≃ 6.3%
97% of the predictions are below 20% error
12
67
68

Feature Subset Selection for Learning Huge Configuration Spaces: The case of Linux Kernel Size