SlideShare a Scribd company logo
1 of 46
Download to read offline
Tackling Deep Software
Variability Together
Mathieu Acher @acherm
Special thanks to Luc Lesoil, Jean-Marc Jézéquel, Juliana Alves Pereira, and Paul Temple
Special thanks to Luc Lesoil, Jean-Marc Jézéquel, Juliana
Alves Pereira, and Paul Temple
SOFTWARE VARIANTS ARE EATING
THE WORLD AND SCIENCE
3
build: OK
8.16 seconds
security:
high
build: failure
build: OK
2.16 seconds
security: low
4
Whole
Population of
Configurations
Performance
Prediction
Training
Sample
Performance
Measurements
Prediction Model
J. Alves Pereira, H. Martin, M. Acher, J.-M. Jézéquel, G. Botterweck and A. Ventresque
<Learning Software Configuration Spaces: A Systematic Literature Review= JSS, 2021
Sampling, Measuring, Learning
Software Configuration Spaces
5
State-of-the-art (SOTA) results: with the right
algorithm and/or sampling strategy, fairly accurate
and interpretable models for many systems and
qualitative/quantitative properties can be derived
Still, why learning-based, SOTA techniques are not
fully adopted and integrated to software-intensive
projects?
● Academic/industry gap?
● Too costly?
● Hard to automatically observe/quantify
variants?
● ….
Learning variability
can be easy…
6
Learning variability can be easy…
hardware variability
operating system
variability
build/compiler
variability
software application
variability
version
variability
input data variability
… but we’re not looking
deeply.
Prediction models and configuration
knowledge may not generalize (ie
misleading/inaccurate/pointless for developers
and users) if the software evolves, is
compiled differently, a different
hardware is used, input data fed
differs, etc. 7
15,000+ options
thousands of compiler flags
and compile-time options
dozens of preferences
100+ command-line parameters
1000+ feature toggles
8
hardware variability
deep software variability
Age # Cores GPU
SOFTWARE
Variant
Compil. Version
Version Option Distrib.
Size Length Res.
Hardware
Operating
System
Software
Input Data
Bug
Perf. ·
Perf. ¸
deep variability Sometimes, variability is
consistent/stable across
layers and knowledge
transfer is immediate.
But there are also
interactions among
variability layers and
variability knowledge
may not generalize
Hardware
Operating
System
Software
Input Data
10.4
x264
--mbtree
...
x264
--no-mbtree
...
x264
--no-mbtree
...
x264
--mbtree
...
20.04
Dell latitude
7400
Raspberry Pi
4 model B
vertical
animation vertical
animation vertical
animation vertical
animation
Duration (s) 22 25 73 72
6 6 351 359
Size (MB) 28 34 33 21
33 21 28 34
A B
2
1 2
1
REAL WORLD Example (x264)
REAL WORLD Example (x264)
Hardware
Operating
System
Software
Input Data
10.4
x264
--mbtree
...
x264
--no-mbtree
...
x264
--no-mbtree
...
x264
--mbtree
...
20.04
Dell latitude
7400
Raspberry Pi
4 model B
vertical
animation vertical
animation vertical
animation vertical
animation
Duration (s) 22 25 73 72
6 6 351 359
Size (MB) 28 34 33 21
33 21 28 34
A B
2
1 2
1
Hardware
Operating
System
Software
Input Data
10.4
x264
--mbtree
...
x264
--no-mbtree
...
x264
--no-mbtree
...
x264
--mbtree
...
20.04
Dell latitude
7400
Raspberry Pi
4 model B
vertical
animation vertical
animation vertical
animation vertical
animation
Duration (s) 22 25 73 72
6 6 351 359
Size (MB) 28 34 33 21
33 21 28 34
A B
2
1 2
1
≈*16
≈*12
REAL WORLD Example (x264)
Age # Cores GPU
SOFTWARE
Variant
Compil. Version
Version Option Distrib.
Size Length Res.
Hardware
Operating
System
Software
Input Data
Bug
Perf. ·
Perf. ¸
deep variability
L. Lesoil, M. Acher, A. Blouin and J.-M. Jézéquel,
<Deep Software Variability: Towards
Handling Cross-Layer Configuration= in VaMoS 2021
The <best=/default software
variant might be a bad one.
Influential software options
and their interactions vary.
Performance prediction
models and variability
knowledge may not
generalize
Transferring Performance Prediction Models Across Different Hardware Platforms
Valov et al. ICPE 2017
<Linear model provides a good approximation of
transformation between performance distributions
of a system deployed in different hardware
environments=
what about
variability of
input data?
compile-time options?
version? 14
Transfer Learning for Software Performance Analysis: An Exploratory Analysis
Jamshidi et al. ASE 2017
15
<Analyzing the Impact of Workloads on Modeling the Performance of Configurable Software Systems=
Stefan Mühlbauer et al. ICSE 2023
16
Let’s go deep with input data!
Intuition: video encoder behavior (and thus runtime configurations) hugely depends
on the input video (different compression ratio, encoding size/type etc.)
Is the best software configuration still the best?
Are influential options always influential?
Does the configuration knowledge generalize?
?
YouTube User General Content dataset: 1397 videos
Measurements of 201 soft. configurations (with same hardware,
compiler, version, etc.): encoding time, bitrate, etc. 17
Do x264 software performances
stay consistent across inputs?
●Encoding time: very strong correlations
○ low input sensitivity
●FPS: very strong correlations
○ low input sensitivity
●CPU usage : moderate correlation, a few negative correlations
○ medium input sensitivity
●Bitrate: medium-low correlation, many negative correlations
○ High input sensitivity
●Encoding size: medium-low correlation, many negative correlations
○ High input sensitivity
?
1397 videos x 201 software
configurations
18
Are there some configuration options
more sensitive to input videos? (P = bitrate)
L. Lesoil, M. Acher, A. Blouin and J.-M. Jézéquel <Input sensitivity on the performance of configurable systems an empirical study= JSS 2023
19
Practical impacts for users, developers, scientists, and
self-adaptive systems
Threats to variability knowledge: predicting, tuning, or understanding configurable systems without being
aware of inputs can be inaccurate and… pointless
eg effectiveness of sampling strategies (random, 2-wise, etc.) is input specific (see also Pereira et al. ICPE 2020)
Opportunities: for some performance properties (P) and subject systems, some stability is observed and
performance remains consistent!
L. Lesoil, M. Acher, A. Blouin and J.-M. Jézéquel <Input sensitivity on the
performance of configurable systems an empirical study= JSS 2023
Stefan Mühlbauer, Florian Sattler, Christian Kaltenecker, Johannes Dorn, Sven Apel, Norbert Siegmund <Analyzing the
Impact of Workloads on Modeling the Performance of Configurable Software Systems= ICSE 2023
20
Is there an interplay between compile-time and
runtime options?
L. Lesoil, M. Acher, X. Tërnava, A. Blouin and
J.-M. Jézéquel <The Interplay of Compile-
time and Run-time Options for Performance
Prediction= in SPLC ’21
21
4.13 version (sep 2017): 6%. What about evolution? Can we reuse the 4.13 Linux prediction
model? No, accuracy quickly decreases: 4.15 (5 months after): 20%; 5.7 (3 years after): 35%
3
22
H. Martin, M. Acher, J. A. Pereira, L. Lesoil, J. Jézéquel and D. E. Khelladi, <Transfer learning across
variants and versions: The case of linux kernel size= Transactions on Software Engineering (TSE), 2021
23
What can we do? (#1 studies)
Empirical studies about deep software variability
● more subject systems
● more variability layers, including interactions
● more quantitative (e.g., performance) properties
with challenges for gathering measurements data:
● how to scale experiments? Variant space is huge!
● how to fix/isolate some layers? (eg hardware)
● how to measure in a reliable way?
Expected outcomes:
● significance of deep software variability in the wild
● identification of stable layers: sources of variability that should not affect the conclusion and
that can be eliminated/forgotten
● identification/quantification of sensitive layers and interactions that matter
● variability knowledge
24
What can we do? (#2 cost)
Reducing the cost of exploring the variability spaces
● learning
○ many algorithms/techniques with tradeoffs interpretability/accuracy
○ transfer learning (instead of learning from scratch) Jamshidi et al. ASE’17, ASE’18, Martin et al.
TSE’21
● sampling strategies
○ uniform random sampling? t-wise? distance-based? …
○ sample of hardware? input data?
● incremental build of configurations Randrianaina et al. ICSE’22
● white-box approaches Velez et al. ICSE’21, Weber et al. ICSE’21
● …
25
What can we do? (#3 modelling)
Modelling variability
● Abstractions are definitely needed to…
○ reason about logical constraints and interactions
○ integrate domain knowledge
○ synthesize domain knowledge
○ automate and guide the exploration of variants
○ scope and prioritize experiments
● Challenges:
○ Multiple systems, layers, concerns
○ Different kinds of variability: technical vs domain, accidental vs essential, implicit
vs explicit… when to stop modelling?
○ reverse engineering
26
Open, reproducible science for deep variability
deep.variability.io?
A collaborative resource and place:
● Evidence of Deep Software Variability
● Continuous survey of papers/articles about deep software variability
● Case studies and configuration knowledge associated to software
projects
● Datasets (eg measurements of performance in certain conditions)
● Reproducible scripts (eg for building prediction models)
● Description and results of Innovative solutions (eg transfer learning)
○ content based on already published papers
○ beyond PDFs
Ongoing work (<live= book with jupyter)… Looking for
contributions/ideas/suggestions! 27
eg https://github.com/llesoil/input_sensitivity
28
Tackling Deep Software Variability Together
Mathieu Acher @acherm
29
BACKUP SLIDES
30
(BEYOND x264) Empirical and fundamental question
HOW DOES DEEP
SOFTWARE VARIABILITY
MANIFEST IN THE WILD?
SOFTWARE
SCIENTISTS
should
OBSERVE the
JUNGLE/
GALAXY!
Age # Cores GPU
Variant
Compil. Version
Version Option Distrib.
Size Length Res.
Hardware
Operating
System
Software
Input Data
Deep Software Variability
4 challenges and opportunities
Luc Lesoil, Mathieu Acher, Arnaud Blouin, Jean-Marc Jézéquel:
Deep Software Variability: Towards Handling Cross-Layer Configuration. VaMoS 2021: 10:1-10:8
Identify the influential layers 1
Age # Cores GPU
Variant
Compil. Version
Version Option Distrib.
Size Length Res.
Hardware
Operating
System
Software
Input Data
Problem
≠ layers,
≠ importances on
performances
Challenge
Estimate their effects
Opportunity
Leverage the useful
variability layers
& variables
2
Test & Benchmark environments
0.152.2854 0.155.2917
Problem
Combinatorial explosion and cost
Challenge
Build a
representative, cheap
set of environments
Opportunity
dimensionality reduction
Hardware
Operating
System
Software
Input Data
Transfer performances across environments
10.4
20.04
Dell latitude
7400
Raspberry Pi
4 model B
A B
?
Problem
Options’ importances
change with environments
Challenge
Transfer performances
across environments
Opportunity
Reduce cost of measure
3
Cross-Layer Tuning
Age # Cores GPU
Variant
Compil. Version
Version Option Distrib.
Size Length Res.
Hardware
Operating
System
Software
Input Data
4
Problem
(Negative) interactions
of layers
Challenge
Find & fix values to improve
performances
Opportunity
Specialize the environment
for a use case
Bug
Perf. ·
Perf. ¸
CHALLENGES for Deep Software Variability
Identify the influential layers
Test & Benchmark environments
Transfer performances across environments
Cross-Layer Tuning
x264 video encoder (compilation/build)
compile-time
options
38
Key results (for x264)
Worth tuning software at compile-time: gain about 10 % of execution time with the
tuning of compile-time options (compared to the default compile-time configuration).
The improvements can be larger for some inputs and some runtime configurations.
Stability of variability knowledge: For all the execution time distributions of x264
and all the input videos, the worst correlation is greater than 0.97. If the compile-time
options change the scale of the distribution, they do not change the rankings of
run-time configurations (i.e., they do not truly interact with the run-time options).
Reuse of configuration knowledge:
● Linear transformation among distributions
● Users can also trust the documentation of run-time options,
consistent whatever the compile-time configuration is.
L. Lesoil, M. Acher, X. Tërnava, A. Blouin and
J.-M. Jézéquel <The Interplay of Compile-
time and Run-time Options for Performance
Prediction= in SPLC ’21
39
Key results (for x264)
First good news: Worth tuning software at compile-time!
Second good news: For all the execution time distributions of x264 and all the input videos, the worst
correlation is greater than 0.97. If the compile-time options change the scale of the distribution, they do not
change the rankings of run-time configurations (i.e., they do not truly interact with the run-time options).
It has three practical implications:
1. Reuse of configuration knowledge: transfer learning of prediction models boils down to apply a linear
transformation among distributions. Users can also trust the documentation of run-time options,
consistent whatever the compile-time configuration is.
2. Tuning at lower cost: finding the best compile-time configuration among all the possible ones allows
one to immediately find the best configuration at run time. We can remove away one dimension!
3. Measuring at lower cost: do not use a default compile-time configuration, use the less costly once since
it will generalize!
Did we recommend to use two binaries? YES, one for measuring, another for reaching optimal
performances!
interplay between
compile-time and runtime
options and even input!
L. Lesoil, M. Acher, X. Tërnava, A. Blouin and
J.-M. Jézéquel <The Interplay of Compile-
time and Run-time Options for Performance
Prediction= in SPLC ’21
40
Are there some configuration options
more sensitive to input videos? (bitrate)
41
● Linux as a subject software system (not as an OS interacting with other layers)
● Targeted non-functional, quantitative property: binary size
○ interest for maintainers/users of the Linux kernel (embedded systems, cloud, etc.)
○ challenging to predict (cross-cutting options, interplay with compilers/build
systems, etc/.)
● Dataset: version 4.13.3 (september 2017), x86_64 arch,
measurements of 95K+ random configurations
○ paranoiac about deep variability since 2017, Docker to control the build
environment and scale
○ diversity of binary sizes: from 7Mb to 1.9Gb
○ 6% MAPE errors: quite good, though costly…
2
42
H. Martin, M. Acher, J. A. Pereira, L. Lesoil, J. Jézéquel and D. E. Khelladi, <Transfer learning across variants
and versions: The case of linux kernel size= Transactions on Software Engineering (TSE), 2021
Transfer learning to the rescue
● Mission Impossible: Saving variability knowledge and
prediction model 4.13 (15K hours of computation)
● Heterogeneous transfer learning: the feature space is
different
● TEAMS: transfer evolution-aware model shifting
5
43
H. Martin, M. Acher, J. A. Pereira, L. Lesoil, J. Jézéquel and D. E. Khelladi, <Transfer learning across variants
and versions: The case of linux kernel size= Transactions on Software Engineering (TSE), 2021
3
43
Joelle Pineau <Building Reproducible, Reusable, and Robust Machine Learning Software= ICSE’19 keynote <[...] results
can be brittle to even minor perturbations in the domain or experimental procedure=
What is the magnitude of the effect
hyperparameter settings can have on baseline
performance?
How does the choice of network architecture for
the policy and value function approximation affect
performance?
How can the reward scale affect results?
Can random seeds drastically alter performance?
How do the environment properties affect
variability in reported RL algorithm performance?
Are commonly used baseline implementations
comparable? 44
<Neuroimaging pipelines are known to generate different results
depending on the computing platform where they are compiled and
executed.=
Statically building programs improves reproducibility across OSes, but small
differences may still remain when dynamic libraries are loaded by static
executables[...]. When static builds are not an option, software heterogeneity might
be addressed using virtual machines. However, such solutions are only
workarounds: differences may still arise between static executables built on
different OSes, or between dynamic executables executed in different VMs.
Reproducibility of neuroimaging
analyses across operating systems,
Glatard et al., Front. Neuroinform., 24
April 2015
45
Deep Questions?

More Related Content

Similar to Tackling Deep Software Variability Together

When Scientific Software Meets (Model-Driven) Software Engineering
When Scientific Software Meets (Model-Driven) Software EngineeringWhen Scientific Software Meets (Model-Driven) Software Engineering
When Scientific Software Meets (Model-Driven) Software EngineeringBenoit Combemale
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Roelof Pieters
 
Improving Software Maintenance using Unsupervised Machine Learning techniques
Improving Software Maintenance using Unsupervised Machine Learning techniquesImproving Software Maintenance using Unsupervised Machine Learning techniques
Improving Software Maintenance using Unsupervised Machine Learning techniquesValerio Maggio
 
Early Detection of Collaboration Conflicts & Risks in Software Development
Early Detection of Collaboration Conflicts & Risks in Software DevelopmentEarly Detection of Collaboration Conflicts & Risks in Software Development
Early Detection of Collaboration Conflicts & Risks in Software DevelopmentRoopesh Jhurani
 
09 The Extreme-scale Scientific Software Stack for Collaborative Open Source
09 The Extreme-scale Scientific Software Stack for Collaborative Open Source09 The Extreme-scale Scientific Software Stack for Collaborative Open Source
09 The Extreme-scale Scientific Software Stack for Collaborative Open SourceRCCSRENKEI
 
Towards Smart Modeling (Environments)
Towards Smart Modeling (Environments)Towards Smart Modeling (Environments)
Towards Smart Modeling (Environments)Benoit Combemale
 
(Structural) Feature Interactions for Variability-Intensive Systems Testing
(Structural) Feature Interactions for Variability-Intensive Systems Testing (Structural) Feature Interactions for Variability-Intensive Systems Testing
(Structural) Feature Interactions for Variability-Intensive Systems Testing Gilles Perrouin
 
Software engineering
Software engineeringSoftware engineering
Software engineeringFahe Em
 
Software engineering
Software engineeringSoftware engineering
Software engineeringFahe Em
 
Replication and Benchmarking in Software Analytics
Replication and Benchmarking in Software AnalyticsReplication and Benchmarking in Software Analytics
Replication and Benchmarking in Software AnalyticsUniversity of Zurich
 
Software Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSoftware Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSung Kim
 
Can ML help software developers? (TEQnation 2022)
Can ML help software developers? (TEQnation 2022)Can ML help software developers? (TEQnation 2022)
Can ML help software developers? (TEQnation 2022)Maurício Aniche
 
OpenSees: Future Directions
OpenSees: Future DirectionsOpenSees: Future Directions
OpenSees: Future Directionsopenseesdays
 
Big Data: the weakest link
Big Data: the weakest linkBig Data: the weakest link
Big Data: the weakest linkCS, NcState
 
Machine learning operations model book mlops
Machine learning operations model book mlopsMachine learning operations model book mlops
Machine learning operations model book mlopsRuyPerez1
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowJan Kirenz
 
Analyzing Big Data's Weakest Link (hint: it might be you)
Analyzing Big Data's Weakest Link  (hint: it might be you)Analyzing Big Data's Weakest Link  (hint: it might be you)
Analyzing Big Data's Weakest Link (hint: it might be you)HPCC Systems
 

Similar to Tackling Deep Software Variability Together (20)

When Scientific Software Meets (Model-Driven) Software Engineering
When Scientific Software Meets (Model-Driven) Software EngineeringWhen Scientific Software Meets (Model-Driven) Software Engineering
When Scientific Software Meets (Model-Driven) Software Engineering
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
 
Icpc16.ppt
Icpc16.pptIcpc16.ppt
Icpc16.ppt
 
Icpc16.ppt
Icpc16.pptIcpc16.ppt
Icpc16.ppt
 
Improving Software Maintenance using Unsupervised Machine Learning techniques
Improving Software Maintenance using Unsupervised Machine Learning techniquesImproving Software Maintenance using Unsupervised Machine Learning techniques
Improving Software Maintenance using Unsupervised Machine Learning techniques
 
Early Detection of Collaboration Conflicts & Risks in Software Development
Early Detection of Collaboration Conflicts & Risks in Software DevelopmentEarly Detection of Collaboration Conflicts & Risks in Software Development
Early Detection of Collaboration Conflicts & Risks in Software Development
 
09 The Extreme-scale Scientific Software Stack for Collaborative Open Source
09 The Extreme-scale Scientific Software Stack for Collaborative Open Source09 The Extreme-scale Scientific Software Stack for Collaborative Open Source
09 The Extreme-scale Scientific Software Stack for Collaborative Open Source
 
Towards Smart Modeling (Environments)
Towards Smart Modeling (Environments)Towards Smart Modeling (Environments)
Towards Smart Modeling (Environments)
 
(Structural) Feature Interactions for Variability-Intensive Systems Testing
(Structural) Feature Interactions for Variability-Intensive Systems Testing (Structural) Feature Interactions for Variability-Intensive Systems Testing
(Structural) Feature Interactions for Variability-Intensive Systems Testing
 
Software engineering
Software engineeringSoftware engineering
Software engineering
 
Software engineering
Software engineeringSoftware engineering
Software engineering
 
Replication and Benchmarking in Software Analytics
Replication and Benchmarking in Software AnalyticsReplication and Benchmarking in Software Analytics
Replication and Benchmarking in Software Analytics
 
Software Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSoftware Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled Datasets
 
Can ML help software developers? (TEQnation 2022)
Can ML help software developers? (TEQnation 2022)Can ML help software developers? (TEQnation 2022)
Can ML help software developers? (TEQnation 2022)
 
OpenSees: Future Directions
OpenSees: Future DirectionsOpenSees: Future Directions
OpenSees: Future Directions
 
24 Reasons Why Variability Models Are Not Yet Universal (24RWVMANYU)
24 Reasons Why Variability Models Are Not Yet Universal (24RWVMANYU)24 Reasons Why Variability Models Are Not Yet Universal (24RWVMANYU)
24 Reasons Why Variability Models Are Not Yet Universal (24RWVMANYU)
 
Big Data: the weakest link
Big Data: the weakest linkBig Data: the weakest link
Big Data: the weakest link
 
Machine learning operations model book mlops
Machine learning operations model book mlopsMachine learning operations model book mlops
Machine learning operations model book mlops
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
 
Analyzing Big Data's Weakest Link (hint: it might be you)
Analyzing Big Data's Weakest Link  (hint: it might be you)Analyzing Big Data's Weakest Link  (hint: it might be you)
Analyzing Big Data's Weakest Link (hint: it might be you)
 

More from University of Rennes, INSA Rennes, Inria/IRISA, CNRS

More from University of Rennes, INSA Rennes, Inria/IRISA, CNRS (20)

A Demonstration of End-User Code Customization Using Generative AI
A Demonstration of End-User Code Customization Using Generative AIA Demonstration of End-User Code Customization Using Generative AI
A Demonstration of End-User Code Customization Using Generative AI
 
On Programming Variability with Large Language Model-based Assistant
On Programming Variability with Large Language Model-based AssistantOn Programming Variability with Large Language Model-based Assistant
On Programming Variability with Large Language Model-based Assistant
 
Generative AI for Reengineering Variants into Software Product Lines: An Expe...
Generative AI for Reengineering Variants into Software Product Lines: An Expe...Generative AI for Reengineering Variants into Software Product Lines: An Expe...
Generative AI for Reengineering Variants into Software Product Lines: An Expe...
 
On anti-cheating in chess, science, reproducibility, and variability
On anti-cheating in chess, science, reproducibility, and variabilityOn anti-cheating in chess, science, reproducibility, and variability
On anti-cheating in chess, science, reproducibility, and variability
 
Feature Subset Selection for Learning Huge Configuration Spaces: The case of ...
Feature Subset Selection for Learning Huge Configuration Spaces: The case of ...Feature Subset Selection for Learning Huge Configuration Spaces: The case of ...
Feature Subset Selection for Learning Huge Configuration Spaces: The case of ...
 
Transfer Learning Across Variants and Versions: The Case of Linux Kernel Size
Transfer Learning Across Variants and Versions: The Case of Linux Kernel SizeTransfer Learning Across Variants and Versions: The Case of Linux Kernel Size
Transfer Learning Across Variants and Versions: The Case of Linux Kernel Size
 
Software Variability and Artificial Intelligence
Software Variability and Artificial IntelligenceSoftware Variability and Artificial Intelligence
Software Variability and Artificial Intelligence
 
Teaching Software Product Lines: A Snapshot of Current Practices and Challenges
Teaching Software Product Lines: A Snapshot of Current Practices and ChallengesTeaching Software Product Lines: A Snapshot of Current Practices and Challenges
Teaching Software Product Lines: A Snapshot of Current Practices and Challenges
 
Exploiting the Enumeration of All Feature Model Configurations: A New Perspec...
Exploiting the Enumeration of All Feature Model Configurations: A New Perspec...Exploiting the Enumeration of All Feature Model Configurations: A New Perspec...
Exploiting the Enumeration of All Feature Model Configurations: A New Perspec...
 
Assessing Product Line Derivation Operators Applied to Java Source Code: An E...
Assessing Product Line Derivation Operators Applied to Java Source Code: An E...Assessing Product Line Derivation Operators Applied to Java Source Code: An E...
Assessing Product Line Derivation Operators Applied to Java Source Code: An E...
 
Synthesis of Attributed Feature Models From Product Descriptions
Synthesis of Attributed Feature Models From Product DescriptionsSynthesis of Attributed Feature Models From Product Descriptions
Synthesis of Attributed Feature Models From Product Descriptions
 
From Basic Variability Models to OpenCompare.org
From Basic Variability Models to OpenCompare.orgFrom Basic Variability Models to OpenCompare.org
From Basic Variability Models to OpenCompare.org
 
Pandoc: a universal document converter
Pandoc: a universal document converterPandoc: a universal document converter
Pandoc: a universal document converter
 
Metamorphic Domain-Specific Languages
Metamorphic Domain-Specific LanguagesMetamorphic Domain-Specific Languages
Metamorphic Domain-Specific Languages
 
3D Printing, Customization, and Product Lines
3D Printing, Customization, and Product Lines3D Printing, Customization, and Product Lines
3D Printing, Customization, and Product Lines
 
WebFML: Synthesizing Feature Models Everywhere (@ SPLC 2014)
WebFML: Synthesizing Feature Models Everywhere (@ SPLC 2014)WebFML: Synthesizing Feature Models Everywhere (@ SPLC 2014)
WebFML: Synthesizing Feature Models Everywhere (@ SPLC 2014)
 
A survey on teaching of software product lines
A survey on teaching of software product linesA survey on teaching of software product lines
A survey on teaching of software product lines
 
Product Comparison Matrix (PCM), Variability Modeling: The Wikipedia Case Study
Product Comparison Matrix (PCM), Variability Modeling: The Wikipedia Case StudyProduct Comparison Matrix (PCM), Variability Modeling: The Wikipedia Case Study
Product Comparison Matrix (PCM), Variability Modeling: The Wikipedia Case Study
 
Ec2013 tutorial-mb variability-final
Ec2013 tutorial-mb variability-finalEc2013 tutorial-mb variability-final
Ec2013 tutorial-mb variability-final
 
ASE'11 (short paper)
ASE'11 (short paper)ASE'11 (short paper)
ASE'11 (short paper)
 

Recently uploaded

Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdfMicrosoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdfQ-Advise
 
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...Alluxio, Inc.
 
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdf
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdfImplementing KPIs and Right Metrics for Agile Delivery Teams.pdf
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdfVictor Lopez
 
how-to-download-files-safely-from-the-internet.pdf
how-to-download-files-safely-from-the-internet.pdfhow-to-download-files-safely-from-the-internet.pdf
how-to-download-files-safely-from-the-internet.pdfMehmet Akar
 
IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024vaibhav130304
 
INGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by DesignINGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by DesignNeo4j
 
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdfMastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdfmbmh111980
 
What need to be mastered as AI-Powered Java Developers
What need to be mastered as AI-Powered Java DevelopersWhat need to be mastered as AI-Powered Java Developers
What need to be mastered as AI-Powered Java DevelopersEmilyJiang23
 
Agnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in KrakówAgnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in Krakówbim.edu.pl
 
Crafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationCrafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationWave PLM
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfkalichargn70th171
 
How to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabberHow to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabbereGrabber
 
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAlluxio, Inc.
 
How to pick right visual testing tool.pdf
How to pick right visual testing tool.pdfHow to pick right visual testing tool.pdf
How to pick right visual testing tool.pdfTestgrid.io
 
APVP,apvp apvp High quality supplier safe spot transport, 98% purity
APVP,apvp apvp High quality supplier safe spot transport, 98% purityAPVP,apvp apvp High quality supplier safe spot transport, 98% purity
APVP,apvp apvp High quality supplier safe spot transport, 98% purityamy56318795
 
AI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAlluxio, Inc.
 
Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024Soroosh Khodami
 

Recently uploaded (20)

Top Mobile App Development Companies 2024
Top Mobile App Development Companies 2024Top Mobile App Development Companies 2024
Top Mobile App Development Companies 2024
 
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdfMicrosoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
 
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
 
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdf
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdfImplementing KPIs and Right Metrics for Agile Delivery Teams.pdf
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdf
 
how-to-download-files-safely-from-the-internet.pdf
how-to-download-files-safely-from-the-internet.pdfhow-to-download-files-safely-from-the-internet.pdf
how-to-download-files-safely-from-the-internet.pdf
 
AI Hackathon.pptx
AI                        Hackathon.pptxAI                        Hackathon.pptx
AI Hackathon.pptx
 
IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024
 
INGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by DesignINGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by Design
 
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdfMastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
 
What need to be mastered as AI-Powered Java Developers
What need to be mastered as AI-Powered Java DevelopersWhat need to be mastered as AI-Powered Java Developers
What need to be mastered as AI-Powered Java Developers
 
Agnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in KrakówAgnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in Kraków
 
Crafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationCrafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM Integration
 
5 Reasons Driving Warehouse Management Systems Demand
5 Reasons Driving Warehouse Management Systems Demand5 Reasons Driving Warehouse Management Systems Demand
5 Reasons Driving Warehouse Management Systems Demand
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
 
How to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabberHow to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabber
 
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
 
How to pick right visual testing tool.pdf
How to pick right visual testing tool.pdfHow to pick right visual testing tool.pdf
How to pick right visual testing tool.pdf
 
APVP,apvp apvp High quality supplier safe spot transport, 98% purity
APVP,apvp apvp High quality supplier safe spot transport, 98% purityAPVP,apvp apvp High quality supplier safe spot transport, 98% purity
APVP,apvp apvp High quality supplier safe spot transport, 98% purity
 
AI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in Michelangelo
 
Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024
 

Tackling Deep Software Variability Together

  • 1. Tackling Deep Software Variability Together Mathieu Acher @acherm Special thanks to Luc Lesoil, Jean-Marc Jézéquel, Juliana Alves Pereira, and Paul Temple
  • 2. Special thanks to Luc Lesoil, Jean-Marc Jézéquel, Juliana Alves Pereira, and Paul Temple
  • 3. SOFTWARE VARIANTS ARE EATING THE WORLD AND SCIENCE 3
  • 4. build: OK 8.16 seconds security: high build: failure build: OK 2.16 seconds security: low 4
  • 5. Whole Population of Configurations Performance Prediction Training Sample Performance Measurements Prediction Model J. Alves Pereira, H. Martin, M. Acher, J.-M. Jézéquel, G. Botterweck and A. Ventresque <Learning Software Configuration Spaces: A Systematic Literature Review= JSS, 2021 Sampling, Measuring, Learning Software Configuration Spaces 5
  • 6. State-of-the-art (SOTA) results: with the right algorithm and/or sampling strategy, fairly accurate and interpretable models for many systems and qualitative/quantitative properties can be derived Still, why learning-based, SOTA techniques are not fully adopted and integrated to software-intensive projects? ● Academic/industry gap? ● Too costly? ● Hard to automatically observe/quantify variants? ● …. Learning variability can be easy… 6
  • 7. Learning variability can be easy… hardware variability operating system variability build/compiler variability software application variability version variability input data variability … but we’re not looking deeply. Prediction models and configuration knowledge may not generalize (ie misleading/inaccurate/pointless for developers and users) if the software evolves, is compiled differently, a different hardware is used, input data fed differs, etc. 7
  • 8. 15,000+ options thousands of compiler flags and compile-time options dozens of preferences 100+ command-line parameters 1000+ feature toggles 8 hardware variability deep software variability
  • 9. Age # Cores GPU SOFTWARE Variant Compil. Version Version Option Distrib. Size Length Res. Hardware Operating System Software Input Data Bug Perf. · Perf. ¸ deep variability Sometimes, variability is consistent/stable across layers and knowledge transfer is immediate. But there are also interactions among variability layers and variability knowledge may not generalize
  • 10. Hardware Operating System Software Input Data 10.4 x264 --mbtree ... x264 --no-mbtree ... x264 --no-mbtree ... x264 --mbtree ... 20.04 Dell latitude 7400 Raspberry Pi 4 model B vertical animation vertical animation vertical animation vertical animation Duration (s) 22 25 73 72 6 6 351 359 Size (MB) 28 34 33 21 33 21 28 34 A B 2 1 2 1 REAL WORLD Example (x264)
  • 11. REAL WORLD Example (x264) Hardware Operating System Software Input Data 10.4 x264 --mbtree ... x264 --no-mbtree ... x264 --no-mbtree ... x264 --mbtree ... 20.04 Dell latitude 7400 Raspberry Pi 4 model B vertical animation vertical animation vertical animation vertical animation Duration (s) 22 25 73 72 6 6 351 359 Size (MB) 28 34 33 21 33 21 28 34 A B 2 1 2 1
  • 12. Hardware Operating System Software Input Data 10.4 x264 --mbtree ... x264 --no-mbtree ... x264 --no-mbtree ... x264 --mbtree ... 20.04 Dell latitude 7400 Raspberry Pi 4 model B vertical animation vertical animation vertical animation vertical animation Duration (s) 22 25 73 72 6 6 351 359 Size (MB) 28 34 33 21 33 21 28 34 A B 2 1 2 1 ≈*16 ≈*12 REAL WORLD Example (x264)
  • 13. Age # Cores GPU SOFTWARE Variant Compil. Version Version Option Distrib. Size Length Res. Hardware Operating System Software Input Data Bug Perf. · Perf. ¸ deep variability L. Lesoil, M. Acher, A. Blouin and J.-M. Jézéquel, <Deep Software Variability: Towards Handling Cross-Layer Configuration= in VaMoS 2021 The <best=/default software variant might be a bad one. Influential software options and their interactions vary. Performance prediction models and variability knowledge may not generalize
  • 14. Transferring Performance Prediction Models Across Different Hardware Platforms Valov et al. ICPE 2017 <Linear model provides a good approximation of transformation between performance distributions of a system deployed in different hardware environments= what about variability of input data? compile-time options? version? 14
  • 15. Transfer Learning for Software Performance Analysis: An Exploratory Analysis Jamshidi et al. ASE 2017 15
  • 16. <Analyzing the Impact of Workloads on Modeling the Performance of Configurable Software Systems= Stefan Mühlbauer et al. ICSE 2023 16
  • 17. Let’s go deep with input data! Intuition: video encoder behavior (and thus runtime configurations) hugely depends on the input video (different compression ratio, encoding size/type etc.) Is the best software configuration still the best? Are influential options always influential? Does the configuration knowledge generalize? ? YouTube User General Content dataset: 1397 videos Measurements of 201 soft. configurations (with same hardware, compiler, version, etc.): encoding time, bitrate, etc. 17
  • 18. Do x264 software performances stay consistent across inputs? ●Encoding time: very strong correlations ○ low input sensitivity ●FPS: very strong correlations ○ low input sensitivity ●CPU usage : moderate correlation, a few negative correlations ○ medium input sensitivity ●Bitrate: medium-low correlation, many negative correlations ○ High input sensitivity ●Encoding size: medium-low correlation, many negative correlations ○ High input sensitivity ? 1397 videos x 201 software configurations 18
  • 19. Are there some configuration options more sensitive to input videos? (P = bitrate) L. Lesoil, M. Acher, A. Blouin and J.-M. Jézéquel <Input sensitivity on the performance of configurable systems an empirical study= JSS 2023 19
  • 20. Practical impacts for users, developers, scientists, and self-adaptive systems Threats to variability knowledge: predicting, tuning, or understanding configurable systems without being aware of inputs can be inaccurate and… pointless eg effectiveness of sampling strategies (random, 2-wise, etc.) is input specific (see also Pereira et al. ICPE 2020) Opportunities: for some performance properties (P) and subject systems, some stability is observed and performance remains consistent! L. Lesoil, M. Acher, A. Blouin and J.-M. Jézéquel <Input sensitivity on the performance of configurable systems an empirical study= JSS 2023 Stefan Mühlbauer, Florian Sattler, Christian Kaltenecker, Johannes Dorn, Sven Apel, Norbert Siegmund <Analyzing the Impact of Workloads on Modeling the Performance of Configurable Software Systems= ICSE 2023 20
  • 21. Is there an interplay between compile-time and runtime options? L. Lesoil, M. Acher, X. Tërnava, A. Blouin and J.-M. Jézéquel <The Interplay of Compile- time and Run-time Options for Performance Prediction= in SPLC ’21 21
  • 22. 4.13 version (sep 2017): 6%. What about evolution? Can we reuse the 4.13 Linux prediction model? No, accuracy quickly decreases: 4.15 (5 months after): 20%; 5.7 (3 years after): 35% 3 22 H. Martin, M. Acher, J. A. Pereira, L. Lesoil, J. Jézéquel and D. E. Khelladi, <Transfer learning across variants and versions: The case of linux kernel size= Transactions on Software Engineering (TSE), 2021
  • 23. 23
  • 24. What can we do? (#1 studies) Empirical studies about deep software variability ● more subject systems ● more variability layers, including interactions ● more quantitative (e.g., performance) properties with challenges for gathering measurements data: ● how to scale experiments? Variant space is huge! ● how to fix/isolate some layers? (eg hardware) ● how to measure in a reliable way? Expected outcomes: ● significance of deep software variability in the wild ● identification of stable layers: sources of variability that should not affect the conclusion and that can be eliminated/forgotten ● identification/quantification of sensitive layers and interactions that matter ● variability knowledge 24
  • 25. What can we do? (#2 cost) Reducing the cost of exploring the variability spaces ● learning ○ many algorithms/techniques with tradeoffs interpretability/accuracy ○ transfer learning (instead of learning from scratch) Jamshidi et al. ASE’17, ASE’18, Martin et al. TSE’21 ● sampling strategies ○ uniform random sampling? t-wise? distance-based? … ○ sample of hardware? input data? ● incremental build of configurations Randrianaina et al. ICSE’22 ● white-box approaches Velez et al. ICSE’21, Weber et al. ICSE’21 ● … 25
  • 26. What can we do? (#3 modelling) Modelling variability ● Abstractions are definitely needed to… ○ reason about logical constraints and interactions ○ integrate domain knowledge ○ synthesize domain knowledge ○ automate and guide the exploration of variants ○ scope and prioritize experiments ● Challenges: ○ Multiple systems, layers, concerns ○ Different kinds of variability: technical vs domain, accidental vs essential, implicit vs explicit… when to stop modelling? ○ reverse engineering 26
  • 27. Open, reproducible science for deep variability deep.variability.io? A collaborative resource and place: ● Evidence of Deep Software Variability ● Continuous survey of papers/articles about deep software variability ● Case studies and configuration knowledge associated to software projects ● Datasets (eg measurements of performance in certain conditions) ● Reproducible scripts (eg for building prediction models) ● Description and results of Innovative solutions (eg transfer learning) ○ content based on already published papers ○ beyond PDFs Ongoing work (<live= book with jupyter)… Looking for contributions/ideas/suggestions! 27
  • 29. Tackling Deep Software Variability Together Mathieu Acher @acherm 29
  • 31. (BEYOND x264) Empirical and fundamental question HOW DOES DEEP SOFTWARE VARIABILITY MANIFEST IN THE WILD? SOFTWARE SCIENTISTS should OBSERVE the JUNGLE/ GALAXY! Age # Cores GPU Variant Compil. Version Version Option Distrib. Size Length Res. Hardware Operating System Software Input Data
  • 32. Deep Software Variability 4 challenges and opportunities Luc Lesoil, Mathieu Acher, Arnaud Blouin, Jean-Marc Jézéquel: Deep Software Variability: Towards Handling Cross-Layer Configuration. VaMoS 2021: 10:1-10:8
  • 33. Identify the influential layers 1 Age # Cores GPU Variant Compil. Version Version Option Distrib. Size Length Res. Hardware Operating System Software Input Data Problem ≠ layers, ≠ importances on performances Challenge Estimate their effects Opportunity Leverage the useful variability layers & variables
  • 34. 2 Test & Benchmark environments 0.152.2854 0.155.2917 Problem Combinatorial explosion and cost Challenge Build a representative, cheap set of environments Opportunity dimensionality reduction Hardware Operating System Software Input Data
  • 35. Transfer performances across environments 10.4 20.04 Dell latitude 7400 Raspberry Pi 4 model B A B ? Problem Options’ importances change with environments Challenge Transfer performances across environments Opportunity Reduce cost of measure 3
  • 36. Cross-Layer Tuning Age # Cores GPU Variant Compil. Version Version Option Distrib. Size Length Res. Hardware Operating System Software Input Data 4 Problem (Negative) interactions of layers Challenge Find & fix values to improve performances Opportunity Specialize the environment for a use case Bug Perf. · Perf. ¸
  • 37. CHALLENGES for Deep Software Variability Identify the influential layers Test & Benchmark environments Transfer performances across environments Cross-Layer Tuning
  • 38. x264 video encoder (compilation/build) compile-time options 38
  • 39. Key results (for x264) Worth tuning software at compile-time: gain about 10 % of execution time with the tuning of compile-time options (compared to the default compile-time configuration). The improvements can be larger for some inputs and some runtime configurations. Stability of variability knowledge: For all the execution time distributions of x264 and all the input videos, the worst correlation is greater than 0.97. If the compile-time options change the scale of the distribution, they do not change the rankings of run-time configurations (i.e., they do not truly interact with the run-time options). Reuse of configuration knowledge: ● Linear transformation among distributions ● Users can also trust the documentation of run-time options, consistent whatever the compile-time configuration is. L. Lesoil, M. Acher, X. Tërnava, A. Blouin and J.-M. Jézéquel <The Interplay of Compile- time and Run-time Options for Performance Prediction= in SPLC ’21 39
  • 40. Key results (for x264) First good news: Worth tuning software at compile-time! Second good news: For all the execution time distributions of x264 and all the input videos, the worst correlation is greater than 0.97. If the compile-time options change the scale of the distribution, they do not change the rankings of run-time configurations (i.e., they do not truly interact with the run-time options). It has three practical implications: 1. Reuse of configuration knowledge: transfer learning of prediction models boils down to apply a linear transformation among distributions. Users can also trust the documentation of run-time options, consistent whatever the compile-time configuration is. 2. Tuning at lower cost: finding the best compile-time configuration among all the possible ones allows one to immediately find the best configuration at run time. We can remove away one dimension! 3. Measuring at lower cost: do not use a default compile-time configuration, use the less costly once since it will generalize! Did we recommend to use two binaries? YES, one for measuring, another for reaching optimal performances! interplay between compile-time and runtime options and even input! L. Lesoil, M. Acher, X. Tërnava, A. Blouin and J.-M. Jézéquel <The Interplay of Compile- time and Run-time Options for Performance Prediction= in SPLC ’21 40
  • 41. Are there some configuration options more sensitive to input videos? (bitrate) 41
  • 42. ● Linux as a subject software system (not as an OS interacting with other layers) ● Targeted non-functional, quantitative property: binary size ○ interest for maintainers/users of the Linux kernel (embedded systems, cloud, etc.) ○ challenging to predict (cross-cutting options, interplay with compilers/build systems, etc/.) ● Dataset: version 4.13.3 (september 2017), x86_64 arch, measurements of 95K+ random configurations ○ paranoiac about deep variability since 2017, Docker to control the build environment and scale ○ diversity of binary sizes: from 7Mb to 1.9Gb ○ 6% MAPE errors: quite good, though costly… 2 42 H. Martin, M. Acher, J. A. Pereira, L. Lesoil, J. Jézéquel and D. E. Khelladi, <Transfer learning across variants and versions: The case of linux kernel size= Transactions on Software Engineering (TSE), 2021
  • 43. Transfer learning to the rescue ● Mission Impossible: Saving variability knowledge and prediction model 4.13 (15K hours of computation) ● Heterogeneous transfer learning: the feature space is different ● TEAMS: transfer evolution-aware model shifting 5 43 H. Martin, M. Acher, J. A. Pereira, L. Lesoil, J. Jézéquel and D. E. Khelladi, <Transfer learning across variants and versions: The case of linux kernel size= Transactions on Software Engineering (TSE), 2021 3 43
  • 44. Joelle Pineau <Building Reproducible, Reusable, and Robust Machine Learning Software= ICSE’19 keynote <[...] results can be brittle to even minor perturbations in the domain or experimental procedure= What is the magnitude of the effect hyperparameter settings can have on baseline performance? How does the choice of network architecture for the policy and value function approximation affect performance? How can the reward scale affect results? Can random seeds drastically alter performance? How do the environment properties affect variability in reported RL algorithm performance? Are commonly used baseline implementations comparable? 44
  • 45. <Neuroimaging pipelines are known to generate different results depending on the computing platform where they are compiled and executed.= Statically building programs improves reproducibility across OSes, but small differences may still remain when dynamic libraries are loaded by static executables[...]. When static builds are not an option, software heterogeneity might be addressed using virtual machines. However, such solutions are only workarounds: differences may still arise between static executables built on different OSes, or between dynamic executables executed in different VMs. Reproducibility of neuroimaging analyses across operating systems, Glatard et al., Front. Neuroinform., 24 April 2015 45