Presented by Mohamed Boussaa on September 6, 2017 at INRIA Rennes, France.
The dissertation is available at this link: https://hal.inria.fr/tel-01598821/document
The jury is composed of:
Hélène WAESELYNCK, LAAS-CNRS Toulouse
Philippe MERLE, INRIA Lille
Erven ROHOU, INRIA Rennes
Franck FLEUREY, SINTEF Oslo
Jean-Marie MOTTU, Université de Nantes
Gerson SUNYÉ, Université de Nantes
Benoit BAUDRY, INRIA Rennes
Olivier BARAIS, Université de Rennes 1
7. 2
1
Automatic code generation
7
Machine Code
C Code
a highly configurable process
Variants
Optimisation flags
(e.g., CFLAGS)
HAXE programs
GPLsGPLs
Variants
-Os
-ftree-vectorize
-Og
-O3
-Ofast
-O2
8. Automatic code generation
8
a highly configurable process
2
1
GPLs
High-level
program specifica^on
Templates
Configurations
Files Flags
Generated code must be effectively tested
Generators
9. Automatic code generation
9
a highly configurable process
2
1
GPLs
High-level
program specifica^on
Templates
Configurations
Files Flags
Tests pass: no bugs in generators
Generators
I will never write
code again !
10. Automatic code generation
10
a highly configurable process
2
1
GPLs
High-level
program specifica^on
Templates
Configurations
Files Flags
Tests fail: bugs must be fixed !
Generators
I will never use
generators again !
11. Automatic code generation
11
a highly configurable process
2
1
GPLs
High-level
program specifica^on
Templates
Configurations
Files Flags
Tests pass, but what about the non-functional properties (quality) of generated code ?
Generators
It is too slow !
I am running out
of memory
14. Related work
14
¤ Tes5ng generators
• Func^onal tes^ng: executable models, differen^al tes^ng [Conrad et al. ’10, Stuermer et al. ’07]
Do not address the NF proper^es
¤ Auto-tuning generators:
• Auto-tuning: a mono objec^ve op^miza^on [Bashkansky et al. ’07, Stephenson et al. ’03]
• Phase ordering problem [Kulkarni et al. ‘06, Cooper et al. ’99]
• Predic^ng op^miza^ons: a machine learning op^miza^on [Fursin et al. 11’]
• Conflic^ng objec^ves: a mul^-objec^ve op^miza^on [Hoste et al. 08’, Mar^nez et al. ’14]
Do not exploit recent advances in SBSE (e.g., diversity-based explora^on)
14
16. 16
Contribution II:
An auto-tuning approach
Contribution I:
An automatic non-functional
testing approach
Select the best
configuration
Generator
experts
Build/maintain
Genrerator
users
Identify code
generator issues
Contribution III:
A lightweight environment for monitoring
and testing the generated code
Use/configure
21. Leveraging metamorphic testing to
automatically detect inconsistencies
Metamophic tes5ng1 (MT):
- Oracles can be derived from proper^es of the system under test
- Exploit the rela^on between the inputs and outputs of special test cases of the system
under test to derive metamorphic rela^ons (MRs) defined as test oracles for new test
cases
21
Metamorphic
Rela5on
Derive
21
21
Original test cases Outputs
Verify New test cases Outputs
21
1Chen et al., Metamorphic tes^ng: a new approach for genera^ng next
test cases, University of Science and Technology, Hong Kong, 1998.
23. Statistical methods
o We propose two varia^on analysis approaches to define the threshold value
23
23
23
R-Chart (Range Chart) PCA (Principal Component Analysis)
Ø A mul^variate sta^s^cal approach
Ø Reduce the dimensionality of the original data to a two
dimensions (PC1 & PC2)
Ø A score distance SD measures how far an observa^on
lies from the rest of the data within the PCA subspace
Ø SD with cutoff value higher than 97.5%-Quan^le Q of the
Chi-square distribu^on are detected as outliers
Ø Evaluate the varia^on as a Range R (Max - Min)
Ø Control limits (LCL and UCL) represent the limits
of varia^on that should be expected from a
process
LCL < R < UCL T <
23
30. Analysis
30
v For Core_TS4 in PHP:
• We observe the intensive use of « arrays »
• Arrays in PHP are allocated dynamically, leading to a slower wr^ng
speed
• We replace « arrays » by « SplFixedArray »
⇒ Speedup x5
⇒ Memory usage reduc^on x2
⇒ Issue fixed by the Haxe community
Key findings:
- The lack of use of specific types that exist in the standard library shows a real
impact on the non-func^onal behavior of generated code.
30
31. Conclusion
§ A non-functional metamorphic relation is used to detect code generator
inconsistencies
− Two statistical methods are applied to find the right MR definition
§ The evaluation results show that:
− 11 performance and 15 memory usage inconsistencies, violating the
metamorphic relation for Haxe code generators
− The analysis of test suites triggering the inconsistencies shows that there
exist potential issues in some code generators, affecting the quality of
delivered software
31
34. Motivating example
¤ GCC 4.8.4:
- 78 optimizations
- 278 combina^ons
34
Speedup,
Memory,
etc.
Resource
Constraints
WHY
ALWAYS
ME !!
-BOSS: Clients complain about the high
memory consumption
-BOSS: Is it possible to consume less
CPU?
we don’t have enough resources/money
-BOSS: Please, can we optimize even
more ?
Good luck Son !!
34
- Tes^ng each op^miza^on configura^on is impossible
- Heuris^cs are needed
34
34
35. Compiler auto-tuning is complex
35
¤ Construc^ng a good set of op^miza^on levels (-Ox) is hard
• Conflic^ng objec^ves
• Complex interac^ons
• Unknown effect of some op^miza^ons
35 35
40. RQ1- Results
RQ1: Mono-objec5ve SBSE Valida5on.
- Training set: 10 Csmith programs
- Average S, MR, and CR
- Comparison: Ox, RS, GA and NS
Key findings for RQ1:
– Best discovered op^miza^on sequences using mono-objec^ve search techniques always provide beber results than
standard GCC op^miza^on levels.
– Novelty Search is a good candidate to improve code in terms of non-func^onal proper^es since it is able to discover
op^miza^on combina^ons that outperform RS and GA.
Search for best op^miza^on
sequence
Best
sequence
Op^miza^ons
Non-func^onal
Metric
Training set programs
4040
43. RQ4- Results
RQ4: Trade-offs between non-func5onal proper5es.
- 1 Csmith program
- Trade-off <execu^on ^me-memory usage>
Key findings for RQ4:
– NOTICE is able to construct op^miza^on levels that represent op^mal trade-offs between non-func^onal
proper^es.
– NSGA-II performs beber than our NS adapta^on for mul^-objec^ve op^miza^on. However, NS-II performs clearly
beber than standard GCC op^miza^ons and previously discovered sequences in RQ1.
43
Op^miza^ons Pareto front
solu^ons
Mul^-objec^ve search
Trade-off ^me/memory
Input program
Pareto front NS-II
(mul^-objec^ve)
Ofast
O3
O2
O1
Best CPU reduc^on
(mono-objec^ve)
Best memory reduc^on
(mono-objec^ve)
Pareto front NSGA-II
(mul^-objec^ve)
43
44. Conclusion
§ Novel formulation of the compiler optimization problem based on Novelty
Search
§ Novelty Search is able to generate effective optimizations
− Generated sequeces perform better than standard levels
− Our approach outperfroms classical approaches (GA and RS)
§ Trade-offs between non-functional properties are constructed
− NSGA-II performs better than NS and mono-objective approaches
44
47. Infrastructure Overview
47
¤ We propose:
• A micro-service infrastructure, based on system containers (Docker) as execu^on
pla]orms, that allow generator experts/users to evaluate the non-func^onal
proper^es of generated code
47
RuntimeMonitoringEngine
Container C
Container B
Container A
SUT
SUT
SUT
Generate
Generate
Generate
Code
Generator A
Code
Generator B
Code
Generator C
Code Generation Runtime monitoring engineCode Execution
Container A’
Container B’
Container C’
Footprint C’
Footprint A’
HTTP
requests
Footprint B’Request
Resource usage extraction
Resource usage
DB
50. Conclusion
§ Effective support for automatically deploying, executing, and testing the
generated code in different environment settings
§ The conducted experiments showed the usefulness of this infrastructure for
tuning and testing generators
50
52. 52
Generator
experts
Genrerator
users
Build and maintain
I can now easily
determine the best
configuration settings
for my generator
I am now able to
automa5cally test my
code generator family
in terms of NFP
Conclusion
Effective support for testing and resource
usage monitoring
Use and configure
53. Perspectives
53
v Combine the proposed black-box approach with tracability tools:
• Tracking the source of code generator inconsistencies
v Speed up the ^me required to tune and test generators:
• Deploy tests on many nodes in the cloud using mul^ple containers in parallel
v Automa^c test case genera^on:
• Test amplifica^on
• Evaluate the quality of executed tests (e.g., code coverage)
v Improve the auto-tuning approach:
• Evaluate other compilers (e.g., LLVM, Clang)
• Explore more tradeoffs among resource usage metrics
• Evaluate different hardware se}ngs
54. Publications
• Mohamed Boussaa, Olivier Barais, Benoit Baudry, Gerson Sunyé: Automa5c Non-func5onal Tes5ng of
Code Generators Families. In The 15th Interna^onal Conference on Genera^ve Programming: Concepts &
Experiences (GPCE 2016), Amsterdam, Netherlands, October 2016.
• Mohamed Boussaa, Olivier Barais, Benoit Baudry, Gerson Sunyé: NOTICE: A Framework for Non-
func5onal Tes5ng of Compilers. In 2016 IEEE Interna^onal Conference on So[ware Quality, Reliability &
Security (QRS 2016), Vienna, Austria, August 2016.
• Mohamed Boussaa, Olivier Barais, Benoit Baudry, Gerson Sunyé: A Novelty Search-based Test Data
Generator for Object-oriented Programs. In Gene^c and Evolu^onary Computa^on Conference
Companion (GECCO 2015), Madrid, Spain, July 2015.
• Mohamed Boussaa, Olivier Barais, Benoit Baudry, Gerson Sunyé: A Novelty Search Approach for
Automa5c Test Data Genera5on. In 8th Interna^onal Workshop on Search-Based So[ware Tes^ng
(SBST@ICSE 2015), Florence, Italy, May 2015.
Under review:
• Mohamed Boussaa, Olivier Barais, Benoit Baudry, Gerson Sunyé: Leveraging Metamorphic Tes5ng to
Automa5cally Detect Inconsistencies in Code Generator Families. IEEE Transac^ons on Reliability, August
2017.
54
58. NSGA-II overview (I)
58
• NSGA-II: Non-dominated Sorting Genetic Algorithm (K. Deb et al., ’02)
Parent
Population
Offspring
Population
Non-dominated
sorting
F1
F2
F3
F4
Crowding distance
sorting
Population in
next
generation
MOEA Framework hbp://moeaframework.org/