Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
NOTICE: A Framework for Non-functional
Testing of Compilers
Mohamed
BOUSSAA
Olivier
BARAIS
Gerson
SUNYE
Benoit
BAUDRY
2016...
a1. Context
a2. Motivating Example
a3. NOTICE: A Framework for Non-functional Testing of Compilers
a4. Performance Evaluat...
Motivation
C Compilers
Source code Machine code
Optimizations
Current innovations in science and industry demand ever-incr...
Compiler fine/auto-tuning is complex
4
 Huge design space for optimization options (more than 150 optimizations)
• compil...
Trying to please everyone
5
 Program-independent universal sequences (e.g., -O1, -O2, -O3, etc.)
 Based on heuristics an...
Motivating Example
 GCC 4.8.4:
- 78 optimizations
- 278 combinations
6
Speedup,
Memory,
etc.
Resource
Constraints
WHY
ALW...
NOTICE: A Framework for Non-functional
Testing of Compilers
https://noticegcc.wordpress.com
7
Contributions
1- Diversity-based exploration
 Novel formulation of the compiler optimization problem using Novelty Search...
Diversity-based exploration
gcc –c test.c –fno-dce –fno-dse –fdce -fno-align-loops …
Mutation:
Crossover:
Best solution
So...
Contributions
1- Diversity-based exploration
 Novel formulation of the compiler optimization problem using Novelty Search...
NOTICE Infrastructure
000
000
NOTICE
Compile and execute
optimized code within a new
container instance
Gather at runtime ...
NOTICE Infrastructure
Optimizations
Component
Under Test
Monitoring
Component
Back-end
Database
Component
Cgroup file syst...
Evaluation
https://noticegcc.wordpress.com/experimental-results/
13
Experimental Setup
v4.8.4
Random C code
generator
For monitoring
For storage
Optimizations
Mono Objective
Novelty Search (...
Research Questions
RQ1: Mono-objective SBSE Validation.
Optimizations
Non-functional
metric
Training set programs
Best
seq...
RQ1- Results
RQ1: Mono-objective SBSE Validation.
- Training set: 10 Csmith programs
- Average S, MR, and CR
- Comparison:...
RQ2- Results
Key findings for RQ2:
– It is possible to build general optimization sequences that perform better than stand...
RQ3- Results
RQ3: Impact of optimizations on resource consumption.
- Ox vs RS vs GA vs NS
Key findings for RQ3:
– Optimizi...
RQ4- Results
RQ4: Trade-offs between non-functional properties.
- 1 Csmith program
- Trade-off <execution time-memory usag...
Conclusion
20
Conclusion
21
 Novel formulation of the compiler
optimization problem based on
Novelty Search
 Novelty Search is able to...
https://noticegcc.wordpress.com/ 22
Questions?
Additional slides
23
Tool Support
24
 Functional Testing of Compilers
Literature Overview
25
 Non-Functional Testing of Compilers
Literature Overview
26
Prior work is insufficient
Testing the non-functional properities pose several new challenges:
- Different cost-benefit tr...
Given a set of compiler optimization
options {F1, F2, ..., Fn}, How can we find
the combination that maximize program
perf...
NSGA-II overview
• NSGA-II: Non-dominated Sorting Genetic Algorithm (K. Deb et al., ’02)
Parent
Population
Offspring
Popul...
NSGA-II overview
30
Upcoming SlideShare
Loading in …5
×

QRS16: NOTICE: A Framework for Non-functional Testing of Compilers

531 views

Published on

This talk was presented at the 2016 IEEE International Conference on Software Quality, Reliability and Security (QRS 2016).
To have more details you can visit the tool website: https://noticegcc.wordpress.com/
Paper link: https://hal.archives-ouvertes.fr/hal-01344835/document
Author website: https://mboussaa.wordpress.com/

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

QRS16: NOTICE: A Framework for Non-functional Testing of Compilers

  1. 1. NOTICE: A Framework for Non-functional Testing of Compilers Mohamed BOUSSAA Olivier BARAIS Gerson SUNYE Benoit BAUDRY 2016 IEEE International Conference on Software Quality, Reliability & Security (QRS 2016) August 1-3, 2016 - Vienna, Austria INRIA Rennes, France 2016 IEEE International Conference on Software Quality, Reliability & Security (QRS 2016) August 1-3, 2016 - Vienna, Austria 1
  2. 2. a1. Context a2. Motivating Example a3. NOTICE: A Framework for Non-functional Testing of Compilers a4. Performance Evaluation a5. Conclusion Outline 2
  3. 3. Motivation C Compilers Source code Machine code Optimizations Current innovations in science and industry demand ever-increasing computing resources while placing strict requirements on system performance, power consumption, size, response, reliability, portability and design time Resourceconstraints 3
  4. 4. Compiler fine/auto-tuning is complex 4  Huge design space for optimization options (more than 150 optimizations) • compiling a program means trading off between various objectives • compilation time, code quality, code size, ...  Constructing a good set of optimization levels (-Ox) is hard • conflicting objectives, complex interactions, unknown effect of some optimizations, ... 4
  5. 5. Trying to please everyone 5  Program-independent universal sequences (e.g., -O1, -O2, -O3, etc.)  Based on heuristics and experience  Each optimization level allows trading off various objectives • O1: "take your time, give it your best shot" • O2: "optimize, and be quick about it" • O3: "I’m feeling lucky, and have lots of time" How efficient are predefined/universal compiler levels? 5
  6. 6. Motivating Example  GCC 4.8.4: - 78 optimizations - 278 combinations 6 Speedup, Memory, etc. Resource Constraints WHY ALWAYS ME !! - Testing each optimization configuration is impossible -BOSS: Clients complain about the high memory consumption -BOSS: Is it possible to consume less CPU? we don’t have enough resources/money -BOSS: Please, can we optimize even more ? Good luck Son !! - Heuristics are needed 6
  7. 7. NOTICE: A Framework for Non-functional Testing of Compilers https://noticegcc.wordpress.com 7
  8. 8. Contributions 1- Diversity-based exploration  Novel formulation of the compiler optimization problem using Novelty Search  Diverse optimization sequences  Explore the large search space by considering Novelty as the main objective 2- Microservice-based infrastructure  Execute and monitor of the different variants of optimized code using system containers  Resource isolation and management  Provide a fine-grained understanding and analysis of compilers behavior regarding optimizations  Automatic extraction of non-functional properties relative to resource usage Finely auto-tuning compilers according to user (non-functional) requirements We propose: 8
  9. 9. Diversity-based exploration gcc –c test.c –fno-dce –fno-dse –fdce -fno-align-loops … Mutation: Crossover: Best solution Solution with best non-functional improvement 0 0 1 0 … Step 2: Evaluation … Archive: Novelty metric: Step 3: Selection Step 4: Evolutionary operators 0 1 1 1 0 … 0 1 1 1 0 … 1 0 0 1 1 … Go To Step 2 Solution representation: Saves solutions that get a novelty metric value higher than a specific novelty threshold value. Calculate the distance of one solution from its K Nearest neighbors in current population and in the Archive. Step 1: Random generation 9 Select solutions to evolve based on novelty scores. Tournament selection:
  10. 10. Contributions 1- Diversity-based exploration  Novel formulation of the compiler optimization problem using Novelty Search  Diverse optimization sequences  Explore the large search space by considering Novelty as the main objective 2- Microservice-based infrastructure  Execute and monitor of the different variants of optimized code using lightweight system containers  Provide a fine-grained understanding and analysis of compilers behavior regarding optimizations  Resource isolation and management  Automatic extraction of non-functional properties relative to resource usage Finely auto-tuning compilers according to user (non-functional) requirements We propose: 10
  11. 11. NOTICE Infrastructure 000 000 NOTICE Compile and execute optimized code within a new container instance Gather at runtime non- functional properties of running programs under test Save information relative to resource consumptions within a times series database Analysis of the performance and non-functional properties of programs under test 1 2 3 4 Code Execution Runtime Monitoring Time series Database Performance Analysis 11
  12. 12. NOTICE Infrastructure Optimizations Component Under Test Monitoring Component Back-end Database Component Cgroup file systems Running… Monitoring records Front-end Visualization Component Time-series database HTTP Requests 12
  13. 13. Evaluation https://noticegcc.wordpress.com/experimental-results/ 13
  14. 14. Experimental Setup v4.8.4 Random C code generator For monitoring For storage Optimizations Mono Objective Novelty Search (NS) Genetic Algo (GA) Random Search (RS) Multi Objective Novelty Search (NS-II) NSGA-II Speedup (S) Meta-heuristics Program under test Compiler Algorithm parameters Evaluation metrics Memory consumption reduction (MR) CPU consumption reduction (CR) Over -O0 Trade-off <execution time - memory usage> 14
  15. 15. Research Questions RQ1: Mono-objective SBSE Validation. Optimizations Non-functional metric Training set programs Best sequence RQ2: Sensitivity of input programs to optimization sequences Unseen programs Non-functional improvementBest sequence in RQ1 RQ3: Impact of speedup on resource consumption. RQ4: Trade-offs between non-functional properties. Best Speedup Sequence In RQ1 Impact on resource consumption Optimizations Pareto front solutions 15 Training set programs Multi-objective search Mono-objective search Non-functional Trade-off <time-memory> Input program
  16. 16. RQ1- Results RQ1: Mono-objective SBSE Validation. - Training set: 10 Csmith programs - Average S, MR, and CR - Comparison: Ox, RS, GA and NS Key findings for RQ1: – Best discovered optimization sequences using mono-objective search techniques always provide better results than standard GCC optimization levels. – Novelty Search is a good candidate to improve code in terms of non-functional properties since it is able to discover optimization combinations that outperform RS and GA. Search for best optimization sequence Best sequence Optimizations Non-functional Metric Training set programs 16
  17. 17. RQ2- Results Key findings for RQ2: – It is possible to build general optimization sequences that perform better than standard optimization levels – Best discovered sequences in RQ1 can be mostly used to improve the memory and CPU consumption of Csmith programs. To answer RQ2, Csmith programs are sensitive to compiler optimizations. RQ2: Sensitivity. - 100 unseen Csmith programs - O2 vs O3 vs NS Unseen programs Non-functional improvement Best Sequence In RQ1 17
  18. 18. RQ3- Results RQ3: Impact of optimizations on resource consumption. - Ox vs RS vs GA vs NS Key findings for RQ3: – Optimizing software performance can induce undesirable effects on system resources. – A trade-off is needed to find a correlation between software performance and resource usage. Best Speedup Sequence In RQ1 Training set programs Impact on Resource CPU & memory 18 Memory reduction Increase of resource usage CPU reduction
  19. 19. RQ4- Results RQ4: Trade-offs between non-functional properties. - 1 Csmith program - Trade-off <execution time-memory usage> Key findings for RQ4: – NOTICE is able to construct optimization levels that represent optimal trade-offs between non-functional properties. – NS is more effective when it is applied for mono-objective search. – NSGA-II performs better than our NS adaptation for multi-objective optimization. However, NS-II performs clearly better than standard GCC optimizations and previously discovered sequences in RQ1. 19 Optimizations Pareto front solutions Multi-objective search Trade-off time/memory Input program Pareto front NS-II (multi-objective) Ofast O3 O2 O1 Best CPU reduction (mono-objective) Best memory reduction (mono-objective) Pareto front NSGA-II (multi-objective)
  20. 20. Conclusion 20
  21. 21. Conclusion 21  Novel formulation of the compiler optimization problem based on Novelty Search  Novelty Search is able to generate effective optimizations  Automated tool for automatic extraction of non-functional properties of optimized code  Automatically extract information about memory and CPU consumption Summary  Explore more trade-offs among resource usage metrics  Evaluate NOTICE: • on real world benchmarks • other case studies (i.e., compilers, programs, etc) Future directions 21
  22. 22. https://noticegcc.wordpress.com/ 22 Questions?
  23. 23. Additional slides 23
  24. 24. Tool Support 24
  25. 25.  Functional Testing of Compilers Literature Overview 25
  26. 26.  Non-Functional Testing of Compilers Literature Overview 26
  27. 27. Prior work is insufficient Testing the non-functional properities pose several new challenges: - Different cost-benefit trade-offs (e.g., Speedup/memory or CPU usage) - Finely auto-tuning compilers according to user (non-functional) requirements - Performance is the major concern (e.g., speedup) - Ignore other important non-functional properties (e.g., resource consumption properties) - Evaluation is based on a small set of input programs (e.g., Spec CPU benchmarks) 27
  28. 28. Given a set of compiler optimization options {F1, F2, ..., Fn}, How can we find the combination that maximize program performance better than standard optimization levels ? Do this efficiently, without the use of a priori knowledge of the optimizations and their interactions From to From to Problem Statement 28
  29. 29. NSGA-II overview • NSGA-II: Non-dominated Sorting Genetic Algorithm (K. Deb et al., ’02) Parent Population Offspring Population Non-dominated sorting F1 F2 F3 F4 Crowding distance sorting Population in next generation MOEA Framework http://moeaframework.org/ 29
  30. 30. NSGA-II overview 30

×