GPCE16: Automatic Non-functional Testing of Code Generators Families

Automatic Non-functional Testing of
Code Generators Families
Mohamed
BOUSSAA
Olivier
BARAIS
Gerson
SUNYE
Benoit
BAUDRY
2016 IEEE International Conference on Software Quality, Reliability & Security (QRS 2016)
August 1-3, 2016 - Vienna, Austria
INRIA Rennes, France
15th International Conference on Generative Programming: Concepts & Experiences (GPCE 2016)
Amsterdam, Netherlands, October 31 – November 1, 2016
1

a1. Context
a2. Motivation
a3. Automatic Non-functional Testing of Code Generators Families
a4. Performance Evaluation
a5. Conclusion
Outline
2

Context
3
Software Platform
Diversity
Software Design Automatic Code Generation
Software
Designer
DSL
Model
GPL
Specs
GUI
Code Generator
creators/maintainers
Code generators
Generated code
All tests are successfully passed but…
How about the non-functional properties (quality) of generated code ?
Code generators are used everywhere
They automatically transform high-level system specifications (Models, DSLs,
GUIs, etc.) into general-purpose languages (JAVA, C++, C#, etc.)
Target diverse and heterogeneous software platorms

Context
4
• Testing issues:
- Defective code generators may generate poor-quality code
- Testing the non-functional properties is time-consuming
- Require examining different non-functional requirements
- Code generators are complex and difficult to understand (involve complex
and hetergenous technologies)

Motivation
5
 Non-functional testing of code generators: The traditional way
• Analyze the non-functional properties of generated code using platform-
specific tools, profilers, etc.
Lack of tools for automatic non-functional testing of code generators
Footprint C
Footprint A
DSL
(Model)
SUT
SUT
SUT
Design
Generate
Generate
Generate
Code
Generator A
Code
Generator B
Code
Generator C
Execute
Execute
Execute
C++
Platform C
Platform B
Platform A
JAVA
C#
Profiler A
Profiler B
Profiler C
Bugs
Finding
Report
Report
Report
Footprint B
Code Generation Non-functional TestingCode ExecutionSoftware Design

Automatic Non-functional Testing of
Code Generators Families
https://testingcodegenerators.wordpress.com
6

Contributions
7
 We propose:
• A runtime monitoring infrastructure, based on system containers (Docker) as
execution platforms, that allow code-generator developers to evaluate the
non-functional properties of generated code
• A black-box testing approach to automatically check the potential inefficient
code generators

Microservice-based infrastructure
8
 Execute and monitor of the generated code using system containers
 Different configurations, instances, images, machines, etc
 Resource isolation and management
 Less performance overhead
 Provide a fine-grained understanding and analysis of compilers behavior
 Automatic extraction of non-functional properties relative to resource usage

Approach Overview
9
Footprint C
Footprint A
DSL
(Model)
SUT
SUT
SUT
Design
Generate
Generate
Generate
Code
Generator A
Code
Generator B
Code
Generator C
Execute
Execute
Execute
C++
Platform C
Platform B
Platform A
JAVA
C#
Profiler A
Profiler B
Profiler C
Bugs
Finding
Report
Report
Report
Footprint B
Code Generation Non-functional TestingCode ExecutionSoftware Design
Container C
Container B
Container A
DSL
(Model)
SUT
SUT
SUT
Design
Generate
Generate
Generate
Code
Generator A
Code
Generator B
Code
Generator C
Code Generation Runtime monitoring engineCode ExecutionSoftware Design
Container A’
C#
Container B’
Container C’
Monitoring
Container
Back-end
Data Base
Container
Front-end
Visualization
Container
JAVA
C++
Footprint C’
Footprint A’
REST
Calls
Footprint B’Request
Bugs Finding

Approach Overview
000
000
Compile and execute the
generated code within
a new container instance
Gather at runtime non-
functional properties of running
programs under test
Save information relative
to resource consumptions
within a times series database
Analysis of the performance
and non-functional properties
of programs under test
1
2
3
4
Code
Execution
Runtime
Monitoring
Time series
Database
Performance
Analysis
10

Testing Infrastructure
Component
Under Test
Back-end
Database
Component
Cgroup file systems
Running…
Monitoring records
Front-end:
Visualization
Component
Time-series database
HTTP Requests
11
8086:
Monitoring
Component
…
Code
Generation +
Compilation
Software
Tester

Testing Method
12
Definition (Code generator family):
We define a code generator family as a set of code generators that takes as input
the same language/model and generate code for different target platforms
(example: Haxe, ThingML, etc)
Differential Testing:
Compare equivalent implementations of the same program written in different
languages
Standard deviation (std_dev):
Quantify the amount of variation among the execution traces in terms of memory
usage and execution time

Testing Method
13
Test suites with Std_dev > threshold value are interpreted as code generator
inconsistencies
…
Memory usage Memory usage Memory usage
Compare
Std_dev > kStd_dev < k
BugNo Bug

Evaluation
https://testingcodegenerators.wordpress.com/experimental-results/
14

Experimental Setup
Haxe Libraries + Test suites
For monitoring:
Google cAdvisor
For storage:
InfluxDB
Execution time (S)
Programs
under test:
Haxe Libraries
Code Generators
under Test:
Haxe Compilers
Non-functional metrics
Memory usage (MBytes)
15
5 targets: C#, C++, JAVA, JS, PHP

Validation
16
• The comparison results of
running each test suite across
five target languages: the
metric used is the standard
deviation between execution
times
• Standard deviations are mostly
close to 0 - 8 interval.
• 8 data points where the
std_dev was extreamly high

Validation
17
 Test suites with the highest variation in terms of execution time (k=60)
We can identify a singular behavior of the PHP code regarding the exectution
time

Validation
18
• The comparison results of
running each test suite across
five target languages: the
metric used is the standard
deviation between memory
consumptions
• Standard deviations are mostly
close to 0 - 150 interval.
• 6 data points where the
std_dev was extreamly high

Validation
19
 Test suites with the highest variation in terms of memory usage (k=400)
We can identify a singular behavior of the PHP code regarding the memory
usage

Validation
20
For Color_TS4 in PHP:
• We observe the intensive use of « arrays »
• We replace « arrays » by « SplFixedArray »
=> Speedup x5
=> Memory usage reduction x2

Conclusion
22
 Approach for testing and
monitoring the code generators
families using a container-based
infrastructure
 Automatically extract information
about the resource usage
 The evaluation results show that
we can find real issues in existing
code generators (i.e., PHP)
Summary
 Detect more code generator
issues (e.g., CPU consumption)
 Evaluate our approach:
• On other code generator families
• Compare to other state-of-the-art
approaches
Future directions
22

https://testingcodegenerators.wordpress.com 23
Questions?

26
Code Generators Testing: ThingML

GPCE16: Automatic Non-functional Testing of Code Generators Families

More Related Content

What's hot

Viewers also liked

Similar to GPCE16: Automatic Non-functional Testing of Code Generators Families

Recently uploaded

GPCE16: Automatic Non-functional Testing of Code Generators Families