This document summarizes an approach for automatically testing the non-functional properties of code generated by code generators from a common input language or model across different target platforms. It proposes using system containers to execute the generated code and monitor runtime properties like execution time and memory usage. Differences in the monitored properties between implementations of the same program on different languages/platforms could indicate bugs or inefficiencies in the code generators.
Application of Residue Theorem to evaluate real integrations.pptx
GPCE16 Poster: Automatic Non-functional Testing of Code Generators Families
1. Automatic Non-functional Testing of Code Generators
Families
Mohamed Boussaa, Olivier Barais, Gerson Sunye and Benoit Baudry
INRIA Rennes, France
Mohamed BOUSSAA
INRIA Rennes, France
Presentation date: GPCE 2016 - Tue 1 Nov: 16:40 at Z1
Email: mohamed.boussaa@inria.fr
Personal webpage: mboussaa@wordpress.com
Tool webpage: testingcodegenerators.wordpress.com
Phone: +33626492436
Contact
Code generators are used everywhere
They automa7cally transform high-level system specications (Models, DSLs,
GUIs, etc.) into general-purpose languages
Target diverse and heterogeneous soNware plakorms (JAVA, C++, C#, etc.)
The generated soNware ar7facts work but…
How about the non-funcDonal properDes (quality)
of generated code ?
Context
¤ Definition (Code generator family): We define a code generator family
as a set of code generators that takes as input the same language/
model and generate code for different target platforms
¤ Non-functional testing of code generators: The classical way
• Analyze the non-functional properties of generated code using platform-
specific tools, profilers, etc.
• Report inconsistencies, bugs, performance issues, etc.
Lack of tools for automatic non-functional testing of code generators
15th International Conference on Generative Programming: Concepts & Experiences (GPCE 2016)
Amsterdam, Netherlands, October 31, 2016
Footprint C
Footprint A
DSL
(Model)
SUT
SUT
SUT
Design
Generate
Generate
Generate
Code
Generator A
Code
Generator B
Code
Generator C
Execute
Execute
Execute
C++
Platform C
Platform B
Platform A
JAVA
C#
Profiler A
Profiler B
Profiler C
Bugs
Finding
Report
Report
Report
Footprint B
Code Generation Non-functional TestingCode ExecutionSoftware Design
Software Platform
Diversity
Software Design Automatic Code Generation
Software
Designer
DSL
Model
GPL
Specs
GUI
Code Generator
creators/maintainers
Code generators
Generated code
Container C
Container B
Container A
DSL
(Model)
SUT
SUT
SUT
Design
Generate
Generate
Generate
Code
Generator A
Code
Generator B
Code
Generator C
Code Generation Runtime monitoring engineCode ExecutionSoftware Design
Container A’
C#
Container B’
Container C’
Monitoring
Container
Back-end
Data Base
Container
Front-end
Visualization
Container
JAVA
C++
Footprint C’
Footprint A’
REST
Calls
Footprint B’Request
Bugs Finding
Testing Infrastructure
GPCE’16
Research Team
diverse.irisa.fr
Research Lab
inria.fr/centre/rennes
EU Funding Project
heads-project.eu
Running…
Haxe library
+ Test suites
Component
Under Test
Back-end
Database
Component
Cgroup file systems
Running…
Monitoring records
Front-end:
Visualization
Component
Time-series database
HTTP Requests
CPU
Memory
…
8086:
Back-end:
Monitoring
ComponentTarget
language …
Code
Generation
Software
Tester
Motivation
We propose:
• A black-box approach to automatically check the potential inefficient
code generators
• A runtime monitoring infrastructure, based on system containers
(Docker) as execution platforms, that allow code-generator developers to
evaluate the non-functional properties of generated code
Differential Testing: Compare equivalent implementations of the same
program written in different languages
Standard deviation (std_dev): Quantify the amount of variation among
the execution traces in terms of memory usage and execution time
Testing Method
Approach Overview
Running…
¤ Test suites with the highest variation in terms of execution time (k=60)
¤ Test suites with the highest variation in terms of memory usage (k=400)
We can identify a singular behavior of the PHP code regarding the
memory usage and exectution time
Validation…
Test suites with Std_dev > threshold value are interpreted as code
generator inconsistencies
…
Memory usage Memory usage Memory usage
Compare
Std_dev > kStd_dev < k
BugNo Bug