A Survey on Automatic Software Evolution Techniques

A Survey on Automatic
Software Evolution
Techniques
Jindae Kim
PhD Qualifying Examination, Aug 31, 2015
HKUST
1

Overview
• Automatic Software Evolution
• Approaches
• Challenges
• Proposed Idea
2

Automatic Software Evolution
• An activity or a technique to evolve software
automatically.
• Supports software development process and increase
the productivity of human developers.
3

Area Techniques
Refactoring
Henkel et al.(2005), Murphy-Hill et al.(2007), Higo et al.
(2008), Tsantalis et al.(2009), Tsantalis et al.(2010),
Tsantalis et al.(2011), Dijkman et al.(2011)
Automatic Patch
Generation
Arcuri (2008), Arcuri et al.(2008), Dallmeier et al.(2009),
Weimer et al.(2009), Wei et al.(2010),
Orlov and Sipper (2011), Le Goues et al.(2012), Kim et al.
(2013), Nguyen et al.(2013), Long et al.(2015)
Automatic Runtime
Recovery
Rinard et al.(2004), Elkarablieh and Khursid (2008),
Dobolyi et al.(2008), Nagarajan et al.(2009), Perkins et al.
(2009), Carbin et al.(2011), Kling et al.(2012), Carzaniga et
al.(2013), Long et al.(2014)
Performance
Improvement
White et al.(2008) , Langdon et al.(2010), Orlov et al.
(2011), White et al.(2011), Harman et al.(2012), Langdon et
al.(2013), Petke et al.(2014)
4

Generate and Validate
• Most recent and popular approach in automatic
software evolution.
• Evolves a program in various aspects with validation.
6

7
Generate
Validate
Seed
Program
Target
Program
Overview of Generate and Validate System
Pass
Program
Variant
Program
Variant
Program
Variant
Program
Variant
Program
Variant
Program
Variant
Fail

8
Seed
Program
Program Feature
ProgramFeature
Search Space of Possible Program Variants

8
Seed
Program
Plausible
Variant
Program Feature
ProgramFeature

8
Seed
Program
Target
Variant
Plausible
Variant
Program Feature
ProgramFeature

8
Seed
Program
Target
Variant
Good! Plausible
Variant
Program Feature
ProgramFeature

8
Seed
Program
Target
Variant
Plausible
Variant
Not Bad
Program Feature
ProgramFeature

8
Seed
Program
Target
Variant
Plausible
Variant
Wrong!
Program Feature
ProgramFeature

Categorization of
Generate and Validate
9
Approaches
Variant Generation
Simple
Mutations
Mutations with
Existing Source
Code
Pre-defined
Templates
Search
Method
Genetic
Programming
(GP)
Arcuri et al.
2008,
FINCH (Orlov et
al. 2011).
GenProg (Le
Goues et al.
2012), Petke et
al. 2014.
PAR (Kim et al.
2013)
Random/
Heuristic
Search
Debroy and
Wong 2010.
SemFix (Nguyen
et al. 2013).
AE (Weimer et
al. 2013),
TrpAutoRepair
(Qi et al. 2013),
RSRepair (Qi et
al. 2014).
SPR (Long et al.
2015), Prophet
(Long et al.
2015)

Simple Mutations +
Genetic Programming
10

Genetic Programming
11
Overview of GenProg (Le Goues, 2013)

Co-Evolutionary Method
• Co-evolution of source code and a test suite.
• Using basic 34 primitives.
• Evaluated on eight seeded faults, and fixed five faults.
12

14
Partial GP implementation of Bubble Sort by Arcuri et al.

Simple Mutations +
Random/Heuristic Search
15

SemFix
• Program Repair via Semantic Analysis.
• Selects a target statement based on fault localization.
• Generates repair constraint based on a test suite.
• Synthesizes a new statement satisfying the constraint.
16

SemFix
17
Basic components used in SemFix by Nguyen et al.

SemFix
• SemFix repaired 48 out of 90 bugs from SIR and GNU
Coreutils.
• SemFix also generated a repair faster than a GP-based
technique.
18

19
Techniques Description Limitation
Co-evolutionary
Method
Evolving source code and
a test suite together.
Using simple primitives.
Evaluated on seed faults for a
simple program.
Primitives are too small.
SemFix
Derive repair constraints by
source code and test cases.
Synthesize a statement
satisfying the constraints.
Components used in
statement synthesis are
simple.
Only evaluated on small
programs.

Existing Source Code +
Genetic Programming
20

GenProg
21
Overview of GenProg (Le Goues, 2013)

GenProg
• Automatic program repair technique.
• Statement insertion/deletion/replacement.
• Using source code in the same revision.
• Fixed 55 out of 105 bugs.
• Assumes a patch already exists in existing source code.
22

Genetic Improvement
• Petke et al. evolve MiniSAT solver for Combinatorial
Interaction Testing (CIT).
• Using multiple variations of MiniSAT solver written by
human as code base.
• Evolved MiniSAT is even faster than the human’s on
CIT.
23

Existing Source Code +
Random/Heuristic Search
24

AE
• Generating all possible variants one by one.
• Using the same mutation as GenProg.
• Selects a fix location based on fault localization.
• Detects and skip equivalent variants validation.
• Generates first order mutant only.
25

RSRepair
• RSRepair has the same search space as GenProg.
• Generates variants one by one - fewer patch trials.
• Randomly selects a change location.
• Outperforms GenProg in 23 out of 24 cases.
26

27
GenProg
Statement level mutations.
Insert/replace new statement
from existing code. Lack of ability to create new
statements.
Fitness guided search is not
effective.Genetic
Improvement
Statement level mutations.
Using multiple variation of
the same program as code
base.
AE
Deterministic search.
Equivalent variant detection.
Lack of ability to create new
statements.
Search space is limited to
variants with only one
mutation.RSRepair Random search.

Pre-defined Templates
+ Genetic Programming
28

PAR
29
Overview of PAR framework by Kim et al.

PAR
• PAR uses 10 pre-defined fix templates.
• Fix templates are drawn from manual inspection of
human patches.
• Fix templates include null checker, parameter changes
and expression changes.
• Patches requiring new code can be generated.
30

Patch Acceptability
31
Average ranks evaluated by 68 developers by Kim et al.

Pre-defined Templates
+ Heuristic Search
32

SPR
• Using seven transformation schemas.
• Condition synthesis for transformation schema
instantiation.
• Applying schemas in pre-defined order.
• Prioritizes transformations on branches and memory
initializations.
33

Plausible vs. Correct Patches
34
Total 69 defects SPR GenProg AE
Plausible 37 16 25
Correct 11 1 2
Repair Generation Results by Long et al.

35
Search Space comparison by Long et al.

Prophet
• Using the same transformation schema as SPR.
• Learning a model from successful patches.
• Ranks candidate patches based on the trained model.
• Prophet generated correct patches for 15 defects, while
SPR generated 11.
36

37
Seed
Program
Plausible
Variant
Program Feature
ProgramFeature
Repair Searches of SPR and Prophet
Target
Variant

37
Seed
Program
Plausible
Variant
Program Feature
ProgramFeature
Target
Variant
SPR

37
Seed
Program
Plausible
Variant
Program Feature
ProgramFeature
Target
Variant
Prophet

38
PAR
10 fix templates from manual
inspection.
Only some of fix templates
are useful.
Template instantiation using
existing code.
SPR
Seven transformation
schemas.
Condition synthesis for
schema instantiation.
Hard coded heuristic search.
Search space is limited by
schemas.
Prophet
Same transformation
schemas as SPR.
Ranks variants based on a
probabilistic model.
Search space is limited by
schemas.

Search Space Explosion
• Pre-defined templates limit search space.
• SPR has correct variants for only 19 out of 69 defects in
its search space.
• 35 out of 69 defects can be fixed with extended search
space (Long et al. 2015).
• How about additional costs?
40

Search Method
• Random application of mutations may generate
plausible, incorrect variants (Qi et al. 2015).
• Prophet can find four more correct patches than SPR.
• Only difference is search method.
• Search space extension makes search even harder.
• Effective and efficient search method is necessary.
41

How to address the issues?
• Avoiding error-prone changes by learning from existing
changes.
• PAR and SPR show that template approach works.
• Identify frequent changes from software repositories,
then use them as templates.
• Mining usage patterns of such changes to assist search.
42

Summary
• Automatic software evolution have been used in many
areas.
• Generate and Validate systems have been advanced in
two major directions - program variant generations and
search method.
• Current challenges in search space explosion and
effective search method.
43

44
Approaches Limitation
Variant
Generation
Simple
Mutations
Applied modifications are very simple.
Scalability issue - only works for small programs.
Mutations with
Existing Source
Code
Existing code restricts possible program variants.
Low possibility that necessary code fragments
exist.
Pre-defined
Templates
Pre-defined templates restrict search space.
Only a small number of templates are used.
Search
Method
Genetic
Programming
Additional costs for fitness evaluation.
Fitness guided search is not effective.
Random/
Heuristic
Search
Search space is limited based on the number of
mutations.
Mostly consider only one mutation.

A Survey on Automatic Software Evolution Techniques

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to A Survey on Automatic Software Evolution Techniques

Similar to A Survey on Automatic Software Evolution Techniques (20)

More from Sung Kim

More from Sung Kim (17)

Recently uploaded

Recently uploaded (20)

A Survey on Automatic Software Evolution Techniques