Reformulating Branch Coverage as a Many-Objective Optimization Problem

Annibale Panichella Fitsum M. Kifetew Paolo Tonella
Reformulating Branch
Coverage as a Many-Objective
Optimization Problem

Evolutionary Testing
Phil McMinn, “Search-based
Software Test Data Generation”
Software Testing, Verification and Reliability

1
2 3
4 5
6
Test Suite
TC1
One-target approach:
- Targeting one branch at a time
- Chromosome = Test Case
- Running Genetic Algorithm multiple times
Fitness(b1) = approach_level(b1) + branch_distance(b1)
b1

1
2 3
4 5
6
Test Suite
TC1
b2
TC2

1
2 3
4 5
6
Test Suite
TC1
b3
TC2
TC2

- Infeasible branches (waste of time)
- Some branches require more search budget
- How to properly allocate search budget?
1
2 3
4 5
6
Test Suite
TC1
TC2
TC2

b7
Whole Suite approach:
- Targeting all branches at once
- Chromosome = Test Case Test Suite
- Running Genetic Algorithm only once
Gordon Fraser, Andrea Arcuri,
“Whole Test Suite Generation”
IEEE Transaction on Software Engineering
1
2 3
4 5
6
Test Suite
TC1
TC2
TC2
Fitness ≈ ∑ branch_distance(bi)
b1
b3
b2
b4
b5
b6

Whole Suite approach:
- Targeting all branches at once
- Chromosome = Test Case Test Suite
- Running Genetic Algorithm only once
- Whole Suite > One-Target
- No issues regarding the budget allocation
- All branches are considered at same time
b7
1
2 3
4 5
6
Test Suite
TC1
TC2
TC2
b1
b3
b2
b4
b5
b6

Solving Multiple Branches at Once, How?
Fitness Function =
1 2
2 52 4
4 5
5 6
1 3
3 6
+
+
+
+
+
+
=
b7
b1
b3
b2
b4
b5 b6

Solving Multiple Branches at Once, How?
Objectives
Fitness Function =
Sum Scalarization
Many-objective Problem → Single-objective Problem
(Sum Scalarization)
1 2
2 52 4
4 5
5 6
1 3
3 6
+
+
+
+
+
+
=
b7
b1
b3
b2
b4
b5 b6
(b1 b2 b3 b4 b5 b6 b7)

Many-objective Optimization
Problem Re-formulation:
Given: B = {b1, . . . , bm} branches of a program.
Find: test cases T = {t1, . . . , tn} minimising the following ﬁtness objectives:
min f1(T) = approach_level(b1) + branch_distance(b1)
…
min fm(T) = approach_level(bm) + branch_distance(bm)

b1 = |a-0|
String example(int a) {
switch (a) {
case 0 : return “0”;
case -1 : return “-1”;
default: return “default”;
}
Example:
b2 = |a-1|
b3 = |a+1|
…

b1 = |a-0|
Example:
b2 = |a-1|
b3 = |a+1|
example(0)
…
switch (a) {
}

b1 = |a-0|
Example:
b2 = |a-1|
b3 = |a+1|
example(2)
example(0)
…
switch (a) {
}

b1 = |a-0|
Example:
b2 = |a-1|
b3 = |a+1|
example(2)
example(0)
example(-2)
…
switch (a) {
}

How Many Goals?
…
DateUtils.java (Apache Commons Lang) = 314 branches (goals)
Programs can have hundreds of
branches (objectives)
Utf8.java (Guava) = 63 branches (goals)
LimitCoronology (JodaTime) = 113 branches (goals)

Many-objective Solvers
θ-Non-dominated Sorting
Genetic Algorithm-III
≤ 15 obj.
Grid-based Evolutionary
Algorithm
≤ 10 obj.
Hypervolume-based
Evolutionary Algorithm
≤ 25 obj.
ε-Multi-objective
≤ 3 obj.
Strength Pareto Evolutionary
Algorithm
≤ 3 obj.
Non-dominated Sorting
Genetic Algorithm-II
≤ 3 obj.

Many-objective Solvers
θ-Non-dominated Sorting
Genetic Algorithm-III
≤ 15 obj.
Grid-based Evolutionary
Algorithm
≤ 10 obj.
Hypervolume-based
≤ 25 obj.
ε-Multi-objective
≤ 3 obj.
Strength Pareto Evolutionary
Algorithm
≤ 3 obj.
Non-dominated Sorting
Genetic Algorithm-II
≤ 3 obj.
Many-Objective Solvers do not
scale beyond 25 Objectives
We cannot use Standard Many-
Objective Solvers

Branch Coverage vs. Optimization Problem
f2
f1
DTLZ1
0.5
0.5

f2
f1
DTLZ1
0.5
0.5
Reach as much trade-offs
as possible

f2
f1
DTLZ1
0.5
0.5
Non-dominated Solutions
are identically optimal
as possible

f2
f1
DTLZ1
0.5
0.5
b1 = |a-b|
b2 = |b-c|
as possible

f2
f1
DTLZ1
0.5
0.5
b1 = |a-b|
b2 = |b-c|
Not all non-dominated
Solutions are optimal
as possible

f2
f1
DTLZ1
0.5
0.5
b1 = |a-b|
b2 = |b-c|
min
Point closest to cover b1
as possible

f2
f1
DTLZ1
0.5
0.5
b1 = |a-b|
b2 = |b-c|
Point closest to cover b1
Point closest to
cover b2
are identically optimal min
as possible

f2
f1
DTLZ1
0.5
0.5
b1 = |a-b|
b2 = |b-c|
These points
are better than
the others
as possible

Given a branch bi, a test case x is preferred over another test case y if and only if the
values of the objective function for bi satisfy the following condition:
bi(x) < bi(y)
b1 = |a-b|
b2 = |b-c|
First front
Preference Criterion:
MOSA: a New Many Objective Sorting Algorithm

b1 = |a-b|
b2 = |b-c|
First front
Preference Criteria:
bi(x) < bi(y)
or
bi(x) == bi(y) and #statement(x) < #statement(y)

b1 = |a-b|
b2 = |b-c|
Second front
bi(x) < bi(y)
or

b1 = |a-b|
b2 = |b-c|
Third front
bi(x) < bi(y)
or

b1 = |a-b|
b2 = |b-c|
Fourth front
bi(x) < bi(y)
or

1. Crossover: Single point
2. Mutation: add, modify
or delete statements
3. Selection: preference
criterion + crowding
distance
yes
Random Test Cases
Crossover
Mutation
Selection
End?
no

distance
5. Archive: keep track of
the best test cases
covering branches
yes
Random Test Cases
Crossover
Mutation
Selection
End?
no
Update Archive

distance
5. Archive: keep track of
the best test cases
covering branches
yes
Random Test Cases
Crossover
Mutation
Selection
End?
no
Update Archive
Final Test Suite
We have implemented MOSA in a prototype
tool by extending EvoSuite

Empirical Evaluation
Systems N. Classes N. Branches (Mean)
Guava 4 132
Tullibee 2 187
Trove 11 141
JSci 4 244
Javex 1 173
JDom 6 115
JodaTime 6 173
Tartarus 3 344
XMLEnc 2 676
NanoXML 1 304
Apache Commons Cli 2 119
Apache Commons Codec 1 498
Apache Commons Primitives 1 81
Apache Commons Collections 2 152
Apache Commons Lang 9 431
Apache Commons Math 9 84
Context: 64 Java classes extracted from 16 widely
used open source projects
Smallest Class:
IntervalSet.java
N. branches = 50
System = Apache Math
Largest Class:
XMLChecker.java
N. branches = 1213
System = XMLEnc
>> 15-20 objectives

We investigates the following research questions:
RQ1 (effectiveness): What is the coverage achieved by MOSA vs. Whole-Suite?
RQ2 (efﬁciency) : What is the rate of convergence of MOSA vs. Whole-Suite?

Effectiveness = N. of branches covered / Total number of branches in the class
Metrics:
Efﬁciency = N. of executed statements (only in cases where there is no
statistically signiﬁcant difference
in effectiveness)

Effectiveness = N. of branches covered / Total number of branches in the class
Metrics:
Efﬁciency = N. of executed statements (only in cases where there is no
statistically signiﬁcant difference
in effectiveness)
We performed a total of 2 (search
strategies) ⨉ 64 (classes) ⨉
100 (repetitions) = 12,800
experiments.
Settings:
- Population size = 50
- N. runs = 100
- Search budget = 1,000,000 n. exec. statements
- Search Timeout = 10 min

Empirical Results
%BranchCoverageDiff.
-5
6
17
28
39
50
Subjects
1
6
12
16
19
22
26
30
33
36
40
45
51
55
58
61
64

Empirical Results
Statistical Signiﬁcance:
Whole-Suite > MOSA = 9/64
MOSA > Whole-Suite = 42/64
No Difference = 13/64
Wilcoxon test
-5
6
17
28
39
50
Subjects
1
6
12
16
19
22
26
30
33
36
40
45
51
55
58
61
64

Empirical Results
Whole-Suite > MOSA = 9/64
Coverage increasing
between 2% and 42%.
Coverage decreasing
between 1% and 4%.
Wilcoxon test
-5
6
17
28
39
50
Subjects
1
6
12
16
19
22
26
30
33
36
40
45
51
55
58
61
64

Empirical Results
Total Branches = 766
Branches Covered by MOSA = 712
Branches Covered by Whole-Suite = 584
Search Time = 10 minutes
Class Name = Conversion.java
Library = Apache Commons Lang

Empirical Results
RQ2 (efﬁciency): What is the rate of convergence of MOSA vs. Whole-Suite?
N.ExecutedStatements
0
125000
250000
375000
500000
Subjects
Whole-Suite MOSA

Empirical Results
Whole-Suite < MOSA = 1/13
Wilcoxon test
0
125000
250000
375000
500000
Subjects
Whole-Suite MOSA

Empirical Results
Whole-Suite < MOSA = 1/13
Wilcoxon test
Search budget
Mean = -22%
Min = -18%
Max = - 50%
0
125000
250000
375000
500000
Subjects
Whole-Suite MOSA

What about the generated test cases?
Target Class
Class Name: BooleanUtils.java
Library: Apache Commons Lang
N. Branches: 271

public void test4() throws Throwable {
boolean boolean0 = BooleanUtils.isNotFalse((Boolean) true);
boolean boolean1 = BooleanUtils.toBoolean("no");
assertEquals(false, boolean1);
assertFalse(boolean1 == boolean0);
boolean boolean2 = BooleanUtils.isFalse((Boolean) true);
assertEquals(false, boolean2);
assertFalse(boolean2 == boolean0);
assertTrue(boolean2 == boolean1);
boolean boolean3 = BooleanUtils.toBoolean(113);
assertEquals(true, boolean3):
booleanArray0[1] = boolean0;
String string2 = BooleanUtils.toStringTrueFalse(boolean0);
assertNotNull(string2);
assertEquals("false", string2);
Boolean boolean4 = BooleanUtils.toBooleanObject("yes");
assertEquals(true, (boolean)boolean4);
assertFalse(boolean4.equals(boolean2));
Boolean boolean1 = Boolean.FALSE;
Boolean boolean2 = Boolean.valueOf(true);
Boolean boolean3 = Boolean.valueOf(false);
Integer integer2 = new Integer(1);
Boolean boolean4 = BooleanUtils.toBooleanObject(0);
assertEquals(false, (boolean)boolean4);
assertFalse(boolean4.equals(boolean2));
}
Whole-Suite:
Branch Coverage: 84.89%
Average Test Case Length: > 30 statements
Target Class
N. Branches: 271
Test Cases must be minimised after
the search process

MOSA:
Average Test Case Length: 5 statements
boolean boolean0 = true;
String string0 = BooleanUtils.toStringTrueFalse(boolean0);
assertEquals("true", string0);
boolean boolean0 = false;
String string0 = BooleanUtils.toStringOnOff(boolean0);
assertEquals("off", string0);
Test Cases already minimised
during the search process
Target Class
N. Branches: 271
Whole-Suite:
Average Test Case Length: > 30 statements
Test Cases must be minimised after
the search process

Summary
We reformulated branch coverage as a
many-objective problem with hundreds of
objectives to optimise

Summary
Traditional Many-objective Solvers are not
enough for our problem

Summary
We presented MOSA, a new novel many-
objective genetic algorithm for branch
coverage

Summary
Results:
1. Effectiveness was signiﬁcantly
improved in 66% of the subjects
2. Efﬁciency was improved in 62% of the
remaining subjects
We presented MOSA, a new novel many-
objective genetic algorithm for branch
coverage

Future Work
Considering further subjects with different
size (in terms of branches)
Investigating whether other adequacy testing
criteria (e.g., statement coverage, mutation
testing, etc.) can be reformulated as many-
objective problems
Reﬁning the preference criteria to incorporate
other secondary objectives, such as
execution cost
Incorporating the preference criteria in other
evolutionary algorithms

Thank You
for
Your Attention!
Questions?

Reformulating Branch Coverage as a Many-Objective Optimization Problem

More Related Content

What's hot

Similar to Reformulating Branch Coverage as a Many-Objective Optimization Problem

More from Annibale Panichella

Recently uploaded

Reformulating Branch Coverage as a Many-Objective Optimization Problem