Annibale Panichella Fitsum M. Kifetew Paolo Tonella
Reformulating Branch
Coverage as a Many-Objective
Optimization Problem
Evolutionary Testing
Phil McMinn, “Search-based
Software Test Data Generation”
Software Testing, Verification and Reliability
Evolutionary Testing
1
2 3
4 5
6
Test Suite
TC1
One-target approach:
- Targeting one branch at a time
- Chromosome = Test Case
- Running Genetic Algorithm multiple times
Fitness(b1) = approach_level(b1) + branch_distance(b1)
b1
Phil McMinn, “Search-based
Software Test Data Generation”
Software Testing, Verification and Reliability
Evolutionary Testing
1
2 3
4 5
6
Test Suite
TC1
One-target approach:
- Targeting one branch at a time
- Chromosome = Test Case
- Running Genetic Algorithm multiple times
b2
TC2
Phil McMinn, “Search-based
Software Test Data Generation”
Software Testing, Verification and Reliability
Fitness(b2) = approach_level(b2) + branch_distance(b2)
Evolutionary Testing
Phil McMinn, “Search-based
Software Test Data Generation”
Software Testing, Verification and Reliability
1
2 3
4 5
6
Test Suite
TC1
One-target approach:
- Targeting one branch at a time
- Chromosome = Test Case
- Running Genetic Algorithm multiple times
b3
TC2
TC2
Fitness(b3) = approach_level(b3) + branch_distance(b3)
Evolutionary Testing
- Infeasible branches (waste of time)
- Some branches require more search budget
- How to properly allocate search budget?
Phil McMinn, “Search-based
Software Test Data Generation”
Software Testing, Verification and Reliability
1
2 3
4 5
6
Test Suite
TC1
One-target approach:
- Targeting one branch at a time
- Chromosome = Test Case
- Running Genetic Algorithm multiple times
TC2
TC2
b7
Whole Suite approach:
- Targeting all branches at once
- Chromosome = Test Case Test Suite
- Running Genetic Algorithm only once
Evolutionary Testing
Gordon Fraser, Andrea Arcuri,
“Whole Test Suite Generation”
IEEE Transaction on Software Engineering
1
2 3
4 5
6
Test Suite
TC1
TC2
TC2
Fitness ≈ ∑ branch_distance(bi)
b1
b3
b2
b4
b5
b6
Whole Suite approach:
- Targeting all branches at once
- Chromosome = Test Case Test Suite
- Running Genetic Algorithm only once
Evolutionary Testing
- Whole Suite > One-Target
- No issues regarding the budget allocation
- All branches are considered at same time
b7
1
2 3
4 5
6
Test Suite
TC1
TC2
TC2
b1
b3
b2
b4
b5
b6
Gordon Fraser, Andrea Arcuri,
“Whole Test Suite Generation”
IEEE Transaction on Software Engineering
Solving Multiple Branches at Once, How?
Fitness Function =
1 2
2 52 4
4 5
5 6
1 3
3 6
+
+
+
+
+
+
=
b7
b1
b3
b2
b4
b5 b6
Gordon Fraser, Andrea Arcuri,
“Whole Test Suite Generation”
IEEE Transaction on Software Engineering
Solving Multiple Branches at Once, How?
Objectives
Fitness Function =
Sum Scalarization
Many-objective Problem → Single-objective Problem
(Sum Scalarization)
1 2
2 52 4
4 5
5 6
1 3
3 6
+
+
+
+
+
+
=
b7
b1
b3
b2
b4
b5 b6
(b1 b2 b3 b4 b5 b6 b7)
Gordon Fraser, Andrea Arcuri,
“Whole Test Suite Generation”
IEEE Transaction on Software Engineering
Many-objective Optimization
Problem Re-formulation:
Given: B = {b1, . . . , bm} branches of a program.
Find: test cases T = {t1, . . . , tn} minimising the following fitness objectives:
min f1(T) = approach_level(b1) + branch_distance(b1)
min f2(T) = approach_level(b2) + branch_distance(b2)
…
min fm(T) = approach_level(bm) + branch_distance(bm)
Many-objective Optimization
b1 = |a-0|
String example(int a) {
switch (a) {
case 0 : return “0”;
case 1 : return “1”;
case -1 : return “-1”;
default: return “default”;
}
Example:
b2 = |a-1|
b3 = |a+1|
Given: B = {b1, . . . , bm} branches of a program.
Find: test cases T = {t1, . . . , tn} minimising the following fitness objectives:
min f1(T) = approach_level(b1) + branch_distance(b1)
min f2(T) = approach_level(b2) + branch_distance(b2)
…
min fm(T) = approach_level(bm) + branch_distance(bm)
Problem Re-formulation:
Many-objective Optimization
b1 = |a-0|
Example:
b2 = |a-1|
b3 = |a+1|
example(0)
Given: B = {b1, . . . , bm} branches of a program.
Find: test cases T = {t1, . . . , tn} minimising the following fitness objectives:
min f1(T) = approach_level(b1) + branch_distance(b1)
min f2(T) = approach_level(b2) + branch_distance(b2)
…
min fm(T) = approach_level(bm) + branch_distance(bm)
Problem Re-formulation:
String example(int a) {
switch (a) {
case 0 : return “0”;
case 1 : return “1”;
case -1 : return “-1”;
default: return “default”;
}
Many-objective Optimization
b1 = |a-0|
Example:
b2 = |a-1|
b3 = |a+1|
example(2)
example(0)
Given: B = {b1, . . . , bm} branches of a program.
Find: test cases T = {t1, . . . , tn} minimising the following fitness objectives:
min f1(T) = approach_level(b1) + branch_distance(b1)
min f2(T) = approach_level(b2) + branch_distance(b2)
…
min fm(T) = approach_level(bm) + branch_distance(bm)
Problem Re-formulation:
String example(int a) {
switch (a) {
case 0 : return “0”;
case 1 : return “1”;
case -1 : return “-1”;
default: return “default”;
}
Many-objective Optimization
b1 = |a-0|
Example:
b2 = |a-1|
b3 = |a+1|
example(2)
example(0)
example(-2)
Given: B = {b1, . . . , bm} branches of a program.
Find: test cases T = {t1, . . . , tn} minimising the following fitness objectives:
min f1(T) = approach_level(b1) + branch_distance(b1)
min f2(T) = approach_level(b2) + branch_distance(b2)
…
min fm(T) = approach_level(bm) + branch_distance(bm)
Problem Re-formulation:
String example(int a) {
switch (a) {
case 0 : return “0”;
case 1 : return “1”;
case -1 : return “-1”;
default: return “default”;
}
How Many Goals?
Given: B = {b1, . . . , bm} branches of a program.
Find: test cases T = {t1, . . . , tn} minimising the following fitness objectives:
min f1(T) = approach_level(b1) + branch_distance(b1)
min f2(T) = approach_level(b2) + branch_distance(b2)
…
min fm(T) = approach_level(bm) + branch_distance(bm)
Problem Re-formulation:
DateUtils.java (Apache Commons Lang) = 314 branches (goals)
Programs can have hundreds of
branches (objectives)
Utf8.java (Guava) = 63 branches (goals)
LimitCoronology (JodaTime) = 113 branches (goals)
Many-objective Solvers
θ-Non-dominated Sorting
Genetic Algorithm-III
≤ 15 obj.
Grid-based Evolutionary
Algorithm
≤ 10 obj.
Hypervolume-based
Evolutionary Algorithm
≤ 25 obj.
ε-Multi-objective
Evolutionary Algorithm
≤ 3 obj.
Strength Pareto Evolutionary
Algorithm
≤ 3 obj.
Non-dominated Sorting
Genetic Algorithm-II
≤ 3 obj.
Many-objective Solvers
θ-Non-dominated Sorting
Genetic Algorithm-III
≤ 15 obj.
Grid-based Evolutionary
Algorithm
≤ 10 obj.
Hypervolume-based
Evolutionary Algorithm
≤ 25 obj.
ε-Multi-objective
Evolutionary Algorithm
≤ 3 obj.
Strength Pareto Evolutionary
Algorithm
≤ 3 obj.
Non-dominated Sorting
Genetic Algorithm-II
≤ 3 obj.
Many-Objective Solvers do not
scale beyond 25 Objectives
We cannot use Standard Many-
Objective Solvers
Branch Coverage vs. Optimization Problem
f2
f1
DTLZ1
0.5
0.5
Branch Coverage vs. Optimization Problem
f2
f1
DTLZ1
0.5
0.5
Reach as much trade-offs
as possible
Branch Coverage vs. Optimization Problem
f2
f1
DTLZ1
0.5
0.5
Non-dominated Solutions
are identically optimal
Reach as much trade-offs
as possible
Branch Coverage vs. Optimization Problem
f2
f1
DTLZ1
0.5
0.5
b1 = |a-b|
b2 = |b-c|
Non-dominated Solutions
are identically optimal
Reach as much trade-offs
as possible
Branch Coverage vs. Optimization Problem
f2
f1
DTLZ1
0.5
0.5
b1 = |a-b|
b2 = |b-c|
Not all non-dominated
Solutions are optimal
Non-dominated Solutions
are identically optimal
Reach as much trade-offs
as possible
Branch Coverage vs. Optimization Problem
f2
f1
DTLZ1
0.5
0.5
b1 = |a-b|
b2 = |b-c|
Not all non-dominated
Solutions are optimal
Non-dominated Solutions
are identically optimal
min
Point closest to cover b1
Reach as much trade-offs
as possible
Branch Coverage vs. Optimization Problem
f2
f1
DTLZ1
0.5
0.5
b1 = |a-b|
b2 = |b-c|
Point closest to cover b1
Not all non-dominated
Solutions are optimal
Point closest to
cover b2
Non-dominated Solutions
are identically optimal min
Reach as much trade-offs
as possible
Branch Coverage vs. Optimization Problem
f2
f1
DTLZ1
0.5
0.5
b1 = |a-b|
b2 = |b-c|
Not all non-dominated
Solutions are optimal
These points
are better than
the others
Non-dominated Solutions
are identically optimal
Reach as much trade-offs
as possible
Given a branch bi, a test case x is preferred over another test case y if and only if the
values of the objective function for bi satisfy the following condition:
bi(x) < bi(y)
b1 = |a-b|
b2 = |b-c|
First front
Preference Criterion:
MOSA: a New Many Objective Sorting Algorithm
MOSA: a New Many Objective Sorting Algorithm
b1 = |a-b|
b2 = |b-c|
First front
Preference Criteria:
Given a branch bi, a test case x is preferred over another test case y if and only if the
values of the objective function for bi satisfy the following condition:
bi(x) < bi(y)
or
bi(x) == bi(y) and #statement(x) < #statement(y)
b1 = |a-b|
b2 = |b-c|
Second front
Preference Criteria:
Given a branch bi, a test case x is preferred over another test case y if and only if the
values of the objective function for bi satisfy the following condition:
bi(x) < bi(y)
or
bi(x) == bi(y) and #statement(x) < #statement(y)
MOSA: a New Many Objective Sorting Algorithm
b1 = |a-b|
b2 = |b-c|
Third front
Preference Criteria:
Given a branch bi, a test case x is preferred over another test case y if and only if the
values of the objective function for bi satisfy the following condition:
bi(x) < bi(y)
or
bi(x) == bi(y) and #statement(x) < #statement(y)
MOSA: a New Many Objective Sorting Algorithm
b1 = |a-b|
b2 = |b-c|
Fourth front
Preference Criteria:
Given a branch bi, a test case x is preferred over another test case y if and only if the
values of the objective function for bi satisfy the following condition:
bi(x) < bi(y)
or
bi(x) == bi(y) and #statement(x) < #statement(y)
MOSA: a New Many Objective Sorting Algorithm
1. Crossover: Single point
2. Mutation: add, modify
or delete statements
3. Selection: preference
criterion + crowding
distance
yes
Random Test Cases
Crossover
Mutation
Selection
End?
no
MOSA: a New Many Objective Sorting Algorithm
1. Crossover: Single point
2. Mutation: add, modify
or delete statements
3. Selection: preference
criterion + crowding
distance
5. Archive: keep track of
the best test cases
covering branches
yes
Random Test Cases
Crossover
Mutation
Selection
End?
no
Update Archive
MOSA: a New Many Objective Sorting Algorithm
1. Crossover: Single point
2. Mutation: add, modify
or delete statements
3. Selection: preference
criterion + crowding
distance
5. Archive: keep track of
the best test cases
covering branches
yes
Random Test Cases
Crossover
Mutation
Selection
End?
no
Update Archive
Final Test Suite
We have implemented MOSA in a prototype
tool by extending EvoSuite
MOSA: a New Many Objective Sorting Algorithm
Empirical Evaluation
Empirical Evaluation
Systems N. Classes N. Branches (Mean)
Guava 4 132
Tullibee 2 187
Trove 11 141
JSci 4 244
Javex 1 173
JDom 6 115
JodaTime 6 173
Tartarus 3 344
XMLEnc 2 676
NanoXML 1 304
Apache Commons Cli 2 119
Apache Commons Codec 1 498
Apache Commons Primitives 1 81
Apache Commons Collections 2 152
Apache Commons Lang 9 431
Apache Commons Math 9 84
Context: 64 Java classes extracted from 16 widely
used open source projects
Smallest Class:
IntervalSet.java
N. branches = 50
System = Apache Math
Largest Class:
XMLChecker.java
N. branches = 1213
System = XMLEnc
>> 15-20 objectives
Empirical Evaluation
We investigates the following research questions:
RQ1 (effectiveness): What is the coverage achieved by MOSA vs. Whole-Suite?
RQ2 (efficiency) : What is the rate of convergence of MOSA vs. Whole-Suite?
Empirical Evaluation
We investigates the following research questions:
RQ1 (effectiveness): What is the coverage achieved by MOSA vs. Whole-Suite?
RQ2 (efficiency) : What is the rate of convergence of MOSA vs. Whole-Suite?
Effectiveness = N. of branches covered / Total number of branches in the class
Metrics:
Efficiency = N. of executed statements (only in cases where there is no
statistically significant difference
in effectiveness)
Empirical Evaluation
We investigates the following research questions:
RQ1 (effectiveness): What is the coverage achieved by MOSA vs. Whole-Suite?
RQ2 (efficiency) : What is the rate of convergence of MOSA vs. Whole-Suite?
Effectiveness = N. of branches covered / Total number of branches in the class
Metrics:
Efficiency = N. of executed statements (only in cases where there is no
statistically significant difference
in effectiveness)
We performed a total of 2 (search
strategies) ⨉ 64 (classes) ⨉
100 (repetitions) = 12,800
experiments.
Settings:
- Population size = 50
- N. runs = 100
- Search budget = 1,000,000 n. exec. statements
- Search Timeout = 10 min
Empirical Results
RQ1 (effectiveness): What is the coverage achieved by MOSA vs. Whole-Suite?
%BranchCoverageDiff.
-5
6
17
28
39
50
Subjects
1
6
12
16
19
22
26
30
33
36
40
45
51
55
58
61
64
Empirical Results
RQ1 (effectiveness): What is the coverage achieved by MOSA vs. Whole-Suite?
Statistical Significance:
Whole-Suite > MOSA = 9/64
MOSA > Whole-Suite = 42/64
No Difference = 13/64
Wilcoxon test
%BranchCoverageDiff.
-5
6
17
28
39
50
Subjects
1
6
12
16
19
22
26
30
33
36
40
45
51
55
58
61
64
Empirical Results
RQ1 (effectiveness): What is the coverage achieved by MOSA vs. Whole-Suite?
Statistical Significance:
Whole-Suite > MOSA = 9/64
MOSA > Whole-Suite = 42/64
No Difference = 13/64
Coverage increasing
between 2% and 42%.
Coverage decreasing
between 1% and 4%.
Wilcoxon test
%BranchCoverageDiff.
-5
6
17
28
39
50
Subjects
1
6
12
16
19
22
26
30
33
36
40
45
51
55
58
61
64
Empirical Results
RQ1 (effectiveness): What is the coverage achieved by MOSA vs. Whole-Suite?
Total Branches = 766
Branches Covered by MOSA = 712
Branches Covered by Whole-Suite = 584
Search Time = 10 minutes
Class Name = Conversion.java
Library = Apache Commons Lang
Empirical Results
RQ2 (efficiency): What is the rate of convergence of MOSA vs. Whole-Suite?
N.ExecutedStatements
0
125000
250000
375000
500000
Subjects
Whole-Suite MOSA
Empirical Results
RQ2 (efficiency): What is the rate of convergence of MOSA vs. Whole-Suite?
Statistical Significance:
Whole-Suite < MOSA = 1/13
MOSA > Whole-Suite = 8/13
No Difference = 4/13
Wilcoxon test
N.ExecutedStatements
0
125000
250000
375000
500000
Subjects
Whole-Suite MOSA
Empirical Results
RQ2 (efficiency): What is the rate of convergence of MOSA vs. Whole-Suite?
Statistical Significance:
Whole-Suite < MOSA = 1/13
MOSA > Whole-Suite = 8/13
No Difference = 4/13
Wilcoxon test
Search budget
Mean = -22%
Min = -18%
Max = - 50%
N.ExecutedStatements
0
125000
250000
375000
500000
Subjects
Whole-Suite MOSA
What about the generated test cases?
Target Class
Class Name: BooleanUtils.java
Library: Apache Commons Lang
N. Branches: 271
What about the generated test cases?
public void test4() throws Throwable {
boolean boolean0 = BooleanUtils.isNotFalse((Boolean) true);
boolean boolean1 = BooleanUtils.toBoolean("no");
assertEquals(false, boolean1);
assertFalse(boolean1 == boolean0);
boolean boolean2 = BooleanUtils.isFalse((Boolean) true);
assertEquals(false, boolean2);
assertFalse(boolean2 == boolean0);
assertTrue(boolean2 == boolean1);
boolean boolean3 = BooleanUtils.toBoolean(113);
assertEquals(true, boolean3):
booleanArray0[1] = boolean0;
booleanArray0[1] = boolean0;
booleanArray0[0] = boolean0;
String string2 = BooleanUtils.toStringTrueFalse(boolean0);
assertNotNull(string2);
assertEquals("false", string2);
Boolean boolean4 = BooleanUtils.toBooleanObject("yes");
assertEquals(true, (boolean)boolean4);
assertFalse(boolean4.equals(boolean2));
Boolean boolean1 = Boolean.FALSE;
Boolean boolean2 = Boolean.valueOf(true);
Boolean boolean3 = Boolean.valueOf(false);
Integer integer2 = new Integer(1);
Boolean boolean4 = BooleanUtils.toBooleanObject(0);
assertEquals(false, (boolean)boolean4);
assertFalse(boolean4.equals(boolean2));
}
Whole-Suite:
Branch Coverage: 84.89%
Average Test Case Length: > 30 statements
Target Class
Class Name: BooleanUtils.java
Library: Apache Commons Lang
N. Branches: 271
Test Cases must be minimised after
the search process
What about the generated test cases?
MOSA:
Branch Coverage: 93.34%
Average Test Case Length: 5 statements
boolean boolean0 = true;
String string0 = BooleanUtils.toStringTrueFalse(boolean0);
assertNotNull(string0);
assertEquals("true", string0);
boolean boolean0 = false;
String string0 = BooleanUtils.toStringOnOff(boolean0);
assertNotNull(string0);
assertEquals("off", string0);
Test Cases already minimised
during the search process
Target Class
Class Name: BooleanUtils.java
Library: Apache Commons Lang
N. Branches: 271
Whole-Suite:
Branch Coverage: 84.89%
Average Test Case Length: > 30 statements
Test Cases must be minimised after
the search process
Summary
We reformulated branch coverage as a
many-objective problem with hundreds of
objectives to optimise
Summary
We reformulated branch coverage as a
many-objective problem with hundreds of
objectives to optimise
Traditional Many-objective Solvers are not
enough for our problem
Summary
We presented MOSA, a new novel many-
objective genetic algorithm for branch
coverage
We reformulated branch coverage as a
many-objective problem with hundreds of
objectives to optimise
Traditional Many-objective Solvers are not
enough for our problem
Summary
Results:
1. Effectiveness was significantly
improved in 66% of the subjects
2. Efficiency was improved in 62% of the
remaining subjects
We presented MOSA, a new novel many-
objective genetic algorithm for branch
coverage
We reformulated branch coverage as a
many-objective problem with hundreds of
objectives to optimise
Traditional Many-objective Solvers are not
enough for our problem
Future Work
Considering further subjects with different
size (in terms of branches)
Investigating whether other adequacy testing
criteria (e.g., statement coverage, mutation
testing, etc.) can be reformulated as many-
objective problems
Refining the preference criteria to incorporate
other secondary objectives, such as
execution cost
Incorporating the preference criteria in other
evolutionary algorithms
Thank You
for
Your Attention!
Questions?

Reformulating Branch Coverage as a Many-Objective Optimization Problem

  • 1.
    Annibale Panichella FitsumM. Kifetew Paolo Tonella Reformulating Branch Coverage as a Many-Objective Optimization Problem
  • 2.
    Evolutionary Testing Phil McMinn,“Search-based Software Test Data Generation” Software Testing, Verification and Reliability
  • 3.
    Evolutionary Testing 1 2 3 45 6 Test Suite TC1 One-target approach: - Targeting one branch at a time - Chromosome = Test Case - Running Genetic Algorithm multiple times Fitness(b1) = approach_level(b1) + branch_distance(b1) b1 Phil McMinn, “Search-based Software Test Data Generation” Software Testing, Verification and Reliability
  • 4.
    Evolutionary Testing 1 2 3 45 6 Test Suite TC1 One-target approach: - Targeting one branch at a time - Chromosome = Test Case - Running Genetic Algorithm multiple times b2 TC2 Phil McMinn, “Search-based Software Test Data Generation” Software Testing, Verification and Reliability Fitness(b2) = approach_level(b2) + branch_distance(b2)
  • 5.
    Evolutionary Testing Phil McMinn,“Search-based Software Test Data Generation” Software Testing, Verification and Reliability 1 2 3 4 5 6 Test Suite TC1 One-target approach: - Targeting one branch at a time - Chromosome = Test Case - Running Genetic Algorithm multiple times b3 TC2 TC2 Fitness(b3) = approach_level(b3) + branch_distance(b3)
  • 6.
    Evolutionary Testing - Infeasiblebranches (waste of time) - Some branches require more search budget - How to properly allocate search budget? Phil McMinn, “Search-based Software Test Data Generation” Software Testing, Verification and Reliability 1 2 3 4 5 6 Test Suite TC1 One-target approach: - Targeting one branch at a time - Chromosome = Test Case - Running Genetic Algorithm multiple times TC2 TC2
  • 7.
    b7 Whole Suite approach: -Targeting all branches at once - Chromosome = Test Case Test Suite - Running Genetic Algorithm only once Evolutionary Testing Gordon Fraser, Andrea Arcuri, “Whole Test Suite Generation” IEEE Transaction on Software Engineering 1 2 3 4 5 6 Test Suite TC1 TC2 TC2 Fitness ≈ ∑ branch_distance(bi) b1 b3 b2 b4 b5 b6
  • 8.
    Whole Suite approach: -Targeting all branches at once - Chromosome = Test Case Test Suite - Running Genetic Algorithm only once Evolutionary Testing - Whole Suite > One-Target - No issues regarding the budget allocation - All branches are considered at same time b7 1 2 3 4 5 6 Test Suite TC1 TC2 TC2 b1 b3 b2 b4 b5 b6 Gordon Fraser, Andrea Arcuri, “Whole Test Suite Generation” IEEE Transaction on Software Engineering
  • 9.
    Solving Multiple Branchesat Once, How? Fitness Function = 1 2 2 52 4 4 5 5 6 1 3 3 6 + + + + + + = b7 b1 b3 b2 b4 b5 b6 Gordon Fraser, Andrea Arcuri, “Whole Test Suite Generation” IEEE Transaction on Software Engineering
  • 10.
    Solving Multiple Branchesat Once, How? Objectives Fitness Function = Sum Scalarization Many-objective Problem → Single-objective Problem (Sum Scalarization) 1 2 2 52 4 4 5 5 6 1 3 3 6 + + + + + + = b7 b1 b3 b2 b4 b5 b6 (b1 b2 b3 b4 b5 b6 b7) Gordon Fraser, Andrea Arcuri, “Whole Test Suite Generation” IEEE Transaction on Software Engineering
  • 11.
    Many-objective Optimization Problem Re-formulation: Given:B = {b1, . . . , bm} branches of a program. Find: test cases T = {t1, . . . , tn} minimising the following fitness objectives: min f1(T) = approach_level(b1) + branch_distance(b1) min f2(T) = approach_level(b2) + branch_distance(b2) … min fm(T) = approach_level(bm) + branch_distance(bm)
  • 12.
    Many-objective Optimization b1 =|a-0| String example(int a) { switch (a) { case 0 : return “0”; case 1 : return “1”; case -1 : return “-1”; default: return “default”; } Example: b2 = |a-1| b3 = |a+1| Given: B = {b1, . . . , bm} branches of a program. Find: test cases T = {t1, . . . , tn} minimising the following fitness objectives: min f1(T) = approach_level(b1) + branch_distance(b1) min f2(T) = approach_level(b2) + branch_distance(b2) … min fm(T) = approach_level(bm) + branch_distance(bm) Problem Re-formulation:
  • 13.
    Many-objective Optimization b1 =|a-0| Example: b2 = |a-1| b3 = |a+1| example(0) Given: B = {b1, . . . , bm} branches of a program. Find: test cases T = {t1, . . . , tn} minimising the following fitness objectives: min f1(T) = approach_level(b1) + branch_distance(b1) min f2(T) = approach_level(b2) + branch_distance(b2) … min fm(T) = approach_level(bm) + branch_distance(bm) Problem Re-formulation: String example(int a) { switch (a) { case 0 : return “0”; case 1 : return “1”; case -1 : return “-1”; default: return “default”; }
  • 14.
    Many-objective Optimization b1 =|a-0| Example: b2 = |a-1| b3 = |a+1| example(2) example(0) Given: B = {b1, . . . , bm} branches of a program. Find: test cases T = {t1, . . . , tn} minimising the following fitness objectives: min f1(T) = approach_level(b1) + branch_distance(b1) min f2(T) = approach_level(b2) + branch_distance(b2) … min fm(T) = approach_level(bm) + branch_distance(bm) Problem Re-formulation: String example(int a) { switch (a) { case 0 : return “0”; case 1 : return “1”; case -1 : return “-1”; default: return “default”; }
  • 15.
    Many-objective Optimization b1 =|a-0| Example: b2 = |a-1| b3 = |a+1| example(2) example(0) example(-2) Given: B = {b1, . . . , bm} branches of a program. Find: test cases T = {t1, . . . , tn} minimising the following fitness objectives: min f1(T) = approach_level(b1) + branch_distance(b1) min f2(T) = approach_level(b2) + branch_distance(b2) … min fm(T) = approach_level(bm) + branch_distance(bm) Problem Re-formulation: String example(int a) { switch (a) { case 0 : return “0”; case 1 : return “1”; case -1 : return “-1”; default: return “default”; }
  • 16.
    How Many Goals? Given:B = {b1, . . . , bm} branches of a program. Find: test cases T = {t1, . . . , tn} minimising the following fitness objectives: min f1(T) = approach_level(b1) + branch_distance(b1) min f2(T) = approach_level(b2) + branch_distance(b2) … min fm(T) = approach_level(bm) + branch_distance(bm) Problem Re-formulation: DateUtils.java (Apache Commons Lang) = 314 branches (goals) Programs can have hundreds of branches (objectives) Utf8.java (Guava) = 63 branches (goals) LimitCoronology (JodaTime) = 113 branches (goals)
  • 17.
    Many-objective Solvers θ-Non-dominated Sorting GeneticAlgorithm-III ≤ 15 obj. Grid-based Evolutionary Algorithm ≤ 10 obj. Hypervolume-based Evolutionary Algorithm ≤ 25 obj. ε-Multi-objective Evolutionary Algorithm ≤ 3 obj. Strength Pareto Evolutionary Algorithm ≤ 3 obj. Non-dominated Sorting Genetic Algorithm-II ≤ 3 obj.
  • 18.
    Many-objective Solvers θ-Non-dominated Sorting GeneticAlgorithm-III ≤ 15 obj. Grid-based Evolutionary Algorithm ≤ 10 obj. Hypervolume-based Evolutionary Algorithm ≤ 25 obj. ε-Multi-objective Evolutionary Algorithm ≤ 3 obj. Strength Pareto Evolutionary Algorithm ≤ 3 obj. Non-dominated Sorting Genetic Algorithm-II ≤ 3 obj. Many-Objective Solvers do not scale beyond 25 Objectives We cannot use Standard Many- Objective Solvers
  • 19.
    Branch Coverage vs.Optimization Problem f2 f1 DTLZ1 0.5 0.5
  • 20.
    Branch Coverage vs.Optimization Problem f2 f1 DTLZ1 0.5 0.5 Reach as much trade-offs as possible
  • 21.
    Branch Coverage vs.Optimization Problem f2 f1 DTLZ1 0.5 0.5 Non-dominated Solutions are identically optimal Reach as much trade-offs as possible
  • 22.
    Branch Coverage vs.Optimization Problem f2 f1 DTLZ1 0.5 0.5 b1 = |a-b| b2 = |b-c| Non-dominated Solutions are identically optimal Reach as much trade-offs as possible
  • 23.
    Branch Coverage vs.Optimization Problem f2 f1 DTLZ1 0.5 0.5 b1 = |a-b| b2 = |b-c| Not all non-dominated Solutions are optimal Non-dominated Solutions are identically optimal Reach as much trade-offs as possible
  • 24.
    Branch Coverage vs.Optimization Problem f2 f1 DTLZ1 0.5 0.5 b1 = |a-b| b2 = |b-c| Not all non-dominated Solutions are optimal Non-dominated Solutions are identically optimal min Point closest to cover b1 Reach as much trade-offs as possible
  • 25.
    Branch Coverage vs.Optimization Problem f2 f1 DTLZ1 0.5 0.5 b1 = |a-b| b2 = |b-c| Point closest to cover b1 Not all non-dominated Solutions are optimal Point closest to cover b2 Non-dominated Solutions are identically optimal min Reach as much trade-offs as possible
  • 26.
    Branch Coverage vs.Optimization Problem f2 f1 DTLZ1 0.5 0.5 b1 = |a-b| b2 = |b-c| Not all non-dominated Solutions are optimal These points are better than the others Non-dominated Solutions are identically optimal Reach as much trade-offs as possible
  • 27.
    Given a branchbi, a test case x is preferred over another test case y if and only if the values of the objective function for bi satisfy the following condition: bi(x) < bi(y) b1 = |a-b| b2 = |b-c| First front Preference Criterion: MOSA: a New Many Objective Sorting Algorithm
  • 28.
    MOSA: a NewMany Objective Sorting Algorithm b1 = |a-b| b2 = |b-c| First front Preference Criteria: Given a branch bi, a test case x is preferred over another test case y if and only if the values of the objective function for bi satisfy the following condition: bi(x) < bi(y) or bi(x) == bi(y) and #statement(x) < #statement(y)
  • 29.
    b1 = |a-b| b2= |b-c| Second front Preference Criteria: Given a branch bi, a test case x is preferred over another test case y if and only if the values of the objective function for bi satisfy the following condition: bi(x) < bi(y) or bi(x) == bi(y) and #statement(x) < #statement(y) MOSA: a New Many Objective Sorting Algorithm
  • 30.
    b1 = |a-b| b2= |b-c| Third front Preference Criteria: Given a branch bi, a test case x is preferred over another test case y if and only if the values of the objective function for bi satisfy the following condition: bi(x) < bi(y) or bi(x) == bi(y) and #statement(x) < #statement(y) MOSA: a New Many Objective Sorting Algorithm
  • 31.
    b1 = |a-b| b2= |b-c| Fourth front Preference Criteria: Given a branch bi, a test case x is preferred over another test case y if and only if the values of the objective function for bi satisfy the following condition: bi(x) < bi(y) or bi(x) == bi(y) and #statement(x) < #statement(y) MOSA: a New Many Objective Sorting Algorithm
  • 32.
    1. Crossover: Singlepoint 2. Mutation: add, modify or delete statements 3. Selection: preference criterion + crowding distance yes Random Test Cases Crossover Mutation Selection End? no MOSA: a New Many Objective Sorting Algorithm
  • 33.
    1. Crossover: Singlepoint 2. Mutation: add, modify or delete statements 3. Selection: preference criterion + crowding distance 5. Archive: keep track of the best test cases covering branches yes Random Test Cases Crossover Mutation Selection End? no Update Archive MOSA: a New Many Objective Sorting Algorithm
  • 34.
    1. Crossover: Singlepoint 2. Mutation: add, modify or delete statements 3. Selection: preference criterion + crowding distance 5. Archive: keep track of the best test cases covering branches yes Random Test Cases Crossover Mutation Selection End? no Update Archive Final Test Suite We have implemented MOSA in a prototype tool by extending EvoSuite MOSA: a New Many Objective Sorting Algorithm
  • 35.
  • 36.
    Empirical Evaluation Systems N.Classes N. Branches (Mean) Guava 4 132 Tullibee 2 187 Trove 11 141 JSci 4 244 Javex 1 173 JDom 6 115 JodaTime 6 173 Tartarus 3 344 XMLEnc 2 676 NanoXML 1 304 Apache Commons Cli 2 119 Apache Commons Codec 1 498 Apache Commons Primitives 1 81 Apache Commons Collections 2 152 Apache Commons Lang 9 431 Apache Commons Math 9 84 Context: 64 Java classes extracted from 16 widely used open source projects Smallest Class: IntervalSet.java N. branches = 50 System = Apache Math Largest Class: XMLChecker.java N. branches = 1213 System = XMLEnc >> 15-20 objectives
  • 37.
    Empirical Evaluation We investigatesthe following research questions: RQ1 (effectiveness): What is the coverage achieved by MOSA vs. Whole-Suite? RQ2 (efficiency) : What is the rate of convergence of MOSA vs. Whole-Suite?
  • 38.
    Empirical Evaluation We investigatesthe following research questions: RQ1 (effectiveness): What is the coverage achieved by MOSA vs. Whole-Suite? RQ2 (efficiency) : What is the rate of convergence of MOSA vs. Whole-Suite? Effectiveness = N. of branches covered / Total number of branches in the class Metrics: Efficiency = N. of executed statements (only in cases where there is no statistically significant difference in effectiveness)
  • 39.
    Empirical Evaluation We investigatesthe following research questions: RQ1 (effectiveness): What is the coverage achieved by MOSA vs. Whole-Suite? RQ2 (efficiency) : What is the rate of convergence of MOSA vs. Whole-Suite? Effectiveness = N. of branches covered / Total number of branches in the class Metrics: Efficiency = N. of executed statements (only in cases where there is no statistically significant difference in effectiveness) We performed a total of 2 (search strategies) ⨉ 64 (classes) ⨉ 100 (repetitions) = 12,800 experiments. Settings: - Population size = 50 - N. runs = 100 - Search budget = 1,000,000 n. exec. statements - Search Timeout = 10 min
  • 40.
    Empirical Results RQ1 (effectiveness):What is the coverage achieved by MOSA vs. Whole-Suite? %BranchCoverageDiff. -5 6 17 28 39 50 Subjects 1 6 12 16 19 22 26 30 33 36 40 45 51 55 58 61 64
  • 41.
    Empirical Results RQ1 (effectiveness):What is the coverage achieved by MOSA vs. Whole-Suite? Statistical Significance: Whole-Suite > MOSA = 9/64 MOSA > Whole-Suite = 42/64 No Difference = 13/64 Wilcoxon test %BranchCoverageDiff. -5 6 17 28 39 50 Subjects 1 6 12 16 19 22 26 30 33 36 40 45 51 55 58 61 64
  • 42.
    Empirical Results RQ1 (effectiveness):What is the coverage achieved by MOSA vs. Whole-Suite? Statistical Significance: Whole-Suite > MOSA = 9/64 MOSA > Whole-Suite = 42/64 No Difference = 13/64 Coverage increasing between 2% and 42%. Coverage decreasing between 1% and 4%. Wilcoxon test %BranchCoverageDiff. -5 6 17 28 39 50 Subjects 1 6 12 16 19 22 26 30 33 36 40 45 51 55 58 61 64
  • 43.
    Empirical Results RQ1 (effectiveness):What is the coverage achieved by MOSA vs. Whole-Suite? Total Branches = 766 Branches Covered by MOSA = 712 Branches Covered by Whole-Suite = 584 Search Time = 10 minutes Class Name = Conversion.java Library = Apache Commons Lang
  • 44.
    Empirical Results RQ2 (efficiency):What is the rate of convergence of MOSA vs. Whole-Suite? N.ExecutedStatements 0 125000 250000 375000 500000 Subjects Whole-Suite MOSA
  • 45.
    Empirical Results RQ2 (efficiency):What is the rate of convergence of MOSA vs. Whole-Suite? Statistical Significance: Whole-Suite < MOSA = 1/13 MOSA > Whole-Suite = 8/13 No Difference = 4/13 Wilcoxon test N.ExecutedStatements 0 125000 250000 375000 500000 Subjects Whole-Suite MOSA
  • 46.
    Empirical Results RQ2 (efficiency):What is the rate of convergence of MOSA vs. Whole-Suite? Statistical Significance: Whole-Suite < MOSA = 1/13 MOSA > Whole-Suite = 8/13 No Difference = 4/13 Wilcoxon test Search budget Mean = -22% Min = -18% Max = - 50% N.ExecutedStatements 0 125000 250000 375000 500000 Subjects Whole-Suite MOSA
  • 47.
    What about thegenerated test cases? Target Class Class Name: BooleanUtils.java Library: Apache Commons Lang N. Branches: 271
  • 48.
    What about thegenerated test cases? public void test4() throws Throwable { boolean boolean0 = BooleanUtils.isNotFalse((Boolean) true); boolean boolean1 = BooleanUtils.toBoolean("no"); assertEquals(false, boolean1); assertFalse(boolean1 == boolean0); boolean boolean2 = BooleanUtils.isFalse((Boolean) true); assertEquals(false, boolean2); assertFalse(boolean2 == boolean0); assertTrue(boolean2 == boolean1); boolean boolean3 = BooleanUtils.toBoolean(113); assertEquals(true, boolean3): booleanArray0[1] = boolean0; booleanArray0[1] = boolean0; booleanArray0[0] = boolean0; String string2 = BooleanUtils.toStringTrueFalse(boolean0); assertNotNull(string2); assertEquals("false", string2); Boolean boolean4 = BooleanUtils.toBooleanObject("yes"); assertEquals(true, (boolean)boolean4); assertFalse(boolean4.equals(boolean2)); Boolean boolean1 = Boolean.FALSE; Boolean boolean2 = Boolean.valueOf(true); Boolean boolean3 = Boolean.valueOf(false); Integer integer2 = new Integer(1); Boolean boolean4 = BooleanUtils.toBooleanObject(0); assertEquals(false, (boolean)boolean4); assertFalse(boolean4.equals(boolean2)); } Whole-Suite: Branch Coverage: 84.89% Average Test Case Length: > 30 statements Target Class Class Name: BooleanUtils.java Library: Apache Commons Lang N. Branches: 271 Test Cases must be minimised after the search process
  • 49.
    What about thegenerated test cases? MOSA: Branch Coverage: 93.34% Average Test Case Length: 5 statements boolean boolean0 = true; String string0 = BooleanUtils.toStringTrueFalse(boolean0); assertNotNull(string0); assertEquals("true", string0); boolean boolean0 = false; String string0 = BooleanUtils.toStringOnOff(boolean0); assertNotNull(string0); assertEquals("off", string0); Test Cases already minimised during the search process Target Class Class Name: BooleanUtils.java Library: Apache Commons Lang N. Branches: 271 Whole-Suite: Branch Coverage: 84.89% Average Test Case Length: > 30 statements Test Cases must be minimised after the search process
  • 50.
    Summary We reformulated branchcoverage as a many-objective problem with hundreds of objectives to optimise
  • 51.
    Summary We reformulated branchcoverage as a many-objective problem with hundreds of objectives to optimise Traditional Many-objective Solvers are not enough for our problem
  • 52.
    Summary We presented MOSA,a new novel many- objective genetic algorithm for branch coverage We reformulated branch coverage as a many-objective problem with hundreds of objectives to optimise Traditional Many-objective Solvers are not enough for our problem
  • 53.
    Summary Results: 1. Effectiveness wassignificantly improved in 66% of the subjects 2. Efficiency was improved in 62% of the remaining subjects We presented MOSA, a new novel many- objective genetic algorithm for branch coverage We reformulated branch coverage as a many-objective problem with hundreds of objectives to optimise Traditional Many-objective Solvers are not enough for our problem
  • 54.
    Future Work Considering furthersubjects with different size (in terms of branches) Investigating whether other adequacy testing criteria (e.g., statement coverage, mutation testing, etc.) can be reformulated as many- objective problems Refining the preference criteria to incorporate other secondary objectives, such as execution cost Incorporating the preference criteria in other evolutionary algorithms
  • 55.