Unit Testing Tool Competition-Eighth Round

Java Unit Testing Tool Competition -
8th Round
Xavier Devroey <x.d.m.devroey@tudelft.nl>
Sebastiano Panichella <sebastiano.panichella@zhaw.ch>
Alessio Gambi <alessio.gambi@uni-passau.de>

History
Year Venue
Coverage
tool
Mutation
Tool
#CUTs #Projects
#Participants
(+ baseline)
Statistical
Tests
Round 1 2013 ICST Cobertura Javalanche 77 5 2 ✗
Round 2 2014 FITTEST JaCoCo PITest 63 9 4 ✗
Round 3 2015 SBST JaCoCo PITest 63 9 8 ✗
Round 4 2016 SBST
DEFECT4J
(Real Faults)
68 5 4 ✗
Round 5 2017 SBST JaCoCo
PITest +
Our Env.
69 8 2 (+ 2) ✓
PITest +
Our Env.
59 7 2 (+ 2)
✓ + combined
analysis
PITest +
Our Env.
69 8 2 (+ 2) ✓ + combined
analysis
PITest +
Our Env.
69 8 1 (+ 1)
✓ + combined
analysis
+
+

Infrastructure
JUnit Tool Contest Infrastructure
runtool runtool runtool
Randoop …. Tooln
CUT
Time budget
@Test @Test @Test

Infrastructure
Randoop …. Tooln
CUT
Time budget
@Test @Test
Filtering
Flaky
Tests

Infrastructure
Randoop …. Tooln
CUT
Time budget
@Test @Test
Filtering
Flaky
Tests
Metrics

Scoring Formula
T = Generated Test
B = Search Budget
C = Class under test
R = independent Run
Covi = statement coverage
Covb = branch coverage
Covm = Strong Mutation
getTime = generation time
penalty = percentage of flaky test
and non-compiling tests
covScore(T, B, C, R) = 1 × Covi + 2 × Covb + 4 × Covm
tScore(T, B, C, R) = covScore(T, B, C, R) × min
(
1,
2 × B
genTime)
Score(T, B, C, R) = tScore(T, B, C, R) + penalty(T, B, C, R)

https://github.com/JUnitContest/junitcontest

Benchmark Projects
• Selection criteria
• GitHub repositories
• Project builds using Maven or Gradle
• Contains JUnit 4 test suite

Benchmark Projects
• Selection criteria
• GitHub repositories
• Project builds using Maven or Gradle
• Contains JUnit 4 test suite
• 4 projects selected
Fescar/Seata
https://github.com/google/guava https://github.com/seata/seata
https://github.com/apache/pdfboxhttps://github.com/INRIA/spoon/

Classes Under Test Selection
Classes
1,094 
Classes
CC >= 5

Classes
1,094 
Classes
CC >= 5 382 
Classes
Randoop 
10 seconds
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●●
●
●
●
●
●●
●
●
●
● ●●
●
●●
●
●
●●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
● ●● ●●●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●● ●
●
●● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
● ●
●
●
●
● ●●
●
●
●●
●
● ●
●
●●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●●●●●
●
●
●
●
●
●● ●
●
●
●
●● ●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
● ●●●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●●●
●
●
●
●
Candidate Sampled
0 25 50 75 100 0 25 50 75 100
0
25
50
75
100
Line coverage
Mutationscore
0
25
50
75
100
Branch

Classes
1,094
Classes
CC >= 5 382
Classes
Randoop
10 seconds
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●●
●
●
●
●
●●
●
●
●
● ●●
●
●●
●
●
●●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
● ●● ●●●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●● ●
●
●● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
● ●
●
●
●
● ●●
●
●
●●
●
● ●
●
●●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●●●●●
●
●
●
●
●
●● ●
●
●
●
●● ●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
● ●●●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●●●
●
●
●
●
Candidate Sampled
0 25 50 75 100 0 25 50 75 100
0
25
50
75
100
Line coverage
Mutationscore
0
25
50
75
100
Branch
70
CUTs
Random
Fescar 20
Guava 20
PdfBox 20
Spoon 10
Total 70

Contest Methodology
Search budgets
1 min. 3 min.
Classes under test
70 classes
Repetitions
10 repetitions
Execution environment Statistical analysis
Friedman’s test
Post-hoc Conover

Results
Line coverage Branch coverage Covered mutants Mutation score
R.1 E.1 R.3 E.3 R.1 E.1 R.3 E.3 R.1 E.1 R.3 E.3 R.1 E.1 R.3 E.3
0
25
50
75
100
R.1: Randoop, 1 min. - E.1: EvoSuite, 1 min. - R.3: Randoop, 3 min. - E.3: EvoSuite, 3 min.

Final Ranking
●
●
●●●●●●●●●●
●●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
FESCAR GUAVA PDFBOX SPOON (all)
R.1 E.1 R.3 E.3 R.1 E.1 R.3 E.3 R.1 E.1 R.3 E.3 R.1 E.1 R.3 E.3 R.1 E.1 R.3 E.3
0
2
4
6
Scores per project
Score: 406.14
1

Lessons Learnt
• Results are consistent with other independent evaluations
• Shows that the bugs discovered in the infrastructure could be fixed
• The two-steps selection procedure for the classes under test is useful
• Discover configuration issues early
• Still largely a manual process
• Docker simplifies the evaluation procedure

What’s Next?
• Contest Infrastructure
• https://github.com/JUnitContest/junitcontest
• Improve usability
• Facilitate setup of an evaluation
• Facilitate evaluation in other contexts
• Update the user documentation
• Storage and versioning of the results (and participating tools?)
• For the next edition
• More tools
• More CUTs

Java Unit Testing Tool Competition -
9th Round
Chair: Sebastiano Panichella Co-chair(s): Alessio Gambi, You?

Unit Testing Tool Competition-Eighth Round

Recommended

Recommended

More Related Content

Similar to Unit Testing Tool Competition-Eighth Round

Similar to Unit Testing Tool Competition-Eighth Round (20)

More from Sebastiano Panichella

More from Sebastiano Panichella (20)

Recently uploaded

Recently uploaded (17)

Unit Testing Tool Competition-Eighth Round