SBFT Tool Competition 2024 -- Python Test Case Generation Track

Search-Based and Fuzz Testing
Tool Competition 2024
Nicolas Erni
Zurich University of Applied
Science (ZHAW)
Christian Birchler
Science (ZHAW)
Pouria Derakhshanfar
JetBrains
Stephan Lukasczyk
University of Passau
Mohammed Al-Ameen
Science (ZHAW)
Software Under test Generated Test Code
Sebastiano Panichella
Science (ZHAW)
Co-located with the 46th International Conference on Software Engineering (ICSE 2024)

History SBFT Python Tool Competition
Year Venue
Coverage
tool
Mutation Tool #CUTs #Projects
#Participants
(+ baseline)
Round 1 2024 SBST PyTest
MutPy /
Cosmic Ray
35 7 4

SBFT Tool Competition - 2024
Python tool competition: For the
fi
rst time ever, we are extending an invitation to researchers to participate in our
competition using their test generation tool for Python. Tools will be assessed based on a benchmark that evaluates code
coverage and mutation score.
What is New?
Figure 1: Example of test generation for simple Python functions.
New!!!

Python tool competition Infrastructure
python-tool-competition-2024 Infrastructure
run run run
Klara …. Tooln
CUT
Time budget
generated
tests
generated
tests
generated
tests

Python tool competition Infrastructure
python-tool-competition-2024 Infrastructure
run run run
Klara …. Tooln
CUT
Time budget
Generated
tests
MutPy /
Cosmic Ray
Line and Branch
coverage metrics
Mutation metrics

Scoring Formula
T = Generated Test
B = Search Budget
C = Class under test
R = independent Run
Covi = statement coverage
Covb = branch coverage
Covm = Strong Mutation
getTime = generation time
covScore(T, B, C, R) = 1 × Covi + 2 × Covb + 4 × Covm
tScore(T, B, C, R) = covScore(T, B, C, R) × min
(
1,
2 × B
genTime)
Score(T, B, C, R) = tScore(T, B, C, R) + penalty(T, B, C, R)
Xavier Devroey, Alessio Gambi, Juan Pablo Galeotti, René Just, Fitsum Meshesha
Kifetew, Annibale Panichella, Sebastiano Panichella: JUGE: An infrastructure for
benchmarking Java unit test generators. Softw. Test. Verification Reliab. 33(3) (2023)

https://github.com/ThunderKey/python-tool-competition-2024

Benchmark Projects
• Selection criteria
• GitHub repositories
• Open Source
• Simple files
• No system access (OS, process, network, disk)

Benchmark Projects
• Selection criteria
• GitHub repositories
• Open Source
• 3 projects selected
Klara
https://github.com/se2p/pynguin https://github.com/usagitoneko97/klara
Ghostwriter with Hypothesis
https://github.com/HypothesisWorks/hypothesis
Pynguin

Contest Methodology
Search budget
400
seconds
Files under test
35
Repetitions
4 repetitions
Execution environment
Linux VM

The Tools
Competitors
UtBot
Benchmark
Klara
Pynguin
Ghostwriter
V.S.

Results (1)
Average line coverage for each project per tool

Results (2)
Average branch coverage for each project per tool

Results (3)
Average mutation score for each project per tool

Final Ranking
Competitors
UtBot
Benchmark
Klara
Pynguin
Ghostwriter
V.S.
1
2

Lessons Learned
• Identified aspects to improve and bugs that could be fixed in the
infrastructure
• Docker will simplify the evaluation procedure
• More participants to the competition!
• From Academia & Industry

What’s Next?
• Contest Infrastructure
• https://github.com/ThunderKey/python-tool-competition-2024
• Improve usability
• Facilitate setup of an evaluation
• Facilitate evaluation in other contexts
• Update the user documentation
• For the next edition
• More tools
• More CUTs
• Time budgets
• Time penalty

SBFT Tool Competition 2024 -- Python Test Case Generation Track

More Related Content

Similar to SBFT Tool Competition 2024 -- Python Test Case Generation Track

More from Sebastiano Panichella

Recently uploaded

SBFT Tool Competition 2024 -- Python Test Case Generation Track