SlideShare a Scribd company logo
1 of 19
Download to read offline
Search-Based and Fuzz Testing
Tool Competition 2024
Nicolas Erni
Zurich University of Applied
Science (ZHAW)
Christian Birchler
Zurich University of Applied
Science (ZHAW)
Pouria Derakhshanfar
JetBrains
Stephan Lukasczyk
University of Passau
Mohammed Al-Ameen
Zurich University of Applied
Science (ZHAW)
Software Under test Generated Test Code
Sebastiano Panichella
Zurich University of Applied
Science (ZHAW)
Co-located with the 46th International Conference on Software Engineering (ICSE 2024)
History SBFT Python Tool Competition
Year Venue
Coverage
tool
Mutation Tool #CUTs #Projects
#Participants
(+ baseline)
Round 1 2024 SBST PyTest
MutPy /
Cosmic Ray
35 7 4
SBFT Tool Competition - 2024
Python tool competition: For the
fi
rst time ever, we are extending an invitation to researchers to participate in our
competition using their test generation tool for Python. Tools will be assessed based on a benchmark that evaluates code
coverage and mutation score.
What is New?
Figure 1: Example of test generation for simple Python functions.
New!!!
Software Under test Generated Test Code
Python tool competition Infrastructure
python-tool-competition-2024 Infrastructure
run run run
Klara …. Tooln
CUT
Time budget
generated
tests
generated
tests
generated
tests
Python tool competition Infrastructure
python-tool-competition-2024 Infrastructure
run run run
Klara …. Tooln
CUT
Time budget
Generated
tests
MutPy /
Cosmic Ray
Line and Branch
coverage metrics
Mutation metrics
Scoring Formula
T = Generated Test
B = Search Budget
C = Class under test
R = independent Run
Covi = statement coverage
Covb = branch coverage
Covm = Strong Mutation
getTime = generation time
covScore(T, B, C, R) = 1 × Covi + 2 × Covb + 4 × Covm
tScore(T, B, C, R) = covScore(T, B, C, R) × min
(
1,
2 × B
genTime)
Score(T, B, C, R) = tScore(T, B, C, R) + penalty(T, B, C, R)
Xavier Devroey, Alessio Gambi, Juan Pablo Galeotti, René Just, Fitsum Meshesha
Kifetew, Annibale Panichella, Sebastiano Panichella: JUGE: An infrastructure for
benchmarking Java unit test generators. Softw. Test. Verification Reliab. 33(3) (2023)
https://github.com/ThunderKey/python-tool-competition-2024
Software Under test Generated Test Code
Benchmark Projects
• Selection criteria
• GitHub repositories
• Open Source
• Simple files
• No system access (OS, process, network, disk)
Benchmark Projects
• Selection criteria
• GitHub repositories
• Open Source
• 3 projects selected
Klara
https://github.com/se2p/pynguin https://github.com/usagitoneko97/klara
Ghostwriter with Hypothesis
https://github.com/HypothesisWorks/hypothesis
Pynguin
Contest Methodology
Search budget
400
seconds
Files under test
35
Repetitions
4 repetitions
Execution environment
Linux VM
The Tools
Competitors
UtBot
Benchmark
Klara
Pynguin
Ghostwriter
V.S.
Results (1)
Average line coverage for each project per tool
Results (2)
Average branch coverage for each project per tool
Results (3)
Average mutation score for each project per tool
Results (4)
Results (5)
Final Ranking
Competitors
UtBot
Benchmark
Klara
Pynguin
Ghostwriter
V.S.
1
2
Lessons Learned
• Identified aspects to improve and bugs that could be fixed in the
infrastructure
• Docker will simplify the evaluation procedure
• More participants to the competition!
• From Academia & Industry
What’s Next?
• Contest Infrastructure
• https://github.com/ThunderKey/python-tool-competition-2024
• Improve usability
• Facilitate setup of an evaluation
• Facilitate evaluation in other contexts
• Update the user documentation
• For the next edition
• More tools
• More CUTs
• Time budgets
• Time penalty

More Related Content

Similar to SBFT Tool Competition 2024 -- Python Test Case Generation Track

Software testing: an introduction - 2017
Software testing: an introduction - 2017Software testing: an introduction - 2017
Software testing: an introduction - 2017XavierDevroey
 
Academic Modular Seminar
Academic Modular SeminarAcademic Modular Seminar
Academic Modular SeminarJason Reid
 
GPCE16 Poster: Automatic Non-functional Testing of Code Generators Families
GPCE16 Poster: Automatic Non-functional Testing of Code Generators Families GPCE16 Poster: Automatic Non-functional Testing of Code Generators Families
GPCE16 Poster: Automatic Non-functional Testing of Code Generators Families Mohamed BOUSSAA
 
CASCON 2023 Most Influential Paper Award Talk
CASCON 2023 Most Influential Paper Award TalkCASCON 2023 Most Influential Paper Award Talk
CASCON 2023 Most Influential Paper Award TalkNikolaos Tsantalis
 
PhD public defense: A Measurement Framework for Analyzing Technical Lag in ...
PhD public defense: A Measurement Framework for  Analyzing Technical Lag in  ...PhD public defense: A Measurement Framework for  Analyzing Technical Lag in  ...
PhD public defense: A Measurement Framework for Analyzing Technical Lag in ...Ahmed Zerouali
 
Enhancing Your Test Automation Scenario Coverage with Selenium - QA or the Hi...
Enhancing Your Test Automation Scenario Coverage with Selenium - QA or the Hi...Enhancing Your Test Automation Scenario Coverage with Selenium - QA or the Hi...
Enhancing Your Test Automation Scenario Coverage with Selenium - QA or the Hi...Perfecto by Perforce
 
Primers or Reminders? The Effects of Existing Review Comments on Code Review
Primers or Reminders? The Effects of Existing Review Comments on Code ReviewPrimers or Reminders? The Effects of Existing Review Comments on Code Review
Primers or Reminders? The Effects of Existing Review Comments on Code ReviewDelft University of Technology
 
Implementation of GPU-based bioinformatic tools at the ENCODE DCC
Implementation of GPU-based bioinformatic tools at the ENCODE DCCImplementation of GPU-based bioinformatic tools at the ENCODE DCC
Implementation of GPU-based bioinformatic tools at the ENCODE DCCENCODE-DCC
 
Automated Developer Testing: Achievements and Challenges
Automated Developer Testing: Achievements and ChallengesAutomated Developer Testing: Achievements and Challenges
Automated Developer Testing: Achievements and ChallengesTao Xie
 
Reproducible, Automated and Portable Computational and Data Science Experimen...
Reproducible, Automated and Portable Computational and Data Science Experimen...Reproducible, Automated and Portable Computational and Data Science Experimen...
Reproducible, Automated and Portable Computational and Data Science Experimen...Ivo Jimenez
 
Resume_Yilun Chong_EN
Resume_Yilun Chong_ENResume_Yilun Chong_EN
Resume_Yilun Chong_ENYilun Chong
 
Keynote VST2020 (Workshop on Validation, Analysis and Evolution of Software ...
Keynote VST2020 (Workshop on  Validation, Analysis and Evolution of Software ...Keynote VST2020 (Workshop on  Validation, Analysis and Evolution of Software ...
Keynote VST2020 (Workshop on Validation, Analysis and Evolution of Software ...University of Antwerp
 
Bulletproof PowerShell
Bulletproof PowerShellBulletproof PowerShell
Bulletproof PowerShellshchegrikovich
 
Reproducible Science with Python
Reproducible Science with PythonReproducible Science with Python
Reproducible Science with PythonAndreas Schreiber
 
Java Unit Testing Tool Competition — Fifth Round
Java Unit Testing Tool Competition — Fifth RoundJava Unit Testing Tool Competition — Fifth Round
Java Unit Testing Tool Competition — Fifth RoundAnnibale Panichella
 
Collective Mind: a collaborative curation tool for program optimization
Collective Mind: a collaborative curation tool for program optimizationCollective Mind: a collaborative curation tool for program optimization
Collective Mind: a collaborative curation tool for program optimizationGrigori Fursin
 
Behold the Power of Python
Behold the Power of PythonBehold the Power of Python
Behold the Power of PythonSarah Dutkiewicz
 

Similar to SBFT Tool Competition 2024 -- Python Test Case Generation Track (20)

Software testing: an introduction - 2017
Software testing: an introduction - 2017Software testing: an introduction - 2017
Software testing: an introduction - 2017
 
Academic Modular Seminar
Academic Modular SeminarAcademic Modular Seminar
Academic Modular Seminar
 
GPCE16 Poster: Automatic Non-functional Testing of Code Generators Families
GPCE16 Poster: Automatic Non-functional Testing of Code Generators Families GPCE16 Poster: Automatic Non-functional Testing of Code Generators Families
GPCE16 Poster: Automatic Non-functional Testing of Code Generators Families
 
2010 ICMIT - Software Support for the Fuzzy Front End Stage of the Innovation...
2010 ICMIT - Software Support for the Fuzzy Front End Stage of the Innovation...2010 ICMIT - Software Support for the Fuzzy Front End Stage of the Innovation...
2010 ICMIT - Software Support for the Fuzzy Front End Stage of the Innovation...
 
CASCON 2023 Most Influential Paper Award Talk
CASCON 2023 Most Influential Paper Award TalkCASCON 2023 Most Influential Paper Award Talk
CASCON 2023 Most Influential Paper Award Talk
 
PhD public defense: A Measurement Framework for Analyzing Technical Lag in ...
PhD public defense: A Measurement Framework for  Analyzing Technical Lag in  ...PhD public defense: A Measurement Framework for  Analyzing Technical Lag in  ...
PhD public defense: A Measurement Framework for Analyzing Technical Lag in ...
 
Enhancing Your Test Automation Scenario Coverage with Selenium - QA or the Hi...
Enhancing Your Test Automation Scenario Coverage with Selenium - QA or the Hi...Enhancing Your Test Automation Scenario Coverage with Selenium - QA or the Hi...
Enhancing Your Test Automation Scenario Coverage with Selenium - QA or the Hi...
 
Primers or Reminders? The Effects of Existing Review Comments on Code Review
Primers or Reminders? The Effects of Existing Review Comments on Code ReviewPrimers or Reminders? The Effects of Existing Review Comments on Code Review
Primers or Reminders? The Effects of Existing Review Comments on Code Review
 
Implementation of GPU-based bioinformatic tools at the ENCODE DCC
Implementation of GPU-based bioinformatic tools at the ENCODE DCCImplementation of GPU-based bioinformatic tools at the ENCODE DCC
Implementation of GPU-based bioinformatic tools at the ENCODE DCC
 
Automated Developer Testing: Achievements and Challenges
Automated Developer Testing: Achievements and ChallengesAutomated Developer Testing: Achievements and Challenges
Automated Developer Testing: Achievements and Challenges
 
Reproducible, Automated and Portable Computational and Data Science Experimen...
Reproducible, Automated and Portable Computational and Data Science Experimen...Reproducible, Automated and Portable Computational and Data Science Experimen...
Reproducible, Automated and Portable Computational and Data Science Experimen...
 
Resume_Yilun Chong_EN
Resume_Yilun Chong_ENResume_Yilun Chong_EN
Resume_Yilun Chong_EN
 
ErikBrayCV
ErikBrayCVErikBrayCV
ErikBrayCV
 
Keynote VST2020 (Workshop on Validation, Analysis and Evolution of Software ...
Keynote VST2020 (Workshop on  Validation, Analysis and Evolution of Software ...Keynote VST2020 (Workshop on  Validation, Analysis and Evolution of Software ...
Keynote VST2020 (Workshop on Validation, Analysis and Evolution of Software ...
 
Bulletproof PowerShell
Bulletproof PowerShellBulletproof PowerShell
Bulletproof PowerShell
 
Reproducible Science with Python
Reproducible Science with PythonReproducible Science with Python
Reproducible Science with Python
 
Java Unit Testing Tool Competition — Fifth Round
Java Unit Testing Tool Competition — Fifth RoundJava Unit Testing Tool Competition — Fifth Round
Java Unit Testing Tool Competition — Fifth Round
 
Collective Mind: a collaborative curation tool for program optimization
Collective Mind: a collaborative curation tool for program optimizationCollective Mind: a collaborative curation tool for program optimization
Collective Mind: a collaborative curation tool for program optimization
 
Resume
ResumeResume
Resume
 
Behold the Power of Python
Behold the Power of PythonBehold the Power of Python
Behold the Power of Python
 

More from Sebastiano Panichella

The 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringThe 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringSebastiano Panichella
 
Diversity-guided Search Exploration for Self-driving Cars Test Generation thr...
Diversity-guided Search Exploration for Self-driving Cars Test Generation thr...Diversity-guided Search Exploration for Self-driving Cars Test Generation thr...
Diversity-guided Search Exploration for Self-driving Cars Test Generation thr...Sebastiano Panichella
 
SBFT Tool Competition 2024 - CPS-UAV Test Case Generation Track
SBFT Tool Competition 2024 - CPS-UAV Test Case Generation TrackSBFT Tool Competition 2024 - CPS-UAV Test Case Generation Track
SBFT Tool Competition 2024 - CPS-UAV Test Case Generation TrackSebastiano Panichella
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSebastiano Panichella
 
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...Sebastiano Panichella
 
COSMOS: DevOps for Complex Cyber-physical Systems
COSMOS: DevOps for Complex Cyber-physical SystemsCOSMOS: DevOps for Complex Cyber-physical Systems
COSMOS: DevOps for Complex Cyber-physical SystemsSebastiano Panichella
 
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...Sebastiano Panichella
 
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...Sebastiano Panichella
 
Automated Identification and Qualitative Characterization of Safety Concerns ...
Automated Identification and Qualitative Characterization of Safety Concerns ...Automated Identification and Qualitative Characterization of Safety Concerns ...
Automated Identification and Qualitative Characterization of Safety Concerns ...Sebastiano Panichella
 
The 2nd Intl. Workshop on NL-based Software Engineering
The 2nd Intl. Workshop on NL-based Software EngineeringThe 2nd Intl. Workshop on NL-based Software Engineering
The 2nd Intl. Workshop on NL-based Software EngineeringSebastiano Panichella
 
The 16th Intl. Workshop on Search-Based and Fuzz Testing
The 16th Intl. Workshop on Search-Based and Fuzz TestingThe 16th Intl. Workshop on Search-Based and Fuzz Testing
The 16th Intl. Workshop on Search-Based and Fuzz TestingSebastiano Panichella
 
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...Sebastiano Panichella
 
Exposed! A case study on the vulnerability-proneness of Google Play Apps
Exposed! A case study on the vulnerability-proneness of Google Play AppsExposed! A case study on the vulnerability-proneness of Google Play Apps
Exposed! A case study on the vulnerability-proneness of Google Play AppsSebastiano Panichella
 
Search-based Software Testing (SBST) '22
Search-based Software Testing (SBST) '22Search-based Software Testing (SBST) '22
Search-based Software Testing (SBST) '22Sebastiano Panichella
 
NL-based Software Engineering (NLBSE) '22
NL-based Software Engineering (NLBSE) '22NL-based Software Engineering (NLBSE) '22
NL-based Software Engineering (NLBSE) '22Sebastiano Panichella
 
"An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.
 "An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.  "An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.
"An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021. Sebastiano Panichella
 
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...An Empirical Investigation of Relevant Changes and Automation Needs in Modern...
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...Sebastiano Panichella
 
A Framework for Multi-source Studies based on Unstructured Data.
A Framework for Multi-source Studies based on Unstructured Data.A Framework for Multi-source Studies based on Unstructured Data.
A Framework for Multi-source Studies based on Unstructured Data.Sebastiano Panichella
 
Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...
Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...
Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...Sebastiano Panichella
 

More from Sebastiano Panichella (20)

The 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringThe 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software Engineering
 
Diversity-guided Search Exploration for Self-driving Cars Test Generation thr...
Diversity-guided Search Exploration for Self-driving Cars Test Generation thr...Diversity-guided Search Exploration for Self-driving Cars Test Generation thr...
Diversity-guided Search Exploration for Self-driving Cars Test Generation thr...
 
SBFT Tool Competition 2024 - CPS-UAV Test Case Generation Track
SBFT Tool Competition 2024 - CPS-UAV Test Case Generation TrackSBFT Tool Competition 2024 - CPS-UAV Test Case Generation Track
SBFT Tool Competition 2024 - CPS-UAV Test Case Generation Track
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
 
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
 
COSMOS: DevOps for Complex Cyber-physical Systems
COSMOS: DevOps for Complex Cyber-physical SystemsCOSMOS: DevOps for Complex Cyber-physical Systems
COSMOS: DevOps for Complex Cyber-physical Systems
 
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
 
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...
 
Automated Identification and Qualitative Characterization of Safety Concerns ...
Automated Identification and Qualitative Characterization of Safety Concerns ...Automated Identification and Qualitative Characterization of Safety Concerns ...
Automated Identification and Qualitative Characterization of Safety Concerns ...
 
The 2nd Intl. Workshop on NL-based Software Engineering
The 2nd Intl. Workshop on NL-based Software EngineeringThe 2nd Intl. Workshop on NL-based Software Engineering
The 2nd Intl. Workshop on NL-based Software Engineering
 
The 16th Intl. Workshop on Search-Based and Fuzz Testing
The 16th Intl. Workshop on Search-Based and Fuzz TestingThe 16th Intl. Workshop on Search-Based and Fuzz Testing
The 16th Intl. Workshop on Search-Based and Fuzz Testing
 
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...
 
Exposed! A case study on the vulnerability-proneness of Google Play Apps
Exposed! A case study on the vulnerability-proneness of Google Play AppsExposed! A case study on the vulnerability-proneness of Google Play Apps
Exposed! A case study on the vulnerability-proneness of Google Play Apps
 
Search-based Software Testing (SBST) '22
Search-based Software Testing (SBST) '22Search-based Software Testing (SBST) '22
Search-based Software Testing (SBST) '22
 
NL-based Software Engineering (NLBSE) '22
NL-based Software Engineering (NLBSE) '22NL-based Software Engineering (NLBSE) '22
NL-based Software Engineering (NLBSE) '22
 
NLBSE’22: Tool Competition
NLBSE’22: Tool CompetitionNLBSE’22: Tool Competition
NLBSE’22: Tool Competition
 
"An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.
 "An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.  "An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.
"An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.
 
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...An Empirical Investigation of Relevant Changes and Automation Needs in Modern...
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...
 
A Framework for Multi-source Studies based on Unstructured Data.
A Framework for Multi-source Studies based on Unstructured Data.A Framework for Multi-source Studies based on Unstructured Data.
A Framework for Multi-source Studies based on Unstructured Data.
 
Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...
Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...
Revisiting Test Smells in Automatically Generated Tests: Limitations, Pitfall...
 

Recently uploaded

ACM CHT Best Inspection Practices Kinben Innovation MIC Slideshare.pdf
ACM CHT Best Inspection Practices Kinben Innovation MIC Slideshare.pdfACM CHT Best Inspection Practices Kinben Innovation MIC Slideshare.pdf
ACM CHT Best Inspection Practices Kinben Innovation MIC Slideshare.pdfKinben Innovation Private Limited
 
art integrated project of computer applications
art integrated project of computer applicationsart integrated project of computer applications
art integrated project of computer applicationsmarvelpwian65
 
ServiceNow CIS-Discovery Exam Dumps 2024
ServiceNow CIS-Discovery Exam Dumps 2024ServiceNow CIS-Discovery Exam Dumps 2024
ServiceNow CIS-Discovery Exam Dumps 2024SkillCertProExams
 
Cymulate (Breach and Attack Simulation).
Cymulate (Breach and Attack Simulation).Cymulate (Breach and Attack Simulation).
Cymulate (Breach and Attack Simulation).luckyk1575
 
DAY 0 8 A Revelation 05-19-2024 PPT.pptx
DAY 0 8 A Revelation 05-19-2024 PPT.pptxDAY 0 8 A Revelation 05-19-2024 PPT.pptx
DAY 0 8 A Revelation 05-19-2024 PPT.pptxFamilyWorshipCenterD
 
Understanding Poverty: A Community Questionnaire
Understanding Poverty: A Community QuestionnaireUnderstanding Poverty: A Community Questionnaire
Understanding Poverty: A Community Questionnairebazilnaeem7
 
Deciding The Topic of our Magazine.pptx.
Deciding The Topic of our Magazine.pptx.Deciding The Topic of our Magazine.pptx.
Deciding The Topic of our Magazine.pptx.bazilnaeem7
 
OC Streetcar Final Presentation-Downtown Santa Ana
OC Streetcar Final Presentation-Downtown Santa AnaOC Streetcar Final Presentation-Downtown Santa Ana
OC Streetcar Final Presentation-Downtown Santa AnaRahsaan L. Browne
 
05232024 Joint Meeting - Community Networking
05232024 Joint Meeting - Community Networking05232024 Joint Meeting - Community Networking
05232024 Joint Meeting - Community NetworkingMichael Orias
 
Oracle Database Administration I (1Z0-082) Exam Dumps 2024.pdf
Oracle Database Administration I (1Z0-082) Exam Dumps 2024.pdfOracle Database Administration I (1Z0-082) Exam Dumps 2024.pdf
Oracle Database Administration I (1Z0-082) Exam Dumps 2024.pdfSkillCertProExams
 
Breathing in New Life_ Part 3 05 22 2024.pptx
Breathing in New Life_ Part 3 05 22 2024.pptxBreathing in New Life_ Part 3 05 22 2024.pptx
Breathing in New Life_ Part 3 05 22 2024.pptxFamilyWorshipCenterD
 
The Influence and Evolution of Mogul Press in Contemporary Public Relations.docx
The Influence and Evolution of Mogul Press in Contemporary Public Relations.docxThe Influence and Evolution of Mogul Press in Contemporary Public Relations.docx
The Influence and Evolution of Mogul Press in Contemporary Public Relations.docxMogul Press
 

Recently uploaded (12)

ACM CHT Best Inspection Practices Kinben Innovation MIC Slideshare.pdf
ACM CHT Best Inspection Practices Kinben Innovation MIC Slideshare.pdfACM CHT Best Inspection Practices Kinben Innovation MIC Slideshare.pdf
ACM CHT Best Inspection Practices Kinben Innovation MIC Slideshare.pdf
 
art integrated project of computer applications
art integrated project of computer applicationsart integrated project of computer applications
art integrated project of computer applications
 
ServiceNow CIS-Discovery Exam Dumps 2024
ServiceNow CIS-Discovery Exam Dumps 2024ServiceNow CIS-Discovery Exam Dumps 2024
ServiceNow CIS-Discovery Exam Dumps 2024
 
Cymulate (Breach and Attack Simulation).
Cymulate (Breach and Attack Simulation).Cymulate (Breach and Attack Simulation).
Cymulate (Breach and Attack Simulation).
 
DAY 0 8 A Revelation 05-19-2024 PPT.pptx
DAY 0 8 A Revelation 05-19-2024 PPT.pptxDAY 0 8 A Revelation 05-19-2024 PPT.pptx
DAY 0 8 A Revelation 05-19-2024 PPT.pptx
 
Understanding Poverty: A Community Questionnaire
Understanding Poverty: A Community QuestionnaireUnderstanding Poverty: A Community Questionnaire
Understanding Poverty: A Community Questionnaire
 
Deciding The Topic of our Magazine.pptx.
Deciding The Topic of our Magazine.pptx.Deciding The Topic of our Magazine.pptx.
Deciding The Topic of our Magazine.pptx.
 
OC Streetcar Final Presentation-Downtown Santa Ana
OC Streetcar Final Presentation-Downtown Santa AnaOC Streetcar Final Presentation-Downtown Santa Ana
OC Streetcar Final Presentation-Downtown Santa Ana
 
05232024 Joint Meeting - Community Networking
05232024 Joint Meeting - Community Networking05232024 Joint Meeting - Community Networking
05232024 Joint Meeting - Community Networking
 
Oracle Database Administration I (1Z0-082) Exam Dumps 2024.pdf
Oracle Database Administration I (1Z0-082) Exam Dumps 2024.pdfOracle Database Administration I (1Z0-082) Exam Dumps 2024.pdf
Oracle Database Administration I (1Z0-082) Exam Dumps 2024.pdf
 
Breathing in New Life_ Part 3 05 22 2024.pptx
Breathing in New Life_ Part 3 05 22 2024.pptxBreathing in New Life_ Part 3 05 22 2024.pptx
Breathing in New Life_ Part 3 05 22 2024.pptx
 
The Influence and Evolution of Mogul Press in Contemporary Public Relations.docx
The Influence and Evolution of Mogul Press in Contemporary Public Relations.docxThe Influence and Evolution of Mogul Press in Contemporary Public Relations.docx
The Influence and Evolution of Mogul Press in Contemporary Public Relations.docx
 

SBFT Tool Competition 2024 -- Python Test Case Generation Track

  • 1. Search-Based and Fuzz Testing Tool Competition 2024 Nicolas Erni Zurich University of Applied Science (ZHAW) Christian Birchler Zurich University of Applied Science (ZHAW) Pouria Derakhshanfar JetBrains Stephan Lukasczyk University of Passau Mohammed Al-Ameen Zurich University of Applied Science (ZHAW) Software Under test Generated Test Code Sebastiano Panichella Zurich University of Applied Science (ZHAW) Co-located with the 46th International Conference on Software Engineering (ICSE 2024)
  • 2. History SBFT Python Tool Competition Year Venue Coverage tool Mutation Tool #CUTs #Projects #Participants (+ baseline) Round 1 2024 SBST PyTest MutPy / Cosmic Ray 35 7 4
  • 3. SBFT Tool Competition - 2024 Python tool competition: For the fi rst time ever, we are extending an invitation to researchers to participate in our competition using their test generation tool for Python. Tools will be assessed based on a benchmark that evaluates code coverage and mutation score. What is New? Figure 1: Example of test generation for simple Python functions. New!!! Software Under test Generated Test Code
  • 4. Python tool competition Infrastructure python-tool-competition-2024 Infrastructure run run run Klara …. Tooln CUT Time budget generated tests generated tests generated tests
  • 5. Python tool competition Infrastructure python-tool-competition-2024 Infrastructure run run run Klara …. Tooln CUT Time budget Generated tests MutPy / Cosmic Ray Line and Branch coverage metrics Mutation metrics
  • 6. Scoring Formula T = Generated Test B = Search Budget C = Class under test R = independent Run Covi = statement coverage Covb = branch coverage Covm = Strong Mutation getTime = generation time covScore(T, B, C, R) = 1 × Covi + 2 × Covb + 4 × Covm tScore(T, B, C, R) = covScore(T, B, C, R) × min ( 1, 2 × B genTime) Score(T, B, C, R) = tScore(T, B, C, R) + penalty(T, B, C, R) Xavier Devroey, Alessio Gambi, Juan Pablo Galeotti, René Just, Fitsum Meshesha Kifetew, Annibale Panichella, Sebastiano Panichella: JUGE: An infrastructure for benchmarking Java unit test generators. Softw. Test. Verification Reliab. 33(3) (2023)
  • 8. Benchmark Projects • Selection criteria • GitHub repositories • Open Source • Simple files • No system access (OS, process, network, disk)
  • 9. Benchmark Projects • Selection criteria • GitHub repositories • Open Source • 3 projects selected Klara https://github.com/se2p/pynguin https://github.com/usagitoneko97/klara Ghostwriter with Hypothesis https://github.com/HypothesisWorks/hypothesis Pynguin
  • 10. Contest Methodology Search budget 400 seconds Files under test 35 Repetitions 4 repetitions Execution environment Linux VM
  • 12. Results (1) Average line coverage for each project per tool
  • 13. Results (2) Average branch coverage for each project per tool
  • 14. Results (3) Average mutation score for each project per tool
  • 18. Lessons Learned • Identified aspects to improve and bugs that could be fixed in the infrastructure • Docker will simplify the evaluation procedure • More participants to the competition! • From Academia & Industry
  • 19. What’s Next? • Contest Infrastructure • https://github.com/ThunderKey/python-tool-competition-2024 • Improve usability • Facilitate setup of an evaluation • Facilitate evaluation in other contexts • Update the user documentation • For the next edition • More tools • More CUTs • Time budgets • Time penalty