SlideShare a Scribd company logo
1 of 33
Download to read offline
.lusoftware verification & validation
VVS
Java Unit Testing Tool
Competition — Fifth Round
Annibale Panichella, Urko Rueda Molina
1
Previous Editions
Year Venue
Coverage
tool
Mutation
Tool
#CUTs #Projects #Participants
Statistical
Tests
Round 1 2013 ICST Cobertura Javalanche 77 5 2 ✗
Round 2 2014 FITTEST JaCoCo PITest 63 9 4 ✗
Round 3 2015 SBST JaCoCo PITest 63 9 8 ✗
Round 4 2016 SBST
DEFECT4J
(Real Faults)
68 5 4 ✗
2
New Edition
Year Venue
Coverage
tool
Mutation
Tool
#CUTs #Projects #Participants
Statistical
Tests
Round 1 2013 ICST Cobertura Javalanche 77 5 2 ✗
Round 2 2014 FITTEST JaCoCo PITest 63 9 4 ✗
Round 3 2015 SBST JaCoCo PITest 63 9 8 ✗
Round 4 2016 SBST
DEFECT4J
(Real Faults)
68 5 4 ✗
Round 5 2017 SBST JaCoCo
PITest +
Our Env.
69 8 2+2 ✓
3
The Infrastructure
4
The Infrastructure
Defect4j
Defect4j
• The previous edition used
DEFECT4J to detect flaky tests
and to measure effectiveness
• In the new edition, we modified
the infrastructure to work with
libraries not in DEFECT4J
• We developed our own tool to
detect flaky tests
• Effectiveness based on mutation
analysis: PITest + JaCoCo
5
The Infrastructure
Defect4j
Defect4j
• The previous edition used
DEFECT4J to detect flaky tests
and to measure effectiveness
• In the new edition, we modified
the infrastructure to work with
libraries not in DEFECT4J
• We developed our own tool to
detect flaky tests
• Effectiveness based on mutation
analysis: PITest + JaCoCo
6
The Infrastructure
Defect4j
Defect4j
• The previous edition used
DEFECT4J to detect flaky tests
and to measure effectiveness
• In the new edition, we modified
the infrastructure to work with
libraries not in DEFECT4J
• We developed our own tool to
detect flaky tests
• Effectiveness based on mutation
analysis: PITest + JaCoCo
7
The Infrastructure
• The previous edition used
DEFECT4J to detect flaky tests
and to measure effectiveness
• In the new edition, we modified
the infrastructure to work with
libraries not in DEFECT4J
• We developed our own tool to
detect flaky tests
• Effectiveness based on mutation
analysis: PITest + JaCoCo
Our Tool
PITest
+
JaCoCo
8
Test Management
Flaky tests:
• Pass during generation but fail when re-executed
• Detection mechanism: we run each test suite five times
• Ignored when computing the coverage scores
Non-compiling tests:
• Generated test suites were re-compiled in our own
execution environment
9
Metric Computation
Code Coverage:
• Statement coverage
• Condition coverage
Mutation Score:
• We did not use PITest’s running engine since it gave
errors for test cases with ad-hoc/non-standard JUnit
runners (e.g., in EvoSuite)
• We only use PITest engine for the generation of
mutants
• Combining PITest with JaCoCo: executing only
mutants infecting covered lines
10
We apply the same formula used in the last competition since it
combines coverage metrics, effectiveness, execution time and
number of flaky/non-compiling tests
Scoring Formula
T = Generated Test
B = Search Budget
C = Class under test
R = independent Run
Covi = statement coverage
Covb = branch coverage
Covm = Strong Mutation
covScorehT,B,C,ri = 1 ⇥ Covi + 2 ⇥ Covb + 4 ⇥ Covm1 2 4
11
We apply the same formula used in the last competition since it
combines coverage metrics, effectiveness, execution time and
number of flaky/non-compiling tests
Scoring Formula
tScorehT,B,C,ri = covScorehT,B,C,ri ⇥ min
✓
1,
L
genTime
◆
T = Generated Test
B = Search Budget
C = Class under test
R = independent Run
Covi = statement coverage
Covb = branch coverage
Covm = Strong Mutation
getTime = generation time
covScorehT,B,C,ri = 1 ⇥ Covi + 2 ⇥ Covb + 4 ⇥ Covm1 2 4
2 x B
12
We apply the same formula used in the last competition since it
combines coverage metrics, effectiveness, execution time and
number of flaky/non-compiling tests
Scoring Formula
tScorehT,B,C,ri = covScorehT,B,C,ri ⇥ min
✓
1,
L
genTime
◆
T = Generated Test
B = Search Budget
C = Class under test
R = independent Run
Covi = statement coverage
Covb = branch coverage
Covm = Strong Mutation
getTime = generation time
penalty = percentage of flaky
test and non-compiling tests
ScorehT,B,C,ri = tScorehT,B,C,ri + penaltyhT,B,C,ri
covScorehT,B,C,ri = 1 ⇥ Covi + 2 ⇥ Covb + 4 ⇥ Covm1 2 4
2 x B
13
The Competition
14
The Tools
jTExpert
RandoopAutomatic unit test generation for Java
T3
15
Selection of the Benchmark Classes
Source Application Domain # Classes
# Selected
Classes
BCEL
Apache
commons
Bytecode manipulation 431 10
Jxpath Java Beans manipulation with Path syntax 180 10
Imaging Framework to write/read images with various formats 427 4
Google Gson
Google
Conversion of Java Objects into their JSON
representation and vice versa
174 9
Re2j
Regular expression engine for time-linear regular
expression matching
47 8
Freehep
Java Analysis
Studio
Open-source repository providing Java utilities for high
energy physics applications
180 10
LA4j Github
Linear Algebra primitives (matrices and vectors) and
algorithms
208 10
Okhttp Github
HTTP and HTTP/2 client for Android and Java
applications
193 8
16
Selection of the Benchmark Classes
Source Application Domain # Classes
# Selected
Classes
BCEL
Apache
commons
Bytecode manipulation 431 10
Jxpath Java Beans manipulation with Path syntax 180 10
Imaging Framework to write/read images with various formats 427 4
Google Gson
Google
Conversion of Java Objects into their JSON
representation and vice versa
174 9
Re2j
Regular expression engine for time-linear regular
expression matching
47 8
Freehep
Java Analysis
Studio
Open-source repository providing Java utilities for high
energy physics applications
180 10
LA4j Github
Linear Algebra primitives (matrices and vectors) and
algorithms
208 10
Okhttp Github
HTTP and HTTP/2 client for Android and Java
applications
193 8
17
Selection of the Benchmark Classes
Source Application Domain # Classes
# Selected
Classes
BCEL
Apache
commons
Bytecode manipulation 431 10
Jxpath Java Beans manipulation with Path syntax 180 10
Imaging Framework to write/read images with various formats 427 4
Google Gson
Google
Conversion of Java Objects into their JSON
representation and vice versa
174 9
Re2j
Regular expression engine for time-linear regular
expression matching
47 8
Freehep
Java Analysis
Studio
Open-source repository providing Java utilities for high
energy physics applications
180 10
LA4j Github
Linear Algebra primitives (matrices and vectors) and
algorithms
208 10
Okhttp Github
HTTP and HTTP/2 client for Android and Java
applications
193 8
18
Selection Procedure
HOW:
• Computing the McCabe’s cyclomatic complexity (MCC) for all methods in
each java library
• Filtering out all trivial classes, i.e., classes that contains only methods
with a MCC < 3
• Random sampling from the pruned projects
WHAT/WHY:
• Removing (likely) trivial classes not challenging for the tools
• Developers may use automated tools for complex classes
19
Benchmark Statistics
Largest Class:
Name = XPathParserTokenManager
Project = JXPATH
N. Statements = 1029
N. Branches = 872
Smallest Class:
Name = ForwardBackSubstitutionSolver
Project = LA4J
N. Statements = 26
N. Branches = 20
# Branches
Frequency
# Statements
Frequency
20
The Methodology
• Search Budgets = 10s, 30s, 60s, 120s, 240s, 300s, 480s
• Number of CUTs = 69
• Number of repetitions = 3
• All tools have been executed in parallel (multi-threading)
on the same machine
• Statistical analysis:
Friedman’s test: non-parametric test for multiple-problem analysis
Post-hoc Connover’s procedure for pairwise multiple comparisons
21
The Results
22
Coverage Results
Search Budget = 10s Search Budget = 30s
23
Coverage Results
Search Budget = 60s Search Budget = 480s
24
Coverage Results
There are 43 classes out of 69 (≈
60%) for which at least one of
the two participant tools could
not generate any test case.
What happens if we consider only
classes for which both EvoSuite and
JTexpert could generate tests?
Filtered Results with
Search Budget = 480s
25
Scalability%BranchCoverage
0
25
50
75
100
Search Budget
10s 30s 60s 120s 240s 300s 480s
EvoSuite JTExpert
T3 Randoop
%StrongMutationCov.
0
12.5
25
37.5
50
Search Budget
10s 30s 60s 120s 240s 300s 480s
EvoSuite JTExpert
T3 Randoop
Comparison for the class Parser.java extracted from the library Re4J.
N. Statements = 760, N. Branches = 565, N. Mutants = 203
26
ScoringScore
0
75
150
225
300
Search Budget
10s 30s 60s 120s 240s 300s 480s
EvoSuite JTExpert T3 Randoop
27
Generated vs. Manually-written Tests
Comparison of the scores achieved by
• EvoSuite after 480s
• JTexpert after 480s
• T3 after 480s
• Random after 480s
• Manually-written tests
• Optimal Score
N.B.: We only considered the 63
subjects for which we found
developers-written tests.
0
50
100
150
200
250
300
350
400
450
500
268
61
78
125
251
Optimal
EvoSuite
JTExpert
T3
Randoop
M
anual
28
Tool Total Score St. Dev.
Friedman’s Test
Statistically better than
(Conover’s procedure)
Rank Score
EvoSuite 1457 193 1 1.55 JTExpert, T3, Randoop
JTexpert 849 102 2 2.71 T3, Randoop
T3 526 82 3 2.81 Random
Random 448 34 4 2.92
Statistical Analysis
29
Tool Total Score St. Dev.
Friedman’s Test
Statistically better than
(Conover’s procedure)
Rank Score
EvoSuite 1457 193 1 1.55 JTExpert, T3, Randoop
JTexpert 849 102 2 2.71 T3, Randoop
T3 526 82 3 2.81 Random
Random 448 34 4 2.92
Statistical Analysis
30
Statistical Analysis
Tool Total Score St. Dev.
Friedman’s Test
Statistically better than
(Conover’s procedure)
Rank Score
EvoSuite 1457 193 1 1.55 JTExpert, T3, Randoop
JTexpert 849 102 2 2.71 T3, Randoop
T3 526 82 3 2.81 Random
Random 448 34 4 2.92
31
Lessons Learnt
• Using multi-problem statistical tests
• Selection procedure to filter-out (likely) trivial classes
• Subject categories: string manipulation, computational intensive, object
manipulation, etc.
• What next:
• Publishing  the benchmark infrastructure
• Performing a more in-depth analysis for each subject category
• More Tools, new languages? (i.e., C, C#?)
32
.lusoftware verification & validation
VVS
Java Unit Testing Tool
Competition — Fifth Round
Annibale Panichella, Urko Rueda Molina
33

More Related Content

What's hot

Hp Quick Test Professional
Hp Quick Test ProfessionalHp Quick Test Professional
Hp Quick Test Professionalsunny.deb
 
Qtp interview questions3
Qtp interview questions3Qtp interview questions3
Qtp interview questions3Ramu Palanki
 
Qtp questions and answers
Qtp questions and answersQtp questions and answers
Qtp questions and answersRamu Palanki
 
Personalized Defect Prediction
Personalized Defect PredictionPersonalized Defect Prediction
Personalized Defect PredictionSung Kim
 
Copy of qtp presentation
Copy of qtp presentationCopy of qtp presentation
Copy of qtp presentationRamu Palanki
 
Testware Hierarchy for Test Automation
Testware Hierarchy for Test AutomationTestware Hierarchy for Test Automation
Testware Hierarchy for Test AutomationGregory Solovey
 
Interview questions in qtp
Interview questions in qtpInterview questions in qtp
Interview questions in qtpRamu Palanki
 
QTP Interview Questions and answers
QTP Interview Questions and answersQTP Interview Questions and answers
QTP Interview Questions and answersRita Singh
 
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Sung Kim
 
Qtp Basics
Qtp BasicsQtp Basics
Qtp Basicsmehramit
 
Quality center certification questions
Quality center certification questionsQuality center certification questions
Quality center certification questionsRamu Palanki
 
QTP Power Point Presentation
QTP Power Point PresentationQTP Power Point Presentation
QTP Power Point PresentationSVRTechnologies
 
QTP Slides Presentation.
QTP Slides Presentation.QTP Slides Presentation.
QTP Slides Presentation.tjdhans
 
Qtp interview questions and answers
Qtp interview questions and answersQtp interview questions and answers
Qtp interview questions and answersITeLearn
 
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...Sung Kim
 
Qtp 92 Tutorial
Qtp 92 TutorialQtp 92 Tutorial
Qtp 92 Tutorialsasidhar
 
First QTP Tutorial
First QTP TutorialFirst QTP Tutorial
First QTP Tutorialtjdhans
 
Qtp Training
Qtp TrainingQtp Training
Qtp Trainingmehramit
 
Qtp Interview Questions
Qtp Interview QuestionsQtp Interview Questions
Qtp Interview Questionskspanigra
 
What is UFT? HP's unified functional testing.
What is UFT? HP's unified functional testing.What is UFT? HP's unified functional testing.
What is UFT? HP's unified functional testing.Confiz
 

What's hot (20)

Hp Quick Test Professional
Hp Quick Test ProfessionalHp Quick Test Professional
Hp Quick Test Professional
 
Qtp interview questions3
Qtp interview questions3Qtp interview questions3
Qtp interview questions3
 
Qtp questions and answers
Qtp questions and answersQtp questions and answers
Qtp questions and answers
 
Personalized Defect Prediction
Personalized Defect PredictionPersonalized Defect Prediction
Personalized Defect Prediction
 
Copy of qtp presentation
Copy of qtp presentationCopy of qtp presentation
Copy of qtp presentation
 
Testware Hierarchy for Test Automation
Testware Hierarchy for Test AutomationTestware Hierarchy for Test Automation
Testware Hierarchy for Test Automation
 
Interview questions in qtp
Interview questions in qtpInterview questions in qtp
Interview questions in qtp
 
QTP Interview Questions and answers
QTP Interview Questions and answersQTP Interview Questions and answers
QTP Interview Questions and answers
 
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
 
Qtp Basics
Qtp BasicsQtp Basics
Qtp Basics
 
Quality center certification questions
Quality center certification questionsQuality center certification questions
Quality center certification questions
 
QTP Power Point Presentation
QTP Power Point PresentationQTP Power Point Presentation
QTP Power Point Presentation
 
QTP Slides Presentation.
QTP Slides Presentation.QTP Slides Presentation.
QTP Slides Presentation.
 
Qtp interview questions and answers
Qtp interview questions and answersQtp interview questions and answers
Qtp interview questions and answers
 
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
 
Qtp 92 Tutorial
Qtp 92 TutorialQtp 92 Tutorial
Qtp 92 Tutorial
 
First QTP Tutorial
First QTP TutorialFirst QTP Tutorial
First QTP Tutorial
 
Qtp Training
Qtp TrainingQtp Training
Qtp Training
 
Qtp Interview Questions
Qtp Interview QuestionsQtp Interview Questions
Qtp Interview Questions
 
What is UFT? HP's unified functional testing.
What is UFT? HP's unified functional testing.What is UFT? HP's unified functional testing.
What is UFT? HP's unified functional testing.
 

Similar to Java Unit Testing Tool Competition — Fifth Round

ESEconf2011 - Guckenheimer Sam: "Agile in the Very Large"
ESEconf2011 - Guckenheimer Sam: "Agile in the Very Large"ESEconf2011 - Guckenheimer Sam: "Agile in the Very Large"
ESEconf2011 - Guckenheimer Sam: "Agile in the Very Large"Aberla
 
Getting Started with Test-Driven Development at Longhorn PHP 2023
Getting Started with Test-Driven Development at Longhorn PHP 2023Getting Started with Test-Driven Development at Longhorn PHP 2023
Getting Started with Test-Driven Development at Longhorn PHP 2023Scott Keck-Warren
 
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...Sebastiano Panichella
 
Никита Галкин "Testing in Frontend World"
Никита Галкин "Testing in Frontend World"Никита Галкин "Testing in Frontend World"
Никита Галкин "Testing in Frontend World"Fwdays
 
Testing in FrontEnd World by Nikita Galkin
Testing in FrontEnd World by Nikita GalkinTesting in FrontEnd World by Nikita Galkin
Testing in FrontEnd World by Nikita GalkinSigma Software
 
Automock: Interaction-Based Mock Code Generation
Automock: Interaction-Based Mock Code GenerationAutomock: Interaction-Based Mock Code Generation
Automock: Interaction-Based Mock Code GenerationSabrina Souto
 
How to Clean Up Your Continuous Testing Suites for Web & Mobile
How to Clean Up Your Continuous Testing Suites for Web & MobileHow to Clean Up Your Continuous Testing Suites for Web & Mobile
How to Clean Up Your Continuous Testing Suites for Web & MobilePerfecto by Perforce
 
How To Transform the Manual Testing Process to Incorporate Test Automation
How To Transform the Manual Testing Process to Incorporate Test AutomationHow To Transform the Manual Testing Process to Incorporate Test Automation
How To Transform the Manual Testing Process to Incorporate Test AutomationRanorex
 
Lecture #6. automation testing (andrey oleynik)
Lecture #6. automation testing (andrey oleynik)Lecture #6. automation testing (andrey oleynik)
Lecture #6. automation testing (andrey oleynik)Andrey Oleynik
 
Elements of a Test Framework
Elements of a Test FrameworkElements of a Test Framework
Elements of a Test FrameworkSmartBear
 
Dhanasekaran 2008-2009 Quick Test Pro Presentation
Dhanasekaran 2008-2009 Quick Test Pro PresentationDhanasekaran 2008-2009 Quick Test Pro Presentation
Dhanasekaran 2008-2009 Quick Test Pro PresentationDhanasekaran Nagarajan
 
QTP Tutorial Slides Presentation.
QTP Tutorial Slides Presentation.QTP Tutorial Slides Presentation.
QTP Tutorial Slides Presentation.Jaya Priya
 
New types of tests for Java projects
New types of tests for Java projectsNew types of tests for Java projects
New types of tests for Java projectsVincent Massol
 
How to Study for ISTQB Test Analyst (CTAL-TA) Certification Exam?
How to Study for ISTQB Test Analyst (CTAL-TA) Certification Exam?How to Study for ISTQB Test Analyst (CTAL-TA) Certification Exam?
How to Study for ISTQB Test Analyst (CTAL-TA) Certification Exam?Meghna Arora
 
Test automation lessons from WebSphere Application Server
Test automation lessons from WebSphere Application ServerTest automation lessons from WebSphere Application Server
Test automation lessons from WebSphere Application ServerRobbie Minshall
 
Testing of Object-Oriented Software
Testing of Object-Oriented SoftwareTesting of Object-Oriented Software
Testing of Object-Oriented SoftwarePraveen Penumathsa
 
ISTQB, ISEB Lecture Notes- 2
ISTQB, ISEB Lecture Notes- 2ISTQB, ISEB Lecture Notes- 2
ISTQB, ISEB Lecture Notes- 2onsoftwaretest
 
HP Quality Center
HP Quality CenterHP Quality Center
HP Quality CenterANKUR-BA
 

Similar to Java Unit Testing Tool Competition — Fifth Round (20)

Sbst2018 contest2018
Sbst2018 contest2018Sbst2018 contest2018
Sbst2018 contest2018
 
ESEconf2011 - Guckenheimer Sam: "Agile in the Very Large"
ESEconf2011 - Guckenheimer Sam: "Agile in the Very Large"ESEconf2011 - Guckenheimer Sam: "Agile in the Very Large"
ESEconf2011 - Guckenheimer Sam: "Agile in the Very Large"
 
Getting Started with Test-Driven Development at Longhorn PHP 2023
Getting Started with Test-Driven Development at Longhorn PHP 2023Getting Started with Test-Driven Development at Longhorn PHP 2023
Getting Started with Test-Driven Development at Longhorn PHP 2023
 
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...
 
Никита Галкин "Testing in Frontend World"
Никита Галкин "Testing in Frontend World"Никита Галкин "Testing in Frontend World"
Никита Галкин "Testing in Frontend World"
 
Testing in FrontEnd World by Nikita Galkin
Testing in FrontEnd World by Nikita GalkinTesting in FrontEnd World by Nikita Galkin
Testing in FrontEnd World by Nikita Galkin
 
Automock: Interaction-Based Mock Code Generation
Automock: Interaction-Based Mock Code GenerationAutomock: Interaction-Based Mock Code Generation
Automock: Interaction-Based Mock Code Generation
 
How to Clean Up Your Continuous Testing Suites for Web & Mobile
How to Clean Up Your Continuous Testing Suites for Web & MobileHow to Clean Up Your Continuous Testing Suites for Web & Mobile
How to Clean Up Your Continuous Testing Suites for Web & Mobile
 
How To Transform the Manual Testing Process to Incorporate Test Automation
How To Transform the Manual Testing Process to Incorporate Test AutomationHow To Transform the Manual Testing Process to Incorporate Test Automation
How To Transform the Manual Testing Process to Incorporate Test Automation
 
Lecture #6. automation testing (andrey oleynik)
Lecture #6. automation testing (andrey oleynik)Lecture #6. automation testing (andrey oleynik)
Lecture #6. automation testing (andrey oleynik)
 
QTP_PRESENTATION_Andy
QTP_PRESENTATION_AndyQTP_PRESENTATION_Andy
QTP_PRESENTATION_Andy
 
Elements of a Test Framework
Elements of a Test FrameworkElements of a Test Framework
Elements of a Test Framework
 
Dhanasekaran 2008-2009 Quick Test Pro Presentation
Dhanasekaran 2008-2009 Quick Test Pro PresentationDhanasekaran 2008-2009 Quick Test Pro Presentation
Dhanasekaran 2008-2009 Quick Test Pro Presentation
 
QTP Tutorial Slides Presentation.
QTP Tutorial Slides Presentation.QTP Tutorial Slides Presentation.
QTP Tutorial Slides Presentation.
 
New types of tests for Java projects
New types of tests for Java projectsNew types of tests for Java projects
New types of tests for Java projects
 
How to Study for ISTQB Test Analyst (CTAL-TA) Certification Exam?
How to Study for ISTQB Test Analyst (CTAL-TA) Certification Exam?How to Study for ISTQB Test Analyst (CTAL-TA) Certification Exam?
How to Study for ISTQB Test Analyst (CTAL-TA) Certification Exam?
 
Test automation lessons from WebSphere Application Server
Test automation lessons from WebSphere Application ServerTest automation lessons from WebSphere Application Server
Test automation lessons from WebSphere Application Server
 
Testing of Object-Oriented Software
Testing of Object-Oriented SoftwareTesting of Object-Oriented Software
Testing of Object-Oriented Software
 
ISTQB, ISEB Lecture Notes- 2
ISTQB, ISEB Lecture Notes- 2ISTQB, ISEB Lecture Notes- 2
ISTQB, ISEB Lecture Notes- 2
 
HP Quality Center
HP Quality CenterHP Quality Center
HP Quality Center
 

More from Annibale Panichella

Breaking the Silence: the Threats of Using LLMs in Software Engineering
Breaking the Silence: the Threats of Using LLMs in Software EngineeringBreaking the Silence: the Threats of Using LLMs in Software Engineering
Breaking the Silence: the Threats of Using LLMs in Software EngineeringAnnibale Panichella
 
Searching for Quality: Genetic Algorithms and Metamorphic Testing for Softwar...
Searching for Quality: Genetic Algorithms and Metamorphic Testing for Softwar...Searching for Quality: Genetic Algorithms and Metamorphic Testing for Softwar...
Searching for Quality: Genetic Algorithms and Metamorphic Testing for Softwar...Annibale Panichella
 
A Fast Multi-objective Evolutionary Approach for Designing Large-Scale Optica...
A Fast Multi-objective Evolutionary Approach for Designing Large-Scale Optica...A Fast Multi-objective Evolutionary Approach for Designing Large-Scale Optica...
A Fast Multi-objective Evolutionary Approach for Designing Large-Scale Optica...Annibale Panichella
 
An Improved Pareto Front Modeling Algorithm for Large-scale Many-Objective Op...
An Improved Pareto Front Modeling Algorithm for Large-scale Many-Objective Op...An Improved Pareto Front Modeling Algorithm for Large-scale Many-Objective Op...
An Improved Pareto Front Modeling Algorithm for Large-scale Many-Objective Op...Annibale Panichella
 
An Adaptive Evolutionary Algorithm based on Non-Euclidean Geometry for Many-O...
An Adaptive Evolutionary Algorithm based on Non-Euclidean Geometry for Many-O...An Adaptive Evolutionary Algorithm based on Non-Euclidean Geometry for Many-O...
An Adaptive Evolutionary Algorithm based on Non-Euclidean Geometry for Many-O...Annibale Panichella
 
Speeding-up Software Testing With Computational Intelligence
Speeding-up Software Testing With Computational IntelligenceSpeeding-up Software Testing With Computational Intelligence
Speeding-up Software Testing With Computational IntelligenceAnnibale Panichella
 
Incremental Control Dependency Frontier Exploration for Many-Criteria Test C...
Incremental Control Dependency Frontier Exploration for Many-Criteria  Test C...Incremental Control Dependency Frontier Exploration for Many-Criteria  Test C...
Incremental Control Dependency Frontier Exploration for Many-Criteria Test C...Annibale Panichella
 
Evolutionary Testing for Crash Reproduction
Evolutionary Testing for Crash ReproductionEvolutionary Testing for Crash Reproduction
Evolutionary Testing for Crash ReproductionAnnibale Panichella
 
Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...
Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...
Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...Annibale Panichella
 
Security Threat Identification and Testing
Security Threat Identification and TestingSecurity Threat Identification and Testing
Security Threat Identification and TestingAnnibale Panichella
 
Reformulating Branch Coverage as a Many-Objective Optimization Problem
Reformulating Branch Coverage as a Many-Objective Optimization ProblemReformulating Branch Coverage as a Many-Objective Optimization Problem
Reformulating Branch Coverage as a Many-Objective Optimization ProblemAnnibale Panichella
 
Results for EvoSuite-MOSA at the Third Unit Testing Tool Competition
Results for EvoSuite-MOSA at the Third Unit Testing Tool CompetitionResults for EvoSuite-MOSA at the Third Unit Testing Tool Competition
Results for EvoSuite-MOSA at the Third Unit Testing Tool CompetitionAnnibale Panichella
 
Adaptive User Feedback for IR-based Traceability Recovery
Adaptive User Feedback for IR-based Traceability RecoveryAdaptive User Feedback for IR-based Traceability Recovery
Adaptive User Feedback for IR-based Traceability RecoveryAnnibale Panichella
 
Diversity mechanisms for evolutionary populations in Search-Based Software En...
Diversity mechanisms for evolutionary populations in Search-Based Software En...Diversity mechanisms for evolutionary populations in Search-Based Software En...
Diversity mechanisms for evolutionary populations in Search-Based Software En...Annibale Panichella
 
Estimating the Evolution Direction of Populations to Improve Genetic Algorithms
Estimating the Evolution Direction of Populations to Improve Genetic AlgorithmsEstimating the Evolution Direction of Populations to Improve Genetic Algorithms
Estimating the Evolution Direction of Populations to Improve Genetic AlgorithmsAnnibale Panichella
 
When and How Using Structural Information to Improve IR-Based Traceability Re...
When and How Using Structural Information to Improve IR-Based Traceability Re...When and How Using Structural Information to Improve IR-Based Traceability Re...
When and How Using Structural Information to Improve IR-Based Traceability Re...Annibale Panichella
 
Multi-Objective Cross-Project Defect Prediction
Multi-Objective Cross-Project Defect PredictionMulti-Objective Cross-Project Defect Prediction
Multi-Objective Cross-Project Defect PredictionAnnibale Panichella
 

More from Annibale Panichella (20)

Breaking the Silence: the Threats of Using LLMs in Software Engineering
Breaking the Silence: the Threats of Using LLMs in Software EngineeringBreaking the Silence: the Threats of Using LLMs in Software Engineering
Breaking the Silence: the Threats of Using LLMs in Software Engineering
 
Searching for Quality: Genetic Algorithms and Metamorphic Testing for Softwar...
Searching for Quality: Genetic Algorithms and Metamorphic Testing for Softwar...Searching for Quality: Genetic Algorithms and Metamorphic Testing for Softwar...
Searching for Quality: Genetic Algorithms and Metamorphic Testing for Softwar...
 
A Fast Multi-objective Evolutionary Approach for Designing Large-Scale Optica...
A Fast Multi-objective Evolutionary Approach for Designing Large-Scale Optica...A Fast Multi-objective Evolutionary Approach for Designing Large-Scale Optica...
A Fast Multi-objective Evolutionary Approach for Designing Large-Scale Optica...
 
An Improved Pareto Front Modeling Algorithm for Large-scale Many-Objective Op...
An Improved Pareto Front Modeling Algorithm for Large-scale Many-Objective Op...An Improved Pareto Front Modeling Algorithm for Large-scale Many-Objective Op...
An Improved Pareto Front Modeling Algorithm for Large-scale Many-Objective Op...
 
VST2022.pdf
VST2022.pdfVST2022.pdf
VST2022.pdf
 
IPA Fall Days 2019
 IPA Fall Days 2019 IPA Fall Days 2019
IPA Fall Days 2019
 
An Adaptive Evolutionary Algorithm based on Non-Euclidean Geometry for Many-O...
An Adaptive Evolutionary Algorithm based on Non-Euclidean Geometry for Many-O...An Adaptive Evolutionary Algorithm based on Non-Euclidean Geometry for Many-O...
An Adaptive Evolutionary Algorithm based on Non-Euclidean Geometry for Many-O...
 
Speeding-up Software Testing With Computational Intelligence
Speeding-up Software Testing With Computational IntelligenceSpeeding-up Software Testing With Computational Intelligence
Speeding-up Software Testing With Computational Intelligence
 
Incremental Control Dependency Frontier Exploration for Many-Criteria Test C...
Incremental Control Dependency Frontier Exploration for Many-Criteria  Test C...Incremental Control Dependency Frontier Exploration for Many-Criteria  Test C...
Incremental Control Dependency Frontier Exploration for Many-Criteria Test C...
 
ICSE 2017 - Evocrash
ICSE 2017 - EvocrashICSE 2017 - Evocrash
ICSE 2017 - Evocrash
 
Evolutionary Testing for Crash Reproduction
Evolutionary Testing for Crash ReproductionEvolutionary Testing for Crash Reproduction
Evolutionary Testing for Crash Reproduction
 
Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...
Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...
Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...
 
Security Threat Identification and Testing
Security Threat Identification and TestingSecurity Threat Identification and Testing
Security Threat Identification and Testing
 
Reformulating Branch Coverage as a Many-Objective Optimization Problem
Reformulating Branch Coverage as a Many-Objective Optimization ProblemReformulating Branch Coverage as a Many-Objective Optimization Problem
Reformulating Branch Coverage as a Many-Objective Optimization Problem
 
Results for EvoSuite-MOSA at the Third Unit Testing Tool Competition
Results for EvoSuite-MOSA at the Third Unit Testing Tool CompetitionResults for EvoSuite-MOSA at the Third Unit Testing Tool Competition
Results for EvoSuite-MOSA at the Third Unit Testing Tool Competition
 
Adaptive User Feedback for IR-based Traceability Recovery
Adaptive User Feedback for IR-based Traceability RecoveryAdaptive User Feedback for IR-based Traceability Recovery
Adaptive User Feedback for IR-based Traceability Recovery
 
Diversity mechanisms for evolutionary populations in Search-Based Software En...
Diversity mechanisms for evolutionary populations in Search-Based Software En...Diversity mechanisms for evolutionary populations in Search-Based Software En...
Diversity mechanisms for evolutionary populations in Search-Based Software En...
 
Estimating the Evolution Direction of Populations to Improve Genetic Algorithms
Estimating the Evolution Direction of Populations to Improve Genetic AlgorithmsEstimating the Evolution Direction of Populations to Improve Genetic Algorithms
Estimating the Evolution Direction of Populations to Improve Genetic Algorithms
 
When and How Using Structural Information to Improve IR-Based Traceability Re...
When and How Using Structural Information to Improve IR-Based Traceability Re...When and How Using Structural Information to Improve IR-Based Traceability Re...
When and How Using Structural Information to Improve IR-Based Traceability Re...
 
Multi-Objective Cross-Project Defect Prediction
Multi-Objective Cross-Project Defect PredictionMulti-Objective Cross-Project Defect Prediction
Multi-Objective Cross-Project Defect Prediction
 

Recently uploaded

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Recently uploaded (20)

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

Java Unit Testing Tool Competition — Fifth Round

  • 1. .lusoftware verification & validation VVS Java Unit Testing Tool Competition — Fifth Round Annibale Panichella, Urko Rueda Molina 1
  • 2. Previous Editions Year Venue Coverage tool Mutation Tool #CUTs #Projects #Participants Statistical Tests Round 1 2013 ICST Cobertura Javalanche 77 5 2 ✗ Round 2 2014 FITTEST JaCoCo PITest 63 9 4 ✗ Round 3 2015 SBST JaCoCo PITest 63 9 8 ✗ Round 4 2016 SBST DEFECT4J (Real Faults) 68 5 4 ✗ 2
  • 3. New Edition Year Venue Coverage tool Mutation Tool #CUTs #Projects #Participants Statistical Tests Round 1 2013 ICST Cobertura Javalanche 77 5 2 ✗ Round 2 2014 FITTEST JaCoCo PITest 63 9 4 ✗ Round 3 2015 SBST JaCoCo PITest 63 9 8 ✗ Round 4 2016 SBST DEFECT4J (Real Faults) 68 5 4 ✗ Round 5 2017 SBST JaCoCo PITest + Our Env. 69 8 2+2 ✓ 3
  • 5. The Infrastructure Defect4j Defect4j • The previous edition used DEFECT4J to detect flaky tests and to measure effectiveness • In the new edition, we modified the infrastructure to work with libraries not in DEFECT4J • We developed our own tool to detect flaky tests • Effectiveness based on mutation analysis: PITest + JaCoCo 5
  • 6. The Infrastructure Defect4j Defect4j • The previous edition used DEFECT4J to detect flaky tests and to measure effectiveness • In the new edition, we modified the infrastructure to work with libraries not in DEFECT4J • We developed our own tool to detect flaky tests • Effectiveness based on mutation analysis: PITest + JaCoCo 6
  • 7. The Infrastructure Defect4j Defect4j • The previous edition used DEFECT4J to detect flaky tests and to measure effectiveness • In the new edition, we modified the infrastructure to work with libraries not in DEFECT4J • We developed our own tool to detect flaky tests • Effectiveness based on mutation analysis: PITest + JaCoCo 7
  • 8. The Infrastructure • The previous edition used DEFECT4J to detect flaky tests and to measure effectiveness • In the new edition, we modified the infrastructure to work with libraries not in DEFECT4J • We developed our own tool to detect flaky tests • Effectiveness based on mutation analysis: PITest + JaCoCo Our Tool PITest + JaCoCo 8
  • 9. Test Management Flaky tests: • Pass during generation but fail when re-executed • Detection mechanism: we run each test suite five times • Ignored when computing the coverage scores Non-compiling tests: • Generated test suites were re-compiled in our own execution environment 9
  • 10. Metric Computation Code Coverage: • Statement coverage • Condition coverage Mutation Score: • We did not use PITest’s running engine since it gave errors for test cases with ad-hoc/non-standard JUnit runners (e.g., in EvoSuite) • We only use PITest engine for the generation of mutants • Combining PITest with JaCoCo: executing only mutants infecting covered lines 10
  • 11. We apply the same formula used in the last competition since it combines coverage metrics, effectiveness, execution time and number of flaky/non-compiling tests Scoring Formula T = Generated Test B = Search Budget C = Class under test R = independent Run Covi = statement coverage Covb = branch coverage Covm = Strong Mutation covScorehT,B,C,ri = 1 ⇥ Covi + 2 ⇥ Covb + 4 ⇥ Covm1 2 4 11
  • 12. We apply the same formula used in the last competition since it combines coverage metrics, effectiveness, execution time and number of flaky/non-compiling tests Scoring Formula tScorehT,B,C,ri = covScorehT,B,C,ri ⇥ min ✓ 1, L genTime ◆ T = Generated Test B = Search Budget C = Class under test R = independent Run Covi = statement coverage Covb = branch coverage Covm = Strong Mutation getTime = generation time covScorehT,B,C,ri = 1 ⇥ Covi + 2 ⇥ Covb + 4 ⇥ Covm1 2 4 2 x B 12
  • 13. We apply the same formula used in the last competition since it combines coverage metrics, effectiveness, execution time and number of flaky/non-compiling tests Scoring Formula tScorehT,B,C,ri = covScorehT,B,C,ri ⇥ min ✓ 1, L genTime ◆ T = Generated Test B = Search Budget C = Class under test R = independent Run Covi = statement coverage Covb = branch coverage Covm = Strong Mutation getTime = generation time penalty = percentage of flaky test and non-compiling tests ScorehT,B,C,ri = tScorehT,B,C,ri + penaltyhT,B,C,ri covScorehT,B,C,ri = 1 ⇥ Covi + 2 ⇥ Covb + 4 ⇥ Covm1 2 4 2 x B 13
  • 15. The Tools jTExpert RandoopAutomatic unit test generation for Java T3 15
  • 16. Selection of the Benchmark Classes Source Application Domain # Classes # Selected Classes BCEL Apache commons Bytecode manipulation 431 10 Jxpath Java Beans manipulation with Path syntax 180 10 Imaging Framework to write/read images with various formats 427 4 Google Gson Google Conversion of Java Objects into their JSON representation and vice versa 174 9 Re2j Regular expression engine for time-linear regular expression matching 47 8 Freehep Java Analysis Studio Open-source repository providing Java utilities for high energy physics applications 180 10 LA4j Github Linear Algebra primitives (matrices and vectors) and algorithms 208 10 Okhttp Github HTTP and HTTP/2 client for Android and Java applications 193 8 16
  • 17. Selection of the Benchmark Classes Source Application Domain # Classes # Selected Classes BCEL Apache commons Bytecode manipulation 431 10 Jxpath Java Beans manipulation with Path syntax 180 10 Imaging Framework to write/read images with various formats 427 4 Google Gson Google Conversion of Java Objects into their JSON representation and vice versa 174 9 Re2j Regular expression engine for time-linear regular expression matching 47 8 Freehep Java Analysis Studio Open-source repository providing Java utilities for high energy physics applications 180 10 LA4j Github Linear Algebra primitives (matrices and vectors) and algorithms 208 10 Okhttp Github HTTP and HTTP/2 client for Android and Java applications 193 8 17
  • 18. Selection of the Benchmark Classes Source Application Domain # Classes # Selected Classes BCEL Apache commons Bytecode manipulation 431 10 Jxpath Java Beans manipulation with Path syntax 180 10 Imaging Framework to write/read images with various formats 427 4 Google Gson Google Conversion of Java Objects into their JSON representation and vice versa 174 9 Re2j Regular expression engine for time-linear regular expression matching 47 8 Freehep Java Analysis Studio Open-source repository providing Java utilities for high energy physics applications 180 10 LA4j Github Linear Algebra primitives (matrices and vectors) and algorithms 208 10 Okhttp Github HTTP and HTTP/2 client for Android and Java applications 193 8 18
  • 19. Selection Procedure HOW: • Computing the McCabe’s cyclomatic complexity (MCC) for all methods in each java library • Filtering out all trivial classes, i.e., classes that contains only methods with a MCC < 3 • Random sampling from the pruned projects WHAT/WHY: • Removing (likely) trivial classes not challenging for the tools • Developers may use automated tools for complex classes 19
  • 20. Benchmark Statistics Largest Class: Name = XPathParserTokenManager Project = JXPATH N. Statements = 1029 N. Branches = 872 Smallest Class: Name = ForwardBackSubstitutionSolver Project = LA4J N. Statements = 26 N. Branches = 20 # Branches Frequency # Statements Frequency 20
  • 21. The Methodology • Search Budgets = 10s, 30s, 60s, 120s, 240s, 300s, 480s • Number of CUTs = 69 • Number of repetitions = 3 • All tools have been executed in parallel (multi-threading) on the same machine • Statistical analysis: Friedman’s test: non-parametric test for multiple-problem analysis Post-hoc Connover’s procedure for pairwise multiple comparisons 21
  • 23. Coverage Results Search Budget = 10s Search Budget = 30s 23
  • 24. Coverage Results Search Budget = 60s Search Budget = 480s 24
  • 25. Coverage Results There are 43 classes out of 69 (≈ 60%) for which at least one of the two participant tools could not generate any test case. What happens if we consider only classes for which both EvoSuite and JTexpert could generate tests? Filtered Results with Search Budget = 480s 25
  • 26. Scalability%BranchCoverage 0 25 50 75 100 Search Budget 10s 30s 60s 120s 240s 300s 480s EvoSuite JTExpert T3 Randoop %StrongMutationCov. 0 12.5 25 37.5 50 Search Budget 10s 30s 60s 120s 240s 300s 480s EvoSuite JTExpert T3 Randoop Comparison for the class Parser.java extracted from the library Re4J. N. Statements = 760, N. Branches = 565, N. Mutants = 203 26
  • 27. ScoringScore 0 75 150 225 300 Search Budget 10s 30s 60s 120s 240s 300s 480s EvoSuite JTExpert T3 Randoop 27
  • 28. Generated vs. Manually-written Tests Comparison of the scores achieved by • EvoSuite after 480s • JTexpert after 480s • T3 after 480s • Random after 480s • Manually-written tests • Optimal Score N.B.: We only considered the 63 subjects for which we found developers-written tests. 0 50 100 150 200 250 300 350 400 450 500 268 61 78 125 251 Optimal EvoSuite JTExpert T3 Randoop M anual 28
  • 29. Tool Total Score St. Dev. Friedman’s Test Statistically better than (Conover’s procedure) Rank Score EvoSuite 1457 193 1 1.55 JTExpert, T3, Randoop JTexpert 849 102 2 2.71 T3, Randoop T3 526 82 3 2.81 Random Random 448 34 4 2.92 Statistical Analysis 29
  • 30. Tool Total Score St. Dev. Friedman’s Test Statistically better than (Conover’s procedure) Rank Score EvoSuite 1457 193 1 1.55 JTExpert, T3, Randoop JTexpert 849 102 2 2.71 T3, Randoop T3 526 82 3 2.81 Random Random 448 34 4 2.92 Statistical Analysis 30
  • 31. Statistical Analysis Tool Total Score St. Dev. Friedman’s Test Statistically better than (Conover’s procedure) Rank Score EvoSuite 1457 193 1 1.55 JTExpert, T3, Randoop JTexpert 849 102 2 2.71 T3, Randoop T3 526 82 3 2.81 Random Random 448 34 4 2.92 31
  • 32. Lessons Learnt • Using multi-problem statistical tests • Selection procedure to filter-out (likely) trivial classes • Subject categories: string manipulation, computational intensive, object manipulation, etc. • What next: • Publishing  the benchmark infrastructure • Performing a more in-depth analysis for each subject category • More Tools, new languages? (i.e., C, C#?) 32
  • 33. .lusoftware verification & validation VVS Java Unit Testing Tool Competition — Fifth Round Annibale Panichella, Urko Rueda Molina 33