SlideShare a Scribd company logo
Test-Driven Code Review:
An Empirical Study
Davide Spadini, Fabio Palomba,
Tobias Baum, Stefan Hanenberg, Magiel Bruntink, Alberto Bacchelli
Davide Spadini, Fabio Palomba,
Tobias Baum, Stefan Hanenberg, Magiel Bruntink, Alberto Bacchelli
@DavideSpadini ishepard
Test-Driven Code Review:
An Empirical Study
Code
Review
Software
Testing
Code
Review
Software
Testing
When Testing Meets Code Review:
Why and How Developers Review Tests
Davide Spadini
Delft University of Technology
Software Improvement Group
Delft, The Netherlands
d.spadini@sig.eu
Maurício Aniche
Delft University of Technology
Delft, The Netherlands
m.f.aniche@tudelft.nl
Margaret-Anne Storey
University of Victoria
Victoria, BC, Canada
mstorey@uvic.ca
Magiel Bruntink
Software Improvement Group
Amsterdam, The Netherlands
m.bruntink@sig.eu
Alberto Bacchelli
University of Zurich
Zurich, Switzerland
bacchelli@i.uzh.ch
ABSTRACT
Automated testing is considered an essential process for ensuring
software quality. However, writing and maintaining high-quality
test code is challenging and frequently considered of secondary
importance. For production code, many open source and industrial
software projects employ code review, a well-established software
quality practice, but the question remains whether and how code
review is also used for ensuring the quality of test code. The aim
of this research is to answer this question and to increase our un-
derstanding of what developers think and do when it comes to
reviewing test code. We conducted both quantitative and quali-
tative methods to analyze more than 300,000 code reviews, and
interviewed 12 developers about how they review test les. This
work resulted in an overview of current code reviewing practices, a
set of identied obstacles limiting the review of test code, and a set
of issues that developers would like to see improved in code review
tools. The study reveals that reviewing test les is very dierent
from reviewing production les, and that the navigation within the
review itself is one of the main issues developers currently face.
Based on our ndings, we propose a series of recommendations
and suggestions for the design of tools and future research.
CCS CONCEPTS
• Software and its engineering → Software testing and de-
bugging;
KEYWORDS
software testing, automated testing, code review, Gerrit
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than the
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specic permission
and/or a fee. Request permissions from permissions@acm.org.
ICSE ’18, May 27-June 3, 2018, Gothenburg, Sweden
© 2018 Copyright held by the owner/author(s). Publication rights licensed to Associa-
tion for Computing Machinery.
ACM ISBN 978-1-4503-5638-1/18/05...$15.00
https://doi.org/10.1145/3180155.3180192
ACM Reference Format:
Davide Spadini, Maurício Aniche, Margaret-Anne Storey, Magiel Bruntink,
and Alberto Bacchelli. 2018. When Testing Meets Code Review: Why and
How Developers Review Tests. In Proceedings of ICSE ’18: 40th International
Conference on Software Engineering , Gothenburg, Sweden, May 27-June 3,
2018 (ICSE ’18), 11 pages.
https://doi.org/10.1145/3180155.3180192
1 INTRODUCTION
Automated testing has become an essential process for improving
the quality of software systems [15, 31]. Automated tests (hereafter
referred to as just ‘tests’) can help ensure that production code is
robust under many usage conditions and that code meets perfor-
mance and security needs [15, 16]. Nevertheless, writing eective
tests is as challenging as writing good production code. A tester has
to ensure that test results are accurate, that all important execution
paths are considered, and that the tests themselves do not introduce
bottlenecks in the development pipeline [15]. Like production code,
test code must also be maintained and evolved [49].
As testing has become more commonplace, some have considered
that improving the quality of test code should help improve the
quality of the associated production code [21, 47]. Unfortunately,
there is evidence that test code is not always of high quality [11, 49].
Vazhabzadeh et al. showed that about half of the projects they
studied had bugs in the test code [46]. Most of these bugs create
false alarms that can waste developer time, while other bugs cause
harmful defects in production code that can remain undetected. We
also see that test code tends to grow over time, leading to bloat and
technical debt [49].
As code review has been shown to improve the quality of source
code in general [12, 38], one practice that is now common in many
development projects is to use Modern Code Review (MCR) to
improve the quality of test code. But how is test code reviewed? Is
it reviewed as rigorously as production code, or is it reviewed at
all? Are there specic issues that reviewers look for in test les?
Does test code pose dierent reviewing challenges compared to the
review of production code? Do some developers use techniques for
reviewing test code that could be helpful to other developers?
To address these questions and nd insights about test code
review, we conducted a two-phase study to understand how test
TDR
What?
Who? Why?
When?How?
Research Questions
Code Review Experiment
with 92 developers
~150 reviews
9 interviews +
survey with 103 respondents
RQ1: Does the order of presenting test code to the
reviewer influence the code review’s effectiveness?
RQ2: How do developers perceive the practice of
Test-Driven Code Review?
Research Questions
RQ1: Does the order of presenting test code to the
reviewer influence the code review’s effectiveness?
RQ2: How do developers perceive the practice of
Test-Driven Code Review?
Code Review Experiment
with 92 developers
~150 reviews
9 interviews +
survey with 103 respondents
RQ1: Does the order of presenting test code to the reviewer influence the code review’s
effectiveness? — Method
review
1st review
review
2nd review
(Optional)
RQ1: Does the order of presenting test code to the reviewer influence the code review’s
effectiveness? — Method
Dependent Variables
Bugs
• 2 different types:
1. Maintainability Issues
2. Functional Defects
• Both in test and in production code
RQ1: Does the order of presenting test code to the reviewer influence the code review’s
effectiveness? — Method
Production First
A.java ATest.javaA.javaATest.java
Test First
Independent Variable:
Treatment
A.java
Only Production
RQ1: Does the order of presenting test code to the reviewer influence the code review’s
effectiveness? — Method
review
2nd review
review
1st review
Test First Production First
Participant A
review
1st review
Production First
review
2nd review
Only Production
Participant B
RQ1: Does the order of presenting test code to the reviewer influence the code review’s
effectiveness? — Method
Control Variables
Code and Review Details, Participant’s Profile
• Duration of the code review
• Patch
• Role
• Experience:
• Reviewing
• Programming
• …
RQ1: Does the order of presenting test code to the reviewer influence the code review’s
effectiveness? — Method
Demographics
 confounders
review
1st review
32 participants completed
only the 1st review
60 participants
completed
both reviews
review
2nd review
(Optional)
Found Maintainability Issues
0%
20%
40%
60%
80%
100%
18%21%13% 8%18%
RQ1: Does the order of presenting test code to the reviewer influence the code review’s
effectiveness? — Results
Found Functional Defects (Bugs)
0%
20%
40%
60%
80%
100%
28%33%17% 28%40%
Test First Production First Only Production
p  0.01
medium eff. size
p  0.01
small eff. size
p  0.05p  0.05
In Test Code In Production Code In Test Code In Production Code
RQ1: Does the order of presenting test code to the reviewer influence the code review’s
effectiveness? — Results
Test Bugs Production Maintainability Issues
Estimate Significance Estimate Significance
Total duration -0.73 ** 0.04 .
Is First Review 0.16 . -0.15
Patch 2 -0.92 *** -2.84 ***
Treatment Prod. First -0.09
Treatment Test First 1.17 ** -1.24 **
Review Practice 0.06 0.25 .
Programming Pract. -0.66 -0.21
Profession Dev Exp. -0.09 -0.29 *
Java Exp. -0.01 -0.03
Worked Hours -0.02 -0.04
significance codes: ’***’p 0.001, ’**’p 0.01, ’*’p 0.05, ’.’p 0.1
RQ1: Does the order of presenting test code to the reviewer influence the code review’s
effectiveness? — Results
Test Bugs Production Maintainability Issues
Estimate Significance Estimate Significance
Total duration -0.73 ** 0.04 .
Is First Review 0.16 . -0.15
Patch 2 -0.92 *** -2.84 ***
Treatment Prod. First -0.09
Treatment Test First 1.17 ** -1.24 **
Review Practice 0.06 0.25 .
Programming Pract. -0.66 -0.21
Profession Dev Exp. -0.09 -0.29 *
Java Exp. -0.01 -0.03
Worked Hours -0.02 -0.04
significance codes: ’***’p 0.001, ’**’p 0.01, ’*’p 0.05, ’.’p 0.1
RQ1: Does the order of presenting test code to the reviewer influence the code review’s
effectiveness? — Results
Test Bugs Production Maintainability Issues
Estimate Significance Estimate Significance
Total duration -0.73 ** 0.04 .
Is First Review 0.16 . -0.15
Patch 2 -0.92 *** -2.84 ***
Treatment Prod. First -0.09
Treatment Test First 1.17 ** -1.24 **
Review Practice 0.06 0.25 .
Programming Pract. -0.66 -0.21
Profession Dev Exp. -0.09 -0.29 *
Java Exp. -0.01 -0.03
Worked Hours -0.02 -0.04
significance codes: ’***’p 0.001, ’**’p 0.01, ’*’p 0.05, ’.’p 0.1
Review Practice,
Progr. Practice,
Java Exp.
With TDR,
participants found
more bugs in test
code and the same
in production code
With TDR,
participants found
less maint. issues
in production code
RQ1: Does the order of presenting test code to the reviewer influence the code review’s
effectiveness? — Results
Research Questions
Code Review Experiment
with 92 developers
~150 reviews
9 interviews +
survey with 103 respondents
RQ1: Does the order of presenting test code to the
reviewer influence the code review’s effectiveness?
RQ2: How do developers perceive the practice of
Test-Driven Code Review?
Perceived
advantages
Adoption
Perceived
problems
RQ2: How do developers perceive Test-Driven Code Review?
Working context:
3 OSS
6 Closed Source
9 interviews
103 survey respondents
44% between 2 to 5 years of Experience
28 % between 6 to 10 years of Experience
RQ2: How do developers perceive Test-Driven Code Review?
Adoption: How often respondents use TDR
RQ2: How do developers perceive Test-Driven Code Review?
0
10
20
30
40
50
Never Almost never Sometimes Almost always Always
Tests give less
knowledge on the
production
behavior
Tests have low
code quality
Can not choose
a different order
in CR tool
Tests are perceived
as less important
Perceived problems with TDR
RQ2: How do developers perceive Test-Driven Code Review?
Black-box view of
production code
Useful when
the developer is
unfamiliar with the
code
improved test
code quality
Perceived advantages of TDR
RQ2: How do developers perceive Test-Driven Code Review?
Research Questions
RQ1: Does the order of presenting test code to the reviewer influence the
code review’s effectiveness?
RQ2: How do developers perceive the practice of Test-Driven Code Review?
The order matters Context dependent
Educate on test writing
and reviewing
Test code importance
==
Prod. code importance

More Related Content

Similar to Test-Driven Code Review: An Empirical Study

5WCSQ - Quality Improvement by the Real-Time Detection of the Problems
5WCSQ - Quality Improvement by the Real-Time Detection of the Problems5WCSQ - Quality Improvement by the Real-Time Detection of the Problems
5WCSQ - Quality Improvement by the Real-Time Detection of the ProblemsTakanori Suzuki
 
Technical debt management strategies
Technical debt management strategiesTechnical debt management strategies
Technical debt management strategiesRaquel Pau
 
Continuous integration practices to improve the software quality
Continuous integration practices to improve the software qualityContinuous integration practices to improve the software quality
Continuous integration practices to improve the software qualityFabricio Epaminondas
 
Continuous Integration Practices
Continuous Integration Practices Continuous Integration Practices
Continuous Integration Practices Marcelo Freire
 
Software Quality Architecture And Code Audit
Software Quality Architecture And Code AuditSoftware Quality Architecture And Code Audit
Software Quality Architecture And Code AuditXebia IT Architects
 
Traps detection during migration of C and C++ code to 64-bit Windows
Traps detection during migration of C and C++ code to 64-bit WindowsTraps detection during migration of C and C++ code to 64-bit Windows
Traps detection during migration of C and C++ code to 64-bit WindowsPVS-Studio
 
Enhancing Your Test Automation Scenario Coverage with Selenium - QA or the Hi...
Enhancing Your Test Automation Scenario Coverage with Selenium - QA or the Hi...Enhancing Your Test Automation Scenario Coverage with Selenium - QA or the Hi...
Enhancing Your Test Automation Scenario Coverage with Selenium - QA or the Hi...Perfecto by Perforce
 
Backward thinking design qa system for quality goals
Backward thinking   design qa system for quality goalsBackward thinking   design qa system for quality goals
Backward thinking design qa system for quality goalsgaoliang641
 
Continuous Testing Landscape.pptx
Continuous Testing Landscape.pptxContinuous Testing Landscape.pptx
Continuous Testing Landscape.pptxMarc Hornbeek
 
Software presentation
Software presentationSoftware presentation
Software presentationJennaPrengle
 
Continuous Inspection of Code Quality: SonarQube
Continuous Inspection of Code Quality: SonarQubeContinuous Inspection of Code Quality: SonarQube
Continuous Inspection of Code Quality: SonarQubeEmre Dündar
 
PTAQ L - Adam Makarowicz - The quality, or there and back again
PTAQ L - Adam Makarowicz - The quality, or there and back againPTAQ L - Adam Makarowicz - The quality, or there and back again
PTAQ L - Adam Makarowicz - The quality, or there and back againAdam Makarowicz
 
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...Pavneet Singh Kochhar
 
Staroletov testing TDD BDD MBT
Staroletov testing TDD BDD MBTStaroletov testing TDD BDD MBT
Staroletov testing TDD BDD MBTSergey Staroletov
 
Improving Development Productivity: Static Analysis and Continuous Integration
Improving Development Productivity: Static Analysis and Continuous IntegrationImproving Development Productivity: Static Analysis and Continuous Integration
Improving Development Productivity: Static Analysis and Continuous IntegrationKlocwork
 
To Improve Code Quality in Your Software Development Projects- Code Brew Labs...
To Improve Code Quality in Your Software Development Projects- Code Brew Labs...To Improve Code Quality in Your Software Development Projects- Code Brew Labs...
To Improve Code Quality in Your Software Development Projects- Code Brew Labs...MarkPeterson367876
 
Problems of testing 64-bit applications
Problems of testing 64-bit applicationsProblems of testing 64-bit applications
Problems of testing 64-bit applicationsPVS-Studio
 
End-end tests as first class citizens - SeleniumConf 2020
End-end tests as first class citizens - SeleniumConf 2020End-end tests as first class citizens - SeleniumConf 2020
End-end tests as first class citizens - SeleniumConf 2020Abhijeet Vaikar
 

Similar to Test-Driven Code Review: An Empirical Study (20)

5WCSQ - Quality Improvement by the Real-Time Detection of the Problems
5WCSQ - Quality Improvement by the Real-Time Detection of the Problems5WCSQ - Quality Improvement by the Real-Time Detection of the Problems
5WCSQ - Quality Improvement by the Real-Time Detection of the Problems
 
Technical debt management strategies
Technical debt management strategiesTechnical debt management strategies
Technical debt management strategies
 
Continuous integration practices to improve the software quality
Continuous integration practices to improve the software qualityContinuous integration practices to improve the software quality
Continuous integration practices to improve the software quality
 
Continuous Integration Practices
Continuous Integration Practices Continuous Integration Practices
Continuous Integration Practices
 
Software Quality Architecture And Code Audit
Software Quality Architecture And Code AuditSoftware Quality Architecture And Code Audit
Software Quality Architecture And Code Audit
 
Traps detection during migration of C and C++ code to 64-bit Windows
Traps detection during migration of C and C++ code to 64-bit WindowsTraps detection during migration of C and C++ code to 64-bit Windows
Traps detection during migration of C and C++ code to 64-bit Windows
 
Enhancing Your Test Automation Scenario Coverage with Selenium - QA or the Hi...
Enhancing Your Test Automation Scenario Coverage with Selenium - QA or the Hi...Enhancing Your Test Automation Scenario Coverage with Selenium - QA or the Hi...
Enhancing Your Test Automation Scenario Coverage with Selenium - QA or the Hi...
 
Backward thinking design qa system for quality goals
Backward thinking   design qa system for quality goalsBackward thinking   design qa system for quality goals
Backward thinking design qa system for quality goals
 
Continuous Testing Landscape.pptx
Continuous Testing Landscape.pptxContinuous Testing Landscape.pptx
Continuous Testing Landscape.pptx
 
Software presentation
Software presentationSoftware presentation
Software presentation
 
Continuous Inspection of Code Quality: SonarQube
Continuous Inspection of Code Quality: SonarQubeContinuous Inspection of Code Quality: SonarQube
Continuous Inspection of Code Quality: SonarQube
 
PTAQ L - Adam Makarowicz - The quality, or there and back again
PTAQ L - Adam Makarowicz - The quality, or there and back againPTAQ L - Adam Makarowicz - The quality, or there and back again
PTAQ L - Adam Makarowicz - The quality, or there and back again
 
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...
 
Staroletov testing TDD BDD MBT
Staroletov testing TDD BDD MBTStaroletov testing TDD BDD MBT
Staroletov testing TDD BDD MBT
 
Improving Development Productivity: Static Analysis and Continuous Integration
Improving Development Productivity: Static Analysis and Continuous IntegrationImproving Development Productivity: Static Analysis and Continuous Integration
Improving Development Productivity: Static Analysis and Continuous Integration
 
To Improve Code Quality in Your Software Development Projects- Code Brew Labs...
To Improve Code Quality in Your Software Development Projects- Code Brew Labs...To Improve Code Quality in Your Software Development Projects- Code Brew Labs...
To Improve Code Quality in Your Software Development Projects- Code Brew Labs...
 
Problems of testing 64-bit applications
Problems of testing 64-bit applicationsProblems of testing 64-bit applications
Problems of testing 64-bit applications
 
End-end tests as first class citizens - SeleniumConf 2020
End-end tests as first class citizens - SeleniumConf 2020End-end tests as first class citizens - SeleniumConf 2020
End-end tests as first class citizens - SeleniumConf 2020
 
Future of QA
Future of QAFuture of QA
Future of QA
 
Futureofqa
FutureofqaFutureofqa
Futureofqa
 

More from Delft University of Technology

More from Delft University of Technology (7)

Investigating Severity Thresholds for Test Smells
Investigating Severity Thresholds for Test SmellsInvestigating Severity Thresholds for Test Smells
Investigating Severity Thresholds for Test Smells
 
Primers or Reminders? The Effects of Existing Review Comments on Code Review
Primers or Reminders? The Effects of Existing Review Comments on Code ReviewPrimers or Reminders? The Effects of Existing Review Comments on Code Review
Primers or Reminders? The Effects of Existing Review Comments on Code Review
 
Practices and Tools for Better Software Testing
Practices and Tools for  Better Software TestingPractices and Tools for  Better Software Testing
Practices and Tools for Better Software Testing
 
PyDriller: Python Framework for Mining Software Repositories
PyDriller: Python Framework for Mining Software RepositoriesPyDriller: Python Framework for Mining Software Repositories
PyDriller: Python Framework for Mining Software Repositories
 
When Testing Meets Code Review: Why and How Developers Review Tests
When Testing Meets Code Review: Why and How Developers Review TestsWhen Testing Meets Code Review: Why and How Developers Review Tests
When Testing Meets Code Review: Why and How Developers Review Tests
 
On The Relation of Test Smells to Software Code Quality
On The Relation of Test Smells to Software Code QualityOn The Relation of Test Smells to Software Code Quality
On The Relation of Test Smells to Software Code Quality
 
To Mock or Not To Mock
To Mock or Not To MockTo Mock or Not To Mock
To Mock or Not To Mock
 

Recently uploaded

fluid mechanics gate notes . gate all pyqs answer
fluid mechanics gate notes . gate all pyqs answerfluid mechanics gate notes . gate all pyqs answer
fluid mechanics gate notes . gate all pyqs answerapareshmondalnita
 
Hall booking system project report .pdf
Hall booking system project report  .pdfHall booking system project report  .pdf
Hall booking system project report .pdfKamal Acharya
 
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical EngineeringIntroduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical EngineeringC Sai Kiran
 
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdfA CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdfKamal Acharya
 
The Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docx
The Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docxThe Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docx
The Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docxCenterEnamel
 
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical SolutionsRS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical SolutionsAtif Razi
 
Online resume builder management system project report.pdf
Online resume builder management system project report.pdfOnline resume builder management system project report.pdf
Online resume builder management system project report.pdfKamal Acharya
 
İTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopİTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopEmre Günaydın
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdfAhmedHussein950959
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234AafreenAbuthahir2
 
Toll tax management system project report..pdf
Toll tax management system project report..pdfToll tax management system project report..pdf
Toll tax management system project report..pdfKamal Acharya
 
Online blood donation management system project.pdf
Online blood donation management system project.pdfOnline blood donation management system project.pdf
Online blood donation management system project.pdfKamal Acharya
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationRobbie Edward Sayers
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxViniHema
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdfPratik Pawar
 
Halogenation process of chemical process industries
Halogenation process of chemical process industriesHalogenation process of chemical process industries
Halogenation process of chemical process industriesMuhammadTufail242431
 
ENERGY STORAGE DEVICES INTRODUCTION UNIT-I
ENERGY STORAGE DEVICES  INTRODUCTION UNIT-IENERGY STORAGE DEVICES  INTRODUCTION UNIT-I
ENERGY STORAGE DEVICES INTRODUCTION UNIT-IVigneshvaranMech
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdfKamal Acharya
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdfKamal Acharya
 

Recently uploaded (20)

fluid mechanics gate notes . gate all pyqs answer
fluid mechanics gate notes . gate all pyqs answerfluid mechanics gate notes . gate all pyqs answer
fluid mechanics gate notes . gate all pyqs answer
 
Hall booking system project report .pdf
Hall booking system project report  .pdfHall booking system project report  .pdf
Hall booking system project report .pdf
 
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical EngineeringIntroduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering
 
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdfA CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
 
The Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docx
The Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docxThe Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docx
The Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docx
 
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical SolutionsRS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
 
Online resume builder management system project report.pdf
Online resume builder management system project report.pdfOnline resume builder management system project report.pdf
Online resume builder management system project report.pdf
 
İTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopİTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering Workshop
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 
Toll tax management system project report..pdf
Toll tax management system project report..pdfToll tax management system project report..pdf
Toll tax management system project report..pdf
 
Online blood donation management system project.pdf
Online blood donation management system project.pdfOnline blood donation management system project.pdf
Online blood donation management system project.pdf
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
Halogenation process of chemical process industries
Halogenation process of chemical process industriesHalogenation process of chemical process industries
Halogenation process of chemical process industries
 
ENERGY STORAGE DEVICES INTRODUCTION UNIT-I
ENERGY STORAGE DEVICES  INTRODUCTION UNIT-IENERGY STORAGE DEVICES  INTRODUCTION UNIT-I
ENERGY STORAGE DEVICES INTRODUCTION UNIT-I
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdf
 

Test-Driven Code Review: An Empirical Study

  • 1. Test-Driven Code Review: An Empirical Study Davide Spadini, Fabio Palomba, Tobias Baum, Stefan Hanenberg, Magiel Bruntink, Alberto Bacchelli
  • 2. Davide Spadini, Fabio Palomba, Tobias Baum, Stefan Hanenberg, Magiel Bruntink, Alberto Bacchelli @DavideSpadini ishepard Test-Driven Code Review: An Empirical Study
  • 4. Code Review Software Testing When Testing Meets Code Review: Why and How Developers Review Tests Davide Spadini Delft University of Technology Software Improvement Group Delft, The Netherlands d.spadini@sig.eu Maurício Aniche Delft University of Technology Delft, The Netherlands m.f.aniche@tudelft.nl Margaret-Anne Storey University of Victoria Victoria, BC, Canada mstorey@uvic.ca Magiel Bruntink Software Improvement Group Amsterdam, The Netherlands m.bruntink@sig.eu Alberto Bacchelli University of Zurich Zurich, Switzerland bacchelli@i.uzh.ch ABSTRACT Automated testing is considered an essential process for ensuring software quality. However, writing and maintaining high-quality test code is challenging and frequently considered of secondary importance. For production code, many open source and industrial software projects employ code review, a well-established software quality practice, but the question remains whether and how code review is also used for ensuring the quality of test code. The aim of this research is to answer this question and to increase our un- derstanding of what developers think and do when it comes to reviewing test code. We conducted both quantitative and quali- tative methods to analyze more than 300,000 code reviews, and interviewed 12 developers about how they review test les. This work resulted in an overview of current code reviewing practices, a set of identied obstacles limiting the review of test code, and a set of issues that developers would like to see improved in code review tools. The study reveals that reviewing test les is very dierent from reviewing production les, and that the navigation within the review itself is one of the main issues developers currently face. Based on our ndings, we propose a series of recommendations and suggestions for the design of tools and future research. CCS CONCEPTS • Software and its engineering → Software testing and de- bugging; KEYWORDS software testing, automated testing, code review, Gerrit Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for prot or commercial advantage and that copies bear this notice and the full citation on the rst page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specic permission and/or a fee. Request permissions from permissions@acm.org. ICSE ’18, May 27-June 3, 2018, Gothenburg, Sweden © 2018 Copyright held by the owner/author(s). Publication rights licensed to Associa- tion for Computing Machinery. ACM ISBN 978-1-4503-5638-1/18/05...$15.00 https://doi.org/10.1145/3180155.3180192 ACM Reference Format: Davide Spadini, Maurício Aniche, Margaret-Anne Storey, Magiel Bruntink, and Alberto Bacchelli. 2018. When Testing Meets Code Review: Why and How Developers Review Tests. In Proceedings of ICSE ’18: 40th International Conference on Software Engineering , Gothenburg, Sweden, May 27-June 3, 2018 (ICSE ’18), 11 pages. https://doi.org/10.1145/3180155.3180192 1 INTRODUCTION Automated testing has become an essential process for improving the quality of software systems [15, 31]. Automated tests (hereafter referred to as just ‘tests’) can help ensure that production code is robust under many usage conditions and that code meets perfor- mance and security needs [15, 16]. Nevertheless, writing eective tests is as challenging as writing good production code. A tester has to ensure that test results are accurate, that all important execution paths are considered, and that the tests themselves do not introduce bottlenecks in the development pipeline [15]. Like production code, test code must also be maintained and evolved [49]. As testing has become more commonplace, some have considered that improving the quality of test code should help improve the quality of the associated production code [21, 47]. Unfortunately, there is evidence that test code is not always of high quality [11, 49]. Vazhabzadeh et al. showed that about half of the projects they studied had bugs in the test code [46]. Most of these bugs create false alarms that can waste developer time, while other bugs cause harmful defects in production code that can remain undetected. We also see that test code tends to grow over time, leading to bloat and technical debt [49]. As code review has been shown to improve the quality of source code in general [12, 38], one practice that is now common in many development projects is to use Modern Code Review (MCR) to improve the quality of test code. But how is test code reviewed? Is it reviewed as rigorously as production code, or is it reviewed at all? Are there specic issues that reviewers look for in test les? Does test code pose dierent reviewing challenges compared to the review of production code? Do some developers use techniques for reviewing test code that could be helpful to other developers? To address these questions and nd insights about test code review, we conducted a two-phase study to understand how test
  • 6. Research Questions Code Review Experiment with 92 developers ~150 reviews 9 interviews + survey with 103 respondents RQ1: Does the order of presenting test code to the reviewer influence the code review’s effectiveness? RQ2: How do developers perceive the practice of Test-Driven Code Review?
  • 7. Research Questions RQ1: Does the order of presenting test code to the reviewer influence the code review’s effectiveness? RQ2: How do developers perceive the practice of Test-Driven Code Review? Code Review Experiment with 92 developers ~150 reviews 9 interviews + survey with 103 respondents
  • 8. RQ1: Does the order of presenting test code to the reviewer influence the code review’s effectiveness? — Method review 1st review review 2nd review (Optional)
  • 9. RQ1: Does the order of presenting test code to the reviewer influence the code review’s effectiveness? — Method Dependent Variables Bugs • 2 different types: 1. Maintainability Issues 2. Functional Defects • Both in test and in production code
  • 10. RQ1: Does the order of presenting test code to the reviewer influence the code review’s effectiveness? — Method Production First A.java ATest.javaA.javaATest.java Test First Independent Variable: Treatment A.java Only Production
  • 11. RQ1: Does the order of presenting test code to the reviewer influence the code review’s effectiveness? — Method review 2nd review review 1st review Test First Production First Participant A review 1st review Production First review 2nd review Only Production Participant B
  • 12. RQ1: Does the order of presenting test code to the reviewer influence the code review’s effectiveness? — Method Control Variables Code and Review Details, Participant’s Profile • Duration of the code review • Patch • Role • Experience: • Reviewing • Programming • …
  • 13. RQ1: Does the order of presenting test code to the reviewer influence the code review’s effectiveness? — Method Demographics confounders review 1st review 32 participants completed only the 1st review 60 participants completed both reviews review 2nd review (Optional)
  • 14. Found Maintainability Issues 0% 20% 40% 60% 80% 100% 18%21%13% 8%18% RQ1: Does the order of presenting test code to the reviewer influence the code review’s effectiveness? — Results Found Functional Defects (Bugs) 0% 20% 40% 60% 80% 100% 28%33%17% 28%40% Test First Production First Only Production p 0.01 medium eff. size p 0.01 small eff. size p 0.05p 0.05 In Test Code In Production Code In Test Code In Production Code
  • 15. RQ1: Does the order of presenting test code to the reviewer influence the code review’s effectiveness? — Results Test Bugs Production Maintainability Issues Estimate Significance Estimate Significance Total duration -0.73 ** 0.04 . Is First Review 0.16 . -0.15 Patch 2 -0.92 *** -2.84 *** Treatment Prod. First -0.09 Treatment Test First 1.17 ** -1.24 ** Review Practice 0.06 0.25 . Programming Pract. -0.66 -0.21 Profession Dev Exp. -0.09 -0.29 * Java Exp. -0.01 -0.03 Worked Hours -0.02 -0.04 significance codes: ’***’p 0.001, ’**’p 0.01, ’*’p 0.05, ’.’p 0.1
  • 16. RQ1: Does the order of presenting test code to the reviewer influence the code review’s effectiveness? — Results Test Bugs Production Maintainability Issues Estimate Significance Estimate Significance Total duration -0.73 ** 0.04 . Is First Review 0.16 . -0.15 Patch 2 -0.92 *** -2.84 *** Treatment Prod. First -0.09 Treatment Test First 1.17 ** -1.24 ** Review Practice 0.06 0.25 . Programming Pract. -0.66 -0.21 Profession Dev Exp. -0.09 -0.29 * Java Exp. -0.01 -0.03 Worked Hours -0.02 -0.04 significance codes: ’***’p 0.001, ’**’p 0.01, ’*’p 0.05, ’.’p 0.1
  • 17. RQ1: Does the order of presenting test code to the reviewer influence the code review’s effectiveness? — Results Test Bugs Production Maintainability Issues Estimate Significance Estimate Significance Total duration -0.73 ** 0.04 . Is First Review 0.16 . -0.15 Patch 2 -0.92 *** -2.84 *** Treatment Prod. First -0.09 Treatment Test First 1.17 ** -1.24 ** Review Practice 0.06 0.25 . Programming Pract. -0.66 -0.21 Profession Dev Exp. -0.09 -0.29 * Java Exp. -0.01 -0.03 Worked Hours -0.02 -0.04 significance codes: ’***’p 0.001, ’**’p 0.01, ’*’p 0.05, ’.’p 0.1
  • 18. Review Practice, Progr. Practice, Java Exp. With TDR, participants found more bugs in test code and the same in production code With TDR, participants found less maint. issues in production code RQ1: Does the order of presenting test code to the reviewer influence the code review’s effectiveness? — Results
  • 19. Research Questions Code Review Experiment with 92 developers ~150 reviews 9 interviews + survey with 103 respondents RQ1: Does the order of presenting test code to the reviewer influence the code review’s effectiveness? RQ2: How do developers perceive the practice of Test-Driven Code Review?
  • 20. Perceived advantages Adoption Perceived problems RQ2: How do developers perceive Test-Driven Code Review?
  • 21. Working context: 3 OSS 6 Closed Source 9 interviews 103 survey respondents 44% between 2 to 5 years of Experience 28 % between 6 to 10 years of Experience RQ2: How do developers perceive Test-Driven Code Review?
  • 22. Adoption: How often respondents use TDR RQ2: How do developers perceive Test-Driven Code Review? 0 10 20 30 40 50 Never Almost never Sometimes Almost always Always
  • 23. Tests give less knowledge on the production behavior Tests have low code quality Can not choose a different order in CR tool Tests are perceived as less important Perceived problems with TDR RQ2: How do developers perceive Test-Driven Code Review?
  • 24. Black-box view of production code Useful when the developer is unfamiliar with the code improved test code quality Perceived advantages of TDR RQ2: How do developers perceive Test-Driven Code Review?
  • 25. Research Questions RQ1: Does the order of presenting test code to the reviewer influence the code review’s effectiveness? RQ2: How do developers perceive the practice of Test-Driven Code Review?
  • 26. The order matters Context dependent Educate on test writing and reviewing Test code importance == Prod. code importance