Wait for it: identifying “On-Hold”
self-admitted technical debt
Rungroj Maipradit Christoph Treude Hideaki Hata Kenichi Matsumoto
Journal-First Paper (Empirical Software Engineering)
Self-Admitted Technical Debt (SATD)
2
// The replaceAll is an ugly workaround for CAMEL-4954, awaiting
a cleaner fix once CAMEL-4425 // is fully resolved in all components
String name = URLDecoder.decode(parameter.substring(0, p), CHARSET);
String value = URLDecoder.decode(parameter.substring(p + 1)
.replaceAll("%", "%25"), CHARSET);
SATD
Self-admitted technical debt (SATD) refers to situations where a software developer
knows that their current implementation is not optimal and indicates this using a
source code comment.
Previous findings
3
• Up to 31% of the files contain SATD. [1]
[1] A. Potdar and E. Shihab, “An exploratory study on self-admitted technical debt,” in30th IEEE International Conference on Software Maintenance and Evolution, Victoria, BC, Canada, September 29 -October 3, 2014, 2014, pp. 91–100.
[2] F. Zampetti, A. Serebrenik, and M. Di Penta, “Was self-admitted technical debt removal a real removal? an in-depth perspective,” in2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR), 2018, pp. 526–536.
An Exploratory Study on
Self-Admitted Technical Debt
Aniket Potdar
Department of Software Engineering
Rochester Institute of Technology
Rochester, NY, USA
Email: asp6719@rit.edu
Emad Shihab
Department of Computer Science and Software Engineering
Concordia University
Montreal, QC, Canada
Email: eshihab@cse.concordia.ca
Abstract—Throughout a software development life cycle, devel-
opers knowingly commit code that is either incomplete, requires
rework, produces errors, or is a temporary workaround. Such
incomplete or temporary workarounds are commonly referred to
as ’technical debt’. Our experience indicates that self-admitted
technical debt is common in software projects and may negatively
impact software maintenance, however, to date very little is
known about them.
Therefore, in this paper, we use source-code comments in four
large open source software projects - Eclipse, Chromium OS,
Apache HTTP Server, and ArgoUML to identify self-admitted
technical debt. Using the identified technical debt, we study
1) the amount of self-admitted technical debt found in these
projects, 2) why this self-admitted technical debt was introduced
into the software projects and 3) how likely is the self-admitted
technical debt to be removed after their introduction. We find
that the amount of self-admitted technical debt exists in 2.4%
- 31% of the files. Furthermore, we find that developers with
higher experience tend to introduce most of the self-admitted
technical debt and that time pressures and complexity of the
code do not correlate with the amount of self-admitted technical
debt. Lastly, although self-admitted technical debt is meant to be
addressed or removed in the future, only between 26.3% - 63.5%
of self-admitted technical debt gets removed from projects after
introduction.
I. INTRODUCTION
Delivering high quality, defect-free software is the goal of
all software projects. To ensure the delivery of high quality
software, software project often plan their development and
maintenance efforts. However, in many cases, developers are
rushed into completing tasks for various reasons. A few of
these reasons mentioned in prior work include, cost reduc-
tion, satisfying customers and market pressure from competi-
tion [1]. Intuition and general belief indicate that such rushed
development tasks (also known as technical debt) negatively
impact software maintenance and overall quality [2].
A plethora of prior work proposed techniques to support
software maintenance and ensure high software quality. For
example, prior work focused on understanding and predicting
software defects (e.g. [3]), analyzing bug fix patterns (e.g. [4]),
and attempting to understand and eliminate rework and main-
tenance (e.g., [5]). The majority of the aforementioned prior
work used historical development data and source-code met-
rics to perform their studies. More recently, researchers lever-
aged natural language to help identify potentially problematic
areas of the software. For example, work by Tan et al. [6]
developed natural language processing tools to find comment-
bug inconsistencies. Other work identified the coevolutionary
relationship between source code and its associated comments
(e.g., [7], [8]) and used task annotations to manage productiv-
ity [9].
The majority of the prior work focused on quality issues
that are due to unintentional errors by developers (i.e., er-
rors introduced by the developers are assumed to mistakes).
However, to the best of our knowledge, very few prior studies
examined the impact of errors that might be introduced due to
intentional (i.e., self admitted) quick or temporary fixes (i.e.,
technical debt). Studying this self-admitted technical debt is
important since they appear frequently in some projects (as
we show later in this study) and prior work indicated that
they negatively impact quality [2].
Therefore, in this paper we perform an exploratory study
to better understand self-admitted technical debt. Inspired
by prior work (e.g., [6], [8], [10], [11]), we use source-
code comments to detect self-admitted technical debt. We
perform our study on four large open source projects - namely
Eclipse, Chromium OS, ArgoUML and Apache httpd. We
focus on quantifying the amount of self-admitted technical
debt (RQ1), on determining why self-admitted technical debt
is introduced (RQ2) and how much of self-admitted technical
debt is actually removed after their introduction (RQ3).
We make the following contributions:
• Identify comment patterns that indicate self-admitted
technical debt. We manually read through 101,762
code comments to determine patterns that indicated self-
admitted technical debt. In the end, we identified 62
different comment patterns that indicate self-admitted
technical debt.
• Measure how much self-admitted technical debt
exists, why self-admitted technical debt is introduced
and how much self-admitted technical debt is
removed after their introduction. We find that 2.4% -
31.0% of the files contain self-admitted technical debt,
that more experienced developers introduce more self-
admitted technical debt and that self-admitted technical
debt is introduced throughout their development activity
(i.e., they do not only introduce self-admitted technical
2014 IEEE International Conference on Software Maintenance and Evolution
1063-6773/14 $31.00 © 2014 IEEE
DOI 10.1109/ICSME.2014.31
91
2014 IEEE International Conference on Software Maintenance and Evolution
1063-6773/14 $31.00 © 2014 IEEE
DOI 10.1109/ICSME.2014.31
91
2014 IEEE International Conference on Software Maintenance and Evolution
1063-6773/14 $31.00 © 2014 IEEE
DOI 10.1109/ICSME.2014.31
91
2014 IEEE International Conference on Software Maintenance and Evolution
1063-6773/14 $31.00 © 2014 IEEE
DOI 10.1109/ICSME.2014.31
91
Authorized licensed use limited to: NARASENTAN-KAGAKUGIJYUTSU. Downloaded on May 02,2021 at 08:19:32 UTC from IEEE Xplore. Restrictions apply.
• 20% - 50% of the removals of SATD
were accidental and are even unintended. [2]
Was Self-Admitted Technical Debt Removal a real Removal?
An In-Depth Perspective
Fiorella Zampetti
University of Sannio, Italy
fiorella.zampetti@unisannio.it
Alexander Serebrenik
Eindhoven University of Technology,
The Netherlands
a.serebrenik@tue.nl
Massimiliano Di Penta
University of Sannio, Italy
dipenta@unisannio.it
ABSTRACT
Technical Debt (TD) has been defined as “code being not quite
right yet”, and its presence is often self-admitted by developers
through comments. The purpose of such comments is to keep track
of TD and appropriately address it when possible. Building on
a previous quantitative investigation by Maldonado et al. on the
removal of self-admitted technical debt (SATD), in this paper we
perform an in-depth quantitative and qualitative study of how SATD
is addressed in five Java open source projects. On the one hand, we
look at whether SATD is “accidentally” removed, and the extent
to which the SATD removal is being documented. We found that
that (i) between 20% and 50% of SATD comments are accidentally
removed while entire classes or methods are dropped, (ii) 8% of
the SATD removal is acknowledged in commit messages, and (iii)
while most of the changes addressing SATD require complex source
code changes, very often SATD is addressed by specific changes to
method calls or conditionals. Our results can be used to better plan
TD management or learn patterns for addressing certain kinds of
TD and provide recommendations to developers.
CCS CONCEPTS
• Software and its engineering → Software evolution;
ACM Reference format:
Fiorella Zampetti, Alexander Serebrenik, and Massimiliano Di Penta. 2018.
Was Self-Admitted Technical Debt Removal a real Removal?
An In-Depth Perspective. In Proceedings of MSR ’18: 15th International Con-
ference on Mining Software Repositories , Gothenburg, Sweden, May 28–29,
2018 (MSR ’18), 11 pages.
DOI: 10.1145/3196398.3196423
1 INTRODUCTION
During software development activities it frequently happens that
developers push code that is not in right shape yet. This can occur
for several reasons, including pressure to release new features, need
for quickly patching faulty code or lack of suitable components
needed to implement certain features. The presence of “not quite
right code which we postpone making it right” has been referred
as Technical Debt (TD) by Cunningham [10].
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a
fee. Request permissions from permissions@acm.org.
MSR ’18, Gothenburg, Sweden
© 2018 ACM. 978-1-4503-5716-6/18/05...$15.00
DOI: 10.1145/3196398.3196423
The TD awareness is a key point to manage it [13]. Luckily,
Potdar and Shihab [27] have observed that often developers tend
to “self-admit” TD. This is done by inserting a comment near the
source code being affected by TD, for example indicating a “TODO”
and/or “FIXME”, that the current code is actually a “hack” or, more
generally, annotating that the code needs some improvement or
refactoring at least. In order to help keep tracking of SATD, vari-
ous authors have proposed approaches to automatically identify
SATD comments, using regular expressions [5] or Natural Language
Processing (NLP) [21]. Also, recently there have been attempts to
identify, based on previous SATD and on source code features,
where there could be “TD that should be admitted” [36]. In sum-
mary, while developers keep track of TD, and there are ways of
detecting where TD has or should have been admitted, one may
wonder whether anybody takes care of improving that sub-optimal
code and therefore addressing the SATD.
Bavota and Russo [5] and Maldonado et al. [11] have studied
the extent to which SATDs are removed. In both studies removal
has been interpreted as removal of the comments reflecting SATD
rather than removal of the code affected by the SATD. Maldonado et
al. [11] have also surveyed developers involved in the introduction
and/or removal of technical debt finding that SATD is predomi-
nantly removed when bugs are fixed or new features are added.
While previous work has quantitatively studied the extent to
which SATD comments disappear, to the best of our knowledge
the existing literature lacks a deep and systematic analysis of how
“self-admitted” technical debt has been removed. This is relevant
for several reasons. First, it is useful for estimating the effort to
allocate for the improvement of the code quality. In other words,
does the improvement require massive code rewriting, refactoring
operations, library replacement/adaptation or is it simply matter
of making the implementation more robust by changing some pre-
conditions?
Second, the identification of recurring SATD-fixing patterns
might be useful, at least in bounded circumstances, to recommend
possible solutions. For example, there are cases when the code might
have been made more robust by adding a null check, simplifying a
complex condition, or replacing an API with an alternative one.
Last, but not least, SATD removal can be accidental, in other
words, the SATD disappears because the related code is no longer
in the system. It is indeed possible that, as it was also found for
code smells [32], the main reason for SATD removal is not develop-
ers intentionally taking care of it, but because the code is simply
no longer there. Also, it can happen that the “admission” (i.e., the
SATD comment) has been dropped, while the code still remains un-
changed. This either means that what it was foreseen as a problem
526
2018 ACM/IEEE 15th International Conference on Mining Software Repositories
GOAL: To support developers in managing self-admitted technical debt.
How do developers remove SATD?
4
To obtain data on the removal of self-admitted technical debt, we used the online appendix of
Maldonado et al. [3]
Project SATD removal commits Sample
Apache Camel 987 128
Apache Tomcat 910 125
Apache Hadoop 370 52
Gerrit Code Review 133 19
Apache Log4j 107 9
Total 2507 333
[3] Maldonado E, Shihab E, Tsantalis N (2017b) Using natural language processing to automatically detect self-admitted technical debt. IEEE Transactions on Software Engineering 43(11):1044–1062
Does the comment represent SATD?
5
284 (85%)
19 (6%)
30 (9%)
0
50
100
150
200
250
300
Yes No N/A
Number
of
comments Does the comment represent SATD?
What kind of SATD was it?
6
124 (44%)
49 (17%)
43 (15%)
24 (8%)
13 (5%) 12 (4%)
5 (2%)
14 (5%)
0
20
40
60
80
100
120
140
Functionality
needed
Refactoring
needed
Clarification
request
Workaround Wait Bug Explanation Other
Number
of
comments
What kind of SATD was it?
284 (85%)
19 (6%)
30 (9%)
Yes No N/A
Does the comment represent SATD?
Did the commit fix the SATD?
7
118 (42%)
166 (58%)
0
20
40
60
80
100
120
140
160
180
Yes No
Number
of
comments
Did the commit fix the SATD?
284 (85%)
19 (6%)
30 (9%)
Yes No N/A
Does the comment represent SATD?
What kind of fix was it?
8
68 (58%)
18 (15%)
14 (12%)
8 (7%)
5 (4%) 5 (4%)
0
10
20
30
40
50
60
70
80
Implementation Refactoring Removing code Uncommenting
code
Removing
workaround
Other
Number
of
comments
What kind of fix was it?
118 (42%)
166 (58%)
Yes No
Did the commit fix the SATD?
Relationship between type of SATD and the
corresponding fixes
9
Low Frequency High Frequency
Implementation Refactoring Removing code
Uncommenting
code
Removing
workaround
Other Not fixed
Functionality needed 56 1 0 0 0 0 69
Refactoring needed 2 16 1 0 0 1 29
Clarification request 5 0 4 0 0 1 33
Workaround 2 0 2 3 5 0 12
Wait 2 0 2 3 0 0 6
Bug 2 0 1 2 0 1 6
Explanation 0 0 0 0 0 0 5
Other 1 1 4 0 0 2 6
Could the same fix be applied to similar SATD
in a different project?
10
40 (34%)
78 (66%)
0
10
20
30
40
50
60
70
80
90
Possibly No
Number
of
comments
Could the same fix be applied to similar SATD in a different project?
118 (42%)
166 (58%)
Yes No
Did the commit fix the SATD?
Does the SATD include a condition?
11
284 (85%)
19 (6%)
30 (9%)
0
50
100
150
200
250
300
Yes No N/A
Number
of
comments Does the comment represent SATD?
Does the SATD include a condition?
12
27 (10%)
257 (90%)
0
50
100
150
200
250
300
Yes No
Number
of
comments
Does the SATD include a condition?
10% of Self-Admitted Technical Debt include a condition
284 (85%)
19 (6%)
30 (9%)
Yes No N/A
Does the comment represent SATD?
// The replaceAll is an ugly workaround for CAMEL-4954, awaiting
a cleaner fix once CAMEL-4425 // is fully resolved in all components
String name = URLDecoder.decode(parameter.substring(0, p), CHARSET);
String value = URLDecoder.decode(parameter.substring(p + 1)
.replaceAll("%", "%25"), CHARSET);
On-hold SATD
13
On-hold SATD
“On-hold” SATD is technical debt which contains a condition to indicate
that a developer is waiting for a certain event or an updated functionality
having been implemented elsewhere.
Identifying on-hold SATD comments
with their conditions
14
An overall process of classifying on-hold SATD comments
and detecting the specific event developers are waiting for
Dataset
15
• Previous research collected data from commits removing SATD in 15 open source projects.
[3][4]
• We manually classify SATD comments into “on-hold” or not.
Characteristic # of
comments
Excluded
(558)
Non SATD 225
Sample of removed SATD 333
Classification Data
(5248)
SATD with condition 267
SATD without condition 4,981
Term abstraction
16
// TODO: CAMEL-1475 should fix this // TODO: abstractproduct abstractbugid should fix this
Term abstraction
N-gram features extraction
17
2004-01-27 Username tomcat
successfully authenticated
username, tomcat,
successfully authenticated
N-gram IDF [5]
Classification
18
Auto-sklearn [6]
14 options for feature
preprocessing
15 classifiers Hyperparameters
Data preprocessing
What is the best performance of a classifier to
automatically identify on-hold SATD?
19
Naïve
baseline
TF-IDF
N-gram TF-
IDF without
rebalancing
N-gram
TF-IDF
Precision 0.12 0.73 0.76 0.75
Recall 0.66 0.60 0.77 0.78
F1-score 0.20 0.66 0.76 0.77
AUC 0.70 0.97 0.98 0.98
Our proposed classifier N-gram TF-IDF has the best performance in every
evaluation except precision which has a similar score.
Condition detection
20
How well can our classifier automatically identify the
specific conditions in on-hold SATD?
21
Condition Abstract form
Dates @abstractdate
Bug IDs @abstractproduct @abstractbugid
Product version @abstractproduct @abstractversion
// Can be removed
after 29 JUNE 2013
// TODO cmueller,
remove … in CAMEL 3.0
// FIXME
(CAMEL-3091)
Date condition Bug condition
Library condition
// Can be removed
after @abstractdate
// TODO cmueller,
remove … in
@abstractproduct
@abstractversion
// FIXME
(@abstractproduct
@abstractbugid)
How well can our classifier automatically identify the
specific conditions in on-hold SATD?
22
// must setup policy for each route
// TODO: @abstractproduct
@abstractbugid should fix this
Able to identify
// This crap is required to
workaround a bug in hibernate
Unable to identify
90% of the detected specific conditions are correct.
For 43% of the on-hold comments, we were able to identify
the specific condition that a developer was waiting for.
23
How well can our classifier automatically identify the
specific conditions in on-hold SATD?
22
// must setup policy for each route
// TODO: @abstractproduct
@abstractbugid should fix this
Able to identify
// This crap is required to
workaround a bug in hibernate
Unable to identify
90% of the detected specific conditions are correct.
For 43% of the on-hold comments, we were able to identify
the specific condition that a developer was waiting for.
What is the best performance of a classifier to
automatically identify on-hold SATD?
19
Naïve
baseline
TF-IDF
N-gram TF-
IDF without
rebalancing
N-gram
TF-IDF
Precision 0.12 0.73 0.76 0.75
Recall 0.66 0.60 0.77 0.78
F1-score 0.20 0.66 0.76 0.77
AUC 0.70 0.97 0.98 0.98
Our proposed classifier N-gram TF-IDF has the best performance in every
evaluation except precision which has a similar score.
Identifying on-hold SATD comments
with their conditions
14
An overall process of classifying on-hold SATD comments
and detecting the specific event developers are waiting for
Identifying on-hold SATD comments
with their conditions
14
An overall process of classifying on-hold SATD comments
and detecting the specific event developers are waiting for

Wait for it: identifying “On-Hold” self-admitted technical debt

  • 1.
    Wait for it:identifying “On-Hold” self-admitted technical debt Rungroj Maipradit Christoph Treude Hideaki Hata Kenichi Matsumoto Journal-First Paper (Empirical Software Engineering)
  • 2.
    Self-Admitted Technical Debt(SATD) 2 // The replaceAll is an ugly workaround for CAMEL-4954, awaiting a cleaner fix once CAMEL-4425 // is fully resolved in all components String name = URLDecoder.decode(parameter.substring(0, p), CHARSET); String value = URLDecoder.decode(parameter.substring(p + 1) .replaceAll("%", "%25"), CHARSET); SATD Self-admitted technical debt (SATD) refers to situations where a software developer knows that their current implementation is not optimal and indicates this using a source code comment.
  • 3.
    Previous findings 3 • Upto 31% of the files contain SATD. [1] [1] A. Potdar and E. Shihab, “An exploratory study on self-admitted technical debt,” in30th IEEE International Conference on Software Maintenance and Evolution, Victoria, BC, Canada, September 29 -October 3, 2014, 2014, pp. 91–100. [2] F. Zampetti, A. Serebrenik, and M. Di Penta, “Was self-admitted technical debt removal a real removal? an in-depth perspective,” in2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR), 2018, pp. 526–536. An Exploratory Study on Self-Admitted Technical Debt Aniket Potdar Department of Software Engineering Rochester Institute of Technology Rochester, NY, USA Email: asp6719@rit.edu Emad Shihab Department of Computer Science and Software Engineering Concordia University Montreal, QC, Canada Email: eshihab@cse.concordia.ca Abstract—Throughout a software development life cycle, devel- opers knowingly commit code that is either incomplete, requires rework, produces errors, or is a temporary workaround. Such incomplete or temporary workarounds are commonly referred to as ’technical debt’. Our experience indicates that self-admitted technical debt is common in software projects and may negatively impact software maintenance, however, to date very little is known about them. Therefore, in this paper, we use source-code comments in four large open source software projects - Eclipse, Chromium OS, Apache HTTP Server, and ArgoUML to identify self-admitted technical debt. Using the identified technical debt, we study 1) the amount of self-admitted technical debt found in these projects, 2) why this self-admitted technical debt was introduced into the software projects and 3) how likely is the self-admitted technical debt to be removed after their introduction. We find that the amount of self-admitted technical debt exists in 2.4% - 31% of the files. Furthermore, we find that developers with higher experience tend to introduce most of the self-admitted technical debt and that time pressures and complexity of the code do not correlate with the amount of self-admitted technical debt. Lastly, although self-admitted technical debt is meant to be addressed or removed in the future, only between 26.3% - 63.5% of self-admitted technical debt gets removed from projects after introduction. I. INTRODUCTION Delivering high quality, defect-free software is the goal of all software projects. To ensure the delivery of high quality software, software project often plan their development and maintenance efforts. However, in many cases, developers are rushed into completing tasks for various reasons. A few of these reasons mentioned in prior work include, cost reduc- tion, satisfying customers and market pressure from competi- tion [1]. Intuition and general belief indicate that such rushed development tasks (also known as technical debt) negatively impact software maintenance and overall quality [2]. A plethora of prior work proposed techniques to support software maintenance and ensure high software quality. For example, prior work focused on understanding and predicting software defects (e.g. [3]), analyzing bug fix patterns (e.g. [4]), and attempting to understand and eliminate rework and main- tenance (e.g., [5]). The majority of the aforementioned prior work used historical development data and source-code met- rics to perform their studies. More recently, researchers lever- aged natural language to help identify potentially problematic areas of the software. For example, work by Tan et al. [6] developed natural language processing tools to find comment- bug inconsistencies. Other work identified the coevolutionary relationship between source code and its associated comments (e.g., [7], [8]) and used task annotations to manage productiv- ity [9]. The majority of the prior work focused on quality issues that are due to unintentional errors by developers (i.e., er- rors introduced by the developers are assumed to mistakes). However, to the best of our knowledge, very few prior studies examined the impact of errors that might be introduced due to intentional (i.e., self admitted) quick or temporary fixes (i.e., technical debt). Studying this self-admitted technical debt is important since they appear frequently in some projects (as we show later in this study) and prior work indicated that they negatively impact quality [2]. Therefore, in this paper we perform an exploratory study to better understand self-admitted technical debt. Inspired by prior work (e.g., [6], [8], [10], [11]), we use source- code comments to detect self-admitted technical debt. We perform our study on four large open source projects - namely Eclipse, Chromium OS, ArgoUML and Apache httpd. We focus on quantifying the amount of self-admitted technical debt (RQ1), on determining why self-admitted technical debt is introduced (RQ2) and how much of self-admitted technical debt is actually removed after their introduction (RQ3). We make the following contributions: • Identify comment patterns that indicate self-admitted technical debt. We manually read through 101,762 code comments to determine patterns that indicated self- admitted technical debt. In the end, we identified 62 different comment patterns that indicate self-admitted technical debt. • Measure how much self-admitted technical debt exists, why self-admitted technical debt is introduced and how much self-admitted technical debt is removed after their introduction. We find that 2.4% - 31.0% of the files contain self-admitted technical debt, that more experienced developers introduce more self- admitted technical debt and that self-admitted technical debt is introduced throughout their development activity (i.e., they do not only introduce self-admitted technical 2014 IEEE International Conference on Software Maintenance and Evolution 1063-6773/14 $31.00 © 2014 IEEE DOI 10.1109/ICSME.2014.31 91 2014 IEEE International Conference on Software Maintenance and Evolution 1063-6773/14 $31.00 © 2014 IEEE DOI 10.1109/ICSME.2014.31 91 2014 IEEE International Conference on Software Maintenance and Evolution 1063-6773/14 $31.00 © 2014 IEEE DOI 10.1109/ICSME.2014.31 91 2014 IEEE International Conference on Software Maintenance and Evolution 1063-6773/14 $31.00 © 2014 IEEE DOI 10.1109/ICSME.2014.31 91 Authorized licensed use limited to: NARASENTAN-KAGAKUGIJYUTSU. Downloaded on May 02,2021 at 08:19:32 UTC from IEEE Xplore. Restrictions apply. • 20% - 50% of the removals of SATD were accidental and are even unintended. [2] Was Self-Admitted Technical Debt Removal a real Removal? An In-Depth Perspective Fiorella Zampetti University of Sannio, Italy fiorella.zampetti@unisannio.it Alexander Serebrenik Eindhoven University of Technology, The Netherlands a.serebrenik@tue.nl Massimiliano Di Penta University of Sannio, Italy dipenta@unisannio.it ABSTRACT Technical Debt (TD) has been defined as “code being not quite right yet”, and its presence is often self-admitted by developers through comments. The purpose of such comments is to keep track of TD and appropriately address it when possible. Building on a previous quantitative investigation by Maldonado et al. on the removal of self-admitted technical debt (SATD), in this paper we perform an in-depth quantitative and qualitative study of how SATD is addressed in five Java open source projects. On the one hand, we look at whether SATD is “accidentally” removed, and the extent to which the SATD removal is being documented. We found that that (i) between 20% and 50% of SATD comments are accidentally removed while entire classes or methods are dropped, (ii) 8% of the SATD removal is acknowledged in commit messages, and (iii) while most of the changes addressing SATD require complex source code changes, very often SATD is addressed by specific changes to method calls or conditionals. Our results can be used to better plan TD management or learn patterns for addressing certain kinds of TD and provide recommendations to developers. CCS CONCEPTS • Software and its engineering → Software evolution; ACM Reference format: Fiorella Zampetti, Alexander Serebrenik, and Massimiliano Di Penta. 2018. Was Self-Admitted Technical Debt Removal a real Removal? An In-Depth Perspective. In Proceedings of MSR ’18: 15th International Con- ference on Mining Software Repositories , Gothenburg, Sweden, May 28–29, 2018 (MSR ’18), 11 pages. DOI: 10.1145/3196398.3196423 1 INTRODUCTION During software development activities it frequently happens that developers push code that is not in right shape yet. This can occur for several reasons, including pressure to release new features, need for quickly patching faulty code or lack of suitable components needed to implement certain features. The presence of “not quite right code which we postpone making it right” has been referred as Technical Debt (TD) by Cunningham [10]. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. MSR ’18, Gothenburg, Sweden © 2018 ACM. 978-1-4503-5716-6/18/05...$15.00 DOI: 10.1145/3196398.3196423 The TD awareness is a key point to manage it [13]. Luckily, Potdar and Shihab [27] have observed that often developers tend to “self-admit” TD. This is done by inserting a comment near the source code being affected by TD, for example indicating a “TODO” and/or “FIXME”, that the current code is actually a “hack” or, more generally, annotating that the code needs some improvement or refactoring at least. In order to help keep tracking of SATD, vari- ous authors have proposed approaches to automatically identify SATD comments, using regular expressions [5] or Natural Language Processing (NLP) [21]. Also, recently there have been attempts to identify, based on previous SATD and on source code features, where there could be “TD that should be admitted” [36]. In sum- mary, while developers keep track of TD, and there are ways of detecting where TD has or should have been admitted, one may wonder whether anybody takes care of improving that sub-optimal code and therefore addressing the SATD. Bavota and Russo [5] and Maldonado et al. [11] have studied the extent to which SATDs are removed. In both studies removal has been interpreted as removal of the comments reflecting SATD rather than removal of the code affected by the SATD. Maldonado et al. [11] have also surveyed developers involved in the introduction and/or removal of technical debt finding that SATD is predomi- nantly removed when bugs are fixed or new features are added. While previous work has quantitatively studied the extent to which SATD comments disappear, to the best of our knowledge the existing literature lacks a deep and systematic analysis of how “self-admitted” technical debt has been removed. This is relevant for several reasons. First, it is useful for estimating the effort to allocate for the improvement of the code quality. In other words, does the improvement require massive code rewriting, refactoring operations, library replacement/adaptation or is it simply matter of making the implementation more robust by changing some pre- conditions? Second, the identification of recurring SATD-fixing patterns might be useful, at least in bounded circumstances, to recommend possible solutions. For example, there are cases when the code might have been made more robust by adding a null check, simplifying a complex condition, or replacing an API with an alternative one. Last, but not least, SATD removal can be accidental, in other words, the SATD disappears because the related code is no longer in the system. It is indeed possible that, as it was also found for code smells [32], the main reason for SATD removal is not develop- ers intentionally taking care of it, but because the code is simply no longer there. Also, it can happen that the “admission” (i.e., the SATD comment) has been dropped, while the code still remains un- changed. This either means that what it was foreseen as a problem 526 2018 ACM/IEEE 15th International Conference on Mining Software Repositories GOAL: To support developers in managing self-admitted technical debt.
  • 4.
    How do developersremove SATD? 4 To obtain data on the removal of self-admitted technical debt, we used the online appendix of Maldonado et al. [3] Project SATD removal commits Sample Apache Camel 987 128 Apache Tomcat 910 125 Apache Hadoop 370 52 Gerrit Code Review 133 19 Apache Log4j 107 9 Total 2507 333 [3] Maldonado E, Shihab E, Tsantalis N (2017b) Using natural language processing to automatically detect self-admitted technical debt. IEEE Transactions on Software Engineering 43(11):1044–1062
  • 5.
    Does the commentrepresent SATD? 5 284 (85%) 19 (6%) 30 (9%) 0 50 100 150 200 250 300 Yes No N/A Number of comments Does the comment represent SATD?
  • 6.
    What kind ofSATD was it? 6 124 (44%) 49 (17%) 43 (15%) 24 (8%) 13 (5%) 12 (4%) 5 (2%) 14 (5%) 0 20 40 60 80 100 120 140 Functionality needed Refactoring needed Clarification request Workaround Wait Bug Explanation Other Number of comments What kind of SATD was it? 284 (85%) 19 (6%) 30 (9%) Yes No N/A Does the comment represent SATD?
  • 7.
    Did the commitfix the SATD? 7 118 (42%) 166 (58%) 0 20 40 60 80 100 120 140 160 180 Yes No Number of comments Did the commit fix the SATD? 284 (85%) 19 (6%) 30 (9%) Yes No N/A Does the comment represent SATD?
  • 8.
    What kind offix was it? 8 68 (58%) 18 (15%) 14 (12%) 8 (7%) 5 (4%) 5 (4%) 0 10 20 30 40 50 60 70 80 Implementation Refactoring Removing code Uncommenting code Removing workaround Other Number of comments What kind of fix was it? 118 (42%) 166 (58%) Yes No Did the commit fix the SATD?
  • 9.
    Relationship between typeof SATD and the corresponding fixes 9 Low Frequency High Frequency Implementation Refactoring Removing code Uncommenting code Removing workaround Other Not fixed Functionality needed 56 1 0 0 0 0 69 Refactoring needed 2 16 1 0 0 1 29 Clarification request 5 0 4 0 0 1 33 Workaround 2 0 2 3 5 0 12 Wait 2 0 2 3 0 0 6 Bug 2 0 1 2 0 1 6 Explanation 0 0 0 0 0 0 5 Other 1 1 4 0 0 2 6
  • 10.
    Could the samefix be applied to similar SATD in a different project? 10 40 (34%) 78 (66%) 0 10 20 30 40 50 60 70 80 90 Possibly No Number of comments Could the same fix be applied to similar SATD in a different project? 118 (42%) 166 (58%) Yes No Did the commit fix the SATD?
  • 11.
    Does the SATDinclude a condition? 11 284 (85%) 19 (6%) 30 (9%) 0 50 100 150 200 250 300 Yes No N/A Number of comments Does the comment represent SATD?
  • 12.
    Does the SATDinclude a condition? 12 27 (10%) 257 (90%) 0 50 100 150 200 250 300 Yes No Number of comments Does the SATD include a condition? 10% of Self-Admitted Technical Debt include a condition 284 (85%) 19 (6%) 30 (9%) Yes No N/A Does the comment represent SATD?
  • 13.
    // The replaceAllis an ugly workaround for CAMEL-4954, awaiting a cleaner fix once CAMEL-4425 // is fully resolved in all components String name = URLDecoder.decode(parameter.substring(0, p), CHARSET); String value = URLDecoder.decode(parameter.substring(p + 1) .replaceAll("%", "%25"), CHARSET); On-hold SATD 13 On-hold SATD “On-hold” SATD is technical debt which contains a condition to indicate that a developer is waiting for a certain event or an updated functionality having been implemented elsewhere.
  • 14.
    Identifying on-hold SATDcomments with their conditions 14 An overall process of classifying on-hold SATD comments and detecting the specific event developers are waiting for
  • 15.
    Dataset 15 • Previous researchcollected data from commits removing SATD in 15 open source projects. [3][4] • We manually classify SATD comments into “on-hold” or not. Characteristic # of comments Excluded (558) Non SATD 225 Sample of removed SATD 333 Classification Data (5248) SATD with condition 267 SATD without condition 4,981
  • 16.
    Term abstraction 16 // TODO:CAMEL-1475 should fix this // TODO: abstractproduct abstractbugid should fix this Term abstraction
  • 17.
    N-gram features extraction 17 2004-01-27Username tomcat successfully authenticated username, tomcat, successfully authenticated N-gram IDF [5]
  • 18.
    Classification 18 Auto-sklearn [6] 14 optionsfor feature preprocessing 15 classifiers Hyperparameters Data preprocessing
  • 19.
    What is thebest performance of a classifier to automatically identify on-hold SATD? 19 Naïve baseline TF-IDF N-gram TF- IDF without rebalancing N-gram TF-IDF Precision 0.12 0.73 0.76 0.75 Recall 0.66 0.60 0.77 0.78 F1-score 0.20 0.66 0.76 0.77 AUC 0.70 0.97 0.98 0.98 Our proposed classifier N-gram TF-IDF has the best performance in every evaluation except precision which has a similar score.
  • 20.
  • 21.
    How well canour classifier automatically identify the specific conditions in on-hold SATD? 21 Condition Abstract form Dates @abstractdate Bug IDs @abstractproduct @abstractbugid Product version @abstractproduct @abstractversion // Can be removed after 29 JUNE 2013 // TODO cmueller, remove … in CAMEL 3.0 // FIXME (CAMEL-3091) Date condition Bug condition Library condition // Can be removed after @abstractdate // TODO cmueller, remove … in @abstractproduct @abstractversion // FIXME (@abstractproduct @abstractbugid)
  • 22.
    How well canour classifier automatically identify the specific conditions in on-hold SATD? 22 // must setup policy for each route // TODO: @abstractproduct @abstractbugid should fix this Able to identify // This crap is required to workaround a bug in hibernate Unable to identify 90% of the detected specific conditions are correct. For 43% of the on-hold comments, we were able to identify the specific condition that a developer was waiting for.
  • 23.
    23 How well canour classifier automatically identify the specific conditions in on-hold SATD? 22 // must setup policy for each route // TODO: @abstractproduct @abstractbugid should fix this Able to identify // This crap is required to workaround a bug in hibernate Unable to identify 90% of the detected specific conditions are correct. For 43% of the on-hold comments, we were able to identify the specific condition that a developer was waiting for. What is the best performance of a classifier to automatically identify on-hold SATD? 19 Naïve baseline TF-IDF N-gram TF- IDF without rebalancing N-gram TF-IDF Precision 0.12 0.73 0.76 0.75 Recall 0.66 0.60 0.77 0.78 F1-score 0.20 0.66 0.76 0.77 AUC 0.70 0.97 0.98 0.98 Our proposed classifier N-gram TF-IDF has the best performance in every evaluation except precision which has a similar score. Identifying on-hold SATD comments with their conditions 14 An overall process of classifying on-hold SATD comments and detecting the specific event developers are waiting for Identifying on-hold SATD comments with their conditions 14 An overall process of classifying on-hold SATD comments and detecting the specific event developers are waiting for