Wait for it: identifying “On-Hold” self-admitted technical debt

Wait for it: identifying “On-Hold”
self-admitted technical debt
Rungroj Maipradit Christoph Treude Hideaki Hata Kenichi Matsumoto
Journal-First Paper (Empirical Software Engineering)

Self-Admitted Technical Debt (SATD)
2
// The replaceAll is an ugly workaround for CAMEL-4954, awaiting
a cleaner fix once CAMEL-4425 // is fully resolved in all components
String name = URLDecoder.decode(parameter.substring(0, p), CHARSET);
String value = URLDecoder.decode(parameter.substring(p + 1)
.replaceAll("%", "%25"), CHARSET);
SATD
Self-admitted technical debt (SATD) refers to situations where a software developer
knows that their current implementation is not optimal and indicates this using a
source code comment.

Previous findings
3
• Up to 31% of the files contain SATD. [1]
[1] A. Potdar and E. Shihab, “An exploratory study on self-admitted technical debt,” in30th IEEE International Conference on Software Maintenance and Evolution, Victoria, BC, Canada, September 29 -October 3, 2014, 2014, pp. 91–100.
[2] F. Zampetti, A. Serebrenik, and M. Di Penta, “Was self-admitted technical debt removal a real removal? an in-depth perspective,” in2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR), 2018, pp. 526–536.
An Exploratory Study on
Self-Admitted Technical Debt
Aniket Potdar
Department of Software Engineering
Rochester Institute of Technology
Rochester, NY, USA
Email: asp6719@rit.edu
Emad Shihab
Department of Computer Science and Software Engineering
Concordia University
Montreal, QC, Canada
Email: eshihab@cse.concordia.ca
Abstract—Throughout a software development life cycle, devel-
opers knowingly commit code that is either incomplete, requires
rework, produces errors, or is a temporary workaround. Such
incomplete or temporary workarounds are commonly referred to
as ’technical debt’. Our experience indicates that self-admitted
technical debt is common in software projects and may negatively
impact software maintenance, however, to date very little is
known about them.
Therefore, in this paper, we use source-code comments in four
large open source software projects - Eclipse, Chromium OS,
Apache HTTP Server, and ArgoUML to identify self-admitted
technical debt. Using the identified technical debt, we study
1) the amount of self-admitted technical debt found in these
projects, 2) why this self-admitted technical debt was introduced
into the software projects and 3) how likely is the self-admitted
technical debt to be removed after their introduction. We find
that the amount of self-admitted technical debt exists in 2.4%
- 31% of the files. Furthermore, we find that developers with
higher experience tend to introduce most of the self-admitted
technical debt and that time pressures and complexity of the
code do not correlate with the amount of self-admitted technical
debt. Lastly, although self-admitted technical debt is meant to be
addressed or removed in the future, only between 26.3% - 63.5%
of self-admitted technical debt gets removed from projects after
introduction.
I. INTRODUCTION
Delivering high quality, defect-free software is the goal of
all software projects. To ensure the delivery of high quality
software, software project often plan their development and
maintenance efforts. However, in many cases, developers are
rushed into completing tasks for various reasons. A few of
these reasons mentioned in prior work include, cost reduc-
tion, satisfying customers and market pressure from competi-
tion [1]. Intuition and general belief indicate that such rushed
development tasks (also known as technical debt) negatively
impact software maintenance and overall quality [2].
A plethora of prior work proposed techniques to support
software maintenance and ensure high software quality. For
example, prior work focused on understanding and predicting
software defects (e.g. [3]), analyzing bug fix patterns (e.g. [4]),
and attempting to understand and eliminate rework and main-
tenance (e.g., [5]). The majority of the aforementioned prior
work used historical development data and source-code met-
rics to perform their studies. More recently, researchers lever-
aged natural language to help identify potentially problematic
areas of the software. For example, work by Tan et al. [6]
developed natural language processing tools to find comment-
bug inconsistencies. Other work identified the coevolutionary
relationship between source code and its associated comments
(e.g., [7], [8]) and used task annotations to manage productiv-
ity [9].
The majority of the prior work focused on quality issues
that are due to unintentional errors by developers (i.e., er-
rors introduced by the developers are assumed to mistakes).
However, to the best of our knowledge, very few prior studies
examined the impact of errors that might be introduced due to
intentional (i.e., self admitted) quick or temporary fixes (i.e.,
technical debt). Studying this self-admitted technical debt is
important since they appear frequently in some projects (as
we show later in this study) and prior work indicated that
they negatively impact quality [2].
Therefore, in this paper we perform an exploratory study
to better understand self-admitted technical debt. Inspired
by prior work (e.g., [6], [8], [10], [11]), we use source-
code comments to detect self-admitted technical debt. We
perform our study on four large open source projects - namely
Eclipse, Chromium OS, ArgoUML and Apache httpd. We
focus on quantifying the amount of self-admitted technical
debt (RQ1), on determining why self-admitted technical debt
is introduced (RQ2) and how much of self-admitted technical
debt is actually removed after their introduction (RQ3).
We make the following contributions:
• Identify comment patterns that indicate self-admitted
technical debt. We manually read through 101,762
code comments to determine patterns that indicated self-
admitted technical debt. In the end, we identified 62
different comment patterns that indicate self-admitted
technical debt.
• Measure how much self-admitted technical debt
exists, why self-admitted technical debt is introduced
and how much self-admitted technical debt is
removed after their introduction. We find that 2.4% -
31.0% of the files contain self-admitted technical debt,
that more experienced developers introduce more self-
admitted technical debt and that self-admitted technical
debt is introduced throughout their development activity
(i.e., they do not only introduce self-admitted technical
2014 IEEE International Conference on Software Maintenance and Evolution
1063-6773/14 $31.00 © 2014 IEEE
DOI 10.1109/ICSME.2014.31
91
1063-6773/14 $31.00 © 2014 IEEE
DOI 10.1109/ICSME.2014.31
91
1063-6773/14 $31.00 © 2014 IEEE
DOI 10.1109/ICSME.2014.31
91
1063-6773/14 $31.00 © 2014 IEEE
DOI 10.1109/ICSME.2014.31
91
Authorized licensed use limited to: NARASENTAN-KAGAKUGIJYUTSU. Downloaded on May 02,2021 at 08:19:32 UTC from IEEE Xplore. Restrictions apply.
• 20% - 50% of the removals of SATD
were accidental and are even unintended. [2]
Was Self-Admitted Technical Debt Removal a real Removal?
An In-Depth Perspective
Fiorella Zampetti
University of Sannio, Italy
fiorella.zampetti@unisannio.it
Alexander Serebrenik
Eindhoven University of Technology,
The Netherlands
a.serebrenik@tue.nl
Massimiliano Di Penta
University of Sannio, Italy
dipenta@unisannio.it
ABSTRACT
Technical Debt (TD) has been defined as “code being not quite
right yet”, and its presence is often self-admitted by developers
through comments. The purpose of such comments is to keep track
of TD and appropriately address it when possible. Building on
a previous quantitative investigation by Maldonado et al. on the
removal of self-admitted technical debt (SATD), in this paper we
perform an in-depth quantitative and qualitative study of how SATD
is addressed in five Java open source projects. On the one hand, we
look at whether SATD is “accidentally” removed, and the extent
to which the SATD removal is being documented. We found that
that (i) between 20% and 50% of SATD comments are accidentally
removed while entire classes or methods are dropped, (ii) 8% of
the SATD removal is acknowledged in commit messages, and (iii)
while most of the changes addressing SATD require complex source
code changes, very often SATD is addressed by specific changes to
method calls or conditionals. Our results can be used to better plan
TD management or learn patterns for addressing certain kinds of
TD and provide recommendations to developers.
CCS CONCEPTS
• Software and its engineering → Software evolution;
ACM Reference format:
Fiorella Zampetti, Alexander Serebrenik, and Massimiliano Di Penta. 2018.
Was Self-Admitted Technical Debt Removal a real Removal?
An In-Depth Perspective. In Proceedings of MSR ’18: 15th International Con-
ference on Mining Software Repositories , Gothenburg, Sweden, May 28–29,
2018 (MSR ’18), 11 pages.
DOI: 10.1145/3196398.3196423
1 INTRODUCTION
During software development activities it frequently happens that
developers push code that is not in right shape yet. This can occur
for several reasons, including pressure to release new features, need
for quickly patching faulty code or lack of suitable components
needed to implement certain features. The presence of “not quite
right code which we postpone making it right” has been referred
as Technical Debt (TD) by Cunningham [10].
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a
fee. Request permissions from permissions@acm.org.
MSR ’18, Gothenburg, Sweden
© 2018 ACM. 978-1-4503-5716-6/18/05...$15.00
DOI: 10.1145/3196398.3196423
The TD awareness is a key point to manage it [13]. Luckily,
Potdar and Shihab [27] have observed that often developers tend
to “self-admit” TD. This is done by inserting a comment near the
source code being affected by TD, for example indicating a “TODO”
and/or “FIXME”, that the current code is actually a “hack” or, more
generally, annotating that the code needs some improvement or
refactoring at least. In order to help keep tracking of SATD, vari-
ous authors have proposed approaches to automatically identify
SATD comments, using regular expressions [5] or Natural Language
Processing (NLP) [21]. Also, recently there have been attempts to
identify, based on previous SATD and on source code features,
where there could be “TD that should be admitted” [36]. In sum-
mary, while developers keep track of TD, and there are ways of
detecting where TD has or should have been admitted, one may
wonder whether anybody takes care of improving that sub-optimal
code and therefore addressing the SATD.
Bavota and Russo [5] and Maldonado et al. [11] have studied
the extent to which SATDs are removed. In both studies removal
has been interpreted as removal of the comments reflecting SATD
rather than removal of the code affected by the SATD. Maldonado et
al. [11] have also surveyed developers involved in the introduction
and/or removal of technical debt finding that SATD is predomi-
nantly removed when bugs are fixed or new features are added.
While previous work has quantitatively studied the extent to
which SATD comments disappear, to the best of our knowledge
the existing literature lacks a deep and systematic analysis of how
“self-admitted” technical debt has been removed. This is relevant
for several reasons. First, it is useful for estimating the effort to
allocate for the improvement of the code quality. In other words,
does the improvement require massive code rewriting, refactoring
operations, library replacement/adaptation or is it simply matter
of making the implementation more robust by changing some pre-
conditions?
Second, the identification of recurring SATD-fixing patterns
might be useful, at least in bounded circumstances, to recommend
possible solutions. For example, there are cases when the code might
have been made more robust by adding a null check, simplifying a
complex condition, or replacing an API with an alternative one.
Last, but not least, SATD removal can be accidental, in other
words, the SATD disappears because the related code is no longer
in the system. It is indeed possible that, as it was also found for
code smells [32], the main reason for SATD removal is not develop-
ers intentionally taking care of it, but because the code is simply
no longer there. Also, it can happen that the “admission” (i.e., the
SATD comment) has been dropped, while the code still remains un-
changed. This either means that what it was foreseen as a problem
526
2018 ACM/IEEE 15th International Conference on Mining Software Repositories
GOAL: To support developers in managing self-admitted technical debt.

How do developers remove SATD?
4
To obtain data on the removal of self-admitted technical debt, we used the online appendix of
Maldonado et al. [3]
Project SATD removal commits Sample
Apache Camel 987 128
Apache Tomcat 910 125
Apache Hadoop 370 52
Gerrit Code Review 133 19
Apache Log4j 107 9
Total 2507 333
[3] Maldonado E, Shihab E, Tsantalis N (2017b) Using natural language processing to automatically detect self-admitted technical debt. IEEE Transactions on Software Engineering 43(11):1044–1062

Does the comment represent SATD?
5
284 (85%)
19 (6%)
30 (9%)
0
50
100
150
200
250
300
Yes No N/A
Number
of
comments Does the comment represent SATD?

What kind of SATD was it?
6
124 (44%)
49 (17%)
43 (15%)
24 (8%)
13 (5%) 12 (4%)
5 (2%)
14 (5%)
0
20
40
60
80
100
120
140
Functionality
needed
Refactoring
needed
Clarification
request
Workaround Wait Bug Explanation Other
Number
of
comments
What kind of SATD was it?
284 (85%)
19 (6%)
30 (9%)
Yes No N/A

Did the commit fix the SATD?
7
118 (42%)
166 (58%)
0
20
40
60
80
100
120
140
160
180
Yes No
Number
of
comments
284 (85%)
19 (6%)
30 (9%)
Yes No N/A

What kind of fix was it?
8
68 (58%)
18 (15%)
14 (12%)
8 (7%)
5 (4%) 5 (4%)
0
10
20
30
40
50
60
70
80
Implementation Refactoring Removing code Uncommenting
code
Removing
workaround
Other
Number
of
comments
What kind of fix was it?
118 (42%)
166 (58%)
Yes No

Relationship between type of SATD and the
corresponding fixes
9
Low Frequency High Frequency
Implementation Refactoring Removing code
Uncommenting
code
Removing
workaround
Other Not fixed
Functionality needed 56 1 0 0 0 0 69
Refactoring needed 2 16 1 0 0 1 29
Clarification request 5 0 4 0 0 1 33
Workaround 2 0 2 3 5 0 12
Wait 2 0 2 3 0 0 6
Bug 2 0 1 2 0 1 6
Explanation 0 0 0 0 0 0 5
Other 1 1 4 0 0 2 6

Could the same fix be applied to similar SATD
in a different project?
10
40 (34%)
78 (66%)
0
10
20
30
40
50
60
70
80
90
Possibly No
Number
of
comments
Could the same fix be applied to similar SATD in a different project?
118 (42%)
166 (58%)
Yes No

Does the SATD include a condition?
11
284 (85%)
19 (6%)
30 (9%)
0
50
100
150
200
250
300
Yes No N/A
Number
of
comments Does the comment represent SATD?

12
27 (10%)
257 (90%)
0
50
100
150
200
250
300
Yes No
Number
of
comments
10% of Self-Admitted Technical Debt include a condition
284 (85%)
19 (6%)
30 (9%)
Yes No N/A

// The replaceAll is an ugly workaround for CAMEL-4954, awaiting
a cleaner fix once CAMEL-4425 // is fully resolved in all components
String name = URLDecoder.decode(parameter.substring(0, p), CHARSET);
String value = URLDecoder.decode(parameter.substring(p + 1)
.replaceAll("%", "%25"), CHARSET);
On-hold SATD
13
On-hold SATD
“On-hold” SATD is technical debt which contains a condition to indicate
that a developer is waiting for a certain event or an updated functionality
having been implemented elsewhere.

Identifying on-hold SATD comments
with their conditions
14
An overall process of classifying on-hold SATD comments
and detecting the specific event developers are waiting for

Dataset
15
• Previous research collected data from commits removing SATD in 15 open source projects.
[3][4]
• We manually classify SATD comments into “on-hold” or not.
Characteristic # of
comments
Excluded
(558)
Non SATD 225
Sample of removed SATD 333
Classification Data
(5248)
SATD with condition 267
SATD without condition 4,981

Term abstraction
16
// TODO: CAMEL-1475 should fix this // TODO: abstractproduct abstractbugid should fix this
Term abstraction

N-gram features extraction
17
2004-01-27 Username tomcat
successfully authenticated
username, tomcat,
successfully authenticated
N-gram IDF [5]

Classification
18
Auto-sklearn [6]
14 options for feature
preprocessing
15 classifiers Hyperparameters
Data preprocessing

What is the best performance of a classifier to
automatically identify on-hold SATD?
19
Naïve
baseline
TF-IDF
N-gram TF-
IDF without
rebalancing
N-gram
TF-IDF
Precision 0.12 0.73 0.76 0.75
Recall 0.66 0.60 0.77 0.78
F1-score 0.20 0.66 0.76 0.77
AUC 0.70 0.97 0.98 0.98
Our proposed classifier N-gram TF-IDF has the best performance in every
evaluation except precision which has a similar score.

How well can our classifier automatically identify the
specific conditions in on-hold SATD?
21
Condition Abstract form
Dates @abstractdate
Bug IDs @abstractproduct @abstractbugid
Product version @abstractproduct @abstractversion
// Can be removed
after 29 JUNE 2013
// TODO cmueller,
remove … in CAMEL 3.0
// FIXME
(CAMEL-3091)
Date condition Bug condition
Library condition
// Can be removed
after @abstractdate
// TODO cmueller,
remove … in
@abstractproduct
@abstractversion
// FIXME
(@abstractproduct
@abstractbugid)

22
// must setup policy for each route
// TODO: @abstractproduct
@abstractbugid should fix this
Able to identify
// This crap is required to
workaround a bug in hibernate
Unable to identify
90% of the detected specific conditions are correct.
For 43% of the on-hold comments, we were able to identify
the specific condition that a developer was waiting for.

23
22
// must setup policy for each route
// TODO: @abstractproduct
@abstractbugid should fix this
Able to identify
// This crap is required to
workaround a bug in hibernate
Unable to identify
90% of the detected specific conditions are correct.
For 43% of the on-hold comments, we were able to identify
the specific condition that a developer was waiting for.
What is the best performance of a classifier to
automatically identify on-hold SATD?
19
Naïve
baseline
TF-IDF
N-gram TF-
IDF without
rebalancing
N-gram
TF-IDF
Precision 0.12 0.73 0.76 0.75
Recall 0.66 0.60 0.77 0.78
F1-score 0.20 0.66 0.76 0.77
AUC 0.70 0.97 0.98 0.98
Our proposed classifier N-gram TF-IDF has the best performance in every
evaluation except precision which has a similar score.
14
14

Wait for it: identifying “On-Hold” self-admitted technical debt

More Related Content

What's hot

Similar to Wait for it: identifying “On-Hold” self-admitted technical debt

Recently uploaded

Wait for it: identifying “On-Hold” self-admitted technical debt