Automated Identification of On-hold
Self-admitted Technical Debt
Rungroj Maipradit1, Bin Lin2,
Csaba Nagy2, Gabriele Bavota2,
Michele Lanza2, Hideaki Hata1, Kenichi Matsumoto1
1Nara Institute of Science and Technology
2Università della Svizzera italiana
Technical Debt
2
Self-Admitted Technical Debt (SATD)
3
SATD
https://github.com/apache/hadoop/blob/e346e3638c595a512cd582739ff51fb64c3b4950/hadoop-common-project/hadoop-
common/src/main/java/org/apache/hadoop/fs/FileContext.java#L512
On-hold SATD
4
On-hold SATD[1]
https://github.com/apache/hadoop/blob/e346e3638c595a512cd582739ff51fb64c3b4950/hadoop-common-project/hadoop-
common/src/main/java/org/apache/hadoop/fs/FileContext.java#L512
On-hold SATD with References to Issues
5
Since the waiting condition has been fulfilled,
thus mention as SATD were wrong form of “wrong documentation”.
Issue id
Issue id
Status &
Resolution
https://issues.apache.org/jira/browse/HADOOP-6223
Automated Identification of On-hold SATD
6
RQ1:
What is the accuracy of our
approach in identifying
On-hold SATD?
RQ2:
How does On-hold SATD
evolve in open source
projects?
RQ3:
To what extent can our
approach identify
“ready-to-be-removed”
On-hold SATD?
Dataset
7
10 projects
133 On-hold /
1,397 Cross-ref
3 issue tracking systems
1,530 comments
Automated Identification of On-hold SATD
8
RQ1:
What is the accuracy of our
approach in identifying
On-hold SATD?
RQ2:
How does On-hold SATD
evolve in open source
projects?
RQ3:
To what extent can our
approach identify “ready-to-
be-removed” On-hold SATD?
Investigates the
performance of our
classifier in identifying
On-hold SATD.
Inspect the duration of
existence of On-hold SATD,
and the time it takes to address
SATD after issue is resolved.
Evaluates the reliability in
identifying On-hold SATD
which should be removed.
9
RQ1
feature extraction Classification selection
• Term abstraction
• Lemmatization
• Special character removal
Extract n-gram by
applying N-gram IDF
Auto-sklearn
(Automated Machine learning)
Methodology
Data preprocessing
10
RQ1
feature extraction Classification selection
• Term abstraction
• Lemmatization
• Special character removal
Extract n-gram by
applying N-gram IDF
Auto-sklearn
(Automated Machine learning)
Methodology
Data preprocessing
// TODO: CAMEL-1475 should fix this // TODO: abstractissueid should fix this
Term abstraction
11
RQ1 Methodology
2004-01-27 Username tomcat
successfully authenticated
username, tomcat,
successfully authenticated
Feature extraction Classification selection
• Term abstraction
• Lemmatization
• Special character removal
Extract n-gram by
applying N-gram IDF [2]
Auto-sklearn
(Automated Machine learning)
Data preprocessing
N-gram IDF
12
RQ1
Feature extraction Classification selection
• Term abstraction
• Lemmatization
• Special character removal
Extract n-gram by
applying N-gram IDF
Auto-sklearn [3]
(Automated Machine learning)
Methodology
Data preprocessing
Auto-sklearn
14 feature
preprocessing
15 classifiers HyperparametersData preprocessing
Result
13
Original approach BOW as feature With Oversampling Different ML algorithms
N-gram +
Auto-sklearn
BOW +
Auto-sklearn
N-gram +
Oversampling +
Auto-sklearn
N-gram +
Naive Bayes
N-gram +
SVM
N-gram +
KNN
Precision 0.79 0.69 0.38 0.64 0.87 0.88
Recall 0.70 0.68 0.48 0.56 0.38 0.15
F1-score 0.73 0.67 0.41 0.59 0.51 0.25
AUC 0.97 0.94 0.87 0.81 0.95 0.76
From 10-fold cross validation, our original approach
achieve the best performance on F1-score and AUC.
RQ1
Automated Identification of On-hold SATD
14
RQ1:
What is the accuracy of our
approach in identifying
On-hold SATD?
RQ2:
How does On-hold SATD
evolve in open source
projects?
RQ3:
To what extent can our
approach identify “ready-to-
be-removed” On-hold SATD?
Investigates the
performance of our
classifier in identifying
On-hold SATD.
Inspect the duration of
existence of On-hold SATD,
and the time it takes to address
SATD after issue is resolved.
Evaluates the reliability in
identifying On-hold SATD
which should be removed.
Our original approach
achieve the best
performance on F1-score
and AUC.
Distribution of life spans of
removed issue-referring comments
15
The median life span of On-hold SATD comments is 42 days,
while it is 119.5 days for cross-reference comments.
RQ2
Distribution of days needed to address
SATD comments after issues were resolved
16
Around 53% of On-hold SATD were removed within the same day when the issue was resolved.
RQ2
However, it took longer than one year to remove 13% of On-hold SATD.
Automated Identification of On-hold SATD
17
RQ1:
What is the accuracy of our
approach in identifying
On-hold SATD?
RQ2:
How does On-hold SATD
evolve in open source
projects?
RQ3:
To what extent can our
approach identify “ready-to-
be-removed” On-hold SATD?
Investigates the
performance of our
classifier in identifying
On-hold SATD.
Inspect the duration of
existence of On-hold SATD,
and the time it takes to address
SATD after issue is resolved.
Evaluates the reliability in
identifying On-hold SATD
which should be removed.
On-hold SATD has a shorter
lifespan compared to Cross-ref.
And some of on-hold SATD take
longer than a year to be removed.
Our original approach
achieve the best
performance on F1-score
and AUC.
Methodology
18
RQ3
Ready to be removed On-hold
Report to developer
“I think this is correct finding. Would you like to put a patch for this”
Feedback
Methodology
19
RQ3
6 On-hold SATD
were reported
2 response
from developer
Overall, the two cases for which we have already received feedback indicates the
practical value of our approach for On-hold SATD identification and removal.
Automated Identification of On-hold SATD
20
RQ1:
What is the accuracy of our
approach in identifying
On-hold SATD?
RQ2:
How does On-hold SATD
evolve in open source
projects?
RQ3:
To what extent can our
approach identify “ready-to-
be-removed” On-hold SATD?
Investigates the
performance of our
classifier in identifying
On-hold SATD.
Inspect the duration of
existence of On-hold SATD,
and the time it takes to address
SATD after issue is resolved.
Evaluates the reliability in
identifying On-hold SATD
which should be removed.
On-hold SATD has a shorter
lifespan compared to Cross-ref.
And some of on-hold SATD take
longer than a year to be removed.
Feedback indicates the
practical value of our
approach for On-hold SATD
identification and removal.
Our original approach
achieve the best
performance on F1-score
and AUC.
Questions
21
In one of our findings, after issue has already been solved 13% of comments were removed with a
delay more than one year. Does this problem exist only in OSS or it also happens in the industry?
If two on-hold SATD reference the same issue and one of them already removed.
Is it possible to suggest code modification to another one?

Automated Identification of On-hold Self-admitted Technical Debt

  • 1.
    Automated Identification ofOn-hold Self-admitted Technical Debt Rungroj Maipradit1, Bin Lin2, Csaba Nagy2, Gabriele Bavota2, Michele Lanza2, Hideaki Hata1, Kenichi Matsumoto1 1Nara Institute of Science and Technology 2Università della Svizzera italiana
  • 2.
  • 3.
    Self-Admitted Technical Debt(SATD) 3 SATD https://github.com/apache/hadoop/blob/e346e3638c595a512cd582739ff51fb64c3b4950/hadoop-common-project/hadoop- common/src/main/java/org/apache/hadoop/fs/FileContext.java#L512
  • 4.
  • 5.
    On-hold SATD withReferences to Issues 5 Since the waiting condition has been fulfilled, thus mention as SATD were wrong form of “wrong documentation”. Issue id Issue id Status & Resolution https://issues.apache.org/jira/browse/HADOOP-6223
  • 6.
    Automated Identification ofOn-hold SATD 6 RQ1: What is the accuracy of our approach in identifying On-hold SATD? RQ2: How does On-hold SATD evolve in open source projects? RQ3: To what extent can our approach identify “ready-to-be-removed” On-hold SATD?
  • 7.
    Dataset 7 10 projects 133 On-hold/ 1,397 Cross-ref 3 issue tracking systems 1,530 comments
  • 8.
    Automated Identification ofOn-hold SATD 8 RQ1: What is the accuracy of our approach in identifying On-hold SATD? RQ2: How does On-hold SATD evolve in open source projects? RQ3: To what extent can our approach identify “ready-to- be-removed” On-hold SATD? Investigates the performance of our classifier in identifying On-hold SATD. Inspect the duration of existence of On-hold SATD, and the time it takes to address SATD after issue is resolved. Evaluates the reliability in identifying On-hold SATD which should be removed.
  • 9.
    9 RQ1 feature extraction Classificationselection • Term abstraction • Lemmatization • Special character removal Extract n-gram by applying N-gram IDF Auto-sklearn (Automated Machine learning) Methodology Data preprocessing
  • 10.
    10 RQ1 feature extraction Classificationselection • Term abstraction • Lemmatization • Special character removal Extract n-gram by applying N-gram IDF Auto-sklearn (Automated Machine learning) Methodology Data preprocessing // TODO: CAMEL-1475 should fix this // TODO: abstractissueid should fix this Term abstraction
  • 11.
    11 RQ1 Methodology 2004-01-27 Usernametomcat successfully authenticated username, tomcat, successfully authenticated Feature extraction Classification selection • Term abstraction • Lemmatization • Special character removal Extract n-gram by applying N-gram IDF [2] Auto-sklearn (Automated Machine learning) Data preprocessing N-gram IDF
  • 12.
    12 RQ1 Feature extraction Classificationselection • Term abstraction • Lemmatization • Special character removal Extract n-gram by applying N-gram IDF Auto-sklearn [3] (Automated Machine learning) Methodology Data preprocessing Auto-sklearn 14 feature preprocessing 15 classifiers HyperparametersData preprocessing
  • 13.
    Result 13 Original approach BOWas feature With Oversampling Different ML algorithms N-gram + Auto-sklearn BOW + Auto-sklearn N-gram + Oversampling + Auto-sklearn N-gram + Naive Bayes N-gram + SVM N-gram + KNN Precision 0.79 0.69 0.38 0.64 0.87 0.88 Recall 0.70 0.68 0.48 0.56 0.38 0.15 F1-score 0.73 0.67 0.41 0.59 0.51 0.25 AUC 0.97 0.94 0.87 0.81 0.95 0.76 From 10-fold cross validation, our original approach achieve the best performance on F1-score and AUC. RQ1
  • 14.
    Automated Identification ofOn-hold SATD 14 RQ1: What is the accuracy of our approach in identifying On-hold SATD? RQ2: How does On-hold SATD evolve in open source projects? RQ3: To what extent can our approach identify “ready-to- be-removed” On-hold SATD? Investigates the performance of our classifier in identifying On-hold SATD. Inspect the duration of existence of On-hold SATD, and the time it takes to address SATD after issue is resolved. Evaluates the reliability in identifying On-hold SATD which should be removed. Our original approach achieve the best performance on F1-score and AUC.
  • 15.
    Distribution of lifespans of removed issue-referring comments 15 The median life span of On-hold SATD comments is 42 days, while it is 119.5 days for cross-reference comments. RQ2
  • 16.
    Distribution of daysneeded to address SATD comments after issues were resolved 16 Around 53% of On-hold SATD were removed within the same day when the issue was resolved. RQ2 However, it took longer than one year to remove 13% of On-hold SATD.
  • 17.
    Automated Identification ofOn-hold SATD 17 RQ1: What is the accuracy of our approach in identifying On-hold SATD? RQ2: How does On-hold SATD evolve in open source projects? RQ3: To what extent can our approach identify “ready-to- be-removed” On-hold SATD? Investigates the performance of our classifier in identifying On-hold SATD. Inspect the duration of existence of On-hold SATD, and the time it takes to address SATD after issue is resolved. Evaluates the reliability in identifying On-hold SATD which should be removed. On-hold SATD has a shorter lifespan compared to Cross-ref. And some of on-hold SATD take longer than a year to be removed. Our original approach achieve the best performance on F1-score and AUC.
  • 18.
    Methodology 18 RQ3 Ready to beremoved On-hold Report to developer “I think this is correct finding. Would you like to put a patch for this” Feedback
  • 19.
    Methodology 19 RQ3 6 On-hold SATD werereported 2 response from developer Overall, the two cases for which we have already received feedback indicates the practical value of our approach for On-hold SATD identification and removal.
  • 20.
    Automated Identification ofOn-hold SATD 20 RQ1: What is the accuracy of our approach in identifying On-hold SATD? RQ2: How does On-hold SATD evolve in open source projects? RQ3: To what extent can our approach identify “ready-to- be-removed” On-hold SATD? Investigates the performance of our classifier in identifying On-hold SATD. Inspect the duration of existence of On-hold SATD, and the time it takes to address SATD after issue is resolved. Evaluates the reliability in identifying On-hold SATD which should be removed. On-hold SATD has a shorter lifespan compared to Cross-ref. And some of on-hold SATD take longer than a year to be removed. Feedback indicates the practical value of our approach for On-hold SATD identification and removal. Our original approach achieve the best performance on F1-score and AUC.
  • 21.
    Questions 21 In one ofour findings, after issue has already been solved 13% of comments were removed with a delay more than one year. Does this problem exist only in OSS or it also happens in the industry? If two on-hold SATD reference the same issue and one of them already removed. Is it possible to suggest code modification to another one?

Editor's Notes

  • #2 Hello everyone, I’m Rungroj Maipradit Phd student, from Nara Institute of Science and Technology, Japan. I would like to present my research paper “Automated Identification of on-hold Self-admitted technical debt”.
  • #3 Technical debt is used as a metaphor to describe a trade-off between a short-term “hack” with long-lasting consequences. If not properly handle.
  • #4 In many cases, developers know when they are about to cause technical debt, and they leave documentation such as code comments to indicate its presence. This action is referred to as Self-admitted technical debt (SATD). Thus Self-admitted technical debt refers to situations where a software developer knows that their current implementation is not optimal and indicates this using a source code comment.
  • #5 One particular type of SATD is called “on-hold” which define as a waiting condition for an external event to happen before the technical debt can be removed.  The waiting condition can be refer to issue id, date, or program release version.
  • #6 In this example, the waiting condition refers to an issue with issue id. !!Click!! In some cases, the issue it refers to has already been closed, but the On-hold SATD was not removed. !!Click!! In essence, On-hold SATD is intentionally reminders left in the source code whose sole purpose is to be removed. Since the “waiting condition” has been fulfilled, thus making the SATD a form of “wrong documentation” in the code.
  • #7 In order to solve this problem, we aim to build a classifier which automatically detects On-hold SATD and indicates whether it is ready to be removed. Which lead to 3 Research questions. RQ1: What is the accuracy of our approach in identifying On-hold SATD? RQ2: How does On-hold SATD evolve in open source projects? RQ3: To what extent can our approach identify “ready-to-be-removed” On-hold SATD?
  • #8 In this study, we extract comments with reference issues from 10 open-source projects. Among these projects, it refers to 3 issue tracking systems Jira, GitHub, and Bugzilla. After extracting both existing comments and removed comments we receive in total 1,530 comments After manually classify we receive 133 on-hold SATD and 1397 cross-reference comments. ** 133 on-hold, 1397 cross-ref น่าแก้
  • #9 In RQ1: What is the accuracy of our approach in identifying On-hold SATD? To answer this question we Investigate the performance of our classifier in identifying On-hold SATD.
  • #10 With lead to 3 step: data preprocessing, feature extraction and classification selection.
  • #11 In this model, we include 3 data preprocessing step Term abstraction, Lemmatization, and Special character removal. In Term Abstraction: we apply to abstracted issue IDs and hyperlinks referring to issues to the string “abstractissueid”, while the hyperlinks unrelated to issues were abstracted to “abstracturl” This is done to eliminate the impact of issue IDs and hyperlinks during classification, as we are not interested in their real content. In Lemmatization: we apply to reduce the inflection form of words into dictionary form by considering the context in the sentences, thus increasing the frequency of the word. In Special character removal: We removed all non-English and non-numeric characters
  • #12 In the feature extraction step, we extract n-gram by applying N-gram IDF N-gram IDF is a theoretical extension of IDF (Inverse Document Frequency). The traditional IDF approach assigns more weight to terms occurring in fewer documents, which does not work well for n-grams. N-gram IDF is designed to address this issue and can determine the dominant n-grams and extract key terms of any length. In this work we exrtract n-gram from On-hold SATD comments only because we want tto extract important patterns to detect on-hold SATD,and we use these patterns to discriminate between On-holdSATD and Cross-reference.
  • #13 In machine learning, two problems are known: (1) no single machine learning method performs best on all data sets, and (2) some machine learning methods rely heavily on hyperparameter optimization. automated machine learning addresses this problem by running multiple classifiers with different parameters to optimize performance. Thus in the Classification selection step: we apply auto-sklearn an automated machine learning.
  • #14 In this study, we compare our approach with 3 group The first group use BOW as a feature to see the impact of different feature The second group applies oversampling to compare results between with and without oversampling in the Imbalance dataset. The third group uses other algorithms to see the performance of our classifier. From 10-fold cross-validation, our original approach achieves the best performance on F1-score and AUC.
  • #15 In RQ2: How does On-hold SATD evolve in open source projects? To answer this question we Inspect the duration of the existence of On-hold SATD, and the time it takes to address SATD after the issue is resolved
  • #16 First, we first looked into the life span of removed issue-referring comments for On-hold SATD and cross-reference comments separately. The median life span of On-hold SATD comments is 42days, while it is 119.5 days for cross-reference comments. On-hold SATD requires maintenance actions from developers. Cross-reference comments stay much longer as they are usually used for documentation purposes.
  • #17 Second, we then investigated how long it takes to address On-hold SATD comments after the corresponding issues are resolved. Around 53% of On-hold SATD were removed within the same day when the issue was resolved. However, it takes longer than one year to remove 13% of On-hold SATD. // Also, after we check status on existing on-hold comments, we found 10 on-hold comments when their condition was resolved which lead to RQ.3
  • #18 In RQ3: To what extent can our approach identify “ready-to-be-removed” On-hold SATD? To answer this question we evaluate the reliability of our approach in identifying On-hold SATD which should be removed since the condition was resolved.
  • #19 In order to evaluate We gather ready to be removed on-hold SATD, we consider on-hold is ready to be removed when their condition was resolved. ”Click” Then we create issue about on-hold SATD ready to be removed and evaluate the response. “Click” This is an example of feedback we receive. ““I think this is correct finding. Would you like to put a patch for this””
  • #20 In total, we reported six identified cases to developers and we have received the feedback from the developers about the two cases. In both case, developers agreed that the on-hold need to be address. Overall, the two cases for which we have already received feedback indicates the practical value of our approach for On-hold SATD identification and removal.
  • #21 In conclusion From 10 OSS projects across 3 issue tracking system. Our classifier with n-gram and auto-sklearn achieves the best performance on F1-score and AUC. While the life span of SATD is shorter compare to cross-reference comment, some of on-hold SATD take longer than a year to be removed. And feedback from the developer indicated the practical value of our approach in the identification and removal of on-hold SATD.
  • #22 This is my question. The first one :If two on-hold SATD reference the same issue and one of them already removed. Is it possible to suggest code modification to another one? The second one: In one of our findings, after issue has already been solved 13% of comments were removed with a delay more than one year. Does this problem exist only in OSS or it also happens in the industry? Thank you for your attention.
  • #23 From 10 OSS project across 3 issue tracking system. Using N-gram IDF, auto-sklearn and term abstraction. Our classifier out perform based on AUC and F1 score. Life span Response from developer.
  • #24 We reported six identified cases to developers in three issuereports, as these six cases correspond to three subsystemsof the Apache Hadoop project (two for Hadoop Common,one for Hadoop HDFS, and three for Hadoop YARN). We found 10 Ready to be address (4 were already address as the time we perform) *** เชคหน้านี้
  • #25 We apply this process because we are more interested in the existence of these types rather than the actual terms, which do not appear frequently.
  • #26 N-gram IDF can detect dominant N-grams and extract key terms of any length. [1] Compare to N-gram, N-gram IDF provide more useful data. [2]
  • #27 Auto-sklearn is an automated machine learning built on scikit-learn library. It will find best classifier and hyperparameter using Bayesian optimization. [3]