Understanding the Rationale for Updating a
Function’s Comment
Haroon Malik, Istehad Chowdhury, Hsiao-Ming Tsou,
Zhen Ming Jiang, Ahmed E. Hassan
School of Computing, Queen’s University, Canada
Documentation is vital for the successful
evolution of a software system
2
Why understand the rationale for
updating a comment
3
Because…
Reduce efforts to understand code
Reduce maintenance cost
Prevent bugs
Increase reliability 4
Likelihood of updating a comment
Function 1.
function incrementValue ($val)
{
return ($val++);
}
Function 2.
function processInput($val)
{ //loop 11 times.
for (i=0;i<10;i++) {
// loop executes for the upper
bound of J
for (j=0;j<10;j++) {
$val = ($val | i) << 2;
$val = $val & $j << 2;
}
}
return $val;
}
5
Likelihood of updating a comment
Function 1.
function incrementValue ($val)
{
return ($val++);
}
Function 2.
function processInput($val)
{ //loop 11 times.
for (i=0;i<10;i++) {
// loop executes for the upper
bound of J
for (j=0;j<10;j++) {
$val = ($val | i) << 2;
$val = $val & $j << 2;
}
}
return $val;
}
6
• Modified function characteristics (8 attributes)
– Long vs. short functions
– Long vs. short function names
– Well-documented functions
– Complex vs. simple functions (# of control statements)
• Change characteristics (8 attributes)
– Complex vs. simple change
– Large vs. small change
• Time and code ownership characteristics (9)
– Do habits change over time? Weekends vs. weekends
– Same developer that changed it last time
7
Study Dimensions
Comment Update?
YES ? No?
8
Modeled as a classification problem
Measuring Performance
9
True Class
Classified As
YES NO
YES a b
NO c d
We measure overall misclassification rate
= (b+c)/(a+b+c+d)
• Explainable model
• Resistant to noise
• Correlated attributes
• Minimum configuration
10
Need
Random Forests
Project Comment
update history
Data Set
Random Forests
12
Project Comment
update history
Data Set
RandomSample
RandomTress
Yes No No
No
Vote
Prediction
Finding Top Attributes
• Sensitivity Analysis for particular attribute
• Randomly change the value in all samples
• Re-classify and compare performance
–Drop in performance is relative to the
importance of the attribute
13
Case Study
• Used 4 open source projects with over 39
years of development:
• PostgreSQL, FreeBSD, Gcluster and GCC
• Conducted 5 experiments
• 1 for each dimension
• 1 for all attributes of each project
• 1 for total combined attributes of all projects
Exp. #1 Characteristics of changed
function
• Intuition
– Modification to complex functions are trickier and
more likely to introduce integration bugs
• Findings
– Likelihood of comment update is higher in
functions
• With a large number of comments
• That are complex
15
Exp. #2 Characteristics of the change
• Intuition
– More extensive and complex changes will increase
the probability that a comment will get updated
• Findings
– Likelihood of comment update is higher for
changes
• That are bug fixes
• With a large number of changed dependencies
• Which increase the complexity of a function (control statements)
16
Exp. #3 Change time and code-
ownership
• Intuition
– To see if time has any impact on a developer
tendency to update a comment
– To highlight the relation of a function with
developer
• Findings
– Likelihood of comment update
• Depends on Weekday: Developers are reluctant to update
comment on certain weekdays
• Does not depend on developer: non-creator of function will
update too
17
Exp. #4 All attributes
• Intuition
– To find general trend towards all attributes instead
of specific trend per dimension
• Findings
– The top attributes are consistent across projects
– The top attributes are from the changed function
and change characteristics dimension
• Number of changed dependencies
• Percentage of changed dependencies
• Total number of comments
18
Exp. #5 All Projects
• Intuition
– Determine the most influential attributes across
all projects
• Added an extra attribute “Project
Name”
• Findings
– Project name did not bubble up as an important
attribute
19
How well we did ?
20
Number Speaks
• Performance of classifier improves with
combining data from all projects. Over
all misclassification rate ~ 20%
21
Conclusion
Random Forests
Training set
…
1
2
n
n random cases
Classification
Algorithm
n classifiers
1
2
3
3
n
Classification
Algorithm
Classification
Algorithm
Classification
Algorithm
Test set
…
L1
L2
L3
Ln
n labels
L
vote
23

Understanding the Rationale for Updating a Function's Comment

  • 1.
    Understanding the Rationalefor Updating a Function’s Comment Haroon Malik, Istehad Chowdhury, Hsiao-Ming Tsou, Zhen Ming Jiang, Ahmed E. Hassan School of Computing, Queen’s University, Canada
  • 2.
    Documentation is vitalfor the successful evolution of a software system 2
  • 3.
    Why understand therationale for updating a comment 3
  • 4.
    Because… Reduce efforts tounderstand code Reduce maintenance cost Prevent bugs Increase reliability 4
  • 5.
    Likelihood of updatinga comment Function 1. function incrementValue ($val) { return ($val++); } Function 2. function processInput($val) { //loop 11 times. for (i=0;i<10;i++) { // loop executes for the upper bound of J for (j=0;j<10;j++) { $val = ($val | i) << 2; $val = $val & $j << 2; } } return $val; } 5
  • 6.
    Likelihood of updatinga comment Function 1. function incrementValue ($val) { return ($val++); } Function 2. function processInput($val) { //loop 11 times. for (i=0;i<10;i++) { // loop executes for the upper bound of J for (j=0;j<10;j++) { $val = ($val | i) << 2; $val = $val & $j << 2; } } return $val; } 6
  • 7.
    • Modified functioncharacteristics (8 attributes) – Long vs. short functions – Long vs. short function names – Well-documented functions – Complex vs. simple functions (# of control statements) • Change characteristics (8 attributes) – Complex vs. simple change – Large vs. small change • Time and code ownership characteristics (9) – Do habits change over time? Weekends vs. weekends – Same developer that changed it last time 7 Study Dimensions
  • 8.
    Comment Update? YES ?No? 8 Modeled as a classification problem
  • 9.
    Measuring Performance 9 True Class ClassifiedAs YES NO YES a b NO c d We measure overall misclassification rate = (b+c)/(a+b+c+d)
  • 10.
    • Explainable model •Resistant to noise • Correlated attributes • Minimum configuration 10 Need
  • 11.
  • 12.
    Random Forests 12 Project Comment updatehistory Data Set RandomSample RandomTress Yes No No No Vote Prediction
  • 13.
    Finding Top Attributes •Sensitivity Analysis for particular attribute • Randomly change the value in all samples • Re-classify and compare performance –Drop in performance is relative to the importance of the attribute 13
  • 14.
    Case Study • Used4 open source projects with over 39 years of development: • PostgreSQL, FreeBSD, Gcluster and GCC • Conducted 5 experiments • 1 for each dimension • 1 for all attributes of each project • 1 for total combined attributes of all projects
  • 15.
    Exp. #1 Characteristicsof changed function • Intuition – Modification to complex functions are trickier and more likely to introduce integration bugs • Findings – Likelihood of comment update is higher in functions • With a large number of comments • That are complex 15
  • 16.
    Exp. #2 Characteristicsof the change • Intuition – More extensive and complex changes will increase the probability that a comment will get updated • Findings – Likelihood of comment update is higher for changes • That are bug fixes • With a large number of changed dependencies • Which increase the complexity of a function (control statements) 16
  • 17.
    Exp. #3 Changetime and code- ownership • Intuition – To see if time has any impact on a developer tendency to update a comment – To highlight the relation of a function with developer • Findings – Likelihood of comment update • Depends on Weekday: Developers are reluctant to update comment on certain weekdays • Does not depend on developer: non-creator of function will update too 17
  • 18.
    Exp. #4 Allattributes • Intuition – To find general trend towards all attributes instead of specific trend per dimension • Findings – The top attributes are consistent across projects – The top attributes are from the changed function and change characteristics dimension • Number of changed dependencies • Percentage of changed dependencies • Total number of comments 18
  • 19.
    Exp. #5 AllProjects • Intuition – Determine the most influential attributes across all projects • Added an extra attribute “Project Name” • Findings – Project name did not bubble up as an important attribute 19
  • 20.
    How well wedid ? 20
  • 21.
    Number Speaks • Performanceof classifier improves with combining data from all projects. Over all misclassification rate ~ 20% 21
  • 22.
  • 23.
    Random Forests Training set … 1 2 n nrandom cases Classification Algorithm n classifiers 1 2 3 3 n Classification Algorithm Classification Algorithm Classification Algorithm Test set … L1 L2 L3 Ln n labels L vote 23