Deep Learning Anti-patterns from Code Metrics
History
35th IEEE International Conference on Software Maintenance and Evolution
September 30th - October 4th 2019
Cleveland, OH USA
Antoine Barbez Foutse Khomh Yann-Gaël Guéhéneuc
Problem
Definition
Anti-patterns
"structures in the design that indicate violation of fundamental design
principles and negatively impact design quality"
"Certain structures in the code that suggest (sometimes they scream for)
the possibility of refactoring."
Suryanarayana et al. (2014)
Fowler (1999)
1/16
Problem
Definition
Detection
- Rely on structural metrics, e.g., LOC, Cyclomatic Complexity …
- Computed for each code component to be classified
1. Structural Anti-patterns Detection
2/16
Problem
Definition
Detection
1. Structural Anti-patterns Detection
- Rely on structural metrics, e.g., LOC, Cyclomatic Complexity …
- Computed for each code component to be classified
Example 1: Rule-based approaches
Lanza and Marinescu (2007)
2/16
Problem
Definition
Detection
- Rely on structural metrics, e.g., LOC, Cyclomatic Complexity …
- Computed for each code component to be classified
Example 1: Rule-based approaches
Example 2: Machine-learning-based approaches
1. Structural Anti-patterns Detection
2/16
Problem
Definition
Detection
- Anti-patterns affect how source code evolve over time when
changes are applied to the system
- Rely on an analysis of co-changes occurring between code
components
2. Historical Anti-patterns Detection
3/16
Problem
Definition
Detection
- Anti-patterns affect how source code evolve over time when
changes are applied to the system
- Rely on an analysis of co-changes occurring between code
components
Example: HIST (Historical Information for Smell deTection)
2. Historical Anti-patterns Detection
Palomba et al. (2013)
3/16
Problem
Definition
Limitations
Structural and historical detection techniques are
complementary.
HIST does not take into account the structural properties of
the changes.
4/16
Approach Convolutional Analysis of code Metrics Evolution
Main idea:
- Analyze the history of source code metrics
- Use a Convolutional Neural Network to perform
classification
Relies on structural and historical information
Processes changes at a code-level granularity
A deep-learning-based approach
5/16
Approach Input: example
Let’s compute the history of the class Dog
for three metrics:
• Number of Methods Declared (NMD)
• Number of Attributes Declared (NAD)
• Lines Of Code (LOC)
With an history length of Lh = 10
6/16
Approach Input: example
NMD NAD LOC
2
Master (commit N)
7/16
Approach Input: example
NMD NAD LOC
2 2
Master (commit N)
7/16
Approach Input: example
NMD NAD LOC
2 2 6
Master (commit N)
7/16
Approach Input: example
NMD NAD LOC
2 2 6
2 2 6
Commit N - 1
7/16
Approach Input: example
NMD NAD LOC
2 2 6
2 2 6
1 2 6
Commit N - 2
7/16
Approach Input: example
NMD NAD LOC
2 2 6
2 2 6
1 2 6
1 1 3
Commit N - 3
7/16
Approach Input: example
NMD NAD LOC
2 2 6
2 2 6
1 2 6
1 1 3
1 1 3
Commit N - 4
7/16
Approach Input: example
NMD NAD LOC
2 2 6
2 2 6
1 2 6
1 1 3
1 1 3
1 1 3
Commit N - 5
7/16
Approach Input: example
404
File not found
NMD NAD LOC
2 2 6
2 2 6
1 2 6
1 1 3
1 1 3
1 1 3
0 0 0
0 0 0
0 0 0
0 0 0
Commit N - 6
7/16
Approach Model
8/16
Study Design God Class
«  … one object with a lion’s share of the responsibilities, while most
other objects only hold data or execute simple processes. »
Brown et al. (1998)
9/16
Study Design God Class
«  … one object with a lion’s share of the responsibilities, while most
other objects only hold data or execute simple processes. »
Brown et al. (1998)
Selected metrics:
• ATFD (Access To Foreign Data) • NADC (Number of Associated Data
Classes)
• LCOM5 (Lack of COhesion in
Methods)
• NMD (Number of Methods Declared)
• LOC (Lines Of Code) • WMC (Weighted Method Count)
• NAD (Number of Attributes
Declared)
9/16
Study Design Studied Systems
System #Class #God Class
Android Opt Telephony 192 10
Android Support 109 4
Apache Ant 694 7
Apache Lucene 155 3
Apache Tomcat 925 5
Apache Xerces 512 15
ArgoUML 1230 22
Jedit 423 5
Total 4240 71
10/16
Study Design Studied Systems
Evaluation
System #Class #God Class
Android Opt Telephony 192 10
Android Support 109 4
Apache Ant 694 7
Apache Lucene 155 3
Apache Tomcat 925 5
Apache Xerces 512 15
ArgoUML 1230 22
Jedit 423 5
Total 4240 71
10/16
Study Design Studied Systems
Evaluation
Training
&
Tuning
System #Class #God Class
Android Opt Telephony 192 10
Android Support 109 4
Apache Ant 694 7
Apache Lucene 155 3
Apache Tomcat 925 5
Apache Xerces 512 15
ArgoUML 1230 22
Jedit 423 5
Total 4240 71
10/16
Study 1 Definition
RQ1: To what extent historical values of source code metrics can
improve detection performances?
Approach: Monitor the performances achieved by CAME with different
length of metrics history: Lh ∈ {1, 10, 50, 100, 250, 500, 1000}
For each value of Lh:
- Perform hyper-parameters tuning
- Build and train 10 distinct CNNs
- Retrieve mean and std of precision, recall and F-measure
11/16
Study 1 Results
RQ1: To what extent historical values of source code metrics can
improve detection performances?
12/16
Study 2 Definition
RQ2: How does CAME compare to other static ML algorithms?
• Decision Tree
• Multi Layer Perceptron (MLP)
• Support Vector Machine (SVM)
13/16
Study 2 Definition
RQ2: How does CAME compare to other static ML algorithms?
RQ3: How does CAME compare to existing detection techniques?
• Decision Tree
• Multi Layer Perceptron (MLP)
• Support Vector Machine (SVM)
• DECOR Moha et al. (2010)
• HIST Palomba et al. (2013)
• JDeodorant Fokaefs et al. (2011)
13/16
Study 2 Results
Approaches Precision Recall F-measure
Decision Tree 68 % 29 % 40 %
MLP 41 % 86 % 56 %
SVM 68 % 14 % 24 %
CAME 71 % 86 % 77 %
RQ2: How does CAME compare to other static ML algorithms?
14/16
Study 2 Results
Approaches Precision Recall F-measure
DECOR 24 % 36 % 29 %
HIST 20 % 43 % 27 %
JDeodorant 4 % 57 % 8 %
CAME 71 % 86 % 77 %
RQ3: How does CAME compare to existing detection techniques?
15/16
Study 2 References
G. Suryanarayana, G. Samarthyam, T. Sharma, Refactoring for Software Design Smells:
Managing Technical Debt, Morgan Kaufmann, 2014.
W. J. Brown, R. C. Malveau, W. H. Brown, H. W. McCormick III, et T. J. Mowbray, Anti
Patterns: Refactoring Software, Architectures, and Projects in Crisis, 1st éd. John Wiley and
Sons, March 1998.
M. Fowler, Refactoring: Improving the Design of Existing Code. Boston, MA,
USA:Addison-Wesley, 1999.
F. Palomba, G. Bavota, M. D. Penta, R. Oliveto, A. D. Lucia, et D. Poshyvanyk, “Detecting
bad smells in source code using change history information.” dans ASE, 2013, pp. 268–278.
N. Moha, Y. Guéhéneuc, D. Laurence, et L. M. Anne-Franccoise, “Decor: A method for
the specification and detection of code and design smells”, IEEE Transactions on Software
Engineering (TSE), vol. 36, no. 1, pp. 20–36, 2010.
M. Fokaefs, N. Tsantalis, E. Stroulia, et A. Chatzigeorgiou, “Jdeodorant: identification and
application of extract class refactorings”, dans Software Engineering (ICSE), 2011 33rd
International Conference on. IEEE, 2011, pp. 1037–1039.
M. Lanza et R. Marinescu, Object-oriented metrics in practice: using software metrics to
characterize, evaluate, and improve the design of object-oriented systems. Springer Science
& Business Media, 2007. 16/16

Icsm19.ppt

  • 1.
    Deep Learning Anti-patternsfrom Code Metrics History 35th IEEE International Conference on Software Maintenance and Evolution September 30th - October 4th 2019 Cleveland, OH USA Antoine Barbez Foutse Khomh Yann-Gaël Guéhéneuc
  • 2.
    Problem Definition Anti-patterns "structures in thedesign that indicate violation of fundamental design principles and negatively impact design quality" "Certain structures in the code that suggest (sometimes they scream for) the possibility of refactoring." Suryanarayana et al. (2014) Fowler (1999) 1/16
  • 3.
    Problem Definition Detection - Rely onstructural metrics, e.g., LOC, Cyclomatic Complexity … - Computed for each code component to be classified 1. Structural Anti-patterns Detection 2/16
  • 4.
    Problem Definition Detection 1. Structural Anti-patternsDetection - Rely on structural metrics, e.g., LOC, Cyclomatic Complexity … - Computed for each code component to be classified Example 1: Rule-based approaches Lanza and Marinescu (2007) 2/16
  • 5.
    Problem Definition Detection - Rely onstructural metrics, e.g., LOC, Cyclomatic Complexity … - Computed for each code component to be classified Example 1: Rule-based approaches Example 2: Machine-learning-based approaches 1. Structural Anti-patterns Detection 2/16
  • 6.
    Problem Definition Detection - Anti-patterns affecthow source code evolve over time when changes are applied to the system - Rely on an analysis of co-changes occurring between code components 2. Historical Anti-patterns Detection 3/16
  • 7.
    Problem Definition Detection - Anti-patterns affecthow source code evolve over time when changes are applied to the system - Rely on an analysis of co-changes occurring between code components Example: HIST (Historical Information for Smell deTection) 2. Historical Anti-patterns Detection Palomba et al. (2013) 3/16
  • 8.
    Problem Definition Limitations Structural and historicaldetection techniques are complementary. HIST does not take into account the structural properties of the changes. 4/16
  • 9.
    Approach Convolutional Analysisof code Metrics Evolution Main idea: - Analyze the history of source code metrics - Use a Convolutional Neural Network to perform classification Relies on structural and historical information Processes changes at a code-level granularity A deep-learning-based approach 5/16
  • 10.
    Approach Input: example Let’scompute the history of the class Dog for three metrics: • Number of Methods Declared (NMD) • Number of Attributes Declared (NAD) • Lines Of Code (LOC) With an history length of Lh = 10 6/16
  • 11.
    Approach Input: example NMDNAD LOC 2 Master (commit N) 7/16
  • 12.
    Approach Input: example NMDNAD LOC 2 2 Master (commit N) 7/16
  • 13.
    Approach Input: example NMDNAD LOC 2 2 6 Master (commit N) 7/16
  • 14.
    Approach Input: example NMDNAD LOC 2 2 6 2 2 6 Commit N - 1 7/16
  • 15.
    Approach Input: example NMDNAD LOC 2 2 6 2 2 6 1 2 6 Commit N - 2 7/16
  • 16.
    Approach Input: example NMDNAD LOC 2 2 6 2 2 6 1 2 6 1 1 3 Commit N - 3 7/16
  • 17.
    Approach Input: example NMDNAD LOC 2 2 6 2 2 6 1 2 6 1 1 3 1 1 3 Commit N - 4 7/16
  • 18.
    Approach Input: example NMDNAD LOC 2 2 6 2 2 6 1 2 6 1 1 3 1 1 3 1 1 3 Commit N - 5 7/16
  • 19.
    Approach Input: example 404 Filenot found NMD NAD LOC 2 2 6 2 2 6 1 2 6 1 1 3 1 1 3 1 1 3 0 0 0 0 0 0 0 0 0 0 0 0 Commit N - 6 7/16
  • 20.
  • 21.
    Study Design GodClass «  … one object with a lion’s share of the responsibilities, while most other objects only hold data or execute simple processes. » Brown et al. (1998) 9/16
  • 22.
    Study Design GodClass «  … one object with a lion’s share of the responsibilities, while most other objects only hold data or execute simple processes. » Brown et al. (1998) Selected metrics: • ATFD (Access To Foreign Data) • NADC (Number of Associated Data Classes) • LCOM5 (Lack of COhesion in Methods) • NMD (Number of Methods Declared) • LOC (Lines Of Code) • WMC (Weighted Method Count) • NAD (Number of Attributes Declared) 9/16
  • 23.
    Study Design StudiedSystems System #Class #God Class Android Opt Telephony 192 10 Android Support 109 4 Apache Ant 694 7 Apache Lucene 155 3 Apache Tomcat 925 5 Apache Xerces 512 15 ArgoUML 1230 22 Jedit 423 5 Total 4240 71 10/16
  • 24.
    Study Design StudiedSystems Evaluation System #Class #God Class Android Opt Telephony 192 10 Android Support 109 4 Apache Ant 694 7 Apache Lucene 155 3 Apache Tomcat 925 5 Apache Xerces 512 15 ArgoUML 1230 22 Jedit 423 5 Total 4240 71 10/16
  • 25.
    Study Design StudiedSystems Evaluation Training & Tuning System #Class #God Class Android Opt Telephony 192 10 Android Support 109 4 Apache Ant 694 7 Apache Lucene 155 3 Apache Tomcat 925 5 Apache Xerces 512 15 ArgoUML 1230 22 Jedit 423 5 Total 4240 71 10/16
  • 26.
    Study 1 Definition RQ1:To what extent historical values of source code metrics can improve detection performances? Approach: Monitor the performances achieved by CAME with different length of metrics history: Lh ∈ {1, 10, 50, 100, 250, 500, 1000} For each value of Lh: - Perform hyper-parameters tuning - Build and train 10 distinct CNNs - Retrieve mean and std of precision, recall and F-measure 11/16
  • 27.
    Study 1 Results RQ1:To what extent historical values of source code metrics can improve detection performances? 12/16
  • 28.
    Study 2 Definition RQ2:How does CAME compare to other static ML algorithms? • Decision Tree • Multi Layer Perceptron (MLP) • Support Vector Machine (SVM) 13/16
  • 29.
    Study 2 Definition RQ2:How does CAME compare to other static ML algorithms? RQ3: How does CAME compare to existing detection techniques? • Decision Tree • Multi Layer Perceptron (MLP) • Support Vector Machine (SVM) • DECOR Moha et al. (2010) • HIST Palomba et al. (2013) • JDeodorant Fokaefs et al. (2011) 13/16
  • 30.
    Study 2 Results ApproachesPrecision Recall F-measure Decision Tree 68 % 29 % 40 % MLP 41 % 86 % 56 % SVM 68 % 14 % 24 % CAME 71 % 86 % 77 % RQ2: How does CAME compare to other static ML algorithms? 14/16
  • 31.
    Study 2 Results ApproachesPrecision Recall F-measure DECOR 24 % 36 % 29 % HIST 20 % 43 % 27 % JDeodorant 4 % 57 % 8 % CAME 71 % 86 % 77 % RQ3: How does CAME compare to existing detection techniques? 15/16
  • 32.
    Study 2 References G.Suryanarayana, G. Samarthyam, T. Sharma, Refactoring for Software Design Smells: Managing Technical Debt, Morgan Kaufmann, 2014. W. J. Brown, R. C. Malveau, W. H. Brown, H. W. McCormick III, et T. J. Mowbray, Anti Patterns: Refactoring Software, Architectures, and Projects in Crisis, 1st éd. John Wiley and Sons, March 1998. M. Fowler, Refactoring: Improving the Design of Existing Code. Boston, MA, USA:Addison-Wesley, 1999. F. Palomba, G. Bavota, M. D. Penta, R. Oliveto, A. D. Lucia, et D. Poshyvanyk, “Detecting bad smells in source code using change history information.” dans ASE, 2013, pp. 268–278. N. Moha, Y. Guéhéneuc, D. Laurence, et L. M. Anne-Franccoise, “Decor: A method for the specification and detection of code and design smells”, IEEE Transactions on Software Engineering (TSE), vol. 36, no. 1, pp. 20–36, 2010. M. Fokaefs, N. Tsantalis, E. Stroulia, et A. Chatzigeorgiou, “Jdeodorant: identification and application of extract class refactorings”, dans Software Engineering (ICSE), 2011 33rd International Conference on. IEEE, 2011, pp. 1037–1039. M. Lanza et R. Marinescu, Object-oriented metrics in practice: using software metrics to characterize, evaluate, and improve the design of object-oriented systems. Springer Science & Business Media, 2007. 16/16