Presented by; Sayed Mohsin Reza, Ph.D. Student, Computer Science, University of Texas
Abstract:
Code smells are structures in the source code that suggest the possibility of refactorings. Consequently, developers may identify refactoring opportunities by detecting code smells. However, manual identification of code smells is challenging and tedious. To this end, a number of approaches have been proposed to identify code smells automatically or semi-automatically. Most of such approaches rely on manually designed heuristics to map manually selected source code metrics into predictions. However, it is challenging to manually select the best features. It is also difficult to manually construct the optimal heuristics. To this end, in this paper we propose a deep learning based novel approach to detecting code smells. The key insight is that deep neural networks and advanced deep learning techniques could automatically select features of source code for code smell detection, and could automatically build the complex mapping between such features and predictions. A big challenge for deep learning based smell detection is that deep learning often requires a large number of labeled training data (to tune a large number of parameters within the employed deep neural network) whereas existing datasets for code smell detection are rather small. To this end, we propose an automatic approach to generating labeled training data for the neural network based classifier, which does not require any human intervention. As an initial try, we apply the proposed approach to four common and well-known code smells, i.e., feature envy, long method, large class, and misplaced class. Evaluation results on open-source applications suggest that the proposed approach significantly improves the state-of-the-art.
Blooming Together_ Growing a Community Garden Worksheet.docx
Deep learning based code smell detection - Qualifying Talk
1. IEEE Transactions on Software Engineering, 2019
Deep Learning based
Code Smell Detection
Hui Liu, Jiahao Jin, Zhifeng Xu, Yanzhen Zou, Yifan Bu and Lu Zhang
Presented By: Sayed Mohsin Reza PhD Student, CS, UTEP
Paper DOI: https://doi.org/10.1109/TSE.2019.2936376
2. What is Code Smell?
• A code smell is a surface indication that usually
corresponds to a deeper/inner problem in the software
• Code smells suggest the possibility of refactoring
• Software refactoring is an effective means to improve
software quality
• Examples: Feature Envy, Large Class, Long Method etc.
2
Figure: Example of a Code Smell:
Duplicate Code
3. Introduction
• Background: Code Smells are needed to fix to improve the
software quality
• Problem: manual identification of code smells is challenging
and tedious
• Solution: Use Deep Learning Technique to identify code
smells
• Motivation: Unmaintained code increase actual cost of
development over time and identifying code smells can
reduce the cost
3
Figure: Technical debt arises for
unmaintained code2
2 Falon Fatemi, "Technical Debt: The Silent Company Killer ", Forbes Magazine Report
4. Objective
• Primary: Identifying following code smells using deep learning technique
1. Feature Envy
2. Long Method
3. God Class
4. Misplaced Class
• Secondary: Provide suggestions of possible refactoring opportunities..
4
5. Selected Code Smell Definition
1. Feature Envy - when a method uses more features (i.e.,
fields and methods) of another class than of its own
2. Long Method – when a method has too much
functionalities and too much coding
3. Large Class – when a class is doing too much and
containing too much code
4. Misplaced Class – when a class is belonging to one
package whereas but better fit in another package
Figure: Examples of Code Smells
(2) Long Method (3) Large Class
(1) Feature Envy
5
6. Proposed Approach
Download Repositories
from corpus website1
Step 1
1 http://qualitascorpus.com/, curated collection of software systems intended to be used for empirical studies of
code artefacts
Generation of
Training Data
Step 2
Labelled Code
Smells
Deep Learning
Techniques
Step 3
Model
TrainingPhaseTestingPhase
Provide new software
repository
Step 4
Generation of
Testing Data
Step 5
Classify the code
smells
• God Class
• Long Method
• Feature Envy
• Misplaced Class
6
7. Research
Questions
(RQs)
• Does the proposed approach
outperform the state-of-the-art
approaches in identifying code smells?
• feature envy (RQ1)
• long methods (RQ3)
• large class (RQ4)
• misplaced class (RQ5)
• Is the proposed approach accurate in
recommending destinations (target
classes) for methods associated with
feature envy (RQ2), target packages for
misplaced class (RQ6)?
7
8. Subject Codebases
• Consider Open-source applications (Java only)
• Facilitates researchers to repeat the
evaluation
• Popular and high-quality codebases
• Development is involved for more than 5
years
Table: Subject Codebases3
83 http://qualitascorpus.com/
9. Generation of Training Data
Look for suggested refactoring
opportunities after uploading in IDE using
Eclipse JDT refactoring tool
Step 1 Step 2
Import projects into
Eclipse JDT IDE
Step 3
Generate Training Data
File Name Method
name
Refactoring Code
Smell
...
Model.java login Move
method
Feature
envy
...
Model.java login Inline
method
Long
method
...
... ... ... ... ...
Model.java Model Extract
class
Large
class
...
Model.java Model Move Class Misplac
ed Class
...
9
10. Training Dataset Structure
Feature Sets Target Sets
File Name Enclosing
Class (ec)
Target Class (tc) Method
(m)
Dist(m,ec) Dist(m,tc) Line of
Code
(LOC)
Cohesion
(COH)
… Number of
Accessed
variables
(NOAV)
Number of
Public
attributes
(NOPA)
Number of
Methods
(NOM)
Feature
Envy
Long
Method
Large
Class
Misplaced
Class
Model.java Model Model3,model4 login 10 ... 0 0 1 0
Model2.java Model2 null login 20 ... 0 0 1 1
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
Model3.java Model3 30 ... 1 1 0 0
Model4.java Model4 40 ... 1 0 0 0
Dataset is available at - J. Jin, Z. Xu, and Y. Bu. (2019) Deep Smell Detector. [Online]. Available:
https://github.com/liuhuigmail/DeepSmellDetection
10
Table: Deep Code Smell Dataset Structure
11. 1 Feature Envy Classifier
Classifier
Input Variables
1. Name(m) - method under investigation
2. Name(ec) - method or class name of
enclosing class
3. Name(tc) - Potential target class
4. Dist(m, ec) - distance between method
and enclosing class
5. Dist(m, tc) - distance between method
and target class
( See distance detailed formula in Appendix
A)
* ec= enclosing class, tc= target class, m =
method
Figure: Classifier for Feature Envy Detection
11
12. 2 Long Method Detection
Classifier
Input Variables
1. LOC(m) - Lines of Code
2. LCOM(m) - Lack of Cohesion of Methods
3. COH(m) - Cohesion of method
4. CC(m) - Class Cohesion
5. NOAV(m) – No of Accessed variables
6. CD(m) -Coupling Dispersion [measures
how much the coupling of the method
involves external classes]
7. MCN(m) - McCabe’s Cyclomatic Number
[measure the complexity of the method]
Figure: Classifier for Long Method Detection
12
13. 3 Large Class Classifier
Classifier
Input Variables
1. AFTD(c) - Access to foreign data
2. DCC(c) - Direct Class Coupling
3. DIT(c) - Depth of Inheritance Tree
4. TCC(c) - Tight Class Cohesion
5. LOCM(c) - Lack of cohesion methods
6. CAM(c) - Cohesion among methods
7. WMC(c) - Weighted method count
8. NOPA(c) - Number of public methods
9. NOAM (c) - Number of Access Method
10. NOA(c)- Number of Attributes
11. NOM(c) – Number of Methods
Figure: Classifier for Large Class
Detection
13
14. 4 Misplaced Class Classifier
Classifier
Input variables
1. Name(c) - name of class
2. Name(ep) - enclosing package
3. Name(tp) - target package
4. CBO(c, ep) - Coupling between objects
5. MPC (c,ep) – Max Message Passing
Coupling
* c= class, ep = enclosing package, tp = target
package
Figure: Classifier for Misplaced Class
Detection
14
EnclosingPackageTargetPackage
15. Evaluation
Compare the proposed approach against
• JDeodorant, a refactoring tool which can detect Feature Envy .
• DECOR, a tool for detecting Long Methods & Large Class .
• TACO, a textual-based technique to detect Misplaced Class.
Tool Demo Video: https://www.youtube.com/watch?v=LtH8uF0epV0 15
16. Evaluation Results: (1) Feature Envy
RQ1: Does the proposed approach outperform the state-of-the-art approaches in identifying feature envy?
Answer: The proposed approach significantly outperforms the state-of-the-art in identifying feature envy.
16
Table: Evaluation Results on Feature Envy Detection
Observations
• Average F1 score of proposed
approach is 51.91% whereas the
average F1 score of JDeodorant
is 24.51%.
• Average recall of proposed
approach is up to 88.11%
whereas Jdeodorant has 16.6%
17. Evaluation Results: (1) Feature Envy
Continue
RQ2: Is the proposed approach accurate in
recommending destinations (target classes) for methods
associated with feature envy?
Answer: The proposed approach is more accurate in
recommending destinations for feature envy methods. Table: Accuracy in Recommending Target Classes
17
Observation:
• Proposed approach is 27.25% more accurate than
JDeodorant in recommending destinations for smelly
methods.
18. Evaluation Results: (2) Long Methods
RQ3: Does the proposed approach outperform the state-of-the-art approaches in identifying long methods?
Result Summary: The proposed approach significantly outperforms the state-of-the-art in identifying long methods.
Table: Evaluation Results on Long Method Detection
18
Observations:
• Proposed approach
identifies most of the long
methods with average
recall 78.99% and F1
score 55.53%.
• DECOR improves
precision at the cost of
significant reduction in
recall.
19. Evaluation Results: (3) Large Class
RQ4: Does the proposed approach outperform the state-of-the-art approaches in identifying large class?
Result Summary: The proposed approach significantly outperforms the state-of-the-art in identifying large class.
Table: Evaluation Results on Large Class Detection
19
Observations:
• Proposed approach improves
recall (80.95%) significantly at
the cost of reduced precision
• Proposed approach
outperforms DÉCOR in F1
scores, MCC, and AUC
20. Evaluation Results: (4) Misplaced Class
RQ5: Does the proposed approach outperform the state-of-the-art approaches in identifying Misplaced class?
Answer: The proposed approach significantly outperforms the state-of-the-art in identifying Misplaced class.
Table: Evaluation Results on Misplaced Class Detection
20
Observations:
• Proposed approach
outperforms TACO in F1
Score, MCC, and AUC.
• Proposed approach
improves both precision
and recall significantly.
21. Evaluation Results: (4) Misplaced Class
RQ6: Is the proposed approach accurate in
recommending target packages for misplaced classes?
Answer: the proposed approach outperforms the
baseline in identifying misplaced classes, and it is
comparable to the baseline in recommending target
packages. Table: Accuracy in Recommending Target Packages
21
Observations:
1. Proposed approach results in greater number of
accepted recommendations
2. TACO is more accurate than the proposed
approach in recommending target packages
22. Conclusion & Future Work
• Proposed a deep learning-based approach to detect code smells
• Proposed a custom technique for creation of labeled training dataset
• Improve F-measure by 27.4% in feature envy detection, 15.11% in long method detection, 4.73% in
large class detection, and 48.18% in misplaced class detection
• Improves the state-of-the-art in software code smells detection
• Future works
• Detect additional categories of code smells: data clumps, lazy class etc
• Integration with IDE may benefit developers who are looking for refactoring opportunities
22
23. My Critic
• In evaluation, they use accuracy, recall, precision, and F1 scores. Other relevant and important
metrics should be included.
Suggestions: False Positive Rate (FPR) and False Negative Rate (FNR) can be included to show how
many false alarms the models generate
• A relatively small data set extracted from only 10 code repositories1.
Suggestion – include more codebases into the datasets
23
1 http://qualitascorpus.com/ , curated collection of software systems intended to be used for empirical studies of code artefacts
24. Questions
Summary
24
Download Repositories
from corpus website1
Step 1
Generation of
Training Data
Step 2
Labelled Code
Smells
Deep Learning
Techniques
Step 3
Model
TrainingPhaseTestingPhase
Provide new software
repository
Step 4
Generation of
Training Data
Step 5
Classify the code
smells
• God Class
• Long Method
• Feature Envy
• Misplaced Class
• The proposed approach is established a better technique in
Identifying code smells
• The proposed approach is successful in suggesting possible
refactoring opportunities
Figure: Proposed Approach
25. Appendix A - Distance metrics formula
1. If method m does not belong to Class C, the distance is computed as follows:
2. Otherwise, the distance is computed as follows:
Where , S = set of entities in method or class level
e = entity (attribute or method)
25
26. Appendix B - Performance Metrics
1. Accuracy is calculated as
2. Precision, recall and F1 Score is calculated as
3. Matthews Correlation Coefficient is calculated as
26
Editor's Notes
Hello everyone,
Thank you for joining qualifying talk.
I am Sayed Mohsin Reza and presenting my talk on “deep learning-based code smell detection”,. The paper was published in IEEE transaction in 2019
I have shared the link of this slide on the chat for your convenience
I am describing code smell little bit for those who are not familiar with this term.
A code smell ….
JUnit is a unit testing framework for the Java programming language.
CNN layer - filters = 128, kernel size = 1 and activation = tanh, dense =128 neurons
- The model employs binary crossentropy as the loss function.
- A Dense layer feeds all outputs from the previous layer to all its neurons, each neuron providing one output to the next layer
- A flatten layer collapses the spatial dimensions of the input into the channel dimension.
- CNN layer - filters = 128, kernel size = 1 and activation = tanh, dense =128 neurons
- The model employs binary crossentropy as the loss function.
- A Dense layer feeds all outputs from the previous layer to all its neurons, each neuron providing one output to the next layer
- Embedding layer – they convert words in identifiers into fixed length numerical vectors using word2vector package, a high-quality distributed vector representation
A Dense layer feeds all outputs from the previous layer to all its neurons, each neuron providing one output to the next layer
Long Short-Term Memory (LSTM) networks are a type of recurrent neural network capable of learning order dependence in sequence prediction problems.
An LSTM layer above provides a sequence output rather than a single value output to the LSTM layer below
- Embedding layer – they convert words in identifiers into fixed length numerical vectors using word2vector package, a high-quality distributed vector representation
- A convolutional layer contains a set of filters whose parameters need to be learned.
- A Dense layer feeds all outputs from the previous layer to all its neurons, each neuron providing one output to the next layer
- A flatten layer collapses the spatial dimensions of the input into the channel dimension.
Recall is the number of smelly classes that predicted correctly in terms of the total number of actual smelly classes,
precision is the smelly classes predicted correctly in terms of the total number of predicted smelly classes
F1 Score is needed when you want to seek a balance between Precision and Recall.
- MCC- Matthews Correlation Coefficient - measure of the quality of binary (two-class) classifications,
- AUC - Area Under Curve