SlideShare a Scribd company logo
IEEE Transactions on Software Engineering, 2019
Deep Learning based
Code Smell Detection
Hui Liu, Jiahao Jin, Zhifeng Xu, Yanzhen Zou, Yifan Bu and Lu Zhang
Presented By: Sayed Mohsin Reza PhD Student, CS, UTEP
Paper DOI: https://doi.org/10.1109/TSE.2019.2936376
What is Code Smell?
• A code smell is a surface indication that usually
corresponds to a deeper/inner problem in the software
• Code smells suggest the possibility of refactoring
• Software refactoring is an effective means to improve
software quality
• Examples: Feature Envy, Large Class, Long Method etc.
2
Figure: Example of a Code Smell:
Duplicate Code
Introduction
• Background: Code Smells are needed to fix to improve the
software quality
• Problem: manual identification of code smells is challenging
and tedious
• Solution: Use Deep Learning Technique to identify code
smells
• Motivation: Unmaintained code increase actual cost of
development over time and identifying code smells can
reduce the cost
3
Figure: Technical debt arises for
unmaintained code2
2 Falon Fatemi, "Technical Debt: The Silent Company Killer ", Forbes Magazine Report
Objective
• Primary: Identifying following code smells using deep learning technique
1. Feature Envy
2. Long Method
3. God Class
4. Misplaced Class
• Secondary: Provide suggestions of possible refactoring opportunities..
4
Selected Code Smell Definition
1. Feature Envy - when a method uses more features (i.e.,
fields and methods) of another class than of its own
2. Long Method – when a method has too much
functionalities and too much coding
3. Large Class – when a class is doing too much and
containing too much code
4. Misplaced Class – when a class is belonging to one
package whereas but better fit in another package
Figure: Examples of Code Smells
(2) Long Method (3) Large Class
(1) Feature Envy
5
Proposed Approach
Download Repositories
from corpus website1
Step 1
1 http://qualitascorpus.com/, curated collection of software systems intended to be used for empirical studies of
code artefacts
Generation of
Training Data
Step 2
Labelled Code
Smells
Deep Learning
Techniques
Step 3
Model
TrainingPhaseTestingPhase
Provide new software
repository
Step 4
Generation of
Testing Data
Step 5
Classify the code
smells
• God Class
• Long Method
• Feature Envy
• Misplaced Class
6
Research
Questions
(RQs)
• Does the proposed approach
outperform the state-of-the-art
approaches in identifying code smells?
• feature envy (RQ1)
• long methods (RQ3)
• large class (RQ4)
• misplaced class (RQ5)
• Is the proposed approach accurate in
recommending destinations (target
classes) for methods associated with
feature envy (RQ2), target packages for
misplaced class (RQ6)?
7
Subject Codebases
• Consider Open-source applications (Java only)
• Facilitates researchers to repeat the
evaluation
• Popular and high-quality codebases
• Development is involved for more than 5
years
Table: Subject Codebases3
83 http://qualitascorpus.com/
Generation of Training Data
Look for suggested refactoring
opportunities after uploading in IDE using
Eclipse JDT refactoring tool
Step 1 Step 2
Import projects into
Eclipse JDT IDE
Step 3
Generate Training Data
File Name Method
name
Refactoring Code
Smell
...
Model.java login Move
method
Feature
envy
...
Model.java login Inline
method
Long
method
...
... ... ... ... ...
Model.java Model Extract
class
Large
class
...
Model.java Model Move Class Misplac
ed Class
...
9
Training Dataset Structure
Feature Sets Target Sets
File Name Enclosing
Class (ec)
Target Class (tc) Method
(m)
Dist(m,ec) Dist(m,tc) Line of
Code
(LOC)
Cohesion
(COH)
… Number of
Accessed
variables
(NOAV)
Number of
Public
attributes
(NOPA)
Number of
Methods
(NOM)
Feature
Envy
Long
Method
Large
Class
Misplaced
Class
Model.java Model Model3,model4 login 10 ... 0 0 1 0
Model2.java Model2 null login 20 ... 0 0 1 1
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
Model3.java Model3 30 ... 1 1 0 0
Model4.java Model4 40 ... 1 0 0 0
Dataset is available at - J. Jin, Z. Xu, and Y. Bu. (2019) Deep Smell Detector. [Online]. Available:
https://github.com/liuhuigmail/DeepSmellDetection
10
Table: Deep Code Smell Dataset Structure
1 Feature Envy Classifier
Classifier
Input Variables
1. Name(m) - method under investigation
2. Name(ec) - method or class name of
enclosing class
3. Name(tc) - Potential target class
4. Dist(m, ec) - distance between method
and enclosing class
5. Dist(m, tc) - distance between method
and target class
( See distance detailed formula in Appendix
A)
* ec= enclosing class, tc= target class, m =
method
Figure: Classifier for Feature Envy Detection
11
2 Long Method Detection
Classifier
Input Variables
1. LOC(m) - Lines of Code
2. LCOM(m) - Lack of Cohesion of Methods
3. COH(m) - Cohesion of method
4. CC(m) - Class Cohesion
5. NOAV(m) – No of Accessed variables
6. CD(m) -Coupling Dispersion [measures
how much the coupling of the method
involves external classes]
7. MCN(m) - McCabe’s Cyclomatic Number
[measure the complexity of the method]
Figure: Classifier for Long Method Detection
12
3 Large Class Classifier
Classifier
Input Variables
1. AFTD(c) - Access to foreign data
2. DCC(c) - Direct Class Coupling
3. DIT(c) - Depth of Inheritance Tree
4. TCC(c) - Tight Class Cohesion
5. LOCM(c) - Lack of cohesion methods
6. CAM(c) - Cohesion among methods
7. WMC(c) - Weighted method count
8. NOPA(c) - Number of public methods
9. NOAM (c) - Number of Access Method
10. NOA(c)- Number of Attributes
11. NOM(c) – Number of Methods
Figure: Classifier for Large Class
Detection
13
4 Misplaced Class Classifier
Classifier
Input variables
1. Name(c) - name of class
2. Name(ep) - enclosing package
3. Name(tp) - target package
4. CBO(c, ep) - Coupling between objects
5. MPC (c,ep) – Max Message Passing
Coupling
* c= class, ep = enclosing package, tp = target
package
Figure: Classifier for Misplaced Class
Detection
14
EnclosingPackageTargetPackage
Evaluation
Compare the proposed approach against
• JDeodorant, a refactoring tool which can detect Feature Envy .
• DECOR, a tool for detecting Long Methods & Large Class .
• TACO, a textual-based technique to detect Misplaced Class.
Tool Demo Video: https://www.youtube.com/watch?v=LtH8uF0epV0​ 15
Evaluation Results: (1) Feature Envy
RQ1: Does the proposed approach outperform the state-of-the-art approaches in identifying feature envy?
Answer: The proposed approach significantly outperforms the state-of-the-art in identifying feature envy.
16
Table: Evaluation Results on Feature Envy Detection
Observations
• Average F1 score of proposed
approach is 51.91% whereas the
average F1 score of JDeodorant
is 24.51%.
• Average recall of proposed
approach is up to 88.11%
whereas Jdeodorant has 16.6%
Evaluation Results: (1) Feature Envy
Continue
RQ2: Is the proposed approach accurate in
recommending destinations (target classes) for methods
associated with feature envy?
Answer: The proposed approach is more accurate in
recommending destinations for feature envy methods. Table: Accuracy in Recommending Target Classes
17
Observation:
• Proposed approach is 27.25% more accurate than
JDeodorant in recommending destinations for smelly
methods.
Evaluation Results: (2) Long Methods
RQ3: Does the proposed approach outperform the state-of-the-art approaches in identifying long methods?
Result Summary: The proposed approach significantly outperforms the state-of-the-art in identifying long methods.
Table: Evaluation Results on Long Method Detection
18
Observations:
• Proposed approach
identifies most of the long
methods with average
recall 78.99% and F1
score 55.53%.
• DECOR improves
precision at the cost of
significant reduction in
recall.
Evaluation Results: (3) Large Class
RQ4: Does the proposed approach outperform the state-of-the-art approaches in identifying large class?
Result Summary: The proposed approach significantly outperforms the state-of-the-art in identifying large class.
Table: Evaluation Results on Large Class Detection
19
Observations:
• Proposed approach improves
recall (80.95%) significantly at
the cost of reduced precision
• Proposed approach
outperforms DÉCOR in F1
scores, MCC, and AUC
Evaluation Results: (4) Misplaced Class
RQ5: Does the proposed approach outperform the state-of-the-art approaches in identifying Misplaced class?
Answer: The proposed approach significantly outperforms the state-of-the-art in identifying Misplaced class.
Table: Evaluation Results on Misplaced Class Detection
20
Observations:
• Proposed approach
outperforms TACO in F1
Score, MCC, and AUC.
• Proposed approach
improves both precision
and recall significantly.
Evaluation Results: (4) Misplaced Class
RQ6: Is the proposed approach accurate in
recommending target packages for misplaced classes?
Answer: the proposed approach outperforms the
baseline in identifying misplaced classes, and it is
comparable to the baseline in recommending target
packages. Table: Accuracy in Recommending Target Packages
21
Observations:
1. Proposed approach results in greater number of
accepted recommendations
2. TACO is more accurate than the proposed
approach in recommending target packages
Conclusion & Future Work
• Proposed a deep learning-based approach to detect code smells
• Proposed a custom technique for creation of labeled training dataset
• Improve F-measure by 27.4% in feature envy detection, 15.11% in long method detection, 4.73% in
large class detection, and 48.18% in misplaced class detection
• Improves the state-of-the-art in software code smells detection
• Future works
• Detect additional categories of code smells: data clumps, lazy class etc
• Integration with IDE may benefit developers who are looking for refactoring opportunities
22
My Critic
• In evaluation, they use accuracy, recall, precision, and F1 scores. Other relevant and important
metrics should be included.
Suggestions: False Positive Rate (FPR) and False Negative Rate (FNR) can be included to show how
many false alarms the models generate
• A relatively small data set extracted from only 10 code repositories1.
Suggestion – include more codebases into the datasets
23
1 http://qualitascorpus.com/ , curated collection of software systems intended to be used for empirical studies of code artefacts
Questions
Summary
24
Download Repositories
from corpus website1
Step 1
Generation of
Training Data
Step 2
Labelled Code
Smells
Deep Learning
Techniques
Step 3
Model
TrainingPhaseTestingPhase
Provide new software
repository
Step 4
Generation of
Training Data
Step 5
Classify the code
smells
• God Class
• Long Method
• Feature Envy
• Misplaced Class
• The proposed approach is established a better technique in
Identifying code smells
• The proposed approach is successful in suggesting possible
refactoring opportunities
Figure: Proposed Approach
Appendix A - Distance metrics formula
1. If method m does not belong to Class C, the distance is computed as follows:
2. Otherwise, the distance is computed as follows:
Where , S = set of entities in method or class level
e = entity (attribute or method)
25
Appendix B - Performance Metrics
1. Accuracy is calculated as
2. Precision, recall and F1 Score is calculated as
3. Matthews Correlation Coefficient is calculated as
26

More Related Content

What's hot

Clean code presentation
Clean code presentationClean code presentation
Clean code presentation
Bhavin Gandhi
 
Code Review
Code ReviewCode Review
Code Review
Mikalai Alimenkou
 
Code review guidelines
Code review guidelinesCode review guidelines
Code review guidelines
Lalit Kale
 
Lynx project overview (H2020)
Lynx project overview (H2020)Lynx project overview (H2020)
Lynx project overview (H2020)
Lynx Project
 
NAMED ENTITY RECOGNITION
NAMED ENTITY RECOGNITIONNAMED ENTITY RECOGNITION
NAMED ENTITY RECOGNITION
live_and_let_live
 
Test-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptxTest-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptx
Victor Rentea
 
PHPの戻り値型宣言でselfを使ってみよう
PHPの戻り値型宣言でselfを使ってみようPHPの戻り値型宣言でselfを使ってみよう
PHPの戻り値型宣言でselfを使ってみよう
DQNEO
 
Bbl sur les tests
Bbl sur les testsBbl sur les tests
Bbl sur les tests
Idriss Neumann
 
Clean Code - Clean Comments
Clean Code - Clean CommentsClean Code - Clean Comments
Clean Code - Clean Comments
Adam Mukharil Bachtiar
 
Code review
Code reviewCode review
Code review
Abhishek Sur
 
Bad Code Smells
Bad Code SmellsBad Code Smells
Bad Code Smells
kim.mens
 
Using AI to understand search intent
Using AI to understand search intentUsing AI to understand search intent
Using AI to understand search intent
Aritra Mandal
 
Cryptic Biscuit - quiz round
Cryptic Biscuit - quiz roundCryptic Biscuit - quiz round
Cryptic Biscuit - quiz round
Rupert Thomas
 
Strategy パターンと開放/閉鎖原則に見るデザインパターンの有用性
Strategy パターンと開放/閉鎖原則に見るデザインパターンの有用性Strategy パターンと開放/閉鎖原則に見るデザインパターンの有用性
Strategy パターンと開放/閉鎖原則に見るデザインパターンの有用性
tomo_masakura
 
はじめての Go 言語のプロジェクトを AWS Lambda + API Gateway でやったのでパッケージ構成を晒すよ
はじめての Go 言語のプロジェクトを AWS Lambda + API Gateway でやったのでパッケージ構成を晒すよはじめての Go 言語のプロジェクトを AWS Lambda + API Gateway でやったのでパッケージ構成を晒すよ
はじめての Go 言語のプロジェクトを AWS Lambda + API Gateway でやったのでパッケージ構成を晒すよ
Shohei Okada
 
Web API: The Good Parts 落穂ひろい
Web API: The Good Parts 落穂ひろいWeb API: The Good Parts 落穂ひろい
Web API: The Good Parts 落穂ひろい
API Meetup
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLP
MENGSAYLOEM1
 
Goの時刻に関するテスト
Goの時刻に関するテストGoの時刻に関するテスト
Goの時刻に関するテスト
Kentaro Kawano
 
Code Smells and Its type (With Example)
Code Smells and Its type (With Example)Code Smells and Its type (With Example)
Code Smells and Its type (With Example)
Anshul Vinayak
 
Azure Search 言語処理関連機能 〜 アナライザー、検索クエリー、辞書、& ランキング, etc
Azure Search 言語処理関連機能 〜 アナライザー、検索クエリー、辞書、& ランキング, etcAzure Search 言語処理関連機能 〜 アナライザー、検索クエリー、辞書、& ランキング, etc
Azure Search 言語処理関連機能 〜 アナライザー、検索クエリー、辞書、& ランキング, etc
Yoichi Kawasaki
 

What's hot (20)

Clean code presentation
Clean code presentationClean code presentation
Clean code presentation
 
Code Review
Code ReviewCode Review
Code Review
 
Code review guidelines
Code review guidelinesCode review guidelines
Code review guidelines
 
Lynx project overview (H2020)
Lynx project overview (H2020)Lynx project overview (H2020)
Lynx project overview (H2020)
 
NAMED ENTITY RECOGNITION
NAMED ENTITY RECOGNITIONNAMED ENTITY RECOGNITION
NAMED ENTITY RECOGNITION
 
Test-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptxTest-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptx
 
PHPの戻り値型宣言でselfを使ってみよう
PHPの戻り値型宣言でselfを使ってみようPHPの戻り値型宣言でselfを使ってみよう
PHPの戻り値型宣言でselfを使ってみよう
 
Bbl sur les tests
Bbl sur les testsBbl sur les tests
Bbl sur les tests
 
Clean Code - Clean Comments
Clean Code - Clean CommentsClean Code - Clean Comments
Clean Code - Clean Comments
 
Code review
Code reviewCode review
Code review
 
Bad Code Smells
Bad Code SmellsBad Code Smells
Bad Code Smells
 
Using AI to understand search intent
Using AI to understand search intentUsing AI to understand search intent
Using AI to understand search intent
 
Cryptic Biscuit - quiz round
Cryptic Biscuit - quiz roundCryptic Biscuit - quiz round
Cryptic Biscuit - quiz round
 
Strategy パターンと開放/閉鎖原則に見るデザインパターンの有用性
Strategy パターンと開放/閉鎖原則に見るデザインパターンの有用性Strategy パターンと開放/閉鎖原則に見るデザインパターンの有用性
Strategy パターンと開放/閉鎖原則に見るデザインパターンの有用性
 
はじめての Go 言語のプロジェクトを AWS Lambda + API Gateway でやったのでパッケージ構成を晒すよ
はじめての Go 言語のプロジェクトを AWS Lambda + API Gateway でやったのでパッケージ構成を晒すよはじめての Go 言語のプロジェクトを AWS Lambda + API Gateway でやったのでパッケージ構成を晒すよ
はじめての Go 言語のプロジェクトを AWS Lambda + API Gateway でやったのでパッケージ構成を晒すよ
 
Web API: The Good Parts 落穂ひろい
Web API: The Good Parts 落穂ひろいWeb API: The Good Parts 落穂ひろい
Web API: The Good Parts 落穂ひろい
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLP
 
Goの時刻に関するテスト
Goの時刻に関するテストGoの時刻に関するテスト
Goの時刻に関するテスト
 
Code Smells and Its type (With Example)
Code Smells and Its type (With Example)Code Smells and Its type (With Example)
Code Smells and Its type (With Example)
 
Azure Search 言語処理関連機能 〜 アナライザー、検索クエリー、辞書、& ランキング, etc
Azure Search 言語処理関連機能 〜 アナライザー、検索クエリー、辞書、& ランキング, etcAzure Search 言語処理関連機能 〜 アナライザー、検索クエリー、辞書、& ランキング, etc
Azure Search 言語処理関連機能 〜 アナライザー、検索クエリー、辞書、& ランキング, etc
 

Similar to Deep learning based code smell detection - Qualifying Talk

VISSOFTPresentation.pdf
VISSOFTPresentation.pdfVISSOFTPresentation.pdf
VISSOFTPresentation.pdf
NourJiheneAgouf
 
Cser13.ppt
Cser13.pptCser13.ppt
DETECTION AND REFACTORING OF BAD SMELL CAUSED BY LARGE SCALE
DETECTION AND REFACTORING OF BAD SMELL CAUSED BY LARGE SCALEDETECTION AND REFACTORING OF BAD SMELL CAUSED BY LARGE SCALE
DETECTION AND REFACTORING OF BAD SMELL CAUSED BY LARGE SCALE
ijseajournal
 
Cser13.ppt
Cser13.pptCser13.ppt
Cser13.ppt
Ptidej Team
 
Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingRevisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software Testing
Lionel Briand
 
Ch 6 only 1. Distinguish between a purpose statement, research p
Ch 6 only 1. Distinguish between a purpose statement, research pCh 6 only 1. Distinguish between a purpose statement, research p
Ch 6 only 1. Distinguish between a purpose statement, research p
MaximaSheffield592
 
Ch 6 only 1. distinguish between a purpose statement, research p
Ch 6 only 1. distinguish between a purpose statement, research pCh 6 only 1. distinguish between a purpose statement, research p
Ch 6 only 1. distinguish between a purpose statement, research p
nand15
 
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled ExperimentSoftware Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Richard Wettel
 
Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models
IJECEIAES
 
SOFTWARE QUALITY ASSURANCE AND TESTING - SHORT NOTES
SOFTWARE QUALITY ASSURANCE AND TESTING - SHORT NOTESSOFTWARE QUALITY ASSURANCE AND TESTING - SHORT NOTES
SOFTWARE QUALITY ASSURANCE AND TESTING - SHORT NOTES
suthi
 
EFFECTIVE IMPLEMENTATION OF AGILE PRACTICES – OBJECT ORIENTED METRICS TOOL TO...
EFFECTIVE IMPLEMENTATION OF AGILE PRACTICES – OBJECT ORIENTED METRICS TOOL TO...EFFECTIVE IMPLEMENTATION OF AGILE PRACTICES – OBJECT ORIENTED METRICS TOOL TO...
EFFECTIVE IMPLEMENTATION OF AGILE PRACTICES – OBJECT ORIENTED METRICS TOOL TO...
ijseajournal
 
Testing survey by_directions
Testing survey by_directionsTesting survey by_directions
Testing survey by_directionsTao He
 
Refactoring
RefactoringRefactoring
Refactor to the Limit!
Refactor to the Limit!Refactor to the Limit!
Refactor to the Limit!
Jim Bethancourt
 
A Study on Code Smell Detection with Refactoring Tools in Object Oriented Lan...
A Study on Code Smell Detection with Refactoring Tools in Object Oriented Lan...A Study on Code Smell Detection with Refactoring Tools in Object Oriented Lan...
A Study on Code Smell Detection with Refactoring Tools in Object Oriented Lan...
ijcnes
 
06 styles and_greenfield_design
06 styles and_greenfield_design06 styles and_greenfield_design
06 styles and_greenfield_designMajong DevJfu
 
Development Emails Content Analyzer: Intention Mining in Developer Discussions
Development Emails Content Analyzer: Intention Mining in Developer DiscussionsDevelopment Emails Content Analyzer: Intention Mining in Developer Discussions
Development Emails Content Analyzer: Intention Mining in Developer Discussions
Sebastiano Panichella
 
Code-Review-COW56-Meeting
Code-Review-COW56-MeetingCode-Review-COW56-Meeting
Code-Review-COW56-Meeting
Masud Rahman
 
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique
Sujeet Suryawanshi
 

Similar to Deep learning based code smell detection - Qualifying Talk (20)

VISSOFTPresentation.pdf
VISSOFTPresentation.pdfVISSOFTPresentation.pdf
VISSOFTPresentation.pdf
 
Cser13.ppt
Cser13.pptCser13.ppt
Cser13.ppt
 
DETECTION AND REFACTORING OF BAD SMELL CAUSED BY LARGE SCALE
DETECTION AND REFACTORING OF BAD SMELL CAUSED BY LARGE SCALEDETECTION AND REFACTORING OF BAD SMELL CAUSED BY LARGE SCALE
DETECTION AND REFACTORING OF BAD SMELL CAUSED BY LARGE SCALE
 
Cser13.ppt
Cser13.pptCser13.ppt
Cser13.ppt
 
Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingRevisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software Testing
 
Ch 6 only 1. Distinguish between a purpose statement, research p
Ch 6 only 1. Distinguish between a purpose statement, research pCh 6 only 1. Distinguish between a purpose statement, research p
Ch 6 only 1. Distinguish between a purpose statement, research p
 
Ch 6 only 1. distinguish between a purpose statement, research p
Ch 6 only 1. distinguish between a purpose statement, research pCh 6 only 1. distinguish between a purpose statement, research p
Ch 6 only 1. distinguish between a purpose statement, research p
 
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled ExperimentSoftware Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
 
ThesisPresentation
ThesisPresentationThesisPresentation
ThesisPresentation
 
Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models
 
SOFTWARE QUALITY ASSURANCE AND TESTING - SHORT NOTES
SOFTWARE QUALITY ASSURANCE AND TESTING - SHORT NOTESSOFTWARE QUALITY ASSURANCE AND TESTING - SHORT NOTES
SOFTWARE QUALITY ASSURANCE AND TESTING - SHORT NOTES
 
EFFECTIVE IMPLEMENTATION OF AGILE PRACTICES – OBJECT ORIENTED METRICS TOOL TO...
EFFECTIVE IMPLEMENTATION OF AGILE PRACTICES – OBJECT ORIENTED METRICS TOOL TO...EFFECTIVE IMPLEMENTATION OF AGILE PRACTICES – OBJECT ORIENTED METRICS TOOL TO...
EFFECTIVE IMPLEMENTATION OF AGILE PRACTICES – OBJECT ORIENTED METRICS TOOL TO...
 
Testing survey by_directions
Testing survey by_directionsTesting survey by_directions
Testing survey by_directions
 
Refactoring
RefactoringRefactoring
Refactoring
 
Refactor to the Limit!
Refactor to the Limit!Refactor to the Limit!
Refactor to the Limit!
 
A Study on Code Smell Detection with Refactoring Tools in Object Oriented Lan...
A Study on Code Smell Detection with Refactoring Tools in Object Oriented Lan...A Study on Code Smell Detection with Refactoring Tools in Object Oriented Lan...
A Study on Code Smell Detection with Refactoring Tools in Object Oriented Lan...
 
06 styles and_greenfield_design
06 styles and_greenfield_design06 styles and_greenfield_design
06 styles and_greenfield_design
 
Development Emails Content Analyzer: Intention Mining in Developer Discussions
Development Emails Content Analyzer: Intention Mining in Developer DiscussionsDevelopment Emails Content Analyzer: Intention Mining in Developer Discussions
Development Emails Content Analyzer: Intention Mining in Developer Discussions
 
Code-Review-COW56-Meeting
Code-Review-COW56-MeetingCode-Review-COW56-Meeting
Code-Review-COW56-Meeting
 
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique
 

Recently uploaded

CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
timhan337
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
EduSkills OECD
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
vaibhavrinwa19
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBCSTRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
kimdan468
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 

Recently uploaded (20)

CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBCSTRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 

Deep learning based code smell detection - Qualifying Talk

  • 1. IEEE Transactions on Software Engineering, 2019 Deep Learning based Code Smell Detection Hui Liu, Jiahao Jin, Zhifeng Xu, Yanzhen Zou, Yifan Bu and Lu Zhang Presented By: Sayed Mohsin Reza PhD Student, CS, UTEP Paper DOI: https://doi.org/10.1109/TSE.2019.2936376
  • 2. What is Code Smell? • A code smell is a surface indication that usually corresponds to a deeper/inner problem in the software • Code smells suggest the possibility of refactoring • Software refactoring is an effective means to improve software quality • Examples: Feature Envy, Large Class, Long Method etc. 2 Figure: Example of a Code Smell: Duplicate Code
  • 3. Introduction • Background: Code Smells are needed to fix to improve the software quality • Problem: manual identification of code smells is challenging and tedious • Solution: Use Deep Learning Technique to identify code smells • Motivation: Unmaintained code increase actual cost of development over time and identifying code smells can reduce the cost 3 Figure: Technical debt arises for unmaintained code2 2 Falon Fatemi, "Technical Debt: The Silent Company Killer ", Forbes Magazine Report
  • 4. Objective • Primary: Identifying following code smells using deep learning technique 1. Feature Envy 2. Long Method 3. God Class 4. Misplaced Class • Secondary: Provide suggestions of possible refactoring opportunities.. 4
  • 5. Selected Code Smell Definition 1. Feature Envy - when a method uses more features (i.e., fields and methods) of another class than of its own 2. Long Method – when a method has too much functionalities and too much coding 3. Large Class – when a class is doing too much and containing too much code 4. Misplaced Class – when a class is belonging to one package whereas but better fit in another package Figure: Examples of Code Smells (2) Long Method (3) Large Class (1) Feature Envy 5
  • 6. Proposed Approach Download Repositories from corpus website1 Step 1 1 http://qualitascorpus.com/, curated collection of software systems intended to be used for empirical studies of code artefacts Generation of Training Data Step 2 Labelled Code Smells Deep Learning Techniques Step 3 Model TrainingPhaseTestingPhase Provide new software repository Step 4 Generation of Testing Data Step 5 Classify the code smells • God Class • Long Method • Feature Envy • Misplaced Class 6
  • 7. Research Questions (RQs) • Does the proposed approach outperform the state-of-the-art approaches in identifying code smells? • feature envy (RQ1) • long methods (RQ3) • large class (RQ4) • misplaced class (RQ5) • Is the proposed approach accurate in recommending destinations (target classes) for methods associated with feature envy (RQ2), target packages for misplaced class (RQ6)? 7
  • 8. Subject Codebases • Consider Open-source applications (Java only) • Facilitates researchers to repeat the evaluation • Popular and high-quality codebases • Development is involved for more than 5 years Table: Subject Codebases3 83 http://qualitascorpus.com/
  • 9. Generation of Training Data Look for suggested refactoring opportunities after uploading in IDE using Eclipse JDT refactoring tool Step 1 Step 2 Import projects into Eclipse JDT IDE Step 3 Generate Training Data File Name Method name Refactoring Code Smell ... Model.java login Move method Feature envy ... Model.java login Inline method Long method ... ... ... ... ... ... Model.java Model Extract class Large class ... Model.java Model Move Class Misplac ed Class ... 9
  • 10. Training Dataset Structure Feature Sets Target Sets File Name Enclosing Class (ec) Target Class (tc) Method (m) Dist(m,ec) Dist(m,tc) Line of Code (LOC) Cohesion (COH) … Number of Accessed variables (NOAV) Number of Public attributes (NOPA) Number of Methods (NOM) Feature Envy Long Method Large Class Misplaced Class Model.java Model Model3,model4 login 10 ... 0 0 1 0 Model2.java Model2 null login 20 ... 0 0 1 1 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... Model3.java Model3 30 ... 1 1 0 0 Model4.java Model4 40 ... 1 0 0 0 Dataset is available at - J. Jin, Z. Xu, and Y. Bu. (2019) Deep Smell Detector. [Online]. Available: https://github.com/liuhuigmail/DeepSmellDetection 10 Table: Deep Code Smell Dataset Structure
  • 11. 1 Feature Envy Classifier Classifier Input Variables 1. Name(m) - method under investigation 2. Name(ec) - method or class name of enclosing class 3. Name(tc) - Potential target class 4. Dist(m, ec) - distance between method and enclosing class 5. Dist(m, tc) - distance between method and target class ( See distance detailed formula in Appendix A) * ec= enclosing class, tc= target class, m = method Figure: Classifier for Feature Envy Detection 11
  • 12. 2 Long Method Detection Classifier Input Variables 1. LOC(m) - Lines of Code 2. LCOM(m) - Lack of Cohesion of Methods 3. COH(m) - Cohesion of method 4. CC(m) - Class Cohesion 5. NOAV(m) – No of Accessed variables 6. CD(m) -Coupling Dispersion [measures how much the coupling of the method involves external classes] 7. MCN(m) - McCabe’s Cyclomatic Number [measure the complexity of the method] Figure: Classifier for Long Method Detection 12
  • 13. 3 Large Class Classifier Classifier Input Variables 1. AFTD(c) - Access to foreign data 2. DCC(c) - Direct Class Coupling 3. DIT(c) - Depth of Inheritance Tree 4. TCC(c) - Tight Class Cohesion 5. LOCM(c) - Lack of cohesion methods 6. CAM(c) - Cohesion among methods 7. WMC(c) - Weighted method count 8. NOPA(c) - Number of public methods 9. NOAM (c) - Number of Access Method 10. NOA(c)- Number of Attributes 11. NOM(c) – Number of Methods Figure: Classifier for Large Class Detection 13
  • 14. 4 Misplaced Class Classifier Classifier Input variables 1. Name(c) - name of class 2. Name(ep) - enclosing package 3. Name(tp) - target package 4. CBO(c, ep) - Coupling between objects 5. MPC (c,ep) – Max Message Passing Coupling * c= class, ep = enclosing package, tp = target package Figure: Classifier for Misplaced Class Detection 14 EnclosingPackageTargetPackage
  • 15. Evaluation Compare the proposed approach against • JDeodorant, a refactoring tool which can detect Feature Envy . • DECOR, a tool for detecting Long Methods & Large Class . • TACO, a textual-based technique to detect Misplaced Class. Tool Demo Video: https://www.youtube.com/watch?v=LtH8uF0epV0​ 15
  • 16. Evaluation Results: (1) Feature Envy RQ1: Does the proposed approach outperform the state-of-the-art approaches in identifying feature envy? Answer: The proposed approach significantly outperforms the state-of-the-art in identifying feature envy. 16 Table: Evaluation Results on Feature Envy Detection Observations • Average F1 score of proposed approach is 51.91% whereas the average F1 score of JDeodorant is 24.51%. • Average recall of proposed approach is up to 88.11% whereas Jdeodorant has 16.6%
  • 17. Evaluation Results: (1) Feature Envy Continue RQ2: Is the proposed approach accurate in recommending destinations (target classes) for methods associated with feature envy? Answer: The proposed approach is more accurate in recommending destinations for feature envy methods. Table: Accuracy in Recommending Target Classes 17 Observation: • Proposed approach is 27.25% more accurate than JDeodorant in recommending destinations for smelly methods.
  • 18. Evaluation Results: (2) Long Methods RQ3: Does the proposed approach outperform the state-of-the-art approaches in identifying long methods? Result Summary: The proposed approach significantly outperforms the state-of-the-art in identifying long methods. Table: Evaluation Results on Long Method Detection 18 Observations: • Proposed approach identifies most of the long methods with average recall 78.99% and F1 score 55.53%. • DECOR improves precision at the cost of significant reduction in recall.
  • 19. Evaluation Results: (3) Large Class RQ4: Does the proposed approach outperform the state-of-the-art approaches in identifying large class? Result Summary: The proposed approach significantly outperforms the state-of-the-art in identifying large class. Table: Evaluation Results on Large Class Detection 19 Observations: • Proposed approach improves recall (80.95%) significantly at the cost of reduced precision • Proposed approach outperforms DÉCOR in F1 scores, MCC, and AUC
  • 20. Evaluation Results: (4) Misplaced Class RQ5: Does the proposed approach outperform the state-of-the-art approaches in identifying Misplaced class? Answer: The proposed approach significantly outperforms the state-of-the-art in identifying Misplaced class. Table: Evaluation Results on Misplaced Class Detection 20 Observations: • Proposed approach outperforms TACO in F1 Score, MCC, and AUC. • Proposed approach improves both precision and recall significantly.
  • 21. Evaluation Results: (4) Misplaced Class RQ6: Is the proposed approach accurate in recommending target packages for misplaced classes? Answer: the proposed approach outperforms the baseline in identifying misplaced classes, and it is comparable to the baseline in recommending target packages. Table: Accuracy in Recommending Target Packages 21 Observations: 1. Proposed approach results in greater number of accepted recommendations 2. TACO is more accurate than the proposed approach in recommending target packages
  • 22. Conclusion & Future Work • Proposed a deep learning-based approach to detect code smells • Proposed a custom technique for creation of labeled training dataset • Improve F-measure by 27.4% in feature envy detection, 15.11% in long method detection, 4.73% in large class detection, and 48.18% in misplaced class detection • Improves the state-of-the-art in software code smells detection • Future works • Detect additional categories of code smells: data clumps, lazy class etc • Integration with IDE may benefit developers who are looking for refactoring opportunities 22
  • 23. My Critic • In evaluation, they use accuracy, recall, precision, and F1 scores. Other relevant and important metrics should be included. Suggestions: False Positive Rate (FPR) and False Negative Rate (FNR) can be included to show how many false alarms the models generate • A relatively small data set extracted from only 10 code repositories1. Suggestion – include more codebases into the datasets 23 1 http://qualitascorpus.com/ , curated collection of software systems intended to be used for empirical studies of code artefacts
  • 24. Questions Summary 24 Download Repositories from corpus website1 Step 1 Generation of Training Data Step 2 Labelled Code Smells Deep Learning Techniques Step 3 Model TrainingPhaseTestingPhase Provide new software repository Step 4 Generation of Training Data Step 5 Classify the code smells • God Class • Long Method • Feature Envy • Misplaced Class • The proposed approach is established a better technique in Identifying code smells • The proposed approach is successful in suggesting possible refactoring opportunities Figure: Proposed Approach
  • 25. Appendix A - Distance metrics formula 1. If method m does not belong to Class C, the distance is computed as follows: 2. Otherwise, the distance is computed as follows: Where , S = set of entities in method or class level e = entity (attribute or method) 25
  • 26. Appendix B - Performance Metrics 1. Accuracy is calculated as 2. Precision, recall and F1 Score is calculated as 3. Matthews Correlation Coefficient is calculated as 26

Editor's Notes

  1. Hello everyone, Thank you for joining qualifying talk. I am Sayed Mohsin Reza and presenting my talk on “deep learning-based code smell detection”,. The paper was published in IEEE transaction in 2019 I have shared the link of this slide on the chat for your convenience
  2. I am describing code smell little bit for those who are not familiar with this term. A code smell ….
  3. JUnit is a unit testing framework for the Java programming language.
  4. CNN layer - filters = 128, kernel size = 1 and activation = tanh, dense =128 neurons - The model employs binary crossentropy as the loss function. - A Dense layer feeds all outputs from the previous layer to all its neurons, each neuron providing one output to the next layer - A flatten layer collapses the spatial dimensions of the input into the channel dimension.
  5. - CNN layer - filters = 128, kernel size = 1 and activation = tanh, dense =128 neurons - The model employs binary crossentropy as the loss function. - A Dense layer feeds all outputs from the previous layer to all its neurons, each neuron providing one output to the next layer
  6. - Embedding layer – they convert words in identifiers into fixed length numerical vectors using word2vector package, a high-quality distributed vector representation A Dense layer feeds all outputs from the previous layer to all its neurons, each neuron providing one output to the next layer Long Short-Term Memory (LSTM) networks are a type of recurrent neural network capable of learning order dependence in sequence prediction problems. An LSTM layer above provides a sequence output rather than a single value output to the LSTM layer below
  7. - Embedding layer – they convert words in identifiers into fixed length numerical vectors using word2vector package, a high-quality distributed vector representation - A convolutional layer contains a set of filters whose parameters need to be learned. - A Dense layer feeds all outputs from the previous layer to all its neurons, each neuron providing one output to the next layer - A flatten layer collapses the spatial dimensions of the input into the channel dimension.
  8. Demo Video: https://www.youtube.com/watch?v=LtH8uF0epV0
  9. Recall is the number of smelly classes that predicted correctly in terms of the total number of actual smelly classes, precision is the smelly classes predicted correctly in terms of the total number of predicted smelly classes F1 Score is needed when you want to seek a balance between Precision and Recall. - MCC- Matthews Correlation Coefficient - measure of the quality of binary (two-class) classifications, - AUC - Area Under Curve
  10. 1 - https://towardsdatascience.com/accuracy-precision-recall-or-f1-331fb37c5cb9