SlideShare a Scribd company logo
1 of 29
Download to read offline
Mining Source Code Improvement
Patterns from Similar Code Review
Yuki Ueda1, Takashi Ishio1, Akinori Ihara2,
Kenichi Matsumoto1
1Nara Institute of Science and Technology
2Wakayama University
13th International Workshop on Software Clones (IWSC’19)
Background Approach Result Summary
Contents
• Goal:Reduce Code Review Cost
• Approach:Code Improvement Pattern Detection
That Appeared Review
• Evaluation: Measure Patterns’ Frequency and
Accuracy
2
Background Approach Result Summary
Code review process:
Reviewers suggest code fix
Patch
Author
Reviewer Project
3
- i=key
+ i=dic[“key”]
Patch
Background
(1) Submit
Background Approach Result Summary
Code review process:
Reviewers suggest code fix
Patch
Author
Reviewer Project
4
- i=key
+ i=dic[“key”]
Patch
You should fix
(1) Submit
(2) Review, Fix suggestion
Background
Background Approach Result Summary
Code review process:
Reviewers suggest code fix
5
- i=key
+ i=dic[“key”]
- i=key
+ i_=_dic[“KEY”]
(3) Integrate
Patch
Author
Reviewer Project(1) Submit
(2) Review, Fix suggestion
Reviewed Patch
(Integrated Patch)
Pre-Review Patch
(Initial Patch)
Background
Background Approach Result Summary
Problem:
Reviewers need to check several times
6
- i=key
+ i=dic[“key”]
- i=key
+ i_=_dic[“KEY”]
(2) (n) Review Fix suggestion
(n) Integrate
Patch
Author
Reviewer Project(1) Submit
Reviewed Patch
(Integrated Patch)
Pre-Review Patch
(Initial Patch)
String should be lower
Waste space
Background
Background Approach Result Summary
Goal
Reduce Similar Review Automatically
7
Auto Review
System
(2) Review Fix suggestion
(3) Review
request
Patch
Author
Reviewer(1) Submit
Similar patch is fixed in the
past like..
Background
Background Approach Result Summary
Approach:
Detect Pattern from Reviewed Patch Diff
8
”key” , it will be “KEY”
Pattern
i=dic[“key”] i=dic[“KEY”]
Dataset
i=dic[“key”] i=dic[“KEY”]i=dic[“key”]
Pre-Review Patch
i=dic[“KEY”]
Reviewed Patch
Approach
If patch has
Detect
Background Approach Result Summary
Approach:
Detect Pattern from Reviewed Patch Diff
9
Patch
Author
Auto Review
System
print(“key”)
print(“KEY”)
”key” , it will be “KEY”
Pattern
If patch has
Use
Dataset
i=dic[“key”] i=dic[“KEY”]i=dic[“key”] i=dic[“KEY”]i=dic[“key”]
Pre-Review Patch
i=dic[“KEY”]
Reviewed Patch
Approach
Background Approach Result Summary
Detect Code Improved Pattern (1/2):
Divide Patch Diff to Chunk
10
- if i␣==␣0:
+ if i==0:
break
- i=dic[“key”]
+ i=dic.get(“key”)
- i=dic[“key”]
+ i=dic.get(“key”)
- if i␣==␣0:
+ if i==0:
Approach
Background Approach Result Summary
Detect Code Improved Pattern (1/2):
Get Pattern by Sequential Pattern Mining
11
- i=dic[“key”]
+ i=dic.get(“key”)
- [i=dic - [ + .get(i=dic
- [i=dic
i=dic
- [ + .get(i=dic “key”
- ]
+ )
Length Length Length
Approach
Background Approach Result Summary
Detect Code Improved Pattern (1/2):
Get Pattern by Sequential Pattern Mining
12
- i=dic[“key”]
+ i=dic.get(“key”)
- [i=dic - [ + .get(i=dic
- [i=dic - ]
i=dic + )
- [ + .get(i=dic “key”
Length Length Length
Keep Frequently Appeared and Longer Patterns
Approach
Background Approach Result Summary
Pattern Evaluation
13
i=dic + .get( - ]
Appeared Time:
+ )(e.g. Pattern
i=dic
.get(
]
)
Pre-Reviewed Patches that have
Reviewed Patches that have
)
Number of Patch Pairs
Approach
Background Approach Result Summary
Pattern Evaluation
14
Appeared Time:
.get( )
Pre-Reviewed Patches that have
Reviewed Patches that have
Number of Patch Pairs
i=dic[“key”]
i=dic.get(“key”)
i=dic[”KEY”]
Count
NOT Count
e.g.
Pre-Reviewed Patch
i=dic + .get( - ] + )(e.g. Pattern )
i=dic ]
Approach
Background Approach Result Summary
15
Appeared Time:
.get( )
Pre-Reviewed Patches that have
Reviewed Patches that have
Number of Patch Pairs
Accuracy:
.get( )
Pre-Reviewed Patches that have
Reviewed Patches that have
Ratio of Patch Pairs
Pattern Evaluation
i=dic ]
i=dic ]
i=dic + .get( - ] + )(e.g. Pattern )
Approach
Background Approach Result Summary
Target
16
Project OpenStack
Language Python3
Time Period 2011-2016
# Patches 173,749
# Chunks for Detect Pattern 555,050
# Chunks for Evaluate Pattern 61,673
Result
Background Approach Result Summary
8 Frequently Appeared Pattern
17
self.stbout() self.stubs.Set()
Why?: Support for OpenStacks‘ library dependency changes
Result
Background Approach Result Summary
8 Frequently Appeared Pattern
18
assertEquals() assertEqual()
Why?: Support for Python 2 to 3 changes
self.stbout() self.stubs.Set()
Why?: Support for OpenStacks‘ library dependency changes
xrange() range()
Result
Background Approach Result Summary
8 Frequently Appeared Pattern
19
assertEquals() assertEqual()
Why?: Support for Python 2 to 3 changes
assertTrue(x in array)
Why?: Improve readability
assertIn(x, array)
xrange() range()
self.stbout() self.stubs.Set()
Why?: Support for OpenStacks‘ library dependency changes
Result
Background Approach Result Summary
8 Frequently Appeared Pattern
20
assertEquals() assertEqual()
Why?: Support for Python2 to 3 changes
assertTrue(x in array)
Why?: Improve readability
assertIn(x, array)
- xrange() + range()
self.stbout() self.stubs.Set()
Why?: Support for OpenStacks‘ library dependency changes
Thresholds:
Appeared time > 300
Accuracy > 10%
Total 8 patterns
Cover: 32.3% (19,940/ 61,673) similar patches
Accuracy: 45.9%
Result
Background Approach Result Summary
Patterns are discussed on StackOverflow
21
- assertEquals() + assertEqual()
Why?: Support for Python2 to 3 changes
- assertTrue(x in array)
Why?: Improve readability
+ assertIn(x, array)
- xrange() + range()
- self.stbout() + self.stubs.Set()
Why?: Support for OpenStacks‘ library dependency changes
Result
Background Approach Result Summary
For Automatically Code Review:
Work as GitHub Bot
22
Patch authorBot
I fixed
Reviewer
OK
Sample URL: https://github.com/Ikuyadeu/ExtentionTest/pull/9
Result
Background Approach Result Summary
vs Other Tool (1 / 2)
Static Analysis Tool
FOO=0 foo_=_0
23
Bad name
Waste
space
Static Analysis Tool (pylint)
Fix based on Language
Other tools: ESlint, Pmd, checkstyle
Result
Background Approach Result Summary
vs Other Tool (1 / 2)
Static Analysis Tool
FOO=0 foo_=_0
24
Static Analysis Tool (pylint)
Fix based on Language
This research:
Project-specific
changes
self.stbout()
xrange()
self.stubs.Set()
range()
Old library
dependency
Language
definition
Result
Background Approach Result Summary
vs Other Tool (2 / 2)
25
Choose best rule set from large rule set
• Invalid-name
• Bad-continuation
• Wrong-import-order
• Invalid-name
• Bad-continuation
• Wrong-import-order
IntelliCode
Result
Background Approach Result Summary
vs Other Tool (2 / 2)
26
Choose best rule set from large rule set
Find NEW pattern set from history
• Invalid-name
• Bad-continuation
• Wrong-import-order
• Invalid-name
• Bad-continuation
• Wrong-import-order
• disk2disk_api
• stubs.Set2stub_out
• assert-equals2equal
IntelliCode
This Study
Result
Background Approach Result Summary
vs Other Tool (2 / 2)
27
Choose best rule set from large rule set
Find NEW pattern set from history
• Invalid-name
• Bad-continuation
• Wrong-import-order
• Invalid-name
• Bad-continuation
• Wrong-import-order
• disk2disk_api
• stubs.Set2stub_out
• assert-equals2equal
IntelliCode
This Study
Support project-specific problem
Support change of environment
Result
Background Approach Result Summary
Future Work
• Which pattern should bot choose?
üMost appeared pattern, High accuracy pattern
• Compare with Other Projects and Languages’
Patterns
• Evaluate by submitting pull request, and get ratio of
Accepted / Submitted pull request
28
Summary
!ueda.yuki.un7@is.naist.jp

More Related Content

What's hot

ENKI: Access Control for Encrypted Query Processing
ENKI: Access Control for Encrypted Query ProcessingENKI: Access Control for Encrypted Query Processing
ENKI: Access Control for Encrypted Query ProcessingMateus S. H. Cruz
 
NSC #2 - D2 06 - Richard Johnson - SAGEly Advice
NSC #2 - D2 06 - Richard Johnson - SAGEly AdviceNSC #2 - D2 06 - Richard Johnson - SAGEly Advice
NSC #2 - D2 06 - Richard Johnson - SAGEly AdviceNoSuchCon
 
Some stuff about C++ and development
Some stuff about C++ and developmentSome stuff about C++ and development
Some stuff about C++ and developmentJon Jagger
 
C Programming: Pointer (Examples)
C Programming: Pointer (Examples)C Programming: Pointer (Examples)
C Programming: Pointer (Examples)Joongheon Kim
 
Compiler Construction | Lecture 2 | Declarative Syntax Definition
Compiler Construction | Lecture 2 | Declarative Syntax DefinitionCompiler Construction | Lecture 2 | Declarative Syntax Definition
Compiler Construction | Lecture 2 | Declarative Syntax DefinitionEelco Visser
 
02 Java Language And OOP Part II LAB
02 Java Language And OOP Part II LAB02 Java Language And OOP Part II LAB
02 Java Language And OOP Part II LABHari Christian
 
Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...
Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...
Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...Mateus S. H. Cruz
 
Integrating Erlang and Java
Integrating Erlang and Java Integrating Erlang and Java
Integrating Erlang and Java Dennis Byrne
 
Intro To C++ - Class #18: Vectors & Arrays
Intro To C++ - Class #18: Vectors & ArraysIntro To C++ - Class #18: Vectors & Arrays
Intro To C++ - Class #18: Vectors & ArraysBlue Elephant Consulting
 
From clever code to better code
From clever code to better codeFrom clever code to better code
From clever code to better codeDror Helper
 
Advanced Java Practical File
Advanced Java Practical FileAdvanced Java Practical File
Advanced Java Practical FileSoumya Behera
 
Machine learning in production with scikit-learn
Machine learning in production with scikit-learnMachine learning in production with scikit-learn
Machine learning in production with scikit-learnJeff Klukas
 
The Erlang Programming Language
The Erlang Programming LanguageThe Erlang Programming Language
The Erlang Programming LanguageDennis Byrne
 
Python programming workshop session 2
Python programming workshop session 2Python programming workshop session 2
Python programming workshop session 2Abdul Haseeb
 

What's hot (20)

ENKI: Access Control for Encrypted Query Processing
ENKI: Access Control for Encrypted Query ProcessingENKI: Access Control for Encrypted Query Processing
ENKI: Access Control for Encrypted Query Processing
 
Java practical
Java practicalJava practical
Java practical
 
NSC #2 - D2 06 - Richard Johnson - SAGEly Advice
NSC #2 - D2 06 - Richard Johnson - SAGEly AdviceNSC #2 - D2 06 - Richard Johnson - SAGEly Advice
NSC #2 - D2 06 - Richard Johnson - SAGEly Advice
 
Some stuff about C++ and development
Some stuff about C++ and developmentSome stuff about C++ and development
Some stuff about C++ and development
 
Java Generics
Java GenericsJava Generics
Java Generics
 
TDD Training
TDD TrainingTDD Training
TDD Training
 
C Programming: Pointer (Examples)
C Programming: Pointer (Examples)C Programming: Pointer (Examples)
C Programming: Pointer (Examples)
 
Compiler Construction | Lecture 2 | Declarative Syntax Definition
Compiler Construction | Lecture 2 | Declarative Syntax DefinitionCompiler Construction | Lecture 2 | Declarative Syntax Definition
Compiler Construction | Lecture 2 | Declarative Syntax Definition
 
Unit Testing with Foq
Unit Testing with FoqUnit Testing with Foq
Unit Testing with Foq
 
02 Java Language And OOP Part II LAB
02 Java Language And OOP Part II LAB02 Java Language And OOP Part II LAB
02 Java Language And OOP Part II LAB
 
Java 103
Java 103Java 103
Java 103
 
Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...
Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...
Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...
 
Integrating Erlang and Java
Integrating Erlang and Java Integrating Erlang and Java
Integrating Erlang and Java
 
Intro To C++ - Class #18: Vectors & Arrays
Intro To C++ - Class #18: Vectors & ArraysIntro To C++ - Class #18: Vectors & Arrays
Intro To C++ - Class #18: Vectors & Arrays
 
From clever code to better code
From clever code to better codeFrom clever code to better code
From clever code to better code
 
Advanced Java Practical File
Advanced Java Practical FileAdvanced Java Practical File
Advanced Java Practical File
 
Machine learning in production with scikit-learn
Machine learning in production with scikit-learnMachine learning in production with scikit-learn
Machine learning in production with scikit-learn
 
The Erlang Programming Language
The Erlang Programming LanguageThe Erlang Programming Language
The Erlang Programming Language
 
Base de-datos
Base de-datosBase de-datos
Base de-datos
 
Python programming workshop session 2
Python programming workshop session 2Python programming workshop session 2
Python programming workshop session 2
 

Similar to Mining Source Code Improvement Patterns from Similar Code Review Works

Mining Source Code Improvement Patterns from Similar Code Review Works
Mining Source Code Improvement Patterns from Similar Code Review WorksMining Source Code Improvement Patterns from Similar Code Review Works
Mining Source Code Improvement Patterns from Similar Code Review Works奈良先端大 情報科学研究科
 
Scientific Software Development
Scientific Software DevelopmentScientific Software Development
Scientific Software Developmentjalle6
 
Reducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code AnalysisReducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code AnalysisSebastiano Panichella
 
Static analysis: Around Java in 60 minutes
Static analysis: Around Java in 60 minutesStatic analysis: Around Java in 60 minutes
Static analysis: Around Java in 60 minutesAndrey Karpov
 
Delivering High Quality Elixir Code using Gitlab
Delivering High Quality Elixir Code using GitlabDelivering High Quality Elixir Code using Gitlab
Delivering High Quality Elixir Code using GitlabTonny Adhi Sabastian
 
Does static analysis need machine learning?
Does static analysis need machine learning?Does static analysis need machine learning?
Does static analysis need machine learning?Andrey Karpov
 
Devry CIS 355A Full Course Latest
Devry CIS 355A Full Course LatestDevry CIS 355A Full Course Latest
Devry CIS 355A Full Course LatestAtifkhilji
 
Introduction to Aspect Oriented Programming
Introduction to Aspect Oriented ProgrammingIntroduction to Aspect Oriented Programming
Introduction to Aspect Oriented ProgrammingAmir Kost
 
Eclipse Con 2015: Codan - a C/C++ Code Analysis Framework for CDT
Eclipse Con 2015: Codan - a C/C++ Code Analysis Framework for CDTEclipse Con 2015: Codan - a C/C++ Code Analysis Framework for CDT
Eclipse Con 2015: Codan - a C/C++ Code Analysis Framework for CDTElena Laskavaia
 
Compiler Construction | Lecture 14 | Interpreters
Compiler Construction | Lecture 14 | InterpretersCompiler Construction | Lecture 14 | Interpreters
Compiler Construction | Lecture 14 | InterpretersEelco Visser
 
Impact of Coding Style Checker on Code Review -A case study on the OpenStack ...
Impact of Coding Style Checker on Code Review -A case study on the OpenStack ...Impact of Coding Style Checker on Code Review -A case study on the OpenStack ...
Impact of Coding Style Checker on Code Review -A case study on the OpenStack ...Yuki Ueda
 
SQLGitHub - Access GitHub API with SQL-like syntaxes
SQLGitHub - Access GitHub API with SQL-like syntaxesSQLGitHub - Access GitHub API with SQL-like syntaxes
SQLGitHub - Access GitHub API with SQL-like syntaxesJasmine Chen
 
Kyo - Functional Scala 2023.pdf
Kyo - Functional Scala 2023.pdfKyo - Functional Scala 2023.pdf
Kyo - Functional Scala 2023.pdfFlavio W. Brasil
 
C++ Windows Forms L01 - Intro
C++ Windows Forms L01 - IntroC++ Windows Forms L01 - Intro
C++ Windows Forms L01 - IntroMohammad Shaker
 
SoCal Code Camp 2015: An introduction to Java 8
SoCal Code Camp 2015: An introduction to Java 8SoCal Code Camp 2015: An introduction to Java 8
SoCal Code Camp 2015: An introduction to Java 8Chaitanya Ganoo
 
SQL Saturday 28 - .NET Fundamentals
SQL Saturday 28 - .NET FundamentalsSQL Saturday 28 - .NET Fundamentals
SQL Saturday 28 - .NET Fundamentalsmikehuguet
 
EA User Group London 2018 - Extending EA with custom scripts to cater for spe...
EA User Group London 2018 - Extending EA with custom scripts to cater for spe...EA User Group London 2018 - Extending EA with custom scripts to cater for spe...
EA User Group London 2018 - Extending EA with custom scripts to cater for spe...Guillaume Finance
 

Similar to Mining Source Code Improvement Patterns from Similar Code Review Works (20)

Mining Source Code Improvement Patterns from Similar Code Review Works
Mining Source Code Improvement Patterns from Similar Code Review WorksMining Source Code Improvement Patterns from Similar Code Review Works
Mining Source Code Improvement Patterns from Similar Code Review Works
 
Scientific Software Development
Scientific Software DevelopmentScientific Software Development
Scientific Software Development
 
Reducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code AnalysisReducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code Analysis
 
Static analysis: Around Java in 60 minutes
Static analysis: Around Java in 60 minutesStatic analysis: Around Java in 60 minutes
Static analysis: Around Java in 60 minutes
 
Delivering High Quality Elixir Code using Gitlab
Delivering High Quality Elixir Code using GitlabDelivering High Quality Elixir Code using Gitlab
Delivering High Quality Elixir Code using Gitlab
 
Tech talks#6: Code Refactoring
Tech talks#6: Code RefactoringTech talks#6: Code Refactoring
Tech talks#6: Code Refactoring
 
Does static analysis need machine learning?
Does static analysis need machine learning?Does static analysis need machine learning?
Does static analysis need machine learning?
 
Devry CIS 355A Full Course Latest
Devry CIS 355A Full Course LatestDevry CIS 355A Full Course Latest
Devry CIS 355A Full Course Latest
 
Introduction to Aspect Oriented Programming
Introduction to Aspect Oriented ProgrammingIntroduction to Aspect Oriented Programming
Introduction to Aspect Oriented Programming
 
Eclipse Con 2015: Codan - a C/C++ Code Analysis Framework for CDT
Eclipse Con 2015: Codan - a C/C++ Code Analysis Framework for CDTEclipse Con 2015: Codan - a C/C++ Code Analysis Framework for CDT
Eclipse Con 2015: Codan - a C/C++ Code Analysis Framework for CDT
 
Compiler Construction | Lecture 14 | Interpreters
Compiler Construction | Lecture 14 | InterpretersCompiler Construction | Lecture 14 | Interpreters
Compiler Construction | Lecture 14 | Interpreters
 
Impact of Coding Style Checker on Code Review -A case study on the OpenStack ...
Impact of Coding Style Checker on Code Review -A case study on the OpenStack ...Impact of Coding Style Checker on Code Review -A case study on the OpenStack ...
Impact of Coding Style Checker on Code Review -A case study on the OpenStack ...
 
SQLGitHub - Access GitHub API with SQL-like syntaxes
SQLGitHub - Access GitHub API with SQL-like syntaxesSQLGitHub - Access GitHub API with SQL-like syntaxes
SQLGitHub - Access GitHub API with SQL-like syntaxes
 
Kyo - Functional Scala 2023.pdf
Kyo - Functional Scala 2023.pdfKyo - Functional Scala 2023.pdf
Kyo - Functional Scala 2023.pdf
 
C++ Windows Forms L01 - Intro
C++ Windows Forms L01 - IntroC++ Windows Forms L01 - Intro
C++ Windows Forms L01 - Intro
 
Magento code audit
Magento code auditMagento code audit
Magento code audit
 
Symbexecsearch
SymbexecsearchSymbexecsearch
Symbexecsearch
 
SoCal Code Camp 2015: An introduction to Java 8
SoCal Code Camp 2015: An introduction to Java 8SoCal Code Camp 2015: An introduction to Java 8
SoCal Code Camp 2015: An introduction to Java 8
 
SQL Saturday 28 - .NET Fundamentals
SQL Saturday 28 - .NET FundamentalsSQL Saturday 28 - .NET Fundamentals
SQL Saturday 28 - .NET Fundamentals
 
EA User Group London 2018 - Extending EA with custom scripts to cater for spe...
EA User Group London 2018 - Extending EA with custom scripts to cater for spe...EA User Group London 2018 - Extending EA with custom scripts to cater for spe...
EA User Group London 2018 - Extending EA with custom scripts to cater for spe...
 

Recently uploaded

Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfVishalKumarJha10
 

Recently uploaded (20)

Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 

Mining Source Code Improvement Patterns from Similar Code Review Works

  • 1. Mining Source Code Improvement Patterns from Similar Code Review Yuki Ueda1, Takashi Ishio1, Akinori Ihara2, Kenichi Matsumoto1 1Nara Institute of Science and Technology 2Wakayama University 13th International Workshop on Software Clones (IWSC’19)
  • 2. Background Approach Result Summary Contents • Goal:Reduce Code Review Cost • Approach:Code Improvement Pattern Detection That Appeared Review • Evaluation: Measure Patterns’ Frequency and Accuracy 2
  • 3. Background Approach Result Summary Code review process: Reviewers suggest code fix Patch Author Reviewer Project 3 - i=key + i=dic[“key”] Patch Background (1) Submit
  • 4. Background Approach Result Summary Code review process: Reviewers suggest code fix Patch Author Reviewer Project 4 - i=key + i=dic[“key”] Patch You should fix (1) Submit (2) Review, Fix suggestion Background
  • 5. Background Approach Result Summary Code review process: Reviewers suggest code fix 5 - i=key + i=dic[“key”] - i=key + i_=_dic[“KEY”] (3) Integrate Patch Author Reviewer Project(1) Submit (2) Review, Fix suggestion Reviewed Patch (Integrated Patch) Pre-Review Patch (Initial Patch) Background
  • 6. Background Approach Result Summary Problem: Reviewers need to check several times 6 - i=key + i=dic[“key”] - i=key + i_=_dic[“KEY”] (2) (n) Review Fix suggestion (n) Integrate Patch Author Reviewer Project(1) Submit Reviewed Patch (Integrated Patch) Pre-Review Patch (Initial Patch) String should be lower Waste space Background
  • 7. Background Approach Result Summary Goal Reduce Similar Review Automatically 7 Auto Review System (2) Review Fix suggestion (3) Review request Patch Author Reviewer(1) Submit Similar patch is fixed in the past like.. Background
  • 8. Background Approach Result Summary Approach: Detect Pattern from Reviewed Patch Diff 8 ”key” , it will be “KEY” Pattern i=dic[“key”] i=dic[“KEY”] Dataset i=dic[“key”] i=dic[“KEY”]i=dic[“key”] Pre-Review Patch i=dic[“KEY”] Reviewed Patch Approach If patch has Detect
  • 9. Background Approach Result Summary Approach: Detect Pattern from Reviewed Patch Diff 9 Patch Author Auto Review System print(“key”) print(“KEY”) ”key” , it will be “KEY” Pattern If patch has Use Dataset i=dic[“key”] i=dic[“KEY”]i=dic[“key”] i=dic[“KEY”]i=dic[“key”] Pre-Review Patch i=dic[“KEY”] Reviewed Patch Approach
  • 10. Background Approach Result Summary Detect Code Improved Pattern (1/2): Divide Patch Diff to Chunk 10 - if i␣==␣0: + if i==0: break - i=dic[“key”] + i=dic.get(“key”) - i=dic[“key”] + i=dic.get(“key”) - if i␣==␣0: + if i==0: Approach
  • 11. Background Approach Result Summary Detect Code Improved Pattern (1/2): Get Pattern by Sequential Pattern Mining 11 - i=dic[“key”] + i=dic.get(“key”) - [i=dic - [ + .get(i=dic - [i=dic i=dic - [ + .get(i=dic “key” - ] + ) Length Length Length Approach
  • 12. Background Approach Result Summary Detect Code Improved Pattern (1/2): Get Pattern by Sequential Pattern Mining 12 - i=dic[“key”] + i=dic.get(“key”) - [i=dic - [ + .get(i=dic - [i=dic - ] i=dic + ) - [ + .get(i=dic “key” Length Length Length Keep Frequently Appeared and Longer Patterns Approach
  • 13. Background Approach Result Summary Pattern Evaluation 13 i=dic + .get( - ] Appeared Time: + )(e.g. Pattern i=dic .get( ] ) Pre-Reviewed Patches that have Reviewed Patches that have ) Number of Patch Pairs Approach
  • 14. Background Approach Result Summary Pattern Evaluation 14 Appeared Time: .get( ) Pre-Reviewed Patches that have Reviewed Patches that have Number of Patch Pairs i=dic[“key”] i=dic.get(“key”) i=dic[”KEY”] Count NOT Count e.g. Pre-Reviewed Patch i=dic + .get( - ] + )(e.g. Pattern ) i=dic ] Approach
  • 15. Background Approach Result Summary 15 Appeared Time: .get( ) Pre-Reviewed Patches that have Reviewed Patches that have Number of Patch Pairs Accuracy: .get( ) Pre-Reviewed Patches that have Reviewed Patches that have Ratio of Patch Pairs Pattern Evaluation i=dic ] i=dic ] i=dic + .get( - ] + )(e.g. Pattern ) Approach
  • 16. Background Approach Result Summary Target 16 Project OpenStack Language Python3 Time Period 2011-2016 # Patches 173,749 # Chunks for Detect Pattern 555,050 # Chunks for Evaluate Pattern 61,673 Result
  • 17. Background Approach Result Summary 8 Frequently Appeared Pattern 17 self.stbout() self.stubs.Set() Why?: Support for OpenStacks‘ library dependency changes Result
  • 18. Background Approach Result Summary 8 Frequently Appeared Pattern 18 assertEquals() assertEqual() Why?: Support for Python 2 to 3 changes self.stbout() self.stubs.Set() Why?: Support for OpenStacks‘ library dependency changes xrange() range() Result
  • 19. Background Approach Result Summary 8 Frequently Appeared Pattern 19 assertEquals() assertEqual() Why?: Support for Python 2 to 3 changes assertTrue(x in array) Why?: Improve readability assertIn(x, array) xrange() range() self.stbout() self.stubs.Set() Why?: Support for OpenStacks‘ library dependency changes Result
  • 20. Background Approach Result Summary 8 Frequently Appeared Pattern 20 assertEquals() assertEqual() Why?: Support for Python2 to 3 changes assertTrue(x in array) Why?: Improve readability assertIn(x, array) - xrange() + range() self.stbout() self.stubs.Set() Why?: Support for OpenStacks‘ library dependency changes Thresholds: Appeared time > 300 Accuracy > 10% Total 8 patterns Cover: 32.3% (19,940/ 61,673) similar patches Accuracy: 45.9% Result
  • 21. Background Approach Result Summary Patterns are discussed on StackOverflow 21 - assertEquals() + assertEqual() Why?: Support for Python2 to 3 changes - assertTrue(x in array) Why?: Improve readability + assertIn(x, array) - xrange() + range() - self.stbout() + self.stubs.Set() Why?: Support for OpenStacks‘ library dependency changes Result
  • 22. Background Approach Result Summary For Automatically Code Review: Work as GitHub Bot 22 Patch authorBot I fixed Reviewer OK Sample URL: https://github.com/Ikuyadeu/ExtentionTest/pull/9 Result
  • 23. Background Approach Result Summary vs Other Tool (1 / 2) Static Analysis Tool FOO=0 foo_=_0 23 Bad name Waste space Static Analysis Tool (pylint) Fix based on Language Other tools: ESlint, Pmd, checkstyle Result
  • 24. Background Approach Result Summary vs Other Tool (1 / 2) Static Analysis Tool FOO=0 foo_=_0 24 Static Analysis Tool (pylint) Fix based on Language This research: Project-specific changes self.stbout() xrange() self.stubs.Set() range() Old library dependency Language definition Result
  • 25. Background Approach Result Summary vs Other Tool (2 / 2) 25 Choose best rule set from large rule set • Invalid-name • Bad-continuation • Wrong-import-order • Invalid-name • Bad-continuation • Wrong-import-order IntelliCode Result
  • 26. Background Approach Result Summary vs Other Tool (2 / 2) 26 Choose best rule set from large rule set Find NEW pattern set from history • Invalid-name • Bad-continuation • Wrong-import-order • Invalid-name • Bad-continuation • Wrong-import-order • disk2disk_api • stubs.Set2stub_out • assert-equals2equal IntelliCode This Study Result
  • 27. Background Approach Result Summary vs Other Tool (2 / 2) 27 Choose best rule set from large rule set Find NEW pattern set from history • Invalid-name • Bad-continuation • Wrong-import-order • Invalid-name • Bad-continuation • Wrong-import-order • disk2disk_api • stubs.Set2stub_out • assert-equals2equal IntelliCode This Study Support project-specific problem Support change of environment Result
  • 28. Background Approach Result Summary Future Work • Which pattern should bot choose? üMost appeared pattern, High accuracy pattern • Compare with Other Projects and Languages’ Patterns • Evaluate by submitting pull request, and get ratio of Accepted / Submitted pull request 28 Summary