SlideShare a Scribd company logo
Impact of Tool Support in
Patch Construction
Anil Koyuncu, Tegawendé F. Bissyandé, Dongsun Kim, Jacques Klein,
Martin Monperrus and Yves Le Traon
1
Motivation
2
Traditional Patch Construction
3
Detection Localization Generation
Thanks to…
4
Test Automation
Automated Bug/Fault Localization
Program Repair
Manual
Patching
Fully
Automated
Patching
Static/DynamicAnalysis
Really?
5
They use?
Or don’t use?
Adopted by
whom?
Patch à
Accepted?
Stable?
1991 2017
…
Long Life Software
PProblem
Solution
Single problem per patch
7
~ 15 millions LOC
Rich code repository
Subject --- Linux Kernel Project
Subject --- Data Sources
8
Bug Reports
Change History
Developer
Discussion
Patch Construction Processes
+ +
Process H
(Human)
Process DLH
(Detection Localization Human)
Process HMG
(Human Match Generation)
10
•Fully Manual
•Automated Localization
(static/dynamic analysis)
•Manual Patch Generation
•Manual Design of
PatchTemplate
•Automatic Application
H patches
• Identification based on direct link to Bugzilla IDs
11
DLH Patches
• Detection based on <tool> names
+
12
DLH Patches – Tools
13
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
Author Dates
20
40
60
80
100#ofPatches
checkpatch
Sparse
Linux Driver Verification Project
Smatch
Coverity
cppcheck
Strace
Syzkaller
HMG Patches +
• mentioning “coccinelle” or “semantic patch”
14
HMG Patches – Coccinelle
SmPL (Semantic Patch Language) for specifying desired matches and
transformations in C code.
15
Patch derived from the SmPLSmPL
H
Patches
DLH
Patches
HMG
Patches
Linux 2.6.12 (June 2005) -Linux 4.8 (October 2016)
616,291 commits
5758 commits 729 commits 4050 commits
Dataset
16
Temporal Distribution
17
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
Author Dates
500
1000
#ofPatches
H patches
DLH Patches
HMG patches
arch
drivers
fs
include
kernel
net
sound
staging
others
kernel subsystem (code directory)
0
10
20
30
40
50
60
70
80
90
100
Percentages
H patches
DLH patches
HMG patches
Spatial Distribution
18
RQ2 - Who is using?
Research Questions
RQ1 - Community Reaction RQ3 – Impact on Stability
RQ4 - Kind of Bugs
19
RQ1 - Community Reaction
• Delays in integrating commits
• Gaps between proposed and integrated patches
20
RQ1 - Community Reaction
RQ1 - Commit Acceptance Delay
• Finding: Integration ofTool-supported patches are slower
than traditional H patches.
21
Submission date acceptance date
RQ1- LKML mentioning HMG (Coccinelle)
• Finding:The gap is closing, patches appear to be accepted.
0
500
1000
1500
2000
2500
3000
2008 2009 2010 2011 2012 2013 2014 2015 2016
#	of	reference	or	commits
Patch	Submitted	in	LKML
Accepted	Commit
Developer	/	Maintainer	Reply	
22
nt. Per-
of DLH
are ap-
shown
entied
ed in a
to the
hat end,
he gaps
munity
ge sug-
n need
ainline
he criti-
ggested
correlate this frequency on a monthly basis with the corresponding
statistics on accepted DLH patches related to the specic tools.
0
100
200
300
400
500
600
#ofreferenceorcommits
Patch Submitted in LKML
Accepted Commit
Developer / Maintainer Reply
(a) Data on checkpatch-related (DLH) patches.
0.5
0.6
0.7
0.8
0.9
1
1.1
0 20 40 60 80
%
timeline
Gap
Slope (Linear regression)
(b) Evolution of the Gap.
0
50
100
150
200
250
300
350
400
#ofreferenceorcommits
Patch Submitted in LKML
Accepted Commit
Developer / Maintainer Reply
(c) Data on coccinelle-related (HMG) patches.
-6
-5
-4
-3
-2
-1
0
1
0 20 40 60 80
%
timeline
Gap
Slope (linear regression)
(d) Evolution of the Gap.
Figure 7: # of Patches submitted / discussed / accepted.
We have crawled all emails archived in the Linux Kernel Mailing
List (LKML) using Scrapy14. We use heuristics to dierentiate mes-
RQ2 - Profile of Patch Authors
• Specialty
• Commitment
23
RQ2 - Specialty
24
• Finding: HMG Patches are often generated by less
specialized developers.
Speciality is dened as a metric for characterizing the extent to
which a developer is focused on a specic subsystem. We compute
it as the percentage of patches, among all her/his patches, which
a developer contributes to a specic subsystem. Thus, speciality
is measured with respect to each Linux code directory. We then
draw, in Figure 8, the distributions of speciality metric values of
developers for the dierent types of patches: e.g., for an automated
patch applied to a le in a subsystem, we consider the commit
author speciality w.r.t that subsystem.
% of Speciality
H Patches
DLH Patches
HMG Patches
Figure 8: Speciality of developers Vs. Patch types.
H patches are mostly provided by specialized developers. This
may imply that the developers focus on implementing specic func-
tionalities over time. Similarly, DLH patches appear to be mostly
Focus on
Specific
modules
Contribute to
all modules
RQ2 - Commitment of developers
• Finding: Patch application tools (HMG) enable developers to
remain committed to the code base.
25
# days between first patch
and last patch
#patches integrated
into Linux
Commitment
to roll back changes, it is common to revert commit
e.g.,
Commit message:
revert hash
26
RQ3 - Stability of Patches
0 50 100
% of patches reverted
H patches
DLH patches
HMG patches
Ledleagueinwins
2.81
0.27
0.32
RQ3 - Stability of Patches
27
• Finding: Tool-supported patches are generally stable.
RQ3 - Stability of Patches
• Finding: Issues on fix patterns appear to be discovered
quickly, however bug detection tools need long time.
28
RQ4 - Kind of Bugs
• Spread of buggy code ~ Locality of the patches
• Complexity of the bugs ~ Change operations
29
RQ4 - Bug Locality
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
H pathes
DLH Patches
HMG Patches
% of patches
1 file 2 files 3 files 4 files 5+ files
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
H pathes
DLH Patches
HMG Patches
% of patches
1 hunk 2 hunks 3 hunks 4 hunks 5+ hunks
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
H pathes
DLH Patches
HMG Patches
% of patches
1 line 2 lines 3 lines 4 lines 5+ lines
More Local
Several Hunks
Several Lines
30
Change operations - Gumtree
If/Unary:del
Ident/GenericString:mv
If/FunCall:add
Ident/GenericString:mv
31
• AST DiffTool to identify change operations
RQ4 - Change operations in patches
• Finding: Several change operations
0 10 20 30 40 50 60 70
Ident/GenericString
GenericList/Left
Compound/ExprStatement
Program/Declaration
Compound/If
% of patches
HMG Patches
upd mv
add del
0 10 20 30 40 50 60 70 80
Program/CppTop
Ident/GenericString
Program/Declaration
Compound/ExprStatement
Compound/If
% of patches
H Patches
upd mv
add del
0 5 10 15 20 25 30 35 40 45
Ident/GenericString
Compound/ExprStatement
Left/Constant
Compound/If
If/Compound
% of patches
DLH Patches
upd mv
add del
32
Take-aways
33
(1) Tools are gradually adopted.
(2) HMG patches leverage micro-clones.
(1) DLH  HMG patches need more
time to be accepted.
Perhaps due to less-severity.
(2) HMG patch acceptance has been
fast.
Take-aways
34
(1) DLH  HMG patches can also change
several lines.
(2) HMG patches change several files due to
APIs.
(1) More opportunities à HMG patches
leverage redundancy.
(2) Need to target more complex defects.
Really?
6
They use?
Or don’t use?
Adopted by
whom?
Patch à
Accepted?
Stable?
Subject --- Data Sources
10
Bug Reports
Change History
Developer
Discussion
Patch Types
+ +
Process H Process DLH Process HMG
13
•Fully Manual
•Automated Localization
(static/dynamic analysis)
•Manual Patch Generation
•Manual Design of
Patch Template
•Automatic Application
RQ2 - Commitment of developers
• Finding: Patch application tools enable developers to remain
committed to contributing patches to the code base.
31
= # patches integrated into Linux * # days between
first patch and last patch

More Related Content

What's hot

Automated Program Repair Keynote talk
Automated Program Repair Keynote talkAutomated Program Repair Keynote talk
Automated Program Repair Keynote talk
Abhik Roychoudhury
 
AVATAR : Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations
AVATAR : Fixing Semantic Bugs with Fix Patterns of Static Analysis ViolationsAVATAR : Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations
AVATAR : Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations
Dongsun Kim
 
VTU PCD Model Question Paper - Programming in C
VTU PCD Model Question Paper - Programming in CVTU PCD Model Question Paper - Programming in C
VTU PCD Model Question Paper - Programming in C
Syed Mustafa
 
answer-model-qp-15-pcd13pcd
answer-model-qp-15-pcd13pcdanswer-model-qp-15-pcd13pcd
answer-model-qp-15-pcd13pcd
Syed Mustafa
 
Code Analysis-run time error prediction
Code Analysis-run time error predictionCode Analysis-run time error prediction
Code Analysis-run time error prediction
NIKHIL NAWATHE
 
Repair dagstuhl jan2017
Repair dagstuhl jan2017Repair dagstuhl jan2017
Repair dagstuhl jan2017
Abhik Roychoudhury
 
Abhik-Satish-dagstuhl
Abhik-Satish-dagstuhlAbhik-Satish-dagstuhl
Abhik-Satish-dagstuhl
Abhik Roychoudhury
 
Test final jav_aaa
Test final jav_aaaTest final jav_aaa
Test final jav_aaa
BagusBudi11
 
SherLog: Error Diagnosis by Connecting Clues from Run-time Logs
SherLog: Error Diagnosis by Connecting Clues from Run-time LogsSherLog: Error Diagnosis by Connecting Clues from Run-time Logs
SherLog: Error Diagnosis by Connecting Clues from Run-time Logs
Dacong (Tony) Yan
 
Automated Repair - ISSTA Summer School
Automated Repair - ISSTA Summer SchoolAutomated Repair - ISSTA Summer School
Automated Repair - ISSTA Summer School
Abhik Roychoudhury
 
Headache from using mathematical software
Headache from using mathematical softwareHeadache from using mathematical software
Headache from using mathematical software
PVS-Studio
 
Core java
Core javaCore java
Core java
prabhatjon
 
c++ lab manual
c++ lab manualc++ lab manual
c++ lab manual
Shrunkhala Wankhede
 
Measuring maintainability; software metrics explained
Measuring maintainability; software metrics explainedMeasuring maintainability; software metrics explained
Measuring maintainability; software metrics explained
Dennis de Greef
 
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
Sangmin Park
 
APSEC2020 Keynote
APSEC2020 KeynoteAPSEC2020 Keynote
APSEC2020 Keynote
Abhik Roychoudhury
 
Deep C
Deep CDeep C
Deep C
Olve Maudal
 
Effective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent SoftwareEffective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent Software
Sangmin Park
 
VTU 1ST SEM PROGRAMMING IN C & DATA STRUCTURES SOLVED PAPERS OF JUNE-2015 & ...
VTU 1ST SEM  PROGRAMMING IN C & DATA STRUCTURES SOLVED PAPERS OF JUNE-2015 & ...VTU 1ST SEM  PROGRAMMING IN C & DATA STRUCTURES SOLVED PAPERS OF JUNE-2015 & ...
VTU 1ST SEM PROGRAMMING IN C & DATA STRUCTURES SOLVED PAPERS OF JUNE-2015 & ...
vtunotesbysree
 
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningDeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
Sung Kim
 

What's hot (20)

Automated Program Repair Keynote talk
Automated Program Repair Keynote talkAutomated Program Repair Keynote talk
Automated Program Repair Keynote talk
 
AVATAR : Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations
AVATAR : Fixing Semantic Bugs with Fix Patterns of Static Analysis ViolationsAVATAR : Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations
AVATAR : Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations
 
VTU PCD Model Question Paper - Programming in C
VTU PCD Model Question Paper - Programming in CVTU PCD Model Question Paper - Programming in C
VTU PCD Model Question Paper - Programming in C
 
answer-model-qp-15-pcd13pcd
answer-model-qp-15-pcd13pcdanswer-model-qp-15-pcd13pcd
answer-model-qp-15-pcd13pcd
 
Code Analysis-run time error prediction
Code Analysis-run time error predictionCode Analysis-run time error prediction
Code Analysis-run time error prediction
 
Repair dagstuhl jan2017
Repair dagstuhl jan2017Repair dagstuhl jan2017
Repair dagstuhl jan2017
 
Abhik-Satish-dagstuhl
Abhik-Satish-dagstuhlAbhik-Satish-dagstuhl
Abhik-Satish-dagstuhl
 
Test final jav_aaa
Test final jav_aaaTest final jav_aaa
Test final jav_aaa
 
SherLog: Error Diagnosis by Connecting Clues from Run-time Logs
SherLog: Error Diagnosis by Connecting Clues from Run-time LogsSherLog: Error Diagnosis by Connecting Clues from Run-time Logs
SherLog: Error Diagnosis by Connecting Clues from Run-time Logs
 
Automated Repair - ISSTA Summer School
Automated Repair - ISSTA Summer SchoolAutomated Repair - ISSTA Summer School
Automated Repair - ISSTA Summer School
 
Headache from using mathematical software
Headache from using mathematical softwareHeadache from using mathematical software
Headache from using mathematical software
 
Core java
Core javaCore java
Core java
 
c++ lab manual
c++ lab manualc++ lab manual
c++ lab manual
 
Measuring maintainability; software metrics explained
Measuring maintainability; software metrics explainedMeasuring maintainability; software metrics explained
Measuring maintainability; software metrics explained
 
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
 
APSEC2020 Keynote
APSEC2020 KeynoteAPSEC2020 Keynote
APSEC2020 Keynote
 
Deep C
Deep CDeep C
Deep C
 
Effective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent SoftwareEffective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent Software
 
VTU 1ST SEM PROGRAMMING IN C & DATA STRUCTURES SOLVED PAPERS OF JUNE-2015 & ...
VTU 1ST SEM  PROGRAMMING IN C & DATA STRUCTURES SOLVED PAPERS OF JUNE-2015 & ...VTU 1ST SEM  PROGRAMMING IN C & DATA STRUCTURES SOLVED PAPERS OF JUNE-2015 & ...
VTU 1ST SEM PROGRAMMING IN C & DATA STRUCTURES SOLVED PAPERS OF JUNE-2015 & ...
 
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningDeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
 

Similar to Impact of Tool Support in Patch Construction

DevOps.pptx
DevOps.pptxDevOps.pptx
DevOps.pptx
EswarVineet
 
Just-in-time Detection of Protection-Impacting Changes on WordPress and Media...
Just-in-time Detection of Protection-Impacting Changes on WordPress and Media...Just-in-time Detection of Protection-Impacting Changes on WordPress and Media...
Just-in-time Detection of Protection-Impacting Changes on WordPress and Media...
Amine Barrak
 
Inauguration lecture Martin Pinzger, University of Klagenfurt, Austria
Inauguration lecture Martin Pinzger, University of Klagenfurt, AustriaInauguration lecture Martin Pinzger, University of Klagenfurt, Austria
Inauguration lecture Martin Pinzger, University of Klagenfurt, Austria
Martin Pinzger
 
Clotho: Saving Programs from Malformed Strings and Incorrect String-handling
Clotho: Saving Programs from Malformed Strings and Incorrect String-handling�Clotho: Saving Programs from Malformed Strings and Incorrect String-handling�
Clotho: Saving Programs from Malformed Strings and Incorrect String-handling
Cybersecurity Education and Research Centre
 
2014 01-ticosa
2014 01-ticosa2014 01-ticosa
2014 01-ticosa
Pharo
 
The Impact of Tangled Code Changes
The Impact of Tangled Code ChangesThe Impact of Tangled Code Changes
The Impact of Tangled Code Changes
Kim Herzig
 
Removing Self Admitted Technical Debt
Removing Self Admitted Technical DebtRemoving Self Admitted Technical Debt
Removing Self Admitted Technical Debt
Alexander Serebrenik
 
Parasoft .TEST, Write better C# Code Using Data Flow Analysis
Parasoft .TEST, Write better C# Code Using  Data Flow Analysis Parasoft .TEST, Write better C# Code Using  Data Flow Analysis
Parasoft .TEST, Write better C# Code Using Data Flow Analysis
Engineering Software Lab
 
Data Mining-based Tools to Support Library Update. PhD Defence of Oleksandr Z...
Data Mining-based Tools to Support Library Update. PhD Defence of Oleksandr Z...Data Mining-based Tools to Support Library Update. PhD Defence of Oleksandr Z...
Data Mining-based Tools to Support Library Update. PhD Defence of Oleksandr Z...
Oleksandr Zaitsev
 
Flagged revs
Flagged revsFlagged revs
Flagged revs
José Felipe Ortega
 
Thesis+of+fehmi+jaafar.ppt
Thesis+of+fehmi+jaafar.pptThesis+of+fehmi+jaafar.ppt
Thesis+of+fehmi+jaafar.ppt
Ptidej Team
 
A tale of bug prediction in software development
A tale of bug prediction in software developmentA tale of bug prediction in software development
A tale of bug prediction in software development
Martin Pinzger
 
Of Changes and Their History
Of Changes and Their HistoryOf Changes and Their History
Of Changes and Their History
University of Zurich
 
Is Orchestration the Next Big Thing in DevOps
Is Orchestration the Next Big Thing in DevOpsIs Orchestration the Next Big Thing in DevOps
Is Orchestration the Next Big Thing in DevOps
Nati Shalom
 
What_is_DevOps_how_it's_very_useful_in_daily_Life.
What_is_DevOps_how_it's_very_useful_in_daily_Life.What_is_DevOps_how_it's_very_useful_in_daily_Life.
What_is_DevOps_how_it's_very_useful_in_daily_Life.
anilpmuvvala
 
What is DevOps And How It Is Useful In Real life.
What is DevOps And How It Is Useful In Real life.What is DevOps And How It Is Useful In Real life.
What is DevOps And How It Is Useful In Real life.
anilpmuvvala
 
What_is_DevOps.pptx
What_is_DevOps.pptxWhat_is_DevOps.pptx
What_is_DevOps.pptx
mridulsharma774687
 
Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)
Martin Pinzger
 
Continuous testing & devops with @petemar5hall
Continuous testing & devops with @petemar5hallContinuous testing & devops with @petemar5hall
Continuous testing & devops with @petemar5hall
Peter Marshall
 
Intro to DevOps 4 undergraduates
Intro to DevOps 4 undergraduates Intro to DevOps 4 undergraduates
Intro to DevOps 4 undergraduates
Liran Levy
 

Similar to Impact of Tool Support in Patch Construction (20)

DevOps.pptx
DevOps.pptxDevOps.pptx
DevOps.pptx
 
Just-in-time Detection of Protection-Impacting Changes on WordPress and Media...
Just-in-time Detection of Protection-Impacting Changes on WordPress and Media...Just-in-time Detection of Protection-Impacting Changes on WordPress and Media...
Just-in-time Detection of Protection-Impacting Changes on WordPress and Media...
 
Inauguration lecture Martin Pinzger, University of Klagenfurt, Austria
Inauguration lecture Martin Pinzger, University of Klagenfurt, AustriaInauguration lecture Martin Pinzger, University of Klagenfurt, Austria
Inauguration lecture Martin Pinzger, University of Klagenfurt, Austria
 
Clotho: Saving Programs from Malformed Strings and Incorrect String-handling
Clotho: Saving Programs from Malformed Strings and Incorrect String-handling�Clotho: Saving Programs from Malformed Strings and Incorrect String-handling�
Clotho: Saving Programs from Malformed Strings and Incorrect String-handling
 
2014 01-ticosa
2014 01-ticosa2014 01-ticosa
2014 01-ticosa
 
The Impact of Tangled Code Changes
The Impact of Tangled Code ChangesThe Impact of Tangled Code Changes
The Impact of Tangled Code Changes
 
Removing Self Admitted Technical Debt
Removing Self Admitted Technical DebtRemoving Self Admitted Technical Debt
Removing Self Admitted Technical Debt
 
Parasoft .TEST, Write better C# Code Using Data Flow Analysis
Parasoft .TEST, Write better C# Code Using  Data Flow Analysis Parasoft .TEST, Write better C# Code Using  Data Flow Analysis
Parasoft .TEST, Write better C# Code Using Data Flow Analysis
 
Data Mining-based Tools to Support Library Update. PhD Defence of Oleksandr Z...
Data Mining-based Tools to Support Library Update. PhD Defence of Oleksandr Z...Data Mining-based Tools to Support Library Update. PhD Defence of Oleksandr Z...
Data Mining-based Tools to Support Library Update. PhD Defence of Oleksandr Z...
 
Flagged revs
Flagged revsFlagged revs
Flagged revs
 
Thesis+of+fehmi+jaafar.ppt
Thesis+of+fehmi+jaafar.pptThesis+of+fehmi+jaafar.ppt
Thesis+of+fehmi+jaafar.ppt
 
A tale of bug prediction in software development
A tale of bug prediction in software developmentA tale of bug prediction in software development
A tale of bug prediction in software development
 
Of Changes and Their History
Of Changes and Their HistoryOf Changes and Their History
Of Changes and Their History
 
Is Orchestration the Next Big Thing in DevOps
Is Orchestration the Next Big Thing in DevOpsIs Orchestration the Next Big Thing in DevOps
Is Orchestration the Next Big Thing in DevOps
 
What_is_DevOps_how_it's_very_useful_in_daily_Life.
What_is_DevOps_how_it's_very_useful_in_daily_Life.What_is_DevOps_how_it's_very_useful_in_daily_Life.
What_is_DevOps_how_it's_very_useful_in_daily_Life.
 
What is DevOps And How It Is Useful In Real life.
What is DevOps And How It Is Useful In Real life.What is DevOps And How It Is Useful In Real life.
What is DevOps And How It Is Useful In Real life.
 
What_is_DevOps.pptx
What_is_DevOps.pptxWhat_is_DevOps.pptx
What_is_DevOps.pptx
 
Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)
 
Continuous testing & devops with @petemar5hall
Continuous testing & devops with @petemar5hallContinuous testing & devops with @petemar5hall
Continuous testing & devops with @petemar5hall
 
Intro to DevOps 4 undergraduates
Intro to DevOps 4 undergraduates Intro to DevOps 4 undergraduates
Intro to DevOps 4 undergraduates
 

Recently uploaded

官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
171ticu
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
Anant Corporation
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
shadow0702a
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
MDSABBIROJJAMANPAYEL
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
kandramariana6
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Christina Lin
 
ML Based Model for NIDS MSc Updated Presentation.v2.pptx
ML Based Model for NIDS MSc Updated Presentation.v2.pptxML Based Model for NIDS MSc Updated Presentation.v2.pptx
ML Based Model for NIDS MSc Updated Presentation.v2.pptx
JamalHussainArman
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
IJECEIAES
 
Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
co23btech11018
 
Hematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood CountHematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood Count
shahdabdulbaset
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
MIGUELANGEL966976
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
insn4465
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
Casting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdfCasting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdf
zubairahmad848137
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
IJECEIAES
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
Rahul
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
Hitesh Mohapatra
 
BRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdfBRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdf
LAXMAREDDY22
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
VICTOR MAESTRE RAMIREZ
 
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.pptUnit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
KrishnaveniKrishnara1
 

Recently uploaded (20)

官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
 
ML Based Model for NIDS MSc Updated Presentation.v2.pptx
ML Based Model for NIDS MSc Updated Presentation.v2.pptxML Based Model for NIDS MSc Updated Presentation.v2.pptx
ML Based Model for NIDS MSc Updated Presentation.v2.pptx
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
 
Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
 
Hematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood CountHematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood Count
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
Casting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdfCasting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdf
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
 
BRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdfBRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdf
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
 
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.pptUnit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
 

Impact of Tool Support in Patch Construction

  • 1. Impact of Tool Support in Patch Construction Anil Koyuncu, Tegawendé F. Bissyandé, Dongsun Kim, Jacques Klein, Martin Monperrus and Yves Le Traon 1
  • 4. Thanks to… 4 Test Automation Automated Bug/Fault Localization Program Repair Manual Patching Fully Automated Patching Static/DynamicAnalysis
  • 5. Really? 5 They use? Or don’t use? Adopted by whom? Patch à Accepted? Stable?
  • 6. 1991 2017 … Long Life Software PProblem Solution Single problem per patch 7 ~ 15 millions LOC Rich code repository Subject --- Linux Kernel Project
  • 7. Subject --- Data Sources 8 Bug Reports Change History Developer Discussion
  • 8. Patch Construction Processes + + Process H (Human) Process DLH (Detection Localization Human) Process HMG (Human Match Generation) 10 •Fully Manual •Automated Localization (static/dynamic analysis) •Manual Patch Generation •Manual Design of PatchTemplate •Automatic Application
  • 9. H patches • Identification based on direct link to Bugzilla IDs 11
  • 10. DLH Patches • Detection based on <tool> names + 12
  • 11. DLH Patches – Tools 13 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 Author Dates 20 40 60 80 100#ofPatches checkpatch Sparse Linux Driver Verification Project Smatch Coverity cppcheck Strace Syzkaller
  • 12. HMG Patches + • mentioning “coccinelle” or “semantic patch” 14
  • 13. HMG Patches – Coccinelle SmPL (Semantic Patch Language) for specifying desired matches and transformations in C code. 15 Patch derived from the SmPLSmPL
  • 14. H Patches DLH Patches HMG Patches Linux 2.6.12 (June 2005) -Linux 4.8 (October 2016) 616,291 commits 5758 commits 729 commits 4050 commits Dataset 16
  • 16. arch drivers fs include kernel net sound staging others kernel subsystem (code directory) 0 10 20 30 40 50 60 70 80 90 100 Percentages H patches DLH patches HMG patches Spatial Distribution 18
  • 17. RQ2 - Who is using? Research Questions RQ1 - Community Reaction RQ3 – Impact on Stability RQ4 - Kind of Bugs 19
  • 18. RQ1 - Community Reaction • Delays in integrating commits • Gaps between proposed and integrated patches 20 RQ1 - Community Reaction
  • 19. RQ1 - Commit Acceptance Delay • Finding: Integration ofTool-supported patches are slower than traditional H patches. 21 Submission date acceptance date
  • 20. RQ1- LKML mentioning HMG (Coccinelle) • Finding:The gap is closing, patches appear to be accepted. 0 500 1000 1500 2000 2500 3000 2008 2009 2010 2011 2012 2013 2014 2015 2016 # of reference or commits Patch Submitted in LKML Accepted Commit Developer / Maintainer Reply 22 nt. Per- of DLH are ap- shown entied ed in a to the hat end, he gaps munity ge sug- n need ainline he criti- ggested correlate this frequency on a monthly basis with the corresponding statistics on accepted DLH patches related to the specic tools. 0 100 200 300 400 500 600 #ofreferenceorcommits Patch Submitted in LKML Accepted Commit Developer / Maintainer Reply (a) Data on checkpatch-related (DLH) patches. 0.5 0.6 0.7 0.8 0.9 1 1.1 0 20 40 60 80 % timeline Gap Slope (Linear regression) (b) Evolution of the Gap. 0 50 100 150 200 250 300 350 400 #ofreferenceorcommits Patch Submitted in LKML Accepted Commit Developer / Maintainer Reply (c) Data on coccinelle-related (HMG) patches. -6 -5 -4 -3 -2 -1 0 1 0 20 40 60 80 % timeline Gap Slope (linear regression) (d) Evolution of the Gap. Figure 7: # of Patches submitted / discussed / accepted. We have crawled all emails archived in the Linux Kernel Mailing List (LKML) using Scrapy14. We use heuristics to dierentiate mes-
  • 21. RQ2 - Profile of Patch Authors • Specialty • Commitment 23
  • 22. RQ2 - Specialty 24 • Finding: HMG Patches are often generated by less specialized developers. Speciality is dened as a metric for characterizing the extent to which a developer is focused on a specic subsystem. We compute it as the percentage of patches, among all her/his patches, which a developer contributes to a specic subsystem. Thus, speciality is measured with respect to each Linux code directory. We then draw, in Figure 8, the distributions of speciality metric values of developers for the dierent types of patches: e.g., for an automated patch applied to a le in a subsystem, we consider the commit author speciality w.r.t that subsystem. % of Speciality H Patches DLH Patches HMG Patches Figure 8: Speciality of developers Vs. Patch types. H patches are mostly provided by specialized developers. This may imply that the developers focus on implementing specic func- tionalities over time. Similarly, DLH patches appear to be mostly Focus on Specific modules Contribute to all modules
  • 23. RQ2 - Commitment of developers • Finding: Patch application tools (HMG) enable developers to remain committed to the code base. 25 # days between first patch and last patch #patches integrated into Linux Commitment
  • 24. to roll back changes, it is common to revert commit e.g., Commit message: revert hash 26 RQ3 - Stability of Patches
  • 25. 0 50 100 % of patches reverted H patches DLH patches HMG patches Ledleagueinwins 2.81 0.27 0.32 RQ3 - Stability of Patches 27 • Finding: Tool-supported patches are generally stable.
  • 26. RQ3 - Stability of Patches • Finding: Issues on fix patterns appear to be discovered quickly, however bug detection tools need long time. 28
  • 27. RQ4 - Kind of Bugs • Spread of buggy code ~ Locality of the patches • Complexity of the bugs ~ Change operations 29
  • 28. RQ4 - Bug Locality 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% H pathes DLH Patches HMG Patches % of patches 1 file 2 files 3 files 4 files 5+ files 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% H pathes DLH Patches HMG Patches % of patches 1 hunk 2 hunks 3 hunks 4 hunks 5+ hunks 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% H pathes DLH Patches HMG Patches % of patches 1 line 2 lines 3 lines 4 lines 5+ lines More Local Several Hunks Several Lines 30
  • 29. Change operations - Gumtree If/Unary:del Ident/GenericString:mv If/FunCall:add Ident/GenericString:mv 31 • AST DiffTool to identify change operations
  • 30. RQ4 - Change operations in patches • Finding: Several change operations 0 10 20 30 40 50 60 70 Ident/GenericString GenericList/Left Compound/ExprStatement Program/Declaration Compound/If % of patches HMG Patches upd mv add del 0 10 20 30 40 50 60 70 80 Program/CppTop Ident/GenericString Program/Declaration Compound/ExprStatement Compound/If % of patches H Patches upd mv add del 0 5 10 15 20 25 30 35 40 45 Ident/GenericString Compound/ExprStatement Left/Constant Compound/If If/Compound % of patches DLH Patches upd mv add del 32
  • 31. Take-aways 33 (1) Tools are gradually adopted. (2) HMG patches leverage micro-clones. (1) DLH HMG patches need more time to be accepted. Perhaps due to less-severity. (2) HMG patch acceptance has been fast.
  • 32. Take-aways 34 (1) DLH HMG patches can also change several lines. (2) HMG patches change several files due to APIs. (1) More opportunities à HMG patches leverage redundancy. (2) Need to target more complex defects.
  • 33. Really? 6 They use? Or don’t use? Adopted by whom? Patch à Accepted? Stable? Subject --- Data Sources 10 Bug Reports Change History Developer Discussion Patch Types + + Process H Process DLH Process HMG 13 •Fully Manual •Automated Localization (static/dynamic analysis) •Manual Patch Generation •Manual Design of Patch Template •Automatic Application RQ2 - Commitment of developers • Finding: Patch application tools enable developers to remain committed to contributing patches to the code base. 31 = # patches integrated into Linux * # days between first patch and last patch