A Closer Look at Real-World Patches

Dongsun Kim
Dongsun KimResearch Associate
A Closer Look at Real-world Patches
Kui Liu, Dongsun Kim, Anil Konyuncu, Tegawendé F. Bissyandé, and Yves Le Traon
Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, Luxembourg
Li Li
Monash Software Force (MSF), Monash University, Melbourne, Australia
@ Madrid Spain, 34th ICSME 2018September 27, 2018
1
> Basic Process of Automated Program Repair (APR)
Fault
Localization
Test
Pass
Fail
Patch
Candidate
APR
Tools
Suspicious
buggy code
Where is the code to be fixed? How to generate patches? Is the patch correct?
passing
tests
Passing
tests
Failing
tests
2
> How many bugs are fixed by existing APR tools?
Benchmark Defects4J [42] (395 bugs).
APR Tool # fixed bugs # Correctly fixed bugs
jGenProg 29 5
jKali 22 1
jMutRepair 17 3
Nopol 35 5
HDRepair 23 6
ACS 23 18
ssFix 60 20
ELIXIR 41 26
JAID 26 9
CapGen 25 21
SketchFix 26 19
SimFix 56 34
Why the quantity of bugs
that can be fixed by APR
tools and the quality of
patches generated by APR
tools are such low?
3
> Scope Limitation of APR Tools
Fixing bugs at the statement level.
Bug Chart_1 in Defects4J fixed by jMutRepair, ELIXIR, ssFix,
JAID, SketchFix, CapGen, SimFix.
4
> Are Non-Statement Code Entities Bug-free?
Bug located in Type Declaration (Math-12 in Defects4J).
Bug located in Method Declaration (Lang-29 in Defects4J). Bug located in Field Declaration (Lang-56 in Defects4J).
Bugs located in non-statement code entities.
None of existing APR tools can fix these bugs.
5
> Statement Level VS. Finer Granularity Level
Statement level: UPD ReturnStatement.
The repair action is difficult to be used
to fix similar bugs.
Expression level: dim / 2  0.5 * dim.
Project: Commons-math.
Bug Report ID: MATH-29, “Fix truncated value.”
Commit cedf0d27f9e9341a9e9fa8a192735a0c2e11be40,
--- a/src/main/java/org/apache/commons/math3/distribution/MultivariateNormalDistribution.java
+++ b/src/main/java/org/apache/commons/math3/distribution/MultivariateNormalDistribution.java
@@ −895, 1 +895, 1 @@
- return FastMath.pow(2 * FastMath.PI, -dim / 2) *
+ return FastMath.pow(2 * FastMath.PI, -0.5 * dim) *
FastMath.pow(covarianceMatrixDeterminant, -0.5) * getExponentTerm(vals);
The fix pattern could be used to fix similar bugs.
6
> Objective
Deepen knowledge on repair ingredients
from real-world patches in a fine-grained
way for automated program repair.
7
STUDY DESIGN
8
> Research Questions.
RQ4. Which parts of buggy expressions are prone to be buggy?
RQ1. Do patches impact some specific statement types?
RQ2. Are there code elements in statements that are prone to be faulty?
RQ3. Which expression types are most impacted by patches?
In APR, Fault localization techniques
(e.g.,Tarantula[31], Ochiai[32], Ochiai2[33], Zoltar[34] and
DStar[35]) are used to identify bug positions at code line level.
Data
Type
Variable
Name Operator
Being
Assigned
Expression
9
> Bug-fixing Patches Collection
1). Keyword matching.
Bug, error, fault, fix, patch or repair
2) Bug linking.
Bug IDs (e.g, MATH-929) in issue
tracking system:
(1) Issue Type is ‘bug’,
(2) Resolution is ‘fixed’.
Projects
# Commits
Identified Selected
Commons-io 222 191
Commons-lang 643 522
Mahout 751 717
Commans-math 1,021 909
Derby 3,788 3,356
Lucene-solr 11,408 10,755
Total 18,013 16,450
Buggy_Hunk
Fixed_Hunk
0 2 4 6 8 10
Hunk Size
Commit logs.
10
> Patch Differencing at AST Node Level
Buggy version
Fixed version
Patch
Regroup
Hierarchical construct
of code change actions.
GumTree[25]
11
> Hierarchical Construct of Code Change Actions of a Patch
“Fixed truncated value.”
12
RESULTS
13
> RQ1: Root AST Nodes Impacted by Patches
• Statements are the main buggy
code entities.
None of existing APR tools can fix declaration-related bugs in Defects4J.
Distributions of Root AST node Types Impacted by Patches.
MethodDeclaration, 15.95%
FieldDeclaration, 9.32%
EnumDeclaration, 0.03%
TypeDeclaration, 1.41%
Statement,
73.29%
• Declaration entities (~27%) could
be buggy.
14
> RQ1: Statements Recurrently Impacted by Patches.
5 out of 22 Statement types occupy 88% buggy code statements.
APR tools could focus on fixing some specific statements.
15
> RQ1: Adoption of Update
Supports the investigation of repair
ingredients in a fine-grained way.
“Update” occupies half of repair actions.
1. double d = FastMath.pow(2 * FastMath.PI, -dim / 2);
2. double d = FastMath.pow(2 * FastMath.PI, -dim / 3);
Update:
- a = a + b;
+ a = a * b;
Delete:
int a = 0;
- a = a + b
Move:
- a = a + b;
sum(a,b);
+ a = a + b;
Insert:
int a = 0;
+ a = a + b;
16
> Search Space at Statement Level VS. Expression Level
Expression-level granularity could reduce search space.
Number of buggy
ExpressionStatements: ~40,000.
Commit log: added protection against infinite loops by
setting a maximal number of valuations.
Number of buggy
PrefixExpression: 1,362.
17
> RQ2: Buggy Modifier.
Three ways of repair actions for “modifier”-
related bugs:
1) Add a missing modifier.
2) Delete an inappropriate modifier.
3) Replace an inappropriate modifier.
None of existing APR tools can fix modifier-related bugs in Defects4J.
Modifier, 3.30%
Type, 8.70%
Identifier, 5.50%
Expression,
82.40%
Distributions of inner-statement elements impacted by patches.
Commit log: LANG-334: To avoid exposing a mutating map.
18
> RQ2: Buggy Type Usage.
Buggy Types:
1. Buggy primitive types.
2. Buggy non-primitive types.
Modifier, 3.30%
Type, 8.70%
Identifier, 5.50%
Expression,
82.40%
Distributions of inner-statement elements impacted by patches.
It is a new challenge for APR tools to fix non-primitive type related
bugs.
Commit log: Fix integer overflow.
19
> RQ2: Buggy Identifiers.
APR tools Do not Fix Buggy Identifiers.
Modifying the inconsistent identifier is also
labeled as a bug fix by developers.
Debugging buggy names [58, 59, 60, 61, 62].
Modifier, 3.30%
Type, 8.70%
Identifier, 5.50%
Expression,
82.40%
Distributions of inner-statement elements impacted by patches.
20
> RQ3: Expressions Recurrently Impacted by Patches
5 out of 34 expression types occupy 80% of buggy expressions.
APR tools could focus on fixing some specific expressions.
Distributions of repair actions at the expression level.
21
> RQ3: Buggy Literal Expressions.
Buggy Literal Expressions raise a new challenge for APR tools.
Commit log: SOLR-6959, fix incorrect base url for PDFs.
22
> RQ4: Fault-prone Parts in Expressions.
Non-buggy part of expressions could provide context for fix
pattern mining at the expression level.
Distribution of whole VS. sub-element changes in some buggy expressions.
Expression % whole exp % each sub-exp
Assignment 18.1% Left_Hand_Exp (13.3%) Operator (0.8%) Right_Hand_Exp (73.5)
CastExpression 45.8% Type (11.9%) Exp (42.9%)
ClassInstanceCreation 15.5% Pre_Exp (9.2%) ClassType (19.7%) Argus (63%)
ConditionalExpression 22.9% Condition_Exp (24.1%) Then_Exp (33%) Else_Exp (49.5%)
InfixExpression 27.3% Left_Hand_Exp (35%) Operator (5.6%) Right_Hand_Exp (68.7)
MethodInvocation 14.7% MethodName (22.1%) Argus (79.8%)
23
> Fix Pattern Mining at Expression Level
Commit 44854912194177d67cdfa1dc765ba684eb013a4c
--- a/src/main/java/org/apache/commons/lang3/time/FastDateParser.java
+++ b/src/main/java/org/apache/commons/lang3/time/FastDateParser.java
@@ −895, 1 +895, 1 @@
- final TimeZone tz = TimeZone.getTimeZone(value.toUpperCase());
+ final TimeZone tz = TimeZone.getTimeZone(value.toUpperCase(Locale.ROOT));
- value.toUpperCase()
+
value.toUpperCase(Locale.ROOT);
Fix
Pattern:
Commit log: use toUpperCase(Locale) internally to avoid i18n issues.
24
> Take-away
RQ1:
1. APR scope should be extended to declaration entities.
2. APR changes can be prioritized on a few specific statement types.
3. Move action can be ignored by APR tools.
4. Real-world patches support further investigation in a fine-grained way.
RQ2:
1. APR scope should be extended to modifiers.
2. Buggy non-primitive types could be a new direction for APR.
RQ3:
1. APR changes can be prioritized on a few specific expression types.
2. Buggy literal expressions raise a new challenge for APR.
RQ4:
Non-buggy part of expressions could provide context for fix pattern mining at the expression level.
25
> Summary
15
> RQ1: Adoption of Update
Supports the investigation of repair
ingredients in a fine-grained way.
“Update” occupies half of repair actions.
1. double d = FastMath.pow(2 * FastMath.PI, -dim / 2);
2. double d = FastMath.pow(2 * FastMath.PI, -dim / 3);
10
> Patch Differencing at AST Node Level
Buggy version
Fixed version
Patch
Regroup
Hierarchical construct
of code change actions.
GumTree[25]
https://github.com/AutoProRepair/PatchParser
1 of 26

Recommended

You Cannot Fix What You Cannot Find! --- An Investigation of Fault Localizati... by
You Cannot Fix What You Cannot Find! --- An Investigation of Fault Localizati...You Cannot Fix What You Cannot Find! --- An Investigation of Fault Localizati...
You Cannot Fix What You Cannot Find! --- An Investigation of Fault Localizati...Dongsun Kim
529 views29 slides
TBar: Revisiting Template-based Automated Program Repair by
TBar: Revisiting Template-based Automated Program RepairTBar: Revisiting Template-based Automated Program Repair
TBar: Revisiting Template-based Automated Program RepairDongsun Kim
379 views26 slides
Mining Fix Patterns for FindBugs Violations by
Mining Fix Patterns for FindBugs ViolationsMining Fix Patterns for FindBugs Violations
Mining Fix Patterns for FindBugs ViolationsDongsun Kim
641 views59 slides
iFixR: Bug Report Driven Program Repair by
iFixR: Bug Report Driven Program RepairiFixR: Bug Report Driven Program Repair
iFixR: Bug Report Driven Program RepairDongsun Kim
554 views35 slides
LSRepair: Live Search of Fix Ingredients for Automated Program Repair by
LSRepair: Live Search of Fix Ingredients for Automated Program RepairLSRepair: Live Search of Fix Ingredients for Automated Program Repair
LSRepair: Live Search of Fix Ingredients for Automated Program RepairDongsun Kim
463 views20 slides
Bench4BL: Reproducibility Study on the Performance of IR-Based Bug Localization by
Bench4BL: Reproducibility Study on the Performance of IR-Based Bug LocalizationBench4BL: Reproducibility Study on the Performance of IR-Based Bug Localization
Bench4BL: Reproducibility Study on the Performance of IR-Based Bug LocalizationDongsun Kim
780 views57 slides

More Related Content

What's hot

Impact of Tool Support in Patch Construction by
Impact of Tool Support in Patch ConstructionImpact of Tool Support in Patch Construction
Impact of Tool Support in Patch ConstructionDongsun Kim
589 views33 slides
Automated Program Repair Keynote talk by
Automated Program Repair Keynote talkAutomated Program Repair Keynote talk
Automated Program Repair Keynote talkAbhik Roychoudhury
5.7K views46 slides
Test final jav_aaa by
Test final jav_aaaTest final jav_aaa
Test final jav_aaaBagusBudi11
63 views19 slides
Software Testing for Data Scientists by
Software Testing for Data ScientistsSoftware Testing for Data Scientists
Software Testing for Data ScientistsAjay Ohri
2.5K views51 slides
Code Analysis-run time error prediction by
Code Analysis-run time error predictionCode Analysis-run time error prediction
Code Analysis-run time error predictionNIKHIL NAWATHE
723 views24 slides
Opal Hermes - towards representative benchmarks by
Opal  Hermes - towards representative benchmarksOpal  Hermes - towards representative benchmarks
Opal Hermes - towards representative benchmarksMichaelEichberg1
77 views21 slides

What's hot(20)

Impact of Tool Support in Patch Construction by Dongsun Kim
Impact of Tool Support in Patch ConstructionImpact of Tool Support in Patch Construction
Impact of Tool Support in Patch Construction
Dongsun Kim589 views
Software Testing for Data Scientists by Ajay Ohri
Software Testing for Data ScientistsSoftware Testing for Data Scientists
Software Testing for Data Scientists
Ajay Ohri2.5K views
Code Analysis-run time error prediction by NIKHIL NAWATHE
Code Analysis-run time error predictionCode Analysis-run time error prediction
Code Analysis-run time error prediction
NIKHIL NAWATHE723 views
Opal Hermes - towards representative benchmarks by MichaelEichberg1
Opal  Hermes - towards representative benchmarksOpal  Hermes - towards representative benchmarks
Opal Hermes - towards representative benchmarks
MichaelEichberg177 views
SherLog: Error Diagnosis by Connecting Clues from Run-time Logs by Dacong (Tony) Yan
SherLog: Error Diagnosis by Connecting Clues from Run-time LogsSherLog: Error Diagnosis by Connecting Clues from Run-time Logs
SherLog: Error Diagnosis by Connecting Clues from Run-time Logs
Crowd debugging (FSE 2015) by Sung Kim
Crowd debugging (FSE 2015)Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)
Sung Kim1.9K views
Static analysis works for mission-critical systems, why not yours? by Rogue Wave Software
Static analysis works for mission-critical systems, why not yours? Static analysis works for mission-critical systems, why not yours?
Static analysis works for mission-critical systems, why not yours?
STAR: Stack Trace based Automatic Crash Reproduction by Sung Kim
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash Reproduction
Sung Kim7K views
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014) by Sung Kim
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
Sung Kim6.4K views
Looking for Bugs in MonoDevelop by PVS-Studio
Looking for Bugs in MonoDevelopLooking for Bugs in MonoDevelop
Looking for Bugs in MonoDevelop
PVS-Studio270 views
Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour... by Lionel Briand
Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour...Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour...
Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour...
Lionel Briand2.1K views
Effective Fault-Localization Techniques for Concurrent Software by Sangmin Park
Effective Fault-Localization Techniques for Concurrent SoftwareEffective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent Software
Sangmin Park837 views
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding... by Sangmin Park
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
Sangmin Park412 views
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015) by Sung Kim
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Sung Kim1.6K views
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014) by Sung Kim
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Sung Kim1.9K views
"Быстрое обнаружение вредоносного ПО для Android с помощью машинного обучения... by Yandex
"Быстрое обнаружение вредоносного ПО для Android с помощью машинного обучения..."Быстрое обнаружение вредоносного ПО для Android с помощью машинного обучения...
"Быстрое обнаружение вредоносного ПО для Android с помощью машинного обучения...
Yandex3.6K views

Similar to A Closer Look at Real-World Patches

S D D Program Development Tools by
S D D  Program  Development  ToolsS D D  Program  Development  Tools
S D D Program Development Toolsgavhays
1.9K views34 slides
A Tale of Experiments on Bug Prediction by
A Tale of Experiments on Bug PredictionA Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionMartin Pinzger
757 views45 slides
ICSE2018-Poster-Bug-Localization by
ICSE2018-Poster-Bug-LocalizationICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-LocalizationMasud Rahman
109 views1 slide
selenium_master.pdf by
selenium_master.pdfselenium_master.pdf
selenium_master.pdfUdaytejaTiyyala1
272 views301 slides
Static Slicing Technique with Algorithmic Approach by
Static Slicing Technique with Algorithmic ApproachStatic Slicing Technique with Algorithmic Approach
Static Slicing Technique with Algorithmic ApproachIOSR Journals
323 views4 slides
BH-US-06-Bilar.pdf by
BH-US-06-Bilar.pdfBH-US-06-Bilar.pdf
BH-US-06-Bilar.pdfMohammadRazavi17
2 views50 slides

Similar to A Closer Look at Real-World Patches(20)

S D D Program Development Tools by gavhays
S D D  Program  Development  ToolsS D D  Program  Development  Tools
S D D Program Development Tools
gavhays1.9K views
A Tale of Experiments on Bug Prediction by Martin Pinzger
A Tale of Experiments on Bug PredictionA Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug Prediction
Martin Pinzger757 views
ICSE2018-Poster-Bug-Localization by Masud Rahman
ICSE2018-Poster-Bug-LocalizationICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-Localization
Masud Rahman109 views
Static Slicing Technique with Algorithmic Approach by IOSR Journals
Static Slicing Technique with Algorithmic ApproachStatic Slicing Technique with Algorithmic Approach
Static Slicing Technique with Algorithmic Approach
IOSR Journals323 views
Multi step automated refactoring for code smell by eSAT Journals
Multi step automated refactoring for code smellMulti step automated refactoring for code smell
Multi step automated refactoring for code smell
eSAT Journals185 views
A tale of experiments on bug prediction by Martin Pinzger
A tale of experiments on bug predictionA tale of experiments on bug prediction
A tale of experiments on bug prediction
Martin Pinzger819 views
IRJET- Data Reduction in Bug Triage using Supervised Machine Learning by IRJET Journal
IRJET- Data Reduction in Bug Triage using Supervised Machine LearningIRJET- Data Reduction in Bug Triage using Supervised Machine Learning
IRJET- Data Reduction in Bug Triage using Supervised Machine Learning
IRJET Journal4 views
Code-Review-COW56-Meeting by Masud Rahman
Code-Review-COW56-MeetingCode-Review-COW56-Meeting
Code-Review-COW56-Meeting
Masud Rahman76 views
Comp122 full course latest 2015 [ all discussions all homework and all ilabs ] by Laynebaril
Comp122 full course latest 2015 [ all discussions all homework and all ilabs ]Comp122 full course latest 2015 [ all discussions all homework and all ilabs ]
Comp122 full course latest 2015 [ all discussions all homework and all ilabs ]
Laynebaril77 views
Comp122 full course latest 2015 [ all discussions all homework and all ilabs ] by Laynebaril
Comp122 full course latest 2015 [ all discussions all homework and all ilabs ]Comp122 full course latest 2015 [ all discussions all homework and all ilabs ]
Comp122 full course latest 2015 [ all discussions all homework and all ilabs ]
Laynebaril90 views
Finding latent code errors via machine learning over program ... by butest
Finding latent code errors via machine learning over program ...Finding latent code errors via machine learning over program ...
Finding latent code errors via machine learning over program ...
butest451 views
Comp122 full course latest 2015 [ all discussions all homework and all ilabs ] by Laynebaril
Comp122 full course latest 2015 [ all discussions all homework and all ilabs ]Comp122 full course latest 2015 [ all discussions all homework and all ilabs ]
Comp122 full course latest 2015 [ all discussions all homework and all ilabs ]
Laynebaril33 views
Comp122 full course latest 2015 [ all discussions all homework and all ilabs ] by Laynebaril
Comp122 full course latest 2015 [ all discussions all homework and all ilabs ]Comp122 full course latest 2015 [ all discussions all homework and all ilabs ]
Comp122 full course latest 2015 [ all discussions all homework and all ilabs ]
Laynebaril23 views
Comp122 full course latest 2015 [ all discussions all homework and all ilabs ] by Laynebaril
Comp122 full course latest 2015 [ all discussions all homework and all ilabs ]Comp122 full course latest 2015 [ all discussions all homework and all ilabs ]
Comp122 full course latest 2015 [ all discussions all homework and all ilabs ]
Laynebaril19 views
Comp122 full course latest 2015 [ all discussions all homework and all ilabs ] by Laynebaril
Comp122 full course latest 2015 [ all discussions all homework and all ilabs ]Comp122 full course latest 2015 [ all discussions all homework and all ilabs ]
Comp122 full course latest 2015 [ all discussions all homework and all ilabs ]
Laynebaril22 views

Recently uploaded

Future of Learning - Yap Aye Wee.pdf by
Future of Learning - Yap Aye Wee.pdfFuture of Learning - Yap Aye Wee.pdf
Future of Learning - Yap Aye Wee.pdfNUS-ISS
41 views11 slides
Web Dev - 1 PPT.pdf by
Web Dev - 1 PPT.pdfWeb Dev - 1 PPT.pdf
Web Dev - 1 PPT.pdfgdsczhcet
55 views45 slides
Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu... by
Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu...Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu...
Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu...NUS-ISS
37 views54 slides
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors by
TouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective SensorsTouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective Sensors
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensorssugiuralab
15 views15 slides
Empathic Computing: Delivering the Potential of the Metaverse by
Empathic Computing: Delivering  the Potential of the MetaverseEmpathic Computing: Delivering  the Potential of the Metaverse
Empathic Computing: Delivering the Potential of the MetaverseMark Billinghurst
470 views80 slides
Voice Logger - Telephony Integration Solution at Aegis by
Voice Logger - Telephony Integration Solution at AegisVoice Logger - Telephony Integration Solution at Aegis
Voice Logger - Telephony Integration Solution at AegisNirmal Sharma
17 views1 slide

Recently uploaded(20)

Future of Learning - Yap Aye Wee.pdf by NUS-ISS
Future of Learning - Yap Aye Wee.pdfFuture of Learning - Yap Aye Wee.pdf
Future of Learning - Yap Aye Wee.pdf
NUS-ISS41 views
Web Dev - 1 PPT.pdf by gdsczhcet
Web Dev - 1 PPT.pdfWeb Dev - 1 PPT.pdf
Web Dev - 1 PPT.pdf
gdsczhcet55 views
Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu... by NUS-ISS
Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu...Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu...
Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu...
NUS-ISS37 views
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors by sugiuralab
TouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective SensorsTouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective Sensors
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors
sugiuralab15 views
Empathic Computing: Delivering the Potential of the Metaverse by Mark Billinghurst
Empathic Computing: Delivering  the Potential of the MetaverseEmpathic Computing: Delivering  the Potential of the Metaverse
Empathic Computing: Delivering the Potential of the Metaverse
Mark Billinghurst470 views
Voice Logger - Telephony Integration Solution at Aegis by Nirmal Sharma
Voice Logger - Telephony Integration Solution at AegisVoice Logger - Telephony Integration Solution at Aegis
Voice Logger - Telephony Integration Solution at Aegis
Nirmal Sharma17 views
handbook for web 3 adoption.pdf by Liveplex
handbook for web 3 adoption.pdfhandbook for web 3 adoption.pdf
handbook for web 3 adoption.pdf
Liveplex19 views
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen... by NUS-ISS
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...
NUS-ISS28 views
SAP Automation Using Bar Code and FIORI.pdf by Virendra Rai, PMP
SAP Automation Using Bar Code and FIORI.pdfSAP Automation Using Bar Code and FIORI.pdf
SAP Automation Using Bar Code and FIORI.pdf
The Importance of Cybersecurity for Digital Transformation by NUS-ISS
The Importance of Cybersecurity for Digital TransformationThe Importance of Cybersecurity for Digital Transformation
The Importance of Cybersecurity for Digital Transformation
NUS-ISS27 views
[2023] Putting the R! in R&D.pdf by Eleanor McHugh
[2023] Putting the R! in R&D.pdf[2023] Putting the R! in R&D.pdf
[2023] Putting the R! in R&D.pdf
Eleanor McHugh38 views
DALI Basics Course 2023 by Ivory Egg
DALI Basics Course  2023DALI Basics Course  2023
DALI Basics Course 2023
Ivory Egg14 views
Future of Learning - Khoong Chan Meng by NUS-ISS
Future of Learning - Khoong Chan MengFuture of Learning - Khoong Chan Meng
Future of Learning - Khoong Chan Meng
NUS-ISS33 views
Igniting Next Level Productivity with AI-Infused Data Integration Workflows by Safe Software
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software225 views
PharoJS - Zürich Smalltalk Group Meetup November 2023 by Noury Bouraqadi
PharoJS - Zürich Smalltalk Group Meetup November 2023PharoJS - Zürich Smalltalk Group Meetup November 2023
PharoJS - Zürich Smalltalk Group Meetup November 2023
Noury Bouraqadi120 views
.conf Go 2023 - Data analysis as a routine by Splunk
.conf Go 2023 - Data analysis as a routine.conf Go 2023 - Data analysis as a routine
.conf Go 2023 - Data analysis as a routine
Splunk93 views
Data-centric AI and the convergence of data and model engineering: opportunit... by Paolo Missier
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...
Paolo Missier34 views
Special_edition_innovator_2023.pdf by WillDavies22
Special_edition_innovator_2023.pdfSpecial_edition_innovator_2023.pdf
Special_edition_innovator_2023.pdf
WillDavies2216 views

A Closer Look at Real-World Patches

  • 1. A Closer Look at Real-world Patches Kui Liu, Dongsun Kim, Anil Konyuncu, Tegawendé F. Bissyandé, and Yves Le Traon Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, Luxembourg Li Li Monash Software Force (MSF), Monash University, Melbourne, Australia @ Madrid Spain, 34th ICSME 2018September 27, 2018
  • 2. 1 > Basic Process of Automated Program Repair (APR) Fault Localization Test Pass Fail Patch Candidate APR Tools Suspicious buggy code Where is the code to be fixed? How to generate patches? Is the patch correct? passing tests Passing tests Failing tests
  • 3. 2 > How many bugs are fixed by existing APR tools? Benchmark Defects4J [42] (395 bugs). APR Tool # fixed bugs # Correctly fixed bugs jGenProg 29 5 jKali 22 1 jMutRepair 17 3 Nopol 35 5 HDRepair 23 6 ACS 23 18 ssFix 60 20 ELIXIR 41 26 JAID 26 9 CapGen 25 21 SketchFix 26 19 SimFix 56 34 Why the quantity of bugs that can be fixed by APR tools and the quality of patches generated by APR tools are such low?
  • 4. 3 > Scope Limitation of APR Tools Fixing bugs at the statement level. Bug Chart_1 in Defects4J fixed by jMutRepair, ELIXIR, ssFix, JAID, SketchFix, CapGen, SimFix.
  • 5. 4 > Are Non-Statement Code Entities Bug-free? Bug located in Type Declaration (Math-12 in Defects4J). Bug located in Method Declaration (Lang-29 in Defects4J). Bug located in Field Declaration (Lang-56 in Defects4J). Bugs located in non-statement code entities. None of existing APR tools can fix these bugs.
  • 6. 5 > Statement Level VS. Finer Granularity Level Statement level: UPD ReturnStatement. The repair action is difficult to be used to fix similar bugs. Expression level: dim / 2  0.5 * dim. Project: Commons-math. Bug Report ID: MATH-29, “Fix truncated value.” Commit cedf0d27f9e9341a9e9fa8a192735a0c2e11be40, --- a/src/main/java/org/apache/commons/math3/distribution/MultivariateNormalDistribution.java +++ b/src/main/java/org/apache/commons/math3/distribution/MultivariateNormalDistribution.java @@ −895, 1 +895, 1 @@ - return FastMath.pow(2 * FastMath.PI, -dim / 2) * + return FastMath.pow(2 * FastMath.PI, -0.5 * dim) * FastMath.pow(covarianceMatrixDeterminant, -0.5) * getExponentTerm(vals); The fix pattern could be used to fix similar bugs.
  • 7. 6 > Objective Deepen knowledge on repair ingredients from real-world patches in a fine-grained way for automated program repair.
  • 9. 8 > Research Questions. RQ4. Which parts of buggy expressions are prone to be buggy? RQ1. Do patches impact some specific statement types? RQ2. Are there code elements in statements that are prone to be faulty? RQ3. Which expression types are most impacted by patches? In APR, Fault localization techniques (e.g.,Tarantula[31], Ochiai[32], Ochiai2[33], Zoltar[34] and DStar[35]) are used to identify bug positions at code line level. Data Type Variable Name Operator Being Assigned Expression
  • 10. 9 > Bug-fixing Patches Collection 1). Keyword matching. Bug, error, fault, fix, patch or repair 2) Bug linking. Bug IDs (e.g, MATH-929) in issue tracking system: (1) Issue Type is ‘bug’, (2) Resolution is ‘fixed’. Projects # Commits Identified Selected Commons-io 222 191 Commons-lang 643 522 Mahout 751 717 Commans-math 1,021 909 Derby 3,788 3,356 Lucene-solr 11,408 10,755 Total 18,013 16,450 Buggy_Hunk Fixed_Hunk 0 2 4 6 8 10 Hunk Size Commit logs.
  • 11. 10 > Patch Differencing at AST Node Level Buggy version Fixed version Patch Regroup Hierarchical construct of code change actions. GumTree[25]
  • 12. 11 > Hierarchical Construct of Code Change Actions of a Patch “Fixed truncated value.”
  • 14. 13 > RQ1: Root AST Nodes Impacted by Patches • Statements are the main buggy code entities. None of existing APR tools can fix declaration-related bugs in Defects4J. Distributions of Root AST node Types Impacted by Patches. MethodDeclaration, 15.95% FieldDeclaration, 9.32% EnumDeclaration, 0.03% TypeDeclaration, 1.41% Statement, 73.29% • Declaration entities (~27%) could be buggy.
  • 15. 14 > RQ1: Statements Recurrently Impacted by Patches. 5 out of 22 Statement types occupy 88% buggy code statements. APR tools could focus on fixing some specific statements.
  • 16. 15 > RQ1: Adoption of Update Supports the investigation of repair ingredients in a fine-grained way. “Update” occupies half of repair actions. 1. double d = FastMath.pow(2 * FastMath.PI, -dim / 2); 2. double d = FastMath.pow(2 * FastMath.PI, -dim / 3); Update: - a = a + b; + a = a * b; Delete: int a = 0; - a = a + b Move: - a = a + b; sum(a,b); + a = a + b; Insert: int a = 0; + a = a + b;
  • 17. 16 > Search Space at Statement Level VS. Expression Level Expression-level granularity could reduce search space. Number of buggy ExpressionStatements: ~40,000. Commit log: added protection against infinite loops by setting a maximal number of valuations. Number of buggy PrefixExpression: 1,362.
  • 18. 17 > RQ2: Buggy Modifier. Three ways of repair actions for “modifier”- related bugs: 1) Add a missing modifier. 2) Delete an inappropriate modifier. 3) Replace an inappropriate modifier. None of existing APR tools can fix modifier-related bugs in Defects4J. Modifier, 3.30% Type, 8.70% Identifier, 5.50% Expression, 82.40% Distributions of inner-statement elements impacted by patches. Commit log: LANG-334: To avoid exposing a mutating map.
  • 19. 18 > RQ2: Buggy Type Usage. Buggy Types: 1. Buggy primitive types. 2. Buggy non-primitive types. Modifier, 3.30% Type, 8.70% Identifier, 5.50% Expression, 82.40% Distributions of inner-statement elements impacted by patches. It is a new challenge for APR tools to fix non-primitive type related bugs. Commit log: Fix integer overflow.
  • 20. 19 > RQ2: Buggy Identifiers. APR tools Do not Fix Buggy Identifiers. Modifying the inconsistent identifier is also labeled as a bug fix by developers. Debugging buggy names [58, 59, 60, 61, 62]. Modifier, 3.30% Type, 8.70% Identifier, 5.50% Expression, 82.40% Distributions of inner-statement elements impacted by patches.
  • 21. 20 > RQ3: Expressions Recurrently Impacted by Patches 5 out of 34 expression types occupy 80% of buggy expressions. APR tools could focus on fixing some specific expressions. Distributions of repair actions at the expression level.
  • 22. 21 > RQ3: Buggy Literal Expressions. Buggy Literal Expressions raise a new challenge for APR tools. Commit log: SOLR-6959, fix incorrect base url for PDFs.
  • 23. 22 > RQ4: Fault-prone Parts in Expressions. Non-buggy part of expressions could provide context for fix pattern mining at the expression level. Distribution of whole VS. sub-element changes in some buggy expressions. Expression % whole exp % each sub-exp Assignment 18.1% Left_Hand_Exp (13.3%) Operator (0.8%) Right_Hand_Exp (73.5) CastExpression 45.8% Type (11.9%) Exp (42.9%) ClassInstanceCreation 15.5% Pre_Exp (9.2%) ClassType (19.7%) Argus (63%) ConditionalExpression 22.9% Condition_Exp (24.1%) Then_Exp (33%) Else_Exp (49.5%) InfixExpression 27.3% Left_Hand_Exp (35%) Operator (5.6%) Right_Hand_Exp (68.7) MethodInvocation 14.7% MethodName (22.1%) Argus (79.8%)
  • 24. 23 > Fix Pattern Mining at Expression Level Commit 44854912194177d67cdfa1dc765ba684eb013a4c --- a/src/main/java/org/apache/commons/lang3/time/FastDateParser.java +++ b/src/main/java/org/apache/commons/lang3/time/FastDateParser.java @@ −895, 1 +895, 1 @@ - final TimeZone tz = TimeZone.getTimeZone(value.toUpperCase()); + final TimeZone tz = TimeZone.getTimeZone(value.toUpperCase(Locale.ROOT)); - value.toUpperCase() + value.toUpperCase(Locale.ROOT); Fix Pattern: Commit log: use toUpperCase(Locale) internally to avoid i18n issues.
  • 25. 24 > Take-away RQ1: 1. APR scope should be extended to declaration entities. 2. APR changes can be prioritized on a few specific statement types. 3. Move action can be ignored by APR tools. 4. Real-world patches support further investigation in a fine-grained way. RQ2: 1. APR scope should be extended to modifiers. 2. Buggy non-primitive types could be a new direction for APR. RQ3: 1. APR changes can be prioritized on a few specific expression types. 2. Buggy literal expressions raise a new challenge for APR. RQ4: Non-buggy part of expressions could provide context for fix pattern mining at the expression level.
  • 26. 25 > Summary 15 > RQ1: Adoption of Update Supports the investigation of repair ingredients in a fine-grained way. “Update” occupies half of repair actions. 1. double d = FastMath.pow(2 * FastMath.PI, -dim / 2); 2. double d = FastMath.pow(2 * FastMath.PI, -dim / 3); 10 > Patch Differencing at AST Node Level Buggy version Fixed version Patch Regroup Hierarchical construct of code change actions. GumTree[25] https://github.com/AutoProRepair/PatchParser

Editor's Notes

  1. Chart_17, Lang_4 none of apr tools can fix non primitive type related bugs.
  2. Some bugs are also related to literal expressions.