Abhik Roychoudhury, National University of Singapore
Satish Chandra, Google
PROGRAM REPAIR
& AUTO-CODING
Talk for ICSE 2023 10-year
Most Influential Paper Award
ICSE2023 MIP Award Talk
Hoang Duong Thien Nguyen, Dawei Qi, Abhik Roychoudhury
School of Computing, National University of Singapore, Singapore
Satish Chandra
IBM Research, USA
01
SemFix: ICSE 2013
In this paper, we present an automated repair method based on symbolic
execution, constraint solving and program synthesis.
Education Correctness Security
Program
Repair
Buggy Program
Tests
Patched
Program
02
ICSE2023 MIP Award Talk
PROGRAM REPAIR
Baseline
Codebase
Initial
Req
Modified
Codebase
CS1010S, ...
First-year programming
Intelligent Tutoring
System
Augmented
Req
THE DEBUGGING PROBLEM
05
ICSE2023 MIP Award Talk
RETROSPECTIVE: DEBUGGING
Source of specifications:
Previous version
Another implementation,
Tests … ?
Dagstuhl Seminar 13061
Fault Prediction, Localization, and Repair, Feb 2013
Repair exploiting symbolic execution
• Scalability with respect to search space size.
Repair via constraint solving
• Synthesize rather than lifting fixes from elsewhere.
Repair without formal specifications
• Do not depend on any being available.
From https://www.dagstuhl.de/13061
07
ICSE2023 MIP Award Talk
SPEC from TESTS
Test ID a b c oracle Pass
1 -1 -1 -1 INVALID
2 1 1 1 EQUILATERAL
3 2 2 3 ISOSCELES
4 2 3 2 ISOSCELES
5 3 2 2 ISOSCELES
6 2 3 4 SCALANE
1 int triangle(int a, int b, int c){
2 if (a <= 0 || b <= 0 || c <= 0)
3. return INVALID;
4. if (a == b && b == c)
5. return EQUILATERAL;
6. if (a == b || b != c) // bug !
7. return ISOSCELES;
8. return SCALENE;
9. }
08
Correct fix
(a == b || b== c || a == c)
ICSE2023 MIP Award Talk
EXAMPLE
Accumulated constraints
f (2,2,3) == true ^
f (2,3,2) == true ^
…
Find a f satisfying this constraint
By fixing the set of operators appearing in f
Program synthesis with fixed set of operators
1 int triangle (int a, int b, int c) {
if (a <= 0 || b <= 0 ||
c<=0) return INVALID;
if (a == b && b == c)
return EQUILATERAL;
if (f (a, b,c))
return ISOSCELES;
return SCALENE;
2
3
4
5
6
7
8
9 }
a==2
7
b==2 c==3
f (2,2,3) = true
Symbolic Execution
ICSE2023 MIP Award Talk
SPEC. from TESTS
Automatically generate the constraint
f (2,2,3) ^ f (2,3,2) ^ f (3,2,2) ^ ¬ f (2,3,4)
Solution
f(a,b,c) = (a == b || b == c || a == c)
“Program testing and program proving can
be considered as extreme alternatives.….
This paper describes a practical approach
between these two extremes…
Each symbolic execution result may be equivalent
to a large number of normal tests”
TESTING/
VERIFICATION
...
8
ICSE2023 MIP Award Talk
SYMBOLIC EXECUTION (1976)
Specification Inference
In the absence of formal specifications,
analyze the buggy program and its
artifacts such as execution traces via
various heuristics to glean a
specification about how it can pass tests
and what could have gone wrong!
1 int triangle (int a, int b, int c) {
if (a <= 0 || b <= 0 ||
c<=0) return INVALID;
if (a == b && b == c)
return EQUILATERAL;
if (f (a, b,c)) // X
return ISOSCELES;
return SCALENE;
2
3
4
5
6
7
8
9 }
a==2 b==2 c==3
X = true
Symbolic Execution
9
ICSE2023 MIP Award Talk
SE for REPAIR
a <=0 || b <= 0 || c <= 0
Yes
Yes
Yes
No
No
No
a == b && b == c
1, 1, 1
1,1,2
a == b || b != c
2,3,4
10
ICSE2023 MIP Award Talk
SE for TESTING
1 int triangle (int a, int b, int c) {
if (a <= 0 || b <= 0 ||
c<=0) return INVALID;
if (a == b && b == c)
return EQUILATERAL;
if (a == b || b != c)
return ISOSCELES;
return SCALINE;
2
3
4
5
6
7
8
9 }
-1, -1, -1
11
ICSE2023 MIP Award Talk
REPAIR / TESTING a <=0 || b <= 0 || c <= 0
Yes
Yes
Yes
No
No
No
a == b && b == c
1, 1, 1
1,1,2
a == b || b != c
2,3,4
-1, -1, -1
1 int triangle (int a, int b, int c) {
if (a <= 0 || b <= 0 ||
c<=0) return INVALID;
if (a == b && b == c)
return EQUILATERAL;
if (f (a, b,c)) // X
return ISOSCELES;
return SCALINE;
2
3
4
5
6
7
8
9 }
a==2 b==2 c==3
X= true / X = f(2,2,3)
Symbolic Execution
SemFix paper
comes here
Passing &
failing tests
Extract
constraints
Learning/
Inference
Generate patch
candidates
Fault localization
Semantic
Repair
Learning-based
Repair
Search-based
Repair
Synthesize code via
constraint solving Predict patch
Validate patch
candidates
Model of patches
03
Patch
Code
transformations
Buggy Program Code corpus
SUMMARY
for t in Tests {
generate repair constraint
}
Synthesize e from
t
t t
Semantics-based Schematic
A BRIEF HISTORY OF 2013-2023
ML makes significant inroads into software tools
• code completion
• code search and recommendation
• troubleshooting
• test selection
• …
• and of course, automated program repair!
From research to mainstream in less than 10 years
A new era of
software tools
ICSE2023 MIP Award Talk
Large code repositories
aka “big code”
Huge progress in ML
esp. in deep learning
ML COMES TO
AUTOMATED PROGRAM REPAIR
Immense amount of code change data available on past fixes
• Sometimes even aligned with bug symptom
ML problem
• Given a potentially buggy code fragment, predict an edit
Software tool problem
• Localize the error [as before]
• Predict an edit [ML problem]
• Validate that the edited code works [as before]
ICSE2023 MIP Award Talk
Passing &
failing tests
Learning/
Inference
Fault localization
Learning-based
Repair
Predict patch
Model of repair
Patch
Buggy Program Code corpus
Code
transformations
GETAFIX (META, 2019)
ICSE2023 MIP Award Talk
+10
+2 +35
-10 -1
-7
+1
-1
42294d 5cdd7c 1ee3fc 181d81 1d89b2 f54c2d
public int getWidth() {
@Nullable View v = this.getView();
- return v.getWidth();
+ return v != null ? v.getWidth() : 0;
}
Bader et al, Learning to fix bugs automatically, OOPSLA 2019
x == null
? x.foo()
: default
x.foo() y == null
? y.bar()
: default
y.bar()
α == null
? α.β()
: default
α.β()
Pattern discovery by anti-unification
Pattern application by probabilistic calculation
GETAFIX (META, 2019)
ICSE2023 MIP Award Talk
Developers are picky about their code – semantic equivalence is
not enough
Emphasis on ranking and picking the most likely pattern – no
budget to compile multiple fixes
Convenient UI integration is important
Where ML has helped?
Generalization in fix patterns
Productive in static analysis errors,
build errors etc.
(somewhat narrow domain, spec is
easier)
CHALLENGES IN APR
ICSE2023 MIP Award Talk
Continued challenges
Patch accuracy: tests may not
capture the full spec
Localization continues to be a
challenge
Private
Code
GitHub Copilot
Service
OpenAI GPT-4 Model GitHu
b
Public code and
text on the
internet
Provide Editor context
Improve Suggestions
04
Provide Suggestions
ICSE2023 MIP Award Talk
PROSPECTIVE: 2022-23
Modern LLMs trained on large code corpora have shown surprising capabilities (beyond code completion) out-of-the-
box, and many more accessible with few-shot prompting. The impact of these capabilities is significant on research
and on the profession.
PROGRAM REPAIR IN THE ERA OF ML-
GENERATED CODE
1. ML-generated code does not mean bugs will not appear. In production,
new unforeseen/untested conditions might occur. The need for fixing
failures is going to be there.
2. Models will improve to be more predictable as well as to avoid the more
routine kind of bugs.
3. Prompts used in code generation might themselves become the entity of
record, in which case the notion of "repair" might be applicable to prompts
too.
4. The question will remain on when ML-generated code can be
“trusted” enough to be integrated as part of your SW project!
20
Steering Search Specification Inference
GRADUAL CORRECTNESS
ICSE2023 MIP Award Talk
"EVIDENCE" from REPAIR
Automated Repair of Programs from Large Language Models, ICSE23.
Trustworthy Software
21
TRUSTED AUTOMATED PROGRAMMING
Repair techniques on code from LLMs
Evidence generation via repair
ICSE2023 MIP Award Talk

16May_ICSE_MIP_APR_2023.pptx

  • 1.
    Abhik Roychoudhury, NationalUniversity of Singapore Satish Chandra, Google PROGRAM REPAIR & AUTO-CODING Talk for ICSE 2023 10-year Most Influential Paper Award
  • 2.
    ICSE2023 MIP AwardTalk Hoang Duong Thien Nguyen, Dawei Qi, Abhik Roychoudhury School of Computing, National University of Singapore, Singapore Satish Chandra IBM Research, USA 01 SemFix: ICSE 2013 In this paper, we present an automated repair method based on symbolic execution, constraint solving and program synthesis.
  • 3.
    Education Correctness Security Program Repair BuggyProgram Tests Patched Program 02 ICSE2023 MIP Award Talk PROGRAM REPAIR Baseline Codebase Initial Req Modified Codebase CS1010S, ... First-year programming Intelligent Tutoring System Augmented Req
  • 4.
    THE DEBUGGING PROBLEM 05 ICSE2023MIP Award Talk RETROSPECTIVE: DEBUGGING Source of specifications: Previous version Another implementation, Tests … ?
  • 5.
    Dagstuhl Seminar 13061 FaultPrediction, Localization, and Repair, Feb 2013 Repair exploiting symbolic execution • Scalability with respect to search space size. Repair via constraint solving • Synthesize rather than lifting fixes from elsewhere. Repair without formal specifications • Do not depend on any being available. From https://www.dagstuhl.de/13061 07 ICSE2023 MIP Award Talk SPEC from TESTS
  • 6.
    Test ID ab c oracle Pass 1 -1 -1 -1 INVALID 2 1 1 1 EQUILATERAL 3 2 2 3 ISOSCELES 4 2 3 2 ISOSCELES 5 3 2 2 ISOSCELES 6 2 3 4 SCALANE 1 int triangle(int a, int b, int c){ 2 if (a <= 0 || b <= 0 || c <= 0) 3. return INVALID; 4. if (a == b && b == c) 5. return EQUILATERAL; 6. if (a == b || b != c) // bug ! 7. return ISOSCELES; 8. return SCALENE; 9. } 08 Correct fix (a == b || b== c || a == c) ICSE2023 MIP Award Talk EXAMPLE
  • 7.
    Accumulated constraints f (2,2,3)== true ^ f (2,3,2) == true ^ … Find a f satisfying this constraint By fixing the set of operators appearing in f Program synthesis with fixed set of operators 1 int triangle (int a, int b, int c) { if (a <= 0 || b <= 0 || c<=0) return INVALID; if (a == b && b == c) return EQUILATERAL; if (f (a, b,c)) return ISOSCELES; return SCALENE; 2 3 4 5 6 7 8 9 } a==2 7 b==2 c==3 f (2,2,3) = true Symbolic Execution ICSE2023 MIP Award Talk SPEC. from TESTS Automatically generate the constraint f (2,2,3) ^ f (2,3,2) ^ f (3,2,2) ^ ¬ f (2,3,4) Solution f(a,b,c) = (a == b || b == c || a == c)
  • 8.
    “Program testing andprogram proving can be considered as extreme alternatives.…. This paper describes a practical approach between these two extremes… Each symbolic execution result may be equivalent to a large number of normal tests” TESTING/ VERIFICATION ... 8 ICSE2023 MIP Award Talk SYMBOLIC EXECUTION (1976)
  • 9.
    Specification Inference In theabsence of formal specifications, analyze the buggy program and its artifacts such as execution traces via various heuristics to glean a specification about how it can pass tests and what could have gone wrong! 1 int triangle (int a, int b, int c) { if (a <= 0 || b <= 0 || c<=0) return INVALID; if (a == b && b == c) return EQUILATERAL; if (f (a, b,c)) // X return ISOSCELES; return SCALENE; 2 3 4 5 6 7 8 9 } a==2 b==2 c==3 X = true Symbolic Execution 9 ICSE2023 MIP Award Talk SE for REPAIR
  • 10.
    a <=0 ||b <= 0 || c <= 0 Yes Yes Yes No No No a == b && b == c 1, 1, 1 1,1,2 a == b || b != c 2,3,4 10 ICSE2023 MIP Award Talk SE for TESTING 1 int triangle (int a, int b, int c) { if (a <= 0 || b <= 0 || c<=0) return INVALID; if (a == b && b == c) return EQUILATERAL; if (a == b || b != c) return ISOSCELES; return SCALINE; 2 3 4 5 6 7 8 9 } -1, -1, -1
  • 11.
    11 ICSE2023 MIP AwardTalk REPAIR / TESTING a <=0 || b <= 0 || c <= 0 Yes Yes Yes No No No a == b && b == c 1, 1, 1 1,1,2 a == b || b != c 2,3,4 -1, -1, -1 1 int triangle (int a, int b, int c) { if (a <= 0 || b <= 0 || c<=0) return INVALID; if (a == b && b == c) return EQUILATERAL; if (f (a, b,c)) // X return ISOSCELES; return SCALINE; 2 3 4 5 6 7 8 9 } a==2 b==2 c==3 X= true / X = f(2,2,3) Symbolic Execution
  • 12.
    SemFix paper comes here Passing& failing tests Extract constraints Learning/ Inference Generate patch candidates Fault localization Semantic Repair Learning-based Repair Search-based Repair Synthesize code via constraint solving Predict patch Validate patch candidates Model of patches 03 Patch Code transformations Buggy Program Code corpus SUMMARY for t in Tests { generate repair constraint } Synthesize e from t t t Semantics-based Schematic
  • 13.
    A BRIEF HISTORYOF 2013-2023 ML makes significant inroads into software tools • code completion • code search and recommendation • troubleshooting • test selection • … • and of course, automated program repair! From research to mainstream in less than 10 years A new era of software tools ICSE2023 MIP Award Talk Large code repositories aka “big code” Huge progress in ML esp. in deep learning
  • 14.
    ML COMES TO AUTOMATEDPROGRAM REPAIR Immense amount of code change data available on past fixes • Sometimes even aligned with bug symptom ML problem • Given a potentially buggy code fragment, predict an edit Software tool problem • Localize the error [as before] • Predict an edit [ML problem] • Validate that the edited code works [as before] ICSE2023 MIP Award Talk Passing & failing tests Learning/ Inference Fault localization Learning-based Repair Predict patch Model of repair Patch Buggy Program Code corpus Code transformations
  • 15.
    GETAFIX (META, 2019) ICSE2023MIP Award Talk +10 +2 +35 -10 -1 -7 +1 -1 42294d 5cdd7c 1ee3fc 181d81 1d89b2 f54c2d public int getWidth() { @Nullable View v = this.getView(); - return v.getWidth(); + return v != null ? v.getWidth() : 0; } Bader et al, Learning to fix bugs automatically, OOPSLA 2019 x == null ? x.foo() : default x.foo() y == null ? y.bar() : default y.bar() α == null ? α.β() : default α.β() Pattern discovery by anti-unification Pattern application by probabilistic calculation
  • 16.
    GETAFIX (META, 2019) ICSE2023MIP Award Talk Developers are picky about their code – semantic equivalence is not enough Emphasis on ranking and picking the most likely pattern – no budget to compile multiple fixes Convenient UI integration is important
  • 17.
    Where ML hashelped? Generalization in fix patterns Productive in static analysis errors, build errors etc. (somewhat narrow domain, spec is easier) CHALLENGES IN APR ICSE2023 MIP Award Talk Continued challenges Patch accuracy: tests may not capture the full spec Localization continues to be a challenge
  • 18.
    Private Code GitHub Copilot Service OpenAI GPT-4Model GitHu b Public code and text on the internet Provide Editor context Improve Suggestions 04 Provide Suggestions ICSE2023 MIP Award Talk PROSPECTIVE: 2022-23 Modern LLMs trained on large code corpora have shown surprising capabilities (beyond code completion) out-of-the- box, and many more accessible with few-shot prompting. The impact of these capabilities is significant on research and on the profession.
  • 19.
    PROGRAM REPAIR INTHE ERA OF ML- GENERATED CODE 1. ML-generated code does not mean bugs will not appear. In production, new unforeseen/untested conditions might occur. The need for fixing failures is going to be there. 2. Models will improve to be more predictable as well as to avoid the more routine kind of bugs. 3. Prompts used in code generation might themselves become the entity of record, in which case the notion of "repair" might be applicable to prompts too. 4. The question will remain on when ML-generated code can be “trusted” enough to be integrated as part of your SW project!
  • 20.
    20 Steering Search SpecificationInference GRADUAL CORRECTNESS ICSE2023 MIP Award Talk "EVIDENCE" from REPAIR Automated Repair of Programs from Large Language Models, ICSE23.
  • 21.
    Trustworthy Software 21 TRUSTED AUTOMATEDPROGRAMMING Repair techniques on code from LLMs Evidence generation via repair ICSE2023 MIP Award Talk