SlideShare a Scribd company logo
1 of 58
Trustworthy Software & Automated Program
Repair
Abhik Roychoudhury
Professor, National University of Singapore
MPI Distinguished Lecture 2019
1
MPI Distinguished Lecture 2019
Working in
Program Analysis
and Software
Security (2001-)
Singapore
Cybersecurity
Consortium
(2016)
National Satellite
of Excellence in
Trustworthy
Software
Systems (2019)
2
Comprehensive university with
Science, Arts, Engineering, medical, Law, Business, Public Policy, Music, Computing, …
Public university 30K undergraduate students, 10K+ graduate students overall.
2500+ faculty members overall, 100+ in Computing (two departments CS and IS).
http://www.nus.edu.sg/about#corporate-information
Snapshot of
the talk
• Search problems in software error detection.
• Fuzzing and Symbolic Execution : Random search and
logical analysis.
• Random Search techniques are becoming more effective.
• [PRELUDE]
• The problem of program repair, as opposed to error
detection.
• Symbolic technique produces higher quality patches
than random or biased random search.
• Novel view of symbolic execution for spec. inference.
• [MAIN PART of theTALK]
MPI Distinguished Lecture 2019
3
Trustworthy
SW
• FuzzTesting
– Feed semi-random inputs to find hangs and crashes
• Continuous fuzzing
– Incrementally find new “problems” in software
• Crash reproduction
– Re-construct a reported crash, crashing input not
included due to privacy
• Reaching nooks and corners
• Localizing reported observable errors
• Patching reported errors from input-output examples
Space of Problems
(Search Problems?)
MPI Distinguished Lecture 2019
4
Trustworthy
SW
Search
Problems
• Random Search
– Less systematic
– Easy set-up, execute up to a time budget
– Use objective function to steer search.
• Symbolic Execution
– Systematic
– More involved set-up, solver calls.
– Use logical formula to steer search.
MPI Distinguished Lecture 2019
5
Use of Random
Search -
Fuzzing
MPI Distinguished Lecture 2019
Input: Seed Inputs S
1:T✗ = ∅
2:T = S
3: ifT = ∅ then
4: add empty file toT
5: end if
6: repeat
7: t = chooseNext(T)
8: p = assignEnergy(t)
9: for i from 1 to p do
10: t0 = mutate_input(t)
11: if t0 crashes then
12: add t0 toT✗
13: else if isInteresting(t0 ) then
14: add t0 toT
15: end if
16: end for
17: until timeout reached or abort-signal
Output: Crashing InputsT✗
6
Intuition
• if (condition1)
• return // short path, frequented by many many
inputs
• else if (condition2)
• exit // short paths, frequented by many inputs
• else ….
MPI Distinguished Lecture 2019
[CCS16, and its adoption]
7
Results
MPI Distinguished Lecture 2019
p(i) = 0, if f(i) > µ
min( ((i)/β)*2s(i), M) otherwise
β is a constant
s(i) #times the input exercising path i has been chosen for fuzzing
f(i) #fuzz exercising path i (path-frequency)
µ mean #fuzz exercising a discovered path (avg. path-frequency)
M maximum energy expendable on a state
Integrated into main-line of AFL fuzzer within a year of
publication (CCS16.
8
MPI Distinguished Lecture 2019
SEARCH( A, L, U, X, found, j){
int j, found = 0;
while (L <= U && found == 0){
j = (L+U)/2;
if (X == A[j]){ found = 1;}
else if (X < A[j]){ U = j -1; }
else{ L = j +1; }
}
if (found == 0){ j = L – 1;}
}
SEARCH(A, 1, 5, 20, found, j)
SEARCH(A, 1, 5, X, found, j)
SEARCH(A, N, N+4, X, found, j)
SEARCH(A, 1, M, X, found, j)
Testing ?
Comprehension??
Verification ???
Blurring the lines
“Program testing and
program proving can be
considered as extreme
alternatives. ….
This paper describes a
practical approach between
these two extremes …
Each symbolic execution
result may be equivalent to a
large number of normal tests”
9
Symbolic
Execution
MPI Distinguished Lecture 2019
int test_me(int Climb, int Up){
int sep, upward;
if (Climb > 0){
sep = Up;}
else {sep = add100(Up);}
if (sep > 150){
upward = 1;
} else {upward = 0;}
if (upward < 0){
abort;
} else return upward;
}
10
Symbolic Execution
MPI Distinguished Lecture 2019
int test_me(int Climb, int
Up){
int sep, upward;
if (Climb > 0){
sep = Up;}
else {sep = add100(Up);}
if (sep > 150){
upward = 1;
} else {upward = 0;}
if (upward < 0){
abort;
} else return upward;
}
11
Execute IF(r) : FORK
[provided r is unresolved]
Then: PC := PC  r
Else: PC := PC  r
Execute IF(r)
Resolved branch condition r
using concrete values
Suppose true, PC := PC  r , OR
Suppose false, PC := PC  r
Fuzzing vs. Symbolic Execution
Bug Finding
- Symbolic execution tree construction e.g. KLEE
[Modeling system environment]
- Grey-box fuzz testing for systematic path exploration
inspired by concolic execution
AFLFast
MPI Distinguished Lecture 2019
12
Fuzzing vs. Symbolic Execution
ReachabilityAnalysis
Reachability of a location in the program
- Traverse the symbolic execution tree using search
strategies e.g. Hercules
- Encode it as an optimization
problem inside the genetic search
of grey-box fuzzing AFLGo
MPI Distinguished Lecture 2019
[CCS17]
13
Fuzzing vs
Symbolic
Execution
MPI Distinguished Lecture 2019
φ1 = (x>y)∧(x+y>10)
φ2 = ¬(x>y)∧(x+y>10)
 Directed Fuzzing as optimization problem!
1. Instrumentation Time:
• Instrument program to aggregate distance values.
2. Runtime, for each input
• decide how long to be fuzzed based on distance.
• If input is closer to the targets, it is fuzzed for longer.
14
Digression: Fuzzing
vs Symbolic
Execution
MPI Distinguished Lecture 2019
Neuro-symbolic execution
[NDSS19]
15
(Ack: figure from P Saxena and co-authors)
Trustworthy
SW
Search
Problems
• Random Search
– Less systematic
– Easy set-up, execute up to a time budget
– Use objective function to steer search.
– Enhance the effectiveness of search, with symbolic
execution as inspiration
• Symbolic Execution
– More involved set-up, solver calls.
– Use logical formula to steer search..
• Novel view of symbolic execution for spec. inference
– Beyond error detection, self-healing
MPI Distinguished Lecture 2019
16
Beyond Error Detection
MPI Distinguished Lecture 2019
In the absence of formal specifications, analyze the
buggy program and its artifacts such as execution
traces via various heuristics to glean a specification
about how it can pass tests and what could have gone
wrong!
Specification Inference
(application: self-healing)
17
Buggy
Program
Tests
Program
Repair
MPI Distinguished Lecture 2019
REPLACETHIS FLOW
Buggy
Program
Tests
18
Repair: Why?
MPI Distinguished Lecture 2019
Education
Productivity
Security
19
Search
MPI Distinguished Lecture 2019
Applicability
Scalability
Over-fitting
Large program?
Large search space?
20
(Ack: figure from C Le Goues)
Over-fitting
MPI Distinguished Lecture 2019
Tests with
oracles
Buggy
Program
Symbolic
Formulae
Program
Repair
Patched
Program
21
Example
MPI Distinguished Lecture 2019
Test id a b c oracle Pass
1 -1 -1 -1 INVALID
2 1 1 1 EQUILATERAL
3 2 2 3 ISOSCELES
4 2 3 2 ISOSCELES
5 3 2 2 ISOSCELES
6 2 3 4 SCALANE
1 int triangle(int a, int b, int c){
2 if (a <= 0 || b <= 0 || c <= 0)
3 return INVALID;
4 if (a == b && b == c)
5 return EQUILATERAL;
6 if (a == b || b != c) // bug!
7 return ISOSCELES;
8 return SCALENE;
9 }
Correct fix
(a == b || b== c || a == c)
Traverse all mutations of line 6 ??
Hard to generate fix since (a ==c) or (c ==a) never
appear anywhere else in the program !
22
Example
MPI Distinguished Lecture 2019
Test id a b c oracle Pass
1 -1 -1 -1 INVALID
2 1 1 1 EQUILATERAL
3 2 2 3 ISOSCELES
4 2 3 2 ISOSCELES
5 3 2 2 ISOSCELES
6 2 3 4 SCALANE
1 int triangle(int a, int b, int c){
2 if (a <= 0 || b <= 0 || c <= 0)
3 return INVALID;
4 if (a == b && b == c)
5 return EQUILATERAL;
6 if (a == b || b != c) // bug!
7 return ISOSCELES;
8 return SCALENE;
9 }
Correct fix
(a == b || b== c || a == c)
Automatically generate the constraint
f(2,2,3)  f(2,3,2)  f(3,2,)   f(2,3,4)
Solution
f(ab,c) = (a == b || b == c || a == c)
23
Comparison 1. Where to fix, which line?
2. Generate patches in the candidate line
3. Validate the candidate patches against correctness
criterion.
1. Where to fix, which line(s)?
2. What values should be returned by those lines, e.g. <inp ==1,
ret== 0>
3. What are the expressions which will return such values?
MPI Distinguished Lecture 2019
Syntax-based Schematic
for e in Search-space{
Validate e againstTests
}
Semantics-basedSchematic
for t inTests {
generate repair constraintΨt
}
Synthesize e from ∧tΨt
24
Specification
Inference
MPI Distinguished Lecture 2019
var = f(live_vars) // X
Test input t
Concrete
values
Oracle (expected output)
Output:
Value-set or Constraint
Symbolic
execution
Program
Concrete Execution
[ICSE13] 25
Example
inhibit up_sep down_sep Observed
o/p
Oracle Pass
1 0 100 0 0
1 11 110 0 1
0 100 50 1 1
1 -20 60 0 1
0 0 10 0 0
MPI Distinguished Lecture 2019
1 int is_upward( int inhibit, int up_sep, int down_sep){
2 int bias;
3 if (inhibit)
4 bias = down_sep; // bias= up_sep + 100
5 else bias = up_sep ;
6 if (bias > down_sep)
7 return 1;
8 else return 0;
9 }
26
Debugging
• Given a test-suiteT
– fail(s) º # of failing executions in which s occurs
– pass(s) º # of passing executions in which s occurs
– allfail ºTotal # of failing executions
– allpass º Total # of passing executions
• allfail+ allpass = |T|
• Can also use other metric likeOchiai.
Score(s) =
fail(s)
allfail
fail(s)
allfail
pass(s)
allpass
+
Buggy
Program
Test Suite
-Investigate what
this statement
should be.
- Generate a fixed
statement
Fixed
Program
YES
NO
MPI Distinguished Lecture 2019
27
Example
MPI Distinguished Lecture 2019
28
Example
MPI Distinguished Lecture 2019
29
Example
MPI Distinguished Lecture 2019
• Accumulated constraints
– f(1,11, 110) > 110 
– f(1,0,100) ≤ 100 
– …
• Find a f satisfying this constraint
– By fixing the set of operators appearing in f
• Candidate methods
• Search over the space of expressions
• Program synthesis with fixed set of operators
– Can also be achieved by second-order constraint solving
• Generated fix
– f(inhibit,up_sep,down_sep) = up_sep + 100
30
Repair
Workflow
MPI Distinguished Lecture 2019
31
Simplified
Workflow, but
MPI Distinguished Lecture 2019
Applicability
Over-fitting
Scalability
[ICSE15] 32
Comparison
MPI Distinguished Lecture 2019
#Programs Equivalent Same Loc. Diff.
SemFix
[ICSE13]
44 17% 46% 6.36
DirectFix
[ICSE15]
44 53% 95% 2.31 33
Workflows
MPI Distinguished Lecture 2019
Applicability
Over-fitting
Scalability
34
Angelix
MPI Distinguished Lecture 2019
35
Repair Constraint
MPI Distinguished Lecture 2019
• SemFix work (ICSE 2013)
– Example: for an identified expression e to be fixed
• [ X > 0 ] ∧ f(t) == X for each test t
• DirectFix work (ICSE 2015)
– Whole Program as repair constraint
– Use the principle of minimality to synthesize a minimal patch.
• Angelix work (ICSE 2016)
– Example: for identified expressions e1, e2, … to be fixed
– [ (X == 1) ∨ (X == 2) ∨ (X== 3)] ∧ f(t) ==X for each test t.
– [ (X== 1 ∧Y == 1) ∨ (X==2 ∧Y ==2)] ∧ f(t) ==X ∧g(t)==Y for each test t.
36
Scalability
Subject LoC
wireshark 2814K
php 1046K
gzip 491K
gmp 145K
libtiff 77K
MPI Distinguished Lecture 2019
Average time == 32 minutes
0
5
10
15
20
25
30
35
wireshark
php
gzip
gmp
libtiff
Overall
Angelix
SPR
GenProg
37
Patch Quality
MPI Distinguished Lecture 2019
38
Experience
of others
• “The core technique inAngelix using symbolic
execution and program synthesis works well”.
• “It can potentially suffer from poor fault localization”.
• “With better fault localization, the patch synthesis
seems hard to improve in effectiveness”
– Can still be improved in terms of efficiency
• [Anecdotal comments only from user groups]
MPI Distinguished Lecture 2019
[ICSE16 and its
usage]
39
Experience
of others
Specification
Inference
• Two approaches
– Get property of function f via symbolic execution, and
synthesize a function f satisfying these properties.
– Directly solve for function f by building a second-order
symbolic execution engine.
MPI Distinguished Lecture 2019
• Allow for existentially quantified second order variables.
• Restrict their interpretation to a language e.g. linear
integer arithmetic
Term =Var |Constant |Term +Term |Term –Term |Constant *Term
• Example SAT
– (0) > 0  (1) ≤ 0
– Satisfying solution  = x. 1 – x
40
Specification
Inference
Term =Var | Constant |Term +Term |
Term –Term | Constant *Term
Second order Program Repair
41
scanf(“%d”, &x);
for(i = 0; i <10; i++){
int t = (i,x);
if (t > 0) printf(“1”);
else printf(“0”);
}
P(5)  “1110000000” expected “1111111000”
Buggy Program:
SampleTest:
Synthesis Specification:  . i i  output = expected
Solve for  directly
Term =Var | Constant |Term +Term |
Term –Term | Constant *Term
MPI Distinguished Lecture 2019
Term =Var | Constant |Term +Term |
Term –Term | Constant *Term
Second order Program Repair
42
scanf(“%d”, &x);
for(i = 0; i <10; i++){
int t = (i,x);
if (t > 0) printf(“1”);
else printf(“0”);
}
P(5)  “1110000000” expected “1111111000”
Buggy Program:
SampleTest:
Synthesis Specification:  . i i  output = expected
Solve for  directly
Term =Var | Constant |Term +Term |
Term –Term | Constant *Term
𝜌 0,5 > 0
𝜌 1,5 > 0 𝜌 1,5 > 0
𝜌 2,5 > 0 𝜌 2,5 > 0 𝜌 2,5 > 0 𝜌 2,5 > 0
Yes No
Yes No Yes No
𝑈𝑁𝑆𝐴𝑇
Yes
Term =Var | Constant |Term +Term |
Term –Term | Constant *Term
MPI Distinguished Lecture 2019
Encoding for Synthesis
43
MPI Distinguished Lecture 2019
… error_severity(1);
return;
}
/ / r(ent->fts_info, ent->fts_errno, prev_depth)
else if (ent->fts_info == FTSSLNONE){
if (symlink_loop(ent->fts_accpath))
…
Digression:
Library
Synthesis
MPI Distinguished Lecture 2019
44
(Test-based)
Program
Repair
Syntax-based Schematic
Semantic Schematic
for t inTests {
generate repair constraintΨt
}
Synthesize e from ∧tΨt
MPI Distinguished Lecture 2019
for e in Search-space{
Validate e against Tests
}
45
Middle Road
中道
MPI Distinguished Lecture 2019
46
Test-
equivalence
based repair
MPI Distinguished Lecture 2019
scanf ("%d" ,&x);
for (i = 0; i < 10; i++)
if (x – i > 0)
printf ("1");
else
printf ("0");
Consider all
inequalities
𝛼𝑥 ± 𝛽𝑖 [>≥=≠] 𝛾
Sequence of values: Equivalence class (x = 4):
{T, T, T, T, T, T, T, T, T, T} {x > 0, x > 1, …}
{T, T, T, T, T, T, T, T, T, F} {x – i > -5, …}
{T, T, T, T, T, T, T, T, F, T} EMPTY
{T, T, T, T, T, T, T, T, F, F} {x – i > -4, …}
{T, T, T, T, T, T, T, F, T, T} EMPTY
{T, T, T, T, T, T, T, F, T, F} EMPTY
{T, T, T, T, T, T, T, F, F,T} EMPTY
…
47
Efficiency
MPI Distinguished Lecture 2019
48
[TOSEM18]
Repair with
Fuzzing
MPI Distinguished Lecture, 2019
49
Fix2Fit
MPI Distinguished Lecture 2019
50
Integration of repair into programming environments?
Number of plausible patches that can be reduced if the tests are
empowered with more oracles
[ISSTA19]
Provably
Correct
MPI Distinguished Lecture 2019
FromTests?
From Programs?
51
Analyzing
Linux Busybox
MPI Distinguished Lecture 2019
52
[ICSE18]
Other
Applications:
Education
MPI Distinguished Lecture 2019
Education
Productivity
Security
Intelligent tutoring systems:Automated grading and
hint generation via Program Repair
Detailed Study in IIT-Kanpur, India [FSE17, and ongoing] 53
Repair in steps
MPI Distinguished Lecture 2019
54
Most Relevant Results
MPI Distinguished Lecture 2019
Semantic Program Repair Using a Reference Implementation ( PDF )
ICSE 2018.
Angelix: Scalable Multiline Program Patch Synthesis via Symbolic Analysis ( pdf )
ICSE 2016.
DirectFix: Looking for Simple Program Repairs ( PDF )
ICSE 2015.
SemFix: Program Repair via Semantic Analysis ( pdf )
ICSE 2013.
Symbolic execution with second order existential constraints
ESEC-FSE 2018.
ACKNOWLEDGEMENT: National Cyber Security Research program from NRF Singapore
http://www.comp.nus.edu.sg/~tsunami/
ACKNOWLEGEMENT: Sergey Mechtaev, Semantic Program Repair,ACM SIGSOFTOutstanding Doctoral Dissertation.
Crash-Avoiding Program Repair
ISSTA 2019.
A Feasibility Study of Using Automated Program Repair for Introductory Programming Assignments
ESEC-FSE 2017.
55
Other Results (mostly background technology)
MPI Distinguished Lecture 2019
Coverage-based Greybox Fuzzing as Markov Chain
CCS 2016.
Directed Greybox Fuzzing
CCS 2017.
ACKNOWLEDGEMENT: National Cyber Security Research program from NRF Singapore
DSO National Laboratories, Singapore.
Neuro-Symbolic Execution: Augmenting Symbolic Execution with Neural Constraints
NDSS 2019.
56
Hercules: Reproducing Crashes in Real-World Application Binaries
ICSE 2015.
Relevant Projects
in Singapore
https://www.comp.nus.edu.sg/~nsoe-tss/
https://www.comp.nus.edu.sg/~tsunami/
Consortium:47 companies
MPI Distinguished Lecture 2019
57
Summary
MPI Distinguished Lecture 2019
Figure taken from:
Automated Program Repair
C. Le Goues, M. Pradel, A. Roychoudhury
Review Article,Communications of the ACM, 2019.
Selectively HIRING atVARIOUS LEVELS:
Post Docs, …
+
Open Positions for
ASST PROFs in NUS CS Department 58

More Related Content

What's hot

Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...Sangmin Park
 
Fcv rep darrell
Fcv rep darrellFcv rep darrell
Fcv rep darrellzukun
 
LSRepair: Live Search of Fix Ingredients for Automated Program Repair
LSRepair: Live Search of Fix Ingredients for Automated Program RepairLSRepair: Live Search of Fix Ingredients for Automated Program Repair
LSRepair: Live Search of Fix Ingredients for Automated Program RepairDongsun Kim
 
Effective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent SoftwareEffective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent SoftwareSangmin Park
 
TRECVID 2016 : Concept Localization
TRECVID 2016 : Concept LocalizationTRECVID 2016 : Concept Localization
TRECVID 2016 : Concept LocalizationGeorge Awad
 
TRECVID 2016 : Ad-hoc Video Search
TRECVID 2016 : Ad-hoc Video Search TRECVID 2016 : Ad-hoc Video Search
TRECVID 2016 : Ad-hoc Video Search George Awad
 
TRECVID 2016 : Video to Text Description
TRECVID 2016 : Video to Text DescriptionTRECVID 2016 : Video to Text Description
TRECVID 2016 : Video to Text DescriptionGeorge Awad
 
Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016Gael Varoquaux
 
Open Source Verification under a Cloud (OpenCert 2010)
Open Source Verification under a Cloud (OpenCert 2010)Open Source Verification under a Cloud (OpenCert 2010)
Open Source Verification under a Cloud (OpenCert 2010)Peter Breuer
 
エンドツーエンド音声合成に向けたNIIにおけるソフトウェア群 ~ TacotronとWaveNetのチュートリアル (Part 2)~
エンドツーエンド音声合成に向けたNIIにおけるソフトウェア群 ~ TacotronとWaveNetのチュートリアル (Part 2)~エンドツーエンド音声合成に向けたNIIにおけるソフトウェア群 ~ TacotronとWaveNetのチュートリアル (Part 2)~
エンドツーエンド音声合成に向けたNIIにおけるソフトウェア群 ~ TacotronとWaveNetのチュートリアル (Part 2)~Yamagishi Laboratory, National Institute of Informatics, Japan
 
Thesis+of+étienne+duclos.ppt
Thesis+of+étienne+duclos.pptThesis+of+étienne+duclos.ppt
Thesis+of+étienne+duclos.pptPtidej Team
 
Breaking Obfuscated Programs with Symbolic Execution
Breaking Obfuscated Programs with Symbolic ExecutionBreaking Obfuscated Programs with Symbolic Execution
Breaking Obfuscated Programs with Symbolic ExecutionSebastian Banescu
 
Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...
Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...
Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...Mateus S. H. Cruz
 
TRECVID 2016 : Instance Search
TRECVID 2016 : Instance SearchTRECVID 2016 : Instance Search
TRECVID 2016 : Instance SearchGeorge Awad
 

What's hot (20)

Symbexecsearch
SymbexecsearchSymbexecsearch
Symbexecsearch
 
Repair dagstuhl jan2017
Repair dagstuhl jan2017Repair dagstuhl jan2017
Repair dagstuhl jan2017
 
DETR ECCV20
DETR ECCV20DETR ECCV20
DETR ECCV20
 
Abhik-Satish-dagstuhl
Abhik-Satish-dagstuhlAbhik-Satish-dagstuhl
Abhik-Satish-dagstuhl
 
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
 
Fcv rep darrell
Fcv rep darrellFcv rep darrell
Fcv rep darrell
 
LSRepair: Live Search of Fix Ingredients for Automated Program Repair
LSRepair: Live Search of Fix Ingredients for Automated Program RepairLSRepair: Live Search of Fix Ingredients for Automated Program Repair
LSRepair: Live Search of Fix Ingredients for Automated Program Repair
 
Effective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent SoftwareEffective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent Software
 
TRECVID 2016 : Concept Localization
TRECVID 2016 : Concept LocalizationTRECVID 2016 : Concept Localization
TRECVID 2016 : Concept Localization
 
TRECVID 2016 : Ad-hoc Video Search
TRECVID 2016 : Ad-hoc Video Search TRECVID 2016 : Ad-hoc Video Search
TRECVID 2016 : Ad-hoc Video Search
 
TRECVID 2016 : Video to Text Description
TRECVID 2016 : Video to Text DescriptionTRECVID 2016 : Video to Text Description
TRECVID 2016 : Video to Text Description
 
Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016
 
STAMP
STAMPSTAMP
STAMP
 
CSMR13b.ppt
CSMR13b.pptCSMR13b.ppt
CSMR13b.ppt
 
Open Source Verification under a Cloud (OpenCert 2010)
Open Source Verification under a Cloud (OpenCert 2010)Open Source Verification under a Cloud (OpenCert 2010)
Open Source Verification under a Cloud (OpenCert 2010)
 
エンドツーエンド音声合成に向けたNIIにおけるソフトウェア群 ~ TacotronとWaveNetのチュートリアル (Part 2)~
エンドツーエンド音声合成に向けたNIIにおけるソフトウェア群 ~ TacotronとWaveNetのチュートリアル (Part 2)~エンドツーエンド音声合成に向けたNIIにおけるソフトウェア群 ~ TacotronとWaveNetのチュートリアル (Part 2)~
エンドツーエンド音声合成に向けたNIIにおけるソフトウェア群 ~ TacotronとWaveNetのチュートリアル (Part 2)~
 
Thesis+of+étienne+duclos.ppt
Thesis+of+étienne+duclos.pptThesis+of+étienne+duclos.ppt
Thesis+of+étienne+duclos.ppt
 
Breaking Obfuscated Programs with Symbolic Execution
Breaking Obfuscated Programs with Symbolic ExecutionBreaking Obfuscated Programs with Symbolic Execution
Breaking Obfuscated Programs with Symbolic Execution
 
Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...
Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...
Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...
 
TRECVID 2016 : Instance Search
TRECVID 2016 : Instance SearchTRECVID 2016 : Instance Search
TRECVID 2016 : Instance Search
 

Similar to Automated Program Repair, Distinguished lecture at MPI-SWS

Automated Repair - ISSTA Summer School
Automated Repair - ISSTA Summer SchoolAutomated Repair - ISSTA Summer School
Automated Repair - ISSTA Summer SchoolAbhik Roychoudhury
 
Constraint-Based Fault-Localization
Constraint-Based Fault-LocalizationConstraint-Based Fault-Localization
Constraint-Based Fault-LocalizationMohammed Bekkouche
 
IRJET- Analysis of Cocke-Younger-Kasami and Earley Algorithms on Probabilisti...
IRJET- Analysis of Cocke-Younger-Kasami and Earley Algorithms on Probabilisti...IRJET- Analysis of Cocke-Younger-Kasami and Earley Algorithms on Probabilisti...
IRJET- Analysis of Cocke-Younger-Kasami and Earley Algorithms on Probabilisti...IRJET Journal
 
IE-301_OptionalProject_Group2_Report
IE-301_OptionalProject_Group2_ReportIE-301_OptionalProject_Group2_Report
IE-301_OptionalProject_Group2_ReportSarp Uzel
 
SplunkLive! London 2017 - Using Machine Learning to Feed Hungry People
SplunkLive! London 2017 - Using Machine Learning to Feed Hungry PeopleSplunkLive! London 2017 - Using Machine Learning to Feed Hungry People
SplunkLive! London 2017 - Using Machine Learning to Feed Hungry PeopleSplunk
 
Fault Tolerant Parallel Filters Based On Bch Codes
Fault Tolerant Parallel Filters Based On Bch CodesFault Tolerant Parallel Filters Based On Bch Codes
Fault Tolerant Parallel Filters Based On Bch CodesIJERA Editor
 
MuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for CMuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for CSusumu Tokumoto
 
制約解消によるプログラム検証・合成 (第1回ステアラボソフトウェア技術セミナー)
制約解消によるプログラム検証・合成 (第1回ステアラボソフトウェア技術セミナー)制約解消によるプログラム検証・合成 (第1回ステアラボソフトウェア技術セミナー)
制約解消によるプログラム検証・合成 (第1回ステアラボソフトウェア技術セミナー)STAIR Lab, Chiba Institute of Technology
 
Math Functions in C Scanf Printf
Math Functions in C Scanf PrintfMath Functions in C Scanf Printf
Math Functions in C Scanf Printfyarkhosh
 
Integrative Parallel Programming in HPC
Integrative Parallel Programming in HPCIntegrative Parallel Programming in HPC
Integrative Parallel Programming in HPCVictor Eijkhout
 
answer-model-qp-15-pcd13pcd
answer-model-qp-15-pcd13pcdanswer-model-qp-15-pcd13pcd
answer-model-qp-15-pcd13pcdSyed Mustafa
 
VTU PCD Model Question Paper - Programming in C
VTU PCD Model Question Paper - Programming in CVTU PCD Model Question Paper - Programming in C
VTU PCD Model Question Paper - Programming in CSyed Mustafa
 

Similar to Automated Program Repair, Distinguished lecture at MPI-SWS (20)

16May_ICSE_MIP_APR_2023.pptx
16May_ICSE_MIP_APR_2023.pptx16May_ICSE_MIP_APR_2023.pptx
16May_ICSE_MIP_APR_2023.pptx
 
Automated Repair - ISSTA Summer School
Automated Repair - ISSTA Summer SchoolAutomated Repair - ISSTA Summer School
Automated Repair - ISSTA Summer School
 
Constraint-Based Fault-Localization
Constraint-Based Fault-LocalizationConstraint-Based Fault-Localization
Constraint-Based Fault-Localization
 
slides5.ppt
slides5.pptslides5.ppt
slides5.ppt
 
IRJET- Analysis of Cocke-Younger-Kasami and Earley Algorithms on Probabilisti...
IRJET- Analysis of Cocke-Younger-Kasami and Earley Algorithms on Probabilisti...IRJET- Analysis of Cocke-Younger-Kasami and Earley Algorithms on Probabilisti...
IRJET- Analysis of Cocke-Younger-Kasami and Earley Algorithms on Probabilisti...
 
Fuzzing.pptx
Fuzzing.pptxFuzzing.pptx
Fuzzing.pptx
 
Ch1
Ch1Ch1
Ch1
 
Ch1
Ch1Ch1
Ch1
 
Cs gate-2011
Cs gate-2011Cs gate-2011
Cs gate-2011
 
Cs gate-2011
Cs gate-2011Cs gate-2011
Cs gate-2011
 
ScaRR
ScaRRScaRR
ScaRR
 
IE-301_OptionalProject_Group2_Report
IE-301_OptionalProject_Group2_ReportIE-301_OptionalProject_Group2_Report
IE-301_OptionalProject_Group2_Report
 
SplunkLive! London 2017 - Using Machine Learning to Feed Hungry People
SplunkLive! London 2017 - Using Machine Learning to Feed Hungry PeopleSplunkLive! London 2017 - Using Machine Learning to Feed Hungry People
SplunkLive! London 2017 - Using Machine Learning to Feed Hungry People
 
Fault Tolerant Parallel Filters Based On Bch Codes
Fault Tolerant Parallel Filters Based On Bch CodesFault Tolerant Parallel Filters Based On Bch Codes
Fault Tolerant Parallel Filters Based On Bch Codes
 
MuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for CMuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for C
 
制約解消によるプログラム検証・合成 (第1回ステアラボソフトウェア技術セミナー)
制約解消によるプログラム検証・合成 (第1回ステアラボソフトウェア技術セミナー)制約解消によるプログラム検証・合成 (第1回ステアラボソフトウェア技術セミナー)
制約解消によるプログラム検証・合成 (第1回ステアラボソフトウェア技術セミナー)
 
Math Functions in C Scanf Printf
Math Functions in C Scanf PrintfMath Functions in C Scanf Printf
Math Functions in C Scanf Printf
 
Integrative Parallel Programming in HPC
Integrative Parallel Programming in HPCIntegrative Parallel Programming in HPC
Integrative Parallel Programming in HPC
 
answer-model-qp-15-pcd13pcd
answer-model-qp-15-pcd13pcdanswer-model-qp-15-pcd13pcd
answer-model-qp-15-pcd13pcd
 
VTU PCD Model Question Paper - Programming in C
VTU PCD Model Question Paper - Programming in CVTU PCD Model Question Paper - Programming in C
VTU PCD Model Question Paper - Programming in C
 

More from Abhik Roychoudhury

More from Abhik Roychoudhury (6)

IFIP2023-Abhik.pptx
IFIP2023-Abhik.pptxIFIP2023-Abhik.pptx
IFIP2023-Abhik.pptx
 
Art of Computer Science Research Planning
Art of Computer Science Research PlanningArt of Computer Science Research Planning
Art of Computer Science Research Planning
 
Issta13 workshop on debugging
Issta13 workshop on debuggingIssta13 workshop on debugging
Issta13 workshop on debugging
 
Repair dagstuhl
Repair dagstuhlRepair dagstuhl
Repair dagstuhl
 
PAS 2012
PAS 2012PAS 2012
PAS 2012
 
Pas oct12
Pas oct12Pas oct12
Pas oct12
 

Recently uploaded

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 

Recently uploaded (20)

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 

Automated Program Repair, Distinguished lecture at MPI-SWS

  • 1. Trustworthy Software & Automated Program Repair Abhik Roychoudhury Professor, National University of Singapore MPI Distinguished Lecture 2019 1
  • 2. MPI Distinguished Lecture 2019 Working in Program Analysis and Software Security (2001-) Singapore Cybersecurity Consortium (2016) National Satellite of Excellence in Trustworthy Software Systems (2019) 2 Comprehensive university with Science, Arts, Engineering, medical, Law, Business, Public Policy, Music, Computing, … Public university 30K undergraduate students, 10K+ graduate students overall. 2500+ faculty members overall, 100+ in Computing (two departments CS and IS). http://www.nus.edu.sg/about#corporate-information
  • 3. Snapshot of the talk • Search problems in software error detection. • Fuzzing and Symbolic Execution : Random search and logical analysis. • Random Search techniques are becoming more effective. • [PRELUDE] • The problem of program repair, as opposed to error detection. • Symbolic technique produces higher quality patches than random or biased random search. • Novel view of symbolic execution for spec. inference. • [MAIN PART of theTALK] MPI Distinguished Lecture 2019 3
  • 4. Trustworthy SW • FuzzTesting – Feed semi-random inputs to find hangs and crashes • Continuous fuzzing – Incrementally find new “problems” in software • Crash reproduction – Re-construct a reported crash, crashing input not included due to privacy • Reaching nooks and corners • Localizing reported observable errors • Patching reported errors from input-output examples Space of Problems (Search Problems?) MPI Distinguished Lecture 2019 4
  • 5. Trustworthy SW Search Problems • Random Search – Less systematic – Easy set-up, execute up to a time budget – Use objective function to steer search. • Symbolic Execution – Systematic – More involved set-up, solver calls. – Use logical formula to steer search. MPI Distinguished Lecture 2019 5
  • 6. Use of Random Search - Fuzzing MPI Distinguished Lecture 2019 Input: Seed Inputs S 1:T✗ = ∅ 2:T = S 3: ifT = ∅ then 4: add empty file toT 5: end if 6: repeat 7: t = chooseNext(T) 8: p = assignEnergy(t) 9: for i from 1 to p do 10: t0 = mutate_input(t) 11: if t0 crashes then 12: add t0 toT✗ 13: else if isInteresting(t0 ) then 14: add t0 toT 15: end if 16: end for 17: until timeout reached or abort-signal Output: Crashing InputsT✗ 6
  • 7. Intuition • if (condition1) • return // short path, frequented by many many inputs • else if (condition2) • exit // short paths, frequented by many inputs • else …. MPI Distinguished Lecture 2019 [CCS16, and its adoption] 7
  • 8. Results MPI Distinguished Lecture 2019 p(i) = 0, if f(i) > µ min( ((i)/β)*2s(i), M) otherwise β is a constant s(i) #times the input exercising path i has been chosen for fuzzing f(i) #fuzz exercising path i (path-frequency) µ mean #fuzz exercising a discovered path (avg. path-frequency) M maximum energy expendable on a state Integrated into main-line of AFL fuzzer within a year of publication (CCS16. 8
  • 9. MPI Distinguished Lecture 2019 SEARCH( A, L, U, X, found, j){ int j, found = 0; while (L <= U && found == 0){ j = (L+U)/2; if (X == A[j]){ found = 1;} else if (X < A[j]){ U = j -1; } else{ L = j +1; } } if (found == 0){ j = L – 1;} } SEARCH(A, 1, 5, 20, found, j) SEARCH(A, 1, 5, X, found, j) SEARCH(A, N, N+4, X, found, j) SEARCH(A, 1, M, X, found, j) Testing ? Comprehension?? Verification ??? Blurring the lines “Program testing and program proving can be considered as extreme alternatives. …. This paper describes a practical approach between these two extremes … Each symbolic execution result may be equivalent to a large number of normal tests” 9
  • 10. Symbolic Execution MPI Distinguished Lecture 2019 int test_me(int Climb, int Up){ int sep, upward; if (Climb > 0){ sep = Up;} else {sep = add100(Up);} if (sep > 150){ upward = 1; } else {upward = 0;} if (upward < 0){ abort; } else return upward; } 10
  • 11. Symbolic Execution MPI Distinguished Lecture 2019 int test_me(int Climb, int Up){ int sep, upward; if (Climb > 0){ sep = Up;} else {sep = add100(Up);} if (sep > 150){ upward = 1; } else {upward = 0;} if (upward < 0){ abort; } else return upward; } 11 Execute IF(r) : FORK [provided r is unresolved] Then: PC := PC  r Else: PC := PC  r Execute IF(r) Resolved branch condition r using concrete values Suppose true, PC := PC  r , OR Suppose false, PC := PC  r
  • 12. Fuzzing vs. Symbolic Execution Bug Finding - Symbolic execution tree construction e.g. KLEE [Modeling system environment] - Grey-box fuzz testing for systematic path exploration inspired by concolic execution AFLFast MPI Distinguished Lecture 2019 12
  • 13. Fuzzing vs. Symbolic Execution ReachabilityAnalysis Reachability of a location in the program - Traverse the symbolic execution tree using search strategies e.g. Hercules - Encode it as an optimization problem inside the genetic search of grey-box fuzzing AFLGo MPI Distinguished Lecture 2019 [CCS17] 13
  • 14. Fuzzing vs Symbolic Execution MPI Distinguished Lecture 2019 φ1 = (x>y)∧(x+y>10) φ2 = ¬(x>y)∧(x+y>10)  Directed Fuzzing as optimization problem! 1. Instrumentation Time: • Instrument program to aggregate distance values. 2. Runtime, for each input • decide how long to be fuzzed based on distance. • If input is closer to the targets, it is fuzzed for longer. 14
  • 15. Digression: Fuzzing vs Symbolic Execution MPI Distinguished Lecture 2019 Neuro-symbolic execution [NDSS19] 15 (Ack: figure from P Saxena and co-authors)
  • 16. Trustworthy SW Search Problems • Random Search – Less systematic – Easy set-up, execute up to a time budget – Use objective function to steer search. – Enhance the effectiveness of search, with symbolic execution as inspiration • Symbolic Execution – More involved set-up, solver calls. – Use logical formula to steer search.. • Novel view of symbolic execution for spec. inference – Beyond error detection, self-healing MPI Distinguished Lecture 2019 16
  • 17. Beyond Error Detection MPI Distinguished Lecture 2019 In the absence of formal specifications, analyze the buggy program and its artifacts such as execution traces via various heuristics to glean a specification about how it can pass tests and what could have gone wrong! Specification Inference (application: self-healing) 17 Buggy Program Tests
  • 18. Program Repair MPI Distinguished Lecture 2019 REPLACETHIS FLOW Buggy Program Tests 18
  • 19. Repair: Why? MPI Distinguished Lecture 2019 Education Productivity Security 19
  • 20. Search MPI Distinguished Lecture 2019 Applicability Scalability Over-fitting Large program? Large search space? 20 (Ack: figure from C Le Goues)
  • 21. Over-fitting MPI Distinguished Lecture 2019 Tests with oracles Buggy Program Symbolic Formulae Program Repair Patched Program 21
  • 22. Example MPI Distinguished Lecture 2019 Test id a b c oracle Pass 1 -1 -1 -1 INVALID 2 1 1 1 EQUILATERAL 3 2 2 3 ISOSCELES 4 2 3 2 ISOSCELES 5 3 2 2 ISOSCELES 6 2 3 4 SCALANE 1 int triangle(int a, int b, int c){ 2 if (a <= 0 || b <= 0 || c <= 0) 3 return INVALID; 4 if (a == b && b == c) 5 return EQUILATERAL; 6 if (a == b || b != c) // bug! 7 return ISOSCELES; 8 return SCALENE; 9 } Correct fix (a == b || b== c || a == c) Traverse all mutations of line 6 ?? Hard to generate fix since (a ==c) or (c ==a) never appear anywhere else in the program ! 22
  • 23. Example MPI Distinguished Lecture 2019 Test id a b c oracle Pass 1 -1 -1 -1 INVALID 2 1 1 1 EQUILATERAL 3 2 2 3 ISOSCELES 4 2 3 2 ISOSCELES 5 3 2 2 ISOSCELES 6 2 3 4 SCALANE 1 int triangle(int a, int b, int c){ 2 if (a <= 0 || b <= 0 || c <= 0) 3 return INVALID; 4 if (a == b && b == c) 5 return EQUILATERAL; 6 if (a == b || b != c) // bug! 7 return ISOSCELES; 8 return SCALENE; 9 } Correct fix (a == b || b== c || a == c) Automatically generate the constraint f(2,2,3)  f(2,3,2)  f(3,2,)   f(2,3,4) Solution f(ab,c) = (a == b || b == c || a == c) 23
  • 24. Comparison 1. Where to fix, which line? 2. Generate patches in the candidate line 3. Validate the candidate patches against correctness criterion. 1. Where to fix, which line(s)? 2. What values should be returned by those lines, e.g. <inp ==1, ret== 0> 3. What are the expressions which will return such values? MPI Distinguished Lecture 2019 Syntax-based Schematic for e in Search-space{ Validate e againstTests } Semantics-basedSchematic for t inTests { generate repair constraintΨt } Synthesize e from ∧tΨt 24
  • 25. Specification Inference MPI Distinguished Lecture 2019 var = f(live_vars) // X Test input t Concrete values Oracle (expected output) Output: Value-set or Constraint Symbolic execution Program Concrete Execution [ICSE13] 25
  • 26. Example inhibit up_sep down_sep Observed o/p Oracle Pass 1 0 100 0 0 1 11 110 0 1 0 100 50 1 1 1 -20 60 0 1 0 0 10 0 0 MPI Distinguished Lecture 2019 1 int is_upward( int inhibit, int up_sep, int down_sep){ 2 int bias; 3 if (inhibit) 4 bias = down_sep; // bias= up_sep + 100 5 else bias = up_sep ; 6 if (bias > down_sep) 7 return 1; 8 else return 0; 9 } 26
  • 27. Debugging • Given a test-suiteT – fail(s) º # of failing executions in which s occurs – pass(s) º # of passing executions in which s occurs – allfail ºTotal # of failing executions – allpass º Total # of passing executions • allfail+ allpass = |T| • Can also use other metric likeOchiai. Score(s) = fail(s) allfail fail(s) allfail pass(s) allpass + Buggy Program Test Suite -Investigate what this statement should be. - Generate a fixed statement Fixed Program YES NO MPI Distinguished Lecture 2019 27
  • 30. Example MPI Distinguished Lecture 2019 • Accumulated constraints – f(1,11, 110) > 110  – f(1,0,100) ≤ 100  – … • Find a f satisfying this constraint – By fixing the set of operators appearing in f • Candidate methods • Search over the space of expressions • Program synthesis with fixed set of operators – Can also be achieved by second-order constraint solving • Generated fix – f(inhibit,up_sep,down_sep) = up_sep + 100 30
  • 32. Simplified Workflow, but MPI Distinguished Lecture 2019 Applicability Over-fitting Scalability [ICSE15] 32
  • 33. Comparison MPI Distinguished Lecture 2019 #Programs Equivalent Same Loc. Diff. SemFix [ICSE13] 44 17% 46% 6.36 DirectFix [ICSE15] 44 53% 95% 2.31 33
  • 34. Workflows MPI Distinguished Lecture 2019 Applicability Over-fitting Scalability 34
  • 36. Repair Constraint MPI Distinguished Lecture 2019 • SemFix work (ICSE 2013) – Example: for an identified expression e to be fixed • [ X > 0 ] ∧ f(t) == X for each test t • DirectFix work (ICSE 2015) – Whole Program as repair constraint – Use the principle of minimality to synthesize a minimal patch. • Angelix work (ICSE 2016) – Example: for identified expressions e1, e2, … to be fixed – [ (X == 1) ∨ (X == 2) ∨ (X== 3)] ∧ f(t) ==X for each test t. – [ (X== 1 ∧Y == 1) ∨ (X==2 ∧Y ==2)] ∧ f(t) ==X ∧g(t)==Y for each test t. 36
  • 37. Scalability Subject LoC wireshark 2814K php 1046K gzip 491K gmp 145K libtiff 77K MPI Distinguished Lecture 2019 Average time == 32 minutes 0 5 10 15 20 25 30 35 wireshark php gzip gmp libtiff Overall Angelix SPR GenProg 37
  • 39. Experience of others • “The core technique inAngelix using symbolic execution and program synthesis works well”. • “It can potentially suffer from poor fault localization”. • “With better fault localization, the patch synthesis seems hard to improve in effectiveness” – Can still be improved in terms of efficiency • [Anecdotal comments only from user groups] MPI Distinguished Lecture 2019 [ICSE16 and its usage] 39 Experience of others
  • 40. Specification Inference • Two approaches – Get property of function f via symbolic execution, and synthesize a function f satisfying these properties. – Directly solve for function f by building a second-order symbolic execution engine. MPI Distinguished Lecture 2019 • Allow for existentially quantified second order variables. • Restrict their interpretation to a language e.g. linear integer arithmetic Term =Var |Constant |Term +Term |Term –Term |Constant *Term • Example SAT – (0) > 0  (1) ≤ 0 – Satisfying solution  = x. 1 – x 40 Specification Inference
  • 41. Term =Var | Constant |Term +Term | Term –Term | Constant *Term Second order Program Repair 41 scanf(“%d”, &x); for(i = 0; i <10; i++){ int t = (i,x); if (t > 0) printf(“1”); else printf(“0”); } P(5)  “1110000000” expected “1111111000” Buggy Program: SampleTest: Synthesis Specification:  . i i  output = expected Solve for  directly Term =Var | Constant |Term +Term | Term –Term | Constant *Term MPI Distinguished Lecture 2019
  • 42. Term =Var | Constant |Term +Term | Term –Term | Constant *Term Second order Program Repair 42 scanf(“%d”, &x); for(i = 0; i <10; i++){ int t = (i,x); if (t > 0) printf(“1”); else printf(“0”); } P(5)  “1110000000” expected “1111111000” Buggy Program: SampleTest: Synthesis Specification:  . i i  output = expected Solve for  directly Term =Var | Constant |Term +Term | Term –Term | Constant *Term 𝜌 0,5 > 0 𝜌 1,5 > 0 𝜌 1,5 > 0 𝜌 2,5 > 0 𝜌 2,5 > 0 𝜌 2,5 > 0 𝜌 2,5 > 0 Yes No Yes No Yes No 𝑈𝑁𝑆𝐴𝑇 Yes Term =Var | Constant |Term +Term | Term –Term | Constant *Term MPI Distinguished Lecture 2019
  • 43. Encoding for Synthesis 43 MPI Distinguished Lecture 2019 … error_severity(1); return; } / / r(ent->fts_info, ent->fts_errno, prev_depth) else if (ent->fts_info == FTSSLNONE){ if (symlink_loop(ent->fts_accpath)) …
  • 45. (Test-based) Program Repair Syntax-based Schematic Semantic Schematic for t inTests { generate repair constraintΨt } Synthesize e from ∧tΨt MPI Distinguished Lecture 2019 for e in Search-space{ Validate e against Tests } 45
  • 47. Test- equivalence based repair MPI Distinguished Lecture 2019 scanf ("%d" ,&x); for (i = 0; i < 10; i++) if (x – i > 0) printf ("1"); else printf ("0"); Consider all inequalities 𝛼𝑥 ± 𝛽𝑖 [>≥=≠] 𝛾 Sequence of values: Equivalence class (x = 4): {T, T, T, T, T, T, T, T, T, T} {x > 0, x > 1, …} {T, T, T, T, T, T, T, T, T, F} {x – i > -5, …} {T, T, T, T, T, T, T, T, F, T} EMPTY {T, T, T, T, T, T, T, T, F, F} {x – i > -4, …} {T, T, T, T, T, T, T, F, T, T} EMPTY {T, T, T, T, T, T, T, F, T, F} EMPTY {T, T, T, T, T, T, T, F, F,T} EMPTY … 47
  • 50. Fix2Fit MPI Distinguished Lecture 2019 50 Integration of repair into programming environments? Number of plausible patches that can be reduced if the tests are empowered with more oracles [ISSTA19]
  • 51. Provably Correct MPI Distinguished Lecture 2019 FromTests? From Programs? 51
  • 52. Analyzing Linux Busybox MPI Distinguished Lecture 2019 52 [ICSE18]
  • 53. Other Applications: Education MPI Distinguished Lecture 2019 Education Productivity Security Intelligent tutoring systems:Automated grading and hint generation via Program Repair Detailed Study in IIT-Kanpur, India [FSE17, and ongoing] 53
  • 54. Repair in steps MPI Distinguished Lecture 2019 54
  • 55. Most Relevant Results MPI Distinguished Lecture 2019 Semantic Program Repair Using a Reference Implementation ( PDF ) ICSE 2018. Angelix: Scalable Multiline Program Patch Synthesis via Symbolic Analysis ( pdf ) ICSE 2016. DirectFix: Looking for Simple Program Repairs ( PDF ) ICSE 2015. SemFix: Program Repair via Semantic Analysis ( pdf ) ICSE 2013. Symbolic execution with second order existential constraints ESEC-FSE 2018. ACKNOWLEDGEMENT: National Cyber Security Research program from NRF Singapore http://www.comp.nus.edu.sg/~tsunami/ ACKNOWLEGEMENT: Sergey Mechtaev, Semantic Program Repair,ACM SIGSOFTOutstanding Doctoral Dissertation. Crash-Avoiding Program Repair ISSTA 2019. A Feasibility Study of Using Automated Program Repair for Introductory Programming Assignments ESEC-FSE 2017. 55
  • 56. Other Results (mostly background technology) MPI Distinguished Lecture 2019 Coverage-based Greybox Fuzzing as Markov Chain CCS 2016. Directed Greybox Fuzzing CCS 2017. ACKNOWLEDGEMENT: National Cyber Security Research program from NRF Singapore DSO National Laboratories, Singapore. Neuro-Symbolic Execution: Augmenting Symbolic Execution with Neural Constraints NDSS 2019. 56 Hercules: Reproducing Crashes in Real-World Application Binaries ICSE 2015.
  • 58. Summary MPI Distinguished Lecture 2019 Figure taken from: Automated Program Repair C. Le Goues, M. Pradel, A. Roychoudhury Review Article,Communications of the ACM, 2019. Selectively HIRING atVARIOUS LEVELS: Post Docs, … + Open Positions for ASST PROFs in NUS CS Department 58