Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

of

APSEC2020 Keynote Slide 1 APSEC2020 Keynote Slide 2 APSEC2020 Keynote Slide 3 APSEC2020 Keynote Slide 4 APSEC2020 Keynote Slide 5 APSEC2020 Keynote Slide 6 APSEC2020 Keynote Slide 7 APSEC2020 Keynote Slide 8 APSEC2020 Keynote Slide 9 APSEC2020 Keynote Slide 10 APSEC2020 Keynote Slide 11 APSEC2020 Keynote Slide 12 APSEC2020 Keynote Slide 13 APSEC2020 Keynote Slide 14 APSEC2020 Keynote Slide 15 APSEC2020 Keynote Slide 16 APSEC2020 Keynote Slide 17 APSEC2020 Keynote Slide 18 APSEC2020 Keynote Slide 19 APSEC2020 Keynote Slide 20 APSEC2020 Keynote Slide 21 APSEC2020 Keynote Slide 22 APSEC2020 Keynote Slide 23 APSEC2020 Keynote Slide 24 APSEC2020 Keynote Slide 25 APSEC2020 Keynote Slide 26 APSEC2020 Keynote Slide 27 APSEC2020 Keynote Slide 28 APSEC2020 Keynote Slide 29 APSEC2020 Keynote Slide 30 APSEC2020 Keynote Slide 31 APSEC2020 Keynote Slide 32 APSEC2020 Keynote Slide 33 APSEC2020 Keynote Slide 34 APSEC2020 Keynote Slide 35 APSEC2020 Keynote Slide 36 APSEC2020 Keynote Slide 37 APSEC2020 Keynote Slide 38 APSEC2020 Keynote Slide 39 APSEC2020 Keynote Slide 40 APSEC2020 Keynote Slide 41 APSEC2020 Keynote Slide 42 APSEC2020 Keynote Slide 43 APSEC2020 Keynote Slide 44 APSEC2020 Keynote Slide 45
Upcoming SlideShare
What to Upload to SlideShare
Next
Download to read offline and view in fullscreen.

0 Likes

Share

Download to read offline

APSEC2020 Keynote

Download to read offline

Keynote given at the Asia Pacific Software Engineering Conference (APSEC), December 2020, on Automated Program Repair technologies and their applications.

Related Audiobooks

Free with a 30 day trial from Scribd

See all
  • Be the first to like this

APSEC2020 Keynote

  1. 1. Automated Program Repair Abhik Roychoudhury National University of Singapore abhik@comp.nus.edu.sg APSEC 2020 Keynote 1 ACKNOWLEDGEMENT: National Cyber Security Research program from NRF Singapore
  2. 2. SUPPOSE I AM UNWELL TODAY 2 APSEC 2020 Keynote Which one is more manageable?
  3. 3. Beyond Error Detection APSEC 2020 Keynote In the absence of formal specifications, analyze the buggy program and its artifacts such as execution traces via various heuristics to glean a specification about how it can pass tests and what could have gone wrong! Specification Inference (application: self-healing) 3 Buggy Program Tests
  4. 4. Repair: Why? 4 APSEC 2020 Keynote Education Productivity Security
  5. 5. Search APSEC 2020 Keynote Applicability Scalability Over-fitting Large program? Large search space? 5 Ack: figure from C Le Goues
  6. 6. Program Repair APSEC 2020 Keynote REPLACETHIS FLOW Buggy Program Tests 6
  7. 7. Over-fitting APSEC 2020 Keynote Tests with oracles Buggy Program Symbolic Formulae Program Repair Patched Program 7 Tests: (ip1,op1), (ip2,op2), (ip3,op3), … AVOID if (ip1) return op1 else if (ip2) return op2 else …
  8. 8. Example APSEC 2020 Keynote Test id a b c oracle Pass 1 -1 -1 -1 INVALID 2 1 1 1 EQUILATERAL 3 2 2 3 ISOSCELES 4 2 3 2 ISOSCELES 5 3 2 2 ISOSCELES 6 2 3 4 SCALANE 1 int triangle(int a, int b, int c){ 2 if (a <= 0 || b <= 0 || c <= 0) 3 return INVALID; 4 if (a == b && b == c) 5 return EQUILATERAL; 6 if (a == b || b != c) // bug! 7 return ISOSCELES; 8 return SCALENE; 9 } Correct fix (a == b || b== c || a == c) Traverse all mutations of line 6 ?? Hard to generate fix since (a ==c) or (c ==a) never appear anywhere else in the program ! 8
  9. 9. Example APSEC 2020 Keynote Test id a b c oracle Pass 1 -1 -1 -1 INVALID 2 1 1 1 EQUILATERAL 3 2 2 3 ISOSCELES 4 2 3 2 ISOSCELES 5 3 2 2 ISOSCELES 6 2 3 4 SCALANE 1 int triangle(int a, int b, int c){ 2 if (a <= 0 || b <= 0 || c <= 0) 3 return INVALID; 4 if (a == b && b == c) 5 return EQUILATERAL; 6 if (a == b || b != c) // bug! 7 return ISOSCELES; 8 return SCALENE; 9 } Correct fix (a == b || b== c || a == c) Automatically generate the constraint f(2,2,3)  f(2,3,2)  f(3,2,2)   f(2,3,4) Solution f(a,b,c) = (a == b || b == c || a == c) 9
  10. 10. Comparison Where to fix, which line? Generate patches in the candidate line Validate the candidate patches against correctness criterion. Where to fix, which line(s)? What values should be returned by those lines, • e.g. <inp ==1, ret== 0> What are the expressions which will return such values? APSEC 2020 Keynote Syntax-based Schematic for e in Search-space{ Validate e againstTests } Semantics-basedSchematic for t inTests { generate repair constraintΨt } Synthesize e from ∧tΨt 10
  11. 11. Specification Inference APSEC 2020 Keynote var = f(live_vars) // X Test input t Concrete values Oracle (expected output) Output: Value-set or Constraint Symbolic execution Program Concrete Execution [ICSE13] 11
  12. 12. Example inhibit up_sep down_sep Observed o/p Oracle Pass 1 0 100 0 0 1 11 110 0 1 0 100 50 1 1 1 -20 60 0 1 0 0 10 0 0 APSEC 2020 Keynote 1 int is_upward( int inhibit, int up_sep, int down_sep){ 2 int bias; 3 if (inhibit) 4 bias = down_sep; // bias= up_sep + 100 5 else bias = up_sep ; 6 if (bias > down_sep) 7 return 1; 8 else return 0; 9 } 12
  13. 13. Debugging • Given a test-suiteT – fail(s) º # of failing executions in which s occurs – pass(s) º # of passing executions in which s occurs – allfail ºTotal # of failing executions – allpass º Total # of passing executions • allfail+ allpass = |T| • Can also use other metric likeOchiai. Score(s) = fail(s) allfail fail(s) allfail pass(s) allpass + Buggy Program Test Suite -Investigate what this statement should be. - Generate a fixed statement Fixed Program YES NO APSEC 2020 Keynote 13
  14. 14. Example APSEC 2020 Keynote 14
  15. 15. Symbolic Execution (Inset) APSEC 2020 Keynote int test_me(int Climb, int Up){ int sep, upward; if (Climb > 0){ sep = Up;} else {sep = add100(Up);} if (sep > 150){ upward = 1; } else {upward = 0;} if (upward < 0){ abort; } else return upward; } 15
  16. 16. Example APSEC 2020 Keynote 16
  17. 17. Example APSEC 2020 Keynote • Accumulated constraints – f(1,11, 110) > 110  – f(1,0,100) ≤ 100  – … • Find a f satisfying this constraint – By fixing the set of operators appearing in f • Candidate methods • Search over the space of expressions • Program synthesis with fixed set of operators – Can also be achieved by second-order constraint solving • Generated fix – f(inhibit,up_sep,down_sep) = up_sep + 100 17
  18. 18. Second-order Reasoning APSEC 2020 Keynote 18 • Two approaches – Get property of function f via symbolic execution, and synthesize a function f satisfying these properties. – Directly solve for function f by building a second-order symbolic execution engine. • Allow for existentially quantified second order variables. • Restrict their interpretation to a language e.g. linear integer arithmetic Term =Var |Constant |Term +Term |Term –Term |Constant *Term • Example SAT – (0) > 0  (1) ≤ 0 – Satisfying solution  = x. 1 – x
  19. 19. First order vs. Second order 19 APSEC 2020 Keynote
  20. 20. Combat Over-fitting: Symbolic Inference APSEC 2020 Keynote 20 Tests with oracles Buggy Program Symbolic Formulae Program Repair Patched Program TCAS
  21. 21. Repair Workflow APSEC 2020 Keynote 21
  22. 22. Simplified Workflow, but APSEC 2020 Keynote Applicability Over-fitting Scalability [DirectFix,ICSE15] 22
  23. 23. Workflows APSEC 2020 Keynote Applicability Over-fitting Scalability 23
  24. 24. Angelix APSEC 2020 Keynote 24
  25. 25. Repair Constraint APSEC 2020 Keynote • SemFix work (ICSE 2013) – Example: for an identified expression e to be fixed • [ X > 0 ] ∧ f(t) == X for each test t • DirectFix work (ICSE 2015) – Whole Program as repair constraint – Use the principle of minimality to synthesize a minimal patch. • Angelix work (ICSE 2016) – Example: for identified expressions e1, e2, … to be fixed – [ (X == 1) ∨ (X == 2) ∨ (X== 3)] ∧ f(t) ==X for each test t. – [ (X== 1 ∧Y == 1) ∨ (X==2 ∧Y ==2)] ∧ f(t) ==X ∧g(t)==Y for each test t. 25
  26. 26. PATCH QUALITY 26 APSEC 2020 Keynote
  27. 27. (Test-based) Program Repair Syntax-based Schematic Semantic Schematic for t inTests { generate repair constraintΨt } Synthesize e from ∧tΨt APSEC 2020 Keynote for e in Search-space{ Validate e against Tests } 27
  28. 28. Middle Way 中道 Madhyamāpratipada APSEC 2020 Keynote 28
  29. 29. Test- equivalence APSEC 2020 Keynote scanf ("%d" ,&x); for (i = 0; i < 10; i++) if (x – i > 0) printf ("1"); else printf ("0"); Consider all inequalities 𝛼𝑥 ± 𝛽𝑖 [>≥=≠] 𝛾 Sequence of values: Equivalence class (x = 4): {T, T, T, T, T, T, T, T, T, T} {x > 0, x > 1, …} {T, T, T, T, T, T, T, T, T, F} {x – i > -5, …} {T, T, T, T, T, T, T, T, F, T} EMPTY {T, T, T, T, T, T, T, T, F, F} {x – i > -4, …} {T, T, T, T, T, T, T, F, T, T} EMPTY {T, T, T, T, T, T, T, F, T, F} EMPTY {T, T, T, T, T, T, T, F, F,T} EMPTY … 29
  30. 30. Repair Efficiency APSEC 2020 Keynote 30 [TOSEM18]
  31. 31. Combat over-fitting: Fuzz Testing APSEC 2020 Keynote 31 Crashing patches Search space Crash-free patches Distinguish crashing and crash-free patches (practical) Correct patches Crashing patches may (1) partially fix the crash or (2) unexpectedly introduce new crash Test generation Test cases Repair Buggy program Patched program Auto-generate tests P P
  32. 32. APSEC 2020 Keynote 32 Fix2Fit char* strncpy(char* s,char* t, int n) { for(int i=0; i<n;i++) // buffer overflow or data leakage t[i]=s[i]; } copy the first n characters of s to t. {p1, p2,p3} {p1, p3} {p2} {p1} {p3} ID Plausible patch P1 i <n && i!=3 p2 i <5 p3 i <n && i<strlen(s) correct patch crashing patch s=“foo”, n=5 s=“fo”, n=5 mutate crashing patch
  33. 33. Fix2Fit APSEC 2020 Keynote 33 Integration of repair into programming environments? Number of plausible patches that can be reduced if the tests are empowered with more oracles
  34. 34. Applications of Repair 34 APSEC 2020 Keynote Repair of security vulnerability Repair of embedded software Repair as feedback for programming education Automated grading Feedback to students for making progress.
  35. 35. Application: Security APSEC 2020 Keynote 35 “The C and C++ programming languages are notoriously insecure yet remain indispensable. Developers therefore resort to a multi-pronged approach to find security issues before adversaries. These include manual, static, and dynamic program analysis. Dynamic bug finding tools or "sanitizers" --- can find bugs that elude other types of analysis because they observe the actual execution of a program, and can therefore directly observe incorrect program behavior as it happens.” Song et al 2018. Time to Fix Number of vulnerabilities in 2019 overall number of new vulnerabilities: (20,362)
  36. 36. Combat Overfitting: Constraint Extraction APSEC 2020 Keynote 36 Repair Buggy program Patched program P P Constraints • Program vulnerability can be formalized as violations of constraints, e.g. buffer overflow access(buffer) < base(buffer) + size(buffer) char getValue(char[] arr,int index){ intlen =size(arr); if (index <= len) // errorlocation return arr[index]; return 0; } failing input: arr={1, 2, 3}, index=3 additional specifications to fix the bug for all tests Concrete Buggy state: arr[3] Abstracted constraint violation: index > len
  37. 37. Constraint Propagation APSEC 2020 Keynote 37 𝜑’ {P} 𝜑 crashing locationfix location • Propagate crash-free constraint 𝜑 from crash location to fix location by calculating weakest precondition [e ⟼ e’]𝜑’ {P} 𝜑 • The goal of repair is to ensure 𝜑’ is satisfied at the fix location. ExtractFix
  38. 38. Effectivenes s APSEC 2020 Keynote 38 Number of fixed vulnerabilities out of 30 subjects
  39. 39. Applications: Embedded SW APSEC 2020 Keynote FromTests? From Programs? 39
  40. 40. Analyzing Linux Busybox APSEC 2020 Keynote 40 [ICSE18]
  41. 41. Other Applications: Education APSEC 2020 Keynote Education Productivity Security Intelligent tutoring systems:Automated grading and hint generation via Program Repair Detailed Study in IIT-Kanpur, India [FSE17, and ongoing] 41
  42. 42. Repair in steps APSEC 2020 Keynote 42
  43. 43. Reference Solution Incorrect Student Program def search(x, seq): for i in range(len(seq)): if x <= seq[i]: return i return len(seq) def search(e, lst): for j in range(len(lst)): if e < lst[j]: return j else: j = j + 1 return len(lst) + 1 Repair Incorrect Student Program def search(e, lst): for j in range(len(lst)): if e <= lst[j]: return j else: pass return len(lst) def search(e, lst): for j in range(len(lst)): if e < lst[j]: return j else: j = j + 1 return len(lst) + 1 Refactored Correct Solution Incorrect Student Program def search(x, seq): for i in range(len(seq)): if x <= seq[i]: return i else: pass return len(seq) def search(e, lst): for j in range(len(lst)): if e < lst[j]: return j else: j = j + 1 return len(lst) + 1 Example: Write a Python program which * Given a sorted sequence seq * Counts the number of elements smaller than x 43
  44. 44. Most Relevant Results Semantic Program Repair Using a Reference Implementation ( PDF ) ICSE 2018. Angelix: Scalable Multiline Program Patch Synthesis via Symbolic Analysis ( pdf ) ICSE 2016. DirectFix: Looking for Simple Program Repairs ( PDF ) ICSE 2015. SemFix: Program Repair via Semantic Analysis ( pdf ) ICSE 2013. Symbolic execution with second order existential constraints ESEC-FSE 2018. ACKNOWLEDGEMENT: National Cyber Security Research program from NRF Singapore http://www.comp.nus.edu.sg/~tsunami/ https://www.comp.nus.edu.sg/~nsoe-tss/ Crash-Avoiding Program Repair ISSTA 2019. A Feasibility Study of Using Automated Program Repair for Introductory Programming Assignments ESEC-FSE 2017. 44
  45. 45. Perspective APSEC 2020 Keynote Automated Program Repair C. Le Goues, M. Pradel, A. Roychoudhury Review Article,Communications of the ACM, 2019. 45 abhik@comp.nus.edu.sg https://www.comp.nus.edu.sg/~abhik

Keynote given at the Asia Pacific Software Engineering Conference (APSEC), December 2020, on Automated Program Repair technologies and their applications.

Views

Total views

7,599

On Slideshare

0

From embeds

0

Number of embeds

7,471

Actions

Downloads

2

Shares

0

Comments

0

Likes

0

×