Successfully reported this slideshow.

Field Failure Reproduction Using Symbolic Execution and Genetic Programming

1,816 views

Published on

Talk at the 30th CREST Open Workshop on Search Based Software Testing (SBST) and Dynamic Symbolic Execution (DSE) -- http://crest.cs.ucl.ac.uk/cow/30/
University College London, January 2014

Video available at http://youtu.be/i4T2g-mdJ-U

A recent survey conducted among developers of the Apache, Eclipse, and Mozilla projects showed that the ability to recreate field failures is considered of fundamental importance when investigating bug reports. Unfortunately, the information typically contained in a bug report, such as memory dumps or call stacks, is usually insufficient for recreating the problem. Even more advanced approaches for gathering field data and help in-house debugging tend to collect either too little information, and be ineffective, or too much information, and be inefficient. This talk presents two techniques that address these issues: BugRedux and SBFR. Both techniques aim to provide support for in-house debugging of field failures by synthesizing, using execution data collected in the field, executions that mimic the observed field failures. BugRedux relies on symbolic execution, whereas SBFR leverages dynamic programming. The talk discusses the two techniques' complementary strengths and weaknesses.

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

Field Failure Reproduction Using Symbolic Execution and Genetic Programming

  1. 1. FIELD FAILURE REPRODUCTION USING SYMBOLIC EXECUTION AND GENETIC PROGRAMMING Alessandro (Alex) Orso School of Computer Science – College of Computing Georgia Institute of Technology Partially supported by: NSF, IBM, and MSR
  2. 2. DSE SBST FIELD FAILURE REPRODUCTION USING SYMBOLIC EXECUTION AND GENETIC PROGRAMMING Alessandro (Alex) Orso School of Computer Science – College of Computing Georgia Institute of Technology Partially supported by: NSF, IBM, and MSR
  3. 3. DSE SBST FIELD FAILURE REPRODUCTION USING SYMBOLIC EXECUTION AND GENETIC PROGRAMMING Alessandro (Alex) Orso School of Computer Science – College of Computing Georgia Institute of Technology Partially supported by: NSF, IBM, and MSR
  4. 4. DSE SBST FIELD FAILURE REPRODUCTION USING SYMBOLIC EXECUTION AND GENETIC PROGRAMMING es are failur Field Alessandro (Alex)ble! oida Orso unav School of Computer Science – College of Computing Georgia Institute of Technology Partially supported by: NSF, IBM, and MSR
  5. 5. DSE SBST FIELD FAILURE REPRODUCTION USING SYMBOLIC EXECUTION AND GENETIC PROGRAMMING es are failur Field Alessandro (Alex)ble! oida Orso unav School of Computer Science – College of Computing Georgia Institute of Technology Partially supported by: NSF, IBM, and MSR
  6. 6. TYPICAL DEBUGGING PROCESS Bug Repository Very hard to (1) reproduce (2) debug
  7. 7. TYPICAL DEBUGGING PROCESS Recent survey of Apache, Eclipse, and Mozilla developers: Information on how to reproduce field failures is the most valuable, and difficult to obtain, piece of information for investigating such failures. [Zimmermann10] Bug Repository Very hard to (1) reproduce (2) debug
  8. 8. TYPICAL DEBUGGING PROCESS Recent survey of Apache, Eclipse, and Mozilla developers: Information on how to reproduce field failures is the most valuable, and difficult to obtain, piece of information for investigating such failures. [Zimmermann10] Bug Repository OVERARCHING GOAL: help developers (1) investigate field failures, (2) understand their causes, and (3) eliminate such causes. Very hard to (1) reproduce (2) debug
  9. 9. OUR WORK SO FAR Recording and replaying executions [icsm 2007, icse 2007] ✘ Input minimization [woda 2006, icse 2007] Input anonymization [icse 2011] Mimicking field failures [icse 2012, icst 2014] Explaining field failures [issta 2013, TR]
  10. 10. MIMICKING FIELD FAILURES User run (R) Mimicked run (R’) •F’ is analogous to F •R’ is an actual execution in the field F F’ in house
  11. 11. MIMICKING FIELD FAILURES User run (R) Relevant events Mimicked run (R’) (breadcrumbs)
  12. 12. In house OVERALL VISION Software developer Application In the field Instrumentation sed.c:8958 -> sed.c: 8958 sed.c:8993 -> sed.c: 9011 sed.c:8785 -> sed.c: 8786 sed.c:8786 -> sed.c: 8786 sed.c:990 -> sed.c: 990 Likely faults Field Failure Debugging Synthesized Executions Field Failure Reproduction Crash report (execution data)
  13. 13. DSE BUGREDUX/SBFR Synthesized Executions Field Failure Reproduction SBST Crash report (execution data)
  14. 14. ith Wei Joint wor k w Jin BUGREDUX Crash report (execution data) Synthesized Executions
  15. 15. ith Wei Joint wor k w Test Input Jin BUGREDUX Crash report (execution data)
  16. 16. ith Wei Joint wor k w Jin BUGREDUX Candidate input Oracle Test Input • Crash report (execution data) Execution data • • • • • Input generator Point of failure (POF) Failure call stack Call sequence Complete trace Input generation technique • Guided symbolic execution
  17. 17. ALGORITHM (SIMPLIFIED) Input icfg for P goals (list of code locations) Output If (candidate input) Main algorithm init; currGoal = first(goals) repeat currState = SelNextState() if (!currState) backtrack or fail if (currState.cl == currGoal) if (currGoal == last(goals)) return solve(currState.pc) else currGoal = next(goals) currState.goal = currGoal symbolicallyExec(currState) statesSet= {<cl, pc, ss, goal>} SelNextState minDis = ∞ retState = null foreach state in statesSet if (state.goal = currGoal) if (state.cl can reach currGoal) d = |shortest path state.cl, currGoal| if d < minDis minDis = d retState = state return retState
  18. 18. ALGORITHM (SIMPLIFIED) Input icfg for P goals (list of code locations) Output If (candidate input) statesSet= {<cl, pc, ss, goal>} Main algorithm SelNextState s /Heuristic ut space ns init; currGoal = first(goals) izatio minDiss=mbolic inp tim Op the y ∞ e e repeat null ting to reducretStater=ne the search spac Dyn=mic tain information to p u ion currState a SelNextState() th computat alysis shor test pa Program an if (!currState) backtrack orss in the foreach state in statesSet fail an omne e r==dcurrGoal) if (currState.cl if (state.goal = currGoal) Som if (currGoal == last(goals)) if (state.cl can reach currGoal) return solve(currState.pc) d = |shortest path state.cl, currGoal| else if d < minDis currGoal = next(goals) minDis = d currState.goal = currGoal retState = state symbolicallyExec(currState) return retState
  19. 19. BUGREDUX EVALUATION – FAILURES CONSIDERED Name sed grep gzip ncompress polymorph aeon glftpd htget socat tipxd aspell exim rsync xmail Repository SIR SIR SIR BugBench BugBench exploit-db exploit-db exploit-db exploit-db exploit-db exploit-db exploit-db exploit-db exploit-db Size(KLOC) 14 10 5 2 1 3 6 3 35 7 0.5 241 67 1 # Faults 2 1 2 1 1 1 1 1 1 1 1 1 1 1
  20. 20. BUGREDUX EVALUATION – FAILURES CONSIDERED Name sed grep gzip ncompress polymorph aeon glftpd htget socat tipxd aspell exim rsync xmail Repository Size(KLOC) # Faults SIR 14 2 SIR 10 1 SIR 5 2 BugBench 2 1 BugBench 1 1 exploit-db 3 scovered by 1 di faults can6be exploit-db se rs 1 ut of 72 hou None of the meo exploit-db EE with a ti 3 1 vanilla KL a exploit-db 35 1 exploit-db 7 1 exploit-db 0.5 1 exploit-db 241 1 exploit-db 67 1 exploit-db 1 1
  21. 21. BUGREDUX EVALUATION – RESULTS Name sed #1 sed #2 grep gzip #1 gzip #2 ncompress polymorph aeon rsync glftpd htget socat tipxd aspell xmail exim POF Call Stack Call Seq. One of three outcomes: ✘: fail ∼: synthesize ✔: (synthesize and) mimic Compl. Trace
  22. 22. 16/16 2/16 Synth.: 9/16 BUGREDUX6/16 Synth.: 10/16 Synth.:– RESULTS EVALUATION 16/16 Synth.: 2/16 Mimic: 6/16 Mimic: Mimic: Mimic: Name sed #1 sed #2 grep gzip #1 gzip #2 ncompress polymorph aeon rsync glftpd htget socat tipxd aspell xmail exim POF ✘ ✘ ✘ ✔ ∼ ✔ ✔ ✔ ✘ ✔ ∼ ✘ ✔ ∼ ✘ ✘ Call Stack ✘ ✘ ∼ ✔ ∼ ✔ ✔ ✔ ✘ ✔ ∼ ✘ ✔ ∼ ✘ ✘ Call Seq. ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ Compl. Trace ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✔ ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✔
  23. 23. 16/16 2/16 Synth.: 9/16 BUGREDUX6/16 Synth.: 10/16 Synth.:– RESULTS EVALUATION 16/16 Synth.: 2/16 Mimic: 6/16 Mimic: Mimic: Mimic: Name POF Call Stack sed #1 ✘ ✘ sed #2 ✘ ✘ grep ✘ ∼ s: n gzip #1 Observatio ✔ ✔ gzip #2 ∼ ∼ from nt ncompress ults can be dista ✔ ✔ • Fa polymorph e failure ✔ oints: ✔ p th s aeon ✔nd call stack ✔ a => POFs ✘ rsync ✘ ely to help nlik t glftpd u ✔mation is no✔ r • More info htget ∼ ∼ t be✘ter s socat alway ✘ xecution can ce tipxd• Symboli ✔ ✔ factor ∼ ting aspell be a limi∼ xmail ✘ ✘ exim ✘ ✘ Call Seq. ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ Compl. Trace ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✔ ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✔
  24. 24. 16/16 2/16 Synth.: 9/16 BUGREDUX6/16 Synth.: 10/16 Synth.:– RESULTS EVALUATION 16/16 Synth.: 2/16 Mimic: 6/16 Mimic: Mimic: Mimic: Name POF Call Stack sed #1 ✘ ✘ sed #2 ✘ ✘ grep ✘ ∼ s: can n gzip #1 Observatiotion ✔ cu ✔ lic exe Sym gzip #2 bo ∼ ∼ e o fectivstafntrfrom be s ef ✔ ncompress ultincan be di ✔ • Fa polymorph e failure ✔ ointhighly ✔ p ith s: th rograms w all stacks aeon• p POFs ✔nd c ✔ a > =str uctured inputs rsync p a ✘ elyato✘htelat internctt nlik glftpd •u progr ms mation is o✔ ✔ h s nf t r • Moire iexoer nal libr ar ie ∼ htget ∼ w ths better r rams alwraye co✘ plex tpoogcan m ecu i n ✘ socat • la g ce Syn genlieralx tor ✔ tipxd• i mbo ✔ ting fac aspell be a limi∼ ∼ xmail ✘ ✘ exim ✘ ✘ Call Seq. ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ Compl. Trace ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✔ ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✔
  25. 25. SBFR ith Joint wor k w nella iella, To Kifetew, Jin, T Crash report (execution data) • Execution data • • Call sequence Input generation technique • Genetic Programming Test Input
  26. 26. ith Joint wor k w nella iella, To Kifetew, Jin, T SBFR <a> ::= <b> |λ Grammar Crash report (execution data) Test Input
  27. 27. ith Joint wor k w nella iella, To Kifetew, Jin, T SBFR <a> ::= <b> |λ Grammar Derivation Tree Genetic Programming Crash report (execution data) Sentence derivation from the grammar: Random application of grammar rules • Uniform • 80/20 • Stochastic (from a corpus) Test Input
  28. 28. ith Joint wor k w nella iella, To Kifetew, Jin, T SBFR <a> ::= <b> |λ Grammar Derivation Tree Genetic Programming Crash report (execution data) Evolution: Test Input Fitness function: Sentence derivation from the grammar: Distance b/w execution traces Random application of grammar rules (candidate–actual failure) • Uniform • 80/20 • Stochastic (from a corpus)
  29. 29. ith Joint wor k w nella iella, To Kifetew, Jin, T SBFR <a> ::= <b> |λ Grammar Derivation Tree ✔︎ Genetic Programming Crash report (execution data) Stopping criterion: • Success • Ic reaches the point of failure • The program fails “in the same way” • Search budget exhausted Test Input
  30. 30. SBFR EVALUATION – FAILURES CONSIDERED Name Language Size(KLOC) # Productions # Faults calc Java 2 38 2 bc C 12 80 1 MSDL Java 13 140 5 PicoC C 11 194 1 Lua C 17 106 2
  31. 31. SBFR EVALUATION – FAILURES CONSIDERED Name Language Size(KLOC) # Productions # Faults calc Java 2 38 2 bc C 12 80 duce any of pro unable to re rs as 13 72 MSDL BugRedux w Java ut of 140 hou o s with a time these failure 1 PicoC C 11 194 1 Lua C 17 106 2 5
  32. 32. SBFR EVALUATION – RESULTS Name calc bug 1 calc bug 2 bc MSDL bug 1 MSDL bug 2 MSDL bug 3 MSDL bug 4 MSDL bug 5 PicoC FRP (SBFR) FRP (Random) 0.0 0.0 • Parameters: 0.0 • Population: 500 0.0 • Budget: 10,000 unique fitness evaluations 0.0 • Performed 10 runs 1.0 • Measured failure 0.0 reproduction probability 0.0 • Used both 80/20 and stochastic derivations 0.1 Lua bug 1 0.0 Lua bug 2 0.0
  33. 33. SBFR EVALUATION – RESULTS Name calc bug 1 FRP (SBFR) 0.6 FRP (Random) 0.0 calc bug 2 0.8 0.0 bc 1.0 0.0 MSDL bug 1 1.0 0.0 MSDL bug 2 1.0 0.0 MSDL bug 3 1.0 1.0 MSDL bug 4 1.0 0.0 MSDL bug 5 1.0 0.0 PicoC 0.8 0.1 Lua bug 1 0.0 0.0 Lua bug 2 0.5 0.0
  34. 34. SBFR EVALUATION – RESULTS Name calc bug 1 FRP (SBFR) 0.6 FRP (Random) 0.0 calc bug 2 0.8 0.0 bc 1.0 0.0 MSDL bug 1 1.0 0.0 MSDL bug 2 1.0 0.0 MSDL bug 3 1.0 1.0 MSDL bug 4 1.0 0.0 MSDL bug 5 1.0 0.0 PicoC 0.8 0.1 Lua bug 1 0.0 0.0 Lua bug 2 0.5 0.0
  35. 35. SBFR EVALUATION – RESULTS Name calc bug 1 FRP (SBFR) 0.6 FRP (Random) 0.0 calc bug 2 0.8 0.0 bc 1.0 0.0 1.0failure in bc le: 0.0 Lua bug 1 0.0 0.0 Lua bug 2 0.5 0.0 MSDL bug 1 MSDL bug 2 Examp 1.0 0.0 str uction in gered by an tri MSDL bug 3 tion fault 1.0g an arrays 1.0 d nta t 32 me eas seg allocates at l n MSDL bug ence that 4 1.0 ha0.0 sequ bles higher t o r ia umber1.0f va rays MSDLdbuglares a n 0.0 ec 5 ar d er of allocate PicoC numb 0.8 0.1 the
  36. 36. SBFR EVALUATION – RESULTS Name calc bug 1 FRP (SBFR) 0.6 FRP (Random) 0.0 calc bug 2 0.8 0.0 Lua bug 1 0.0 0.0 Lua bug 2 0.5 0.0 : ervations 1.0 0.0 Ob s c MSDL bug 1 1.0failure in an be effective in 0.0 : eroaches c b mplp Exa based ap1.0 annotchiandle rchMSDL bug 2 0.0 • Sea t on c i olic execubiy an ivnstr u t on b a e s a3es thot sfymlt 1.0ar s red effect e rays 1.0 d c MSDL bug i tin gau mtrigg are ast 32 ar ted an t segmentaas c raocmtes atulteless direc och hat all 1.0 le, b a n MSDL • Stence t e scalab bug 4 a0.0 u les h em r th seqSBST mor ber of vare bompilgheentar y, ia • um d DSE ar c ays MSDLdbuglares BST an 1.0 ec=5 S a n artrechniques 0.0 > of a ernatd er an allltocate ive PicoC rnuheb th 0.8 0.1 the at m r bc
  37. 37. FUTURE WORK / FOOD FOR THOUGHTS Relevant execution data identification • Which types? • Which specific ones? • Failure explanation • Reproduction is not enough • Can DSE and SBST help? • Use of different input generation techniques • Grammar-based symbolic execution • Backward symbolic analysis? • Other SBST approaches? • SBST targeted at different kinds of programs? • Combination of techniques •

×