These slides contain an introduction to Symbolic execution and an introduction to KLEE.
I made this for a small demo/intro for my research group's meeting.
Mattingly "AI & Prompt Design: Large Language Models"
Symbolic Execution And KLEE
1. Symbolic executionOverview of work done by Dawson Engler’s group at Stanford (EGT/EXE/KLEE*) by Shauvik Roy Choudhary http://cc.gatech.edu/~shauvik Some slides adapted from the EXE and KLEE presentations + slides from Saswat
2. Old research area but still active.. First introduced in 1975 (source: Saswat) 1976 by James King, IBM – TJ watson Very active area of research. Eg. EGT / EXE / KLEE [Stanford] DART [Bell Labs] CUTE [UIUC] SAGE, Pex [MSR Redmond] Vigilante [MSR Cambridge] BitScope [Berkeley/CMU] CatchConv [Berkeley] JPF [NASA Ames] 2
3. Symbolic Execution Symbolic execution refers to execution of program with symbols as argument. Unlike concrete execution, in symbolic execution the program can take any feasible path. (limitation: constraint solver) During symbolic execution, program state consists of symbolic values for some memory locations path condition Path condition is a conjuct of constraints on the symbolic input values. Solution of path-condition is an test-input that covers the respective path. 3
4. Implementation of Symbolic Execution Transformation approach transform the program to another program that operates on symbolic values such that execution of the transformed program is equivalent to symbolic execution of the original program difficult to implement, portable solution, suitable for Java, .NET Instrumentation approach callback hooks are inserted in the program such that symbolic execution is done in background during normal execution of program easy to implement for C Customized runtime approach Customize the runtime (e.g., JVM) to support symbolic execution Applicable to Java, .NET, difficult to implement, flexible, not portable 4 CUTE, KLEE JPF
5. Limitations of Symbolic Execution Limited by the power of constraint solver cannot handle non-linear and very complex constraints Does not scale when number of paths are large. (subject of ongoing research in this area) Source code, or equivalent (e.g., Java class files) is required for precise symbolic execution 5
7. Generic features: Baroque interfaces, tricky input, rats nest of conditionals. Enormous undertaking to hit with manual testing. Random “fuzz” testing Charm: no manual work Blind generation makes hard to hit errors for narrow input range Also hard to hit errors that require structure This talk: a simple trick to finesse. Goal: find many bugs in systems code
8. EGT: Execution Generated Testing [SPIN’05] Basic Idea: Use the code itself to construct its input ! Basic Algorithm: Symbolic execution + constraints solving. Run code on symbolic inputs, initial value = “anything” As code observes inputs, it tells us values it can be. At conditionals that uses symbolic input, fork On true branch, add constraint that input satisfies check On false that it does not. Then generate constraints using these inputs and re-run code using them. 8 How to make system code crash itself !
9. The toy example Initialize x to be “any int” Code will run 3 times. Solve constraints at each to get our 3 test cases. 9
10. The big picture Implementation prototype Do source-to-source transformation using CIL Use CVCL decision procedure to solve constraints, then re-run code on concrete values Robustness: use mixed symbolic and concrete execution 3 ways to look at what’s going on Grammar extraction Turn code inside out from input consumer to generator Sort-of Heisenberg effect: observations perturb symbolic inputs into increasingly concrete ones. More definite observation = more definite perturbation 10
11. Mixed execution Basic idea: given an operation: If all of it’s operands are concrete, just do it. If any are symbolic, add constraint. If current constraints are impossible, stop. If current path causes something to blow up, solve & emit. If current path calls unmodelled function, solve & call. If program exits, solve & emit. How to track? Use variable addresses to determine if symbolic or concrete Note: Symbolic assignment not destructive. Creates new symbol 11
12. Example transformation “+” Each varv has v.concrete and v.symbolic fields If v is concrete, symbol = <invalid> and vice versa 12
14. Results Mutt vs <= 1.4 have buffer overflow (osdi paper) Input size 4, took 34 minutes to generate 458 tests with 98% st coverage printf(3 implementations pintOS, gccfast, embedded) Made format strings symbolic Two bugs Incorrect grouping of integers Incorrect handling of plus flags (“%” followed by space) 14
15. More.. WsMP3 server case study 2ooo LOC Technique: Make recv input symbolic Found known security hole + 2 new bugs 15 Network controlled infinite loop Buffer overflow
16. EXE: EXecution generated Executions [CCS’06] Same ideas as EGT Main contributions More practical tool: Can test any code path Generates actual attacks Constraint Solver : STP Decision solver for bitvectors and arrays. If solvable, passes constraints to MiniSAT Four times lesser code than CVCL and magnitude faster Array optimizations (substitution, refinements, simplification) 16 Automatically Generating inputs of Death !
17. The mechanics User marks input to treat symbolically using either: Compile with EXE compiler, exe-cc. Uses CIL to Insert checks around every expression: if operands all concrete, run as normal. Otherwise, add as constraint Insert fork calls when symbolic could cause multiple acts ./a.out: forks at each decision point. When path terminates use STP to solve constraints. Terminates when: (1) exit, (2) crash, (3) EXE detects err Rerun concrete through uninstrumented code.
18. Isn’t exponential expensive? Only fork on symbolic branches. Most concrete (linear). Loops? Heuristics. Default: DFS. Linear processes with chain depth. Can get stuck. “Best first” search: chose branch, backtrack to point that will run code hit fewest times. Can do better… However: Happy to let run for weeks as long as generating interesting test cases. Competition is manual and random.
19. Mixed execution Basic idea: given expression (e.g., deref, ALU op) If all of its operands are concrete, just do it. If any are symbolic, add as constraint. If current constraints are impossible, stop. If current path hits error or exit(), solve+emit. If calls uninstrumented code: do call, or solve and do call Example: “x = y + z” If y, z both concrete, execute. Record x = concrete. Otherwise set “x = y + z”, record x =symbolic. Result: Most code runs concretely: small slice deals w/ symbolics. Robust: do not need all source code (e.g., OS). Just run
20. Limits Missed constraints: If call asm, or CIL cannot eat file. STP cannot do div/mod: constraint to be power of 2, shift, mask respectively. Cannot handle **p where “p” is symbolic: must concretize *p. (Note: **p still symbolic.) Stops path if cannot solve; can get lost in exponentials. Missing: No symbolic function pointers, symbolics passed to varargs not tracked. No floating point. long long support is erratic.
21. EXE Results Berkley Packet Filter Two buffer overflow exploits udhcpd – well tested user level DHCP server Five memory errors PCRE – Perl Compatible Regular Expressions Many out of bounds writes leading to abort in glibc on free Disks of death – File systems Four bugs on ext2 & ext 3 file systems. Null pointer dereference in JFS 21
24. 24 Code complexity Tricky control flow Complex dependencies Abusive use of pointer operations Environmental dependencies Code has to anticipate all possible interactions Including malicious ones Writing Systems Code Is Hard
25.
26. Toy Example x= x < 0 intbad_abs(intx) { if (x < 0) return –x; if (x == 1234) return –x; return x; } TRUE FALSE x0 x< 0 x = 1234 return -x TRUE FALSE x= 1234 x1234 x = -2 return x return -x test1.out x = 3 x = 1234 test2.out test3.out 26
27. KLEE Architecture LLVM bytecode C code L L V M x = -2 K L E E SYMBOLIC ENVIRONMENT x = 1234 x = 3 x 0 x 1234 x = 3 Constraint Solver (STP) 27
28. Outline Motivation Example and Basic Architecture Scalability Challenges Experimental Evaluation 28
40. Eliminating Irrelevant Constraints In practice, each branch usually depends on a small number of variables … … if (x < 10) { … } x + y > 10 z & -z = z x< 10 ? 33
41. Caching Solutions Static set of branches: lots of similar constraint sets 2 y < 100 x > 3 x + y > 10 x = 5 y = 15 x = 5 y = 15 2 y < 100 x + y > 10 Eliminating constraints cannot invalidate solution 2 y < 100 x > 3 x + y > 10 x < 10 x = 5 y = 15 Adding constraints often does not invalidate solution UBTree data structure [Hoffman and Koehler, IJCAI ’99] 34
54. …Variety of functions, different authors, intensive interaction with environment Heavily tested, mature code 40
55. Coreutils ELOC (incl. called lib) Number of applications Executable Lines of Code (ELOC) 41
56.
57. High Line Coverage (Coreutils, non-lib, 1h/utility = 89 h) Overall: 84%, Average 91%, Median 95% 16 at 100% Coverage (ELOC %) Apps sorted by KLEE coverage 43
58. KLEE 91% Manual 68% Beats 15 Years of Manual Testing Avg/utility Manual tests also check correctness KLEE coverage – Manual coverage Apps sorted by KLEE coverage – Manual coverage 44
59. Busybox Suite for Embedded Devices Overall: 91%, Average 94%, Median 98% 31 at 100% Coverage (ELOC %) Apps sorted by KLEE coverage 45
66. md5sum -c t1.txt mkdir -Z a b mkfifo -Z a b mknod -Z a b p seq -f %0 1 pr -e t2.txt tac -r t3.txt t3.txt paste -d abcdefghijklmnopqrstuvwxyz ptx -F abcdefghijklmnopqrstuvwxyz ptx x t4.txt t1.txt: MD5( t2.txt: t3.txt: t4.txt: A Ten command lines of death 49
71. An assert is just a branch, and KLEE proves feasibility/infeasibility of each branch it reaches
72. If KLEE determines infeasibility of false side of assert, the assert was proven on the current path51
73. Crosschecking Assume f(x) and f’(x) implement the same interface Make input x symbolic Run KLEE on assert(f(x) == f’(x)) For each explored path: KLEE terminates w/o error: paths are equivalent KLEE terminates w/ error: mismatch found Coreutils vs. Busybox: UNIX utilities should conform to IEEE Std.1003.1 Crosschecked pairs of Coreutils and Busybox apps Verified paths, found mismatches 52
74. Input Busybox Coreutils tee "" <t1.txt [infinite loop] [terminates] tee - [copies once to stdout] [copies twice] comm t1.txt t2.txt [doesn’t show diff] [shows diff] cksum / "4294967295 0 /" "/: Is a directory" split / "/: Is a directory" tr [duplicates input] "missing operand" [ 0 ‘‘<’’ 1 ] "binary op. expected" tail –2l [rejects] [accepts] unexpand –f [accepts] [rejects] split – [rejects] [accepts] t1.txt: a t2.txt: b (no newlines!) Mismatches Found 53
75.
76. KLEE DEMO Tool available at http://klee.llvm.org/ Experiments Tool examples isLower() RegExp More experimentation 55