BaaB: Bugs as a Backdoor
Shih-Kun Huang
Software Quality Lab
National Chiao Tung University
Hsinchu, Taiwan

22:44:38

1
Trusting Trust
• If (a=1)
• Reflections on Trusting Trust
Ken Thompson
– 1984, Turing Award Lecture

22:44:38

2
Introduction
• Constructing Symbolic Failure Models based on
the software Crash
• Producing Attacks through the Symbolic M...
Finding bugs and backdoors
• If a backdoor channel is built by embedding
bugs in the system
– Trojan horse identification ...
Reliability/Bug

Security/Vulnerability

CRAX: test if CRAsh eXploitable
by Automatic Exploit Generation
(CRAXing mplayer ...
CRAX is the second

Binary AEG
(Automatic Exploit Generator)
•
•
•
•

•

Microsoft’s !exploitable crash analyzer (plugged ...
Outline
• Introduction
– The need for exploit generation
– Current methods
– Our CRAX framework

• Method
• Implementation...
The Need for Exploit Generation
• Crash is inevitable in software
• Need a way to judge exploitability
– Too Many Crashes ...
Motivation 2: Hacker’s Tool Chain
• Bug Fuzzer
– Crash
– meta-fuzz, smart-fuzzer, zzuf, peach,taintscope,…

• Crash detect...
Current Exploit Generation Method
• Manual exploit generation
– Time consuming
– Require much skill and security knowledge...
Our CRAX’s Framework
• Based on the whole system emulation
– Platform independent
– Source is not needed

• Generalized th...
Outline
• Introduction
• Method
– Overview
– Code selection

• Implementation
• Experiment result

22:44:38

12
Overview of CRAX’s Framework
• Built on S2E
– A whole system symbolic execution engine

• Exploit generation process
1. Ex...
Symbolic EIP (program counter)
• Symbolic EIP and Tainted EIP
– Tainted EIP: Only a bit, indicating the EIP is tainted
– S...
Exploit Generation Process
• Objective: automatically generate an exploit for a
given program binary and crash input

22:4...
Exploit Generation Process
• Initially, only input is symbolic

22:44:38

16
Exploit Generation Process
• Symbolic data will propagate with program execution

22:44:38

17
Exploit Generation Process
• Also collect constraints that limit the program to
follow the same path

22:44:38

18
Exploit Generation Process
• Collect path constraint & symbolic memory blocks…

22:44:38

19
Exploit Generation Process

22:44:38

20
Exploit Generation Process

22:44:38

21
Exploit Generation Process

22:44:38

22
Exploit Generation Process
• When a vulnerable return/call/jmp/exception is
executed, symbolic EIP is detected

22:44:38

...
Exploit Generation Process
• Using collected information to reason out an exploit

22:44:38

24
Exploit Generation Process
• Constrain the content of a selected symbolic block to
be our shellcode, and EIP to point to t...
Exploit Generation Process
• Query the solver to find a solution that satisfy both
path constraint and exploit constraint
...
Exploit Generation Process
• The solution is an exploit

22:44:38

27
Code Selection
• Kernel & library code are huge and would add
lots of constraints
• Some kernel & library functions are ir...
Code Selection

22:44:38

29
Code Selection

22:44:38

30
Code Selection

22:44:38

31
Outline
• Introduction
• Method
• Implementation
– Concolic mode
– Code selection
– Symbolic EIP detection
– Exploit gener...
Concolic Mode
• Keep the concrete value in an extra constraint set
– Concolic constraint

• If branch condition is symboli...
Concolic Mode

22:44:38

34
Concolic Mode
• Query the solver to find the concrete value of branch
condition

22:44:38

35
Concolic Mode

22:44:38

36
Concolic Mode
• Follow the concrete path, and constrain branch
condition to be the concrete value

22:44:38

37
Code Selection
• Selective functionality of S2E
– s2e_disable_symbolic_execution()
– s2e_enable_symbolic_execution()

• LD...
Code Selection

22:44:38

39
Symbolic EIP Detection
• In the symbolic execution engine of S2E
– State of emulated CPU is stored in CPUX86State
structur...
Symbolic EIP Detection

22:44:39

41
Exploit Generation
• Finding symbolic memory blocks
– Memory model in S2E
– Search method

• Shellcode injection
– Determi...
Memory Model in S2E
• concreteMask is used to record which bytes of
ObjectState is symbolic
 Find blocks with consecutive...
Search Method
• Search entire 232 address space of guest
process
• Hierarchical search
1. Check the existence of all guest...
Shellcode Injection

22:44:39

45
Determine NOP Sled Length
• Binary search like algorithm
• Ensure
1. EIP can point to
NOP range
2. NOP can fill the range
...
Determine EIP Range
• Binary search like algorithm
• Try to point EIP to the
middle of NOP sled

22:44:39

47
Other Optimizations
• Fast Construction of the Symbolic Failure
Model
– Fast Concolic (input constraint, branch condition,...
Outline
•
•
•
•

Introduction
Method
Implementation
Experiment results
– CRAX results
– Comparisons with AEG benchmarks
– ...
CRAX Results (model building)
Program

Input source

Input
Length

Advisory ID

CRAX Time

aeon

Env. Var.

550

CVE-2005-...
CRAX Results (model building)
Program

Input source

Input
Length

Advisory ID

CRAX Time

CRAX Time
(fast concolic)

aeon...
CRAX Results (fast concolic)
Program

Input source

Input
Length

Advisory ID

CRAX Time
(fast concolic)

aeon

Env. Var.
...
CRAX Results (adaptive input)
Program

Input source

Input
Length

Advisory ID

CRAX Time

CRAX Time
(Adaptive)

aeon

Env...
CRAX Results
Program

Input source

Input
Length

Advisory ID

CRAX Time

CRAX Time
(Adaptive)

aeon

Env. Var.

550

CVE-...
Comparisons with AEG Benchmarks
Program

Input source

Input
Length

AEG Time

CRAX Time

Core i7, 3.4G

Core 2, 2.66G

CR...
Comparisons with AEG Benchmarks
Program

Input source

Input
Length

AEG Time

CRAX Time

Core i7, 3.4G

Core 2, 2.66G

CR...
Comparisons with AEG Benchmarks
Program

Input source

Input
Length

AEG Time

CRAX Time

Speedup

Core 2, 2.66G

CRAX Tim...
Comparisons with MAYHEM
Benchmarks (Linux)
Program

Input source

Input
Length

Mayhem Time
Core i7, 3.4G

CRAX Time
Core ...
Comparisons with MAYHEM
Benchmarks (Linux)
Program

Input source

Input
Length

Mayhem Time
Core i7, 3.4G

CRAX Time
Core ...
Comparisons with MAYHEM
Benchmarks (windows)
Program

Input source

Input
Length

Mayhem Time
Core i7, 3.4G

CRAX Time
Cor...
Comparisons with MAYHEM
Benchmarks (windows)
Program

Input source

Input
Length

Mayhem Time
Core i7, 3.4G

CRAX Time
Cor...
Results of Larger Programs
Program

Input
source

Input
Length

Explore
Time

Exploit
Explore Time
Gen. Time (Adaptive)

U...
Results of Craxing Larger Programs
Program

Input
source

Input
Length

Explore
Time

Exploit
Explore Time
Gen. Time (Adap...
Comparisons of AEG Features
System

Heelan’s
(Sep 2009)

APEG
AEG
(May 2008) (Feb 2011)

MAYHEM
(May 2012)

CRAX
(June 201...
Comparisons of AEG Features
System

Heelan’s
(Sep 2009)

APEG
AEG
(May 2008) (Feb 2011)

MAYHEM
(May 2012)

CRAX
(June 201...
Comparisons of AEG Features
System

Heelan’s
(Sep 2009)

APEG
AEG
(May 2008) (Feb 2011)

MAYHEM
(May 2012)

CRAX
(June 201...
Comparisons of AEG Features
System

Heelan’s
(Sep 2009)

APEG
AEG
(May 2008) (Feb 2011)

MAYHEM
(May 2012)

CRAX
(June 201...
Comparisons of AEG Features
System

Heelan’s
(Sep 2009)

APEG
AEG
(May 2008) (Feb 2011)

MAYHEM
(May 2012)

CRAX
(June 201...
Comparisons of AEG Features
System

Heelan’s
(Sep 2009)

APEG
AEG
(May 2008) (Feb 2011)

MAYHEM
(May 2012)

CRAX
(June 201...
Comparisons of AEG Features
System

Heelan’s
(Sep 2009)

APEG
AEG
(May 2008) (Feb 2011)

MAYHEM
(May 2012)

CRAX
(June 201...
Comparisons of AEG Features
System

Heelan’s
(Sep 2009)

APEG
AEG
(May 2008) (Feb 2011)

MAYHEM
(May 2012)

CRAX
(June 201...
Comparisons of AEG Features
System

Heelan’s
(Sep 2009)

APEG
AEG
(May 2008) (Feb 2011)

MAYHEM
(May 2012)

CRAX
(June 201...
Comparisons of AEG Features
System

Heelan’s
(Sep 2009)

APEG
AEG
(May 2008) (Feb 2011)

MAYHEM
(May 2012)

CRAX
(June 201...
Comparisons of AEG Features
System

Heelan’s
(Sep 2009)

APEG
AEG
(May 2008) (Feb 2011)

MAYHEM
(May 2012)

CRAX
(June 201...
Comparisons of AEG Features
System

Heelan’s
(Sep 2009)

APEG
AEG
(May 2008) (Feb 2011)

MAYHEM
(May 2012)

CRAX
(June 201...
Conclusions
CRAX: test if crash exploitable
• Exploit-Gen is a single path concolic execution
(without fork) with no path ...
Lessons Learned
• Symbolic EIP Detection Process
– Reconstructing the Symbolic Failure Model (the
crash model)

• Applicat...
Further Work
• Craxing IE, Firefox, Acrobat pdf reader, Office, and Antivirus software in driver mode
• Automate most of t...
The Impact
• Much Easier for Implementing a Binary AEG
– S2E is available for “poor man”
– Symbolic EIP detection is quite...
Upcoming SlideShare
Loading in …5
×

Baab (Bug as a Backdoor) through automatic exploit generation (CRAX)

4,211 views

Published on

This paper presents a new method, capable of automatically generating attacks on binary programs from software crashes.We analyze software crashes with a symbolic failure model by performing concolic executions following the failure directed paths, using a whole system environment model and concrete address mapped symbolic memory in S2E. We propose a new selective symbolic input method and lazy evaluation on pseudo symbolic
variables to handle symbolic pointers and speed up the process. This is an end-to-end approach able to create exploits from crash inputs or existing exploits for various applications, including most of the existing benchmark programs, and several large scale applications,
such as a word processor (Microsoft office word), a media
player (mpalyer), an archiver (unrar), or a pdf reader (foxit).We can deal with vulnerability types including stack and heap overflows, format string, and the use of uninitialized variables. Notably, these applications have become software fuzz testing targets, but still require
a manual process with security knowledge to produce mitigation-hardened exploits. Using this method to generate exploits is an automated process for software failures without source code. The proposed method is simpler, more general, faster, and can be scaled to larger programs than existing systems. We produce the exploits within one minute for most of the benchmark programs, including mplayer.We also transform existing exploits of Microsoft
office word into new exploits within four minutes. The best speedup is 7,211 times faster than the initial attempt. For heap overflow vulnerability, we can automatically exploit the unlink() macro of glibc, which formerly requires sophisticated hacking efforts.

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
4,211
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
59
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Baab (Bug as a Backdoor) through automatic exploit generation (CRAX)

  1. 1. BaaB: Bugs as a Backdoor Shih-Kun Huang Software Quality Lab National Chiao Tung University Hsinchu, Taiwan 22:44:38 1
  2. 2. Trusting Trust • If (a=1) • Reflections on Trusting Trust Ken Thompson – 1984, Turing Award Lecture 22:44:38 2
  3. 3. Introduction • Constructing Symbolic Failure Models based on the software Crash • Producing Attacks through the Symbolic Model – Software Crash failures can be manipulated and Exploited • If Bugs are exploited and attacked, arbitrary code can be executed and a backdoor channel will be built – Bugs as a Backdoor 22:44:38 3
  4. 4. Finding bugs and backdoors • If a backdoor channel is built by embedding bugs in the system – Trojan horse identification will be reduced to the finding of the software bugs • Our work – Exploitable Crash detection – Automatic Exploitation (Attack input) Generation 22:44:38 4
  5. 5. Reliability/Bug Security/Vulnerability CRAX: test if CRAsh eXploitable by Automatic Exploit Generation (CRAXing mplayer in minutes) 5
  6. 6. CRAX is the second Binary AEG (Automatic Exploit Generator) • • • • • Microsoft’s !exploitable crash analyzer (plugged in many fuzzers) released in 2009 Heelan’s AEG and Concolic Methods for AEG proposed by different groups (including us) around 2008 and 2009 CMU’s AEG (and later Q) claimed to be the first end-to-end AEG needing source code, published in NDSS 2011 CMU’s MAYHEM claimed to be the first binary AEG, just published in May’s IEEE S&P 2012 Compared with AEG and MAYHEM, ours (CRAX) is simpler, more general, faster, and can be scaled to larger programs 22:44:38 6
  7. 7. Outline • Introduction – The need for exploit generation – Current methods – Our CRAX framework • Method • Implementation • Experiment results 22:44:38 7
  8. 8. The Need for Exploit Generation • Crash is inevitable in software • Need a way to judge exploitability – Too Many Crashes are to be fixed – Exploitable crashes without mitigations should be fixed first – Exploitable crashes with mitigations can be fixed later – Other crashes are prioritized in normal order  Exploit generation – A convincing way to prove exploitability 22:44:38 8
  9. 9. Motivation 2: Hacker’s Tool Chain • Bug Fuzzer – Crash – meta-fuzz, smart-fuzzer, zzuf, peach,taintscope,… • Crash detector or Failure Monitor – Taint Track – gdb,ollydbg,Pin, valgrind,CRED,Beagle,!exploitable,… • Exploit-code Generator  missing link of the tool chain – Manually Efforts with Expertise – Heelan’s, AEG, Q, MAYHEM, and CRAX • Shell-code forger – Customized Payload – An Easier Botnet Builder – meta-sploit
  10. 10. Current Exploit Generation Method • Manual exploit generation – Time consuming – Require much skill and security knowledge • Automatic exploit generation – Platform dependent – Require source code (MAYHEM excluded) – Handle only limited kind of vulnerabilities 22:44:38 10
  11. 11. Our CRAX’s Framework • Based on the whole system emulation – Platform independent – Source is not needed • Generalized threat model – Can be applied to most of the vulnerabilities – Crash: Tainted Continuations – Exploitable: Symbolic Continuations 22:44:38 11
  12. 12. Outline • Introduction • Method – Overview – Code selection • Implementation • Experiment result 22:44:38 12
  13. 13. Overview of CRAX’s Framework • Built on S2E – A whole system symbolic execution engine • Exploit generation process 1. Explore crash path with the crash input • Only explore the crash path => concolic mode without forking another branch 2. Detect symbolic EIP (program counter) 3. Reason out exploit 22:44:38 13
  14. 14. Symbolic EIP (program counter) • Symbolic EIP and Tainted EIP – Tainted EIP: Only a bit, indicating the EIP is tainted – Symbolic EIP: several mega-bytes (of constraints) • Path Constraints: indicating the control flow to reach the crash site • Continuation Constraints: indicating the next “malicious progress” of exploits • Payload Constraints: indicating the code body of “malicious intents” to continue executions • Symbolic Continuations – While/for/if branch predicates/jmp buf/SEH/GOT/RET/ • The process of Symbolic EIP detection is to Reconstruct a Symbolic Failure Model (after that, we can manipulate the Symbolic Model at will) 22:44:38 14
  15. 15. Exploit Generation Process • Objective: automatically generate an exploit for a given program binary and crash input 22:44:38 15
  16. 16. Exploit Generation Process • Initially, only input is symbolic 22:44:38 16
  17. 17. Exploit Generation Process • Symbolic data will propagate with program execution 22:44:38 17
  18. 18. Exploit Generation Process • Also collect constraints that limit the program to follow the same path 22:44:38 18
  19. 19. Exploit Generation Process • Collect path constraint & symbolic memory blocks… 22:44:38 19
  20. 20. Exploit Generation Process 22:44:38 20
  21. 21. Exploit Generation Process 22:44:38 21
  22. 22. Exploit Generation Process 22:44:38 22
  23. 23. Exploit Generation Process • When a vulnerable return/call/jmp/exception is executed, symbolic EIP is detected 22:44:38 23
  24. 24. Exploit Generation Process • Using collected information to reason out an exploit 22:44:38 24
  25. 25. Exploit Generation Process • Constrain the content of a selected symbolic block to be our shellcode, and EIP to point to the block 22:44:38 25
  26. 26. Exploit Generation Process • Query the solver to find a solution that satisfy both path constraint and exploit constraint 22:44:38 26
  27. 27. Exploit Generation Process • The solution is an exploit 22:44:38 27
  28. 28. Code Selection • Kernel & library code are huge and would add lots of constraints • Some kernel & library functions are irrelevant – Such as fopen() or perror() Concretely execute them 22:44:38 28
  29. 29. Code Selection 22:44:38 29
  30. 30. Code Selection 22:44:38 30
  31. 31. Code Selection 22:44:38 31
  32. 32. Outline • Introduction • Method • Implementation – Concolic mode – Code selection – Symbolic EIP detection – Exploit generation – Other types of exploit • Experiment result 22:44:38 32
  33. 33. Concolic Mode • Keep the concrete value in an extra constraint set – Concolic constraint • If branch condition is symbolic – We want to find its concrete value  Query the constraint solver with concolic constraint 22:44:38 33
  34. 34. Concolic Mode 22:44:38 34
  35. 35. Concolic Mode • Query the solver to find the concrete value of branch condition 22:44:38 35
  36. 36. Concolic Mode 22:44:38 36
  37. 37. Concolic Mode • Follow the concrete path, and constrain branch condition to be the concrete value 22:44:38 37
  38. 38. Code Selection • Selective functionality of S2E – s2e_disable_symbolic_execution() – s2e_enable_symbolic_execution() • LD_PRELOAD environment variable in Linux – Intercept call to perror()/fopen()/… – Disable symbolic execution before enter libc – Enable symbolic execution after leave libc 22:44:38 38
  39. 39. Code Selection 22:44:38 39
  40. 40. Symbolic EIP Detection • In the symbolic execution engine of S2E – State of emulated CPU is stored in CPUX86State structure – Guest code will be translated into llvm IR before symbolic executed • Access to CPU register will be translated into load/store IR to CPUX86State structure Check executed store IR to see whether the target is EIP and value is symbolic 22:44:39 40
  41. 41. Symbolic EIP Detection 22:44:39 41
  42. 42. Exploit Generation • Finding symbolic memory blocks – Memory model in S2E – Search method • Shellcode injection – Determine the position of shellcode – Determine the length of nop sled – Determine EIP range 22:44:39 42
  43. 43. Memory Model in S2E • concreteMask is used to record which bytes of ObjectState is symbolic  Find blocks with consecutive 0s in concreteMask 22:44:39 43
  44. 44. Search Method • Search entire 232 address space of guest process • Hierarchical search 1. Check the existence of all guest page 2. For each existing guest page, check which of its ObjectState contains symbolic data 3. For each ObjectState that contains symbolic data, search consecutive symbolic blocks in it 22:44:39 44
  45. 45. Shellcode Injection 22:44:39 45
  46. 46. Determine NOP Sled Length • Binary search like algorithm • Ensure 1. EIP can point to NOP range 2. NOP can fill the range 22:44:39 46
  47. 47. Determine EIP Range • Binary search like algorithm • Try to point EIP to the middle of NOP sled 22:44:39 47
  48. 48. Other Optimizations • Fast Construction of the Symbolic Failure Model – Fast Concolic (input constraint, branch condition, and path constraint reductions along with the failure path) by selective symbolic execution • Input Selections (adaptive symbolic Input) – Most of the benchmark used by AEG and MAYHEM can be resolved by dividing inputs into smaller symbolic blocks – An iterative and still automatic process 22:44:39 48
  49. 49. Outline • • • • Introduction Method Implementation Experiment results – CRAX results – Comparisons with AEG benchmarks – Comparisons with MAYHEM benchmarks – Results of larger programs 49
  50. 50. CRAX Results (model building) Program Input source Input Length Advisory ID CRAX Time aeon Env. Var. 550 CVE-2005-1019 298.12 iwconfig Arguments 85 BID-8901 4.21 glftpd Arguments 300 OSVDB-16373 50.07 ncompress Arguments 1050 CVE-2001-1413 2000.41 htget Arguments 276 CVE-2004-0852 146.72 htget Env. Var. 180 CMU AEG 0-day expect Env. Var.(HOME) 300 OSVDB-60979 expect Env. Var.(DOTDIR) 300 CMU AEG 0-day rsync Env. Var. 201 CVE-2004-2093 210.53 acon Env. Var. 1300 CVE-2008-1994 3782.50 gif2png Arguments 1080 CVE-2009-5018 12254.87 hsolink Arguments 1050 CVE-2010-2930 2422.07 exim Arguments 304 EDB-ID#796 aspell Stdin 300 CVE-2004-0548 xserver Socket 104 CVE-2007-3957 xmail Stdin 307 CVE-2005-2943 172.50 CRAX Time (fast concolic)
  51. 51. CRAX Results (model building) Program Input source Input Length Advisory ID CRAX Time CRAX Time (fast concolic) aeon Env. Var. 550 CVE-2005-1019 298.12 19.67 (15.1x) iwconfig Arguments 85 BID-8901 4.21 2.68 (1.57x) glftpd Arguments 300 OSVDB-16373 50.07 4.71 (10.63x) ncompress Arguments 1050 CVE-2001-1413 2000.41 53.79(37.18x) htget Arguments 276 CVE-2004-0852 146.72 27.19(5.39) htget Env. Var. 180 CMU AEG 0-day expect Env. Var.(HOME) 300 OSVDB-60979 172.50 23.51(7.33x) expect Env. Var.(DOTDIR) 300 CMU AEG 0-day rsync Env. Var. 201 CVE-2004-2093 210.53 7.75(27.1x) acon Env. Var. 1300 CVE-2008-1994 3782.50 68.86(54.93x) gif2png Arguments 1080 CVE-2009-5018 12254.87 89.43(25.21x) hsolink Arguments 1050 CVE-2010-2930 2422.07 47.47(51.02x) exim Arguments 304 EDB-ID#796 aspell Stdin 300 CVE-2004-0548 xserver Socket 104 CVE-2007-3957 xmail Stdin 307 CVE-2005-2943
  52. 52. CRAX Results (fast concolic) Program Input source Input Length Advisory ID CRAX Time (fast concolic) aeon Env. Var. 550 CVE-2005-1019 32.0 iwconfig Arguments 85 BID-8901 3.6 glftpd Arguments 300 OSVDB-16373 8.0 ncompress Arguments 1050 CVE-2001-1413 99.4 htget Arguments 276 CVE-2004-0852 35.5 htget Env. Var. 180 CMU AEG 0-day 5.1 expect Env. Var.(HOME) 300 OSVDB-60979 29.4 expect Env. Var.(DOTDIR) 300 CMU AEG 0-day 29.3 rsync Env. Var. 201 CVE-2004-2093 9.9 acon Env. Var. 1300 CVE-2008-1994 32.0 gif2png Arguments 1080 CVE-2009-5018 154.7 hsolink Arguments 1050 CVE-2010-2930 103.9 exim Arguments 304 EDB-ID#796 122.3 aspell Stdin 300 CVE-2004-0548 14.5 xserver Socket 104 CVE-2007-3957 14.4 xmail Stdin 307 CVE-2005-2943 371.7 CRAX Time (Adaptive)
  53. 53. CRAX Results (adaptive input) Program Input source Input Length Advisory ID CRAX Time CRAX Time (Adaptive) aeon Env. Var. 550 CVE-2005-1019 32.0 2.6 iwconfig Arguments 85 BID-8901 3.6 0.7 glftpd Arguments 300 OSVDB-16373 8.0 0.5 ncompress Arguments 1050 CVE-2001-1413 99.4 0.7 htget Arguments 276 CVE-2004-0852 35.5 2.9 htget Env. Var. 180 CMU AEG 0-day 5.1 1.17 expect Env. Var.(HOME) 300 OSVDB-60979 29.4 2.7 expect Env. Var.(DOTDIR) 300 CMU AEG 0-day 29.3 3.56 rsync Env. Var. 201 CVE-2004-2093 9.9 2.7 acon Env. Var. 1300 CVE-2008-1994 32.0 2.7 gif2png Arguments 1080 CVE-2009-5018 154.7 1.69 hsolink Arguments 1050 CVE-2010-2930 103.9 2.4 exim Arguments 304 EDB-ID#796 122.3 4.3 aspell Stdin 300 CVE-2004-0548 14.5 1.7 xserver Socket 104 CVE-2007-3957 14.4 2.5 xmail Stdin 307 CVE-2005-2943 371.7 171.0
  54. 54. CRAX Results Program Input source Input Length Advisory ID CRAX Time CRAX Time (Adaptive) aeon Env. Var. 550 CVE-2005-1019 32.0 2.6 iwconfig Arguments 85 BID-8901 3.6 0.7 glftpd Arguments 300 OSVDB-16373 8.0 0.5 ncompress Arguments 1050 CVE-2001-1413 99.4 0.7 htget Arguments 276 CVE-2004-0852 35.5 2.9 htget Env. Var. 180 CMU AEG 0-day 5.1 1.17 expect Env. Var.(HOME) 300 OSVDB-60979 29.4 2.7 expect Env. Var.(DOTDIR) 300 CMU AEG 0-day 29.3 3.56 rsync Env. Var. 201 CVE-2004-2093 9.9 2.7 acon Env. Var. 1300 CVE-2008-1994 32.0 2.7 gif2png Arguments 1080 CVE-2009-5018 154.7 1.69 hsolink Arguments 1050 CVE-2010-2930 103.9 2.4 exim Arguments 304 EDB-ID#796 122.3 4.3 aspell Stdin 300 CVE-2004-0548 14.5 1.7 xserver Socket 104 CVE-2007-3957 14.4 2.5 xmail Stdin 307 CVE-2005-2943 371.7 171.0
  55. 55. Comparisons with AEG Benchmarks Program Input source Input Length AEG Time CRAX Time Core i7, 3.4G Core 2, 2.66G CRAX Time (Adaptive) aeon Env. Var. 550 32.0 2.6 iwconfig Arguments 85 3.6 0.7 glftpd Arguments 300 8.0 0.5 ncompress Arguments 1050 99.4 0.7 htget Arguments 276 35.5 2.9 htget Env. Var. 180 5.1 1.17 expect Env. Var.(HOME) 300 29.4 2.7 expect Env. Var.(DOTDIR) 300 29.3 3.56 rsync Env. Var. 201 9.9 2.7 acon Env. Var. 1300 32.0 2.7 gif2png Arguments 1080 154.7 1.69 hsolink Arguments 1050 103.9 2.4 exim Arguments 304 122.3 4.3 aspell Stdin 300 14.5 1.7 xserver Socket 104 14.4 2.5 xmail Stdin 307 371.7 171.0 Speedup 55
  56. 56. Comparisons with AEG Benchmarks Program Input source Input Length AEG Time CRAX Time Core i7, 3.4G Core 2, 2.66G CRAX Time (Adaptive) aeon Env. Var. 550 3.8 32.0 2.6 iwconfig Arguments 85 1.5 3.6 0.7 glftpd Arguments 300 2.3 8.0 0.5 ncompress Arguments 1050 12.3 99.4 0.7 htget Arguments 276 57.2 35.5 2.9 htget Env. Var. 180 1.2 5.1 1.17 expect Env. Var.(HOME) 300 187.6 29.4 2.7 expect Env. Var.(DOTDIR) 300 186.7 29.3 3.56 rsync Env. Var. 201 19.7 9.9 2.7 acon Env. Var. 1300 32.0 2.7 gif2png Arguments 1080 154.7 1.69 hsolink Arguments 1050 103.9 2.4 exim Arguments 304 33.8 122.3 4.3 aspell Stdin 300 15.2 14.5 1.7 xserver Socket 104 31.9 14.4 2.5 xmail Stdin 307 1276.0 371.7 171.0 Speedup 56
  57. 57. Comparisons with AEG Benchmarks Program Input source Input Length AEG Time CRAX Time Speedup Core 2, 2.66G CRAX Time (Adaptive) aeon Env. Var. 550 3.8 32.0 2.6 1.5x iwconfig Arguments 85 1.5 3.6 0.7 2.1x glftpd Arguments 300 2.3 8.0 0.5 4.6x ncompress Arguments 1050 12.3 99.4 0.7 17.6x htget Arguments 276 57.2 35.5 2.9 19.7x htget Env. Var. 180 1.2 5.1 1.17 1.0x expect Env. Var.(HOME) 300 187.6 29.4 2.7 69.5x expect Env. Var.(DOTDIR) 300 186.7 29.3 3.56 52.44x rsync Env. Var. 201 19.7 9.9 2.7 7.3x acon Env. Var. 1300 32.0 2.7 gif2png Arguments 1080 154.7 1.69 hsolink Arguments 1050 103.9 2.4 exim Arguments 304 33.8 122.3 4.3 7.9x aspell Stdin 300 15.2 14.5 1.7 8.9x xserver Socket 104 31.9 14.4 2.5 12.8x xmail Stdin 307 1276.0 371.7 171.0 7.5x 57
  58. 58. Comparisons with MAYHEM Benchmarks (Linux) Program Input source Input Length Mayhem Time Core i7, 3.4G CRAX Time Core 2, 2.66G CRAX Time (Adaptive) Aeon Env. Var. 550 10 32.0 2.6 Aspell stdin 750 82 14.5 1.7 Glftpd Arguments 300 4 8.0 0.5 Htget Env. Var. 350 7 5.1 1.17 Iwconfig Arg. 400 2 3.6 0.7 nCompress Arg. 1400 11 99.4 0.7 Rsync Env.Var. 100 8 9.9 2.7 Mbse-bbs Env. Var. 4200 362.0 784.5 26.9 PSUtils Arguments 300 46.0 122.6 25.4 Htpasswd Arguments 400 4.0 5.2 0.4 Squirrel Mail Arguments 150 2 5.6 0.9 22:44:39 58
  59. 59. Comparisons with MAYHEM Benchmarks (Linux) Program Input source Input Length Mayhem Time Core i7, 3.4G CRAX Time Core 2, 2.66G CRAX Time (Adaptive) Aeon Env. Var. 550 10 32.0 2.6 (38.4x) Aspell stdin 750 82 14.5 1.7 (48.2x) Glftpd Arguments 300 4 8.0 0.5 (8x) Htget Env. Var. 350 7 5.1 1.17(5.9x) Iwconfig Arg. 400 2 3.6 0.7(2.9x) nCompress Arg. 1400 11 99.4 0.7(15.7x) Rsync Env.Var. 100 8 9.9 2.7(3x) Mbse-bbs Env. Var. 4200 362.0 784.5 26.9(13.4x) PSUtils Arguments 300 46.0 122.6 25.4(1.8x) Htpasswd Arguments 400 4.0 5.2 0.4(10x) Squirrel Mail Arguments 150 2 5.6 0.9(2.2x) 22:44:39 59
  60. 60. Comparisons with MAYHEM Benchmarks (windows) Program Input source Input Length Mayhem Time Core i7, 3.4G CRAX Time Core 2, 2.66G Coolplayer File 210 164.0 140.7 Distiny File 2100 963.0 60.8 Dizzy Arguments 519 13260.0 313.0 (Only Explore) GAlan File 1500 831.0 26.1 GSPlayer File 400 120.0 33.3 22:44:39 CRAX Time (Adaptive) 60
  61. 61. Comparisons with MAYHEM Benchmarks (windows) Program Input source Input Length Mayhem Time Core i7, 3.4G CRAX Time Core 2, 2.66G Coolplayer File 210 164.0 140.7 (1.4x) Distiny File 2100 963.0 60.8 (15.8x) Dizzy Arguments 519 13260.0 313.0 (Only Explore) GAlan File 1500 831.0 26.1 (31x) GSPlayer File 400 120.0 33.3 (36x) 22:44:39 CRAX Time (Adaptive) 61
  62. 62. Results of Larger Programs Program Input source Input Length Explore Time Exploit Explore Time Gen. Time (Adaptive) Unrar Arguments 5000 1388.5 2569.8 Mplayer (Linux) File 145 145.8 151.2 Mplayer (Windows) File 5568 1713.8 2939.4 Foxit Reader File 10503 5211.1 10094.2 22:44:39 Exploit Gen. Time (Adaptive) 62
  63. 63. Results of Craxing Larger Programs Program Input source Input Length Explore Time Exploit Explore Time Gen. Time (Adaptive) Exploit Gen. Time (Adaptive) Unrar Arguments 5000 1388.5 2569.8 11.7 1.8 Mplayer (Linux) File 145 145.8 151.2 3.3 0.3 Mplayer (Windows) File 5568 1713.8 2939.4 Foxit Reader File 10503 5211.1 10094.2 Program Constraint Size (Bytes) Symbolic-exec Instructions Unrar 2.91M 1177301 Mplayer (Windows) 3.89M 1146887 Foxit Reader 3.91M 1825260 22:44:39 63
  64. 64. Comparisons of AEG Features System Heelan’s (Sep 2009) APEG AEG (May 2008) (Feb 2011) MAYHEM (May 2012) CRAX (June 2012) Exploit-gen Yes No Yes Yes Yes End-to-end No No Yes Yes Yes Source/Binary Source Binary Source Binary Binary Instrument PIN QEMU PIN QEMU Symbolic Environment No - incomplete 8000 LOC Incomplete 10000 LOC 6 models of S2E all environment, 100 LOC Symbolic Memory (Concrete) - - No (abstract) Yes (implement with efforts) Yes (built in S2E, small efforts) Partial Selected code/path/input fast slow faster (larger and much faster, x10 faster) Selected Symbolic Execution Performance Scale XBMC xmail Dizzy Mplayer/Foxit pdf reader Platforms Linux Linux Linux/windows Linux/Windows/Web process process process/system/kernel Applicability 22:44:39 64
  65. 65. Comparisons of AEG Features System Heelan’s (Sep 2009) APEG AEG (May 2008) (Feb 2011) MAYHEM (May 2012) CRAX (June 2012) Exploit-gen Yes No Yes Yes Yes End-to-end No No Yes Yes Yes Source/Binary Source Binary Source Binary Binary Instrument PIN QEMU PIN QEMU Symbolic Environment No - incomplete 8000 LOC Incomplete 10000 LOC 6 models of S2E all environment, 100 LOC Symbolic Memory (Concrete) - - No (abstract) Yes (implement with efforts) Yes (built in S2E, small efforts) Partial Selected code/path/input fast slow faster (larger and much faster, x10 faster) Selected Symbolic Execution Performance Scale XBMC xmail Dizzy Mplayer/Foxit pdf reader Platforms Linux Linux Linux/windows Linux/Windows/Web process process process/system/kernel Applicability 22:44:39 65
  66. 66. Comparisons of AEG Features System Heelan’s (Sep 2009) APEG AEG (May 2008) (Feb 2011) MAYHEM (May 2012) CRAX (June 2012) Exploit-gen Yes No Yes Yes Yes End-to-end No No Yes Yes Yes Source/Binary Source Binary Source Binary Binary Instrument PIN QEMU PIN QEMU Symbolic Environment No - incomplete 8000 LOC Incomplete 10000 LOC 6 models of S2E all environment, 100 LOC Symbolic Memory (Concrete) - - No (abstract) Yes (implement with efforts) Yes (built in S2E, small efforts) Partial Selected code/path/input fast slow faster (larger and much faster, x10 faster) Selected Symbolic Execution Performance Scale XBMC xmail Dizzy Mplayer/Foxit pdf reader Platforms Linux Linux Linux/windows Linux/Windows/Web process process process/system/kernel Applicability 22:44:39 66
  67. 67. Comparisons of AEG Features System Heelan’s (Sep 2009) APEG AEG (May 2008) (Feb 2011) MAYHEM (May 2012) CRAX (June 2012) Exploit-gen Yes No Yes Yes Yes End-to-end No No Yes Yes Yes Source/Binary Source Binary Source Binary Binary Instrument PIN QEMU PIN QEMU Symbolic Environment No - incomplete 8000 LOC Incomplete 6 models of S2E all environment, 100 LOC Symbolic Memory (Concrete) - - No (abstract) Yes (implement with efforts, 27000 LOC) Yes (built in S2E, small efforts) Partial Selected code/path/input fast slow faster (larger and much faster, x10 faster) Selected Symbolic Execution Performance Scale XBMC xmail Dizzy Mplayer/Foxit pdf reader Platforms Linux Linux Linux/windows Linux/Windows/Web process process process/system/kernel 67 Applicability 22:44:39
  68. 68. Comparisons of AEG Features System Heelan’s (Sep 2009) APEG AEG (May 2008) (Feb 2011) MAYHEM (May 2012) CRAX (June 2012) Exploit-gen Yes No Yes Yes Yes End-to-end No No Yes Yes Yes Source/Binary Source Binary Source Binary Binary Instrument PIN QEMU PIN QEMU incomplete 8000 LOC Incomplete (30 system call in linux) 6 models of S2E all environment, 100 LOC No (abstract) Yes (implement with efforts) Yes (built in S2E, small efforts) Partial Selected code/path/input fast slow faster (larger and much faster, x10 faster) Symbolic Environment Symbolic Memory (Concrete) - Selected Symbolic Execution Performance Scale XBMC xmail Dizzy Mplayer/Foxit pdf reader Platforms Linux Linux Linux/windows Linux/Windows/Web process process process/system/kernel Applicability 22:44:39 68
  69. 69. Comparisons of AEG Features System Heelan’s (Sep 2009) APEG AEG (May 2008) (Feb 2011) MAYHEM (May 2012) CRAX (June 2012) Exploit-gen Yes No Yes Yes Yes End-to-end No No Yes Yes Yes Source/Binary Source Binary Source Binary Binary Instrument PIN QEMU PIN QEMU Symbolic Environment No - incomplete 8000 LOC Incomplete 6 models of S2E all environment, 100 LOC Symbolic Memory (concrete) - - No Yes (abstract) (implement with efforts 27000 LOC) Selected Symbolic Execution Yes (builtin in S2E, small efforts) Partial fast Performance Selected code/path/input slow faster (larger and much faster, x10 faster) Scale XBMC xmail Dizzy Mplayer/Foxit pdf reader 22:44:39 Platforms Linux Linux Linux/windows 69 Linux/Windows/Web
  70. 70. Comparisons of AEG Features System Heelan’s (Sep 2009) APEG AEG (May 2008) (Feb 2011) MAYHEM (May 2012) CRAX (June 2012) Exploit-gen Yes No Yes Yes Yes End-to-end No No Yes Yes Yes Source/Binary Source Binary Source Binary Binary Instrument PIN QEMU PIN QEMU Symbolic Environment No - incomplete 8000 LOC Incomplete 6 models of S2E all environment, 100 LOC Symbolic Memory (Concrete) - - No (abstract) Yes (implement with efforts) Yes (built in S2E, small efforts) Partial Selected code/path/input fast slow faster (larger and much faster, x10 faster) XBMC xmail Dizzy Mplayer/Foxit pdf reader Linux Linux Linux/windows Linux/Windows/Web process process process/system/kernel Selected Symbolic Execution Performance Scale Platforms 22:44:39 Applicability 70
  71. 71. Comparisons of AEG Features System Heelan’s (Sep 2009) APEG AEG (May 2008) (Feb 2011) MAYHEM (May 2012) CRAX (June 2012) Exploit-gen Yes No Yes Yes Yes End-to-end No No Yes Yes Yes Source/Binary Source Binary Source Binary Binary Instrument PIN QEMU PIN QEMU Symbolic Environment No - incomplete 8000 LOC Incomplete 6 models of S2E all environment, 100 LOC Symbolic Memory (Concrete) - - No (abstract) Yes (implement with efforts) Yes (built in S2E, small efforts) Partial Selected code/path/input fast slow faster (larger and much faster, x10 faster) Selected Symbolic Execution Performance Scale XBMC xmail Dizzy Mplayer/Foxit pdf reader Platforms Linux Linux Linux/windows Linux/Windows/Web process process 71 process/system/kernel 22:44:39 Applicability
  72. 72. Comparisons of AEG Features System Heelan’s (Sep 2009) APEG AEG (May 2008) (Feb 2011) MAYHEM (May 2012) CRAX (June 2012) Exploit-gen Yes No Yes Yes Yes End-to-end No No Yes Yes Yes Source/Binary Source Binary Source Binary Binary Instrument PIN QEMU PIN QEMU Symbolic Environment No - incomplete 8000 LOC Incomplete 6 models of S2E all environment, 100 LOC Symbolic Memory (Concrete) - - No (abstract) Yes (implement with efforts) Yes (built in S2E, small efforts) Partial Selected code/path/input fast slow faster (larger and much faster, x10 faster) Selected Symbolic Execution Performance Scale XBMC xmail Dizzy Mplayer/Foxit pdf reader Platforms Linux Linux Linux/windows Linux/Windows/Web process process process/system/kernel 72 Applicability 22:44:39
  73. 73. Comparisons of AEG Features System Heelan’s (Sep 2009) APEG AEG (May 2008) (Feb 2011) MAYHEM (May 2012) CRAX (June 2012) Exploit-gen Yes No Yes Yes Yes End-to-end No No Yes Yes Yes Source/Binary Source Binary Source Binary Binary Instrument PIN QEMU PIN QEMU Symbolic Environment No - incomplete 8000 LOC Incomplete 6 models of S2E all environment, 100 LOC Symbolic Memory (Concrete) - - No (abstract) Yes (implement with efforts) Yes (built in S2E, small efforts) Partial Selected code/path/input fast slow faster (larger and much faster, x10 faster) Selected Symbolic Execution Performance Scale XBMC xmail Dizzy Mplayer/Foxit pdf reader Platforms Linux Linux Linux/windows Linux/Windows/Web process process process/system/kernel Applicability 22:44:39 73
  74. 74. Comparisons of AEG Features System Heelan’s (Sep 2009) APEG AEG (May 2008) (Feb 2011) MAYHEM (May 2012) CRAX (June 2012) Exploit-gen Yes No Yes Yes Yes End-to-end No No Yes Yes Yes Source/Binary Source Binary Source Binary Binary Instrument PIN QEMU PIN QEMU Symbolic Environment No - incomplete 8000 LOC Incomplete 6 models of S2E all environment, 100 LOC Symbolic Memory (Concrete) - - No (abstract) Yes (implement with efforts) Yes (built in S2E, small efforts) Partial Selected code/path/input fast slow faster (larger and much faster, x10 faster) Selected Symbolic Execution Performance Scale XBMC xmail Dizzy Mplayer/Foxit pdf reader Platforms Linux Linux Linux/windows Linux/Windows/Web process process process/system/kernel Applicability 22:44:39 74
  75. 75. Comparisons of AEG Features System Heelan’s (Sep 2009) APEG AEG (May 2008) (Feb 2011) MAYHEM (May 2012) CRAX (June 2012) Exploit-gen Yes No Yes Yes Yes End-to-end No No Yes Yes Yes Source/Binary Source Binary Source Binary Binary Instrument PIN QEMU PIN QEMU Symbolic Environment No - incomplete 8000 LOC Incomplete (30 systems call) 6 models of S2E all environment, 100 LOC Symbolic Memory (Concrete) - - No (abstract) Yes (implement with efforts), 27000 LOC Yes (built in S2E, small efforts) Partial Selected code/path/input (6000 LOC) fast slow faster (larger and much faster, x10 faster) Selected Symbolic Execution Performance Scale XBMC xmail Dizzy Mplayer/Foxit pdf reader Platforms Linux Linux Linux/windows Linux/Windows/Web process process process/system/kernel Applicability 22:44:39 75
  76. 76. Conclusions CRAX: test if crash exploitable • Exploit-Gen is a single path concolic execution (without fork) with no path explosion – Should be separated with bug finding process (possible path explosion) – AEG and MYAHEM: mixed with bug finding/exploit gen • Vulnerability Independent – Memory corruption (stack, heap, use of uninitialized variables) – Crash: tainted continuations • ret/jmpbuf/SEH/for,while,if branch predicates tainted – Exploitable: symbolic continuations 22:44:39 76
  77. 77. Lessons Learned • Symbolic EIP Detection Process – Reconstructing the Symbolic Failure Model (the crash model) • Applications of Realistic symbolic crash model – Manipulate the Crash (exploit generation) – Diagnose the Crash (bug forensics) – Better Understand the Crash (fault localization) 22:44:39 77
  78. 78. Further Work • Craxing IE, Firefox, Acrobat pdf reader, Office, and Antivirus software in driver mode • Automate most of the CVEs exploit-gen in a few hours • Zero-day Exploit-gen (need Zero-day Crash-gen) • Anti-Mitigations Exploit-gen (ASLR+W X, EMET) • Web platform independent Exploit-gen (PHP, JSP, ASP, Ruby, Python) • Bug is an implicit Backdoor – Symbolic Continuations as Implicit Backdoors for Crashed Software (with process continuations) 22:44:39 78
  79. 79. The Impact • Much Easier for Implementing a Binary AEG – S2E is available for “poor man” – Symbolic EIP detection is quite easy in S2E – Binary AEG won’t be a challenging work • BUG = Vulnerability ? • BUG = Backdoor ? 22:44:39 79

×