AEG: Automatic Exploit Generation
Thanassis Avgerinos, Sang Kil Cha, Brent Lim Tze Hao and David Brumley
Carnegie Mellon University, Pittsburgh, PA
NDSS Symposium 2011
redhung@SQLab, NYCU
>_ Outline
● Introduction
● Overview
● Approach
● Problems and solutions
● Evaluation
Introduction
>_ Introduction
● Manual Exploit Generation
○ Locate the address where is overflow
○ Locate the return address
○ Construct the shellcode
● Automatic Exploit Generation
○ Automatically find vulnerabilities
○ Detect exploitable bugs
○ Generate exploits
>_ Introduction
● Challenges
○ Source code analysis alone is inadequate
■ char src[12], dst[10]; strncpy(dst, src, sizeof(src))
○ Infinite number of possible paths
■ Which paths should we check first?
Overview
>_ Overview
● Condition
○ Stack overflow or format string
○ Overwrite the return address to hijack the workflow
○ Not against common security defenses e.g. NX, ASLR
○ Source code is needed
○ Linux x86
>_ Overview
>_ Overview
● Step for shell spawning
○ Find the bug
○ Get the run-time information
○ Generate the exploit
○ Verify the exploit
>_ Overview
● Modeling
○ The Unsafe Path Predicate Πbug
■ Safe property Φ
○ The Exploit Predicate Πexploit
■ Attacker’s logic
○ Πbug
(ε) ∧ Πexploit
(ε) = true
Approach
>_ Approach
>_ Approach
● 1. Pre-Process
○ src → ( Bgcc,
Bllvm
)
○ Bgcc
for binary analysis
○ Bllvm
for source code analysis
>_ Approach
● 2. Src-Analysis
○ Bllvm
→ max
○ Static analysis
○ Generate the maximum size of symbolic data max
>_ Approach
● 3. Bug-Find
○ (Bllvm
, Φ, max) → (Πbug
, V)
○ V contains source-level information
○ e.g. buffer name, vulnerable function name
>_ Approach
● 4. DBA ( Dynamic Binary Analysis )
○ (Bgcc
, (Πbug
, V)) → R
○ R represents run-time information
○ e.g. stack address, return address, stack frame
>_ Approach
● 5. Exploit-Gen
○ (Πbug
, R) → Πbug ∧ Πexploit
○ Program counter points to a user-determined location
○ The location contains shellcode
>_ Approach
● 6. Verify
○ (Bgcc
, Πbug
∧ Πexploit
) → { ε, ⊥ }
○ ε for true
○ ⊥ for false
Problems and Solutions
>_ Problems
● Traditional symbolic execution
○ State space explosion problem
○ Path selection problem
● Other
○ Environment modelling problem
>_ Solutions
● Preconditioned symbolic execution
○ State space is determined by Πprec
○ Known length
○ Known prefix
○ Concolic execution
>_ Solutions
>_ Solutions
>_ Solutions
● Path prioritization
○ Buggy-Path-First
○ Loop exhaustion
>_ Solutions
● Environment modleing
○ Symbolic files
○ Symbolic sockets
○ Environment variables
○ Library function calls and system calls
>_ Solutions
Evaluation
>_ Evaluation
● Experimental setup
○ 2.4 GHz Intel(R) Core 2 Duo CPU
○ 4GB of RAM
○ Debian Linux 2.6.26-2
○ LLVM-GCC 2.7
○ GCC 4.2.4
>_ Evaluation
>_ Evaluation
>_ Evaluation
THANKS

AEG_ Automatic Exploit Generation

  • 1.
    AEG: Automatic ExploitGeneration Thanassis Avgerinos, Sang Kil Cha, Brent Lim Tze Hao and David Brumley Carnegie Mellon University, Pittsburgh, PA NDSS Symposium 2011 redhung@SQLab, NYCU
  • 2.
    >_ Outline ● Introduction ●Overview ● Approach ● Problems and solutions ● Evaluation
  • 3.
  • 4.
    >_ Introduction ● ManualExploit Generation ○ Locate the address where is overflow ○ Locate the return address ○ Construct the shellcode ● Automatic Exploit Generation ○ Automatically find vulnerabilities ○ Detect exploitable bugs ○ Generate exploits
  • 5.
    >_ Introduction ● Challenges ○Source code analysis alone is inadequate ■ char src[12], dst[10]; strncpy(dst, src, sizeof(src)) ○ Infinite number of possible paths ■ Which paths should we check first?
  • 6.
  • 7.
    >_ Overview ● Condition ○Stack overflow or format string ○ Overwrite the return address to hijack the workflow ○ Not against common security defenses e.g. NX, ASLR ○ Source code is needed ○ Linux x86
  • 8.
  • 9.
    >_ Overview ● Stepfor shell spawning ○ Find the bug ○ Get the run-time information ○ Generate the exploit ○ Verify the exploit
  • 10.
    >_ Overview ● Modeling ○The Unsafe Path Predicate Πbug ■ Safe property Φ ○ The Exploit Predicate Πexploit ■ Attacker’s logic ○ Πbug (ε) ∧ Πexploit (ε) = true
  • 11.
  • 12.
  • 13.
    >_ Approach ● 1.Pre-Process ○ src → ( Bgcc, Bllvm ) ○ Bgcc for binary analysis ○ Bllvm for source code analysis
  • 14.
    >_ Approach ● 2.Src-Analysis ○ Bllvm → max ○ Static analysis ○ Generate the maximum size of symbolic data max
  • 15.
    >_ Approach ● 3.Bug-Find ○ (Bllvm , Φ, max) → (Πbug , V) ○ V contains source-level information ○ e.g. buffer name, vulnerable function name
  • 16.
    >_ Approach ● 4.DBA ( Dynamic Binary Analysis ) ○ (Bgcc , (Πbug , V)) → R ○ R represents run-time information ○ e.g. stack address, return address, stack frame
  • 17.
    >_ Approach ● 5.Exploit-Gen ○ (Πbug , R) → Πbug ∧ Πexploit ○ Program counter points to a user-determined location ○ The location contains shellcode
  • 18.
    >_ Approach ● 6.Verify ○ (Bgcc , Πbug ∧ Πexploit ) → { ε, ⊥ } ○ ε for true ○ ⊥ for false
  • 19.
  • 20.
    >_ Problems ● Traditionalsymbolic execution ○ State space explosion problem ○ Path selection problem ● Other ○ Environment modelling problem
  • 21.
    >_ Solutions ● Preconditionedsymbolic execution ○ State space is determined by Πprec ○ Known length ○ Known prefix ○ Concolic execution
  • 22.
  • 23.
  • 24.
    >_ Solutions ● Pathprioritization ○ Buggy-Path-First ○ Loop exhaustion
  • 25.
    >_ Solutions ● Environmentmodleing ○ Symbolic files ○ Symbolic sockets ○ Environment variables ○ Library function calls and system calls
  • 26.
  • 27.
  • 28.
    >_ Evaluation ● Experimentalsetup ○ 2.4 GHz Intel(R) Core 2 Duo CPU ○ 4GB of RAM ○ Debian Linux 2.6.26-2 ○ LLVM-GCC 2.7 ○ GCC 4.2.4
  • 29.
  • 30.
  • 31.
  • 32.