- MAYHEM is a system for automatically generating exploits by combining concrete and symbolic execution. It aims to maximize the amount of work done while minimizing wasted effort.
- It uses a hybrid execution approach where it concurrently runs a concrete executor client and symbolic executor server. The client explores new paths while the server performs symbolic analysis.
- A key challenge is handling symbolic memory addresses, which MAYHEM addresses through techniques like value set analysis to bound possible addresses and index search trees to efficiently search the memory state space.
Advancing Engineering with AI through the Next Generation of Strategic Projec...
MAYHEM Unleashing Automatic Exploit Generation on Binary Code
1. redhung@SQLab, NYCU
Unleashing
MAYHEM
On Binary Code
Sang Kil Cha, Thanassis Avgerinos, Alexandre Rebert and David Brumley, Carnegie Mellon University
2012 IEEE Symposium on Security and Privacy
redhung@SQLab, NYCU
Unleashing
MAYHEM
On Binary Code
Sang Kil Cha, Thanassis Avgerinos, Alexandre Rebert and David Brumley, Carnegie Mellon University
2012 IEEE Symposium on Security and Privacy
我用Keynote輸出Power Point會跑版請見諒
4. >_ Introduction
•AEG
---
-The automatic exploit generation challenge is given a program, automatically
find vulnerabilities and generate exploits for them.
-Heelan, AEG, Mayhem, CRAX
5. >_ Introduction
•MAYHE
M
---
-MAYHEM’s design is based on four main principles:
1. The system should be able to make forward progress for arbitrarily long
times, ideally run “forever”
2. In order to maximize performance, the system should not repeat work
3. The system should not throw away any work, previous analysis results of
the system should be reusable on subsequent runs
4. The system should be able to reason about symbolic memory where a load
or store address depends on user input
6. >_ Introduction
•Symbolic Executors
---
-Current executors can be divided into two main categories: offline
executors, online executors
-Offline executors: which concretely run a single execution path and then
symbolically execute it
-Online executors: which try to execute all possible paths in a single run of the
system
7. >_ Introduction
•Overall, MAYHEM makes the following
contributions:
---
-Hybrid execution
-Index-based memory modeling
-Binary-only exploit generation
9. >_ Overview
---
•We use an HTTP server, orzHttpd as an example to highlight the main challenges
and present how MAYHEM works.
•Note that we show source for clarity and simplicity; MAYHEM runs on binary code
17. >_ Overview
---
•We highlight several key points for finding exploitable bugs:
-Low-level details matter
-There are an enormous number of paths
-The more checked paths, the better
-Execute as much natively as possible
18. >_ Overview
---
•MAYHEM consists of two concurrently running processes:
-Concrete Executor Client (CEC)
-Symbolic Executor Server (SES)
38. >_
Techniques
---
•Symbolic indices
x = user_input();
y = mem[x];
assert ( y == 42 );
x can be everything
Which memory cell
contains 42?
2^32 cells to check
Memory
0 2^32 - 1
41. >_
Techniques
---
•Another cause: table lookups
-Really frequent in standard API
-e.g. sscanf, vfprintf, toupper, tolower, etc
-If user gives user input as an argument, then inside these functions we
will use user input as a memory index and we will use that index to
look up a table
45. >_
Techniques
---
•Use symbolic execution state to:
1. Bound memory addresses referenced
2. Make search tree for memory address values
-Value Set Analysis (VSA)
•Other algorithms:
-Refinement Cache
-Lemma Cache
-Index Search Trees (ISTs)
-Bucketization with Linear Functions