SlideShare a Scribd company logo
Klee and Angr
bananaappletw @ UCCU Hacker
2017/10/29
$whoami
• 陳威伯(bananaappletw)
• Master of National Chiao Tung
University
• Organizations:
• Software Quality Laboratory
• Bamboofox member
• Vice president of NCTUCSC
• Specialize in:
• symbolic execution
• binary exploit
• Talks:
• HITCON CMT 2015
• HITCON CMT 2017
What is symbolic execution?
• Symbolic execution is a means of analyzing a program to determine
what inputs cause each part of a program to execute
• System-level
• S2e(https://github.com/dslab-epfl/s2e)
• User-level
• Angr(http://angr.io/)
• Triton(https://triton.quarkslab.com/)
• Code-based
• klee(http://klee.github.io/)
Symbolic execution
Z == 12
fail() "OK"
Klee
• Symbolic virtual machine built on top of the LLVM compiler
infrastructure
• Website: http://klee.github.io/
• Github: https://github.com/klee/klee
• Klee paper: http://llvm.org/pubs/2008-12-OSDI-KLEE.pdf (Worth
reading)
• Main goal of Klee:
1. Hit every line of executable code in the program
2. Detect at each dangerous operation
Installation
• Install docker
• sudo apt-get install docker.io
• Get klee image
• sudo docker pull klee/klee
• Run klee docker image
• sudo docker run --rm -ti --ulimit='stack=-1:-1' klee/klee
• Test environment
• Ubuntu 17.04 desktop amd64
Introduction
• Klee is a symbolic machine to generate test cases
• Need source code to compile to LLVM bitcode
• Steps:
• Replace input with Klee function to make memory region symbolic
• Compile source code to LLVM bitcode
• Run Klee
• Get the test cases and path's information
Steps
• Source code at ~/klee_src/examples/get_sign.c
• Include klee library
#include <klee/klee.h>
• Make input symbolic
klee_make_symbolic(&a, sizeof(a), "a");
• Compile to LLVM bitcode
clang -I ../../include -emit-llvm -c -g get_sign.c
Steps
• Examine LLVM bitcode
llvm-dis get_sign.bc
• Running KLEE
klee get_sign.bc
• Show information about test case
ktest-tool ./klee-last/*.ktest
get_sign.c
#include <klee/klee.h>
int get_sign(int x) {
if (x == 0)
return 0;
if (x < 0)
return -1;
else
return 1;
}
int main() {
int a;
klee_make_symbolic(&a, sizeof(a), "a");
return get_sign(a);
}
get_sign.ll
define i32 @main() #0 {
%1 = alloca i32, align 4
%a = alloca i32, align 4
store i32 0, i32* %1
call void @llvm.dbg.declare(metadata !{i32*
%a}, metadata !25), !dbg !26
%2 = bitcast i32* %a to i8*, !dbg !27
call void @klee_make_symbolic(i8* %2, i64 4,
i8* getelementptr inbounds ([2 x i8]* @.str,
i32 0, i32 0)), !dbg !27
%3 = load i32* %a, align 4, !dbg !28
%4 = call i32 @get_sign(i32 %3), !dbg !28
ret i32 %4, !dbg !28
}
Result
Functions
• klee_make_symbolic(address, size, name)
Make memory address symbolic
klee_make_symbolic(&c, sizeof(c), "c");
Notice: Don't overlap symbolic memory regions
• klee_assume(condition)
Add condition constraint on input
klee_assume((c==2)||(d==3));
• klee_prefer_cex(object, condition)
When generating test cases, prefer certain values between equivalent test
cases
klee_prefer_cex(input, 32 <= input[i] && input[i] <= 126);
Functions
• Insert assert on target path
klee_assert(0)
• Find assert test case
ls ./klee-last/ | grep .assert
• Show information about test case
ktest-tool ./klee-last/*.ktest
• Using GNU c library functions
Run with --libc=uclibc --posix-runtime parameters
Environment Modeling
• For example:
• command-line arguments
• environment variables
• file
• data and metadata
• network packets
• ……
• Klee redirects library call to models
• If fd refers to a concrete file
Klee use pread to multiplex access from KLEE’s many states onto the one actual
underlying file descriptor
• If fd refers to a symbolic file
read() copies from the underlying symbolic buffer
Environment Modeling
• -sym-arg <N>
Replace by a symbolic argument with length N
• -sym-args <MIN> <MAX> <N>
Replace by at least MIN arguments and at most MAX arguments, each with
maximum length N
• -sym-files <NUM> <N>
Make NUM symbolic files (‘A’, ‘B’, ‘C’, etc.), each with size N (excluding stdin)
• -sym-stdin <N>
Make stdin symbolic with size N
Solver
• MetaSMT
• STP(Default option for Klee)
• Z3
Architecture
• Represent Klee as an interpreter loop which select a state to run and
then symbolically executes a single instruction in the context of that
state
• When Klee meets the conditional branch instruction, it clones the
state so that it could explore both paths
• By implementing heap as immutable map, portions of the heap
structure itself can also be shared amongst multiple states(copy on
write)
Architecture
• Potentially dangerous operations implicitly generate branches that
check if any input value exists that could cause an error(e.g., zero
divisor)
• When processing instruction, if all given operands are concrete,
performs the operation natively, returning a constant expression
• When KLEE detects an error or when a path reaches an exit call, Klee
solves the current path's constraints
Diagram
1. Step the program until it
meets the branch
#include <klee/klee.h>
int get_sign(int x) {
if (x == 0)
return 0;
if (x < 0)
return -1;
else
return 1;
}
int main() {
int a;
klee_make_symbolic(&a, sizeof(a), "a");
return get_sign(a);
}
Diagram
1. Step the program until it
meets the branch
2. If all given operands are
concrete, return constant
expression. If not, record
current condition constraints
and clone the state.
#include <klee/klee.h>
int get_sign(int x) {
if (x == 0)
return 0;
if (x < 0)
return -1;
else
return 1;
}
int main() {
int a;
klee_make_symbolic(&a, sizeof(a), "a");
return get_sign(a);
}
Diagram
1. Step the program until it
meets the branch
2. If all given operands are
concrete, return constant
expression. If not, record
current condition constraints
and clone the state
3. Step the states until they hit
exit call or error
X==0
Constraints:
X!=0
Next instruction:
if (x < 0)
Constraints:
X==0
Next instruction:
return 0;
Diagram
1. Step the program until it
meets the branch
2. If all given operands are
concrete, return constant
expression. If not, record
current condition constraints
and clone the state
3. Step the states until they hit
exit call or error
4. Solve the conditional
constraint
X==0
Constraints:
X!=0
Next instruction:
if (x < 0)
Constraints:
X==0
Next instruction:
return 0;
Diagram
1. Step the program until it meets the branch
2. If all given operands are concrete, return constant expression. If
not, record current condition constraints and clone the state
3. Step the states until they hit exit call or error
4. Solve the conditional constraint
5. Loop until no remaining states or user-defined timeout is reached
Exercises
• ccr
Sum of continuous number
https://bamboofox.cs.nctu.edu.tw/courses/1/challenges/67
• Suicide
Product of continuous number
https://bamboofox.cs.nctu.edu.tw/courses/1/challenges/68
Angr
• Website: http://angr.io/
• Angr is a python framework for analyzing binaries. It combines both
static and dynamic symbolic ("concolic") analysis, making it applicable
to a variety of tasks.
• Flow
• Loading a binary into the analysis program.
• Translating a binary into an intermediate representation (IR).
• Performing the actual analysis
Installation
• sudo apt-get install -y python-pip python-dev libffi-dev build-essential
virtualenvwrapper
• Virtualenv
• mkvirtualenv angr && pip install angr
• Direct install in system
• sudo pip install angr
• Test environment
• Ubuntu 17.04 desktop amd64
Flow
• Import angr
import angr
• Load the binary and initialize angr project
project = angr.Project('./ais3_crackme')
• Define argv1 as 100 bytes bitvectors
argv1 = claripy.BVS("argv1",100*8)
• Initialize the state with argv1
state = project.factory.entry_state(args=["./crackme1",argv1])
Flow
• Initialize the simulation manager
simgr = p.factory.simgr(state)
• Explore the states that matches the condition
simgr.explore(find= 0x400602)
• Extract one state from found states
found = simgr.found[0]
• Solve the expression with solver
solution = found.solver.eval(argv1, cast_to=str)
ais3 crackme
• Binary could be found in: https://github.com/angr/angr-
doc/blob/master/examples/ais3_crackme/
• Run binary with argument
• If argument is correct
print "Correct! that is the secret key!"
• else
print "I'm sorry, that's the wrong secret key!"
Target address
Solution
import angr
import claripy
project = angr.Project("./ais3_crackme")
argv1 = claripy.BVS("argv1",100*8)
state = project.factory.entry_state(args=["./crackme1",argv1])
simgr = project.factory.simgr(state)
simgr.explore(find=0x400602)
found = simgr.found[0]
solution = found.solver.eval(argv1, cast_to=str)
print(repr(solution))
Result
Loader
• Load the binary through angr.Project
• Loading options
• auto_load_libs
• When there is no SimProcedure
• True(Default, real library function is executed)
• False(return a generic "stub" SimProcedure called ReturnUnconstrained)
Intermediate Representation
• In order to be able to analyze and execute machine code from
different CPU architectures, Angr performs most of its analysis on an
intermediate representation
• Angr's intermediate representation is VEX(Valgrind), since the
uplifting of binary code into VEX is quite well supported
Intermediate Representation
• IR abstracts away several architecture differences when dealing with
different architectures
• Register names: VEX models the registers as a separate memory space, with
integer offsets
• Memory access: The IR abstracts difference between architectures access
memory in different ways
• Memory segmentation: Some architectures support memory segmentation
through the use of special segment registers
• Instruction side-effects: Most instructions have side-effects
Intermediate Representation
• addl %eax, %ebx • t3 = GET:I32(0)
• # get %eax, a 32-bit integer
• t2 = GET:I32(12)
• # get %ebx, a 32-bit integer
• t1 = Add32(t3,t2)
• # addl
• PUT(0) = t1
• # put %eax
Simulation Managers
• simgr.step()
Step forward all states in a given stash by one basic block
• simgr.run()
Step until everything terminates
• simgr.explore(conditions)
• find, avoid:
• An address to find or avoid
• A set or list of addresses to find or avoid
• A function that takes a state and returns whether or not it matches.
Bit-Vector Symbol
• claripy.BVS(name, size)
• name: The name of the symbol.
• size: The size (in bits) of the bit-vector.
• chop(bits)
• bits: A list of smaller bitvectors, each bits in length.
State
• Property
• registers: The state's register file as a flat memory region
• memory: The state's memory as a flat memory region
• solver(e.g., se): The solver engine for this state
• posix: information about the operating system or environment model(e.g.,
posix.files[fd])
• add_constraints(BVS condition)
• inspect.b(event, when=angr.BP_AFTER, action=debug_func)
• event: mem_read, reg_write
• when: BP_BEFORE or BP_AFTER
• action: debug_func
SimMemory
• load(addr, size=None)
• addr: memory address
• size: The sizein bytes
• find(addr, what)
• addr: The start address
• what: what to search for
• store(addr, data, size=None)
• addr: address to store at
• data: The data(claripy expression or something convertable to a claripy expression)
• size: The data size(claripy expression or something convertable to a claripy
expression)
Stash types
active This stash contains the states that will be stepped by default, unless an alternate stash is specified.
deadended
A state goes to the deadended stash when it cannot continue the execution for some reason,
including no more valid instructions, unsat state of all of its successors, or an invalid instruction
pointer.
pruned
When using LAZY_SOLVES, states are not checked for satisfiability unless absolutely necessary.
When a state is found to be unsat in the presence of LAZY_SOLVES, the state hierarchy is traversed
to identify when, in its history, it initially became unsat. All states that are descendants of that point
(which will also be unsat, since a state cannot become un-unsat) are pruned and put in this stash.
unconstrained
If the save_unconstrained option is provided to the SimulationManager constructor, states that are
determined to be unconstrained (i.e., with the instruction pointer controlled by user data or some
other source of symbolic data) are placed here.
unsat
If the save_unsat option is provided to the SimulationManager constructor, states that are
determined to be unsatisfiable (i.e., they have constraints that are contradictory, like the input
having to be both "AAAA" and "BBBB" at the same time) are placed here.
Hooking
• project.hook(address, hook)
• hook is a SimProcedure instance
• At every step, angr checks if the current address has been hooked,
and if so, runs the hook instead of the binary code at that address
• You could also use symbol to locate the address
• proj.hook_symbol(name, hook)
• You could use hook as function decorator, length is to jump after
finishing the hook @proj.hook(0x20000, length=5)
def my_hook(state):
state.regs.rax = 1
Hooking
• proj.is_hooked(address)
True or False
• proj.unhook(address)
Unhook hook on address
• proj.hooked_by(address)
Return SimProcedure instance
Symbolic Function
• Project tries to replace external calls to library functions by using
symbolic summaries termed SimProcedures
• Because SimProcedures are library hooks written in Python, it has
inaccuracy
• If you encounter path explosion or inaccuracy, you can do:
1. Disable the SimProcedure
2. Replace the SimProcedure with something written directly to the situation
in question
3. Fix the SimProcedure
Symbolic Function(scanf)
• Source code:
https://github.com/angr/angr/blob/master/angr/procedures/libc/sca
nf.py
1. Get first argument(pointer to format string)
2. Define function return type by the architecture
3. Parse format string
4. According format string, read input from file descriptor 0(i.e.,
standard input)
5. Do the read operation
Symbolic Function(scanf)
from angr.procedures.stubs.format_parser import FormatParser
from angr.sim_type import SimTypeInt, SimTypeString
class scanf(FormatParser):
def run(self, fmt):
self.argument_types = {0: self.ty_ptr(SimTypeString())}
self.return_type = SimTypeInt(self.state.arch.bits, True)
fmt_str = self._parse(0)
f = self.state.posix.get_file(0)
region = f.content
start = f.pos
(end, items) = fmt_str.interpret(start, 1, self.arg, region=region)
# do the read, correcting the internal file position and logging the action
self.state.posix.read_from(0, end - start)
return items
Symbolic Function(scanf)
class SimProcedure(object):
@staticmethod
def ty_ptr(self, ty):
return SimTypePointer(self.arch, ty)
class FormatParser(SimProcedure):
def _parse(self, fmt_idx):
"""
fmt_idx: The index of the (pointer to the) format string in the
arguments list.
"""
def interpret(self, addr, startpos, args, region=None):
"""
Interpret a format string, reading the data at `addr` in `region` into `args`
starting at `startpos`.
"""
def _parse(self, fmt_idx):
• int scanf ( const char * format, ...
);
• scanf ("%d",&i);
fmt_str = self._parse(0)
• int sscanf ( const char * s, const
char * format, ...);
• sscanf (sentence,"%s %*s
%d",str,&i);
fmt_str = self._parse(1)
def interpret(self, addr, startpos, args,
region=None):
• int scanf ( const char * format, ...
);
• scanf ("%d",&i);
f = self.state.posix.get_file(0)
region = f.content
start = f.pos
(end, items) =
fmt_str.interpret(start, 1, self.arg,
region=region)
• int sscanf ( const char * s, const
char * format, ...);
• sscanf (sentence,"%s %*s
%d",str,&i);
_, items =
fmt_str.interpret(self.arg(0), 2,
self.arg, region=self.state.memory)
Exercises
• angrman
https://bamboofox.cs.nctu.edu.tw/courses/1/challenges/69
Local binary
• magicpuzzle
Server response the base64-encoded binary, try to find out the differences,
and automatically generate the exploit
https://bamboofox.cs.nctu.edu.tw/courses/1/challenges/70
• Sushi
Server response the base64-encoded binary, try to find out the differences,
and automatically generate the exploit
https://bamboofox.cs.nctu.edu.tw/courses/1/challenges/72
Q & A

More Related Content

What's hot

中3女子が狂える本当に気持ちのいい constexpr
中3女子が狂える本当に気持ちのいい constexpr中3女子が狂える本当に気持ちのいい constexpr
中3女子が狂える本当に気持ちのいい constexpr
Genya Murakami
 
Master Canary Forging: 新しいスタックカナリア回避手法の提案 by 小池 悠生 - CODE BLUE 2015
Master Canary Forging: 新しいスタックカナリア回避手法の提案 by 小池 悠生 - CODE BLUE 2015Master Canary Forging: 新しいスタックカナリア回避手法の提案 by 小池 悠生 - CODE BLUE 2015
Master Canary Forging: 新しいスタックカナリア回避手法の提案 by 小池 悠生 - CODE BLUE 2015
CODE BLUE
 
C++ マルチスレッドプログラミング
C++ マルチスレッドプログラミングC++ マルチスレッドプログラミング
C++ マルチスレッドプログラミング
Kohsuke Yuasa
 
MCC CTF講習会 pwn編
MCC CTF講習会 pwn編MCC CTF講習会 pwn編
MCC CTF講習会 pwn編
hama7230
 
Pwning in c++ (basic)
Pwning in c++ (basic)Pwning in c++ (basic)
Pwning in c++ (basic)
Angel Boy
 
Virtual Machine Constructions for Dummies
Virtual Machine Constructions for DummiesVirtual Machine Constructions for Dummies
Virtual Machine Constructions for Dummies
National Cheng Kung University
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance Analysis
Brendan Gregg
 
PWNの超入門 大和セキュリティ神戸 2018-03-25
PWNの超入門 大和セキュリティ神戸 2018-03-25PWNの超入門 大和セキュリティ神戸 2018-03-25
PWNの超入門 大和セキュリティ神戸 2018-03-25
Isaac Mathis
 
Windows Operating System Archaeology
Windows Operating System ArchaeologyWindows Operating System Archaeology
Windows Operating System Archaeology
enigma0x3
 
Interpreter, Compiler, JIT from scratch
Interpreter, Compiler, JIT from scratchInterpreter, Compiler, JIT from scratch
Interpreter, Compiler, JIT from scratch
National Cheng Kung University
 
シェル芸初心者によるシェル芸入門
シェル芸初心者によるシェル芸入門シェル芸初心者によるシェル芸入門
シェル芸初心者によるシェル芸入門
icchy
 
Effective CMake
Effective CMakeEffective CMake
Effective CMake
Daniel Pfeifer
 
How A Compiler Works: GNU Toolchain
How A Compiler Works: GNU ToolchainHow A Compiler Works: GNU Toolchain
How A Compiler Works: GNU Toolchain
National Cheng Kung University
 
イマドキC++erのモテカワリソース管理術
イマドキC++erのモテカワリソース管理術イマドキC++erのモテカワリソース管理術
イマドキC++erのモテカワリソース管理術
Kohsuke Yuasa
 
JavaでWebサービスを作り続けるための戦略と戦術 JJUG-CCC-2018-Spring-g1
JavaでWebサービスを作り続けるための戦略と戦術 JJUG-CCC-2018-Spring-g1JavaでWebサービスを作り続けるための戦略と戦術 JJUG-CCC-2018-Spring-g1
JavaでWebサービスを作り続けるための戦略と戦術 JJUG-CCC-2018-Spring-g1
Y Watanabe
 
Introduction to kotlin coroutines
Introduction to kotlin coroutinesIntroduction to kotlin coroutines
Introduction to kotlin coroutines
NAVER Engineering
 
高速な暗号実装のためにしてきたこと
高速な暗号実装のためにしてきたこと高速な暗号実装のためにしてきたこと
高速な暗号実装のためにしてきたこと
MITSUNARI Shigeo
 
RSA暗号運用でやってはいけない n のこと #ssmjp
RSA暗号運用でやってはいけない n のこと #ssmjpRSA暗号運用でやってはいけない n のこと #ssmjp
RSA暗号運用でやってはいけない n のこと #ssmjp
sonickun
 
PHP の GC の話
PHP の GC の話PHP の GC の話
PHP の GC の話
y-uti
 
書くネタがCoqしかない
書くネタがCoqしかない書くネタがCoqしかない
書くネタがCoqしかない
Masaki Hara
 

What's hot (20)

中3女子が狂える本当に気持ちのいい constexpr
中3女子が狂える本当に気持ちのいい constexpr中3女子が狂える本当に気持ちのいい constexpr
中3女子が狂える本当に気持ちのいい constexpr
 
Master Canary Forging: 新しいスタックカナリア回避手法の提案 by 小池 悠生 - CODE BLUE 2015
Master Canary Forging: 新しいスタックカナリア回避手法の提案 by 小池 悠生 - CODE BLUE 2015Master Canary Forging: 新しいスタックカナリア回避手法の提案 by 小池 悠生 - CODE BLUE 2015
Master Canary Forging: 新しいスタックカナリア回避手法の提案 by 小池 悠生 - CODE BLUE 2015
 
C++ マルチスレッドプログラミング
C++ マルチスレッドプログラミングC++ マルチスレッドプログラミング
C++ マルチスレッドプログラミング
 
MCC CTF講習会 pwn編
MCC CTF講習会 pwn編MCC CTF講習会 pwn編
MCC CTF講習会 pwn編
 
Pwning in c++ (basic)
Pwning in c++ (basic)Pwning in c++ (basic)
Pwning in c++ (basic)
 
Virtual Machine Constructions for Dummies
Virtual Machine Constructions for DummiesVirtual Machine Constructions for Dummies
Virtual Machine Constructions for Dummies
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance Analysis
 
PWNの超入門 大和セキュリティ神戸 2018-03-25
PWNの超入門 大和セキュリティ神戸 2018-03-25PWNの超入門 大和セキュリティ神戸 2018-03-25
PWNの超入門 大和セキュリティ神戸 2018-03-25
 
Windows Operating System Archaeology
Windows Operating System ArchaeologyWindows Operating System Archaeology
Windows Operating System Archaeology
 
Interpreter, Compiler, JIT from scratch
Interpreter, Compiler, JIT from scratchInterpreter, Compiler, JIT from scratch
Interpreter, Compiler, JIT from scratch
 
シェル芸初心者によるシェル芸入門
シェル芸初心者によるシェル芸入門シェル芸初心者によるシェル芸入門
シェル芸初心者によるシェル芸入門
 
Effective CMake
Effective CMakeEffective CMake
Effective CMake
 
How A Compiler Works: GNU Toolchain
How A Compiler Works: GNU ToolchainHow A Compiler Works: GNU Toolchain
How A Compiler Works: GNU Toolchain
 
イマドキC++erのモテカワリソース管理術
イマドキC++erのモテカワリソース管理術イマドキC++erのモテカワリソース管理術
イマドキC++erのモテカワリソース管理術
 
JavaでWebサービスを作り続けるための戦略と戦術 JJUG-CCC-2018-Spring-g1
JavaでWebサービスを作り続けるための戦略と戦術 JJUG-CCC-2018-Spring-g1JavaでWebサービスを作り続けるための戦略と戦術 JJUG-CCC-2018-Spring-g1
JavaでWebサービスを作り続けるための戦略と戦術 JJUG-CCC-2018-Spring-g1
 
Introduction to kotlin coroutines
Introduction to kotlin coroutinesIntroduction to kotlin coroutines
Introduction to kotlin coroutines
 
高速な暗号実装のためにしてきたこと
高速な暗号実装のためにしてきたこと高速な暗号実装のためにしてきたこと
高速な暗号実装のためにしてきたこと
 
RSA暗号運用でやってはいけない n のこと #ssmjp
RSA暗号運用でやってはいけない n のこと #ssmjpRSA暗号運用でやってはいけない n のこと #ssmjp
RSA暗号運用でやってはいけない n のこと #ssmjp
 
PHP の GC の話
PHP の GC の話PHP の GC の話
PHP の GC の話
 
書くネタがCoqしかない
書くネタがCoqしかない書くネタがCoqしかない
書くネタがCoqしかない
 

Similar to Klee and angr

Symbolic Execution (introduction and hands-on)
Symbolic Execution (introduction and hands-on)Symbolic Execution (introduction and hands-on)
Symbolic Execution (introduction and hands-on)
Emilio Coppa
 
App secforum2014 andrivet-cplusplus11-metaprogramming_applied_to_software_obf...
App secforum2014 andrivet-cplusplus11-metaprogramming_applied_to_software_obf...App secforum2014 andrivet-cplusplus11-metaprogramming_applied_to_software_obf...
App secforum2014 andrivet-cplusplus11-metaprogramming_applied_to_software_obf...
Cyber Security Alliance
 
Tips and tricks for building high performance android apps using native code
Tips and tricks for building high performance android apps using native codeTips and tricks for building high performance android apps using native code
Tips and tricks for building high performance android apps using native code
Kenneth Geisshirt
 
Valgrind
ValgrindValgrind
Valgrind
aidanshribman
 
How to fake_properly
How to fake_properlyHow to fake_properly
How to fake_properly
Rainer Schuettengruber
 
React Native One Day
React Native One DayReact Native One Day
React Native One Day
Troy Miles
 
Price of an Error
Price of an ErrorPrice of an Error
Price of an Error
Andrey Karpov
 
Introduction to C ++.pptx
Introduction to C ++.pptxIntroduction to C ++.pptx
Introduction to C ++.pptx
VAIBHAVKADAGANCHI
 
The operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzerThe operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzer
Andrey Karpov
 
GCC Summit 2010
GCC Summit 2010GCC Summit 2010
GCC Summit 2010
regehr
 
Blocks & GCD
Blocks & GCDBlocks & GCD
Blocks & GCD
rsebbe
 
cb streams - gavin pickin
cb streams - gavin pickincb streams - gavin pickin
cb streams - gavin pickin
Ortus Solutions, Corp
 
CBDW2014 - MockBox, get ready to mock your socks off!
CBDW2014 - MockBox, get ready to mock your socks off!CBDW2014 - MockBox, get ready to mock your socks off!
CBDW2014 - MockBox, get ready to mock your socks off!
Ortus Solutions, Corp
 
Unit testing in iOS featuring OCUnit, GHUnit & OCMock
Unit testing in iOS featuring OCUnit, GHUnit & OCMockUnit testing in iOS featuring OCUnit, GHUnit & OCMock
Unit testing in iOS featuring OCUnit, GHUnit & OCMock
Robot Media
 
Introduction to Elixir
Introduction to ElixirIntroduction to Elixir
Introduction to Elixir
Diacode
 
Code Analysis-run time error prediction
Code Analysis-run time error predictionCode Analysis-run time error prediction
Code Analysis-run time error prediction
NIKHIL NAWATHE
 
Fortran & Link with Library & Brief Explanation of MKL BLAS
Fortran & Link with Library & Brief Explanation of MKL BLASFortran & Link with Library & Brief Explanation of MKL BLAS
Fortran & Link with Library & Brief Explanation of MKL BLAS
Jongsu "Liam" Kim
 
Triton and Symbolic execution on GDB@DEF CON China
Triton and Symbolic execution on GDB@DEF CON ChinaTriton and Symbolic execution on GDB@DEF CON China
Triton and Symbolic execution on GDB@DEF CON China
Wei-Bo Chen
 
SPARKNaCl: A verified, fast cryptographic library
SPARKNaCl: A verified, fast cryptographic librarySPARKNaCl: A verified, fast cryptographic library
SPARKNaCl: A verified, fast cryptographic library
AdaCore
 
JCConf 2020 - New Java Features Released in 2020
JCConf 2020 - New Java Features Released in 2020JCConf 2020 - New Java Features Released in 2020
JCConf 2020 - New Java Features Released in 2020
Joseph Kuo
 

Similar to Klee and angr (20)

Symbolic Execution (introduction and hands-on)
Symbolic Execution (introduction and hands-on)Symbolic Execution (introduction and hands-on)
Symbolic Execution (introduction and hands-on)
 
App secforum2014 andrivet-cplusplus11-metaprogramming_applied_to_software_obf...
App secforum2014 andrivet-cplusplus11-metaprogramming_applied_to_software_obf...App secforum2014 andrivet-cplusplus11-metaprogramming_applied_to_software_obf...
App secforum2014 andrivet-cplusplus11-metaprogramming_applied_to_software_obf...
 
Tips and tricks for building high performance android apps using native code
Tips and tricks for building high performance android apps using native codeTips and tricks for building high performance android apps using native code
Tips and tricks for building high performance android apps using native code
 
Valgrind
ValgrindValgrind
Valgrind
 
How to fake_properly
How to fake_properlyHow to fake_properly
How to fake_properly
 
React Native One Day
React Native One DayReact Native One Day
React Native One Day
 
Price of an Error
Price of an ErrorPrice of an Error
Price of an Error
 
Introduction to C ++.pptx
Introduction to C ++.pptxIntroduction to C ++.pptx
Introduction to C ++.pptx
 
The operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzerThe operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzer
 
GCC Summit 2010
GCC Summit 2010GCC Summit 2010
GCC Summit 2010
 
Blocks & GCD
Blocks & GCDBlocks & GCD
Blocks & GCD
 
cb streams - gavin pickin
cb streams - gavin pickincb streams - gavin pickin
cb streams - gavin pickin
 
CBDW2014 - MockBox, get ready to mock your socks off!
CBDW2014 - MockBox, get ready to mock your socks off!CBDW2014 - MockBox, get ready to mock your socks off!
CBDW2014 - MockBox, get ready to mock your socks off!
 
Unit testing in iOS featuring OCUnit, GHUnit & OCMock
Unit testing in iOS featuring OCUnit, GHUnit & OCMockUnit testing in iOS featuring OCUnit, GHUnit & OCMock
Unit testing in iOS featuring OCUnit, GHUnit & OCMock
 
Introduction to Elixir
Introduction to ElixirIntroduction to Elixir
Introduction to Elixir
 
Code Analysis-run time error prediction
Code Analysis-run time error predictionCode Analysis-run time error prediction
Code Analysis-run time error prediction
 
Fortran & Link with Library & Brief Explanation of MKL BLAS
Fortran & Link with Library & Brief Explanation of MKL BLASFortran & Link with Library & Brief Explanation of MKL BLAS
Fortran & Link with Library & Brief Explanation of MKL BLAS
 
Triton and Symbolic execution on GDB@DEF CON China
Triton and Symbolic execution on GDB@DEF CON ChinaTriton and Symbolic execution on GDB@DEF CON China
Triton and Symbolic execution on GDB@DEF CON China
 
SPARKNaCl: A verified, fast cryptographic library
SPARKNaCl: A verified, fast cryptographic librarySPARKNaCl: A verified, fast cryptographic library
SPARKNaCl: A verified, fast cryptographic library
 
JCConf 2020 - New Java Features Released in 2020
JCConf 2020 - New Java Features Released in 2020JCConf 2020 - New Java Features Released in 2020
JCConf 2020 - New Java Features Released in 2020
 

More from Wei-Bo Chen

Triton and symbolic execution on gdb
Triton and symbolic execution on gdbTriton and symbolic execution on gdb
Triton and symbolic execution on gdb
Wei-Bo Chen
 
Python
PythonPython
Python
Wei-Bo Chen
 
Some tips
Some tipsSome tips
Some tips
Wei-Bo Chen
 
Ctf For Beginner
Ctf For BeginnerCtf For Beginner
Ctf For Beginner
Wei-Bo Chen
 
Format String
Format StringFormat String
Format String
Wei-Bo Chen
 
x86
x86x86

More from Wei-Bo Chen (6)

Triton and symbolic execution on gdb
Triton and symbolic execution on gdbTriton and symbolic execution on gdb
Triton and symbolic execution on gdb
 
Python
PythonPython
Python
 
Some tips
Some tipsSome tips
Some tips
 
Ctf For Beginner
Ctf For BeginnerCtf For Beginner
Ctf For Beginner
 
Format String
Format StringFormat String
Format String
 
x86
x86x86
x86
 

Recently uploaded

Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
TIPNGVN2
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Zilliz
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
Pixlogix Infotech
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 

Recently uploaded (20)

Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 

Klee and angr

  • 1. Klee and Angr bananaappletw @ UCCU Hacker 2017/10/29
  • 2. $whoami • 陳威伯(bananaappletw) • Master of National Chiao Tung University • Organizations: • Software Quality Laboratory • Bamboofox member • Vice president of NCTUCSC • Specialize in: • symbolic execution • binary exploit • Talks: • HITCON CMT 2015 • HITCON CMT 2017
  • 3. What is symbolic execution? • Symbolic execution is a means of analyzing a program to determine what inputs cause each part of a program to execute • System-level • S2e(https://github.com/dslab-epfl/s2e) • User-level • Angr(http://angr.io/) • Triton(https://triton.quarkslab.com/) • Code-based • klee(http://klee.github.io/)
  • 4. Symbolic execution Z == 12 fail() "OK"
  • 5. Klee • Symbolic virtual machine built on top of the LLVM compiler infrastructure • Website: http://klee.github.io/ • Github: https://github.com/klee/klee • Klee paper: http://llvm.org/pubs/2008-12-OSDI-KLEE.pdf (Worth reading) • Main goal of Klee: 1. Hit every line of executable code in the program 2. Detect at each dangerous operation
  • 6. Installation • Install docker • sudo apt-get install docker.io • Get klee image • sudo docker pull klee/klee • Run klee docker image • sudo docker run --rm -ti --ulimit='stack=-1:-1' klee/klee • Test environment • Ubuntu 17.04 desktop amd64
  • 7. Introduction • Klee is a symbolic machine to generate test cases • Need source code to compile to LLVM bitcode • Steps: • Replace input with Klee function to make memory region symbolic • Compile source code to LLVM bitcode • Run Klee • Get the test cases and path's information
  • 8. Steps • Source code at ~/klee_src/examples/get_sign.c • Include klee library #include <klee/klee.h> • Make input symbolic klee_make_symbolic(&a, sizeof(a), "a"); • Compile to LLVM bitcode clang -I ../../include -emit-llvm -c -g get_sign.c
  • 9. Steps • Examine LLVM bitcode llvm-dis get_sign.bc • Running KLEE klee get_sign.bc • Show information about test case ktest-tool ./klee-last/*.ktest
  • 10. get_sign.c #include <klee/klee.h> int get_sign(int x) { if (x == 0) return 0; if (x < 0) return -1; else return 1; } int main() { int a; klee_make_symbolic(&a, sizeof(a), "a"); return get_sign(a); }
  • 11. get_sign.ll define i32 @main() #0 { %1 = alloca i32, align 4 %a = alloca i32, align 4 store i32 0, i32* %1 call void @llvm.dbg.declare(metadata !{i32* %a}, metadata !25), !dbg !26 %2 = bitcast i32* %a to i8*, !dbg !27 call void @klee_make_symbolic(i8* %2, i64 4, i8* getelementptr inbounds ([2 x i8]* @.str, i32 0, i32 0)), !dbg !27 %3 = load i32* %a, align 4, !dbg !28 %4 = call i32 @get_sign(i32 %3), !dbg !28 ret i32 %4, !dbg !28 }
  • 13. Functions • klee_make_symbolic(address, size, name) Make memory address symbolic klee_make_symbolic(&c, sizeof(c), "c"); Notice: Don't overlap symbolic memory regions • klee_assume(condition) Add condition constraint on input klee_assume((c==2)||(d==3)); • klee_prefer_cex(object, condition) When generating test cases, prefer certain values between equivalent test cases klee_prefer_cex(input, 32 <= input[i] && input[i] <= 126);
  • 14. Functions • Insert assert on target path klee_assert(0) • Find assert test case ls ./klee-last/ | grep .assert • Show information about test case ktest-tool ./klee-last/*.ktest • Using GNU c library functions Run with --libc=uclibc --posix-runtime parameters
  • 15. Environment Modeling • For example: • command-line arguments • environment variables • file • data and metadata • network packets • …… • Klee redirects library call to models • If fd refers to a concrete file Klee use pread to multiplex access from KLEE’s many states onto the one actual underlying file descriptor • If fd refers to a symbolic file read() copies from the underlying symbolic buffer
  • 16. Environment Modeling • -sym-arg <N> Replace by a symbolic argument with length N • -sym-args <MIN> <MAX> <N> Replace by at least MIN arguments and at most MAX arguments, each with maximum length N • -sym-files <NUM> <N> Make NUM symbolic files (‘A’, ‘B’, ‘C’, etc.), each with size N (excluding stdin) • -sym-stdin <N> Make stdin symbolic with size N
  • 17. Solver • MetaSMT • STP(Default option for Klee) • Z3
  • 18. Architecture • Represent Klee as an interpreter loop which select a state to run and then symbolically executes a single instruction in the context of that state • When Klee meets the conditional branch instruction, it clones the state so that it could explore both paths • By implementing heap as immutable map, portions of the heap structure itself can also be shared amongst multiple states(copy on write)
  • 19. Architecture • Potentially dangerous operations implicitly generate branches that check if any input value exists that could cause an error(e.g., zero divisor) • When processing instruction, if all given operands are concrete, performs the operation natively, returning a constant expression • When KLEE detects an error or when a path reaches an exit call, Klee solves the current path's constraints
  • 20. Diagram 1. Step the program until it meets the branch #include <klee/klee.h> int get_sign(int x) { if (x == 0) return 0; if (x < 0) return -1; else return 1; } int main() { int a; klee_make_symbolic(&a, sizeof(a), "a"); return get_sign(a); }
  • 21. Diagram 1. Step the program until it meets the branch 2. If all given operands are concrete, return constant expression. If not, record current condition constraints and clone the state. #include <klee/klee.h> int get_sign(int x) { if (x == 0) return 0; if (x < 0) return -1; else return 1; } int main() { int a; klee_make_symbolic(&a, sizeof(a), "a"); return get_sign(a); }
  • 22. Diagram 1. Step the program until it meets the branch 2. If all given operands are concrete, return constant expression. If not, record current condition constraints and clone the state 3. Step the states until they hit exit call or error X==0 Constraints: X!=0 Next instruction: if (x < 0) Constraints: X==0 Next instruction: return 0;
  • 23. Diagram 1. Step the program until it meets the branch 2. If all given operands are concrete, return constant expression. If not, record current condition constraints and clone the state 3. Step the states until they hit exit call or error 4. Solve the conditional constraint X==0 Constraints: X!=0 Next instruction: if (x < 0) Constraints: X==0 Next instruction: return 0;
  • 24. Diagram 1. Step the program until it meets the branch 2. If all given operands are concrete, return constant expression. If not, record current condition constraints and clone the state 3. Step the states until they hit exit call or error 4. Solve the conditional constraint 5. Loop until no remaining states or user-defined timeout is reached
  • 25. Exercises • ccr Sum of continuous number https://bamboofox.cs.nctu.edu.tw/courses/1/challenges/67 • Suicide Product of continuous number https://bamboofox.cs.nctu.edu.tw/courses/1/challenges/68
  • 26. Angr • Website: http://angr.io/ • Angr is a python framework for analyzing binaries. It combines both static and dynamic symbolic ("concolic") analysis, making it applicable to a variety of tasks. • Flow • Loading a binary into the analysis program. • Translating a binary into an intermediate representation (IR). • Performing the actual analysis
  • 27. Installation • sudo apt-get install -y python-pip python-dev libffi-dev build-essential virtualenvwrapper • Virtualenv • mkvirtualenv angr && pip install angr • Direct install in system • sudo pip install angr • Test environment • Ubuntu 17.04 desktop amd64
  • 28. Flow • Import angr import angr • Load the binary and initialize angr project project = angr.Project('./ais3_crackme') • Define argv1 as 100 bytes bitvectors argv1 = claripy.BVS("argv1",100*8) • Initialize the state with argv1 state = project.factory.entry_state(args=["./crackme1",argv1])
  • 29. Flow • Initialize the simulation manager simgr = p.factory.simgr(state) • Explore the states that matches the condition simgr.explore(find= 0x400602) • Extract one state from found states found = simgr.found[0] • Solve the expression with solver solution = found.solver.eval(argv1, cast_to=str)
  • 30. ais3 crackme • Binary could be found in: https://github.com/angr/angr- doc/blob/master/examples/ais3_crackme/ • Run binary with argument • If argument is correct print "Correct! that is the secret key!" • else print "I'm sorry, that's the wrong secret key!"
  • 32. Solution import angr import claripy project = angr.Project("./ais3_crackme") argv1 = claripy.BVS("argv1",100*8) state = project.factory.entry_state(args=["./crackme1",argv1]) simgr = project.factory.simgr(state) simgr.explore(find=0x400602) found = simgr.found[0] solution = found.solver.eval(argv1, cast_to=str) print(repr(solution))
  • 34. Loader • Load the binary through angr.Project • Loading options • auto_load_libs • When there is no SimProcedure • True(Default, real library function is executed) • False(return a generic "stub" SimProcedure called ReturnUnconstrained)
  • 35. Intermediate Representation • In order to be able to analyze and execute machine code from different CPU architectures, Angr performs most of its analysis on an intermediate representation • Angr's intermediate representation is VEX(Valgrind), since the uplifting of binary code into VEX is quite well supported
  • 36. Intermediate Representation • IR abstracts away several architecture differences when dealing with different architectures • Register names: VEX models the registers as a separate memory space, with integer offsets • Memory access: The IR abstracts difference between architectures access memory in different ways • Memory segmentation: Some architectures support memory segmentation through the use of special segment registers • Instruction side-effects: Most instructions have side-effects
  • 37. Intermediate Representation • addl %eax, %ebx • t3 = GET:I32(0) • # get %eax, a 32-bit integer • t2 = GET:I32(12) • # get %ebx, a 32-bit integer • t1 = Add32(t3,t2) • # addl • PUT(0) = t1 • # put %eax
  • 38. Simulation Managers • simgr.step() Step forward all states in a given stash by one basic block • simgr.run() Step until everything terminates • simgr.explore(conditions) • find, avoid: • An address to find or avoid • A set or list of addresses to find or avoid • A function that takes a state and returns whether or not it matches.
  • 39. Bit-Vector Symbol • claripy.BVS(name, size) • name: The name of the symbol. • size: The size (in bits) of the bit-vector. • chop(bits) • bits: A list of smaller bitvectors, each bits in length.
  • 40. State • Property • registers: The state's register file as a flat memory region • memory: The state's memory as a flat memory region • solver(e.g., se): The solver engine for this state • posix: information about the operating system or environment model(e.g., posix.files[fd]) • add_constraints(BVS condition) • inspect.b(event, when=angr.BP_AFTER, action=debug_func) • event: mem_read, reg_write • when: BP_BEFORE or BP_AFTER • action: debug_func
  • 41. SimMemory • load(addr, size=None) • addr: memory address • size: The sizein bytes • find(addr, what) • addr: The start address • what: what to search for • store(addr, data, size=None) • addr: address to store at • data: The data(claripy expression or something convertable to a claripy expression) • size: The data size(claripy expression or something convertable to a claripy expression)
  • 42. Stash types active This stash contains the states that will be stepped by default, unless an alternate stash is specified. deadended A state goes to the deadended stash when it cannot continue the execution for some reason, including no more valid instructions, unsat state of all of its successors, or an invalid instruction pointer. pruned When using LAZY_SOLVES, states are not checked for satisfiability unless absolutely necessary. When a state is found to be unsat in the presence of LAZY_SOLVES, the state hierarchy is traversed to identify when, in its history, it initially became unsat. All states that are descendants of that point (which will also be unsat, since a state cannot become un-unsat) are pruned and put in this stash. unconstrained If the save_unconstrained option is provided to the SimulationManager constructor, states that are determined to be unconstrained (i.e., with the instruction pointer controlled by user data or some other source of symbolic data) are placed here. unsat If the save_unsat option is provided to the SimulationManager constructor, states that are determined to be unsatisfiable (i.e., they have constraints that are contradictory, like the input having to be both "AAAA" and "BBBB" at the same time) are placed here.
  • 43. Hooking • project.hook(address, hook) • hook is a SimProcedure instance • At every step, angr checks if the current address has been hooked, and if so, runs the hook instead of the binary code at that address • You could also use symbol to locate the address • proj.hook_symbol(name, hook) • You could use hook as function decorator, length is to jump after finishing the hook @proj.hook(0x20000, length=5) def my_hook(state): state.regs.rax = 1
  • 44. Hooking • proj.is_hooked(address) True or False • proj.unhook(address) Unhook hook on address • proj.hooked_by(address) Return SimProcedure instance
  • 45. Symbolic Function • Project tries to replace external calls to library functions by using symbolic summaries termed SimProcedures • Because SimProcedures are library hooks written in Python, it has inaccuracy • If you encounter path explosion or inaccuracy, you can do: 1. Disable the SimProcedure 2. Replace the SimProcedure with something written directly to the situation in question 3. Fix the SimProcedure
  • 46. Symbolic Function(scanf) • Source code: https://github.com/angr/angr/blob/master/angr/procedures/libc/sca nf.py 1. Get first argument(pointer to format string) 2. Define function return type by the architecture 3. Parse format string 4. According format string, read input from file descriptor 0(i.e., standard input) 5. Do the read operation
  • 47. Symbolic Function(scanf) from angr.procedures.stubs.format_parser import FormatParser from angr.sim_type import SimTypeInt, SimTypeString class scanf(FormatParser): def run(self, fmt): self.argument_types = {0: self.ty_ptr(SimTypeString())} self.return_type = SimTypeInt(self.state.arch.bits, True) fmt_str = self._parse(0) f = self.state.posix.get_file(0) region = f.content start = f.pos (end, items) = fmt_str.interpret(start, 1, self.arg, region=region) # do the read, correcting the internal file position and logging the action self.state.posix.read_from(0, end - start) return items
  • 48. Symbolic Function(scanf) class SimProcedure(object): @staticmethod def ty_ptr(self, ty): return SimTypePointer(self.arch, ty) class FormatParser(SimProcedure): def _parse(self, fmt_idx): """ fmt_idx: The index of the (pointer to the) format string in the arguments list. """ def interpret(self, addr, startpos, args, region=None): """ Interpret a format string, reading the data at `addr` in `region` into `args` starting at `startpos`. """
  • 49. def _parse(self, fmt_idx): • int scanf ( const char * format, ... ); • scanf ("%d",&i); fmt_str = self._parse(0) • int sscanf ( const char * s, const char * format, ...); • sscanf (sentence,"%s %*s %d",str,&i); fmt_str = self._parse(1)
  • 50. def interpret(self, addr, startpos, args, region=None): • int scanf ( const char * format, ... ); • scanf ("%d",&i); f = self.state.posix.get_file(0) region = f.content start = f.pos (end, items) = fmt_str.interpret(start, 1, self.arg, region=region) • int sscanf ( const char * s, const char * format, ...); • sscanf (sentence,"%s %*s %d",str,&i); _, items = fmt_str.interpret(self.arg(0), 2, self.arg, region=self.state.memory)
  • 51. Exercises • angrman https://bamboofox.cs.nctu.edu.tw/courses/1/challenges/69 Local binary • magicpuzzle Server response the base64-encoded binary, try to find out the differences, and automatically generate the exploit https://bamboofox.cs.nctu.edu.tw/courses/1/challenges/70 • Sushi Server response the base64-encoded binary, try to find out the differences, and automatically generate the exploit https://bamboofox.cs.nctu.edu.tw/courses/1/challenges/72
  • 52. Q & A