SlideShare a Scribd company logo
Triton: Concolic Execution Framework
Florent Saudel
Jonathan Salwan
SSTIC
Rennes – France
Slides: detailed version
June 3 2015
Keywords: program analysis, DBI, DBA, Pin, concrete execution,
symbolic execution, concolic execution, DSE, taint analysis,
context snapshot, Z3 theorem prover and behavior analysis.
2
● Jonathan Salwan is a student at Bordeaux University
(CSI Master) and also an employee at Quarkslab
● Florent Saudel is a student at the Bordeaux
University (CSI Master) and applying to an
Internship at Amossys
● Both like playing with low-level computing, program
analysis and software verification methods
Who are we?
3
● Triton is a project started on January 2015 for our
Master final project at Bordeaux University (CSI)
supervised by Emmanuel Fleury from laBRI
● Triton is also sponsored by Quarkslab from the
beginning
Where does Triton come from?
4
● Triton is a concolic execution framework as Pintool
● It provides advanced classes to improve dynamic
binary analysis (DBA) using Pin
– Symbolic execution engine
– SMT semantics representation
– Interface with SMT Solver
– Taint analysis engine
– Snapshot engine
– API and Python bindings
What is Triton?
5
What is Triton?
Plug what you want which supports
the SMT2-LIB format
Pin
Taint
Engine
Symbolic
Execution
Engine
Snapshot
Engine
SMT
Semantics
IR
SMT
Solver
Interface
Python
Bindings
Triton internal components
Triton.so pintool
user
script.py
Z3 *
Front-endBack-end User side
6
● Well-known projects
– SAGE
– Mayhem
– Bitblaze
– S2E
● The difference?
– Triton works online* through a higher level
languages using the Pin engine
Relative projects
online*: Analysis is performed at runtime and data can be modified directly
in memory to go through specific branches.
7
● You can build tools which:
– Analyze a trace with concrete information
● Registers and memory values at each program point
– Perform a symbolic execution
● To know the symbolic expression of registers and memory at each program point
– Perform a symbolic fuzzing session
– Generate and solve path constraints
– Gather code coverage
– Runtime registers and memory modification
– Replay traces directly in memory
– Scriptable debugging
– Access to Pin functions through a higher level languages (Python bindings)
– And probably lots of others things
What kind of things you can
build with Triton?
8
● Triton's Internal Components
9
● Symbolic Engine
10
● Symbolic Engine
● Symbolic execution is the execution of a program using
symbolic variables instead of concrete values
● Symbolic execution translates the program's semantics into a
logical formula
● Symbolic execution can build and keep a path formula
– By solving the formula and its negation we can take all paths
and “cover” a code
● Instead of concrete execution which takes only one path
● Then a symbolic expression is given to a SMT solver to
generate a concrete value
11
● Symbolic Engine inside Triton
●
A trace is a sequence of instructions
T = (Ins1 Ins∧ 2 Ins∧ 3 Ins∧ 4 … Ins∧ ∧ i)
●
Instructions are represented with symbolic expressions
●
A symbolic trace is a sequence of symbolic expressions
● Each symbolic expression is translated like this:
REFout = semantic
– Where :
●
REFout := unique ID
●
Semantic := REFin | <<smt expression>>
●
Each register or byte of memory points to its last reference →
Single Static Assignment Form (SSA)
12
● Register References
movzx eax, byte ptr [mem]
add eax, 2
mov ebx, eax
Example:
// All refs initialized to -1
Register Reference Table {
EAX : -1,
EBX : -1,
ECX : -1,
...
}
// Empty set
Symbolic Expression Set {
}
13
● Register references
movzx eax, byte ptr [mem] #0 = symvar_1
add eax, 2
mov ebx, eax
Example:
// All refs initialized to -1
Register Reference Table {
EAX : #0,
EBX : -1,
ECX : -1,
...
}
// Empty set
Symbolic Expression Set {
<#0, symvar_1>
}
14
● Register references
movzx eax, byte ptr [mem] #0 = symvar_1
add eax, 2 #1 = (bvadd #0 2)
mov ebx, eax
Example:
// All refs initialized to -1
Register Reference Table {
EAX : #1,
EBX : -1,
ECX : -1,
...
}
// Empty set
Symbolic Expression Set {
<#1, (bvadd #0 2)>,
<#0, symvar_1>
}
15
● Register references
movzx eax, byte ptr [mem] #0 = symvar_1
add eax, 2 #1 = (bvadd #0 2)
mov ebx, eax #2 = #1
Example:
// All refs initialized to -1
Register Reference Table {
EAX : #1,
EBX : #2,
ECX : -1,
...
}
// Empty set
Symbolic Expression Set {
<#2, #1>,
<#1, (bvadd #0 2)>,
<#0, symvar_1>
}
16
● Rebuild the trace with
backward analysis
movzx eax, byte ptr [mem]
add eax, 2
mov ebx, eax
Example:
// All refs initialized to -1
Register Reference Table {
EAX : #1,
EBX : #2,
ECX : -1,
...
}
// Empty set
Symbolic Expression Set {
<#2, #1>,
<#1, (bvadd #0 2)>,
<#0, symvar_1>
}
What is the semantic trace of EBX ?
17
● Rebuild the trace with
backward analysis
movzx eax, byte ptr [mem]
add eax, 2
mov ebx, eax
Example:
// All refs initialized to -1
Register Reference Table {
EAX : #1,
EBX : #2,
ECX : -1,
...
}
// Empty set
Symbolic Expression Set {
<#2, #1>,
<#1, (bvadd #0 2)>,
<#0, symvar_1>
}
What is the semantic trace of EBX ?
EBX holds the reference #2
18
● Rebuild the trace with
backward analysis
movzx eax, byte ptr [mem]
add eax, 2
mov ebx, eax
Example:
// All refs initialized to -1
Register Reference Table {
EAX : #1,
EBX : #2,
ECX : -1,
...
}
// Empty set
Symbolic Expression Set {
<#2, #1>,
<#1, (bvadd #0 2)>,
<#0, symvar_1>
}
What is the semantic trace of EBX ?
EBX holds the reference #2
What is #2 ?
19
● Rebuild the trace with
backward analysis
movzx eax, byte ptr [mem]
add eax, 2
mov ebx, eax
Example:
// All refs initialized to -1
Register Reference Table {
EAX : #1,
EBX : #2,
ECX : -1,
...
}
// Empty set
Symbolic Expression Set {
<#2, #1>,
<#1, (bvadd #0 2)>,
<#0, symvar_1>
}
What is the semantic trace of EBX ?
EBX holds the reference #2
What is #2 ? Reconstruction: EBX = #2
20
● Rebuild the trace with
backward analysis
movzx eax, byte ptr [mem]
add eax, 2
mov ebx, eax
Example:
// All refs initialized to -1
Register Reference Table {
EAX : #1,
EBX : #2,
ECX : -1,
...
}
// Empty set
Symbolic Expression Set {
<#2, #1>,
<#1, (bvadd #0 2)>,
<#0, symvar_1>
}
What is the semantic trace of EBX ?
EBX holds the reference #2
What is #2 ? Reconstruction: EBX = #1
21
● Rebuild the trace with
backward analysis
movzx eax, byte ptr [mem]
add eax, 2
mov ebx, eax
Example:
// All refs initialized to -1
Register Reference Table {
EAX : #1,
EBX : #2,
ECX : -1,
...
}
// Empty set
Symbolic Expression Set {
<#2, #1>,
<#1, (bvadd #0 2)>,
<#0, symvar_1>
}
What is the semantic trace of EBX ?
EBX holds the reference #2
What is #2 ? Reconstruction: EBX = (bvadd #0 2)
22
● Rebuild the trace with
backward analysis
movzx eax, byte ptr [mem]
add eax, 2
mov ebx, eax
Example:
// All refs initialized to -1
Register Reference Table {
EAX : #1,
EBX : #2,
ECX : -1,
...
}
// Empty set
Symbolic Expression Set {
<#2, #1>,
<#1, add(#0, 2)>,
<#0, symvar_1>
}
What is the semantic trace of EBX ?
EBX holds the reference #2
What is #2 ? Reconstruction: EBX = (bvadd symvar_1 2)
23
● Assigning a reference for each register is not enough, we must also add
references on memory
● Follow references over memory
mov dword ptr [rbp-0x4], 0x0
...
mov eax, dword ptr [rbp-0x4]
push eax
...
pop ebx
What do we want to know?
Eax = 0 from somewhere ebx = eax
References
#1 = 0x0
...
#x = #1
#2 = #1
...
#x = #2
24
● All registers, flags and each byte of memory are references
● A reference assignment is in SSA form during the execution
● The registers, flags and bytes of memory are assigned in the same
way
● A memory reference can be assigned from a register reference
(mov [mem], reg)
● A register reference can be assigned from a memory reference
(mov reg, [mem])
● If a reference doesn't exist yet, we concretize the value and we
affect a new reference
● References conclusion
25
● SMT Semantics Representation
with SSA Form
26
● SMT Semantics Representation
with SSA Form
● All instructions semantics are represented via SMT2-LIB representation
● This SMT2-LIB representation is on SSA form
add rax, rdx
rax = (bvadd ((_ extract 63 0) rax) ((_ extract 63 0) rdx))
... (af, cf, of, pf) ...
sf = (ite (= ((_ extract 63 63) rax) (_ bv1 1)) (_ bv1 1) (_ bv0 1))
zf = (ite (= rax (_ bv0 64)) (_ bv1 1) (_ bv0 1))
#60 = (bvadd ((_ extract 63 0) #58) ((_ extract 63 0) #54))
... (af, cf, of, pf) ...
#64 = (ite (= ((_ extract 63 63) #60) (_ bv1 1)) (_ bv1 1) (_ bv0 1))
#65 = (ite (= #60 (_ bv0 64)) (_ bv1 1) (_ bv0 1))
Assembly
SMT
SSA SMT
New rax reference Old rax reference Old rdx reference
27
● SMT Semantics Representation
with SSA Form
● Why use SMT2-LIB representation?
– SMT-LIB is an international initiative aimed at facilitating research and
development in Satisfiability Modulo Theories (SMT)
– As all Triton's expressions are in the SMT2-LIB representation, you
can plug all solvers which supports this representation
● Currently Triton has an interface with Z3 but feel free to plug what
you want
28
● Symbolic Execution Guided By The
Taint Analysis
29
● Symbolic Execution Guided By The
Taint Analysis
● Taint analysis provides information about which registers and memory
addresses are controllable by the user at each program point:
– Assists the symbolic engine to setup the symbolic variables (a
symbolic variable is a memory area that the user can control)
– Limit the symbolic engine to the relevant part of the program
– At each branch instruction, we directly know if the user can go through
both branches (this is mainly used for code coverage)
30
● Symbolic Execution Guided By The
Taint Analysis
● Transform a tainted area into a symbolic variable
0x40058b: movzx eax, byte ptr [rax]
-> #33 = SymVar_0 ; Controllable by the user
-> #34 = (_ bv4195726 64) ; RIP
0x40058e: movsx eax, al
-> #35 = ((_ sign_extend 24) ((_ extract 7 0) #33))
-> #36 = (_ bv4195729 64) ; RIP
0x40058b: movzx eax, byte ptr [rax]
-> #33 = ((_ zero_extend 24) (_ bv97 8))
-> #34 = (_ bv4195726 64) ; RIP
0x40058e: movsx eax, al
-> #35 = ((_ sign_extend 24) ((_ extract 7 0) #33))
-> #36 = (_ bv4195729 64) ; RIP
rax points on a tainted area
Use symbolic variable instead of concrete value
31
● Symbolic Execution Guided By The
Taint Analysis
● Can I go through this branch?
– Check if flags are tainted
0x4005ae: cmp ecx, eax
-> #72 = (bvsub ((_ extract 31 0) #52) ((_ extract 31 0) #70))
...CF, OF, SF, AF, and PF skipped...
-> #78 = (ite (= #72 (_ bv0 32)) (_ bv1 1) (_ bv0 1)) ; ZF
-> #79 = (_ bv4195760 64) ; RIP
0x4005b0: jz 0x4005b9
-> #80 = (ite (= #78 (_ bv1 1)) (_ bv4195769 64) (_ bv4195762 64)) ; RIP
eax is tainted
tainted
32
● Taint Analysis guided by the Symbolic
Engine and the Solver Engine
●
As the symbolic execution may be guided by the taint analysis, the taint analysis may
also be guided by the symbolic execution and the solver engine
●
What to choose between an over-approximation and under-approximation?
– Over-approximation: We can generate inputs for infeasible concrete paths.
– Under-approximation: We can miss some feasible paths.
● The goal of the taint engine is to say YES or NO if a register and memory is probably
tainted (byte-level over approximation)
● The goal of the symbolic engine is to build symbolic expressions based on
instructions semantics
● The goal of the solver engine is to generate a model of an expression (path condition)
– If your target is not tainted, don't ask a model → gain time
– If the solver engine returns UNSAT → the tainted inputs can't influence the control
flow to go through this path.
– If the solver engine returns SAT → the path can be triggered with the actual tainted
inputs. The model give us the set of concrete inputs for this path.
33
● Snapshot Engine – Replay your trace
34
● Snapshot Engine – Replay your trace
●
The snapshot engine offers the possibility to take and restore snapshot
– Mainly used to apply code coverage in memory. Useful when you fuzz the binary
– In future versions, it will be possible to take different snapshots at several program point
● The snapshot engine only restores registers and memory states
– If there is some disk, network,... I/O, Triton won't be able to restore the files modification
Restore context
Snapshot
Restore
SnapshotTarget
function / basic block
Dynamic Symbolic Execution Possible paths in the target
35
● Stop talking about back-end!
Let's see how I can use Triton
36
● How to install Triton?
● Easy is easy
● You just need:
– Pin v2.14-71313
– Z3 v4.3.1
– Python v2.7
$ cd pin-2.14-71313-gcc.4.4.7-linux/source/tools/
$ git clone git@github.com:JonathanSalwan/Triton.git
$ cd Triton
$ make
$ ../../../pin -t ./triton.so -script your_script.py -- ./your_target_binary.elf64
Shell 1: Installation
37
● Start an analysis
import triton
if __name__ == '__main__':
# Start the symbolic analysis from the 'check' function
triton.startAnalysisFromSymbol('check')
# Run the instrumentation - Never returns
triton.runProgram()
import triton
if __name__ == '__main__':
# Start the symbolic analysis from address
triton.startAnalysisFromAddr(0x40056d)
triton.stopAnalysisFromAddr(0x4005c9)
# Run the instrumentation - Never returns
triton.runProgram()
Code 1: Start analysis from symbols
Code 2: Start analysis from address
38
● Predicate taint and untaint
import triton
if __name__ == '__main__':
# Start the symbolic analysis from the 'check' function
triton.startAnalysisFromSymbol('check')
# Taint the RAX and RBX registers when the address 0x40058e is executed
triton.taintRegFromAddr(0x40058e, [IDREF.REG.RAX, IDREF.REG.RBX])
# Untaint the RCX register when the address 0x40058e is executed
triton.untaintRegFromAddr(0x40058e, [IDREF.REG.RCX])
# Run the instrumentation - Never returns
triton.runProgram()
Code 3: Predicate taint and untaint at specific addresses
39
● Callbacks
● Triton supports 8 kinds of callbacks
– AFTER
● Defines a callback after the instruction processing
– BEFORE
● Defines a callback before the instruction processing
– BEFORE_SYMPROC
● Defines a callback before the symbolic processing
– FINI
● Define a callback at the end of the execution
– ROUTINE_ENTRY
● Define a callback at the entry of a specified routine.
– ROUTINE_EXIT
● Define a callback at the exit of a specified routine.
– SYSCALL_ENTRY
● Define a callback before each syscall processing
– SYSCALL_EXIT
● Define a callback after each syscall processing
40
● Callback on SYSCALL
def my_callback_syscall_entry(threadId, std):
print '-> Syscall Entry: %s' %(syscallToString(std, getSyscallNumber(std)))
if getSyscallNumber(std) == IDREF.SYSCALL.LINUX_64.WRITE:
arg0 = getSyscallArgument(std, 0)
arg1 = getSyscallArgument(std, 1)
arg2 = getSyscallArgument(std, 2)
print ' sys_write(%x, %x, %x)' %(arg0, arg1, arg2)
def my_callback_syscall_exit(threadId, std):
print '<- Syscall return %x' %(getSyscallReturn(std))
if __name__ == '__main__':
startAnalysisFromSymbol('main')
addCallback(my_callback_syscall_entry, IDREF.CALLBACK.SYSCALL_ENTRY)
addCallback(my_callback_syscall_exit, IDREF.CALLBACK.SYSCALL_EXIT)
runProgram()
Code 4: Callback before and after syscalls processing
-> Syscall Entry: fstat
<- Syscall return 0
-> Syscall Entry: mmap
<- Syscall return 7fb7f06e1000
-> Syscall Entry: write
sys_write(1, 7fb7f06e1000, 6)
Code 4 result
41
● Callback on ROUTINE
def mallocEntry(threadId):
sizeAllocated = getRegValue(IDREF.REG.RDI)
print '-> malloc(%#x)' %(sizeAllocated)
def mallocExit(threadId):
ptrAllocated = getRegValue(IDREF.REG.RAX)
print '<- %#x' %(ptrAllocated)
if __name__ == '__main__':
startAnalysisFromSymbol('main')
addCallback(mallocEntry, IDREF.CALLBACK.ROUTINE_ENTRY, "malloc")
addCallback(mallocExit, IDREF.CALLBACK.ROUTINE_EXIT, "malloc")
runProgram()
Code 5: Callback before and after routine processing
-> malloc(0x20)
<- 0x8fc010
-> malloc(0x20)
<- 0x8fc040
-> malloc(0x20)
<- 0x8fc010
Code 5 result
42
● Callback BEFORE and AFTER
instruction processing
def my_callback_before(instruction):
print 'TID (%d) %#x %s' %(instruction.threadId,
instruction.address,
instruction.assembly)
if __name__ == '__main__':
# Start the symbolic analysis from the 'check' function
startAnalysisFromSymbol('check')
# Add a callback.
addCallback(my_callback_before, IDREF.CALLBACK.BEFORE)
# Run the instrumentation - Never returns
runProgram()
Code 6: Callback before instruction processing
TID (0) 0x40056d push rbp
TID (0) 0x40056e mov rbp, rsp
TID (0) 0x400571 mov qword ptr [rbp-0x18], rdi
TID (0) 0x400575 mov dword ptr [rbp-0x4], 0x0
...
TID (0) 0x4005b2 mov eax, 0x1
TID (0) 0x4005b7 jmp 0x4005c8
TID (0) 0x4005c8 pop rbp
Code 6 result
43
● Instruction class
def my_callback(instruction):
...
● instruction.address
● instruction.assembly
● instruction.imageName - e.g: libc.so
● instruction.isBranch
● instruction.opcode
● instruction.opcodeCategory
● instruction.operands
● instruction.symbolicElements – List of SymbolicElement class
● instruction.routineName - e.g: main
● instruction.sectionName - e.g: .text
● instruction.threadId
44
● SymbolicElement class
instruction.symbolicElements[0]
● symbolicElement.comment → blah
● symbolicElement.destination → #41
● symbolicElement.expression → #41 = (bvadd ((_ extract 63 0) #40) ((_ extract 63 0) #39))
● symbolicElement.id → 41
● symbolicElement.isTainted → True or False
● symbolicElement.source → (bvadd ((_ extract 63 0) #40) ((_ extract 63 0) #39))
Instruction: add rax, rdx
SymbolicElement: #41 = (bvadd ((_ extract 63 0) #40) ((_ extract 63 0) #39)) ; blah
45
● Dump the symbolic expressions trace
def my_callback_after(instruction):
print '%#x: %s' %(instruction.address, instruction.assembly)
for se in instruction.symbolicElements:
print 't -> ', se.expression
print
if __name__ == '__main__':
startAnalysisFromSymbol('check')
addCallback(my_callback_after, IDREF.CALLBACK.AFTER)
runProgram()
Code 7: Dump a symbolic expression trace
0x4005ab: movsx eax, al
-> #70 = ((_ sign_extend 24) ((_ extract 7 0) #68))
-> #71 = (_ bv4195758 64)
0x4005ae: cmp ecx, eax
-> #72 = (bvsub ((_ extract 31 0) #52) ((_ extract 31 0) #70))
...
-> #77 = (ite (= ((_ extract 31 31) #72) (_ bv1 1)) (_ bv1 1) (_ bv0 1))
-> #78 = (ite (= #72 (_ bv0 32)) (_ bv1 1) (_ bv0 1))
-> #79 = (_ bv4195760 64)
0x4005b0: jz 0x4005b9
-> #80 = (ite (= #78 (_ bv1 1)) (_ bv4195769 64) (_ bv4195762 64))
Code 7 result
46
● Play with the Taint engine at runtime
# 0x40058b: movzx eax, byte ptr [rax]
def cbeforeSymProc(instruction):
if instruction.address == 0x40058b:
rax = getRegValue(IDREF.REG.RAX)
taintMem(rax)
if __name__ == '__main__':
startAnalysisFromSymbol('check')
addCallback(cbeforeSymProc, IDREF.CALLBACK.BEFORE_SYMPROC)
runProgram()
Code 8: Taint memory at runtime
0x40058b: movzx eax, byte ptr [rax]
-> #33 = SymVar_0
-> #34 = (_ bv4195726 64)
0x40058e: movsx eax, al
-> #35 = ((_ sign_extend 24) ((_ extract 7 0) #33))
-> #36 = (_ bv4195729 64)
Code 8 result
Modifications must be done before
the symbolic processing
47
● Taint argv[x][x] at the main function
def mainAnalysis(threadId):
rdi = getRegValue(IDREF.REG.RDI) # argc
rsi = getRegValue(IDREF.REG.RSI) # argv
while rdi != 0:
argv = getMemValue(rsi + ((rdi-1) * 8), 8)
offset = 0
while getMemValue(argv + offset, 1) != 0x00:
taintMem(argv + offset)
offset += 1
print '[+] %03d bytes tainted from the argv[%d] (%#x) pointer'
%(offset, rdi-1, argv)
rdi -= 1
return
Code 9: Taint all arguments when the main function occurs
$ pin -t ./triton.so -script taint_main.py -- ./example.bin64 12 123456 123456789
[+] 009 bytes tainted from the argv[3] (0x7fff802ad116) pointer
[+] 006 bytes tainted from the argv[2] (0x7fff802ad10f) pointer
[+] 002 bytes tainted from the argv[1] (0x7fff802ad10c) pointer
[+] 015 bytes tainted from the argv[0] (0x7fff802ad0ef) pointer
Code 9 result
48
● Play with the Symbolic engine
0x40058b: movzx eax, byte ptr [rax]
...
...
0x4005ae: cmp ecx, eax
Example 10: Assembly code
def callback_beforeSymProc(instruction):
if instruction.address == 0x40058b:
rax = getRegValue(IDREF.REG.RAX)
taintMem(rax)
def callback_after(instruction):
if instruction.address == 0x4005ae:
# Get the symbolic expression ID of ZF
zfId = getRegSymbolicID(IDREF.FLAG.ZF)
# Backtrack the symbolic expression ZF
zfExpr = getBacktrackedSymExpr(zfId)
# Craft a new expression over the ZF expression : (assert (= zfExpr True))
expr = smt2lib.smtAssert(smt2lib.equal(zfExpr, smt2lib.bvtrue()))
print expr
Code 10: Backtrack symbolic expression
We know that rax points on a tainted area
(assert (= (ite (= (bvsub ((_ extract 31 0) ((_ extract 31 0) (bvxor ((_ extract 31 0)
(bvsub ((_ extract 31 0) ((_ sign_extend 24) ((_ extract 7 0) SymVar_0))) (_ bv1 32)))
(_ bv85 32)))) ((_ extract 31 0) ((_ sign_extend 24) ((_ extract 7 0) ((_ zero_extend 24)
(_ bv49 8)))))) (_ bv0 32)) (_ bv1 1) (_ bv0 1)) (_ bv1 1)))
Example 10 result
Symbolic Variable
49
● Play with the Symbolic engine
...
zfExpr = getBacktrackedSymExpr(zfId)
# Craft a new expression over the ZF expression : (assert (= zfExpr True))
expr = smt2lib.smtAssert(smt2lib.equal(zfExpr, smt2lib.bvtrue()))
...
Extract of the Code 10
● What does it really mean?
– Triton builds symbolic formulas based on the instructions
semantics
– Triton also exports smt2lib functions which allows you to create
your own formula
– In this example, we want that the ZF expression is equal to 1
50
● Play with the Solver engine
...
zfExpr = getBacktrackedSymExpr(zfId)
# Craft a new expression over the ZF expression : (assert (= zfExpr True))
expr = smt2lib.smtAssert(smt2lib.equal(zfExpr, smt2lib.bvtrue()))
...
model = getModel(expr)
print model
Extract of the Code 10
{'SymVar_0': 0x65}
Result
● getModel() returns a dictionary of valid model for each
symbolic variable
0x40058b: movzx eax, byte ptr [rax]
...
...
0x4005ae: cmp ecx, eax
Example 10: Assembly code
We know now that the first character must be 0x65 to
set the ZF at the compare instruction
51
● Play with the Solver engine and inject
values directly in memory
...
model = getModel(expr)
print model
Extract of the Code 10 Result
● Each symbolic variable is assigned to a memory address (SymVar ↔ Address)
– Possible to get the symbolic variable from a memory address
● getSymVarFromMemory(addr)
– Possible to get the memory address from a symbolic variable
● getMemoryFromSymVar(symVar)
{'SymVar_0': 0x65}
for k, v in model.items():
setMemValue(getMemoryFromSymVar(k), getSymVarSize(k), v)
Inject values given by the solver in memory
52
● Inject values in memory is not enough
Play with the snapshot engine
● Inject values in memory after instructions processing is useless
● That's why Triton offers a snapshot engine
φ1 φ2 φ3 φ4 φ5 φ6
0x40058b: movzx eax, byte ptr [rax]
0x4005ae: cmp ecx, eax
def callback_after(instruction):
if instruction.address == 0x40058b and isSnapshotEnable() == False:
takeSnapshot()
if instruction.address == 0x4005ae:
if getFlagValue(IDREF.FLAG.ZF) == 0:
zfExpr = getBacktrackedSymExpr(...) # Described on slide 45
expr = smt2lib.smtAssert(...zfExpr...) # Described on slide 45
for k, v in getModel(expr).items(): # Described on slide 48
saveValue(...)
restoreSnapshot()
restore snapshottake snapshot
53
● Stop pasting fucking code
Show me a global vision
54
● Stop pasting fucking code
Show me a global vision
● Full API and Python bindings describes here
– https://github.com/JonathanSalwan/Triton/wiki/Python-Bindings
– ~80 functions exported over the Python bindings
● Basically we can:
– Taint and untaint memory and registers
– Inject value in memory and registers
– Add callbacks at each program point, syscalls, routine
– Assign symbolic expression on registers and bytes of memory
– Build and customize symbolic expressions
– Solve symbolic expressions
– Take and restore snapshots
– Do all this in Python!
55
● Conclusion
56
● Conclusion
● Triton:
– is a Pintool which provides others classes for DBA
– is designed as a concolic execution framework
– provides an API and Python bindings
– supports only x86-64 binaries
– currently supports ~100 semantics but we are working hard on it to
increase the semantics support
● An awesome thanks to Kevin `wisk` Szkudlapski and Francis `gg` Gabriel for
the x86.yaml from the Medusa project :)
– is free and open-source :)
– is available here : github.com/JonathanSalwan/Triton
57
● Contacts
– fsaudel@gmail.com
– jsalwan@quarkslab.com
● Thanks
– We would like to thank the SSTIC's staff and especially Rolf Rolles, Sean Heelan,
Sébastien Bardin, Fred Raynal and Serge Guelton for their proofreading and awesome
feedbacks! Then, a big thanks to Quarkslab for the sponsor.
Thanks For Your Attention
Question(s)?
58
● Q&A - Performances
● Host machine configuration
– Tested with an Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz
– 16 Go DDR3
– 415 Go SSD Swap
● The targeted binary analyzed was /usr/bin/z3
– 6,789,610 symbolic expressions created for 1 trace
– The binary has been analyzed in 180 seconds
● One trace with SMT2-LIB translation and the taint spread
– 19 Go of RAM consumed
● Due to the SMT2-LIB strings manipulation

More Related Content

What's hot

Lecture03
Lecture03Lecture03
Lecture03
sean chen
 
Delays in verilog
Delays in verilogDelays in verilog
Delays in verilog
JITU MISTRY
 
Switch level modeling
Switch level modelingSwitch level modeling
Switch level modeling
Devi Pradeep Podugu
 
Coding verilog
Coding verilogCoding verilog
Coding verilog
umarjamil10000
 
Verilog 語法教學
Verilog 語法教學 Verilog 語法教學
Verilog 語法教學
艾鍗科技
 
C++ process new
C++ process newC++ process new
C++ process new
敬倫 林
 
Quiz 9
Quiz 9Quiz 9
Verilog Lecture2 thhts
Verilog Lecture2 thhtsVerilog Lecture2 thhts
Verilog Lecture2 thhts
Béo Tú
 
Semaphores OS Basics
Semaphores OS BasicsSemaphores OS Basics
Semaphores OS Basics
Shijin Raj P
 
SystemVerilog Assertions verification with SVAUnit - DVCon US 2016 Tutorial
SystemVerilog Assertions verification with SVAUnit - DVCon US 2016 TutorialSystemVerilog Assertions verification with SVAUnit - DVCon US 2016 Tutorial
SystemVerilog Assertions verification with SVAUnit - DVCon US 2016 Tutorial
Amiq Consulting
 
Verilog Lecture3 hust 2014
Verilog Lecture3 hust 2014Verilog Lecture3 hust 2014
Verilog Lecture3 hust 2014
Béo Tú
 
Verilogforlab
VerilogforlabVerilogforlab
Verilogforlab
Shankar Bhukya
 
Verilog overview
Verilog overviewVerilog overview
Verilog overview
posdege
 
Verilog HDL Training Course
Verilog HDL Training CourseVerilog HDL Training Course
Verilog HDL Training Course
Paul Laskowski
 
Verilog tutorial
Verilog tutorialVerilog tutorial
Verilog tutorial
Abhiraj Bohra
 
VHDL- gate level modelling
VHDL- gate level modellingVHDL- gate level modelling
VHDL- gate level modelling
VandanaPagar1
 
Verilog tutorial
Verilog tutorialVerilog tutorial
Verilog tutorial
Maryala Srinivas
 
C programming session10
C programming  session10C programming  session10
C programming session10
Keroles karam khalil
 
Experiment write-vhdl-code-for-realize-all-logic-gates
Experiment write-vhdl-code-for-realize-all-logic-gatesExperiment write-vhdl-code-for-realize-all-logic-gates
Experiment write-vhdl-code-for-realize-all-logic-gates
Ricardo Castro
 
Digital Circuit Verification Hardware Descriptive Language Verilog
Digital Circuit Verification Hardware Descriptive Language VerilogDigital Circuit Verification Hardware Descriptive Language Verilog
Digital Circuit Verification Hardware Descriptive Language Verilog
Abhiraj Bohra
 

What's hot (20)

Lecture03
Lecture03Lecture03
Lecture03
 
Delays in verilog
Delays in verilogDelays in verilog
Delays in verilog
 
Switch level modeling
Switch level modelingSwitch level modeling
Switch level modeling
 
Coding verilog
Coding verilogCoding verilog
Coding verilog
 
Verilog 語法教學
Verilog 語法教學 Verilog 語法教學
Verilog 語法教學
 
C++ process new
C++ process newC++ process new
C++ process new
 
Quiz 9
Quiz 9Quiz 9
Quiz 9
 
Verilog Lecture2 thhts
Verilog Lecture2 thhtsVerilog Lecture2 thhts
Verilog Lecture2 thhts
 
Semaphores OS Basics
Semaphores OS BasicsSemaphores OS Basics
Semaphores OS Basics
 
SystemVerilog Assertions verification with SVAUnit - DVCon US 2016 Tutorial
SystemVerilog Assertions verification with SVAUnit - DVCon US 2016 TutorialSystemVerilog Assertions verification with SVAUnit - DVCon US 2016 Tutorial
SystemVerilog Assertions verification with SVAUnit - DVCon US 2016 Tutorial
 
Verilog Lecture3 hust 2014
Verilog Lecture3 hust 2014Verilog Lecture3 hust 2014
Verilog Lecture3 hust 2014
 
Verilogforlab
VerilogforlabVerilogforlab
Verilogforlab
 
Verilog overview
Verilog overviewVerilog overview
Verilog overview
 
Verilog HDL Training Course
Verilog HDL Training CourseVerilog HDL Training Course
Verilog HDL Training Course
 
Verilog tutorial
Verilog tutorialVerilog tutorial
Verilog tutorial
 
VHDL- gate level modelling
VHDL- gate level modellingVHDL- gate level modelling
VHDL- gate level modelling
 
Verilog tutorial
Verilog tutorialVerilog tutorial
Verilog tutorial
 
C programming session10
C programming  session10C programming  session10
C programming session10
 
Experiment write-vhdl-code-for-realize-all-logic-gates
Experiment write-vhdl-code-for-realize-all-logic-gatesExperiment write-vhdl-code-for-realize-all-logic-gates
Experiment write-vhdl-code-for-realize-all-logic-gates
 
Digital Circuit Verification Hardware Descriptive Language Verilog
Digital Circuit Verification Hardware Descriptive Language VerilogDigital Circuit Verification Hardware Descriptive Language Verilog
Digital Circuit Verification Hardware Descriptive Language Verilog
 

Viewers also liked

Blue Ocean Strategy
Blue Ocean StrategyBlue Ocean Strategy
Blue Ocean Strategy
Sreekanth Jayanti
 
Execution - The Discipline of getting things done
Execution - The Discipline of getting things done Execution - The Discipline of getting things done
Execution - The Discipline of getting things done
GMR Group
 
Consulting - Stanford Strategic Framework for Matrix Structures
Consulting - Stanford Strategic Framework for Matrix Structures Consulting - Stanford Strategic Framework for Matrix Structures
Consulting - Stanford Strategic Framework for Matrix Structures
Francisco Gonzalez
 
Strategic Business Execution Introduction
Strategic Business Execution IntroductionStrategic Business Execution Introduction
Strategic Business Execution Introduction
PMO Advisory LLC
 
CRM 2.0 - Frameworks for Program Strategy
CRM 2.0 - Frameworks for Program StrategyCRM 2.0 - Frameworks for Program Strategy
CRM 2.0 - Frameworks for Program Strategy
Michael Moir
 
The Visioneering Operating System - A Framework For Execution
The Visioneering Operating System - A Framework For ExecutionThe Visioneering Operating System - A Framework For Execution
The Visioneering Operating System - A Framework For Execution
Davender Gupta
 
IT Strategy
IT StrategyIT Strategy
Execution - The Discipline of getting things done
Execution - The Discipline of getting things doneExecution - The Discipline of getting things done
Execution - The Discipline of getting things done
Sathish Kumar P
 
Strategic Planning, Execution Frameworks & Organizational Health
Strategic Planning, Execution Frameworks & Organizational HealthStrategic Planning, Execution Frameworks & Organizational Health
Strategic Planning, Execution Frameworks & Organizational Health
Richard Swartzbaugh
 
Content Strategy Frameworks (from KBS)
Content Strategy Frameworks (from KBS)Content Strategy Frameworks (from KBS)
Content Strategy Frameworks (from KBS)
Paul Potenzone
 
Best Practices Frameworks 101
Best Practices Frameworks 101Best Practices Frameworks 101
Best Practices Frameworks 101
shailsood
 
Porter's strategies (generic strategies, five forces, diamond model) with ref...
Porter's strategies (generic strategies, five forces, diamond model) with ref...Porter's strategies (generic strategies, five forces, diamond model) with ref...
Porter's strategies (generic strategies, five forces, diamond model) with ref...
Sigortam.net
 
Performance Execution Framework (PEF).
Performance Execution Framework (PEF).Performance Execution Framework (PEF).
Performance Execution Framework (PEF).
Mindtree Ltd.
 
Strategy frameworks-and-models
Strategy frameworks-and-modelsStrategy frameworks-and-models
Strategy frameworks-and-models
Taposh Roy
 

Viewers also liked (14)

Blue Ocean Strategy
Blue Ocean StrategyBlue Ocean Strategy
Blue Ocean Strategy
 
Execution - The Discipline of getting things done
Execution - The Discipline of getting things done Execution - The Discipline of getting things done
Execution - The Discipline of getting things done
 
Consulting - Stanford Strategic Framework for Matrix Structures
Consulting - Stanford Strategic Framework for Matrix Structures Consulting - Stanford Strategic Framework for Matrix Structures
Consulting - Stanford Strategic Framework for Matrix Structures
 
Strategic Business Execution Introduction
Strategic Business Execution IntroductionStrategic Business Execution Introduction
Strategic Business Execution Introduction
 
CRM 2.0 - Frameworks for Program Strategy
CRM 2.0 - Frameworks for Program StrategyCRM 2.0 - Frameworks for Program Strategy
CRM 2.0 - Frameworks for Program Strategy
 
The Visioneering Operating System - A Framework For Execution
The Visioneering Operating System - A Framework For ExecutionThe Visioneering Operating System - A Framework For Execution
The Visioneering Operating System - A Framework For Execution
 
IT Strategy
IT StrategyIT Strategy
IT Strategy
 
Execution - The Discipline of getting things done
Execution - The Discipline of getting things doneExecution - The Discipline of getting things done
Execution - The Discipline of getting things done
 
Strategic Planning, Execution Frameworks & Organizational Health
Strategic Planning, Execution Frameworks & Organizational HealthStrategic Planning, Execution Frameworks & Organizational Health
Strategic Planning, Execution Frameworks & Organizational Health
 
Content Strategy Frameworks (from KBS)
Content Strategy Frameworks (from KBS)Content Strategy Frameworks (from KBS)
Content Strategy Frameworks (from KBS)
 
Best Practices Frameworks 101
Best Practices Frameworks 101Best Practices Frameworks 101
Best Practices Frameworks 101
 
Porter's strategies (generic strategies, five forces, diamond model) with ref...
Porter's strategies (generic strategies, five forces, diamond model) with ref...Porter's strategies (generic strategies, five forces, diamond model) with ref...
Porter's strategies (generic strategies, five forces, diamond model) with ref...
 
Performance Execution Framework (PEF).
Performance Execution Framework (PEF).Performance Execution Framework (PEF).
Performance Execution Framework (PEF).
 
Strategy frameworks-and-models
Strategy frameworks-and-modelsStrategy frameworks-and-models
Strategy frameworks-and-models
 

Similar to Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsalwan

ElixirでFPGAを設計する
ElixirでFPGAを設計するElixirでFPGAを設計する
ElixirでFPGAを設計する
Hideki Takase
 
Lecture2 general structure of a compiler
Lecture2 general structure of a compilerLecture2 general structure of a compiler
Lecture2 general structure of a compiler
Mahesh Kumar Chelimilla
 
Introduction to Assembly Language
Introduction to Assembly Language Introduction to Assembly Language
Introduction to Assembly Language
ApekshaShinde6
 
The Phases of a Compiler
The Phases of a CompilerThe Phases of a Compiler
The Phases of a Compiler
Radhika Talaviya
 
cs241-f06-final-overview
cs241-f06-final-overviewcs241-f06-final-overview
cs241-f06-final-overview
Colin Bell
 
Attention mechanisms with tensorflow
Attention mechanisms with tensorflowAttention mechanisms with tensorflow
Attention mechanisms with tensorflow
Keon Kim
 
Assembler
AssemblerAssembler
Assembler
SheetalAwate2
 
Compiler Design
Compiler DesignCompiler Design
Compiler Design
Dr. Jaydeep Patil
 
Reversing & Malware Analysis Training Part 4 - Assembly Programming Basics
Reversing & Malware Analysis Training Part 4 - Assembly Programming BasicsReversing & Malware Analysis Training Part 4 - Assembly Programming Basics
Reversing & Malware Analysis Training Part 4 - Assembly Programming Basics
securityxploded
 
Coal (1)
Coal (1)Coal (1)
Coal (1)
talhashahid40
 
Reversing malware analysis training part4 assembly programming basics
Reversing malware analysis training part4 assembly programming basicsReversing malware analysis training part4 assembly programming basics
Reversing malware analysis training part4 assembly programming basics
Cysinfo Cyber Security Community
 
ISA.pptx
ISA.pptxISA.pptx
ISA.pptx
FarrukhMuneer2
 
SS-assemblers 1.pptx
SS-assemblers 1.pptxSS-assemblers 1.pptx
SS-assemblers 1.pptx
kalavathisugan
 
CD U1-5.pptx
CD U1-5.pptxCD U1-5.pptx
CD U1-5.pptx
Himajanaidu2
 
Interm codegen
Interm codegenInterm codegen
Interm codegen
Anshul Sharma
 
system software 16 marks
system software 16 markssystem software 16 marks
system software 16 marks
vvcetit
 
W8_1: Intro to UoS Educational Processor
W8_1: Intro to UoS Educational ProcessorW8_1: Intro to UoS Educational Processor
W8_1: Intro to UoS Educational Processor
Daniel Roggen
 
CODch3Slides.ppt
CODch3Slides.pptCODch3Slides.ppt
CODch3Slides.ppt
Anonymous9etQKwW
 
Assembly language programming_fundamentals 8086
Assembly language programming_fundamentals 8086Assembly language programming_fundamentals 8086
Assembly language programming_fundamentals 8086
Shehrevar Davierwala
 
Lecture7.pdf
Lecture7.pdfLecture7.pdf
Lecture7.pdf
HiNguynNgc22
 

Similar to Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsalwan (20)

ElixirでFPGAを設計する
ElixirでFPGAを設計するElixirでFPGAを設計する
ElixirでFPGAを設計する
 
Lecture2 general structure of a compiler
Lecture2 general structure of a compilerLecture2 general structure of a compiler
Lecture2 general structure of a compiler
 
Introduction to Assembly Language
Introduction to Assembly Language Introduction to Assembly Language
Introduction to Assembly Language
 
The Phases of a Compiler
The Phases of a CompilerThe Phases of a Compiler
The Phases of a Compiler
 
cs241-f06-final-overview
cs241-f06-final-overviewcs241-f06-final-overview
cs241-f06-final-overview
 
Attention mechanisms with tensorflow
Attention mechanisms with tensorflowAttention mechanisms with tensorflow
Attention mechanisms with tensorflow
 
Assembler
AssemblerAssembler
Assembler
 
Compiler Design
Compiler DesignCompiler Design
Compiler Design
 
Reversing & Malware Analysis Training Part 4 - Assembly Programming Basics
Reversing & Malware Analysis Training Part 4 - Assembly Programming BasicsReversing & Malware Analysis Training Part 4 - Assembly Programming Basics
Reversing & Malware Analysis Training Part 4 - Assembly Programming Basics
 
Coal (1)
Coal (1)Coal (1)
Coal (1)
 
Reversing malware analysis training part4 assembly programming basics
Reversing malware analysis training part4 assembly programming basicsReversing malware analysis training part4 assembly programming basics
Reversing malware analysis training part4 assembly programming basics
 
ISA.pptx
ISA.pptxISA.pptx
ISA.pptx
 
SS-assemblers 1.pptx
SS-assemblers 1.pptxSS-assemblers 1.pptx
SS-assemblers 1.pptx
 
CD U1-5.pptx
CD U1-5.pptxCD U1-5.pptx
CD U1-5.pptx
 
Interm codegen
Interm codegenInterm codegen
Interm codegen
 
system software 16 marks
system software 16 markssystem software 16 marks
system software 16 marks
 
W8_1: Intro to UoS Educational Processor
W8_1: Intro to UoS Educational ProcessorW8_1: Intro to UoS Educational Processor
W8_1: Intro to UoS Educational Processor
 
CODch3Slides.ppt
CODch3Slides.pptCODch3Slides.ppt
CODch3Slides.ppt
 
Assembly language programming_fundamentals 8086
Assembly language programming_fundamentals 8086Assembly language programming_fundamentals 8086
Assembly language programming_fundamentals 8086
 
Lecture7.pdf
Lecture7.pdfLecture7.pdf
Lecture7.pdf
 

Recently uploaded

Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
Abdul Wali Khan University Mardan,kP,Pakistan
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
Gokturk Mehmet Dilci
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
Texas Alliance of Groundwater Districts
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
University of Maribor
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills MN
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
Sharon Liu
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
AbdullaAlAsif1
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
Daniel Tubbenhauer
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
by6843629
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
IshaGoswami9
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
HongcNguyn6
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
muralinath2
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
vluwdy49
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
TinyAnderson
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
Leonel Morgado
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
European Sustainable Phosphorus Platform
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 

Recently uploaded (20)

Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 

Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsalwan

  • 1. Triton: Concolic Execution Framework Florent Saudel Jonathan Salwan SSTIC Rennes – France Slides: detailed version June 3 2015 Keywords: program analysis, DBI, DBA, Pin, concrete execution, symbolic execution, concolic execution, DSE, taint analysis, context snapshot, Z3 theorem prover and behavior analysis.
  • 2. 2 ● Jonathan Salwan is a student at Bordeaux University (CSI Master) and also an employee at Quarkslab ● Florent Saudel is a student at the Bordeaux University (CSI Master) and applying to an Internship at Amossys ● Both like playing with low-level computing, program analysis and software verification methods Who are we?
  • 3. 3 ● Triton is a project started on January 2015 for our Master final project at Bordeaux University (CSI) supervised by Emmanuel Fleury from laBRI ● Triton is also sponsored by Quarkslab from the beginning Where does Triton come from?
  • 4. 4 ● Triton is a concolic execution framework as Pintool ● It provides advanced classes to improve dynamic binary analysis (DBA) using Pin – Symbolic execution engine – SMT semantics representation – Interface with SMT Solver – Taint analysis engine – Snapshot engine – API and Python bindings What is Triton?
  • 5. 5 What is Triton? Plug what you want which supports the SMT2-LIB format Pin Taint Engine Symbolic Execution Engine Snapshot Engine SMT Semantics IR SMT Solver Interface Python Bindings Triton internal components Triton.so pintool user script.py Z3 * Front-endBack-end User side
  • 6. 6 ● Well-known projects – SAGE – Mayhem – Bitblaze – S2E ● The difference? – Triton works online* through a higher level languages using the Pin engine Relative projects online*: Analysis is performed at runtime and data can be modified directly in memory to go through specific branches.
  • 7. 7 ● You can build tools which: – Analyze a trace with concrete information ● Registers and memory values at each program point – Perform a symbolic execution ● To know the symbolic expression of registers and memory at each program point – Perform a symbolic fuzzing session – Generate and solve path constraints – Gather code coverage – Runtime registers and memory modification – Replay traces directly in memory – Scriptable debugging – Access to Pin functions through a higher level languages (Python bindings) – And probably lots of others things What kind of things you can build with Triton?
  • 10. 10 ● Symbolic Engine ● Symbolic execution is the execution of a program using symbolic variables instead of concrete values ● Symbolic execution translates the program's semantics into a logical formula ● Symbolic execution can build and keep a path formula – By solving the formula and its negation we can take all paths and “cover” a code ● Instead of concrete execution which takes only one path ● Then a symbolic expression is given to a SMT solver to generate a concrete value
  • 11. 11 ● Symbolic Engine inside Triton ● A trace is a sequence of instructions T = (Ins1 Ins∧ 2 Ins∧ 3 Ins∧ 4 … Ins∧ ∧ i) ● Instructions are represented with symbolic expressions ● A symbolic trace is a sequence of symbolic expressions ● Each symbolic expression is translated like this: REFout = semantic – Where : ● REFout := unique ID ● Semantic := REFin | <<smt expression>> ● Each register or byte of memory points to its last reference → Single Static Assignment Form (SSA)
  • 12. 12 ● Register References movzx eax, byte ptr [mem] add eax, 2 mov ebx, eax Example: // All refs initialized to -1 Register Reference Table { EAX : -1, EBX : -1, ECX : -1, ... } // Empty set Symbolic Expression Set { }
  • 13. 13 ● Register references movzx eax, byte ptr [mem] #0 = symvar_1 add eax, 2 mov ebx, eax Example: // All refs initialized to -1 Register Reference Table { EAX : #0, EBX : -1, ECX : -1, ... } // Empty set Symbolic Expression Set { <#0, symvar_1> }
  • 14. 14 ● Register references movzx eax, byte ptr [mem] #0 = symvar_1 add eax, 2 #1 = (bvadd #0 2) mov ebx, eax Example: // All refs initialized to -1 Register Reference Table { EAX : #1, EBX : -1, ECX : -1, ... } // Empty set Symbolic Expression Set { <#1, (bvadd #0 2)>, <#0, symvar_1> }
  • 15. 15 ● Register references movzx eax, byte ptr [mem] #0 = symvar_1 add eax, 2 #1 = (bvadd #0 2) mov ebx, eax #2 = #1 Example: // All refs initialized to -1 Register Reference Table { EAX : #1, EBX : #2, ECX : -1, ... } // Empty set Symbolic Expression Set { <#2, #1>, <#1, (bvadd #0 2)>, <#0, symvar_1> }
  • 16. 16 ● Rebuild the trace with backward analysis movzx eax, byte ptr [mem] add eax, 2 mov ebx, eax Example: // All refs initialized to -1 Register Reference Table { EAX : #1, EBX : #2, ECX : -1, ... } // Empty set Symbolic Expression Set { <#2, #1>, <#1, (bvadd #0 2)>, <#0, symvar_1> } What is the semantic trace of EBX ?
  • 17. 17 ● Rebuild the trace with backward analysis movzx eax, byte ptr [mem] add eax, 2 mov ebx, eax Example: // All refs initialized to -1 Register Reference Table { EAX : #1, EBX : #2, ECX : -1, ... } // Empty set Symbolic Expression Set { <#2, #1>, <#1, (bvadd #0 2)>, <#0, symvar_1> } What is the semantic trace of EBX ? EBX holds the reference #2
  • 18. 18 ● Rebuild the trace with backward analysis movzx eax, byte ptr [mem] add eax, 2 mov ebx, eax Example: // All refs initialized to -1 Register Reference Table { EAX : #1, EBX : #2, ECX : -1, ... } // Empty set Symbolic Expression Set { <#2, #1>, <#1, (bvadd #0 2)>, <#0, symvar_1> } What is the semantic trace of EBX ? EBX holds the reference #2 What is #2 ?
  • 19. 19 ● Rebuild the trace with backward analysis movzx eax, byte ptr [mem] add eax, 2 mov ebx, eax Example: // All refs initialized to -1 Register Reference Table { EAX : #1, EBX : #2, ECX : -1, ... } // Empty set Symbolic Expression Set { <#2, #1>, <#1, (bvadd #0 2)>, <#0, symvar_1> } What is the semantic trace of EBX ? EBX holds the reference #2 What is #2 ? Reconstruction: EBX = #2
  • 20. 20 ● Rebuild the trace with backward analysis movzx eax, byte ptr [mem] add eax, 2 mov ebx, eax Example: // All refs initialized to -1 Register Reference Table { EAX : #1, EBX : #2, ECX : -1, ... } // Empty set Symbolic Expression Set { <#2, #1>, <#1, (bvadd #0 2)>, <#0, symvar_1> } What is the semantic trace of EBX ? EBX holds the reference #2 What is #2 ? Reconstruction: EBX = #1
  • 21. 21 ● Rebuild the trace with backward analysis movzx eax, byte ptr [mem] add eax, 2 mov ebx, eax Example: // All refs initialized to -1 Register Reference Table { EAX : #1, EBX : #2, ECX : -1, ... } // Empty set Symbolic Expression Set { <#2, #1>, <#1, (bvadd #0 2)>, <#0, symvar_1> } What is the semantic trace of EBX ? EBX holds the reference #2 What is #2 ? Reconstruction: EBX = (bvadd #0 2)
  • 22. 22 ● Rebuild the trace with backward analysis movzx eax, byte ptr [mem] add eax, 2 mov ebx, eax Example: // All refs initialized to -1 Register Reference Table { EAX : #1, EBX : #2, ECX : -1, ... } // Empty set Symbolic Expression Set { <#2, #1>, <#1, add(#0, 2)>, <#0, symvar_1> } What is the semantic trace of EBX ? EBX holds the reference #2 What is #2 ? Reconstruction: EBX = (bvadd symvar_1 2)
  • 23. 23 ● Assigning a reference for each register is not enough, we must also add references on memory ● Follow references over memory mov dword ptr [rbp-0x4], 0x0 ... mov eax, dword ptr [rbp-0x4] push eax ... pop ebx What do we want to know? Eax = 0 from somewhere ebx = eax References #1 = 0x0 ... #x = #1 #2 = #1 ... #x = #2
  • 24. 24 ● All registers, flags and each byte of memory are references ● A reference assignment is in SSA form during the execution ● The registers, flags and bytes of memory are assigned in the same way ● A memory reference can be assigned from a register reference (mov [mem], reg) ● A register reference can be assigned from a memory reference (mov reg, [mem]) ● If a reference doesn't exist yet, we concretize the value and we affect a new reference ● References conclusion
  • 25. 25 ● SMT Semantics Representation with SSA Form
  • 26. 26 ● SMT Semantics Representation with SSA Form ● All instructions semantics are represented via SMT2-LIB representation ● This SMT2-LIB representation is on SSA form add rax, rdx rax = (bvadd ((_ extract 63 0) rax) ((_ extract 63 0) rdx)) ... (af, cf, of, pf) ... sf = (ite (= ((_ extract 63 63) rax) (_ bv1 1)) (_ bv1 1) (_ bv0 1)) zf = (ite (= rax (_ bv0 64)) (_ bv1 1) (_ bv0 1)) #60 = (bvadd ((_ extract 63 0) #58) ((_ extract 63 0) #54)) ... (af, cf, of, pf) ... #64 = (ite (= ((_ extract 63 63) #60) (_ bv1 1)) (_ bv1 1) (_ bv0 1)) #65 = (ite (= #60 (_ bv0 64)) (_ bv1 1) (_ bv0 1)) Assembly SMT SSA SMT New rax reference Old rax reference Old rdx reference
  • 27. 27 ● SMT Semantics Representation with SSA Form ● Why use SMT2-LIB representation? – SMT-LIB is an international initiative aimed at facilitating research and development in Satisfiability Modulo Theories (SMT) – As all Triton's expressions are in the SMT2-LIB representation, you can plug all solvers which supports this representation ● Currently Triton has an interface with Z3 but feel free to plug what you want
  • 28. 28 ● Symbolic Execution Guided By The Taint Analysis
  • 29. 29 ● Symbolic Execution Guided By The Taint Analysis ● Taint analysis provides information about which registers and memory addresses are controllable by the user at each program point: – Assists the symbolic engine to setup the symbolic variables (a symbolic variable is a memory area that the user can control) – Limit the symbolic engine to the relevant part of the program – At each branch instruction, we directly know if the user can go through both branches (this is mainly used for code coverage)
  • 30. 30 ● Symbolic Execution Guided By The Taint Analysis ● Transform a tainted area into a symbolic variable 0x40058b: movzx eax, byte ptr [rax] -> #33 = SymVar_0 ; Controllable by the user -> #34 = (_ bv4195726 64) ; RIP 0x40058e: movsx eax, al -> #35 = ((_ sign_extend 24) ((_ extract 7 0) #33)) -> #36 = (_ bv4195729 64) ; RIP 0x40058b: movzx eax, byte ptr [rax] -> #33 = ((_ zero_extend 24) (_ bv97 8)) -> #34 = (_ bv4195726 64) ; RIP 0x40058e: movsx eax, al -> #35 = ((_ sign_extend 24) ((_ extract 7 0) #33)) -> #36 = (_ bv4195729 64) ; RIP rax points on a tainted area Use symbolic variable instead of concrete value
  • 31. 31 ● Symbolic Execution Guided By The Taint Analysis ● Can I go through this branch? – Check if flags are tainted 0x4005ae: cmp ecx, eax -> #72 = (bvsub ((_ extract 31 0) #52) ((_ extract 31 0) #70)) ...CF, OF, SF, AF, and PF skipped... -> #78 = (ite (= #72 (_ bv0 32)) (_ bv1 1) (_ bv0 1)) ; ZF -> #79 = (_ bv4195760 64) ; RIP 0x4005b0: jz 0x4005b9 -> #80 = (ite (= #78 (_ bv1 1)) (_ bv4195769 64) (_ bv4195762 64)) ; RIP eax is tainted tainted
  • 32. 32 ● Taint Analysis guided by the Symbolic Engine and the Solver Engine ● As the symbolic execution may be guided by the taint analysis, the taint analysis may also be guided by the symbolic execution and the solver engine ● What to choose between an over-approximation and under-approximation? – Over-approximation: We can generate inputs for infeasible concrete paths. – Under-approximation: We can miss some feasible paths. ● The goal of the taint engine is to say YES or NO if a register and memory is probably tainted (byte-level over approximation) ● The goal of the symbolic engine is to build symbolic expressions based on instructions semantics ● The goal of the solver engine is to generate a model of an expression (path condition) – If your target is not tainted, don't ask a model → gain time – If the solver engine returns UNSAT → the tainted inputs can't influence the control flow to go through this path. – If the solver engine returns SAT → the path can be triggered with the actual tainted inputs. The model give us the set of concrete inputs for this path.
  • 33. 33 ● Snapshot Engine – Replay your trace
  • 34. 34 ● Snapshot Engine – Replay your trace ● The snapshot engine offers the possibility to take and restore snapshot – Mainly used to apply code coverage in memory. Useful when you fuzz the binary – In future versions, it will be possible to take different snapshots at several program point ● The snapshot engine only restores registers and memory states – If there is some disk, network,... I/O, Triton won't be able to restore the files modification Restore context Snapshot Restore SnapshotTarget function / basic block Dynamic Symbolic Execution Possible paths in the target
  • 35. 35 ● Stop talking about back-end! Let's see how I can use Triton
  • 36. 36 ● How to install Triton? ● Easy is easy ● You just need: – Pin v2.14-71313 – Z3 v4.3.1 – Python v2.7 $ cd pin-2.14-71313-gcc.4.4.7-linux/source/tools/ $ git clone git@github.com:JonathanSalwan/Triton.git $ cd Triton $ make $ ../../../pin -t ./triton.so -script your_script.py -- ./your_target_binary.elf64 Shell 1: Installation
  • 37. 37 ● Start an analysis import triton if __name__ == '__main__': # Start the symbolic analysis from the 'check' function triton.startAnalysisFromSymbol('check') # Run the instrumentation - Never returns triton.runProgram() import triton if __name__ == '__main__': # Start the symbolic analysis from address triton.startAnalysisFromAddr(0x40056d) triton.stopAnalysisFromAddr(0x4005c9) # Run the instrumentation - Never returns triton.runProgram() Code 1: Start analysis from symbols Code 2: Start analysis from address
  • 38. 38 ● Predicate taint and untaint import triton if __name__ == '__main__': # Start the symbolic analysis from the 'check' function triton.startAnalysisFromSymbol('check') # Taint the RAX and RBX registers when the address 0x40058e is executed triton.taintRegFromAddr(0x40058e, [IDREF.REG.RAX, IDREF.REG.RBX]) # Untaint the RCX register when the address 0x40058e is executed triton.untaintRegFromAddr(0x40058e, [IDREF.REG.RCX]) # Run the instrumentation - Never returns triton.runProgram() Code 3: Predicate taint and untaint at specific addresses
  • 39. 39 ● Callbacks ● Triton supports 8 kinds of callbacks – AFTER ● Defines a callback after the instruction processing – BEFORE ● Defines a callback before the instruction processing – BEFORE_SYMPROC ● Defines a callback before the symbolic processing – FINI ● Define a callback at the end of the execution – ROUTINE_ENTRY ● Define a callback at the entry of a specified routine. – ROUTINE_EXIT ● Define a callback at the exit of a specified routine. – SYSCALL_ENTRY ● Define a callback before each syscall processing – SYSCALL_EXIT ● Define a callback after each syscall processing
  • 40. 40 ● Callback on SYSCALL def my_callback_syscall_entry(threadId, std): print '-> Syscall Entry: %s' %(syscallToString(std, getSyscallNumber(std))) if getSyscallNumber(std) == IDREF.SYSCALL.LINUX_64.WRITE: arg0 = getSyscallArgument(std, 0) arg1 = getSyscallArgument(std, 1) arg2 = getSyscallArgument(std, 2) print ' sys_write(%x, %x, %x)' %(arg0, arg1, arg2) def my_callback_syscall_exit(threadId, std): print '<- Syscall return %x' %(getSyscallReturn(std)) if __name__ == '__main__': startAnalysisFromSymbol('main') addCallback(my_callback_syscall_entry, IDREF.CALLBACK.SYSCALL_ENTRY) addCallback(my_callback_syscall_exit, IDREF.CALLBACK.SYSCALL_EXIT) runProgram() Code 4: Callback before and after syscalls processing -> Syscall Entry: fstat <- Syscall return 0 -> Syscall Entry: mmap <- Syscall return 7fb7f06e1000 -> Syscall Entry: write sys_write(1, 7fb7f06e1000, 6) Code 4 result
  • 41. 41 ● Callback on ROUTINE def mallocEntry(threadId): sizeAllocated = getRegValue(IDREF.REG.RDI) print '-> malloc(%#x)' %(sizeAllocated) def mallocExit(threadId): ptrAllocated = getRegValue(IDREF.REG.RAX) print '<- %#x' %(ptrAllocated) if __name__ == '__main__': startAnalysisFromSymbol('main') addCallback(mallocEntry, IDREF.CALLBACK.ROUTINE_ENTRY, "malloc") addCallback(mallocExit, IDREF.CALLBACK.ROUTINE_EXIT, "malloc") runProgram() Code 5: Callback before and after routine processing -> malloc(0x20) <- 0x8fc010 -> malloc(0x20) <- 0x8fc040 -> malloc(0x20) <- 0x8fc010 Code 5 result
  • 42. 42 ● Callback BEFORE and AFTER instruction processing def my_callback_before(instruction): print 'TID (%d) %#x %s' %(instruction.threadId, instruction.address, instruction.assembly) if __name__ == '__main__': # Start the symbolic analysis from the 'check' function startAnalysisFromSymbol('check') # Add a callback. addCallback(my_callback_before, IDREF.CALLBACK.BEFORE) # Run the instrumentation - Never returns runProgram() Code 6: Callback before instruction processing TID (0) 0x40056d push rbp TID (0) 0x40056e mov rbp, rsp TID (0) 0x400571 mov qword ptr [rbp-0x18], rdi TID (0) 0x400575 mov dword ptr [rbp-0x4], 0x0 ... TID (0) 0x4005b2 mov eax, 0x1 TID (0) 0x4005b7 jmp 0x4005c8 TID (0) 0x4005c8 pop rbp Code 6 result
  • 43. 43 ● Instruction class def my_callback(instruction): ... ● instruction.address ● instruction.assembly ● instruction.imageName - e.g: libc.so ● instruction.isBranch ● instruction.opcode ● instruction.opcodeCategory ● instruction.operands ● instruction.symbolicElements – List of SymbolicElement class ● instruction.routineName - e.g: main ● instruction.sectionName - e.g: .text ● instruction.threadId
  • 44. 44 ● SymbolicElement class instruction.symbolicElements[0] ● symbolicElement.comment → blah ● symbolicElement.destination → #41 ● symbolicElement.expression → #41 = (bvadd ((_ extract 63 0) #40) ((_ extract 63 0) #39)) ● symbolicElement.id → 41 ● symbolicElement.isTainted → True or False ● symbolicElement.source → (bvadd ((_ extract 63 0) #40) ((_ extract 63 0) #39)) Instruction: add rax, rdx SymbolicElement: #41 = (bvadd ((_ extract 63 0) #40) ((_ extract 63 0) #39)) ; blah
  • 45. 45 ● Dump the symbolic expressions trace def my_callback_after(instruction): print '%#x: %s' %(instruction.address, instruction.assembly) for se in instruction.symbolicElements: print 't -> ', se.expression print if __name__ == '__main__': startAnalysisFromSymbol('check') addCallback(my_callback_after, IDREF.CALLBACK.AFTER) runProgram() Code 7: Dump a symbolic expression trace 0x4005ab: movsx eax, al -> #70 = ((_ sign_extend 24) ((_ extract 7 0) #68)) -> #71 = (_ bv4195758 64) 0x4005ae: cmp ecx, eax -> #72 = (bvsub ((_ extract 31 0) #52) ((_ extract 31 0) #70)) ... -> #77 = (ite (= ((_ extract 31 31) #72) (_ bv1 1)) (_ bv1 1) (_ bv0 1)) -> #78 = (ite (= #72 (_ bv0 32)) (_ bv1 1) (_ bv0 1)) -> #79 = (_ bv4195760 64) 0x4005b0: jz 0x4005b9 -> #80 = (ite (= #78 (_ bv1 1)) (_ bv4195769 64) (_ bv4195762 64)) Code 7 result
  • 46. 46 ● Play with the Taint engine at runtime # 0x40058b: movzx eax, byte ptr [rax] def cbeforeSymProc(instruction): if instruction.address == 0x40058b: rax = getRegValue(IDREF.REG.RAX) taintMem(rax) if __name__ == '__main__': startAnalysisFromSymbol('check') addCallback(cbeforeSymProc, IDREF.CALLBACK.BEFORE_SYMPROC) runProgram() Code 8: Taint memory at runtime 0x40058b: movzx eax, byte ptr [rax] -> #33 = SymVar_0 -> #34 = (_ bv4195726 64) 0x40058e: movsx eax, al -> #35 = ((_ sign_extend 24) ((_ extract 7 0) #33)) -> #36 = (_ bv4195729 64) Code 8 result Modifications must be done before the symbolic processing
  • 47. 47 ● Taint argv[x][x] at the main function def mainAnalysis(threadId): rdi = getRegValue(IDREF.REG.RDI) # argc rsi = getRegValue(IDREF.REG.RSI) # argv while rdi != 0: argv = getMemValue(rsi + ((rdi-1) * 8), 8) offset = 0 while getMemValue(argv + offset, 1) != 0x00: taintMem(argv + offset) offset += 1 print '[+] %03d bytes tainted from the argv[%d] (%#x) pointer' %(offset, rdi-1, argv) rdi -= 1 return Code 9: Taint all arguments when the main function occurs $ pin -t ./triton.so -script taint_main.py -- ./example.bin64 12 123456 123456789 [+] 009 bytes tainted from the argv[3] (0x7fff802ad116) pointer [+] 006 bytes tainted from the argv[2] (0x7fff802ad10f) pointer [+] 002 bytes tainted from the argv[1] (0x7fff802ad10c) pointer [+] 015 bytes tainted from the argv[0] (0x7fff802ad0ef) pointer Code 9 result
  • 48. 48 ● Play with the Symbolic engine 0x40058b: movzx eax, byte ptr [rax] ... ... 0x4005ae: cmp ecx, eax Example 10: Assembly code def callback_beforeSymProc(instruction): if instruction.address == 0x40058b: rax = getRegValue(IDREF.REG.RAX) taintMem(rax) def callback_after(instruction): if instruction.address == 0x4005ae: # Get the symbolic expression ID of ZF zfId = getRegSymbolicID(IDREF.FLAG.ZF) # Backtrack the symbolic expression ZF zfExpr = getBacktrackedSymExpr(zfId) # Craft a new expression over the ZF expression : (assert (= zfExpr True)) expr = smt2lib.smtAssert(smt2lib.equal(zfExpr, smt2lib.bvtrue())) print expr Code 10: Backtrack symbolic expression We know that rax points on a tainted area (assert (= (ite (= (bvsub ((_ extract 31 0) ((_ extract 31 0) (bvxor ((_ extract 31 0) (bvsub ((_ extract 31 0) ((_ sign_extend 24) ((_ extract 7 0) SymVar_0))) (_ bv1 32))) (_ bv85 32)))) ((_ extract 31 0) ((_ sign_extend 24) ((_ extract 7 0) ((_ zero_extend 24) (_ bv49 8)))))) (_ bv0 32)) (_ bv1 1) (_ bv0 1)) (_ bv1 1))) Example 10 result Symbolic Variable
  • 49. 49 ● Play with the Symbolic engine ... zfExpr = getBacktrackedSymExpr(zfId) # Craft a new expression over the ZF expression : (assert (= zfExpr True)) expr = smt2lib.smtAssert(smt2lib.equal(zfExpr, smt2lib.bvtrue())) ... Extract of the Code 10 ● What does it really mean? – Triton builds symbolic formulas based on the instructions semantics – Triton also exports smt2lib functions which allows you to create your own formula – In this example, we want that the ZF expression is equal to 1
  • 50. 50 ● Play with the Solver engine ... zfExpr = getBacktrackedSymExpr(zfId) # Craft a new expression over the ZF expression : (assert (= zfExpr True)) expr = smt2lib.smtAssert(smt2lib.equal(zfExpr, smt2lib.bvtrue())) ... model = getModel(expr) print model Extract of the Code 10 {'SymVar_0': 0x65} Result ● getModel() returns a dictionary of valid model for each symbolic variable 0x40058b: movzx eax, byte ptr [rax] ... ... 0x4005ae: cmp ecx, eax Example 10: Assembly code We know now that the first character must be 0x65 to set the ZF at the compare instruction
  • 51. 51 ● Play with the Solver engine and inject values directly in memory ... model = getModel(expr) print model Extract of the Code 10 Result ● Each symbolic variable is assigned to a memory address (SymVar ↔ Address) – Possible to get the symbolic variable from a memory address ● getSymVarFromMemory(addr) – Possible to get the memory address from a symbolic variable ● getMemoryFromSymVar(symVar) {'SymVar_0': 0x65} for k, v in model.items(): setMemValue(getMemoryFromSymVar(k), getSymVarSize(k), v) Inject values given by the solver in memory
  • 52. 52 ● Inject values in memory is not enough Play with the snapshot engine ● Inject values in memory after instructions processing is useless ● That's why Triton offers a snapshot engine φ1 φ2 φ3 φ4 φ5 φ6 0x40058b: movzx eax, byte ptr [rax] 0x4005ae: cmp ecx, eax def callback_after(instruction): if instruction.address == 0x40058b and isSnapshotEnable() == False: takeSnapshot() if instruction.address == 0x4005ae: if getFlagValue(IDREF.FLAG.ZF) == 0: zfExpr = getBacktrackedSymExpr(...) # Described on slide 45 expr = smt2lib.smtAssert(...zfExpr...) # Described on slide 45 for k, v in getModel(expr).items(): # Described on slide 48 saveValue(...) restoreSnapshot() restore snapshottake snapshot
  • 53. 53 ● Stop pasting fucking code Show me a global vision
  • 54. 54 ● Stop pasting fucking code Show me a global vision ● Full API and Python bindings describes here – https://github.com/JonathanSalwan/Triton/wiki/Python-Bindings – ~80 functions exported over the Python bindings ● Basically we can: – Taint and untaint memory and registers – Inject value in memory and registers – Add callbacks at each program point, syscalls, routine – Assign symbolic expression on registers and bytes of memory – Build and customize symbolic expressions – Solve symbolic expressions – Take and restore snapshots – Do all this in Python!
  • 56. 56 ● Conclusion ● Triton: – is a Pintool which provides others classes for DBA – is designed as a concolic execution framework – provides an API and Python bindings – supports only x86-64 binaries – currently supports ~100 semantics but we are working hard on it to increase the semantics support ● An awesome thanks to Kevin `wisk` Szkudlapski and Francis `gg` Gabriel for the x86.yaml from the Medusa project :) – is free and open-source :) – is available here : github.com/JonathanSalwan/Triton
  • 57. 57 ● Contacts – fsaudel@gmail.com – jsalwan@quarkslab.com ● Thanks – We would like to thank the SSTIC's staff and especially Rolf Rolles, Sean Heelan, Sébastien Bardin, Fred Raynal and Serge Guelton for their proofreading and awesome feedbacks! Then, a big thanks to Quarkslab for the sponsor. Thanks For Your Attention Question(s)?
  • 58. 58 ● Q&A - Performances ● Host machine configuration – Tested with an Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz – 16 Go DDR3 – 415 Go SSD Swap ● The targeted binary analyzed was /usr/bin/z3 – 6,789,610 symbolic expressions created for 1 trace – The binary has been analyzed in 180 seconds ● One trace with SMT2-LIB translation and the taint spread – 19 Go of RAM consumed ● Due to the SMT2-LIB strings manipulation