An introduction to binary
exploitation
Gábor Pék
About me
● Co-founder and CTO at Avatao
● PhD in virtualization and malware security (CrySyS Lab)
● Virtualization hacks (e.g., XSA-59)
● Research of advanced malware (e.g., Duqu, Flame)
● Co-founder of !SpamAndHex (3x DEFCON CTF Finalist team)
So why this topic?
https://www.theverge.com
Let’s begin
LSzekeres,MPayer,WeiTao,DSong:SoK–EternalWarinMemory,IEEE
SecurityandPrivacy,2013
Maybe at another time :)
LSzekeres,MPayer,WeiTao,DSong:SoK–EternalWarinMemory,IEEE
SecurityandPrivacy,2013
X
First, Understand the basics
● Compilers (native, JIT, AOT) vs. interpreters
● Native binaries
○ compiled from C, C++ or Rust for example
● Loaders
● Program memory layout
● CPU instructions and registers including Assembly
● Type of bugs
interpreters
“An interpreter executes some
representation of a program’s source
file at run- time.”
Randall Hyde
Compilers
“A compiler translates a source
program in text form into executable
machine code.”
Randall Hyde
COMPILATION PROCESS (SIMPLIFIED)
Source
code
Analysis
Optimization
Code generation
Machine
code
Example Source code
Machine (object) code
Basic x86 system
Source: [1]
Cpu registers (32-bit)
Source: [1]
Interested in architecture internals? :D
Intel® 64 and IA-32 Architectures
Software Developer Manuals
(4846 pages)
Assembly basics
Assembler language is the
representation of machine code in
human readable form
assembly instructions
AT&T syntax: <instruction (mnemonic)>
<source>, <destination>
Intel syntax: <instruction (mnemonic)>
<destination>, <source>
Machine code vs Assembly
So how to store data in memory?
Stack: Built up from stack frames of
functions, LIFO, grows downwards
Heap: Dynamically allocated
memory, grows upwards
ELF binary in memory … too much for today
Function call example
int main(void){
int a, b, c;
a=7;
b=3;
c = addnum(a,b);
printf("%d", c);
}
int addnum(int a, int b){
int c = 4;
c = a + b;
return c;
}
Stack frame example (x86)
Free memory
← ESP
Local variables
Saved EBP
Previous frame
Return address
Function parameters
← EBP
Higher memory
addresses
Stack frame before function call (x86)
Free memory
← ESP
Lower memory
addresses
stack pointer, points to
the top of the stack frame
second function parameter placed first
Free memory
← ESP
Lower memory
addresses
3 (b) int main(void){
int a, b, c;
a=7;
b=3;
c = addnum(a,b);
printf("%d", c);
}
First function parameter placed after
Free memory
← ESP
Lower memory
addresses
3 (b) int main(void){
int a, b, c;
a=7;
b=3;
c = addnum(a,b);
printf("%d", c);
}
7 (a)
Place return address (Address of next instruction)
Free memory
← ESP
Lower memory
addresses
3 (b) int main(void){
int a, b, c;
a=7;
b=3;
c = addnum(a,b);
printf("%d", c);
}
7 (a)
Return address
& __asm__ add esp, 8
Place the Base pointer of the previous stack frame
Free memory
← ESP
Lower memory
addresses
3 (b) int addnum(int a, int b){
int c = 4;
c = a + b;
return c;
}
7 (a)
Return address
Saved EBP
__asm__ push ebp
__asm__ mov ebp, esp
__asm__ sub esp, 0x10
Align the base pointer to the current stack frame
Free memory
← ESP,
Lower memory
addresses
3 (b) int addnum(int a, int b){
int c = 4;
c = a + b;
return c;
}
7 (a)
Return address
Saved EBP
__asm__ push ebp
__asm__ mov ebp, esp
__asm__ sub esp, 0x10
EBP
Create space for local variables
Free memory
← EBP
Lower memory
addresses
3 (b) int addnum(int a, int b){
int c = 4;
c = a + b;
return c;
}
7 (a)
Return address
Saved EBP
__asm__ push ebp
__asm__ mov ebp, esp
__asm__ sub esp, 0x10
← ESP
Space for local variables
Let’s return: delete local variables from stack frame
Free memory
← EBP
Lower memory
addresses
3 (b) int addnum(int a, int b){
int c = 4;
c = a + b;
return c;
}
7 (a)
Return address
Saved EBP
...
ESP
__asm__ mov esp, ebp
__asm__ pop ebp
__asm__ ret
restore ebp to its previous value
Free memory
←ESP
Lower memory
addresses
3 (b) int addnum(int a, int b){
int c = 4;
c = a + b;
return c;
}
7 (a)
Return address
...
__asm__ mov esp, ebp
__asm__ pop ebp
__asm__ ret
Return to the address we saved before
Free memory
←ESP
Lower memory
addresses
3 (b) int addnum(int a, int b){
int c = 4;
c = a + b;
return c;
}
7 (a) ...
__asm__ mov esp, ebp
__asm__ pop ebp
__asm__ ret
Caller cleans the stack from function parameters
Free memory
←ESP
Lower memory
addresses
int main(void){
int a, b, c;
a=7;
b=3;
c = addnum(a,b);
printf("%d", c);
}
__asm__ add esp, 8
What about 64-bit mode (e.g.,System V AMD64) ?
Function parameters are put into CPU
registers like
RDI,RSI,RDX,RCX,R8,R9,XMM0-7
Debuggers (E.g., GDB, WinDBG, IDA)
Understand the runtime behaviour of
a program
Memory corruption errors
Attackers can read or write
arbitrary memory locations of a
vulnerable program
Vulnerability Examples
Stack overflow, heap overflow,
integer overflow format string,
use-after-free, ...
Protections
Stack canaries, DEP, ASLR, Control
and Data Flow Integrity, using safe
libraries, ...
Some exploitation techniques
Return to LibC, ROP, JOP, GOT
hijacking, Off-by-one, ...
Maybe next time
LSzekeres,MPayer,WeiTao,DSong:SoK–EternalWarinMemory,IEEE
SecurityandPrivacy,2013
My two cents
Write less and more secure code :)
Readings
[1] Randall Hyde: Write Great Code, Volume 2 - Thinking
low-level, writing high-level, No Starch Press, 2006
[2] L Szekeres, M Payer , Wei Tao, D Song : SoK – Eternal
War in Memory, IEEE Security and Privacy, 2013
[3] Intel® 64 and IA-32 Architectures Software Developer
Manuals, 2016
Huge thanks for coming!
Questions? gabor.pek@avatao.com

Hacker Thursdays: An introduction to binary exploitation