Inspecting and Manipulating binaries
Introduction.
x86 architecture. Assembler.
Binary inspection.
  General sample (crackme)
Binary manipulation.
  Python to the rescue!
Malware analysis

                 What we (you) are going to do   2
Welcome to the city of death   3
$ whoami




  Braaaaiiinnnssss   4
                         4
$ whoami
           Binary Reverse Engineering
           Malware
           Programming
           The walking dead




                                   5
                                       5
Yeah, but what is reversing ???

  Kind of a reconstruction
     Code      Binary

int main(int argc, char **argv)   push ebp
{                                 mov ebp, esp
 int gv = 0; // global            mov dword_403394, 0
 […]                              […]




                                                        6
so, i want to do some reversing

Requisites
   ASM knowledge
   Binary format (PE32, etc.)
   OS Knowledge
   Patience ;)




                                     7
At the beginning, there was c0de
            #include "stdafx.h"
            #include <iostream>

            using namespace std;

            int add(int x, int y)
            {
               return x + y;
            }



            int main()
            {
               int a = 0, b = 0;
               cout << "Enter a: ";
               […]
              return 0;
            }




                                      8
The compiler fucks it all up




                               9
compiling is like...
… that scene of Apollo XIII




                                      10
It may hurt a little bit…   11
Bytes :: opcodes




                   12
Opcodes :: basic blocks




                          13
Basic blocks :: functions




                            14
Cpu registers
„General purpose“ registers:
  eax, ebx, ecx, edx, esi, edi, ebp, esp
  32 bits (64 bits in x86_64)
  Some of them have special uses.



                                                 „The rootkit arsenal“ Bill Blunden



                     It may hurt a little bit…                               15
Cpu registers
Not so „general purpose“ registers:
 eax: Arithmetic and function return values
 ecx: counter (loops, etc.)
 esi, edi: src, dst in memcpy, strcpy, etc.
 ebp, esp: stack operations :)
 and more...



                   It may hurt a little bit…   16
Some common instructions
Read
  mov eax, [ecx]
  mov eax, [00401000]
Write
  mov ebx, 0x20
  mov [ebx+0x1C], ecx




              It may hurt a little bit…   17
Some common instructions
push
 push 0x100
 push dword_1377
pop
 pop ecx
 pop 0x00c0ffee




              It may hurt a little bit…   18
Some common instructions
inc, dec
mul, div
  Use your imagination :)


add, sub
  add esp, 0x08
  sub esp, 0x1c


                      It may hurt a little bit…   19
Some common instructions
lea dst, src
lea eax, [esi*2]
lea ecx, [esi+ecx]

shl, shr
shl eax, 2



             It may hurt a little bit…   20
Some common instructions
    jxx dst
    jmp 0xbadc0de
    jnz eax
    ja 0xc0ffee
    jl ebx
t
    call 0x00412F1E

                      It may hurt a little bit…   21
Some common instructions
cmp dst, src
cmp eax, ebx
cmp ecx, 0xFF

test dst, src
test ecx, edx



            It may hurt a little bit…   22
Important segments
.data: statically allocated *initialized*
  int g = 1; char str[] = „yomamma“;
.bss: statically allocated *UNinitialized*
  int var; char *ptr;
.rdata: read-only data (const c = 0)




                        It may hurt a little bit…   23
It‘s all about the .(r)data




         It may hurt a little bit…   24
Moar data




 It may hurt a little bit…   25
Moar data




 It may hurt a little bit…   26
Imports and all that stuff…




          It may hurt a little bit…   27
The problem with imports

Problem: O(n2) vs. O(n)

„Thunks and dynamic resolution makes the binary
*portable* between Windows versions.
XP and Win7 don‘t have the same addresses for all
kernel32 functions

                  It may hurt a little bit…     28
Regular call    thunks                                   Opcode:
                                                        0xE8 + offset




Memcpy = thunk                               Not resolved yet
  (nur jmp)




                 It may hurt a little bit…                              29
call ds:xxx   “direct“ import calls
                                                         Opcodes:
                                                  0xFF15 + absolut address




                                        Not resolved yet




                     It may hurt a little bit…                               30
Page based (4 KB pages, arch dependent)

Pages (virtual mem.) -> page tables -> page frames
(physical mem.)


Pages have several attributes:
  User / Supervisor (Kernel)
  read-only, read-write, read-execute, etc.
                  Memory is not disk, dough!         31
Picture: CC Hameed, (http://blogs.technet.com)   32
Picture: Bill Blunden (The Rootkit Arsenal)   33
Application starts. Process is generated.
   Process = running instance of an application.
   Processes are separated from each other.
   Every process gets its own virtual address space (32-bit: 4 GB)


A process is a container (own VM, Handles, Threads (min.1))
   Thread = context execution of a process.
   Multithreading.
   Shares system resources (Code, Data, Handles)
       Own stack, though.
       Thread Local Storage.
   Threads (within a process) can share memory.


                                Processes and Threads                34
Important data structures in every process

  PEB: Process Environment Block (1 pro Process)
      Location of executable (ImageBase)
      Information about DLLs
      Information regarding the heap

  TEB: Thread Environment Block (1 pro Thread)
      Location of the PEB
      Location of the stack
      Pointer to first SEH Chain entry



                                   PEB vs. TEB     35
Process memory segmentation:

   Code (.text): like the segment on disk

   Data (.data): like the segment on disk

   Stack: function arguments, local variables
       Grows towards lower addresses
       Defined through top (ESP) and bottom (EBP)
       PUSH vs POP (dword)

   Heap: managed by Allocator/Deallocator algorithms


                                Two new segments       36
User land




                     Picture courtesy of CORELAN
               2GB
 Kernel land




                      Copyright (c) Corelan GCV




               4GB                             37
Open cmd.exe in ImmunityDebugger




                 Time to take a peek   38
Time to take a peek   39
Time to take a peek   40
The stack . Push & pop.




        It may hurt a little bit…   41
Function prologue




     It may hurt a little bit…   42
function calls
Different types:
  cdecl (C progs, variable arg number)
    Caller* is responsible for adjusting the stack.
    How many args were there? Unknown
  stdcall (windows API, fixed arg number)
    Function self adjusts the stack before return.
    It's clear in advance how many arguments
    _funcName@nrBytes
                        It may hurt a little bit…     43
Argument passing




    It may hurt a little bit…   44
Argument passing ( by ref )




          It may hurt a little bit…   45
Argument passing ( by ref )




          It may hurt a little bit…   46
Argument passing ( by value )




           It may hurt a little bit…   47
Argument passing ( by value )




           It may hurt a little bit…   48
Go home Compiler, you are drunk
#include <stdio.h>
int main(void) {
  char x = 0xff;

    if(x == 0xff)
      puts("YES");
    else
      puts("NO");

    return 0;
}
                     sign extensions from hell   49
Time to take a peek   50
There is NOT such thing as static reversing ONLY

  Combine static (IDA) and dynamic (debugger).
  The trick is to optimize the information transfer
  between these two.




                        Best of both worlds           51
Examples:
  IDA debugging capabilities
  Dynamic Binary Instrumentation (PIN).
     Import results in IDA to see code coverage


  Trace with .py
     Differential debugging



                          Best of both worlds     52
53
Not really scary…   54
Don't get distracted!   55
Look, it works!   56
Loooooong function   57
Don't really feel like doing this manually   58
F*ck this shit
                                    I'm outta here!




We will solve this soon… intelligently                59
Take the control!   60
Not elegant but effective…   61
I want more… finesse…   62
Python to the rescue!




       Tzzzzzz. Tttzzzz…   63
Python to the rescue!




     If it‘s in twitter it must be true   64
Keepass stalker
Example: utorrent readfile
Which ReadFile ?!?!




There are several references…
                Manually inspecting all is a tedious job.   67
Reading from a File
CreateFile(...)
  Returns handle
ReadFile(handle)
CloseHandle(handle)
Reading from a File




       It's a long shot   70
And... found!




    It's a long shot   71
Binary manipulation

    Binaries can be easily modified
        Patched (on disk, live)
        Functions intercepted (hooking, live)

    Usage:
        Inspection (ex. Tracing)
        Change execution flow
        Whatever you can imagine
What you all have been waiting for…   73
So many questions…

 APIs used to send() and recv() data?
   Sure? Think twice
   Functions I don't see?
 What happens with the data received?
 How does the malware achieve persistency?


             Braaaaiiinnnsss… I mean, credeeeennttiiiaaalllssssss…   74
Braaaaiiinnnsss… I mean, credeeeennttiiiaaalllssssss…   75
Are these the
     ONLY APIs
used by this malware?

       Malware are deceiving bastards…   76
Braaaaiiinnnsss… I mean, CPU cycles…   77
How does the malware
      achieve
   persistency?

     I like it here. I think I‘m gonna stick for a while…   78
Braaaaiiinnnsss… I mean, CPU cycles…   79
Getting in the enemy's mind   80
Hands on: STRINGS




     Juicy info in two minutes   81
Hands on: imports




     Very interesting imports…   82
Hands on: resources




    The .rsrc section is perfect for hiding data   83
"Crypto" stuff




 Takes less time and is less brain damaging   84
Hands on: sneaky bastards




         I see what you did there…   85
There are so many!
What Do i do?!?!?



                     86
Who are u gonna call? Sneakbuster!




           Takes less time and is less brain damaging   87
Hands on: WTFTLS




      NO ME GUSTA   88
Hands on: WTFTLS




     Call, call, call …   89
Hands on: running it




       Thanks for the info…   90
Hands on: running it




     Here's (another?) candy for you…   91
Last chance! ;)   92
Twitter: @m0n0sapiens   93

The walking 0xDEAD

  • 1.
  • 2.
    Introduction. x86 architecture. Assembler. Binaryinspection. General sample (crackme) Binary manipulation. Python to the rescue! Malware analysis What we (you) are going to do 2
  • 3.
    Welcome to thecity of death 3
  • 4.
    $ whoami Braaaaiiinnnssss 4 4
  • 5.
    $ whoami Binary Reverse Engineering Malware Programming The walking dead 5 5
  • 6.
    Yeah, but whatis reversing ??? Kind of a reconstruction Code Binary int main(int argc, char **argv) push ebp { mov ebp, esp int gv = 0; // global mov dword_403394, 0 […] […] 6
  • 7.
    so, i wantto do some reversing Requisites ASM knowledge Binary format (PE32, etc.) OS Knowledge Patience ;) 7
  • 8.
    At the beginning,there was c0de #include "stdafx.h" #include <iostream> using namespace std; int add(int x, int y) { return x + y; } int main() { int a = 0, b = 0; cout << "Enter a: "; […] return 0; } 8
  • 9.
    The compiler fucksit all up 9
  • 10.
    compiling is like... …that scene of Apollo XIII 10
  • 11.
    It may hurta little bit… 11
  • 12.
  • 13.
  • 14.
    Basic blocks ::functions 14
  • 15.
    Cpu registers „General purpose“registers: eax, ebx, ecx, edx, esi, edi, ebp, esp 32 bits (64 bits in x86_64) Some of them have special uses. „The rootkit arsenal“ Bill Blunden It may hurt a little bit… 15
  • 16.
    Cpu registers Not so„general purpose“ registers: eax: Arithmetic and function return values ecx: counter (loops, etc.) esi, edi: src, dst in memcpy, strcpy, etc. ebp, esp: stack operations :) and more... It may hurt a little bit… 16
  • 17.
    Some common instructions Read mov eax, [ecx] mov eax, [00401000] Write mov ebx, 0x20 mov [ebx+0x1C], ecx It may hurt a little bit… 17
  • 18.
    Some common instructions push push 0x100 push dword_1377 pop pop ecx pop 0x00c0ffee It may hurt a little bit… 18
  • 19.
    Some common instructions inc,dec mul, div Use your imagination :) add, sub add esp, 0x08 sub esp, 0x1c It may hurt a little bit… 19
  • 20.
    Some common instructions leadst, src lea eax, [esi*2] lea ecx, [esi+ecx] shl, shr shl eax, 2 It may hurt a little bit… 20
  • 21.
    Some common instructions jxx dst jmp 0xbadc0de jnz eax ja 0xc0ffee jl ebx t call 0x00412F1E It may hurt a little bit… 21
  • 22.
    Some common instructions cmpdst, src cmp eax, ebx cmp ecx, 0xFF test dst, src test ecx, edx It may hurt a little bit… 22
  • 23.
    Important segments .data: staticallyallocated *initialized* int g = 1; char str[] = „yomamma“; .bss: statically allocated *UNinitialized* int var; char *ptr; .rdata: read-only data (const c = 0) It may hurt a little bit… 23
  • 24.
    It‘s all aboutthe .(r)data It may hurt a little bit… 24
  • 25.
    Moar data Itmay hurt a little bit… 25
  • 26.
    Moar data Itmay hurt a little bit… 26
  • 27.
    Imports and allthat stuff… It may hurt a little bit… 27
  • 28.
    The problem withimports Problem: O(n2) vs. O(n) „Thunks and dynamic resolution makes the binary *portable* between Windows versions. XP and Win7 don‘t have the same addresses for all kernel32 functions It may hurt a little bit… 28
  • 29.
    Regular call thunks Opcode: 0xE8 + offset Memcpy = thunk Not resolved yet (nur jmp) It may hurt a little bit… 29
  • 30.
    call ds:xxx “direct“ import calls Opcodes: 0xFF15 + absolut address Not resolved yet It may hurt a little bit… 30
  • 31.
    Page based (4KB pages, arch dependent) Pages (virtual mem.) -> page tables -> page frames (physical mem.) Pages have several attributes: User / Supervisor (Kernel) read-only, read-write, read-execute, etc. Memory is not disk, dough! 31
  • 32.
    Picture: CC Hameed,(http://blogs.technet.com) 32
  • 33.
    Picture: Bill Blunden(The Rootkit Arsenal) 33
  • 34.
    Application starts. Processis generated. Process = running instance of an application. Processes are separated from each other. Every process gets its own virtual address space (32-bit: 4 GB) A process is a container (own VM, Handles, Threads (min.1)) Thread = context execution of a process. Multithreading. Shares system resources (Code, Data, Handles) Own stack, though. Thread Local Storage. Threads (within a process) can share memory. Processes and Threads 34
  • 35.
    Important data structuresin every process PEB: Process Environment Block (1 pro Process) Location of executable (ImageBase) Information about DLLs Information regarding the heap TEB: Thread Environment Block (1 pro Thread) Location of the PEB Location of the stack Pointer to first SEH Chain entry PEB vs. TEB 35
  • 36.
    Process memory segmentation: Code (.text): like the segment on disk Data (.data): like the segment on disk Stack: function arguments, local variables Grows towards lower addresses Defined through top (ESP) and bottom (EBP) PUSH vs POP (dword) Heap: managed by Allocator/Deallocator algorithms Two new segments 36
  • 37.
    User land Picture courtesy of CORELAN 2GB Kernel land Copyright (c) Corelan GCV 4GB 37
  • 38.
    Open cmd.exe inImmunityDebugger Time to take a peek 38
  • 39.
    Time to takea peek 39
  • 40.
    Time to takea peek 40
  • 41.
    The stack .Push & pop. It may hurt a little bit… 41
  • 42.
    Function prologue It may hurt a little bit… 42
  • 43.
    function calls Different types: cdecl (C progs, variable arg number) Caller* is responsible for adjusting the stack. How many args were there? Unknown stdcall (windows API, fixed arg number) Function self adjusts the stack before return. It's clear in advance how many arguments _funcName@nrBytes It may hurt a little bit… 43
  • 44.
    Argument passing It may hurt a little bit… 44
  • 45.
    Argument passing (by ref ) It may hurt a little bit… 45
  • 46.
    Argument passing (by ref ) It may hurt a little bit… 46
  • 47.
    Argument passing (by value ) It may hurt a little bit… 47
  • 48.
    Argument passing (by value ) It may hurt a little bit… 48
  • 49.
    Go home Compiler,you are drunk #include <stdio.h> int main(void) { char x = 0xff; if(x == 0xff) puts("YES"); else puts("NO"); return 0; } sign extensions from hell 49
  • 50.
    Time to takea peek 50
  • 51.
    There is NOTsuch thing as static reversing ONLY Combine static (IDA) and dynamic (debugger). The trick is to optimize the information transfer between these two. Best of both worlds 51
  • 52.
    Examples: IDAdebugging capabilities Dynamic Binary Instrumentation (PIN). Import results in IDA to see code coverage Trace with .py Differential debugging Best of both worlds 52
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
    Don't really feellike doing this manually 58
  • 59.
    F*ck this shit I'm outta here! We will solve this soon… intelligently 59
  • 60.
  • 61.
    Not elegant buteffective… 61
  • 62.
    I want more…finesse… 62
  • 63.
    Python to therescue! Tzzzzzz. Tttzzzz… 63
  • 64.
    Python to therescue! If it‘s in twitter it must be true 64
  • 65.
  • 66.
  • 67.
    Which ReadFile ?!?! Thereare several references… Manually inspecting all is a tedious job. 67
  • 69.
    Reading from aFile CreateFile(...) Returns handle ReadFile(handle) CloseHandle(handle)
  • 70.
    Reading from aFile It's a long shot 70
  • 71.
    And... found! It's a long shot 71
  • 72.
    Binary manipulation Binaries can be easily modified Patched (on disk, live) Functions intercepted (hooking, live) Usage: Inspection (ex. Tracing) Change execution flow Whatever you can imagine
  • 73.
    What you allhave been waiting for… 73
  • 74.
    So many questions… APIs used to send() and recv() data? Sure? Think twice Functions I don't see? What happens with the data received? How does the malware achieve persistency? Braaaaiiinnnsss… I mean, credeeeennttiiiaaalllssssss… 74
  • 75.
    Braaaaiiinnnsss… I mean,credeeeennttiiiaaalllssssss… 75
  • 76.
    Are these the ONLY APIs used by this malware? Malware are deceiving bastards… 76
  • 77.
  • 78.
    How does themalware achieve persistency? I like it here. I think I‘m gonna stick for a while… 78
  • 79.
  • 80.
    Getting in theenemy's mind 80
  • 81.
    Hands on: STRINGS Juicy info in two minutes 81
  • 82.
    Hands on: imports Very interesting imports… 82
  • 83.
    Hands on: resources The .rsrc section is perfect for hiding data 83
  • 84.
    "Crypto" stuff Takesless time and is less brain damaging 84
  • 85.
    Hands on: sneakybastards I see what you did there… 85
  • 86.
    There are somany! What Do i do?!?!? 86
  • 87.
    Who are ugonna call? Sneakbuster! Takes less time and is less brain damaging 87
  • 88.
    Hands on: WTFTLS NO ME GUSTA 88
  • 89.
    Hands on: WTFTLS Call, call, call … 89
  • 90.
    Hands on: runningit Thanks for the info… 90
  • 91.
    Hands on: runningit Here's (another?) candy for you… 91
  • 92.
  • 93.