SlideShare a Scribd company logo
1 of 61
www.intertel.co.za
MOTIVATION
• Malware landscape is diverse and constant evolving
• Large botnets
• Diverse propagation vectors, exploits, C&C
• Capabilities – backdoor, keylogging, rootkits,
• Logic bombs, time-bombs
• Diverse targets: desktops, mobile platforms, SCADA systems
(Stuxnet)
• Malware is not about script-kiddies anymore, it’s real
business. Recent events indicate that it can be a
powerful weapon in cyber warfare.
• Manual reverse-engineering is close to impossible
• Need automated techniques to extract system logic,
interactions and side-effects, derive intent, and devise mitigating
strategies.www.intertel.co.za
OUTLINE
• Review of the workflow of binary program analysis
• Review of the challenges in binary program analysis:
• Obfuscation Techniques
• Techniques for reverse engineering stripped binaries:
• Systematic deobfuscation
• Examples of obfuscation: Conficker, Hydrac (Google attack), Stuxnet, …
www.intertel.co.za
CAPTURING MALWARE
• Honeynets: Capture malware that scans the Internet for vulnerable
targets
• Mining SPAM for attachments
• Mining SPAM for malicious URLs, and capturing drive-by downloads
• AV heuristics
www.intertel.co.za
MALWARE BINARY ANALYSIS
01001010100101010
10101010011010101
01001010100101010
10101010011010101
01001010100101010
10101010011010101
01001010100101010
10101010011010101
10101010011010101
01001010100101010
10101010011010101
.exe
Typically a stripped
binary with no
debugging information.
In the case of malicious
code, it is often obfuscated
and packed
Often has embedded suicide logic and
anti-analysis logic
• What does the malware do
• How does it do it
• identify triggers
• What is the purpose of the
malware
• is this an instance of a known
threat or a new malware
• who is the author
• …
Challenges:
• lack of automation
• time-critical analysis
• labor intensive
• requires a human in the loop
www.intertel.co.za
DYNAMIC VS STATIC MALWARE ANALYSIS
• Dynamic Analysis
• Techniques that profile actions of binary at runtime
• More popular
• CWSandbox, TTAnalyze, multipath exploration
• Only provides partial ``effects-oriented profile’’ of malware potential
• Static Analysis
• Can provide complementary insights
• Potential for more comprehensive assessment
www.intertel.co.za
MALWARE EVASIONS AND
OBFUSCATIONS
• To defeat signature based detection schemes
• Polymorphism, metamorphism: started appearing in viruses of
the 90’s primarily to defeat AV tools
• To defeat Dynamic Malware Analysis
• Anti-debugging, anti-tracing, anti-memory dumping
• VMM detection, emulator detection
• To defeat Static Malware analysis
• Encryption (packing)
• API and control-flow obfuscations
• Anti-disassembly
• The main purpose of obfuscation is to slow down the
security communitywww.intertel.co.za
MY PERSONAL PHILOSOPHY
• Push the limits of static analysis as much as possible.
• Rebuild the binary in its original form prior to obfuscation.
• Recover a higher level description of the malware binary that makes
deriving the purpose of the malware atteingnable: I want to stare at C
code as opposed to staring at assembly code
www.intertel.co.za
MALWARE REVERE ENGINEERING SYSTEM
GOALS
• Desiderata for a Static Analysis Framework
• Unpack most of contemporary malware
• Handle most if not all packers
• Deobfuscate API references
• Automate identification of capabilities
• Provide feedback on unpacking success
• Simplify and annotate call graphs to illustrate interactions
between key logical blocks
• Enable decompilation of assemply code into a higher-level
language
• Identify key logical blocks (crypto for instance)
www.intertel.co.za
REVERSE ENGINEERING PHASES
 Unpacking phase: the image of a running malware sample is often considered
damaged: No known OEP. Imported APIs are invoked dynamically and the original
import table is destroyed. Arbitrary section names and r/w/e permissions.
 Disassembly phase:
- Identification of code and data segments
- Relies on the unpacker to capture all code and data segments.
 Decompilation phase:
- Reconstruction of the code segment into a C-like higher level representation
- Relies on the disassembler to recognize function boundaries, targets of call sites,
imports, and OEP.
 Program understanding phase:
- Relies on the decompiler to produce readable C code, by recognizing the
compiler, calling conventions, stack frames manipulation, functions prologs and
epilogs, user-defined data structures
www.intertel.co.za
PHASE 1: MALWARE UNPACKING
Unpacking
www.intertel.co.za
THE EUREKA FRAMEWORK
• Novel unpacking technique based on coarse grained execution tracing
• Heuristic-based and statistic-based upacking
• Implements several techniques to handle obfucated API references
• Multiple metrics to evaluate unpack success
• Annotated call graphs provide bird’s eye view of system interaction
www.intertel.co.za
THE EUREKA WORKFLOW
Trace
Malware
syscalls in
VM
Syscall
trace
Heuristic
based
offline
analysis
Eureka’s
Unpacker
Favorable
execution
point
Packed
Binary
Un-
packed
Binary
Dis-
assembly
IDA-Pro
Un-
Packed
.ASM
Dis-
assembly
IDA-Pro
Packed
.ASM
Statistics
based
Evaluator
Unpack
Evaluati
on
Raw unpacked
Executable
•Unknown OEP
•No debug information
•Unresolved library calls
•Snapshot of data segment
•Unreachable code
•Loss of structuresStatistics
based
Evaluator
www.intertel.co.za
COARSE-GRAINED EXECUTION
MONITORING
• Generalized unpacking principle
• Execute binary till it has sufficiently revealed itself
• Dump the process execution image for static analysis
• Monitoring exection progress
• Eureka employs a Windows driver that hooks to SSDT (System Service
Dispatch Table)
• Callback invoked on each NTDLL system call
• Filtering based on malware process pid
www.intertel.co.za
HEURISTIC-BASED UNPACKING
• How do you determine when to dump?
• Heuristic #1: Dump as late as possible. NtTerminateProcess
• Heuristic #2: Dump when your program generates errors.
NtRaiseHardError
• Heuristic #3: Dump when program forks a child process.
NtCreateProcess
• Issues
• Weak adversarial model, too simple to evade…
• Doesn’t work well for package non-malware programs
www.intertel.co.za
STATISTICS-BASED UNPACKING
• Observations
• Statistical properties of packed executable differ from unpacked exectuable
• As malware executes code-to-data ratio increases
• Complications
• Code and data sections are interleaved in PE executables
• Data directories(import tables) look similar to data but are often found in
code sections
• Properties of data sections vary with packers
www.intertel.co.za
STATISTICS-BASED UNPACKING (2)
• Our Approach
• Model statistical properties of unpacked code
• Estimating unpacked code
• N-gram analysis to look for frequent instructions
• We use bi-grams (2-grams) because x-86 opcodes are
1 or 2 bytes
• Extract subroutine code from 9 benign executables
• FF 15 (call), FF 75 (push), E8 _ _ _ ff (call), E8 _ _ _
00 (call)
www.intertel.co.za
EVALUATION (ASPACK)
www.intertel.co.za
EVALUATION (MOLEBOX)
www.intertel.co.za
EVALUATION (ARMADILLO)
www.intertel.co.za
SYSTEMATIC APPROACH TO
DEOBFUSCATION:
UNPACKING
• Automatic Unpacking: involves running the malware and capturing its
memory image.
• Monitoring the execution of the malware is an intrusive process and is often
detected using anti-tracing and anti-debugging techniques embedded in the
malware.
• Our multi-strategy approach consists of minimal monitoring and capturing
the process image at key events:
• ExitProcess
• Byte bigram monitoring: call, push instructions for instance
• Number of seconds elapsed
• Run the malware without monitoring and suspend its execution and perform memory
inspection
• In practice, we always manage to get a dump (memory snapshot) of the
running process: no OEP and no Import table
www.intertel.co.za
PHASE 2: DISASSEMBLY
• The disassembler reads the PE data structure in order to:
1. Determine the different sections of the file and separate code from data and
identifies resource information such as import tables
– The disassembler relies on the PE data structure (could be corrupt)
– The disassembler translates into code, any referenced address from
known code location
2. Translate code segments into assembly language
– The disassembler relies on the hardware instruction set documentation
3. Interpret data according to identified types
1. A data referenced by code can be of any type: integer, string, struct, etc.
Integer:
0x0040F45C dword_40F45C dd 0E06D7363h, 1, 2 dup(0) ; DATA XREF: 408C98
String:
0x0040F45C unk_40F45C db 63h ; c ; DATA XREF: sub_408C98
0x0040F45D db 73h ; s
0x0040F45E db 6Dh ; m
0x0040F45F db 0E0h ; a
www.intertel.co.za
IDA PRO DISASSEMBLER
• http://www.hex-rays.com/idapro/
• It supports a variety of executable formats for different processors
and operating systems. It also can be used as a debugger for
Windows PE, Mac OS X, and Linux ELF executables.
• IDA performs a large degree of automatic code analysis to a certain
extent, leveraging cross-references between code sections,
knowledge of parameters of API calls, limited dataflow analysis, and
recognition of standard libraries.
• Hashes of known statically linked libraries are compared to hashes of
identified subroutines in the code
• Provides scripting languages to interact with the system to improve
the analysis.
• Support plug-ins: The IDA decompiler is the most impressive plug-
in.www.intertel.co.za
PE EXECUTION
1. Read the Portable Executable
(PE) file data structure and
maps the file into memory
2. Load import modules
1. Start execution at entry point
2. Runtime unpacking
3. Jump to OEP
www.intertel.co.za
PHASE 3: FIXING THE DISASSEMBLED
CODE
• Unpacked & disassembled code does not have an OEP.
• Import tables are rebuilt dynamically and there are no static references
to dynamically loaded libraries
• Header information is not reliable
• Data is not typed
www.intertel.co.za
PARSING THE PE EXECUTABLE FORMAT
www.intertel.co.za
CHALLENGES IN BINARY CODE DISASSEMBLY
• Disassembly is not an exact science: On CISC platforms
with variable-width instructions, or in the presence of self-
modifying code, it is possible for a single program to have
two or more reasonable disassemblies. Determining which
instructions would actually be encountered during a run of
the program reduces to the proven-unsolvable halting
problem.
• Bad disassembly because of variable length instructions
• Jumps into middle of instructions
• No reachability analysis: Unreachable code can hide data.
www.intertel.co.za
EXAMPLES OF DISASSEMBLY PROBLEMS (THE STORM WORM)
Data hidden as code:
ArcadeWorld.exe:0042BDB0 mov eax, 0
ArcadeWorld.exe:0042BDB5 test eax, eax
ArcadeWorld.exe:0042BDB7 jnz short loc_42BDD8; unreachable code
ArcadeWorld.exe:0042BDD8 loc_42BDD8: ; CODE
XREF: 0042BDB7
ArcadeWorld.exe:0042BDD8 lea edx, [ebp+401C42h]
ArcadeWorld.exe:0042BDDE lea eax, [ebp+40B220h]
ArcadeWorld.exe:0042BDE4 push eax
ArcadeWorld.exe:0042BDE5 push 8A20h
ArcadeWorld.exe:0042BDEA push edx
ArcadeWorld.exe:0042BDEB call near ptr unk_42BF7D
ArcadeWorld.exe:0042BDF0 pusha
ArcadeWorld.exe:0042BDF1 mov edi, [ebp+40C067h]
ArcadeWorld.exe:0042BDF7 add edi, [ebp+40C03Fh]
ArcadeWorld.exe:0042BDFD lea esi, [ebp+40BFBFh]
0042BDD8 8D 95 42 1C 40 00 8D 85 20 B2 40 00 50 68 20 8A ìòB.@.ìà ¦@.Ph è
0042BDE8 00 00 52 E8 8D 01 00 00 60 8B BD 67 C0 40 00 03 ..RFì...`ï+g+@..
0042BDF8 BD 3F C0 40 00 8D B5 BF BF 40 00 B9 74 00 00 00 +?+@.ì¦++@.¦t...
0042BE08 68 00 20 00 00 57 E8 26 FF FF FF F3 A4 8D 85 AF h. ..WF& =ñìà»
0042BE18 BF 40 00 8B 9D 3F C0 40 00 FF B5 43 C0 40 00 FF +@.ï¥?+@. ¦C+@.
0042BE28 B5 37 C0 40 00 6A 01 50 53 E8 9E 00 00 00 FF B5 ¦7+@.j.PSFP... ¦
0042BE38 6B C0 40 00 FF B5 3F C0 40 00 E8 DE FD FF FF 8B k+@. ¦?+@.F¦² ï
0042BE48 85 18 B2 40 00 85 C0 74 1C FF B5 18 B2 40 00 FF à.¦@.à+t. ¦.¦@.
0042BE58 B5 57 C0 40 00 FF B5 3F C0 40 00 E8 E7 10 00 00 ¦W+@. ¦?+@.Ft...
0042BE68 E8 3F FE FF FF 53 E8 E5 04 00 00 E8 14 00 00 00 F?¦ SFs...F....
0042BE78 5C 64 6C 6C 63 61 63 68 65 5C 74 63 70 69 70 2E dllcachetcpip.
0042BE88 73 79 73 00 E8 16 0A 00 00 E8 13 00 00 00 5C 64 sys.F....F....d
www.intertel.co.za
API RESOLUTION
• User-level malware programs require system calls to perform malicious
actions
• Use Win32 API to access user level libraries
• Obufscations impede malware analysis using IDA Pro or OllyDbg
• Packers use non-standard linking and loading of dlls
• Obfuscated API resolution
www.intertel.co.za
STANDARD API RESOLUTIONImports in IAT identified by IDA by looking at Import Table
www.intertel.co.za
HANDLING THUNKS
• Identify subroutines with a JMP instruction only
• Treat any calls to these subs as an API call
IsDebuggerPresent
www.intertel.co.za
LEVERAGING STANDARD API ADDRESS
LOADING==================================================
Function Name : ADSICloseDSObject
Address : 0x76e30826
Relative Address : 0x00020826
Ordinal : 142 (0x8e)
Filename : adsldpc.dll
Full Path : c:WINDOWSsystem32adsldpc.dll
Type : Exported Function
==================================================
==================================================
Function Name : ADSICloseSearchHandle
Address : 0x76e3050a
Relative Address : 0x0002050a
Ordinal : 143 (0x8f)
Filename : adsldpc.dll
Full Path : c:WINDOWSsystem32adsldpc.dll
Type : Exported Function
==================================================
==================================================
Function Name : ADSICreateDSObject
Address : 0x76e30447
Relative Address : 0x00020447
Ordinal : 144 (0x90)
Filename : adsldpc.dll
Full Path : c:WINDOWSsystem32adsldpc.dll
Type : Exported Function
==================================================
www.intertel.co.za
USING DATAFLOW ANALYSIS
• Identify register based indirect calls GetEnvironmentStringW
use
def
www.intertel.co.za
HANDLING DYNAMIC POINTER UPDATES
• Identify register based indirect calls
dword_41e304 has no static
value to look up API
use
def
A def to dword_41e308 is found
Look for probable call to
GetProcAddress earlier
Call to GetProcAddress
www.intertel.co.za
STANDARD API ADDRESS LOADING IS NOT ENOUGH
==================================================
Function Name : ADSICloseDSObject
Address : 0x76e30826
Relative Address : 0x00020826
Ordinal : 142 (0x8e)
Filename : adsldpc.dll
Full Path : c:WINDOWSsystem32adsldpc.dll
Type : Exported Function
==================================================
==================================================
Function Name : ADSICloseSearchHandle
Address : 0x76e3050a
Relative Address : 0x0002050a
Ordinal : 143 (0x8f)
Filename : adsldpc.dll
Full Path : c:WINDOWSsystem32adsldpc.dll
Type : Exported Function
==================================================
==================================================
Function Name : ADSICreateDSObject
Address : 0x76e30447
Relative Address : 0x00020447
Ordinal : 144 (0x90)
Filename : adsldpc.dll
Full Path : c:WINDOWSsystem32adsldpc.dll
Type : Exported Function
==================================================
There are many indirect ways to load
And call a Windows API:
• access to list of loaded DLLs
• access to a loaded DLL and use of
GetModulHandle() + offset
• …
www.intertel.co.za
CONSEQUENCE OF FAILURE TO IDENTIFY APIS
...
.text:004011A7 push offset unk_40A2DC ; arg 1
.text:004011AC xor ebx, ebx
.text:004011AE call dword ptr unk_40A0E4 .data:0040A0E4 00000000
.text:004011B4 mov edi, eax
.text:004011B6 cmp edi, ebx
.text:004011B8 jz short loc_401211
.text:004011BA push esi
.text:004011BB mov esi, dword ptr unk_40A0E8
.text:004011C1 push offset unk_40A2C4 ; arg 2
.text:004011C6 push edi ; arg 1
.text:004011C7 call esi ; unk_40A0E8 .data:0040A0E8 00000000
.text:004011C9 push offset unk_40A2AC
.text:004011CE push edi
.text:004011CF mov dword_433480, eax
...
...
.text:00401132 lea eax, [ebp+var_4]
.text:00401135 push eax
.text:00401136 push ebx
.text:00401137 push 0
.text:00401139 mov [ebp+var_4], esi
.text:0040113C call dword_433480
.text:00401142 test eax, eax
…
Name of a library
Load library call (LoadLibrary)
Name of the library function
Name of the library
API call to get the address
Of the loaded library function
(GetProcAddress)
library function call
www.intertel.co.za
FAILURE TO PERFORM CONTROL FLOW
ANALYSIS
• CreateThread
.text:009A3A4C push eax
.text:009A3A4D xor eax, eax
.text:009A3A4F push eax
.text:009A3A50 push eax
.text:009A3A51 push offset dword_9A3939
.text:009A3A56 push eax
.text:009A3A57 push eax
.text:009A3A58 call [ebx]
.data:009A3939 xxxxxxx
• Starting Services
• Thread synchronization
• Critical sections
• Callback functions
Location of the start address of a thread
Call to CreateThread
www.intertel.co.za
ADVANCED API RESOLUTION
• There are many ways in which a library or API can be invoked.
• There are many ways an API call can be obfuscated
• But there is one invariant associated to each API and library: its
signature
• i.e; number of arguments, type of arguments, and type of return value if any.
www.intertel.co.za
ADVANCED API RESOLUTION: TYPE INFERENCE FOR BINARY PROGRAM ANALYS
• Use type inference as a single solution to solve three fundamental problems:
• Identifying API and function calls (call and jump targets)
• Building a precise CFG
• Recovering user-defined types for proper decompilation
• For Windows Executable files:
• Integers: object handles, addresses, IP address, ports, etc
• Strings: file names, service names, etc
• Structures: sockaddr
struct sockaddr_in {
short sin_family;
u_short sin_port;
struct in_addr sin_addr;
char sin_zero[8];
};
www.intertel.co.za
TYPE PROPAGATION AND MATCHING
• Type propagation using dataflow analysis
• Propagation of return values and arguments of functions
sub_403649 proc near
...
.text:00403649 arg_0 = dword ptr 8
.text:00403649 arg_4 = dword ptr 0Ch
.text:00403649 arg_8 = dword ptr 10h
.text:00403649 arg_C = dword ptr 14h
...
.text:00403668 xor ebx, ebx
...
.text:00403704 mov esi, ds:dword_40A06C
.text:0040370A push ebx ; arg 7 type(f,7) = type (ebx)
.text:0040370B mov edi, 80h
.text:00403710 push edi ; arg 6 type(f,6) = type (edi)
.text:00403711 push 4 ; arg 5 type(f,5) = union(int,char)
.text:00403713 push ebx ; arg 4 type(f,4) = type (ebx)
.text:00403714 push 2 ; arg 3 type(f,3) = union(int,char)
.text:00403716 push 2 ; arg 2 type(f,2) = union(int,char)
.text:00403718 push [ebp+arg_4] ; arg 1 type(f,1) = type([ebp+arg_4])
.text:0040371B mov [ebp+var_18], ebx
.text:0040371E call esi ; dword_40A06C type(ret(f)) = type(eax)
...
There is only one API that has
7 arguments such that the seventh
and third and first one can be
pointers and all others are not.
HANDLE WINAPI CreateFile(
__in LPCTSTR lpFileName,
__in DWORD dwDesiredAccess,
__in DWORD dwShareMode,
__in LPSECURITY_ATTRIBUTES lpSecurityAttributes,
__in DWORD dwCreationDisposition,
__in DWORD dwFlagsAndAttributes,
__in HANDLE hTemplateFile
);
www.intertel.co.za
ADVANTAGES OF TYPE INFERENCE ANALYSIS
• Programmers data structures and types are going to be based
on known data structures and types provided by the libraries
• Identifying API calls and type information help capture better
the semantics of the program execution
• Not restricted to Windows but require knowledge of the
libraries and their documentation
• Can deal with some of the widely used obfuscation techniques
• Import table obfuscation
• Code rewrite: code rewrite preserves the types!
www.intertel.co.za
PHASE 3: REBUILDING THE UNPACKED
EXECUTABLE
• From a damaged dumped image of a running malware to a PE
executable:
• Knowing all APIs allows us to identify the OEP.
• Semantic approach: ExitProcess, CreateMutex, GetCommandLine,
GetModulaHandle, etc are close to OEP.There are about 20 APIs that are
often called at the beginning of the execution of the code.
• Structural approach: find sources of call graphs in the binary
• Rebuilding in import table with all references to identified APIs
• The disassembly of the reconstructed PE is often of better quality
than the disassembly of the dumped process image
• The new PE code bypasses the unpacking routine embedded in the packed
code
• The new PE contains the original code
www.intertel.co.za
www.intertel.co.za
www.intertel.co.za
PHASE 4: DECOMPILATION
• Identifies local variables
• Identifies arguments: registers, stack, or any combination
• Identifies global variables
• Identify calling conventions
• Identifies common idioms and compiler features
• Eliminates the use of registers as intermediate variables
• Identifies control structures
www.intertel.co.za
DECOMPILATION DEPENDS ON PREVIOUS
ANALYSIS PHASES
www.intertel.co.za
MALWARE OBFUSCATION EFFECT ON
DECOMPILATION
• While packing is the most used obfuscation
technique, it is often combined with other
advanced forms of obfuscation that make
decompilation often impossible:
• Call obfuscation in general and API
obfuscation in particular
• Binary Rewrite to create semantically
equivalent code with vastly different
structure
• Chuncking or “code spaghettisation”
• …
www.intertel.co.za
ANALYSIS PHASES
Compiler
Source Code
Executable code
Assembly code
Legitimate C/C++
that a compiler
would generate
Disassembly &
Analysis
Decompilation
Ideally
Unpacking
Malware
Non-executable
code
Obfuscated
assembly
A mess
Disassembly &
Analysis
Decompilation
Reality
Assembly code
Legitimate C/C++
Undo
Obfuscation
Decompilation
www.intertel.co.za
EXAMPLE OF BINARY REWRITE
.text:009B327C OBFUSCATED_VERSION_OF_is_private_subnet proc near
.text:009B327C mov ecx, eax
.text:009B327E and ecx, 0FFFFh
.text:009B3284 cmp ecx, 0A8C0h
.text:009B328A jz loc_9B4264
.text:009B3290 jmp off_9BAAA5
/* .text:009BAAA5 off_9BAAA5 dd offset loc_9AB10C */
.text:009B3290 OBFUSCATED_VERSION_OF_is_private_subnet endp
.
.text:009B4264 loc_9B4264:
.text:009B4264
.text:009B4264 mov eax, 1
.text:009B4269 retn
.
.text:009AB10C cmp al, 0Ah
.text:009AB10E jz loc_9B4264
.text:009AB114 jmp off_9BA137
/* .text:009BA137 off_9BA137 dd offset loc_9B2CE4 */
.
.text:009B2CE4 and eax, 0F0FFh
.text:009B2CE9 cmp eax, 10ACh
.text:009B2CEE jz loc_9B4264
.text:009B2CF4 jmp off_9BA3CC
/* .text:009BA3CC off_9BA3CC dd offset loc_9ABA24 */
.
.text:009ABA24 xor eax, eax
.text:009ABA26 retn
.text:009A1311 is_private_subnet proc near
.text:009A1311 mov ecx, eax
.text:009A1313 and ecx, 0FFFFh
.text:009A1319 cmp ecx, 0A8C0h
.text:009A131F jz loc_9A1340
.text:009A1325 cmp al, 0Ah
.text:009A1327 jz loc_9A1340
.text:009A132D and eax, 0F0FFh
.text:009A1332 cmp eax, 10ACh
.text:009A1337 jz loc_9A1340
.text:009A133D xor eax, eax
.text:009A133F retn
.text:009A1340
.text:009A1340 loc_9A1340:
.text:009A1340 mov eax, 1
.text:009A1345 retn
.text:009A1345 is_private_subnet endp
bool __usercall is_private_subnet (unsigned __int16 a1) {
return a1 == 43200 || a1 == 10 || (a1 & 0xF0FF) == 4268;
}
int __usercall OBFUSCATED_VERSION_OF_is_private_subnet<eax>(unsigned __int16 a1 <ax>)
{
int result; // eax@2
if ( a1 == 43200 ) result = 1;
else result = off_9BAAA5();
return result;
}
www.intertel.co.za
SYSTEMATIC APPROACH TO CODE
DEOBFUSCATION:
BINARY REWRITING
• Dechunking: The control flow of Conficker's P2P module has been significantly
obfuscated to hinder its disassembly and decompilation. Specifically, the contents of code
blocks from each subroutine have been extracted and relocated throughout different portions
of the executable. These different blocks (or chunks) are then referenced through
unconditional and conditional jump instructions. In effect, the logical control flow of the P2P
module has been obscured (spaghetti-code) to a degree that the module cannot be
decompiled into coherent C-like code, which typically drives more in-depth and accurate
code interpretation. Move all blocks to a contiguous memory block.
• Normalize x86 instructions: push followed by a pop is
a mov
• Normalize calling convention: cdecl, fastcall, stdcall,
instead of user-defined.
www.intertel.co.za
CONFICKER AND HYDRAC DECHUNKING
• Identify all chunks in a function and rewrite the function
• Applied to all Conficker C P2P Protocol subroutines
• Unlike the Conficker P2P logic, Hydraq did not exhibit the same
level of obfuscation. It did, however, share some obfuscation
features with Conficker. The functions of the Hydraq binary have
been subjected to chunking, which renders decompilation
difficult. We applied our transformations to automatically
generate the C-like code for each subroutine and build a
complete CFG of the binary. The IDA disassembler identified 185
subroutines in the binary prior to our analysis. After running the
dechunking transformation, only 141 subroutine remained and
were decompiled.
www.intertel.co.za
PURPOSE OF CODE OBFUSCATION
• While packing is often used to reduce the size of
binaries and to create polymorphic malware
samples, the more advanced obfuscation
techniques are designed to slow down reverse
engineering efforts and to prevent:
• the identification of API calls: identify the basic building blocks of the malware
• the control-flow reconstruction of the malware: follow and reconstruct the logic flow
• static analysis: determine the full functionality, triggers, hidden logic, time bombs, etc.
• timely reverse engineering and mitigation of the threat
www.intertel.co.za
WHY CODE OBFUSCATION IS NOT EASY
• Malware authors can design binary code that is extremely
difficult to analyze. Using advanced programming languages
knowledge, it is possible to create such code.
• Malware authors do not feel the need to always obfuscated
their code. Can easily defeat signature-based detection.
Overwhelm analysts and tools with large numbers of samples.
• Malware code should be able to run in a reliable manner.
Obfuscation should not compromise this important
requirement and should maintain the reliability of the initial
code. This requires a proof or guarantee of some sort.
• Malware deobfuscation is therefore a more attainable than you
might think. Systematic obfuscation informs systematic
deobfuscation.
www.intertel.co.za
OUR APPROACH
• Because obfuscation is introduced in a rather systematic way, there is a
hope that it can be dealt with in an automated way.
• Systematically identifying an obfuscation step and undoing its effect.
• Focus on generic approaches as opposed to packer/obfuscator
specifics
• Focus on metrics that allow us to assess the effect and success of our
deobfuscation strategies
www.intertel.co.za
EXAMPLE: STATIC ANALYSIS OF CONFICKER
• Conficker appeared on November 20th, 2008
• Infected millions of machines worldwide
• Millions of machines still infected despite an extensive news coverage
about the threat
• Four versions have been released: A, B, B++, and C
• It is a sophisticated piece of malicious code created by professionals who
have extensive knowledge about networking, cryptography, system and
network programming, and security
• Managed to defeat the security community in stopping its progression by
using strong crypto, code obfuscation, aggressive propagation strategies,
and constantly monitoring the security community actions
• Dynamic analysis provided a limited understanding of the threat:
• Identification of what appears to be a P2P protocol
• Identification of ports opened by the malware
• Deobfuscation and static analysis were the only techniques that were able
to uncover the full capability of the malware.
www.intertel.co.za
EXAMPLE: STATIC ANALYSIS OF CONFICKER
Static Analysis of Conficker Code:
– Domain generation algorithm: provided a list of daily domains to be blocked
– Quarter of the Internet scanned: Understand what part of the Internet was
targeted for scanning and what infections were due to USB ports and mobile
devices
– List of disabled security products: detection
– Ukrainian keyboard avoidance: Geo-location database poisoning
– Use of MD6 and related crypto algorithms: Attribution
– DNS APIs patching to disable list of websites (including SRI!): detection
– Distribute a number of modified versions of the binary
– TCP and UDP ports based on the IP address of the infected machine:
detection
www.intertel.co.za
DEOBFUSCATION OF THE CONFICKER C P2P
• Heavily obfuscated protocol code
• 88 APIs obfuscated
• Use of chunking lead to poor decompilation
• Benefits of the deobfuscation
– P2P Protocol description: protocol understanding and P2P
structure
– Peer selection algorithm: proved the peer poisoning approach
useless
– Possibility to hot patch code without DGA updates: proved C&C
domains obsolete
The P2P protocol was not just a mechanism for distributing PE executable
files but also digitally signed sets of x86 instructions that are executed in a
separate thread and take as argument the IP address of the sender. This
would provide a hot patch mechanism for all data manipulated by Conficker:
list of peers, encryption/decryption keys, the Conficker code it self, etc.www.intertel.co.za
STUXNET: KEEPING IT “RELATIVELY” SIMPLE
• Stuxnet does not use advanced binary obfuscation techniques.
• The analysis of the code is challenging nevertheless
• Stuxnet Code Characteristics:
• Use of C++
• Use of C++ exception handling
• Use of C++ classes
• Use of simple data encoding (encryption)
• Use of C structures for all data passed to the main subroutines:
• Over 40 user-defined structures
• Not recognized by disassemblers and decompilers
www.intertel.co.za
PHASE 5: PROGRAM UNDERSTANDING
• Need to identify higher-level concepts from the deobfuscated code
• Need to interpret the code into a higher-level malware objective
• Need to indentify particular features: crypto:
• Functions that use crypto-related opcode, loops, etc
• Known constants in crypto algorithms
www.intertel.co.za
FINDING KNOWN AND UNKNOWN CRYPTO
www.intertel.co.za
CONCLUSIONS
• It is always desirable to recover from the malware a
description that is as close as possible to the original
code produced by the authors.
• It is often possible to do that in practice
• It is often the only way to really determine the full
capability of the malware
• The benefits are important when it comes to high-
profile targets
• Easily integrated in common analysis tools:
disassembler (IDA), Decompilers.
www.intertel.co.za

More Related Content

What's hot

Bsides 2019 - Intelligent Threat Hunting
Bsides 2019 - Intelligent Threat HuntingBsides 2019 - Intelligent Threat Hunting
Bsides 2019 - Intelligent Threat HuntingDhruv Majumdar
 
How to Prevent RFI and LFI Attacks
How to Prevent RFI and LFI AttacksHow to Prevent RFI and LFI Attacks
How to Prevent RFI and LFI AttacksImperva
 
Cyber Security Awareness Session for Executives and Non-IT professionals
Cyber Security Awareness Session for Executives and Non-IT professionalsCyber Security Awareness Session for Executives and Non-IT professionals
Cyber Security Awareness Session for Executives and Non-IT professionalsKrishna Srikanth Manda
 
Practical Malware Analysis: Ch 0: Malware Analysis Primer & 1: Basic Static T...
Practical Malware Analysis: Ch 0: Malware Analysis Primer & 1: Basic Static T...Practical Malware Analysis: Ch 0: Malware Analysis Primer & 1: Basic Static T...
Practical Malware Analysis: Ch 0: Malware Analysis Primer & 1: Basic Static T...Sam Bowne
 
Application Threat Modeling
Application Threat ModelingApplication Threat Modeling
Application Threat ModelingPriyanka Aash
 
Malware analysis, threat intelligence and reverse engineering
Malware analysis, threat intelligence and reverse engineeringMalware analysis, threat intelligence and reverse engineering
Malware analysis, threat intelligence and reverse engineeringbartblaze
 
Metamorphic Malware Analysis and Detection
Metamorphic Malware Analysis and DetectionMetamorphic Malware Analysis and Detection
Metamorphic Malware Analysis and DetectionGrijesh Chauhan
 
OWASP Top 10 2021 What's New
OWASP Top 10 2021 What's NewOWASP Top 10 2021 What's New
OWASP Top 10 2021 What's NewMichael Furman
 
Threat Modeling Everything
Threat Modeling EverythingThreat Modeling Everything
Threat Modeling EverythingAnne Oikarinen
 
Basic Malware Analysis
Basic Malware AnalysisBasic Malware Analysis
Basic Malware AnalysisAlbert Hui
 
Threat hunting 101 by Sandeep Singh
Threat hunting 101 by Sandeep SinghThreat hunting 101 by Sandeep Singh
Threat hunting 101 by Sandeep SinghOWASP Delhi
 
Practical Malware Analysis Ch 14: Malware-Focused Network Signatures
Practical Malware Analysis Ch 14: Malware-Focused Network SignaturesPractical Malware Analysis Ch 14: Malware-Focused Network Signatures
Practical Malware Analysis Ch 14: Malware-Focused Network SignaturesSam Bowne
 
Introduction To Information Security
Introduction To Information SecurityIntroduction To Information Security
Introduction To Information Securitybelsis
 
HTTP Request Smuggling via higher HTTP versions
HTTP Request Smuggling via higher HTTP versionsHTTP Request Smuggling via higher HTTP versions
HTTP Request Smuggling via higher HTTP versionsneexemil
 
Malware Analysis 101 - N00b to Ninja in 60 Minutes at BSidesLV on August 5, ...
Malware Analysis 101 -  N00b to Ninja in 60 Minutes at BSidesLV on August 5, ...Malware Analysis 101 -  N00b to Ninja in 60 Minutes at BSidesLV on August 5, ...
Malware Analysis 101 - N00b to Ninja in 60 Minutes at BSidesLV on August 5, ...grecsl
 

What's hot (20)

Bsides 2019 - Intelligent Threat Hunting
Bsides 2019 - Intelligent Threat HuntingBsides 2019 - Intelligent Threat Hunting
Bsides 2019 - Intelligent Threat Hunting
 
How to Prevent RFI and LFI Attacks
How to Prevent RFI and LFI AttacksHow to Prevent RFI and LFI Attacks
How to Prevent RFI and LFI Attacks
 
Cyber Security Awareness Session for Executives and Non-IT professionals
Cyber Security Awareness Session for Executives and Non-IT professionalsCyber Security Awareness Session for Executives and Non-IT professionals
Cyber Security Awareness Session for Executives and Non-IT professionals
 
Ethical hacking ppt
Ethical hacking pptEthical hacking ppt
Ethical hacking ppt
 
Practical Malware Analysis: Ch 0: Malware Analysis Primer & 1: Basic Static T...
Practical Malware Analysis: Ch 0: Malware Analysis Primer & 1: Basic Static T...Practical Malware Analysis: Ch 0: Malware Analysis Primer & 1: Basic Static T...
Practical Malware Analysis: Ch 0: Malware Analysis Primer & 1: Basic Static T...
 
Application Threat Modeling
Application Threat ModelingApplication Threat Modeling
Application Threat Modeling
 
Securing Remote Access
Securing Remote AccessSecuring Remote Access
Securing Remote Access
 
Malware analysis, threat intelligence and reverse engineering
Malware analysis, threat intelligence and reverse engineeringMalware analysis, threat intelligence and reverse engineering
Malware analysis, threat intelligence and reverse engineering
 
Metamorphic Malware Analysis and Detection
Metamorphic Malware Analysis and DetectionMetamorphic Malware Analysis and Detection
Metamorphic Malware Analysis and Detection
 
OWASP Top 10 2021 What's New
OWASP Top 10 2021 What's NewOWASP Top 10 2021 What's New
OWASP Top 10 2021 What's New
 
Threat Modeling Everything
Threat Modeling EverythingThreat Modeling Everything
Threat Modeling Everything
 
Bug bounty
Bug bountyBug bounty
Bug bounty
 
Basic Malware Analysis
Basic Malware AnalysisBasic Malware Analysis
Basic Malware Analysis
 
Threat hunting 101 by Sandeep Singh
Threat hunting 101 by Sandeep SinghThreat hunting 101 by Sandeep Singh
Threat hunting 101 by Sandeep Singh
 
Practical Malware Analysis Ch 14: Malware-Focused Network Signatures
Practical Malware Analysis Ch 14: Malware-Focused Network SignaturesPractical Malware Analysis Ch 14: Malware-Focused Network Signatures
Practical Malware Analysis Ch 14: Malware-Focused Network Signatures
 
Introduction To Information Security
Introduction To Information SecurityIntroduction To Information Security
Introduction To Information Security
 
HTTP Request Smuggling via higher HTTP versions
HTTP Request Smuggling via higher HTTP versionsHTTP Request Smuggling via higher HTTP versions
HTTP Request Smuggling via higher HTTP versions
 
Secure coding practices
Secure coding practicesSecure coding practices
Secure coding practices
 
Malware Analysis 101 - N00b to Ninja in 60 Minutes at BSidesLV on August 5, ...
Malware Analysis 101 -  N00b to Ninja in 60 Minutes at BSidesLV on August 5, ...Malware Analysis 101 -  N00b to Ninja in 60 Minutes at BSidesLV on August 5, ...
Malware Analysis 101 - N00b to Ninja in 60 Minutes at BSidesLV on August 5, ...
 
Analysing Ransomware
Analysing RansomwareAnalysing Ransomware
Analysing Ransomware
 

Similar to Reverse Engineering Malware - A Practical Guide

The Future of Automated Malware Generation
The Future of Automated Malware GenerationThe Future of Automated Malware Generation
The Future of Automated Malware GenerationStephan Chenette
 
How-To Find Malicious Backdoors and Business Logic Vulnerabilities in Your Code
How-To Find Malicious Backdoors and Business Logic Vulnerabilities in Your CodeHow-To Find Malicious Backdoors and Business Logic Vulnerabilities in Your Code
How-To Find Malicious Backdoors and Business Logic Vulnerabilities in Your CodeDevOps.com
 
B-Sides Seattle 2012 Offensive Defense
B-Sides Seattle 2012 Offensive DefenseB-Sides Seattle 2012 Offensive Defense
B-Sides Seattle 2012 Offensive DefenseStephan Chenette
 
An Introduction of SQL Injection, Buffer Overflow & Wireless Attack
An Introduction of SQL Injection, Buffer Overflow & Wireless AttackAn Introduction of SQL Injection, Buffer Overflow & Wireless Attack
An Introduction of SQL Injection, Buffer Overflow & Wireless AttackTechSecIT
 
Using Analyzers to Resolve Security Problems
Using Analyzers to Resolve Security ProblemsUsing Analyzers to Resolve Security Problems
Using Analyzers to Resolve Security Problemskiansahafi
 
Demystifying Binary Reverse Engineering - Pixels Camp
Demystifying Binary Reverse Engineering - Pixels CampDemystifying Binary Reverse Engineering - Pixels Camp
Demystifying Binary Reverse Engineering - Pixels CampAndré Baptista
 
RIoT (Raiding Internet of Things) by Jacob Holcomb
RIoT  (Raiding Internet of Things)  by Jacob HolcombRIoT  (Raiding Internet of Things)  by Jacob Holcomb
RIoT (Raiding Internet of Things) by Jacob HolcombPriyanka Aash
 
[ITAS.VN]CxSuite Enterprise Edition
[ITAS.VN]CxSuite Enterprise Edition[ITAS.VN]CxSuite Enterprise Edition
[ITAS.VN]CxSuite Enterprise EditionITAS VIETNAM
 
EMBA Firmware analysis - TROOPERS22
EMBA Firmware analysis - TROOPERS22EMBA Firmware analysis - TROOPERS22
EMBA Firmware analysis - TROOPERS22MichaelM85042
 
DEF CON 27 - CHRISTOPHER ROBERTS - firmware slap
DEF CON 27 - CHRISTOPHER ROBERTS - firmware slapDEF CON 27 - CHRISTOPHER ROBERTS - firmware slap
DEF CON 27 - CHRISTOPHER ROBERTS - firmware slapFelipe Prado
 
Reverse Engineering.pptx
Reverse Engineering.pptxReverse Engineering.pptx
Reverse Engineering.pptxSameer Sapra
 
Anti-virus Mechanisms and Various Ways to Bypass Antivirus detection
Anti-virus Mechanisms and Various Ways to Bypass Antivirus detectionAnti-virus Mechanisms and Various Ways to Bypass Antivirus detection
Anti-virus Mechanisms and Various Ways to Bypass Antivirus detectionNeel Pathak
 
Virus Detection Based on the Packet Flow
Virus Detection Based on the Packet FlowVirus Detection Based on the Packet Flow
Virus Detection Based on the Packet FlowAntiy Labs
 
Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...
Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...
Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...Lastline, Inc.
 
Malware classification and detection
Malware classification and detectionMalware classification and detection
Malware classification and detectionChong-Kuan Chen
 
Detecting Evasive Malware in Sandbox
Detecting Evasive Malware in SandboxDetecting Evasive Malware in Sandbox
Detecting Evasive Malware in SandboxRahul Mohandas
 
Static Detection of Application Backdoors
Static Detection of Application BackdoorsStatic Detection of Application Backdoors
Static Detection of Application BackdoorsTyler Shields
 

Similar to Reverse Engineering Malware - A Practical Guide (20)

The Future of Automated Malware Generation
The Future of Automated Malware GenerationThe Future of Automated Malware Generation
The Future of Automated Malware Generation
 
How-To Find Malicious Backdoors and Business Logic Vulnerabilities in Your Code
How-To Find Malicious Backdoors and Business Logic Vulnerabilities in Your CodeHow-To Find Malicious Backdoors and Business Logic Vulnerabilities in Your Code
How-To Find Malicious Backdoors and Business Logic Vulnerabilities in Your Code
 
B-Sides Seattle 2012 Offensive Defense
B-Sides Seattle 2012 Offensive DefenseB-Sides Seattle 2012 Offensive Defense
B-Sides Seattle 2012 Offensive Defense
 
An Introduction of SQL Injection, Buffer Overflow & Wireless Attack
An Introduction of SQL Injection, Buffer Overflow & Wireless AttackAn Introduction of SQL Injection, Buffer Overflow & Wireless Attack
An Introduction of SQL Injection, Buffer Overflow & Wireless Attack
 
Using Analyzers to Resolve Security Problems
Using Analyzers to Resolve Security ProblemsUsing Analyzers to Resolve Security Problems
Using Analyzers to Resolve Security Problems
 
Talos
TalosTalos
Talos
 
Demystifying Binary Reverse Engineering - Pixels Camp
Demystifying Binary Reverse Engineering - Pixels CampDemystifying Binary Reverse Engineering - Pixels Camp
Demystifying Binary Reverse Engineering - Pixels Camp
 
RIoT (Raiding Internet of Things) by Jacob Holcomb
RIoT  (Raiding Internet of Things)  by Jacob HolcombRIoT  (Raiding Internet of Things)  by Jacob Holcomb
RIoT (Raiding Internet of Things) by Jacob Holcomb
 
[ITAS.VN]CxSuite Enterprise Edition
[ITAS.VN]CxSuite Enterprise Edition[ITAS.VN]CxSuite Enterprise Edition
[ITAS.VN]CxSuite Enterprise Edition
 
EMBA Firmware analysis - TROOPERS22
EMBA Firmware analysis - TROOPERS22EMBA Firmware analysis - TROOPERS22
EMBA Firmware analysis - TROOPERS22
 
DEF CON 27 - CHRISTOPHER ROBERTS - firmware slap
DEF CON 27 - CHRISTOPHER ROBERTS - firmware slapDEF CON 27 - CHRISTOPHER ROBERTS - firmware slap
DEF CON 27 - CHRISTOPHER ROBERTS - firmware slap
 
Hacking Blind
Hacking BlindHacking Blind
Hacking Blind
 
Hacking blind
Hacking blindHacking blind
Hacking blind
 
Reverse Engineering.pptx
Reverse Engineering.pptxReverse Engineering.pptx
Reverse Engineering.pptx
 
Anti-virus Mechanisms and Various Ways to Bypass Antivirus detection
Anti-virus Mechanisms and Various Ways to Bypass Antivirus detectionAnti-virus Mechanisms and Various Ways to Bypass Antivirus detection
Anti-virus Mechanisms and Various Ways to Bypass Antivirus detection
 
Virus Detection Based on the Packet Flow
Virus Detection Based on the Packet FlowVirus Detection Based on the Packet Flow
Virus Detection Based on the Packet Flow
 
Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...
Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...
Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...
 
Malware classification and detection
Malware classification and detectionMalware classification and detection
Malware classification and detection
 
Detecting Evasive Malware in Sandbox
Detecting Evasive Malware in SandboxDetecting Evasive Malware in Sandbox
Detecting Evasive Malware in Sandbox
 
Static Detection of Application Backdoors
Static Detection of Application BackdoorsStatic Detection of Application Backdoors
Static Detection of Application Backdoors
 

Recently uploaded

Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls DubaiDubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubaikojalkojal131
 
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...SUHANI PANDEY
 
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdfMatthew Sinclair
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirtrahman018755
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrHenryBriggs2
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtrahman018755
 
💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋nirzagarg
 
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋nirzagarg
 
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445ruhi
 
Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...
Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...
Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...SUHANI PANDEY
 
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...singhpriety023
 
Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...
Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...
Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...SUHANI PANDEY
 
Call Girls Sangvi Call Me 7737669865 Budget Friendly No Advance BookingCall G...
Call Girls Sangvi Call Me 7737669865 Budget Friendly No Advance BookingCall G...Call Girls Sangvi Call Me 7737669865 Budget Friendly No Advance BookingCall G...
Call Girls Sangvi Call Me 7737669865 Budget Friendly No Advance BookingCall G...roncy bisnoi
 
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Delhi Call girls
 
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)Delhi Call girls
 

Recently uploaded (20)

Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls DubaiDubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
 
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
 
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirt
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirt
 
💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
 
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
 
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
 
📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
 
Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...
Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...
Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...
 
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
 
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
 
Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...
Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...
Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...
 
Call Girls Sangvi Call Me 7737669865 Budget Friendly No Advance BookingCall G...
Call Girls Sangvi Call Me 7737669865 Budget Friendly No Advance BookingCall G...Call Girls Sangvi Call Me 7737669865 Budget Friendly No Advance BookingCall G...
Call Girls Sangvi Call Me 7737669865 Budget Friendly No Advance BookingCall G...
 
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
 
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
 
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
 
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
 

Reverse Engineering Malware - A Practical Guide

  • 2. MOTIVATION • Malware landscape is diverse and constant evolving • Large botnets • Diverse propagation vectors, exploits, C&C • Capabilities – backdoor, keylogging, rootkits, • Logic bombs, time-bombs • Diverse targets: desktops, mobile platforms, SCADA systems (Stuxnet) • Malware is not about script-kiddies anymore, it’s real business. Recent events indicate that it can be a powerful weapon in cyber warfare. • Manual reverse-engineering is close to impossible • Need automated techniques to extract system logic, interactions and side-effects, derive intent, and devise mitigating strategies.www.intertel.co.za
  • 3. OUTLINE • Review of the workflow of binary program analysis • Review of the challenges in binary program analysis: • Obfuscation Techniques • Techniques for reverse engineering stripped binaries: • Systematic deobfuscation • Examples of obfuscation: Conficker, Hydrac (Google attack), Stuxnet, … www.intertel.co.za
  • 4. CAPTURING MALWARE • Honeynets: Capture malware that scans the Internet for vulnerable targets • Mining SPAM for attachments • Mining SPAM for malicious URLs, and capturing drive-by downloads • AV heuristics www.intertel.co.za
  • 5. MALWARE BINARY ANALYSIS 01001010100101010 10101010011010101 01001010100101010 10101010011010101 01001010100101010 10101010011010101 01001010100101010 10101010011010101 10101010011010101 01001010100101010 10101010011010101 .exe Typically a stripped binary with no debugging information. In the case of malicious code, it is often obfuscated and packed Often has embedded suicide logic and anti-analysis logic • What does the malware do • How does it do it • identify triggers • What is the purpose of the malware • is this an instance of a known threat or a new malware • who is the author • … Challenges: • lack of automation • time-critical analysis • labor intensive • requires a human in the loop www.intertel.co.za
  • 6. DYNAMIC VS STATIC MALWARE ANALYSIS • Dynamic Analysis • Techniques that profile actions of binary at runtime • More popular • CWSandbox, TTAnalyze, multipath exploration • Only provides partial ``effects-oriented profile’’ of malware potential • Static Analysis • Can provide complementary insights • Potential for more comprehensive assessment www.intertel.co.za
  • 7. MALWARE EVASIONS AND OBFUSCATIONS • To defeat signature based detection schemes • Polymorphism, metamorphism: started appearing in viruses of the 90’s primarily to defeat AV tools • To defeat Dynamic Malware Analysis • Anti-debugging, anti-tracing, anti-memory dumping • VMM detection, emulator detection • To defeat Static Malware analysis • Encryption (packing) • API and control-flow obfuscations • Anti-disassembly • The main purpose of obfuscation is to slow down the security communitywww.intertel.co.za
  • 8. MY PERSONAL PHILOSOPHY • Push the limits of static analysis as much as possible. • Rebuild the binary in its original form prior to obfuscation. • Recover a higher level description of the malware binary that makes deriving the purpose of the malware atteingnable: I want to stare at C code as opposed to staring at assembly code www.intertel.co.za
  • 9. MALWARE REVERE ENGINEERING SYSTEM GOALS • Desiderata for a Static Analysis Framework • Unpack most of contemporary malware • Handle most if not all packers • Deobfuscate API references • Automate identification of capabilities • Provide feedback on unpacking success • Simplify and annotate call graphs to illustrate interactions between key logical blocks • Enable decompilation of assemply code into a higher-level language • Identify key logical blocks (crypto for instance) www.intertel.co.za
  • 10. REVERSE ENGINEERING PHASES  Unpacking phase: the image of a running malware sample is often considered damaged: No known OEP. Imported APIs are invoked dynamically and the original import table is destroyed. Arbitrary section names and r/w/e permissions.  Disassembly phase: - Identification of code and data segments - Relies on the unpacker to capture all code and data segments.  Decompilation phase: - Reconstruction of the code segment into a C-like higher level representation - Relies on the disassembler to recognize function boundaries, targets of call sites, imports, and OEP.  Program understanding phase: - Relies on the decompiler to produce readable C code, by recognizing the compiler, calling conventions, stack frames manipulation, functions prologs and epilogs, user-defined data structures www.intertel.co.za
  • 11. PHASE 1: MALWARE UNPACKING Unpacking www.intertel.co.za
  • 12. THE EUREKA FRAMEWORK • Novel unpacking technique based on coarse grained execution tracing • Heuristic-based and statistic-based upacking • Implements several techniques to handle obfucated API references • Multiple metrics to evaluate unpack success • Annotated call graphs provide bird’s eye view of system interaction www.intertel.co.za
  • 13. THE EUREKA WORKFLOW Trace Malware syscalls in VM Syscall trace Heuristic based offline analysis Eureka’s Unpacker Favorable execution point Packed Binary Un- packed Binary Dis- assembly IDA-Pro Un- Packed .ASM Dis- assembly IDA-Pro Packed .ASM Statistics based Evaluator Unpack Evaluati on Raw unpacked Executable •Unknown OEP •No debug information •Unresolved library calls •Snapshot of data segment •Unreachable code •Loss of structuresStatistics based Evaluator www.intertel.co.za
  • 14. COARSE-GRAINED EXECUTION MONITORING • Generalized unpacking principle • Execute binary till it has sufficiently revealed itself • Dump the process execution image for static analysis • Monitoring exection progress • Eureka employs a Windows driver that hooks to SSDT (System Service Dispatch Table) • Callback invoked on each NTDLL system call • Filtering based on malware process pid www.intertel.co.za
  • 15. HEURISTIC-BASED UNPACKING • How do you determine when to dump? • Heuristic #1: Dump as late as possible. NtTerminateProcess • Heuristic #2: Dump when your program generates errors. NtRaiseHardError • Heuristic #3: Dump when program forks a child process. NtCreateProcess • Issues • Weak adversarial model, too simple to evade… • Doesn’t work well for package non-malware programs www.intertel.co.za
  • 16. STATISTICS-BASED UNPACKING • Observations • Statistical properties of packed executable differ from unpacked exectuable • As malware executes code-to-data ratio increases • Complications • Code and data sections are interleaved in PE executables • Data directories(import tables) look similar to data but are often found in code sections • Properties of data sections vary with packers www.intertel.co.za
  • 17. STATISTICS-BASED UNPACKING (2) • Our Approach • Model statistical properties of unpacked code • Estimating unpacked code • N-gram analysis to look for frequent instructions • We use bi-grams (2-grams) because x-86 opcodes are 1 or 2 bytes • Extract subroutine code from 9 benign executables • FF 15 (call), FF 75 (push), E8 _ _ _ ff (call), E8 _ _ _ 00 (call) www.intertel.co.za
  • 21. SYSTEMATIC APPROACH TO DEOBFUSCATION: UNPACKING • Automatic Unpacking: involves running the malware and capturing its memory image. • Monitoring the execution of the malware is an intrusive process and is often detected using anti-tracing and anti-debugging techniques embedded in the malware. • Our multi-strategy approach consists of minimal monitoring and capturing the process image at key events: • ExitProcess • Byte bigram monitoring: call, push instructions for instance • Number of seconds elapsed • Run the malware without monitoring and suspend its execution and perform memory inspection • In practice, we always manage to get a dump (memory snapshot) of the running process: no OEP and no Import table www.intertel.co.za
  • 22. PHASE 2: DISASSEMBLY • The disassembler reads the PE data structure in order to: 1. Determine the different sections of the file and separate code from data and identifies resource information such as import tables – The disassembler relies on the PE data structure (could be corrupt) – The disassembler translates into code, any referenced address from known code location 2. Translate code segments into assembly language – The disassembler relies on the hardware instruction set documentation 3. Interpret data according to identified types 1. A data referenced by code can be of any type: integer, string, struct, etc. Integer: 0x0040F45C dword_40F45C dd 0E06D7363h, 1, 2 dup(0) ; DATA XREF: 408C98 String: 0x0040F45C unk_40F45C db 63h ; c ; DATA XREF: sub_408C98 0x0040F45D db 73h ; s 0x0040F45E db 6Dh ; m 0x0040F45F db 0E0h ; a www.intertel.co.za
  • 23. IDA PRO DISASSEMBLER • http://www.hex-rays.com/idapro/ • It supports a variety of executable formats for different processors and operating systems. It also can be used as a debugger for Windows PE, Mac OS X, and Linux ELF executables. • IDA performs a large degree of automatic code analysis to a certain extent, leveraging cross-references between code sections, knowledge of parameters of API calls, limited dataflow analysis, and recognition of standard libraries. • Hashes of known statically linked libraries are compared to hashes of identified subroutines in the code • Provides scripting languages to interact with the system to improve the analysis. • Support plug-ins: The IDA decompiler is the most impressive plug- in.www.intertel.co.za
  • 24. PE EXECUTION 1. Read the Portable Executable (PE) file data structure and maps the file into memory 2. Load import modules 1. Start execution at entry point 2. Runtime unpacking 3. Jump to OEP www.intertel.co.za
  • 25. PHASE 3: FIXING THE DISASSEMBLED CODE • Unpacked & disassembled code does not have an OEP. • Import tables are rebuilt dynamically and there are no static references to dynamically loaded libraries • Header information is not reliable • Data is not typed www.intertel.co.za
  • 26. PARSING THE PE EXECUTABLE FORMAT www.intertel.co.za
  • 27. CHALLENGES IN BINARY CODE DISASSEMBLY • Disassembly is not an exact science: On CISC platforms with variable-width instructions, or in the presence of self- modifying code, it is possible for a single program to have two or more reasonable disassemblies. Determining which instructions would actually be encountered during a run of the program reduces to the proven-unsolvable halting problem. • Bad disassembly because of variable length instructions • Jumps into middle of instructions • No reachability analysis: Unreachable code can hide data. www.intertel.co.za
  • 28. EXAMPLES OF DISASSEMBLY PROBLEMS (THE STORM WORM) Data hidden as code: ArcadeWorld.exe:0042BDB0 mov eax, 0 ArcadeWorld.exe:0042BDB5 test eax, eax ArcadeWorld.exe:0042BDB7 jnz short loc_42BDD8; unreachable code ArcadeWorld.exe:0042BDD8 loc_42BDD8: ; CODE XREF: 0042BDB7 ArcadeWorld.exe:0042BDD8 lea edx, [ebp+401C42h] ArcadeWorld.exe:0042BDDE lea eax, [ebp+40B220h] ArcadeWorld.exe:0042BDE4 push eax ArcadeWorld.exe:0042BDE5 push 8A20h ArcadeWorld.exe:0042BDEA push edx ArcadeWorld.exe:0042BDEB call near ptr unk_42BF7D ArcadeWorld.exe:0042BDF0 pusha ArcadeWorld.exe:0042BDF1 mov edi, [ebp+40C067h] ArcadeWorld.exe:0042BDF7 add edi, [ebp+40C03Fh] ArcadeWorld.exe:0042BDFD lea esi, [ebp+40BFBFh] 0042BDD8 8D 95 42 1C 40 00 8D 85 20 B2 40 00 50 68 20 8A ìòB.@.ìà ¦@.Ph è 0042BDE8 00 00 52 E8 8D 01 00 00 60 8B BD 67 C0 40 00 03 ..RFì...`ï+g+@.. 0042BDF8 BD 3F C0 40 00 8D B5 BF BF 40 00 B9 74 00 00 00 +?+@.ì¦++@.¦t... 0042BE08 68 00 20 00 00 57 E8 26 FF FF FF F3 A4 8D 85 AF h. ..WF& =ñìà» 0042BE18 BF 40 00 8B 9D 3F C0 40 00 FF B5 43 C0 40 00 FF +@.ï¥?+@. ¦C+@. 0042BE28 B5 37 C0 40 00 6A 01 50 53 E8 9E 00 00 00 FF B5 ¦7+@.j.PSFP... ¦ 0042BE38 6B C0 40 00 FF B5 3F C0 40 00 E8 DE FD FF FF 8B k+@. ¦?+@.F¦² ï 0042BE48 85 18 B2 40 00 85 C0 74 1C FF B5 18 B2 40 00 FF à.¦@.à+t. ¦.¦@. 0042BE58 B5 57 C0 40 00 FF B5 3F C0 40 00 E8 E7 10 00 00 ¦W+@. ¦?+@.Ft... 0042BE68 E8 3F FE FF FF 53 E8 E5 04 00 00 E8 14 00 00 00 F?¦ SFs...F.... 0042BE78 5C 64 6C 6C 63 61 63 68 65 5C 74 63 70 69 70 2E dllcachetcpip. 0042BE88 73 79 73 00 E8 16 0A 00 00 E8 13 00 00 00 5C 64 sys.F....F....d www.intertel.co.za
  • 29. API RESOLUTION • User-level malware programs require system calls to perform malicious actions • Use Win32 API to access user level libraries • Obufscations impede malware analysis using IDA Pro or OllyDbg • Packers use non-standard linking and loading of dlls • Obfuscated API resolution www.intertel.co.za
  • 30. STANDARD API RESOLUTIONImports in IAT identified by IDA by looking at Import Table www.intertel.co.za
  • 31. HANDLING THUNKS • Identify subroutines with a JMP instruction only • Treat any calls to these subs as an API call IsDebuggerPresent www.intertel.co.za
  • 32. LEVERAGING STANDARD API ADDRESS LOADING================================================== Function Name : ADSICloseDSObject Address : 0x76e30826 Relative Address : 0x00020826 Ordinal : 142 (0x8e) Filename : adsldpc.dll Full Path : c:WINDOWSsystem32adsldpc.dll Type : Exported Function ================================================== ================================================== Function Name : ADSICloseSearchHandle Address : 0x76e3050a Relative Address : 0x0002050a Ordinal : 143 (0x8f) Filename : adsldpc.dll Full Path : c:WINDOWSsystem32adsldpc.dll Type : Exported Function ================================================== ================================================== Function Name : ADSICreateDSObject Address : 0x76e30447 Relative Address : 0x00020447 Ordinal : 144 (0x90) Filename : adsldpc.dll Full Path : c:WINDOWSsystem32adsldpc.dll Type : Exported Function ================================================== www.intertel.co.za
  • 33. USING DATAFLOW ANALYSIS • Identify register based indirect calls GetEnvironmentStringW use def www.intertel.co.za
  • 34. HANDLING DYNAMIC POINTER UPDATES • Identify register based indirect calls dword_41e304 has no static value to look up API use def A def to dword_41e308 is found Look for probable call to GetProcAddress earlier Call to GetProcAddress www.intertel.co.za
  • 35. STANDARD API ADDRESS LOADING IS NOT ENOUGH ================================================== Function Name : ADSICloseDSObject Address : 0x76e30826 Relative Address : 0x00020826 Ordinal : 142 (0x8e) Filename : adsldpc.dll Full Path : c:WINDOWSsystem32adsldpc.dll Type : Exported Function ================================================== ================================================== Function Name : ADSICloseSearchHandle Address : 0x76e3050a Relative Address : 0x0002050a Ordinal : 143 (0x8f) Filename : adsldpc.dll Full Path : c:WINDOWSsystem32adsldpc.dll Type : Exported Function ================================================== ================================================== Function Name : ADSICreateDSObject Address : 0x76e30447 Relative Address : 0x00020447 Ordinal : 144 (0x90) Filename : adsldpc.dll Full Path : c:WINDOWSsystem32adsldpc.dll Type : Exported Function ================================================== There are many indirect ways to load And call a Windows API: • access to list of loaded DLLs • access to a loaded DLL and use of GetModulHandle() + offset • … www.intertel.co.za
  • 36. CONSEQUENCE OF FAILURE TO IDENTIFY APIS ... .text:004011A7 push offset unk_40A2DC ; arg 1 .text:004011AC xor ebx, ebx .text:004011AE call dword ptr unk_40A0E4 .data:0040A0E4 00000000 .text:004011B4 mov edi, eax .text:004011B6 cmp edi, ebx .text:004011B8 jz short loc_401211 .text:004011BA push esi .text:004011BB mov esi, dword ptr unk_40A0E8 .text:004011C1 push offset unk_40A2C4 ; arg 2 .text:004011C6 push edi ; arg 1 .text:004011C7 call esi ; unk_40A0E8 .data:0040A0E8 00000000 .text:004011C9 push offset unk_40A2AC .text:004011CE push edi .text:004011CF mov dword_433480, eax ... ... .text:00401132 lea eax, [ebp+var_4] .text:00401135 push eax .text:00401136 push ebx .text:00401137 push 0 .text:00401139 mov [ebp+var_4], esi .text:0040113C call dword_433480 .text:00401142 test eax, eax … Name of a library Load library call (LoadLibrary) Name of the library function Name of the library API call to get the address Of the loaded library function (GetProcAddress) library function call www.intertel.co.za
  • 37. FAILURE TO PERFORM CONTROL FLOW ANALYSIS • CreateThread .text:009A3A4C push eax .text:009A3A4D xor eax, eax .text:009A3A4F push eax .text:009A3A50 push eax .text:009A3A51 push offset dword_9A3939 .text:009A3A56 push eax .text:009A3A57 push eax .text:009A3A58 call [ebx] .data:009A3939 xxxxxxx • Starting Services • Thread synchronization • Critical sections • Callback functions Location of the start address of a thread Call to CreateThread www.intertel.co.za
  • 38. ADVANCED API RESOLUTION • There are many ways in which a library or API can be invoked. • There are many ways an API call can be obfuscated • But there is one invariant associated to each API and library: its signature • i.e; number of arguments, type of arguments, and type of return value if any. www.intertel.co.za
  • 39. ADVANCED API RESOLUTION: TYPE INFERENCE FOR BINARY PROGRAM ANALYS • Use type inference as a single solution to solve three fundamental problems: • Identifying API and function calls (call and jump targets) • Building a precise CFG • Recovering user-defined types for proper decompilation • For Windows Executable files: • Integers: object handles, addresses, IP address, ports, etc • Strings: file names, service names, etc • Structures: sockaddr struct sockaddr_in { short sin_family; u_short sin_port; struct in_addr sin_addr; char sin_zero[8]; }; www.intertel.co.za
  • 40. TYPE PROPAGATION AND MATCHING • Type propagation using dataflow analysis • Propagation of return values and arguments of functions sub_403649 proc near ... .text:00403649 arg_0 = dword ptr 8 .text:00403649 arg_4 = dword ptr 0Ch .text:00403649 arg_8 = dword ptr 10h .text:00403649 arg_C = dword ptr 14h ... .text:00403668 xor ebx, ebx ... .text:00403704 mov esi, ds:dword_40A06C .text:0040370A push ebx ; arg 7 type(f,7) = type (ebx) .text:0040370B mov edi, 80h .text:00403710 push edi ; arg 6 type(f,6) = type (edi) .text:00403711 push 4 ; arg 5 type(f,5) = union(int,char) .text:00403713 push ebx ; arg 4 type(f,4) = type (ebx) .text:00403714 push 2 ; arg 3 type(f,3) = union(int,char) .text:00403716 push 2 ; arg 2 type(f,2) = union(int,char) .text:00403718 push [ebp+arg_4] ; arg 1 type(f,1) = type([ebp+arg_4]) .text:0040371B mov [ebp+var_18], ebx .text:0040371E call esi ; dword_40A06C type(ret(f)) = type(eax) ... There is only one API that has 7 arguments such that the seventh and third and first one can be pointers and all others are not. HANDLE WINAPI CreateFile( __in LPCTSTR lpFileName, __in DWORD dwDesiredAccess, __in DWORD dwShareMode, __in LPSECURITY_ATTRIBUTES lpSecurityAttributes, __in DWORD dwCreationDisposition, __in DWORD dwFlagsAndAttributes, __in HANDLE hTemplateFile ); www.intertel.co.za
  • 41. ADVANTAGES OF TYPE INFERENCE ANALYSIS • Programmers data structures and types are going to be based on known data structures and types provided by the libraries • Identifying API calls and type information help capture better the semantics of the program execution • Not restricted to Windows but require knowledge of the libraries and their documentation • Can deal with some of the widely used obfuscation techniques • Import table obfuscation • Code rewrite: code rewrite preserves the types! www.intertel.co.za
  • 42. PHASE 3: REBUILDING THE UNPACKED EXECUTABLE • From a damaged dumped image of a running malware to a PE executable: • Knowing all APIs allows us to identify the OEP. • Semantic approach: ExitProcess, CreateMutex, GetCommandLine, GetModulaHandle, etc are close to OEP.There are about 20 APIs that are often called at the beginning of the execution of the code. • Structural approach: find sources of call graphs in the binary • Rebuilding in import table with all references to identified APIs • The disassembly of the reconstructed PE is often of better quality than the disassembly of the dumped process image • The new PE code bypasses the unpacking routine embedded in the packed code • The new PE contains the original code www.intertel.co.za
  • 45. PHASE 4: DECOMPILATION • Identifies local variables • Identifies arguments: registers, stack, or any combination • Identifies global variables • Identify calling conventions • Identifies common idioms and compiler features • Eliminates the use of registers as intermediate variables • Identifies control structures www.intertel.co.za
  • 46. DECOMPILATION DEPENDS ON PREVIOUS ANALYSIS PHASES www.intertel.co.za
  • 47. MALWARE OBFUSCATION EFFECT ON DECOMPILATION • While packing is the most used obfuscation technique, it is often combined with other advanced forms of obfuscation that make decompilation often impossible: • Call obfuscation in general and API obfuscation in particular • Binary Rewrite to create semantically equivalent code with vastly different structure • Chuncking or “code spaghettisation” • … www.intertel.co.za
  • 48. ANALYSIS PHASES Compiler Source Code Executable code Assembly code Legitimate C/C++ that a compiler would generate Disassembly & Analysis Decompilation Ideally Unpacking Malware Non-executable code Obfuscated assembly A mess Disassembly & Analysis Decompilation Reality Assembly code Legitimate C/C++ Undo Obfuscation Decompilation www.intertel.co.za
  • 49. EXAMPLE OF BINARY REWRITE .text:009B327C OBFUSCATED_VERSION_OF_is_private_subnet proc near .text:009B327C mov ecx, eax .text:009B327E and ecx, 0FFFFh .text:009B3284 cmp ecx, 0A8C0h .text:009B328A jz loc_9B4264 .text:009B3290 jmp off_9BAAA5 /* .text:009BAAA5 off_9BAAA5 dd offset loc_9AB10C */ .text:009B3290 OBFUSCATED_VERSION_OF_is_private_subnet endp . .text:009B4264 loc_9B4264: .text:009B4264 .text:009B4264 mov eax, 1 .text:009B4269 retn . .text:009AB10C cmp al, 0Ah .text:009AB10E jz loc_9B4264 .text:009AB114 jmp off_9BA137 /* .text:009BA137 off_9BA137 dd offset loc_9B2CE4 */ . .text:009B2CE4 and eax, 0F0FFh .text:009B2CE9 cmp eax, 10ACh .text:009B2CEE jz loc_9B4264 .text:009B2CF4 jmp off_9BA3CC /* .text:009BA3CC off_9BA3CC dd offset loc_9ABA24 */ . .text:009ABA24 xor eax, eax .text:009ABA26 retn .text:009A1311 is_private_subnet proc near .text:009A1311 mov ecx, eax .text:009A1313 and ecx, 0FFFFh .text:009A1319 cmp ecx, 0A8C0h .text:009A131F jz loc_9A1340 .text:009A1325 cmp al, 0Ah .text:009A1327 jz loc_9A1340 .text:009A132D and eax, 0F0FFh .text:009A1332 cmp eax, 10ACh .text:009A1337 jz loc_9A1340 .text:009A133D xor eax, eax .text:009A133F retn .text:009A1340 .text:009A1340 loc_9A1340: .text:009A1340 mov eax, 1 .text:009A1345 retn .text:009A1345 is_private_subnet endp bool __usercall is_private_subnet (unsigned __int16 a1) { return a1 == 43200 || a1 == 10 || (a1 & 0xF0FF) == 4268; } int __usercall OBFUSCATED_VERSION_OF_is_private_subnet<eax>(unsigned __int16 a1 <ax>) { int result; // eax@2 if ( a1 == 43200 ) result = 1; else result = off_9BAAA5(); return result; } www.intertel.co.za
  • 50. SYSTEMATIC APPROACH TO CODE DEOBFUSCATION: BINARY REWRITING • Dechunking: The control flow of Conficker's P2P module has been significantly obfuscated to hinder its disassembly and decompilation. Specifically, the contents of code blocks from each subroutine have been extracted and relocated throughout different portions of the executable. These different blocks (or chunks) are then referenced through unconditional and conditional jump instructions. In effect, the logical control flow of the P2P module has been obscured (spaghetti-code) to a degree that the module cannot be decompiled into coherent C-like code, which typically drives more in-depth and accurate code interpretation. Move all blocks to a contiguous memory block. • Normalize x86 instructions: push followed by a pop is a mov • Normalize calling convention: cdecl, fastcall, stdcall, instead of user-defined. www.intertel.co.za
  • 51. CONFICKER AND HYDRAC DECHUNKING • Identify all chunks in a function and rewrite the function • Applied to all Conficker C P2P Protocol subroutines • Unlike the Conficker P2P logic, Hydraq did not exhibit the same level of obfuscation. It did, however, share some obfuscation features with Conficker. The functions of the Hydraq binary have been subjected to chunking, which renders decompilation difficult. We applied our transformations to automatically generate the C-like code for each subroutine and build a complete CFG of the binary. The IDA disassembler identified 185 subroutines in the binary prior to our analysis. After running the dechunking transformation, only 141 subroutine remained and were decompiled. www.intertel.co.za
  • 52. PURPOSE OF CODE OBFUSCATION • While packing is often used to reduce the size of binaries and to create polymorphic malware samples, the more advanced obfuscation techniques are designed to slow down reverse engineering efforts and to prevent: • the identification of API calls: identify the basic building blocks of the malware • the control-flow reconstruction of the malware: follow and reconstruct the logic flow • static analysis: determine the full functionality, triggers, hidden logic, time bombs, etc. • timely reverse engineering and mitigation of the threat www.intertel.co.za
  • 53. WHY CODE OBFUSCATION IS NOT EASY • Malware authors can design binary code that is extremely difficult to analyze. Using advanced programming languages knowledge, it is possible to create such code. • Malware authors do not feel the need to always obfuscated their code. Can easily defeat signature-based detection. Overwhelm analysts and tools with large numbers of samples. • Malware code should be able to run in a reliable manner. Obfuscation should not compromise this important requirement and should maintain the reliability of the initial code. This requires a proof or guarantee of some sort. • Malware deobfuscation is therefore a more attainable than you might think. Systematic obfuscation informs systematic deobfuscation. www.intertel.co.za
  • 54. OUR APPROACH • Because obfuscation is introduced in a rather systematic way, there is a hope that it can be dealt with in an automated way. • Systematically identifying an obfuscation step and undoing its effect. • Focus on generic approaches as opposed to packer/obfuscator specifics • Focus on metrics that allow us to assess the effect and success of our deobfuscation strategies www.intertel.co.za
  • 55. EXAMPLE: STATIC ANALYSIS OF CONFICKER • Conficker appeared on November 20th, 2008 • Infected millions of machines worldwide • Millions of machines still infected despite an extensive news coverage about the threat • Four versions have been released: A, B, B++, and C • It is a sophisticated piece of malicious code created by professionals who have extensive knowledge about networking, cryptography, system and network programming, and security • Managed to defeat the security community in stopping its progression by using strong crypto, code obfuscation, aggressive propagation strategies, and constantly monitoring the security community actions • Dynamic analysis provided a limited understanding of the threat: • Identification of what appears to be a P2P protocol • Identification of ports opened by the malware • Deobfuscation and static analysis were the only techniques that were able to uncover the full capability of the malware. www.intertel.co.za
  • 56. EXAMPLE: STATIC ANALYSIS OF CONFICKER Static Analysis of Conficker Code: – Domain generation algorithm: provided a list of daily domains to be blocked – Quarter of the Internet scanned: Understand what part of the Internet was targeted for scanning and what infections were due to USB ports and mobile devices – List of disabled security products: detection – Ukrainian keyboard avoidance: Geo-location database poisoning – Use of MD6 and related crypto algorithms: Attribution – DNS APIs patching to disable list of websites (including SRI!): detection – Distribute a number of modified versions of the binary – TCP and UDP ports based on the IP address of the infected machine: detection www.intertel.co.za
  • 57. DEOBFUSCATION OF THE CONFICKER C P2P • Heavily obfuscated protocol code • 88 APIs obfuscated • Use of chunking lead to poor decompilation • Benefits of the deobfuscation – P2P Protocol description: protocol understanding and P2P structure – Peer selection algorithm: proved the peer poisoning approach useless – Possibility to hot patch code without DGA updates: proved C&C domains obsolete The P2P protocol was not just a mechanism for distributing PE executable files but also digitally signed sets of x86 instructions that are executed in a separate thread and take as argument the IP address of the sender. This would provide a hot patch mechanism for all data manipulated by Conficker: list of peers, encryption/decryption keys, the Conficker code it self, etc.www.intertel.co.za
  • 58. STUXNET: KEEPING IT “RELATIVELY” SIMPLE • Stuxnet does not use advanced binary obfuscation techniques. • The analysis of the code is challenging nevertheless • Stuxnet Code Characteristics: • Use of C++ • Use of C++ exception handling • Use of C++ classes • Use of simple data encoding (encryption) • Use of C structures for all data passed to the main subroutines: • Over 40 user-defined structures • Not recognized by disassemblers and decompilers www.intertel.co.za
  • 59. PHASE 5: PROGRAM UNDERSTANDING • Need to identify higher-level concepts from the deobfuscated code • Need to interpret the code into a higher-level malware objective • Need to indentify particular features: crypto: • Functions that use crypto-related opcode, loops, etc • Known constants in crypto algorithms www.intertel.co.za
  • 60. FINDING KNOWN AND UNKNOWN CRYPTO www.intertel.co.za
  • 61. CONCLUSIONS • It is always desirable to recover from the malware a description that is as close as possible to the original code produced by the authors. • It is often possible to do that in practice • It is often the only way to really determine the full capability of the malware • The benefits are important when it comes to high- profile targets • Easily integrated in common analysis tools: disassembler (IDA), Decompilers. www.intertel.co.za