 
Mitchell Adair
utdcsg.org
 Debugging libraries (for Windows)
o WinAppDbg, PyDBG
• Examples
• Pros and con
 Fuzzer design
o Design concepts
o Fuzzer goals
o Github
o Future work

 PyDBG
o “A pure-python win32 debugger interface.”
o Part of the Paimei reverse engineering framework
• Awesome
o Created by Pedram Amini
• Badass, you should be following him on Twitter etc.
 https://github.com/OpenRCE/pydbg
 So… what can it do?
o Launch or attach to processes
o Breakpoints, step into, step over, etc.
o Get / set memory or register values
o Give you access to PEB
o Resolve functions
o Disassemble
o Set callbacks for signals, events, breakpoints, etc.
o Snapshots
o … (seriously)
 And… you can use it stand-alone, or from within IDA!
 How is this different from Immunity, OllyDBG, etc?
o It’s scriptable!
 How about automating…
o Unpacking
o Malware analysis
• General statistics, system calls of interest, etc.
o Crash analysis
• Trace my path, save operand values, etc.
o Fuzzing!
• Debug a process, set callbacks on signals of interest, log the run…
• In memory fuzzing with snapshots
 Let’s see some examples!
 Create a debugging object
 Load the target executable
 Run it
 Pretty painless
 From the interpreter
 The entire dbg object is passed to the callback handler
 Some sort of continue status is returned
 Let’s handle some signals. How about access violation
 On Microsoft Windows, a process that accesses invalid
memory receives the STATUS_ACCESS_VIOLATION exception.
o Wikipedia
 Why do we care about access violations?
o “invalid memory” = ?
o Virtual memory that does not map to physical memory
o Virtual memory marked with permissions, and the process does not
have permission to perform the operation
• Memory is read/write/executable
• Trying to perform a read on non-readable memory… access violation
 We are typically trying to influence pointers, influence
length values, overflow boundaries, etc.
 The above usually results in access violations
 Illegal instruction is another good signal (usually means we
messed with EIP and it now points to an invalid instruction)
 We can
o Launch or attach to an application
o Set our callback handlers
o Run the application
 But… we want to collect as much information as possible
from the access violation handler
 Paimei comes with the great util, crash_binning.py that will
record lots of useful information
 Just create a crash_binning object and record the crash
with the dbg object passed to the callback handler
 That’s a pretty powerful 16 lines of code…
 Sample output from
crash_binning
 Registers, assembly,
stack trace, SEH
 All with a function
call, so easy!
 Now import multiprocessing
 Mutate some files
 Launch the target application with the new files
 Find bugs 
 WinAppDbg
 “The WinAppDbg python module allows developers to
quickly code instrumentation scripts in Python under
a Windows environment.”
 “It uses ctypes to wrap many Win32 API calls related to
debugging…”
 “The intended audience are QA engineers and software
security auditors wishing to test or fuzz Windows
applications with quickly coded Python scripts.”
 http://winappdbg.sourceforge.net/
 Why not just stick with PyDBG?
o Rumor has it PyDBG development has become OSX focused
o It rocks, but it’s a little old and antiquated
o Might have to write some wrappers, depending on your usage
 WinAppDbg is *only* windows, but it has a *ton* of stuff to
work with
 If you’re doing heavy PE work WinAppDbg might be the way
to go
 The WinAppDbg site has some great examples
o http://winappdbg.sourceforge.net/ProgrammingGuide.html
o Instrumentation
• Enumerating processes, loading a DLL into a process, control windows
o Debugging
• Starting and attaching, handling events, breakpoints, etc.
o Win32 API wrappers
• Enumerating heap blocks, modules and device drivers
o Misc
• Dump process memory, find alphanumeric jump addresses, etc.
 We’ll compare WinAppDbg with our last PyDBG example,
then show one more interesting example
 Picking up where we left off with PyDBG
A custom event handler
is optional, but is an
easy way to catch any
signals of interest
 Hooking a
function,
wsprintfW
 Catch the
load_dll signal
 If it’s
user32.dll,
resolve
wsprintf, hook
it
 Print the args
 Hooking a
function,
wsprintfW
 Catch the
load_dll signal
 If it’s
user32.dll,
resolve
wsprintf, hook
it
 Print the args
1. Catch load_dll
signal
 Hooking a
function,
wsprintfW
 Catch the
load_dll signal
 If it’s
user32.dll,
resolve
wsprintf, hook
it
 Print the args
1. Catch load_dll
signal
2. If it’s user32.dll
 Hooking a
function,
wsprintfW
 Catch the
load_dll signal
 If it’s
user32.dll,
resolve
wsprintf, hook
it
 Print the args
1. Catch load_dll
signal
2. If it’s user32.dll
3. Resolve “wsprintfW”
 Hooking a
function,
wsprintfW
 Catch the
load_dll signal
 If it’s
user32.dll,
resolve
wsprintf, hook
it
 Print the args
1. Catch load_dll
signal
2. If it’s user32.dll
3. Resolve “wsprintfW”
4. Hook it
 Hooking a
function,
wsprintfW
 Catch the
load_dll signal
 If it’s
user32.dll,
resolve
wsprintf, hook
it
 Print the args
1. Catch load_dll
signal
2. If it’s user32.dll
3. Resolve “wsprintfW”
4. Hook it
5. wsprintf hit at run time
 Hooking a
function,
wsprintfW
 Catch the
load_dll signal
 If it’s
user32.dll,
resolve
wsprintf, hook
it
 Print the args
1. Catch load_dll
signal
2. If it’s user32.dll
3. Resolve “wsprintfW”
4. Hook it
5. wsprintf hit at run time
6. Dereference
format string
 Hooking a
function,
wsprintfW
 Catch the
load_dll signal
 If it’s
user32.dll,
resolve
wsprintf, hook
it
 Print the args
1. Catch load_dll
signal
2. If it’s user32.dll
3. Resolve “wsprintfW”
4. Hook it
5. wsprintf hit at run time
6. Dereference
format string
7. Count args
 Hooking a
function,
wsprintfW
 Catch the
load_dll signal
 If it’s
user32.dll,
resolve
wsprintf, hook
it
 Print the args
1. Catch load_dll
signal
2. If it’s user32.dll
3. Resolve “wsprintfW”
4. Hook it
5. wsprintf hit at run time
6. Dereference
format string
7. Count args
8. Read
off stack,
print args
 Way too many great examples on their site to go into
o Hooking functions
o Watching variables
o Watching buffers
o Etc… very powerfull
 If you want to automate anything PE related, this is a great
library to look into

 Design goals
o Modularity
• Ex: generator, executor, monitor
o Reusability
• A new target program or file type should make little to no difference
o Speed
• A large file might have hundreds of thousands of mutations
• Multiprocessing or a distributed architecture is helpful
o False negatives
• We don’t want to miss anything…
 What are the general tasks performed during fuzzing?
o Generating mutated data
o Launching the target application
o Sending the data to the application
o Monitoring the application for signals of interest
o Logging results
o …more?
Mutate Data
Launch
Application
Monitor
Application
Log Results
Mutate Data
Launch
Application
Monitor
Application
Log Results
Executor.py
Mutator.py
Fuzzer.py
?
 Part 1 discussed possible values you may want to try
 Yield is a nice python feature
 Sole job is to mutate the bytes, any changes in possible
values can easily be handled here
 My actual executor
 Continually check
queue for new jobs
 When one is
available, call
execute
 Create a new pydbg
instance, setup
callbacks, execute
 My actual executor
 Continually check
queue for new jobs
 When one is
available, call
execute
 Create a new pydbg
instance, setup
callbacks, execute
1. Establish timeout
and queues
 My actual executor
 Continually check
queue for new jobs
 When one is
available, call
execute
 Create a new pydbg
instance, setup
callbacks, execute
1. Establish timeout
and queues
2. Wait for new job
 My actual executor
 Continually check
queue for new jobs
 When one is
available, call
execute
 Create a new pydbg
instance, setup
callbacks, execute
1. Establish timeout
and queues
2. Wait for new job
3. Execute job
 My actual executor
 Continually check
queue for new jobs
 When one is
available, call
execute
 Create a new pydbg
instance, setup
callbacks, execute
1. Establish timeout
and queues
2. Wait for new job
3. Execute job
4. Check timeout
 My actual executor
 Continually check
queue for new jobs
 When one is
available, call
execute
 Create a new pydbg
instance, setup
callbacks, execute
1. Establish timeout
and queues
2. Wait for new job
3. Execute job
4. Check timeout
5. Handle av
 handle_av we’ve seen, uses crash_binning to
capture relevant data
 timeout_callback is a custom callback. Every
itteration of the main debugging loop, it gets called.
An easy way to implement a max timeout
Start the
consumers
Start the
monitor thread
When the
queue is empty,
put a new job
 Feel free to grab my *work in progress* from the above link
 (I will update the site after the presentation)
 Producer / Consumer model
 Multiprocessing
 All in about 260 lines of python
Fuzzer.py
Mutator.py
Executor nExecutor 2Executor 1
queue
…
Fuzzer.py
Mutator.py
Executor nExecutor 2Executor 1
queue
…
1. For each file
mutation in
mutator
Fuzzer.py
Mutator.py
Executor nExecutor 2Executor 1
queue
…
1. For each file
mutation in
mutator
2. Yield a new
mutated file
Fuzzer.py
Mutator.py
Executor nExecutor 2Executor 1
queue
…
1. For each file
mutation in
mutator
2. Yield a new
mutated file
3. Add the new job
to the in_queue
Fuzzer.py
Mutator.py
Executor nExecutor 2Executor 1
queue
…
1. For each file
mutation in
mutator
2. Yield a new
mutated file
3. Add the new job
to the in_queue
4. Execute, and
monitor the job
Fuzzer.py
Mutator.py
Executor nExecutor 2Executor 1
queue
…
1. For each file
mutation in
mutator
2. Yield a new
mutated file
3. Add the new job
to the in_queue
4. Execute, and
monitor the job
5. Return the results
to the out_queue
Fuzzer.py
Mutator.py
Executor nExecutor 2Executor 1
queue
…
1. For each file
mutation in
mutator
2. Yield a new
mutated file
3. Add the new job
to the in_queue
4. Execute, and
monitor the job
5. Return the results
to the out_queue
6. Log results
 There is actually an incoming queue and an outgoing queue
as shown in the fuzzer.py slide, but it took me long enough
to get that graphic, I’m not changing it ;)
 How can we improve our fuzzer, increase our odds?
 Code coverage would be a nice feature
o PyDBG and WinAppDbg both support process “stalking”
o Used to determine the first time a basic block or something specific
is hit
• Enumerate basic blocks ahead of time, count ones hit during execution
• Find common pitfalls, track code coverage, etc.
 Cluster instead of consumer producer?
 Support specific file format fields?
o Just use Peach ;)
 Where can I find some sample files?
o Google.com, with the filter “filetype:xyz”
o ie. “filetype:zip”
o http://samples.mplayerhq.hu/
o http://www.filecrop.com/
• Be careful!
 Gray Hat Python: Python Programming for Hackers and
Reverse Engineers
o http://www.amazon.com/Gray-Hat-Python-Programming-
Engineers/dp/1593271921
 Fuzzing: Brute Force Vulnerability Discovery
o http://fuzzing.org/

Fuzzing - Part 2

  • 1.
  • 2.
     Debugging libraries(for Windows) o WinAppDbg, PyDBG • Examples • Pros and con  Fuzzer design o Design concepts o Fuzzer goals o Github o Future work
  • 3.
  • 4.
     PyDBG o “Apure-python win32 debugger interface.” o Part of the Paimei reverse engineering framework • Awesome o Created by Pedram Amini • Badass, you should be following him on Twitter etc.  https://github.com/OpenRCE/pydbg
  • 5.
     So… whatcan it do? o Launch or attach to processes o Breakpoints, step into, step over, etc. o Get / set memory or register values o Give you access to PEB o Resolve functions o Disassemble o Set callbacks for signals, events, breakpoints, etc. o Snapshots o … (seriously)  And… you can use it stand-alone, or from within IDA!
  • 6.
     How isthis different from Immunity, OllyDBG, etc? o It’s scriptable!  How about automating… o Unpacking o Malware analysis • General statistics, system calls of interest, etc. o Crash analysis • Trace my path, save operand values, etc. o Fuzzing! • Debug a process, set callbacks on signals of interest, log the run… • In memory fuzzing with snapshots
  • 7.
     Let’s seesome examples!
  • 8.
     Create adebugging object  Load the target executable  Run it  Pretty painless
  • 9.
     From theinterpreter  The entire dbg object is passed to the callback handler  Some sort of continue status is returned
  • 10.
     Let’s handlesome signals. How about access violation  On Microsoft Windows, a process that accesses invalid memory receives the STATUS_ACCESS_VIOLATION exception. o Wikipedia
  • 12.
     Why dowe care about access violations? o “invalid memory” = ? o Virtual memory that does not map to physical memory o Virtual memory marked with permissions, and the process does not have permission to perform the operation • Memory is read/write/executable • Trying to perform a read on non-readable memory… access violation  We are typically trying to influence pointers, influence length values, overflow boundaries, etc.  The above usually results in access violations  Illegal instruction is another good signal (usually means we messed with EIP and it now points to an invalid instruction)
  • 13.
     We can oLaunch or attach to an application o Set our callback handlers o Run the application  But… we want to collect as much information as possible from the access violation handler  Paimei comes with the great util, crash_binning.py that will record lots of useful information
  • 14.
     Just createa crash_binning object and record the crash with the dbg object passed to the callback handler
  • 15.
     That’s apretty powerful 16 lines of code…
  • 16.
     Sample outputfrom crash_binning  Registers, assembly, stack trace, SEH  All with a function call, so easy!
  • 17.
     Now importmultiprocessing  Mutate some files  Launch the target application with the new files  Find bugs 
  • 18.
     WinAppDbg  “TheWinAppDbg python module allows developers to quickly code instrumentation scripts in Python under a Windows environment.”  “It uses ctypes to wrap many Win32 API calls related to debugging…”  “The intended audience are QA engineers and software security auditors wishing to test or fuzz Windows applications with quickly coded Python scripts.”  http://winappdbg.sourceforge.net/
  • 19.
     Why notjust stick with PyDBG? o Rumor has it PyDBG development has become OSX focused o It rocks, but it’s a little old and antiquated o Might have to write some wrappers, depending on your usage  WinAppDbg is *only* windows, but it has a *ton* of stuff to work with  If you’re doing heavy PE work WinAppDbg might be the way to go
  • 20.
     The WinAppDbgsite has some great examples o http://winappdbg.sourceforge.net/ProgrammingGuide.html o Instrumentation • Enumerating processes, loading a DLL into a process, control windows o Debugging • Starting and attaching, handling events, breakpoints, etc. o Win32 API wrappers • Enumerating heap blocks, modules and device drivers o Misc • Dump process memory, find alphanumeric jump addresses, etc.  We’ll compare WinAppDbg with our last PyDBG example, then show one more interesting example
  • 21.
     Picking upwhere we left off with PyDBG A custom event handler is optional, but is an easy way to catch any signals of interest
  • 22.
     Hooking a function, wsprintfW Catch the load_dll signal  If it’s user32.dll, resolve wsprintf, hook it  Print the args
  • 23.
     Hooking a function, wsprintfW Catch the load_dll signal  If it’s user32.dll, resolve wsprintf, hook it  Print the args 1. Catch load_dll signal
  • 24.
     Hooking a function, wsprintfW Catch the load_dll signal  If it’s user32.dll, resolve wsprintf, hook it  Print the args 1. Catch load_dll signal 2. If it’s user32.dll
  • 25.
     Hooking a function, wsprintfW Catch the load_dll signal  If it’s user32.dll, resolve wsprintf, hook it  Print the args 1. Catch load_dll signal 2. If it’s user32.dll 3. Resolve “wsprintfW”
  • 26.
     Hooking a function, wsprintfW Catch the load_dll signal  If it’s user32.dll, resolve wsprintf, hook it  Print the args 1. Catch load_dll signal 2. If it’s user32.dll 3. Resolve “wsprintfW” 4. Hook it
  • 27.
     Hooking a function, wsprintfW Catch the load_dll signal  If it’s user32.dll, resolve wsprintf, hook it  Print the args 1. Catch load_dll signal 2. If it’s user32.dll 3. Resolve “wsprintfW” 4. Hook it 5. wsprintf hit at run time
  • 28.
     Hooking a function, wsprintfW Catch the load_dll signal  If it’s user32.dll, resolve wsprintf, hook it  Print the args 1. Catch load_dll signal 2. If it’s user32.dll 3. Resolve “wsprintfW” 4. Hook it 5. wsprintf hit at run time 6. Dereference format string
  • 29.
     Hooking a function, wsprintfW Catch the load_dll signal  If it’s user32.dll, resolve wsprintf, hook it  Print the args 1. Catch load_dll signal 2. If it’s user32.dll 3. Resolve “wsprintfW” 4. Hook it 5. wsprintf hit at run time 6. Dereference format string 7. Count args
  • 30.
     Hooking a function, wsprintfW Catch the load_dll signal  If it’s user32.dll, resolve wsprintf, hook it  Print the args 1. Catch load_dll signal 2. If it’s user32.dll 3. Resolve “wsprintfW” 4. Hook it 5. wsprintf hit at run time 6. Dereference format string 7. Count args 8. Read off stack, print args
  • 31.
     Way toomany great examples on their site to go into o Hooking functions o Watching variables o Watching buffers o Etc… very powerfull  If you want to automate anything PE related, this is a great library to look into
  • 32.
  • 33.
     Design goals oModularity • Ex: generator, executor, monitor o Reusability • A new target program or file type should make little to no difference o Speed • A large file might have hundreds of thousands of mutations • Multiprocessing or a distributed architecture is helpful o False negatives • We don’t want to miss anything…
  • 34.
     What arethe general tasks performed during fuzzing? o Generating mutated data o Launching the target application o Sending the data to the application o Monitoring the application for signals of interest o Logging results o …more?
  • 35.
  • 36.
  • 37.
     Part 1discussed possible values you may want to try  Yield is a nice python feature  Sole job is to mutate the bytes, any changes in possible values can easily be handled here
  • 38.
     My actualexecutor  Continually check queue for new jobs  When one is available, call execute  Create a new pydbg instance, setup callbacks, execute
  • 39.
     My actualexecutor  Continually check queue for new jobs  When one is available, call execute  Create a new pydbg instance, setup callbacks, execute 1. Establish timeout and queues
  • 40.
     My actualexecutor  Continually check queue for new jobs  When one is available, call execute  Create a new pydbg instance, setup callbacks, execute 1. Establish timeout and queues 2. Wait for new job
  • 41.
     My actualexecutor  Continually check queue for new jobs  When one is available, call execute  Create a new pydbg instance, setup callbacks, execute 1. Establish timeout and queues 2. Wait for new job 3. Execute job
  • 42.
     My actualexecutor  Continually check queue for new jobs  When one is available, call execute  Create a new pydbg instance, setup callbacks, execute 1. Establish timeout and queues 2. Wait for new job 3. Execute job 4. Check timeout
  • 43.
     My actualexecutor  Continually check queue for new jobs  When one is available, call execute  Create a new pydbg instance, setup callbacks, execute 1. Establish timeout and queues 2. Wait for new job 3. Execute job 4. Check timeout 5. Handle av
  • 44.
     handle_av we’veseen, uses crash_binning to capture relevant data  timeout_callback is a custom callback. Every itteration of the main debugging loop, it gets called. An easy way to implement a max timeout
  • 45.
    Start the consumers Start the monitorthread When the queue is empty, put a new job
  • 46.
     Feel freeto grab my *work in progress* from the above link  (I will update the site after the presentation)  Producer / Consumer model  Multiprocessing  All in about 260 lines of python
  • 47.
  • 48.
    Fuzzer.py Mutator.py Executor nExecutor 2Executor1 queue … 1. For each file mutation in mutator
  • 49.
    Fuzzer.py Mutator.py Executor nExecutor 2Executor1 queue … 1. For each file mutation in mutator 2. Yield a new mutated file
  • 50.
    Fuzzer.py Mutator.py Executor nExecutor 2Executor1 queue … 1. For each file mutation in mutator 2. Yield a new mutated file 3. Add the new job to the in_queue
  • 51.
    Fuzzer.py Mutator.py Executor nExecutor 2Executor1 queue … 1. For each file mutation in mutator 2. Yield a new mutated file 3. Add the new job to the in_queue 4. Execute, and monitor the job
  • 52.
    Fuzzer.py Mutator.py Executor nExecutor 2Executor1 queue … 1. For each file mutation in mutator 2. Yield a new mutated file 3. Add the new job to the in_queue 4. Execute, and monitor the job 5. Return the results to the out_queue
  • 53.
    Fuzzer.py Mutator.py Executor nExecutor 2Executor1 queue … 1. For each file mutation in mutator 2. Yield a new mutated file 3. Add the new job to the in_queue 4. Execute, and monitor the job 5. Return the results to the out_queue 6. Log results
  • 54.
     There isactually an incoming queue and an outgoing queue as shown in the fuzzer.py slide, but it took me long enough to get that graphic, I’m not changing it ;)
  • 55.
     How canwe improve our fuzzer, increase our odds?  Code coverage would be a nice feature o PyDBG and WinAppDbg both support process “stalking” o Used to determine the first time a basic block or something specific is hit • Enumerate basic blocks ahead of time, count ones hit during execution • Find common pitfalls, track code coverage, etc.  Cluster instead of consumer producer?  Support specific file format fields? o Just use Peach ;)
  • 56.
     Where canI find some sample files? o Google.com, with the filter “filetype:xyz” o ie. “filetype:zip” o http://samples.mplayerhq.hu/ o http://www.filecrop.com/ • Be careful!
  • 57.
     Gray HatPython: Python Programming for Hackers and Reverse Engineers o http://www.amazon.com/Gray-Hat-Python-Programming- Engineers/dp/1593271921  Fuzzing: Brute Force Vulnerability Discovery o http://fuzzing.org/