The Codex of Business Writing Software for Real-World Solutions 2.pptx
Man in-the-browser-in-depth-report
1. W H I T E PA P E R
Man-In-The-Browser: Apple Mac OS X Edition
ThreatMetrix™ Labs Report February 2012
V122412
Authors: Nick Blievers and Andreas Baumhof
2. W H I T E PA P E R
Page2
Contents
Introduction 3
MitB: Mac OS X Edition part 2 3
Lazy Symbol Resolution 4
Sound like too much work? 6
The three approaches in detail 7
DYLD_LIBRARY_PATH 7
DYLD_INSERT_LIBRARY 7
Code Injection 7
Conclusion 8
3. W H I T E PA P E R
Page3
Introduction
This ThreatMetrix™ Labs report is the second part of a series about Man-in-the-Browser (MitB) for
Apple Mac OS X. In our first report in November 2011, we provided an overview of different ways to
perform MitB on Apple Mac OS X. We identified three possible ways and provided initial details.
All three approaches will overload a function in one way or another and each approach has its
advantages and disadvantages. We will look in detail at each of these three approaches.
This ThreatMetrix™ Labs report will provide important intelligence to understand the threat of MitB
for platforms other than Windows.
MitB: Mac OS X Edition Part 2
As mentioned above, the main problem we have to solve is to “hook” a function. If we can “redirect”
a function that resides within the operating system to our own “malicious” code, then we can
successfully perform a MitB attack. (For example all browsers call an operating system function to
make an Internet request. If we can redirect this function, we can see all Internet traffic.)
But what do we really need to hook a function?
At a basic level, hooking requires a way of diverting a system function call to somewhere else.
The system function call (the victim of our hook) will then be called by our hook code, so the
apparent functionality stays the same. Last time we mentioned three different methods. There are
undoubtedly more than this, but this makes a good start.
To recap:
1. Library overloading using DYLD_LIBRARY_PATH
2. Function overloading using DYLD_INSERT_LIBRARIES
3. Code injection
Before we take a look at each of these, let’s take a quick look at how symbol resolution
works on Mac OS X. Afterwards we will take each method in turn and examine the benefits
and disadvantages.
4. W H I T E PA P E R
Page4
Lazy Symbol Resolution
When an application is run, it almost always uses system libraries to perform common tasks. In fact,
today Mac OS X SDKs ship with very few static libraries, so outside of a trivial program, it’s very
difficult to avoid using shared libraries. Things as simple as comparing strings, or as complex as
drawing a window on the screen, are all done via system (shared) libraries. This is a good thing, as it
means there is a lot of code reuse, and binaries are smaller than they otherwise would be. However,
there is a disadvantage to this method as well. When your binary runs, there needs to be a process
whereby the functions or symbols it needs are found.
In the early days, this would happen at start-up but the result of this was very slow program starts,
and it was realized that many symbols are not needed until much later (or at all) in a program’s life
cycle. So, the idea of lazy resolution was introduced. This means that a symbol is not resolved until
it’s called the first time. The dynamic linker (dyld) is responsible for finding (linking) the symbols that a
binary needs.
5. W H I T E PA P E R
Page5
As an aside, if you want to see just how slow program starts can be without lazy linking, its possible
to start an application with DYLD_BIND_AT_LAUNCH set, which forces non-lazy linking. Combine this
with a C++ application that makes use of C++ libraries (C++ is particularly bad as classes generate a
large number of symbols) and the results can be less than ideal.
The following diagram shows what an executable looks like when it’s loaded into memory. There
are a few important things to note. First, the TEXT segment is not writable but the DATA segment is.
However it’s not executable. Second, , the shared library is loaded independently of the executable
rather than embedded in it (which is pretty much the point of a shared library). However, this poses
a problem, as we need to know where in memory it was loaded to be able to call functions from that
library. Additionally, it has to be able to be loaded at any address to avoid conflicts with other libraries.
The way this works, is we call a function in our code (‘printf’ for example), but the actual address that
is embedded in our code by the compiler, is in the symbol stub section. The symbol stub is very small,
it simply calls a function based on a value at an address in the lazy symbol pointer table. Now, the first
time this happens, the pointer will point to the stub helper section. The stub helper calls the dynamic
linker and essentially says “where is printf?” The dynamic linker finds the function in question and then
we update the pointer with the address of the function.
The second time our code calls ‘printf’, the pointer in the lazy symbol pointer table now points directly
to the shared library (the dotted line). That is roughly how symbol resolution works on Mac OS X. The
details are slightly different for 64 bit binaries but the concept is the same
This may seem to be a bit of a roundabout way of handling the symbol resolution. You may ask,
wouldn’t it be simpler to just rewrite the symbol stub with the actual address? We mentioned earlier
that each segment has different permissions. The process doesn’t have write permissions for that
segment, hence the need for the Lazy Symbol Pointer table inside the (writable) DATA segment.
If we want to hook a library function and pervert it somehow, we can simply change the pointer in the
lazy symbol pointer table to point to our injected code.
6. W H I T E PA P E R
Page6
If you read the last article on MitB, then this image will look strangely familiar.
Sound like too much work?
Any knowledgeable UNIX user might be thinking that the above is all a bit too hard and that there must
be an easier way. Well this is true. There is. Given the dynamic linker’s specialty is resolving symbols,
can’t we get it to do some of the heavy lifting here?
Most UNIX’s have some variant of LD_LIBRARY_PATH (and LD_PRELOAD), which allows you to
specify your own path for loading libraries. In this way you can tell the dynamic linker to load your
library instead, and ensure that your code runs first. Mac OS X is no different. It has a variant of that,
however, it also has something much better that we will discuss later.
7. W H I T E PA P E R
Page7
The three approaches in detail
DYLD_LIBRARY_PATH
• Please refer to the full ThreatMetrix Labs report for technical details. You can
request a copy of the report by contacting us at labs@threatmetrix.com
DYLD_INSERT_LIBRARY
• Please refer to the full ThreatMetrix Labs report for technical details. You can
request a copy of the report by contacting us at labs@threatmetrix.com
Code Injection
The last method that we are going to discuss is, in some ways, the best method. It is definitely the
most complicated, but it is also the hardest of these three to detect. Depending on exactly how the
other two are implemented, detecting them could be as simple as checking the environment variables,
or querying dyld. Code injection, however, is done without dyld knowing about it.
• Please refer to the full ThreatMetrix Labs report for technical details. You can
request a copy of the report by contacting us at labs@threatmetrix.com
Easy enough. So all we need to do is put this address into the lazy symbol pointer of our injected code.
We could, of course, do this for every symbol we wanted to use but its easier to let dlsym() do the work
for us.