1Buttercup On Network-based Detection of Polymorphic B.docx

1
Buttercup: On Network-based Detection of
Polymorphic Buffer Overflow Vulnerabilities
Archana Pasupulati, Jason Coit, Karl Levitt, S. Felix Wu
Department of Computer Science
University of California, Davis
{pasupula, coit,levitt, wu}@cs.ucdavis.edu
S.H. Li, R.C. Kuo, Kuo-Pao Fan
Computer Communication Labortory
Industry Technology Research Institute
{shli, rckuo}@itri.org.tw
Abstract — Attack polymorphism is a powerful tool for the
attackers in the Internet to
evade signature-based intrusion detection/prevention systems.
On the other hand, new and faster
Internet worms can be coded and launched easily by even high
school students at any moment of
time to against our critical infrastructures such as DNS or
update servers. And, we believe that

polymorphic Internet worms will be developed in the future
such that many of our current
solutions might have very small chance to survive. In this
paper, we propose a simple solution
called “Buttercup” to counter against attacks based on buffer-
overflow exploits (such as
CodeRed, Nimda, Slammer, and Blaster). We have implemented
our idea in SNORT, and included
19 return address ranges of buffer-overflow exploits. With a
suite of tests against 13 TCPdump
traces, the false positive for our best algorithm is as low as
0.01%. This indicates that,
potentially, Buttercup can drop 100% worm attack packets on
the wire while only 0.01% of the
good packets will be sacrificed.
I. Introduction
Since a signature-based Network Intrusion Detection System
(NIDS) identifies an attack instance
by exactly matching attack signatures against the incoming and
outgoing data packets, when the
well-known attacks are modified/transformed differently, the
NIDS might fail due to its inability
to match them in its signature database. Sometimes, we call

these transformed attacks (but all
from one single original attack signature, for the purpose of IDS
evasion) “polymorphic attacks”.
2
In this paper, we propose a new solution to accurately identify
one particular type of polymorphic
attacks, known as polymorphic shellcode. Due to the space
limitation, solutions for dealing with
other types of polymorphic attacks are discussed in [1].
Under the polymorphic shellcode attacks, the attacker can
choose an unknown encryption
algorithm to encrypt the attack code and include the decryption
code as part of the attack packet.
The trick to make the whole thing work is to utilize an existing
buffer-overflow exploit and to set
the “return” memory address on the over-flowed stack to be the
entrance point of the decryption
code module. The attacker can transform every other bit in the
packet payload to avoid being
detected by a signature-based IDS, but a critical constraint
exists on the range of the “return”
memory address that can be twisted. Our solution, Buttercup, is

simply to identify the ranges of
the possible return memory addresses for existing buffer-
overflow exploits, and if a packet
contains such addresses, a red/yellow flag might be raised. For
the evaluation of false positive, we
have modified SNORT and selected 19 exploits to run against
13 different TCPdump traffic files.
For one of our range matching algorithms, the false positive is
as low as 0.01%, while other
simpler algorithms are all below 1.13%.
One significant motivation/objective for the Buttercup project is
to identify and drop internet
worm attacks at the edge of the Internet. Today, most existing
solutions for worms require a
human analysis of the worm binary code, first, and then develop
signatures for catching the
worms. Unfortunately, we believe that the process of worm code
analysis will take some
significant amount of time, and by the time the puzzle is solved,
the damage has been widely
spread and maybe uncontrollable. If the attacker developed a
polymorphic version of worms, the
analysis will be much harder because, first, we need to
understand an unknown encryption

algorithm. However, since all the worms (CodeRed, Nimda,
Slammer, Blaster) utilize some
existing “buffer-overflow” exploits, if we can recognize the
potential return memory address
ranges, we can catch it from the birth of the worms.
Furthermore, when the worm is identified via
the Buttercup module, it should be dropped immediately. For
worms, based on our experience,
3
we have to drop every single worm packet, otherwise, they will
be spreading themselves in very
high speed. With Buttercup, we can drop all worms based on the
known buffer-overflow
vulnerabilities, while, according to our evaluation, only 0.01%
of the good packets in the Internet
will be mistakenly dropped.
II. SNORT, a Signature-Based IDS
SNORT [2,3] is an open source lightweight signature-based IDS
and it is a representative of any
signature-based IDS. Snort rules are simple to write, yet
powerful enough to detect a wide variety

of hostile or merely suspicious network traffic. An example rule
below contains protocol,
direction, port, and other attack related information:
alert tcp any any -> 10.1.1.0/24 80 (content: "/cgi-bin/phf";
msg: "PHF probe!";).
There are, however, some weaknesses in a signature-based
NIDS like SNORT, and these
weaknesses can be exploited by an attacker to evade the NIDS
and to successfully attack his/her
target. SNORT has a preprocessor, spp_fnord, for detecting
polymorphic shellcode, by searching
for a certain length pattern of no-op like characters, but it is
port and length dependent. Please
note that a really skillful attacker can avoid or transform the no-
op operations as well.
III. Some Background about Buffer Overflow
On many C implementations, writing past the end of an array
declared auto in a routine causes the
execution stack to get corrupted. This code is said to smash the
stack [4] and can cause return
from the routine to jump to a random address. By placing our
own code at a particular memory
location, and causing the return address variable on stack to
point to that location, it is possible to

take over control of a system and obtain root privileges on it.
Over the last few years, there has
been a great increase in the number of buffer overflow
vulnerabilities being discovered and
exploited. Some of the examples of attacks exploiting buffer
overflow vulnerabilities are Code
Red I, Nimda, SQL/Sapphire/Slammer, and Blaster worms.
Processes are divided into three regions: Text, Data and Stack.
The text region is fixed by the
program and includes code (instructions) and read-only data.
This region corresponds to the text
4
section of the executable file. This region is normally marked as
read-only and any attempt to
write to it will result in a segmentation violation. The data
region contains initialized and
uninitialized data. Static variables are allocated at load time on
the data segment and dynamic
variables are allocated at run time on the stack.
A stack is an abstract data type, which has the LIFO (last in,
first out) property i.e., the object that

has been placed last on the stack will be the first object
removed. The stack is used to
dynamically allocate the local variables used in functions, to
pass parameters to functions, and to
return values from functions as shown in Fig. 1. It also stores
the return addresses for function
calls i.e. the address of the instruction to be executed after the
return from the function call. This
is what makes it vulnerable.
A buffer is a contiguous block of computer memory that holds
multiple instances of the same data
type. A buffer overflow [5,6,7] is the result of stuffing more
data into a buffer than it can handle.
A typical example is when a function copies a supplied string
into an allocated buffer space
without bounds checking by using a strcpy() instead of
strncpy(). The contents of the supplied
string that do not fit into the allocated buffer space overwrite
the bytes after the allocated buffer
space in the stack, including the return address. When the stored
return address on the stack gets
replaced by some arbitrary value due to a buffer overflow, the
function returns and tries to read
the next instruction from that address. This results in a

segmentation violation.
bottom of/top of top of/bottom of
memory /stack memory/stack
buffer sfp ret *str
<------ [ ][ ][ ][ ]
Fig. 1: Structure of a stack
By sending a string that overflows a buffer such that it fills the
return address on the stack, with
an address where arbitrary code is placed by the attacker, he/she
could use the buffer overflow
vulnerability to execute his/her own code. This kind of an attack
is mostly used by a malicious
user to gain root access on a machine and to execute code on it.
In most cases, a buffer overflow
attack is simply used to spawn a shell. From the shell, other
commands can be issued. The
5
hexadecimal representation, of the commands in machine
language, which are used to spawn a
shell, is sent as a part of the string that is used to overflow the
buffer. This string is thus called the
shell code.

Importance of return address: When a buffer overflow
vulnerability is discovered, the most
important requirement for an exploit to work is to get the return
address right. A buffer overflow
exploit involves loading shellcode onto the buffer we are
overflowing and overwriting the return
address variable of the stack frame (which contains parameters
to a function, its local variables,
and the data necessary to recover the previous stack frame,
including the value of the instruction
pointer at the time of the function call) so it points back into the
buffer. Hence, the address placed
in the return address variable would be a value within the
address space allocated for the process
i.e., the shellcode is executed off the stack. If the shellcode
occupies a portion of memory other
than the memory space of the program we are trying to exploit,
a segmentation violation occurs.
The problem faced when trying to overflow the buffer of
another program is to figure out at what
address the buffer (and thus the exploit code) will be. The
answer is that, for every program, the
stack starts at the same address. Most programs do not push
more than a few hundred or a few

thousand bytes into the stack at any one time. Therefore by
knowing where the stack starts one
can try to guess where the buffer one is trying to overflow, will
be. The program can take as a
parameter the buffer size, and an offset from its own stack
pointer (where we believe the buffer
we want to overflow may live). This method of guessing the
offset is only applicable to local
buffer overflow exploits and to exploits that are run on the same
operating system as the target
machine.
However, trying to guess the offset, even while knowing where
the beginning of the stack lives, is
nearly impossible. The problem is, we need to guess exactly
where the address of our code will
start. If we are off by one byte more or less, we will just get a
segmentation violation or an invalid
instruction. One possible solution is to pad the front of the
buffer overflow with NOP instructions
that perform NULL operations. Hence, half of the overflow
buffer is filled with them. The shell
6

code is placed at the center, and then followed with the return
addresses. If the return address
points anywhere in the string of NOPs, they will just get
executed until they reach the shell code.
Assuming the stack starts at 0xFF, that S stands for shell code,
and that N stands for a NOP
instruction, the new stack would look like this:
bottom of EEEEEEEEEEE EEEE FFFF FFFF FFFF FFFF
top of
memory 123456789AB CDEF 0123 4567 89AB CDEF
memory
buffer sfp ret a b c
<----- [NNNNSSSSSSS][0xE2][0xE2][0xE2][0xE2][0xE2]
^ |
|_____________________|
Fig. 2: Structure (graphical representation) of a buffer
overflow exploit
IV. Polymorphic shellcode
Polymorphic shellcode [8] is basically a functionally equivalent
form of a buffer overflow exploit
with a different signature on the network. The attack code is
subtly transformed such that it looks
different from the known signature. As it hits the target
machine, it reassembles, having eluded
the IDS [9].

A well-known tool that generates polymorphic shellcode is a
polymorphic buffer-overflow engine
called ADMutate [10]. An attacker feeds the ADMutate a buffer
overflow exploit to generate
hundreds or thousands of functionally equivalent exploits [11].
This is accomplished by using
simple encryption techniques, along with the substitution of
functionally equivalent machine-
language instructions. This confuses many IDS tools (including
Snort) that search for the familiar
NOP sled or the known machine-language exploit included in
buffer overflows, as ADMutate
dynamically modifies these elements.
A buffer overflow attack script consists of three parts, a set of
NOPs, the shellcode, and the return
address in the form [NNNN][SSSS][RRRR]. In polymorphic
shellcode, the NOPs are replaced by
a random mix of no-effect instructions and the shellcode is
encrypted differently each time, thus
making signature-based detection by an NIDS, that looks for
NOPs or certain strings within the
shellcode, impossible. Having generated encoded shellcode and
substituted NOPs, ADMutate

7
then places the decoder in the code. The shellcode is then of the
form
[NNNN][DDDD][SSSS][RRRR], where “D” represents the
decoder. It is not possible to detect
the decoder either since techniques such as multiple code paths,
non-operational pad instructions,
out-of-order decoder generation and randomly generated
instructions make it look different each
time. The use of sliding keys effectively eliminates the ability
to recover the plaintext shellcode
by means of the reversible nature of xor. The only part of the
script that remains constant through
each instance of a buffer overflow attack is the return address.
In fact, even the return address is
modified by modulating its least significant bit, but when this is
done, sometimes, the address
may no longer be valid when it hits the target. Hence, we intend
to use this part of a buffer
overflow attack script in enabling an IDS to detect polymorphic
shellcode.
V. ButterCup: an IDS architecture against Attack Polymorphism

As we saw above, one solution to the problem of determining
the return address to exploit a
buffer overflow vulnerability is, to pad the front of the
shellcode with NOP instructions. If the
return address points anywhere within the NOPs, they will just
get executed till the exploit code
is reached. Using this method, the exploit might work for a
certain range of the offset values since
the return address could point anywhere within the string of
NOPs.
Hence, for every buffer overflow vulnerability, the return
address is overwritten with a value,
which can only lie within a certain range of values (the process’
address space). By determining
the address range for a particular buffer overflow exploit and
looking for values that lie within
this range, in incoming packets, we hope to detect the exploit.
Determination of address range values: Determining a lower
limit and an upper limit within
which the return address can fall can reduce the range of values,
which need to be checked,
further. The lower limit would be the address at which the
buffer starts since the string we send to
overflow starts at the start of the buffer and cannot be placed in

a memory area with an address
less than the address of the buffer.
8
Let’s take a look at the example we saw above (fig. 2). In this
example, since the buffer starts at
address 0xE1, the lower limit of our address range would be
0xE1 and not any value lower than
that. Since the string in the example can be changed by
increasing or decreasing the number of
NOPs, we try to determine a suitable range that would help us
detect the attacks even if the
number of NOPs is changed.
In addition to having the form [NNNN][SSSS][RRRR], the
attack script can also be of the form
[RRRR] [NNNN][SSSS], especially in cases where the buffer is
small. In this case, the buffer and
the return address field are filled up with the address where the
shellcode is to be found. The
attack in this case looks like this:
bottom of DDDDDEEEEEEEEEEEE EEEE FFFF FFFF
FFFF FFFF top of
memory BCDEF0123456789AB CDEF 0123 4567 89AB

CDEF memory
<------
[0xF80xF80xF80xF8][0xF8][0xF8][NNNN][SSSS][SSSS]
| ^
|_________|
Fig. 3: Structure of a buffer overflow exploit demonstrating the
range of address values
In the case where the attack is of the form
[NNNN][SSSS][RRRR], the upper limit would be the
(address of the return address field - length of the shellcode).
In the case where the attack is of
the form [RRRR] [NNNN][SSSS], the upper limit would be
(bottom of stack – length of the
shellcode). The higher of these two values is obviously the one
in the second case.
We have thus determined that there is definitely an upper limit
and a lower limit within which the
return address of the shellcode of a buffer overflow exploit
should fall. The only task left now is
determining the address range. This range of values can aid in
the detection of a particular buffer
overflow exploit.
An example: We modified the values, for the offset from the

stack pointer and the number of
NOPs, in an exploit code that exploited a local buffer overflow
vulnerability and found that there
definitely was a range of values within which the return address
value had to fall. If the values of
the offset and number of NOPs were changed such that the
return address value fell outside this
9
range, there was a segmentation fault. The lower value of the
range was found to be 0xbffff62c,
which was the point where the buffer started, and the higher
value was 0xbffff9c4.
By analyzing the exploit codes, one can determine the range of
the return address values. The
solution we provide is to enable Snort to analyze packets and
check for 32-bit values, which lie
within the range of addresses for a particular buffer overflow
vulnerability.
Implementation of proposed solution: We implemented the
solution by including a new keyword
in Snort-2.0.0 called “range”. We call this implementation of
our solution in Snort, Buttercup. In

Buttercup, a new detection plugin file named sp_range_check
was included, which takes 32 bits
at a time from the payload of the incoming packet, starting from
the first byte, and compares it
against the two values provided as the values for the “range”
keyword. If it lies within the range,
then the buffer overflow alert corresponding to those return
address values is generated. Else, the
32 bits starting from the next byte are compared with the two
values. The range values are
obtained by getting the return address used for a particular
buffer overflow exploit and initially,
the lower limit is taken to be a value –200 from the return
address value and the upper limit is a
value +200 from the return address value. In this way, the entire
packet is analyzed. An example
of a rule to detect a buffer overflow exploit using the range
keyword is as follows:
alert tcp $EXTERNAL_NET any -> $HTTP_SERVERS
$HTTP_PORTS (msg:"Intel PXE buffer
overflow"; range:"|bfffef94-bffff124|";)
From the exploit code for the Intel PXE Buffer overflow, the
return address of the shellcode was
determined to be 0xbffff05c. The lower limit was obtained by
subtracting 200 from this value

(0xbffff05c – 200 = 0xbfffef94) and the upper limit was
obtained by adding 200 to this value
(0xbffff05c + 200 = 0xbffff124). A rules file named my.rules
was included in the rules directory
of snort and 22 rules were included for 22 different buffer
overflow attacks, and the code was
tested for false positives. Among the 22 rules, 3 were later
commented out since they generated a
lot of false positives.
10
Among these, the Sendmail’s prescan buffer overflow, the
XFree86 XLOCALEDIR buffer
overflow, the Linux ATM buffer overflow and the KON buffer
overflow vulnerabilities are local
vulnerabilities i.e., they can be exploited by a local user to gain
root privileges.
We also obtained a range for the Microsoft Windows RPC
Buffer Overflow vulnerability, which
was exploited by the very recent Blaster worm that caused a lot
of damage worldwide. We
obtained this range by studying some exploit codes for this
vulnerability. The lower range and

higher range values were found to be 0x77d73713 and
0x77f92b63 respectively. The return
address values are different for different versions and service
packs of the Windows operating
system, which the code exploits, and hence, these values were
derived by subtracting 200 from
the lowest of the return address values and adding 200 to the
highest of the return address values.
However, a rule for detecting this attack wasn’t added to our
rules file before we performed all
the tests, since this vulnerability was exploited only recently.
Steps proposed to reduce false positives: In order to reduce the
number of false positives further, 2
other keywords, ‘rangeoffset’ and ‘rangedepth’ were introduced.
The value provided with the
‘rangeoffset’ indicates the starting point in the packet payload
from where the 32-bit values are
checked. The ‘rangedepth’ sets the maximum search depth for
the range check function to search
from the beginning of its search region. The ‘rangeoffset’ and
‘rangedepth’ options are used as
modifiers to rules using the ‘range’ option keyword. By
carefully studying the buffer overflow

exploit code, we can determine the part of the shellcode in
which the return address is placed and
thus provide values for the above two option keywords. We also
used the ‘dsize’ option keyword,
already implemented in Snort, in order to flag alerts only for
those packets that have payloads
whose length falls within a given range in addition to
containing the particular return address
values. Using these three additional keywords, the number of
false positives was brought down
considerably. An example of a Snort rule containing the ‘dsize’,
‘rangeoffset’ and ‘rangedepth’
keywords, in addition to the ‘range’ keyword, is as follows:
11
alert tcp $EXTERNAL_NET any -> $HTTP_SERVERS any
(msg:"MSSQL2000 remote UDP
exploit"; range:"|42ae1000-42b0caa4|"; dsize:475<>550;
rangeoffset:97; rangedepth:20; )
The above rule is used for detecting attacks that exploit the MS
SQL 2000 buffer overflow
vulnerability. We detect these attacks by looking for values
lying between 42ae1000 and

42b0caa4, only in packets whose size falls in the range 475-550.
Also, we look for these values
starting from the 97th character in the packet payload and only
within 20 characters from the
starting point. As we can see from this example, this greatly
reduces the amount of processing
that the IDS needs to do since it looks for the address values
only in certain packets and only in
certain portions of those packets instead of searching all the
packet payloads from start to finish.
The values for the ‘dsize’ keyword were obtained by studying
the exploit codes for each of the
buffer overflow vulnerabilities mentioned above, and
determining the size of the shellcode.
Similarly, the values for ‘rangeoffset ‘ and ‘rangedepth’ were
obtained by studying the shellcode
and determining exactly which parts of the shellcode contain the
return address values. However,
due to the complex nature of the shellcodes, we could not get
values for the ‘dsize’, ‘rangeoffset’
and ‘rangedepth’ keywords for all of the exploits.

Buffer Overflow OS/application Low address High
address Dsize<> Range Range
vulnerability value value Offset Depth
- ATFTPd buffer overflow Linux-Debian 3.0 0x08055544
0x080556d4 502<>570 248 16
- Snort TCP Stream Reassembly
Integer Overflow Snort 1.9.1 0x0819fdfa
0x0819ff8a 3830<>3900 646 4
- IIS WebDAV buffer overflow Windows 0x4142427c
0x4142440c 1014<>1050 922 102
- MSSQL200 Remote UDP
exploit Windows 0x42ae1000 0x42b0caa4
475<>550 97 20
- SQL/Sapphire/Slammer worm Windows 0x42b0c914
0x42b0caa4 - - -
- IIS5.0 .idq overrun Windows 0x77e51616
0x77e517a6 1048<>1100 - -
- Code Red Worms Windows (IIS) 0x7801cb0b
0x7801cc9b 285<>350 245 -
- Kerio Personal Firewall buffer
overflow Windows 0x780705c8 0x78070758
5267<>6050 5268 4
- Sendmail’s prescan buffer
overflow FreeBSD 0xbffe1e6e 0xbffe1ffe
3062<>3100 - -
- File buffer overflow Linux 0xbfffbc78 0xbfffbe08
6180<>6250 0 20
- PGP4Pine buffer overflow Linux 0xbfffdb08

0xbfffdc98 291<>350 256 45
- ISC DHCPD buffer overflow Linux 0xbfffdd70
0xbfffdf00 240<>300 - -
- XFree86 XLOCALEDIR
buffer overflow Xfree86 v 4.2.x 0xbfffe7a5
0xbfffe935 5090<>6050 58 -
- Intel PXE buffer overflow Linux -Red Hat 0xbfffef94
0xbffff124 1018<>1050 1024 4
- PoPToP PPTP Server buffer
overflow Linux 0xbffff478 0xbffff648
490<>550 320 4
- GKrellM buffer overflow Linux-Debian 3.0 0xbffff703
0xbffff893 490<>550 156 4
- Linux ATM buffer overflow Linux 0xbffff798
0xbffff928 242<>300 227 25
- KON buffer overflow Linux 0xbffffef9
0xc0000089 790<>850 - -
- WebAdmin.exe buffer
overflow Windows 2000 0xd6bf523
0xd6bf53cf - - -
12
Table 1: Buffer overflow vulnerabilities, included in the rules
file, their address ranges and the values of
‘dsize’, ‘rangeoffset’ and ‘rangedepth’ keywords.
We hope that through deeper evaluation, these values can be
obtained for all the exploits. Table 2
below lists the buffer overflow vulnerabilities for which rules

have been included in our version
of Snort, alongwith the address ranges (in hex) we look for
(when the range is +-200), and the
values for the ‘dsize’, ‘rangeoffset’ and ‘rangedepth’ keywords
for all of our rules.
The symbol ‘<>’ denotes that the address range is checked only
in packets whose size falls within
a certain range. The lower value of this range was determined
by subtracting a value of 10 from
the size of the shellcode obtained from the buffer overflow
exploit, and the higher value was
obtained by rounding off, the value determined from the
shellcode, to the nearest 50. We also
performed tests checking for address ranges only in packets
whose size exceeds a certain value
and in this case, the symbol ‘>’ is used with the ‘dsize’
keyword.
VI. Simulation and Analysis
In this section, we describe the various tests that were
performed on Buttercup in order to
compare its performance with the original version of Snort. In
order to determine the performance
of our IDS architecture against polymorphic shellcode, various
parameters, such as ‘range’ and

‘dsize’ values, were changed in our implementation and the
performance of Snort observed in
terms of processing time and percentage of alerts generated.
Simulation: For our simulation, approximately 50 real tcpdump
files of network traffic were
obtained from the MIT Lincoln Laboratory IDS evaluation Data
Sets. These tcpdump files were
provided as input to Buttercup, which included the ‘range’,
‘dsize’, ‘rangeoffset’ and
‘rangedepth’ keywords and 19 new rules. Buttercup was then
tested for false positives on each of
these files.
13
In Table 2, we look at the total number of packets that each of
the tcpdump files has, and we then
compare the number of alerts generated by the unmodified Snort
against the number of alerts
generated by Buttercup. The version of Buttercup used in this
case has rules that have the range
values of +-200 and do not include the ‘dsize’, ‘rangeoffset’ and
‘rangedepth’ keywords.

Table 3 depicts the results obtained in the form of the
percentage of alerts generated i.e. (no. of
alerts / no. of packets) when several tcpdump files were taken
as input by Buttercup. In order to
observe how the number of alerts would change when the range
values were changed, we present
the percentage of alerts for range values of +-50, +-100, +-200,
+-250, +-300, +-400 and +-500 in
table 1 below.
Table 4 again depicts the change in the percentage of alerts, but
his time, comparison is made
between the cases where the rules have just the ‘range’ keyword
alone, the rules have the ‘dsize’
keyword, with symbol ‘<>’, in addition to the ‘range’ keyword,
the rules have the ‘range’,
‘dsize’, ‘rangeoffset’ and ‘rangedepth’ keywords and the
symbol ‘>’ is used with the ‘dsize’
keyword.
Since, in the above two cases, we only want to concentrate on
how many alerts Buttercup
generates due to the buffer overflow rules we have added, we
only include our rules file my.rules
in the configuration file, snort.conf.

Finally, Table 5 depicts the change in the processing times of
original Snort and Buttercup. In this
case, since we are concerned about how our modified Snort
compares with the unmodified Snort,
we include all the rules files in the configuration file,
snort.conf.
Fig. 5 and fig. 6 are graphical representations of the results
presented in Table 3. Fig.7 is a bar
graph representing the results presented in Table 4.
Tcpdump files Total no. of No. of Snort No.
of Buttercup
packets alerts alerts
inside.tcpdump-00 159658 87 1064
outside.tcpdump-00 583050 132 4242
sampledata01-dump 14523 38 32
tcpdins-00 649787 34056 2023
tcpdwk1mon-98 634595 174 7131
tcpdwk1tue-98 598569 165 6417
tcpdwk2wed-98 811678 169 6402
tcpdwk2thu-98 966468 273 9536
tcpdwk2fri-98 475060 37725 1423
tcpdinswk1mon-99 1492331 20394 7533
tcpdinswk1tue-99 1237119 3435 7161
tcpdinswk1wed-99 1726319 37316 8994

14
Table 2: Total no. of packets and no. of alerts generated by
Snort and Buttercup for
various tcpdump files
Table 3: Percentage of alerts generated by Buttercup for various
address ranges and tcpdump files

Table 4: Percentage of alerts generated for various versions of
Snort for a range value of +-200.
where
RANGE
Tcpdump files
+-50 +-100 +-200
+-250 +-300 +-400 +-500
inside.tcpdump-00 0.3488 0.6213 0.6664 0.6746
0.6927 0.7165 0.7973
outside.tcpdump-00 0.3967 0.6727 0.7276 0.7356
0.7541 0.7797 0.8174
sampledata01-dump 0.1928 0.2066 0.2203 0.2203
0.2203 0.2203 0.2617
tcpdins-00 0.1617 0.2846 0.3113
0.3176 0.3296 0.3421 0.3684
tcpdwk1mon-98 0.7181 0.9904 1.1237
1.1336 1.2203 1.2704 1.3104
tcpdwk1tue-98 0.6796 0.9422 1.0721
1.0804 1.1566 1.2009 1.2415
tcpdwk2wed-98 0.4927 0.7035 0.7887
0.8080 0.9022 0.9390 1.0755
tcpdwk2thu-98 0.5730 0.8782 0.9867
1.0081 1.1236 1.1945 1.3179
tcpdwk2fri-98 0.2450 0.2823 0.2995 0.3092
0.3360 0.3517 0.5054
tcpdinswk1mon-99 0.2630 0.4639 0.5048
0.5125 0.5337 0.5551 0.6076
tcpdinswk1tue-99 0.3039 0.5186 0.5788
0.5875 0.6125 0.6366 0.7188
tcpdinswk1wed-99 0.2678 0.4734 0.5210
0.5284 0.5431 0.5634 0.6031

tcpdinswk2mon-99 0.2670 0.4439 0.4927
0.4996 0.5172 0.5393 0.5749
Snort versions
Tcpdump files
BC-range BC-range- BC-range- BC-range-
BC-range-
dsize<> dsize<>-RO-RD dsize> dsize>-RO-
RD
Inside.tcpdump-00 0.6664 0.0144 0.0138 0.5293
0.2468
Outside.tcpdump-00 0.7276 0.0245 0.0249 0.5987
0.2833
sampledata01-dump 0.2203 0 0 0.2203 0.0275
tcpdins-00 0.3113 0.0057 0.0051 0.2408
0.1039
tcpdwk1mon-98 1.1237 0.0077 0.0093 0.9642
0.3240
tcpdwk1tue-98 1.0721 0.0075 0.0092 0.9275
0.3229
tcpdwk2wed-98 0.7887 0.0067 0.0059 0.6779
0.2592
tcpdwk2thu-98 0.9867 0.0153 0.0110 0.8788
0.3857
tcpdwk2fri-98 0.2995 0 0 0.2804 0.0539
tcpdinswk1mon-99 0.5048 0.0138 0.0132 0.4020
0.1916
tcpdinswk1tue-99 0.5788 0.0122 0.0117 0.4210
0.2202
tcpdinswk1wed-99 0.5210 0.0106 0.0099 0.4083
0.1902
tcpdinswk2mon-99 0.4927 0.0107 0.0098 0.3845
0.1710

15
BC-range – Buttercup with only ‘range’ keyword and range of
+-200.
BC-range-dsize<> – Buttercup with range of +-200 and ‘dsize’
<> values (values derived from
size of shellcode).
BC-range-dsize<>-RO-RD – Buttercup with ‘range’ of +-200
and ‘dsize’ <> values (values
derived from size of shellcode) and ‘rangeoffset’ and
‘rangedepth’ keywords included.
BC-range-dsize> – Buttercup with ‘range’ of +-200 and ‘dsize’
> value (size of shellcode
obtained from buffer overflow exploits).
BC-range-dsize>-RO-RD - Buttercup with range of +-200 and
‘dsize’ > value (size of shellcode
obtained from buffer overflow exploits) and ‘rangeoffset’ and
‘rangedepth’ keywords included.
Table 5: Processing times (in seconds) of different versions of
Snort

where
Snort-2.0.0 - original snort-2.0.0 with all rules files included in
snort.conf.
BC-range – Buttercup with all rules files included in snort.conf
and only ‘range’ keyword with a
range of +-200.
BC-range-dsize<> – Buttercup with all rules files included in
snort.conf and ‘range’ keyword
with a range of +-200 and ‘dsize’ (<> values) keyword.
Snort versions
Tcpdump files
No. of packets Snort-2.0.0 BC-range BC-range-
BC-range-
dsize<> dsize<>-RO-RD
phase-1-dump-00 40 0.311 0.301 0.308
0.314
phase-1-dump-2-00 4 0.156 0.181 0.162
0.160
phase-2-dump-00 158 0.12466 0.21532 0.12144
0.13130
phase-2-dump2-00 6 0.1394 0.1335 0.1330
0.1353
phase-3-dump-00 225 0.10749 0.19784 0.12670
0.42505
phase-3-dump-2-00 72 0.12826 0.19548 0.46062
0.47378
phase-4-dump-00 520 0.54868 0.36267
0.33663 1.63335
phase-4-dump-2-00 203 0.17332 0.53444

0.54236 0.76637
phase-5-dump-2-00 954 0.30983 0.36433 0.35798
0.48870
sampledata01-dump 14523 1.33127 5.76940
3.51988 3.43390
tcpdwk3mon-98 793256 73.11217 215.2422
219.2653 230.4325
tcpdwk3tue-98 393566 37.42337 135.6459
125.3525 149.3899
16
BC-range-dsize<>-RO-RD – Buttercup with all rules files
included in snort.conf and ‘range’
keyword with a range of +-200 and ‘dsize’ (<> values),
‘rangeoffset’ and ‘rangedepth’ keywords.
Fig. 5: Graph showing change in percentage of alerts with
change in address range values for 2 tcpdump
files.
Bar graph for pe rce ntage of ale rts for various addre s s range
values
0
0.2
0.4
0.6

0.8
1
1.2
inside.tcpdump_00 tcpdw k2w ed-98 tcpdinsw k1w ed-99
Tcpdum p file s
P
er
ce
nt
ag
e
of
a
le
rt
s
+-50
+-100
+-200
+-250
+-300

+-400
+-500
Fig. 6: Bar graph showing percentage of alerts for various
address range values for 3 tcpdump files.
Percentage of alerts vs Range values
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 100 200 300 400 500 600
Range values
P

er
ce
nt
ag
e
of
a
le
rt
s
inside.tcpdump-00
outside.tcpdump-00
17
Bar graph s how ing pe rce ntage of ale rts for various ve rs
ions of Snort
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8

0.9
inside.tcpdump_00 tcpdw k2w ed-98 tcpdinsw k1w ed-99
Tcpdum p file s
P
er
ce
nt
ag
e
of
a
le
rt
s
BC-range
BC-range-dsize<>
BC-range-dsize<>-RO-RD
BC-range-dsize>
BC-range-dszie>-RO-RD
Fig. 7: Bar graph showing percentage of alerts for various
versions of Snort for 3 tcpdump files
Performance: In Table 2, we see that the number of alerts
generated by Buttercup is far greater
than those generated by Snort. This is most probably due to a

large number of false positives
generated by Buttercup since the version of Buttercup used here
does not contain the ‘dsize’,
‘rangeoffset’ and ‘rangedepth’ keywords.
From Table 3, we observe that as the address range values are
increased from +-50 to +-500,
there is a corresponding rise in the percentage of alerts
generated. The rise is however, not a
linear one, as can be observed from the graph in fig. 5. The
percentage of alerts increase sharply
between a range of 50 and 100, increases less sharply for range
values between 100 and 200,
doesn’t change too much for the range values between 200 and
300 and again between range
values of 300-400 and 400-500, the increase is pretty sharp.
From Fig. 6, which shows the bar
graph comparing the percentage of alerts for the various address
range values, we can see that the
percentage of alerts for the range values of 200 and 250 are the
closest in value. It can thus be
safely concluded that the optimum range values are between 200
and 250.
Table 4 compares the percentage of alerts generated for
different versions of Buttercup for

various tcpdump files. It can be observed from Fig. 7, which
shows the bar graph more clearly
depicting the percentage of alerts for the various versions of
Snort, that the percentage of alerts is
18
the greatest when only the ‘range’ keyword is used, is lesser
when the ‘dsize’ (with symbol ‘<>’)
keyword is included and is the least when the ‘rangeoffset’ and
‘rangedepth’ keywords are also
included. Hence, by studying a buffer overflow exploit carefully
and determining the size of the
shellcode and the part of the shellcode that contains the return
addresses, the number of false
positives can be brought down considerable, thereby, enabling a
more accurate detection of buffer
overflow attacks.
However, the drawback of narrowing the payload in which to
look for address ranges is that some
of the buffer overflow attacks may not be detected if there is a
miscalculation in the ‘rangeoffset’
and ‘rangedepth’ values or if the shellcode is modified
considerably. The same behavior repeats

for the cases where symbol ‘>’ is used with the ‘dsize’
keyword, but the percentage of alerts is far
greater than those where symbol ‘<>’ is used. Hence,
calculating the range in which the size of
the shellcode falls helps us determine buffer overflow attacks
more accurately than just looking
for packets that are larger than a given size. It must be pointed
out here that there is definitely a
range for the size of the shellcode, since there aren’t too many
ways of modifying the size of a
particular shellcode other than varying the number of NOPs
included.
Table 5 compares the processing times of four different versions
of Snort for different tcpdump
files and also lists the number of packets in each file. It can be
observed that the tcpdump files
used aren’t the same as the ones used in the above three cases.
This is because these are the
smaller tcpdump files, which didn’t generate too many alerts
and hence, are unsuitable for
determining the performance of Snort in the first three cases.
These files are, however, useful in
this case, since the larger files cannot be used for determining
the processing times because all the

rules files are included in the snort.conf (since the performance
is compared to the unmodified
Snort with all its rules files included). Since all the rules files
are included, when the large
tcpdump files are used, too many alerts are generated and Snort
halts.
It can be observed that the processing time increases sharply
when the ‘range’ keyword is
included as compared to the unmodified version. However,
when the ‘dsize’ keyword is included,
19
the processing time decreases since only packets whose payload
size falls within a specific
payload are searched for the address ranges. This considerably
brings down the processing time.
We would expect the processing time to decrease further when
the ‘rangeoffset’ and ‘rangedepth’
keywords are added since the payload in which to look for
address ranges is further narrowed
down, but this doesn’t happen. In fact, the processing time
increases slightly. It can be concluded
that this happens due to the extra processing involved with the

inclusion of two new keywords.
Also, it should be noted that this behavior is true for most of the
tcpdump files, but, as can be
observed from Table 5, for some of them, the results are
different. This is due to the fact that due
to the complexity of some of the exploit codes, the ‘rangeoffset’
and ‘rangedepth’ values for all
the rules could not be determined. Hence, some of the rules
have just the ‘range’ and ‘dsize’
keywords, thereby leading to the inconsistency in the results of
the tcpdump files. A final
observation is that as the size of the tcpdump files increases, the
processing time increases
significantly.
VII. Conclusion
In this paper, we focus on the weakness of signature-based
Network Intrusion Detection Systems
in detecting polymorphic attacks. When a regular attack, for
which an IDS already has a signature
available in its signature database, is modified or transformed,
the IDS might fail to identify
correctly. The same principle can be applied to future Internet
worm attacks that we have not seen

before.
We present a new solution here called “Buttercup” to counter
against any attacks based on buffer-
overflow vulnerabilities (such as CodeRed, Nimda, Slammer,
and Blaster). We have implemented
our idea in SNORT, and included 19 return address ranges of
buffer-overflow vulnerabilities. We
introduce three new keywords in SNORT namely ‘range’,
‘rangeoffset’ and ‘rangedepth’ and a
keyword already existing in Snort namely ‘dsize’ to detect
packets with potentially return address
values lying within specific ranges. For evaluation, with a suite
of tests against 13 TCPdump
traces, the false positive for our best algorithm is as low as
0.01%. This indicates that, potentially,
20
Buttercup can drop 100% worm and other attack packets on the
wire while only 0.01% of the
good packets will be sacrificed. We believe that our solution is
simple and practical as normally
an exploit is known long before the worms based on that
particular exploit are developed and

launched.
Currently, Buttercup will need an accurate input of the return
address ranges to be effective. For
high-speed Internet worms, we are currently developing
solutions such that Buttercup can
intelligently discover previous unknown address ranges. With
this particular capability, we can
even handle attacks with totally “unknown” exploits.
Acknowledgement
This research is sponsored by NSF and ITRI.
References
[1] “On Network-Based Attack Polymorphism” MS thesis,
Computer Science Department, UC Davis.
[2] Martin Roesch, “Snort-Lightweight Intrusion Detection for
Networks”.
[3] Martin Roesch, “Snort Users Manual”, Snort Release: 1.9.x.
[4] Aleph One, “Smashing the Stack for fun and profit”,
http://www.phrack.org/show.php?p=49&a=14
[5] “Buffer Overflows Demystified”,
http://www.enderunix.org/docs/eng/bof-eng.txt
[6] Lefty, “Buffer Overruns, what’s the real story?”,
http://destroy.net/machines/security/stack.nfo.txt

[7] Fides, “Simple buffer-overflow exploits”,
http://www.collusion.org/Article.cfm?ID=176
[8] K. Timm, “IDS Evasion Techniques and Tactics”,
http://online.securityfocus.com/infocus/1577
[9] E. Messmer, “Put to the test”,
http://www.nwfusion.com/news/2002/0415idsevad.html
[10] “ADMuate Readme”, http://www.ktwo.ca/readme.html
[11] E. Skoudis, “Sneaking Past IDS”,
http://www.infosecuritymag.com/2002/jul/sneaking.shtml
[12] “Polymorphic Shellcodes vs. Application IDSs”, NGSEC
White Paper, http://www.ngsec.com
CEG 4420/6420
Total 60 Points
1. (25 points) True and False with Justification (2 for
True/False and 3 for Justification for each question)
a. Shellcode developed for i386/Linux cannot be used to exploit
a Buffer-Over-Flow-vulnerable program running on
SPARC/Solaris. (SPARC and i386 correspond to different
instruction sets)
b. The StackGuard solution (“canary word” solution) includes
the operation to push a number into the stack for detection.
c. The control-flow-integrity solution includes the operation to
push a number into the stack for detection.
d. The instruction-set randomization solution includes the
operation to push a number into the stack for detection.
e. A kernelized reference monitor analyzes the interactions
between a user-space process with the kernel (e.g., analyzing

system calls). A browser loads a plugin into its memory in the
user space of the memory and this plugin modifies some critical
data inside the user-space-based memory for this browser. This
modification is visible to the reference monitor.
2. (10 points) In order to launch successful buffer-over-flow
attacks, one problem is we need to guess *exactly* where the
address of our code will start. If we are off by one byte more or
less we will just get a segmentation violation or an invalid
instruction. Please propose one solution to increase the chance
of success. (hint: the answer can be found in the “Smashing The
Stack For Fun and Profit” paper.)
3. (10 points) The shellcode and BoF attack layout discussed in
the “Smashing” article can be summarized in the following
figure (e.g., Figure 2). However, the size of the shellcode is
bounded by the size of the buffer to be exploited. In certain
cases, the size of the shellcode might be too large to be put into
the buffer to be exploited. Please propose one solution to solve
this challenge (you should plot the stack layout of your
solution.) (hint: the answer can be found in Section V. – Page 7
and 8 in the “Buttercup” paper.)
4. (15 points) Consider the following C code fragment. We have
a server program that uses this fragment once to process user
input (both “input” and “i” are provided by unprivileged users).
We compile the server program in a 32-bit operating system, in
which the type of int, memory address, and registers, such as
EIP, EBP, as well as ESP, are represented by 32 bits.
a. Is it possible to design an attack to exploit this “foo” function
so that the CPU will execute an instruction at an arbitrary
address in the user space? Please plot the stack layout and your
attack. (5 points)
b. Will the StackGuard approach be able to detect this attack?
Please justify. (5 points)
c. Will the control flow integrity approach be able to detect this
attack? Please justify. (5 points)

void foo(int i, int *input){
int *arr[10];
arr[i] = input;
}
6
code is placed at the center, and then followed with the return
addresses. If the return address
points anywhere in the string of NOPs, they will just get
executed until they reach the shell code.
Assuming the stack starts at 0xFF, that S stands for shell code,
and that N stands for a NOP
instruction, the new stack would look like this:
bottom of EEEEEEEEEEE EEEE FFFF FFFF FFFF FFFF
top of
memory
^ |
|_____________________|
overflow exploit

the IDS [9].

6code is placed at the center, and then followed with the return
addresses. If the return address points anywhere in the string of
NOPs, they will just get executed until they reach the shell
code. Assuming the stack starts at 0xFF, that S stands for shell
code, and that N stands for a NOP instruction, the new stack
would look like this: bottom of EEEEEEEEEEE EEEE
FFFF FFFF FFFF FFFF top of
memory
^ |
|_____________________|
overflow exploit
the IDS [9].

1Buttercup On Network-based Detection of Polymorphic B.docx

Recommended

Recommended

More Related Content

Similar to 1Buttercup On Network-based Detection of Polymorphic B.docx

Similar to 1Buttercup On Network-based Detection of Polymorphic B.docx (20)

More from aryan532920

More from aryan532920 (20)

Recently uploaded

Recently uploaded (20)

1Buttercup On Network-based Detection of Polymorphic B.docx