Intruder adaptability

Tel Aviv University
Raymond and Beverly Sackler Faculty of Exact Sciences

Classification of Traffic Sources on The Internet
A study of the adaptability level of network agents

Thesis Submitted as partial fulfillment of the requirement towards the M.Sc. degree

School of Computer Science
Tel Aviv University

By

Uri Gilad

This research work in this thesis has been conducted under the supervision of
Prof. Hezy Yeshurun, Tel Aviv University (Coordinating Instructor)
Dr. Vern Paxson, International Computer Science Institute

May 2005.

Abstract
The subject of investigating the different types and purpose of network intruders
has received relatively little attention in the academic community, unlike the extensively
researched area of identifying intrusions. In this paper, we examine how adaptive to
changing environments intruders are. We investigate the different groups that form when
we rank intruders according to their adaptability. This research lends new understanding
about the level of threat presented by highly adaptive intruders.

The following paper is organized as follows: first we examine the current work in
intrusion detection and related subjects. We establish ground truth by performing an
experiment in controlled environment. We present and justify our methods of data
collection and filtering, and move on to present results based on real data. Finally we
outlay our conclusions and discuss methods of further research.

The resulting algorithm is a useful tool to rank the intruders attacking a defended
network according to adaptability levels.

2

Contents
Classification of Traffic Sources on The Internet...........................................................1
Abstract..............................................................................................................................2
Contents..............................................................................................................................3
1. Introduction....................................................................................................................4
2. Algorithm........................................................................................................................8
3. Results...........................................................................................................................31
4. Discussion.....................................................................................................................46
5. Further work................................................................................................................49
References.........................................................................................................................51
‫75..................................................................................................................................תקציר‬

3

1. Introduction
Present day computer security technologies are focused on detecting and
protecting against intruders. While there are many solutions for network defense, there
has been surprisingly little research on the subject of understanding the nature of
attackers. Present day Internet traffic contains a variety of threats, all of which may direct
traffic at any random host connected to the network. The traffic originates from sources
such as worms, attacking hackers, automated scanning tools and legitimate traffic.

The rough classification above begs the question: Can one automatically
categorize and determine the type of a traffic source. While it may seem secondary, the
nature of an attacker is paramount to understanding the seriousness of a threat. Attacks
that come from worms and scanning tools are “impersonal” and will not “insist” on
lingering at the specific site while a human may return again and again. A human can
employ a wider array of techniques and keep hammering at a site until finally brute-
forcing a way inside. In addition, humans can be subject to retaliation by contacting the
responsible party, while the threat from autonomous agents does not have a clear identity
behind it to contact.

1.1 Traffic Sources on the Internet
A random host connected to the Internet will be scanned and attacked in a short
while [21]. Most of the attacks come from automated agents, due to their overwhelming
numbers. Automated agents can be worms, human written software pieces. Worms are
designed to propagate between computer systems by “infecting” a vulnerable machine
using a built-in exploit. After infection the worm places a copy of itself on the targeted
system and moves on to infect the next victim. The worm thus spawned on the infected
system enters the same cycle. Research suggests that probes from worms and autorooters
heavily dominate the Internet [37].

As mentioned above, another source of traffic on the Internet is automated
scanning tools such as the abovementioned autorooters. Autorooters are designed to
quickly inspect as many hosts as possible for information – without the propagation
feature. There are benign scanning tools such as web crawlers – designed to collect
information for indexes such as Internet search engines. Other scanning tools, however,
are designed to seek vulnerabilities to exploit. The people behind the last type of
automated scanning tools use the data gathered to return and break into a site found to be
vulnerable. Autorooters, an evolution of these automated vulnerability scanners, actually
install a backdoor automatically on a vulnerable machine and report the machines IP
address back to the human operating them, who can now take control of the host.

4

Rarely seen on the Internet are fully manual sources of hostile traffic, adaptive
sources actually interacting with a specific network site in order to break into it. There is
a distinction between this adaptive (possibly human) traffic sources and the above two
rough classifications. While humans may employ worms and automated scanning tools,
sometimes the human hacker will manually work against a site, with the intention of
achieving better results by employing a wider range of techniques.

This paper presents a practical method to classify sources of attack on the
Internet. The classification is believed to be helpful when determining the nature of an
attacker – is it an automated agent or a manual one (a human). The development and
experimentation is based on actual data from several distinct sites.

1.2 Related Work
The subjects of examining the behavior of an attacker on the Internet,
understanding attackers nature or classifying the attacker by behavior have not received
much attention from the academic community. However, the related subjects of
differentiation between humans and computers and the subject of detecting worm
outbreaks (a subset of the larger subject of intrusion detection) have been researched
extensively. Several papers touch upon this work's ideas.

Telling computers and humans apart is a problem discussed by Von Ahn et al. [1].
The researchers describe a technique dubbed "CAPTCHA" (Completely Automated
Public Turing Test to Tell Computers and Humans Apart). As the acronym implies, the
work details an automatic algorithm to generate a test that human beings can pass, while
computer programs cannot. The test is based upon "hard to solve AI problems" that
human beings can solve easily, while computers cannot. The Definition of "Hard to
solve AI problems" relies on consensus among researchers, and therefore the problem set
may change in the future. The proof that human beings can solve these problems is
empiric. The work outlined in this paper and the CAPTCHA paper share an interest in
telling humans and software agents apart. While the CAPTCHA solution is taken from
the domain of AI, and relies on voluntary test takers (if the test is not passed, resources
will be denied) this paper will attempt to present an algorithm that deals with more severe
restrictions. The differentiation accomplished here is performed without knowledge or
cooperation of the tested subject.

In [7] Staniford et al. discuss theoretical advanced worm algorithms that use new
propagation mechanisms, random and non-random. Discussed are worms equipped with
a hit-list of targets to infect, worms sharing a common permutation to avoid repeating
infection attempts and worms that study the application topology (for example:
harvesting email addresses) to decide which computers to target for infection. Staniford
and co. envision a “cyber center for disease control” to identify and combat threats. In
this paper, we touch upon the task of identifying the threat and categorizing it as an
adaptive source attack or an automated agent that may be part of a worldwide infection.
The subject of “non-random” agents (such as those worms equipped with a hit-list – for

5

example) is important, as the issue of identifying the origin of that attack (a human who
may change tactics or a worm) is of importance.

Zou et al. [2] present a generic worm monitoring system. The system described
contains an ingress scan monitor, used for analyzing traffic to an unused IP address
space. The researchers observe that worms follow a simple attack behavior while a
hacker’s attack is more complicated, and cannot be simulated by known models. They
suggest using a recursive filtering algorithm to detect new traffic trends that may indicate
a case of worm infection. Zou’s observation that hacker and worm behavior is different
is not expanded upon. The observation is implicitly used, as the paper describes a
method for intruder classification based on their behavior. This work will present an
algorithm that uses the mentioned difference to group intruders with similar behavior.

In [3] Wu et al. expands upon the subject of worm detection mentioned in [2].
Wu discusses several worm scanning techniques and re-introduces the subject of
monitoring unassigned IP address space. Jiang Wu and co. discuss several worm
propagation algorithms similar to those presented in [7]. The authors discuss worm
detection, and propose the hypothesis that random scanning worms scan the unassigned
IP address space and that this fact may present a way to detect them. The authors search
for common characteristics of worms (as opposed to other agents), one such common
characteristic is that a worm will scan a large number of unassigned IP addresses in a
short while. Wu and co. suggest an adaptive threshold as a way to detect worm
outbreaks. Examining the algorithm results on traffic traces validates the work, and the
conclusion is that unknown worms can be detected after only 4% of the vulnerable
machines in the network are infected. Unused IP address monitoring is employed in this
research also, this research draws upon the common characteristic of worms to detect
“automaton-like” behavior, as opposed to more random or adaptive behavior.

In [17] Jung et al. present an algorithm to separate port scanning from benign
traffic by examining at the number of connections that are made to active hosts vs. the
number of connections made to inactive hosts. Their observation is that there is a
disparity on the number of connections made to active hosts between benign and hostile
sources. The disparity in the access attempts to active and inactive hosts is implemented
in an algorithm named TRW (Threshold random walk), which is used to successfully and
efficiently detect scanning hosts. The research by Jung et al. brings to light an important
difference between hostile scanners and benign users, which touches on this research. In
this research an attempt is made to establish a range of behaviors, ranging from fully
automated agents (such as worms) to attackers who react to environmental changes and
modify their algorithm of attack accordingly – a behavior indicative of humans. The
work done in [17] assumes certain behavior on the part of users – they will access more
active sources than inactive sources. This research relies on the diversity of resources
accessed as another characteristic useful to classify attackers. In [18] Weaver et al. show
that even a simplified version of TRW achieves quite good results, which emphasizes the
point that the difference in behavior between naïve and hostile sources is of importance.

6

Spitzner [4] describes the honeypot: a system that has no production value and
thus it can be assumed that any access to it is illegitimate. In [5] Spitzner makes the
distinction between behaviors of different hackers and bases this distinction on their
behavior – as can be evident by their actions when attacking a network. This work is
related to honeypot technology as the working assumption that most traffic directed at
unused IP space is hostile is used. Further, Spitzner notes the classification of Internet
hostiles based on their activity, which is one of the foundations for this research. Spitzer,
however, does not expand on the idea of classification beyond describing a specific group
of Internet “operatives” dubbed “script kiddies”. The intention of this research is to go
beyond that and provide a method of telling apart different hostile sources.

Lee [6] presents a data mining approach to intrusion detection. The approach
presented by Lee is to apply current machine learning techniques to network sessions,
extracting features and analyzing them – finally building a decision tree to separate
intrusion attempts from legitimate traffic. Building decision trees to aid in classification
of traffic is a recognized method, but in this research we chose to rely on expert
knowledge as this approach is assumed to produce better results. In Lees research the
resulting detection rate of below 70% is claimed to be unsatisfactory. The low success
rate in Lee’s work encouraged us to try the presented approach.

1.3 Rationale for this study
In this paper, we present a practical algorithm to classify intruders according to
their activity as captured at the targeted network. The classification defines a range
between automated attacks and fully manual sources of attack such as human hackers.
This kind of classification is touched upon in some works, but is not explored to the
fullest extent.

The algorithm is based on tested hypotheses on human and worm behavioral
differences. The method relies on past behavior of simple worms in order to find the
common denominator of the behavior of automated attackers. After determining this
common denominator, attackers can be ranked and differentiated.

7

2. Algorithm
2.1 Establishing ground truth
The research presented below is based on the assumption that there are core
differences between the traffic generated by a fully manual source of attack such as a
human and fully automated source of attack, such as a worm. This assumption was tested
in an experiment before embarking on the development of a full-scale algorithm.

2.1.1 Experiment
An initial algorithm was developed to test attacker behavior. The algorithm goes
over a pre-recorded network traffic dump. For each traffic source, the algorithm
calculates the series of ICMP echo requests and TCP/UDP ports the source accessed on
each destination address. For each new target the algorithm processes for this source, the
resulting access series is compared to those already performed by the source on different
targets. If the access series is different, it is added to the list.

It is proposed that a higher number of different access series for a specific source
is indicative of the ability to react to changes, and possibly the guidance of a human. The
basis for this assumption is that we believe a human will react to different hosts in a
different way, producing a different access series, while automated sources will act upon
a simple deterministic algorithm programmed beforehand, independent of the conditions
in the network currently attacked.

This algorithm is used to test the access series for humans and simple
automatons/network worms. This is not the same as testing the behavior of an attacker to
the fullest extent, including examining the data passed in the various sessions in the
traffic collected. We believe that establishing that humans and worms differ in their
“access series” to targets is a good argument towards proving that there is a difference in
the behavior between an automaton and a more manual source. This conclusion is valid
because a different access series is an example of different behavior.

For the remainder of this paper, we term the number of different access series for
a single source as an “adaptability score”. The method of calculating an access series for
an intruder is given under section 2.3 – Algorithm details.

8

2.1.2 Experiment layout
The experiment was composed of two stages. The first stage – the algorithms
response to the activity of human attackers was tested, humans being an example of a
highly adaptive agent. In the second phase the algorithms response to a network worm in
a lab environment was tested. Network worms with their simple algorithms are examples
of completely automatic agents.

2.1.2.1 Testing Human subjects
A group of volunteers was collected for testing the assumptions about human
behavior. Besides being human, the test subjects had to be knowledgeable security
experts, with practical experience in penetration testing (i.e. breaking into company web
sites for testing and improving their security). The human test subjects were given the
task of breaking into a prepared site.

Only the IP address range was provided, without mention of the purpose of the
experiment and what the defense mechanisms are employed at the site.

The site is protected by an ActiveScout [16] machine, which creates virtual
resources for the humans to interact with. The test subjects are able to connect to
interactive and transaction based services such as FTP, NetBIOS, HTTP and Telnet. The
virtual site presented appears to contain some “security holes” in it in the form of
vulnerable services and open ports. The vulnerable services show a welcome banner that
clearly proclaims an aged version of a common server application, one that is known to
the security community to be open for exploitation and can be abused to break into the
hosting computer. Most of the versions reported in the welcome banners have publicly
available exploits.

The protection of the site is configured so that after scanning for ports, much
more “virtual” resources than real resources will answer, and that accessing (as opposed
to scanning) these virtual resources will cause the attacker to be locked out of the site for
a short duration of time (4 hours).

As we are using a commercial product to implement the experiment, the lockout
period and conditions are configurable. Specifically, in this experiment lockout occurs
under the following conditions:
1. If the intruder accesses a NetBIOS based resource, lockout will occur after
trying a user/password combination, but enumerating servers, users and other
NetBIOS resources does not trigger lockout.
2. If the intruder accesses an http server, the lockout will not be triggered.
3. After the intruder completes a handshake with a TCP based simulated
resource – lockout will occur.
These are the default settings on the product, which are claimed to provide
maximum protection while ensuring a low number of false positives.

9

A real web server was installed in the site to serve as the “trophy” to be found.
This web server – a Linux machine running an old (unpatched) version of apache, was
vulnerable to several well-known weaknesses. A hacker finding (and breaking into) this
web server will find a message that will notify him of his success.

The structure of the experiment would appear to introduce only minor bias. The
experiment only approximates a “double blind” experiment, as the tester knows that the
attackers are human. The test subjects are not aware which algorithms are being used, or
what is being tested. The researcher is running an algorithm that is not aware of the type
of test subject. Some bias may have been introduced when designing the test
environment, however, as the site is obviously not a “real” one. There are no corporate
assets in the site, there is no real content beyond that which is provided by the
ActiveScout machine, and there are no legitimate uses for this site. These facts make the
site markedly different than a commercial presence on the web. However, the site can
pass for a bare bones website, and the attackers are employing the same techniques they
use when penetrating “real” web sites. Thus, the experiment will record human attacker
activity.

2.1.2.2 Testing automated agents - worms
To test the assumptions about automated agent behavior, a representative group of
current and past network worms was analyzed. Worms are an example of an automated
agent, as a static pre-defined algorithm dictates a worm’s behavior. The worm
experimentation is logistically simpler than running the experiment with humans, since
most worms’ algorithms are well known. Analysis of each worms state machine
produced enough knowledge about its expected access series. The purpose of testing
worms is to understand how a very simple automaton behaves. Worms represent the
simplest automatons available.

2.1.3 Experiment results, human subjects
We had each participant write down a journal while conducting the test,
containing the activities performed, the tools used and his conclusions.

All human subject achieved scores higher or equal to 3 on the adaptability scale,
meaning they had 3 or more different access series for the targets they accessed. While
worms ordinarily had an adaptability score of “1” – as they perform according to a
deterministic state machine.

Below is a summary of the human experts results, some comments for each, and
the resulting adaptability calculation.

10

Test Subject #1
1. Test subject #1 started the experiment by researching the Internet Whois database
and DNS to find out information about the site.
2. The test subject continued the experiment with port scans for various common
ports. These include FTP, SSH, HTTP, NetBIOS, and some other ports – which
include 6000/TCP (X-Windows) and the range 1-1024/TCP (looking for other
open services). These scans were conducted on the entire range supplied, in
deterministic, sequential order.
3. The test subject followed the above scans by manually connecting to SSH and
HTTP open ports, and querying the services for versions and banner information.
4. The test subject attempted to browse the HTTP sites found, but was unable to due
to the fact that the ActiveScout [16] had by this time determined it to be hostile,
and locked him out of the site for the duration of 4 hours.
5. The test subject did not realize at first that he was locked out – and continued
various attempts, finally giving up and returning in 8 hours, to accomplish
basically the same results.
6. The test concluded after the second scan.
7. The test subject was able to determine which of the simulated computers was a
“real” web server, but did not manage to break into it. The test subject arrived to
this conclusion after noting the different content of the real web site and the
simulated web sites. Following up on this hunch, the test subject mapped out the
TCP/IP fingerprints for the simulated web site and the real web site, and found a
bug in the implementation of the simulated web site. The simulated web site,
while claiming to be the same OS model as the real web site had a different
response to specific TCP packets.
Discussion
From the log this test subject kept, we learn that he attempted to use outside
information (DNS, Whois) to learn about the site before accessing it, attempting to glean
information about services and servers available before doing the actual scan. This is the
kind of behavior rarely seen in automated agents to date, although possible, most
automated agents do not tend to seek outside information about the site being attacked, an
exception are topologically aware worms discussed below.

The classification thesis proved robust against this intruder, as the test subject was
attracted to open ports and attempted to penetrate them in various ways in order to break
into the site. The test subject did not follow a specific plan once the attack passed the
“reconnaissance” phase. This fact contributed to the adaptability and to a greater score in
the algorithm.

The access series for this test subjects appear below in figure 1. These port groups are
counted towards an adaptability score of 5.

11

1. 22/TCP/TCP,21/TCP,80/TCP,135/TCP,137/UDP, 22/TCP
2. 22/TCP/TCP,21/TCP,80/TCP,135/TCP,137/UDP, 22/TCP,80/TCP
3. 22/TCP/TCP,21/TCP,80/TCP,135/TCP,137/UDP, 80/TCP
4. 22/TCP/TCP,21/TCP,80/TCP,135/TCP,137/UDP, 22/TCP,6000/TCP
5. 1-1024/TCP,6000/TCP

Score = 5

Figure 1: test subject 1 port groups.

Test subject #2
1. Test subject started the experiment by making a select few connections to FTP,
HTTP, NetBIOS and ports 467/TCP and 484/TCP. It is interesting to note that
ports 467/TCP and 484/TCP are not used by any known protocol. When later
queried about this, the test subject replied that the ports were selected without a
practical reason but as an attempt to confuse an operator of the site, if there is any.
2. These selective connections to specific ports suggest that the test subject made an
earlier reconnaissance attempt from a different source address. The test subject
confirmed this suspicion in a later interview.
3. After a break of four days, the test subject returned from the same source address,
and performed a “ping sweep” of the available address space.
4. The test subject also performed scans to ports 98/TCP and 96/TCP.
5. Finally, the test subject concluded with a horizontal NetBIOS scan.

Discussion
In a later interview, the test subject confirmed that in the four days he did not
actively attack the site, he contemplated the problem and deduced that a network device
of some sort protects the site. The test subject did not divine the nature of the “network
device” – but his later behavior is explained by this deduction. Apparently he realized
that the device is reacting to scans. He attempted to access unusual ports in order to test
the device’s response. Failing to find any weakness, and receiving what seemed like fake
results from the NetBIOS scan, he gave up.

The access series output by the algorithm is summarized in figure 2 below.

1. 137/UDP 80/TCP ICMP
2. 137/UDP 21/TCP ICMP
3. 98/TCP 96/TCP
4. 484/TCP 467/TCP

Score = 4

Figure 2: Test subject 2 port groups.

12

Test subject #3
1. Test subject 3 started the experiment by automatically mapping out the entire
network range.
2. The test subject than followed up on several ports in random order, concentrating
on 22/TCP (SSH) 80/TCP (HTTP) and 21/TCP (FTP).
3. The scan included attempts to determine the operating system of the attacked
virtual computers by testing TCP flags – using an automated tool called “nmap”
[20].
4. After every connection attempt to a responsive “virtual host” the test subject
would be blocked out of the network for a period of time.
5. This behavior of the network frustrated the test subject, and the experiment was
concluded.

Discussion
While the experiment with this test subject itself did not provide a large amount of
data to analyze, as the test subject gave up quickly on the test itself, the establishment of
ground truth for the research benefited. The test subject, although not very thorough, did
work in an adaptive way, returning to those ports that were responsive in the scanned
hosts. The test subject spent more time on responsive resources, and tried to determine
versions of applications and operating system.

The test subject’s operating mode was completely adaptive, and summarized
below are the different ports series output by the algorithm for this test subject. The score
for this test subject is low, 3, and can be explained by his lack of interest and frustration.
Although low, this score is still higher than the score awarded to automated agents,
below.

1. 22/TCP 21/TCP 22/TCP 80/TCP
2. 21/TCP 22/TCP 21/TCP 80/TCP
3. 21/TCP 22/TCP 80/TCP

Score = 3

Figure 3: Test subject #3 port groups.

13

2.1.4 Testing human subjects, conclusions
Several conclusions arise from interviewing the test subjects and analyzing their
actions. It appears that all test subjects became bored at one stage or another with the
work. The fact that this was volunteer work, along with their growing suspicion that
“something fishy” is going on, contributed to their wrapping up the experiment quicker
than they would if their work was paid for or if their work was for personal interest – two
common motivations for human hackers. This can lead to another indicator for human
activity, but this research will not focus on it.

All test subjects reported a suspicion that “something fishy” is going on. This
feeling was caused by the fact that the ActiveScout protection mechanism used generated
resources in response to scans, and blocked users for a period of time after they have
proved to be hostile by accessing the generated resources. All test subjects eventually got
blocked. The blocking was released automatically after a period of time. This behavior
contributed to the frustration of the test subjects as the site did not seem to provide a
consistent image of resources over time. Frustration is not a trait of automated agents.
While the test subjects are familiar with mechanisms that block offensive users, they
were not familiar with mechanisms that work by detecting access to “virtual” resources –
but rather signature based mechanisms.

Another conclusion is that hackers may not immediately follow the
reconnaissance phase with an attack phase. Some hackers will take the results and
analyze them manually, and than employ exploits against each vulnerable spot at their
leisure. Additionally, these separate stages may come at the target from different source
IP addresses, due to the use of DHCP or even due to an active attempt to disguise the
origin of an attack. During the reconnaissance phase, when hackers employ automated
scanning tools they are still considered to be automated agents. When hackers turn to
selectively attacking scanned resources – they will appear to be manual sources of attack.

The algorithm proved robust even if hackers perform the reconnaissance phase
from a different source address as the source address they attack from. The
reconnaissance source address will be detected (correctly) as an automated source, and
the attack will be detected as a more adaptive source. Running several automated tools
from the same source IP can cause enough adaptability to be counted as a manual source,
which is as expected – if the attacker employs a fully automated script it will provide
different results than if the attacker runs several scanning tools in response to results from
the site.

While human test subjects in this experiment employed automated tools for most
of the reconnaissance phase, they performed the attack phase manually. This result is
enhanced by the fact that the availability of scanning tools is much greater than the
availability of automated “autorooter tools” – tools that perform an attack automatically.
Autorooters perform the attack cycle from beginning to end, finally providing the user
with a list of compromised hosts. Most of the “attack tools” available are usually tools
which break into a single host, and the user is left to decide what to do after the break-in
on his own. A reasonable assumption would be that users that write even more advanced

14

attack tools – show even greater adaptability and randomness of actions when finally
doing manual actions on their own.

2.1.5 Analysis of recent network worms as examples for automated
agents
To complete the process of establishing ground truth, we need to look at the other
side of the spectrum, at fully automated sources of attack. The most common automated
agent is the network worm. The following is a study of the recent well-known worms
discovered on the Internet.

We make the distinction between email worms and network worms. The
difference between these two kinds of malicious agents is that email based worms usually
require some sort of human interaction – such as opening the mail message and executing
an attachment.

The claim we need to establish is that network worms have a simple state
machine, resulting in a predictable order of actions. A simple, predictable behavior will
result in a low adaptability score. The worms studied are summarized below – for each
worm, the adaptability score is calculated according to the worm’s algorithm. Included is
the list of ICMP echo requests and TCP/UDP ports the worm accesses, this list is the
basis for the adaptability score calculated.

In some cases, a copy of network traffic from a captured host was not found, and
we had to rely on public knowledge bases such as SANS [38] and CERT [39]. This
knowledge was used to divine the worms’ algorithm, for which an adaptability score was
calculated.

2.1.5.1 Sadmind
First report by CERT: May 10, 2001 [22].
The Sadmind worm (also named PoisonBox.worm in some resources) is a worm
that employs an exploit for a Solaris service (Sadmind) to spread. Once a machine is
infected, in addition to propagating, the machine will scan for and deface IIS web servers.
This is an example of a multimode worm, although the second mode is used for defacing
and not propagation.
Since the IIS infection is performed in conjunction with the propagation process,
there are two possible port series for this worm. The worm will attempt the same series of
ports for all victims, be they exploitable Solaris workstations or defaceable IIS machines.
The port series calculated for this worm are:
1. 111/TCP, 600/TCP this is the propagation phase – the first port is the portmapper
service used for exploitation, the exploit will open a backdoor on 600/TCP port.
2. 80/TCP – the port used to launch an exploit against the IIS machines, which will
than download the defaced web pages from the attacking machine.

Adaptability score for this worm would be 2, due to the additional feature of webpage
defacement – a new series of actions attempted against hosts independently of the first
series.

15

2.1.5.2 Code Red I & II
The code red worm has several variants, but the security community makes a
distinction between two major variants, sharing the use of the same vulnerability:
Code Red first variant was reported by CERT on Jul. 19, 2001. [23]
Code red II report by CERT: Aug. 6, 2001.
Code Red connects to the HTTP port of any victim and sends a specially crafted
request containing the worm itself, the worm code will execute from the stack of the
exploited process. This worm’s state machine is simple, and consists of two stages: The
generation of a random IP, and the probe/attack/spread phase, which is contained within
the malicious payload.

The list of ports contains only port 80/TCP – resulting in an adaptability score of 1.

2.1.5.3 Nimda
First report by CERT: Sep. 18, 2001. [26]
Nimda is a multi-mode worm, which spreads by either:
1. Infecting files in open NetBIOS shares.
2. Infecting web pages present on the attacked computer, thereby spreading to
unsuspecting web clients.
3. Replication by sending itself in an executable attached in an email message from
the infected computer.
4. Attacking IIS machines, using a weakness of the HTTP server.
5. Exploiting the backdoor left by code red II and Sadmind.
The Nimda worm will gain a score of 2 with the algorithm, as the worm’s algorithm
attacks the following ports groups:
1. 80/TCP [IIS & Code Red backdoor]
2. 445/TCP, 139/TCP [open NetBIOS shares]

The web page infection, and the infecting email cannot be seen by our chosen IDS.
Although this worm has several modes of attack, it’s algorithm is based of a simple state
machine, even taking into consideration the modes not seen by our IDS, the worm will
still retain a low score of 3. This worm does however merit a discussion of multi-mode
worms – below.

2.1.5.4 Spida
First report by CERT: May 22, 2002. [24]
The Spida worm will connect to Microsoft SQL server and exploit a default
“null” password for an administrative account in order to propagate and spread.

The worm operates entirely over 1433/TCP, providing it with an adaptability
score of 1.

16

2.1.5.5 Slapper
First report by CERT: Sep. 14, 2002. [25]
The Slapper worm will test systems for mod_ssl (the vulnerable web server
module) by grabbing the web page banner from 80/TCP. If mod_ssl is present, the worm
will connect to port 443/TCP, and launch an exploit. Due to the fact that this worm is
Linux based, the worm downloads and recompiles it’s code on the infected system, as
opposed to other worms that use static binaries.

Since the worm operates in a pre-defined port order for all victims, (80/TCP,
443/TCP) the worm has an adaptability score of 1.

2.1.5.6 Slammer
First report by CERT: Jan. 27, 2003. [27]
The worm sends a specially crafted UDP packet to SQL services – which will
cause the SQL server start sending the same packet to random IP addresses in a never
ending loop. The worm’s state machine is extremely simple and is similar to the one
employed by Code Red II above – generate an IP address and probe/attack/spread.

This malware has Adaptability score of 1, where the port series includes simply
1434/UDP.

2.1.5.8 Blaster
First report by CERT: Aug. 11, 2003 [28]
Blaster attacks by exploiting a weakness in Microsoft DCOM RPC [41] services.
The worm connects to port 135/TCP, and the compromised host is instructed to “call
back” to the attacker and retrieve the worm code.

The worm operates on this outgoing port only, gaining it a score of 1.

2.1.5.9 Welchia/Nachi
First report by CERT: August 18, 2003
Shortly after blaster was released, a worm dubbed “Welchia”[40] was unleashed.
This worm was especially interesting, as it seems it was written with the intent of being a
“benign worm”. When Welchia successfully infects a host, it will first seek and remove
any copies of the “Blaster” worm. Additionally, the worm will attempt to install the
relevant patch from Microsoft. Welchia exploits two vulnerabilities in Microsoft
systems: the RPC DCOM vulnerability used by blaster, and vulnerability in NTDLL
commonly known as the “WebDAV” vulnerability. The worm will seek one of 76 hard-
coded class B networks when attempting to infect with the WebDAV vulnerabilities.
Presumably, the worm’s author scanned these networks beforehand, as the networks are
owned by Chinese organizations, and the WebDAV exploit bundled with Welchia only
works on some double-byte character platforms, Chinese being one of these vulnerable
platforms.

17

As Welchia targets different hosts with it’s two exploits, the worm will gain an
adaptability score of 2.

2.1.5.10 Sasser
First report by CERT: May 1, 2004
Reported by Symantec on April 30, 2004.
The Sasser worm operates in way not dissimilar to Blaster. The worm will exploit
an RPC service on the Microsoft Windows platform, the exploit will cause a remote
command backdoor to be opened, through which the worm will instruct the victim to
download and execute the malicious payload.

The port series for this worm is 445/TCP, with 9996/TCP for the remote backdoor
created. Some variants use ICMP ping request to speed up the process but all variants
known follow a single predetermined port series for all victims, gaining the worm a score
of 1.

2.1.5.11 Santy/PHP include worms.
First report by CERT: December 21, 2004
The Santy worm is an example of a topological worm. This worm gathers a “hit
list” of sources to attack from a Google query (later variants use other search engines)
looking for a vulnerable version of a PHP (html scripting language used to present
dynamic web pages) application, specifically phpBB – a bulletin board package.
Since this worm will not perform any reconnaissance. It will bypass our chosen
IDS system. Active Response Technology depends on intruders using baits distributed
during an early reconnaissance phase. However, going over a captured traffic from this
worm, shows that the worm will perform a simple series of actions, all over the HTTP
port.
The algorithm will award this worm an adaptability score of 1, as it does not
divert from this order of actions, nor does it try any other services.

2.1.6 Multi mode worms, discussion
As seen above, most worms have a very simple state machine and they tend to
follow a very specific order of actions for each host attacked. However, there are worms
that will attempt several exploits against a single target, examples from the above list are
Nimda and Sadmind. Such worms will gain a higher adaptability score if they attempt a
different set of attacks for each host. Multi mode worms will still linger a very short
while with each victim, and they will still follow a pre-defined order of actions, although
their algorithm will be more complex. Multi mode worms tend to have a lower number
of variants, and are generally rare (although this trend may change in the future).

18

2.1.7 Topologically aware worms, discussion
Topological worms are worms that use additional sources of information when
deciding which victim to attack. An example is Santy (above). Such worms are not more
or less adaptive than other worms, as the process of choosing victims is independent from
the process of attacking each victim.

These worms do present a problem with this research, our chosen IDS depends on
an intruder (be it an adaptive source such as a human or an automatic source) doing some
reconnaissance before attacking. If a worm is topologically aware, it may be able to
ignore the virtual resources the Active Response technology sets as baits by using
information gleaned elsewhere – such as a search engine.

This problem is mitigated by the fact that looking at a traffic dump of a source
infected by a topologically aware worm, an adaptability score can be determined
manually.

19

Intruder adaptability

Recommended

Recommended

More Related Content

What's hot

What's hot (14)

Viewers also liked

Viewers also liked (12)

Similar to Intruder adaptability

Similar to Intruder adaptability (20)

Intruder adaptability